Available Gemini models
| Model | Context | Strengths |
|---|---|---|
| Gemini 2.5 Pro | 1M tokens | Strongest reasoning, long-context analysis |
| Gemini 2.5 Flash | 1M tokens | Fast inference, cost-effective |
Basic usage
Long-context analysis (1M tokens)
Gemini’s 1M token context handles entire codebases or books in a single request:Vision — image analysis
Runcrate vs. direct Google API
| Direct Google API | Runcrate | |
|---|---|---|
| Auth | Google Cloud project + service account | Single API key |
| Format | Google-specific SDK | OpenAI-compatible |
| Other models | Gemini only | 140+ models, same key |
Pro vs. Flash
| Scenario | Model | Why |
|---|---|---|
| Complex reasoning | Gemini 2.5 Pro | Stronger reasoning |
| Bulk processing | Gemini 2.5 Flash | Faster, cheaper |
| Real-time chat | Gemini 2.5 Flash | Lower latency |
| Vision / image analysis | Either | Both support multimodal |
Tips
- 1M context is real — you can feed entire repositories or book-length texts.
- Gemini 2.5 Flash is the cost-effective choice for high-volume tasks.
- Same API format: just change the model string from DeepSeek or Llama.
Next steps
- Chat completions reference
- AI Summarization — Gemini Flash for long-document summarization
- Model catalog