Available Models
Binom.Router integrates with leading AI model providers, giving you access to state-of-the-art language models through a unified API. Configure multiple providers and switch between them seamlessly based on your needs.
OpenAI
OpenAI's GPT series represents some of the most capable language models available, excelling at complex reasoning, coding, and creative tasks.
Available Models
| Model | Context Window | Streaming | Best For |
|---|---|---|---|
| gpt-5 | 128K+ | ✅ | General-purpose tasks, complex reasoning |
| gpt-5-codex | 128K+ | ✅ | Code generation, debugging, technical documentation |
| gpt-4o-mini | 128K | ✅ | Cost-effective tasks, simple queries |
| gpt-4-turbo | 128K | ✅ | Balanced performance and speed |
| gpt-3.5-turbo | 16K | ✅ | Lightweight tasks, high-throughput scenarios |
Authentication Methods
- API Key: Direct authentication with OpenAI
- OAuth: Sign in with your OpenAI/ChatGPT account
- OpenAI-Compatible: Connect to any OpenAI-compatible endpoint (OpenRouter, Groq, custom servers)
Pricing
Pricing varies by model and usage tier. Visit your Billing page for current rates based on your subscription plan.
Google Gemini
Google's Gemini models offer state-of-the-art performance with strong capabilities in multimodal understanding and long-context reasoning.
Available Models
| Model | Context Window | Streaming | Best For |
|---|---|---|---|
| gemini-2.0-flash-exp | 1M+ | ✅ | Experimental features, cutting-edge performance |
| gemini-1.5-pro | 1M | ✅ | Complex reasoning, large document analysis |
| gemini-1.5-flash | 1M | ✅ | Fast responses, real-time applications |
| gemini-1.0-pro | 32K | ✅ | General-purpose tasks, cost-effective |
Authentication Methods
- API Key: Google AI Studio API key
- OAuth: Sign in with your Google Account (supports Gemini 2.5/3 models)
- Vertex AI: Enterprise Google Cloud authentication
Special Features
- Massive Context: Up to 1 million token context window
- Multimodal: Native support for images, audio, and video
- Code Execution: Built-in code execution capabilities
Anthropic Claude
Claude models are designed with a focus on helpfulness, honesty, and safety, with strong performance in analysis, writing, and coding tasks.
Available Models
| Model | Context Window | Streaming | Best For |
|---|---|---|---|
| claude-opus-4-1-20250805 | 200K | ✅ | Most complex tasks, deepest reasoning |
| claude-sonnet-4-5-20250929 | 200K | ✅ | Balanced performance, code and writing |
| claude-3-5-sonnet-20240620 | 200K | ✅ | Advanced coding and technical tasks |
| claude-3-opus-20240229 | 200K | ✅ | Complex analysis and research |
| claude-3-sonnet-20240229 | 200K | ✅ | General-purpose, balanced tasks |
| claude-3-haiku-20240307 | 200K | ✅ | Fast responses, lightweight tasks |
| claude-3-5-haiku-20241022 | 200K | ✅ | Quick responses, simple queries |
Authentication Methods
- API Key: Direct Anthropic API key authentication
- OAuth: Sign in with your Anthropic account
Special Features
- Large Context: Up to 200K token context window
- Constitutional AI: Built-in safety and alignment
- Vision: Native multimodal capabilities
Other Supported Providers
Binom.Router supports additional model providers through flexible integration options.
Qwen (Alibaba)
| Model | Context Window | Streaming | Best For |
|---|---|---|---|
| qwen3-max | 32K | ✅ | General-purpose, high performance |
| qwen3-coder-plus | 32K | ✅ | Code generation and programming |
| qwen3-235b-a22b-instruct | 32K | ✅ | Complex instruction following |
GLM (Zhipu AI)
| Model | Context Window | Streaming | Best For |
|---|---|---|---|
| glm-4.7 | 128K | ✅ | Chinese language tasks, general use |
Additional Providers
- Cohere: Enterprise-focused language models
- Groq: Ultra-fast inference with open-source models
- Together AI: Access to open-source models at scale
- Ollama: Self-hosted models for privacy and control
- Custom: Connect any OpenAI-compatible endpoint
Model Comparison
Quick comparison of popular models across providers:
| Provider | Model | Context | Speed | Cost | Best Use Case |
|---|---|---|---|---|---|
| OpenAI | gpt-5 | 128K+ | High | High | General-purpose, complex tasks |
| OpenAI | gpt-5-codex | 128K+ | High | High | Code generation |
| OpenAI | gpt-4o-mini | 128K | Very High | Low | Cost-effective queries |
| gemini-2.0-flash-exp | 1M+ | Very High | Medium | Experimental, large context | |
| gemini-1.5-pro | 1M | High | Medium | Document analysis, research | |
| Anthropic | claude-opus-4-1-20250805 | 200K | High | High | Complex reasoning, research |
| Anthropic | claude-sonnet-4-5-20250929 | 200K | High | Medium | Balanced performance |
| Anthropic | claude-3-5-haiku-20241022 | 200K | Very High | Low | Fast responses |
| Qwen | qwen3-max | 32K | High | Low | General-purpose |
Choosing the Right Model
Consider these factors when selecting a model:
- Task Complexity: Use larger models (Opus, GPT-5) for complex reasoning
- Speed Requirements: Choose Flash or Haiku models for real-time applications
- Cost Constraints: Opt for gpt-4o-mini or Haiku for high-volume tasks
- Context Length: Select Gemini 1.5 for documents requiring massive context
- Specialization: Use Codex models for programming tasks
Model Availability
Model availability depends on:
- Your subscription plan
- Provider service status
- Geographic restrictions
- API key permissions
Check your Dashboard for real-time availability status.
Streaming Support
All listed models support streaming responses for real-time interaction. Enable streaming in your API requests by setting "stream": true in your request parameters.
Rate Limits
Rate limits vary by provider and model. Refer to your Billing page for detailed rate limit information based on your plan.