Model Providers
Connect to external AI providers with unified API access and automatic key management. The AI Gateway supports multiple third-party providers, giving you flexibility to choose the best model for your use case.
Supported Providers
OpenAI
Access state-of-the-art language models from OpenAI:
- GPT-4: Most capable model for complex tasks
- GPT-4 Turbo: Faster and more cost-effective version of GPT-4
- GPT-3.5 Turbo: Fast and affordable model for common tasks
Best for: General-purpose text generation, conversation, code generation, analysis
Anthropic
Deploy Claude models with advanced reasoning capabilities:
- Claude 3 Opus: Most powerful model for complex analysis and creative tasks
- Claude 3 Sonnet: Balanced performance and speed for most workloads
- Claude 3 Haiku: Fastest and most compact model for simple tasks
Best for: Long-form content, analysis, research, conversational AI
Google
Access Google's latest language models:
- Gemini Pro: Advanced multimodal model for text and vision
- Gemini Ultra: Most capable model for complex reasoning
- PaLM 2: Efficient language model for diverse tasks
Best for: Multimodal tasks, translation, code generation
Cohere
Specialized models for enterprise applications:
- Command: Instruction-following model for business tasks
- Command-Light: Faster, lighter version for simple commands
- Embed: Text embeddings for semantic search and classification
Best for: Enterprise applications, semantic search, text classification
Azure OpenAI
Enterprise-grade OpenAI models with Azure SLA:
- Same models as OpenAI (GPT-4, GPT-3.5 Turbo)
- Azure infrastructure and compliance
- Enterprise support and SLA guarantees
- Private network deployment options
Best for: Enterprise deployments requiring Azure compliance and support
Unified API Access
All providers use a consistent API interface, making it easy to switch between models:
const services = JSON.parse(process.env.STRONGLY_SERVICES);
const aiModel = services.aiModels['model-id'];
const response = await fetch(aiModel.endpoint, {
method: 'POST',
headers: {
'Authorization': `Bearer ${aiModel.apiKey}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: aiModel.name,
messages: [{ role: 'user', content: 'Hello!' }],
max_tokens: 500,
temperature: 0.7
})
});
const data = await response.json();
console.log(data.choices[0].message.content);
Automatic Key Management
The AI Gateway handles API key management automatically:
- Secure Storage: Keys are encrypted at rest and in transit
- Rotation: Automatic key rotation for enhanced security
- Access Control: Role-based access to different models
- Usage Tracking: Monitor API usage per key and user
Comparing Self-Hosted vs Third-Party
| Factor | Self-Hosted Models | Third-Party Providers |
|---|---|---|
| Control | Full control over infrastructure | Limited to provider's capabilities |
| Privacy | Data stays in your infrastructure | Data sent to external provider |
| Cost | Pay for compute resources | Pay per token/request |
| Latency | Depends on your infrastructure | Provider's infrastructure latency |
| Scalability | Manual or autoscaling configuration | Automatic by provider |
| Maintenance | You manage updates and infrastructure | Provider handles maintenance |
| Customization | Full model fine-tuning capability | Limited fine-tuning options |
| Availability | Depends on your uptime | Provider's SLA guarantees |
Choosing the Right Provider
Consider these factors when selecting a model provider:
Performance Requirements
- Response Time: Third-party providers typically have lower latency
- Throughput: Self-hosted models can be optimized for high throughput
- Consistency: Third-party providers offer consistent performance with SLAs
Cost Considerations
- Usage Patterns: On-demand self-hosted models are cost-effective for intermittent use
- Volume: High-volume workloads may benefit from self-hosted optimization
- Budget: Third-party providers offer predictable per-token pricing
Privacy and Compliance
- Data Sensitivity: Self-hosted models keep data within your infrastructure
- Regulatory Requirements: Some industries require on-premises deployment
- Audit Trail: Both options support comprehensive logging and monitoring
Model Capabilities
- Task Complexity: More complex tasks may benefit from latest provider models
- Specialization: Fine-tuned self-hosted models excel at domain-specific tasks
- Multimodal Needs: Some providers offer vision and text capabilities
Provider Pricing Models
OpenAI and Anthropic
- Pay per Token: Charged based on input and output tokens
- Different Rates: Varies by model (GPT-4 more expensive than GPT-3.5)
- No Minimum: Pay only for what you use
Self-Hosted Models
- Compute Costs: Pay for GPU/CPU resources while running
- Storage Costs: Pay for model storage
- Deployment Options: Always On, On Demand, or Scheduled
- Potential Savings: 70-90% cost reduction with on-demand deployment
Managing Multiple Providers
The AI Gateway makes it easy to work with multiple providers simultaneously:
- Unified Dashboard: View all models in one place
- Consistent API: Same interface across all providers
- Centralized Monitoring: Track usage and costs across providers
- Easy Switching: Change providers without code changes
- A/B Testing: Compare model performance across providers
Best Practices
Development
- Start with third-party providers for rapid prototyping
- Use smaller, faster models during development
- Switch to self-hosted models for production if needed
Production
- Choose providers based on latency and reliability requirements
- Implement fallback providers for high availability
- Monitor performance and costs continuously
- Use guardrails to ensure content safety
Cost Optimization
- Use on-demand deployment for variable workloads
- Choose appropriate model sizes for tasks
- Implement caching for repeated queries
- Monitor token usage to identify optimization opportunities
Next Steps
- Deploy your first model
- Configure autoscaling for self-hosted models
- Set up monitoring to track performance
- Optimize costs across providers