Skip to main content

Model Providers

Connect to external AI providers with unified API access and automatic key management. The AI Gateway supports multiple third-party providers, giving you flexibility to choose the best model for your use case.

Supported Providers

OpenAI

Access state-of-the-art language models from OpenAI:

  • GPT-4: Most capable model for complex tasks
  • GPT-4 Turbo: Faster and more cost-effective version of GPT-4
  • GPT-3.5 Turbo: Fast and affordable model for common tasks

Best for: General-purpose text generation, conversation, code generation, analysis

Anthropic

Deploy Claude models with advanced reasoning capabilities:

  • Claude 3 Opus: Most powerful model for complex analysis and creative tasks
  • Claude 3 Sonnet: Balanced performance and speed for most workloads
  • Claude 3 Haiku: Fastest and most compact model for simple tasks

Best for: Long-form content, analysis, research, conversational AI

Google

Access Google's latest language models:

  • Gemini Pro: Advanced multimodal model for text and vision
  • Gemini Ultra: Most capable model for complex reasoning
  • PaLM 2: Efficient language model for diverse tasks

Best for: Multimodal tasks, translation, code generation

Cohere

Specialized models for enterprise applications:

  • Command: Instruction-following model for business tasks
  • Command-Light: Faster, lighter version for simple commands
  • Embed: Text embeddings for semantic search and classification

Best for: Enterprise applications, semantic search, text classification

Azure OpenAI

Enterprise-grade OpenAI models with Azure SLA:

  • Same models as OpenAI (GPT-4, GPT-3.5 Turbo)
  • Azure infrastructure and compliance
  • Enterprise support and SLA guarantees
  • Private network deployment options

Best for: Enterprise deployments requiring Azure compliance and support

Unified API Access

All providers use a consistent API interface, making it easy to switch between models:

const services = JSON.parse(process.env.STRONGLY_SERVICES);
const aiModel = services.aiModels['model-id'];

const response = await fetch(aiModel.endpoint, {
method: 'POST',
headers: {
'Authorization': `Bearer ${aiModel.apiKey}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: aiModel.name,
messages: [{ role: 'user', content: 'Hello!' }],
max_tokens: 500,
temperature: 0.7
})
});

const data = await response.json();
console.log(data.choices[0].message.content);

Automatic Key Management

The AI Gateway handles API key management automatically:

  • Secure Storage: Keys are encrypted at rest and in transit
  • Rotation: Automatic key rotation for enhanced security
  • Access Control: Role-based access to different models
  • Usage Tracking: Monitor API usage per key and user

Comparing Self-Hosted vs Third-Party

FactorSelf-Hosted ModelsThird-Party Providers
ControlFull control over infrastructureLimited to provider's capabilities
PrivacyData stays in your infrastructureData sent to external provider
CostPay for compute resourcesPay per token/request
LatencyDepends on your infrastructureProvider's infrastructure latency
ScalabilityManual or autoscaling configurationAutomatic by provider
MaintenanceYou manage updates and infrastructureProvider handles maintenance
CustomizationFull model fine-tuning capabilityLimited fine-tuning options
AvailabilityDepends on your uptimeProvider's SLA guarantees

Choosing the Right Provider

Consider these factors when selecting a model provider:

Performance Requirements

  • Response Time: Third-party providers typically have lower latency
  • Throughput: Self-hosted models can be optimized for high throughput
  • Consistency: Third-party providers offer consistent performance with SLAs

Cost Considerations

  • Usage Patterns: On-demand self-hosted models are cost-effective for intermittent use
  • Volume: High-volume workloads may benefit from self-hosted optimization
  • Budget: Third-party providers offer predictable per-token pricing

Privacy and Compliance

  • Data Sensitivity: Self-hosted models keep data within your infrastructure
  • Regulatory Requirements: Some industries require on-premises deployment
  • Audit Trail: Both options support comprehensive logging and monitoring

Model Capabilities

  • Task Complexity: More complex tasks may benefit from latest provider models
  • Specialization: Fine-tuned self-hosted models excel at domain-specific tasks
  • Multimodal Needs: Some providers offer vision and text capabilities

Provider Pricing Models

OpenAI and Anthropic

  • Pay per Token: Charged based on input and output tokens
  • Different Rates: Varies by model (GPT-4 more expensive than GPT-3.5)
  • No Minimum: Pay only for what you use

Self-Hosted Models

  • Compute Costs: Pay for GPU/CPU resources while running
  • Storage Costs: Pay for model storage
  • Deployment Options: Always On, On Demand, or Scheduled
  • Potential Savings: 70-90% cost reduction with on-demand deployment

Managing Multiple Providers

The AI Gateway makes it easy to work with multiple providers simultaneously:

  1. Unified Dashboard: View all models in one place
  2. Consistent API: Same interface across all providers
  3. Centralized Monitoring: Track usage and costs across providers
  4. Easy Switching: Change providers without code changes
  5. A/B Testing: Compare model performance across providers

Best Practices

Development

  • Start with third-party providers for rapid prototyping
  • Use smaller, faster models during development
  • Switch to self-hosted models for production if needed

Production

  • Choose providers based on latency and reliability requirements
  • Implement fallback providers for high availability
  • Monitor performance and costs continuously
  • Use guardrails to ensure content safety

Cost Optimization

  • Use on-demand deployment for variable workloads
  • Choose appropriate model sizes for tasks
  • Implement caching for repeated queries
  • Monitor token usage to identify optimization opportunities

Next Steps