AI Models
Manage AI models registered in the Strongly AI Gateway. Models can be third-party provider models (OpenAI, Anthropic, etc.) or self-hosted models deployed on your cluster.
AIModel Object
{
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"owner": "user_abc123",
"organizationId": "org_abc123",
"isShared": true,
"sharedWith": ["user_def456"],
"config": {
"defaultTemperature": 0.7,
"rateLimit": 100
},
"createdAt": "2025-01-15T10:00:00.000Z",
"updatedAt": "2025-02-01T14:30:00.000Z"
}
GET /api/v1/ai/models
List AI models
Returns a paginated list of AI models accessible to the current user.
Scope: ai-gateway:read
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
search | query | string | No | Search by model name |
type | query | string | No | Filter by type: third-party, self-hosted |
status | query | string | No | Filter by status: active, deploying, stopped, failed |
provider | query | string | No | Filter by provider: openai, anthropic, google, etc. |
modelType | query | string | No | Filter by model type: chat, completion, embedding |
limit | query | integer | No | Max results (default: 50, max: 200) |
offset | query | integer | No | Pagination offset |
sort | query | string | No | Sort field (default: -createdAt) |
Response: 200 OK (paginated)
{
"data": [
{
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"owner": "user_abc123",
"organizationId": "org_abc123",
"isShared": true,
"createdAt": "2025-01-15T10:00:00.000Z",
"updatedAt": "2025-02-01T14:30:00.000Z"
}
],
"meta": { "total": 12, "limit": 50, "offset": 0, "hasMore": false, "requestId": "req_abc123" }
}
GET /api/v1/ai/models/overview
Get model overview statistics
Returns aggregate counts of models by status and type.
Scope: ai-gateway:read
Response: 200 OK
{
"data": {
"total": 12,
"active": 8,
"deploying": 1,
"stopped": 2,
"failed": 1,
"thirdParty": 9,
"selfHosted": 3
},
"meta": { "requestId": "req_abc123" }
}
POST /api/v1/ai/models
Create a new AI model
Registers a new model in the AI Gateway. Third-party models become active immediately; self-hosted models require deployment.
Scope: ai-gateway:write
Request Body:
{
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"config": {
"defaultTemperature": 0.7,
"rateLimit": 100
}
}
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Human-readable model name |
type | string | Yes | Model type: third-party or self-hosted |
provider | string | Yes | Provider: openai, anthropic, google, huggingface, etc. |
vendorModelId | string | Yes | Provider's model identifier (e.g., gpt-4, claude-3-opus) |
modelType | string | No | Capability type: chat, completion, embedding |
description | string | No | Model description |
capabilities | string[] | No | List of capabilities (e.g., chat, function-calling, vision) |
maxTokens | integer | No | Maximum output tokens |
contextWindow | integer | No | Maximum context window size in tokens |
config | object | No | Additional provider-specific configuration |
Response: 201 Created
{
"data": {
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"owner": "user_abc123",
"organizationId": "org_abc123",
"createdAt": "2025-02-07T10:00:00.000Z",
"updatedAt": "2025-02-07T10:00:00.000Z"
},
"meta": { "requestId": "req_abc123" }
}
GET /api/v1/ai/models/:id
Get an AI model
Returns the full details of a single model.
Scope: ai-gateway:read
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"owner": "user_abc123",
"organizationId": "org_abc123",
"isShared": true,
"sharedWith": ["user_def456"],
"config": {
"defaultTemperature": 0.7,
"rateLimit": 100
},
"createdAt": "2025-01-15T10:00:00.000Z",
"updatedAt": "2025-02-01T14:30:00.000Z"
},
"meta": { "requestId": "req_abc123" }
}
PUT /api/v1/ai/models/:id
Update an AI model
Updates model properties. Only provided fields are changed.
Scope: ai-gateway:write
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Request Body:
{
"name": "GPT-4 Production (Updated)",
"description": "Updated description",
"maxTokens": 4096,
"config": {
"defaultTemperature": 0.5
}
}
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Updated model name |
description | string | No | Updated description |
capabilities | string[] | No | Updated capabilities list |
maxTokens | integer | No | Updated max output tokens |
contextWindow | integer | No | Updated context window size |
config | object | No | Updated configuration (merged with existing) |
Response: 200 OK
{
"data": {
"id": "model_abc123",
"name": "GPT-4 Production (Updated)",
"updatedAt": "2025-02-07T12:00:00.000Z"
},
"meta": { "requestId": "req_abc123" }
}
DELETE /api/v1/ai/models/:id
Delete an AI model
Permanently removes a model from the AI Gateway. Self-hosted models are undeployed first.
Scope: ai-gateway:write
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 204 No Content
POST /api/v1/ai/models/:id/deploy
Deploy a self-hosted model
Deploys a self-hosted model to the Kubernetes cluster. Only applicable to models with type: "self-hosted".
Scope: ai-gateway:write
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"id": "model_abc123",
"status": "deploying",
"message": "Deployment initiated"
},
"meta": { "requestId": "req_abc123" }
}
POST /api/v1/ai/models/:id/start
Start a stopped model
Starts a previously stopped self-hosted model.
Scope: ai-gateway:write
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"id": "model_abc123",
"status": "deploying",
"message": "Model starting"
},
"meta": { "requestId": "req_abc123" }
}
POST /api/v1/ai/models/:id/stop
Stop a running model
Stops a running self-hosted model, freeing cluster resources.
Scope: ai-gateway:write
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"id": "model_abc123",
"status": "stopped",
"message": "Model stopped"
},
"meta": { "requestId": "req_abc123" }
}
GET /api/v1/ai/models/:id/status
Get model deployment status
Returns the current deployment status and replica information for a self-hosted model.
Scope: ai-gateway:read
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"status": "active",
"replicas": 2,
"readyReplicas": 2,
"conditions": [
{
"type": "Available",
"status": "True",
"lastTransitionTime": "2025-02-07T10:00:00.000Z",
"reason": "MinimumReplicasAvailable",
"message": "Deployment has minimum availability"
}
]
},
"meta": { "requestId": "req_abc123" }
}
GET /api/v1/ai/models/:id/metrics
Get model metrics
Returns performance and usage metrics for a model.
Scope: ai-gateway:read
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"totalRequests": 15420,
"totalTokens": 2500000,
"avgLatencyMs": 450,
"errorRate": 0.02,
"requestsPerMinute": 12.5,
"uptimePercent": 99.8
},
"meta": { "requestId": "req_abc123" }
}
GET /api/v1/ai/models/:id/logs
Get model logs
Returns container logs for a self-hosted model deployment.
Scope: ai-gateway:read
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
lines | query | integer | No | Number of log lines to return (default: 100) |
since | query | string | No | Return logs since this time (ISO 8601 or duration like 1h, 30m) |
container | query | string | No | Container name (for multi-container pods) |
Response: 200 OK
{
"data": {
"logs": "2025-02-07T10:00:00Z INFO Model loaded successfully\n2025-02-07T10:00:01Z INFO Ready to serve requests on port 8080\n",
"container": "model-server",
"lines": 100
},
"meta": { "requestId": "req_abc123" }
}
GET /api/v1/ai/models/:id/permissions
Get model permissions
Returns sharing and ownership information for a model.
Scope: ai-gateway:read
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Response: 200 OK
{
"data": {
"owner": "user_abc123",
"sharedWith": ["user_def456", "user_ghi789"],
"isShared": true,
"organizationId": "org_abc123"
},
"meta": { "requestId": "req_abc123" }
}
PUT /api/v1/ai/models/:id/permissions
Update model permissions
Updates sharing settings for a model. Only the model owner can modify permissions.
Scope: ai-gateway:write
Parameters:
| Name | In | Type | Required | Description |
|---|---|---|---|---|
id | path | string | Yes | Model ID |
Request Body:
{
"isShared": true,
"sharedWith": ["user_def456", "user_ghi789"]
}
| Field | Type | Required | Description |
|---|---|---|---|
isShared | boolean | No | Whether the model is shared with other users |
sharedWith | string[] | No | List of user IDs to share with |
Response: 200 OK
{
"data": {
"owner": "user_abc123",
"sharedWith": ["user_def456", "user_ghi789"],
"isShared": true,
"organizationId": "org_abc123"
},
"meta": { "requestId": "req_abc123" }
}