Skip to main content

AI Models

Manage AI models registered in the Strongly AI Gateway. Models can be third-party provider models (OpenAI, Anthropic, etc.) or self-hosted models deployed on your cluster.


AIModel Object

{
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"owner": "user_abc123",
"organizationId": "org_abc123",
"isShared": true,
"sharedWith": ["user_def456"],
"config": {
"defaultTemperature": 0.7,
"rateLimit": 100
},
"createdAt": "2025-01-15T10:00:00.000Z",
"updatedAt": "2025-02-01T14:30:00.000Z"
}

GET /api/v1/ai/models

List AI models

Returns a paginated list of AI models accessible to the current user.

Scope: ai-gateway:read

Parameters:

NameInTypeRequiredDescription
searchquerystringNoSearch by model name
typequerystringNoFilter by type: third-party, self-hosted
statusquerystringNoFilter by status: active, deploying, stopped, failed
providerquerystringNoFilter by provider: openai, anthropic, google, etc.
modelTypequerystringNoFilter by model type: chat, completion, embedding
limitqueryintegerNoMax results (default: 50, max: 200)
offsetqueryintegerNoPagination offset
sortquerystringNoSort field (default: -createdAt)

Response: 200 OK (paginated)

{
"data": [
{
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"owner": "user_abc123",
"organizationId": "org_abc123",
"isShared": true,
"createdAt": "2025-01-15T10:00:00.000Z",
"updatedAt": "2025-02-01T14:30:00.000Z"
}
],
"meta": { "total": 12, "limit": 50, "offset": 0, "hasMore": false, "requestId": "req_abc123" }
}

GET /api/v1/ai/models/overview

Get model overview statistics

Returns aggregate counts of models by status and type.

Scope: ai-gateway:read

Response: 200 OK

{
"data": {
"total": 12,
"active": 8,
"deploying": 1,
"stopped": 2,
"failed": 1,
"thirdParty": 9,
"selfHosted": 3
},
"meta": { "requestId": "req_abc123" }
}

POST /api/v1/ai/models

Create a new AI model

Registers a new model in the AI Gateway. Third-party models become active immediately; self-hosted models require deployment.

Scope: ai-gateway:write

Request Body:

{
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"config": {
"defaultTemperature": 0.7,
"rateLimit": 100
}
}
FieldTypeRequiredDescription
namestringYesHuman-readable model name
typestringYesModel type: third-party or self-hosted
providerstringYesProvider: openai, anthropic, google, huggingface, etc.
vendorModelIdstringYesProvider's model identifier (e.g., gpt-4, claude-3-opus)
modelTypestringNoCapability type: chat, completion, embedding
descriptionstringNoModel description
capabilitiesstring[]NoList of capabilities (e.g., chat, function-calling, vision)
maxTokensintegerNoMaximum output tokens
contextWindowintegerNoMaximum context window size in tokens
configobjectNoAdditional provider-specific configuration

Response: 201 Created

{
"data": {
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"owner": "user_abc123",
"organizationId": "org_abc123",
"createdAt": "2025-02-07T10:00:00.000Z",
"updatedAt": "2025-02-07T10:00:00.000Z"
},
"meta": { "requestId": "req_abc123" }
}

GET /api/v1/ai/models/:id

Get an AI model

Returns the full details of a single model.

Scope: ai-gateway:read

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"id": "model_abc123",
"name": "GPT-4 Production",
"type": "third-party",
"provider": "openai",
"vendorModelId": "gpt-4",
"modelType": "chat",
"status": "active",
"description": "GPT-4 for production workloads",
"capabilities": ["chat", "function-calling"],
"maxTokens": 8192,
"contextWindow": 128000,
"owner": "user_abc123",
"organizationId": "org_abc123",
"isShared": true,
"sharedWith": ["user_def456"],
"config": {
"defaultTemperature": 0.7,
"rateLimit": 100
},
"createdAt": "2025-01-15T10:00:00.000Z",
"updatedAt": "2025-02-01T14:30:00.000Z"
},
"meta": { "requestId": "req_abc123" }
}

PUT /api/v1/ai/models/:id

Update an AI model

Updates model properties. Only provided fields are changed.

Scope: ai-gateway:write

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Request Body:

{
"name": "GPT-4 Production (Updated)",
"description": "Updated description",
"maxTokens": 4096,
"config": {
"defaultTemperature": 0.5
}
}
FieldTypeRequiredDescription
namestringNoUpdated model name
descriptionstringNoUpdated description
capabilitiesstring[]NoUpdated capabilities list
maxTokensintegerNoUpdated max output tokens
contextWindowintegerNoUpdated context window size
configobjectNoUpdated configuration (merged with existing)

Response: 200 OK

{
"data": {
"id": "model_abc123",
"name": "GPT-4 Production (Updated)",
"updatedAt": "2025-02-07T12:00:00.000Z"
},
"meta": { "requestId": "req_abc123" }
}

DELETE /api/v1/ai/models/:id

Delete an AI model

Permanently removes a model from the AI Gateway. Self-hosted models are undeployed first.

Scope: ai-gateway:write

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 204 No Content


POST /api/v1/ai/models/:id/deploy

Deploy a self-hosted model

Deploys a self-hosted model to the Kubernetes cluster. Only applicable to models with type: "self-hosted".

Scope: ai-gateway:write

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"id": "model_abc123",
"status": "deploying",
"message": "Deployment initiated"
},
"meta": { "requestId": "req_abc123" }
}

POST /api/v1/ai/models/:id/start

Start a stopped model

Starts a previously stopped self-hosted model.

Scope: ai-gateway:write

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"id": "model_abc123",
"status": "deploying",
"message": "Model starting"
},
"meta": { "requestId": "req_abc123" }
}

POST /api/v1/ai/models/:id/stop

Stop a running model

Stops a running self-hosted model, freeing cluster resources.

Scope: ai-gateway:write

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"id": "model_abc123",
"status": "stopped",
"message": "Model stopped"
},
"meta": { "requestId": "req_abc123" }
}

GET /api/v1/ai/models/:id/status

Get model deployment status

Returns the current deployment status and replica information for a self-hosted model.

Scope: ai-gateway:read

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"status": "active",
"replicas": 2,
"readyReplicas": 2,
"conditions": [
{
"type": "Available",
"status": "True",
"lastTransitionTime": "2025-02-07T10:00:00.000Z",
"reason": "MinimumReplicasAvailable",
"message": "Deployment has minimum availability"
}
]
},
"meta": { "requestId": "req_abc123" }
}

GET /api/v1/ai/models/:id/metrics

Get model metrics

Returns performance and usage metrics for a model.

Scope: ai-gateway:read

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"totalRequests": 15420,
"totalTokens": 2500000,
"avgLatencyMs": 450,
"errorRate": 0.02,
"requestsPerMinute": 12.5,
"uptimePercent": 99.8
},
"meta": { "requestId": "req_abc123" }
}

GET /api/v1/ai/models/:id/logs

Get model logs

Returns container logs for a self-hosted model deployment.

Scope: ai-gateway:read

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID
linesqueryintegerNoNumber of log lines to return (default: 100)
sincequerystringNoReturn logs since this time (ISO 8601 or duration like 1h, 30m)
containerquerystringNoContainer name (for multi-container pods)

Response: 200 OK

{
"data": {
"logs": "2025-02-07T10:00:00Z INFO Model loaded successfully\n2025-02-07T10:00:01Z INFO Ready to serve requests on port 8080\n",
"container": "model-server",
"lines": 100
},
"meta": { "requestId": "req_abc123" }
}

GET /api/v1/ai/models/:id/permissions

Get model permissions

Returns sharing and ownership information for a model.

Scope: ai-gateway:read

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Response: 200 OK

{
"data": {
"owner": "user_abc123",
"sharedWith": ["user_def456", "user_ghi789"],
"isShared": true,
"organizationId": "org_abc123"
},
"meta": { "requestId": "req_abc123" }
}

PUT /api/v1/ai/models/:id/permissions

Update model permissions

Updates sharing settings for a model. Only the model owner can modify permissions.

Scope: ai-gateway:write

Parameters:

NameInTypeRequiredDescription
idpathstringYesModel ID

Request Body:

{
"isShared": true,
"sharedWith": ["user_def456", "user_ghi789"]
}
FieldTypeRequiredDescription
isSharedbooleanNoWhether the model is shared with other users
sharedWithstring[]NoList of user IDs to share with

Response: 200 OK

{
"data": {
"owner": "user_abc123",
"sharedWith": ["user_def456", "user_ghi789"],
"isShared": true,
"organizationId": "org_abc123"
},
"meta": { "requestId": "req_abc123" }
}