AI Gateway

Access LLMs through Strongly's AI Gateway for completions, chat, streaming, and embeddings.

Overview

The AI Gateway provides:

Unified API for multiple LLM providers
Text completions and chat conversations
Streaming responses
Text embeddings for similarity search
Automatic model routing and fallbacks

Basic Usage

from strongly import gateway

# Simple completion
response = gateway.complete("Explain machine learning in one sentence:")
print(response.content)

# Chat conversation
chat = gateway.Chat(model="gpt-4o-mini")
chat.add_system("You are a helpful assistant.")
response = chat.send("Hello!")
print(response.content)

Text Completions

Generate text from a prompt:

from strongly import gateway

# Basic completion
response = gateway.complete("What is Python?")
print(response.content)

# With options
response = gateway.complete(
    prompt="Explain machine learning:",
    model="gpt-4o-mini",
    max_tokens=100
)
print(response.content)
print(f"Tokens used: {response.usage.total_tokens}")

Chat Conversations

Build multi-turn conversations:

from strongly import gateway

# Create chat session
chat = gateway.Chat(model="gpt-4o-mini")

# Add system prompt
chat.add_system("You are a helpful AI assistant specialized in data science.")

# Send messages and get responses
response1 = chat.send("What is supervised learning?")
print(f"Assistant: {response1.content}")

# Continue conversation (context is maintained)
response2 = chat.send("Give me an example.")
print(f"Assistant: {response2.content}")

Manual Chat Messages

For more control, use the direct chat function:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there! How can I help?"},
    {"role": "user", "content": "What's the weather like?"}
]

response = gateway.chat(messages, model="gpt-4o-mini")
print(response.content)

Streaming Responses

Stream tokens as they're generated:

from strongly import gateway

print("Response: ", end="", flush=True)
for chunk in gateway.stream(
    prompt="Write a haiku about Python programming:",
    model="gpt-4o-mini"
):
    print(chunk.content, end="", flush=True)
print()  # Newline at end

Text Embeddings

Generate embeddings for similarity search and RAG applications:

from strongly import gateway

# Generate embeddings
texts = [
    "Machine learning is a subset of AI.",
    "Deep learning uses neural networks.",
    "The weather today is sunny."
]

response = gateway.embed(texts)

print(f"Generated {len(response.embeddings)} embeddings")
print(f"Embedding dimension: {len(response.embeddings[0])}")

Similarity Search Example

import math

def cosine_similarity(a, b):
    dot = sum(x * y for x, y in zip(a, b))
    norm_a = math.sqrt(sum(x * x for x in a))
    norm_b = math.sqrt(sum(x * x for x in b))
    return dot / (norm_a * norm_b)

# Get embeddings
texts = [
    "Machine learning is a subset of artificial intelligence.",
    "Deep learning uses neural networks with many layers.",
    "The weather today is sunny and warm."
]
response = gateway.embed(texts)

# Compare similarity
sim_01 = cosine_similarity(response.embeddings[0], response.embeddings[1])
sim_02 = cosine_similarity(response.embeddings[0], response.embeddings[2])

print(f"ML vs Deep Learning: {sim_01:.4f}")  # High similarity
print(f"ML vs Weather: {sim_02:.4f}")        # Low similarity

Available Models

List models available in your workspace:

from strongly import gateway

models = gateway.list_models()
for model in models[:10]:
    print(f"  - {model.id}: {model.name}")

# Get specific model info
model = gateway.get_model("gpt-4o-mini")
print(f"Model: {model.name}")
print(f"Provider: {model.provider}")

Complete Example

from strongly import gateway
import math

def basic_completion():
    """Simple text completion"""
    print("--- Basic Completion ---")
    response = gateway.complete(
        prompt="Explain machine learning in one sentence:",
        model="gpt-4o-mini",
        max_tokens=100
    )
    print(response.content)
    print(f"Tokens used: {response.usage.total_tokens}\n")

def chat_conversation():
    """Multi-turn chat conversation"""
    print("--- Chat Conversation ---")

    chat = gateway.Chat(model="gpt-4o-mini")
    chat.add_system("You are a helpful AI assistant specialized in data science.")

    response1 = chat.send("What is the difference between supervised and unsupervised learning?")
    print(f"Assistant: {response1.content[:200]}...\n")

    response2 = chat.send("Give me an example of each.")
    print(f"Assistant: {response2.content[:200]}...\n")

def streaming_response():
    """Stream tokens as they're generated"""
    print("--- Streaming Response ---")

    print("Response: ", end="", flush=True)
    for chunk in gateway.stream(
        prompt="Write a haiku about Python programming:",
        model="gpt-4o-mini"
    ):
        print(chunk.content, end="", flush=True)
    print("\n")

def text_embeddings():
    """Generate text embeddings"""
    print("--- Text Embeddings ---")

    texts = [
        "Machine learning is a subset of artificial intelligence.",
        "Deep learning uses neural networks with many layers.",
        "The weather today is sunny and warm."
    ]

    response = gateway.embed(texts)

    print(f"Generated {len(response.embeddings)} embeddings")
    print(f"Embedding dimension: {len(response.embeddings[0])}")

    def cosine_similarity(a, b):
        dot = sum(x * y for x, y in zip(a, b))
        norm_a = math.sqrt(sum(x * x for x in a))
        norm_b = math.sqrt(sum(x * x for x in b))
        return dot / (norm_a * norm_b)

    sim_01 = cosine_similarity(response.embeddings[0], response.embeddings[1])
    sim_02 = cosine_similarity(response.embeddings[0], response.embeddings[2])

    print(f"Similarity (text 0 vs 1): {sim_01:.4f}")
    print(f"Similarity (text 0 vs 2): {sim_02:.4f}")

def list_available_models():
    """List models available in your workspace"""
    print("\n--- Available Models ---")
    models = gateway.list_models()
    for model in models[:10]:
        print(f"  - {model.id}: {model.name}")

def main():
    basic_completion()
    chat_conversation()
    streaming_response()
    text_embeddings()
    list_available_models()

if __name__ == "__main__":
    main()

API Reference

Quick Functions

Function	Description
`gateway.complete(prompt, model, max_tokens)`	Text completion
`gateway.chat(messages, model)`	Chat with message list
`gateway.stream(prompt, model)`	Streaming completion
`gateway.embed(texts)`	Generate embeddings

Chat Session

chat = gateway.Chat(model="gpt-4o-mini")
chat.add_system("system prompt")      # Add system message
chat.add_user("user message")         # Add user message
chat.add_assistant("assistant reply") # Add assistant message
response = chat.send("new message")   # Send and get response

Model Discovery

Function	Description
`gateway.list_models()`	List available models
`gateway.get_model(name)`	Get model details

Response Objects

Completion Response

response.content       # Generated text
response.usage.total_tokens    # Total tokens used
response.usage.prompt_tokens   # Prompt tokens
response.usage.completion_tokens  # Generated tokens

Embedding Response

response.embeddings    # List of embedding vectors

Overview​

Basic Usage​

Text Completions​

Chat Conversations​

Manual Chat Messages​

Streaming Responses​

Text Embeddings​

Similarity Search Example​

Available Models​

Complete Example​

API Reference​

Quick Functions​

Chat Session​

Model Discovery​

Response Objects​

Completion Response​

Embedding Response​