API Reference

Complete API documentation for all Vel classes and functions.

Agent
ContextManager
StatelessContextManager
ToolSpec
Providers
Stream Events
Helper Functions

Agent

Main orchestrator class for running agents with LLM providers.

Constructor

Agent(
    id: str,
    model: Dict[str, Any],
    prompt_env: str = 'prod',
    tools: List[str] | None = None,
    policies: Dict[str, Any] | None = None,
    context_manager: Optional[ContextManager] = None,
    session_storage: Literal['memory', 'database'] = 'memory',
    generation_config: Optional[Dict[str, Any]] = None
)

Parameters:

id (required)

Type: str
Agent identifier (unique name for this agent)
Example: 'my-agent', 'chat-general:v1'

model (required)

Type: Dict[str, Any]
Model configuration with provider, model, and optional api_key keys
Examples:
- {'provider': 'openai', 'model': 'gpt-4o'} - Uses OPENAI_API_KEY env var
- {'provider': 'openai', 'model': 'gpt-4o', 'api_key': 'sk-...'} - Uses explicit API key
Supported providers: 'openai', 'google', 'anthropic'
API Key Handling:
- If api_key is provided, it overrides the environment variable for this agent instance
- If api_key is not provided, falls back to environment variable (OPENAI_API_KEY, GOOGLE_API_KEY, ANTHROPIC_API_KEY)
- This enables multi-tenant applications where different agents use different API keys

prompt_env

Type: str
Default: 'prod'
Environment for prompts (for future prompt versioning)

tools

Type: List[str] | None
Default: None
List of tool names to enable
Example: ['get_weather', 'search']
Tools must be registered before creating agent

policies

Type: Dict[str, Any] | None
Default: {'max_steps': 24, 'retry': {'attempts': 2}}
Execution policies controlling agent behavior
Available policies:
- max_steps: Maximum tool calls per run (default: 24)
- retry: Retry configuration (future feature)

context_manager

Type: Optional[ContextManager]
Default: None (uses ContextManager())
Custom context manager instance for conversation memory
Options:
- None: Default full memory
- ContextManager(): Full memory (explicit)
- ContextManager(max_history=10): Limited history
- StatelessContextManager(): No memory
- Custom subclass

session_storage

Type: Literal['memory', 'database']
Default: 'memory'
Where to persist session context
Options:
- 'memory': In-memory only (fast, not persistent)
- 'database': Postgres-backed (persistent, requires POSTGRES_DSN)

generation_config

Type: Optional[Dict[str, Any]]
Default: None
Model generation parameters (temperature, max_tokens, etc.)
Common parameters:
- temperature: float (0-2) - Sampling temperature
- max_tokens: int - Maximum output tokens
- top_p: float (0-1) - Nucleus sampling
- top_k: int - Top-K sampling (Gemini, Anthropic)
- presence_penalty: float (-2 to 2) - Penalize new tokens (OpenAI)
- frequency_penalty: float (-2 to 2) - Penalize repeated tokens (OpenAI)
- stop: List[str] - Stop sequences
- seed: int - Reproducibility seed (OpenAI, Anthropic)
Can be overridden per-run using generation_config parameter in run() or run_stream()

Example:

from vel import Agent

# Basic agent
agent = Agent(
    id='my-agent',
    model={'provider': 'openai', 'model': 'gpt-4o'}
)

# Agent with tools and limited memory
agent = Agent(
    id='my-agent',
    model={'provider': 'google', 'model': 'gemini-1.5-pro'},
    tools=['get_weather', 'search'],
    context_manager=ContextManager(max_history=20),
    session_storage='database',
    policies={'max_steps': 10}
)

Methods

run()

Non-streaming execution - returns final answer only.

async def run(
    input: Dict[str, Any],
    session_id: Optional[str] = None,
    generation_config: Optional[Dict[str, Any]] = None
) -> str

Parameters:

input (required)

Type: Dict[str, Any]
Input dictionary with 'message' field
Example: {'message': 'Hello'}

session_id

Type: Optional[str]
Default: None
Session ID for multi-turn conversations
If provided, context persists across calls
Example: 'user-123', 'conv-abc'

generation_config

Type: Optional[Dict[str, Any]]
Default: None
Per-run generation config that overrides agent-level config
Example: {'temperature': 0, 'max_tokens': 100}
See Agent constructor for supported parameters

Returns:

Type: str
Final answer from the agent

Raises:

RuntimeError: If max_steps exceeded
Exception: On LLM or tool errors

Example:

# Single-turn
answer = await agent.run({'message': 'What is 2+2?'})
print(answer)  # "4"

# Multi-turn with sessions
session_id = 'user-123'
answer1 = await agent.run({'message': 'My name is Alice'}, session_id=session_id)
answer2 = await agent.run({'message': 'What is my name?'}, session_id=session_id)
print(answer2)  # "Your name is Alice"

# With per-run generation config override
answer3 = await agent.run(
    {'message': 'Explain quantum computing'},
    generation_config={'temperature': 0, 'max_tokens': 100}  # Deterministic, brief
)

run_stream()

Streaming execution - yields stream protocol events in real-time.

async def run_stream(
    input: Dict[str, Any],
    session_id: Optional[str] = None,
    generation_config: Optional[Dict[str, Any]] = None
) -> AsyncGenerator[Dict[str, Any], None]

Parameters:

input (required)

Type: Dict[str, Any]
Input dictionary with 'message' field
Example: {'message': 'Tell me a story'}

session_id

Type: Optional[str]
Default: None
Session ID for multi-turn conversations
Context persists across calls if provided

generation_config

Type: Optional[Dict[str, Any]]
Default: None
Per-run generation config that overrides agent-level config
Example: {'temperature': 0.9, 'max_tokens': 1000}
See Agent constructor for supported parameters

Yields:

Type: Dict[str, Any]
Stream protocol events as they occur
Event types: text-delta, tool-input-available, finish-message, etc.

Example:

# Stream text to console
async for event in agent.run_stream({'message': 'Tell me a joke'}):
    if event['type'] == 'text-delta':
        print(event['delta'], end='', flush=True)
    elif event['type'] == 'finish-message':
        print()  # Newline
        break

# With sessions
session_id = 'user-123'
async for event in agent.run_stream({'message': 'My name is Bob'}, session_id=session_id):
    if event['type'] == 'text-delta':
        print(event['delta'], end='', flush=True)

# With per-run generation config override
async for event in agent.run_stream(
    {'message': 'Write a haiku'},
    generation_config={'temperature': 0.9, 'max_tokens': 50}  # Creative, brief
):
    if event['type'] == 'text-delta':
        print(event['delta'], end='', flush=True)

ContextManager

Manages conversation history with configurable memory behavior.

Constructor

ContextManager(
    max_history: Optional[int] = None,
    summarize: bool = False
)

Parameters:

max_history

Type: Optional[int]
Default: None (unlimited)
Maximum number of messages to retain
Implements sliding window (keeps last N messages)
Example: max_history=10 keeps last 10 messages (~5 turns)

summarize

Type: bool
Default: False
Whether to summarize old messages (future feature, not implemented)

Example:

from vel import Agent, ContextManager

# Full memory (default)
agent = Agent(
    id='my-agent',
    model={'provider': 'openai', 'model': 'gpt-4o'},
    context_manager=ContextManager()
)

# Limited memory (last 20 messages)
agent = Agent(
    id='my-agent',
    model={'provider': 'openai', 'model': 'gpt-4o'},
    context_manager=ContextManager(max_history=20)
)

Methods

get_session_context()

Get all messages for a session.

def get_session_context(session_id: str) -> List[Dict[str, Any]]

Parameters:

session_id: Session identifier

Returns:

List of message dictionaries with role and content fields

Example:

context = agent.ctxmgr.get_session_context('user-123')
print(context)
# [
#   {'role': 'user', 'content': 'Hello'},
#   {'role': 'assistant', 'content': 'Hi there!'}
# ]

set_session_context()

Set messages for a session (used when loading from storage).

def set_session_context(session_id: str, messages: List[Dict[str, Any]])

Parameters:

session_id: Session identifier
messages: List of message dictionaries

Example:

messages = [
    {'role': 'user', 'content': 'Hello'},
    {'role': 'assistant', 'content': 'Hi there!'}
]
agent.ctxmgr.set_session_context('user-123', messages)

clear_session()

Clear all messages for a session.

def clear_session(session_id: str)

Parameters:

session_id: Session identifier

Example:

agent.ctxmgr.clear_session('user-123')

StatelessContextManager

Stateless context manager - no memory between calls.

Constructor

StatelessContextManager()

No parameters. Each call is completely independent.

Example:

from vel import Agent, StatelessContextManager

agent = Agent(
    id='my-agent',
    model={'provider': 'openai', 'model': 'gpt-4o'},
    context_manager=StatelessContextManager()
)

# First call
answer1 = await agent.run({'message': 'My name is Alice'}, session_id='user-1')

# Second call - does NOT remember Alice
answer2 = await agent.run({'message': 'What is my name?'}, session_id='user-1')
# Agent will say it doesn't know

ToolSpec

Specification for a custom tool with JSON schema validation.

Constructor

ToolSpec(
    name: str,
    input_schema: Dict[str, Any],
    output_schema: Dict[str, Any],
    handler: Callable[[Dict[str, Any], Dict[str, Any]], Dict[str, Any]]
)

Parameters:

name (required)

Type: str
Unique tool identifier
Convention: lowercase_with_underscores
Example: 'get_weather', 'search_products'

input_schema (required)

Type: Dict[str, Any]
JSON Schema (Draft 2020-12) for input validation
Must include type, properties, required
Validated before calling handler

output_schema (required)

Type: Dict[str, Any]
JSON Schema for output validation
Validates handler return value
Ensures consistent behavior

handler (required)

Type: Callable[[Dict[str, Any], Dict[str, Any]], Dict[str, Any]]
Function that executes tool logic
Signature: (input: dict, ctx: dict) -> dict
Can be sync or async (auto-detected)

Example:

from vel import ToolSpec, register_tool

def get_weather_handler(input: dict, ctx: dict) -> dict:
    city = input['city']
    # Your logic here
    return {
        'temp_f': 72,
        'condition': 'sunny',
        'city': city
    }

weather_tool = ToolSpec(
    name='get_weather',
    input_schema={
        'type': 'object',
        'properties': {
            'city': {'type': 'string', 'description': 'City name'}
        },
        'required': ['city']
    },
    output_schema={
        'type': 'object',
        'properties': {
            'temp_f': {'type': 'number'},
            'condition': {'type': 'string'},
            'city': {'type': 'string'}
        },
        'required': ['temp_f', 'condition', 'city']
    },
    handler=get_weather_handler
)

register_tool(weather_tool)

Methods

run()

Execute the tool handler with validation.

async def run(input: Dict[str, Any], ctx: Dict[str, Any]) -> Dict[str, Any]

Parameters:

input: Tool input (validated against input_schema)
ctx: Runtime context with run_id, session_id, agent_id

Returns:

Tool output (validated against output_schema)

Note: Usually called internally by Agent, not directly.

Providers

BaseProvider

Abstract base class for LLM providers.

class BaseProvider(ABC):
    name: str

    @abstractmethod
    async def stream(
        messages: List[LLMMessage],
        model: str,
        tools: Dict[str, Any]
    ) -> AsyncGenerator[StreamEvent, None]

    @abstractmethod
    async def generate(
        messages: List[LLMMessage],
        model: str,
        tools: Dict[str, Any]
    ) -> Dict[str, Any]

Note: Implement this to create custom providers.

OpenAIProvider

OpenAI provider implementation.

Environment Variables:

OPENAI_API_KEY (required)
OPENAI_API_BASE (optional, default: https://api.openai.com/v1)

Supported Models:

gpt-4o
gpt-4-turbo
gpt-4
gpt-3.5-turbo

Example:

# Set in .env
OPENAI_API_KEY=sk-...
OPENAI_API_BASE=https://api.openai.com/v1  # Optional

# Use with agent
agent = Agent(
    id='my-agent',
    model={'provider': 'openai', 'model': 'gpt-4o'}
)

GeminiProvider

Google Gemini provider implementation.

Environment Variables:

GOOGLE_API_KEY (required)

Supported Models:

gemini-1.5-pro
gemini-1.5-flash
gemini-pro

Example:

# Set in .env
GOOGLE_API_KEY=AIza...

# Use with agent
agent = Agent(
    id='my-agent',
    model={'provider': 'google', 'model': 'gemini-1.5-pro'}
)

AnthropicProvider

Anthropic Claude provider implementation.

Environment Variables:

ANTHROPIC_API_KEY (required)
ANTHROPIC_API_BASE (optional, default: https://api.anthropic.com)

Supported Models:

claude-opus-4-20250514
claude-sonnet-4-20250514
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022
claude-3-opus-20240229
claude-3-sonnet-20240229
claude-3-haiku-20240307

Example:

# Set in .env
ANTHROPIC_API_KEY=sk-ant-...

# Use with agent
agent = Agent(
    id='my-agent',
    model={'provider': 'anthropic', 'model': 'claude-sonnet-4-20250514'}
)

Stream Events

All stream events have a type field and extend the base StreamEvent class.

Event Types

Event	Fields
`start`	`messageId` (optional)
`text-start`	`id` (block ID)
`text-delta`	`id`, `delta` (text chunk)
`text-end`	`id`
`tool-input-start`	`toolCallId`, `toolName`
`tool-input-delta`	`toolCallId`, `inputTextDelta`
`tool-input-available`	`toolCallId`, `toolName`, `input` (object)
`tool-output-available`	`toolCallId`, `output` (any)
`finish-message`	`finishReason`
`error`	`error` (string)

See Stream Protocol for complete event documentation.

Helper Functions

register_tool()

def register_tool(tool: ToolSpec)

Parameters:

tool: ToolSpec instance

Example:

from vel import ToolSpec, register_tool

tool = ToolSpec(
    name='my_tool',
    input_schema={...},
    output_schema={...},
    handler=my_handler
)

register_tool(tool)

validate_io()

Validate data against a JSON schema.

def validate_io(schema: Dict[str, Any], value: Dict[str, Any])

Parameters:

schema: JSON Schema dictionary
value: Data to validate

Raises:

jsonschema.exceptions.ValidationError: If validation fails

Example:

from vel import validate_io

schema = {
    'type': 'object',
    'properties': {'count': {'type': 'number'}},
    'required': ['count']
}

validate_io(schema, {'count': 5})  # OK
validate_io(schema, {'count': 'five'})  # Raises ValidationError

Type Definitions

LLMMessage

Message format for LLM providers.

LLMMessage = Dict[str, Any]

# Structure:
{
    'role': 'user' | 'assistant' | 'system',
    'content': str
}

Default Values

Agent Defaults

{
    'prompt_env': 'prod',
    'tools': None,
    'policies': {'max_steps': 24, 'retry': {'attempts': 2}},
    'context_manager': ContextManager(),
    'session_storage': 'memory'
}

ContextManager Defaults

{
    'max_history': None,  # Unlimited
    'summarize': False
}

Policy Defaults

{
    'max_steps': 24,
    'retry': {'attempts': 2}  # Future feature
}

Complete Example

import asyncio
from dotenv import load_dotenv
from vel import Agent, ContextManager, ToolSpec, register_tool

load_dotenv()

# Create custom tool
def calculate_handler(input: dict, ctx: dict) -> dict:
    a = input['a']
    b = input['b']
    op = input['operation']

    if op == 'add':
        result = a + b
    elif op == 'multiply':
        result = a * b
    else:
        result = 0

    return {'result': result}

calc_tool = ToolSpec(
    name='calculate',
    input_schema={
        'type': 'object',
        'properties': {
            'a': {'type': 'number'},
            'b': {'type': 'number'},
            'operation': {'type': 'string', 'enum': ['add', 'multiply']}
        },
        'required': ['a', 'b', 'operation']
    },
    output_schema={
        'type': 'object',
        'properties': {'result': {'type': 'number'}},
        'required': ['result']
    },
    handler=calculate_handler
)

register_tool(calc_tool)

# Create agent
agent = Agent(
    id='calculator-agent',
    model={'provider': 'openai', 'model': 'gpt-4o'},
    tools=['calculate'],
    context_manager=ContextManager(max_history=10),
    session_storage='memory',
    policies={'max_steps': 5}
)

async def main():
    session_id = 'session-1'

    # Turn 1
    answer1 = await agent.run(
        {'message': 'What is 5 + 3?'},
        session_id=session_id
    )
    print(answer1)

    # Turn 2: Streaming
    print("\nTurn 2:")
    async for event in agent.run_stream(
        {'message': 'Now multiply that by 2'},
        session_id=session_id
    ):
        if event['type'] == 'text-delta':
            print(event['delta'], end='', flush=True)
        elif event['type'] == 'tool-input-available':
            print(f"\n[Tool: {event['toolName']}({event['input']})]")

if __name__ == '__main__':
    asyncio.run(main())

Next Steps

Getting Started - Quick start guide
Stream Protocol - Streaming event details
Tools - Tool system in depth
Providers - Provider configuration

API Reference

Table of Contents

Agent

Constructor

Methods

run()

run_stream()

ContextManager

Constructor

Methods

get_session_context()

set_session_context()

clear_session()

StatelessContextManager

Constructor

ToolSpec

Constructor

Methods

run()

Providers

BaseProvider

OpenAIProvider

GeminiProvider

AnthropicProvider

Stream Events

Event Types

Helper Functions

register_tool()

validate_io()

Type Definitions

LLMMessage

Default Values

Agent Defaults

ContextManager Defaults

Policy Defaults

Complete Example

Next Steps