API Reference
Complete API documentation for all Vel classes and functions.
Table of Contents
Agent
Main orchestrator class for running agents with LLM providers.
Constructor
Agent(
id: str,
model: Dict[str, Any],
prompt_env: str = 'prod',
tools: List[str] | None = None,
policies: Dict[str, Any] | None = None,
context_manager: Optional[ContextManager] = None,
session_storage: Literal['memory', 'database'] = 'memory',
generation_config: Optional[Dict[str, Any]] = None
)
Parameters:
id (required)
- Type:
str
- Agent identifier (unique name for this agent)
- Example:
'my-agent'
,'chat-general:v1'
model (required)
- Type:
Dict[str, Any]
- Model configuration with
provider
,model
, and optionalapi_key
keys - Examples:
{'provider': 'openai', 'model': 'gpt-4o'}
- UsesOPENAI_API_KEY
env var{'provider': 'openai', 'model': 'gpt-4o', 'api_key': 'sk-...'}
- Uses explicit API key
- Supported providers:
'openai'
,'google'
,'anthropic'
- API Key Handling:
- If
api_key
is provided, it overrides the environment variable for this agent instance - If
api_key
is not provided, falls back to environment variable (OPENAI_API_KEY
,GOOGLE_API_KEY
,ANTHROPIC_API_KEY
) - This enables multi-tenant applications where different agents use different API keys
- If
prompt_env
- Type:
str
- Default:
'prod'
- Environment for prompts (for future prompt versioning)
tools
- Type:
List[str] | None
- Default:
None
- List of tool names to enable
- Example:
['get_weather', 'search']
- Tools must be registered before creating agent
policies
- Type:
Dict[str, Any] | None
- Default:
{'max_steps': 24, 'retry': {'attempts': 2}}
- Execution policies controlling agent behavior
- Available policies:
max_steps
: Maximum tool calls per run (default: 24)retry
: Retry configuration (future feature)
context_manager
- Type:
Optional[ContextManager]
- Default:
None
(usesContextManager()
) - Custom context manager instance for conversation memory
- Options:
None
: Default full memoryContextManager()
: Full memory (explicit)ContextManager(max_history=10)
: Limited historyStatelessContextManager()
: No memory- Custom subclass
session_storage
- Type:
Literal['memory', 'database']
- Default:
'memory'
- Where to persist session context
- Options:
'memory'
: In-memory only (fast, not persistent)'database'
: Postgres-backed (persistent, requiresPOSTGRES_DSN
)
generation_config
- Type:
Optional[Dict[str, Any]]
- Default:
None
- Model generation parameters (temperature, max_tokens, etc.)
- Common parameters:
temperature
: float (0-2) - Sampling temperaturemax_tokens
: int - Maximum output tokenstop_p
: float (0-1) - Nucleus samplingtop_k
: int - Top-K sampling (Gemini, Anthropic)presence_penalty
: float (-2 to 2) - Penalize new tokens (OpenAI)frequency_penalty
: float (-2 to 2) - Penalize repeated tokens (OpenAI)stop
: List[str] - Stop sequencesseed
: int - Reproducibility seed (OpenAI, Anthropic)
- Can be overridden per-run using
generation_config
parameter inrun()
orrun_stream()
Example:
from vel import Agent
# Basic agent
agent = Agent(
id='my-agent',
model={'provider': 'openai', 'model': 'gpt-4o'}
)
# Agent with tools and limited memory
agent = Agent(
id='my-agent',
model={'provider': 'google', 'model': 'gemini-1.5-pro'},
tools=['get_weather', 'search'],
context_manager=ContextManager(max_history=20),
session_storage='database',
policies={'max_steps': 10}
)
Methods
run()
Non-streaming execution - returns final answer only.
async def run(
input: Dict[str, Any],
session_id: Optional[str] = None,
generation_config: Optional[Dict[str, Any]] = None
) -> str
Parameters:
input (required)
- Type:
Dict[str, Any]
- Input dictionary with
'message'
field - Example:
{'message': 'Hello'}
session_id
- Type:
Optional[str]
- Default:
None
- Session ID for multi-turn conversations
- If provided, context persists across calls
- Example:
'user-123'
,'conv-abc'
generation_config
- Type:
Optional[Dict[str, Any]]
- Default:
None
- Per-run generation config that overrides agent-level config
- Example:
{'temperature': 0, 'max_tokens': 100}
- See Agent constructor for supported parameters
Returns:
- Type:
str
- Final answer from the agent
Raises:
RuntimeError
: If max_steps exceededException
: On LLM or tool errors
Example:
# Single-turn
answer = await agent.run({'message': 'What is 2+2?'})
print(answer) # "4"
# Multi-turn with sessions
session_id = 'user-123'
answer1 = await agent.run({'message': 'My name is Alice'}, session_id=session_id)
answer2 = await agent.run({'message': 'What is my name?'}, session_id=session_id)
print(answer2) # "Your name is Alice"
# With per-run generation config override
answer3 = await agent.run(
{'message': 'Explain quantum computing'},
generation_config={'temperature': 0, 'max_tokens': 100} # Deterministic, brief
)
run_stream()
Streaming execution - yields stream protocol events in real-time.
async def run_stream(
input: Dict[str, Any],
session_id: Optional[str] = None,
generation_config: Optional[Dict[str, Any]] = None
) -> AsyncGenerator[Dict[str, Any], None]
Parameters:
input (required)
- Type:
Dict[str, Any]
- Input dictionary with
'message'
field - Example:
{'message': 'Tell me a story'}
session_id
- Type:
Optional[str]
- Default:
None
- Session ID for multi-turn conversations
- Context persists across calls if provided
generation_config
- Type:
Optional[Dict[str, Any]]
- Default:
None
- Per-run generation config that overrides agent-level config
- Example:
{'temperature': 0.9, 'max_tokens': 1000}
- See Agent constructor for supported parameters
Yields:
- Type:
Dict[str, Any]
- Stream protocol events as they occur
- Event types:
text-delta
,tool-input-available
,finish-message
, etc.
Example:
# Stream text to console
async for event in agent.run_stream({'message': 'Tell me a joke'}):
if event['type'] == 'text-delta':
print(event['delta'], end='', flush=True)
elif event['type'] == 'finish-message':
print() # Newline
break
# With sessions
session_id = 'user-123'
async for event in agent.run_stream({'message': 'My name is Bob'}, session_id=session_id):
if event['type'] == 'text-delta':
print(event['delta'], end='', flush=True)
# With per-run generation config override
async for event in agent.run_stream(
{'message': 'Write a haiku'},
generation_config={'temperature': 0.9, 'max_tokens': 50} # Creative, brief
):
if event['type'] == 'text-delta':
print(event['delta'], end='', flush=True)
ContextManager
Manages conversation history with configurable memory behavior.
Constructor
ContextManager(
max_history: Optional[int] = None,
summarize: bool = False
)
Parameters:
max_history
- Type:
Optional[int]
- Default:
None
(unlimited) - Maximum number of messages to retain
- Implements sliding window (keeps last N messages)
- Example:
max_history=10
keeps last 10 messages (~5 turns)
summarize
- Type:
bool
- Default:
False
- Whether to summarize old messages (future feature, not implemented)
Example:
from vel import Agent, ContextManager
# Full memory (default)
agent = Agent(
id='my-agent',
model={'provider': 'openai', 'model': 'gpt-4o'},
context_manager=ContextManager()
)
# Limited memory (last 20 messages)
agent = Agent(
id='my-agent',
model={'provider': 'openai', 'model': 'gpt-4o'},
context_manager=ContextManager(max_history=20)
)
Methods
get_session_context()
Get all messages for a session.
def get_session_context(session_id: str) -> List[Dict[str, Any]]
Parameters:
session_id
: Session identifier
Returns:
- List of message dictionaries with
role
andcontent
fields
Example:
context = agent.ctxmgr.get_session_context('user-123')
print(context)
# [
# {'role': 'user', 'content': 'Hello'},
# {'role': 'assistant', 'content': 'Hi there!'}
# ]
set_session_context()
Set messages for a session (used when loading from storage).
def set_session_context(session_id: str, messages: List[Dict[str, Any]])
Parameters:
session_id
: Session identifiermessages
: List of message dictionaries
Example:
messages = [
{'role': 'user', 'content': 'Hello'},
{'role': 'assistant', 'content': 'Hi there!'}
]
agent.ctxmgr.set_session_context('user-123', messages)
clear_session()
Clear all messages for a session.
def clear_session(session_id: str)
Parameters:
session_id
: Session identifier
Example:
agent.ctxmgr.clear_session('user-123')
StatelessContextManager
Stateless context manager - no memory between calls.
Constructor
StatelessContextManager()
No parameters. Each call is completely independent.
Example:
from vel import Agent, StatelessContextManager
agent = Agent(
id='my-agent',
model={'provider': 'openai', 'model': 'gpt-4o'},
context_manager=StatelessContextManager()
)
# First call
answer1 = await agent.run({'message': 'My name is Alice'}, session_id='user-1')
# Second call - does NOT remember Alice
answer2 = await agent.run({'message': 'What is my name?'}, session_id='user-1')
# Agent will say it doesn't know
ToolSpec
Specification for a custom tool with JSON schema validation.
Constructor
ToolSpec(
name: str,
input_schema: Dict[str, Any],
output_schema: Dict[str, Any],
handler: Callable[[Dict[str, Any], Dict[str, Any]], Dict[str, Any]]
)
Parameters:
name (required)
- Type:
str
- Unique tool identifier
- Convention:
lowercase_with_underscores
- Example:
'get_weather'
,'search_products'
input_schema (required)
- Type:
Dict[str, Any]
- JSON Schema (Draft 2020-12) for input validation
- Must include
type
,properties
,required
- Validated before calling handler
output_schema (required)
- Type:
Dict[str, Any]
- JSON Schema for output validation
- Validates handler return value
- Ensures consistent behavior
handler (required)
- Type:
Callable[[Dict[str, Any], Dict[str, Any]], Dict[str, Any]]
- Function that executes tool logic
- Signature:
(input: dict, ctx: dict) -> dict
- Can be sync or async (auto-detected)
Example:
from vel import ToolSpec, register_tool
def get_weather_handler(input: dict, ctx: dict) -> dict:
city = input['city']
# Your logic here
return {
'temp_f': 72,
'condition': 'sunny',
'city': city
}
weather_tool = ToolSpec(
name='get_weather',
input_schema={
'type': 'object',
'properties': {
'city': {'type': 'string', 'description': 'City name'}
},
'required': ['city']
},
output_schema={
'type': 'object',
'properties': {
'temp_f': {'type': 'number'},
'condition': {'type': 'string'},
'city': {'type': 'string'}
},
'required': ['temp_f', 'condition', 'city']
},
handler=get_weather_handler
)
register_tool(weather_tool)
Methods
run()
Execute the tool handler with validation.
async def run(input: Dict[str, Any], ctx: Dict[str, Any]) -> Dict[str, Any]
Parameters:
input
: Tool input (validated against input_schema)ctx
: Runtime context withrun_id
,session_id
,agent_id
Returns:
- Tool output (validated against output_schema)
Note: Usually called internally by Agent, not directly.
Providers
BaseProvider
Abstract base class for LLM providers.
class BaseProvider(ABC):
name: str
@abstractmethod
async def stream(
messages: List[LLMMessage],
model: str,
tools: Dict[str, Any]
) -> AsyncGenerator[StreamEvent, None]
@abstractmethod
async def generate(
messages: List[LLMMessage],
model: str,
tools: Dict[str, Any]
) -> Dict[str, Any]
Note: Implement this to create custom providers.
OpenAIProvider
OpenAI provider implementation.
Environment Variables:
OPENAI_API_KEY
(required)OPENAI_API_BASE
(optional, default:https://api.openai.com/v1
)
Supported Models:
gpt-4o
gpt-4-turbo
gpt-4
gpt-3.5-turbo
Example:
# Set in .env
OPENAI_API_KEY=sk-...
OPENAI_API_BASE=https://api.openai.com/v1 # Optional
# Use with agent
agent = Agent(
id='my-agent',
model={'provider': 'openai', 'model': 'gpt-4o'}
)
GeminiProvider
Google Gemini provider implementation.
Environment Variables:
GOOGLE_API_KEY
(required)
Supported Models:
gemini-1.5-pro
gemini-1.5-flash
gemini-pro
Example:
# Set in .env
GOOGLE_API_KEY=AIza...
# Use with agent
agent = Agent(
id='my-agent',
model={'provider': 'google', 'model': 'gemini-1.5-pro'}
)
AnthropicProvider
Anthropic Claude provider implementation.
Environment Variables:
ANTHROPIC_API_KEY
(required)ANTHROPIC_API_BASE
(optional, default:https://api.anthropic.com
)
Supported Models:
claude-opus-4-20250514
claude-sonnet-4-20250514
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022
claude-3-opus-20240229
claude-3-sonnet-20240229
claude-3-haiku-20240307
Example:
# Set in .env
ANTHROPIC_API_KEY=sk-ant-...
# Use with agent
agent = Agent(
id='my-agent',
model={'provider': 'anthropic', 'model': 'claude-sonnet-4-20250514'}
)
Stream Events
All stream events have a type
field and extend the base StreamEvent
class.
Event Types
Event | Fields |
---|---|
start |
messageId (optional) |
text-start |
id (block ID) |
text-delta |
id , delta (text chunk) |
text-end |
id |
tool-input-start |
toolCallId , toolName |
tool-input-delta |
toolCallId , inputTextDelta |
tool-input-available |
toolCallId , toolName , input (object) |
tool-output-available |
toolCallId , output (any) |
finish-message |
finishReason |
error |
error (string) |
See Stream Protocol for complete event documentation.
Helper Functions
register_tool()
Register a tool in the global registry.
def register_tool(tool: ToolSpec)
Parameters:
tool
: ToolSpec instance
Example:
from vel import ToolSpec, register_tool
tool = ToolSpec(
name='my_tool',
input_schema={...},
output_schema={...},
handler=my_handler
)
register_tool(tool)
validate_io()
Validate data against a JSON schema.
def validate_io(schema: Dict[str, Any], value: Dict[str, Any])
Parameters:
schema
: JSON Schema dictionaryvalue
: Data to validate
Raises:
jsonschema.exceptions.ValidationError
: If validation fails
Example:
from vel import validate_io
schema = {
'type': 'object',
'properties': {'count': {'type': 'number'}},
'required': ['count']
}
validate_io(schema, {'count': 5}) # OK
validate_io(schema, {'count': 'five'}) # Raises ValidationError
Type Definitions
LLMMessage
Message format for LLM providers.
LLMMessage = Dict[str, Any]
# Structure:
{
'role': 'user' | 'assistant' | 'system',
'content': str
}
Default Values
Agent Defaults
{
'prompt_env': 'prod',
'tools': None,
'policies': {'max_steps': 24, 'retry': {'attempts': 2}},
'context_manager': ContextManager(),
'session_storage': 'memory'
}
ContextManager Defaults
{
'max_history': None, # Unlimited
'summarize': False
}
Policy Defaults
{
'max_steps': 24,
'retry': {'attempts': 2} # Future feature
}
Complete Example
import asyncio
from dotenv import load_dotenv
from vel import Agent, ContextManager, ToolSpec, register_tool
load_dotenv()
# Create custom tool
def calculate_handler(input: dict, ctx: dict) -> dict:
a = input['a']
b = input['b']
op = input['operation']
if op == 'add':
result = a + b
elif op == 'multiply':
result = a * b
else:
result = 0
return {'result': result}
calc_tool = ToolSpec(
name='calculate',
input_schema={
'type': 'object',
'properties': {
'a': {'type': 'number'},
'b': {'type': 'number'},
'operation': {'type': 'string', 'enum': ['add', 'multiply']}
},
'required': ['a', 'b', 'operation']
},
output_schema={
'type': 'object',
'properties': {'result': {'type': 'number'}},
'required': ['result']
},
handler=calculate_handler
)
register_tool(calc_tool)
# Create agent
agent = Agent(
id='calculator-agent',
model={'provider': 'openai', 'model': 'gpt-4o'},
tools=['calculate'],
context_manager=ContextManager(max_history=10),
session_storage='memory',
policies={'max_steps': 5}
)
async def main():
session_id = 'session-1'
# Turn 1
answer1 = await agent.run(
{'message': 'What is 5 + 3?'},
session_id=session_id
)
print(answer1)
# Turn 2: Streaming
print("\nTurn 2:")
async for event in agent.run_stream(
{'message': 'Now multiply that by 2'},
session_id=session_id
):
if event['type'] == 'text-delta':
print(event['delta'], end='', flush=True)
elif event['type'] == 'tool-input-available':
print(f"\n[Tool: {event['toolName']}({event['input']})]")
if __name__ == '__main__':
asyncio.run(main())
Next Steps
- Getting Started - Quick start guide
- Stream Protocol - Streaming event details
- Tools - Tool system in depth
- Providers - Provider configuration