Using Translators Directly

This guide explains how to use Vel’s translators for custom orchestration while maintaining compatibility with AI SDK frontend components.

Architecture: Why Two Layers?

Vel separates protocol translation from orchestration based on the Single Responsibility Principle:

Translator: Protocol Adapter Pattern

Job: Convert provider-specific → standard protocol
Scope: Single LLM response stream
Stateful: Only tracks current response (text blocks, tool calls)
Reusable: Can be used in any orchestrator (Vel Agent, Mesh, LangGraph, custom)

Agent: Orchestrator Pattern

Job: Multi-step execution, tool calling, context management
Scope: Full agentic workflow
Stateful: Sessions, context, run history
Opinionated: Implements specific orchestration pattern

Why This Matters: This separation enables composability. You can use Vel’s translators in any orchestration framework without adopting Vel’s Agent pattern. Each layer does one thing well.

Understanding the Gap

Vel translators convert provider-specific events to standardized Vel protocol events, but they only handle content-level events from a single LLM response:

What Translators Emit ✅

translator = OpenAIAPITranslator()

async for chunk in openai_stream:
    event = translator.translate_chunk(chunk)
    # Emits:
    # - text-start, text-delta, text-end
    # - tool-input-start, tool-input-delta, tool-input-available
    # - response-metadata
    # - finish-message

What Translators DON’T Emit ❌

start - Generation begins
start-step - Step begins (multi-step agents)
finish-step - Step complete with metadata
finish - Full generation complete
tool-output-available - Tool execution results

Why This Matters for Frontend

AI SDK frontend components (React, Vue, Svelte hooks) expect the full event stream:

// Frontend expects this:
useChat({
  api: '/api/chat',
  // Needs: start, start-step, text-delta, finish-step, finish
});

If you use translator directly without filling gaps, frontend components will:

❌ Not detect generation start
❌ Not track usage/metadata properly
❌ Not show tool execution results
❌ Not detect completion correctly

Solution: Manual Event Orchestration

Here’s how to bridge the gap:

Example 1: Single-Step Text Generation

from vel.providers.translators import OpenAIAPITranslator
from openai import AsyncOpenAI

async def stream_with_orchestration(prompt: str):
    """
    Use translator with manual orchestration.
    Compatible with AI SDK frontend components.
    """
    client = AsyncOpenAI()
    translator = OpenAIAPITranslator()

    # 1. Emit start event
    yield {'type': 'start'}

    # 2. Emit start-step
    yield {'type': 'start-step'}

    # 3. Stream LLM response through translator
    usage = None
    response_id = None
    model_id = None
    finish_reason = 'stop'

    stream = await client.chat.completions.create(
        model='gpt-4o',
        messages=[{'role': 'user', 'content': prompt}],
        stream=True,
        stream_options={'include_usage': True}
    )

    async for chunk in stream:
        chunk_dict = chunk.model_dump()

        # Translate to Vel event
        event = translator.translate_chunk(chunk_dict)

        if event:
            # Track metadata (don't forward)
            if event.type == 'response-metadata':
                response_id = event.id
                model_id = event.model_id
                usage = event.usage
                continue  # Don't emit

            # Track finish reason (don't forward)
            if event.type == 'finish-message':
                finish_reason = event.finish_reason
                continue  # Don't emit

            # Forward content events
            yield event.to_dict()

    # 4. Finalize any pending events
    for pending_event in translator.finalize_tool_calls():
        yield pending_event.to_dict()

    # 5. Emit finish-step with metadata
    yield {
        'type': 'finish-step',
        'finishReason': finish_reason,
        'usage': usage,
        'response': {
            'id': response_id,
            'modelId': model_id
        }
    }

    # 6. Emit finish
    yield {
        'type': 'finish',
        'finishReason': finish_reason,
        'totalUsage': usage
    }

# FastAPI endpoint
@app.post('/api/chat')
async def chat_endpoint(request: ChatRequest):
    return StreamingResponse(
        stream_with_orchestration(request.message),
        media_type='text/event-stream'
    )

Example 2: Multi-Step with Tool Calling

async def multi_step_stream(prompt: str, tools: dict):
    """
    Multi-step orchestration with tool execution.
    Fully compatible with AI SDK frontend.
    """
    client = AsyncOpenAI()
    translator = OpenAIAPITranslator()

    messages = [{'role': 'user', 'content': prompt}]
    max_steps = 5

    # Emit start
    yield {'type': 'start'}

    for step in range(max_steps):
        # Emit start-step
        yield {'type': 'start-step'}

        translator.reset()
        usage = None
        response_metadata = None
        finish_reason = 'stop'
        tool_calls = []

        # Stream LLM response
        stream = await client.chat.completions.create(
            model='gpt-4o',
            messages=messages,
            tools=[
                {'type': 'function', 'function': {'name': name, 'parameters': schema['input']}}
                for name, schema in tools.items()
            ],
            stream=True,
            stream_options={'include_usage': True}
        )

        async for chunk in stream:
            chunk_dict = chunk.model_dump()
            event = translator.translate_chunk(chunk_dict)

            if event:
                # Track metadata internally
                if event.type == 'response-metadata':
                    if not response_metadata:
                        response_metadata = {}
                    response_metadata['id'] = event.id
                    response_metadata['modelId'] = event.model_id
                    usage = event.usage
                    continue

                if event.type == 'finish-message':
                    finish_reason = event.finish_reason
                    continue

                # Track tool calls
                if event.type == 'tool-input-available':
                    tool_calls.append({
                        'id': event.tool_call_id,
                        'name': event.tool_name,
                        'input': event.input
                    })

                # Forward content events
                yield event.to_dict()

        # Finalize translator
        for pending in translator.finalize_tool_calls():
            yield pending.to_dict()

        # If no tool calls, we're done
        if not tool_calls:
            # Emit finish-step
            yield {
                'type': 'finish-step',
                'finishReason': finish_reason,
                'usage': usage,
                'response': response_metadata
            }

            # Emit finish
            yield {
                'type': 'finish',
                'finishReason': finish_reason,
                'totalUsage': usage
            }
            break

        # Execute tools
        for tc in tool_calls:
            # Execute tool (your logic here)
            result = await execute_tool(tc['name'], tc['input'])

            # Emit tool-output-available (CRITICAL - translator doesn't emit this)
            yield {
                'type': 'tool-output-available',
                'toolCallId': tc['id'],
                'output': result
            }

            # Add to message history
            messages.append({
                'role': 'assistant',
                'tool_calls': [{
                    'id': tc['id'],
                    'type': 'function',
                    'function': {
                        'name': tc['name'],
                        'arguments': json.dumps(tc['input'])
                    }
                }]
            })
            messages.append({
                'role': 'tool',
                'tool_call_id': tc['id'],
                'content': json.dumps(result)
            })

        # Emit finish-step after tool execution
        yield {
            'type': 'finish-step',
            'finishReason': finish_reason,
            'usage': usage,
            'response': response_metadata
        }

        # Continue to next step (don't emit finish yet)

    # If we hit max steps, emit final finish
    yield {
        'type': 'finish',
        'finishReason': 'length',
        'totalUsage': usage
    }

Event Checklist

When using translators directly, ensure you emit:

Required for All Scenarios

✅ start - At the beginning
✅ start-step - Before each LLM call
✅ finish-step - After each LLM response (with usage/response)
✅ finish - At the very end (with totalUsage)

Required for Tool Calling

✅ tool-output-available - After executing each tool
✅ Track tool calls from tool-input-available events
✅ Execute tools yourself (translator doesn’t do this)
✅ Add tool results to message history

Internal (Don’t Forward)

🔒 response-metadata - Consume, don’t emit (use in finish-step)
🔒 finish-message - Consume, don’t emit (use in finish-step)

Comparison: Translator vs Agent

Using Translator (Manual)

Pros:

Full control over orchestration logic
Can integrate with custom frameworks
Fine-grained event handling

Cons:

Must implement orchestration yourself
Must emit start/finish events manually
Must handle tool execution
More code to maintain

# ~80 lines of orchestration code
async def custom_stream():
    yield {'type': 'start'}
    yield {'type': 'start-step'}
    # ... translator usage ...
    yield {'type': 'finish-step', ...}
    yield {'type': 'finish', ...}

Using Agent (Automatic)