Vel Memory Architecture
Velβs memory system is modular and runtime-owned, meaning it operates entirely inside the agent runtime rather than through LLM tool calls.
It allows agents to recall facts, reasoning strategies, and user-specific context without increasing latency or changing the prompt architecture.
π§© Overview
Vel defines three complementary memory components:
Component | Type | Scope | Purpose |
---|---|---|---|
ContextManager |
short-term | per run / session | Maintains current conversation state |
Fact Store |
mid-term | persistent | Stores facts, summaries, and user data |
ReasoningBank |
long-term | persistent | Stores learned strategies and heuristics |
These can be mixed and matched via the MemoryConfig
object.
π§ Conceptual Layers
ββββββββββββββββββββββββββββββ
β LLM Runtime β
β (Agent Execution) β
ββββββββββββββ¬ββββββββββββββββ
β
βββββββββββββββββββ΄ββββββββββββββββββ
β Context Layer β
β (Conversation & In-Run Context) β
β β ContextManager β
βββββββββββββββββββ¬ββββββββββββββββββ
β
ββββββββββββββββ΄βββββββββββββββ
β Memory Layer β
β Fact Store + ReasoningBank
β (Optional Runtime Boosts) β
ββββββββββββββββ¬βββββββββββββββ
β
βββββββββββββ΄βββββββββββββ
β SQLite / Vector DB β
β (Memory persistence) β
ββββββββββββββββββββββββββ
βοΈ Runtime Flow
1. Pre-Run Initialization
When a new agent run starts:
- The
ContextManager
loads any session or run-specific history. - If enabled:
Fact Store
retrieves relevant key-value items (facts, preferences, summaries).ReasoningBank
retrieves top-K similar strategies based on task signature embeddings.
- The combined context is injected into the system prompt before model generation.
2. Streaming and Execution
- The LLM streams tokens as usual.
- The runtime decides when to stop streaming (no open memory tools or callbacks required).
- The agent executes tools and gathers results via the normal
ContextManager
interface.
π§ Key difference from βtool-basedβ memory systems:
The LLM never calls amemory_read
ormemory_write
tool β all memory operations happen before and after model inference.
3. Post-Run Finalization
After the response is completed:
- The
ContextManager
appends messages to session memory. - If enabled:
Fact Store
may store summaries, outcomes, or key facts.ReasoningBank
evaluates whether the strategy was successful and updates its vector database accordingly.
- Updates are handled asynchronously in background threads to avoid blocking latency.
π§© Component Details
π¦ 1. ContextManager
- Maintains transient conversation state (per-run or per-session).
- Handles inputs, outputs, and message trimming (
max_history
). - Stateless variants can disable all retention.
ctx = ContextManager(max_history=5)
ctx.set_input(run_id, {"message": "Plan a FastAPI project"})
ctx.append_assistant_message(run_id, "Let's start with project scaffolding.")
π¨ 2. Fact Store
- Namespaced key-value store for long-term structured data
- Ideal for user preferences, project metadata, and domain knowledge
- Implemented with SQLite under the hood
ctx.fact_put("user:richard", "theme", "dark")
theme = ctx.fact_get("user:richard", "theme")
The fact store is structured β it stores curated metadata you explicitly save.
π₯ 3. ReasoningBank
- Strategic memory system that recalls how to think, not what to think.
- Each entry stores a distilled heuristic plus vector embeddings for the context.
advice = ctx.reasoningbank.get_advice(signature)
print(advice)
# -> ["Clarify user intent before planning.", "Avoid replanning mid-stream."]
Itβs powered by embeddings for conceptual similarity:
- vector_strategy: embedding of heuristic text
- vector_signature: embedding of contextual metadata
π Control Flow Diagram
βββββββββββββββββββββββββββββββ
β Agent Runtime (Vel) β
ββββββββββββββ¬βββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β ContextManager β
β Load session/run β
ββββββββββ¬ββββββββββ
β
βββββββββ΄ββββββββββββββββ
β MemoryConfig Switch β
β mode = facts/both β
βββββββββ¬ββββββββββββββββ
β
βββββββββββ΄ββββββββββ
β Fact Store β
β (facts & kv) β
ββββββββββββββββββββββ€
β ReasoningBank β
β (strategy recall) β
βββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β SQLite / VectorDB β
ββββββββββββββββββββ
β‘ Runtime Behavior
Stage | Component | Operation | Timing |
---|---|---|---|
Run start | ReasoningBank | Retrieve top-K strategies | synchronous |
Run start | Fact Store | Retrieve contextual facts | synchronous |
Run loop | ContextManager | Manage messages, tools | synchronous |
Run end | ReasoningBank | Update confidence, anti-patterns | async |
Run end | Fact Store | Store new facts or summaries | async |
π Design Goals
-
Zero LLM dependency
- No memory tools or external calls.
-
Predictable latency
- Pre-load and post-update, never mid-stream.
-
Composable configuration
- Enable memory per run or globally.
-
Embeddings optional
- Works even with simple string similarity if desired.
-
Persistent but lightweight
- Uses SQLite for simplicity; can scale to vector DBs.
βοΈ Typical Configuration
from vel.core.context import MemoryConfig
mem = MemoryConfig(
mode="all", # "none" | "facts" | "reasoning" | "all"
db_path=".vel/vel.db",
rb_top_k=5,
embeddings_fn=encode # custom or model-based
)
The MemoryConfig
object wires together all components dynamically at runtime.
π§© Summary
Component | Memory Type | Purpose | Retention | Example |
---|---|---|---|---|
ContextManager | Working memory | Holds conversation state | transient | βThis runβ |
Fact Store | Declarative | Stores facts, summaries | persistent | βUser prefers dark themeβ |
ReasoningBank | Procedural | Stores reasoning heuristics | persistent | βClarify before planningβ |
Together, they form a cognitive architecture:
- ContextManager = short-term βworking memoryβ
- Fact Store = mid-term βfact memoryβ
- ReasoningBank = long-term βstrategy memoryβ
π§ In short: Vel agents donβt just remember what they said β they learn how to think.