Everything agents need to manage memory.

A full memory stack — ingestion, storage, retrieval, and lifecycle management — as a managed service.*

A full memory stack — ingestion, storage, retrieval, and lifecycle management — as a managed service.*

GGeett  ssttaarrtteedd  ffoorr  ffrreeee
RReeaadd  tthhee  ddooccss

*No new infrastructure to provision. No custom schemas to maintain.

*No new infrastructure to provision. No custom schemas to maintain.

Architecture

How it fits in your stack.

Agents Memory runs as a managed service you call over HTTP. It connects to your agent logic — not directly to your LLM. You call it before and after each LLM interaction.
At write time, your agent sends interactions to Agents Memory: messages, events, extracted facts. These are processed, scored, and stored — in session state or long-term storage depending on content and configuration.
At read time, your agent queries Agents Memory before constructing its prompt. The API returns ranked memory items with metadata. You decide how to inject them.

Request flow

Your Agent

↓ retrieve(user_id, query)

Agents Memory API

↓ Conversation + Visitor/Merchant

Ranked context returned

↓ ranked context

Your Agent builds prompt

↓ prompt + context

LLM (OpenAI / Anthropic / …)

↓ response

Your Agent calls add()

↓ scored + consolidated (Visitor Rollup auto-triggers)

↓ scored + consolidated

Agents Memory API

Sub-200ms total overhead per round-trip

The Digital Brain

Every region mapped to a real system.

Agents Memory's architecture is a direct analogy of biological memory. Each brain region that handles a distinct memory function maps to a purpose-built subsystem.

Prefrontal Cortex

Working Memory

Session state, last N turns, auto-TTL. Fast hot storage for active context.

Hippocampus

Episodic Memory

Episodic Memory

Past interactions with vector embeddings for semantic recall across sessions.

Neocortex

Semantic Memory

Semantic Memory

Frequency tracking and confidence scoring on every stored fact.

Sleep Consolidation

Memory Replay

Memory Replay

Auto-triggers after every ingest, merging conversation data into persistent profiles.

Forgetting Curve

Natural Decay

Natural Decay

Importance fades over time; frequency protects strong neural pathways from decay.

Amygdala

Emotional Tagging

Emotional Tagging

Importance score (0.0–1.0) ensures high-impact events persist longer than low-signal noise.

memory_lifecycle.txt

New fact ingested:

freq=1, conf=0.50, imp=0.8

↓ queried 3 times:

freq=4, conf=0.59

← pathway strengthens

↓ queried 5 more:

freq=9, conf=0.74

← neural reinforcement

↓ 30 days no access:

imp × 0.96

← only 4% loss (frequency protects)

↓ queried again:

saved from decay

← Ebbinghaus curve reset

Core Capabilities

Five memory primitives. One API.

Session Memory

01

Context that survives the conversation.

Session memory captures the full thread of a conversation — what the user asked, what the agent responded, what actions were taken. It's scoped to a session ID and persists beyond the LLM's context window.

Use it to maintain coherent multi-turn conversations, resume interrupted sessions, and pass relevant history to downstream agents in a pipeline.

Key behaviors

Automatic message threading

Token-aware summarization for long sessions

Session expiry controls

Resumable across deployments

Long-term Memory

02

What the agent should remember forever.

Long-term memory stores facts, preferences, and conclusions that should persist across sessions. It accumulates over time and is not tied to a single conversation.

Agents Memory applies scoring to determine what deserves long-term retention versus what's transient. You can also write to long-term memory explicitly via API.

Key behaviors

User-attributed fact storage

Memory scoring and relevance weighting

Manual and automatic write modes

Configurable retention policies

User Memory

03

A persistent profile for every user.

User memory is a structured, evolving record of what an agent knows about a specific user: preferences, past behavior, stated goals, and inferred patterns.

It's separate from conversation history. It answers the question: what do I know about this person? — not what did we talk about last time?

Key behaviors

Per-user memory namespacing

Preference and trait extraction

Exportable and deletable for GDPR/CCPA

Versioned updates — no silent overwrites

Temporal Fact Tracking

04

Facts that know when they're true.

Every fact is stored with valid_from and valid_to timestamps. When a new fact contradicts an existing one, Agents Memory detects the conflict, versions the old fact, and surfaces the current truth.

This eliminates the silent drift problem — where an agent confidently cites outdated information. The full version history is queryable so you can see exactly how a fact evolved over time.

Key behaviors

Contradiction detection on every ingest

valid_from / valid_to timestamp tracking

Full version history per fact

Confidence scoring (0.0–1.0) with each write

Retrieval Engine

05

Relevant context, ranked and ready.

The retrieval engine surfaces the right memory at the right time. Given a query or current message, it returns ranked memory items — a mix of recent history, long-term facts, and user context — formatted for prompt injection.

It's not raw similarity search. It combines semantic relevance with recency, user scope, and memory type weighting.

Key behaviors

Hybrid retrieval (semantic + recency + type)

Configurable context window budget

Returns memory with source metadata

Streaming support for real-time agents

Positioning

Not a replacement. The missing piece.

LLMs generate responses. Vector databases store and search embeddings. Agents Memory manages what your agent remembers and when.
If you're already using Weaviate, Pinecone, or pgvector for document retrieval — keep using them. Agents Memory handles a different problem: agent state, user context, and conversational memory. These are complementary.
If you don't have a vector store, you don't need to add one. Agents Memory handles its own storage and retrieval internally.

LLM

Generates responses

In-context reasoning

In-context reasoning

Vector DB

Document retrieval (RAG)

Knowledge base search

Knowledge base search

Agents Memory

Agent memory layer

Agent memory layer

Persistence, user context, statefulness

<< agents memory >>

Example Workflow

A typical agent interaction.

1

User sends message

Your agent receives it

2

Agent calls .retrieve()

Gets ranked context: preferences, session summary, long-term facts

3

Build the prompt

Inject context + current message, call your LLM

4

LLM responds

Agent sends response to user

5

Agent calls .add()

Exchange stored, scored, and updated in memory

Start building agents that remember.

Free tier available. No credit card required.

GGeett  ssttaarrtteedd  ffoorr  ffrreeee
RReeaadd  tthhee  ddooccss

By signing up, you agree to our Terms of Service and Privacy Policy.