Memory configuration
Short-term window, long-term retrieval, cold tier, cross-session opt-in. The four levers that shape persona memory.
Moonborn's memory layer has three tiers (short-term context, long-term pgvector retrieval, cold archive) and one opt-in escape hatch (cross-session memory). All four are config-driven.
Short-term window
chat.memory.short_term.window_turns (default 12). The last N turns
ride in-prompt verbatim. Higher = more recall but more cost and slower
inference.
await client.config.setItem({
key: 'chat.memory.short_term.window_turns',
value: 16,
scope: 'workspace',
scopeId: 'ws_...',
});Long-term retrieval
Old turns get summarized and embedded with voyage-3-large (default;
configurable via engine.embedding.model). Each new turn retrieves
the top-K relevant chunks:
chat.memory.long_term.top_k(default4)chat.memory.long_term.retrieval_strategy(defaulthybrid— semantic + BM25 + rerank + MMR)
For long support sessions, bump top_k to 8. For creative play
where coherence matters less, drop to 2.
Cold tier
Chunks older than chat.memory.long_term.cold_tier_after_days
(default 90) move to slower storage and are skipped by default
retrieval. Still queryable if the user references something explicitly
old.
Cross-session memory (Team+)
By default memory is session-scoped — a persona doesn't remember between sessions. To enable cross-session continuity:
await client.config.setItem({
key: 'chat.memory.cross_session.enabled',
value: true,
scope: 'workspace',
scopeId: 'ws_...',
});This has privacy implications — a persona now carries information between sessions, possibly across users. Pair with consent UI and a clear retention policy.
Manual forgetting
DELETE /v1/chat/sessions/{id}/memory/{chunk_id} removes a memory
chunk. The persona forgets that specific fact for the session.