Voice fingerprint
The voice fingerprint, drift detection, and recovery strategies that keep a persona from sliding under long conversations or model swaps.
What a fingerprint is
A voice fingerprint is a numeric embedding of a persona's Mask — its register, rhythm, vocabulary, and signature phrases — captured by running fifty short scenario completions against the persona at generation time. The result is a "voice DNA" the chat runtime can score against on every reply.
The fingerprint is computed once at generation, recomputed on refine, and stored alongside the persona. It is not user-visible content; it is a defensive baseline.
The drift score
Each chat reply is embedded with the same model and compared against the fingerprint via cosine distance. The result is a number between 0 and 1 — closer to 0 means the reply is in voice, closer to 1 means it has slipped.
{
"driftScore": 0.12,
"driftThreshold": 0.30,
"driftAlert": false
}Below the threshold, the reply ships normally. Above the threshold, an alert fires (logged, webhook-emitted, optionally enforced).
What causes drift
- Long context. The system prompt's authority decays as the conversation history grows.
- Off-topic steering. Users push a persona into territory it was not generated for.
- Provider model swap. Switching from Claude Opus to Sonnet (or vice versa) changes the voice surface, even with identical prompts.
- High temperature. The variance reads as drift even when the persona is "still itself."
- Cross-tool calls. Tool responses inject system-like text that bleeds back into the reply tone.
Recovery strategies
Several layers of defense run before a drifted reply reaches the user:
- Auto-recovery. A small re-prompt with the fingerprint reference often pulls the persona back inside the threshold.
- Session reset. Truncate the chat history to the system prompt + last N turns.
- Manual fingerprint refresh. Recompute the baseline after a deliberate Mask refine.
- Fallback model. Drop to a stricter model in the pipeline config for the affected step.
All four are individually configurable via engine.pipeline.drift_detection.*.
Threshold tuning
The default threshold (0.30) is workspace-configurable per use case:
- Customer support, regulated content →
0.20. Strict. Trips early so QA can review. - Open-ended chat, creative play →
0.45. Loose. Lets the persona breathe. - A/B brand variants → set per persona, override workspace default.
The config item is engine.pipeline.drift_detection.threshold; per-persona overrides land via the persona's runtime contract.
Telemetry
Every chat response carries driftScore, driftThreshold, and driftAlert. Above-threshold replies emit the webhook event conversation.drift_detected (signed HMAC-SHA256, retried up to five times with exponential backoff). Wire this into your support QA queue so drift incidents become a metric, not a surprise.
{
"type": "conversation.drift_detected",
"data": {
"sessionId": "sess_...",
"messageId": "msg_...",
"personaId": "persona_...",
"driftScore": 0.41,
"driftThreshold": 0.30
}
}Return to Quickstart for the chat-session basics, or read the Chat API reference.