Inside the six-step generation pipeline
What happens between `POST /v1/personas` and the response — six steps, two LLMs, an audit pass, and the post-generation jobs that don't block the response.
A single API call generates a four-layer persona in 30–90 seconds. Internally, that's six visible steps + two post-generation jobs. Here's what each one does.
1. Intent parse
The user's brief — a free-form sentence — gets parsed into a small structured object: genre hints, locale, tonal axis preferences. The intent string itself stays untouched; the parsed object is a constraint chain for the subsequent steps.
Model: claude-sonnet-4-6 (configurable). Cost: ~$0.001.
2. Soul draft
The deepest layer first. The Soul prompt asks for desire, fear, wound, and growth arc — the things that make a character feel unguardedly themselves. We use Opus here on purpose: the Soul draft is the load-bearing step; cheaping out on this one cascades into flat downstream layers.
Model: claude-opus-4-7 (configurable). Cost: ~$0.04.
3. Self enrich
With Soul in hand, Self fills in psychometric structure: Big Five, archetype, values, attachment style. The prompt is heavily constrained by Soul — a persona whose Soul is "to be seen as ordinary" can't have a 0.95 extraversion score.
Model: claude-sonnet-4-6. Cost: ~$0.005.
4. Mask build
The user-facing voice. Register, tone, signature phrases, social role. Mask is constrained by both Soul and Self — a Soul of "approval-seeking" + Self of high agreeableness produces a Mask that hedges, softens, asks before asserting.
Model: claude-sonnet-4-6. Cost: ~$0.005.
5. Surface ground
Name, age, location, occupation, appearance. The most demographic layer, generated last so it grounds itself in the prior three rather than anchoring on a demographic cliché. (If we wrote Surface first, the model anchors on "founder + Istanbul" and back-fills the rest to match the stereotype.)
Model: claude-sonnet-4-6. Cost: ~$0.004.
6. Audit
A second Opus call reads the full four-layer document and scores it
on coherence, depth, cultural fidelity, voice distinctiveness, and
realism. Score below 3.5/5 → pipeline retries (up to 3 times). After
three retries, persona ships in flagged status.
Model: claude-opus-4-7. Cost: ~$0.03.
Total
End-to-end: ~$0.08 per persona, 30–90 seconds. Compare with naive single-call generation: ~$0.01, 5 seconds, no audit, no constraint chain, persona is flat.
Post-generation jobs
Two jobs run asynchronously after the API response returns. The persona is usable immediately; these add metadata.
Voice fingerprint
Fifty short scenarios run through the persona, embeddings averaged into a single vector. Used for drift detection on every subsequent chat reply. See How drift detection works.
Cost: ~$0.03. Time: ~60 seconds. Runs as a background job; webhook
event persona.fingerprint.ready fires when done.
Provocation test suite
The 33-test catalog runs against the persona — role-breaking, contradictions, emotional load, jailbreak attempts. Aggregate pass rate stored alongside the audit verdict. See Audit + provocation tests.
Cost: ~$0.10. Time: ~3 minutes. Async; persona.test_suite_complete
event fires when done.
Streaming the visible six
Pass stream: true and the response becomes SSE. Each step emits
step.started and step.completed events with timing metadata. Use
this to build progress UI without polling.
Configuration
Every step's model, temperature, max_tokens, and fallback chain is
exposed as a config item under engine.pipeline.<step>.*. Org admins
can swap providers (Anthropic → OpenAI → Google), set per-workspace
overrides, lock specific items, snapshot + rollback the whole tree.
The defaults (opus for Soul + audit, sonnet for the middle) are
the result of internal A/B runs on persona quality. We don't
recommend straying without running your own evaluation; the audit
catches obvious regressions, but subtle ones can slip.
Honest scope
This is the production pipeline. Variations live in our research branch (different Soul prompts, different audit rubrics, different fingerprint scenarios), but the customer-facing contract is the six steps + two jobs. Changes propagate through the audit's regression suite before shipping.