Moonborn — Developers

Inside the six-step generation pipeline

What happens between `POST /v1/personas` and the response — six steps, two LLMs, an audit pass, and the post-generation jobs that don't block the response.

A single API call generates a four-layer persona in 30–90 seconds. Internally, that's six visible steps + two post-generation jobs. Here's what each one does.

1. Intent parse

The user's brief — a free-form sentence — gets parsed into a small structured object: genre hints, locale, tonal axis preferences. The intent string itself stays untouched; the parsed object is a constraint chain for the subsequent steps.

Model: claude-sonnet-4-6 (configurable). Cost: ~$0.001.

2. Soul draft

The deepest layer first. The Soul prompt asks for desire, fear, wound, and growth arc — the things that make a character feel unguardedly themselves. We use Opus here on purpose: the Soul draft is the load-bearing step; cheaping out on this one cascades into flat downstream layers.

Model: claude-opus-4-7 (configurable). Cost: ~$0.04.

3. Self enrich

With Soul in hand, Self fills in psychometric structure: Big Five, archetype, values, attachment style. The prompt is heavily constrained by Soul — a persona whose Soul is "to be seen as ordinary" can't have a 0.95 extraversion score.

Model: claude-sonnet-4-6. Cost: ~$0.005.

4. Mask build

The user-facing voice. Register, tone, signature phrases, social role. Mask is constrained by both Soul and Self — a Soul of "approval-seeking" + Self of high agreeableness produces a Mask that hedges, softens, asks before asserting.

Model: claude-sonnet-4-6. Cost: ~$0.005.

5. Surface ground

Name, age, location, occupation, appearance. The most demographic layer, generated last so it grounds itself in the prior three rather than anchoring on a demographic cliché. (If we wrote Surface first, the model anchors on "founder + Istanbul" and back-fills the rest to match the stereotype.)

Model: claude-sonnet-4-6. Cost: ~$0.004.

6. Audit

A second Opus call reads the full four-layer document and scores it on coherence, depth, cultural fidelity, voice distinctiveness, and realism. Score below 3.5/5 → pipeline retries (up to 3 times). After three retries, persona ships in flagged status.

Model: claude-opus-4-7. Cost: ~$0.03.

Total

End-to-end: ~$0.08 per persona, 30–90 seconds. Compare with naive single-call generation: ~$0.01, 5 seconds, no audit, no constraint chain, persona is flat.

Post-generation jobs

Two jobs run asynchronously after the API response returns. The persona is usable immediately; these add metadata.

Voice fingerprint

Fifty short scenarios run through the persona, embeddings averaged into a single vector. Used for drift detection on every subsequent chat reply. See How drift detection works.

Cost: ~$0.03. Time: ~60 seconds. Runs as a background job; webhook event persona.fingerprint.ready fires when done.

Provocation test suite

The 33-test catalog runs against the persona — role-breaking, contradictions, emotional load, jailbreak attempts. Aggregate pass rate stored alongside the audit verdict. See Audit + provocation tests.

Cost: ~$0.10. Time: ~3 minutes. Async; persona.test_suite_complete event fires when done.

Streaming the visible six

Pass stream: true and the response becomes SSE. Each step emits step.started and step.completed events with timing metadata. Use this to build progress UI without polling.

Configuration

Every step's model, temperature, max_tokens, and fallback chain is exposed as a config item under engine.pipeline.<step>.*. Org admins can swap providers (Anthropic → OpenAI → Google), set per-workspace overrides, lock specific items, snapshot + rollback the whole tree.

The defaults (opus for Soul + audit, sonnet for the middle) are the result of internal A/B runs on persona quality. We don't recommend straying without running your own evaluation; the audit catches obvious regressions, but subtle ones can slip.

Honest scope

This is the production pipeline. Variations live in our research branch (different Soul prompts, different audit rubrics, different fingerprint scenarios), but the customer-facing contract is the six steps + two jobs. Changes propagate through the audit's regression suite before shipping.