perf: tighten the dynamic prompt — persona split + lean deliberation

The per-turn prompt was ~5.5K tokens (persona alone ~40%), sent up to 3x/turn. Tightened by RELEVANCE (the control plane decides what each turn needs), not by deletion — fidelity preserved, focus improved (buried instructions were getting ignored), tokens roughly halved. - persona split: core (identity + voice — always) vs situational sections pulled in only when relevant. mind._persona_block: self-model/origin only on meta turns (generous _META_HINTS), poker guardrails only in poker context (mode/strategic/ _POKER_HINTS). persona.core_prompt()/section(); system_prompt() kept as fallback. - lean deliberation: the private 'what do I think' pass now uses a focused context (her interiority + recent turns + the message), not the full persona/profile/ narrative/recall dump. It shapes the take, not the voice. Measured: casual Talk turn 21,949 -> 15,974 chars (-27%); deliberation 21,949 -> 6,026 (-72%); meta turns still include the self-model. Suite 98 green, ruff clean. Real retirement of the long prompt is still the fine-tune (mouth); this is the cheap, high-leverage cut that also improves adherence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:48:44 +00:00
parent 8a3c9b2701
commit 51c2d6abb9
3 changed files with 121 additions and 16 deletions
@@ -1,20 +1,60 @@
 """Persona: Lyra's identity and voice, loaded from an editable markdown prompt.

-The prompt lives in `personas/<name>.md` so it can be tuned without touching
-code. `LYRA_PERSONA` selects which file to load (default: "lyra").
+The prompt lives in `personas/<name>.md` so it can be tuned without touching code.
+`LYRA_PERSONA` selects which file to load (default: "lyra").
+
+The file is split on `## ` headers so the control plane can include only what a turn
+needs: the **core** (identity + voice — the anti-generic essentials) is always sent;
+the heavier situational sections (her origin, the self-model, the poker guardrails)
+are pulled in by `mind` only when relevant. This keeps the per-turn prompt tight
+without losing fidelity. `system_prompt()` still returns the whole thing (fallback).
 """
 from __future__ import annotations

 import os
+import re
 from functools import lru_cache
 from pathlib import Path

 _PERSONA_DIR = Path(__file__).parent / "personas"

+# Sections always sent (besides the intro) — the voice + identity that keep her her.
+_CORE = ("Who you are", "How you talk", "Right now")
+
+
+def _name(name: str | None) -> str:
+    return name or os.getenv("LYRA_PERSONA", "lyra")
+
+
+@lru_cache(maxsize=None)
+def _sections(name: str) -> dict[str, str]:
+    """Parse the persona file into {header: text}; the pre-header preamble is 'intro'."""
+    text = (_PERSONA_DIR / f"{name}.md").read_text(encoding="utf-8").strip()
+    chunks = re.split(r"(?m)^## ", text)
+    out = {"intro": chunks[0].strip()}
+    for ch in chunks[1:]:
+        header = ch.split("\n", 1)[0].strip()
+        out[header] = ("## " + ch).strip()
+    return out
+

@lru_cache(maxsize=None)
 def system_prompt(name: str | None = None) -> str:
-    """Return the persona system prompt. Cached; pass a name to override env."""
-    name = name or os.getenv("LYRA_PERSONA", "lyra")
-    path = _PERSONA_DIR / f"{name}.md"
-    return path.read_text(encoding="utf-8").strip()
+    """The full persona (every section). Fallback / back-compat."""
+    return (_PERSONA_DIR / f"{_name(name)}.md").read_text(encoding="utf-8").strip()
+
+
+def core_prompt(name: str | None = None) -> str:
+    """Intro + the always-on core sections (identity + voice)."""
+    s = _sections(_name(name))
+    parts = [s["intro"]] + [section(h, name) for h in _CORE]
+    return "\n\n".join(p for p in parts if p)
+
+
+def section(header_prefix: str, name: str | None = None) -> str:
+    """A situational section by header prefix (e.g. 'How you actually work'); '' if absent."""
+    pref = header_prefix.lower()
+    for header, body in _sections(_name(name)).items():
+        if header.lower().startswith(pref):
+            return body
+    return ""