feat(P3): mind/mouth split — separate voice model for the final reply (seam, default off)

The mind (chat backend/model) decides, reasons, and runs tools → a draft; the mouth
re-voices that draft in her character. Default: no mouth configured → the mind's
draft IS the reply, bit-for-bit the old behavior (and old streaming path untouched).

- config: MOUTH_BACKEND / MOUTH_MODEL. The slot for an eventual fine-tuned voice.
- chat: _mind_loop (tool/generation loop, non-stream, returns draft + tools_run),
  _voice_pass / mind.voice_messages (re-voice the draft, keep every fact/number),
  _mouth_target (active only when configured AND != mind). respond + respond_stream
  branch: mouth off = stream the mind directly (unchanged); mouth on = mind decides
  + runs tools, then the mouth streams the re-voiced reply. Falls back to the draft
  on any mouth failure (chat never breaks).
- Key payoff: the mouth needs no tool support (the mind handles tools), so it can be
  a non-tool character model (Dolphin / Claude / fine-tune). Makes the fine-tune
  easy: teach a small model to *sound* like Lyra, not to be smart.
- tests: mouth target on/off, voice_messages shape, voice_pass revoice+fallback.
  Suite 96 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-24 06:08:06 +00:00
parent a7af461cdb
commit 03aceec6fa
5 changed files with 183 additions and 58 deletions
+2
View File
@@ -49,3 +49,5 @@ PING_AUTO_SALIENCE=0.8 # a thought this salient auto-pings even without an exp
PING_COOLDOWN_MIN=60 # min minutes between AUTO pings (explicit reach-outs bypass)
DIGEST_HOUR=18 # local hour to send her daily "what I've been thinking" digest
CHAT_DELIBERATE=true # think privately before answering substantive chat turns (false = faster, shallower)
MOUTH_BACKEND= # mind/mouth split: separate character/voice model for the final reply (empty = mind speaks)
MOUTH_MODEL=