project-lyra

serversdown/project-lyra

Fork 0

Commit Graph

Author	SHA1	Message	Date
serversdown	03aceec6fa	feat(P3): mind/mouth split — separate voice model for the final reply (seam, default off) The mind (chat backend/model) decides, reasons, and runs tools → a draft; the mouth re-voices that draft in her character. Default: no mouth configured → the mind's draft IS the reply, bit-for-bit the old behavior (and old streaming path untouched). - config: MOUTH_BACKEND / MOUTH_MODEL. The slot for an eventual fine-tuned voice. - chat: _mind_loop (tool/generation loop, non-stream, returns draft + tools_run), _voice_pass / mind.voice_messages (re-voice the draft, keep every fact/number), _mouth_target (active only when configured AND != mind). respond + respond_stream branch: mouth off = stream the mind directly (unchanged); mouth on = mind decides + runs tools, then the mouth streams the re-voiced reply. Falls back to the draft on any mouth failure (chat never breaks). - Key payoff: the mouth needs no tool support (the mind handles tools), so it can be a non-tool character model (Dolphin / Claude / fine-tune). Makes the fine-tune easy: teach a small model to sound like Lyra, not to be smart. - tests: mouth target on/off, voice_messages shape, voice_pass revoice+fallback. Suite 96 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 06:08:06 +00:00
serversdown	904eda3388	refactor(P1): extract the turn pipeline into lyra/mind.py (behavior-preserving) First step of the cognition control plane (docs/COGNITION.md). The chat turn is now an explicit society of parts over a shared TurnContext blackboard: perceive (stub) -> route (session mode) -> compose (tiered prompt) -> deliberate. - lyra/mind.py (new): TurnContext + the pipeline + assemble(); moved build_messages and the deliberation helpers here (the assembly belongs in the control plane). - lyra/chat.py: slimmed to "speak + persist" — calls mind.assemble(), runs the tool/generation loop, persists. No behavior change (same prompt, same output). - tests: point test_time/test_chat at mind; add an assemble() structure test; make test_chat/test_tools hermetic (CHAT_DELIBERATE off so respond() doesn't make a real LLM call). Suite 86 green in ~5s, ruff clean, no import cycle. This is the frame; perceive/route/learn get filled in next phases — each opt-in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 05:19:39 +00:00
serversdown	97afa82594	feat: live chat deliberation — think privately before answering (less 'meh') The chat had no thinking in it: respond() was a single gpt-4o call in default- assistant voice (numbered lists, 'would you like to...', vague). All the cognition work was background-only. This brings a thought step into the conversation. - chat: before answering a substantive turn (trivial 'ok/lol' skipped), a private _deliberate() pass — "what do you ACTUALLY think, your real take, the substance, no pleasantries" — drawing on her in-context threads/journal. The thinking is then injected as the LAST system note with voice enforcement (answer from this; no numbered list / how-to outline unless asked; no 'would you like to' closer), so it beats gpt-4o's boilerplate at the most influential position. Logged to /logs. - Wired into respond() + respond_stream(). Config CHAT_DELIBERATE (default on) to disable if the extra call's latency annoys. - persona: "talk, don't outline" — prose over listicles, the first concrete move over a survey of options. - test_chat.py (gating + note composition + disabled). Suite 84, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 00:35:49 +00:00

Author

SHA1

Message

Date

serversdown

03aceec6fa

feat(P3): mind/mouth split — separate voice model for the final reply (seam, default off)

The mind (chat backend/model) decides, reasons, and runs tools → a draft; the mouth
re-voices that draft in her character. Default: no mouth configured → the mind's
draft IS the reply, bit-for-bit the old behavior (and old streaming path untouched).

- config: MOUTH_BACKEND / MOUTH_MODEL. The slot for an eventual fine-tuned voice.
- chat: _mind_loop (tool/generation loop, non-stream, returns draft + tools_run),
  _voice_pass / mind.voice_messages (re-voice the draft, keep every fact/number),
  _mouth_target (active only when configured AND != mind). respond + respond_stream
  branch: mouth off = stream the mind directly (unchanged); mouth on = mind decides
  + runs tools, then the mouth streams the re-voiced reply. Falls back to the draft
  on any mouth failure (chat never breaks).
- Key payoff: the mouth needs no tool support (the mind handles tools), so it can be
  a non-tool character model (Dolphin / Claude / fine-tune). Makes the fine-tune
  easy: teach a small model to *sound* like Lyra, not to be smart.
- tests: mouth target on/off, voice_messages shape, voice_pass revoice+fallback.
  Suite 96 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-24 06:08:06 +00:00

serversdown

904eda3388

refactor(P1): extract the turn pipeline into lyra/mind.py (behavior-preserving)

First step of the cognition control plane (docs/COGNITION.md). The chat turn is now
an explicit society of parts over a shared TurnContext blackboard:
  perceive (stub) -> route (session mode) -> compose (tiered prompt) -> deliberate.

- lyra/mind.py (new): TurnContext + the pipeline + assemble(); moved build_messages
  and the deliberation helpers here (the assembly belongs in the control plane).
- lyra/chat.py: slimmed to "speak + persist" — calls mind.assemble(), runs the
  tool/generation loop, persists. No behavior change (same prompt, same output).
- tests: point test_time/test_chat at mind; add an assemble() structure test;
  make test_chat/test_tools hermetic (CHAT_DELIBERATE off so respond() doesn't make
  a real LLM call). Suite 86 green in ~5s, ruff clean, no import cycle.

This is the frame; perceive/route/learn get filled in next phases — each opt-in.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-24 05:19:39 +00:00

serversdown

97afa82594

feat: live chat deliberation — think privately before answering (less 'meh')

The chat had no thinking in it: respond() was a single gpt-4o call in default-
assistant voice (numbered lists, 'would you like to...', vague). All the cognition
work was background-only. This brings a thought step into the conversation.

- chat: before answering a substantive turn (trivial 'ok/lol' skipped), a private
  _deliberate() pass — "what do you ACTUALLY think, your real take, the substance,
  no pleasantries" — drawing on her in-context threads/journal. The thinking is then
  injected as the LAST system note with voice enforcement (answer from this; no
  numbered list / how-to outline unless asked; no 'would you like to' closer), so it
  beats gpt-4o's boilerplate at the most influential position. Logged to /logs.
- Wired into respond() + respond_stream(). Config CHAT_DELIBERATE (default on) to
  disable if the extra call's latency annoys.
- persona: "talk, don't outline" — prose over listicles, the first concrete move
  over a survey of options.
- test_chat.py (gating + note composition + disabled). Suite 84, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-23 00:35:49 +00:00

3 Commits