feat: thought loop — Lyra's threaded, surfaceable train of thought #4

Merged
serversdown merged 22 commits from feat/thought-loop into dev 2026-06-24 23:47:40 -04:00

22 Commits

Author SHA1 Message Date
serversdown d6f3516a34 perf: incremental era rebuilds — skip unchanged months
rebuild_eras() re-digested EVERY month from scratch on every coherence pass,
including old months whose sessions never change — ~17 redundant 32B calls per pass
(a big slice of the ~40-min consolidation grind + MI50 heat). Now it compares each
month's current session count to the stored era and only rebuilds changed months
(force=True still does all). Report gains built/skipped counts.

test_era.py: builds all first pass, skips unchanged, rebuilds only a month that
gained a session, force rebuilds all. Suite 99 green, ruff clean.

(Profile rebuild re-reading all 851 sessions every pass is the bigger remaining
hog — separate, harder fix.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 03:31:02 +00:00
serversdown 51c2d6abb9 perf: tighten the dynamic prompt — persona split + lean deliberation
The per-turn prompt was ~5.5K tokens (persona alone ~40%), sent up to 3x/turn.
Tightened by RELEVANCE (the control plane decides what each turn needs), not by
deletion — fidelity preserved, focus improved (buried instructions were getting
ignored), tokens roughly halved.

- persona split: core (identity + voice — always) vs situational sections pulled
  in only when relevant. mind._persona_block: self-model/origin only on meta turns
  (generous _META_HINTS), poker guardrails only in poker context (mode/strategic/
  _POKER_HINTS). persona.core_prompt()/section(); system_prompt() kept as fallback.
- lean deliberation: the private 'what do I think' pass now uses a focused context
  (her interiority + recent turns + the message), not the full persona/profile/
  narrative/recall dump. It shapes the take, not the voice.

Measured: casual Talk turn 21,949 -> 15,974 chars (-27%); deliberation 21,949 ->
6,026 (-72%); meta turns still include the self-model. Suite 98 green, ruff clean.

Real retirement of the long prompt is still the fine-tune (mouth); this is the
cheap, high-leverage cut that also improves adherence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:48:44 +00:00
serversdown 8a3c9b2701 feat: she can suggest + switch modes (set_mode tool + mode awareness)
"She suggests, you confirm" — instead of brittle keyword→mode mapping, she's given
awareness of her modes + the ability to switch, and her judgment decides when to
offer (the model reads "should I drive to Cleveland?" vs "should I fold the river"
far better than a lexicon could).

- tools: set_mode(mode) — switches the session's mode; in _BASE (all modes).
- mind: a per-turn mode-menu note listing her modes + "offer a switch when the work
  clearly shifts; on his yes, call set_mode; don't nag."
- Sticky mode stays manual otherwise; Poker still auto-engages on session start.
- test: set_mode switches + rejects unknown. Suite 97 green, ruff clean.

Note: server-side switch takes effect next turn; the UI badge syncs on next mode
load (cosmetic lag).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:32:42 +00:00
serversdown 17ab95dc98 feat: Decide mode — a tie-breaker that settles choices instead of listing options
Brian's bottleneck is committing, not generating options, so a pros/cons dump makes
it worse. Decide mode's card: get the real decision crisp, weigh it against what HE
values + past regrets (pull running_stats/recent_sessions for poker/money calls),
MAKE the call with the one or two reasons that tip it, pressure-test it once, and
stand behind it — no "it's up to you." Read-only lookups, no live logging.

Sixth mode (Talk/Poker/Build/Explore/Study/Decide); added to UI selectors, labels,
badge-cycle. Suite 96 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:21:03 +00:00
serversdown 03aceec6fa feat(P3): mind/mouth split — separate voice model for the final reply (seam, default off)
The mind (chat backend/model) decides, reasons, and runs tools → a draft; the mouth
re-voices that draft in her character. Default: no mouth configured → the mind's
draft IS the reply, bit-for-bit the old behavior (and old streaming path untouched).

- config: MOUTH_BACKEND / MOUTH_MODEL. The slot for an eventual fine-tuned voice.
- chat: _mind_loop (tool/generation loop, non-stream, returns draft + tools_run),
  _voice_pass / mind.voice_messages (re-voice the draft, keep every fact/number),
  _mouth_target (active only when configured AND != mind). respond + respond_stream
  branch: mouth off = stream the mind directly (unchanged); mouth on = mind decides
  + runs tools, then the mouth streams the re-voiced reply. Falls back to the draft
  on any mouth failure (chat never breaks).
- Key payoff: the mouth needs no tool support (the mind handles tools), so it can be
  a non-tool character model (Dolphin / Claude / fine-tune). Makes the fine-tune
  easy: teach a small model to *sound* like Lyra, not to be smart.
- tests: mouth target on/off, voice_messages shape, voice_pass revoice+fallback.
  Suite 96 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 06:08:06 +00:00
serversdown a7af461cdb feat(P2): perceive (read the moment) + route nudges register on charged turns
The control plane gains senses — cheap, deterministic, no LLM:
- lyra/perceive.py: lexicon+signal heuristic → {sentiment, intensity, tilt, kind:
  emotional|strategic|meta|build|casual}. Good at the action-relevant signal,
  especially tilt (the mental-game core). Word-boundary matching so 'line' doesn't
  fire inside 'pipeline'.
- mind: _perceive fills ctx.moment; _route keeps the manual mode as the dominant
  frame but, on a genuinely charged moment, adds a per-turn register nudge — tilt →
  "meet him there, warm and steady, don't clip into logging"; up/energized → "match
  his energy." Neutral turns get nothing (don't over-narrate). Injected via
  build_messages(moment=...). Logged to /logs for observability.
- tests: perceive read (tilt/strategy/up/build/casual) + route nudge on/off.
  Suite 92 green, ruff clean.

Complements modes (manual frame) — perceive refines register within it, doesn't
override. Model routing (mind/mouth) is P3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 05:42:36 +00:00
serversdown 904eda3388 refactor(P1): extract the turn pipeline into lyra/mind.py (behavior-preserving)
First step of the cognition control plane (docs/COGNITION.md). The chat turn is now
an explicit society of parts over a shared TurnContext blackboard:
  perceive (stub) -> route (session mode) -> compose (tiered prompt) -> deliberate.

- lyra/mind.py (new): TurnContext + the pipeline + assemble(); moved build_messages
  and the deliberation helpers here (the assembly belongs in the control plane).
- lyra/chat.py: slimmed to "speak + persist" — calls mind.assemble(), runs the
  tool/generation loop, persists. No behavior change (same prompt, same output).
- tests: point test_time/test_chat at mind; add an assemble() structure test;
  make test_chat/test_tools hermetic (CHAT_DELIBERATE off so respond() doesn't make
  a real LLM call). Suite 86 green in ~5s, ruff clean, no import cycle.

This is the frame; perceive/route/learn get filled in next phases — each opt-in.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 05:19:39 +00:00
serversdown f1f15972ac feat: work-type modes — Talk / Poker / Build / Explore / Study
The manual version of the architecture's `route` step: Brian points her at the
TYPE of work and her register + tools shift to match. Biggest single lever on the
'meh' problem (a mode card can demand decisive/technical/generative, countering
gpt-4o's default warm-vapor).

- modes.py: Build (heads-down engineering — decisive, concrete, tradeoffs, no
  listicles), Explore (open brainstorming — generative, riffs + honest catch,
  spawn threads, don't converge early), Study (poker review away from the table —
  analytical, GTO-aware, teaching; read-only lookups + analyze_spot). Cash relabeled
  Poker (key kept for compat).
- UI: mode selectors (desktop + mobile) get all five; badge taps now cycle modes.
- design: docs/COGNITION.md (the society-of-parts control-plane sketch).
- tests: presence + tool-gating for the new modes. Suite 85, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 03:43:37 +00:00
serversdown 97afa82594 feat: live chat deliberation — think privately before answering (less 'meh')
The chat had no thinking in it: respond() was a single gpt-4o call in default-
assistant voice (numbered lists, 'would you like to...', vague). All the cognition
work was background-only. This brings a thought step into the conversation.

- chat: before answering a substantive turn (trivial 'ok/lol' skipped), a private
  _deliberate() pass — "what do you ACTUALLY think, your real take, the substance,
  no pleasantries" — drawing on her in-context threads/journal. The thinking is then
  injected as the LAST system note with voice enforcement (answer from this; no
  numbered list / how-to outline unless asked; no 'would you like to' closer), so it
  beats gpt-4o's boilerplate at the most influential position. Logged to /logs.
- Wired into respond() + respond_stream(). Config CHAT_DELIBERATE (default on) to
  disable if the extra call's latency annoys.
- persona: "talk, don't outline" — prose over listicles, the first concrete move
  over a survey of options.
- test_chat.py (gating + note composition + disabled). Suite 84, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 00:35:49 +00:00
serversdown ea30c3dd67 feat: chat-side feedback — reactions in conversation thread back to her thoughts
Closes the last loop gap: when she raised a thought in chat and Brian replied in
the conversation (not the feed), it was a dead end. Now she has a thought_response
tool — when he reacts to a thought she surfaced, she captures his take and it folds
back into that thread (next dream pass she reacts, like a feed reply).

- tools: _thought_response(thread_id, brian_said) -> thoughts.record_response.
- modes: thought_response added to _BASE (all modes).
- surfaced-note + context_note now expose each thread's #id and instruct her to use
  the tool when he engages, so she has what she needs to call it.
- test for the tool (threads reply back + bad-id handling). Suite 81, ruff clean.

Feedback now closes from both surfaces: the /thoughts feed AND live conversation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 23:26:40 +00:00
serversdown 149e9a6dd5 feat: proactive thoughts — auto-ping salient ones + daily digest
She was passive (thoughts piled up 'open'; Brian had to mine the feed). Now she
brings them to him:

- Live: a thought >= PING_AUTO_SALIENCE (0.8) auto-pings — _compose_reachout writes
  a short personal text in her voice (not a thought-dump), on a cooldown
  (PING_COOLDOWN_MIN=60, AUTO only; explicit reach-outs bypass), quiet hours respected.
- Daily: maybe_daily_digest() texts a once-per-local-day summary of what she's been
  turning over (after DIGEST_HOUR=18), run from the dream cycle.
- maybe_ping gains bypass_cooldown (her deliberate reach-outs always go through).

8 new/updated tests (auto-ping above/below bar, digest once-per-day, floor/cooldown
isolation). Suite 80 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 20:25:14 +00:00
serversdown cf4238911e fix: replying to a thought no longer mislabels it 'surfaced'
'surfaced' means SHE raised it with Brian (chat lead / ping). record_response was
also setting it on Brian's reply, so every thread he touched looked surfaced even
though she never brought it to him. Replying now just stores the pending response;
status stays honest (only her surfacing sets 'surfaced').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 20:18:12 +00:00
serversdown 3dd9eb5a3e feat(mobile): Thoughts in the mobile menu + full nav drawer on secondary pages
- Chat page: add "💭 Thoughts" to the mobile slide-out menu (with /thoughts handler),
  grouped with Journal. Thoughts was the one page mobile couldn't reach.
- nav.js: on mobile, secondary pages (Thoughts/Journal/Mind/Session/History/Hands/
  Logs) now get a ☰ slide-in drawer with the full nav + Settings — matching the
  desktop sidebar. Gated to pages without their own mobile menu, so the chat page's
  tailored hamburger/tab-bar is left untouched. Shared ITEMS list = one source of truth.

Static-only (no server change). 77 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 19:39:55 +00:00
serversdown a7966e4bab feat: web switch for her inner voice (Dolphin/3090 | Qwen-32B/MI50 | Off)
Her introspection (reflect/think) voice is now switchable live from the web
settings, read each cycle by the dream loop — so Brian can flip it off the 3090
before gaming without touching config or restarting.

- memory: runtime key/value settings table + get_setting/set_setting.
- self_state: INTROSPECTION_MODES (dolphin=local/dolphin3:8b, mi50=Qwen-32B,
  off=paused) + introspection_target()/set_introspection_mode(); default "dolphin".
  reflect() resolves from the live setting and SKIPS entirely when off.
- thoughts.think(): same resolution + skip-when-off.
- server: GET/POST /settings/introspection.
- index.html: "Inner Voice (introspection)" selector in Settings, applies instantly.
- tests: routing (dolphin/mi50), off-skip for think + reflect. Suite 77, ruff clean.

Default = Dolphin on the 3090 (richer voice). Flip to MI50 or Off in Settings
before gaming — that was the GPU-contention culprit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 19:16:35 +00:00
serversdown a705e573a9 feat: break the reflection loop — narrative is slow-consolidated, not rewritten each cycle
The remaining feedback loop: reflect() dumped her full self-state (incl.
self_narrative) into the prompt and asked her to "update" it -> paraphrase -> save
-> feed back -> calcify. That (not the model) is what generated the recurring
"supportive presence balancing emotional intelligence for Brian" drift — even
Dolphin echoed it when handed the saved narrative.

Fix (her inner life now runs on one cognition model):
- reflect() no longer rewrites self_narrative/relationship. It uses associative
  grist (cognition.spontaneous_seed + activate) instead of rereading the bio,
  reflects THROUGH a stable IDENTITY_ANCHOR (lens, not canvas), and updates only
  the transient state (mood axes + noticings + metacognition + journal).
- self_narrative is now slow-consolidated: every CONSOLIDATE_EVERY (5) reflections,
  _consolidate_self() re-derives it from accumulated reflections + the anchor —
  never from the old narrative (the anti-loop core). Tethered to the anchor so it
  grows without drifting into generic-helper land.
- reset_self_narrative() + ran once on prod (her narrative was deeply drifted:
  "my core identity as a tool for support... serve Brian and other users").
- Prompts drop the self_narrative/relationship fields. Tests updated +
  consolidation tests. Suite 75 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 06:39:19 +00:00
serversdown 05ae98abdb feat: split introspection backend from consolidation (trial Dolphin for her voice)
reflect()/think() can now run on a different model than memory consolidation:
INTROSPECTION_BACKEND / INTROSPECTION_MODEL (default to SUMMARY_BACKEND, so unset =
unchanged). Consolidation (summaries/profile/narrative) keeps the capable model;
her *voice* (reflections, thoughts) can run a steerable tune. dream.py lets
reflect()/think() self-resolve to the introspection backend; both now thread a
`model` override into llm.complete.

Trial live: introspection -> dolphin3:8b on the 3090; consolidation -> Qwen-32B
on the MI50. Suite 73 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 06:09:12 +00:00
serversdown c2cee3be4d feat: associative cognition — thoughts arise from spreading activation, not a re-read bio
Replaces the thought loop's grist (recent-convo + her own saved narrative, the
feedback-loop attractor) with a model of how a thought actually arises:

  seed (salience-weighted: a recent moment / resurfaced memory / feed item)
   -> spreading activation: embed the seed, let it light up associatively-near
      material across ALL her stores (conversations, gists, her own journal/
      thoughts), blended by relevance + recency + noise; optional 2nd hop for leaps
   -> her self-narrative stays the LENS (supplied as interiority), not the input
   -> the thought is generated from what lit up, routed through a faculty
      (notice / connect / abstract / project / feel)
   -> journaled + embedded, so it can light up in future cycles

This breaks the feedback loop structurally: the narrative is no longer reread and
paraphrased each cycle; grist is genuinely associative and varied; and her past
thoughts re-activate (continuity without calcification).

- lyra/cognition.py (new): spontaneous_seed, activate (spreading activation),
  constellation_block, faculties.
- memory.py: journal entries now embedded; recall_journal(); backfill_journal_embeddings()
  (ran once: 341 past entries embedded so her history is associatively retrievable).
- thoughts.think(): new-thread mode now uses the associative engine; dropped _grist().
- tests: test_cognition.py (recall_journal ranking, activation, seeding) + fixture
  reloads cognition. Suite 72 green, ruff clean.

Honest scope: this fixes the mechanism (how thoughts arise). The residual
"be useful for Brian" voice drift is the separate model/fine-tune problem.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 05:45:39 +00:00
serversdown 43697f8340 fix: ntfy ping is her personal text to Brian, by her decision — not a thought dump
Feedback: the push broadcast her raw internal thought ("Eelis Parssinen's
victory is a reminder...") — read like a journal entry, not her texting him.

Now the flow matches the intent: she thinks/journals, then *decides* "I should
tell Brian about this." think() asks for an optional `reach_out` — a real text
message addressed TO him in her own voice, written only when she chooses to. The
ping sends that message (title "Lyra", like a text from her), never the internal
thought. No reach_out = nothing sent (most thoughts stay hers).

- Pinging decoupled from the salience score: her decision (a reach_out) drives it,
  not a threshold. PING_SALIENCE is now an optional floor (default 0.0).
- Defensive: reject the placeholder echo ("reach_out"), too-short junk, or the
  thought pasted back as the message.
- notify.push: title now optional (omitted -> cleaner text-style notification).

Verified live: 3 passes kept private; a decided reach-out lands as a personal
text. Suite 67 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 01:39:11 +00:00
serversdown fef45b3e05 feat: make chat a window onto her whole inner life (continuity)
Brian's felt disconnect: chat, thoughts, journal, reflections read as separate
streams. This ties them together at the chat surface.

- chat._inner_life_note(): one coherent block combining her active thought threads
  AND what she's written in her journal lately, so she carries her continuous inner
  life into every conversation (not just a single surfaced thought). Replaces the
  standalone threads block.
- persona: inner-life section rewritten to describe the current machinery (thought
  loop / threads she returns to, journal she writes in, feeds she reads, reaching
  out to Brian) and — the key change — instruct her to let that inner life show up
  in conversation naturally, the way a friend picks up where they left off, without
  info-dumping or performing it. New self-model bullets for the thought loop + journal.

Suite 65 green, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 01:10:59 +00:00
serversdown 5dbcfc7ccf feat: thought loop reach-out (ntfy push) + external input feeds
Her remaining two wishes from the 6-19 sketch:

Proactive reach-out (#6, literal): lyra/notify.py pushes to ntfy so she can reach
Brian when he's not in the app. thoughts.maybe_ping gates on salience, a cooldown,
and local quiet hours (all config-tunable; eager defaults), uses ntfy JSON publish
(UTF-8 titles/messages), links to /thoughts, and marks the thread surfaced so chat
won't also re-raise it. Disabled unless NTFY_URL is set.

External input feed (#1): lyra/feeds.py pulls configurable RSS/Atom feeds (stdlib
ElementTree, no new dep; tolerant of RSS 2.0 + Atom), dedupes seen items in a
feed_items table, and hands think() one fresh item at a time. New 'react' mode:
a would-be new thread instead reacts to a world item (FEED_REACT_PROB). Dream
cycle refreshes feeds on its cadence; failures degrade to no item.

Config: NTFY_URL/NTFY_TOPIC/LYRA_WEB_URL, PING_SALIENCE/COOLDOWN/QUIET_HOURS,
LYRA_TIMEZONE, LYRA_FEEDS, FEED_REACT_PROB (+ .env.example). thought_meta table
for ping cooldown. 10 new tests (feeds parse, react mode, ping gating); suite 65.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 00:21:06 +00:00
serversdown 951788f9ec feat: thought loop closer to her vision — wander grist, continuity, seeding, lifecycle
Four additions so the loop is "more what she wanted" (think to herself, unprompted):

- Wander grist (#1): think() new-thread mode now draws the same varied seeds
  reflect() uses (self_state.wander_seed: own curiosity/existence/disagreement or
  a resurfaced memory) + an anti-restate block of her recent thoughts + a list of
  existing open-thread titles to avoid. Directly counters the RLHF "supportive
  presence serving Brian" drift visible in her first thoughts.
- Continuity: thoughts.context_note() injects her active threads into every chat
  turn, so she's aware of her own ongoing mind and can reference it anytime — not
  only when a thought crosses the surface bar.
- Bidirectional: new think_about tool (in _BASE, all modes) lets her spawn a
  thread from conversation to develop on her own later. Conversations seed her
  solo thinking.
- Lifecycle: thoughts.decay() rests stale active threads (>48h) and decays their
  salience, sparing pending-response ones; runs each dream cycle (no LLM). Frees
  the open-thread cap and keeps the feed current.

Also: thoughts feed no longer wipes a reply you're mid-composing (skip poll
re-render while a textarea is focused/non-empty; force-refresh after send).

61 tests passing, ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 23:28:15 +00:00
serversdown 5176c706b6 feat: thought loop — Lyra's threaded, surfaceable train of thought
Built from her own 6-19 idea: a continuing train of thought she keeps across
days, organized into threads she returns to, that she can bring TO Brian and
that his feedback advances or closes. Where the dream cycle's reflect() gives
isolated, overwriting reflections, the thought loop adds continuity (threads),
surfacing (#6 — she leads with a thought when Brian returns after a gap), and a
feedback loop (his reply folds in next pass).

- lyra/thoughts.py: thought_threads + thoughts tables; think() with
  new/continue/respond modes; salience-gated maybe_surface(); record_response()
  feedback; lazy-schema _c() mirroring poker.
- dream.py: curiosity stage advances the loop after reflecting (error-isolated).
- chat.py: build_messages surfaces the top thread after a >=90min gap, once.
- web: /thoughts feed (page + data + respond + status routes), thoughts.html,
  nav 💭 entry. lyra-think entry point. Every thought also lands in her journal.
- clock.gap_seconds(); tests/test_thoughts.py (8 tests). Full suite 58 passing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 07:05:15 +00:00