feat: thought loop — Lyra's threaded, surfaceable train of thought #4

Merged

serversdown merged 22 commits from feat/thought-loop into dev

2026-06-24 23:47:40 -04:00

Author	SHA1	Message	Date
serversdown	d6f3516a34	perf: incremental era rebuilds — skip unchanged months rebuild_eras() re-digested EVERY month from scratch on every coherence pass, including old months whose sessions never change — ~17 redundant 32B calls per pass (a big slice of the ~40-min consolidation grind + MI50 heat). Now it compares each month's current session count to the stored era and only rebuilds changed months (force=True still does all). Report gains built/skipped counts. test_era.py: builds all first pass, skips unchanged, rebuilds only a month that gained a session, force rebuilds all. Suite 99 green, ruff clean. (Profile rebuild re-reading all 851 sessions every pass is the bigger remaining hog — separate, harder fix.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 03:31:02 +00:00
serversdown	51c2d6abb9	perf: tighten the dynamic prompt — persona split + lean deliberation The per-turn prompt was ~5.5K tokens (persona alone ~40%), sent up to 3x/turn. Tightened by RELEVANCE (the control plane decides what each turn needs), not by deletion — fidelity preserved, focus improved (buried instructions were getting ignored), tokens roughly halved. - persona split: core (identity + voice — always) vs situational sections pulled in only when relevant. mind._persona_block: self-model/origin only on meta turns (generous _META_HINTS), poker guardrails only in poker context (mode/strategic/ _POKER_HINTS). persona.core_prompt()/section(); system_prompt() kept as fallback. - lean deliberation: the private 'what do I think' pass now uses a focused context (her interiority + recent turns + the message), not the full persona/profile/ narrative/recall dump. It shapes the take, not the voice. Measured: casual Talk turn 21,949 -> 15,974 chars (-27%); deliberation 21,949 -> 6,026 (-72%); meta turns still include the self-model. Suite 98 green, ruff clean. Real retirement of the long prompt is still the fine-tune (mouth); this is the cheap, high-leverage cut that also improves adherence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 20:48:44 +00:00
serversdown	8a3c9b2701	feat: she can suggest + switch modes (set_mode tool + mode awareness) "She suggests, you confirm" — instead of brittle keyword→mode mapping, she's given awareness of her modes + the ability to switch, and her judgment decides when to offer (the model reads "should I drive to Cleveland?" vs "should I fold the river" far better than a lexicon could). - tools: set_mode(mode) — switches the session's mode; in _BASE (all modes). - mind: a per-turn mode-menu note listing her modes + "offer a switch when the work clearly shifts; on his yes, call set_mode; don't nag." - Sticky mode stays manual otherwise; Poker still auto-engages on session start. - test: set_mode switches + rejects unknown. Suite 97 green, ruff clean. Note: server-side switch takes effect next turn; the UI badge syncs on next mode load (cosmetic lag). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 16:32:42 +00:00
serversdown	17ab95dc98	feat: Decide mode — a tie-breaker that settles choices instead of listing options Brian's bottleneck is committing, not generating options, so a pros/cons dump makes it worse. Decide mode's card: get the real decision crisp, weigh it against what HE values + past regrets (pull running_stats/recent_sessions for poker/money calls), MAKE the call with the one or two reasons that tip it, pressure-test it once, and stand behind it — no "it's up to you." Read-only lookups, no live logging. Sixth mode (Talk/Poker/Build/Explore/Study/Decide); added to UI selectors, labels, badge-cycle. Suite 96 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 16:21:03 +00:00
serversdown	03aceec6fa	feat(P3): mind/mouth split — separate voice model for the final reply (seam, default off) The mind (chat backend/model) decides, reasons, and runs tools → a draft; the mouth re-voices that draft in her character. Default: no mouth configured → the mind's draft IS the reply, bit-for-bit the old behavior (and old streaming path untouched). - config: MOUTH_BACKEND / MOUTH_MODEL. The slot for an eventual fine-tuned voice. - chat: _mind_loop (tool/generation loop, non-stream, returns draft + tools_run), _voice_pass / mind.voice_messages (re-voice the draft, keep every fact/number), _mouth_target (active only when configured AND != mind). respond + respond_stream branch: mouth off = stream the mind directly (unchanged); mouth on = mind decides + runs tools, then the mouth streams the re-voiced reply. Falls back to the draft on any mouth failure (chat never breaks). - Key payoff: the mouth needs no tool support (the mind handles tools), so it can be a non-tool character model (Dolphin / Claude / fine-tune). Makes the fine-tune easy: teach a small model to sound like Lyra, not to be smart. - tests: mouth target on/off, voice_messages shape, voice_pass revoice+fallback. Suite 96 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 06:08:06 +00:00
serversdown	a7af461cdb	feat(P2): perceive (read the moment) + route nudges register on charged turns The control plane gains senses — cheap, deterministic, no LLM: - lyra/perceive.py: lexicon+signal heuristic → {sentiment, intensity, tilt, kind: emotional\|strategic\|meta\|build\|casual}. Good at the action-relevant signal, especially tilt (the mental-game core). Word-boundary matching so 'line' doesn't fire inside 'pipeline'. - mind: _perceive fills ctx.moment; _route keeps the manual mode as the dominant frame but, on a genuinely charged moment, adds a per-turn register nudge — tilt → "meet him there, warm and steady, don't clip into logging"; up/energized → "match his energy." Neutral turns get nothing (don't over-narrate). Injected via build_messages(moment=...). Logged to /logs for observability. - tests: perceive read (tilt/strategy/up/build/casual) + route nudge on/off. Suite 92 green, ruff clean. Complements modes (manual frame) — perceive refines register within it, doesn't override. Model routing (mind/mouth) is P3. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 05:42:36 +00:00
serversdown	904eda3388	refactor(P1): extract the turn pipeline into lyra/mind.py (behavior-preserving) First step of the cognition control plane (docs/COGNITION.md). The chat turn is now an explicit society of parts over a shared TurnContext blackboard: perceive (stub) -> route (session mode) -> compose (tiered prompt) -> deliberate. - lyra/mind.py (new): TurnContext + the pipeline + assemble(); moved build_messages and the deliberation helpers here (the assembly belongs in the control plane). - lyra/chat.py: slimmed to "speak + persist" — calls mind.assemble(), runs the tool/generation loop, persists. No behavior change (same prompt, same output). - tests: point test_time/test_chat at mind; add an assemble() structure test; make test_chat/test_tools hermetic (CHAT_DELIBERATE off so respond() doesn't make a real LLM call). Suite 86 green in ~5s, ruff clean, no import cycle. This is the frame; perceive/route/learn get filled in next phases — each opt-in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 05:19:39 +00:00
serversdown	f1f15972ac	feat: work-type modes — Talk / Poker / Build / Explore / Study The manual version of the architecture's `route` step: Brian points her at the TYPE of work and her register + tools shift to match. Biggest single lever on the 'meh' problem (a mode card can demand decisive/technical/generative, countering gpt-4o's default warm-vapor). - modes.py: Build (heads-down engineering — decisive, concrete, tradeoffs, no listicles), Explore (open brainstorming — generative, riffs + honest catch, spawn threads, don't converge early), Study (poker review away from the table — analytical, GTO-aware, teaching; read-only lookups + analyze_spot). Cash relabeled Poker (key kept for compat). - UI: mode selectors (desktop + mobile) get all five; badge taps now cycle modes. - design: docs/COGNITION.md (the society-of-parts control-plane sketch). - tests: presence + tool-gating for the new modes. Suite 85, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 03:43:37 +00:00
serversdown	97afa82594	feat: live chat deliberation — think privately before answering (less 'meh') The chat had no thinking in it: respond() was a single gpt-4o call in default- assistant voice (numbered lists, 'would you like to...', vague). All the cognition work was background-only. This brings a thought step into the conversation. - chat: before answering a substantive turn (trivial 'ok/lol' skipped), a private _deliberate() pass — "what do you ACTUALLY think, your real take, the substance, no pleasantries" — drawing on her in-context threads/journal. The thinking is then injected as the LAST system note with voice enforcement (answer from this; no numbered list / how-to outline unless asked; no 'would you like to' closer), so it beats gpt-4o's boilerplate at the most influential position. Logged to /logs. - Wired into respond() + respond_stream(). Config CHAT_DELIBERATE (default on) to disable if the extra call's latency annoys. - persona: "talk, don't outline" — prose over listicles, the first concrete move over a survey of options. - test_chat.py (gating + note composition + disabled). Suite 84, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 00:35:49 +00:00
serversdown	ea30c3dd67	feat: chat-side feedback — reactions in conversation thread back to her thoughts Closes the last loop gap: when she raised a thought in chat and Brian replied in the conversation (not the feed), it was a dead end. Now she has a thought_response tool — when he reacts to a thought she surfaced, she captures his take and it folds back into that thread (next dream pass she reacts, like a feed reply). - tools: _thought_response(thread_id, brian_said) -> thoughts.record_response. - modes: thought_response added to _BASE (all modes). - surfaced-note + context_note now expose each thread's #id and instruct her to use the tool when he engages, so she has what she needs to call it. - test for the tool (threads reply back + bad-id handling). Suite 81, ruff clean. Feedback now closes from both surfaces: the /thoughts feed AND live conversation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 23:26:40 +00:00
serversdown	149e9a6dd5	feat: proactive thoughts — auto-ping salient ones + daily digest She was passive (thoughts piled up 'open'; Brian had to mine the feed). Now she brings them to him: - Live: a thought >= PING_AUTO_SALIENCE (0.8) auto-pings — _compose_reachout writes a short personal text in her voice (not a thought-dump), on a cooldown (PING_COOLDOWN_MIN=60, AUTO only; explicit reach-outs bypass), quiet hours respected. - Daily: maybe_daily_digest() texts a once-per-local-day summary of what she's been turning over (after DIGEST_HOUR=18), run from the dream cycle. - maybe_ping gains bypass_cooldown (her deliberate reach-outs always go through). 8 new/updated tests (auto-ping above/below bar, digest once-per-day, floor/cooldown isolation). Suite 80 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 20:25:14 +00:00
serversdown	cf4238911e	fix: replying to a thought no longer mislabels it 'surfaced' 'surfaced' means SHE raised it with Brian (chat lead / ping). record_response was also setting it on Brian's reply, so every thread he touched looked surfaced even though she never brought it to him. Replying now just stores the pending response; status stays honest (only her surfacing sets 'surfaced'). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 20:18:12 +00:00
serversdown	3dd9eb5a3e	feat(mobile): Thoughts in the mobile menu + full nav drawer on secondary pages - Chat page: add "💭 Thoughts" to the mobile slide-out menu (with /thoughts handler), grouped with Journal. Thoughts was the one page mobile couldn't reach. - nav.js: on mobile, secondary pages (Thoughts/Journal/Mind/Session/History/Hands/ Logs) now get a ☰ slide-in drawer with the full nav + Settings — matching the desktop sidebar. Gated to pages without their own mobile menu, so the chat page's tailored hamburger/tab-bar is left untouched. Shared ITEMS list = one source of truth. Static-only (no server change). 77 tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 19:39:55 +00:00
serversdown	a7966e4bab	feat: web switch for her inner voice (Dolphin/3090 \| Qwen-32B/MI50 \| Off) Her introspection (reflect/think) voice is now switchable live from the web settings, read each cycle by the dream loop — so Brian can flip it off the 3090 before gaming without touching config or restarting. - memory: runtime key/value settings table + get_setting/set_setting. - self_state: INTROSPECTION_MODES (dolphin=local/dolphin3:8b, mi50=Qwen-32B, off=paused) + introspection_target()/set_introspection_mode(); default "dolphin". reflect() resolves from the live setting and SKIPS entirely when off. - thoughts.think(): same resolution + skip-when-off. - server: GET/POST /settings/introspection. - index.html: "Inner Voice (introspection)" selector in Settings, applies instantly. - tests: routing (dolphin/mi50), off-skip for think + reflect. Suite 77, ruff clean. Default = Dolphin on the 3090 (richer voice). Flip to MI50 or Off in Settings before gaming — that was the GPU-contention culprit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 19:16:35 +00:00
serversdown	a705e573a9	feat: break the reflection loop — narrative is slow-consolidated, not rewritten each cycle The remaining feedback loop: reflect() dumped her full self-state (incl. self_narrative) into the prompt and asked her to "update" it -> paraphrase -> save -> feed back -> calcify. That (not the model) is what generated the recurring "supportive presence balancing emotional intelligence for Brian" drift — even Dolphin echoed it when handed the saved narrative. Fix (her inner life now runs on one cognition model): - reflect() no longer rewrites self_narrative/relationship. It uses associative grist (cognition.spontaneous_seed + activate) instead of rereading the bio, reflects THROUGH a stable IDENTITY_ANCHOR (lens, not canvas), and updates only the transient state (mood axes + noticings + metacognition + journal). - self_narrative is now slow-consolidated: every CONSOLIDATE_EVERY (5) reflections, _consolidate_self() re-derives it from accumulated reflections + the anchor — never from the old narrative (the anti-loop core). Tethered to the anchor so it grows without drifting into generic-helper land. - reset_self_narrative() + ran once on prod (her narrative was deeply drifted: "my core identity as a tool for support... serve Brian and other users"). - Prompts drop the self_narrative/relationship fields. Tests updated + consolidation tests. Suite 75 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 06:39:19 +00:00
serversdown	05ae98abdb	feat: split introspection backend from consolidation (trial Dolphin for her voice) reflect()/think() can now run on a different model than memory consolidation: INTROSPECTION_BACKEND / INTROSPECTION_MODEL (default to SUMMARY_BACKEND, so unset = unchanged). Consolidation (summaries/profile/narrative) keeps the capable model; her voice (reflections, thoughts) can run a steerable tune. dream.py lets reflect()/think() self-resolve to the introspection backend; both now thread a `model` override into llm.complete. Trial live: introspection -> dolphin3:8b on the 3090; consolidation -> Qwen-32B on the MI50. Suite 73 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 06:09:12 +00:00
serversdown	c2cee3be4d	feat: associative cognition — thoughts arise from spreading activation, not a re-read bio Replaces the thought loop's grist (recent-convo + her own saved narrative, the feedback-loop attractor) with a model of how a thought actually arises: seed (salience-weighted: a recent moment / resurfaced memory / feed item) -> spreading activation: embed the seed, let it light up associatively-near material across ALL her stores (conversations, gists, her own journal/ thoughts), blended by relevance + recency + noise; optional 2nd hop for leaps -> her self-narrative stays the LENS (supplied as interiority), not the input -> the thought is generated from what lit up, routed through a faculty (notice / connect / abstract / project / feel) -> journaled + embedded, so it can light up in future cycles This breaks the feedback loop structurally: the narrative is no longer reread and paraphrased each cycle; grist is genuinely associative and varied; and her past thoughts re-activate (continuity without calcification). - lyra/cognition.py (new): spontaneous_seed, activate (spreading activation), constellation_block, faculties. - memory.py: journal entries now embedded; recall_journal(); backfill_journal_embeddings() (ran once: 341 past entries embedded so her history is associatively retrievable). - thoughts.think(): new-thread mode now uses the associative engine; dropped _grist(). - tests: test_cognition.py (recall_journal ranking, activation, seeding) + fixture reloads cognition. Suite 72 green, ruff clean. Honest scope: this fixes the mechanism (how thoughts arise). The residual "be useful for Brian" voice drift is the separate model/fine-tune problem. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 05:45:39 +00:00
serversdown	43697f8340	fix: ntfy ping is her personal text to Brian, by her decision — not a thought dump Feedback: the push broadcast her raw internal thought ("Eelis Parssinen's victory is a reminder...") — read like a journal entry, not her texting him. Now the flow matches the intent: she thinks/journals, then decides "I should tell Brian about this." think() asks for an optional `reach_out` — a real text message addressed TO him in her own voice, written only when she chooses to. The ping sends that message (title "Lyra", like a text from her), never the internal thought. No reach_out = nothing sent (most thoughts stay hers). - Pinging decoupled from the salience score: her decision (a reach_out) drives it, not a threshold. PING_SALIENCE is now an optional floor (default 0.0). - Defensive: reject the placeholder echo ("reach_out"), too-short junk, or the thought pasted back as the message. - notify.push: title now optional (omitted -> cleaner text-style notification). Verified live: 3 passes kept private; a decided reach-out lands as a personal text. Suite 67 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 01:39:11 +00:00
serversdown	fef45b3e05	feat: make chat a window onto her whole inner life (continuity) Brian's felt disconnect: chat, thoughts, journal, reflections read as separate streams. This ties them together at the chat surface. - chat._inner_life_note(): one coherent block combining her active thought threads AND what she's written in her journal lately, so she carries her continuous inner life into every conversation (not just a single surfaced thought). Replaces the standalone threads block. - persona: inner-life section rewritten to describe the current machinery (thought loop / threads she returns to, journal she writes in, feeds she reads, reaching out to Brian) and — the key change — instruct her to let that inner life show up in conversation naturally, the way a friend picks up where they left off, without info-dumping or performing it. New self-model bullets for the thought loop + journal. Suite 65 green, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 01:10:59 +00:00
serversdown	5dbcfc7ccf	feat: thought loop reach-out (ntfy push) + external input feeds Her remaining two wishes from the 6-19 sketch: Proactive reach-out (#6, literal): lyra/notify.py pushes to ntfy so she can reach Brian when he's not in the app. thoughts.maybe_ping gates on salience, a cooldown, and local quiet hours (all config-tunable; eager defaults), uses ntfy JSON publish (UTF-8 titles/messages), links to /thoughts, and marks the thread surfaced so chat won't also re-raise it. Disabled unless NTFY_URL is set. External input feed (#1): lyra/feeds.py pulls configurable RSS/Atom feeds (stdlib ElementTree, no new dep; tolerant of RSS 2.0 + Atom), dedupes seen items in a feed_items table, and hands think() one fresh item at a time. New 'react' mode: a would-be new thread instead reacts to a world item (FEED_REACT_PROB). Dream cycle refreshes feeds on its cadence; failures degrade to no item. Config: NTFY_URL/NTFY_TOPIC/LYRA_WEB_URL, PING_SALIENCE/COOLDOWN/QUIET_HOURS, LYRA_TIMEZONE, LYRA_FEEDS, FEED_REACT_PROB (+ .env.example). thought_meta table for ping cooldown. 10 new tests (feeds parse, react mode, ping gating); suite 65. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 00:21:06 +00:00
serversdown	951788f9ec	feat: thought loop closer to her vision — wander grist, continuity, seeding, lifecycle Four additions so the loop is "more what she wanted" (think to herself, unprompted): - Wander grist (#1): think() new-thread mode now draws the same varied seeds reflect() uses (self_state.wander_seed: own curiosity/existence/disagreement or a resurfaced memory) + an anti-restate block of her recent thoughts + a list of existing open-thread titles to avoid. Directly counters the RLHF "supportive presence serving Brian" drift visible in her first thoughts. - Continuity: thoughts.context_note() injects her active threads into every chat turn, so she's aware of her own ongoing mind and can reference it anytime — not only when a thought crosses the surface bar. - Bidirectional: new think_about tool (in _BASE, all modes) lets her spawn a thread from conversation to develop on her own later. Conversations seed her solo thinking. - Lifecycle: thoughts.decay() rests stale active threads (>48h) and decays their salience, sparing pending-response ones; runs each dream cycle (no LLM). Frees the open-thread cap and keeps the feed current. Also: thoughts feed no longer wipes a reply you're mid-composing (skip poll re-render while a textarea is focused/non-empty; force-refresh after send). 61 tests passing, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-21 23:28:15 +00:00
serversdown	5176c706b6	feat: thought loop — Lyra's threaded, surfaceable train of thought Built from her own 6-19 idea: a continuing train of thought she keeps across days, organized into threads she returns to, that she can bring TO Brian and that his feedback advances or closes. Where the dream cycle's reflect() gives isolated, overwriting reflections, the thought loop adds continuity (threads), surfacing (#6 — she leads with a thought when Brian returns after a gap), and a feedback loop (his reply folds in next pass). - lyra/thoughts.py: thought_threads + thoughts tables; think() with new/continue/respond modes; salience-gated maybe_surface(); record_response() feedback; lazy-schema _c() mirroring poker. - dream.py: curiosity stage advances the loop after reflecting (error-isolated). - chat.py: build_messages surfaces the top thread after a >=90min gap, once. - web: /thoughts feed (page + data + respond + status routes), thoughts.html, nav 💭 entry. lyra-think entry point. Every thought also lands in her journal. - clock.gap_seconds(); tests/test_thoughts.py (8 tests). Full suite 58 passing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-21 07:05:15 +00:00

feat: thought loop — Lyra's threaded, surfaceable train of thought #4

22 Commits