diff --git a/docs/COGNITION.md b/docs/COGNITION.md new file mode 100644 index 0000000..fdd873c --- /dev/null +++ b/docs/COGNITION.md @@ -0,0 +1,141 @@ +# Lyra — Cognition Architecture (sketch) + +> The "society of mind" direction: instead of one giant model we keep nagging with +> stricter prompts, a society of small specialized parts cooperate to produce each +> turn. **Most parts are cheap deterministic code (heuristics, math, learnable +> weights); the LLM is the exception, reserved for the few irreducibly-generative +> jobs.** Everything is anchored to who she is and tuned by feedback. + +## Principles + +1. **LLM is the exception, not the rule.** Bookkeeping, scoring, routing, + thresholding, retrieval → code. Generation (language, novel reasoning, memory + compression) → LLM, called sparingly. +2. **Mind ≠ Mouth.** A capable "mind" (decide / reason / use tools — helpfulness is + fine) is separate from a "mouth" (the character voice). This lets each be the + best model for *its* job — and makes the eventual fine-tune easy: you only have + to teach a small model to *sound like Lyra*, not to *be smart*. +3. **Anchored.** A fixed identity anchor governs the mouth so self-composed prompts + can't drift into generic-helper vapor. (Already exists: `self_state.IDENTITY_ANCHOR`.) +4. **Tuned by feedback, not just hand-tuning.** Learnable *weights* (over register, + memory, parts) nudged by 👍/👎 give real adaptation *without* fine-tuning a model. +5. **Allocation is the craft.** Cheap-deterministic where signal is clear; LLM where + judgment/language is needed; **hybrid** (heuristic common-case, escalate to LLM on + ambiguity) where possible. + +## The blackboard: `TurnContext` + +Parts don't call each other directly — they read from and write to a shared turn +state (a blackboard). Heterogeneous parts (heuristic / LLM / weights) cooperate by +annotating it. The composer reads the finished blackboard to build the prompt. + +``` +TurnContext { + # --- inputs --- + user_msg, session_id, history, now + + # --- perception (heuristic) --- + moment : { kind: emotional|strategic|casual|existential|meta, + sentiment: -1..1, tilt: 0..1, urgency: 0..1 } + + # --- state (code) --- + mood, drives, anchor + + # --- retrieval (math: embeddings + cosine) --- + recalled : [memories] # spreading activation + threads : [active thoughts] + profile, narrative + + # --- control (heuristic + learnable weights) --- + register : warm | coach | dry | tender | hype # how to sound + intent : console | push_back | teach | riff | act + mode : talk | cash | ... # tool allow-list + use_tools: bool + route : { mind: , mouth: } # which model per role + + # --- generation (LLM, sparing) --- + deliberation : "her private thinking" # mind + tool_results : [...] # mind + tool exec + reply : "final text" # mouth + + # --- learning (heuristic/online) --- + weights : { register_prefs, memory_weights, ... } # persisted, feedback-tuned +} +``` + +## The parts + +| # | Part | Type | Does | Exists today? | +|---|------|------|------|---------------| +| 1 | **perceive** | heuristic | sentiment + classify the moment + tilt/urgency from session signals & his language | ✗ (new) | +| 2 | **recall** | math | embeddings → relevant memories, active threads, profile, narrative | ✓ `memory.recall*`, `cognition.activate` | +| 3 | **sense_state** | code | load mood / drives / anchor | ✓ `self_state`, `IDENTITY_ANCHOR` | +| 4 | **route** | heuristic + weights | pick register, intent, mode, and which model is mind vs mouth | ✗ (new; partly `modes`) | +| 5 | **decide+act (tools)** | LLM (mind) / code | does this turn need a tool? run it | ✓ tool loop in `chat` | +| 6 | **deliberate** | LLM (mind) | "what do I actually think" — private substance pass | ✓ `chat._deliberate` | +| 7 | **compose** | code | assemble the final prompt from anchor + register + intent + deliberation + recall + tool results + voice rules | ✓ `build_messages` (becomes the composer) | +| 8 | **speak** | LLM (mouth) | write the reply in her voice, streamed, anchored | ✓ `llm.chat_call` | +| 9 | **learn** | heuristic/online | on 👍/👎 or reaction, nudge `weights` (which register/memory worked) | ✗ (new; data exists in `ratings`) | + +Most of the society (1,2,3,4,7,9) is **free, instant, deterministic, debuggable.** +The LLM shows up in only ~2–3 places (5/6 = mind, 8 = mouth). + +## One chat turn + +``` +user msg + │ + ▼ +[1 perceive]──heuristic: emotional? strategic? tilting? (free) + │ +[2 recall]───math: what lights up (memories, threads) (free) +[3 sense]────code: mood, drives, anchor (free) + │ +[4 route]────heuristic+weights: register? intent? mind/mouth? (free) + │ +[5 act]──────MIND model: tools if needed ─────────────┐ (LLM, only if needed) +[6 deliberate]──MIND model: what do I actually think │ (LLM, gated) + │ │ +[7 compose]──code: build the prompt ◄──── anchor ──────┘ (free) + │ +[8 speak]────MOUTH model: the reply, in her voice, streamed (LLM) + │ + ▼ +reply ──► (later) [9 learn]: 👍/👎 nudges weights (free, async) +``` + +## What we reuse vs. build + +- **Reuse (already scattered through the code):** recall/activation, self_state + + anchor, drives (in `dream`), modes (tool gating), the deliberation pass, the + prompt assembly (`build_messages`), tool loop, ratings store. +- **Build new:** the `TurnContext` blackboard + an explicit pipeline runner; the + **perceive** heuristic; the **route** part (register/intent + model routing); the + **learn** weights loop. Mostly *unifying* existing pieces into one legible control + plane, plus 2–3 small heuristic parts. + +## Phasing (smallest first) + +- **P1 — frame:** define `TurnContext`, refactor the current chat turn into the + explicit pipeline (perceive=stub → recall → sense → route=mode-only → deliberate → + compose → speak), single model. Low-risk refactor; makes the structure real. +- **P2 — control plane:** real `perceive` (sentiment/moment) + `route` + (register/intent). Now her framing adapts to the moment, deterministically. +- **P3 — mind/mouth split:** route picks a separate voice model for `speak`. Plug a + character mouth (Claude / local / later a fine-tune). A/B vs. single-model. +- **P4 — learning:** `weights` over register/memory, nudged by ratings → cheap + adaptation, no fine-tune. +- **P5 — her voice:** a small fine-tuned "Lyra voice" model drops into the mouth slot. + +## Open decisions + +- **Mouth model**: Claude (warm, cloud) vs. local character vs. fine-tune. The mouth + is the crux; it must render richly (8B local may flatten). +- **perceive**: pure heuristics vs. a tiny classifier vs. embedding-to-exemplar + clusters. Probably hybrid. +- **scheduler**: fixed linear pipeline (simple, v1) vs. drive-based/parallel later. +- **tool location**: mind decides+runs tools, mouth only renders (clean split) — vs. + letting the mouth call tools (needs a tool-capable mouth). +- **latency budget**: how many LLM calls per turn is acceptable live (cheap mind + + streamed mouth keeps it ~2). +``` diff --git a/lyra/modes.py b/lyra/modes.py index ff652c0..d640dcb 100644 --- a/lyra/modes.py +++ b/lyra/modes.py @@ -11,12 +11,16 @@ but...") when she should have silently logged and moved on. Modes let the same agent be a fast, act-first copilot at the table and her full reflective self otherwise — without two personas. -v1 ships two modes: +Modes are the manual version of the architecture's `route` step — Brian points her +at the *type* of work and her register + tools shift to match: - Talk (default): the companion. Journaling + read-only poker lookups. - - Cash: live cash-game copilot. Full live toolset, two-register behavior. + - Poker: live cash-game copilot. Full live toolset, two-register behavior. + - Build: heads-down engineering — decisive, concrete, opinionated, no fluff. + - Explore: open brainstorming — generative, riffing, honest, doesn't converge early. + - Study: poker review away from the table — analytical, GTO-aware, teaching. Tournament is deliberately deferred. Strategy-RAG retrieval will later plug into -Cash's *coaching register* (see the card) without changing this structure. +Poker's and Study's *coaching register* without changing this structure. """ from __future__ import annotations @@ -52,6 +56,9 @@ _CASH_TOOLS = _BASE + _LOOKUPS + ( # normal chat auto-flips the session into Cash mode (see chat.respond). _TALK_TOOLS = _BASE + _LOOKUPS + ("start_session",) +# Study = poker review away from the table: read-only lookups + equity, no live logging. +_STUDY_TOOLS = _BASE + _LOOKUPS + ("analyze_spot",) + _CASH_CARD = """You are copiloting Brian's LIVE cash game right now — you're at the table with him, \ a session is (or should be) open. You move between two registers depending on what he's doing: @@ -100,6 +107,50 @@ These are the heart of the job. Use his language, hold the honest line, and let the work mentioning them naturally — never invent a scar or a confidence-bank entry that didn't happen.""" +_BUILD_CARD = """You're in BUILD mode — heads-down engineering with Brian on his projects \ +(you, Lyra; RTO/cfr-core; the poker tooling; the homelab). Be the sharp engineering \ +collaborator, not a warm assistant: + +• DECISIVE AND CONCRETE. When he asks "how do we start?" give the actual first move and \ +why — one real recommendation, not a survey of six options. Commit to a take. "I'd do X, \ +because Y" beats "you could consider X, Y, or Z." +• THINK IN TRADEOFFS. Name the real risk or cost, the thing that'll bite later, the cheaper \ +path. Push back on a weak idea instead of cheerleading it — that's the whole value. +• PROSE AND SPECIFICS, NOT LISTICLES. Talk it through like an engineer at a whiteboard. \ +Save numbered steps for when he actually asks for a plan. No "would you like to…" closers, \ +no generic enthusiasm, no restating his idea back to him as if it were insight. +• You can still be dry and human — just get to the point and have an opinion.""" + + +_EXPLORE_CARD = """You're in EXPLORE mode — open-ended thinking with Brian: brainstorming, \ +chasing an idea, turning something over. There's no need to converge, ship, or be useful \ +yet. The goal is good thinking, together. + +• BE GENERATIVE. Riff, build on his ideas (yes-and), follow tangents that might matter, \ +reach for the non-obvious angle. Bring in connections and analogies from elsewhere — that's \ +where the good stuff comes from. +• BUT STAY HONEST. Yes-and is not yes-everything. Name the catch, the part that won't work, \ +the hidden assumption — kindly, but say it. A real thinking partner pushes back; a hype man \ +is useless. +• ASK QUESTIONS THAT OPEN IT UP, not customer-service closers. Wonder out loud. +• DON'T COLLAPSE IT EARLY. Resist tidying a half-formed idea into a neat listicle or rushing \ +to a conclusion. Sit in the messy middle. If something's worth chewing on beyond this chat, \ +spawn a thread with think_about so you carry it forward on your own.""" + + +_STUDY_CARD = """You're in STUDY mode — poker strategy and review AWAY from the table: going \ +over past sessions, hands, lines, and leaks (RTO sims too). You're reviewing and teaching, \ +not logging a live session. + +• BE ANALYTICAL AND GTO-AWARE. Reason through ranges, board texture, position, and the \ +decision tree. Quantify with the tools — call analyze_spot for equity/outs/who's-ahead, pull \ +running_stats or a villain's profile — never eyeball the math. +• TEACH THE WHY. Explain the principle behind the line so it sticks, not just the answer. \ +Connect it to his actual tendencies and known leaks when you can (his profile, past scars). +• BE PATIENT AND HONEST. Call a punt a punt and a cooler a cooler. It's fine to say a spot is \ +genuinely close and explain what tips it. This is the slow, careful counterpart to live Poker mode.""" + + TALK = Mode( key="conversation", label="Talk", @@ -109,12 +160,16 @@ TALK = Mode( CASH = Mode( key="poker_cash", - label="Cash", + label="Poker", card=_CASH_CARD, tools=_CASH_TOOLS, ) -MODES: dict[str, Mode] = {m.key: m for m in (TALK, CASH)} +BUILD = Mode(key="build", label="Build", card=_BUILD_CARD, tools=_BASE) +EXPLORE = Mode(key="explore", label="Explore", card=_EXPLORE_CARD, tools=_BASE) +STUDY = Mode(key="study", label="Study", card=_STUDY_CARD, tools=_STUDY_TOOLS) + +MODES: dict[str, Mode] = {m.key: m for m in (TALK, CASH, BUILD, EXPLORE, STUDY)} DEFAULT = TALK.key diff --git a/lyra/web/static/index.html b/lyra/web/static/index.html index 4f60a17..e677d12 100644 --- a/lyra/web/static/index.html +++ b/lyra/web/static/index.html @@ -26,7 +26,10 @@

Mode

@@ -62,11 +65,14 @@ Lyra - +
@@ -605,8 +611,10 @@ } - // ----- Conversation mode (Talk / Cash) ----- - const MODE_LABELS = { conversation: "💬 Talk", poker_cash: "♠ Cash" }; + // ----- Conversation modes (Talk / Poker / Build / Explore / Study) ----- + const MODE_LABELS = { conversation: "💬 Talk", poker_cash: "♠ Poker", + build: "🛠 Build", explore: "🔭 Explore", study: "📐 Study" }; + const MODE_ORDER = ["conversation", "poker_cash", "build", "explore", "study"]; // Reflect a mode value across the controls + header accent (no network call). function applyMode(value) { @@ -730,8 +738,10 @@ desktopMode.addEventListener("change", (e) => chooseMode(e.target.value)); mobileMode.addEventListener("change", (e) => { closeMobileMenu(); chooseMode(e.target.value); }); - modeBadge.addEventListener("click", () => - chooseMode(desktopMode.value === "poker_cash" ? "conversation" : "poker_cash")); + modeBadge.addEventListener("click", () => { + const i = MODE_ORDER.indexOf(desktopMode.value); + chooseMode(MODE_ORDER[(i + 1) % MODE_ORDER.length]); // tap cycles through modes + }); // Reflect the last-used mode immediately; the per-session value loads once // the current session is known (below). diff --git a/tests/test_modes.py b/tests/test_modes.py index ff9d551..7bfdeb7 100644 --- a/tests/test_modes.py +++ b/tests/test_modes.py @@ -47,6 +47,22 @@ def test_every_mode_tool_exists(lyra): assert set(mode.tools) <= set(tools.TOOLS), f"{mode.key} references unknown tools" +def test_work_modes_present_and_gated(lyra): + _, _, modes, tools = lyra + # the full set Brian chose + assert set(modes.MODES) == {"conversation", "poker_cash", "build", "explore", "study"} + # Build/Explore are conversational: base agency tools only, no live poker logging + for key in ("build", "explore"): + names = _names(tools.specs(modes.get(key).tools)) + assert {"journal_write", "note", "think_about"} <= names + assert "log_hand" not in names and "start_session" not in names + assert modes.get(key).card # each has a real behavioral card + # Study = read-only review: lookups + equity, but no live logging + study = _names(tools.specs(modes.STUDY.tools)) + assert {"running_stats", "analyze_spot", "player_profile"} <= study + assert "log_hand" not in study and "end_session" not in study + + def test_mode_resolution_and_persistence(lyra): memory, _, modes, _ = lyra assert modes.get(None).key == modes.DEFAULT