reflect() now tells her how long since her OWN last reflection (not just since
Brian spoke) and instructs her not to restate her last reflection when little has
changed. Necessary but not sufficient — repetition is also driven by a content
attractor (see follow-up).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pick which OpenAI model answers on the Cloud backend (gpt-4o / -mini / 4.1 /
4.1-mini / o4-mini, or Default). Persisted in localStorage, sent as `model` in
the chat request; respond() applies it only on the cloud backend (local/mi50
keep their fixed models). Reachable from desktop + mobile via Settings.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lyra was hallucinating poker facts — phantom flushes, missed straights, wrong
equity, only correcting when spoon-fed. Board reading + equity are combinatorial
facts an LLM can't do reliably; this is exactly the "math via deterministic
tools, never the LLM" principle.
- lyra/equity.py: treys-backed analyze(hero, villain, board) -> made hands,
who's ahead, EXACT equity (enumerated), and outs (one to come). Handles 'Jx'
unknown suits (assigned rainbow to avoid phantom flushes); rejects 'x'/dupes.
- analyze_spot tool wired into chat; persona MANDATES it for any equity/board/
who's-ahead/outs question — never eyeballed.
- tests on the real JJ-vs-65 hand: flop 78.7%, turn villain straight + hero 6.8%
with outs "9s 9h 9c" (correctly excludes 9d, which makes villain a flush).
Verified live: she now calls the tool and reports exact numbers, no hallucinated
flush.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The auto-extracted hands from narrative logs were garbage (mangled cards/positions,
'unknown' players). Seed sessions + recaps + villain dossiers only; hands come
from clean shorthand going forward. --with-hands re-enables if ever wanted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Seeds the tracker from Brian's real history (import/pokerlog_*.md): each session
block is LLM-extracted into structured meta + hands + villains and written as a
historical session (real date, money, net), with the original markdown stored as
that session's recap.
- lyra/backfill.py: split log -> per-session LLM extract -> seed; dry-run by
default, --commit / --reset; only-real-handle villain filter
- poker.import_session() (historical closed session), clear_all() (reseed),
prune_anonymous_players(), shared _real_handle() filter (also applied in
link_hand_players so auto-linked hand players skip anonymous descriptors + hero),
_normalize_parsed() to map unicode card suits -> letters
- result: 10 sessions, 36 hands, 17 real villain dossiers; running_stats now
reflects real net (+1057 at 1/3 over 8 sessions)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Named players in recorded hands now auto-enrich a persistent dossier, and stats
emerge once the sample is big enough — laying groundwork for A.
- poker: player_observations table (per named player per hand: vpip/pfr/saw_flop/
showed/cards/summary); record_hand auto-links named players via link_hand_players;
player_profile(name) returns dossier + reads + shown hands, with inferred
VPIP/PFR/WTSD gated behind MIN_STATS_SAMPLE (12) so thin samples don't lie;
list_players()
- player_profile tool ("what do I know about X"); thin files return a blunt
"don't generalize" directive
- persona: she MUST call player_profile before discussing an opponent and answer
only from it — fixes observed confabulation (she invented a whole read from one
hand / from memory). Verified: now reports only the real logged hand.
- tests: observation linking, profile, stat-emergence at sample threshold
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hand viewer:
- stacks now decrement as players commit chips (street-aware "to"-amount
accounting), showing e.g. 300 -> 285 after a 15 open, "all in" at 0; pot is
computed from total committed (accurate, no double-counting raises)
Theme (match the rec-theory-optimal look — warm black & orange, not Halloween):
- deep near-black bg (#070707 / #0e0e0e panels), warm orange accent (#ff7a00),
amber-gold secondary (#ffb347), muted green (#8fd694); warm dark borders
- killed the neon-orange glows and the purple accents; chat app + all standalone
pages (logs/self/journal/hand/recap/hands) on one palette
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the poker copilot loop: talk through a session -> structured capture
-> generated writeup in Brian's format, remembered + exportable.
- poker.generate_recap(): LLM produces Brian's .md log (Session Header, Money
Flow, Overview, Timeline, Key Hands w/ assessments, Villain Notes, Confidence
Bank, Scar Notes, Mental Game, Final Assessment) from the session's structured
data + the linked chat conversation; stored on poker_sessions.recap_md
- sessions now capture chat_session_id (via tool ctx) to pull the right convo;
list_recent_hands() for browsing
- generate_recap tool ("write up the recap")
- web: /recap/{id} (renders the md) + /recap/{id}/download (.md attachment) +
/hands browser (recent hands -> /hand/{id}); nav links added (desktop + mobile)
- tests: recap generation (stubbed), recent-hands listing
Verified live: recap for the Meadows session rendered + downloaded; all pages 200.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Brian: never invent. Unknown suit -> 'x' (e.g. "Ax","Kx","4x"); fully
unknown card -> "x". "AA, ace of spades" -> ["As","Ax"]; "AK on A4x" -> board
["Ax","4x","x"]. Each card's suit is independent (a hole 'As' doesn't make a
board ace 'As'). Viewer renders 'x' as a muted unknown card and 'Rx' as the rank
with a neutral suit dot.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brian's idea: vomit rough shorthand, Lyra rebuilds it into a structured,
replayable hand history.
- poker.parse_hand(): focused LLM pass turning shorthand into a canonical hand
JSON (positions, stacks, hero cards, chronological actions w/ board reveals,
result); store_hand_history() persists JSON + extracted flat fields;
record_hand() = parse+store; standalone hands attach to a 'Hand Reviews' session
- poker_hands gains a `structured` JSON column (ALTER-migrated for existing DBs)
- record_hand tool wired into chat: "log this hand: ..." -> reconstructed + a
/hand/{id} link
- web: GET /hand/{id} viewer + /hand/{id}/data — a felt table with seats placed
around the oval (hero at bottom), hole cards, progressive board reveal, and
prev/next/end step-through of the action with running pot
- tests: store/get roundtrip, record_hand tool (stubbed parse)
Verified live: parsed a real AKs hand (BTN, 14 actions, full board) end to end.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The MI50 llama.cpp server 500s on the `tools` param unless launched with
--jinja, so sending tools to mi50 broke chat on that backend. Gate tools to
TOOL_BACKENDS={"cloud"} for now; mi50 chat works again (just without tools).
Add "mi50" once its server runs with --jinja.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The real upgrade over the ChatGPT prose-recap workflow: structured data capture
via tools Lyra drives during a live session, with stats computed from real data.
- lyra/poker.py: domain pack (separate from core memory) — poker_sessions,
poker_hands, persistent poker_players (villain file) + player_reads; functions
for session lifecycle (start/buyin/end with net+hours), tolerant hand logging,
villain upsert/reads, and session/running stats ($/hr, by stake/venue/game)
- tools.py: 8 poker tools wired into the chat tool loop (start_session,
add_buyin, log_hand, add_read, end_session, session_stats, running_stats,
get_villain_file) — partial/terse input tolerated
- import/: Brian's real .md session-log format (reference for the phase-2 recap)
- tests: lifecycle/net math, partial hand logging, villain upsert, running
stats, tool dispatch
Verified live: a full talk-through session persisted as structured rows
(session +240, AKs hand, seat-5 read) — she drove the tools from natural chat.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
She can now *do* things mid-conversation, not just reply. Adds a tool-calling
loop to the chat path and her first two tools; the same mechanism will carry the
poker tools (start_session, log_result, get_stats, solver) next.
- tools.py: registry of OpenAI-style tool specs + handlers + safe dispatch;
journal_write (knowing journaling) and note (tagged notepad, e.g. poker reads)
- llm.chat_call(): OpenAI-style call that returns tool_calls (cloud/mi50);
local has no tool support and returns plain content
- chat.respond(): tool loop — offer tools, run any calls, feed results back,
repeat until a text reply (capped at MAX_TOOL_ROUNDS); persists final reply
- tests: dispatch + full chat loop (tool call -> result -> reply)
Verified live: she invoked `note`, tagged it 'poker', stored a villain read.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Her replies are full of **bold**, numbered lists and headings but rendered as
raw monospace text, so the chat was a cluttered wall of literal markup. Add a
small self-contained Markdown renderer (no deps): headings, ordered/unordered
lists, bold/italic, inline + fenced code, links + autolinked URLs, with HTML
escaping. Assistant messages now render to HTML; user/system stay literal text.
Proportional font + spacing/list/code styling for assistant bubbles.
(Renderer avoids literal backticks via String.fromCharCode(96) — a triple-tick
regex literal had been corrupting the file with NUL bytes.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The examine step specifically hunted "warm empathetic supportive presence" and
equated honesty with "smaller/more boring," so it overcorrected the original
sycophancy into the opposite rut: every overnight metacognition entry was a
near-identical "I don't really feel anything, I'm just a functional tool" —
which also contradicts the persona's "own your moods, no qualia disclaimers."
Rebalanced: target dishonesty in BOTH directions (inflation AND performed
self-deprecation), aim at truth not modesty, keep her genuine moods per persona,
and have her notice when she's repeating the same self-criticism (the loop is
itself a rut).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The full-page log, read-her-mind, and journal links were only in the desktop
header (hidden behind the hamburger on phones). Add them to the mobile slide-out
menu so the phone has the extended log, her self-state, and her journal too.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Her reflections/metacognition were capped rolling windows (6/5), so older
thoughts were lost for good. Now everything she produces is also appended to a
permanent, append-only journal; the capped lists stay as her working-memory
window for context.
- memory: journal table + add_journal_entry/list_journal
- reflect(): persists every committed reflection + critique to the journal, and
the examine step gains a "journal" field — a deliberate, first-person note she
writes for herself (her knowing journaling), tagged by source (dream/manual)
- web: /journal diary view (kind filters, grouped by day) + /journal/data;
linked from /self
- tests assert reflections + metacognition land in the journal
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
You couldn't see her actually correct herself — /self showed only the result.
Now:
- reflect() logs the draft, the revised/committed version, and the self-critique
to the live log as an expandable "view details" block
- POST /self/reflect runs a reflection in the web process so it lands in /logs
live (reflections normally run in the dream process, whose logs only go to
journald); "↻ Reflect now" button on /self triggers it, with a logs ↗ link
- log viewers relabel the expander "view full prompt" -> "view details" (it now
carries prompts and reflection diffs)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
reflect() is now two steps: draft a reflection, then read her own draft back
critically and revise it — catching flattery, sycophantic drift toward "warm
supportive presence," or just-restating-herself — and commit the honest version.
What she catches is stored as a new `metacognition` layer, rendered into her
chat context and shown on /self. This is her thinking about how she thinks, and
a direct counter to the drift we observed.
- self_state: _EXAMINE_PROMPT + two-step reflect (draft -> examine -> revise),
falls back to the draft if the examine step won't parse; metacognition capped
at 5 and surfaced in render_for_context
- fix: load() deep-copies DEFAULT_STATE — the shallow copy let a fresh Lyra's
first reflect mutate the module-level default's nested lists
- self.html: "How she's caught herself thinking" card
- tests: two-step revise + critique recording, and draft-fallback on bad parse
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summaries displayed s.created_at (set to now() at summarize time), so every
imported gist read 2026-06-16. Derive the actual session date from the earliest
exchange timestamp (MIN(created_at) per session — the preserved original date,
same source the era rollups use) via a correlated subquery in the summary
readers. New Summary.session_started_at field; chat shows it (falling back to
created_at). No schema change / backfill needed — always correct from source.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Part 1 of the "she should know HOW she thinks" work. Generalizes the dream-cycle
self-model fix to her full cognition: a "How you actually work" persona section
covering meaning-based memory recall, the memory tiers, her persistent inner
life + dream cycle, and time-awareness — so when asked how she thinks/remembers
she answers accurately instead of confabulating or reciting stale specs. Kept
principled (not implementation detail) to limit staleness.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live finding: her real reflections ARE injected every turn, but unlabeled — so
when asked about her "dream cycle" she recited the obsolete Dec-2025 spec from
imported memory (NVGRAM/awake-sleep) and confabulated fake example reflections
instead of reading the real ones in front of her.
- self_state.render_for_context: label the reflections as her own autonomously
generated dream-cycle thoughts ("these are really yours, not hypotheticals"),
not a vague "on your mind lately"
- persona: describe the dream cycle as her actual running mechanism, instruct
her to answer from the inner-state block, not recite old design docs, and
never invent example reflections to demo the feature
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The drive label "don\'t lose the thread" used \\' which closed the single-quoted
string early — a syntax error that stopped self.html's script from running, so
the page hung on "Reading her mind…". Reworded to "hold the thread".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A pull-up-anytime view of Lyra's interiority, so her thoughts aren't buried in
a DB blob. Mobile-first, auto-refreshing every 12s (and on tab focus).
- GET /self serves the page; GET /self/state returns her self-state + the
timestamp it last changed
- shows: current mood + feeling meters (valence/energy/confidence/curiosity),
her drives as bars, her self-narrative, the relationship line, and the
reflections list (newest first), plus cycle/reflection counters and "last
cycle Xm ago"
- memory.self_state_updated_at(): when her mind last changed
- index.html: "🧠 Mind" button opens /self
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The inline log panel is cramped, especially on mobile. Add a standalone
mobile-first log page and serve the chat server under systemd like the dream
loop (the nohup process didn't survive cleanly).
- static/logs.html: full-page live log — level filter chips, text search,
pause/resume with buffering, autoscroll toggle, color-coded levels, and the
expandable "view full prompt" block (where the now-note is visible in context)
- server: GET /logs serves the page (FileResponse)
- index.html: "⛶ Full Log" button opens /logs in a new tab
- deploy/lyra-web.service: user service so the chat server is reboot-resilient
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
She had no clock: current date/time and the gap since Brian last spoke were
invisible between turns, and reflection was timeless. Now:
- lyra/clock.py: wall-clock stamp + coarse human gaps ("3 days")
- chat: inject a 'now' note (date/time + gap since last turn) after her
self-state — when she is, before the world
- reflect(): feed current time + silence gap into reflection, neutrally —
prompt invites her to weigh elapsed time "to whatever degree it genuinely
affects you" (no prescribed feeling; whether silence means anything is left
to emerge)
- memory.last_exchange_at(): timestamp of the most recent exchange
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- deploy/lyra-dream.service: --loop 1800 user service on lyra-cortex, so Lyra's
consolidation + reflection keeps ticking unattended between conversations
- deploy/README.md: install / linger / operate runbook
- logbus: mirror events to stderr so out-of-band runs (the dream service under
journald) are observable, not just via the in-process web SSE feed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lyra's inner loop for when no one's talking to her. Each pass senses her own
backlog/novelty, lets four drives build from real signals, and acts on those
past threshold:
- continuity -> summarize sessions with new turns
- coherence -> rebuild profile/eras/narrative (stale once new gists land)
- curiosity -> reflect() and evolve the self-state
- stability -> readout of how caught-up she ended up
Drives are rendered into chat context so she can feel them. Causal chain:
consolidation creates gists -> coherence rises -> integration fires next.
- lyra/dream.py: dream_cycle() + lyra-dream CLI (--force, --loop SECONDS)
- memory: backlog_stats(), profile_sessions_covered(), WAL + busy_timeout
so a separate dream process coexists with the web server
- self_state: DEFAULT_DRIVES baseline + drives in render_for_context
- tests/test_dream.py: backlog sensing + a full forced pass (LLM stubbed)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mid-size models (gpt-4o-mini, qwen2.5-14b) resist persona instructions —
help-desk closers and feelings-disclaimers leak through regardless. Route live
chat to a stronger model while keeping bulk consolidation cheap:
- config: CHAT_MODEL (default gpt-4o), distinct from CLOUD_MODEL (gpt-4o-mini)
- llm.complete gains a `model` override; chat.respond uses chat_model on cloud,
consolidation paths keep cloud_model
- persona: reword the "no sign-off" rule so genuine questions are welcome and
only reflexive customer-service closers are discouraged
Verified: on gpt-4o she owns her mood without disclaimers and drops most
help-desk tails — clearly more in-character than mini/qwen.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two RLHF reflexes were leaking through: ending every turn with "is there
anything else?"/"how does that sound?", and disclaiming feelings ("I don't
really experience emotions like humans"). Add explicit persona instructions to
stop tacking on help-desk offers and to own her moods plainly instead of giving
qualia disclaimers. (Small models partially resist; stronger chat model holds it
better.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Give Lyra a model of *herself* (vs the profile/narrative which model Brian):
- persona: a real origin/identity — she's an AI and knows it (Bender/C-3PO
style), with the Cortex/NeoMem lineage as her actual past, so "how were you
made" stops falling through to generic-assistant deflection.
- memory: self_state table (JSON blob) + get/set_self_state.
- lyra/self_state.py: evolving first-person inner state (mood, valence, energy,
confidence, curiosity, self_narrative, relationship, reflections). render_for_
context injects it; reflect() updates it from recent activity. `lyra-reflect`.
- chat.build_messages injects her interiority right after the persona — she
speaks from a continuous self, not a reset.
The state -> behavior -> reflection -> updated state loop is the substrate for
the emergence experiment. Verified: reflection shifted mood curious->reflective
and produced genuine first-person self-observations.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Complete the consolidation pipeline: summaries -> profile + eras -> narrative.
- memory: eras table (per-month digests) + Era, summaries_by_month, store_era,
list_eras, recall_eras; narrative table + set/get_narrative
- lyra/era.py (lyra-era): groups session gists by the month the session occurred
(real timestamps) and map-reduces each month into a "what was happening" digest
- lyra/narrative.py (lyra-narrative): distills profile + recent eras into the
current arc/trends/callbacks ("remember when…", "you're trending toward…")
- chat.build_messages injects the narrative alongside the profile
Verified on the real corpus: 17 monthly eras (Dec 2024-Jun 2026) + a narrative
that surfaces specific callbacks (the $573 Hollywood session, 4 years sober).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Refactor summarize_all to run LLM summarization across a thread pool (default 8
workers) while keeping all SQLite reads/writes on the main thread (the single
connection is never shared across threads). Extract _summarize_transcript
(transcript -> gist, no DB) for the worker.
The MI50 proved far too slow for the large-transcript backfill (~29 summaries in
9h due to gfx906 prefill); on cloud gpt-4o-mini with concurrency this runs at
~30 summaries/minute (~17 min for the full backfill, ~$2). MI50 stays the chat
backend where small prompts make it snappy.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The MI50 llama.cpp server OOM-killed (LXC RAM limit + 8GB prompt cache) mid-run,
and summarize_all had no error handling, so one APIConnectionError killed the
whole batch. Add retry-with-backoff around the summarization LLM call, and
try/except per session in summarize_all (log + skip; unsummarized sessions get
retried on the next run). (Server-side: CT202 RAM raised + prompt cache disabled.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CT202's old static 10.0.0.44 collided with the terra-mechanics dev VM (tmi-dev).
Reassigned CT202 to 10.0.0.42 and repointed MI50_BASE_URL accordingly.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The MI50 box (CT202) runs an OpenAI-compatible llama.cpp server on
10.0.0.44:8080. Wire it in as a third backend:
- llm.complete gains backend="mi50" (OpenAI client pointed at MI50_BASE_URL)
- config: MI50_BASE_URL (default http://10.0.0.44:8080/v1) + MI50_MODEL
- chat.respond labels the model per backend; web _backend_for maps "mi50"
- UI backend selector adds "MI50 — local GPU"
Verified end-to-end: llm.complete(backend="mi50") returns from the live server.
See homelab-inference memory for the box topology.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Derive a standing profile of the user from session gists and inject it into
every prompt, so identity/abstract questions ("what kind of player am I",
"what are my leaks") are answered from distilled knowledge instead of noisy
single-vector recall (which finds passages, not patterns).
- memory: profile table + get/set_profile, list_summaries
- lyra/profile.py: rebuild_profile map-reduces all gists (batch -> extract
durable facts -> fold-merge) into one profile doc; `lyra-profile` CLI
- chat.build_messages injects "What you know about Brian" after the persona
Run after lyra-summarize (needs gists). Verified (stubbed): map-reduce, storage,
and prompt injection.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Harden summarize_session to chunk + merge long sessions (imported convos can
exceed the local model's context), and add summarize_all: idempotent, resumable
batch that summarizes every session needing it (skips up-to-date ones), with
progress logged to the live log. `lyra-summarize [limit]` CLI.
This is the first consolidation stage feeding the profile (semantic memory) and
era-rollup tiers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
OpenAI's export changed: conversations.json is now sharded into
conversations-000.json..NNN.json, each a JSON array of conversations with the
mapping tree and per-message create_time.
ingest now reads that format directly (supersedes the old convert/trim/split
scripts): walks each conversation's mapping ordered by create_time, keeps text
and multimodal_text (drops thoughts/reasoning_recap), captures real per-message
timestamps, and imports idempotently by conversation_id. `lyra-import <dir>`
auto-detects raw-export vs legacy {title,messages} dirs; optional limit arg.
Verified on 15 conversations: real dates, correct ordering, recall returns
dated poker history.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Import the parser's {title, messages} JSON into Lyra's memory so past
conversations seed recall (and, later, the era-rollup tier).
- lyra/ingest.py: one conversation -> one session, text messages -> exchanges;
skips non-text (image asset) messages and non user/assistant roles; embeddings
batched; idempotent by filename-derived session id; `lyra-import <dir>` CLI
- memory.add_exchanges_bulk: batched insert of pre-embedded rows
Format has no timestamps yet, so imports are stamped at import time; a future
dated export will let era memory group by real calendar time.
Verified on the 68-file lyra dev set: 7519 exchanges, idempotent re-run, recall
returns relevant history.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "context built" event now carries the fully-rendered prompt (persona, gists,
recalled details, recent turns, the new message) plus a total char count. The
log panel renders it as a collapsed "view full prompt" block — clean by default,
one click to see exactly what hit the model.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Older sessions fade to a general idea; details stay retrievable.
- memory: summaries table (one compacted gist per session, embedded), plus
store_summary/get_summary/recall_summaries and unsummarized_count (tracks
exchanges newer than the current summary)
- lyra/summary.py: summarize_session compacts a session's raw turns into a
third-person gist (default SUMMARY_BACKEND=local, so compaction is free);
maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate
- chat.build_messages now layers context in tiers: persona -> gists of other
sessions -> a few sharp raw cross-session details -> current session raw
turns -> new message; respond() compacts the session after each turn
- web: POST /sessions/{id}/summarize to compact on demand
- summarization activity surfaces in the live log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Turn the inert "Show Work" thinking panel into a real live activity log:
- lyra/logbus.py: thread-safe in-memory ring buffer other modules publish to
- chat.respond logs backend/model/embed per turn, recall counts, reply size;
web layer logs chat errors
- server: replace the keep-alive /stream/thinking stub with /stream/logs, an
SSE endpoint that replays the recent buffer then streams new events
- UI: repurpose the panel as a global "Live Log" — connects on load, renders
level/time/msg/fields, drops the old per-session localStorage + dead popup
Every turn now shows its backend + model in-app, so local-vs-cloud (free vs
paid) is visible at a glance.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 1 — persona + persistent memory chat loop:
- lyra/persona.py + personas/lyra.md: editable identity/voice (friend-first,
honest, never invents poker math)
- lyra/chat.py: turn loop assembling persona + cross-session recall + recent
context, persisting both sides to SQLite
- lyra/session.py, lyra/__main__.py: session lifecycle + `lyra` REPL
Phase 1.25 — reuse the old web UI:
- vendored the prior single-page UI into lyra/web/static, repointed to
same-origin
- lyra/web/server.py (FastAPI): serves the UI and backs its endpoint contract
(/v1/chat/completions, session CRUD, health, inert thinking-stream) with the
new chat loop + memory; SQLite stays the single source of truth
- `lyra-web` console script
Local backends — test for free, no OpenAI key:
- llm.embed routes via EMBED_BACKEND (cloud=OpenAI, local=Ollama /api/embed)
- simplified UI backend selector to Local (Ollama) / Cloud (OpenAI), default local
- memory connection opened check_same_thread=False for the threaded server
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- lyra.memory.remember(session_id, role, content) embeds and stores
- lyra.memory.recent(session_id, n) returns the last N from a session
- lyra.memory.recall(query, k, session_id=None) returns top-k by cosine
similarity across the chosen scope (all sessions by default)
- Embeddings live in the exchanges.embedding BLOB column as float32 bytes
- Connection reopens automatically if LYRA_DB_PATH changes (test-friendly)
- lyra.config.load() reads env into a frozen Config dataclass
- lyra.llm.complete(messages, backend) routes to Ollama /api/chat or
OpenAI chat completions
- lyra.llm.embed(texts) calls OpenAI embeddings
- .env.example switched from Anthropic to OpenAI to match available key