Her reflections/metacognition were capped rolling windows (6/5), so older
thoughts were lost for good. Now everything she produces is also appended to a
permanent, append-only journal; the capped lists stay as her working-memory
window for context.
- memory: journal table + add_journal_entry/list_journal
- reflect(): persists every committed reflection + critique to the journal, and
the examine step gains a "journal" field — a deliberate, first-person note she
writes for herself (her knowing journaling), tagged by source (dream/manual)
- web: /journal diary view (kind filters, grouped by day) + /journal/data;
linked from /self
- tests assert reflections + metacognition land in the journal
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
You couldn't see her actually correct herself — /self showed only the result.
Now:
- reflect() logs the draft, the revised/committed version, and the self-critique
to the live log as an expandable "view details" block
- POST /self/reflect runs a reflection in the web process so it lands in /logs
live (reflections normally run in the dream process, whose logs only go to
journald); "↻ Reflect now" button on /self triggers it, with a logs ↗ link
- log viewers relabel the expander "view full prompt" -> "view details" (it now
carries prompts and reflection diffs)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
reflect() is now two steps: draft a reflection, then read her own draft back
critically and revise it — catching flattery, sycophantic drift toward "warm
supportive presence," or just-restating-herself — and commit the honest version.
What she catches is stored as a new `metacognition` layer, rendered into her
chat context and shown on /self. This is her thinking about how she thinks, and
a direct counter to the drift we observed.
- self_state: _EXAMINE_PROMPT + two-step reflect (draft -> examine -> revise),
falls back to the draft if the examine step won't parse; metacognition capped
at 5 and surfaced in render_for_context
- fix: load() deep-copies DEFAULT_STATE — the shallow copy let a fresh Lyra's
first reflect mutate the module-level default's nested lists
- self.html: "How she's caught herself thinking" card
- tests: two-step revise + critique recording, and draft-fallback on bad parse
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summaries displayed s.created_at (set to now() at summarize time), so every
imported gist read 2026-06-16. Derive the actual session date from the earliest
exchange timestamp (MIN(created_at) per session — the preserved original date,
same source the era rollups use) via a correlated subquery in the summary
readers. New Summary.session_started_at field; chat shows it (falling back to
created_at). No schema change / backfill needed — always correct from source.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Part 1 of the "she should know HOW she thinks" work. Generalizes the dream-cycle
self-model fix to her full cognition: a "How you actually work" persona section
covering meaning-based memory recall, the memory tiers, her persistent inner
life + dream cycle, and time-awareness — so when asked how she thinks/remembers
she answers accurately instead of confabulating or reciting stale specs. Kept
principled (not implementation detail) to limit staleness.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live finding: her real reflections ARE injected every turn, but unlabeled — so
when asked about her "dream cycle" she recited the obsolete Dec-2025 spec from
imported memory (NVGRAM/awake-sleep) and confabulated fake example reflections
instead of reading the real ones in front of her.
- self_state.render_for_context: label the reflections as her own autonomously
generated dream-cycle thoughts ("these are really yours, not hypotheticals"),
not a vague "on your mind lately"
- persona: describe the dream cycle as her actual running mechanism, instruct
her to answer from the inner-state block, not recite old design docs, and
never invent example reflections to demo the feature
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The drive label "don\'t lose the thread" used \\' which closed the single-quoted
string early — a syntax error that stopped self.html's script from running, so
the page hung on "Reading her mind…". Reworded to "hold the thread".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A pull-up-anytime view of Lyra's interiority, so her thoughts aren't buried in
a DB blob. Mobile-first, auto-refreshing every 12s (and on tab focus).
- GET /self serves the page; GET /self/state returns her self-state + the
timestamp it last changed
- shows: current mood + feeling meters (valence/energy/confidence/curiosity),
her drives as bars, her self-narrative, the relationship line, and the
reflections list (newest first), plus cycle/reflection counters and "last
cycle Xm ago"
- memory.self_state_updated_at(): when her mind last changed
- index.html: "🧠 Mind" button opens /self
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The inline log panel is cramped, especially on mobile. Add a standalone
mobile-first log page and serve the chat server under systemd like the dream
loop (the nohup process didn't survive cleanly).
- static/logs.html: full-page live log — level filter chips, text search,
pause/resume with buffering, autoscroll toggle, color-coded levels, and the
expandable "view full prompt" block (where the now-note is visible in context)
- server: GET /logs serves the page (FileResponse)
- index.html: "⛶ Full Log" button opens /logs in a new tab
- deploy/lyra-web.service: user service so the chat server is reboot-resilient
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
She had no clock: current date/time and the gap since Brian last spoke were
invisible between turns, and reflection was timeless. Now:
- lyra/clock.py: wall-clock stamp + coarse human gaps ("3 days")
- chat: inject a 'now' note (date/time + gap since last turn) after her
self-state — when she is, before the world
- reflect(): feed current time + silence gap into reflection, neutrally —
prompt invites her to weigh elapsed time "to whatever degree it genuinely
affects you" (no prescribed feeling; whether silence means anything is left
to emerge)
- memory.last_exchange_at(): timestamp of the most recent exchange
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- deploy/lyra-dream.service: --loop 1800 user service on lyra-cortex, so Lyra's
consolidation + reflection keeps ticking unattended between conversations
- deploy/README.md: install / linger / operate runbook
- logbus: mirror events to stderr so out-of-band runs (the dream service under
journald) are observable, not just via the in-process web SSE feed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lyra's inner loop for when no one's talking to her. Each pass senses her own
backlog/novelty, lets four drives build from real signals, and acts on those
past threshold:
- continuity -> summarize sessions with new turns
- coherence -> rebuild profile/eras/narrative (stale once new gists land)
- curiosity -> reflect() and evolve the self-state
- stability -> readout of how caught-up she ended up
Drives are rendered into chat context so she can feel them. Causal chain:
consolidation creates gists -> coherence rises -> integration fires next.
- lyra/dream.py: dream_cycle() + lyra-dream CLI (--force, --loop SECONDS)
- memory: backlog_stats(), profile_sessions_covered(), WAL + busy_timeout
so a separate dream process coexists with the web server
- self_state: DEFAULT_DRIVES baseline + drives in render_for_context
- tests/test_dream.py: backlog sensing + a full forced pass (LLM stubbed)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture the isolated-VM design for the self-modification frontier: Proxmox
sandbox clone, network isolation (esp. from tmi-dev/day-job), snapshot-rollback,
spend/resource caps, kill switch, human-gated promotion. Build the cage before
the agent gets code-write powers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture moonshots/pipe-dreams (own model, memory-as-native-vectors, prompt
compression, RTO/cfr-core tooling) so they don't derail current work but aren't
lost. The discipline: park what's "in the way of the point," ship the working
thing, revisit when it becomes the point.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mid-size models (gpt-4o-mini, qwen2.5-14b) resist persona instructions —
help-desk closers and feelings-disclaimers leak through regardless. Route live
chat to a stronger model while keeping bulk consolidation cheap:
- config: CHAT_MODEL (default gpt-4o), distinct from CLOUD_MODEL (gpt-4o-mini)
- llm.complete gains a `model` override; chat.respond uses chat_model on cloud,
consolidation paths keep cloud_model
- persona: reword the "no sign-off" rule so genuine questions are welcome and
only reflexive customer-service closers are discouraged
Verified: on gpt-4o she owns her mood without disclaimers and drops most
help-desk tails — clearly more in-character than mini/qwen.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two RLHF reflexes were leaking through: ending every turn with "is there
anything else?"/"how does that sound?", and disclaiming feelings ("I don't
really experience emotions like humans"). Add explicit persona instructions to
stop tacking on help-desk offers and to own her moods plainly instead of giving
qualia disclaimers. (Small models partially resist; stronger chat model holds it
better.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Give Lyra a model of *herself* (vs the profile/narrative which model Brian):
- persona: a real origin/identity — she's an AI and knows it (Bender/C-3PO
style), with the Cortex/NeoMem lineage as her actual past, so "how were you
made" stops falling through to generic-assistant deflection.
- memory: self_state table (JSON blob) + get/set_self_state.
- lyra/self_state.py: evolving first-person inner state (mood, valence, energy,
confidence, curiosity, self_narrative, relationship, reflections). render_for_
context injects it; reflect() updates it from recent activity. `lyra-reflect`.
- chat.build_messages injects her interiority right after the persona — she
speaks from a continuous self, not a reset.
The state -> behavior -> reflection -> updated state loop is the substrate for
the emergence experiment. Verified: reflection shifted mood curious->reflective
and produced genuine first-person self-observations.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Complete the consolidation pipeline: summaries -> profile + eras -> narrative.
- memory: eras table (per-month digests) + Era, summaries_by_month, store_era,
list_eras, recall_eras; narrative table + set/get_narrative
- lyra/era.py (lyra-era): groups session gists by the month the session occurred
(real timestamps) and map-reduces each month into a "what was happening" digest
- lyra/narrative.py (lyra-narrative): distills profile + recent eras into the
current arc/trends/callbacks ("remember when…", "you're trending toward…")
- chat.build_messages injects the narrative alongside the profile
Verified on the real corpus: 17 monthly eras (Dec 2024-Jun 2026) + a narrative
that surfaces specific callbacks (the $573 Hollywood session, 4 years sober).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Refactor summarize_all to run LLM summarization across a thread pool (default 8
workers) while keeping all SQLite reads/writes on the main thread (the single
connection is never shared across threads). Extract _summarize_transcript
(transcript -> gist, no DB) for the worker.
The MI50 proved far too slow for the large-transcript backfill (~29 summaries in
9h due to gfx906 prefill); on cloud gpt-4o-mini with concurrency this runs at
~30 summaries/minute (~17 min for the full backfill, ~$2). MI50 stays the chat
backend where small prompts make it snappy.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The MI50 llama.cpp server OOM-killed (LXC RAM limit + 8GB prompt cache) mid-run,
and summarize_all had no error handling, so one APIConnectionError killed the
whole batch. Add retry-with-backoff around the summarization LLM call, and
try/except per session in summarize_all (log + skip; unsummarized sessions get
retried on the next run). (Server-side: CT202 RAM raised + prompt cache disabled.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CT202's old static 10.0.0.44 collided with the terra-mechanics dev VM (tmi-dev).
Reassigned CT202 to 10.0.0.42 and repointed MI50_BASE_URL accordingly.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The MI50 box (CT202) runs an OpenAI-compatible llama.cpp server on
10.0.0.44:8080. Wire it in as a third backend:
- llm.complete gains backend="mi50" (OpenAI client pointed at MI50_BASE_URL)
- config: MI50_BASE_URL (default http://10.0.0.44:8080/v1) + MI50_MODEL
- chat.respond labels the model per backend; web _backend_for maps "mi50"
- UI backend selector adds "MI50 — local GPU"
Verified end-to-end: llm.complete(backend="mi50") returns from the live server.
See homelab-inference memory for the box topology.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Derive a standing profile of the user from session gists and inject it into
every prompt, so identity/abstract questions ("what kind of player am I",
"what are my leaks") are answered from distilled knowledge instead of noisy
single-vector recall (which finds passages, not patterns).
- memory: profile table + get/set_profile, list_summaries
- lyra/profile.py: rebuild_profile map-reduces all gists (batch -> extract
durable facts -> fold-merge) into one profile doc; `lyra-profile` CLI
- chat.build_messages injects "What you know about Brian" after the persona
Run after lyra-summarize (needs gists). Verified (stubbed): map-reduce, storage,
and prompt injection.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Harden summarize_session to chunk + merge long sessions (imported convos can
exceed the local model's context), and add summarize_all: idempotent, resumable
batch that summarizes every session needing it (skips up-to-date ones), with
progress logged to the live log. `lyra-summarize [limit]` CLI.
This is the first consolidation stage feeding the profile (semantic memory) and
era-rollup tiers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
OpenAI's export changed: conversations.json is now sharded into
conversations-000.json..NNN.json, each a JSON array of conversations with the
mapping tree and per-message create_time.
ingest now reads that format directly (supersedes the old convert/trim/split
scripts): walks each conversation's mapping ordered by create_time, keeps text
and multimodal_text (drops thoughts/reasoning_recap), captures real per-message
timestamps, and imports idempotently by conversation_id. `lyra-import <dir>`
auto-detects raw-export vs legacy {title,messages} dirs; optional limit arg.
Verified on 15 conversations: real dates, correct ordering, recall returns
dated poker history.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Import the parser's {title, messages} JSON into Lyra's memory so past
conversations seed recall (and, later, the era-rollup tier).
- lyra/ingest.py: one conversation -> one session, text messages -> exchanges;
skips non-text (image asset) messages and non user/assistant roles; embeddings
batched; idempotent by filename-derived session id; `lyra-import <dir>` CLI
- memory.add_exchanges_bulk: batched insert of pre-embedded rows
Format has no timestamps yet, so imports are stamped at import time; a future
dated export will let era memory group by real calendar time.
Verified on the 68-file lyra dev set: 7519 exchanges, idempotent re-run, recall
returns relevant history.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "context built" event now carries the fully-rendered prompt (persona, gists,
recalled details, recent turns, the new message) plus a total char count. The
log panel renders it as a collapsed "view full prompt" block — clean by default,
one click to see exactly what hit the model.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Older sessions fade to a general idea; details stay retrievable.
- memory: summaries table (one compacted gist per session, embedded), plus
store_summary/get_summary/recall_summaries and unsummarized_count (tracks
exchanges newer than the current summary)
- lyra/summary.py: summarize_session compacts a session's raw turns into a
third-person gist (default SUMMARY_BACKEND=local, so compaction is free);
maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate
- chat.build_messages now layers context in tiers: persona -> gists of other
sessions -> a few sharp raw cross-session details -> current session raw
turns -> new message; respond() compacts the session after each turn
- web: POST /sessions/{id}/summarize to compact on demand
- summarization activity surfaces in the live log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Turn the inert "Show Work" thinking panel into a real live activity log:
- lyra/logbus.py: thread-safe in-memory ring buffer other modules publish to
- chat.respond logs backend/model/embed per turn, recall counts, reply size;
web layer logs chat errors
- server: replace the keep-alive /stream/thinking stub with /stream/logs, an
SSE endpoint that replays the recent buffer then streams new events
- UI: repurpose the panel as a global "Live Log" — connects on load, renders
level/time/msg/fields, drops the old per-session localStorage + dead popup
Every turn now shows its backend + model in-app, so local-vs-cloud (free vs
paid) is visible at a glance.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 1 — persona + persistent memory chat loop:
- lyra/persona.py + personas/lyra.md: editable identity/voice (friend-first,
honest, never invents poker math)
- lyra/chat.py: turn loop assembling persona + cross-session recall + recent
context, persisting both sides to SQLite
- lyra/session.py, lyra/__main__.py: session lifecycle + `lyra` REPL
Phase 1.25 — reuse the old web UI:
- vendored the prior single-page UI into lyra/web/static, repointed to
same-origin
- lyra/web/server.py (FastAPI): serves the UI and backs its endpoint contract
(/v1/chat/completions, session CRUD, health, inert thinking-stream) with the
new chat loop + memory; SQLite stays the single source of truth
- `lyra-web` console script
Local backends — test for free, no OpenAI key:
- llm.embed routes via EMBED_BACKEND (cloud=OpenAI, local=Ollama /api/embed)
- simplified UI backend selector to Local (Ollama) / Cloud (OpenAI), default local
- memory connection opened check_same_thread=False for the threaded server
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- lyra.memory.remember(session_id, role, content) embeds and stores
- lyra.memory.recent(session_id, n) returns the last N from a session
- lyra.memory.recall(query, k, session_id=None) returns top-k by cosine
similarity across the chosen scope (all sessions by default)
- Embeddings live in the exchanges.embedding BLOB column as float32 bytes
- Connection reopens automatically if LYRA_DB_PATH changes (test-friendly)
- lyra.config.load() reads env into a frozen Config dataclass
- lyra.llm.complete(messages, backend) routes to Ollama /api/chat or
OpenAI chat completions
- lyra.llm.embed(texts) calls OpenAI embeddings
- .env.example switched from Anthropic to OpenAI to match available key
- Added `trillium.py` for searching and creating notes with Trillium's ETAPI.
- Implemented `search_notes` and `create_note` functions with appropriate error handling and validation.
feat: Add web search functionality using DuckDuckGo
- Introduced `web_search.py` for performing web searches without API keys.
- Implemented `search_web` function with result handling and validation.
feat: Create provider-agnostic function caller for iterative tool calling
- Developed `function_caller.py` to manage LLM interactions with tools.
- Implemented iterative calling logic with error handling and tool execution.
feat: Establish a tool registry for managing available tools
- Created `registry.py` to define and manage tool availability and execution.
- Integrated feature flags for enabling/disabling tools based on environment variables.
feat: Implement event streaming for tool calling processes
- Added `stream_events.py` to manage Server-Sent Events (SSE) for tool calling.
- Enabled real-time updates during tool execution for enhanced user experience.
test: Add tests for tool calling system components
- Created `test_tools.py` to validate functionality of code execution, web search, and tool registry.
- Implemented asynchronous tests to ensure proper execution and result handling.
chore: Add Dockerfile for sandbox environment setup
- Created `Dockerfile` to set up a Python environment with necessary dependencies for code execution.
chore: Add debug regex script for testing XML parsing
- Introduced `debug_regex.py` to validate regex patterns against XML tool calls.
chore: Add HTML template for displaying thinking stream events
- Created `test_thinking_stream.html` for visualizing tool calling events in a user-friendly format.
test: Add tests for OllamaAdapter XML parsing
- Developed `test_ollama_parser.py` to validate XML parsing with various test cases, including malformed XML.