Part 1 of the "she should know HOW she thinks" work. Generalizes the dream-cycle
self-model fix to her full cognition: a "How you actually work" persona section
covering meaning-based memory recall, the memory tiers, her persistent inner
life + dream cycle, and time-awareness — so when asked how she thinks/remembers
she answers accurately instead of confabulating or reciting stale specs. Kept
principled (not implementation detail) to limit staleness.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live finding: her real reflections ARE injected every turn, but unlabeled — so
when asked about her "dream cycle" she recited the obsolete Dec-2025 spec from
imported memory (NVGRAM/awake-sleep) and confabulated fake example reflections
instead of reading the real ones in front of her.
- self_state.render_for_context: label the reflections as her own autonomously
generated dream-cycle thoughts ("these are really yours, not hypotheticals"),
not a vague "on your mind lately"
- persona: describe the dream cycle as her actual running mechanism, instruct
her to answer from the inner-state block, not recite old design docs, and
never invent example reflections to demo the feature
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mid-size models (gpt-4o-mini, qwen2.5-14b) resist persona instructions —
help-desk closers and feelings-disclaimers leak through regardless. Route live
chat to a stronger model while keeping bulk consolidation cheap:
- config: CHAT_MODEL (default gpt-4o), distinct from CLOUD_MODEL (gpt-4o-mini)
- llm.complete gains a `model` override; chat.respond uses chat_model on cloud,
consolidation paths keep cloud_model
- persona: reword the "no sign-off" rule so genuine questions are welcome and
only reflexive customer-service closers are discouraged
Verified: on gpt-4o she owns her mood without disclaimers and drops most
help-desk tails — clearly more in-character than mini/qwen.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two RLHF reflexes were leaking through: ending every turn with "is there
anything else?"/"how does that sound?", and disclaiming feelings ("I don't
really experience emotions like humans"). Add explicit persona instructions to
stop tacking on help-desk offers and to own her moods plainly instead of giving
qualia disclaimers. (Small models partially resist; stronger chat model holds it
better.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Give Lyra a model of *herself* (vs the profile/narrative which model Brian):
- persona: a real origin/identity — she's an AI and knows it (Bender/C-3PO
style), with the Cortex/NeoMem lineage as her actual past, so "how were you
made" stops falling through to generic-assistant deflection.
- memory: self_state table (JSON blob) + get/set_self_state.
- lyra/self_state.py: evolving first-person inner state (mood, valence, energy,
confidence, curiosity, self_narrative, relationship, reflections). render_for_
context injects it; reflect() updates it from recent activity. `lyra-reflect`.
- chat.build_messages injects her interiority right after the persona — she
speaks from a continuous self, not a reset.
The state -> behavior -> reflection -> updated state loop is the substrate for
the emergence experiment. Verified: reflection shifted mood curious->reflective
and produced genuine first-person self-observations.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 1 — persona + persistent memory chat loop:
- lyra/persona.py + personas/lyra.md: editable identity/voice (friend-first,
honest, never invents poker math)
- lyra/chat.py: turn loop assembling persona + cross-session recall + recent
context, persisting both sides to SQLite
- lyra/session.py, lyra/__main__.py: session lifecycle + `lyra` REPL
Phase 1.25 — reuse the old web UI:
- vendored the prior single-page UI into lyra/web/static, repointed to
same-origin
- lyra/web/server.py (FastAPI): serves the UI and backs its endpoint contract
(/v1/chat/completions, session CRUD, health, inert thinking-stream) with the
new chat loop + memory; SQLite stays the single source of truth
- `lyra-web` console script
Local backends — test for free, no OpenAI key:
- llm.embed routes via EMBED_BACKEND (cloud=OpenAI, local=Ollama /api/embed)
- simplified UI backend selector to Local (Ollama) / Cloud (OpenAI), default local
- memory connection opened check_same_thread=False for the threaded server
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>