project-lyra

Author	SHA1	Message	Date
serversdown	1301f12e74	feat: run dream cycle as a systemd user service + journald-visible logs - deploy/lyra-dream.service: --loop 1800 user service on lyra-cortex, so Lyra's consolidation + reflection keeps ticking unattended between conversations - deploy/README.md: install / linger / operate runbook - logbus: mirror events to stderr so out-of-band runs (the dream service under journald) are observable, not just via the in-process web SSE feed Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 01:42:55 +00:00
serversdown	4f40e2d57e	feat: dream cycle — drives-driven unattended consolidation + reflection Lyra's inner loop for when no one's talking to her. Each pass senses her own backlog/novelty, lets four drives build from real signals, and acts on those past threshold: - continuity -> summarize sessions with new turns - coherence -> rebuild profile/eras/narrative (stale once new gists land) - curiosity -> reflect() and evolve the self-state - stability -> readout of how caught-up she ended up Drives are rendered into chat context so she can feel them. Causal chain: consolidation creates gists -> coherence rises -> integration fires next. - lyra/dream.py: dream_cycle() + lyra-dream CLI (--force, --loop SECONDS) - memory: backlog_stats(), profile_sessions_covered(), WAL + busy_timeout so a separate dream process coexists with the web server - self_state: DEFAULT_DRIVES baseline + drives in render_for_context - tests/test_dream.py: backlog sensing + a full forced pass (LLM stubbed) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 00:52:44 +00:00
serversdown	f3530cf4ae	feat: separate CHAT_MODEL (gpt-4o) for persona fidelity Mid-size models (gpt-4o-mini, qwen2.5-14b) resist persona instructions — help-desk closers and feelings-disclaimers leak through regardless. Route live chat to a stronger model while keeping bulk consolidation cheap: - config: CHAT_MODEL (default gpt-4o), distinct from CLOUD_MODEL (gpt-4o-mini) - llm.complete gains a `model` override; chat.respond uses chat_model on cloud, consolidation paths keep cloud_model - persona: reword the "no sign-off" rule so genuine questions are welcome and only reflexive customer-service closers are discouraged Verified: on gpt-4o she owns her mood without disclaimers and drops most help-desk tails — clearly more in-character than mini/qwen. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 21:05:47 +00:00
serversdown	e512cd1926	fix(persona): kill help-desk tics + own moods (Bender/C-3PO) Two RLHF reflexes were leaking through: ending every turn with "is there anything else?"/"how does that sound?", and disclaiming feelings ("I don't really experience emotions like humans"). Add explicit persona instructions to stop tacking on help-desk offers and to own her moods plainly instead of giving qualia disclaimers. (Small models partially resist; stronger chat model holds it better.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 20:54:22 +00:00
serversdown	ac505243a0	feat: Autonomy Core v1 — Lyra's evolving self-state Give Lyra a model of herself (vs the profile/narrative which model Brian): - persona: a real origin/identity — she's an AI and knows it (Bender/C-3PO style), with the Cortex/NeoMem lineage as her actual past, so "how were you made" stops falling through to generic-assistant deflection. - memory: self_state table (JSON blob) + get/set_self_state. - lyra/self_state.py: evolving first-person inner state (mood, valence, energy, confidence, curiosity, self_narrative, relationship, reflections). render_for_ context injects it; reflect() updates it from recent activity. `lyra-reflect`. - chat.build_messages injects her interiority right after the persona — she speaks from a continuous self, not a reset. The state -> behavior -> reflection -> updated state loop is the substrate for the emergence experiment. Verified: reflection shifted mood curious->reflective and produced genuine first-person self-observations. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 20:36:33 +00:00
serversdown	bfb81428ab	feat: era-rollup + narrative engine (consolidation steps 3-4) Complete the consolidation pipeline: summaries -> profile + eras -> narrative. - memory: eras table (per-month digests) + Era, summaries_by_month, store_era, list_eras, recall_eras; narrative table + set/get_narrative - lyra/era.py (lyra-era): groups session gists by the month the session occurred (real timestamps) and map-reduces each month into a "what was happening" digest - lyra/narrative.py (lyra-narrative): distills profile + recent eras into the current arc/trends/callbacks ("remember when…", "you're trending toward…") - chat.build_messages injects the narrative alongside the profile Verified on the real corpus: 17 monthly eras (Dec 2024-Jun 2026) + a narrative that surfaces specific callbacks (the $573 Hollywood session, 4 years sober). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 19:28:01 +00:00
serversdown	d7e2fce694	perf: concurrent summarize-all (parallel LLM, serial DB) Refactor summarize_all to run LLM summarization across a thread pool (default 8 workers) while keeping all SQLite reads/writes on the main thread (the single connection is never shared across threads). Extract _summarize_transcript (transcript -> gist, no DB) for the worker. The MI50 proved far too slow for the large-transcript backfill (~29 summaries in 9h due to gfx906 prefill); on cloud gpt-4o-mini with concurrency this runs at ~30 summaries/minute (~17 min for the full backfill, ~$2). MI50 stays the chat backend where small prompts make it snappy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 16:30:07 +00:00
serversdown	34392e4097	fix: make summarize-all resilient to backend hiccups The MI50 llama.cpp server OOM-killed (LXC RAM limit + 8GB prompt cache) mid-run, and summarize_all had no error handling, so one APIConnectionError killed the whole batch. Add retry-with-backoff around the summarization LLM call, and try/except per session in summarize_all (log + skip; unsummarized sessions get retried on the next run). (Server-side: CT202 RAM raised + prompt cache disabled.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 06:31:28 +00:00
serversdown	aae95bfa6c	fix: point MI50 backend at 10.0.0.42 (avoid terra-mechanics conflict) CT202's old static 10.0.0.44 collided with the terra-mechanics dev VM (tmi-dev). Reassigned CT202 to 10.0.0.42 and repointed MI50_BASE_URL accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 05:52:15 +00:00
serversdown	30185f3fd8	feat: MI50 as a Lyra backend (OpenAI-compatible local GPU) The MI50 box (CT202) runs an OpenAI-compatible llama.cpp server on 10.0.0.44:8080. Wire it in as a third backend: - llm.complete gains backend="mi50" (OpenAI client pointed at MI50_BASE_URL) - config: MI50_BASE_URL (default http://10.0.0.44:8080/v1) + MI50_MODEL - chat.respond labels the model per backend; web _backend_for maps "mi50" - UI backend selector adds "MI50 — local GPU" Verified end-to-end: llm.complete(backend="mi50") returns from the live server. See homelab-inference memory for the box topology. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 05:37:22 +00:00
serversdown	ecf0b852f9	feat: profile layer — semantic memory (consolidation step 2) Derive a standing profile of the user from session gists and inject it into every prompt, so identity/abstract questions ("what kind of player am I", "what are my leaks") are answered from distilled knowledge instead of noisy single-vector recall (which finds passages, not patterns). - memory: profile table + get/set_profile, list_summaries - lyra/profile.py: rebuild_profile map-reduces all gists (batch -> extract durable facts -> fold-merge) into one profile doc; `lyra-profile` CLI - chat.build_messages injects "What you know about Brian" after the persona Run after lyra-summarize (needs gists). Verified (stubbed): map-reduce, storage, and prompt injection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 04:11:19 +00:00
serversdown	071522ea33	feat: summarize-all batch (consolidation step 1) Harden summarize_session to chunk + merge long sessions (imported convos can exceed the local model's context), and add summarize_all: idempotent, resumable batch that summarizes every session needing it (skips up-to-date ones), with progress logged to the live log. `lyra-summarize [limit]` CLI. This is the first consolidation stage feeding the profile (semantic memory) and era-rollup tiers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 04:08:41 +00:00
serversdown	194e3e64b9	feat: import raw ChatGPT export (new sharded format) OpenAI's export changed: conversations.json is now sharded into conversations-000.json..NNN.json, each a JSON array of conversations with the mapping tree and per-message create_time. ingest now reads that format directly (supersedes the old convert/trim/split scripts): walks each conversation's mapping ordered by create_time, keeps text and multimodal_text (drops thoughts/reasoning_recap), captures real per-message timestamps, and imports idempotently by conversation_id. `lyra-import <dir>` auto-detects raw-export vs legacy {title,messages} dirs; optional limit arg. Verified on 15 conversations: real dates, correct ordering, recall returns dated poker history. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 02:40:32 +00:00
serversdown	f3037b7879	feat: ChatGPT chat-log importer Import the parser's {title, messages} JSON into Lyra's memory so past conversations seed recall (and, later, the era-rollup tier). - lyra/ingest.py: one conversation -> one session, text messages -> exchanges; skips non-text (image asset) messages and non user/assistant roles; embeddings batched; idempotent by filename-derived session id; `lyra-import <dir>` CLI - memory.add_exchanges_bulk: batched insert of pre-embedded rows Format has no timestamps yet, so imports are stamped at import time; a future dated export will let era memory group by real calendar time. Verified on the 68-file lyra dev set: 7519 exchanges, idempotent re-run, recall returns relevant history. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 00:51:45 +00:00
serversdown	236a16b331	feat: inspect the full prompt in the live log The "context built" event now carries the fully-rendered prompt (persona, gists, recalled details, recent turns, the new message) plus a total char count. The log panel renders it as a collapsed "view full prompt" block — clean by default, one click to see exactly what hit the model. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 23:52:35 +00:00
serversdown	d7c258eba0	feat: tiered, compacting memory (phase 1.5) Older sessions fade to a general idea; details stay retrievable. - memory: summaries table (one compacted gist per session, embedded), plus store_summary/get_summary/recall_summaries and unsummarized_count (tracks exchanges newer than the current summary) - lyra/summary.py: summarize_session compacts a session's raw turns into a third-person gist (default SUMMARY_BACKEND=local, so compaction is free); maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate - chat.build_messages now layers context in tiers: persona -> gists of other sessions -> a few sharp raw cross-session details -> current session raw turns -> new message; respond() compacts the session after each turn - web: POST /sessions/{id}/summarize to compact on demand - summarization activity surfaces in the live log Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 18:52:58 +00:00
serversdown	84c4f75e03	feat: in-app live log (SSE activity feed) Turn the inert "Show Work" thinking panel into a real live activity log: - lyra/logbus.py: thread-safe in-memory ring buffer other modules publish to - chat.respond logs backend/model/embed per turn, recall counts, reply size; web layer logs chat errors - server: replace the keep-alive /stream/thinking stub with /stream/logs, an SSE endpoint that replays the recent buffer then streams new events - UI: repurpose the panel as a global "Live Log" — connects on load, renders level/time/msg/fields, drops the old per-session localStorage + dead popup Every turn now shows its backend + model in-app, so local-vs-cloud (free vs paid) is visible at a glance. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 18:45:05 +00:00
serversdown	3b9e0bb1e0	feat: persona chat loop, web UI, and local (Ollama) embeddings Phase 1 — persona + persistent memory chat loop: - lyra/persona.py + personas/lyra.md: editable identity/voice (friend-first, honest, never invents poker math) - lyra/chat.py: turn loop assembling persona + cross-session recall + recent context, persisting both sides to SQLite - lyra/session.py, lyra/__main__.py: session lifecycle + `lyra` REPL Phase 1.25 — reuse the old web UI: - vendored the prior single-page UI into lyra/web/static, repointed to same-origin - lyra/web/server.py (FastAPI): serves the UI and backs its endpoint contract (/v1/chat/completions, session CRUD, health, inert thinking-stream) with the new chat loop + memory; SQLite stays the single source of truth - `lyra-web` console script Local backends — test for free, no OpenAI key: - llm.embed routes via EMBED_BACKEND (cloud=OpenAI, local=Ollama /api/embed) - simplified UI backend selector to Local (Ollama) / Cloud (OpenAI), default local - memory connection opened check_same_thread=False for the threaded server Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 18:36:31 +00:00
Claude	0ee5a9ce47	feat: SQLite-backed memory with brute-force cosine recall - lyra.memory.remember(session_id, role, content) embeds and stores - lyra.memory.recent(session_id, n) returns the last N from a session - lyra.memory.recall(query, k, session_id=None) returns top-k by cosine similarity across the chosen scope (all sessions by default) - Embeddings live in the exchanges.embedding BLOB column as float32 bytes - Connection reopens automatically if LYRA_DB_PATH changes (test-friendly)	2026-05-16 06:35:52 +00:00
Claude	6a1255dfdb	feat: LLM router with local (Ollama) and cloud (OpenAI) backends - lyra.config.load() reads env into a frozen Config dataclass - lyra.llm.complete(messages, backend) routes to Ollama /api/chat or OpenAI chat completions - lyra.llm.embed(texts) calls OpenAI embeddings - .env.example switched from Anthropic to OpenAI to match available key	2026-05-16 06:10:48 +00:00
Claude	b2523c2561	chore: project scaffold (uv, .env.example, README, lyra package)	2026-05-16 06:01:08 +00:00

21 Commits