feat: tiered, compacting memory (phase 1.5)

Older sessions fade to a general idea; details stay retrievable.

- memory: summaries table (one compacted gist per session, embedded), plus
  store_summary/get_summary/recall_summaries and unsummarized_count (tracks
  exchanges newer than the current summary)
- lyra/summary.py: summarize_session compacts a session's raw turns into a
  third-person gist (default SUMMARY_BACKEND=local, so compaction is free);
  maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate
- chat.build_messages now layers context in tiers: persona -> gists of other
  sessions -> a few sharp raw cross-session details -> current session raw
  turns -> new message; respond() compacts the session after each turn
- web: POST /sessions/{id}/summarize to compact on demand
- summarization activity surfaces in the live log

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-15 18:52:58 +00:00
parent 84c4f75e03
commit d7c258eba0
6 changed files with 211 additions and 23 deletions
+3
View File
@@ -12,5 +12,8 @@ EMBED_BACKEND=cloud
EMBED_MODEL=text-embedding-3-small
LOCAL_EMBED_MODEL=nomic-embed-text
# Backend used to compact old sessions into summaries ("local" keeps it free).
SUMMARY_BACKEND=local
# Where Lyra stores her memory.
LYRA_DB_PATH=data/lyra.db