feat: tiered, compacting memory (phase 1.5)
Older sessions fade to a general idea; details stay retrievable.
- memory: summaries table (one compacted gist per session, embedded), plus
store_summary/get_summary/recall_summaries and unsummarized_count (tracks
exchanges newer than the current summary)
- lyra/summary.py: summarize_session compacts a session's raw turns into a
third-person gist (default SUMMARY_BACKEND=local, so compaction is free);
maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate
- chat.build_messages now layers context in tiers: persona -> gists of other
sessions -> a few sharp raw cross-session details -> current session raw
turns -> new message; respond() compacts the session after each turn
- web: POST /sessions/{id}/summarize to compact on demand
- summarization activity surfaces in the live log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -19,6 +19,7 @@ class Config:
|
||||
embed_backend: str # "cloud" (OpenAI) or "local" (Ollama)
|
||||
embed_model: str # OpenAI embedding model
|
||||
local_embed_model: str # Ollama embedding model
|
||||
summary_backend: str # "local" or "cloud" — backend used to compact memory
|
||||
db_path: Path
|
||||
|
||||
|
||||
@@ -31,5 +32,6 @@ def load() -> Config:
|
||||
embed_backend=os.getenv("EMBED_BACKEND", "cloud").lower(),
|
||||
embed_model=os.getenv("EMBED_MODEL", "text-embedding-3-small"),
|
||||
local_embed_model=os.getenv("LOCAL_EMBED_MODEL", "nomic-embed-text"),
|
||||
summary_backend=os.getenv("SUMMARY_BACKEND", "local").lower(),
|
||||
db_path=Path(os.getenv("LYRA_DB_PATH", "data/lyra.db")),
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user