d7c258eba0
Older sessions fade to a general idea; details stay retrievable.
- memory: summaries table (one compacted gist per session, embedded), plus
store_summary/get_summary/recall_summaries and unsummarized_count (tracks
exchanges newer than the current summary)
- lyra/summary.py: summarize_session compacts a session's raw turns into a
third-person gist (default SUMMARY_BACKEND=local, so compaction is free);
maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate
- chat.build_messages now layers context in tiers: persona -> gists of other
sessions -> a few sharp raw cross-session details -> current session raw
turns -> new message; respond() compacts the session after each turn
- web: POST /sessions/{id}/summarize to compact on demand
- summarization activity surfaces in the live log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
20 lines
675 B
Bash
20 lines
675 B
Bash
# Local backend (Ollama) — free, private. Point this at your home-lab Ollama.
|
|
LOCAL_BASE_URL=http://localhost:11434
|
|
LOCAL_MODEL=qwen2.5:7b-instruct
|
|
|
|
# Cloud backend (OpenAI) — higher quality, costs money.
|
|
OPENAI_API_KEY=
|
|
CLOUD_MODEL=gpt-4o-mini
|
|
|
|
# Embeddings: "cloud" (OpenAI) or "local" (Ollama). A database is tied to whichever
|
|
# backend created it — don't switch this against an existing DB (vector spaces differ).
|
|
EMBED_BACKEND=cloud
|
|
EMBED_MODEL=text-embedding-3-small
|
|
LOCAL_EMBED_MODEL=nomic-embed-text
|
|
|
|
# Backend used to compact old sessions into summaries ("local" keeps it free).
|
|
SUMMARY_BACKEND=local
|
|
|
|
# Where Lyra stores her memory.
|
|
LYRA_DB_PATH=data/lyra.db
|