feat: tiered, compacting memory (phase 1.5)
Older sessions fade to a general idea; details stay retrievable.
- memory: summaries table (one compacted gist per session, embedded), plus
store_summary/get_summary/recall_summaries and unsummarized_count (tracks
exchanges newer than the current summary)
- lyra/summary.py: summarize_session compacts a session's raw turns into a
third-person gist (default SUMMARY_BACKEND=local, so compaction is free);
maybe_summarize re-summarizes once SUMMARIZE_AFTER new turns accumulate
- chat.build_messages now layers context in tiers: persona -> gists of other
sessions -> a few sharp raw cross-session details -> current session raw
turns -> new message; respond() compacts the session after each turn
- web: POST /sessions/{id}/summarize to compact on demand
- summarization activity surfaces in the live log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+6
-1
@@ -18,7 +18,7 @@ from fastapi import FastAPI, Request
|
||||
from fastapi.responses import StreamingResponse
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
|
||||
from lyra import chat, logbus, memory
|
||||
from lyra import chat, logbus, memory, summary
|
||||
from lyra.llm import Backend
|
||||
|
||||
|
||||
@@ -77,6 +77,11 @@ def create_app() -> FastAPI:
|
||||
memory.delete_session(session_id)
|
||||
return {"ok": True}
|
||||
|
||||
@app.post("/sessions/{session_id}/summarize")
|
||||
async def summarize(session_id: str) -> dict:
|
||||
gist = await asyncio.to_thread(summary.summarize_session, session_id)
|
||||
return {"ok": gist is not None, "summary": gist}
|
||||
|
||||
@app.post("/v1/chat/completions")
|
||||
async def chat_completions(request: Request) -> dict:
|
||||
body = await request.json()
|
||||
|
||||
Reference in New Issue
Block a user