feat: persona chat loop, web UI, and local (Ollama) embeddings
Phase 1 — persona + persistent memory chat loop: - lyra/persona.py + personas/lyra.md: editable identity/voice (friend-first, honest, never invents poker math) - lyra/chat.py: turn loop assembling persona + cross-session recall + recent context, persisting both sides to SQLite - lyra/session.py, lyra/__main__.py: session lifecycle + `lyra` REPL Phase 1.25 — reuse the old web UI: - vendored the prior single-page UI into lyra/web/static, repointed to same-origin - lyra/web/server.py (FastAPI): serves the UI and backs its endpoint contract (/v1/chat/completions, session CRUD, health, inert thinking-stream) with the new chat loop + memory; SQLite stays the single source of truth - `lyra-web` console script Local backends — test for free, no OpenAI key: - llm.embed routes via EMBED_BACKEND (cloud=OpenAI, local=Ollama /api/embed) - simplified UI backend selector to Local (Ollama) / Cloud (OpenAI), default local - memory connection opened check_same_thread=False for the threaded server Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+7
-2
@@ -1,11 +1,16 @@
|
||||
# Local backend (Ollama) — used by default for most calls.
|
||||
# Local backend (Ollama) — free, private. Point this at your home-lab Ollama.
|
||||
LOCAL_BASE_URL=http://localhost:11434
|
||||
LOCAL_MODEL=qwen2.5:7b-instruct
|
||||
|
||||
# Cloud backend (OpenAI) — used for harder reasoning and embeddings.
|
||||
# Cloud backend (OpenAI) — higher quality, costs money.
|
||||
OPENAI_API_KEY=
|
||||
CLOUD_MODEL=gpt-4o-mini
|
||||
|
||||
# Embeddings: "cloud" (OpenAI) or "local" (Ollama). A database is tied to whichever
|
||||
# backend created it — don't switch this against an existing DB (vector spaces differ).
|
||||
EMBED_BACKEND=cloud
|
||||
EMBED_MODEL=text-embedding-3-small
|
||||
LOCAL_EMBED_MODEL=nomic-embed-text
|
||||
|
||||
# Where Lyra stores her memory.
|
||||
LYRA_DB_PATH=data/lyra.db
|
||||
|
||||
Reference in New Issue
Block a user