30185f3fd8b1eb49be1853f705ddfb00b5440550
The MI50 box (CT202) runs an OpenAI-compatible llama.cpp server on 10.0.0.44:8080. Wire it in as a third backend: - llm.complete gains backend="mi50" (OpenAI client pointed at MI50_BASE_URL) - config: MI50_BASE_URL (default http://10.0.0.44:8080/v1) + MI50_MODEL - chat.respond labels the model per backend; web _backend_for maps "mi50" - UI backend selector adds "MI50 — local GPU" Verified end-to-end: llm.complete(backend="mi50") returns from the live server. See homelab-inference memory for the box topology. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lyra
A persistent, autonomous AI assistant. From-scratch rewrite of an earlier attempt.
The design thinking that survives the rewrite lives in docs/ — start with docs/ARCH_v0-6-1.md. The previous implementation is preserved on the archive branch.
Status
Pre-MVP. Building toward the smallest useful version: chat with persistent memory across sessions.
Setup
uv sync
cp .env.example .env
# fill in ANTHROPIC_API_KEY and point LOCAL_BASE_URL at your Ollama
Architecture
The long-term target is the cognitive split in docs/ARCH_v0-6-1.md — Inner Self as the seat of consciousness, Executive for hard reasoning, Cortex Chat for drafting, Persona for voice. The MVP implements only the chat + memory baseline. Cognitive layers come back one at a time.
Description
Languages
HTML
46.7%
Python
32.1%
CSS
21.2%