docs updated v0.7.0
This commit is contained in:
243
README.md
243
README.md
@@ -1,10 +1,12 @@
|
||||
# Project Lyra - README v0.6.0
|
||||
# Project Lyra - README v0.7.0
|
||||
|
||||
Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
|
||||
It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
|
||||
featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.
|
||||
|
||||
**Current Version:** v0.6.0 (2025-12-18)
|
||||
**NEW in v0.7.0:** Standard Mode for simple chatbot functionality + UI backend selection + server-side session persistence
|
||||
|
||||
**Current Version:** v0.7.0 (2025-12-21)
|
||||
|
||||
> **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.
|
||||
|
||||
@@ -25,14 +27,18 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
||||
- Coordinates all module interactions
|
||||
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
||||
- Internal endpoint: `POST /chat`
|
||||
- Routes messages through Cortex reasoning pipeline
|
||||
- Dual-mode routing: Standard Mode (simple chat) or Cortex Mode (full reasoning)
|
||||
- Server-side session persistence with file-based storage
|
||||
- Session management API: `GET/POST/PATCH/DELETE /sessions`
|
||||
- Manages async calls to Cortex ingest
|
||||
- *(NeoMem integration currently disabled in v0.6.0)*
|
||||
|
||||
**2. UI** (Static HTML)
|
||||
- Browser-based chat interface with cyberpunk theme
|
||||
- Connects to Relay
|
||||
- Saves and loads sessions
|
||||
- **NEW:** Mode selector (Standard/Cortex) in header
|
||||
- **NEW:** Settings modal with backend selection and session management
|
||||
- **NEW:** Light/Dark mode toggle (dark by default)
|
||||
- Server-synced session management (persists across browsers and reboots)
|
||||
- OpenAI-compatible message format
|
||||
|
||||
**3. NeoMem** (Python/FastAPI) - Port 7077 - **DISABLED IN v0.6.0**
|
||||
@@ -49,15 +55,22 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
||||
- Primary reasoning engine with multi-stage pipeline and autonomy system
|
||||
- **Includes embedded Intake module** (no separate service as of v0.5.1)
|
||||
- **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
|
||||
- **4-Stage Processing:**
|
||||
1. **Reflection** - Generates meta-awareness notes about conversation
|
||||
2. **Reasoning** - Creates initial draft answer using context
|
||||
3. **Refinement** - Polishes and improves the draft
|
||||
4. **Persona** - Applies Lyra's personality and speaking style
|
||||
- **Dual Operating Modes:**
|
||||
- **Standard Mode** (NEW in v0.7.0) - Simple chatbot with context retention
|
||||
- Bypasses reflection, reasoning, refinement stages
|
||||
- Direct LLM call with conversation history
|
||||
- User-selectable backend (SECONDARY, OPENAI, or custom)
|
||||
- Faster responses for coding and practical tasks
|
||||
- **Cortex Mode** - Full 4-stage reasoning pipeline
|
||||
1. **Reflection** - Generates meta-awareness notes about conversation
|
||||
2. **Reasoning** - Creates initial draft answer using context
|
||||
3. **Refinement** - Polishes and improves the draft
|
||||
4. **Persona** - Applies Lyra's personality and speaking style
|
||||
- Integrates with Intake for short-term context via internal Python imports
|
||||
- Flexible LLM router supporting multiple backends via HTTP
|
||||
- **Endpoints:**
|
||||
- `POST /reason` - Main reasoning pipeline
|
||||
- `POST /reason` - Main reasoning pipeline (Cortex Mode)
|
||||
- `POST /simple` - Direct LLM chat (Standard Mode) **NEW in v0.7.0**
|
||||
- `POST /ingest` - Receives conversation exchanges from Relay
|
||||
- `GET /health` - Service health check
|
||||
- `GET /debug/sessions` - Inspect in-memory SESSIONS state
|
||||
@@ -129,12 +142,38 @@ The autonomy system operates in coordinated layers, all maintaining state in `se
|
||||
|
||||
---
|
||||
|
||||
## Data Flow Architecture (v0.6.0)
|
||||
## Data Flow Architecture (v0.7.0)
|
||||
|
||||
### Normal Message Flow:
|
||||
### Standard Mode Flow (NEW in v0.7.0):
|
||||
|
||||
```
|
||||
User (UI) → POST /v1/chat/completions
|
||||
User (UI) → POST /v1/chat/completions {mode: "standard", backend: "SECONDARY"}
|
||||
↓
|
||||
Relay (7078)
|
||||
↓ POST /simple
|
||||
Cortex (7081)
|
||||
↓ (internal Python call)
|
||||
Intake module → get_recent_messages() (last 20 messages)
|
||||
↓
|
||||
Direct LLM call (user-selected backend: SECONDARY/OPENAI/custom)
|
||||
↓
|
||||
Returns simple response to Relay
|
||||
↓
|
||||
Relay → POST /ingest (async)
|
||||
↓
|
||||
Cortex → add_exchange_internal() → SESSIONS buffer
|
||||
↓
|
||||
Relay → POST /sessions/:id (save session to file)
|
||||
↓
|
||||
Relay → UI (returns final response)
|
||||
|
||||
Note: Bypasses reflection, reasoning, refinement, persona stages
|
||||
```
|
||||
|
||||
### Cortex Mode Flow (Full Reasoning):
|
||||
|
||||
```
|
||||
User (UI) → POST /v1/chat/completions {mode: "cortex"}
|
||||
↓
|
||||
Relay (7078)
|
||||
↓ POST /reason
|
||||
@@ -158,11 +197,26 @@ Cortex → add_exchange_internal() → SESSIONS buffer
|
||||
↓
|
||||
Autonomy System → Update self_state.json (pattern tracking)
|
||||
↓
|
||||
Relay → POST /sessions/:id (save session to file)
|
||||
↓
|
||||
Relay → UI (returns final response)
|
||||
|
||||
Note: NeoMem integration disabled in v0.6.0
|
||||
```
|
||||
|
||||
### Session Persistence Flow (NEW in v0.7.0):
|
||||
|
||||
```
|
||||
UI loads → GET /sessions → Relay → List all sessions from files → UI dropdown
|
||||
User sends message → POST /sessions/:id → Relay → Save to sessions/*.json
|
||||
User renames session → PATCH /sessions/:id/metadata → Relay → Update *.meta.json
|
||||
User deletes session → DELETE /sessions/:id → Relay → Remove session files
|
||||
|
||||
Sessions stored in: core/relay/sessions/
|
||||
- {sessionId}.json (conversation history)
|
||||
- {sessionId}.meta.json (name, timestamps, metadata)
|
||||
```
|
||||
|
||||
### Cortex 4-Stage Reasoning Pipeline:
|
||||
|
||||
1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI)
|
||||
@@ -196,6 +250,14 @@ Note: NeoMem integration disabled in v0.6.0
|
||||
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
||||
- Internal endpoint: `POST /chat`
|
||||
- Health check: `GET /_health`
|
||||
- **NEW:** Dual-mode routing (Standard/Cortex)
|
||||
- **NEW:** Server-side session persistence with CRUD API
|
||||
- **NEW:** Session management endpoints:
|
||||
- `GET /sessions` - List all sessions
|
||||
- `GET /sessions/:id` - Retrieve session history
|
||||
- `POST /sessions/:id` - Save session history
|
||||
- `PATCH /sessions/:id/metadata` - Update session metadata
|
||||
- `DELETE /sessions/:id` - Delete session
|
||||
- Async non-blocking calls to Cortex
|
||||
- Shared request handler for code reuse
|
||||
- Comprehensive error handling
|
||||
@@ -210,19 +272,35 @@ Note: NeoMem integration disabled in v0.6.0
|
||||
|
||||
**UI**:
|
||||
- Lightweight static HTML chat interface
|
||||
- Cyberpunk theme
|
||||
- Session save/load functionality
|
||||
- Cyberpunk theme with light/dark mode toggle
|
||||
- **NEW:** Mode selector (Standard/Cortex) in header
|
||||
- **NEW:** Settings modal (⚙ button) with:
|
||||
- Backend selection for Standard Mode (SECONDARY/OPENAI/custom)
|
||||
- Session management (view, delete sessions)
|
||||
- Theme toggle (dark mode default)
|
||||
- **NEW:** Server-synced session management
|
||||
- Sessions persist across browsers and reboots
|
||||
- Rename sessions with custom names
|
||||
- Delete sessions with confirmation
|
||||
- Automatic session save on every message
|
||||
- OpenAI message format support
|
||||
|
||||
### Reasoning Layer
|
||||
|
||||
**Cortex** (v0.5.1):
|
||||
- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona)
|
||||
**Cortex** (v0.7.0):
|
||||
- **NEW:** Dual operating modes:
|
||||
- **Standard Mode** - Simple chat with context (`/simple` endpoint)
|
||||
- User-selectable backend (SECONDARY, OPENAI, or custom)
|
||||
- Full conversation history via Intake integration
|
||||
- Bypasses reasoning pipeline for faster responses
|
||||
- **Cortex Mode** - Full reasoning pipeline (`/reason` endpoint)
|
||||
- Multi-stage processing: reflection → reasoning → refine → persona
|
||||
- Per-stage backend selection
|
||||
- Autonomy system integration
|
||||
- Flexible LLM backend routing via HTTP
|
||||
- Per-stage backend selection
|
||||
- Async processing throughout
|
||||
- Embedded Intake module for short-term context
|
||||
- `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
|
||||
- `/reason`, `/simple`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
|
||||
- Lenient error handling - never fails the chat pipeline
|
||||
|
||||
**Intake** (Embedded Module):
|
||||
@@ -327,7 +405,28 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
||||
|
||||
## Version History
|
||||
|
||||
### v0.6.0 (2025-12-18) - Current Release
|
||||
### v0.7.0 (2025-12-21) - Current Release
|
||||
**Major Features: Standard Mode + Backend Selection + Session Persistence**
|
||||
- ✅ Added Standard Mode for simple chatbot functionality
|
||||
- ✅ UI mode selector (Standard/Cortex) in header
|
||||
- ✅ Settings modal with backend selection for Standard Mode
|
||||
- ✅ Server-side session persistence with file-based storage
|
||||
- ✅ Session management UI (view, rename, delete sessions)
|
||||
- ✅ Light/Dark mode toggle (dark by default)
|
||||
- ✅ Context retention in Standard Mode via Intake integration
|
||||
- ✅ Fixed modal positioning and z-index issues
|
||||
- ✅ Cortex `/simple` endpoint for direct LLM calls
|
||||
- ✅ Session CRUD API in Relay
|
||||
- ✅ Full backward compatibility - Cortex Mode unchanged
|
||||
|
||||
**Key Changes:**
|
||||
- Standard Mode bypasses 6 of 7 reasoning stages for faster responses
|
||||
- Sessions now sync across browsers and survive container restarts
|
||||
- User can select SECONDARY (Ollama), OPENAI, or custom backend for Standard Mode
|
||||
- Theme preference and backend selection persisted in localStorage
|
||||
- Session files stored in `core/relay/sessions/` directory
|
||||
|
||||
### v0.6.0 (2025-12-18)
|
||||
**Major Feature: Autonomy System (Phase 1, 2, and 2.5)**
|
||||
- ✅ Added autonomous decision-making framework
|
||||
- ✅ Implemented executive planning and goal-setting layer
|
||||
@@ -394,30 +493,39 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
||||
|
||||
---
|
||||
|
||||
## Known Issues (v0.6.0)
|
||||
## Known Issues (v0.7.0)
|
||||
|
||||
### Temporarily Disabled (v0.6.0)
|
||||
### Temporarily Disabled
|
||||
- **NeoMem disabled by default** - Being refined independently before full integration
|
||||
- PostgreSQL + pgvector storage inactive
|
||||
- Neo4j graph database inactive
|
||||
- Memory persistence endpoints not active
|
||||
- RAG service (Beta Lyrae) currently disabled in docker-compose.yml
|
||||
|
||||
### Non-Critical
|
||||
- Session management endpoints not fully implemented in Relay
|
||||
- Full autonomy system integration still being refined
|
||||
- Memory retrieval integration pending NeoMem re-enablement
|
||||
### Standard Mode Limitations
|
||||
- No reflection, reasoning, or refinement stages (by design)
|
||||
- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
|
||||
- No RAG integration (same as Cortex Mode - currently disabled)
|
||||
- No NeoMem memory storage (same as Cortex Mode - currently disabled)
|
||||
|
||||
### Session Management Limitations
|
||||
- Sessions stored in container filesystem - requires volume mount for true persistence
|
||||
- No session import/export functionality yet
|
||||
- No session search or filtering
|
||||
- Old localStorage sessions don't automatically migrate to server
|
||||
|
||||
### Operational Notes
|
||||
- **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state
|
||||
- Multi-worker scaling requires migrating SESSIONS to Redis or shared storage
|
||||
- Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting
|
||||
- Backend selection only affects Standard Mode - Cortex Mode uses environment-configured backends
|
||||
|
||||
### Future Enhancements
|
||||
- Re-enable NeoMem integration after pipeline refinement
|
||||
- Full autonomy system maturation and optimization
|
||||
- Re-enable RAG service integration
|
||||
- Implement full session persistence
|
||||
- Session import/export functionality
|
||||
- Session search and filtering UI
|
||||
- Migrate SESSIONS to Redis for multi-worker support
|
||||
- Add request correlation IDs for tracing
|
||||
- Comprehensive health checks across all services
|
||||
@@ -457,17 +565,56 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
||||
curl http://localhost:7077/health
|
||||
```
|
||||
|
||||
4. Access the UI at `http://localhost:7078`
|
||||
4. Access the UI at `http://localhost:8081`
|
||||
|
||||
### Using the UI
|
||||
|
||||
**Mode Selection:**
|
||||
- Use the **Mode** dropdown in the header to switch between:
|
||||
- **Standard** - Simple chatbot for coding and practical tasks
|
||||
- **Cortex** - Full reasoning pipeline with autonomy features
|
||||
|
||||
**Settings Menu:**
|
||||
1. Click the **⚙ Settings** button in the header
|
||||
2. **Backend Selection** (Standard Mode only):
|
||||
- Choose **SECONDARY** (Ollama/Qwen on 3090) - Fast, local
|
||||
- Choose **OPENAI** (GPT-4o-mini) - Cloud-based, high quality
|
||||
- Enter custom backend name for advanced configurations
|
||||
3. **Session Management**:
|
||||
- View all saved sessions with message counts and timestamps
|
||||
- Click 🗑️ to delete unwanted sessions
|
||||
4. **Theme Toggle**:
|
||||
- Click **🌙 Dark Mode** or **☀️ Light Mode** to switch themes
|
||||
|
||||
**Session Management:**
|
||||
- Sessions automatically save on every message
|
||||
- Use the **Session** dropdown to switch between sessions
|
||||
- Click **➕ New** to create a new session
|
||||
- Click **✏️ Rename** to rename the current session
|
||||
- Sessions persist across browsers and container restarts
|
||||
|
||||
### Test
|
||||
|
||||
**Test Relay → Cortex pipeline:**
|
||||
**Test Standard Mode:**
|
||||
```bash
|
||||
curl -X POST http://localhost:7078/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"mode": "standard",
|
||||
"backend": "SECONDARY",
|
||||
"messages": [{"role": "user", "content": "Hello!"}],
|
||||
"sessionId": "test"
|
||||
}'
|
||||
```
|
||||
|
||||
**Test Cortex Mode (Full Reasoning):**
|
||||
```bash
|
||||
curl -X POST http://localhost:7078/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"mode": "cortex",
|
||||
"messages": [{"role": "user", "content": "Hello Lyra!"}],
|
||||
"session_id": "test"
|
||||
"sessionId": "test"
|
||||
}'
|
||||
```
|
||||
|
||||
@@ -492,6 +639,21 @@ curl http://localhost:7081/debug/sessions
|
||||
curl "http://localhost:7081/debug/summary?session_id=test"
|
||||
```
|
||||
|
||||
**List all sessions:**
|
||||
```bash
|
||||
curl http://localhost:7078/sessions
|
||||
```
|
||||
|
||||
**Get session history:**
|
||||
```bash
|
||||
curl http://localhost:7078/sessions/sess-abc123
|
||||
```
|
||||
|
||||
**Delete a session:**
|
||||
```bash
|
||||
curl -X DELETE http://localhost:7078/sessions/sess-abc123
|
||||
```
|
||||
|
||||
All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
|
||||
|
||||
---
|
||||
@@ -515,12 +677,13 @@ OPENAI_API_KEY=sk-...
|
||||
|
||||
**Module-specific backend selection:**
|
||||
```bash
|
||||
CORTEX_LLM=SECONDARY # Use Ollama for reasoning
|
||||
INTAKE_LLM=PRIMARY # Use llama.cpp for summarization
|
||||
SPEAK_LLM=OPENAI # Use OpenAI for persona
|
||||
NEOMEM_LLM=PRIMARY # Use llama.cpp for memory
|
||||
UI_LLM=OPENAI # Use OpenAI for UI
|
||||
RELAY_LLM=PRIMARY # Use llama.cpp for relay
|
||||
CORTEX_LLM=SECONDARY # Use Ollama for reasoning
|
||||
INTAKE_LLM=PRIMARY # Use llama.cpp for summarization
|
||||
SPEAK_LLM=OPENAI # Use OpenAI for persona
|
||||
NEOMEM_LLM=PRIMARY # Use llama.cpp for memory
|
||||
UI_LLM=OPENAI # Use OpenAI for UI
|
||||
RELAY_LLM=PRIMARY # Use llama.cpp for relay
|
||||
STANDARD_MODE_LLM=SECONDARY # Default backend for Standard Mode (NEW in v0.7.0)
|
||||
```
|
||||
|
||||
### Database Configuration
|
||||
@@ -541,6 +704,7 @@ NEO4J_PASSWORD=neomemgraph
|
||||
NEOMEM_API=http://neomem-api:7077
|
||||
CORTEX_API=http://cortex:7081
|
||||
CORTEX_REASON_URL=http://cortex:7081/reason
|
||||
CORTEX_SIMPLE_URL=http://cortex:7081/simple # NEW in v0.7.0
|
||||
CORTEX_INGEST_URL=http://cortex:7081/ingest
|
||||
RELAY_URL=http://relay:7078
|
||||
```
|
||||
@@ -685,7 +849,10 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0).
|
||||
### Debugging Tips
|
||||
- Enable verbose logging: `VERBOSE_DEBUG=true` in `.env`
|
||||
- Check Cortex logs: `docker logs cortex -f`
|
||||
- Check Relay logs: `docker logs relay -f`
|
||||
- Inspect SESSIONS: `curl http://localhost:7081/debug/sessions`
|
||||
- Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"`
|
||||
- Check Relay logs: `docker logs relay -f`
|
||||
- List sessions: `curl http://localhost:7078/sessions`
|
||||
- Test Standard Mode: `curl -X POST http://localhost:7078/v1/chat/completions -H "Content-Type: application/json" -d '{"mode":"standard","backend":"SECONDARY","messages":[{"role":"user","content":"test"}],"sessionId":"test"}'`
|
||||
- Monitor Docker network: `docker network inspect lyra_net`
|
||||
- Check session files: `ls -la core/relay/sessions/`
|
||||
|
||||
Reference in New Issue
Block a user