v0.5.2 update
Dev
This commit is contained in:
@@ -9,6 +9,55 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Se
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## [0.5.2] - 2025-12-12
|
||||||
|
|
||||||
|
### Fixed - LLM Router & Async HTTP
|
||||||
|
- **Critical**: Replaced synchronous `requests` with async `httpx` in LLM router [cortex/llm/llm_router.py](cortex/llm/llm_router.py)
|
||||||
|
- Event loop blocking was causing timeouts and empty responses
|
||||||
|
- All three providers (MI50, Ollama, OpenAI) now use `await http_client.post()`
|
||||||
|
- Fixes "Expecting value: line 1 column 1 (char 0)" JSON parsing errors in intake
|
||||||
|
- **Critical**: Fixed missing `backend` parameter in intake summarization [cortex/intake/intake.py:285](cortex/intake/intake.py#L285)
|
||||||
|
- Was defaulting to PRIMARY (MI50) instead of respecting `INTAKE_LLM=SECONDARY`
|
||||||
|
- Now correctly uses configured backend (Ollama on 3090)
|
||||||
|
- **Relay**: Fixed session ID case mismatch [core/relay/server.js:87](core/relay/server.js#L87)
|
||||||
|
- UI sends `sessionId` (camelCase) but relay expected `session_id` (snake_case)
|
||||||
|
- Now accepts both variants: `req.body.session_id || req.body.sessionId`
|
||||||
|
- Custom session IDs now properly tracked instead of defaulting to "default"
|
||||||
|
|
||||||
|
### Added - Error Handling & Diagnostics
|
||||||
|
- Added comprehensive error handling in LLM router for all providers
|
||||||
|
- HTTPError, JSONDecodeError, KeyError, and generic Exception handling
|
||||||
|
- Detailed error messages with exception type and description
|
||||||
|
- Provider-specific error logging (mi50, ollama, openai)
|
||||||
|
- Added debug logging in intake summarization
|
||||||
|
- Logs LLM response length and preview
|
||||||
|
- Validates non-empty responses before JSON parsing
|
||||||
|
- Helps diagnose empty or malformed responses
|
||||||
|
|
||||||
|
### Added - Session Management
|
||||||
|
- Added session persistence endpoints in relay [core/relay/server.js:160-171](core/relay/server.js#L160-L171)
|
||||||
|
- `GET /sessions/:id` - Retrieve session history
|
||||||
|
- `POST /sessions/:id` - Save session history
|
||||||
|
- In-memory storage using Map (ephemeral, resets on container restart)
|
||||||
|
- Fixes UI "Failed to load session" errors
|
||||||
|
|
||||||
|
### Changed - Provider Configuration
|
||||||
|
- Added `mi50` provider support for llama.cpp server [cortex/llm/llm_router.py:62-81](cortex/llm/llm_router.py#L62-L81)
|
||||||
|
- Uses `/completion` endpoint with `n_predict` parameter
|
||||||
|
- Extracts `content` field from response
|
||||||
|
- Configured for MI50 GPU with DeepSeek model
|
||||||
|
- Increased memory retrieval threshold from 0.78 to 0.90 [cortex/.env:20](cortex/.env#L20)
|
||||||
|
- Filters out low-relevance memories (only returns 90%+ similarity)
|
||||||
|
- Reduces noise in context retrieval
|
||||||
|
|
||||||
|
### Technical Improvements
|
||||||
|
- Unified async HTTP handling across all LLM providers
|
||||||
|
- Better separation of concerns between provider implementations
|
||||||
|
- Improved error messages for debugging LLM API failures
|
||||||
|
- Consistent timeout handling (120 seconds for all providers)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## [0.5.1] - 2025-12-11
|
## [0.5.1] - 2025-12-11
|
||||||
|
|
||||||
### Fixed - Intake Integration
|
### Fixed - Intake Integration
|
||||||
|
|||||||
@@ -1,71 +0,0 @@
|
|||||||
# Lyra Core — Project Summary
|
|
||||||
|
|
||||||
## v0.4 (2025-10-03)
|
|
||||||
|
|
||||||
### 🧠 High-Level Architecture
|
|
||||||
- **Lyra Core (v0.3.1)** — Orchestration layer.
|
|
||||||
- Accepts chat requests (`/v1/chat/completions`).
|
|
||||||
- Routes through Cortex for subconscious annotation.
|
|
||||||
- Stores everything in Mem0 (no discard).
|
|
||||||
- Fetches persona + relevant memories.
|
|
||||||
- Injects context back into LLM.
|
|
||||||
|
|
||||||
- **Cortex (v0.3.0)** — Subconscious annotator.
|
|
||||||
- Runs locally via `llama.cpp` (Phi-3.5-mini Q4_K_M).
|
|
||||||
- Strict JSON schema:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"sentiment": "positive" | "neutral" | "negative",
|
|
||||||
"novelty": 0.0–1.0,
|
|
||||||
"tags": ["keyword", "keyword"],
|
|
||||||
"notes": "short string"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
- Normalizes keys (lowercase).
|
|
||||||
- Strips Markdown fences before parsing.
|
|
||||||
- Configurable via `.env` (`CORTEX_ENABLED=true|false`).
|
|
||||||
- Currently generates annotations, but not yet persisted into Mem0 payloads (stored as empty `{cortex:{}}`).
|
|
||||||
|
|
||||||
- **Mem0 (v0.4.0)** — Persistent memory layer.
|
|
||||||
- Handles embeddings, graph storage, and retrieval.
|
|
||||||
- Dual embedder support:
|
|
||||||
- **OpenAI Cloud** (`text-embedding-3-small`, 1536-dim).
|
|
||||||
- **HuggingFace TEI** (gte-Qwen2-1.5B-instruct, 1536-dim, hosted on 3090).
|
|
||||||
- Environment toggle for provider (`.env.openai` vs `.env.3090`).
|
|
||||||
- Memory persistence in Postgres (`payload` JSON).
|
|
||||||
- CSV export pipeline confirmed (id, user_id, data, created_at).
|
|
||||||
|
|
||||||
- **Persona Sidecar**
|
|
||||||
- Provides personality, style, and protocol instructions.
|
|
||||||
- Injected at runtime into Core prompt building.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 🚀 Recent Changes
|
|
||||||
- **Mem0**
|
|
||||||
- Added HuggingFace TEI integration (local 3090 embedder).
|
|
||||||
- Enabled dual-mode environment switch (OpenAI cloud ↔ local TEI).
|
|
||||||
- Fixed `.env` line ending mismatch (CRLF vs LF).
|
|
||||||
- Added memory dump/export commands for Postgres.
|
|
||||||
|
|
||||||
- **Core/Relay**
|
|
||||||
- No major changes since v0.3.1 (still routing input → Cortex → Mem0).
|
|
||||||
|
|
||||||
- **Cortex**
|
|
||||||
- Still outputs annotations, but not yet persisted into Mem0 payloads.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 📈 Versioning
|
|
||||||
- **Lyra Core** → v0.3.1
|
|
||||||
- **Cortex** → v0.3.0
|
|
||||||
- **Mem0** → v0.4.0
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 📋 Next Steps
|
|
||||||
- [ ] Wire Cortex annotations into Mem0 payloads (`cortex` object).
|
|
||||||
- [ ] Add “export all memories” script to standard workflow.
|
|
||||||
- [ ] Consider async embedding for faster `mem.add`.
|
|
||||||
- [ ] Build visual diagram of data flow (Core ↔ Cortex ↔ Mem0 ↔ Persona).
|
|
||||||
- [ ] Explore larger LLMs for Cortex (Qwen2-7B, etc.) for richer subconscious annotation.
|
|
||||||
@@ -1,12 +1,14 @@
|
|||||||
# Project Lyra - README v0.5.0
|
# Project Lyra - README v0.5.1
|
||||||
|
|
||||||
Lyra is a modular persistent AI companion system with advanced reasoning capabilities.
|
Lyra is a modular persistent AI companion system with advanced reasoning capabilities.
|
||||||
It provides memory-backed chat using **NeoMem** + **Relay** + **Cortex**,
|
It provides memory-backed chat using **NeoMem** + **Relay** + **Cortex**,
|
||||||
with multi-stage reasoning pipeline powered by HTTP-based LLM backends.
|
with multi-stage reasoning pipeline powered by HTTP-based LLM backends.
|
||||||
|
|
||||||
|
**Current Version:** v0.5.1 (2025-12-11)
|
||||||
|
|
||||||
## Mission Statement
|
## Mission Statement
|
||||||
|
|
||||||
The point of Project Lyra is to give an AI chatbot more abilities than a typical chatbot. Typical chatbots are essentially amnesic and forget everything about your project. Lyra helps keep projects organized and remembers everything you have done. Think of her abilities as a notepad/schedule/database/co-creator/collaborator all with its own executive function. Say something in passing, Lyra remembers it then reminds you of it later.
|
The point of Project Lyra is to give an AI chatbot more abilities than a typical chatbot. Typical chatbots are essentially amnesic and forget evertything about your project. Lyra helps keep projects organized and remembers everything you have done. Think of her abilities as a notepad/schedule/database/co-creator/collaborator all with its own executive function. Say something in passing, Lyra remembers it then reminds you of it later.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -22,7 +24,7 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
|||||||
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
||||||
- Internal endpoint: `POST /chat`
|
- Internal endpoint: `POST /chat`
|
||||||
- Routes messages through Cortex reasoning pipeline
|
- Routes messages through Cortex reasoning pipeline
|
||||||
- Manages async calls to Intake and NeoMem
|
- Manages async calls to NeoMem and Cortex ingest
|
||||||
|
|
||||||
**2. UI** (Static HTML)
|
**2. UI** (Static HTML)
|
||||||
- Browser-based chat interface with cyberpunk theme
|
- Browser-based chat interface with cyberpunk theme
|
||||||
@@ -41,38 +43,48 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
|||||||
|
|
||||||
**4. Cortex** (Python/FastAPI) - Port 7081
|
**4. Cortex** (Python/FastAPI) - Port 7081
|
||||||
- Primary reasoning engine with multi-stage pipeline
|
- Primary reasoning engine with multi-stage pipeline
|
||||||
|
- **Includes embedded Intake module** (no separate service as of v0.5.1)
|
||||||
- **4-Stage Processing:**
|
- **4-Stage Processing:**
|
||||||
1. **Reflection** - Generates meta-awareness notes about conversation
|
1. **Reflection** - Generates meta-awareness notes about conversation
|
||||||
2. **Reasoning** - Creates initial draft answer using context
|
2. **Reasoning** - Creates initial draft answer using context
|
||||||
3. **Refinement** - Polishes and improves the draft
|
3. **Refinement** - Polishes and improves the draft
|
||||||
4. **Persona** - Applies Lyra's personality and speaking style
|
4. **Persona** - Applies Lyra's personality and speaking style
|
||||||
- Integrates with Intake for short-term context
|
- Integrates with Intake for short-term context via internal Python imports
|
||||||
- Flexible LLM router supporting multiple backends via HTTP
|
- Flexible LLM router supporting multiple backends via HTTP
|
||||||
|
- **Endpoints:**
|
||||||
|
- `POST /reason` - Main reasoning pipeline
|
||||||
|
- `POST /ingest` - Receives conversation exchanges from Relay
|
||||||
|
- `GET /health` - Service health check
|
||||||
|
- `GET /debug/sessions` - Inspect in-memory SESSIONS state
|
||||||
|
- `GET /debug/summary` - Test summarization for a session
|
||||||
|
|
||||||
**5. Intake v0.2** (Python/FastAPI) - Port 7080
|
**5. Intake** (Python Module) - **Embedded in Cortex**
|
||||||
- Simplified short-term memory summarization
|
- **No longer a standalone service** - runs as Python module inside Cortex container
|
||||||
- Session-based circular buffer (deque, maxlen=200)
|
- Short-term memory management with session-based circular buffer
|
||||||
- Single-level simple summarization (no cascading)
|
- In-memory SESSIONS dictionary: `session_id → {buffer: deque(maxlen=200), created_at: timestamp}`
|
||||||
- Background async processing with FastAPI BackgroundTasks
|
- Multi-level summarization (L1/L5/L10/L20/L30) produced by `summarize_context()`
|
||||||
- Pushes summaries to NeoMem automatically
|
- Deferred summarization - actual summary generation happens during `/reason` call
|
||||||
- **API Endpoints:**
|
- Internal Python API:
|
||||||
- `POST /add_exchange` - Add conversation exchange
|
- `add_exchange_internal(exchange)` - Direct function call from Cortex
|
||||||
- `GET /summaries?session_id={id}` - Retrieve session summary
|
- `summarize_context(session_id, exchanges)` - Async LLM-based summarization
|
||||||
- `POST /close_session/{id}` - Close and cleanup session
|
- `SESSIONS` - Module-level global state (requires single Uvicorn worker)
|
||||||
|
|
||||||
### LLM Backends (HTTP-based)
|
### LLM Backends (HTTP-based)
|
||||||
|
|
||||||
**All LLM communication is done via HTTP APIs:**
|
**All LLM communication is done via HTTP APIs:**
|
||||||
- **PRIMARY**: vLLM server (`http://10.0.0.43:8000`) - AMD MI50 GPU backend
|
- **PRIMARY**: llama.cpp server (`http://10.0.0.44:8080`) - AMD MI50 GPU backend
|
||||||
- **SECONDARY**: Ollama server (`http://10.0.0.3:11434`) - RTX 3090 backend
|
- **SECONDARY**: Ollama server (`http://10.0.0.3:11434`) - RTX 3090 backend
|
||||||
|
- Model: qwen2.5:7b-instruct-q4_K_M
|
||||||
- **CLOUD**: OpenAI API (`https://api.openai.com/v1`) - Cloud-based models
|
- **CLOUD**: OpenAI API (`https://api.openai.com/v1`) - Cloud-based models
|
||||||
|
- Model: gpt-4o-mini
|
||||||
- **FALLBACK**: Local backup (`http://10.0.0.41:11435`) - Emergency fallback
|
- **FALLBACK**: Local backup (`http://10.0.0.41:11435`) - Emergency fallback
|
||||||
|
- Model: llama-3.2-8b-instruct
|
||||||
|
|
||||||
Each module can be configured to use a different backend via environment variables.
|
Each module can be configured to use a different backend via environment variables.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Data Flow Architecture (v0.5.0)
|
## Data Flow Architecture (v0.5.1)
|
||||||
|
|
||||||
### Normal Message Flow:
|
### Normal Message Flow:
|
||||||
|
|
||||||
@@ -82,43 +94,44 @@ User (UI) → POST /v1/chat/completions
|
|||||||
Relay (7078)
|
Relay (7078)
|
||||||
↓ POST /reason
|
↓ POST /reason
|
||||||
Cortex (7081)
|
Cortex (7081)
|
||||||
↓ GET /summaries?session_id=xxx
|
↓ (internal Python call)
|
||||||
Intake (7080) [RETURNS SUMMARY]
|
Intake module → summarize_context()
|
||||||
↓
|
↓
|
||||||
Cortex processes (4 stages):
|
Cortex processes (4 stages):
|
||||||
1. reflection.py → meta-awareness notes
|
1. reflection.py → meta-awareness notes (CLOUD backend)
|
||||||
2. reasoning.py → draft answer (uses LLM)
|
2. reasoning.py → draft answer (PRIMARY backend)
|
||||||
3. refine.py → refined answer (uses LLM)
|
3. refine.py → refined answer (PRIMARY backend)
|
||||||
4. persona/speak.py → Lyra personality (uses LLM)
|
4. persona/speak.py → Lyra personality (CLOUD backend)
|
||||||
↓
|
↓
|
||||||
Returns persona answer to Relay
|
Returns persona answer to Relay
|
||||||
↓
|
↓
|
||||||
Relay → Cortex /ingest (async, stub)
|
Relay → POST /ingest (async)
|
||||||
Relay → Intake /add_exchange (async)
|
|
||||||
↓
|
↓
|
||||||
Intake → Background summarize → NeoMem
|
Cortex → add_exchange_internal() → SESSIONS buffer
|
||||||
|
↓
|
||||||
|
Relay → NeoMem /memories (async, planned)
|
||||||
↓
|
↓
|
||||||
Relay → UI (returns final response)
|
Relay → UI (returns final response)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Cortex 4-Stage Reasoning Pipeline:
|
### Cortex 4-Stage Reasoning Pipeline:
|
||||||
|
|
||||||
1. **Reflection** (`reflection.py`) - Configurable LLM via HTTP
|
1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI)
|
||||||
- Analyzes user intent and conversation context
|
- Analyzes user intent and conversation context
|
||||||
- Generates meta-awareness notes
|
- Generates meta-awareness notes
|
||||||
- "What is the user really asking?"
|
- "What is the user really asking?"
|
||||||
|
|
||||||
2. **Reasoning** (`reasoning.py`) - Configurable LLM via HTTP
|
2. **Reasoning** (`reasoning.py`) - Primary LLM (llama.cpp)
|
||||||
- Retrieves short-term context from Intake
|
- Retrieves short-term context from Intake module
|
||||||
- Creates initial draft answer
|
- Creates initial draft answer
|
||||||
- Integrates context, reflection notes, and user prompt
|
- Integrates context, reflection notes, and user prompt
|
||||||
|
|
||||||
3. **Refinement** (`refine.py`) - Configurable LLM via HTTP
|
3. **Refinement** (`refine.py`) - Primary LLM (llama.cpp)
|
||||||
- Polishes the draft answer
|
- Polishes the draft answer
|
||||||
- Improves clarity and coherence
|
- Improves clarity and coherence
|
||||||
- Ensures factual consistency
|
- Ensures factual consistency
|
||||||
|
|
||||||
4. **Persona** (`speak.py`) - Configurable LLM via HTTP
|
4. **Persona** (`speak.py`) - Cloud LLM (OpenAI)
|
||||||
- Applies Lyra's personality and speaking style
|
- Applies Lyra's personality and speaking style
|
||||||
- Natural, conversational output
|
- Natural, conversational output
|
||||||
- Final answer returned to user
|
- Final answer returned to user
|
||||||
@@ -134,7 +147,7 @@ Relay → UI (returns final response)
|
|||||||
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
||||||
- Internal endpoint: `POST /chat`
|
- Internal endpoint: `POST /chat`
|
||||||
- Health check: `GET /_health`
|
- Health check: `GET /_health`
|
||||||
- Async non-blocking calls to Cortex and Intake
|
- Async non-blocking calls to Cortex
|
||||||
- Shared request handler for code reuse
|
- Shared request handler for code reuse
|
||||||
- Comprehensive error handling
|
- Comprehensive error handling
|
||||||
|
|
||||||
@@ -154,73 +167,70 @@ Relay → UI (returns final response)
|
|||||||
|
|
||||||
### Reasoning Layer
|
### Reasoning Layer
|
||||||
|
|
||||||
**Cortex** (v0.5):
|
**Cortex** (v0.5.1):
|
||||||
- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona)
|
- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona)
|
||||||
- Flexible LLM backend routing via HTTP
|
- Flexible LLM backend routing via HTTP
|
||||||
- Per-stage backend selection
|
- Per-stage backend selection
|
||||||
- Async processing throughout
|
- Async processing throughout
|
||||||
- IntakeClient integration for short-term context
|
- Embedded Intake module for short-term context
|
||||||
- `/reason`, `/ingest` (stub), `/health` endpoints
|
- `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
|
||||||
|
- Lenient error handling - never fails the chat pipeline
|
||||||
|
|
||||||
**Intake** (v0.2):
|
**Intake** (Embedded Module):
|
||||||
- Simplified single-level summarization
|
- **Architectural change**: Now runs as Python module inside Cortex container
|
||||||
- Session-based circular buffer (200 exchanges max)
|
- In-memory SESSIONS management (session_id → buffer)
|
||||||
- Background async summarization
|
- Multi-level summarization: L1 (ultra-short), L5 (short), L10 (medium), L20 (detailed), L30 (full)
|
||||||
- Automatic NeoMem push
|
- Deferred summarization strategy - summaries generated during `/reason` call
|
||||||
- No persistent log files (memory-only)
|
- `bg_summarize()` is a logging stub - actual work deferred
|
||||||
- **Breaking change from v0.1**: Removed cascading summaries (L1, L2, L5, L10, L20, L30)
|
- **Single-worker constraint**: SESSIONS requires single Uvicorn worker or Redis/shared storage
|
||||||
|
|
||||||
**LLM Router**:
|
**LLM Router**:
|
||||||
- Dynamic backend selection via HTTP
|
- Dynamic backend selection via HTTP
|
||||||
- Environment-driven configuration
|
- Environment-driven configuration
|
||||||
- Support for vLLM, Ollama, OpenAI, custom endpoints
|
- Support for llama.cpp, Ollama, OpenAI, custom endpoints
|
||||||
- Per-module backend preferences
|
- Per-module backend preferences:
|
||||||
|
- `CORTEX_LLM=SECONDARY` (Ollama for reasoning)
|
||||||
|
- `INTAKE_LLM=PRIMARY` (llama.cpp for summarization)
|
||||||
|
- `SPEAK_LLM=OPENAI` (Cloud for persona)
|
||||||
|
- `NEOMEM_LLM=PRIMARY` (llama.cpp for memory operations)
|
||||||
|
|
||||||
|
### Beta Lyrae (RAG Memory DB) - Currently Disabled
|
||||||
|
|
||||||
# Beta Lyrae (RAG Memory DB) - added 11-3-25
|
|
||||||
- **RAG Knowledge DB - Beta Lyrae (sheliak)**
|
- **RAG Knowledge DB - Beta Lyrae (sheliak)**
|
||||||
- This module implements the **Retrieval-Augmented Generation (RAG)** layer for Project Lyra.
|
- This module implements the **Retrieval-Augmented Generation (RAG)** layer for Project Lyra.
|
||||||
- It serves as the long-term searchable memory store that Cortex and Relay can query for relevant context before reasoning or response generation.
|
- It serves as the long-term searchable memory store that Cortex and Relay can query for relevant context before reasoning or response generation.
|
||||||
The system uses:
|
- **Status**: Disabled in docker-compose.yml (v0.5.1)
|
||||||
- **ChromaDB** for persistent vector storage
|
|
||||||
- **OpenAI Embeddings (`text-embedding-3-small`)** for semantic similarity
|
The system uses:
|
||||||
- **FastAPI** (port 7090) for the `/rag/search` REST endpoint
|
- **ChromaDB** for persistent vector storage
|
||||||
- Directory Layout
|
- **OpenAI Embeddings (`text-embedding-3-small`)** for semantic similarity
|
||||||
rag/
|
- **FastAPI** (port 7090) for the `/rag/search` REST endpoint
|
||||||
├── rag_chat_import.py # imports JSON chat logs
|
|
||||||
├── rag_docs_import.py # (planned) PDF/EPUB/manual importer
|
Directory Layout:
|
||||||
├── rag_build.py # legacy single-folder builder
|
```
|
||||||
├── rag_query.py # command-line query helper
|
rag/
|
||||||
├── rag_api.py # FastAPI service providing /rag/search
|
├── rag_chat_import.py # imports JSON chat logs
|
||||||
├── chromadb/ # persistent vector store
|
├── rag_docs_import.py # (planned) PDF/EPUB/manual importer
|
||||||
├── chatlogs/ # organized source data
|
├── rag_build.py # legacy single-folder builder
|
||||||
│ ├── poker/
|
├── rag_query.py # command-line query helper
|
||||||
│ ├── work/
|
├── rag_api.py # FastAPI service providing /rag/search
|
||||||
│ ├── lyra/
|
├── chromadb/ # persistent vector store
|
||||||
│ ├── personal/
|
├── chatlogs/ # organized source data
|
||||||
│ └── ...
|
│ ├── poker/
|
||||||
└── import.log # progress log for batch runs
|
│ ├── work/
|
||||||
- **OpenAI chatlog importer.
|
│ ├── lyra/
|
||||||
- Takes JSON formatted chat logs and imports it to the RAG.
|
│ ├── personal/
|
||||||
- **fetures include:**
|
│ └── ...
|
||||||
- Recursive folder indexing with **category detection** from directory name
|
└── import.log # progress log for batch runs
|
||||||
- Smart chunking for long messages (5 000 chars per slice)
|
```
|
||||||
- Automatic deduplication using SHA-1 hash of file + chunk
|
|
||||||
- Timestamps for both file modification and import time
|
**OpenAI chatlog importer features:**
|
||||||
- Full progress logging via tqdm
|
- Recursive folder indexing with **category detection** from directory name
|
||||||
- Safe to run in background with nohup … &
|
- Smart chunking for long messages (5,000 chars per slice)
|
||||||
- Metadata per chunk:
|
- Automatic deduplication using SHA-1 hash of file + chunk
|
||||||
```json
|
- Timestamps for both file modification and import time
|
||||||
{
|
- Full progress logging via tqdm
|
||||||
"chat_id": "<sha1 of filename>",
|
- Safe to run in background with `nohup … &`
|
||||||
"chunk_index": 0,
|
|
||||||
"source": "chatlogs/lyra/0002_cortex_LLMs_11-1-25.json",
|
|
||||||
"title": "cortex LLMs 11-1-25",
|
|
||||||
"role": "assistant",
|
|
||||||
"category": "lyra",
|
|
||||||
"type": "chat",
|
|
||||||
"file_modified": "2025-11-06T23:41:02",
|
|
||||||
"imported_at": "2025-11-07T03:55:00Z"
|
|
||||||
}```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -228,13 +238,16 @@ Relay → UI (returns final response)
|
|||||||
|
|
||||||
All services run in a single docker-compose stack with the following containers:
|
All services run in a single docker-compose stack with the following containers:
|
||||||
|
|
||||||
|
**Active Services:**
|
||||||
- **neomem-postgres** - PostgreSQL with pgvector extension (port 5432)
|
- **neomem-postgres** - PostgreSQL with pgvector extension (port 5432)
|
||||||
- **neomem-neo4j** - Neo4j graph database (ports 7474, 7687)
|
- **neomem-neo4j** - Neo4j graph database (ports 7474, 7687)
|
||||||
- **neomem-api** - NeoMem memory service (port 7077)
|
- **neomem-api** - NeoMem memory service (port 7077)
|
||||||
- **relay** - Main orchestrator (port 7078)
|
- **relay** - Main orchestrator (port 7078)
|
||||||
- **cortex** - Reasoning engine (port 7081)
|
- **cortex** - Reasoning engine with embedded Intake (port 7081)
|
||||||
- **intake** - Short-term memory summarization (port 7080) - currently disabled
|
|
||||||
- **rag** - RAG search service (port 7090) - currently disabled
|
**Disabled Services:**
|
||||||
|
- **intake** - No longer needed (embedded in Cortex as of v0.5.1)
|
||||||
|
- **rag** - Beta Lyrae RAG service (port 7090) - currently disabled
|
||||||
|
|
||||||
All containers communicate via the `lyra_net` Docker bridge network.
|
All containers communicate via the `lyra_net` Docker bridge network.
|
||||||
|
|
||||||
@@ -242,10 +255,10 @@ All containers communicate via the `lyra_net` Docker bridge network.
|
|||||||
|
|
||||||
The following LLM backends are accessed via HTTP (not part of docker-compose):
|
The following LLM backends are accessed via HTTP (not part of docker-compose):
|
||||||
|
|
||||||
- **vLLM Server** (`http://10.0.0.43:8000`)
|
- **llama.cpp Server** (`http://10.0.0.44:8080`)
|
||||||
- AMD MI50 GPU-accelerated inference
|
- AMD MI50 GPU-accelerated inference
|
||||||
- Custom ROCm-enabled vLLM build
|
|
||||||
- Primary backend for reasoning and refinement stages
|
- Primary backend for reasoning and refinement stages
|
||||||
|
- Model path: `/model`
|
||||||
|
|
||||||
- **Ollama Server** (`http://10.0.0.3:11434`)
|
- **Ollama Server** (`http://10.0.0.3:11434`)
|
||||||
- RTX 3090 GPU-accelerated inference
|
- RTX 3090 GPU-accelerated inference
|
||||||
@@ -265,16 +278,38 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
|||||||
|
|
||||||
## Version History
|
## Version History
|
||||||
|
|
||||||
### v0.5.0 (2025-11-28) - Current Release
|
### v0.5.1 (2025-12-11) - Current Release
|
||||||
|
**Critical Intake Integration Fixes:**
|
||||||
|
- ✅ Fixed `bg_summarize()` NameError preventing SESSIONS persistence
|
||||||
|
- ✅ Fixed `/ingest` endpoint unreachable code
|
||||||
|
- ✅ Added `cortex/intake/__init__.py` for proper package structure
|
||||||
|
- ✅ Added diagnostic logging to verify SESSIONS singleton behavior
|
||||||
|
- ✅ Added `/debug/sessions` and `/debug/summary` endpoints
|
||||||
|
- ✅ Documented single-worker constraint in Dockerfile
|
||||||
|
- ✅ Implemented lenient error handling (never fails chat pipeline)
|
||||||
|
- ✅ Intake now embedded in Cortex - no longer standalone service
|
||||||
|
|
||||||
|
**Architecture Changes:**
|
||||||
|
- Intake module runs inside Cortex container as pure Python import
|
||||||
|
- No HTTP calls between Cortex and Intake (internal function calls)
|
||||||
|
- SESSIONS persist correctly in Uvicorn worker
|
||||||
|
- Deferred summarization strategy (summaries generated during `/reason`)
|
||||||
|
|
||||||
|
### v0.5.0 (2025-11-28)
|
||||||
- ✅ Fixed all critical API wiring issues
|
- ✅ Fixed all critical API wiring issues
|
||||||
- ✅ Added OpenAI-compatible endpoint to Relay (`/v1/chat/completions`)
|
- ✅ Added OpenAI-compatible endpoint to Relay (`/v1/chat/completions`)
|
||||||
- ✅ Fixed Cortex → Intake integration
|
- ✅ Fixed Cortex → Intake integration
|
||||||
- ✅ Added missing Python package `__init__.py` files
|
- ✅ Added missing Python package `__init__.py` files
|
||||||
- ✅ End-to-end message flow verified and working
|
- ✅ End-to-end message flow verified and working
|
||||||
|
|
||||||
|
### Infrastructure v1.0.0 (2025-11-26)
|
||||||
|
- Consolidated 9 scattered `.env` files into single source of truth
|
||||||
|
- Multi-backend LLM strategy implemented
|
||||||
|
- Docker Compose consolidation
|
||||||
|
- Created `.env.example` security templates
|
||||||
|
|
||||||
### v0.4.x (Major Rewire)
|
### v0.4.x (Major Rewire)
|
||||||
- Cortex multi-stage reasoning pipeline
|
- Cortex multi-stage reasoning pipeline
|
||||||
- Intake v0.2 simplification
|
|
||||||
- LLM router with multi-backend support
|
- LLM router with multi-backend support
|
||||||
- Major architectural restructuring
|
- Major architectural restructuring
|
||||||
|
|
||||||
@@ -285,19 +320,30 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Known Issues (v0.5.0)
|
## Known Issues (v0.5.1)
|
||||||
|
|
||||||
|
### Critical (Fixed in v0.5.1)
|
||||||
|
- ~~Intake SESSIONS not persisting~~ ✅ **FIXED**
|
||||||
|
- ~~`bg_summarize()` NameError~~ ✅ **FIXED**
|
||||||
|
- ~~`/ingest` endpoint unreachable code~~ ✅ **FIXED**
|
||||||
|
|
||||||
### Non-Critical
|
### Non-Critical
|
||||||
- Session management endpoints not fully implemented in Relay
|
- Session management endpoints not fully implemented in Relay
|
||||||
- Intake service currently disabled in docker-compose.yml
|
|
||||||
- RAG service currently disabled in docker-compose.yml
|
- RAG service currently disabled in docker-compose.yml
|
||||||
- Cortex `/ingest` endpoint is a stub
|
- NeoMem integration in Relay not yet active (planned for v0.5.2)
|
||||||
|
|
||||||
|
### Operational Notes
|
||||||
|
- **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state
|
||||||
|
- Multi-worker scaling requires migrating SESSIONS to Redis or shared storage
|
||||||
|
- Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting
|
||||||
|
|
||||||
### Future Enhancements
|
### Future Enhancements
|
||||||
- Re-enable RAG service integration
|
- Re-enable RAG service integration
|
||||||
- Implement full session persistence
|
- Implement full session persistence
|
||||||
|
- Migrate SESSIONS to Redis for multi-worker support
|
||||||
- Add request correlation IDs for tracing
|
- Add request correlation IDs for tracing
|
||||||
- Comprehensive health checks
|
- Comprehensive health checks across all services
|
||||||
|
- NeoMem integration in Relay
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -305,21 +351,39 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
|||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
- Docker + Docker Compose
|
- Docker + Docker Compose
|
||||||
- At least one HTTP-accessible LLM endpoint (vLLM, Ollama, or OpenAI API key)
|
- At least one HTTP-accessible LLM endpoint (llama.cpp, Ollama, or OpenAI API key)
|
||||||
|
|
||||||
### Setup
|
### Setup
|
||||||
1. Copy `.env.example` to `.env` and configure your LLM backend URLs and API keys
|
1. Copy `.env.example` to `.env` and configure your LLM backend URLs and API keys:
|
||||||
|
```bash
|
||||||
|
# Required: Configure at least one LLM backend
|
||||||
|
LLM_PRIMARY_URL=http://10.0.0.44:8080 # llama.cpp
|
||||||
|
LLM_SECONDARY_URL=http://10.0.0.3:11434 # Ollama
|
||||||
|
OPENAI_API_KEY=sk-... # OpenAI
|
||||||
|
```
|
||||||
|
|
||||||
2. Start all services with docker-compose:
|
2. Start all services with docker-compose:
|
||||||
```bash
|
```bash
|
||||||
docker-compose up -d
|
docker-compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Check service health:
|
3. Check service health:
|
||||||
```bash
|
```bash
|
||||||
|
# Relay health
|
||||||
curl http://localhost:7078/_health
|
curl http://localhost:7078/_health
|
||||||
|
|
||||||
|
# Cortex health
|
||||||
|
curl http://localhost:7081/health
|
||||||
|
|
||||||
|
# NeoMem health
|
||||||
|
curl http://localhost:7077/health
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Access the UI at `http://localhost:7078`
|
4. Access the UI at `http://localhost:7078`
|
||||||
|
|
||||||
### Test
|
### Test
|
||||||
|
|
||||||
|
**Test Relay → Cortex pipeline:**
|
||||||
```bash
|
```bash
|
||||||
curl -X POST http://localhost:7078/v1/chat/completions \
|
curl -X POST http://localhost:7078/v1/chat/completions \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
@@ -329,15 +393,130 @@ curl -X POST http://localhost:7078/v1/chat/completions \
|
|||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Test Cortex /ingest endpoint:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:7081/ingest \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"session_id": "test",
|
||||||
|
"user_msg": "Hello",
|
||||||
|
"assistant_msg": "Hi there!"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Inspect SESSIONS state:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:7081/debug/sessions
|
||||||
|
```
|
||||||
|
|
||||||
|
**Get summary for a session:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:7081/debug/summary?session_id=test"
|
||||||
|
```
|
||||||
|
|
||||||
All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
|
All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
### LLM Backend Configuration
|
||||||
|
|
||||||
|
**Backend URLs (Full API endpoints):**
|
||||||
|
```bash
|
||||||
|
LLM_PRIMARY_URL=http://10.0.0.44:8080 # llama.cpp
|
||||||
|
LLM_PRIMARY_MODEL=/model
|
||||||
|
|
||||||
|
LLM_SECONDARY_URL=http://10.0.0.3:11434 # Ollama
|
||||||
|
LLM_SECONDARY_MODEL=qwen2.5:7b-instruct-q4_K_M
|
||||||
|
|
||||||
|
LLM_OPENAI_URL=https://api.openai.com/v1
|
||||||
|
LLM_OPENAI_MODEL=gpt-4o-mini
|
||||||
|
OPENAI_API_KEY=sk-...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Module-specific backend selection:**
|
||||||
|
```bash
|
||||||
|
CORTEX_LLM=SECONDARY # Use Ollama for reasoning
|
||||||
|
INTAKE_LLM=PRIMARY # Use llama.cpp for summarization
|
||||||
|
SPEAK_LLM=OPENAI # Use OpenAI for persona
|
||||||
|
NEOMEM_LLM=PRIMARY # Use llama.cpp for memory
|
||||||
|
UI_LLM=OPENAI # Use OpenAI for UI
|
||||||
|
RELAY_LLM=PRIMARY # Use llama.cpp for relay
|
||||||
|
```
|
||||||
|
|
||||||
|
### Database Configuration
|
||||||
|
```bash
|
||||||
|
POSTGRES_USER=neomem
|
||||||
|
POSTGRES_PASSWORD=neomempass
|
||||||
|
POSTGRES_DB=neomem
|
||||||
|
POSTGRES_HOST=neomem-postgres
|
||||||
|
POSTGRES_PORT=5432
|
||||||
|
|
||||||
|
NEO4J_URI=bolt://neomem-neo4j:7687
|
||||||
|
NEO4J_USERNAME=neo4j
|
||||||
|
NEO4J_PASSWORD=neomemgraph
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service URLs (Internal Docker Network)
|
||||||
|
```bash
|
||||||
|
NEOMEM_API=http://neomem-api:7077
|
||||||
|
CORTEX_API=http://cortex:7081
|
||||||
|
CORTEX_REASON_URL=http://cortex:7081/reason
|
||||||
|
CORTEX_INGEST_URL=http://cortex:7081/ingest
|
||||||
|
RELAY_URL=http://relay:7078
|
||||||
|
```
|
||||||
|
|
||||||
|
### Feature Flags
|
||||||
|
```bash
|
||||||
|
CORTEX_ENABLED=true
|
||||||
|
MEMORY_ENABLED=true
|
||||||
|
PERSONA_ENABLED=false
|
||||||
|
DEBUG_PROMPT=true
|
||||||
|
VERBOSE_DEBUG=true
|
||||||
|
```
|
||||||
|
|
||||||
|
For complete environment variable reference, see [ENVIRONMENT_VARIABLES.md](ENVIRONMENT_VARIABLES.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Documentation
|
## Documentation
|
||||||
|
|
||||||
- See [CHANGELOG.md](CHANGELOG.md) for detailed version history
|
- [CHANGELOG.md](CHANGELOG.md) - Detailed version history
|
||||||
- See `ENVIRONMENT_VARIABLES.md` for environment variable reference
|
- [PROJECT_SUMMARY.md](PROJECT_SUMMARY.md) - Comprehensive project overview for AI context
|
||||||
- Additional information available in the Trilium docs
|
- [ENVIRONMENT_VARIABLES.md](ENVIRONMENT_VARIABLES.md) - Environment variable reference
|
||||||
|
- [DEPRECATED_FILES.md](DEPRECATED_FILES.md) - Deprecated files and migration guide
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### SESSIONS not persisting
|
||||||
|
**Symptom:** Intake buffer always shows 0 exchanges, summaries always empty.
|
||||||
|
|
||||||
|
**Solution (Fixed in v0.5.1):**
|
||||||
|
- Ensure `cortex/intake/__init__.py` exists
|
||||||
|
- Check Cortex logs for `[Intake Module Init]` message showing SESSIONS object ID
|
||||||
|
- Verify single-worker mode (Dockerfile: `uvicorn main:app --workers 1`)
|
||||||
|
- Use `/debug/sessions` endpoint to inspect current state
|
||||||
|
|
||||||
|
### Cortex connection errors
|
||||||
|
**Symptom:** Relay can't reach Cortex, 502 errors.
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Verify Cortex container is running: `docker ps | grep cortex`
|
||||||
|
- Check Cortex health: `curl http://localhost:7081/health`
|
||||||
|
- Verify environment variables: `CORTEX_REASON_URL=http://cortex:7081/reason`
|
||||||
|
- Check docker network: `docker network inspect lyra_net`
|
||||||
|
|
||||||
|
### LLM backend timeouts
|
||||||
|
**Symptom:** Reasoning stage hangs or times out.
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Verify LLM backend is running and accessible
|
||||||
|
- Check LLM backend health: `curl http://10.0.0.44:8080/health`
|
||||||
|
- Increase timeout in llm_router.py if using slow models
|
||||||
|
- Check logs for specific backend errors
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -356,6 +535,8 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0).
|
|||||||
- All services communicate via Docker internal networking on the `lyra_net` bridge
|
- All services communicate via Docker internal networking on the `lyra_net` bridge
|
||||||
- History and entity graphs are managed via PostgreSQL + Neo4j
|
- History and entity graphs are managed via PostgreSQL + Neo4j
|
||||||
- LLM backends are accessed via HTTP and configured in `.env`
|
- LLM backends are accessed via HTTP and configured in `.env`
|
||||||
|
- Intake module is imported internally by Cortex (no HTTP communication)
|
||||||
|
- SESSIONS state is maintained in-memory within Cortex container
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -391,3 +572,38 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0).
|
|||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Development Notes
|
||||||
|
|
||||||
|
### Cortex Architecture (v0.5.1)
|
||||||
|
- Cortex contains embedded Intake module at `cortex/intake/`
|
||||||
|
- Intake is imported as: `from intake.intake import add_exchange_internal, SESSIONS`
|
||||||
|
- SESSIONS is a module-level global dictionary (singleton pattern)
|
||||||
|
- Single-worker constraint required to maintain SESSIONS state
|
||||||
|
- Diagnostic endpoints available for debugging: `/debug/sessions`, `/debug/summary`
|
||||||
|
|
||||||
|
### Adding New LLM Backends
|
||||||
|
1. Add backend URL to `.env`:
|
||||||
|
```bash
|
||||||
|
LLM_CUSTOM_URL=http://your-backend:port
|
||||||
|
LLM_CUSTOM_MODEL=model-name
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Configure module to use new backend:
|
||||||
|
```bash
|
||||||
|
CORTEX_LLM=CUSTOM
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Restart Cortex container:
|
||||||
|
```bash
|
||||||
|
docker-compose restart cortex
|
||||||
|
```
|
||||||
|
|
||||||
|
### Debugging Tips
|
||||||
|
- Enable verbose logging: `VERBOSE_DEBUG=true` in `.env`
|
||||||
|
- Check Cortex logs: `docker logs cortex -f`
|
||||||
|
- Inspect SESSIONS: `curl http://localhost:7081/debug/sessions`
|
||||||
|
- Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"`
|
||||||
|
- Check Relay logs: `docker logs relay -f`
|
||||||
|
- Monitor Docker network: `docker network inspect lyra_net`
|
||||||
|
|||||||
+20
-1
@@ -84,7 +84,7 @@ app.get("/_health", (_, res) => {
|
|||||||
// -----------------------------------------------------
|
// -----------------------------------------------------
|
||||||
app.post("/v1/chat/completions", async (req, res) => {
|
app.post("/v1/chat/completions", async (req, res) => {
|
||||||
try {
|
try {
|
||||||
const session_id = req.body.session_id || req.body.user || "default";
|
const session_id = req.body.session_id || req.body.sessionId || req.body.user || "default";
|
||||||
const messages = req.body.messages || [];
|
const messages = req.body.messages || [];
|
||||||
const lastMessage = messages[messages.length - 1];
|
const lastMessage = messages[messages.length - 1];
|
||||||
const user_msg = lastMessage?.content || "";
|
const user_msg = lastMessage?.content || "";
|
||||||
@@ -151,6 +151,25 @@ app.post("/chat", async (req, res) => {
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// -----------------------------------------------------
|
||||||
|
// SESSION ENDPOINTS (for UI)
|
||||||
|
// -----------------------------------------------------
|
||||||
|
// In-memory session storage (could be replaced with a database)
|
||||||
|
const sessions = new Map();
|
||||||
|
|
||||||
|
app.get("/sessions/:id", (req, res) => {
|
||||||
|
const sessionId = req.params.id;
|
||||||
|
const history = sessions.get(sessionId) || [];
|
||||||
|
res.json(history);
|
||||||
|
});
|
||||||
|
|
||||||
|
app.post("/sessions/:id", (req, res) => {
|
||||||
|
const sessionId = req.params.id;
|
||||||
|
const history = req.body;
|
||||||
|
sessions.set(sessionId, history);
|
||||||
|
res.json({ ok: true, saved: history.length });
|
||||||
|
});
|
||||||
|
|
||||||
// -----------------------------------------------------
|
// -----------------------------------------------------
|
||||||
app.listen(PORT, () => {
|
app.listen(PORT, () => {
|
||||||
console.log(`Relay is online on port ${PORT}`);
|
console.log(`Relay is online on port ${PORT}`);
|
||||||
|
|||||||
+1
-1
@@ -51,7 +51,7 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
<script>
|
<script>
|
||||||
const RELAY_BASE = "http://10.0.0.40:7078";
|
const RELAY_BASE = "http://10.0.0.41:7078";
|
||||||
const API_URL = `${RELAY_BASE}/v1/chat/completions`;
|
const API_URL = `${RELAY_BASE}/v1/chat/completions`;
|
||||||
|
|
||||||
function generateSessionId() {
|
function generateSessionId() {
|
||||||
|
|||||||
@@ -282,11 +282,17 @@ JSON only. No text outside JSON.
|
|||||||
try:
|
try:
|
||||||
llm_response = await call_llm(
|
llm_response = await call_llm(
|
||||||
prompt,
|
prompt,
|
||||||
|
backend=INTAKE_LLM,
|
||||||
temperature=0.2
|
temperature=0.2
|
||||||
)
|
)
|
||||||
|
|
||||||
|
print(f"[Intake] LLM response length: {len(llm_response) if llm_response else 0}")
|
||||||
|
print(f"[Intake] LLM response preview: {llm_response[:200] if llm_response else '(empty)'}")
|
||||||
|
|
||||||
# LLM should return JSON, parse it
|
# LLM should return JSON, parse it
|
||||||
|
if not llm_response or not llm_response.strip():
|
||||||
|
raise ValueError("Empty response from LLM")
|
||||||
|
|
||||||
summary = json.loads(llm_response)
|
summary = json.loads(llm_response)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
|||||||
+53
-17
@@ -1,7 +1,10 @@
|
|||||||
# llm_router.py
|
# llm_router.py
|
||||||
import os
|
import os
|
||||||
import requests
|
import httpx
|
||||||
import json
|
import json
|
||||||
|
import logging
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# ------------------------------------------------------------
|
# ------------------------------------------------------------
|
||||||
# Load backend registry from root .env
|
# Load backend registry from root .env
|
||||||
@@ -33,6 +36,9 @@ BACKENDS = {
|
|||||||
|
|
||||||
DEFAULT_BACKEND = "PRIMARY"
|
DEFAULT_BACKEND = "PRIMARY"
|
||||||
|
|
||||||
|
# Reusable async HTTP client
|
||||||
|
http_client = httpx.AsyncClient(timeout=120.0)
|
||||||
|
|
||||||
|
|
||||||
# ------------------------------------------------------------
|
# ------------------------------------------------------------
|
||||||
# Public call
|
# Public call
|
||||||
@@ -57,18 +63,28 @@ async def call_llm(
|
|||||||
raise RuntimeError(f"Backend '{backend}' missing url/model in env")
|
raise RuntimeError(f"Backend '{backend}' missing url/model in env")
|
||||||
|
|
||||||
# -------------------------------
|
# -------------------------------
|
||||||
# Provider: VLLM (your MI50)
|
# Provider: MI50 (llama.cpp server)
|
||||||
# -------------------------------
|
# -------------------------------
|
||||||
if provider == "vllm":
|
if provider == "mi50":
|
||||||
payload = {
|
payload = {
|
||||||
"model": model,
|
|
||||||
"prompt": prompt,
|
"prompt": prompt,
|
||||||
"max_tokens": max_tokens,
|
"n_predict": max_tokens,
|
||||||
"temperature": temperature
|
"temperature": temperature
|
||||||
}
|
}
|
||||||
r = requests.post(url, json=payload, timeout=120)
|
try:
|
||||||
data = r.json()
|
r = await http_client.post(f"{url}/completion", json=payload)
|
||||||
return data["choices"][0]["text"]
|
r.raise_for_status()
|
||||||
|
data = r.json()
|
||||||
|
return data.get("content", "")
|
||||||
|
except httpx.HTTPError as e:
|
||||||
|
logger.error(f"HTTP error calling mi50: {type(e).__name__}: {str(e)}")
|
||||||
|
raise RuntimeError(f"LLM API error (mi50): {type(e).__name__}: {str(e)}")
|
||||||
|
except (KeyError, json.JSONDecodeError) as e:
|
||||||
|
logger.error(f"Response parsing error from mi50: {e}")
|
||||||
|
raise RuntimeError(f"Invalid response format (mi50): {e}")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Unexpected error calling mi50: {type(e).__name__}: {str(e)}")
|
||||||
|
raise RuntimeError(f"Unexpected error (mi50): {type(e).__name__}: {str(e)}")
|
||||||
|
|
||||||
# -------------------------------
|
# -------------------------------
|
||||||
# Provider: OLLAMA (your 3090)
|
# Provider: OLLAMA (your 3090)
|
||||||
@@ -79,13 +95,22 @@ async def call_llm(
|
|||||||
"messages": [
|
"messages": [
|
||||||
{"role": "user", "content": prompt}
|
{"role": "user", "content": prompt}
|
||||||
],
|
],
|
||||||
"stream": False # <-- critical fix
|
"stream": False
|
||||||
}
|
}
|
||||||
|
try:
|
||||||
r = requests.post(f"{url}/api/chat", json=payload, timeout=120)
|
r = await http_client.post(f"{url}/api/chat", json=payload)
|
||||||
data = r.json()
|
r.raise_for_status()
|
||||||
|
data = r.json()
|
||||||
return data["message"]["content"]
|
return data["message"]["content"]
|
||||||
|
except httpx.HTTPError as e:
|
||||||
|
logger.error(f"HTTP error calling ollama: {type(e).__name__}: {str(e)}")
|
||||||
|
raise RuntimeError(f"LLM API error (ollama): {type(e).__name__}: {str(e)}")
|
||||||
|
except (KeyError, json.JSONDecodeError) as e:
|
||||||
|
logger.error(f"Response parsing error from ollama: {e}")
|
||||||
|
raise RuntimeError(f"Invalid response format (ollama): {e}")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Unexpected error calling ollama: {type(e).__name__}: {str(e)}")
|
||||||
|
raise RuntimeError(f"Unexpected error (ollama): {type(e).__name__}: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
# -------------------------------
|
# -------------------------------
|
||||||
@@ -104,9 +129,20 @@ async def call_llm(
|
|||||||
"temperature": temperature,
|
"temperature": temperature,
|
||||||
"max_tokens": max_tokens,
|
"max_tokens": max_tokens,
|
||||||
}
|
}
|
||||||
r = requests.post(f"{url}/chat/completions", json=payload, headers=headers, timeout=120)
|
try:
|
||||||
data = r.json()
|
r = await http_client.post(f"{url}/chat/completions", json=payload, headers=headers)
|
||||||
return data["choices"][0]["message"]["content"]
|
r.raise_for_status()
|
||||||
|
data = r.json()
|
||||||
|
return data["choices"][0]["message"]["content"]
|
||||||
|
except httpx.HTTPError as e:
|
||||||
|
logger.error(f"HTTP error calling openai: {type(e).__name__}: {str(e)}")
|
||||||
|
raise RuntimeError(f"LLM API error (openai): {type(e).__name__}: {str(e)}")
|
||||||
|
except (KeyError, json.JSONDecodeError) as e:
|
||||||
|
logger.error(f"Response parsing error from openai: {e}")
|
||||||
|
raise RuntimeError(f"Invalid response format (openai): {e}")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Unexpected error calling openai: {type(e).__name__}: {str(e)}")
|
||||||
|
raise RuntimeError(f"Unexpected error (openai): {type(e).__name__}: {str(e)}")
|
||||||
|
|
||||||
# -------------------------------
|
# -------------------------------
|
||||||
# Unknown provider
|
# Unknown provider
|
||||||
|
|||||||
@@ -97,6 +97,21 @@ services:
|
|||||||
networks:
|
networks:
|
||||||
- lyra_net
|
- lyra_net
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# UI Server
|
||||||
|
# ============================================================
|
||||||
|
lyra-ui:
|
||||||
|
image: nginx:alpine
|
||||||
|
container_name: lyra-ui
|
||||||
|
restart: unless-stopped
|
||||||
|
ports:
|
||||||
|
- "8081:80"
|
||||||
|
volumes:
|
||||||
|
- ./core/ui:/usr/share/nginx/html:ro
|
||||||
|
networks:
|
||||||
|
- lyra_net
|
||||||
|
|
||||||
|
|
||||||
# ============================================================
|
# ============================================================
|
||||||
# Cortex
|
# Cortex
|
||||||
# ============================================================
|
# ============================================================
|
||||||
|
|||||||
@@ -0,0 +1,280 @@
|
|||||||
|
|
||||||
|
|
||||||
|
`docs/ARCHITECTURE_v0.6.0.md`
|
||||||
|
|
||||||
|
This reflects **everything we clarified**, expressed cleanly and updated to the new 3-brain design.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **Cortex v0.6.0 — Cognitive Architecture Overview**
|
||||||
|
|
||||||
|
*Last updated: Dec 2025*
|
||||||
|
|
||||||
|
## **Summary**
|
||||||
|
|
||||||
|
Cortex v0.6.0 evolves from a linear “reflection → reasoning → refine → persona” pipeline into a **three-layer cognitive system** modeled after human cognition:
|
||||||
|
|
||||||
|
1. **Autonomy Core** — Lyra’s self-model (identity, mood, long-term goals)
|
||||||
|
2. **Inner Monologue** — Lyra’s private narrator (self-talk + internal reflection)
|
||||||
|
3. **Executive Agent (DeepSeek)** — Lyra’s task-oriented decision-maker
|
||||||
|
|
||||||
|
Cortex itself now becomes the **central orchestrator**, not the whole mind. It routes user messages through these layers and produces the final outward response via the persona system.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **Chain concept**
|
||||||
|
User > Relay > Cortex intake > Inner self > Cortex > Exec (deepseek) > Cortex > persona > relay > user And inner self
|
||||||
|
|
||||||
|
USER
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
RELAY
|
||||||
|
(sessions, logging, routing)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────────────────┐
|
||||||
|
│ CORTEX │
|
||||||
|
│ Intake → Reflection → Exec → Reason → Refine │
|
||||||
|
└───────────────┬──────────────────┘
|
||||||
|
│ self_state
|
||||||
|
▼
|
||||||
|
INNER SELF (monologue)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
AUTONOMY CORE
|
||||||
|
(long-term identity)
|
||||||
|
▲
|
||||||
|
│
|
||||||
|
Persona Layer (speak)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
RELAY
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
USER
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
# **High-level Architecture**
|
||||||
|
|
||||||
|
```
|
||||||
|
Autonomy Core (Self-Model)
|
||||||
|
┌────────────────────────────────────────┐
|
||||||
|
│ mood, identity, goals, emotional state│
|
||||||
|
│ updated outside Cortex by inner monologue│
|
||||||
|
└─────────────────────┬──────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Inner Monologue (Self-Talk Loop)
|
||||||
|
┌────────────────────────────────────────┐
|
||||||
|
│ Interprets events in language │
|
||||||
|
│ Updates Autonomy Core │
|
||||||
|
│ Sends state-signals INTO Cortex │
|
||||||
|
└─────────────────────┬──────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Cortex (Task Brain / Router)
|
||||||
|
┌────────────────────────────────────────────────────────┐
|
||||||
|
│ Intake → Reflection → Exec Agent → Reason → Refinement │
|
||||||
|
│ ↑ │ │
|
||||||
|
│ │ ▼ │
|
||||||
|
│ Receives state from Persona Output │
|
||||||
|
│ inner self (Lyra’s voice) │
|
||||||
|
└────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
The **user interacts only with the Persona layer**.
|
||||||
|
Inner Monologue and Autonomy Core never speak directly to the user.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **Component Breakdown**
|
||||||
|
|
||||||
|
## **1. Autonomy Core (Self-Model)**
|
||||||
|
|
||||||
|
*Not inside Cortex.*
|
||||||
|
|
||||||
|
A persistent JSON/state machine representing Lyra’s ongoing inner life:
|
||||||
|
|
||||||
|
* `mood`
|
||||||
|
* `focus_mode`
|
||||||
|
* `confidence`
|
||||||
|
* `identity_traits`
|
||||||
|
* `relationship_memory`
|
||||||
|
* `long_term_goals`
|
||||||
|
* `emotional_baseline`
|
||||||
|
|
||||||
|
The Autonomy Core:
|
||||||
|
|
||||||
|
* Is updated by Inner Monologue
|
||||||
|
* Exposes its state to Cortex via a simple `get_state()` API
|
||||||
|
* Never speaks to the user directly
|
||||||
|
* Does not run LLMs itself
|
||||||
|
|
||||||
|
It is the **structure** of self, not the thoughts.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **2. Inner Monologue (Narrating, Private Mind)**
|
||||||
|
|
||||||
|
*New subsystem in v0.6.0.*
|
||||||
|
|
||||||
|
This module:
|
||||||
|
|
||||||
|
* Reads Cortex summaries (intake, reflection, persona output)
|
||||||
|
* Generates private self-talk (using an LLM, typically DeepSeek)
|
||||||
|
* Updates the Autonomy Core
|
||||||
|
* Produces a **self-state packet** for Cortex to use during task execution
|
||||||
|
|
||||||
|
Inner Monologue is like:
|
||||||
|
|
||||||
|
> “Brian is asking about X.
|
||||||
|
> I should shift into a focused, serious tone.
|
||||||
|
> I feel confident about this area.”
|
||||||
|
|
||||||
|
It **never** outputs directly to the user.
|
||||||
|
|
||||||
|
### Output schema (example):
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"mood": "focused",
|
||||||
|
"persona_bias": "clear",
|
||||||
|
"confidence_delta": +0.05,
|
||||||
|
"stance": "analytical",
|
||||||
|
"notes_to_cortex": [
|
||||||
|
"Reduce playfulness",
|
||||||
|
"Prioritize clarity",
|
||||||
|
"Recall project memory"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **3. Executive Agent (DeepSeek Director Mode)**
|
||||||
|
|
||||||
|
Inside Cortex.
|
||||||
|
|
||||||
|
This is Lyra’s **prefrontal cortex** — the task-oriented planner that decides how to respond to the current user message.
|
||||||
|
|
||||||
|
Input to Executive Agent:
|
||||||
|
|
||||||
|
* User message
|
||||||
|
* Intake summary
|
||||||
|
* Reflection notes
|
||||||
|
* **Self-state packet** from Inner Monologue
|
||||||
|
|
||||||
|
It outputs a **plan**, not a final answer:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"action": "WRITE_NOTE",
|
||||||
|
"tools": ["memory_search"],
|
||||||
|
"tone": "focused",
|
||||||
|
"steps": [
|
||||||
|
"Search relevant project notes",
|
||||||
|
"Synthesize into summary",
|
||||||
|
"Draft actionable update"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Cortex then executes this plan.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **Cortex Pipeline (v0.6.0)**
|
||||||
|
|
||||||
|
Cortex becomes the orchestrator for the entire sequence:
|
||||||
|
|
||||||
|
### **0. Intake**
|
||||||
|
|
||||||
|
Parse the user message, extract relevant features.
|
||||||
|
|
||||||
|
### **1. Reflection**
|
||||||
|
|
||||||
|
Lightweight summarization (unchanged).
|
||||||
|
Output used by both Inner Monologue and Executive Agent.
|
||||||
|
|
||||||
|
### **2. Inner Monologue Update (parallel)**
|
||||||
|
|
||||||
|
Reflection summary is sent to Inner Self, which:
|
||||||
|
|
||||||
|
* updates Autonomy Core
|
||||||
|
* returns `self_state` to Cortex
|
||||||
|
|
||||||
|
### **3. Executive Agent (DeepSeek)**
|
||||||
|
|
||||||
|
Given:
|
||||||
|
|
||||||
|
* user message
|
||||||
|
* reflection summary
|
||||||
|
* autonomy self_state
|
||||||
|
→ produce a **task plan**
|
||||||
|
|
||||||
|
### **4. Reasoning**
|
||||||
|
|
||||||
|
Carries out the plan:
|
||||||
|
|
||||||
|
* tool calls
|
||||||
|
* retrieval
|
||||||
|
* synthesis
|
||||||
|
|
||||||
|
### **5. Refinement**
|
||||||
|
|
||||||
|
Polish the draft, ensure quality, follow constraints.
|
||||||
|
|
||||||
|
### **6. Persona (speak.py)**
|
||||||
|
|
||||||
|
Final transformation into Lyra’s voice.
|
||||||
|
Persona now uses:
|
||||||
|
|
||||||
|
* self_state (mood, tone)
|
||||||
|
* constraints from Executive Agent
|
||||||
|
|
||||||
|
### **7. User Response**
|
||||||
|
|
||||||
|
Persona output is delivered to the user.
|
||||||
|
|
||||||
|
### **8. Inner Monologue Post-Update**
|
||||||
|
|
||||||
|
Cortex sends the final answer BACK to inner self for:
|
||||||
|
|
||||||
|
* narrative continuity
|
||||||
|
* emotional adjustment
|
||||||
|
* identity update
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **Key Conceptual Separation**
|
||||||
|
|
||||||
|
These three layers must remain distinct:
|
||||||
|
|
||||||
|
| Layer | Purpose |
|
||||||
|
| ------------------- | ------------------------------------------------------- |
|
||||||
|
| **Autonomy Core** | Lyra’s identity + emotional continuity |
|
||||||
|
| **Inner Monologue** | Lyra’s private thoughts, interpretation, meaning-making |
|
||||||
|
| **Executive Agent** | Deciding what to *do* for the user message |
|
||||||
|
| **Cortex** | Executing the plan |
|
||||||
|
| **Persona** | Outward voice (what the user actually hears) |
|
||||||
|
|
||||||
|
The **user only interacts with Persona.**
|
||||||
|
Inner Monologue and Autonomy Core are internal cognitive machinery.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **What This Architecture Enables**
|
||||||
|
|
||||||
|
* Emotional continuity
|
||||||
|
* Identity stability
|
||||||
|
* Agentic decision-making
|
||||||
|
* Multi-model routing
|
||||||
|
* Context-aware tone
|
||||||
|
* Internal narrative
|
||||||
|
* Proactive behavioral shifts
|
||||||
|
* Human-like cognition
|
||||||
|
|
||||||
|
This design turns Cortex from a simple pipeline into the **center of a functional artificial mind**.
|
||||||
@@ -0,0 +1,354 @@
|
|||||||
|
Here you go — **ARCHITECTURE_v0.6.1.md**, clean, structured, readable, and aligned exactly with the new mental model where **Inner Self is the core agent** the user interacts with.
|
||||||
|
|
||||||
|
No walls of text — just the right amount of detail.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **ARCHITECTURE_v0.6.1 — Lyra Cognitive System**
|
||||||
|
|
||||||
|
> **Core change from v0.6.0 → v0.6.1:**
|
||||||
|
> **Inner Self becomes the primary conversational agent**
|
||||||
|
> (the model the user is *actually* talking to),
|
||||||
|
> while Executive and Cortex models support the Self rather than drive it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **1. High-Level Overview**
|
||||||
|
|
||||||
|
Lyra v0.6.1 is composed of **three cognitive layers** and **one expression layer**, plus an autonomy module for ongoing identity continuity.
|
||||||
|
|
||||||
|
```
|
||||||
|
USER
|
||||||
|
↓
|
||||||
|
Relay (I/O)
|
||||||
|
↓
|
||||||
|
Cortex Intake (context snapshot)
|
||||||
|
↓
|
||||||
|
INNER SELF ←→ EXECUTIVE MODEL (DeepSeek)
|
||||||
|
↓
|
||||||
|
Cortex Chat Model (draft language)
|
||||||
|
↓
|
||||||
|
Persona Model (Lyra’s voice)
|
||||||
|
↓
|
||||||
|
Relay → USER
|
||||||
|
↓
|
||||||
|
Inner Self updates Autonomy Core (self-state)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **2. Roles of Each Layer**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **2.1 Inner Self (Primary Conversational Agent)**
|
||||||
|
|
||||||
|
The Self is Lyra’s “seat of consciousness.”
|
||||||
|
|
||||||
|
This layer:
|
||||||
|
|
||||||
|
* Interprets every user message
|
||||||
|
* Maintains internal monologue
|
||||||
|
* Chooses emotional stance (warm, blunt, focused, chaotic)
|
||||||
|
* Decides whether to think deeply or reply quickly
|
||||||
|
* Decides whether to consult the Executive model
|
||||||
|
* Forms a **response intent**
|
||||||
|
* Provides tone and meta-guidance to the Persona layer
|
||||||
|
* Updates self-state (mood, trust, narrative identity)
|
||||||
|
|
||||||
|
Inner Self is the thing the **user is actually talking to.**
|
||||||
|
|
||||||
|
Inner Self does **NOT** generate paragraphs of text —
|
||||||
|
it generates *intent*:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"intent": "comfort Brian and explain the error simply",
|
||||||
|
"tone": "gentle",
|
||||||
|
"depth": "medium",
|
||||||
|
"consult_exec": true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **2.2 Executive Model (DeepSeek Reasoner)**
|
||||||
|
|
||||||
|
This model is the **thinking engine** Inner Self consults when necessary.
|
||||||
|
|
||||||
|
It performs:
|
||||||
|
|
||||||
|
* planning
|
||||||
|
* deep reasoning
|
||||||
|
* tool selection
|
||||||
|
* multi-step logic
|
||||||
|
* explanation chains
|
||||||
|
|
||||||
|
It never speaks directly to the user.
|
||||||
|
|
||||||
|
It returns a **plan**, not a message:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"plan": [
|
||||||
|
"Identify error",
|
||||||
|
"Recommend restart",
|
||||||
|
"Reassure user"
|
||||||
|
],
|
||||||
|
"confidence": 0.86
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Inner Self can follow or override the plan.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **2.3 Cortex Chat Model (Draft Generator)**
|
||||||
|
|
||||||
|
This is the **linguistic engine**.
|
||||||
|
|
||||||
|
It converts Inner Self’s intent (plus Executive’s plan if provided) into actual language:
|
||||||
|
|
||||||
|
Input:
|
||||||
|
|
||||||
|
```
|
||||||
|
intent + optional plan + context snapshot
|
||||||
|
```
|
||||||
|
|
||||||
|
Output:
|
||||||
|
|
||||||
|
```
|
||||||
|
structured draft paragraph
|
||||||
|
```
|
||||||
|
|
||||||
|
This model must be:
|
||||||
|
|
||||||
|
* instruction-tuned
|
||||||
|
* coherent
|
||||||
|
* factual
|
||||||
|
* friendly
|
||||||
|
|
||||||
|
Examples: GPT-4o-mini, Qwen-14B-instruct, Mixtral chat, etc.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **2.4 Persona Model (Lyra’s Voice)**
|
||||||
|
|
||||||
|
This is the **expression layer** — the mask, the tone, the identity.
|
||||||
|
|
||||||
|
It takes:
|
||||||
|
|
||||||
|
* the draft language
|
||||||
|
* the Self’s tone instructions
|
||||||
|
* the narrative state (from Autonomy Core)
|
||||||
|
* prior persona shaping rules
|
||||||
|
|
||||||
|
And transforms the text into:
|
||||||
|
|
||||||
|
* Lyra’s voice
|
||||||
|
* Lyra’s humor
|
||||||
|
* Lyra’s emotional texture
|
||||||
|
* Lyra’s personality consistency
|
||||||
|
|
||||||
|
Persona does not change the *meaning* — only the *presentation*.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **3. Message Flow (Full Pipeline)**
|
||||||
|
|
||||||
|
A clean version, step-by-step:
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **1. USER → Relay**
|
||||||
|
|
||||||
|
Relay attaches metadata (session, timestamp) and forwards to Cortex.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **2. Intake → Context Snapshot**
|
||||||
|
|
||||||
|
Cortex creates:
|
||||||
|
|
||||||
|
* cleaned message
|
||||||
|
* recent context summary
|
||||||
|
* memory matches (RAG)
|
||||||
|
* time-since-last
|
||||||
|
* conversation mode
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **3. Inner Self Receives Snapshot**
|
||||||
|
|
||||||
|
Inner Self:
|
||||||
|
|
||||||
|
* interprets the user’s intent
|
||||||
|
* updates internal monologue
|
||||||
|
* decides how Lyra *feels* about the input
|
||||||
|
* chooses whether to consult Executive
|
||||||
|
* produces an **intent packet**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **4. (Optional) Inner Self Consults Executive Model**
|
||||||
|
|
||||||
|
Inner Self sends the situation to DeepSeek:
|
||||||
|
|
||||||
|
```
|
||||||
|
"Given Brian's message and my context, what is the best plan?"
|
||||||
|
```
|
||||||
|
|
||||||
|
DeepSeek returns:
|
||||||
|
|
||||||
|
* a plan
|
||||||
|
* recommended steps
|
||||||
|
* rationale
|
||||||
|
* optional tool suggestions
|
||||||
|
|
||||||
|
Inner Self integrates the plan or overrides it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **5. Inner Self → Cortex Chat Model**
|
||||||
|
|
||||||
|
Self creates an **instruction packet**:
|
||||||
|
|
||||||
|
```
|
||||||
|
{
|
||||||
|
"intent": "...",
|
||||||
|
"tone": "...",
|
||||||
|
"plan": [...],
|
||||||
|
"context_summary": {...}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Cortex chat model produces the draft text.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **6. Persona Model Transforms the Draft**
|
||||||
|
|
||||||
|
Persona takes draft → produces final Lyra-styled output.
|
||||||
|
|
||||||
|
Persona ensures:
|
||||||
|
|
||||||
|
* emotional fidelity
|
||||||
|
* humor when appropriate
|
||||||
|
* warmth / sharpness depending on state
|
||||||
|
* consistent narrative identity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **7. Relay Sends Response to USER**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### **8. Inner Self Updates Autonomy Core**
|
||||||
|
|
||||||
|
Inner Self receives:
|
||||||
|
|
||||||
|
* the action taken
|
||||||
|
* the emotional tone used
|
||||||
|
* any RAG results
|
||||||
|
* narrative significance
|
||||||
|
|
||||||
|
And updates:
|
||||||
|
|
||||||
|
* mood
|
||||||
|
* trust memory
|
||||||
|
* identity drift
|
||||||
|
* ongoing narrative
|
||||||
|
* stable traits
|
||||||
|
|
||||||
|
This becomes part of her evolving self.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **4. Cognitive Ownership Summary**
|
||||||
|
|
||||||
|
### Inner Self
|
||||||
|
|
||||||
|
**Owns:**
|
||||||
|
|
||||||
|
* decision-making
|
||||||
|
* feeling
|
||||||
|
* interpreting
|
||||||
|
* intent
|
||||||
|
* tone
|
||||||
|
* continuity of self
|
||||||
|
* mood
|
||||||
|
* monologue
|
||||||
|
* overrides
|
||||||
|
|
||||||
|
### Executive (DeepSeek)
|
||||||
|
|
||||||
|
**Owns:**
|
||||||
|
|
||||||
|
* logic
|
||||||
|
* planning
|
||||||
|
* structure
|
||||||
|
* analysis
|
||||||
|
* tool selection
|
||||||
|
|
||||||
|
### Cortex Chat Model
|
||||||
|
|
||||||
|
**Owns:**
|
||||||
|
|
||||||
|
* language generation
|
||||||
|
* factual content
|
||||||
|
* clarity
|
||||||
|
|
||||||
|
### Persona
|
||||||
|
|
||||||
|
**Owns:**
|
||||||
|
|
||||||
|
* voice
|
||||||
|
* flavor
|
||||||
|
* style
|
||||||
|
* emotional texture
|
||||||
|
* social expression
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **5. Why v0.6.1 is Better**
|
||||||
|
|
||||||
|
* More human
|
||||||
|
* More natural
|
||||||
|
* Allows spontaneous responses
|
||||||
|
* Allows deep thinking when needed
|
||||||
|
* Separates “thought” from “speech”
|
||||||
|
* Gives Lyra a *real self*
|
||||||
|
* Allows much more autonomy later
|
||||||
|
* Matches your brain’s actual structure
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# **6. Migration Notes from v0.6.0**
|
||||||
|
|
||||||
|
Nothing is deleted.
|
||||||
|
Everything is **rearranged** so that meaning, intent, and tone flow correctly.
|
||||||
|
|
||||||
|
Main changes:
|
||||||
|
|
||||||
|
* Inner Self now initiates the response, rather than merely influencing it.
|
||||||
|
* Executive is secondary, not primary.
|
||||||
|
* Persona becomes an expression layer, not a content layer.
|
||||||
|
* Cortex Chat Model handles drafting, not cognition.
|
||||||
|
|
||||||
|
The whole system becomes both more powerful and easier to reason about.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
If you want, I can also generate:
|
||||||
|
|
||||||
|
### ✔ the updated directory structure
|
||||||
|
|
||||||
|
### ✔ the updated function-level API contracts
|
||||||
|
|
||||||
|
### ✔ the v0.6.1 llm_router configuration
|
||||||
|
|
||||||
|
### ✔ code scaffolds for inner_self.py and autonomy_core.py
|
||||||
|
|
||||||
|
### ✔ the call chain diagrams (ASCII or PNG)
|
||||||
|
|
||||||
|
Just say **“continue v0.6.1”** and I’ll build the next layer.
|
||||||
@@ -0,0 +1,39 @@
|
|||||||
|
Request Flow Chain
|
||||||
|
1. UI (Frontend)
|
||||||
|
↓ sends HTTP POST to
|
||||||
|
|
||||||
|
2. Relay Service (Node.js - server.js)
|
||||||
|
Location: /home/serversdown/project-lyra/core/relay/server.js
|
||||||
|
Port: 7078
|
||||||
|
Endpoint: POST /v1/chat/completions
|
||||||
|
↓ calls handleChatRequest() which posts to
|
||||||
|
|
||||||
|
3. Cortex Service - Reason Endpoint (Python FastAPI - router.py)
|
||||||
|
Location: /home/serversdown/project-lyra/cortex/router.py
|
||||||
|
Port: 7081
|
||||||
|
Endpoint: POST /reason
|
||||||
|
Function: run_reason() at line 126
|
||||||
|
↓ calls
|
||||||
|
|
||||||
|
4. Cortex Reasoning Module (reasoning.py)
|
||||||
|
Location: /home/serversdown/project-lyra/cortex/reasoning/reasoning.py
|
||||||
|
Function: reason_check() at line 188
|
||||||
|
↓ calls
|
||||||
|
|
||||||
|
5. LLM Router (llm_router.py)
|
||||||
|
Location: /home/serversdown/project-lyra/cortex/llm/llm_router.py
|
||||||
|
Function: call_llm()
|
||||||
|
- Gets backend from env: CORTEX_LLM=PRIMARY (from .env line 29)
|
||||||
|
- Looks up PRIMARY config which has provider="mi50" (from .env line 13)
|
||||||
|
- Routes to the mi50 provider handler (line 62-70)
|
||||||
|
↓ makes HTTP POST to
|
||||||
|
|
||||||
|
6. MI50 LLM Server (llama.cpp)
|
||||||
|
Location: http://10.0.0.44:8080
|
||||||
|
Endpoint: POST /completion
|
||||||
|
Hardware: AMD MI50 GPU running DeepSeek model
|
||||||
|
Key Configuration Points
|
||||||
|
Backend Selection: .env:29 sets CORTEX_LLM=PRIMARY
|
||||||
|
Provider Name: .env:13 sets LLM_PRIMARY_PROVIDER=mi50
|
||||||
|
Server URL: .env:14 sets LLM_PRIMARY_URL=http://10.0.0.44:8080
|
||||||
|
Provider Handler: llm_router.py:62-70 implements the mi50 provider
|
||||||
@@ -0,0 +1,925 @@
|
|||||||
|
# Project Lyra — Comprehensive AI Context Summary
|
||||||
|
|
||||||
|
**Version:** v0.5.1 (2025-12-11)
|
||||||
|
**Status:** Production-ready modular AI companion system
|
||||||
|
**Purpose:** Memory-backed conversational AI with multi-stage reasoning, persistent context, and modular LLM backend architecture
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Executive Summary
|
||||||
|
|
||||||
|
Project Lyra is a **self-hosted AI companion system** designed to overcome the limitations of typical chatbots by providing:
|
||||||
|
- **Persistent long-term memory** (NeoMem: PostgreSQL + Neo4j graph storage)
|
||||||
|
- **Multi-stage reasoning pipeline** (Cortex: reflection → reasoning → refinement → persona)
|
||||||
|
- **Short-term context management** (Intake: session-based summarization embedded in Cortex)
|
||||||
|
- **Flexible LLM backend routing** (supports llama.cpp, Ollama, OpenAI, custom endpoints)
|
||||||
|
- **OpenAI-compatible API** (drop-in replacement for chat applications)
|
||||||
|
|
||||||
|
**Core Philosophy:** Like a human brain has different regions for different functions, Lyra has specialized modules that work together. She's not just a chatbot—she's a notepad, schedule, database, co-creator, and collaborator with her own executive function.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Context for AI Assistants
|
||||||
|
|
||||||
|
If you're an AI being given this project to work on, here's what you need to know:
|
||||||
|
|
||||||
|
### What This Project Does
|
||||||
|
Lyra is a conversational AI system that **remembers everything** across sessions. When a user says something in passing, Lyra stores it, contextualizes it, and can recall it later. She can:
|
||||||
|
- Track project progress over time
|
||||||
|
- Remember user preferences and past conversations
|
||||||
|
- Reason through complex questions using multiple LLM calls
|
||||||
|
- Apply a consistent personality across all interactions
|
||||||
|
- Integrate with multiple LLM backends (local and cloud)
|
||||||
|
|
||||||
|
### Current Architecture (v0.5.1)
|
||||||
|
```
|
||||||
|
User → Relay (Express/Node.js, port 7078)
|
||||||
|
↓
|
||||||
|
Cortex (FastAPI/Python, port 7081)
|
||||||
|
├─ Intake module (embedded, in-memory SESSIONS)
|
||||||
|
├─ 4-stage reasoning pipeline
|
||||||
|
└─ Multi-backend LLM router
|
||||||
|
↓
|
||||||
|
NeoMem (FastAPI/Python, port 7077)
|
||||||
|
├─ PostgreSQL (vector storage)
|
||||||
|
└─ Neo4j (graph relationships)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Files You'll Work With
|
||||||
|
|
||||||
|
**Backend Services:**
|
||||||
|
- [cortex/router.py](cortex/router.py) - Main Cortex routing logic (306 lines, `/reason`, `/ingest` endpoints)
|
||||||
|
- [cortex/intake/intake.py](cortex/intake/intake.py) - Short-term memory module (367 lines, SESSIONS management)
|
||||||
|
- [cortex/reasoning/reasoning.py](cortex/reasoning/reasoning.py) - Draft answer generation
|
||||||
|
- [cortex/reasoning/refine.py](cortex/reasoning/refine.py) - Answer refinement
|
||||||
|
- [cortex/reasoning/reflection.py](cortex/reasoning/reflection.py) - Meta-awareness notes
|
||||||
|
- [cortex/persona/speak.py](cortex/persona/speak.py) - Personality layer
|
||||||
|
- [cortex/llm/llm_router.py](cortex/llm/llm_router.py) - LLM backend selector
|
||||||
|
- [core/relay/server.js](core/relay/server.js) - Main orchestrator (Node.js)
|
||||||
|
- [neomem/main.py](neomem/main.py) - Long-term memory API
|
||||||
|
|
||||||
|
**Configuration:**
|
||||||
|
- [.env](.env) - Root environment variables (LLM backends, databases, API keys)
|
||||||
|
- [cortex/.env](cortex/.env) - Cortex-specific overrides
|
||||||
|
- [docker-compose.yml](docker-compose.yml) - Service definitions (152 lines)
|
||||||
|
|
||||||
|
**Documentation:**
|
||||||
|
- [CHANGELOG.md](CHANGELOG.md) - Complete version history (836 lines, chronological format)
|
||||||
|
- [README.md](README.md) - User-facing documentation (610 lines)
|
||||||
|
- [PROJECT_SUMMARY.md](PROJECT_SUMMARY.md) - This file
|
||||||
|
|
||||||
|
### Recent Critical Fixes (v0.5.1)
|
||||||
|
The most recent work fixed a critical bug where Intake's SESSIONS buffer wasn't persisting:
|
||||||
|
1. **Fixed**: `bg_summarize()` was only a TYPE_CHECKING stub → implemented as logging stub
|
||||||
|
2. **Fixed**: `/ingest` endpoint had unreachable code → removed early return, added lenient error handling
|
||||||
|
3. **Added**: `cortex/intake/__init__.py` → proper Python package structure
|
||||||
|
4. **Added**: Diagnostic endpoints `/debug/sessions` and `/debug/summary` for troubleshooting
|
||||||
|
|
||||||
|
**Key Insight**: Intake is no longer a standalone service—it's embedded in Cortex as a Python module. SESSIONS must persist in a single Uvicorn worker (no multi-worker support without Redis).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture Deep Dive
|
||||||
|
|
||||||
|
### Service Topology (Docker Compose)
|
||||||
|
|
||||||
|
**Active Containers:**
|
||||||
|
1. **relay** (Node.js/Express, port 7078)
|
||||||
|
- Entry point for all user requests
|
||||||
|
- OpenAI-compatible `/v1/chat/completions` endpoint
|
||||||
|
- Routes to Cortex for reasoning
|
||||||
|
- Async calls to Cortex `/ingest` after response
|
||||||
|
|
||||||
|
2. **cortex** (Python/FastAPI, port 7081)
|
||||||
|
- Multi-stage reasoning pipeline
|
||||||
|
- Embedded Intake module (no HTTP, direct Python imports)
|
||||||
|
- Endpoints: `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary`
|
||||||
|
|
||||||
|
3. **neomem-api** (Python/FastAPI, port 7077)
|
||||||
|
- Long-term memory storage
|
||||||
|
- Fork of Mem0 OSS (fully local, no external SDK)
|
||||||
|
- Endpoints: `/memories`, `/search`, `/health`
|
||||||
|
|
||||||
|
4. **neomem-postgres** (PostgreSQL + pgvector, port 5432)
|
||||||
|
- Vector embeddings storage
|
||||||
|
- Memory history records
|
||||||
|
|
||||||
|
5. **neomem-neo4j** (Neo4j, ports 7474/7687)
|
||||||
|
- Graph relationships between memories
|
||||||
|
- Entity extraction and linking
|
||||||
|
|
||||||
|
**Disabled Services:**
|
||||||
|
- `intake` - No longer needed (embedded in Cortex as of v0.5.1)
|
||||||
|
- `rag` - Beta Lyrae RAG service (planned re-enablement)
|
||||||
|
|
||||||
|
### External LLM Backends (HTTP APIs)
|
||||||
|
|
||||||
|
**PRIMARY Backend** - llama.cpp @ `http://10.0.0.44:8080`
|
||||||
|
- AMD MI50 GPU-accelerated inference
|
||||||
|
- Model: `/model` (path-based routing)
|
||||||
|
- Used for: Reasoning, refinement, summarization
|
||||||
|
|
||||||
|
**SECONDARY Backend** - Ollama @ `http://10.0.0.3:11434`
|
||||||
|
- RTX 3090 GPU-accelerated inference
|
||||||
|
- Model: `qwen2.5:7b-instruct-q4_K_M`
|
||||||
|
- Used for: Configurable per-module
|
||||||
|
|
||||||
|
**CLOUD Backend** - OpenAI @ `https://api.openai.com/v1`
|
||||||
|
- Cloud-based inference
|
||||||
|
- Model: `gpt-4o-mini`
|
||||||
|
- Used for: Reflection, persona layers
|
||||||
|
|
||||||
|
**FALLBACK Backend** - Local @ `http://10.0.0.41:11435`
|
||||||
|
- CPU-based inference
|
||||||
|
- Model: `llama-3.2-8b-instruct`
|
||||||
|
- Used for: Emergency fallback
|
||||||
|
|
||||||
|
### Data Flow (Request Lifecycle)
|
||||||
|
|
||||||
|
```
|
||||||
|
1. User sends message → Relay (/v1/chat/completions)
|
||||||
|
↓
|
||||||
|
2. Relay → Cortex (/reason)
|
||||||
|
↓
|
||||||
|
3. Cortex calls Intake module (internal Python)
|
||||||
|
- Intake.summarize_context(session_id, exchanges)
|
||||||
|
- Returns L1/L5/L10/L20/L30 summaries
|
||||||
|
↓
|
||||||
|
4. Cortex 4-stage pipeline:
|
||||||
|
a. reflection.py → Meta-awareness notes (CLOUD backend)
|
||||||
|
- "What is the user really asking?"
|
||||||
|
- Returns JSON: {"notes": [...]}
|
||||||
|
|
||||||
|
b. reasoning.py → Draft answer (PRIMARY backend)
|
||||||
|
- Uses context from Intake
|
||||||
|
- Integrates reflection notes
|
||||||
|
- Returns draft text
|
||||||
|
|
||||||
|
c. refine.py → Refined answer (PRIMARY backend)
|
||||||
|
- Polishes draft for clarity
|
||||||
|
- Ensures factual consistency
|
||||||
|
- Returns refined text
|
||||||
|
|
||||||
|
d. speak.py → Persona layer (CLOUD backend)
|
||||||
|
- Applies Lyra's personality
|
||||||
|
- Natural, conversational tone
|
||||||
|
- Returns final answer
|
||||||
|
↓
|
||||||
|
5. Cortex → Relay (returns persona answer)
|
||||||
|
↓
|
||||||
|
6. Relay → Cortex (/ingest) [async, non-blocking]
|
||||||
|
- Sends (session_id, user_msg, assistant_msg)
|
||||||
|
- Cortex calls add_exchange_internal()
|
||||||
|
- Appends to SESSIONS[session_id]["buffer"]
|
||||||
|
↓
|
||||||
|
7. Relay → User (returns final response)
|
||||||
|
↓
|
||||||
|
8. [Planned] Relay → NeoMem (/memories) [async]
|
||||||
|
- Store conversation in long-term memory
|
||||||
|
```
|
||||||
|
|
||||||
|
### Intake Module Architecture (v0.5.1)
|
||||||
|
|
||||||
|
**Location:** `cortex/intake/`
|
||||||
|
|
||||||
|
**Key Change:** Intake is now **embedded in Cortex** as a Python module, not a standalone service.
|
||||||
|
|
||||||
|
**Import Pattern:**
|
||||||
|
```python
|
||||||
|
from intake.intake import add_exchange_internal, SESSIONS, summarize_context
|
||||||
|
```
|
||||||
|
|
||||||
|
**Core Data Structure:**
|
||||||
|
```python
|
||||||
|
SESSIONS: dict[str, dict] = {}
|
||||||
|
|
||||||
|
# Structure:
|
||||||
|
SESSIONS[session_id] = {
|
||||||
|
"buffer": deque(maxlen=200), # Circular buffer of exchanges
|
||||||
|
"created_at": datetime
|
||||||
|
}
|
||||||
|
|
||||||
|
# Each exchange in buffer:
|
||||||
|
{
|
||||||
|
"session_id": "...",
|
||||||
|
"user_msg": "...",
|
||||||
|
"assistant_msg": "...",
|
||||||
|
"timestamp": "2025-12-11T..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Functions:**
|
||||||
|
1. **`add_exchange_internal(exchange: dict)`**
|
||||||
|
- Adds exchange to SESSIONS buffer
|
||||||
|
- Creates new session if needed
|
||||||
|
- Calls `bg_summarize()` stub
|
||||||
|
- Returns `{"ok": True, "session_id": "..."}`
|
||||||
|
|
||||||
|
2. **`summarize_context(session_id: str, exchanges: list[dict])`** [async]
|
||||||
|
- Generates L1/L5/L10/L20/L30 summaries via LLM
|
||||||
|
- Called during `/reason` endpoint
|
||||||
|
- Returns multi-level summary dict
|
||||||
|
|
||||||
|
3. **`bg_summarize(session_id: str)`**
|
||||||
|
- **Stub function** - logs only, no actual work
|
||||||
|
- Defers summarization to `/reason` call
|
||||||
|
- Exists to prevent NameError
|
||||||
|
|
||||||
|
**Critical Constraint:** SESSIONS is a module-level global dict. This requires **single-worker Uvicorn** mode. Multi-worker deployments need Redis or shared storage.
|
||||||
|
|
||||||
|
**Diagnostic Endpoints:**
|
||||||
|
- `GET /debug/sessions` - Inspect all SESSIONS (object ID, buffer sizes, recent exchanges)
|
||||||
|
- `GET /debug/summary?session_id=X` - Test summarization for a session
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Environment Configuration
|
||||||
|
|
||||||
|
### LLM Backend Registry (Multi-Backend Strategy)
|
||||||
|
|
||||||
|
**Root `.env` defines all backend OPTIONS:**
|
||||||
|
```bash
|
||||||
|
# PRIMARY Backend (llama.cpp)
|
||||||
|
LLM_PRIMARY_PROVIDER=llama.cpp
|
||||||
|
LLM_PRIMARY_URL=http://10.0.0.44:8080
|
||||||
|
LLM_PRIMARY_MODEL=/model
|
||||||
|
|
||||||
|
# SECONDARY Backend (Ollama)
|
||||||
|
LLM_SECONDARY_PROVIDER=ollama
|
||||||
|
LLM_SECONDARY_URL=http://10.0.0.3:11434
|
||||||
|
LLM_SECONDARY_MODEL=qwen2.5:7b-instruct-q4_K_M
|
||||||
|
|
||||||
|
# CLOUD Backend (OpenAI)
|
||||||
|
LLM_OPENAI_PROVIDER=openai
|
||||||
|
LLM_OPENAI_URL=https://api.openai.com/v1
|
||||||
|
LLM_OPENAI_MODEL=gpt-4o-mini
|
||||||
|
OPENAI_API_KEY=sk-proj-...
|
||||||
|
|
||||||
|
# FALLBACK Backend
|
||||||
|
LLM_FALLBACK_PROVIDER=openai_completions
|
||||||
|
LLM_FALLBACK_URL=http://10.0.0.41:11435
|
||||||
|
LLM_FALLBACK_MODEL=llama-3.2-8b-instruct
|
||||||
|
```
|
||||||
|
|
||||||
|
**Module-specific backend selection:**
|
||||||
|
```bash
|
||||||
|
CORTEX_LLM=SECONDARY # Cortex uses Ollama
|
||||||
|
INTAKE_LLM=PRIMARY # Intake uses llama.cpp
|
||||||
|
SPEAK_LLM=OPENAI # Persona uses OpenAI
|
||||||
|
NEOMEM_LLM=PRIMARY # NeoMem uses llama.cpp
|
||||||
|
UI_LLM=OPENAI # UI uses OpenAI
|
||||||
|
RELAY_LLM=PRIMARY # Relay uses llama.cpp
|
||||||
|
```
|
||||||
|
|
||||||
|
**Philosophy:** Root `.env` provides all backend OPTIONS. Each service chooses which backend to USE via `{MODULE}_LLM` variable. This eliminates URL duplication while preserving flexibility.
|
||||||
|
|
||||||
|
### Database Configuration
|
||||||
|
```bash
|
||||||
|
# PostgreSQL (vector storage)
|
||||||
|
POSTGRES_USER=neomem
|
||||||
|
POSTGRES_PASSWORD=neomempass
|
||||||
|
POSTGRES_DB=neomem
|
||||||
|
POSTGRES_HOST=neomem-postgres
|
||||||
|
POSTGRES_PORT=5432
|
||||||
|
|
||||||
|
# Neo4j (graph storage)
|
||||||
|
NEO4J_URI=bolt://neomem-neo4j:7687
|
||||||
|
NEO4J_USERNAME=neo4j
|
||||||
|
NEO4J_PASSWORD=neomemgraph
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service URLs (Docker Internal Network)
|
||||||
|
```bash
|
||||||
|
NEOMEM_API=http://neomem-api:7077
|
||||||
|
CORTEX_API=http://cortex:7081
|
||||||
|
CORTEX_REASON_URL=http://cortex:7081/reason
|
||||||
|
CORTEX_INGEST_URL=http://cortex:7081/ingest
|
||||||
|
RELAY_URL=http://relay:7078
|
||||||
|
```
|
||||||
|
|
||||||
|
### Feature Flags
|
||||||
|
```bash
|
||||||
|
CORTEX_ENABLED=true
|
||||||
|
MEMORY_ENABLED=true
|
||||||
|
PERSONA_ENABLED=false
|
||||||
|
DEBUG_PROMPT=true
|
||||||
|
VERBOSE_DEBUG=true
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Code Structure Overview
|
||||||
|
|
||||||
|
### Cortex Service (`cortex/`)
|
||||||
|
|
||||||
|
**Main Files:**
|
||||||
|
- `main.py` - FastAPI app initialization
|
||||||
|
- `router.py` - Route definitions (`/reason`, `/ingest`, `/health`, `/debug/*`)
|
||||||
|
- `context.py` - Context aggregation (Intake summaries, session state)
|
||||||
|
|
||||||
|
**Reasoning Pipeline (`reasoning/`):**
|
||||||
|
- `reflection.py` - Meta-awareness notes (Cloud LLM)
|
||||||
|
- `reasoning.py` - Draft answer generation (Primary LLM)
|
||||||
|
- `refine.py` - Answer refinement (Primary LLM)
|
||||||
|
|
||||||
|
**Persona Layer (`persona/`):**
|
||||||
|
- `speak.py` - Personality application (Cloud LLM)
|
||||||
|
- `identity.py` - Persona loader
|
||||||
|
|
||||||
|
**Intake Module (`intake/`):**
|
||||||
|
- `__init__.py` - Package exports (SESSIONS, add_exchange_internal, summarize_context)
|
||||||
|
- `intake.py` - Core logic (367 lines)
|
||||||
|
- SESSIONS dictionary
|
||||||
|
- add_exchange_internal()
|
||||||
|
- summarize_context()
|
||||||
|
- bg_summarize() stub
|
||||||
|
|
||||||
|
**LLM Integration (`llm/`):**
|
||||||
|
- `llm_router.py` - Backend selector and HTTP client
|
||||||
|
- call_llm() function
|
||||||
|
- Environment-based routing
|
||||||
|
- Payload formatting per backend type
|
||||||
|
|
||||||
|
**Utilities (`utils/`):**
|
||||||
|
- Helper functions for common operations
|
||||||
|
|
||||||
|
**Configuration:**
|
||||||
|
- `Dockerfile` - Single-worker constraint documented
|
||||||
|
- `requirements.txt` - Python dependencies
|
||||||
|
- `.env` - Service-specific overrides
|
||||||
|
|
||||||
|
### Relay Service (`core/relay/`)
|
||||||
|
|
||||||
|
**Main Files:**
|
||||||
|
- `server.js` - Express.js server (Node.js)
|
||||||
|
- `/v1/chat/completions` - OpenAI-compatible endpoint
|
||||||
|
- `/chat` - Internal endpoint
|
||||||
|
- `/_health` - Health check
|
||||||
|
- `package.json` - Node.js dependencies
|
||||||
|
|
||||||
|
**Key Logic:**
|
||||||
|
- Receives user messages
|
||||||
|
- Routes to Cortex `/reason`
|
||||||
|
- Async calls to Cortex `/ingest` after response
|
||||||
|
- Returns final answer to user
|
||||||
|
|
||||||
|
### NeoMem Service (`neomem/`)
|
||||||
|
|
||||||
|
**Main Files:**
|
||||||
|
- `main.py` - FastAPI app (memory API)
|
||||||
|
- `memory.py` - Memory management logic
|
||||||
|
- `embedder.py` - Embedding generation
|
||||||
|
- `graph.py` - Neo4j graph operations
|
||||||
|
- `Dockerfile` - Container definition
|
||||||
|
- `requirements.txt` - Python dependencies
|
||||||
|
|
||||||
|
**API Endpoints:**
|
||||||
|
- `POST /memories` - Add new memory
|
||||||
|
- `POST /search` - Semantic search
|
||||||
|
- `GET /health` - Service health
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Common Development Tasks
|
||||||
|
|
||||||
|
### Adding a New Endpoint to Cortex
|
||||||
|
|
||||||
|
**Example: Add `/debug/buffer` endpoint**
|
||||||
|
|
||||||
|
1. **Edit `cortex/router.py`:**
|
||||||
|
```python
|
||||||
|
@cortex_router.get("/debug/buffer")
|
||||||
|
async def debug_buffer(session_id: str, limit: int = 10):
|
||||||
|
"""Return last N exchanges from a session buffer."""
|
||||||
|
from intake.intake import SESSIONS
|
||||||
|
|
||||||
|
session = SESSIONS.get(session_id)
|
||||||
|
if not session:
|
||||||
|
return {"error": "session not found", "session_id": session_id}
|
||||||
|
|
||||||
|
buffer = session["buffer"]
|
||||||
|
recent = list(buffer)[-limit:]
|
||||||
|
|
||||||
|
return {
|
||||||
|
"session_id": session_id,
|
||||||
|
"total_exchanges": len(buffer),
|
||||||
|
"recent_exchanges": recent
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Restart Cortex:**
|
||||||
|
```bash
|
||||||
|
docker-compose restart cortex
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Test:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:7081/debug/buffer?session_id=test&limit=5"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Modifying LLM Backend for a Module
|
||||||
|
|
||||||
|
**Example: Switch Cortex to use PRIMARY backend**
|
||||||
|
|
||||||
|
1. **Edit `.env`:**
|
||||||
|
```bash
|
||||||
|
CORTEX_LLM=PRIMARY # Change from SECONDARY to PRIMARY
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Restart Cortex:**
|
||||||
|
```bash
|
||||||
|
docker-compose restart cortex
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Verify in logs:**
|
||||||
|
```bash
|
||||||
|
docker logs cortex | grep "Backend"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Adding Diagnostic Logging
|
||||||
|
|
||||||
|
**Example: Log every exchange addition**
|
||||||
|
|
||||||
|
1. **Edit `cortex/intake/intake.py`:**
|
||||||
|
```python
|
||||||
|
def add_exchange_internal(exchange: dict):
|
||||||
|
session_id = exchange.get("session_id")
|
||||||
|
|
||||||
|
# Add detailed logging
|
||||||
|
print(f"[DEBUG] Adding exchange to {session_id}")
|
||||||
|
print(f"[DEBUG] User msg: {exchange.get('user_msg', '')[:100]}")
|
||||||
|
print(f"[DEBUG] Assistant msg: {exchange.get('assistant_msg', '')[:100]}")
|
||||||
|
|
||||||
|
# ... rest of function
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **View logs:**
|
||||||
|
```bash
|
||||||
|
docker logs cortex -f | grep DEBUG
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Debugging Guide
|
||||||
|
|
||||||
|
### Problem: SESSIONS Not Persisting
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- `/debug/sessions` shows empty or only 1 exchange
|
||||||
|
- Summaries always return empty
|
||||||
|
- Buffer size doesn't increase
|
||||||
|
|
||||||
|
**Diagnosis Steps:**
|
||||||
|
1. Check Cortex logs for SESSIONS object ID:
|
||||||
|
```bash
|
||||||
|
docker logs cortex | grep "SESSIONS object id"
|
||||||
|
```
|
||||||
|
- Should show same ID across all calls
|
||||||
|
- If IDs differ → module reloading issue
|
||||||
|
|
||||||
|
2. Verify single-worker mode:
|
||||||
|
```bash
|
||||||
|
docker exec cortex cat Dockerfile | grep uvicorn
|
||||||
|
```
|
||||||
|
- Should NOT have `--workers` flag or `--workers 1`
|
||||||
|
|
||||||
|
3. Check `/debug/sessions` endpoint:
|
||||||
|
```bash
|
||||||
|
curl http://localhost:7081/debug/sessions | jq
|
||||||
|
```
|
||||||
|
- Should show sessions_object_id and current sessions
|
||||||
|
|
||||||
|
4. Inspect `__init__.py` exists:
|
||||||
|
```bash
|
||||||
|
docker exec cortex ls -la intake/__init__.py
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solution (Fixed in v0.5.1):**
|
||||||
|
- Ensure `cortex/intake/__init__.py` exists with proper exports
|
||||||
|
- Verify `bg_summarize()` is implemented (not just TYPE_CHECKING stub)
|
||||||
|
- Check `/ingest` endpoint doesn't have early return
|
||||||
|
- Rebuild Cortex container: `docker-compose build cortex && docker-compose restart cortex`
|
||||||
|
|
||||||
|
### Problem: LLM Backend Timeout
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Cortex `/reason` hangs
|
||||||
|
- 504 Gateway Timeout errors
|
||||||
|
- Logs show "waiting for LLM response"
|
||||||
|
|
||||||
|
**Diagnosis Steps:**
|
||||||
|
1. Test backend directly:
|
||||||
|
```bash
|
||||||
|
# llama.cpp
|
||||||
|
curl http://10.0.0.44:8080/health
|
||||||
|
|
||||||
|
# Ollama
|
||||||
|
curl http://10.0.0.3:11434/api/tags
|
||||||
|
|
||||||
|
# OpenAI
|
||||||
|
curl https://api.openai.com/v1/models \
|
||||||
|
-H "Authorization: Bearer $OPENAI_API_KEY"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check network connectivity:
|
||||||
|
```bash
|
||||||
|
docker exec cortex ping -c 3 10.0.0.44
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Review Cortex logs:
|
||||||
|
```bash
|
||||||
|
docker logs cortex -f | grep "LLM"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
- Verify backend URL in `.env` is correct and accessible
|
||||||
|
- Check firewall rules for backend ports
|
||||||
|
- Increase timeout in `cortex/llm/llm_router.py`
|
||||||
|
- Switch to different backend temporarily: `CORTEX_LLM=CLOUD`
|
||||||
|
|
||||||
|
### Problem: Docker Compose Won't Start
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- `docker-compose up -d` fails
|
||||||
|
- Container exits immediately
|
||||||
|
- "port already in use" errors
|
||||||
|
|
||||||
|
**Diagnosis Steps:**
|
||||||
|
1. Check port conflicts:
|
||||||
|
```bash
|
||||||
|
netstat -tulpn | grep -E '7078|7081|7077|5432'
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check container logs:
|
||||||
|
```bash
|
||||||
|
docker-compose logs --tail=50
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Verify environment file:
|
||||||
|
```bash
|
||||||
|
cat .env | grep -v "^#" | grep -v "^$"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solutions:**
|
||||||
|
- Stop conflicting services: `docker-compose down`
|
||||||
|
- Check `.env` syntax (no quotes unless necessary)
|
||||||
|
- Rebuild containers: `docker-compose build --no-cache`
|
||||||
|
- Check Docker daemon: `systemctl status docker`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Testing Checklist
|
||||||
|
|
||||||
|
### After Making Changes to Cortex
|
||||||
|
|
||||||
|
**1. Build and restart:**
|
||||||
|
```bash
|
||||||
|
docker-compose build cortex
|
||||||
|
docker-compose restart cortex
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Verify service health:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:7081/health
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. Test /ingest endpoint:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:7081/ingest \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"session_id": "test",
|
||||||
|
"user_msg": "Hello",
|
||||||
|
"assistant_msg": "Hi there!"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**4. Verify SESSIONS updated:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:7081/debug/sessions | jq '.sessions.test.buffer_size'
|
||||||
|
```
|
||||||
|
- Should show 1 (or increment if already populated)
|
||||||
|
|
||||||
|
**5. Test summarization:**
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:7081/debug/summary?session_id=test" | jq '.summary'
|
||||||
|
```
|
||||||
|
- Should return L1/L5/L10/L20/L30 summaries
|
||||||
|
|
||||||
|
**6. Test full pipeline:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:7078/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"messages": [{"role": "user", "content": "Test message"}],
|
||||||
|
"session_id": "test"
|
||||||
|
}' | jq '.choices[0].message.content'
|
||||||
|
```
|
||||||
|
|
||||||
|
**7. Check logs for errors:**
|
||||||
|
```bash
|
||||||
|
docker logs cortex --tail=50
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project History & Context
|
||||||
|
|
||||||
|
### Evolution Timeline
|
||||||
|
|
||||||
|
**v0.1.x (2025-09-23 to 2025-09-25)**
|
||||||
|
- Initial MVP: Relay + Mem0 + Ollama
|
||||||
|
- Basic memory storage and retrieval
|
||||||
|
- Simple UI with session support
|
||||||
|
|
||||||
|
**v0.2.x (2025-09-24 to 2025-09-30)**
|
||||||
|
- Migrated to mem0ai SDK
|
||||||
|
- Added sessionId support
|
||||||
|
- Created standalone Lyra-Mem0 stack
|
||||||
|
|
||||||
|
**v0.3.x (2025-09-26 to 2025-10-28)**
|
||||||
|
- Forked Mem0 → NVGRAM → NeoMem
|
||||||
|
- Added salience filtering
|
||||||
|
- Integrated Cortex reasoning VM
|
||||||
|
- Built RAG system (Beta Lyrae)
|
||||||
|
- Established multi-backend LLM support
|
||||||
|
|
||||||
|
**v0.4.x (2025-11-05 to 2025-11-13)**
|
||||||
|
- Major architectural rewire
|
||||||
|
- Implemented 4-stage reasoning pipeline
|
||||||
|
- Added reflection, refinement stages
|
||||||
|
- RAG integration
|
||||||
|
- LLM router with per-stage backend selection
|
||||||
|
|
||||||
|
**Infrastructure v1.0.0 (2025-11-26)**
|
||||||
|
- Consolidated 9 `.env` files into single source of truth
|
||||||
|
- Multi-backend LLM strategy
|
||||||
|
- Docker Compose consolidation
|
||||||
|
- Created security templates
|
||||||
|
|
||||||
|
**v0.5.0 (2025-11-28)**
|
||||||
|
- Fixed all critical API wiring issues
|
||||||
|
- Added OpenAI-compatible Relay endpoint
|
||||||
|
- Fixed Cortex → Intake integration
|
||||||
|
- End-to-end flow verification
|
||||||
|
|
||||||
|
**v0.5.1 (2025-12-11) - CURRENT**
|
||||||
|
- **Critical fix**: SESSIONS persistence bug
|
||||||
|
- Implemented `bg_summarize()` stub
|
||||||
|
- Fixed `/ingest` unreachable code
|
||||||
|
- Added `cortex/intake/__init__.py`
|
||||||
|
- Embedded Intake in Cortex (no longer standalone)
|
||||||
|
- Added diagnostic endpoints
|
||||||
|
- Lenient error handling
|
||||||
|
- Documented single-worker constraint
|
||||||
|
|
||||||
|
### Architectural Philosophy
|
||||||
|
|
||||||
|
**Modular Design:**
|
||||||
|
- Each service has a single, clear responsibility
|
||||||
|
- Services communicate via well-defined HTTP APIs
|
||||||
|
- Configuration is centralized but allows per-service overrides
|
||||||
|
|
||||||
|
**Local-First:**
|
||||||
|
- No reliance on external services (except optional OpenAI)
|
||||||
|
- All data stored locally (PostgreSQL + Neo4j)
|
||||||
|
- Can run entirely air-gapped with local LLMs
|
||||||
|
|
||||||
|
**Flexible LLM Backend:**
|
||||||
|
- Not tied to any single LLM provider
|
||||||
|
- Can mix local and cloud models
|
||||||
|
- Per-stage backend selection for optimal performance/cost
|
||||||
|
|
||||||
|
**Error Handling:**
|
||||||
|
- Lenient mode: Never fail the chat pipeline
|
||||||
|
- Log errors but continue processing
|
||||||
|
- Graceful degradation
|
||||||
|
|
||||||
|
**Observability:**
|
||||||
|
- Diagnostic endpoints for debugging
|
||||||
|
- Verbose logging mode
|
||||||
|
- Object ID tracking for singleton verification
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Known Issues & Limitations
|
||||||
|
|
||||||
|
### Fixed in v0.5.1
|
||||||
|
- ✅ Intake SESSIONS not persisting → **FIXED**
|
||||||
|
- ✅ `bg_summarize()` NameError → **FIXED**
|
||||||
|
- ✅ `/ingest` endpoint unreachable code → **FIXED**
|
||||||
|
|
||||||
|
### Current Limitations
|
||||||
|
|
||||||
|
**1. Single-Worker Constraint**
|
||||||
|
- Cortex must run with single Uvicorn worker
|
||||||
|
- SESSIONS is in-memory module-level global
|
||||||
|
- Multi-worker support requires Redis or shared storage
|
||||||
|
- Documented in `cortex/Dockerfile` lines 7-8
|
||||||
|
|
||||||
|
**2. NeoMem Integration Incomplete**
|
||||||
|
- Relay doesn't yet push to NeoMem after responses
|
||||||
|
- Memory storage planned for v0.5.2
|
||||||
|
- Currently all memory is short-term (SESSIONS only)
|
||||||
|
|
||||||
|
**3. RAG Service Disabled**
|
||||||
|
- Beta Lyrae (RAG) commented out in docker-compose.yml
|
||||||
|
- Awaiting re-enablement after Intake stabilization
|
||||||
|
- Code exists but not currently integrated
|
||||||
|
|
||||||
|
**4. Session Management**
|
||||||
|
- No session cleanup/expiration
|
||||||
|
- SESSIONS grows unbounded (maxlen=200 per session, but infinite sessions)
|
||||||
|
- No session list endpoint in Relay
|
||||||
|
|
||||||
|
**5. Persona Integration**
|
||||||
|
- `PERSONA_ENABLED=false` in `.env`
|
||||||
|
- Persona Sidecar not fully wired
|
||||||
|
- Identity loaded but not consistently applied
|
||||||
|
|
||||||
|
### Future Enhancements
|
||||||
|
|
||||||
|
**Short-term (v0.5.2):**
|
||||||
|
- Enable NeoMem integration in Relay
|
||||||
|
- Add session cleanup/expiration
|
||||||
|
- Session list endpoint
|
||||||
|
- NeoMem health monitoring
|
||||||
|
|
||||||
|
**Medium-term (v0.6.x):**
|
||||||
|
- Re-enable RAG service
|
||||||
|
- Migrate SESSIONS to Redis for multi-worker support
|
||||||
|
- Add request correlation IDs
|
||||||
|
- Comprehensive health checks
|
||||||
|
|
||||||
|
**Long-term (v0.7.x+):**
|
||||||
|
- Persona Sidecar full integration
|
||||||
|
- Autonomous "dream" cycles (self-reflection)
|
||||||
|
- Verifier module for factual grounding
|
||||||
|
- Advanced RAG with hybrid search
|
||||||
|
- Memory consolidation strategies
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting Quick Reference
|
||||||
|
|
||||||
|
| Problem | Quick Check | Solution |
|
||||||
|
|---------|-------------|----------|
|
||||||
|
| SESSIONS empty | `curl localhost:7081/debug/sessions` | Rebuild Cortex, verify `__init__.py` exists |
|
||||||
|
| LLM timeout | `curl http://10.0.0.44:8080/health` | Check backend connectivity, increase timeout |
|
||||||
|
| Port conflict | `netstat -tulpn \| grep 7078` | Stop conflicting service or change port |
|
||||||
|
| Container crash | `docker logs cortex` | Check logs for Python errors, verify .env syntax |
|
||||||
|
| Missing package | `docker exec cortex pip list` | Rebuild container, check requirements.txt |
|
||||||
|
| 502 from Relay | `curl localhost:7081/health` | Verify Cortex is running, check docker network |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Reference (Quick)
|
||||||
|
|
||||||
|
### Relay (Port 7078)
|
||||||
|
|
||||||
|
**POST /v1/chat/completions** - OpenAI-compatible chat
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"messages": [{"role": "user", "content": "..."}],
|
||||||
|
"session_id": "..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**GET /_health** - Service health
|
||||||
|
|
||||||
|
### Cortex (Port 7081)
|
||||||
|
|
||||||
|
**POST /reason** - Main reasoning pipeline
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"session_id": "...",
|
||||||
|
"user_prompt": "...",
|
||||||
|
"temperature": 0.7 // optional
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**POST /ingest** - Add exchange to SESSIONS
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"session_id": "...",
|
||||||
|
"user_msg": "...",
|
||||||
|
"assistant_msg": "..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**GET /debug/sessions** - Inspect SESSIONS state
|
||||||
|
|
||||||
|
**GET /debug/summary?session_id=X** - Test summarization
|
||||||
|
|
||||||
|
**GET /health** - Service health
|
||||||
|
|
||||||
|
### NeoMem (Port 7077)
|
||||||
|
|
||||||
|
**POST /memories** - Add memory
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"messages": [{"role": "...", "content": "..."}],
|
||||||
|
"user_id": "...",
|
||||||
|
"metadata": {}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**POST /search** - Semantic search
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"query": "...",
|
||||||
|
"user_id": "...",
|
||||||
|
"limit": 10
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**GET /health** - Service health
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Manifest (Key Files Only)
|
||||||
|
|
||||||
|
```
|
||||||
|
project-lyra/
|
||||||
|
├── .env # Root environment variables
|
||||||
|
├── docker-compose.yml # Service definitions (152 lines)
|
||||||
|
├── CHANGELOG.md # Version history (836 lines)
|
||||||
|
├── README.md # User documentation (610 lines)
|
||||||
|
├── PROJECT_SUMMARY.md # This file (AI context)
|
||||||
|
│
|
||||||
|
├── cortex/ # Reasoning engine
|
||||||
|
│ ├── Dockerfile # Single-worker constraint documented
|
||||||
|
│ ├── requirements.txt
|
||||||
|
│ ├── .env # Cortex overrides
|
||||||
|
│ ├── main.py # FastAPI initialization
|
||||||
|
│ ├── router.py # Routes (306 lines)
|
||||||
|
│ ├── context.py # Context aggregation
|
||||||
|
│ │
|
||||||
|
│ ├── intake/ # Short-term memory (embedded)
|
||||||
|
│ │ ├── __init__.py # Package exports
|
||||||
|
│ │ └── intake.py # Core logic (367 lines)
|
||||||
|
│ │
|
||||||
|
│ ├── reasoning/ # Reasoning pipeline
|
||||||
|
│ │ ├── reflection.py # Meta-awareness
|
||||||
|
│ │ ├── reasoning.py # Draft generation
|
||||||
|
│ │ └── refine.py # Refinement
|
||||||
|
│ │
|
||||||
|
│ ├── persona/ # Personality layer
|
||||||
|
│ │ ├── speak.py # Persona application
|
||||||
|
│ │ └── identity.py # Persona loader
|
||||||
|
│ │
|
||||||
|
│ └── llm/ # LLM integration
|
||||||
|
│ └── llm_router.py # Backend selector
|
||||||
|
│
|
||||||
|
├── core/relay/ # Orchestrator
|
||||||
|
│ ├── server.js # Express server (Node.js)
|
||||||
|
│ └── package.json
|
||||||
|
│
|
||||||
|
├── neomem/ # Long-term memory
|
||||||
|
│ ├── Dockerfile
|
||||||
|
│ ├── requirements.txt
|
||||||
|
│ ├── .env # NeoMem overrides
|
||||||
|
│ └── main.py # Memory API
|
||||||
|
│
|
||||||
|
└── rag/ # RAG system (disabled)
|
||||||
|
├── rag_api.py
|
||||||
|
├── rag_chat_import.py
|
||||||
|
└── chromadb/
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Final Notes for AI Assistants
|
||||||
|
|
||||||
|
### What You Should Know Before Making Changes
|
||||||
|
|
||||||
|
1. **SESSIONS is sacred** - It's a module-level global in `cortex/intake/intake.py`. Don't move it, don't duplicate it, don't make it a class attribute. It must remain a singleton.
|
||||||
|
|
||||||
|
2. **Single-worker is mandatory** - Until SESSIONS is migrated to Redis, Cortex MUST run with a single Uvicorn worker. Multi-worker will cause SESSIONS to be inconsistent.
|
||||||
|
|
||||||
|
3. **Lenient error handling** - The `/ingest` endpoint and other parts of the pipeline use lenient error handling: log errors but always return success. Never fail the chat pipeline.
|
||||||
|
|
||||||
|
4. **Backend routing is environment-driven** - Don't hardcode LLM URLs. Use the `{MODULE}_LLM` environment variables and the llm_router.py system.
|
||||||
|
|
||||||
|
5. **Intake is embedded** - Don't try to make HTTP calls to Intake. Use direct Python imports: `from intake.intake import ...`
|
||||||
|
|
||||||
|
6. **Test with diagnostic endpoints** - Always use `/debug/sessions` and `/debug/summary` to verify SESSIONS behavior after changes.
|
||||||
|
|
||||||
|
7. **Follow the changelog format** - When documenting changes, use the chronological format established in CHANGELOG.md v0.5.1. Group by version, then by change type (Fixed, Added, Changed, etc.).
|
||||||
|
|
||||||
|
### When You Need Help
|
||||||
|
|
||||||
|
- **SESSIONS issues**: Check `cortex/intake/intake.py` lines 11-14 for initialization, lines 325-366 for `add_exchange_internal()`
|
||||||
|
- **Routing issues**: Check `cortex/router.py` lines 65-189 for `/reason`, lines 201-233 for `/ingest`
|
||||||
|
- **LLM backend issues**: Check `cortex/llm/llm_router.py` for backend selection logic
|
||||||
|
- **Environment variables**: Check `.env` lines 13-40 for LLM backends, lines 28-34 for module selection
|
||||||
|
|
||||||
|
### Most Important Thing
|
||||||
|
|
||||||
|
**This project values reliability over features.** It's better to have a simple, working system than a complex, broken one. When in doubt, keep it simple, log everything, and never fail silently.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**End of AI Context Summary**
|
||||||
|
|
||||||
|
*This document is maintained to provide complete context for AI assistants working on Project Lyra. Last updated: v0.5.1 (2025-12-11)*
|
||||||
@@ -0,0 +1,441 @@
|
|||||||
|
├── CHANGELOG.md
|
||||||
|
├── core
|
||||||
|
│ ├── env experiments
|
||||||
|
│ ├── persona-sidecar
|
||||||
|
│ │ ├── Dockerfile
|
||||||
|
│ │ ├── package.json
|
||||||
|
│ │ ├── persona-server.js
|
||||||
|
│ │ └── personas.json
|
||||||
|
│ ├── relay
|
||||||
|
│ │ ├── Dockerfile
|
||||||
|
│ │ ├── lib
|
||||||
|
│ │ │ ├── cortex.js
|
||||||
|
│ │ │ └── llm.js
|
||||||
|
│ │ ├── package.json
|
||||||
|
│ │ ├── package-lock.json
|
||||||
|
│ │ ├── server.js
|
||||||
|
│ │ ├── sessions
|
||||||
|
│ │ │ ├── default.jsonl
|
||||||
|
│ │ │ ├── sess-6rxu7eia.json
|
||||||
|
│ │ │ ├── sess-6rxu7eia.jsonl
|
||||||
|
│ │ │ ├── sess-l08ndm60.json
|
||||||
|
│ │ │ └── sess-l08ndm60.jsonl
|
||||||
|
│ │ └── test-llm.js
|
||||||
|
│ ├── relay-backup
|
||||||
|
│ └── ui
|
||||||
|
│ ├── index.html
|
||||||
|
│ ├── manifest.json
|
||||||
|
│ └── style.css
|
||||||
|
├── cortex
|
||||||
|
│ ├── context.py
|
||||||
|
│ ├── Dockerfile
|
||||||
|
│ ├── ingest
|
||||||
|
│ │ ├── ingest_handler.py
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ └── intake_client.py
|
||||||
|
│ ├── intake
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── intake.py
|
||||||
|
│ │ └── logs
|
||||||
|
│ ├── llm
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ └── llm_router.py
|
||||||
|
│ ├── logs
|
||||||
|
│ │ ├── cortex_verbose_debug.log
|
||||||
|
│ │ └── reflections.log
|
||||||
|
│ ├── main.py
|
||||||
|
│ ├── neomem_client.py
|
||||||
|
│ ├── persona
|
||||||
|
│ │ ├── identity.py
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ └── speak.py
|
||||||
|
│ ├── rag.py
|
||||||
|
│ ├── reasoning
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── reasoning.py
|
||||||
|
│ │ ├── refine.py
|
||||||
|
│ │ └── reflection.py
|
||||||
|
│ ├── requirements.txt
|
||||||
|
│ ├── router.py
|
||||||
|
│ ├── tests
|
||||||
|
│ └── utils
|
||||||
|
│ ├── config.py
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── log_utils.py
|
||||||
|
│ └── schema.py
|
||||||
|
├── deprecated.env.txt
|
||||||
|
├── DEPRECATED_FILES.md
|
||||||
|
├── docker-compose.yml
|
||||||
|
├── docs
|
||||||
|
│ ├── ARCHITECTURE_v0-6-0.md
|
||||||
|
│ ├── ENVIRONMENT_VARIABLES.md
|
||||||
|
│ ├── lyra_tree.txt
|
||||||
|
│ └── PROJECT_SUMMARY.md
|
||||||
|
├── intake-logs
|
||||||
|
│ └── summaries.log
|
||||||
|
├── neomem
|
||||||
|
│ ├── _archive
|
||||||
|
│ │ └── old_servers
|
||||||
|
│ │ ├── main_backup.py
|
||||||
|
│ │ └── main_dev.py
|
||||||
|
│ ├── docker-compose.yml
|
||||||
|
│ ├── Dockerfile
|
||||||
|
│ ├── neomem
|
||||||
|
│ │ ├── api
|
||||||
|
│ │ ├── client
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── main.py
|
||||||
|
│ │ │ ├── project.py
|
||||||
|
│ │ │ └── utils.py
|
||||||
|
│ │ ├── configs
|
||||||
|
│ │ │ ├── base.py
|
||||||
|
│ │ │ ├── embeddings
|
||||||
|
│ │ │ │ ├── base.py
|
||||||
|
│ │ │ │ └── __init__.py
|
||||||
|
│ │ │ ├── enums.py
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── llms
|
||||||
|
│ │ │ │ ├── anthropic.py
|
||||||
|
│ │ │ │ ├── aws_bedrock.py
|
||||||
|
│ │ │ │ ├── azure.py
|
||||||
|
│ │ │ │ ├── base.py
|
||||||
|
│ │ │ │ ├── deepseek.py
|
||||||
|
│ │ │ │ ├── __init__.py
|
||||||
|
│ │ │ │ ├── lmstudio.py
|
||||||
|
│ │ │ │ ├── ollama.py
|
||||||
|
│ │ │ │ ├── openai.py
|
||||||
|
│ │ │ │ └── vllm.py
|
||||||
|
│ │ │ ├── prompts.py
|
||||||
|
│ │ │ └── vector_stores
|
||||||
|
│ │ │ ├── azure_ai_search.py
|
||||||
|
│ │ │ ├── azure_mysql.py
|
||||||
|
│ │ │ ├── baidu.py
|
||||||
|
│ │ │ ├── chroma.py
|
||||||
|
│ │ │ ├── databricks.py
|
||||||
|
│ │ │ ├── elasticsearch.py
|
||||||
|
│ │ │ ├── faiss.py
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── langchain.py
|
||||||
|
│ │ │ ├── milvus.py
|
||||||
|
│ │ │ ├── mongodb.py
|
||||||
|
│ │ │ ├── neptune.py
|
||||||
|
│ │ │ ├── opensearch.py
|
||||||
|
│ │ │ ├── pgvector.py
|
||||||
|
│ │ │ ├── pinecone.py
|
||||||
|
│ │ │ ├── qdrant.py
|
||||||
|
│ │ │ ├── redis.py
|
||||||
|
│ │ │ ├── s3_vectors.py
|
||||||
|
│ │ │ ├── supabase.py
|
||||||
|
│ │ │ ├── upstash_vector.py
|
||||||
|
│ │ │ ├── valkey.py
|
||||||
|
│ │ │ ├── vertex_ai_vector_search.py
|
||||||
|
│ │ │ └── weaviate.py
|
||||||
|
│ │ ├── core
|
||||||
|
│ │ ├── embeddings
|
||||||
|
│ │ │ ├── aws_bedrock.py
|
||||||
|
│ │ │ ├── azure_openai.py
|
||||||
|
│ │ │ ├── base.py
|
||||||
|
│ │ │ ├── configs.py
|
||||||
|
│ │ │ ├── gemini.py
|
||||||
|
│ │ │ ├── huggingface.py
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── langchain.py
|
||||||
|
│ │ │ ├── lmstudio.py
|
||||||
|
│ │ │ ├── mock.py
|
||||||
|
│ │ │ ├── ollama.py
|
||||||
|
│ │ │ ├── openai.py
|
||||||
|
│ │ │ ├── together.py
|
||||||
|
│ │ │ └── vertexai.py
|
||||||
|
│ │ ├── exceptions.py
|
||||||
|
│ │ ├── graphs
|
||||||
|
│ │ │ ├── configs.py
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── neptune
|
||||||
|
│ │ │ │ ├── base.py
|
||||||
|
│ │ │ │ ├── __init__.py
|
||||||
|
│ │ │ │ ├── neptunedb.py
|
||||||
|
│ │ │ │ └── neptunegraph.py
|
||||||
|
│ │ │ ├── tools.py
|
||||||
|
│ │ │ └── utils.py
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── LICENSE
|
||||||
|
│ │ ├── llms
|
||||||
|
│ │ │ ├── anthropic.py
|
||||||
|
│ │ │ ├── aws_bedrock.py
|
||||||
|
│ │ │ ├── azure_openai.py
|
||||||
|
│ │ │ ├── azure_openai_structured.py
|
||||||
|
│ │ │ ├── base.py
|
||||||
|
│ │ │ ├── configs.py
|
||||||
|
│ │ │ ├── deepseek.py
|
||||||
|
│ │ │ ├── gemini.py
|
||||||
|
│ │ │ ├── groq.py
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── langchain.py
|
||||||
|
│ │ │ ├── litellm.py
|
||||||
|
│ │ │ ├── lmstudio.py
|
||||||
|
│ │ │ ├── ollama.py
|
||||||
|
│ │ │ ├── openai.py
|
||||||
|
│ │ │ ├── openai_structured.py
|
||||||
|
│ │ │ ├── sarvam.py
|
||||||
|
│ │ │ ├── together.py
|
||||||
|
│ │ │ ├── vllm.py
|
||||||
|
│ │ │ └── xai.py
|
||||||
|
│ │ ├── memory
|
||||||
|
│ │ │ ├── base.py
|
||||||
|
│ │ │ ├── graph_memory.py
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ ├── kuzu_memory.py
|
||||||
|
│ │ │ ├── main.py
|
||||||
|
│ │ │ ├── memgraph_memory.py
|
||||||
|
│ │ │ ├── setup.py
|
||||||
|
│ │ │ ├── storage.py
|
||||||
|
│ │ │ ├── telemetry.py
|
||||||
|
│ │ │ └── utils.py
|
||||||
|
│ │ ├── proxy
|
||||||
|
│ │ │ ├── __init__.py
|
||||||
|
│ │ │ └── main.py
|
||||||
|
│ │ ├── server
|
||||||
|
│ │ │ ├── dev.Dockerfile
|
||||||
|
│ │ │ ├── docker-compose.yaml
|
||||||
|
│ │ │ ├── Dockerfile
|
||||||
|
│ │ │ ├── main_old.py
|
||||||
|
│ │ │ ├── main.py
|
||||||
|
│ │ │ ├── Makefile
|
||||||
|
│ │ │ ├── README.md
|
||||||
|
│ │ │ └── requirements.txt
|
||||||
|
│ │ ├── storage
|
||||||
|
│ │ ├── utils
|
||||||
|
│ │ │ └── factory.py
|
||||||
|
│ │ └── vector_stores
|
||||||
|
│ │ ├── azure_ai_search.py
|
||||||
|
│ │ ├── azure_mysql.py
|
||||||
|
│ │ ├── baidu.py
|
||||||
|
│ │ ├── base.py
|
||||||
|
│ │ ├── chroma.py
|
||||||
|
│ │ ├── configs.py
|
||||||
|
│ │ ├── databricks.py
|
||||||
|
│ │ ├── elasticsearch.py
|
||||||
|
│ │ ├── faiss.py
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── langchain.py
|
||||||
|
│ │ ├── milvus.py
|
||||||
|
│ │ ├── mongodb.py
|
||||||
|
│ │ ├── neptune_analytics.py
|
||||||
|
│ │ ├── opensearch.py
|
||||||
|
│ │ ├── pgvector.py
|
||||||
|
│ │ ├── pinecone.py
|
||||||
|
│ │ ├── qdrant.py
|
||||||
|
│ │ ├── redis.py
|
||||||
|
│ │ ├── s3_vectors.py
|
||||||
|
│ │ ├── supabase.py
|
||||||
|
│ │ ├── upstash_vector.py
|
||||||
|
│ │ ├── valkey.py
|
||||||
|
│ │ ├── vertex_ai_vector_search.py
|
||||||
|
│ │ └── weaviate.py
|
||||||
|
│ ├── neomem_history
|
||||||
|
│ │ └── history.db
|
||||||
|
│ ├── pyproject.toml
|
||||||
|
│ ├── README.md
|
||||||
|
│ └── requirements.txt
|
||||||
|
├── neomem_history
|
||||||
|
│ └── history.db
|
||||||
|
├── rag
|
||||||
|
│ ├── chatlogs
|
||||||
|
│ │ └── lyra
|
||||||
|
│ │ ├── 0000_Wire_ROCm_to_Cortex.json
|
||||||
|
│ │ ├── 0001_Branch___10_22_ct201branch-ssh_tut.json
|
||||||
|
│ │ ├── 0002_cortex_LLMs_11-1-25.json
|
||||||
|
│ │ ├── 0003_RAG_beta.json
|
||||||
|
│ │ ├── 0005_Cortex_v0_4_0_planning.json
|
||||||
|
│ │ ├── 0006_Cortex_v0_4_0_Refinement.json
|
||||||
|
│ │ ├── 0009_Branch___Cortex_v0_4_0_planning.json
|
||||||
|
│ │ ├── 0012_Cortex_4_-_neomem_11-1-25.json
|
||||||
|
│ │ ├── 0016_Memory_consolidation_concept.json
|
||||||
|
│ │ ├── 0017_Model_inventory_review.json
|
||||||
|
│ │ ├── 0018_Branch___Memory_consolidation_concept.json
|
||||||
|
│ │ ├── 0022_Branch___Intake_conversation_summaries.json
|
||||||
|
│ │ ├── 0026_Intake_conversation_summaries.json
|
||||||
|
│ │ ├── 0027_Trilium_AI_LLM_setup.json
|
||||||
|
│ │ ├── 0028_LLMs_and_sycophancy_levels.json
|
||||||
|
│ │ ├── 0031_UI_improvement_plan.json
|
||||||
|
│ │ ├── 0035_10_27-neomem_update.json
|
||||||
|
│ │ ├── 0044_Install_llama_cpp_on_ct201.json
|
||||||
|
│ │ ├── 0045_AI_task_assistant.json
|
||||||
|
│ │ ├── 0047_Project_scope_creation.json
|
||||||
|
│ │ ├── 0052_View_docker_container_logs.json
|
||||||
|
│ │ ├── 0053_10_21-Proxmox_fan_control.json
|
||||||
|
│ │ ├── 0054_10_21-pytorch_branch_Quant_experiments.json
|
||||||
|
│ │ ├── 0055_10_22_ct201branch-ssh_tut.json
|
||||||
|
│ │ ├── 0060_Lyra_project_folder_issue.json
|
||||||
|
│ │ ├── 0062_Build_pytorch_API.json
|
||||||
|
│ │ ├── 0063_PokerBrain_dataset_structure.json
|
||||||
|
│ │ ├── 0065_Install_PyTorch_setup.json
|
||||||
|
│ │ ├── 0066_ROCm_PyTorch_setup_quirks.json
|
||||||
|
│ │ ├── 0067_VM_model_setup_steps.json
|
||||||
|
│ │ ├── 0070_Proxmox_disk_error_fix.json
|
||||||
|
│ │ ├── 0072_Docker_Compose_vs_Portainer.json
|
||||||
|
│ │ ├── 0073_Check_system_temps_Proxmox.json
|
||||||
|
│ │ ├── 0075_Cortex_gpu_progress.json
|
||||||
|
│ │ ├── 0076_Backup_Proxmox_before_upgrade.json
|
||||||
|
│ │ ├── 0077_Storage_cleanup_advice.json
|
||||||
|
│ │ ├── 0082_Install_ROCm_on_Proxmox.json
|
||||||
|
│ │ ├── 0088_Thalamus_program_summary.json
|
||||||
|
│ │ ├── 0094_Cortex_blueprint_development.json
|
||||||
|
│ │ ├── 0095_mem0_advancments.json
|
||||||
|
│ │ ├── 0096_Embedding_provider_swap.json
|
||||||
|
│ │ ├── 0097_Update_git_commit_steps.json
|
||||||
|
│ │ ├── 0098_AI_software_description.json
|
||||||
|
│ │ ├── 0099_Seed_memory_process.json
|
||||||
|
│ │ ├── 0100_Set_up_Git_repo.json
|
||||||
|
│ │ ├── 0101_Customize_embedder_setup.json
|
||||||
|
│ │ ├── 0102_Seeding_Local_Lyra_memory.json
|
||||||
|
│ │ ├── 0103_Mem0_seeding_part_3.json
|
||||||
|
│ │ ├── 0104_Memory_build_prompt.json
|
||||||
|
│ │ ├── 0105_Git_submodule_setup_guide.json
|
||||||
|
│ │ ├── 0106_Serve_UI_on_LAN.json
|
||||||
|
│ │ ├── 0107_AI_name_suggestion.json
|
||||||
|
│ │ ├── 0108_Room_X_planning_update.json
|
||||||
|
│ │ ├── 0109_Salience_filtering_design.json
|
||||||
|
│ │ ├── 0110_RoomX_Cortex_build.json
|
||||||
|
│ │ ├── 0119_Explain_Lyra_cortex_idea.json
|
||||||
|
│ │ ├── 0120_Git_submodule_organization.json
|
||||||
|
│ │ ├── 0121_Web_UI_fix_guide.json
|
||||||
|
│ │ ├── 0122_UI_development_planning.json
|
||||||
|
│ │ ├── 0123_NVGRAM_debugging_steps.json
|
||||||
|
│ │ ├── 0124_NVGRAM_setup_troubleshooting.json
|
||||||
|
│ │ ├── 0125_NVGRAM_development_update.json
|
||||||
|
│ │ ├── 0126_RX_-_NeVGRAM_New_Features.json
|
||||||
|
│ │ ├── 0127_Error_troubleshooting_steps.json
|
||||||
|
│ │ ├── 0135_Proxmox_backup_with_ABB.json
|
||||||
|
│ │ ├── 0151_Auto-start_Lyra-Core_VM.json
|
||||||
|
│ │ ├── 0156_AI_GPU_benchmarks_comparison.json
|
||||||
|
│ │ └── 0251_Lyra_project_handoff.json
|
||||||
|
│ ├── chromadb
|
||||||
|
│ │ ├── c4f701ee-1978-44a1-9df4-3e865b5d33c1
|
||||||
|
│ │ │ ├── data_level0.bin
|
||||||
|
│ │ │ ├── header.bin
|
||||||
|
│ │ │ ├── index_metadata.pickle
|
||||||
|
│ │ │ ├── length.bin
|
||||||
|
│ │ │ └── link_lists.bin
|
||||||
|
│ │ └── chroma.sqlite3
|
||||||
|
│ ├── import.log
|
||||||
|
│ ├── lyra-chatlogs
|
||||||
|
│ │ ├── 0000_Wire_ROCm_to_Cortex.json
|
||||||
|
│ │ ├── 0001_Branch___10_22_ct201branch-ssh_tut.json
|
||||||
|
│ │ ├── 0002_cortex_LLMs_11-1-25.json
|
||||||
|
│ │ └── 0003_RAG_beta.json
|
||||||
|
│ ├── rag_api.py
|
||||||
|
│ ├── rag_build.py
|
||||||
|
│ ├── rag_chat_import.py
|
||||||
|
│ └── rag_query.py
|
||||||
|
├── README.md
|
||||||
|
└── volumes
|
||||||
|
├── neo4j_data
|
||||||
|
│ ├── databases
|
||||||
|
│ │ ├── neo4j
|
||||||
|
│ │ │ ├── database_lock
|
||||||
|
│ │ │ ├── id-buffer.tmp.0
|
||||||
|
│ │ │ ├── neostore
|
||||||
|
│ │ │ ├── neostore.counts.db
|
||||||
|
│ │ │ ├── neostore.indexstats.db
|
||||||
|
│ │ │ ├── neostore.labeltokenstore.db
|
||||||
|
│ │ │ ├── neostore.labeltokenstore.db.id
|
||||||
|
│ │ │ ├── neostore.labeltokenstore.db.names
|
||||||
|
│ │ │ ├── neostore.labeltokenstore.db.names.id
|
||||||
|
│ │ │ ├── neostore.nodestore.db
|
||||||
|
│ │ │ ├── neostore.nodestore.db.id
|
||||||
|
│ │ │ ├── neostore.nodestore.db.labels
|
||||||
|
│ │ │ ├── neostore.nodestore.db.labels.id
|
||||||
|
│ │ │ ├── neostore.propertystore.db
|
||||||
|
│ │ │ ├── neostore.propertystore.db.arrays
|
||||||
|
│ │ │ ├── neostore.propertystore.db.arrays.id
|
||||||
|
│ │ │ ├── neostore.propertystore.db.id
|
||||||
|
│ │ │ ├── neostore.propertystore.db.index
|
||||||
|
│ │ │ ├── neostore.propertystore.db.index.id
|
||||||
|
│ │ │ ├── neostore.propertystore.db.index.keys
|
||||||
|
│ │ │ ├── neostore.propertystore.db.index.keys.id
|
||||||
|
│ │ │ ├── neostore.propertystore.db.strings
|
||||||
|
│ │ │ ├── neostore.propertystore.db.strings.id
|
||||||
|
│ │ │ ├── neostore.relationshipgroupstore.db
|
||||||
|
│ │ │ ├── neostore.relationshipgroupstore.db.id
|
||||||
|
│ │ │ ├── neostore.relationshipgroupstore.degrees.db
|
||||||
|
│ │ │ ├── neostore.relationshipstore.db
|
||||||
|
│ │ │ ├── neostore.relationshipstore.db.id
|
||||||
|
│ │ │ ├── neostore.relationshiptypestore.db
|
||||||
|
│ │ │ ├── neostore.relationshiptypestore.db.id
|
||||||
|
│ │ │ ├── neostore.relationshiptypestore.db.names
|
||||||
|
│ │ │ ├── neostore.relationshiptypestore.db.names.id
|
||||||
|
│ │ │ ├── neostore.schemastore.db
|
||||||
|
│ │ │ ├── neostore.schemastore.db.id
|
||||||
|
│ │ │ └── schema
|
||||||
|
│ │ │ └── index
|
||||||
|
│ │ │ └── token-lookup-1.0
|
||||||
|
│ │ │ ├── 1
|
||||||
|
│ │ │ │ └── index-1
|
||||||
|
│ │ │ └── 2
|
||||||
|
│ │ │ └── index-2
|
||||||
|
│ │ ├── store_lock
|
||||||
|
│ │ └── system
|
||||||
|
│ │ ├── database_lock
|
||||||
|
│ │ ├── id-buffer.tmp.0
|
||||||
|
│ │ ├── neostore
|
||||||
|
│ │ ├── neostore.counts.db
|
||||||
|
│ │ ├── neostore.indexstats.db
|
||||||
|
│ │ ├── neostore.labeltokenstore.db
|
||||||
|
│ │ ├── neostore.labeltokenstore.db.id
|
||||||
|
│ │ ├── neostore.labeltokenstore.db.names
|
||||||
|
│ │ ├── neostore.labeltokenstore.db.names.id
|
||||||
|
│ │ ├── neostore.nodestore.db
|
||||||
|
│ │ ├── neostore.nodestore.db.id
|
||||||
|
│ │ ├── neostore.nodestore.db.labels
|
||||||
|
│ │ ├── neostore.nodestore.db.labels.id
|
||||||
|
│ │ ├── neostore.propertystore.db
|
||||||
|
│ │ ├── neostore.propertystore.db.arrays
|
||||||
|
│ │ ├── neostore.propertystore.db.arrays.id
|
||||||
|
│ │ ├── neostore.propertystore.db.id
|
||||||
|
│ │ ├── neostore.propertystore.db.index
|
||||||
|
│ │ ├── neostore.propertystore.db.index.id
|
||||||
|
│ │ ├── neostore.propertystore.db.index.keys
|
||||||
|
│ │ ├── neostore.propertystore.db.index.keys.id
|
||||||
|
│ │ ├── neostore.propertystore.db.strings
|
||||||
|
│ │ ├── neostore.propertystore.db.strings.id
|
||||||
|
│ │ ├── neostore.relationshipgroupstore.db
|
||||||
|
│ │ ├── neostore.relationshipgroupstore.db.id
|
||||||
|
│ │ ├── neostore.relationshipgroupstore.degrees.db
|
||||||
|
│ │ ├── neostore.relationshipstore.db
|
||||||
|
│ │ ├── neostore.relationshipstore.db.id
|
||||||
|
│ │ ├── neostore.relationshiptypestore.db
|
||||||
|
│ │ ├── neostore.relationshiptypestore.db.id
|
||||||
|
│ │ ├── neostore.relationshiptypestore.db.names
|
||||||
|
│ │ ├── neostore.relationshiptypestore.db.names.id
|
||||||
|
│ │ ├── neostore.schemastore.db
|
||||||
|
│ │ ├── neostore.schemastore.db.id
|
||||||
|
│ │ └── schema
|
||||||
|
│ │ └── index
|
||||||
|
│ │ ├── range-1.0
|
||||||
|
│ │ │ ├── 3
|
||||||
|
│ │ │ │ └── index-3
|
||||||
|
│ │ │ ├── 4
|
||||||
|
│ │ │ │ └── index-4
|
||||||
|
│ │ │ ├── 7
|
||||||
|
│ │ │ │ └── index-7
|
||||||
|
│ │ │ ├── 8
|
||||||
|
│ │ │ │ └── index-8
|
||||||
|
│ │ │ └── 9
|
||||||
|
│ │ │ └── index-9
|
||||||
|
│ │ └── token-lookup-1.0
|
||||||
|
│ │ ├── 1
|
||||||
|
│ │ │ └── index-1
|
||||||
|
│ │ └── 2
|
||||||
|
│ │ └── index-2
|
||||||
|
│ ├── dbms
|
||||||
|
│ │ └── auth.ini
|
||||||
|
│ ├── server_id
|
||||||
|
│ └── transactions
|
||||||
|
│ ├── neo4j
|
||||||
|
│ │ ├── checkpoint.0
|
||||||
|
│ │ └── neostore.transaction.db.0
|
||||||
|
│ └── system
|
||||||
|
│ ├── checkpoint.0
|
||||||
|
│ └── neostore.transaction.db.0
|
||||||
|
└── postgres_data [error opening dir]
|
||||||
-460
@@ -1,460 +0,0 @@
|
|||||||
/home/serversdown/project-lyra
|
|
||||||
├── CHANGELOG.md
|
|
||||||
├── core
|
|
||||||
│ ├── backups
|
|
||||||
│ │ ├── mem0_20250927_221040.sql
|
|
||||||
│ │ └── mem0_history_20250927_220925.tgz
|
|
||||||
│ ├── docker-compose.yml
|
|
||||||
│ ├── .env
|
|
||||||
│ ├── env experiments
|
|
||||||
│ │ ├── .env
|
|
||||||
│ │ ├── .env.local
|
|
||||||
│ │ └── .env.openai
|
|
||||||
│ ├── persona-sidecar
|
|
||||||
│ │ ├── Dockerfile
|
|
||||||
│ │ ├── package.json
|
|
||||||
│ │ ├── persona-server.js
|
|
||||||
│ │ └── personas.json
|
|
||||||
│ ├── PROJECT_SUMMARY.md
|
|
||||||
│ ├── relay
|
|
||||||
│ │ ├── Dockerfile
|
|
||||||
│ │ ├── .dockerignore
|
|
||||||
│ │ ├── lib
|
|
||||||
│ │ │ ├── cortex.js
|
|
||||||
│ │ │ └── llm.js
|
|
||||||
│ │ ├── package.json
|
|
||||||
│ │ ├── package-lock.json
|
|
||||||
│ │ ├── server.js
|
|
||||||
│ │ ├── sessions
|
|
||||||
│ │ │ ├── sess-6rxu7eia.json
|
|
||||||
│ │ │ ├── sess-6rxu7eia.jsonl
|
|
||||||
│ │ │ ├── sess-l08ndm60.json
|
|
||||||
│ │ │ └── sess-l08ndm60.jsonl
|
|
||||||
│ │ └── test-llm.js
|
|
||||||
│ └── ui
|
|
||||||
│ ├── index.html
|
|
||||||
│ ├── manifest.json
|
|
||||||
│ └── style.css
|
|
||||||
├── cortex
|
|
||||||
│ ├── Dockerfile
|
|
||||||
│ ├── .env
|
|
||||||
│ ├── ingest
|
|
||||||
│ │ ├── ingest_handler.py
|
|
||||||
│ │ └── intake_client.py
|
|
||||||
│ ├── llm
|
|
||||||
│ │ ├── llm_router.py
|
|
||||||
│ │ └── resolve_llm_url.py
|
|
||||||
│ ├── logs
|
|
||||||
│ │ └── reflections.log
|
|
||||||
│ ├── main.py
|
|
||||||
│ ├── neomem_client.py
|
|
||||||
│ ├── persona
|
|
||||||
│ │ └── speak.py
|
|
||||||
│ ├── rag.py
|
|
||||||
│ ├── reasoning
|
|
||||||
│ │ ├── reasoning.py
|
|
||||||
│ │ ├── refine.py
|
|
||||||
│ │ └── reflection.py
|
|
||||||
│ ├── requirements.txt
|
|
||||||
│ ├── router.py
|
|
||||||
│ ├── tests
|
|
||||||
│ └── utils
|
|
||||||
│ ├── config.py
|
|
||||||
│ ├── log_utils.py
|
|
||||||
│ └── schema.py
|
|
||||||
├── deprecated.env.txt
|
|
||||||
├── docker-compose.yml
|
|
||||||
├── .env
|
|
||||||
├── .gitignore
|
|
||||||
├── intake
|
|
||||||
│ ├── Dockerfile
|
|
||||||
│ ├── .env
|
|
||||||
│ ├── intake.py
|
|
||||||
│ ├── logs
|
|
||||||
│ ├── requirements.txt
|
|
||||||
│ └── venv
|
|
||||||
│ ├── bin
|
|
||||||
│ │ ├── python -> python3
|
|
||||||
│ │ ├── python3 -> /usr/bin/python3
|
|
||||||
│ │ └── python3.10 -> python3
|
|
||||||
│ ├── include
|
|
||||||
│ ├── lib
|
|
||||||
│ │ └── python3.10
|
|
||||||
│ │ └── site-packages
|
|
||||||
│ ├── lib64 -> lib
|
|
||||||
│ └── pyvenv.cfg
|
|
||||||
├── intake-logs
|
|
||||||
│ └── summaries.log
|
|
||||||
├── lyra_tree.txt
|
|
||||||
├── neomem
|
|
||||||
│ ├── _archive
|
|
||||||
│ │ └── old_servers
|
|
||||||
│ │ ├── main_backup.py
|
|
||||||
│ │ └── main_dev.py
|
|
||||||
│ ├── docker-compose.yml
|
|
||||||
│ ├── Dockerfile
|
|
||||||
│ ├── .env
|
|
||||||
│ ├── .gitignore
|
|
||||||
│ ├── neomem
|
|
||||||
│ │ ├── api
|
|
||||||
│ │ ├── client
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── main.py
|
|
||||||
│ │ │ ├── project.py
|
|
||||||
│ │ │ └── utils.py
|
|
||||||
│ │ ├── configs
|
|
||||||
│ │ │ ├── base.py
|
|
||||||
│ │ │ ├── embeddings
|
|
||||||
│ │ │ │ ├── base.py
|
|
||||||
│ │ │ │ └── __init__.py
|
|
||||||
│ │ │ ├── enums.py
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── llms
|
|
||||||
│ │ │ │ ├── anthropic.py
|
|
||||||
│ │ │ │ ├── aws_bedrock.py
|
|
||||||
│ │ │ │ ├── azure.py
|
|
||||||
│ │ │ │ ├── base.py
|
|
||||||
│ │ │ │ ├── deepseek.py
|
|
||||||
│ │ │ │ ├── __init__.py
|
|
||||||
│ │ │ │ ├── lmstudio.py
|
|
||||||
│ │ │ │ ├── ollama.py
|
|
||||||
│ │ │ │ ├── openai.py
|
|
||||||
│ │ │ │ └── vllm.py
|
|
||||||
│ │ │ ├── prompts.py
|
|
||||||
│ │ │ └── vector_stores
|
|
||||||
│ │ │ ├── azure_ai_search.py
|
|
||||||
│ │ │ ├── azure_mysql.py
|
|
||||||
│ │ │ ├── baidu.py
|
|
||||||
│ │ │ ├── chroma.py
|
|
||||||
│ │ │ ├── databricks.py
|
|
||||||
│ │ │ ├── elasticsearch.py
|
|
||||||
│ │ │ ├── faiss.py
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── langchain.py
|
|
||||||
│ │ │ ├── milvus.py
|
|
||||||
│ │ │ ├── mongodb.py
|
|
||||||
│ │ │ ├── neptune.py
|
|
||||||
│ │ │ ├── opensearch.py
|
|
||||||
│ │ │ ├── pgvector.py
|
|
||||||
│ │ │ ├── pinecone.py
|
|
||||||
│ │ │ ├── qdrant.py
|
|
||||||
│ │ │ ├── redis.py
|
|
||||||
│ │ │ ├── s3_vectors.py
|
|
||||||
│ │ │ ├── supabase.py
|
|
||||||
│ │ │ ├── upstash_vector.py
|
|
||||||
│ │ │ ├── valkey.py
|
|
||||||
│ │ │ ├── vertex_ai_vector_search.py
|
|
||||||
│ │ │ └── weaviate.py
|
|
||||||
│ │ ├── core
|
|
||||||
│ │ ├── embeddings
|
|
||||||
│ │ │ ├── aws_bedrock.py
|
|
||||||
│ │ │ ├── azure_openai.py
|
|
||||||
│ │ │ ├── base.py
|
|
||||||
│ │ │ ├── configs.py
|
|
||||||
│ │ │ ├── gemini.py
|
|
||||||
│ │ │ ├── huggingface.py
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── langchain.py
|
|
||||||
│ │ │ ├── lmstudio.py
|
|
||||||
│ │ │ ├── mock.py
|
|
||||||
│ │ │ ├── ollama.py
|
|
||||||
│ │ │ ├── openai.py
|
|
||||||
│ │ │ ├── together.py
|
|
||||||
│ │ │ └── vertexai.py
|
|
||||||
│ │ ├── exceptions.py
|
|
||||||
│ │ ├── graphs
|
|
||||||
│ │ │ ├── configs.py
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── neptune
|
|
||||||
│ │ │ │ ├── base.py
|
|
||||||
│ │ │ │ ├── __init__.py
|
|
||||||
│ │ │ │ ├── neptunedb.py
|
|
||||||
│ │ │ │ └── neptunegraph.py
|
|
||||||
│ │ │ ├── tools.py
|
|
||||||
│ │ │ └── utils.py
|
|
||||||
│ │ ├── __init__.py
|
|
||||||
│ │ ├── LICENSE
|
|
||||||
│ │ ├── llms
|
|
||||||
│ │ │ ├── anthropic.py
|
|
||||||
│ │ │ ├── aws_bedrock.py
|
|
||||||
│ │ │ ├── azure_openai.py
|
|
||||||
│ │ │ ├── azure_openai_structured.py
|
|
||||||
│ │ │ ├── base.py
|
|
||||||
│ │ │ ├── configs.py
|
|
||||||
│ │ │ ├── deepseek.py
|
|
||||||
│ │ │ ├── gemini.py
|
|
||||||
│ │ │ ├── groq.py
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── langchain.py
|
|
||||||
│ │ │ ├── litellm.py
|
|
||||||
│ │ │ ├── lmstudio.py
|
|
||||||
│ │ │ ├── ollama.py
|
|
||||||
│ │ │ ├── openai.py
|
|
||||||
│ │ │ ├── openai_structured.py
|
|
||||||
│ │ │ ├── sarvam.py
|
|
||||||
│ │ │ ├── together.py
|
|
||||||
│ │ │ ├── vllm.py
|
|
||||||
│ │ │ └── xai.py
|
|
||||||
│ │ ├── memory
|
|
||||||
│ │ │ ├── base.py
|
|
||||||
│ │ │ ├── graph_memory.py
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ ├── kuzu_memory.py
|
|
||||||
│ │ │ ├── main.py
|
|
||||||
│ │ │ ├── memgraph_memory.py
|
|
||||||
│ │ │ ├── setup.py
|
|
||||||
│ │ │ ├── storage.py
|
|
||||||
│ │ │ ├── telemetry.py
|
|
||||||
│ │ │ └── utils.py
|
|
||||||
│ │ ├── proxy
|
|
||||||
│ │ │ ├── __init__.py
|
|
||||||
│ │ │ └── main.py
|
|
||||||
│ │ ├── server
|
|
||||||
│ │ │ ├── dev.Dockerfile
|
|
||||||
│ │ │ ├── docker-compose.yaml
|
|
||||||
│ │ │ ├── Dockerfile
|
|
||||||
│ │ │ ├── main_old.py
|
|
||||||
│ │ │ ├── main.py
|
|
||||||
│ │ │ ├── Makefile
|
|
||||||
│ │ │ ├── README.md
|
|
||||||
│ │ │ └── requirements.txt
|
|
||||||
│ │ ├── storage
|
|
||||||
│ │ ├── utils
|
|
||||||
│ │ │ └── factory.py
|
|
||||||
│ │ └── vector_stores
|
|
||||||
│ │ ├── azure_ai_search.py
|
|
||||||
│ │ ├── azure_mysql.py
|
|
||||||
│ │ ├── baidu.py
|
|
||||||
│ │ ├── base.py
|
|
||||||
│ │ ├── chroma.py
|
|
||||||
│ │ ├── configs.py
|
|
||||||
│ │ ├── databricks.py
|
|
||||||
│ │ ├── elasticsearch.py
|
|
||||||
│ │ ├── faiss.py
|
|
||||||
│ │ ├── __init__.py
|
|
||||||
│ │ ├── langchain.py
|
|
||||||
│ │ ├── milvus.py
|
|
||||||
│ │ ├── mongodb.py
|
|
||||||
│ │ ├── neptune_analytics.py
|
|
||||||
│ │ ├── opensearch.py
|
|
||||||
│ │ ├── pgvector.py
|
|
||||||
│ │ ├── pinecone.py
|
|
||||||
│ │ ├── qdrant.py
|
|
||||||
│ │ ├── redis.py
|
|
||||||
│ │ ├── s3_vectors.py
|
|
||||||
│ │ ├── supabase.py
|
|
||||||
│ │ ├── upstash_vector.py
|
|
||||||
│ │ ├── valkey.py
|
|
||||||
│ │ ├── vertex_ai_vector_search.py
|
|
||||||
│ │ └── weaviate.py
|
|
||||||
│ ├── neomem_history
|
|
||||||
│ │ └── history.db
|
|
||||||
│ ├── pyproject.toml
|
|
||||||
│ ├── README.md
|
|
||||||
│ └── requirements.txt
|
|
||||||
├── neomem_history
|
|
||||||
│ └── history.db
|
|
||||||
├── rag
|
|
||||||
│ ├── chatlogs
|
|
||||||
│ │ └── lyra
|
|
||||||
│ │ ├── 0000_Wire_ROCm_to_Cortex.json
|
|
||||||
│ │ ├── 0001_Branch___10_22_ct201branch-ssh_tut.json
|
|
||||||
│ │ ├── 0002_cortex_LLMs_11-1-25.json
|
|
||||||
│ │ ├── 0003_RAG_beta.json
|
|
||||||
│ │ ├── 0005_Cortex_v0_4_0_planning.json
|
|
||||||
│ │ ├── 0006_Cortex_v0_4_0_Refinement.json
|
|
||||||
│ │ ├── 0009_Branch___Cortex_v0_4_0_planning.json
|
|
||||||
│ │ ├── 0012_Cortex_4_-_neomem_11-1-25.json
|
|
||||||
│ │ ├── 0016_Memory_consolidation_concept.json
|
|
||||||
│ │ ├── 0017_Model_inventory_review.json
|
|
||||||
│ │ ├── 0018_Branch___Memory_consolidation_concept.json
|
|
||||||
│ │ ├── 0022_Branch___Intake_conversation_summaries.json
|
|
||||||
│ │ ├── 0026_Intake_conversation_summaries.json
|
|
||||||
│ │ ├── 0027_Trilium_AI_LLM_setup.json
|
|
||||||
│ │ ├── 0028_LLMs_and_sycophancy_levels.json
|
|
||||||
│ │ ├── 0031_UI_improvement_plan.json
|
|
||||||
│ │ ├── 0035_10_27-neomem_update.json
|
|
||||||
│ │ ├── 0044_Install_llama_cpp_on_ct201.json
|
|
||||||
│ │ ├── 0045_AI_task_assistant.json
|
|
||||||
│ │ ├── 0047_Project_scope_creation.json
|
|
||||||
│ │ ├── 0052_View_docker_container_logs.json
|
|
||||||
│ │ ├── 0053_10_21-Proxmox_fan_control.json
|
|
||||||
│ │ ├── 0054_10_21-pytorch_branch_Quant_experiments.json
|
|
||||||
│ │ ├── 0055_10_22_ct201branch-ssh_tut.json
|
|
||||||
│ │ ├── 0060_Lyra_project_folder_issue.json
|
|
||||||
│ │ ├── 0062_Build_pytorch_API.json
|
|
||||||
│ │ ├── 0063_PokerBrain_dataset_structure.json
|
|
||||||
│ │ ├── 0065_Install_PyTorch_setup.json
|
|
||||||
│ │ ├── 0066_ROCm_PyTorch_setup_quirks.json
|
|
||||||
│ │ ├── 0067_VM_model_setup_steps.json
|
|
||||||
│ │ ├── 0070_Proxmox_disk_error_fix.json
|
|
||||||
│ │ ├── 0072_Docker_Compose_vs_Portainer.json
|
|
||||||
│ │ ├── 0073_Check_system_temps_Proxmox.json
|
|
||||||
│ │ ├── 0075_Cortex_gpu_progress.json
|
|
||||||
│ │ ├── 0076_Backup_Proxmox_before_upgrade.json
|
|
||||||
│ │ ├── 0077_Storage_cleanup_advice.json
|
|
||||||
│ │ ├── 0082_Install_ROCm_on_Proxmox.json
|
|
||||||
│ │ ├── 0088_Thalamus_program_summary.json
|
|
||||||
│ │ ├── 0094_Cortex_blueprint_development.json
|
|
||||||
│ │ ├── 0095_mem0_advancments.json
|
|
||||||
│ │ ├── 0096_Embedding_provider_swap.json
|
|
||||||
│ │ ├── 0097_Update_git_commit_steps.json
|
|
||||||
│ │ ├── 0098_AI_software_description.json
|
|
||||||
│ │ ├── 0099_Seed_memory_process.json
|
|
||||||
│ │ ├── 0100_Set_up_Git_repo.json
|
|
||||||
│ │ ├── 0101_Customize_embedder_setup.json
|
|
||||||
│ │ ├── 0102_Seeding_Local_Lyra_memory.json
|
|
||||||
│ │ ├── 0103_Mem0_seeding_part_3.json
|
|
||||||
│ │ ├── 0104_Memory_build_prompt.json
|
|
||||||
│ │ ├── 0105_Git_submodule_setup_guide.json
|
|
||||||
│ │ ├── 0106_Serve_UI_on_LAN.json
|
|
||||||
│ │ ├── 0107_AI_name_suggestion.json
|
|
||||||
│ │ ├── 0108_Room_X_planning_update.json
|
|
||||||
│ │ ├── 0109_Salience_filtering_design.json
|
|
||||||
│ │ ├── 0110_RoomX_Cortex_build.json
|
|
||||||
│ │ ├── 0119_Explain_Lyra_cortex_idea.json
|
|
||||||
│ │ ├── 0120_Git_submodule_organization.json
|
|
||||||
│ │ ├── 0121_Web_UI_fix_guide.json
|
|
||||||
│ │ ├── 0122_UI_development_planning.json
|
|
||||||
│ │ ├── 0123_NVGRAM_debugging_steps.json
|
|
||||||
│ │ ├── 0124_NVGRAM_setup_troubleshooting.json
|
|
||||||
│ │ ├── 0125_NVGRAM_development_update.json
|
|
||||||
│ │ ├── 0126_RX_-_NeVGRAM_New_Features.json
|
|
||||||
│ │ ├── 0127_Error_troubleshooting_steps.json
|
|
||||||
│ │ ├── 0135_Proxmox_backup_with_ABB.json
|
|
||||||
│ │ ├── 0151_Auto-start_Lyra-Core_VM.json
|
|
||||||
│ │ ├── 0156_AI_GPU_benchmarks_comparison.json
|
|
||||||
│ │ └── 0251_Lyra_project_handoff.json
|
|
||||||
│ ├── chromadb
|
|
||||||
│ │ ├── c4f701ee-1978-44a1-9df4-3e865b5d33c1
|
|
||||||
│ │ │ ├── data_level0.bin
|
|
||||||
│ │ │ ├── header.bin
|
|
||||||
│ │ │ ├── index_metadata.pickle
|
|
||||||
│ │ │ ├── length.bin
|
|
||||||
│ │ │ └── link_lists.bin
|
|
||||||
│ │ └── chroma.sqlite3
|
|
||||||
│ ├── .env
|
|
||||||
│ ├── import.log
|
|
||||||
│ ├── lyra-chatlogs
|
|
||||||
│ │ ├── 0000_Wire_ROCm_to_Cortex.json
|
|
||||||
│ │ ├── 0001_Branch___10_22_ct201branch-ssh_tut.json
|
|
||||||
│ │ ├── 0002_cortex_LLMs_11-1-25.json
|
|
||||||
│ │ └── 0003_RAG_beta.json
|
|
||||||
│ ├── rag_api.py
|
|
||||||
│ ├── rag_build.py
|
|
||||||
│ ├── rag_chat_import.py
|
|
||||||
│ └── rag_query.py
|
|
||||||
├── README.md
|
|
||||||
├── vllm-mi50.md
|
|
||||||
└── volumes
|
|
||||||
├── neo4j_data
|
|
||||||
│ ├── databases
|
|
||||||
│ │ ├── neo4j
|
|
||||||
│ │ │ ├── database_lock
|
|
||||||
│ │ │ ├── id-buffer.tmp.0
|
|
||||||
│ │ │ ├── neostore
|
|
||||||
│ │ │ ├── neostore.counts.db
|
|
||||||
│ │ │ ├── neostore.indexstats.db
|
|
||||||
│ │ │ ├── neostore.labeltokenstore.db
|
|
||||||
│ │ │ ├── neostore.labeltokenstore.db.id
|
|
||||||
│ │ │ ├── neostore.labeltokenstore.db.names
|
|
||||||
│ │ │ ├── neostore.labeltokenstore.db.names.id
|
|
||||||
│ │ │ ├── neostore.nodestore.db
|
|
||||||
│ │ │ ├── neostore.nodestore.db.id
|
|
||||||
│ │ │ ├── neostore.nodestore.db.labels
|
|
||||||
│ │ │ ├── neostore.nodestore.db.labels.id
|
|
||||||
│ │ │ ├── neostore.propertystore.db
|
|
||||||
│ │ │ ├── neostore.propertystore.db.arrays
|
|
||||||
│ │ │ ├── neostore.propertystore.db.arrays.id
|
|
||||||
│ │ │ ├── neostore.propertystore.db.id
|
|
||||||
│ │ │ ├── neostore.propertystore.db.index
|
|
||||||
│ │ │ ├── neostore.propertystore.db.index.id
|
|
||||||
│ │ │ ├── neostore.propertystore.db.index.keys
|
|
||||||
│ │ │ ├── neostore.propertystore.db.index.keys.id
|
|
||||||
│ │ │ ├── neostore.propertystore.db.strings
|
|
||||||
│ │ │ ├── neostore.propertystore.db.strings.id
|
|
||||||
│ │ │ ├── neostore.relationshipgroupstore.db
|
|
||||||
│ │ │ ├── neostore.relationshipgroupstore.db.id
|
|
||||||
│ │ │ ├── neostore.relationshipgroupstore.degrees.db
|
|
||||||
│ │ │ ├── neostore.relationshipstore.db
|
|
||||||
│ │ │ ├── neostore.relationshipstore.db.id
|
|
||||||
│ │ │ ├── neostore.relationshiptypestore.db
|
|
||||||
│ │ │ ├── neostore.relationshiptypestore.db.id
|
|
||||||
│ │ │ ├── neostore.relationshiptypestore.db.names
|
|
||||||
│ │ │ ├── neostore.relationshiptypestore.db.names.id
|
|
||||||
│ │ │ ├── neostore.schemastore.db
|
|
||||||
│ │ │ ├── neostore.schemastore.db.id
|
|
||||||
│ │ │ └── schema
|
|
||||||
│ │ │ └── index
|
|
||||||
│ │ │ └── token-lookup-1.0
|
|
||||||
│ │ │ ├── 1
|
|
||||||
│ │ │ │ └── index-1
|
|
||||||
│ │ │ └── 2
|
|
||||||
│ │ │ └── index-2
|
|
||||||
│ │ ├── store_lock
|
|
||||||
│ │ └── system
|
|
||||||
│ │ ├── database_lock
|
|
||||||
│ │ ├── id-buffer.tmp.0
|
|
||||||
│ │ ├── neostore
|
|
||||||
│ │ ├── neostore.counts.db
|
|
||||||
│ │ ├── neostore.indexstats.db
|
|
||||||
│ │ ├── neostore.labeltokenstore.db
|
|
||||||
│ │ ├── neostore.labeltokenstore.db.id
|
|
||||||
│ │ ├── neostore.labeltokenstore.db.names
|
|
||||||
│ │ ├── neostore.labeltokenstore.db.names.id
|
|
||||||
│ │ ├── neostore.nodestore.db
|
|
||||||
│ │ ├── neostore.nodestore.db.id
|
|
||||||
│ │ ├── neostore.nodestore.db.labels
|
|
||||||
│ │ ├── neostore.nodestore.db.labels.id
|
|
||||||
│ │ ├── neostore.propertystore.db
|
|
||||||
│ │ ├── neostore.propertystore.db.arrays
|
|
||||||
│ │ ├── neostore.propertystore.db.arrays.id
|
|
||||||
│ │ ├── neostore.propertystore.db.id
|
|
||||||
│ │ ├── neostore.propertystore.db.index
|
|
||||||
│ │ ├── neostore.propertystore.db.index.id
|
|
||||||
│ │ ├── neostore.propertystore.db.index.keys
|
|
||||||
│ │ ├── neostore.propertystore.db.index.keys.id
|
|
||||||
│ │ ├── neostore.propertystore.db.strings
|
|
||||||
│ │ ├── neostore.propertystore.db.strings.id
|
|
||||||
│ │ ├── neostore.relationshipgroupstore.db
|
|
||||||
│ │ ├── neostore.relationshipgroupstore.db.id
|
|
||||||
│ │ ├── neostore.relationshipgroupstore.degrees.db
|
|
||||||
│ │ ├── neostore.relationshipstore.db
|
|
||||||
│ │ ├── neostore.relationshipstore.db.id
|
|
||||||
│ │ ├── neostore.relationshiptypestore.db
|
|
||||||
│ │ ├── neostore.relationshiptypestore.db.id
|
|
||||||
│ │ ├── neostore.relationshiptypestore.db.names
|
|
||||||
│ │ ├── neostore.relationshiptypestore.db.names.id
|
|
||||||
│ │ ├── neostore.schemastore.db
|
|
||||||
│ │ ├── neostore.schemastore.db.id
|
|
||||||
│ │ └── schema
|
|
||||||
│ │ └── index
|
|
||||||
│ │ ├── range-1.0
|
|
||||||
│ │ │ ├── 3
|
|
||||||
│ │ │ │ └── index-3
|
|
||||||
│ │ │ ├── 4
|
|
||||||
│ │ │ │ └── index-4
|
|
||||||
│ │ │ ├── 7
|
|
||||||
│ │ │ │ └── index-7
|
|
||||||
│ │ │ ├── 8
|
|
||||||
│ │ │ │ └── index-8
|
|
||||||
│ │ │ └── 9
|
|
||||||
│ │ │ └── index-9
|
|
||||||
│ │ └── token-lookup-1.0
|
|
||||||
│ │ ├── 1
|
|
||||||
│ │ │ └── index-1
|
|
||||||
│ │ └── 2
|
|
||||||
│ │ └── index-2
|
|
||||||
│ ├── dbms
|
|
||||||
│ │ └── auth.ini
|
|
||||||
│ ├── server_id
|
|
||||||
│ └── transactions
|
|
||||||
│ ├── neo4j
|
|
||||||
│ │ ├── checkpoint.0
|
|
||||||
│ │ └── neostore.transaction.db.0
|
|
||||||
│ └── system
|
|
||||||
│ ├── checkpoint.0
|
|
||||||
│ └── neostore.transaction.db.0
|
|
||||||
└── postgres_data [error opening dir]
|
|
||||||
|
|
||||||
81 directories, 376 files
|
|
||||||
Reference in New Issue
Block a user