autonomy, initial scaffold

This commit is contained in:
serversdwn
2025-12-11 13:12:44 -05:00
parent d5d7ea3469
commit 30f6c1a3da
9 changed files with 0 additions and 0 deletions

View File

View File

@@ -0,0 +1,250 @@
# Environment Variables Reference
This document describes all environment variables used across Project Lyra services.
## Quick Start
1. Copy environment templates:
```bash
cp .env.example .env
cp cortex/.env.example cortex/.env
cp neomem/.env.example neomem/.env
cp intake/.env.example intake/.env
```
2. Edit `.env` and add your credentials:
- `OPENAI_API_KEY`: Your OpenAI API key
- `POSTGRES_PASSWORD`: Database password
- `NEO4J_PASSWORD`: Graph database password
- `NEOMEM_API_KEY`: Generate a secure token
3. Update service URLs if your infrastructure differs from defaults
## File Structure
### Root `.env` - Shared Infrastructure
Contains all shared configuration used by multiple services:
- LLM backend options (PRIMARY, SECONDARY, CLOUD, FALLBACK)
- Database credentials (Postgres, Neo4j)
- API keys (OpenAI)
- Internal service URLs
- Feature flags
### Service-Specific `.env` Files
Each service has minimal overrides for service-specific parameters:
- **`cortex/.env`**: Cortex operational parameters
- **`neomem/.env`**: NeoMem LLM naming convention mappings
- **`intake/.env`**: Intake summarization parameters
## Environment Loading Order
Docker Compose loads environment files in this order (later overrides earlier):
1. Service-specific `.env` (e.g., `cortex/.env`)
2. Root `.env`
This means service-specific files can override root values when needed.
## Global Variables (Root `.env`)
### Global Configuration
| Variable | Default | Description |
|----------|---------|-------------|
| `LOCAL_TZ_LABEL` | `America/New_York` | Timezone for logs and timestamps |
| `DEFAULT_SESSION_ID` | `default` | Default chat session identifier |
### LLM Backend Options
Each service chooses which backend to use from these available options.
#### Primary Backend (vLLM on MI50 GPU)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_PRIMARY_PROVIDER` | `vllm` | Provider type |
| `LLM_PRIMARY_URL` | `http://10.0.0.43:8000` | vLLM server endpoint |
| `LLM_PRIMARY_MODEL` | `/model` | Model path for vLLM |
#### Secondary Backend (Ollama on 3090 GPU)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_SECONDARY_PROVIDER` | `ollama` | Provider type |
| `LLM_SECONDARY_URL` | `http://10.0.0.3:11434` | Ollama server endpoint |
| `LLM_SECONDARY_MODEL` | `qwen2.5:7b-instruct-q4_K_M` | Ollama model name |
#### Cloud Backend (OpenAI)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_CLOUD_PROVIDER` | `openai_chat` | Provider type |
| `LLM_CLOUD_URL` | `https://api.openai.com/v1` | OpenAI API endpoint |
| `LLM_CLOUD_MODEL` | `gpt-4o-mini` | OpenAI model to use |
| `OPENAI_API_KEY` | *required* | OpenAI API authentication key |
#### Fallback Backend (llama.cpp/LM Studio)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_FALLBACK_PROVIDER` | `openai_completions` | Provider type (llama.cpp mimics OpenAI) |
| `LLM_FALLBACK_URL` | `http://10.0.0.41:11435` | Fallback server endpoint |
| `LLM_FALLBACK_MODEL` | `llama-3.2-8b-instruct` | Fallback model name |
#### LLM Global Settings
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_TEMPERATURE` | `0.7` | Sampling temperature (0.0-2.0) |
### Database Configuration
#### PostgreSQL (with pgvector)
| Variable | Default | Description |
|----------|---------|-------------|
| `POSTGRES_USER` | `neomem` | PostgreSQL username |
| `POSTGRES_PASSWORD` | *required* | PostgreSQL password |
| `POSTGRES_DB` | `neomem` | Database name |
| `POSTGRES_HOST` | `neomem-postgres` | Container name/hostname |
| `POSTGRES_PORT` | `5432` | PostgreSQL port |
#### Neo4j Graph Database
| Variable | Default | Description |
|----------|---------|-------------|
| `NEO4J_URI` | `bolt://neomem-neo4j:7687` | Neo4j connection URI |
| `NEO4J_USERNAME` | `neo4j` | Neo4j username |
| `NEO4J_PASSWORD` | *required* | Neo4j password |
| `NEO4J_AUTH` | `neo4j/<password>` | Neo4j auth string |
### Memory Services (NeoMem)
| Variable | Default | Description |
|----------|---------|-------------|
| `NEOMEM_API` | `http://neomem-api:7077` | NeoMem API endpoint |
| `NEOMEM_API_KEY` | *required* | NeoMem API authentication token |
| `NEOMEM_HISTORY_DB` | `postgresql://...` | PostgreSQL connection string for history |
| `EMBEDDER_PROVIDER` | `openai` | Embedding provider (used by NeoMem) |
| `EMBEDDER_MODEL` | `text-embedding-3-small` | Embedding model name |
### Internal Service URLs
All using Docker container names for network communication:
| Variable | Default | Description |
|----------|---------|-------------|
| `INTAKE_API_URL` | `http://intake:7080` | Intake summarizer service |
| `CORTEX_API` | `http://cortex:7081` | Cortex reasoning service |
| `CORTEX_URL` | `http://cortex:7081/reflect` | Cortex reflection endpoint |
| `CORTEX_URL_INGEST` | `http://cortex:7081/ingest` | Cortex ingest endpoint |
| `RAG_API_URL` | `http://rag:7090` | RAG service (if enabled) |
| `RELAY_URL` | `http://relay:7078` | Relay orchestration service |
| `PERSONA_URL` | `http://persona-sidecar:7080/current` | Persona service (optional) |
### Feature Flags
| Variable | Default | Description |
|----------|---------|-------------|
| `CORTEX_ENABLED` | `true` | Enable Cortex autonomous reflection |
| `MEMORY_ENABLED` | `true` | Enable NeoMem long-term memory |
| `PERSONA_ENABLED` | `false` | Enable persona sidecar |
| `DEBUG_PROMPT` | `true` | Enable debug logging for prompts |
## Service-Specific Variables
### Cortex (`cortex/.env`)
Cortex operational parameters:
| Variable | Default | Description |
|----------|---------|-------------|
| `CORTEX_MODE` | `autonomous` | Operation mode (autonomous/manual) |
| `CORTEX_LOOP_INTERVAL` | `300` | Seconds between reflection loops |
| `CORTEX_REFLECTION_INTERVAL` | `86400` | Seconds between deep reflections (24h) |
| `CORTEX_LOG_LEVEL` | `debug` | Logging verbosity |
| `NEOMEM_HEALTH_CHECK_INTERVAL` | `300` | NeoMem health check frequency |
| `REFLECTION_NOTE_TARGET` | `trilium` | Where to store reflection notes |
| `REFLECTION_NOTE_PATH` | `/app/logs/reflections.log` | Reflection output path |
| `RELEVANCE_THRESHOLD` | `0.78` | Memory retrieval relevance threshold |
**Note**: Cortex uses `LLM_PRIMARY` (vLLM on MI50) by default from root `.env`.
### NeoMem (`neomem/.env`)
NeoMem uses different variable naming conventions:
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_PROVIDER` | `ollama` | NeoMem's LLM provider name |
| `LLM_MODEL` | `qwen2.5:7b-instruct-q4_K_M` | NeoMem's LLM model |
| `LLM_API_BASE` | `http://10.0.0.3:11434` | NeoMem's LLM endpoint (Ollama) |
**Note**: NeoMem uses Ollama (SECONDARY) for reasoning and OpenAI for embeddings. Database credentials and `OPENAI_API_KEY` inherited from root `.env`.
### Intake (`intake/.env`)
Intake summarization parameters:
| Variable | Default | Description |
|----------|---------|-------------|
| `SUMMARY_MODEL_NAME` | `/model` | Model path for summarization |
| `SUMMARY_API_URL` | `http://10.0.0.43:8000` | LLM endpoint for summaries |
| `SUMMARY_MAX_TOKENS` | `400` | Max tokens for summary generation |
| `SUMMARY_TEMPERATURE` | `0.4` | Temperature for summaries (lower = more focused) |
| `SUMMARY_INTERVAL` | `300` | Seconds between summary checks |
| `INTAKE_LOG_PATH` | `/app/logs/intake.log` | Log file location |
| `INTAKE_LOG_LEVEL` | `info` | Logging verbosity |
**Note**: Intake uses `LLM_PRIMARY` (vLLM) by default.
## Multi-Backend LLM Strategy
Project Lyra supports flexible backend selection per service:
**Root `.env` provides backend OPTIONS**:
- PRIMARY: vLLM on MI50 GPU (high performance)
- SECONDARY: Ollama on 3090 GPU (local inference)
- CLOUD: OpenAI API (cloud fallback)
- FALLBACK: llama.cpp/LM Studio (CPU-only)
**Services choose which backend to USE**:
- **Cortex** → vLLM (PRIMARY) for autonomous reasoning
- **NeoMem** → Ollama (SECONDARY) + OpenAI embeddings
- **Intake** → vLLM (PRIMARY) for summarization
- **Relay** → Implements fallback cascade with user preference
This design eliminates URL duplication while preserving per-service flexibility.
## Security Best Practices
1. **Never commit `.env` files to git** - they contain secrets
2. **Use `.env.example` templates** for documentation and onboarding
3. **Rotate credentials regularly**, especially:
- `OPENAI_API_KEY`
- `NEOMEM_API_KEY`
- Database passwords
4. **Use strong passwords** for production databases
5. **Restrict network access** to LLM backends and databases
## Troubleshooting
### Services can't connect to each other
- Verify container names match in service URLs
- Check all services are on the `lyra_net` Docker network
- Use `docker-compose ps` to verify all services are running
### LLM calls failing
- Verify backend URLs are correct for your infrastructure
- Check if LLM servers are running and accessible
- Test with `curl <LLM_URL>/v1/models` (OpenAI-compatible APIs)
### Database connection errors
- Verify database credentials match in all locations
- Check if database containers are healthy: `docker-compose ps`
- Review database logs: `docker-compose logs neomem-postgres`
### Environment variables not loading
- Verify env_file paths in docker-compose.yml
- Check file permissions: `.env` files must be readable
- Remember loading order: service `.env` overrides root `.env`
## Migration from Old Setup
If you have the old multi-file setup with duplicated variables:
1. **Backup existing files**: All original `.env` files are in `.env-backups/`
2. **Copy new templates**: Use `.env.example` files as base
3. **Merge credentials**: Transfer your actual keys/passwords to new root `.env`
4. **Test thoroughly**: Verify all services start and communicate correctly
## Support
For issues or questions:
- Check logs: `docker-compose logs <service>`
- Verify configuration: `docker exec <container> env | grep <VAR>`
- Review this documentation for variable descriptions

925
docs/PROJECT_SUMMARY.md Normal file
View File

@@ -0,0 +1,925 @@
# Project Lyra — Comprehensive AI Context Summary
**Version:** v0.5.1 (2025-12-11)
**Status:** Production-ready modular AI companion system
**Purpose:** Memory-backed conversational AI with multi-stage reasoning, persistent context, and modular LLM backend architecture
---
## Executive Summary
Project Lyra is a **self-hosted AI companion system** designed to overcome the limitations of typical chatbots by providing:
- **Persistent long-term memory** (NeoMem: PostgreSQL + Neo4j graph storage)
- **Multi-stage reasoning pipeline** (Cortex: reflection → reasoning → refinement → persona)
- **Short-term context management** (Intake: session-based summarization embedded in Cortex)
- **Flexible LLM backend routing** (supports llama.cpp, Ollama, OpenAI, custom endpoints)
- **OpenAI-compatible API** (drop-in replacement for chat applications)
**Core Philosophy:** Like a human brain has different regions for different functions, Lyra has specialized modules that work together. She's not just a chatbot—she's a notepad, schedule, database, co-creator, and collaborator with her own executive function.
---
## Quick Context for AI Assistants
If you're an AI being given this project to work on, here's what you need to know:
### What This Project Does
Lyra is a conversational AI system that **remembers everything** across sessions. When a user says something in passing, Lyra stores it, contextualizes it, and can recall it later. She can:
- Track project progress over time
- Remember user preferences and past conversations
- Reason through complex questions using multiple LLM calls
- Apply a consistent personality across all interactions
- Integrate with multiple LLM backends (local and cloud)
### Current Architecture (v0.5.1)
```
User → Relay (Express/Node.js, port 7078)
Cortex (FastAPI/Python, port 7081)
├─ Intake module (embedded, in-memory SESSIONS)
├─ 4-stage reasoning pipeline
└─ Multi-backend LLM router
NeoMem (FastAPI/Python, port 7077)
├─ PostgreSQL (vector storage)
└─ Neo4j (graph relationships)
```
### Key Files You'll Work With
**Backend Services:**
- [cortex/router.py](cortex/router.py) - Main Cortex routing logic (306 lines, `/reason`, `/ingest` endpoints)
- [cortex/intake/intake.py](cortex/intake/intake.py) - Short-term memory module (367 lines, SESSIONS management)
- [cortex/reasoning/reasoning.py](cortex/reasoning/reasoning.py) - Draft answer generation
- [cortex/reasoning/refine.py](cortex/reasoning/refine.py) - Answer refinement
- [cortex/reasoning/reflection.py](cortex/reasoning/reflection.py) - Meta-awareness notes
- [cortex/persona/speak.py](cortex/persona/speak.py) - Personality layer
- [cortex/llm/llm_router.py](cortex/llm/llm_router.py) - LLM backend selector
- [core/relay/server.js](core/relay/server.js) - Main orchestrator (Node.js)
- [neomem/main.py](neomem/main.py) - Long-term memory API
**Configuration:**
- [.env](.env) - Root environment variables (LLM backends, databases, API keys)
- [cortex/.env](cortex/.env) - Cortex-specific overrides
- [docker-compose.yml](docker-compose.yml) - Service definitions (152 lines)
**Documentation:**
- [CHANGELOG.md](CHANGELOG.md) - Complete version history (836 lines, chronological format)
- [README.md](README.md) - User-facing documentation (610 lines)
- [PROJECT_SUMMARY.md](PROJECT_SUMMARY.md) - This file
### Recent Critical Fixes (v0.5.1)
The most recent work fixed a critical bug where Intake's SESSIONS buffer wasn't persisting:
1. **Fixed**: `bg_summarize()` was only a TYPE_CHECKING stub → implemented as logging stub
2. **Fixed**: `/ingest` endpoint had unreachable code → removed early return, added lenient error handling
3. **Added**: `cortex/intake/__init__.py` → proper Python package structure
4. **Added**: Diagnostic endpoints `/debug/sessions` and `/debug/summary` for troubleshooting
**Key Insight**: Intake is no longer a standalone service—it's embedded in Cortex as a Python module. SESSIONS must persist in a single Uvicorn worker (no multi-worker support without Redis).
---
## Architecture Deep Dive
### Service Topology (Docker Compose)
**Active Containers:**
1. **relay** (Node.js/Express, port 7078)
- Entry point for all user requests
- OpenAI-compatible `/v1/chat/completions` endpoint
- Routes to Cortex for reasoning
- Async calls to Cortex `/ingest` after response
2. **cortex** (Python/FastAPI, port 7081)
- Multi-stage reasoning pipeline
- Embedded Intake module (no HTTP, direct Python imports)
- Endpoints: `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary`
3. **neomem-api** (Python/FastAPI, port 7077)
- Long-term memory storage
- Fork of Mem0 OSS (fully local, no external SDK)
- Endpoints: `/memories`, `/search`, `/health`
4. **neomem-postgres** (PostgreSQL + pgvector, port 5432)
- Vector embeddings storage
- Memory history records
5. **neomem-neo4j** (Neo4j, ports 7474/7687)
- Graph relationships between memories
- Entity extraction and linking
**Disabled Services:**
- `intake` - No longer needed (embedded in Cortex as of v0.5.1)
- `rag` - Beta Lyrae RAG service (planned re-enablement)
### External LLM Backends (HTTP APIs)
**PRIMARY Backend** - llama.cpp @ `http://10.0.0.44:8080`
- AMD MI50 GPU-accelerated inference
- Model: `/model` (path-based routing)
- Used for: Reasoning, refinement, summarization
**SECONDARY Backend** - Ollama @ `http://10.0.0.3:11434`
- RTX 3090 GPU-accelerated inference
- Model: `qwen2.5:7b-instruct-q4_K_M`
- Used for: Configurable per-module
**CLOUD Backend** - OpenAI @ `https://api.openai.com/v1`
- Cloud-based inference
- Model: `gpt-4o-mini`
- Used for: Reflection, persona layers
**FALLBACK Backend** - Local @ `http://10.0.0.41:11435`
- CPU-based inference
- Model: `llama-3.2-8b-instruct`
- Used for: Emergency fallback
### Data Flow (Request Lifecycle)
```
1. User sends message → Relay (/v1/chat/completions)
2. Relay → Cortex (/reason)
3. Cortex calls Intake module (internal Python)
- Intake.summarize_context(session_id, exchanges)
- Returns L1/L5/L10/L20/L30 summaries
4. Cortex 4-stage pipeline:
a. reflection.py → Meta-awareness notes (CLOUD backend)
- "What is the user really asking?"
- Returns JSON: {"notes": [...]}
b. reasoning.py → Draft answer (PRIMARY backend)
- Uses context from Intake
- Integrates reflection notes
- Returns draft text
c. refine.py → Refined answer (PRIMARY backend)
- Polishes draft for clarity
- Ensures factual consistency
- Returns refined text
d. speak.py → Persona layer (CLOUD backend)
- Applies Lyra's personality
- Natural, conversational tone
- Returns final answer
5. Cortex → Relay (returns persona answer)
6. Relay → Cortex (/ingest) [async, non-blocking]
- Sends (session_id, user_msg, assistant_msg)
- Cortex calls add_exchange_internal()
- Appends to SESSIONS[session_id]["buffer"]
7. Relay → User (returns final response)
8. [Planned] Relay → NeoMem (/memories) [async]
- Store conversation in long-term memory
```
### Intake Module Architecture (v0.5.1)
**Location:** `cortex/intake/`
**Key Change:** Intake is now **embedded in Cortex** as a Python module, not a standalone service.
**Import Pattern:**
```python
from intake.intake import add_exchange_internal, SESSIONS, summarize_context
```
**Core Data Structure:**
```python
SESSIONS: dict[str, dict] = {}
# Structure:
SESSIONS[session_id] = {
"buffer": deque(maxlen=200), # Circular buffer of exchanges
"created_at": datetime
}
# Each exchange in buffer:
{
"session_id": "...",
"user_msg": "...",
"assistant_msg": "...",
"timestamp": "2025-12-11T..."
}
```
**Functions:**
1. **`add_exchange_internal(exchange: dict)`**
- Adds exchange to SESSIONS buffer
- Creates new session if needed
- Calls `bg_summarize()` stub
- Returns `{"ok": True, "session_id": "..."}`
2. **`summarize_context(session_id: str, exchanges: list[dict])`** [async]
- Generates L1/L5/L10/L20/L30 summaries via LLM
- Called during `/reason` endpoint
- Returns multi-level summary dict
3. **`bg_summarize(session_id: str)`**
- **Stub function** - logs only, no actual work
- Defers summarization to `/reason` call
- Exists to prevent NameError
**Critical Constraint:** SESSIONS is a module-level global dict. This requires **single-worker Uvicorn** mode. Multi-worker deployments need Redis or shared storage.
**Diagnostic Endpoints:**
- `GET /debug/sessions` - Inspect all SESSIONS (object ID, buffer sizes, recent exchanges)
- `GET /debug/summary?session_id=X` - Test summarization for a session
---
## Environment Configuration
### LLM Backend Registry (Multi-Backend Strategy)
**Root `.env` defines all backend OPTIONS:**
```bash
# PRIMARY Backend (llama.cpp)
LLM_PRIMARY_PROVIDER=llama.cpp
LLM_PRIMARY_URL=http://10.0.0.44:8080
LLM_PRIMARY_MODEL=/model
# SECONDARY Backend (Ollama)
LLM_SECONDARY_PROVIDER=ollama
LLM_SECONDARY_URL=http://10.0.0.3:11434
LLM_SECONDARY_MODEL=qwen2.5:7b-instruct-q4_K_M
# CLOUD Backend (OpenAI)
LLM_OPENAI_PROVIDER=openai
LLM_OPENAI_URL=https://api.openai.com/v1
LLM_OPENAI_MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-proj-...
# FALLBACK Backend
LLM_FALLBACK_PROVIDER=openai_completions
LLM_FALLBACK_URL=http://10.0.0.41:11435
LLM_FALLBACK_MODEL=llama-3.2-8b-instruct
```
**Module-specific backend selection:**
```bash
CORTEX_LLM=SECONDARY # Cortex uses Ollama
INTAKE_LLM=PRIMARY # Intake uses llama.cpp
SPEAK_LLM=OPENAI # Persona uses OpenAI
NEOMEM_LLM=PRIMARY # NeoMem uses llama.cpp
UI_LLM=OPENAI # UI uses OpenAI
RELAY_LLM=PRIMARY # Relay uses llama.cpp
```
**Philosophy:** Root `.env` provides all backend OPTIONS. Each service chooses which backend to USE via `{MODULE}_LLM` variable. This eliminates URL duplication while preserving flexibility.
### Database Configuration
```bash
# PostgreSQL (vector storage)
POSTGRES_USER=neomem
POSTGRES_PASSWORD=neomempass
POSTGRES_DB=neomem
POSTGRES_HOST=neomem-postgres
POSTGRES_PORT=5432
# Neo4j (graph storage)
NEO4J_URI=bolt://neomem-neo4j:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=neomemgraph
```
### Service URLs (Docker Internal Network)
```bash
NEOMEM_API=http://neomem-api:7077
CORTEX_API=http://cortex:7081
CORTEX_REASON_URL=http://cortex:7081/reason
CORTEX_INGEST_URL=http://cortex:7081/ingest
RELAY_URL=http://relay:7078
```
### Feature Flags
```bash
CORTEX_ENABLED=true
MEMORY_ENABLED=true
PERSONA_ENABLED=false
DEBUG_PROMPT=true
VERBOSE_DEBUG=true
```
---
## Code Structure Overview
### Cortex Service (`cortex/`)
**Main Files:**
- `main.py` - FastAPI app initialization
- `router.py` - Route definitions (`/reason`, `/ingest`, `/health`, `/debug/*`)
- `context.py` - Context aggregation (Intake summaries, session state)
**Reasoning Pipeline (`reasoning/`):**
- `reflection.py` - Meta-awareness notes (Cloud LLM)
- `reasoning.py` - Draft answer generation (Primary LLM)
- `refine.py` - Answer refinement (Primary LLM)
**Persona Layer (`persona/`):**
- `speak.py` - Personality application (Cloud LLM)
- `identity.py` - Persona loader
**Intake Module (`intake/`):**
- `__init__.py` - Package exports (SESSIONS, add_exchange_internal, summarize_context)
- `intake.py` - Core logic (367 lines)
- SESSIONS dictionary
- add_exchange_internal()
- summarize_context()
- bg_summarize() stub
**LLM Integration (`llm/`):**
- `llm_router.py` - Backend selector and HTTP client
- call_llm() function
- Environment-based routing
- Payload formatting per backend type
**Utilities (`utils/`):**
- Helper functions for common operations
**Configuration:**
- `Dockerfile` - Single-worker constraint documented
- `requirements.txt` - Python dependencies
- `.env` - Service-specific overrides
### Relay Service (`core/relay/`)
**Main Files:**
- `server.js` - Express.js server (Node.js)
- `/v1/chat/completions` - OpenAI-compatible endpoint
- `/chat` - Internal endpoint
- `/_health` - Health check
- `package.json` - Node.js dependencies
**Key Logic:**
- Receives user messages
- Routes to Cortex `/reason`
- Async calls to Cortex `/ingest` after response
- Returns final answer to user
### NeoMem Service (`neomem/`)
**Main Files:**
- `main.py` - FastAPI app (memory API)
- `memory.py` - Memory management logic
- `embedder.py` - Embedding generation
- `graph.py` - Neo4j graph operations
- `Dockerfile` - Container definition
- `requirements.txt` - Python dependencies
**API Endpoints:**
- `POST /memories` - Add new memory
- `POST /search` - Semantic search
- `GET /health` - Service health
---
## Common Development Tasks
### Adding a New Endpoint to Cortex
**Example: Add `/debug/buffer` endpoint**
1. **Edit `cortex/router.py`:**
```python
@cortex_router.get("/debug/buffer")
async def debug_buffer(session_id: str, limit: int = 10):
"""Return last N exchanges from a session buffer."""
from intake.intake import SESSIONS
session = SESSIONS.get(session_id)
if not session:
return {"error": "session not found", "session_id": session_id}
buffer = session["buffer"]
recent = list(buffer)[-limit:]
return {
"session_id": session_id,
"total_exchanges": len(buffer),
"recent_exchanges": recent
}
```
2. **Restart Cortex:**
```bash
docker-compose restart cortex
```
3. **Test:**
```bash
curl "http://localhost:7081/debug/buffer?session_id=test&limit=5"
```
### Modifying LLM Backend for a Module
**Example: Switch Cortex to use PRIMARY backend**
1. **Edit `.env`:**
```bash
CORTEX_LLM=PRIMARY # Change from SECONDARY to PRIMARY
```
2. **Restart Cortex:**
```bash
docker-compose restart cortex
```
3. **Verify in logs:**
```bash
docker logs cortex | grep "Backend"
```
### Adding Diagnostic Logging
**Example: Log every exchange addition**
1. **Edit `cortex/intake/intake.py`:**
```python
def add_exchange_internal(exchange: dict):
session_id = exchange.get("session_id")
# Add detailed logging
print(f"[DEBUG] Adding exchange to {session_id}")
print(f"[DEBUG] User msg: {exchange.get('user_msg', '')[:100]}")
print(f"[DEBUG] Assistant msg: {exchange.get('assistant_msg', '')[:100]}")
# ... rest of function
```
2. **View logs:**
```bash
docker logs cortex -f | grep DEBUG
```
---
## Debugging Guide
### Problem: SESSIONS Not Persisting
**Symptoms:**
- `/debug/sessions` shows empty or only 1 exchange
- Summaries always return empty
- Buffer size doesn't increase
**Diagnosis Steps:**
1. Check Cortex logs for SESSIONS object ID:
```bash
docker logs cortex | grep "SESSIONS object id"
```
- Should show same ID across all calls
- If IDs differ → module reloading issue
2. Verify single-worker mode:
```bash
docker exec cortex cat Dockerfile | grep uvicorn
```
- Should NOT have `--workers` flag or `--workers 1`
3. Check `/debug/sessions` endpoint:
```bash
curl http://localhost:7081/debug/sessions | jq
```
- Should show sessions_object_id and current sessions
4. Inspect `__init__.py` exists:
```bash
docker exec cortex ls -la intake/__init__.py
```
**Solution (Fixed in v0.5.1):**
- Ensure `cortex/intake/__init__.py` exists with proper exports
- Verify `bg_summarize()` is implemented (not just TYPE_CHECKING stub)
- Check `/ingest` endpoint doesn't have early return
- Rebuild Cortex container: `docker-compose build cortex && docker-compose restart cortex`
### Problem: LLM Backend Timeout
**Symptoms:**
- Cortex `/reason` hangs
- 504 Gateway Timeout errors
- Logs show "waiting for LLM response"
**Diagnosis Steps:**
1. Test backend directly:
```bash
# llama.cpp
curl http://10.0.0.44:8080/health
# Ollama
curl http://10.0.0.3:11434/api/tags
# OpenAI
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY"
```
2. Check network connectivity:
```bash
docker exec cortex ping -c 3 10.0.0.44
```
3. Review Cortex logs:
```bash
docker logs cortex -f | grep "LLM"
```
**Solutions:**
- Verify backend URL in `.env` is correct and accessible
- Check firewall rules for backend ports
- Increase timeout in `cortex/llm/llm_router.py`
- Switch to different backend temporarily: `CORTEX_LLM=CLOUD`
### Problem: Docker Compose Won't Start
**Symptoms:**
- `docker-compose up -d` fails
- Container exits immediately
- "port already in use" errors
**Diagnosis Steps:**
1. Check port conflicts:
```bash
netstat -tulpn | grep -E '7078|7081|7077|5432'
```
2. Check container logs:
```bash
docker-compose logs --tail=50
```
3. Verify environment file:
```bash
cat .env | grep -v "^#" | grep -v "^$"
```
**Solutions:**
- Stop conflicting services: `docker-compose down`
- Check `.env` syntax (no quotes unless necessary)
- Rebuild containers: `docker-compose build --no-cache`
- Check Docker daemon: `systemctl status docker`
---
## Testing Checklist
### After Making Changes to Cortex
**1. Build and restart:**
```bash
docker-compose build cortex
docker-compose restart cortex
```
**2. Verify service health:**
```bash
curl http://localhost:7081/health
```
**3. Test /ingest endpoint:**
```bash
curl -X POST http://localhost:7081/ingest \
-H "Content-Type: application/json" \
-d '{
"session_id": "test",
"user_msg": "Hello",
"assistant_msg": "Hi there!"
}'
```
**4. Verify SESSIONS updated:**
```bash
curl http://localhost:7081/debug/sessions | jq '.sessions.test.buffer_size'
```
- Should show 1 (or increment if already populated)
**5. Test summarization:**
```bash
curl "http://localhost:7081/debug/summary?session_id=test" | jq '.summary'
```
- Should return L1/L5/L10/L20/L30 summaries
**6. Test full pipeline:**
```bash
curl -X POST http://localhost:7078/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Test message"}],
"session_id": "test"
}' | jq '.choices[0].message.content'
```
**7. Check logs for errors:**
```bash
docker logs cortex --tail=50
```
---
## Project History & Context
### Evolution Timeline
**v0.1.x (2025-09-23 to 2025-09-25)**
- Initial MVP: Relay + Mem0 + Ollama
- Basic memory storage and retrieval
- Simple UI with session support
**v0.2.x (2025-09-24 to 2025-09-30)**
- Migrated to mem0ai SDK
- Added sessionId support
- Created standalone Lyra-Mem0 stack
**v0.3.x (2025-09-26 to 2025-10-28)**
- Forked Mem0 → NVGRAM → NeoMem
- Added salience filtering
- Integrated Cortex reasoning VM
- Built RAG system (Beta Lyrae)
- Established multi-backend LLM support
**v0.4.x (2025-11-05 to 2025-11-13)**
- Major architectural rewire
- Implemented 4-stage reasoning pipeline
- Added reflection, refinement stages
- RAG integration
- LLM router with per-stage backend selection
**Infrastructure v1.0.0 (2025-11-26)**
- Consolidated 9 `.env` files into single source of truth
- Multi-backend LLM strategy
- Docker Compose consolidation
- Created security templates
**v0.5.0 (2025-11-28)**
- Fixed all critical API wiring issues
- Added OpenAI-compatible Relay endpoint
- Fixed Cortex → Intake integration
- End-to-end flow verification
**v0.5.1 (2025-12-11) - CURRENT**
- **Critical fix**: SESSIONS persistence bug
- Implemented `bg_summarize()` stub
- Fixed `/ingest` unreachable code
- Added `cortex/intake/__init__.py`
- Embedded Intake in Cortex (no longer standalone)
- Added diagnostic endpoints
- Lenient error handling
- Documented single-worker constraint
### Architectural Philosophy
**Modular Design:**
- Each service has a single, clear responsibility
- Services communicate via well-defined HTTP APIs
- Configuration is centralized but allows per-service overrides
**Local-First:**
- No reliance on external services (except optional OpenAI)
- All data stored locally (PostgreSQL + Neo4j)
- Can run entirely air-gapped with local LLMs
**Flexible LLM Backend:**
- Not tied to any single LLM provider
- Can mix local and cloud models
- Per-stage backend selection for optimal performance/cost
**Error Handling:**
- Lenient mode: Never fail the chat pipeline
- Log errors but continue processing
- Graceful degradation
**Observability:**
- Diagnostic endpoints for debugging
- Verbose logging mode
- Object ID tracking for singleton verification
---
## Known Issues & Limitations
### Fixed in v0.5.1
- ✅ Intake SESSIONS not persisting → **FIXED**
- ✅ `bg_summarize()` NameError → **FIXED**
- ✅ `/ingest` endpoint unreachable code → **FIXED**
### Current Limitations
**1. Single-Worker Constraint**
- Cortex must run with single Uvicorn worker
- SESSIONS is in-memory module-level global
- Multi-worker support requires Redis or shared storage
- Documented in `cortex/Dockerfile` lines 7-8
**2. NeoMem Integration Incomplete**
- Relay doesn't yet push to NeoMem after responses
- Memory storage planned for v0.5.2
- Currently all memory is short-term (SESSIONS only)
**3. RAG Service Disabled**
- Beta Lyrae (RAG) commented out in docker-compose.yml
- Awaiting re-enablement after Intake stabilization
- Code exists but not currently integrated
**4. Session Management**
- No session cleanup/expiration
- SESSIONS grows unbounded (maxlen=200 per session, but infinite sessions)
- No session list endpoint in Relay
**5. Persona Integration**
- `PERSONA_ENABLED=false` in `.env`
- Persona Sidecar not fully wired
- Identity loaded but not consistently applied
### Future Enhancements
**Short-term (v0.5.2):**
- Enable NeoMem integration in Relay
- Add session cleanup/expiration
- Session list endpoint
- NeoMem health monitoring
**Medium-term (v0.6.x):**
- Re-enable RAG service
- Migrate SESSIONS to Redis for multi-worker support
- Add request correlation IDs
- Comprehensive health checks
**Long-term (v0.7.x+):**
- Persona Sidecar full integration
- Autonomous "dream" cycles (self-reflection)
- Verifier module for factual grounding
- Advanced RAG with hybrid search
- Memory consolidation strategies
---
## Troubleshooting Quick Reference
| Problem | Quick Check | Solution |
|---------|-------------|----------|
| SESSIONS empty | `curl localhost:7081/debug/sessions` | Rebuild Cortex, verify `__init__.py` exists |
| LLM timeout | `curl http://10.0.0.44:8080/health` | Check backend connectivity, increase timeout |
| Port conflict | `netstat -tulpn \| grep 7078` | Stop conflicting service or change port |
| Container crash | `docker logs cortex` | Check logs for Python errors, verify .env syntax |
| Missing package | `docker exec cortex pip list` | Rebuild container, check requirements.txt |
| 502 from Relay | `curl localhost:7081/health` | Verify Cortex is running, check docker network |
---
## API Reference (Quick)
### Relay (Port 7078)
**POST /v1/chat/completions** - OpenAI-compatible chat
```json
{
"messages": [{"role": "user", "content": "..."}],
"session_id": "..."
}
```
**GET /_health** - Service health
### Cortex (Port 7081)
**POST /reason** - Main reasoning pipeline
```json
{
"session_id": "...",
"user_prompt": "...",
"temperature": 0.7 // optional
}
```
**POST /ingest** - Add exchange to SESSIONS
```json
{
"session_id": "...",
"user_msg": "...",
"assistant_msg": "..."
}
```
**GET /debug/sessions** - Inspect SESSIONS state
**GET /debug/summary?session_id=X** - Test summarization
**GET /health** - Service health
### NeoMem (Port 7077)
**POST /memories** - Add memory
```json
{
"messages": [{"role": "...", "content": "..."}],
"user_id": "...",
"metadata": {}
}
```
**POST /search** - Semantic search
```json
{
"query": "...",
"user_id": "...",
"limit": 10
}
```
**GET /health** - Service health
---
## File Manifest (Key Files Only)
```
project-lyra/
├── .env # Root environment variables
├── docker-compose.yml # Service definitions (152 lines)
├── CHANGELOG.md # Version history (836 lines)
├── README.md # User documentation (610 lines)
├── PROJECT_SUMMARY.md # This file (AI context)
├── cortex/ # Reasoning engine
│ ├── Dockerfile # Single-worker constraint documented
│ ├── requirements.txt
│ ├── .env # Cortex overrides
│ ├── main.py # FastAPI initialization
│ ├── router.py # Routes (306 lines)
│ ├── context.py # Context aggregation
│ │
│ ├── intake/ # Short-term memory (embedded)
│ │ ├── __init__.py # Package exports
│ │ └── intake.py # Core logic (367 lines)
│ │
│ ├── reasoning/ # Reasoning pipeline
│ │ ├── reflection.py # Meta-awareness
│ │ ├── reasoning.py # Draft generation
│ │ └── refine.py # Refinement
│ │
│ ├── persona/ # Personality layer
│ │ ├── speak.py # Persona application
│ │ └── identity.py # Persona loader
│ │
│ └── llm/ # LLM integration
│ └── llm_router.py # Backend selector
├── core/relay/ # Orchestrator
│ ├── server.js # Express server (Node.js)
│ └── package.json
├── neomem/ # Long-term memory
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── .env # NeoMem overrides
│ └── main.py # Memory API
└── rag/ # RAG system (disabled)
├── rag_api.py
├── rag_chat_import.py
└── chromadb/
```
---
## Final Notes for AI Assistants
### What You Should Know Before Making Changes
1. **SESSIONS is sacred** - It's a module-level global in `cortex/intake/intake.py`. Don't move it, don't duplicate it, don't make it a class attribute. It must remain a singleton.
2. **Single-worker is mandatory** - Until SESSIONS is migrated to Redis, Cortex MUST run with a single Uvicorn worker. Multi-worker will cause SESSIONS to be inconsistent.
3. **Lenient error handling** - The `/ingest` endpoint and other parts of the pipeline use lenient error handling: log errors but always return success. Never fail the chat pipeline.
4. **Backend routing is environment-driven** - Don't hardcode LLM URLs. Use the `{MODULE}_LLM` environment variables and the llm_router.py system.
5. **Intake is embedded** - Don't try to make HTTP calls to Intake. Use direct Python imports: `from intake.intake import ...`
6. **Test with diagnostic endpoints** - Always use `/debug/sessions` and `/debug/summary` to verify SESSIONS behavior after changes.
7. **Follow the changelog format** - When documenting changes, use the chronological format established in CHANGELOG.md v0.5.1. Group by version, then by change type (Fixed, Added, Changed, etc.).
### When You Need Help
- **SESSIONS issues**: Check `cortex/intake/intake.py` lines 11-14 for initialization, lines 325-366 for `add_exchange_internal()`
- **Routing issues**: Check `cortex/router.py` lines 65-189 for `/reason`, lines 201-233 for `/ingest`
- **LLM backend issues**: Check `cortex/llm/llm_router.py` for backend selection logic
- **Environment variables**: Check `.env` lines 13-40 for LLM backends, lines 28-34 for module selection
### Most Important Thing
**This project values reliability over features.** It's better to have a simple, working system than a complex, broken one. When in doubt, keep it simple, log everything, and never fail silently.
---
**End of AI Context Summary**
*This document is maintained to provide complete context for AI assistants working on Project Lyra. Last updated: v0.5.1 (2025-12-11)*

460
docs/lyra_tree.txt Normal file
View File

@@ -0,0 +1,460 @@
/home/serversdown/project-lyra
├── CHANGELOG.md
├── core
│   ├── backups
│   │   ├── mem0_20250927_221040.sql
│   │   └── mem0_history_20250927_220925.tgz
│   ├── docker-compose.yml
│   ├── .env
│   ├── env experiments
│   │   ├── .env
│   │   ├── .env.local
│   │   └── .env.openai
│   ├── persona-sidecar
│   │   ├── Dockerfile
│   │   ├── package.json
│   │   ├── persona-server.js
│   │   └── personas.json
│   ├── PROJECT_SUMMARY.md
│   ├── relay
│   │   ├── Dockerfile
│   │   ├── .dockerignore
│   │   ├── lib
│   │   │   ├── cortex.js
│   │   │   └── llm.js
│   │   ├── package.json
│   │   ├── package-lock.json
│   │   ├── server.js
│   │   ├── sessions
│   │   │   ├── sess-6rxu7eia.json
│   │   │   ├── sess-6rxu7eia.jsonl
│   │   │   ├── sess-l08ndm60.json
│   │   │   └── sess-l08ndm60.jsonl
│   │   └── test-llm.js
│   └── ui
│   ├── index.html
│   ├── manifest.json
│   └── style.css
├── cortex
│   ├── Dockerfile
│   ├── .env
│   ├── ingest
│   │   ├── ingest_handler.py
│   │   └── intake_client.py
│   ├── llm
│   │   ├── llm_router.py
│   │   └── resolve_llm_url.py
│   ├── logs
│   │   └── reflections.log
│   ├── main.py
│   ├── neomem_client.py
│   ├── persona
│   │   └── speak.py
│   ├── rag.py
│   ├── reasoning
│   │   ├── reasoning.py
│   │   ├── refine.py
│   │   └── reflection.py
│   ├── requirements.txt
│   ├── router.py
│   ├── tests
│   └── utils
│   ├── config.py
│   ├── log_utils.py
│   └── schema.py
├── deprecated.env.txt
├── docker-compose.yml
├── .env
├── .gitignore
├── intake
│   ├── Dockerfile
│   ├── .env
│   ├── intake.py
│   ├── logs
│   ├── requirements.txt
│   └── venv
│   ├── bin
│   │   ├── python -> python3
│   │   ├── python3 -> /usr/bin/python3
│   │   └── python3.10 -> python3
│   ├── include
│   ├── lib
│   │   └── python3.10
│   │   └── site-packages
│   ├── lib64 -> lib
│   └── pyvenv.cfg
├── intake-logs
│   └── summaries.log
├── lyra_tree.txt
├── neomem
│   ├── _archive
│   │   └── old_servers
│   │   ├── main_backup.py
│   │   └── main_dev.py
│   ├── docker-compose.yml
│   ├── Dockerfile
│   ├── .env
│   ├── .gitignore
│   ├── neomem
│   │   ├── api
│   │   ├── client
│   │   │   ├── __init__.py
│   │   │   ├── main.py
│   │   │   ├── project.py
│   │   │   └── utils.py
│   │   ├── configs
│   │   │   ├── base.py
│   │   │   ├── embeddings
│   │   │   │   ├── base.py
│   │   │   │   └── __init__.py
│   │   │   ├── enums.py
│   │   │   ├── __init__.py
│   │   │   ├── llms
│   │   │   │   ├── anthropic.py
│   │   │   │   ├── aws_bedrock.py
│   │   │   │   ├── azure.py
│   │   │   │   ├── base.py
│   │   │   │   ├── deepseek.py
│   │   │   │   ├── __init__.py
│   │   │   │   ├── lmstudio.py
│   │   │   │   ├── ollama.py
│   │   │   │   ├── openai.py
│   │   │   │   └── vllm.py
│   │   │   ├── prompts.py
│   │   │   └── vector_stores
│   │   │   ├── azure_ai_search.py
│   │   │   ├── azure_mysql.py
│   │   │   ├── baidu.py
│   │   │   ├── chroma.py
│   │   │   ├── databricks.py
│   │   │   ├── elasticsearch.py
│   │   │   ├── faiss.py
│   │   │   ├── __init__.py
│   │   │   ├── langchain.py
│   │   │   ├── milvus.py
│   │   │   ├── mongodb.py
│   │   │   ├── neptune.py
│   │   │   ├── opensearch.py
│   │   │   ├── pgvector.py
│   │   │   ├── pinecone.py
│   │   │   ├── qdrant.py
│   │   │   ├── redis.py
│   │   │   ├── s3_vectors.py
│   │   │   ├── supabase.py
│   │   │   ├── upstash_vector.py
│   │   │   ├── valkey.py
│   │   │   ├── vertex_ai_vector_search.py
│   │   │   └── weaviate.py
│   │   ├── core
│   │   ├── embeddings
│   │   │   ├── aws_bedrock.py
│   │   │   ├── azure_openai.py
│   │   │   ├── base.py
│   │   │   ├── configs.py
│   │   │   ├── gemini.py
│   │   │   ├── huggingface.py
│   │   │   ├── __init__.py
│   │   │   ├── langchain.py
│   │   │   ├── lmstudio.py
│   │   │   ├── mock.py
│   │   │   ├── ollama.py
│   │   │   ├── openai.py
│   │   │   ├── together.py
│   │   │   └── vertexai.py
│   │   ├── exceptions.py
│   │   ├── graphs
│   │   │   ├── configs.py
│   │   │   ├── __init__.py
│   │   │   ├── neptune
│   │   │   │   ├── base.py
│   │   │   │   ├── __init__.py
│   │   │   │   ├── neptunedb.py
│   │   │   │   └── neptunegraph.py
│   │   │   ├── tools.py
│   │   │   └── utils.py
│   │   ├── __init__.py
│   │   ├── LICENSE
│   │   ├── llms
│   │   │   ├── anthropic.py
│   │   │   ├── aws_bedrock.py
│   │   │   ├── azure_openai.py
│   │   │   ├── azure_openai_structured.py
│   │   │   ├── base.py
│   │   │   ├── configs.py
│   │   │   ├── deepseek.py
│   │   │   ├── gemini.py
│   │   │   ├── groq.py
│   │   │   ├── __init__.py
│   │   │   ├── langchain.py
│   │   │   ├── litellm.py
│   │   │   ├── lmstudio.py
│   │   │   ├── ollama.py
│   │   │   ├── openai.py
│   │   │   ├── openai_structured.py
│   │   │   ├── sarvam.py
│   │   │   ├── together.py
│   │   │   ├── vllm.py
│   │   │   └── xai.py
│   │   ├── memory
│   │   │   ├── base.py
│   │   │   ├── graph_memory.py
│   │   │   ├── __init__.py
│   │   │   ├── kuzu_memory.py
│   │   │   ├── main.py
│   │   │   ├── memgraph_memory.py
│   │   │   ├── setup.py
│   │   │   ├── storage.py
│   │   │   ├── telemetry.py
│   │   │   └── utils.py
│   │   ├── proxy
│   │   │   ├── __init__.py
│   │   │   └── main.py
│   │   ├── server
│   │   │   ├── dev.Dockerfile
│   │   │   ├── docker-compose.yaml
│   │   │   ├── Dockerfile
│   │   │   ├── main_old.py
│   │   │   ├── main.py
│   │   │   ├── Makefile
│   │   │   ├── README.md
│   │   │   └── requirements.txt
│   │   ├── storage
│   │   ├── utils
│   │   │   └── factory.py
│   │   └── vector_stores
│   │   ├── azure_ai_search.py
│   │   ├── azure_mysql.py
│   │   ├── baidu.py
│   │   ├── base.py
│   │   ├── chroma.py
│   │   ├── configs.py
│   │   ├── databricks.py
│   │   ├── elasticsearch.py
│   │   ├── faiss.py
│   │   ├── __init__.py
│   │   ├── langchain.py
│   │   ├── milvus.py
│   │   ├── mongodb.py
│   │   ├── neptune_analytics.py
│   │   ├── opensearch.py
│   │   ├── pgvector.py
│   │   ├── pinecone.py
│   │   ├── qdrant.py
│   │   ├── redis.py
│   │   ├── s3_vectors.py
│   │   ├── supabase.py
│   │   ├── upstash_vector.py
│   │   ├── valkey.py
│   │   ├── vertex_ai_vector_search.py
│   │   └── weaviate.py
│   ├── neomem_history
│   │   └── history.db
│   ├── pyproject.toml
│   ├── README.md
│   └── requirements.txt
├── neomem_history
│   └── history.db
├── rag
│   ├── chatlogs
│   │   └── lyra
│   │   ├── 0000_Wire_ROCm_to_Cortex.json
│   │   ├── 0001_Branch___10_22_ct201branch-ssh_tut.json
│   │   ├── 0002_cortex_LLMs_11-1-25.json
│   │   ├── 0003_RAG_beta.json
│   │   ├── 0005_Cortex_v0_4_0_planning.json
│   │   ├── 0006_Cortex_v0_4_0_Refinement.json
│   │   ├── 0009_Branch___Cortex_v0_4_0_planning.json
│   │   ├── 0012_Cortex_4_-_neomem_11-1-25.json
│   │   ├── 0016_Memory_consolidation_concept.json
│   │   ├── 0017_Model_inventory_review.json
│   │   ├── 0018_Branch___Memory_consolidation_concept.json
│   │   ├── 0022_Branch___Intake_conversation_summaries.json
│   │   ├── 0026_Intake_conversation_summaries.json
│   │   ├── 0027_Trilium_AI_LLM_setup.json
│   │   ├── 0028_LLMs_and_sycophancy_levels.json
│   │   ├── 0031_UI_improvement_plan.json
│   │   ├── 0035_10_27-neomem_update.json
│   │   ├── 0044_Install_llama_cpp_on_ct201.json
│   │   ├── 0045_AI_task_assistant.json
│   │   ├── 0047_Project_scope_creation.json
│   │   ├── 0052_View_docker_container_logs.json
│   │   ├── 0053_10_21-Proxmox_fan_control.json
│   │   ├── 0054_10_21-pytorch_branch_Quant_experiments.json
│   │   ├── 0055_10_22_ct201branch-ssh_tut.json
│   │   ├── 0060_Lyra_project_folder_issue.json
│   │   ├── 0062_Build_pytorch_API.json
│   │   ├── 0063_PokerBrain_dataset_structure.json
│   │   ├── 0065_Install_PyTorch_setup.json
│   │   ├── 0066_ROCm_PyTorch_setup_quirks.json
│   │   ├── 0067_VM_model_setup_steps.json
│   │   ├── 0070_Proxmox_disk_error_fix.json
│   │   ├── 0072_Docker_Compose_vs_Portainer.json
│   │   ├── 0073_Check_system_temps_Proxmox.json
│   │   ├── 0075_Cortex_gpu_progress.json
│   │   ├── 0076_Backup_Proxmox_before_upgrade.json
│   │   ├── 0077_Storage_cleanup_advice.json
│   │   ├── 0082_Install_ROCm_on_Proxmox.json
│   │   ├── 0088_Thalamus_program_summary.json
│   │   ├── 0094_Cortex_blueprint_development.json
│   │   ├── 0095_mem0_advancments.json
│   │   ├── 0096_Embedding_provider_swap.json
│   │   ├── 0097_Update_git_commit_steps.json
│   │   ├── 0098_AI_software_description.json
│   │   ├── 0099_Seed_memory_process.json
│   │   ├── 0100_Set_up_Git_repo.json
│   │   ├── 0101_Customize_embedder_setup.json
│   │   ├── 0102_Seeding_Local_Lyra_memory.json
│   │   ├── 0103_Mem0_seeding_part_3.json
│   │   ├── 0104_Memory_build_prompt.json
│   │   ├── 0105_Git_submodule_setup_guide.json
│   │   ├── 0106_Serve_UI_on_LAN.json
│   │   ├── 0107_AI_name_suggestion.json
│   │   ├── 0108_Room_X_planning_update.json
│   │   ├── 0109_Salience_filtering_design.json
│   │   ├── 0110_RoomX_Cortex_build.json
│   │   ├── 0119_Explain_Lyra_cortex_idea.json
│   │   ├── 0120_Git_submodule_organization.json
│   │   ├── 0121_Web_UI_fix_guide.json
│   │   ├── 0122_UI_development_planning.json
│   │   ├── 0123_NVGRAM_debugging_steps.json
│   │   ├── 0124_NVGRAM_setup_troubleshooting.json
│   │   ├── 0125_NVGRAM_development_update.json
│   │   ├── 0126_RX_-_NeVGRAM_New_Features.json
│   │   ├── 0127_Error_troubleshooting_steps.json
│   │   ├── 0135_Proxmox_backup_with_ABB.json
│   │   ├── 0151_Auto-start_Lyra-Core_VM.json
│   │   ├── 0156_AI_GPU_benchmarks_comparison.json
│   │   └── 0251_Lyra_project_handoff.json
│   ├── chromadb
│   │   ├── c4f701ee-1978-44a1-9df4-3e865b5d33c1
│   │   │   ├── data_level0.bin
│   │   │   ├── header.bin
│   │   │   ├── index_metadata.pickle
│   │   │   ├── length.bin
│   │   │   └── link_lists.bin
│   │   └── chroma.sqlite3
│   ├── .env
│   ├── import.log
│   ├── lyra-chatlogs
│   │   ├── 0000_Wire_ROCm_to_Cortex.json
│   │   ├── 0001_Branch___10_22_ct201branch-ssh_tut.json
│   │   ├── 0002_cortex_LLMs_11-1-25.json
│   │   └── 0003_RAG_beta.json
│   ├── rag_api.py
│   ├── rag_build.py
│   ├── rag_chat_import.py
│   └── rag_query.py
├── README.md
├── vllm-mi50.md
└── volumes
├── neo4j_data
│   ├── databases
│   │   ├── neo4j
│   │   │   ├── database_lock
│   │   │   ├── id-buffer.tmp.0
│   │   │   ├── neostore
│   │   │   ├── neostore.counts.db
│   │   │   ├── neostore.indexstats.db
│   │   │   ├── neostore.labeltokenstore.db
│   │   │   ├── neostore.labeltokenstore.db.id
│   │   │   ├── neostore.labeltokenstore.db.names
│   │   │   ├── neostore.labeltokenstore.db.names.id
│   │   │   ├── neostore.nodestore.db
│   │   │   ├── neostore.nodestore.db.id
│   │   │   ├── neostore.nodestore.db.labels
│   │   │   ├── neostore.nodestore.db.labels.id
│   │   │   ├── neostore.propertystore.db
│   │   │   ├── neostore.propertystore.db.arrays
│   │   │   ├── neostore.propertystore.db.arrays.id
│   │   │   ├── neostore.propertystore.db.id
│   │   │   ├── neostore.propertystore.db.index
│   │   │   ├── neostore.propertystore.db.index.id
│   │   │   ├── neostore.propertystore.db.index.keys
│   │   │   ├── neostore.propertystore.db.index.keys.id
│   │   │   ├── neostore.propertystore.db.strings
│   │   │   ├── neostore.propertystore.db.strings.id
│   │   │   ├── neostore.relationshipgroupstore.db
│   │   │   ├── neostore.relationshipgroupstore.db.id
│   │   │   ├── neostore.relationshipgroupstore.degrees.db
│   │   │   ├── neostore.relationshipstore.db
│   │   │   ├── neostore.relationshipstore.db.id
│   │   │   ├── neostore.relationshiptypestore.db
│   │   │   ├── neostore.relationshiptypestore.db.id
│   │   │   ├── neostore.relationshiptypestore.db.names
│   │   │   ├── neostore.relationshiptypestore.db.names.id
│   │   │   ├── neostore.schemastore.db
│   │   │   ├── neostore.schemastore.db.id
│   │   │   └── schema
│   │   │   └── index
│   │   │   └── token-lookup-1.0
│   │   │   ├── 1
│   │   │   │   └── index-1
│   │   │   └── 2
│   │   │   └── index-2
│   │   ├── store_lock
│   │   └── system
│   │   ├── database_lock
│   │   ├── id-buffer.tmp.0
│   │   ├── neostore
│   │   ├── neostore.counts.db
│   │   ├── neostore.indexstats.db
│   │   ├── neostore.labeltokenstore.db
│   │   ├── neostore.labeltokenstore.db.id
│   │   ├── neostore.labeltokenstore.db.names
│   │   ├── neostore.labeltokenstore.db.names.id
│   │   ├── neostore.nodestore.db
│   │   ├── neostore.nodestore.db.id
│   │   ├── neostore.nodestore.db.labels
│   │   ├── neostore.nodestore.db.labels.id
│   │   ├── neostore.propertystore.db
│   │   ├── neostore.propertystore.db.arrays
│   │   ├── neostore.propertystore.db.arrays.id
│   │   ├── neostore.propertystore.db.id
│   │   ├── neostore.propertystore.db.index
│   │   ├── neostore.propertystore.db.index.id
│   │   ├── neostore.propertystore.db.index.keys
│   │   ├── neostore.propertystore.db.index.keys.id
│   │   ├── neostore.propertystore.db.strings
│   │   ├── neostore.propertystore.db.strings.id
│   │   ├── neostore.relationshipgroupstore.db
│   │   ├── neostore.relationshipgroupstore.db.id
│   │   ├── neostore.relationshipgroupstore.degrees.db
│   │   ├── neostore.relationshipstore.db
│   │   ├── neostore.relationshipstore.db.id
│   │   ├── neostore.relationshiptypestore.db
│   │   ├── neostore.relationshiptypestore.db.id
│   │   ├── neostore.relationshiptypestore.db.names
│   │   ├── neostore.relationshiptypestore.db.names.id
│   │   ├── neostore.schemastore.db
│   │   ├── neostore.schemastore.db.id
│   │   └── schema
│   │   └── index
│   │   ├── range-1.0
│   │   │   ├── 3
│   │   │   │   └── index-3
│   │   │   ├── 4
│   │   │   │   └── index-4
│   │   │   ├── 7
│   │   │   │   └── index-7
│   │   │   ├── 8
│   │   │   │   └── index-8
│   │   │   └── 9
│   │   │   └── index-9
│   │   └── token-lookup-1.0
│   │   ├── 1
│   │   │   └── index-1
│   │   └── 2
│   │   └── index-2
│   ├── dbms
│   │   └── auth.ini
│   ├── server_id
│   └── transactions
│   ├── neo4j
│   │   ├── checkpoint.0
│   │   └── neostore.transaction.db.0
│   └── system
│   ├── checkpoint.0
│   └── neostore.transaction.db.0
└── postgres_data [error opening dir]
81 directories, 376 files