2026-01-18 02:46:25 -05:00
3 changed files with 478 additions and 38 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,271 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Se
 ---
 ## [0.7.0] - 2025-12-21
 ### Added - Standard Mode & UI Enhancements
 **Standard Mode Implementation**
 - Added "Standard Mode" chat option that bypasses complex cortex reasoning pipeline
  - Provides simple chatbot functionality for coding and practical tasks
  - Maintains full conversation context across messages
  - Backend-agnostic - works with SECONDARY (Ollama), OPENAI, or custom backends
  - Created `/simple` endpoint in Cortex router [cortex/router.py:389](cortex/router.py#L389)
 - Mode selector in UI with toggle between Standard and Cortex modes
  - Standard Mode: Direct LLM chat with context retention
  - Cortex Mode: Full 7-stage reasoning pipeline (unchanged)
 **Backend Selection System**
 - UI settings modal with LLM backend selection for Standard Mode
  - Radio button selector: SECONDARY (Ollama/Qwen), OPENAI (GPT-4o-mini), or custom
  - Backend preference persisted in localStorage
  - Custom backend text input for advanced users
 - Backend parameter routing through entire stack:
  - UI sends `backend` parameter in request body
  - Relay forwards backend selection to Cortex
  - Cortex `/simple` endpoint respects user's backend choice
 - Environment-based fallback: Uses `STANDARD_MODE_LLM` if no backend specified
 **Session Management Overhaul**
 - Complete rewrite of session system to use server-side persistence
  - File-based storage in `core/relay/sessions/` directory
  - Session files: `{sessionId}.json` for history, `{sessionId}.meta.json` for metadata
  - Server is source of truth - sessions sync across browsers and reboots
 - Session metadata system for friendly names
  - Sessions display custom names instead of random IDs
  - Rename functionality in session dropdown
  - Last modified timestamps and message counts
 - Full CRUD API for sessions in Relay:
  - `GET /sessions` - List all sessions with metadata
  - `GET /sessions/:id` - Retrieve session history
  - `POST /sessions/:id` - Save session history
  - `PATCH /sessions/:id/metadata` - Update session name/metadata
  - `DELETE /sessions/:id` - Delete session and metadata
 - Session management UI in settings modal:
  - List of all sessions with message counts and timestamps
  - Delete button for each session with confirmation
  - Automatic session cleanup when deleting current session
 **UI Improvements**
 - Settings modal with hamburger menu (⚙ Settings button)
  - Backend selection section for Standard Mode
  - Session management section with delete functionality
  - Clean modal overlay with cyberpunk theme
  - ESC key and click-outside to close
 - Light/Dark mode toggle with dark mode as default
  - Theme preference persisted in localStorage
  - CSS variables for seamless theme switching
  - Toggle button shows current mode (🌙 Dark Mode / ☀️ Light Mode)
 - Removed redundant model selector dropdown from header
 - Fixed modal positioning and z-index layering
  - Modal moved outside #chat container for proper rendering
  - Fixed z-index: overlay (999), modal content (1001)
  - Centered modal with proper backdrop blur
 **Context Retention for Standard Mode**
 - Integration with Intake module for conversation history
  - Added `get_recent_messages()` function in intake.py
  - Standard Mode retrieves last 20 messages from session buffer
  - Full context sent to LLM on each request
 - Message array format support in LLM router:
  - Updated Ollama provider to accept `messages` parameter
  - Updated OpenAI provider to accept `messages` parameter
  - Automatic conversion from messages to prompt string for non-chat APIs
 ### Changed - Architecture & Routing
 **Relay Server Updates** [core/relay/server.js](core/relay/server.js)
 - ES module migration for session persistence:
  - Imported `fs/promises`, `path`, `fileURLToPath` for file operations
  - Created `SESSIONS_DIR` constant for session storage location
 - Mode-based routing in both `/chat` and `/v1/chat/completions` endpoints:
  - Extracts `mode` parameter from request body (default: "cortex")
  - Routes to `CORTEX_SIMPLE` for Standard Mode, `CORTEX_REASON` for Cortex Mode
  - Backend parameter only used in Standard Mode
 - Session persistence functions:
  - `ensureSessionsDir()` - Creates sessions directory if needed
  - `loadSession(sessionId)` - Reads session history from file
  - `saveSession(sessionId, history, metadata)` - Writes session to file
  - `loadSessionMetadata(sessionId)` - Reads session metadata
  - `saveSessionMetadata(sessionId, metadata)` - Updates session metadata
  - `listSessions()` - Returns all sessions with metadata, sorted by last modified
  - `deleteSession(sessionId)` - Removes session and metadata files
 **Cortex Router Updates** [cortex/router.py](cortex/router.py)
 - Added `backend` field to `ReasonRequest` Pydantic model (optional)
 - Created `/simple` endpoint for Standard Mode:
  - Bypasses reflection, reasoning, refinement stages
  - Direct LLM call with conversation context
  - Uses backend from request or falls back to `STANDARD_MODE_LLM` env variable
  - Returns simple response structure without reasoning artifacts
 - Backend selection logic in `/simple`:
  - Normalizes backend names to uppercase
  - Maps UI backend names to system backend names
  - Validates backend availability before calling
 **Intake Integration** [cortex/intake/intake.py](cortex/intake/intake.py)
 - Added `get_recent_messages(session_id, limit)` function:
  - Retrieves last N messages from session buffer
  - Returns empty list if session doesn't exist
  - Used by `/simple` endpoint for context retrieval
 **LLM Router Enhancements** [cortex/llm/llm_router.py](cortex/llm/llm_router.py)
 - Added `messages` parameter support across all providers
 - Automatic message-to-prompt conversion for legacy APIs
 - Chat completion format for Ollama and OpenAI providers
 - Stop sequences for MI50/DeepSeek R1 to prevent runaway generation:
  - `"User:"`, `"\nUser:"`, `"Assistant:"`, `"\n\n\n"`
 **Environment Configuration** [.env](.env)
 - Added `STANDARD_MODE_LLM=SECONDARY` for default Standard Mode backend
 - Added `CORTEX_SIMPLE_URL=http://cortex:7081/simple` for routing
 **UI Architecture** [core/ui/index.html](core/ui/index.html)
 - Server-based session loading system:
  - `loadSessionsFromServer()` - Fetches sessions from Relay API
  - `renderSessions()` - Populates session dropdown from server data
  - Session state synchronized with server on every change
 - Backend selection persistence:
  - Loads saved backend from localStorage on page load
  - Includes backend parameter in request body when in Standard Mode
  - Settings modal pre-selects current backend choice
 - Dark mode by default:
  - Checks localStorage for theme preference
  - Sets dark theme if no preference found
  - Toggle button updates localStorage and applies theme
 **CSS Styling** [core/ui/style.css](core/ui/style.css)
 - Light mode CSS variables:
  - `--bg-dark: #f5f5f5` (light background)
  - `--text-main: #1a1a1a` (dark text)
  - `--text-fade: #666` (dimmed text)
 - Dark mode CSS variables (default):
  - `--bg-dark: #0a0a0a` (dark background)
  - `--text-main: #e6e6e6` (light text)
  - `--text-fade: #999` (dimmed text)
 - Modal positioning fixes:
  - `position: fixed` with `top: 50%`, `left: 50%`, `transform: translate(-50%, -50%)`
  - Z-index layering: overlay (999), content (1001)
  - Backdrop blur effect on modal overlay
 - Session list styling:
  - Session item cards with hover effects
  - Delete button with red hover state
  - Message count and timestamp display
 ### Fixed - Critical Issues
 **DeepSeek R1 Runaway Generation**
 - Root cause: R1 reasoning model generates thinking process and hallucinates conversations
 - Solution:
  - Changed `STANDARD_MODE_LLM` to SECONDARY (Ollama/Qwen) instead of PRIMARY (MI50/R1)
  - Added stop sequences to MI50 provider to prevent continuation
  - Documented R1 limitations for Standard Mode usage
 **Context Not Maintained in Standard Mode**
 - Root cause: `/simple` endpoint didn't retrieve conversation history from Intake
 - Solution:
  - Created `get_recent_messages()` function in intake.py
  - Standard Mode now pulls last 20 messages from session buffer
  - Full context sent to LLM with each request
 - User feedback: "it's saying it hasn't received any other messages from me, so it looks like the standard mode llm isn't getting the full chat"
 **OpenAI Backend 400 Errors**
 - Root cause: OpenAI provider only accepted prompt strings, not messages arrays
 - Solution: Updated OpenAI provider to support messages parameter like Ollama
 - Now handles chat completion format correctly
 **Modal Formatting Issues**
 - Root cause: Settings modal inside #chat container with overflow constraints
 - Symptoms: Modal appearing at bottom, jumbled layout, couldn't close
 - Solution:
  - Moved modal outside #chat container to be direct child of body
  - Changed positioning from absolute to fixed
  - Added proper z-index layering (overlay: 999, content: 1001)
  - Removed old model selector from header
 - User feedback: "the formating for the settings is all off. Its at the bottom and all jumbling together, i cant get it to go away"
 **Session Persistence Broken**
 - Root cause: Sessions stored only in localStorage, not synced with server
 - Symptoms: Sessions didn't persist across browsers or reboots, couldn't load messages
 - Solution: Complete rewrite of session system
  - Implemented server-side file persistence in Relay
  - Created CRUD API endpoints for session management
  - Updated UI to load sessions from server instead of localStorage
  - Added metadata system for session names
  - Sessions now survive container restarts and sync across browsers
 - User feedback: "sessions seem to exist locally only, i cant get them to actually load any messages and there is now way to delete them. If i open the ui in a different browser those arent there."
 ### Technical Improvements
 **Backward Compatibility**
 - All changes include defaults to maintain existing behavior
 - Cortex Mode completely unchanged - still uses full 7-stage pipeline
 - Standard Mode is opt-in via UI mode selector
 - If no backend specified, falls back to `STANDARD_MODE_LLM` env variable
 - Existing requests without mode parameter default to "cortex"
 **Code Quality**
 - Consistent async/await patterns throughout stack
 - Proper error handling with fallbacks
 - Clean separation between Standard and Cortex modes
 - Session persistence abstracted into helper functions
 - Modular UI code with clear event handlers
 **Performance**
 - Standard Mode bypasses 6 of 7 reasoning stages for faster responses
 - Session loading optimized with file-based caching
 - Backend selection happens once per message, not per LLM call
 - Minimal overhead for mode detection and routing
 ### Architecture - Dual-Mode Chat System
 **Standard Mode Flow:**
 ```
 User (UI) → Relay → Cortex /simple → Intake (get_recent_messages)
 → LLM (direct call with context) → Relay → UI
 ```
 **Cortex Mode Flow (Unchanged):**
 ```
 User (UI) → Relay → Cortex /reason → Reflection → Reasoning
 → Refinement → Persona → Relay → UI
 ```
 **Session Persistence:**
 ```
 UI → POST /sessions/:id → Relay → File system (sessions/*.json)
 UI → GET /sessions → Relay → List all sessions → UI dropdown
 ```
 ### Known Limitations
 **Standard Mode:**
 - No reflection, reasoning, or refinement stages
 - No RAG integration (same as Cortex Mode - currently disabled)
 - No NeoMem memory storage (same as Cortex Mode - currently disabled)
 - DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
 **Session Management:**
 - Sessions stored in container filesystem - need volume mount for true persistence
 - No session import/export functionality yet
 - No session search or filtering
 ### Migration Notes
 **For Users Upgrading:**
 1. Existing sessions in localStorage will not automatically migrate to server
 2. Create new sessions after upgrade for server-side persistence
 3. Theme preference (light/dark) will be preserved from localStorage
 4. Backend preference will default to SECONDARY if not previously set
 **For Developers:**
 1. Relay now requires `fs/promises` for session persistence
 2. Cortex `/simple` endpoint expects `backend` parameter (optional)
 3. UI sends `mode` and `backend` parameters in request body
 4. Session files stored in `core/relay/sessions/` directory
 ---
 ## [0.6.0] - 2025-12-18
 ### Added - Autonomy System (Phase 1 & 2)
--- a/README.md
+++ b/README.md
@@ -1,10 +1,12 @@
-# Project Lyra - README v0.6.0
+# Project Lyra - README v0.7.0
 Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
 It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
 featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.
-**Current Version:** v0.6.0 (2025-12-18)
+**NEW in v0.7.0:** Standard Mode for simple chatbot functionality + UI backend selection + server-side session persistence
 **Current Version:** v0.7.0 (2025-12-21)
 > **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.
@@ -25,14 +27,18 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Coordinates all module interactions
 - OpenAI-compatible endpoint: `POST /v1/chat/completions`
 - Internal endpoint: `POST /chat`
- Routes messages through Cortex reasoning pipeline
+- Dual-mode routing: Standard Mode (simple chat) or Cortex Mode (full reasoning)
 - Server-side session persistence with file-based storage
 - Session management API: `GET/POST/PATCH/DELETE /sessions`
 - Manages async calls to Cortex ingest
 - *(NeoMem integration currently disabled in v0.6.0)*
 **2. UI** (Static HTML)
 - Browser-based chat interface with cyberpunk theme
- Connects to Relay
+- **NEW:** Mode selector (Standard/Cortex) in header
- Saves and loads sessions
+- **NEW:** Settings modal with backend selection and session management
 - **NEW:** Light/Dark mode toggle (dark by default)
 - Server-synced session management (persists across browsers and reboots)
 - OpenAI-compatible message format
 **3. NeoMem** (Python/FastAPI) - Port 7077 - **DISABLED IN v0.6.0**
@@ -49,7 +55,13 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Primary reasoning engine with multi-stage pipeline and autonomy system
 - **Includes embedded Intake module** (no separate service as of v0.5.1)
 - **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
- **4-Stage Processing:**
+- **Dual Operating Modes:**
  - **Standard Mode** (NEW in v0.7.0) - Simple chatbot with context retention
    - Bypasses reflection, reasoning, refinement stages
    - Direct LLM call with conversation history
    - User-selectable backend (SECONDARY, OPENAI, or custom)
    - Faster responses for coding and practical tasks
  - **Cortex Mode** - Full 4-stage reasoning pipeline
    1. **Reflection** - Generates meta-awareness notes about conversation
    2. **Reasoning** - Creates initial draft answer using context
    3. **Refinement** - Polishes and improves the draft
@@ -57,7 +69,8 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Integrates with Intake for short-term context via internal Python imports
 - Flexible LLM router supporting multiple backends via HTTP
 - **Endpoints:**
-  - `POST /reason` - Main reasoning pipeline
+  - `POST /reason` - Main reasoning pipeline (Cortex Mode)
  - `POST /simple` - Direct LLM chat (Standard Mode) **NEW in v0.7.0**
  - `POST /ingest` - Receives conversation exchanges from Relay
  - `GET /health` - Service health check
  - `GET /debug/sessions` - Inspect in-memory SESSIONS state
@@ -129,12 +142,38 @@ The autonomy system operates in coordinated layers, all maintaining state in `se
 ---
-## Data Flow Architecture (v0.6.0)
+## Data Flow Architecture (v0.7.0)
-### Normal Message Flow:
+### Standard Mode Flow (NEW in v0.7.0):
 ```
-User (UI) → POST /v1/chat/completions
+User (UI) → POST /v1/chat/completions {mode: "standard", backend: "SECONDARY"}
  ↓
 Relay (7078)
  ↓ POST /simple
 Cortex (7081)
  ↓ (internal Python call)
 Intake module → get_recent_messages() (last 20 messages)
  ↓
 Direct LLM call (user-selected backend: SECONDARY/OPENAI/custom)
  ↓
 Returns simple response to Relay
  ↓
 Relay → POST /ingest (async)
  ↓
 Cortex → add_exchange_internal() → SESSIONS buffer
  ↓
 Relay → POST /sessions/:id (save session to file)
  ↓
 Relay → UI (returns final response)
 Note: Bypasses reflection, reasoning, refinement, persona stages
 ```
 ### Cortex Mode Flow (Full Reasoning):
 ```
 User (UI) → POST /v1/chat/completions {mode: "cortex"}
  ↓
 Relay (7078)
  ↓ POST /reason
@@ -158,11 +197,26 @@ Cortex → add_exchange_internal() → SESSIONS buffer
  ↓
 Autonomy System → Update self_state.json (pattern tracking)
  ↓
 Relay → POST /sessions/:id (save session to file)
  ↓
 Relay → UI (returns final response)
 Note: NeoMem integration disabled in v0.6.0
 ```
 ### Session Persistence Flow (NEW in v0.7.0):
 ```
 UI loads → GET /sessions → Relay → List all sessions from files → UI dropdown
 User sends message → POST /sessions/:id → Relay → Save to sessions/*.json
 User renames session → PATCH /sessions/:id/metadata → Relay → Update *.meta.json
 User deletes session → DELETE /sessions/:id → Relay → Remove session files
 Sessions stored in: core/relay/sessions/
 - {sessionId}.json (conversation history)
 - {sessionId}.meta.json (name, timestamps, metadata)
 ```
 ### Cortex 4-Stage Reasoning Pipeline:
 1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI)
@@ -196,6 +250,14 @@ Note: NeoMem integration disabled in v0.6.0
 - OpenAI-compatible endpoint: `POST /v1/chat/completions`
 - Internal endpoint: `POST /chat`
 - Health check: `GET /_health`
 - **NEW:** Dual-mode routing (Standard/Cortex)
 - **NEW:** Server-side session persistence with CRUD API
 - **NEW:** Session management endpoints:
  - `GET /sessions` - List all sessions
  - `GET /sessions/:id` - Retrieve session history
  - `POST /sessions/:id` - Save session history
  - `PATCH /sessions/:id/metadata` - Update session metadata
  - `DELETE /sessions/:id` - Delete session
 - Async non-blocking calls to Cortex
 - Shared request handler for code reuse
 - Comprehensive error handling
@@ -210,19 +272,35 @@ Note: NeoMem integration disabled in v0.6.0
 **UI**:
 - Lightweight static HTML chat interface
- Cyberpunk theme
+- Cyberpunk theme with light/dark mode toggle
- Session save/load functionality
+- **NEW:** Mode selector (Standard/Cortex) in header
 - **NEW:** Settings modal (⚙ button) with:
  - Backend selection for Standard Mode (SECONDARY/OPENAI/custom)
  - Session management (view, delete sessions)
  - Theme toggle (dark mode default)
 - **NEW:** Server-synced session management
  - Sessions persist across browsers and reboots
  - Rename sessions with custom names
  - Delete sessions with confirmation
  - Automatic session save on every message
 - OpenAI message format support
 ### Reasoning Layer
-**Cortex** (v0.5.1):
+**Cortex** (v0.7.0):
- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona)
+- **NEW:** Dual operating modes:
  - **Standard Mode** - Simple chat with context (`/simple` endpoint)
    - User-selectable backend (SECONDARY, OPENAI, or custom)
    - Full conversation history via Intake integration
    - Bypasses reasoning pipeline for faster responses
  - **Cortex Mode** - Full reasoning pipeline (`/reason` endpoint)
    - Multi-stage processing: reflection → reasoning → refine → persona
    - Per-stage backend selection
    - Autonomy system integration
 - Flexible LLM backend routing via HTTP
 - Per-stage backend selection
 - Async processing throughout
 - Embedded Intake module for short-term context
- `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
+- `/reason`, `/simple`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
 - Lenient error handling - never fails the chat pipeline
 **Intake** (Embedded Module):
@@ -327,7 +405,28 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
 ## Version History
-### v0.6.0 (2025-12-18) - Current Release
+### v0.7.0 (2025-12-21) - Current Release
 **Major Features: Standard Mode + Backend Selection + Session Persistence**
 - ✅ Added Standard Mode for simple chatbot functionality
 - ✅ UI mode selector (Standard/Cortex) in header
 - ✅ Settings modal with backend selection for Standard Mode
 - ✅ Server-side session persistence with file-based storage
 - ✅ Session management UI (view, rename, delete sessions)
 - ✅ Light/Dark mode toggle (dark by default)
 - ✅ Context retention in Standard Mode via Intake integration
 - ✅ Fixed modal positioning and z-index issues
 - ✅ Cortex `/simple` endpoint for direct LLM calls
 - ✅ Session CRUD API in Relay
 - ✅ Full backward compatibility - Cortex Mode unchanged
 **Key Changes:**
 - Standard Mode bypasses 6 of 7 reasoning stages for faster responses
 - Sessions now sync across browsers and survive container restarts
 - User can select SECONDARY (Ollama), OPENAI, or custom backend for Standard Mode
 - Theme preference and backend selection persisted in localStorage
 - Session files stored in `core/relay/sessions/` directory
 ### v0.6.0 (2025-12-18)
 **Major Feature: Autonomy System (Phase 1, 2, and 2.5)**
 - ✅ Added autonomous decision-making framework
 - ✅ Implemented executive planning and goal-setting layer
@@ -394,30 +493,39 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
 ---
-## Known Issues (v0.6.0)
+## Known Issues (v0.7.0)
-### Temporarily Disabled (v0.6.0)
+### Temporarily Disabled
 - **NeoMem disabled by default** - Being refined independently before full integration
  - PostgreSQL + pgvector storage inactive
  - Neo4j graph database inactive
  - Memory persistence endpoints not active
 - RAG service (Beta Lyrae) currently disabled in docker-compose.yml
-### Non-Critical
+### Standard Mode Limitations
- Session management endpoints not fully implemented in Relay
+- No reflection, reasoning, or refinement stages (by design)
- Full autonomy system integration still being refined
+- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
- Memory retrieval integration pending NeoMem re-enablement
+- No RAG integration (same as Cortex Mode - currently disabled)
 - No NeoMem memory storage (same as Cortex Mode - currently disabled)
 ### Session Management Limitations
 - Sessions stored in container filesystem - requires volume mount for true persistence
 - No session import/export functionality yet
 - No session search or filtering
 - Old localStorage sessions don't automatically migrate to server
 ### Operational Notes
 - **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state
  - Multi-worker scaling requires migrating SESSIONS to Redis or shared storage
 - Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting
 - Backend selection only affects Standard Mode - Cortex Mode uses environment-configured backends
 ### Future Enhancements
 - Re-enable NeoMem integration after pipeline refinement
 - Full autonomy system maturation and optimization
 - Re-enable RAG service integration
- Implement full session persistence
+- Session import/export functionality
 - Session search and filtering UI
 - Migrate SESSIONS to Redis for multi-worker support
 - Add request correlation IDs for tracing
 - Comprehensive health checks across all services
@@ -457,17 +565,56 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
   curl http://localhost:7077/health
   ```
-4. Access the UI at `http://localhost:7078`
+4. Access the UI at `http://localhost:8081`
 ### Using the UI
 **Mode Selection:**
 - Use the **Mode** dropdown in the header to switch between:
  - **Standard** - Simple chatbot for coding and practical tasks
  - **Cortex** - Full reasoning pipeline with autonomy features
 **Settings Menu:**
 1. Click the **⚙ Settings** button in the header
 2. **Backend Selection** (Standard Mode only):
   - Choose **SECONDARY** (Ollama/Qwen on 3090) - Fast, local
   - Choose **OPENAI** (GPT-4o-mini) - Cloud-based, high quality
   - Enter custom backend name for advanced configurations
 3. **Session Management**:
   - View all saved sessions with message counts and timestamps
   - Click 🗑️ to delete unwanted sessions
 4. **Theme Toggle**:
   - Click **🌙 Dark Mode** or **☀️ Light Mode** to switch themes
 **Session Management:**
 - Sessions automatically save on every message
 - Use the **Session** dropdown to switch between sessions
 - Click **➕ New** to create a new session
 - Click **✏️ Rename** to rename the current session
 - Sessions persist across browsers and container restarts
 ### Test
-**Test Relay → Cortex pipeline:**
+**Test Standard Mode:**
 ```bash
 curl -X POST http://localhost:7078/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "standard",
    "backend": "SECONDARY",
    "messages": [{"role": "user", "content": "Hello!"}],
    "sessionId": "test"
  }'
 ```
 **Test Cortex Mode (Full Reasoning):**
 ```bash
 curl -X POST http://localhost:7078/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "cortex",
    "messages": [{"role": "user", "content": "Hello Lyra!"}],
-    "session_id": "test"
+    "sessionId": "test"
  }'
 ```
@@ -492,6 +639,21 @@ curl http://localhost:7081/debug/sessions
 curl "http://localhost:7081/debug/summary?session_id=test"
 ```
 **List all sessions:**
 ```bash
 curl http://localhost:7078/sessions
 ```
 **Get session history:**
 ```bash
 curl http://localhost:7078/sessions/sess-abc123
 ```
 **Delete a session:**
 ```bash
 curl -X DELETE http://localhost:7078/sessions/sess-abc123
 ```
 All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
 ---
@@ -521,6 +683,7 @@ SPEAK_LLM=OPENAI          # Use OpenAI for persona
 NEOMEM_LLM=PRIMARY           # Use llama.cpp for memory
 UI_LLM=OPENAI                # Use OpenAI for UI
 RELAY_LLM=PRIMARY            # Use llama.cpp for relay
 STANDARD_MODE_LLM=SECONDARY  # Default backend for Standard Mode (NEW in v0.7.0)
 ```
 ### Database Configuration
@@ -541,6 +704,7 @@ NEO4J_PASSWORD=neomemgraph
 NEOMEM_API=http://neomem-api:7077
 CORTEX_API=http://cortex:7081
 CORTEX_REASON_URL=http://cortex:7081/reason
 CORTEX_SIMPLE_URL=http://cortex:7081/simple      # NEW in v0.7.0
 CORTEX_INGEST_URL=http://cortex:7081/ingest
 RELAY_URL=http://relay:7078
 ```
@@ -685,7 +849,10 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0).
 ### Debugging Tips
 - Enable verbose logging: `VERBOSE_DEBUG=true` in `.env`
 - Check Cortex logs: `docker logs cortex -f`
 - Check Relay logs: `docker logs relay -f`
 - Inspect SESSIONS: `curl http://localhost:7081/debug/sessions`
 - Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"`
- Check Relay logs: `docker logs relay -f`
+- List sessions: `curl http://localhost:7078/sessions`
 - Test Standard Mode: `curl -X POST http://localhost:7078/v1/chat/completions -H "Content-Type: application/json" -d '{"mode":"standard","backend":"SECONDARY","messages":[{"role":"user","content":"test"}],"sessionId":"test"}'`
 - Monitor Docker network: `docker network inspect lyra_net`
 - Check session files: `ls -la core/relay/sessions/`
--- a/core/relay/sessions/sess-dnm44wyb.json
+++ b/core/relay/sessions/sess-dnm44wyb.json
@@ -14,5 +14,13 @@
  {
    "role": "assistant",
    "content": "Hello Brian! Nice to meet you. As an AI, I don't have physical design capabilities, but I'm here to help with any information or tasks you need. How can I assist you in your design process?"
  },
  {
    "role": "user",
    "content": "Can you code python scripts for me?"
  },
  {
    "role": "assistant",
    "content": "Sure thing, Brian! I can help you with Python scripting. What specifically do you need assistance with? Whether it's a simple script or something more complex, just let me know the details!"
  }
 ]