diff --git a/CHANGELOG.md b/CHANGELOG.md index f5784f7..7da96ec 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,271 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Se --- +## [0.7.0] - 2025-12-21 + +### Added - Standard Mode & UI Enhancements + +**Standard Mode Implementation** +- Added "Standard Mode" chat option that bypasses complex cortex reasoning pipeline + - Provides simple chatbot functionality for coding and practical tasks + - Maintains full conversation context across messages + - Backend-agnostic - works with SECONDARY (Ollama), OPENAI, or custom backends + - Created `/simple` endpoint in Cortex router [cortex/router.py:389](cortex/router.py#L389) +- Mode selector in UI with toggle between Standard and Cortex modes + - Standard Mode: Direct LLM chat with context retention + - Cortex Mode: Full 7-stage reasoning pipeline (unchanged) + +**Backend Selection System** +- UI settings modal with LLM backend selection for Standard Mode + - Radio button selector: SECONDARY (Ollama/Qwen), OPENAI (GPT-4o-mini), or custom + - Backend preference persisted in localStorage + - Custom backend text input for advanced users +- Backend parameter routing through entire stack: + - UI sends `backend` parameter in request body + - Relay forwards backend selection to Cortex + - Cortex `/simple` endpoint respects user's backend choice +- Environment-based fallback: Uses `STANDARD_MODE_LLM` if no backend specified + +**Session Management Overhaul** +- Complete rewrite of session system to use server-side persistence + - File-based storage in `core/relay/sessions/` directory + - Session files: `{sessionId}.json` for history, `{sessionId}.meta.json` for metadata + - Server is source of truth - sessions sync across browsers and reboots +- Session metadata system for friendly names + - Sessions display custom names instead of random IDs + - Rename functionality in session dropdown + - Last modified timestamps and message counts +- Full CRUD API for sessions in Relay: + - `GET /sessions` - List all sessions with metadata + - `GET /sessions/:id` - Retrieve session history + - `POST /sessions/:id` - Save session history + - `PATCH /sessions/:id/metadata` - Update session name/metadata + - `DELETE /sessions/:id` - Delete session and metadata +- Session management UI in settings modal: + - List of all sessions with message counts and timestamps + - Delete button for each session with confirmation + - Automatic session cleanup when deleting current session + +**UI Improvements** +- Settings modal with hamburger menu (⚙ Settings button) + - Backend selection section for Standard Mode + - Session management section with delete functionality + - Clean modal overlay with cyberpunk theme + - ESC key and click-outside to close +- Light/Dark mode toggle with dark mode as default + - Theme preference persisted in localStorage + - CSS variables for seamless theme switching + - Toggle button shows current mode (🌙 Dark Mode / ☀️ Light Mode) +- Removed redundant model selector dropdown from header +- Fixed modal positioning and z-index layering + - Modal moved outside #chat container for proper rendering + - Fixed z-index: overlay (999), modal content (1001) + - Centered modal with proper backdrop blur + +**Context Retention for Standard Mode** +- Integration with Intake module for conversation history + - Added `get_recent_messages()` function in intake.py + - Standard Mode retrieves last 20 messages from session buffer + - Full context sent to LLM on each request +- Message array format support in LLM router: + - Updated Ollama provider to accept `messages` parameter + - Updated OpenAI provider to accept `messages` parameter + - Automatic conversion from messages to prompt string for non-chat APIs + +### Changed - Architecture & Routing + +**Relay Server Updates** [core/relay/server.js](core/relay/server.js) +- ES module migration for session persistence: + - Imported `fs/promises`, `path`, `fileURLToPath` for file operations + - Created `SESSIONS_DIR` constant for session storage location +- Mode-based routing in both `/chat` and `/v1/chat/completions` endpoints: + - Extracts `mode` parameter from request body (default: "cortex") + - Routes to `CORTEX_SIMPLE` for Standard Mode, `CORTEX_REASON` for Cortex Mode + - Backend parameter only used in Standard Mode +- Session persistence functions: + - `ensureSessionsDir()` - Creates sessions directory if needed + - `loadSession(sessionId)` - Reads session history from file + - `saveSession(sessionId, history, metadata)` - Writes session to file + - `loadSessionMetadata(sessionId)` - Reads session metadata + - `saveSessionMetadata(sessionId, metadata)` - Updates session metadata + - `listSessions()` - Returns all sessions with metadata, sorted by last modified + - `deleteSession(sessionId)` - Removes session and metadata files + +**Cortex Router Updates** [cortex/router.py](cortex/router.py) +- Added `backend` field to `ReasonRequest` Pydantic model (optional) +- Created `/simple` endpoint for Standard Mode: + - Bypasses reflection, reasoning, refinement stages + - Direct LLM call with conversation context + - Uses backend from request or falls back to `STANDARD_MODE_LLM` env variable + - Returns simple response structure without reasoning artifacts +- Backend selection logic in `/simple`: + - Normalizes backend names to uppercase + - Maps UI backend names to system backend names + - Validates backend availability before calling + +**Intake Integration** [cortex/intake/intake.py](cortex/intake/intake.py) +- Added `get_recent_messages(session_id, limit)` function: + - Retrieves last N messages from session buffer + - Returns empty list if session doesn't exist + - Used by `/simple` endpoint for context retrieval + +**LLM Router Enhancements** [cortex/llm/llm_router.py](cortex/llm/llm_router.py) +- Added `messages` parameter support across all providers +- Automatic message-to-prompt conversion for legacy APIs +- Chat completion format for Ollama and OpenAI providers +- Stop sequences for MI50/DeepSeek R1 to prevent runaway generation: + - `"User:"`, `"\nUser:"`, `"Assistant:"`, `"\n\n\n"` + +**Environment Configuration** [.env](.env) +- Added `STANDARD_MODE_LLM=SECONDARY` for default Standard Mode backend +- Added `CORTEX_SIMPLE_URL=http://cortex:7081/simple` for routing + +**UI Architecture** [core/ui/index.html](core/ui/index.html) +- Server-based session loading system: + - `loadSessionsFromServer()` - Fetches sessions from Relay API + - `renderSessions()` - Populates session dropdown from server data + - Session state synchronized with server on every change +- Backend selection persistence: + - Loads saved backend from localStorage on page load + - Includes backend parameter in request body when in Standard Mode + - Settings modal pre-selects current backend choice +- Dark mode by default: + - Checks localStorage for theme preference + - Sets dark theme if no preference found + - Toggle button updates localStorage and applies theme + +**CSS Styling** [core/ui/style.css](core/ui/style.css) +- Light mode CSS variables: + - `--bg-dark: #f5f5f5` (light background) + - `--text-main: #1a1a1a` (dark text) + - `--text-fade: #666` (dimmed text) +- Dark mode CSS variables (default): + - `--bg-dark: #0a0a0a` (dark background) + - `--text-main: #e6e6e6` (light text) + - `--text-fade: #999` (dimmed text) +- Modal positioning fixes: + - `position: fixed` with `top: 50%`, `left: 50%`, `transform: translate(-50%, -50%)` + - Z-index layering: overlay (999), content (1001) + - Backdrop blur effect on modal overlay +- Session list styling: + - Session item cards with hover effects + - Delete button with red hover state + - Message count and timestamp display + +### Fixed - Critical Issues + +**DeepSeek R1 Runaway Generation** +- Root cause: R1 reasoning model generates thinking process and hallucinates conversations +- Solution: + - Changed `STANDARD_MODE_LLM` to SECONDARY (Ollama/Qwen) instead of PRIMARY (MI50/R1) + - Added stop sequences to MI50 provider to prevent continuation + - Documented R1 limitations for Standard Mode usage + +**Context Not Maintained in Standard Mode** +- Root cause: `/simple` endpoint didn't retrieve conversation history from Intake +- Solution: + - Created `get_recent_messages()` function in intake.py + - Standard Mode now pulls last 20 messages from session buffer + - Full context sent to LLM with each request +- User feedback: "it's saying it hasn't received any other messages from me, so it looks like the standard mode llm isn't getting the full chat" + +**OpenAI Backend 400 Errors** +- Root cause: OpenAI provider only accepted prompt strings, not messages arrays +- Solution: Updated OpenAI provider to support messages parameter like Ollama +- Now handles chat completion format correctly + +**Modal Formatting Issues** +- Root cause: Settings modal inside #chat container with overflow constraints +- Symptoms: Modal appearing at bottom, jumbled layout, couldn't close +- Solution: + - Moved modal outside #chat container to be direct child of body + - Changed positioning from absolute to fixed + - Added proper z-index layering (overlay: 999, content: 1001) + - Removed old model selector from header +- User feedback: "the formating for the settings is all off. Its at the bottom and all jumbling together, i cant get it to go away" + +**Session Persistence Broken** +- Root cause: Sessions stored only in localStorage, not synced with server +- Symptoms: Sessions didn't persist across browsers or reboots, couldn't load messages +- Solution: Complete rewrite of session system + - Implemented server-side file persistence in Relay + - Created CRUD API endpoints for session management + - Updated UI to load sessions from server instead of localStorage + - Added metadata system for session names + - Sessions now survive container restarts and sync across browsers +- User feedback: "sessions seem to exist locally only, i cant get them to actually load any messages and there is now way to delete them. If i open the ui in a different browser those arent there." + +### Technical Improvements + +**Backward Compatibility** +- All changes include defaults to maintain existing behavior +- Cortex Mode completely unchanged - still uses full 7-stage pipeline +- Standard Mode is opt-in via UI mode selector +- If no backend specified, falls back to `STANDARD_MODE_LLM` env variable +- Existing requests without mode parameter default to "cortex" + +**Code Quality** +- Consistent async/await patterns throughout stack +- Proper error handling with fallbacks +- Clean separation between Standard and Cortex modes +- Session persistence abstracted into helper functions +- Modular UI code with clear event handlers + +**Performance** +- Standard Mode bypasses 6 of 7 reasoning stages for faster responses +- Session loading optimized with file-based caching +- Backend selection happens once per message, not per LLM call +- Minimal overhead for mode detection and routing + +### Architecture - Dual-Mode Chat System + +**Standard Mode Flow:** +``` +User (UI) → Relay → Cortex /simple → Intake (get_recent_messages) +→ LLM (direct call with context) → Relay → UI +``` + +**Cortex Mode Flow (Unchanged):** +``` +User (UI) → Relay → Cortex /reason → Reflection → Reasoning +→ Refinement → Persona → Relay → UI +``` + +**Session Persistence:** +``` +UI → POST /sessions/:id → Relay → File system (sessions/*.json) +UI → GET /sessions → Relay → List all sessions → UI dropdown +``` + +### Known Limitations + +**Standard Mode:** +- No reflection, reasoning, or refinement stages +- No RAG integration (same as Cortex Mode - currently disabled) +- No NeoMem memory storage (same as Cortex Mode - currently disabled) +- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts) + +**Session Management:** +- Sessions stored in container filesystem - need volume mount for true persistence +- No session import/export functionality yet +- No session search or filtering + +### Migration Notes + +**For Users Upgrading:** +1. Existing sessions in localStorage will not automatically migrate to server +2. Create new sessions after upgrade for server-side persistence +3. Theme preference (light/dark) will be preserved from localStorage +4. Backend preference will default to SECONDARY if not previously set + +**For Developers:** +1. Relay now requires `fs/promises` for session persistence +2. Cortex `/simple` endpoint expects `backend` parameter (optional) +3. UI sends `mode` and `backend` parameters in request body +4. Session files stored in `core/relay/sessions/` directory + +--- + ## [0.6.0] - 2025-12-18 ### Added - Autonomy System (Phase 1 & 2) diff --git a/README.md b/README.md index 0afc2b6..b8b1525 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,12 @@ -# Project Lyra - README v0.6.0 +# Project Lyra - README v0.7.0 Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making. It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**, featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends. -**Current Version:** v0.6.0 (2025-12-18) +**NEW in v0.7.0:** Standard Mode for simple chatbot functionality + UI backend selection + server-side session persistence + +**Current Version:** v0.7.0 (2025-12-21) > **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration. @@ -25,14 +27,18 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do - Coordinates all module interactions - OpenAI-compatible endpoint: `POST /v1/chat/completions` - Internal endpoint: `POST /chat` -- Routes messages through Cortex reasoning pipeline +- Dual-mode routing: Standard Mode (simple chat) or Cortex Mode (full reasoning) +- Server-side session persistence with file-based storage +- Session management API: `GET/POST/PATCH/DELETE /sessions` - Manages async calls to Cortex ingest - *(NeoMem integration currently disabled in v0.6.0)* **2. UI** (Static HTML) - Browser-based chat interface with cyberpunk theme -- Connects to Relay -- Saves and loads sessions +- **NEW:** Mode selector (Standard/Cortex) in header +- **NEW:** Settings modal with backend selection and session management +- **NEW:** Light/Dark mode toggle (dark by default) +- Server-synced session management (persists across browsers and reboots) - OpenAI-compatible message format **3. NeoMem** (Python/FastAPI) - Port 7077 - **DISABLED IN v0.6.0** @@ -49,15 +55,22 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do - Primary reasoning engine with multi-stage pipeline and autonomy system - **Includes embedded Intake module** (no separate service as of v0.5.1) - **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below -- **4-Stage Processing:** - 1. **Reflection** - Generates meta-awareness notes about conversation - 2. **Reasoning** - Creates initial draft answer using context - 3. **Refinement** - Polishes and improves the draft - 4. **Persona** - Applies Lyra's personality and speaking style +- **Dual Operating Modes:** + - **Standard Mode** (NEW in v0.7.0) - Simple chatbot with context retention + - Bypasses reflection, reasoning, refinement stages + - Direct LLM call with conversation history + - User-selectable backend (SECONDARY, OPENAI, or custom) + - Faster responses for coding and practical tasks + - **Cortex Mode** - Full 4-stage reasoning pipeline + 1. **Reflection** - Generates meta-awareness notes about conversation + 2. **Reasoning** - Creates initial draft answer using context + 3. **Refinement** - Polishes and improves the draft + 4. **Persona** - Applies Lyra's personality and speaking style - Integrates with Intake for short-term context via internal Python imports - Flexible LLM router supporting multiple backends via HTTP - **Endpoints:** - - `POST /reason` - Main reasoning pipeline + - `POST /reason` - Main reasoning pipeline (Cortex Mode) + - `POST /simple` - Direct LLM chat (Standard Mode) **NEW in v0.7.0** - `POST /ingest` - Receives conversation exchanges from Relay - `GET /health` - Service health check - `GET /debug/sessions` - Inspect in-memory SESSIONS state @@ -129,12 +142,38 @@ The autonomy system operates in coordinated layers, all maintaining state in `se --- -## Data Flow Architecture (v0.6.0) +## Data Flow Architecture (v0.7.0) -### Normal Message Flow: +### Standard Mode Flow (NEW in v0.7.0): ``` -User (UI) → POST /v1/chat/completions +User (UI) → POST /v1/chat/completions {mode: "standard", backend: "SECONDARY"} + ↓ +Relay (7078) + ↓ POST /simple +Cortex (7081) + ↓ (internal Python call) +Intake module → get_recent_messages() (last 20 messages) + ↓ +Direct LLM call (user-selected backend: SECONDARY/OPENAI/custom) + ↓ +Returns simple response to Relay + ↓ +Relay → POST /ingest (async) + ↓ +Cortex → add_exchange_internal() → SESSIONS buffer + ↓ +Relay → POST /sessions/:id (save session to file) + ↓ +Relay → UI (returns final response) + +Note: Bypasses reflection, reasoning, refinement, persona stages +``` + +### Cortex Mode Flow (Full Reasoning): + +``` +User (UI) → POST /v1/chat/completions {mode: "cortex"} ↓ Relay (7078) ↓ POST /reason @@ -158,11 +197,26 @@ Cortex → add_exchange_internal() → SESSIONS buffer ↓ Autonomy System → Update self_state.json (pattern tracking) ↓ +Relay → POST /sessions/:id (save session to file) + ↓ Relay → UI (returns final response) Note: NeoMem integration disabled in v0.6.0 ``` +### Session Persistence Flow (NEW in v0.7.0): + +``` +UI loads → GET /sessions → Relay → List all sessions from files → UI dropdown +User sends message → POST /sessions/:id → Relay → Save to sessions/*.json +User renames session → PATCH /sessions/:id/metadata → Relay → Update *.meta.json +User deletes session → DELETE /sessions/:id → Relay → Remove session files + +Sessions stored in: core/relay/sessions/ +- {sessionId}.json (conversation history) +- {sessionId}.meta.json (name, timestamps, metadata) +``` + ### Cortex 4-Stage Reasoning Pipeline: 1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI) @@ -196,6 +250,14 @@ Note: NeoMem integration disabled in v0.6.0 - OpenAI-compatible endpoint: `POST /v1/chat/completions` - Internal endpoint: `POST /chat` - Health check: `GET /_health` +- **NEW:** Dual-mode routing (Standard/Cortex) +- **NEW:** Server-side session persistence with CRUD API +- **NEW:** Session management endpoints: + - `GET /sessions` - List all sessions + - `GET /sessions/:id` - Retrieve session history + - `POST /sessions/:id` - Save session history + - `PATCH /sessions/:id/metadata` - Update session metadata + - `DELETE /sessions/:id` - Delete session - Async non-blocking calls to Cortex - Shared request handler for code reuse - Comprehensive error handling @@ -210,19 +272,35 @@ Note: NeoMem integration disabled in v0.6.0 **UI**: - Lightweight static HTML chat interface -- Cyberpunk theme -- Session save/load functionality +- Cyberpunk theme with light/dark mode toggle +- **NEW:** Mode selector (Standard/Cortex) in header +- **NEW:** Settings modal (⚙ button) with: + - Backend selection for Standard Mode (SECONDARY/OPENAI/custom) + - Session management (view, delete sessions) + - Theme toggle (dark mode default) +- **NEW:** Server-synced session management + - Sessions persist across browsers and reboots + - Rename sessions with custom names + - Delete sessions with confirmation + - Automatic session save on every message - OpenAI message format support ### Reasoning Layer -**Cortex** (v0.5.1): -- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona) +**Cortex** (v0.7.0): +- **NEW:** Dual operating modes: + - **Standard Mode** - Simple chat with context (`/simple` endpoint) + - User-selectable backend (SECONDARY, OPENAI, or custom) + - Full conversation history via Intake integration + - Bypasses reasoning pipeline for faster responses + - **Cortex Mode** - Full reasoning pipeline (`/reason` endpoint) + - Multi-stage processing: reflection → reasoning → refine → persona + - Per-stage backend selection + - Autonomy system integration - Flexible LLM backend routing via HTTP -- Per-stage backend selection - Async processing throughout - Embedded Intake module for short-term context -- `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints +- `/reason`, `/simple`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints - Lenient error handling - never fails the chat pipeline **Intake** (Embedded Module): @@ -327,7 +405,28 @@ The following LLM backends are accessed via HTTP (not part of docker-compose): ## Version History -### v0.6.0 (2025-12-18) - Current Release +### v0.7.0 (2025-12-21) - Current Release +**Major Features: Standard Mode + Backend Selection + Session Persistence** +- ✅ Added Standard Mode for simple chatbot functionality +- ✅ UI mode selector (Standard/Cortex) in header +- ✅ Settings modal with backend selection for Standard Mode +- ✅ Server-side session persistence with file-based storage +- ✅ Session management UI (view, rename, delete sessions) +- ✅ Light/Dark mode toggle (dark by default) +- ✅ Context retention in Standard Mode via Intake integration +- ✅ Fixed modal positioning and z-index issues +- ✅ Cortex `/simple` endpoint for direct LLM calls +- ✅ Session CRUD API in Relay +- ✅ Full backward compatibility - Cortex Mode unchanged + +**Key Changes:** +- Standard Mode bypasses 6 of 7 reasoning stages for faster responses +- Sessions now sync across browsers and survive container restarts +- User can select SECONDARY (Ollama), OPENAI, or custom backend for Standard Mode +- Theme preference and backend selection persisted in localStorage +- Session files stored in `core/relay/sessions/` directory + +### v0.6.0 (2025-12-18) **Major Feature: Autonomy System (Phase 1, 2, and 2.5)** - ✅ Added autonomous decision-making framework - ✅ Implemented executive planning and goal-setting layer @@ -394,30 +493,39 @@ The following LLM backends are accessed via HTTP (not part of docker-compose): --- -## Known Issues (v0.6.0) +## Known Issues (v0.7.0) -### Temporarily Disabled (v0.6.0) +### Temporarily Disabled - **NeoMem disabled by default** - Being refined independently before full integration - PostgreSQL + pgvector storage inactive - Neo4j graph database inactive - Memory persistence endpoints not active - RAG service (Beta Lyrae) currently disabled in docker-compose.yml -### Non-Critical -- Session management endpoints not fully implemented in Relay -- Full autonomy system integration still being refined -- Memory retrieval integration pending NeoMem re-enablement +### Standard Mode Limitations +- No reflection, reasoning, or refinement stages (by design) +- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts) +- No RAG integration (same as Cortex Mode - currently disabled) +- No NeoMem memory storage (same as Cortex Mode - currently disabled) + +### Session Management Limitations +- Sessions stored in container filesystem - requires volume mount for true persistence +- No session import/export functionality yet +- No session search or filtering +- Old localStorage sessions don't automatically migrate to server ### Operational Notes - **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state - Multi-worker scaling requires migrating SESSIONS to Redis or shared storage - Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting +- Backend selection only affects Standard Mode - Cortex Mode uses environment-configured backends ### Future Enhancements - Re-enable NeoMem integration after pipeline refinement - Full autonomy system maturation and optimization - Re-enable RAG service integration -- Implement full session persistence +- Session import/export functionality +- Session search and filtering UI - Migrate SESSIONS to Redis for multi-worker support - Add request correlation IDs for tracing - Comprehensive health checks across all services @@ -457,17 +565,56 @@ The following LLM backends are accessed via HTTP (not part of docker-compose): curl http://localhost:7077/health ``` -4. Access the UI at `http://localhost:7078` +4. Access the UI at `http://localhost:8081` + +### Using the UI + +**Mode Selection:** +- Use the **Mode** dropdown in the header to switch between: + - **Standard** - Simple chatbot for coding and practical tasks + - **Cortex** - Full reasoning pipeline with autonomy features + +**Settings Menu:** +1. Click the **⚙ Settings** button in the header +2. **Backend Selection** (Standard Mode only): + - Choose **SECONDARY** (Ollama/Qwen on 3090) - Fast, local + - Choose **OPENAI** (GPT-4o-mini) - Cloud-based, high quality + - Enter custom backend name for advanced configurations +3. **Session Management**: + - View all saved sessions with message counts and timestamps + - Click 🗑️ to delete unwanted sessions +4. **Theme Toggle**: + - Click **🌙 Dark Mode** or **☀️ Light Mode** to switch themes + +**Session Management:** +- Sessions automatically save on every message +- Use the **Session** dropdown to switch between sessions +- Click **➕ New** to create a new session +- Click **✏️ Rename** to rename the current session +- Sessions persist across browsers and container restarts ### Test -**Test Relay → Cortex pipeline:** +**Test Standard Mode:** ```bash curl -X POST http://localhost:7078/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ + "mode": "standard", + "backend": "SECONDARY", + "messages": [{"role": "user", "content": "Hello!"}], + "sessionId": "test" + }' +``` + +**Test Cortex Mode (Full Reasoning):** +```bash +curl -X POST http://localhost:7078/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "mode": "cortex", "messages": [{"role": "user", "content": "Hello Lyra!"}], - "session_id": "test" + "sessionId": "test" }' ``` @@ -492,6 +639,21 @@ curl http://localhost:7081/debug/sessions curl "http://localhost:7081/debug/summary?session_id=test" ``` +**List all sessions:** +```bash +curl http://localhost:7078/sessions +``` + +**Get session history:** +```bash +curl http://localhost:7078/sessions/sess-abc123 +``` + +**Delete a session:** +```bash +curl -X DELETE http://localhost:7078/sessions/sess-abc123 +``` + All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack. --- @@ -515,12 +677,13 @@ OPENAI_API_KEY=sk-... **Module-specific backend selection:** ```bash -CORTEX_LLM=SECONDARY # Use Ollama for reasoning -INTAKE_LLM=PRIMARY # Use llama.cpp for summarization -SPEAK_LLM=OPENAI # Use OpenAI for persona -NEOMEM_LLM=PRIMARY # Use llama.cpp for memory -UI_LLM=OPENAI # Use OpenAI for UI -RELAY_LLM=PRIMARY # Use llama.cpp for relay +CORTEX_LLM=SECONDARY # Use Ollama for reasoning +INTAKE_LLM=PRIMARY # Use llama.cpp for summarization +SPEAK_LLM=OPENAI # Use OpenAI for persona +NEOMEM_LLM=PRIMARY # Use llama.cpp for memory +UI_LLM=OPENAI # Use OpenAI for UI +RELAY_LLM=PRIMARY # Use llama.cpp for relay +STANDARD_MODE_LLM=SECONDARY # Default backend for Standard Mode (NEW in v0.7.0) ``` ### Database Configuration @@ -541,6 +704,7 @@ NEO4J_PASSWORD=neomemgraph NEOMEM_API=http://neomem-api:7077 CORTEX_API=http://cortex:7081 CORTEX_REASON_URL=http://cortex:7081/reason +CORTEX_SIMPLE_URL=http://cortex:7081/simple # NEW in v0.7.0 CORTEX_INGEST_URL=http://cortex:7081/ingest RELAY_URL=http://relay:7078 ``` @@ -685,7 +849,10 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0). ### Debugging Tips - Enable verbose logging: `VERBOSE_DEBUG=true` in `.env` - Check Cortex logs: `docker logs cortex -f` +- Check Relay logs: `docker logs relay -f` - Inspect SESSIONS: `curl http://localhost:7081/debug/sessions` - Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"` -- Check Relay logs: `docker logs relay -f` +- List sessions: `curl http://localhost:7078/sessions` +- Test Standard Mode: `curl -X POST http://localhost:7078/v1/chat/completions -H "Content-Type: application/json" -d '{"mode":"standard","backend":"SECONDARY","messages":[{"role":"user","content":"test"}],"sessionId":"test"}'` - Monitor Docker network: `docker network inspect lyra_net` +- Check session files: `ls -la core/relay/sessions/` diff --git a/core/relay/sessions/sess-dnm44wyb.json b/core/relay/sessions/sess-dnm44wyb.json index 9d8a32e..d541f62 100644 --- a/core/relay/sessions/sess-dnm44wyb.json +++ b/core/relay/sessions/sess-dnm44wyb.json @@ -14,5 +14,13 @@ { "role": "assistant", "content": "Hello Brian! Nice to meet you. As an AI, I don't have physical design capabilities, but I'm here to help with any information or tasks you need. How can I assist you in your design process?" + }, + { + "role": "user", + "content": "Can you code python scripts for me?" + }, + { + "role": "assistant", + "content": "Sure thing, Brian! I can help you with Python scripting. What specifically do you need assistance with? Whether it's a simple script or something more complex, just let me know the details!" } ] \ No newline at end of file