Update to v0.9.1 #1
265
CHANGELOG.md
265
CHANGELOG.md
@@ -9,6 +9,271 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Se
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## [0.7.0] - 2025-12-21
|
||||||
|
|
||||||
|
### Added - Standard Mode & UI Enhancements
|
||||||
|
|
||||||
|
**Standard Mode Implementation**
|
||||||
|
- Added "Standard Mode" chat option that bypasses complex cortex reasoning pipeline
|
||||||
|
- Provides simple chatbot functionality for coding and practical tasks
|
||||||
|
- Maintains full conversation context across messages
|
||||||
|
- Backend-agnostic - works with SECONDARY (Ollama), OPENAI, or custom backends
|
||||||
|
- Created `/simple` endpoint in Cortex router [cortex/router.py:389](cortex/router.py#L389)
|
||||||
|
- Mode selector in UI with toggle between Standard and Cortex modes
|
||||||
|
- Standard Mode: Direct LLM chat with context retention
|
||||||
|
- Cortex Mode: Full 7-stage reasoning pipeline (unchanged)
|
||||||
|
|
||||||
|
**Backend Selection System**
|
||||||
|
- UI settings modal with LLM backend selection for Standard Mode
|
||||||
|
- Radio button selector: SECONDARY (Ollama/Qwen), OPENAI (GPT-4o-mini), or custom
|
||||||
|
- Backend preference persisted in localStorage
|
||||||
|
- Custom backend text input for advanced users
|
||||||
|
- Backend parameter routing through entire stack:
|
||||||
|
- UI sends `backend` parameter in request body
|
||||||
|
- Relay forwards backend selection to Cortex
|
||||||
|
- Cortex `/simple` endpoint respects user's backend choice
|
||||||
|
- Environment-based fallback: Uses `STANDARD_MODE_LLM` if no backend specified
|
||||||
|
|
||||||
|
**Session Management Overhaul**
|
||||||
|
- Complete rewrite of session system to use server-side persistence
|
||||||
|
- File-based storage in `core/relay/sessions/` directory
|
||||||
|
- Session files: `{sessionId}.json` for history, `{sessionId}.meta.json` for metadata
|
||||||
|
- Server is source of truth - sessions sync across browsers and reboots
|
||||||
|
- Session metadata system for friendly names
|
||||||
|
- Sessions display custom names instead of random IDs
|
||||||
|
- Rename functionality in session dropdown
|
||||||
|
- Last modified timestamps and message counts
|
||||||
|
- Full CRUD API for sessions in Relay:
|
||||||
|
- `GET /sessions` - List all sessions with metadata
|
||||||
|
- `GET /sessions/:id` - Retrieve session history
|
||||||
|
- `POST /sessions/:id` - Save session history
|
||||||
|
- `PATCH /sessions/:id/metadata` - Update session name/metadata
|
||||||
|
- `DELETE /sessions/:id` - Delete session and metadata
|
||||||
|
- Session management UI in settings modal:
|
||||||
|
- List of all sessions with message counts and timestamps
|
||||||
|
- Delete button for each session with confirmation
|
||||||
|
- Automatic session cleanup when deleting current session
|
||||||
|
|
||||||
|
**UI Improvements**
|
||||||
|
- Settings modal with hamburger menu (⚙ Settings button)
|
||||||
|
- Backend selection section for Standard Mode
|
||||||
|
- Session management section with delete functionality
|
||||||
|
- Clean modal overlay with cyberpunk theme
|
||||||
|
- ESC key and click-outside to close
|
||||||
|
- Light/Dark mode toggle with dark mode as default
|
||||||
|
- Theme preference persisted in localStorage
|
||||||
|
- CSS variables for seamless theme switching
|
||||||
|
- Toggle button shows current mode (🌙 Dark Mode / ☀️ Light Mode)
|
||||||
|
- Removed redundant model selector dropdown from header
|
||||||
|
- Fixed modal positioning and z-index layering
|
||||||
|
- Modal moved outside #chat container for proper rendering
|
||||||
|
- Fixed z-index: overlay (999), modal content (1001)
|
||||||
|
- Centered modal with proper backdrop blur
|
||||||
|
|
||||||
|
**Context Retention for Standard Mode**
|
||||||
|
- Integration with Intake module for conversation history
|
||||||
|
- Added `get_recent_messages()` function in intake.py
|
||||||
|
- Standard Mode retrieves last 20 messages from session buffer
|
||||||
|
- Full context sent to LLM on each request
|
||||||
|
- Message array format support in LLM router:
|
||||||
|
- Updated Ollama provider to accept `messages` parameter
|
||||||
|
- Updated OpenAI provider to accept `messages` parameter
|
||||||
|
- Automatic conversion from messages to prompt string for non-chat APIs
|
||||||
|
|
||||||
|
### Changed - Architecture & Routing
|
||||||
|
|
||||||
|
**Relay Server Updates** [core/relay/server.js](core/relay/server.js)
|
||||||
|
- ES module migration for session persistence:
|
||||||
|
- Imported `fs/promises`, `path`, `fileURLToPath` for file operations
|
||||||
|
- Created `SESSIONS_DIR` constant for session storage location
|
||||||
|
- Mode-based routing in both `/chat` and `/v1/chat/completions` endpoints:
|
||||||
|
- Extracts `mode` parameter from request body (default: "cortex")
|
||||||
|
- Routes to `CORTEX_SIMPLE` for Standard Mode, `CORTEX_REASON` for Cortex Mode
|
||||||
|
- Backend parameter only used in Standard Mode
|
||||||
|
- Session persistence functions:
|
||||||
|
- `ensureSessionsDir()` - Creates sessions directory if needed
|
||||||
|
- `loadSession(sessionId)` - Reads session history from file
|
||||||
|
- `saveSession(sessionId, history, metadata)` - Writes session to file
|
||||||
|
- `loadSessionMetadata(sessionId)` - Reads session metadata
|
||||||
|
- `saveSessionMetadata(sessionId, metadata)` - Updates session metadata
|
||||||
|
- `listSessions()` - Returns all sessions with metadata, sorted by last modified
|
||||||
|
- `deleteSession(sessionId)` - Removes session and metadata files
|
||||||
|
|
||||||
|
**Cortex Router Updates** [cortex/router.py](cortex/router.py)
|
||||||
|
- Added `backend` field to `ReasonRequest` Pydantic model (optional)
|
||||||
|
- Created `/simple` endpoint for Standard Mode:
|
||||||
|
- Bypasses reflection, reasoning, refinement stages
|
||||||
|
- Direct LLM call with conversation context
|
||||||
|
- Uses backend from request or falls back to `STANDARD_MODE_LLM` env variable
|
||||||
|
- Returns simple response structure without reasoning artifacts
|
||||||
|
- Backend selection logic in `/simple`:
|
||||||
|
- Normalizes backend names to uppercase
|
||||||
|
- Maps UI backend names to system backend names
|
||||||
|
- Validates backend availability before calling
|
||||||
|
|
||||||
|
**Intake Integration** [cortex/intake/intake.py](cortex/intake/intake.py)
|
||||||
|
- Added `get_recent_messages(session_id, limit)` function:
|
||||||
|
- Retrieves last N messages from session buffer
|
||||||
|
- Returns empty list if session doesn't exist
|
||||||
|
- Used by `/simple` endpoint for context retrieval
|
||||||
|
|
||||||
|
**LLM Router Enhancements** [cortex/llm/llm_router.py](cortex/llm/llm_router.py)
|
||||||
|
- Added `messages` parameter support across all providers
|
||||||
|
- Automatic message-to-prompt conversion for legacy APIs
|
||||||
|
- Chat completion format for Ollama and OpenAI providers
|
||||||
|
- Stop sequences for MI50/DeepSeek R1 to prevent runaway generation:
|
||||||
|
- `"User:"`, `"\nUser:"`, `"Assistant:"`, `"\n\n\n"`
|
||||||
|
|
||||||
|
**Environment Configuration** [.env](.env)
|
||||||
|
- Added `STANDARD_MODE_LLM=SECONDARY` for default Standard Mode backend
|
||||||
|
- Added `CORTEX_SIMPLE_URL=http://cortex:7081/simple` for routing
|
||||||
|
|
||||||
|
**UI Architecture** [core/ui/index.html](core/ui/index.html)
|
||||||
|
- Server-based session loading system:
|
||||||
|
- `loadSessionsFromServer()` - Fetches sessions from Relay API
|
||||||
|
- `renderSessions()` - Populates session dropdown from server data
|
||||||
|
- Session state synchronized with server on every change
|
||||||
|
- Backend selection persistence:
|
||||||
|
- Loads saved backend from localStorage on page load
|
||||||
|
- Includes backend parameter in request body when in Standard Mode
|
||||||
|
- Settings modal pre-selects current backend choice
|
||||||
|
- Dark mode by default:
|
||||||
|
- Checks localStorage for theme preference
|
||||||
|
- Sets dark theme if no preference found
|
||||||
|
- Toggle button updates localStorage and applies theme
|
||||||
|
|
||||||
|
**CSS Styling** [core/ui/style.css](core/ui/style.css)
|
||||||
|
- Light mode CSS variables:
|
||||||
|
- `--bg-dark: #f5f5f5` (light background)
|
||||||
|
- `--text-main: #1a1a1a` (dark text)
|
||||||
|
- `--text-fade: #666` (dimmed text)
|
||||||
|
- Dark mode CSS variables (default):
|
||||||
|
- `--bg-dark: #0a0a0a` (dark background)
|
||||||
|
- `--text-main: #e6e6e6` (light text)
|
||||||
|
- `--text-fade: #999` (dimmed text)
|
||||||
|
- Modal positioning fixes:
|
||||||
|
- `position: fixed` with `top: 50%`, `left: 50%`, `transform: translate(-50%, -50%)`
|
||||||
|
- Z-index layering: overlay (999), content (1001)
|
||||||
|
- Backdrop blur effect on modal overlay
|
||||||
|
- Session list styling:
|
||||||
|
- Session item cards with hover effects
|
||||||
|
- Delete button with red hover state
|
||||||
|
- Message count and timestamp display
|
||||||
|
|
||||||
|
### Fixed - Critical Issues
|
||||||
|
|
||||||
|
**DeepSeek R1 Runaway Generation**
|
||||||
|
- Root cause: R1 reasoning model generates thinking process and hallucinates conversations
|
||||||
|
- Solution:
|
||||||
|
- Changed `STANDARD_MODE_LLM` to SECONDARY (Ollama/Qwen) instead of PRIMARY (MI50/R1)
|
||||||
|
- Added stop sequences to MI50 provider to prevent continuation
|
||||||
|
- Documented R1 limitations for Standard Mode usage
|
||||||
|
|
||||||
|
**Context Not Maintained in Standard Mode**
|
||||||
|
- Root cause: `/simple` endpoint didn't retrieve conversation history from Intake
|
||||||
|
- Solution:
|
||||||
|
- Created `get_recent_messages()` function in intake.py
|
||||||
|
- Standard Mode now pulls last 20 messages from session buffer
|
||||||
|
- Full context sent to LLM with each request
|
||||||
|
- User feedback: "it's saying it hasn't received any other messages from me, so it looks like the standard mode llm isn't getting the full chat"
|
||||||
|
|
||||||
|
**OpenAI Backend 400 Errors**
|
||||||
|
- Root cause: OpenAI provider only accepted prompt strings, not messages arrays
|
||||||
|
- Solution: Updated OpenAI provider to support messages parameter like Ollama
|
||||||
|
- Now handles chat completion format correctly
|
||||||
|
|
||||||
|
**Modal Formatting Issues**
|
||||||
|
- Root cause: Settings modal inside #chat container with overflow constraints
|
||||||
|
- Symptoms: Modal appearing at bottom, jumbled layout, couldn't close
|
||||||
|
- Solution:
|
||||||
|
- Moved modal outside #chat container to be direct child of body
|
||||||
|
- Changed positioning from absolute to fixed
|
||||||
|
- Added proper z-index layering (overlay: 999, content: 1001)
|
||||||
|
- Removed old model selector from header
|
||||||
|
- User feedback: "the formating for the settings is all off. Its at the bottom and all jumbling together, i cant get it to go away"
|
||||||
|
|
||||||
|
**Session Persistence Broken**
|
||||||
|
- Root cause: Sessions stored only in localStorage, not synced with server
|
||||||
|
- Symptoms: Sessions didn't persist across browsers or reboots, couldn't load messages
|
||||||
|
- Solution: Complete rewrite of session system
|
||||||
|
- Implemented server-side file persistence in Relay
|
||||||
|
- Created CRUD API endpoints for session management
|
||||||
|
- Updated UI to load sessions from server instead of localStorage
|
||||||
|
- Added metadata system for session names
|
||||||
|
- Sessions now survive container restarts and sync across browsers
|
||||||
|
- User feedback: "sessions seem to exist locally only, i cant get them to actually load any messages and there is now way to delete them. If i open the ui in a different browser those arent there."
|
||||||
|
|
||||||
|
### Technical Improvements
|
||||||
|
|
||||||
|
**Backward Compatibility**
|
||||||
|
- All changes include defaults to maintain existing behavior
|
||||||
|
- Cortex Mode completely unchanged - still uses full 7-stage pipeline
|
||||||
|
- Standard Mode is opt-in via UI mode selector
|
||||||
|
- If no backend specified, falls back to `STANDARD_MODE_LLM` env variable
|
||||||
|
- Existing requests without mode parameter default to "cortex"
|
||||||
|
|
||||||
|
**Code Quality**
|
||||||
|
- Consistent async/await patterns throughout stack
|
||||||
|
- Proper error handling with fallbacks
|
||||||
|
- Clean separation between Standard and Cortex modes
|
||||||
|
- Session persistence abstracted into helper functions
|
||||||
|
- Modular UI code with clear event handlers
|
||||||
|
|
||||||
|
**Performance**
|
||||||
|
- Standard Mode bypasses 6 of 7 reasoning stages for faster responses
|
||||||
|
- Session loading optimized with file-based caching
|
||||||
|
- Backend selection happens once per message, not per LLM call
|
||||||
|
- Minimal overhead for mode detection and routing
|
||||||
|
|
||||||
|
### Architecture - Dual-Mode Chat System
|
||||||
|
|
||||||
|
**Standard Mode Flow:**
|
||||||
|
```
|
||||||
|
User (UI) → Relay → Cortex /simple → Intake (get_recent_messages)
|
||||||
|
→ LLM (direct call with context) → Relay → UI
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cortex Mode Flow (Unchanged):**
|
||||||
|
```
|
||||||
|
User (UI) → Relay → Cortex /reason → Reflection → Reasoning
|
||||||
|
→ Refinement → Persona → Relay → UI
|
||||||
|
```
|
||||||
|
|
||||||
|
**Session Persistence:**
|
||||||
|
```
|
||||||
|
UI → POST /sessions/:id → Relay → File system (sessions/*.json)
|
||||||
|
UI → GET /sessions → Relay → List all sessions → UI dropdown
|
||||||
|
```
|
||||||
|
|
||||||
|
### Known Limitations
|
||||||
|
|
||||||
|
**Standard Mode:**
|
||||||
|
- No reflection, reasoning, or refinement stages
|
||||||
|
- No RAG integration (same as Cortex Mode - currently disabled)
|
||||||
|
- No NeoMem memory storage (same as Cortex Mode - currently disabled)
|
||||||
|
- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
|
||||||
|
|
||||||
|
**Session Management:**
|
||||||
|
- Sessions stored in container filesystem - need volume mount for true persistence
|
||||||
|
- No session import/export functionality yet
|
||||||
|
- No session search or filtering
|
||||||
|
|
||||||
|
### Migration Notes
|
||||||
|
|
||||||
|
**For Users Upgrading:**
|
||||||
|
1. Existing sessions in localStorage will not automatically migrate to server
|
||||||
|
2. Create new sessions after upgrade for server-side persistence
|
||||||
|
3. Theme preference (light/dark) will be preserved from localStorage
|
||||||
|
4. Backend preference will default to SECONDARY if not previously set
|
||||||
|
|
||||||
|
**For Developers:**
|
||||||
|
1. Relay now requires `fs/promises` for session persistence
|
||||||
|
2. Cortex `/simple` endpoint expects `backend` parameter (optional)
|
||||||
|
3. UI sends `mode` and `backend` parameters in request body
|
||||||
|
4. Session files stored in `core/relay/sessions/` directory
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## [0.6.0] - 2025-12-18
|
## [0.6.0] - 2025-12-18
|
||||||
|
|
||||||
### Added - Autonomy System (Phase 1 & 2)
|
### Added - Autonomy System (Phase 1 & 2)
|
||||||
|
|||||||
223
README.md
223
README.md
@@ -1,10 +1,12 @@
|
|||||||
# Project Lyra - README v0.6.0
|
# Project Lyra - README v0.7.0
|
||||||
|
|
||||||
Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
|
Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
|
||||||
It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
|
It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
|
||||||
featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.
|
featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.
|
||||||
|
|
||||||
**Current Version:** v0.6.0 (2025-12-18)
|
**NEW in v0.7.0:** Standard Mode for simple chatbot functionality + UI backend selection + server-side session persistence
|
||||||
|
|
||||||
|
**Current Version:** v0.7.0 (2025-12-21)
|
||||||
|
|
||||||
> **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.
|
> **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.
|
||||||
|
|
||||||
@@ -25,14 +27,18 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
|||||||
- Coordinates all module interactions
|
- Coordinates all module interactions
|
||||||
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
||||||
- Internal endpoint: `POST /chat`
|
- Internal endpoint: `POST /chat`
|
||||||
- Routes messages through Cortex reasoning pipeline
|
- Dual-mode routing: Standard Mode (simple chat) or Cortex Mode (full reasoning)
|
||||||
|
- Server-side session persistence with file-based storage
|
||||||
|
- Session management API: `GET/POST/PATCH/DELETE /sessions`
|
||||||
- Manages async calls to Cortex ingest
|
- Manages async calls to Cortex ingest
|
||||||
- *(NeoMem integration currently disabled in v0.6.0)*
|
- *(NeoMem integration currently disabled in v0.6.0)*
|
||||||
|
|
||||||
**2. UI** (Static HTML)
|
**2. UI** (Static HTML)
|
||||||
- Browser-based chat interface with cyberpunk theme
|
- Browser-based chat interface with cyberpunk theme
|
||||||
- Connects to Relay
|
- **NEW:** Mode selector (Standard/Cortex) in header
|
||||||
- Saves and loads sessions
|
- **NEW:** Settings modal with backend selection and session management
|
||||||
|
- **NEW:** Light/Dark mode toggle (dark by default)
|
||||||
|
- Server-synced session management (persists across browsers and reboots)
|
||||||
- OpenAI-compatible message format
|
- OpenAI-compatible message format
|
||||||
|
|
||||||
**3. NeoMem** (Python/FastAPI) - Port 7077 - **DISABLED IN v0.6.0**
|
**3. NeoMem** (Python/FastAPI) - Port 7077 - **DISABLED IN v0.6.0**
|
||||||
@@ -49,7 +55,13 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
|||||||
- Primary reasoning engine with multi-stage pipeline and autonomy system
|
- Primary reasoning engine with multi-stage pipeline and autonomy system
|
||||||
- **Includes embedded Intake module** (no separate service as of v0.5.1)
|
- **Includes embedded Intake module** (no separate service as of v0.5.1)
|
||||||
- **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
|
- **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
|
||||||
- **4-Stage Processing:**
|
- **Dual Operating Modes:**
|
||||||
|
- **Standard Mode** (NEW in v0.7.0) - Simple chatbot with context retention
|
||||||
|
- Bypasses reflection, reasoning, refinement stages
|
||||||
|
- Direct LLM call with conversation history
|
||||||
|
- User-selectable backend (SECONDARY, OPENAI, or custom)
|
||||||
|
- Faster responses for coding and practical tasks
|
||||||
|
- **Cortex Mode** - Full 4-stage reasoning pipeline
|
||||||
1. **Reflection** - Generates meta-awareness notes about conversation
|
1. **Reflection** - Generates meta-awareness notes about conversation
|
||||||
2. **Reasoning** - Creates initial draft answer using context
|
2. **Reasoning** - Creates initial draft answer using context
|
||||||
3. **Refinement** - Polishes and improves the draft
|
3. **Refinement** - Polishes and improves the draft
|
||||||
@@ -57,7 +69,8 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
|
|||||||
- Integrates with Intake for short-term context via internal Python imports
|
- Integrates with Intake for short-term context via internal Python imports
|
||||||
- Flexible LLM router supporting multiple backends via HTTP
|
- Flexible LLM router supporting multiple backends via HTTP
|
||||||
- **Endpoints:**
|
- **Endpoints:**
|
||||||
- `POST /reason` - Main reasoning pipeline
|
- `POST /reason` - Main reasoning pipeline (Cortex Mode)
|
||||||
|
- `POST /simple` - Direct LLM chat (Standard Mode) **NEW in v0.7.0**
|
||||||
- `POST /ingest` - Receives conversation exchanges from Relay
|
- `POST /ingest` - Receives conversation exchanges from Relay
|
||||||
- `GET /health` - Service health check
|
- `GET /health` - Service health check
|
||||||
- `GET /debug/sessions` - Inspect in-memory SESSIONS state
|
- `GET /debug/sessions` - Inspect in-memory SESSIONS state
|
||||||
@@ -129,12 +142,38 @@ The autonomy system operates in coordinated layers, all maintaining state in `se
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Data Flow Architecture (v0.6.0)
|
## Data Flow Architecture (v0.7.0)
|
||||||
|
|
||||||
### Normal Message Flow:
|
### Standard Mode Flow (NEW in v0.7.0):
|
||||||
|
|
||||||
```
|
```
|
||||||
User (UI) → POST /v1/chat/completions
|
User (UI) → POST /v1/chat/completions {mode: "standard", backend: "SECONDARY"}
|
||||||
|
↓
|
||||||
|
Relay (7078)
|
||||||
|
↓ POST /simple
|
||||||
|
Cortex (7081)
|
||||||
|
↓ (internal Python call)
|
||||||
|
Intake module → get_recent_messages() (last 20 messages)
|
||||||
|
↓
|
||||||
|
Direct LLM call (user-selected backend: SECONDARY/OPENAI/custom)
|
||||||
|
↓
|
||||||
|
Returns simple response to Relay
|
||||||
|
↓
|
||||||
|
Relay → POST /ingest (async)
|
||||||
|
↓
|
||||||
|
Cortex → add_exchange_internal() → SESSIONS buffer
|
||||||
|
↓
|
||||||
|
Relay → POST /sessions/:id (save session to file)
|
||||||
|
↓
|
||||||
|
Relay → UI (returns final response)
|
||||||
|
|
||||||
|
Note: Bypasses reflection, reasoning, refinement, persona stages
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cortex Mode Flow (Full Reasoning):
|
||||||
|
|
||||||
|
```
|
||||||
|
User (UI) → POST /v1/chat/completions {mode: "cortex"}
|
||||||
↓
|
↓
|
||||||
Relay (7078)
|
Relay (7078)
|
||||||
↓ POST /reason
|
↓ POST /reason
|
||||||
@@ -158,11 +197,26 @@ Cortex → add_exchange_internal() → SESSIONS buffer
|
|||||||
↓
|
↓
|
||||||
Autonomy System → Update self_state.json (pattern tracking)
|
Autonomy System → Update self_state.json (pattern tracking)
|
||||||
↓
|
↓
|
||||||
|
Relay → POST /sessions/:id (save session to file)
|
||||||
|
↓
|
||||||
Relay → UI (returns final response)
|
Relay → UI (returns final response)
|
||||||
|
|
||||||
Note: NeoMem integration disabled in v0.6.0
|
Note: NeoMem integration disabled in v0.6.0
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Session Persistence Flow (NEW in v0.7.0):
|
||||||
|
|
||||||
|
```
|
||||||
|
UI loads → GET /sessions → Relay → List all sessions from files → UI dropdown
|
||||||
|
User sends message → POST /sessions/:id → Relay → Save to sessions/*.json
|
||||||
|
User renames session → PATCH /sessions/:id/metadata → Relay → Update *.meta.json
|
||||||
|
User deletes session → DELETE /sessions/:id → Relay → Remove session files
|
||||||
|
|
||||||
|
Sessions stored in: core/relay/sessions/
|
||||||
|
- {sessionId}.json (conversation history)
|
||||||
|
- {sessionId}.meta.json (name, timestamps, metadata)
|
||||||
|
```
|
||||||
|
|
||||||
### Cortex 4-Stage Reasoning Pipeline:
|
### Cortex 4-Stage Reasoning Pipeline:
|
||||||
|
|
||||||
1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI)
|
1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI)
|
||||||
@@ -196,6 +250,14 @@ Note: NeoMem integration disabled in v0.6.0
|
|||||||
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
- OpenAI-compatible endpoint: `POST /v1/chat/completions`
|
||||||
- Internal endpoint: `POST /chat`
|
- Internal endpoint: `POST /chat`
|
||||||
- Health check: `GET /_health`
|
- Health check: `GET /_health`
|
||||||
|
- **NEW:** Dual-mode routing (Standard/Cortex)
|
||||||
|
- **NEW:** Server-side session persistence with CRUD API
|
||||||
|
- **NEW:** Session management endpoints:
|
||||||
|
- `GET /sessions` - List all sessions
|
||||||
|
- `GET /sessions/:id` - Retrieve session history
|
||||||
|
- `POST /sessions/:id` - Save session history
|
||||||
|
- `PATCH /sessions/:id/metadata` - Update session metadata
|
||||||
|
- `DELETE /sessions/:id` - Delete session
|
||||||
- Async non-blocking calls to Cortex
|
- Async non-blocking calls to Cortex
|
||||||
- Shared request handler for code reuse
|
- Shared request handler for code reuse
|
||||||
- Comprehensive error handling
|
- Comprehensive error handling
|
||||||
@@ -210,19 +272,35 @@ Note: NeoMem integration disabled in v0.6.0
|
|||||||
|
|
||||||
**UI**:
|
**UI**:
|
||||||
- Lightweight static HTML chat interface
|
- Lightweight static HTML chat interface
|
||||||
- Cyberpunk theme
|
- Cyberpunk theme with light/dark mode toggle
|
||||||
- Session save/load functionality
|
- **NEW:** Mode selector (Standard/Cortex) in header
|
||||||
|
- **NEW:** Settings modal (⚙ button) with:
|
||||||
|
- Backend selection for Standard Mode (SECONDARY/OPENAI/custom)
|
||||||
|
- Session management (view, delete sessions)
|
||||||
|
- Theme toggle (dark mode default)
|
||||||
|
- **NEW:** Server-synced session management
|
||||||
|
- Sessions persist across browsers and reboots
|
||||||
|
- Rename sessions with custom names
|
||||||
|
- Delete sessions with confirmation
|
||||||
|
- Automatic session save on every message
|
||||||
- OpenAI message format support
|
- OpenAI message format support
|
||||||
|
|
||||||
### Reasoning Layer
|
### Reasoning Layer
|
||||||
|
|
||||||
**Cortex** (v0.5.1):
|
**Cortex** (v0.7.0):
|
||||||
- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona)
|
- **NEW:** Dual operating modes:
|
||||||
|
- **Standard Mode** - Simple chat with context (`/simple` endpoint)
|
||||||
|
- User-selectable backend (SECONDARY, OPENAI, or custom)
|
||||||
|
- Full conversation history via Intake integration
|
||||||
|
- Bypasses reasoning pipeline for faster responses
|
||||||
|
- **Cortex Mode** - Full reasoning pipeline (`/reason` endpoint)
|
||||||
|
- Multi-stage processing: reflection → reasoning → refine → persona
|
||||||
|
- Per-stage backend selection
|
||||||
|
- Autonomy system integration
|
||||||
- Flexible LLM backend routing via HTTP
|
- Flexible LLM backend routing via HTTP
|
||||||
- Per-stage backend selection
|
|
||||||
- Async processing throughout
|
- Async processing throughout
|
||||||
- Embedded Intake module for short-term context
|
- Embedded Intake module for short-term context
|
||||||
- `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
|
- `/reason`, `/simple`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
|
||||||
- Lenient error handling - never fails the chat pipeline
|
- Lenient error handling - never fails the chat pipeline
|
||||||
|
|
||||||
**Intake** (Embedded Module):
|
**Intake** (Embedded Module):
|
||||||
@@ -327,7 +405,28 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
|||||||
|
|
||||||
## Version History
|
## Version History
|
||||||
|
|
||||||
### v0.6.0 (2025-12-18) - Current Release
|
### v0.7.0 (2025-12-21) - Current Release
|
||||||
|
**Major Features: Standard Mode + Backend Selection + Session Persistence**
|
||||||
|
- ✅ Added Standard Mode for simple chatbot functionality
|
||||||
|
- ✅ UI mode selector (Standard/Cortex) in header
|
||||||
|
- ✅ Settings modal with backend selection for Standard Mode
|
||||||
|
- ✅ Server-side session persistence with file-based storage
|
||||||
|
- ✅ Session management UI (view, rename, delete sessions)
|
||||||
|
- ✅ Light/Dark mode toggle (dark by default)
|
||||||
|
- ✅ Context retention in Standard Mode via Intake integration
|
||||||
|
- ✅ Fixed modal positioning and z-index issues
|
||||||
|
- ✅ Cortex `/simple` endpoint for direct LLM calls
|
||||||
|
- ✅ Session CRUD API in Relay
|
||||||
|
- ✅ Full backward compatibility - Cortex Mode unchanged
|
||||||
|
|
||||||
|
**Key Changes:**
|
||||||
|
- Standard Mode bypasses 6 of 7 reasoning stages for faster responses
|
||||||
|
- Sessions now sync across browsers and survive container restarts
|
||||||
|
- User can select SECONDARY (Ollama), OPENAI, or custom backend for Standard Mode
|
||||||
|
- Theme preference and backend selection persisted in localStorage
|
||||||
|
- Session files stored in `core/relay/sessions/` directory
|
||||||
|
|
||||||
|
### v0.6.0 (2025-12-18)
|
||||||
**Major Feature: Autonomy System (Phase 1, 2, and 2.5)**
|
**Major Feature: Autonomy System (Phase 1, 2, and 2.5)**
|
||||||
- ✅ Added autonomous decision-making framework
|
- ✅ Added autonomous decision-making framework
|
||||||
- ✅ Implemented executive planning and goal-setting layer
|
- ✅ Implemented executive planning and goal-setting layer
|
||||||
@@ -394,30 +493,39 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Known Issues (v0.6.0)
|
## Known Issues (v0.7.0)
|
||||||
|
|
||||||
### Temporarily Disabled (v0.6.0)
|
### Temporarily Disabled
|
||||||
- **NeoMem disabled by default** - Being refined independently before full integration
|
- **NeoMem disabled by default** - Being refined independently before full integration
|
||||||
- PostgreSQL + pgvector storage inactive
|
- PostgreSQL + pgvector storage inactive
|
||||||
- Neo4j graph database inactive
|
- Neo4j graph database inactive
|
||||||
- Memory persistence endpoints not active
|
- Memory persistence endpoints not active
|
||||||
- RAG service (Beta Lyrae) currently disabled in docker-compose.yml
|
- RAG service (Beta Lyrae) currently disabled in docker-compose.yml
|
||||||
|
|
||||||
### Non-Critical
|
### Standard Mode Limitations
|
||||||
- Session management endpoints not fully implemented in Relay
|
- No reflection, reasoning, or refinement stages (by design)
|
||||||
- Full autonomy system integration still being refined
|
- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
|
||||||
- Memory retrieval integration pending NeoMem re-enablement
|
- No RAG integration (same as Cortex Mode - currently disabled)
|
||||||
|
- No NeoMem memory storage (same as Cortex Mode - currently disabled)
|
||||||
|
|
||||||
|
### Session Management Limitations
|
||||||
|
- Sessions stored in container filesystem - requires volume mount for true persistence
|
||||||
|
- No session import/export functionality yet
|
||||||
|
- No session search or filtering
|
||||||
|
- Old localStorage sessions don't automatically migrate to server
|
||||||
|
|
||||||
### Operational Notes
|
### Operational Notes
|
||||||
- **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state
|
- **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state
|
||||||
- Multi-worker scaling requires migrating SESSIONS to Redis or shared storage
|
- Multi-worker scaling requires migrating SESSIONS to Redis or shared storage
|
||||||
- Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting
|
- Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting
|
||||||
|
- Backend selection only affects Standard Mode - Cortex Mode uses environment-configured backends
|
||||||
|
|
||||||
### Future Enhancements
|
### Future Enhancements
|
||||||
- Re-enable NeoMem integration after pipeline refinement
|
- Re-enable NeoMem integration after pipeline refinement
|
||||||
- Full autonomy system maturation and optimization
|
- Full autonomy system maturation and optimization
|
||||||
- Re-enable RAG service integration
|
- Re-enable RAG service integration
|
||||||
- Implement full session persistence
|
- Session import/export functionality
|
||||||
|
- Session search and filtering UI
|
||||||
- Migrate SESSIONS to Redis for multi-worker support
|
- Migrate SESSIONS to Redis for multi-worker support
|
||||||
- Add request correlation IDs for tracing
|
- Add request correlation IDs for tracing
|
||||||
- Comprehensive health checks across all services
|
- Comprehensive health checks across all services
|
||||||
@@ -457,17 +565,56 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
|
|||||||
curl http://localhost:7077/health
|
curl http://localhost:7077/health
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Access the UI at `http://localhost:7078`
|
4. Access the UI at `http://localhost:8081`
|
||||||
|
|
||||||
|
### Using the UI
|
||||||
|
|
||||||
|
**Mode Selection:**
|
||||||
|
- Use the **Mode** dropdown in the header to switch between:
|
||||||
|
- **Standard** - Simple chatbot for coding and practical tasks
|
||||||
|
- **Cortex** - Full reasoning pipeline with autonomy features
|
||||||
|
|
||||||
|
**Settings Menu:**
|
||||||
|
1. Click the **⚙ Settings** button in the header
|
||||||
|
2. **Backend Selection** (Standard Mode only):
|
||||||
|
- Choose **SECONDARY** (Ollama/Qwen on 3090) - Fast, local
|
||||||
|
- Choose **OPENAI** (GPT-4o-mini) - Cloud-based, high quality
|
||||||
|
- Enter custom backend name for advanced configurations
|
||||||
|
3. **Session Management**:
|
||||||
|
- View all saved sessions with message counts and timestamps
|
||||||
|
- Click 🗑️ to delete unwanted sessions
|
||||||
|
4. **Theme Toggle**:
|
||||||
|
- Click **🌙 Dark Mode** or **☀️ Light Mode** to switch themes
|
||||||
|
|
||||||
|
**Session Management:**
|
||||||
|
- Sessions automatically save on every message
|
||||||
|
- Use the **Session** dropdown to switch between sessions
|
||||||
|
- Click **➕ New** to create a new session
|
||||||
|
- Click **✏️ Rename** to rename the current session
|
||||||
|
- Sessions persist across browsers and container restarts
|
||||||
|
|
||||||
### Test
|
### Test
|
||||||
|
|
||||||
**Test Relay → Cortex pipeline:**
|
**Test Standard Mode:**
|
||||||
```bash
|
```bash
|
||||||
curl -X POST http://localhost:7078/v1/chat/completions \
|
curl -X POST http://localhost:7078/v1/chat/completions \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{
|
-d '{
|
||||||
|
"mode": "standard",
|
||||||
|
"backend": "SECONDARY",
|
||||||
|
"messages": [{"role": "user", "content": "Hello!"}],
|
||||||
|
"sessionId": "test"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Test Cortex Mode (Full Reasoning):**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:7078/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"mode": "cortex",
|
||||||
"messages": [{"role": "user", "content": "Hello Lyra!"}],
|
"messages": [{"role": "user", "content": "Hello Lyra!"}],
|
||||||
"session_id": "test"
|
"sessionId": "test"
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -492,6 +639,21 @@ curl http://localhost:7081/debug/sessions
|
|||||||
curl "http://localhost:7081/debug/summary?session_id=test"
|
curl "http://localhost:7081/debug/summary?session_id=test"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**List all sessions:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:7078/sessions
|
||||||
|
```
|
||||||
|
|
||||||
|
**Get session history:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:7078/sessions/sess-abc123
|
||||||
|
```
|
||||||
|
|
||||||
|
**Delete a session:**
|
||||||
|
```bash
|
||||||
|
curl -X DELETE http://localhost:7078/sessions/sess-abc123
|
||||||
|
```
|
||||||
|
|
||||||
All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
|
All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -521,6 +683,7 @@ SPEAK_LLM=OPENAI # Use OpenAI for persona
|
|||||||
NEOMEM_LLM=PRIMARY # Use llama.cpp for memory
|
NEOMEM_LLM=PRIMARY # Use llama.cpp for memory
|
||||||
UI_LLM=OPENAI # Use OpenAI for UI
|
UI_LLM=OPENAI # Use OpenAI for UI
|
||||||
RELAY_LLM=PRIMARY # Use llama.cpp for relay
|
RELAY_LLM=PRIMARY # Use llama.cpp for relay
|
||||||
|
STANDARD_MODE_LLM=SECONDARY # Default backend for Standard Mode (NEW in v0.7.0)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Database Configuration
|
### Database Configuration
|
||||||
@@ -541,6 +704,7 @@ NEO4J_PASSWORD=neomemgraph
|
|||||||
NEOMEM_API=http://neomem-api:7077
|
NEOMEM_API=http://neomem-api:7077
|
||||||
CORTEX_API=http://cortex:7081
|
CORTEX_API=http://cortex:7081
|
||||||
CORTEX_REASON_URL=http://cortex:7081/reason
|
CORTEX_REASON_URL=http://cortex:7081/reason
|
||||||
|
CORTEX_SIMPLE_URL=http://cortex:7081/simple # NEW in v0.7.0
|
||||||
CORTEX_INGEST_URL=http://cortex:7081/ingest
|
CORTEX_INGEST_URL=http://cortex:7081/ingest
|
||||||
RELAY_URL=http://relay:7078
|
RELAY_URL=http://relay:7078
|
||||||
```
|
```
|
||||||
@@ -685,7 +849,10 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0).
|
|||||||
### Debugging Tips
|
### Debugging Tips
|
||||||
- Enable verbose logging: `VERBOSE_DEBUG=true` in `.env`
|
- Enable verbose logging: `VERBOSE_DEBUG=true` in `.env`
|
||||||
- Check Cortex logs: `docker logs cortex -f`
|
- Check Cortex logs: `docker logs cortex -f`
|
||||||
|
- Check Relay logs: `docker logs relay -f`
|
||||||
- Inspect SESSIONS: `curl http://localhost:7081/debug/sessions`
|
- Inspect SESSIONS: `curl http://localhost:7081/debug/sessions`
|
||||||
- Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"`
|
- Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"`
|
||||||
- Check Relay logs: `docker logs relay -f`
|
- List sessions: `curl http://localhost:7078/sessions`
|
||||||
|
- Test Standard Mode: `curl -X POST http://localhost:7078/v1/chat/completions -H "Content-Type: application/json" -d '{"mode":"standard","backend":"SECONDARY","messages":[{"role":"user","content":"test"}],"sessionId":"test"}'`
|
||||||
- Monitor Docker network: `docker network inspect lyra_net`
|
- Monitor Docker network: `docker network inspect lyra_net`
|
||||||
|
- Check session files: `ls -la core/relay/sessions/`
|
||||||
|
|||||||
@@ -14,5 +14,13 @@
|
|||||||
{
|
{
|
||||||
"role": "assistant",
|
"role": "assistant",
|
||||||
"content": "Hello Brian! Nice to meet you. As an AI, I don't have physical design capabilities, but I'm here to help with any information or tasks you need. How can I assist you in your design process?"
|
"content": "Hello Brian! Nice to meet you. As an AI, I don't have physical design capabilities, but I'm here to help with any information or tasks you need. How can I assist you in your design process?"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "Can you code python scripts for me?"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"role": "assistant",
|
||||||
|
"content": "Sure thing, Brian! I can help you with Python scripting. What specifically do you need assistance with? Whether it's a simple script or something more complex, just let me know the details!"
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
Reference in New Issue
Block a user