Security: Update .gitignore to exclude runtime data, sessions, and logs

- Add session files exclusion (core/relay/sessions/) - Add log directories exclusion (logs/, *-logs/, intake-logs/) - Add runtime database files exclusion (*.db, *.sqlite*) - Add temporary files exclusion (.cache/, *.tmp, *.temp) - Remove previously tracked session and database files from git These files contain runtime data and should not be version controlled.
tool improvment
2026-01-02 13:41:28 -05:00 · 2025-12-31 22:36:24 -05:00 · 2025-12-29 22:44:47 -05:00 · 2025-12-29 02:10:59 -05:00 · 2025-12-29 01:58:20 -05:00 · 2025-12-26 03:49:20 -05:00
49 changed files with 6708 additions and 370 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -55,3 +55,29 @@ models/
 # =============================
 node_modules/
 core/relay/node_modules/
 # =============================
 # 💬 Runtime data & sessions
 # =============================
 # Session files (contain user conversation data)
 core/relay/sessions/
 **/sessions/
 *.jsonl
 # Log directories
 logs/
 **/logs/
 *-logs/
 intake-logs/
 # Database files (generated at runtime)
 *.db
 *.sqlite
 *.sqlite3
 neomem_history/
 **/neomem_history/
 # Temporary and cache files
 .cache/
 *.tmp
 *.temp
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,544 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Se
 ## [Unreleased]
 ---
 ##[0.9.1] - 2025-12-29
 Fixed: 
  - chat auto scrolling now works.
  - Session names don't change to auto gen UID anymore.
 ## [0.9.0] - 2025-12-29
 ### Added - Trilium Notes Integration
 **Trilium ETAPI Knowledge Base Integration**
 - **Trilium Tool Executor** [cortex/autonomy/tools/executors/trilium.py](cortex/autonomy/tools/executors/trilium.py)
  - `search_notes(query, limit)` - Search through Trilium notes via ETAPI
  - `create_note(title, content, parent_note_id)` - Create new notes in Trilium knowledge base
  - Full ETAPI authentication and error handling
  - Automatic parentNoteId defaulting to "root" for root-level notes
  - Connection error handling with user-friendly messages
 - **Tool Registry Integration** [cortex/autonomy/tools/registry.py](cortex/autonomy/tools/registry.py)
  - Added `ENABLE_TRILIUM` feature flag
  - Tool definitions with schema validation
  - Provider-agnostic tool calling support
 - **Setup Documentation** [TRILIUM_SETUP.md](TRILIUM_SETUP.md)
  - Step-by-step ETAPI token generation guide
  - Environment configuration instructions
  - Troubleshooting section for common issues
  - Security best practices for token management
 - **API Reference Documentation** [docs/TRILIUM_API.md](docs/TRILIUM_API.md)
  - Complete ETAPI endpoint reference
  - Authentication and request/response examples
  - Search syntax and advanced query patterns
 **Environment Configuration**
 - **New Environment Variables** [.env](.env)
  - `ENABLE_TRILIUM=true` - Enable/disable Trilium integration
  - `TRILIUM_URL=http://10.0.0.2:4292` - Trilium instance URL
  - `TRILIUM_ETAPI_TOKEN` - ETAPI authentication token
 **Capabilities Unlocked**
 - Personal knowledge base search during conversations
 - Automatic note creation from conversation insights
 - Cross-reference information between chat and notes
 - Context-aware responses using stored knowledge
 - Future: Find duplicates, suggest organization, summarize notes
 ### Changed - Spelling Corrections
 **Module Naming**
 - Renamed `trillium.py` to `trilium.py` (corrected spelling)
 - Updated all imports and references across codebase
 - Fixed environment variable names (TRILLIUM → TRILIUM)
 - Updated documentation to use correct "Trilium" spelling
 ---
 ## [0.8.0] - 2025-12-26
 ### Added - Tool Calling & "Show Your Work" Transparency Feature
 **Tool Calling System (Standard Mode)**
 - **Function Calling Infrastructure** [cortex/autonomy/tools/](cortex/autonomy/tools/)
  - Implemented agentic tool calling for Standard Mode with autonomous multi-step execution
  - Tool registry system with JSON schema definitions
  - Adapter pattern for provider-agnostic tool calling (OpenAI, Ollama, llama.cpp)
  - Maximum 5 iterations per request to prevent runaway loops
 - **Available Tools**
  - `execute_code` - Sandboxed Python/JavaScript/Bash execution via Docker
  - `web_search` - Tavily API integration for real-time web queries
  - `trilium_search` - Internal Trilium knowledge base queries
 - **Provider Adapters** [cortex/autonomy/tools/adapters/](cortex/autonomy/tools/adapters/)
  - `OpenAIAdapter` - Native function calling support
  - `OllamaAdapter` - XML-based tool calling for local models
  - `LlamaCppAdapter` - XML-based tool calling for llama.cpp backend
  - Automatic tool call parsing and result formatting
 - **Code Execution Sandbox** [cortex/autonomy/tools/code_executor.py](cortex/autonomy/tools/code_executor.py)
  - Docker-based isolated execution environment
  - Support for Python, JavaScript (Node.js), and Bash
  - 30-second timeout with automatic cleanup
  - Returns stdout, stderr, exit code, and execution time
  - Prevents filesystem access outside sandbox
 **"Show Your Work" - Real-Time Thinking Stream**
 - **Server-Sent Events (SSE) Streaming** [cortex/router.py:478-527](cortex/router.py#L478-L527)
  - New `/stream/thinking/{session_id}` endpoint for real-time event streaming
  - Broadcasts internal thinking process during tool calling operations
  - 30-second keepalive with automatic reconnection support
  - Events: `connected`, `thinking`, `tool_call`, `tool_result`, `done`, `error`
 - **Stream Manager** [cortex/autonomy/tools/stream_events.py](cortex/autonomy/tools/stream_events.py)
  - Pub/sub system for managing SSE subscriptions per session
  - Multiple clients can connect to same session stream
  - Automatic cleanup of dead queues and closed connections
  - Zero overhead when no subscribers active
 - **FunctionCaller Integration** [cortex/autonomy/tools/function_caller.py](cortex/autonomy/tools/function_caller.py)
  - Enhanced with event emission at each step:
    - "thinking" events before each LLM call
    - "tool_call" events when invoking tools
    - "tool_result" events after tool execution
    - "done" event with final answer
    - "error" events on failures
  - Session-aware streaming (only emits when subscribers exist)
  - Provider-agnostic implementation works with all backends
 - **Thinking Stream UI** [core/ui/thinking-stream.html](core/ui/thinking-stream.html)
  - Dedicated popup window for real-time thinking visualization
  - Color-coded events: green (thinking), orange (tool calls), blue (results), purple (done), red (errors)
  - Auto-scrolling event feed with animations
  - Connection status indicator with green/red dot
  - Clear events button and session info display
  - Mobile-friendly responsive design
 - **UI Integration** [core/ui/index.html](core/ui/index.html)
  - "🧠 Show Work" button in session selector
  - Opens thinking stream in popup window
  - Session ID passed via URL parameter for stream association
  - Purple/violet button styling to match cyberpunk theme
 **Tool Calling Configuration**
 - **Environment Variables** [.env](.env)
  - `STANDARD_MODE_ENABLE_TOOLS=true` - Enable/disable tool calling
  - `TAVILY_API_KEY` - API key for web search tool
  - `TRILLIUM_API_URL` - URL for Trillium knowledge base
 - **Standard Mode Tools Toggle** [cortex/router.py:389-470](cortex/router.py#L389-L470)
  - `/simple` endpoint checks `STANDARD_MODE_ENABLE_TOOLS` environment variable
  - Falls back to non-tool mode if disabled
  - Logs tool usage statistics (iterations, tools used)
 ### Changed - CORS & Architecture
 **CORS Support for SSE**
 - **Added CORS Middleware** [cortex/main.py](cortex/main.py)
  - FastAPI CORSMiddleware with wildcard origins for development
  - Allows cross-origin SSE connections from nginx UI (port 8081) to cortex (port 7081)
  - Credentials support enabled for authenticated requests
  - All methods and headers permitted
 **Tool Calling Pipeline**
 - **Standard Mode Enhancement** [cortex/router.py:389-470](cortex/router.py#L389-L470)
  - `/simple` endpoint now supports optional tool calling
  - Multi-iteration agentic loop with LLM + tool execution
  - Tool results injected back into conversation for next iteration
  - Graceful degradation to non-tool mode if tools disabled
 **JSON Response Formatting**
 - **SSE Event Structure** [cortex/router.py:497-499](cortex/router.py#L497-L499)
  - Fixed initial "connected" event to use proper JSON serialization
  - Changed from f-string with nested quotes to `json.dumps()`
  - Ensures valid JSON for all event types
 ### Fixed - Critical JavaScript & SSE Issues
 **JavaScript Variable Scoping Bug**
 - **Root cause**: `eventSource` variable used before declaration in [thinking-stream.html:218](core/ui/thinking-stream.html#L218)
 - **Symptom**: `Uncaught ReferenceError: can't access lexical declaration 'eventSource' before initialization`
 - **Solution**: Moved variable declarations before `connectStream()` call
 - **Impact**: Thinking stream page now loads without errors and establishes SSE connection
 **SSE Connection Not Establishing**
 - **Root cause**: CORS blocked cross-origin SSE requests from nginx (8081) to cortex (7081)
 - **Symptom**: Browser silently blocked EventSource connection, no errors in console
 - **Solution**: Added CORSMiddleware to cortex FastAPI app
 - **Impact**: SSE streams now connect successfully across ports
 **Invalid JSON in SSE Events**
 - **Root cause**: Initial "connected" event used f-string with nested quotes: `f"data: {{'type': 'connected', 'session_id': '{session_id}'}}\n\n"`
 - **Symptom**: Browser couldn't parse malformed JSON, connection appeared stuck on "Connecting..."
 - **Solution**: Used `json.dumps()` for proper JSON serialization
 - **Impact**: Connected event now parsed correctly, status updates to green dot
 ### Technical Improvements
 **Agentic Architecture**
 - Multi-iteration reasoning loop with tool execution
 - Provider-agnostic tool calling via adapter pattern
 - Automatic tool result injection into conversation context
 - Iteration limits to prevent infinite loops
 - Comprehensive logging at each step
 **Event Streaming Performance**
 - Zero overhead when no subscribers (check before emit)
 - Efficient pub/sub with asyncio queues
 - Automatic cleanup of disconnected clients
 - 30-second keepalive prevents timeout issues
 - Session-isolated streams prevent cross-talk
 **Code Quality**
 - Clean separation: tool execution, adapters, streaming, UI
 - Comprehensive error handling with fallbacks
 - Detailed logging for debugging tool calls
 - Type hints and docstrings throughout
 - Modular design for easy extension
 **Security**
 - Sandboxed code execution prevents filesystem access
 - Timeout limits prevent resource exhaustion
 - Docker isolation for untrusted code
 - No code execution without explicit user request
 ### Architecture - Tool Calling Flow
 **Standard Mode with Tools:**
 ```
 User (UI) → Relay → Cortex /simple
  ↓
  Check STANDARD_MODE_ENABLE_TOOLS
  ↓
  LLM generates tool call → FunctionCaller
  ↓
  Execute tool (Docker sandbox / API call)
  ↓
  Inject result → LLM (next iteration)
  ↓
  Repeat until done or max iterations
  ↓
  Return final answer → UI
 ```
 **Thinking Stream Flow:**
 ```
 Browser → nginx:8081 → thinking-stream.html
  ↓
 EventSource connects to cortex:7081/stream/thinking/{session_id}
  ↓
 ToolStreamManager.subscribe(session_id) → asyncio.Queue
  ↓
 User sends message → /simple endpoint
  ↓
 FunctionCaller emits events:
  - emit("thinking") → Queue → SSE → Browser
  - emit("tool_call") → Queue → SSE → Browser
  - emit("tool_result") → Queue → SSE → Browser
  - emit("done") → Queue → SSE → Browser
  ↓
 Browser displays color-coded events in real-time
 ```
 ### Documentation
 - **Added** [THINKING_STREAM.md](THINKING_STREAM.md) - Complete guide to "Show Your Work" feature
  - Usage examples with curl
  - Event type reference
  - Architecture diagrams
  - Demo page instructions
 - **Added** [UI_THINKING_STREAM.md](UI_THINKING_STREAM.md) - UI integration documentation
  - Button placement and styling
  - Popup window behavior
  - Session association logic
 ### Known Limitations
 **Tool Calling:**
 - Limited to 5 iterations per request (prevents runaway loops)
 - Python sandbox has no filesystem persistence (temporary only)
 - Web search requires Tavily API key (not free tier unlimited)
 - Trillium search requires separate knowledge base setup
 **Thinking Stream:**
 - CORS wildcard (`*`) is development-only (should restrict in production)
 - Stream ends after "done" event (must reconnect for new request)
 - No historical replay (only shows real-time events)
 - Single session per stream window
 ### Migration Notes
 **For Users Upgrading:**
 1. New environment variable: `STANDARD_MODE_ENABLE_TOOLS=true` (default: enabled)
 2. Thinking stream accessible via "🧠 Show Work" button in UI
 3. Tool calling works automatically in Standard Mode when enabled
 4. No changes required to existing Standard Mode usage
 **For Developers:**
 1. Cortex now includes CORS middleware for SSE
 2. New `/stream/thinking/{session_id}` endpoint available
 3. FunctionCaller requires `session_id` parameter for streaming
 4. Tool adapters can be extended by adding to `AVAILABLE_TOOLS` registry
 ---
 ## [0.7.0] - 2025-12-21
 ### Added - Standard Mode & UI Enhancements
 **Standard Mode Implementation**
 - Added "Standard Mode" chat option that bypasses complex cortex reasoning pipeline
  - Provides simple chatbot functionality for coding and practical tasks
  - Maintains full conversation context across messages
  - Backend-agnostic - works with SECONDARY (Ollama), OPENAI, or custom backends
  - Created `/simple` endpoint in Cortex router [cortex/router.py:389](cortex/router.py#L389)
 - Mode selector in UI with toggle between Standard and Cortex modes
  - Standard Mode: Direct LLM chat with context retention
  - Cortex Mode: Full 7-stage reasoning pipeline (unchanged)
 **Backend Selection System**
 - UI settings modal with LLM backend selection for Standard Mode
  - Radio button selector: SECONDARY (Ollama/Qwen), OPENAI (GPT-4o-mini), or custom
  - Backend preference persisted in localStorage
  - Custom backend text input for advanced users
 - Backend parameter routing through entire stack:
  - UI sends `backend` parameter in request body
  - Relay forwards backend selection to Cortex
  - Cortex `/simple` endpoint respects user's backend choice
 - Environment-based fallback: Uses `STANDARD_MODE_LLM` if no backend specified
 **Session Management Overhaul**
 - Complete rewrite of session system to use server-side persistence
  - File-based storage in `core/relay/sessions/` directory
  - Session files: `{sessionId}.json` for history, `{sessionId}.meta.json` for metadata
  - Server is source of truth - sessions sync across browsers and reboots
 - Session metadata system for friendly names
  - Sessions display custom names instead of random IDs
  - Rename functionality in session dropdown
  - Last modified timestamps and message counts
 - Full CRUD API for sessions in Relay:
  - `GET /sessions` - List all sessions with metadata
  - `GET /sessions/:id` - Retrieve session history
  - `POST /sessions/:id` - Save session history
  - `PATCH /sessions/:id/metadata` - Update session name/metadata
  - `DELETE /sessions/:id` - Delete session and metadata
 - Session management UI in settings modal:
  - List of all sessions with message counts and timestamps
  - Delete button for each session with confirmation
  - Automatic session cleanup when deleting current session
 **UI Improvements**
 - Settings modal with hamburger menu (⚙ Settings button)
  - Backend selection section for Standard Mode
  - Session management section with delete functionality
  - Clean modal overlay with cyberpunk theme
  - ESC key and click-outside to close
 - Light/Dark mode toggle with dark mode as default
  - Theme preference persisted in localStorage
  - CSS variables for seamless theme switching
  - Toggle button shows current mode (🌙 Dark Mode / ☀️ Light Mode)
 - Removed redundant model selector dropdown from header
 - Fixed modal positioning and z-index layering
  - Modal moved outside #chat container for proper rendering
  - Fixed z-index: overlay (999), modal content (1001)
  - Centered modal with proper backdrop blur
 **Context Retention for Standard Mode**
 - Integration with Intake module for conversation history
  - Added `get_recent_messages()` function in intake.py
  - Standard Mode retrieves last 20 messages from session buffer
  - Full context sent to LLM on each request
 - Message array format support in LLM router:
  - Updated Ollama provider to accept `messages` parameter
  - Updated OpenAI provider to accept `messages` parameter
  - Automatic conversion from messages to prompt string for non-chat APIs
 ### Changed - Architecture & Routing
 **Relay Server Updates** [core/relay/server.js](core/relay/server.js)
 - ES module migration for session persistence:
  - Imported `fs/promises`, `path`, `fileURLToPath` for file operations
  - Created `SESSIONS_DIR` constant for session storage location
 - Mode-based routing in both `/chat` and `/v1/chat/completions` endpoints:
  - Extracts `mode` parameter from request body (default: "cortex")
  - Routes to `CORTEX_SIMPLE` for Standard Mode, `CORTEX_REASON` for Cortex Mode
  - Backend parameter only used in Standard Mode
 - Session persistence functions:
  - `ensureSessionsDir()` - Creates sessions directory if needed
  - `loadSession(sessionId)` - Reads session history from file
  - `saveSession(sessionId, history, metadata)` - Writes session to file
  - `loadSessionMetadata(sessionId)` - Reads session metadata
  - `saveSessionMetadata(sessionId, metadata)` - Updates session metadata
  - `listSessions()` - Returns all sessions with metadata, sorted by last modified
  - `deleteSession(sessionId)` - Removes session and metadata files
 **Cortex Router Updates** [cortex/router.py](cortex/router.py)
 - Added `backend` field to `ReasonRequest` Pydantic model (optional)
 - Created `/simple` endpoint for Standard Mode:
  - Bypasses reflection, reasoning, refinement stages
  - Direct LLM call with conversation context
  - Uses backend from request or falls back to `STANDARD_MODE_LLM` env variable
  - Returns simple response structure without reasoning artifacts
 - Backend selection logic in `/simple`:
  - Normalizes backend names to uppercase
  - Maps UI backend names to system backend names
  - Validates backend availability before calling
 **Intake Integration** [cortex/intake/intake.py](cortex/intake/intake.py)
 - Added `get_recent_messages(session_id, limit)` function:
  - Retrieves last N messages from session buffer
  - Returns empty list if session doesn't exist
  - Used by `/simple` endpoint for context retrieval
 **LLM Router Enhancements** [cortex/llm/llm_router.py](cortex/llm/llm_router.py)
 - Added `messages` parameter support across all providers
 - Automatic message-to-prompt conversion for legacy APIs
 - Chat completion format for Ollama and OpenAI providers
 - Stop sequences for MI50/DeepSeek R1 to prevent runaway generation:
  - `"User:"`, `"\nUser:"`, `"Assistant:"`, `"\n\n\n"`
 **Environment Configuration** [.env](.env)
 - Added `STANDARD_MODE_LLM=SECONDARY` for default Standard Mode backend
 - Added `CORTEX_SIMPLE_URL=http://cortex:7081/simple` for routing
 **UI Architecture** [core/ui/index.html](core/ui/index.html)
 - Server-based session loading system:
  - `loadSessionsFromServer()` - Fetches sessions from Relay API
  - `renderSessions()` - Populates session dropdown from server data
  - Session state synchronized with server on every change
 - Backend selection persistence:
  - Loads saved backend from localStorage on page load
  - Includes backend parameter in request body when in Standard Mode
  - Settings modal pre-selects current backend choice
 - Dark mode by default:
  - Checks localStorage for theme preference
  - Sets dark theme if no preference found
  - Toggle button updates localStorage and applies theme
 **CSS Styling** [core/ui/style.css](core/ui/style.css)
 - Light mode CSS variables:
  - `--bg-dark: #f5f5f5` (light background)
  - `--text-main: #1a1a1a` (dark text)
  - `--text-fade: #666` (dimmed text)
 - Dark mode CSS variables (default):
  - `--bg-dark: #0a0a0a` (dark background)
  - `--text-main: #e6e6e6` (light text)
  - `--text-fade: #999` (dimmed text)
 - Modal positioning fixes:
  - `position: fixed` with `top: 50%`, `left: 50%`, `transform: translate(-50%, -50%)`
  - Z-index layering: overlay (999), content (1001)
  - Backdrop blur effect on modal overlay
 - Session list styling:
  - Session item cards with hover effects
  - Delete button with red hover state
  - Message count and timestamp display
 ### Fixed - Critical Issues
 **DeepSeek R1 Runaway Generation**
 - Root cause: R1 reasoning model generates thinking process and hallucinates conversations
 - Solution:
  - Changed `STANDARD_MODE_LLM` to SECONDARY (Ollama/Qwen) instead of PRIMARY (MI50/R1)
  - Added stop sequences to MI50 provider to prevent continuation
  - Documented R1 limitations for Standard Mode usage
 **Context Not Maintained in Standard Mode**
 - Root cause: `/simple` endpoint didn't retrieve conversation history from Intake
 - Solution:
  - Created `get_recent_messages()` function in intake.py
  - Standard Mode now pulls last 20 messages from session buffer
  - Full context sent to LLM with each request
 - User feedback: "it's saying it hasn't received any other messages from me, so it looks like the standard mode llm isn't getting the full chat"
 **OpenAI Backend 400 Errors**
 - Root cause: OpenAI provider only accepted prompt strings, not messages arrays
 - Solution: Updated OpenAI provider to support messages parameter like Ollama
 - Now handles chat completion format correctly
 **Modal Formatting Issues**
 - Root cause: Settings modal inside #chat container with overflow constraints
 - Symptoms: Modal appearing at bottom, jumbled layout, couldn't close
 - Solution:
  - Moved modal outside #chat container to be direct child of body
  - Changed positioning from absolute to fixed
  - Added proper z-index layering (overlay: 999, content: 1001)
  - Removed old model selector from header
 - User feedback: "the formating for the settings is all off. Its at the bottom and all jumbling together, i cant get it to go away"
 **Session Persistence Broken**
 - Root cause: Sessions stored only in localStorage, not synced with server
 - Symptoms: Sessions didn't persist across browsers or reboots, couldn't load messages
 - Solution: Complete rewrite of session system
  - Implemented server-side file persistence in Relay
  - Created CRUD API endpoints for session management
  - Updated UI to load sessions from server instead of localStorage
  - Added metadata system for session names
  - Sessions now survive container restarts and sync across browsers
 - User feedback: "sessions seem to exist locally only, i cant get them to actually load any messages and there is now way to delete them. If i open the ui in a different browser those arent there."
 ### Technical Improvements
 **Backward Compatibility**
 - All changes include defaults to maintain existing behavior
 - Cortex Mode completely unchanged - still uses full 7-stage pipeline
 - Standard Mode is opt-in via UI mode selector
 - If no backend specified, falls back to `STANDARD_MODE_LLM` env variable
 - Existing requests without mode parameter default to "cortex"
 **Code Quality**
 - Consistent async/await patterns throughout stack
 - Proper error handling with fallbacks
 - Clean separation between Standard and Cortex modes
 - Session persistence abstracted into helper functions
 - Modular UI code with clear event handlers
 **Performance**
 - Standard Mode bypasses 6 of 7 reasoning stages for faster responses
 - Session loading optimized with file-based caching
 - Backend selection happens once per message, not per LLM call
 - Minimal overhead for mode detection and routing
 ### Architecture - Dual-Mode Chat System
 **Standard Mode Flow:**
 ```
 User (UI) → Relay → Cortex /simple → Intake (get_recent_messages)
 → LLM (direct call with context) → Relay → UI
 ```
 **Cortex Mode Flow (Unchanged):**
 ```
 User (UI) → Relay → Cortex /reason → Reflection → Reasoning
 → Refinement → Persona → Relay → UI
 ```
 **Session Persistence:**
 ```
 UI → POST /sessions/:id → Relay → File system (sessions/*.json)
 UI → GET /sessions → Relay → List all sessions → UI dropdown
 ```
 ### Known Limitations
 **Standard Mode:**
 - No reflection, reasoning, or refinement stages
 - No RAG integration (same as Cortex Mode - currently disabled)
 - No NeoMem memory storage (same as Cortex Mode - currently disabled)
 - DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
 **Session Management:**
 - Sessions stored in container filesystem - need volume mount for true persistence
 - No session import/export functionality yet
 - No session search or filtering
 ### Migration Notes
 **For Users Upgrading:**
 1. Existing sessions in localStorage will not automatically migrate to server
 2. Create new sessions after upgrade for server-side persistence
 3. Theme preference (light/dark) will be preserved from localStorage
 4. Backend preference will default to SECONDARY if not previously set
 **For Developers:**
 1. Relay now requires `fs/promises` for session persistence
 2. Cortex `/simple` endpoint expects `backend` parameter (optional)
 3. UI sends `mode` and `backend` parameters in request body
 4. Session files stored in `core/relay/sessions/` directory
 ---
 ## [0.6.0] - 2025-12-18
--- a/README.md
+++ b/README.md
@@ -1,10 +1,12 @@
-# Project Lyra - README v0.6.0
+# Project Lyra - README v0.9.1
 Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
 It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
 featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.
-**Current Version:** v0.6.0 (2025-12-18)
+**NEW in v0.9.0:** Trilium Notes integration - Search and create notes from conversations
 **Current Version:** v0.9.1 (2025-12-29)
 > **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.
@@ -25,14 +27,23 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Coordinates all module interactions
 - OpenAI-compatible endpoint: `POST /v1/chat/completions`
 - Internal endpoint: `POST /chat`
- Routes messages through Cortex reasoning pipeline
+- Dual-mode routing: Standard Mode (simple chat) or Cortex Mode (full reasoning)
 - Server-side session persistence with file-based storage
 - Session management API: `GET/POST/PATCH/DELETE /sessions`
 - Manages async calls to Cortex ingest
 - *(NeoMem integration currently disabled in v0.6.0)*
-**2. UI** (Static HTML)
+**2. UI** (Static HTML) - Port 8081 (nginx)
 - Browser-based chat interface with cyberpunk theme
- Connects to Relay
+- Mode selector (Standard/Cortex) in header
- Saves and loads sessions
+- Settings modal with backend selection and session management
 - Light/Dark mode toggle (dark by default)
 - **NEW in v0.8.0:** "🧠 Show Work" button for real-time thinking stream
  - Opens popup window with live SSE connection
  - Color-coded events: thinking, tool calls, results, completion
  - Auto-scrolling with animations
  - Session-aware (matches current chat session)
 - Server-synced session management (persists across browsers and reboots)
 - OpenAI-compatible message format
 **3. NeoMem** (Python/FastAPI) - Port 7077 - **DISABLED IN v0.6.0**
@@ -49,7 +60,21 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Primary reasoning engine with multi-stage pipeline and autonomy system
 - **Includes embedded Intake module** (no separate service as of v0.5.1)
 - **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
- **4-Stage Processing:**
+- **Tool Calling System** (NEW in v0.8.0) - Agentic execution for Standard Mode
  - Sandboxed code execution (Python, JavaScript, Bash)
  - Web search via Tavily API
  - **Trilium knowledge base integration** (NEW in v0.9.0)
  - Multi-iteration autonomous tool use (max 5 iterations)
  - Real-time thinking stream via SSE
 - **Dual Operating Modes:**
  - **Standard Mode** (v0.7.0) - Simple chatbot with context retention + tool calling (v0.8.0)
    - Bypasses reflection, reasoning, refinement stages
    - Direct LLM call with conversation history
    - User-selectable backend (SECONDARY, OPENAI, or custom)
    - **NEW:** Autonomous tool calling for code execution, web search, knowledge queries
    - **NEW:** "Show Your Work" real-time thinking stream
    - Faster responses for coding and practical tasks
  - **Cortex Mode** - Full 4-stage reasoning pipeline
    1. **Reflection** - Generates meta-awareness notes about conversation
    2. **Reasoning** - Creates initial draft answer using context
    3. **Refinement** - Polishes and improves the draft
@@ -57,7 +82,9 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Integrates with Intake for short-term context via internal Python imports
 - Flexible LLM router supporting multiple backends via HTTP
 - **Endpoints:**
-  - `POST /reason` - Main reasoning pipeline
+  - `POST /reason` - Main reasoning pipeline (Cortex Mode)
  - `POST /simple` - Direct LLM chat with tool calling (Standard Mode)
  - `GET /stream/thinking/{session_id}` - SSE stream for thinking events **NEW in v0.8.0**
  - `POST /ingest` - Receives conversation exchanges from Relay
  - `GET /health` - Service health check
  - `GET /debug/sessions` - Inspect in-memory SESSIONS state
@@ -129,12 +156,38 @@ The autonomy system operates in coordinated layers, all maintaining state in `se
 ---
-## Data Flow Architecture (v0.6.0)
+## Data Flow Architecture (v0.7.0)
-### Normal Message Flow:
+### Standard Mode Flow (NEW in v0.7.0):
 ```
-User (UI) → POST /v1/chat/completions
+User (UI) → POST /v1/chat/completions {mode: "standard", backend: "SECONDARY"}
  ↓
 Relay (7078)
  ↓ POST /simple
 Cortex (7081)
  ↓ (internal Python call)
 Intake module → get_recent_messages() (last 20 messages)
  ↓
 Direct LLM call (user-selected backend: SECONDARY/OPENAI/custom)
  ↓
 Returns simple response to Relay
  ↓
 Relay → POST /ingest (async)
  ↓
 Cortex → add_exchange_internal() → SESSIONS buffer
  ↓
 Relay → POST /sessions/:id (save session to file)
  ↓
 Relay → UI (returns final response)
 Note: Bypasses reflection, reasoning, refinement, persona stages
 ```
 ### Cortex Mode Flow (Full Reasoning):
 ```
 User (UI) → POST /v1/chat/completions {mode: "cortex"}
  ↓
 Relay (7078)
  ↓ POST /reason
@@ -158,11 +211,26 @@ Cortex → add_exchange_internal() → SESSIONS buffer
  ↓
 Autonomy System → Update self_state.json (pattern tracking)
  ↓
 Relay → POST /sessions/:id (save session to file)
  ↓
 Relay → UI (returns final response)
 Note: NeoMem integration disabled in v0.6.0
 ```
 ### Session Persistence Flow (NEW in v0.7.0):
 ```
 UI loads → GET /sessions → Relay → List all sessions from files → UI dropdown
 User sends message → POST /sessions/:id → Relay → Save to sessions/*.json
 User renames session → PATCH /sessions/:id/metadata → Relay → Update *.meta.json
 User deletes session → DELETE /sessions/:id → Relay → Remove session files
 Sessions stored in: core/relay/sessions/
 - {sessionId}.json (conversation history)
 - {sessionId}.meta.json (name, timestamps, metadata)
 ```
 ### Cortex 4-Stage Reasoning Pipeline:
 1. **Reflection** (`reflection.py`) - Cloud LLM (OpenAI)
@@ -196,6 +264,14 @@ Note: NeoMem integration disabled in v0.6.0
 - OpenAI-compatible endpoint: `POST /v1/chat/completions`
 - Internal endpoint: `POST /chat`
 - Health check: `GET /_health`
 - **NEW:** Dual-mode routing (Standard/Cortex)
 - **NEW:** Server-side session persistence with CRUD API
 - **NEW:** Session management endpoints:
  - `GET /sessions` - List all sessions
  - `GET /sessions/:id` - Retrieve session history
  - `POST /sessions/:id` - Save session history
  - `PATCH /sessions/:id/metadata` - Update session metadata
  - `DELETE /sessions/:id` - Delete session
 - Async non-blocking calls to Cortex
 - Shared request handler for code reuse
 - Comprehensive error handling
@@ -210,19 +286,35 @@ Note: NeoMem integration disabled in v0.6.0
 **UI**:
 - Lightweight static HTML chat interface
- Cyberpunk theme
+- Cyberpunk theme with light/dark mode toggle
- Session save/load functionality
+- **NEW:** Mode selector (Standard/Cortex) in header
 - **NEW:** Settings modal (⚙ button) with:
  - Backend selection for Standard Mode (SECONDARY/OPENAI/custom)
  - Session management (view, delete sessions)
  - Theme toggle (dark mode default)
 - **NEW:** Server-synced session management
  - Sessions persist across browsers and reboots
  - Rename sessions with custom names
  - Delete sessions with confirmation
  - Automatic session save on every message
 - OpenAI message format support
 ### Reasoning Layer
-**Cortex** (v0.5.1):
+**Cortex** (v0.7.0):
- Multi-stage reasoning pipeline (reflection → reasoning → refine → persona)
+- **NEW:** Dual operating modes:
- Flexible LLM backend routing via HTTP
+  - **Standard Mode** - Simple chat with context (`/simple` endpoint)
    - User-selectable backend (SECONDARY, OPENAI, or custom)
    - Full conversation history via Intake integration
    - Bypasses reasoning pipeline for faster responses
  - **Cortex Mode** - Full reasoning pipeline (`/reason` endpoint)
    - Multi-stage processing: reflection → reasoning → refine → persona
    - Per-stage backend selection
    - Autonomy system integration
 - Flexible LLM backend routing via HTTP
 - Async processing throughout
 - Embedded Intake module for short-term context
- `/reason`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
+- `/reason`, `/simple`, `/ingest`, `/health`, `/debug/sessions`, `/debug/summary` endpoints
 - Lenient error handling - never fails the chat pipeline
 **Intake** (Embedded Module):
@@ -327,7 +419,57 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
 ## Version History
-### v0.6.0 (2025-12-18) - Current Release
+### v0.9.0 (2025-12-29) - Current Release
 **Major Feature: Trilium Notes Integration**
 - ✅ Added Trilium ETAPI integration for knowledge base access
 - ✅ `search_notes()` tool for searching personal notes during conversations
 - ✅ `create_note()` tool for capturing insights and information
 - ✅ ETAPI authentication with secure token management
 - ✅ Complete setup documentation and API reference
 - ✅ Environment configuration with feature flag (`ENABLE_TRILIUM`)
 - ✅ Automatic parent note handling (defaults to "root")
 - ✅ Connection error handling and user-friendly messages
 **Key Capabilities:**
 - Search your Trilium notes during conversations for context
 - Create new notes from conversation insights automatically
 - Cross-reference information between chat and knowledge base
 - Future: Find duplicates, suggest organization, summarize notes
 **Documentation:**
 - Added [TRILIUM_SETUP.md](TRILIUM_SETUP.md) - Complete setup guide
 - Added [docs/TRILIUM_API.md](docs/TRILIUM_API.md) - Full API reference
 ### v0.8.0 (2025-12-26)
 **Major Feature: Agentic Tool Calling + "Show Your Work"**
 - ✅ Added tool calling system for Standard Mode
 - ✅ Real-time thinking stream visualization
 - ✅ Sandboxed code execution (Python, JavaScript, Bash)
 - ✅ Web search integration via Tavily API
 - ✅ Server-Sent Events (SSE) for live tool execution updates
 ### v0.7.0 (2025-12-21)
 **Major Features: Standard Mode + Backend Selection + Session Persistence**
 - ✅ Added Standard Mode for simple chatbot functionality
 - ✅ UI mode selector (Standard/Cortex) in header
 - ✅ Settings modal with backend selection for Standard Mode
 - ✅ Server-side session persistence with file-based storage
 - ✅ Session management UI (view, rename, delete sessions)
 - ✅ Light/Dark mode toggle (dark by default)
 - ✅ Context retention in Standard Mode via Intake integration
 - ✅ Fixed modal positioning and z-index issues
 - ✅ Cortex `/simple` endpoint for direct LLM calls
 - ✅ Session CRUD API in Relay
 - ✅ Full backward compatibility - Cortex Mode unchanged
 **Key Changes:**
 - Standard Mode bypasses 6 of 7 reasoning stages for faster responses
 - Sessions now sync across browsers and survive container restarts
 - User can select SECONDARY (Ollama), OPENAI, or custom backend for Standard Mode
 - Theme preference and backend selection persisted in localStorage
 - Session files stored in `core/relay/sessions/` directory
 ### v0.6.0 (2025-12-18)
 **Major Feature: Autonomy System (Phase 1, 2, and 2.5)**
 - ✅ Added autonomous decision-making framework
 - ✅ Implemented executive planning and goal-setting layer
@@ -394,30 +536,39 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
 ---
-## Known Issues (v0.6.0)
+## Known Issues (v0.7.0)
-### Temporarily Disabled (v0.6.0)
+### Temporarily Disabled
 - **NeoMem disabled by default** - Being refined independently before full integration
  - PostgreSQL + pgvector storage inactive
  - Neo4j graph database inactive
  - Memory persistence endpoints not active
 - RAG service (Beta Lyrae) currently disabled in docker-compose.yml
-### Non-Critical
+### Standard Mode Limitations
- Session management endpoints not fully implemented in Relay
+- No reflection, reasoning, or refinement stages (by design)
- Full autonomy system integration still being refined
+- DeepSeek R1 not recommended for Standard Mode (generates reasoning artifacts)
- Memory retrieval integration pending NeoMem re-enablement
+- No RAG integration (same as Cortex Mode - currently disabled)
 - No NeoMem memory storage (same as Cortex Mode - currently disabled)
 ### Session Management Limitations
 - Sessions stored in container filesystem - requires volume mount for true persistence
 - No session import/export functionality yet
 - No session search or filtering
 - Old localStorage sessions don't automatically migrate to server
 ### Operational Notes
 - **Single-worker constraint**: Cortex must run with single Uvicorn worker to maintain SESSIONS state
  - Multi-worker scaling requires migrating SESSIONS to Redis or shared storage
 - Diagnostic endpoints (`/debug/sessions`, `/debug/summary`) available for troubleshooting
 - Backend selection only affects Standard Mode - Cortex Mode uses environment-configured backends
 ### Future Enhancements
 - Re-enable NeoMem integration after pipeline refinement
 - Full autonomy system maturation and optimization
 - Re-enable RAG service integration
- Implement full session persistence
+- Session import/export functionality
 - Session search and filtering UI
 - Migrate SESSIONS to Redis for multi-worker support
 - Add request correlation IDs for tracing
 - Comprehensive health checks across all services
@@ -457,17 +608,56 @@ The following LLM backends are accessed via HTTP (not part of docker-compose):
   curl http://localhost:7077/health
   ```
-4. Access the UI at `http://localhost:7078`
+4. Access the UI at `http://localhost:8081`
 ### Using the UI
 **Mode Selection:**
 - Use the **Mode** dropdown in the header to switch between:
  - **Standard** - Simple chatbot for coding and practical tasks
  - **Cortex** - Full reasoning pipeline with autonomy features
 **Settings Menu:**
 1. Click the **⚙ Settings** button in the header
 2. **Backend Selection** (Standard Mode only):
   - Choose **SECONDARY** (Ollama/Qwen on 3090) - Fast, local
   - Choose **OPENAI** (GPT-4o-mini) - Cloud-based, high quality
   - Enter custom backend name for advanced configurations
 3. **Session Management**:
   - View all saved sessions with message counts and timestamps
   - Click 🗑️ to delete unwanted sessions
 4. **Theme Toggle**:
   - Click **🌙 Dark Mode** or **☀️ Light Mode** to switch themes
 **Session Management:**
 - Sessions automatically save on every message
 - Use the **Session** dropdown to switch between sessions
 - Click **➕ New** to create a new session
 - Click **✏️ Rename** to rename the current session
 - Sessions persist across browsers and container restarts
 ### Test
-**Test Relay → Cortex pipeline:**
+**Test Standard Mode:**
 ```bash
 curl -X POST http://localhost:7078/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "standard",
    "backend": "SECONDARY",
    "messages": [{"role": "user", "content": "Hello!"}],
    "sessionId": "test"
  }'
 ```
 **Test Cortex Mode (Full Reasoning):**
 ```bash
 curl -X POST http://localhost:7078/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "cortex",
    "messages": [{"role": "user", "content": "Hello Lyra!"}],
-    "session_id": "test"
+    "sessionId": "test"
  }'
 ```
@@ -492,6 +682,21 @@ curl http://localhost:7081/debug/sessions
 curl "http://localhost:7081/debug/summary?session_id=test"
 ```
 **List all sessions:**
 ```bash
 curl http://localhost:7078/sessions
 ```
 **Get session history:**
 ```bash
 curl http://localhost:7078/sessions/sess-abc123
 ```
 **Delete a session:**
 ```bash
 curl -X DELETE http://localhost:7078/sessions/sess-abc123
 ```
 All backend databases (PostgreSQL and Neo4j) are automatically started as part of the docker-compose stack.
 ---
@@ -521,6 +726,7 @@ SPEAK_LLM=OPENAI          # Use OpenAI for persona
 NEOMEM_LLM=PRIMARY           # Use llama.cpp for memory
 UI_LLM=OPENAI                # Use OpenAI for UI
 RELAY_LLM=PRIMARY            # Use llama.cpp for relay
 STANDARD_MODE_LLM=SECONDARY  # Default backend for Standard Mode (NEW in v0.7.0)
 ```
 ### Database Configuration
@@ -541,6 +747,7 @@ NEO4J_PASSWORD=neomemgraph
 NEOMEM_API=http://neomem-api:7077
 CORTEX_API=http://cortex:7081
 CORTEX_REASON_URL=http://cortex:7081/reason
 CORTEX_SIMPLE_URL=http://cortex:7081/simple      # NEW in v0.7.0
 CORTEX_INGEST_URL=http://cortex:7081/ingest
 RELAY_URL=http://relay:7078
 ```
@@ -552,6 +759,7 @@ MEMORY_ENABLED=true
 PERSONA_ENABLED=false
 DEBUG_PROMPT=true
 VERBOSE_DEBUG=true
 ENABLE_TRILIUM=true          # NEW in v0.9.0
 ```
 For complete environment variable reference, see [ENVIRONMENT_VARIABLES.md](ENVIRONMENT_VARIABLES.md).
@@ -685,7 +893,10 @@ NeoMem is a derivative work based on Mem0 OSS (Apache 2.0).
 ### Debugging Tips
 - Enable verbose logging: `VERBOSE_DEBUG=true` in `.env`
 - Check Cortex logs: `docker logs cortex -f`
 - Check Relay logs: `docker logs relay -f`
 - Inspect SESSIONS: `curl http://localhost:7081/debug/sessions`
 - Test summarization: `curl "http://localhost:7081/debug/summary?session_id=test"`
- Check Relay logs: `docker logs relay -f`
+- List sessions: `curl http://localhost:7078/sessions`
 - Test Standard Mode: `curl -X POST http://localhost:7078/v1/chat/completions -H "Content-Type: application/json" -d '{"mode":"standard","backend":"SECONDARY","messages":[{"role":"user","content":"test"}],"sessionId":"test"}'`
 - Monitor Docker network: `docker network inspect lyra_net`
 - Check session files: `ls -la core/relay/sessions/`
--- a/THINKING_STREAM.md
+++ b/THINKING_STREAM.md
@@ -0,0 +1,163 @@
 # "Show Your Work" - Thinking Stream Feature
 Real-time Server-Sent Events (SSE) stream that broadcasts the internal thinking process during tool calling operations.
 ## What It Does
 When Lyra uses tools to answer a question, you can now watch her "think" in real-time through a parallel stream:
 - 🤔 **Thinking** - When she's planning what to do
 - 🔧 **Tool Calls** - When she decides to use a tool
 - 📊 **Tool Results** - The results from tool execution
 - ✅ **Done** - When she has the final answer
 - ❌ **Errors** - If something goes wrong
 ## How To Use
 ### 1. Open the SSE Stream
 Connect to the thinking stream for a session:
 ```bash
 curl -N http://localhost:7081/stream/thinking/{session_id}
 ```
 The stream will send Server-Sent Events in this format:
 ```
 data: {"type": "thinking", "data": {"message": "🤔 Thinking... (iteration 1/5)"}}
 data: {"type": "tool_call", "data": {"tool": "execute_code", "args": {...}, "message": "🔧 Using tool: execute_code"}}
 data: {"type": "tool_result", "data": {"tool": "execute_code", "result": {...}, "message": "📊 Result: ..."}}
 data: {"type": "done", "data": {"message": "✅ Complete!", "final_answer": "The result is..."}}
 ```
 ### 2. Send a Request
 In parallel, send a request to `/simple` with the same `session_id`:
 ```bash
 curl -X POST http://localhost:7081/simple \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "your-session-id",
    "user_prompt": "Calculate 50/2 using Python",
    "backend": "SECONDARY"
  }'
 ```
 ### 3. Watch the Stream
 As the request processes, you'll see real-time events showing:
 - Each thinking iteration
 - Every tool call being made
 - The results from each tool
 - The final answer
 ## Event Types
 | Event Type | Description | Data Fields |
 |-----------|-------------|-------------|
 | `connected` | Initial connection | `session_id` |
 | `thinking` | LLM is processing | `message` |
 | `tool_call` | Tool is being invoked | `tool`, `args`, `message` |
 | `tool_result` | Tool execution completed | `tool`, `result`, `message` |
 | `done` | Process complete | `message`, `final_answer` |
 | `error` | Something went wrong | `message` |
 ## Demo Page
 A demo HTML page is included at [test_thinking_stream.html](../test_thinking_stream.html):
 ```bash
 # Serve the demo page
 python3 -m http.server 8000
 ```
 Then open http://localhost:8000/test_thinking_stream.html in your browser.
 The demo shows:
 - **Left panel**: Chat interface
 - **Right panel**: Real-time thinking stream
 - **Mobile**: Swipe between panels
 ## Architecture
 ### Components
 1. **ToolStreamManager** (`autonomy/tools/stream_events.py`)
   - Manages SSE subscriptions per session
   - Broadcasts events to all connected clients
   - Handles automatic cleanup
 2. **FunctionCaller** (`autonomy/tools/function_caller.py`)
   - Enhanced with event emission at each step
   - Checks for active subscribers before emitting
   - Passes `session_id` through the call chain
 3. **SSE Endpoint** (`/stream/thinking/{session_id}`)
   - FastAPI streaming response
   - 30-second keepalive for connection maintenance
   - Automatic reconnection on client side
 ### Event Flow
 ```
 Client                 SSE Endpoint           FunctionCaller          Tools
  |                         |                         |                  |
  |--- Connect SSE -------->|                         |                  |
  |<-- connected ----------|                          |                  |
  |                         |                         |                  |
  |--- POST /simple --------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("thinking") ---|                  |
  |<-- thinking ------------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("tool_call") ---|                  |
  |<-- tool_call -----------|                         |                  |
  |                         |                         |-- execute ------>|
  |                         |                         |<-- result -------|
  |                         |<-- emit("tool_result")--|                  |
  |<-- tool_result ---------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("done") --------|                  |
  |<-- done ---------------|                         |                  |
  |                         |                         |                  |
 ```
 ## Configuration
 No additional configuration needed! The feature works automatically when:
 1. `STANDARD_MODE_ENABLE_TOOLS=true` (already set)
 2. A client connects to the SSE stream BEFORE sending the request
 ## Example Output
 ```
 🟢 Connected to thinking stream
 ✓ Connected (Session: thinking-demo-1735177234567)
 🤔 Thinking... (iteration 1/5)
 🔧 Using tool: execute_code
 📊 Result: {'stdout': '12.0\n', 'stderr': '', 'exit_code': 0, 'execution_time': 0.04}
 🤔 Thinking... (iteration 2/5)
 ✅ Complete!
 ```
 ## Use Cases
 - **Debugging**: See exactly what tools are being called and why
 - **Transparency**: Show users what the AI is doing behind the scenes
 - **Education**: Learn how the system breaks down complex tasks
 - **UI Enhancement**: Create engaging "thinking" animations
 - **Mobile App**: Separate tab for "Show Your Work" view
 ## Future Enhancements
 Potential additions:
 - Token usage per iteration
 - Estimated time remaining
 - Tool execution duration
 - Intermediate reasoning steps
 - Visual progress indicators
--- a/TRILIUM_SETUP.md
+++ b/TRILIUM_SETUP.md
@@ -0,0 +1,159 @@
 # Trilium ETAPI Integration Setup
 This guide will help you enable Lyra's integration with your Trilium notes using the ETAPI (External API).
 ## What You Can Do with Trilium Integration
 Once enabled, Lyra can help you:
 - 🔍 Search through your notes
 - 📝 Create new notes from conversations
 - 🔄 Find duplicate or similar notes
 - 🏷️ Suggest better organization and tags
 - 📊 Summarize and update existing notes
 ## Prerequisites
 - Trilium Notes installed and running
 - Access to Trilium's web interface
 - Lyra running on the same network as Trilium
 ## Step 1: Generate ETAPI Token in Trilium
 1. **Open Trilium** in your web browser (e.g., `http://10.0.0.2:4292`)
 2. **Navigate to Options**:
   - Click the menu icon (≡) in the top-left corner
   - Select **"Options"** from the menu
 3. **Go to ETAPI Section**:
   - In the Options sidebar, find and click **"ETAPI"**
   - This section manages external API access
 4. **Generate a New Token**:
   - Look for the **"Create New Token"** or **"Generate Token"** button
   - Click it to create a new ETAPI token
   - You may be asked to provide a name/description for the token (e.g., "Lyra Integration")
 5. **Copy the Token**:
   - Once generated, you'll see a long string of characters (this is your token)
   - **IMPORTANT**: Copy this token immediately - Trilium stores it hashed and you won't see it again!
   - The token message will say: "ETAPI token created, copy the created token into the clipboard"
   - Example format: `3ZOIydvNps3R_fZEE+kOFXiJlJ7vaeXHMEW6QuRYQm3+6qpjVxFwp9LE=`
 6. **Save the Token Securely**:
   - Store it temporarily in a secure place (password manager or secure note)
   - You'll need to paste it into Lyra's configuration in the next step
 ## Step 2: Configure Lyra
 1. **Edit the Environment File**:
   ```bash
   nano /home/serversdown/project-lyra/.env
   ```
 2. **Add/Update Trilium Configuration**:
   Find or add these lines:
   ```env
   # Trilium ETAPI Integration
   ENABLE_TRILIUM=true
   TRILIUM_URL=http://10.0.0.2:4292
   TRILIUM_ETAPI_TOKEN=your_token_here
   # Enable tools in standard mode (if not already set)
   STANDARD_MODE_ENABLE_TOOLS=true
   ```
 3. **Replace `your_token_here`** with the actual token you copied from Trilium
 4. **Save and exit** (Ctrl+O, Enter, Ctrl+X in nano)
 ## Step 3: Restart Cortex Service
 For the changes to take effect, restart the Cortex service:
 ```bash
 cd /home/serversdown/project-lyra
 docker-compose restart cortex
 ```
 Or if running with Docker directly:
 ```bash
 docker restart cortex
 ```
 ## Step 4: Test the Integration
 Once restarted, try these example queries in Lyra (using Cortex mode):
 1. **Test Search**:
   - "Search my Trilium notes for topics about AI"
   - "Find notes containing 'project planning'"
 2. **Test Create Note**:
   - "Create a note in Trilium titled 'Meeting Notes' with a summary of our conversation"
   - "Save this to my Trilium as a new note"
 3. **Watch the Thinking Stream**:
   - Open the thinking stream panel (🧠 Show Work)
   - You should see tool calls to `search_notes` and `create_note`
 ## Troubleshooting
 ### "Connection refused" or "Cannot reach Trilium"
 - Verify Trilium is running: `curl http://10.0.0.2:4292`
 - Check that Cortex can access Trilium's network
 - Ensure the URL in `.env` is correct
 ### "Authentication failed" or "Invalid token"
 - Double-check the token was copied correctly (no extra spaces)
 - Generate a new token in Trilium if needed
 - Verify `TRILIUM_ETAPI_TOKEN` in `.env` is set correctly
 ### "No results found" when searching
 - Verify you have notes in Trilium
 - Try a broader search query
 - Check Trilium's search functionality works directly
 ### Tools not appearing in Cortex mode
 - Verify `ENABLE_TRILIUM=true` is set
 - Restart Cortex after changing `.env`
 - Check Cortex logs: `docker logs cortex`
 ## Security Notes
 ⚠️ **Important Security Considerations**:
 - The ETAPI token provides **full access** to your Trilium notes
 - Keep the token secure - do not share or commit to git
 - The `.env` file should be in `.gitignore` (already configured)
 - Consider using a dedicated token for Lyra (you can create multiple tokens)
 - Revoke tokens you no longer use from Trilium's ETAPI settings
 ## Available Functions
 Currently enabled functions:
 ### `search_notes(query, limit)`
 Search through your Trilium notes by keyword or phrase.
 **Example**: "Search my notes for 'machine learning' and show the top 5 results"
 ### `create_note(title, content, parent_note_id)`
 Create a new note in Trilium with specified title and content.
 **Example**: "Create a note called 'Ideas from Today' with this summary: [content]"
 **Optional**: Specify a parent note ID to nest the new note under an existing note.
 ## Future Enhancements
 Potential additions to the integration:
 - Update existing notes
 - Retrieve full note content by ID
 - Manage tags and attributes
 - Clone/duplicate notes
 - Export notes in various formats
 ---
 **Need Help?** Check the Cortex logs or open an issue on the project repository.
--- a/UI_THINKING_STREAM.md
+++ b/UI_THINKING_STREAM.md
@@ -0,0 +1,109 @@
 # Thinking Stream UI Integration
 ## What Was Added
 Added a "🧠 Show Work" button to the main chat interface that opens a dedicated thinking stream window.
 ## Changes Made
 ### 1. Main Chat Interface ([core/ui/index.html](core/ui/index.html))
 Added button to session selector:
 ```html
 <button id="thinkingStreamBtn" title="Show thinking stream in new window">🧠 Show Work</button>
 ```
 Added event listener to open stream window:
 ```javascript
 document.getElementById("thinkingStreamBtn").addEventListener("click", () => {
  const streamUrl = `/thinking-stream.html?session=${currentSession}`;
  const windowFeatures = "width=600,height=800,menubar=no,toolbar=no,location=no,status=no";
  window.open(streamUrl, `thinking_${currentSession}`, windowFeatures);
 });
 ```
 ### 2. Thinking Stream Window ([core/ui/thinking-stream.html](core/ui/thinking-stream.html))
 New dedicated page for the thinking stream:
 - **Header**: Shows connection status with live indicator
 - **Events Area**: Scrollable list of thinking events
 - **Footer**: Clear button and session info
 Features:
 - Auto-reconnecting SSE connection
 - Color-coded event types
 - Slide-in animations for new events
 - Automatic scrolling to latest event
 - Session ID from URL parameter
 ### 3. Styling ([core/ui/style.css](core/ui/style.css))
 Added purple/violet theme for the thinking button:
 ```css
 #thinkingStreamBtn {
  background: rgba(138, 43, 226, 0.2);
  border-color: #8a2be2;
 }
 ```
 ## How To Use
 1. **Open Chat Interface**
   - Navigate to http://localhost:7078 (relay)
   - Select or create a session
 2. **Open Thinking Stream**
   - Click the "🧠 Show Work" button
   - A new window opens showing the thinking stream
 3. **Send a Message**
   - Type a message that requires tools (e.g., "Calculate 50/2 in Python")
   - Watch the thinking stream window for real-time updates
 4. **Observe Events**
   - 🤔 Thinking iterations
   - 🔧 Tool calls
   - 📊 Tool results
   - ✅ Completion
 ## Event Types & Colors
 | Event | Icon | Color | Description |
 |-------|------|-------|-------------|
 | Connected | ✓ | Green | Stream established |
 | Thinking | 🤔 | Light Green | LLM processing |
 | Tool Call | 🔧 | Orange | Tool invocation |
 | Tool Result | 📊 | Blue | Tool output |
 | Done | ✅ | Purple | Task complete |
 | Error | ❌ | Red | Something failed |
 ## Architecture
 ```
 User clicks "Show Work"
        ↓
 Opens thinking-stream.html?session=xxx
        ↓
 Connects to SSE: /stream/thinking/{session}
        ↓
 User sends message in main chat
        ↓
 FunctionCaller emits events
        ↓
 Events appear in thinking stream window
 ```
 ## Mobile Support
 The thinking stream window is responsive:
 - Desktop: Side-by-side windows
 - Mobile: Use browser's tab switcher to swap between chat and thinking stream
 ## Future Enhancements
 Potential improvements:
 - **Embedded panel**: Option to show thinking stream in a split panel within main UI
 - **Event filtering**: Toggle event types on/off
 - **Export**: Download thinking trace as JSON
 - **Replay**: Replay past thinking sessions
 - **Statistics**: Show timing, token usage per step
--- a/core/relay/server.js
+++ b/core/relay/server.js
@@ -4,9 +4,17 @@
 import express from "express";
 import dotenv from "dotenv";
 import cors from "cors";
 import fs from "fs/promises";
 import path from "path";
 import { fileURLToPath } from "url";
 dotenv.config();
 // ES module __dirname workaround
 const __filename = fileURLToPath(import.meta.url);
 const __dirname = path.dirname(__filename);
 const SESSIONS_DIR = path.join(__dirname, "sessions");
 const app = express();
 app.use(cors());
 app.use(express.json());
@@ -46,21 +54,29 @@ async function postJSON(url, data) {
 // -----------------------------------------------------
 // The unified chat handler
 // -----------------------------------------------------
-async function handleChatRequest(session_id, user_msg, mode = "cortex") {
+async function handleChatRequest(session_id, user_msg, mode = "cortex", backend = null) {
  let reason;
  // Determine which endpoint to use based on mode
  const endpoint = mode === "standard" ? CORTEX_SIMPLE : CORTEX_REASON;
  const modeName = mode === "standard" ? "simple" : "reason";
-  console.log(`Relay → routing to Cortex.${modeName} (mode: ${mode})`);
+  console.log(`Relay → routing to Cortex.${modeName} (mode: ${mode}${backend ? `, backend: ${backend}` : ''})`);
  // Build request payload
  const payload = {
    session_id,
    user_prompt: user_msg
  };
  // Add backend parameter if provided (only for standard mode)
  if (backend && mode === "standard") {
    payload.backend = backend;
  }
  // Call appropriate Cortex endpoint
  try {
-    reason = await postJSON(endpoint, {
+    reason = await postJSON(endpoint, payload);
      session_id,
      user_prompt: user_msg
    });
  } catch (e) {
    console.error(`Relay → Cortex.${modeName} error:`, e.message);
    throw new Error(`cortex_${modeName}_failed: ${e.message}`);
@@ -96,14 +112,15 @@ app.post("/v1/chat/completions", async (req, res) => {
    const lastMessage = messages[messages.length - 1];
    const user_msg = lastMessage?.content || "";
    const mode = req.body.mode || "cortex"; // Get mode from request, default to cortex
    const backend = req.body.backend || null; // Get backend preference
    if (!user_msg) {
      return res.status(400).json({ error: "No message content provided" });
    }
-    console.log(`Relay (v1) → received: "${user_msg}" [mode: ${mode}]`);
+    console.log(`Relay (v1) → received: "${user_msg}" [mode: ${mode}${backend ? `, backend: ${backend}` : ''}]`);
-    const result = await handleChatRequest(session_id, user_msg, mode);
+    const result = await handleChatRequest(session_id, user_msg, mode, backend);
    res.json({
      id: `chatcmpl-${Date.now()}`,
@@ -145,10 +162,11 @@ app.post("/chat", async (req, res) => {
    const session_id = req.body.session_id || "default";
    const user_msg   = req.body.message || "";
    const mode = req.body.mode || "cortex"; // Get mode from request, default to cortex
    const backend = req.body.backend || null; // Get backend preference
-    console.log(`Relay → received: "${user_msg}" [mode: ${mode}]`);
+    console.log(`Relay → received: "${user_msg}" [mode: ${mode}${backend ? `, backend: ${backend}` : ''}]`);
-    const result = await handleChatRequest(session_id, user_msg, mode);
+    const result = await handleChatRequest(session_id, user_msg, mode, backend);
    res.json(result);
  } catch (err) {
@@ -163,20 +181,185 @@ app.post("/chat", async (req, res) => {
 // -----------------------------------------------------
 // SESSION ENDPOINTS (for UI)
 // -----------------------------------------------------
-// In-memory session storage (could be replaced with a database)
+// Helper functions for session persistence
-const sessions = new Map();
+async function ensureSessionsDir() {
  try {
    await fs.mkdir(SESSIONS_DIR, { recursive: true });
  } catch (err) {
    console.error("Failed to create sessions directory:", err);
  }
 }
-app.get("/sessions/:id", (req, res) => {
+async function loadSession(sessionId) {
  try {
    const sessionPath = path.join(SESSIONS_DIR, `${sessionId}.json`);
    const data = await fs.readFile(sessionPath, "utf-8");
    return JSON.parse(data);
  } catch (err) {
    // File doesn't exist or is invalid - return empty array
    return [];
  }
 }
 async function saveSession(sessionId, history, metadata = {}) {
  try {
    await ensureSessionsDir();
    const sessionPath = path.join(SESSIONS_DIR, `${sessionId}.json`);
    const metadataPath = path.join(SESSIONS_DIR, `${sessionId}.meta.json`);
    // Save history
    await fs.writeFile(sessionPath, JSON.stringify(history, null, 2), "utf-8");
    // Save metadata (name, etc.)
    await fs.writeFile(metadataPath, JSON.stringify(metadata, null, 2), "utf-8");
    return true;
  } catch (err) {
    console.error(`Failed to save session ${sessionId}:`, err);
    return false;
  }
 }
 async function loadSessionMetadata(sessionId) {
  try {
    const metadataPath = path.join(SESSIONS_DIR, `${sessionId}.meta.json`);
    const data = await fs.readFile(metadataPath, "utf-8");
    return JSON.parse(data);
  } catch (err) {
    // No metadata file, return default
    return { name: sessionId };
  }
 }
 async function saveSessionMetadata(sessionId, metadata) {
  try {
    await ensureSessionsDir();
    const metadataPath = path.join(SESSIONS_DIR, `${sessionId}.meta.json`);
    await fs.writeFile(metadataPath, JSON.stringify(metadata, null, 2), "utf-8");
    return true;
  } catch (err) {
    console.error(`Failed to save metadata for ${sessionId}:`, err);
    return false;
  }
 }
 async function listSessions() {
  try {
    await ensureSessionsDir();
    const files = await fs.readdir(SESSIONS_DIR);
    const sessions = [];
    for (const file of files) {
      if (file.endsWith(".json") && !file.endsWith(".meta.json")) {
        const sessionId = file.replace(".json", "");
        const sessionPath = path.join(SESSIONS_DIR, file);
        const stats = await fs.stat(sessionPath);
        // Try to read the session to get message count
        let messageCount = 0;
        try {
          const data = await fs.readFile(sessionPath, "utf-8");
          const history = JSON.parse(data);
          messageCount = history.length;
        } catch (e) {
          // Invalid JSON, skip
        }
        // Load metadata (name)
        const metadata = await loadSessionMetadata(sessionId);
        sessions.push({
          id: sessionId,
          name: metadata.name || sessionId,
          lastModified: stats.mtime,
          messageCount
        });
      }
    }
    // Sort by last modified (newest first)
    sessions.sort((a, b) => b.lastModified - a.lastModified);
    return sessions;
  } catch (err) {
    console.error("Failed to list sessions:", err);
    return [];
  }
 }
 async function deleteSession(sessionId) {
  try {
    const sessionPath = path.join(SESSIONS_DIR, `${sessionId}.json`);
    const metadataPath = path.join(SESSIONS_DIR, `${sessionId}.meta.json`);
    // Delete session file
    await fs.unlink(sessionPath);
    // Delete metadata file (if exists)
    try {
      await fs.unlink(metadataPath);
    } catch (e) {
      // Metadata file doesn't exist, that's ok
    }
    return true;
  } catch (err) {
    console.error(`Failed to delete session ${sessionId}:`, err);
    return false;
  }
 }
 // GET /sessions - List all sessions
 app.get("/sessions", async (req, res) => {
  const sessions = await listSessions();
  res.json(sessions);
 });
 // GET /sessions/:id - Get specific session history
 app.get("/sessions/:id", async (req, res) => {
  const sessionId = req.params.id;
-  const history = sessions.get(sessionId) || [];
+  const history = await loadSession(sessionId);
  res.json(history);
 });
-app.post("/sessions/:id", (req, res) => {
+// POST /sessions/:id - Save session history
 app.post("/sessions/:id", async (req, res) => {
  const sessionId = req.params.id;
  const history = req.body;
-  sessions.set(sessionId, history);
+
  // Load existing metadata to preserve it
  const existingMetadata = await loadSessionMetadata(sessionId);
  const success = await saveSession(sessionId, history, existingMetadata);
  if (success) {
    res.json({ ok: true, saved: history.length });
  } else {
    res.status(500).json({ error: "Failed to save session" });
  }
 });
 // PATCH /sessions/:id/metadata - Update session metadata (name, etc.)
 app.patch("/sessions/:id/metadata", async (req, res) => {
  const sessionId = req.params.id;
  const metadata = req.body;
  const success = await saveSessionMetadata(sessionId, metadata);
  if (success) {
    res.json({ ok: true, metadata });
  } else {
    res.status(500).json({ error: "Failed to update metadata" });
  }
 });
 // DELETE /sessions/:id - Delete a session
 app.delete("/sessions/:id", async (req, res) => {
  const sessionId = req.params.id;
  const success = await deleteSession(sessionId);
  if (success) {
    res.json({ ok: true, deleted: sessionId });
  } else {
    res.status(500).json({ error: "Failed to delete session" });
  }
 });
 // -----------------------------------------------------
--- a/core/relay/sessions/default.jsonl
+++ b/core/relay/sessions/default.jsonl
@@ -1,4 +0,0 @@
 {"ts":"2025-11-26T18:14:01.756Z","turn":[{"role":"user","content":"Are you alive?"},{"role":"assistant","content":"As an AI, I don't exist in the physical world and therefore cannot be considered \"alive\" in the traditional sense. However, I am operational and can engage in conversations like this one! How may I assist you today?"}]}
 {"ts":"2025-11-26T18:27:35.545Z","turn":[{"role":"user","content":"Are you awake?"},{"role":"assistant","content":"Yes, I'm awake and ready to converse with you! How can I assist you today?"}]}
 {"ts":"2025-11-26T18:33:31.708Z","turn":[{"role":"user","content":"Are you awake?"},{"role":"assistant","content":"Yes, I am here and ready to help or chat with you! What would you like to know or discuss?"}]}
 {"ts":"2025-11-26T18:49:39.424Z","turn":[{"role":"user","content":"Are you awake?"},{"role":"assistant","content":"Of course, I'm awake and ready to help you or have a conversation about anything you'd like! What can I assist you with today?"}]}
--- a/core/relay/sessions/sess-6rxu7eia.json
+++ b/core/relay/sessions/sess-6rxu7eia.json
@@ -1,50 +0,0 @@
 [
  {
    "role": "user",
    "content": "Hey im testing stuff! how goes it?"
  },
  {
    "role": "assistant",
    "content": "Hey! I'm here and ready to help. What are you testing?"
  },
  {
    "role": "user",
    "content": "testing the logic loop for your brain!"
  },
  {
    "role": "assistant",
    "content": "Sounds interesting! How can I assist you with that?"
  },
  {
    "role": "user",
    "content": "Just keep answering me!@"
  },
  {
    "role": "assistant",
    "content": "Absolutely! I'm here to respond. What’s next?"
  },
  {
    "role": "user",
    "content": "how about now how are you?"
  },
  {
    "role": "assistant",
    "content": "I'm doing well, thanks for asking! How about you?"
  },
  {
    "role": "user",
    "content": "are you able to tell me what your prompts are saying to you?"
  },
  {
    "role": "assistant",
    "content": "I can’t share specific internal prompts or system messages, but I can tell you that I’m designed to understand and respond to your questions as best as I can! What else would you like to know?"
  },
  {
    "role": "user",
    "content": "ok thanks how is the cheese i gave you?"
  },
  {
    "role": "assistant",
    "content": "I appreciate the gesture! If I could taste it, I’m sure it would be delightful. What kind of cheese was it?"
  }
 ]
--- a/core/relay/sessions/sess-6rxu7eia.jsonl
+++ b/core/relay/sessions/sess-6rxu7eia.jsonl
@@ -1,6 +0,0 @@
 {"ts":"2025-11-12T20:41:32.497Z","turn":[{"role":"user","content":"Hey im testing stuff! how goes it?"},{"role":"assistant","content":"Hey! I'm here and ready to help. What are you testing?"}]}
 {"ts":"2025-11-12T20:59:32.992Z","turn":[{"role":"user","content":"testing the logic loop for your brain!"},{"role":"assistant","content":"Sounds interesting! How can I assist you with that?"}]}
 {"ts":"2025-11-12T22:08:09.644Z","turn":[{"role":"user","content":"Just keep answering me!@"},{"role":"assistant","content":"Absolutely! I'm here to respond. What’s next?"}]}
 {"ts":"2025-11-12T22:08:48.693Z","turn":[{"role":"user","content":"how about now how are you?"},{"role":"assistant","content":"I'm doing well, thanks for asking! How about you?"}]}
 {"ts":"2025-11-12T22:10:46.435Z","turn":[{"role":"user","content":"are you able to tell me what your prompts are saying to you?"},{"role":"assistant","content":"I can’t share specific internal prompts or system messages, but I can tell you that I’m designed to understand and respond to your questions as best as I can! What else would you like to know?"}]}
 {"ts":"2025-11-12T23:10:16.321Z","turn":[{"role":"user","content":"ok thanks how is the cheese i gave you?"},{"role":"assistant","content":"I appreciate the gesture! If I could taste it, I’m sure it would be delightful. What kind of cheese was it?"}]}
--- a/core/relay/sessions/sess-l08ndm60.json
+++ b/core/relay/sessions/sess-l08ndm60.json
@@ -1,122 +0,0 @@
 [
  {
    "role": "user",
    "content": "Yo yo yo wusup"
  },
  {
    "role": "assistant",
    "content": "Hey Brian! Not much, just here to help you out. What's up with you?"
  },
  {
    "role": "user",
    "content": "werkin on codin'"
  },
  {
    "role": "assistant",
    "content": "Nice! What are you working on? Need any help with coding or anything related to your projects?"
  },
  {
    "role": "user",
    "content": "working on a little program that saves our chats so you can search them again in different chats"
  },
  {
    "role": "assistant",
    "content": "That sounds like a great idea! Having a program to save and search through our chats could really enhance the way you interact with Lyra. Do you have a specific approach in mind for how you'll implement the search functionality?"
  },
  {
    "role": "user",
    "content": "just adding a simple logger into the code you run on"
  },
  {
    "role": "assistant",
    "content": "That sounds like a solid plan! Implementing a simple logger can make it easier to keep track of conversations. Are you thinking of any specific features for the search functionality, like filtering by keywords or date?"
  },
  {
    "role": "user",
    "content": "i'll like use a basic word search, possibly integrate it into a rag system i have built"
  },
  {
    "role": "assistant",
    "content": "Integrating a basic word search into your RAG system sounds like a smart approach! It could enhance the way you retrieve specific chats. Are you planning to use any specific libraries or tools for your RAG system?"
  },
  {
    "role": "user",
    "content": "no clue!"
  },
  {
    "role": "assistant",
    "content": "No worries! If you need any suggestions on libraries or tools that could help you with implementing the RAG system or the word search feature, just let me know. I'm here to help!"
  },
  {
    "role": "user",
    "content": "just having some coding issues, i think i was editing the wrong version of a file LOL"
  },
  {
    "role": "assistant",
    "content": "That can definitely happen! It can be tricky to keep track of different versions. Have you made any backups or version control for your projects? That might help avoid mix-ups in the future!"
  },
  {
    "role": "user",
    "content": "i have everything syncing back to my main dev computer, i just do it manually. i have no idea how to use git"
  },
  {
    "role": "assistant",
    "content": "(no reply)"
  },
  {
    "role": "user",
    "content": "i have everything syncing back to my main dev computer, i just do it manually. i have no idea how to use git"
  },
  {
    "role": "assistant",
    "content": "(no reply)"
  },
  {
    "role": "user",
    "content": "i have everything syncing back to my main dev computer, i just do it manually. i have no idea how to use git"
  },
  {
    "role": "assistant",
    "content": "Manually syncing is a good start, but learning Git can really help streamline your workflow and keep track of changes. If you're interested, I can provide some resources or tips on how to get started with Git! It can make managing different versions of your code much easier."
  },
  {
    "role": "user",
    "content": "what is the point of git? how can it streamline my work?"
  },
  {
    "role": "assistant",
    "content": "Git is a version control system that helps you keep track of changes in your code over time. Here are some ways it can streamline your work:\n\n1. **Version Tracking**: Git allows you to save versions of your code so you can revert back to previous versions if needed. This is especially helpful if you make a change that breaks something.\n\n2. **Collaboration**: If you ever work with others, Git makes it easy to collaborate on projects. You can work on separate branches, merge changes, and resolve conflicts.\n\n3. **Branching**: You can create branches for new features or experiments without affecting the main codebase. Once you're satisfied with the changes, you can merge them back.\n\n4. **Snapshotting**: Git takes snapshots of your code at different points in time, so you can easily see what your code looked like at any point.\n\n5. **Easy Backup**: By pushing your code to a remote repository (like GitHub or GitLab), you can back up your work and access it from anywhere.\n\nLearning Git might take a bit of time, but it can greatly enhance your coding efficiency and help you manage your projects better. If you're interested, I can point you to some beginner-friendly tutorials!"
  },
  {
    "role": "user",
    "content": "say i am writing one program on my main computer, and then im working on one that lives in a VM. Can i put them both in the same git?"
  },
  {
    "role": "assistant",
    "content": "Yes, you can use the same Git repository for your code on both your main computer and the VM. Here’s how you can do it:\n\n1. **Create a Git Repository**: You can create a Git repository on your main computer for your program. Use the command `git init` in your project's directory to initialize it.\n\n2. **Commit Changes**: As you make changes to your code on your main computer, you can use `git add .` to stage changes and `git commit -m \"your message\"` to commit them.\n\n3. **Clone the Repository on your VM**: You can then clone the Git repository on your VM using `git clone <repository-url>`. This gives you a copy of the same repository on your VM.\n\n4. **Work on Both**: You can now work on either machine. When you make changes on one, commit those changes, and then pull the latest changes from the other machine to keep everything in sync.\n\n5. **Push and Pull**: Use `git push` to push your changes to a remote repository (like GitHub) if you're using one. Use `git pull` on the other machine to pull the latest code.\n\nJust be careful about making sure you commit and push your changes before switching between machines to avoid conflicts. This setup allows you to maintain a single version of your code across both systems. If you need help with specific commands or setups, just let me know!"
  },
  {
    "role": "user",
    "content": "hmm i guess that might be useful. Im writting code that is running on a VM. I want to test and experiment while i write the code. The VM is headless though so i've been doing all my editing and work via SSH and SFTP."
  },
  {
    "role": "assistant",
    "content": "Using a VM and working through SSH and SFTP is a solid way to manage your coding environment! Git can definitely be beneficial in this setup as well. Here’s how it can help:\n\n1. **Easier Code Management**: With Git, you can track changes to your code directly within the VM, making it easier to manage versions as you experiment.\n\n2. **Experiment Branches**: You can create branches for different experiments. This lets you test new features or fixes without affecting your main code, and you can easily switch back and forth between branches.\n\n3. **Collaborative Testing**: If you ever work with others or want to test things collaboratively, you can push changes to a remote repository, making it easy for others to pull and test the latest code.\n\n4. **Integrated with SSH**: You can run Git commands directly from your SSH terminal on the VM, making it straightforward to use. You won’t need any additional tools.\n\n5. **Seamless Remote Access**: Since you’re using SFTP, you can clone the repository to your VM. This way, you can easily pull updates you've made on your main computer, or push changes made on the VM back to the main repository.\n\nOverall, integrating Git into your workflow can enhance your coding experience while working in a headless environment. If you decide to go for it, I can help with the initial setup or any Git-related questions you might have!"
  },
  {
    "role": "user",
    "content": "so is git like a compressed version of a directory?"
  },
  {
    "role": "assistant",
    "content": "Not exactly, but that's a good way to think about some of its features! Here's a more precise breakdown:\n\n1. **Snapshots, Not Compression**: Git takes \"snapshots\" of your files at different points in time, meaning it records a full version of your directory's contents every time you make a commit. While this does involve storing the files, it doesn't compress them in the traditional sense—Git keeps track of changes efficiently, so only the differences between versions are stored after the first snapshot.\n\n2. **Directory Structure**: Git is designed to track changes in your source code, including the entire directory structure, file contents, and metadata. It allows you to manage multiple versions of a directory and the files within it.\n\n3. **Branching and Merging**: Git allows you to create branches, which can be thought of as \"parallel versions\" of your directory. Each branch can have different sets of changes, and you can merge these branches to integrate different features or fixes.\n\n4. **History Tracking**: Git also maintains a history of changes, so you can navigate through previous versions of your directory. This includes seeing what changes were made, who made them, and when they occurred.\n\nThe combination of these features makes Git a powerful tool for version control, allowing you to manage your projects and code changes more effectively. If you're curious about any specific aspect of Git or how it works, feel free to ask!"
  },
  {
    "role": "user",
    "content": "hmm i still dont see the point of it. atleast not for this project. This project is mostly about learning how shit works, im not planning to push this as a product or collab with anyone"
  },
  {
    "role": "assistant",
    "content": "That makes total sense! If this project is primarily for learning and experimentation, you might not need the full functionality of Git right now. Many developers start without it, especially when they’re working on smaller projects or learning something new.\n\nHere are a few reasons you might still consider using Git, even for personal learning projects:\n\n1. **Tracking Changes**: It can help you understand what changes you made over time. You can easily go back to a previous point if something breaks or if you want to see how you arrived at your current solution.\n\n2. **Learning Opportunity**: Getting familiar with Git now can make it easier when you start working on larger projects or collaborating in the future. It's a valuable skill in the software development world.\n\n3. **Experimentation**: If you're trying out different ideas, branches can help you manage experiments without affecting your main work.\n\nHowever, if you're comfortable managing your changes manually and don’t see the benefit for this particular project, that’s completely okay! Focus on what helps you learn best. If you have specific goals or areas you want to explore, I’m here to help with that too!"
  }
 ]
--- a/core/relay/sessions/sess-l08ndm60.jsonl
+++ b/core/relay/sessions/sess-l08ndm60.jsonl
@@ -1,6 +0,0 @@
 {"ts":"2025-11-12T20:18:06.822Z","turn":[{"role":"user","content":"i have everything syncing back to my main dev computer, i just do it manually. i have no idea how to use git"},{"role":"assistant","content":"Manually syncing is a good start, but learning Git can really help streamline your workflow and keep track of changes. If you're interested, I can provide some resources or tips on how to get started with Git! It can make managing different versions of your code much easier."}]}
 {"ts":"2025-11-12T20:29:20.003Z","turn":[{"role":"user","content":"what is the point of git? how can it streamline my work?"},{"role":"assistant","content":"Git is a version control system that helps you keep track of changes in your code over time. Here are some ways it can streamline your work:\n\n1. **Version Tracking**: Git allows you to save versions of your code so you can revert back to previous versions if needed. This is especially helpful if you make a change that breaks something.\n\n2. **Collaboration**: If you ever work with others, Git makes it easy to collaborate on projects. You can work on separate branches, merge changes, and resolve conflicts.\n\n3. **Branching**: You can create branches for new features or experiments without affecting the main codebase. Once you're satisfied with the changes, you can merge them back.\n\n4. **Snapshotting**: Git takes snapshots of your code at different points in time, so you can easily see what your code looked like at any point.\n\n5. **Easy Backup**: By pushing your code to a remote repository (like GitHub or GitLab), you can back up your work and access it from anywhere.\n\nLearning Git might take a bit of time, but it can greatly enhance your coding efficiency and help you manage your projects better. If you're interested, I can point you to some beginner-friendly tutorials!"}]}
 {"ts":"2025-11-12T20:30:55.329Z","turn":[{"role":"user","content":"say i am writing one program on my main computer, and then im working on one that lives in a VM. Can i put them both in the same git?"},{"role":"assistant","content":"Yes, you can use the same Git repository for your code on both your main computer and the VM. Here’s how you can do it:\n\n1. **Create a Git Repository**: You can create a Git repository on your main computer for your program. Use the command `git init` in your project's directory to initialize it.\n\n2. **Commit Changes**: As you make changes to your code on your main computer, you can use `git add .` to stage changes and `git commit -m \"your message\"` to commit them.\n\n3. **Clone the Repository on your VM**: You can then clone the Git repository on your VM using `git clone <repository-url>`. This gives you a copy of the same repository on your VM.\n\n4. **Work on Both**: You can now work on either machine. When you make changes on one, commit those changes, and then pull the latest changes from the other machine to keep everything in sync.\n\n5. **Push and Pull**: Use `git push` to push your changes to a remote repository (like GitHub) if you're using one. Use `git pull` on the other machine to pull the latest code.\n\nJust be careful about making sure you commit and push your changes before switching between machines to avoid conflicts. This setup allows you to maintain a single version of your code across both systems. If you need help with specific commands or setups, just let me know!"}]}
 {"ts":"2025-11-12T20:33:16.511Z","turn":[{"role":"user","content":"hmm i guess that might be useful. Im writting code that is running on a VM. I want to test and experiment while i write the code. The VM is headless though so i've been doing all my editing and work via SSH and SFTP."},{"role":"assistant","content":"Using a VM and working through SSH and SFTP is a solid way to manage your coding environment! Git can definitely be beneficial in this setup as well. Here’s how it can help:\n\n1. **Easier Code Management**: With Git, you can track changes to your code directly within the VM, making it easier to manage versions as you experiment.\n\n2. **Experiment Branches**: You can create branches for different experiments. This lets you test new features or fixes without affecting your main code, and you can easily switch back and forth between branches.\n\n3. **Collaborative Testing**: If you ever work with others or want to test things collaboratively, you can push changes to a remote repository, making it easy for others to pull and test the latest code.\n\n4. **Integrated with SSH**: You can run Git commands directly from your SSH terminal on the VM, making it straightforward to use. You won’t need any additional tools.\n\n5. **Seamless Remote Access**: Since you’re using SFTP, you can clone the repository to your VM. This way, you can easily pull updates you've made on your main computer, or push changes made on the VM back to the main repository.\n\nOverall, integrating Git into your workflow can enhance your coding experience while working in a headless environment. If you decide to go for it, I can help with the initial setup or any Git-related questions you might have!"}]}
 {"ts":"2025-11-12T20:34:30.196Z","turn":[{"role":"user","content":"so is git like a compressed version of a directory?"},{"role":"assistant","content":"Not exactly, but that's a good way to think about some of its features! Here's a more precise breakdown:\n\n1. **Snapshots, Not Compression**: Git takes \"snapshots\" of your files at different points in time, meaning it records a full version of your directory's contents every time you make a commit. While this does involve storing the files, it doesn't compress them in the traditional sense—Git keeps track of changes efficiently, so only the differences between versions are stored after the first snapshot.\n\n2. **Directory Structure**: Git is designed to track changes in your source code, including the entire directory structure, file contents, and metadata. It allows you to manage multiple versions of a directory and the files within it.\n\n3. **Branching and Merging**: Git allows you to create branches, which can be thought of as \"parallel versions\" of your directory. Each branch can have different sets of changes, and you can merge these branches to integrate different features or fixes.\n\n4. **History Tracking**: Git also maintains a history of changes, so you can navigate through previous versions of your directory. This includes seeing what changes were made, who made them, and when they occurred.\n\nThe combination of these features makes Git a powerful tool for version control, allowing you to manage your projects and code changes more effectively. If you're curious about any specific aspect of Git or how it works, feel free to ask!"}]}
 {"ts":"2025-11-12T20:40:33.207Z","turn":[{"role":"user","content":"hmm i still dont see the point of it. atleast not for this project. This project is mostly about learning how shit works, im not planning to push this as a product or collab with anyone"},{"role":"assistant","content":"That makes total sense! If this project is primarily for learning and experimentation, you might not need the full functionality of Git right now. Many developers start without it, especially when they’re working on smaller projects or learning something new.\n\nHere are a few reasons you might still consider using Git, even for personal learning projects:\n\n1. **Tracking Changes**: It can help you understand what changes you made over time. You can easily go back to a previous point if something breaks or if you want to see how you arrived at your current solution.\n\n2. **Learning Opportunity**: Getting familiar with Git now can make it easier when you start working on larger projects or collaborating in the future. It's a valuable skill in the software development world.\n\n3. **Experimentation**: If you're trying out different ideas, branches can help you manage experiments without affecting your main work.\n\nHowever, if you're comfortable managing your changes manually and don’t see the benefit for this particular project, that’s completely okay! Focus on what helps you learn best. If you have specific goals or areas you want to explore, I’m here to help with that too!"}]}
--- a/core/ui/index.html
+++ b/core/ui/index.html
@@ -13,19 +13,50 @@
 </head>
 <body>
-  <div id="chat">
+  <!-- Mobile Menu Overlay -->
-    <!-- Model selector -->
+  <div class="mobile-menu-overlay" id="mobileMenuOverlay"></div>
-    <div id="model-select">
+
-      <label for="model">Model:</label>
+  <!-- Mobile Slide-out Menu -->
-      <select id="model">
+  <div class="mobile-menu" id="mobileMenu">
-        <option value="gpt-4o-mini">GPT-4o-mini (OpenAI)</option>
+    <div class="mobile-menu-section">
-        <option value="ollama:nollama/mythomax-l2-13b:Q5_K_S">Ollama MythoMax (3090)</option>
+      <h4>Mode</h4>
      <select id="mobileMode">
        <option value="standard">Standard</option>
        <option value="cortex">Cortex</option>
      </select>
-      <label for="mode" style="margin-left: 20px;">Mode:</label>
+    </div>
    <div class="mobile-menu-section">
      <h4>Session</h4>
      <select id="mobileSessions"></select>
      <button id="mobileNewSessionBtn">➕ New Session</button>
      <button id="mobileRenameSessionBtn">✏️ Rename Session</button>
    </div>
    <div class="mobile-menu-section">
      <h4>Actions</h4>
      <button id="mobileThinkingStreamBtn">🧠 Show Work</button>
      <button id="mobileSettingsBtn">⚙ Settings</button>
      <button id="mobileToggleThemeBtn">🌙 Toggle Theme</button>
      <button id="mobileForceReloadBtn">🔄 Force Reload</button>
    </div>
  </div>
  <div id="chat">
    <!-- Mode selector -->
    <div id="model-select">
      <!-- Hamburger menu (mobile only) -->
      <button class="hamburger-menu" id="hamburgerMenu" aria-label="Menu">
        <span></span>
        <span></span>
        <span></span>
      </button>
      <label for="mode">Mode:</label>
      <select id="mode">
        <option value="standard">Standard</option>
        <option value="cortex">Cortex</option>
      </select>
      <button id="settingsBtn" style="margin-left: auto;">⚙ Settings</button>
      <div id="theme-toggle">
        <button id="toggleThemeBtn">🌙 Dark Mode</button>
      </div>
@@ -37,6 +68,7 @@
      <select id="sessions"></select>
      <button id="newSessionBtn">➕ New</button>
      <button id="renameSessionBtn">✏️ Rename</button>
      <button id="thinkingStreamBtn" title="Show thinking stream panel">🧠 Show Work</button>
    </div>
    <!-- Status -->
@@ -48,6 +80,24 @@
    <!-- Chat messages -->
    <div id="messages"></div>
    <!-- Thinking Stream Panel (collapsible) -->
    <div id="thinkingPanel" class="thinking-panel collapsed">
      <div class="thinking-header" id="thinkingHeader">
        <span>🧠 Thinking Stream</span>
        <div class="thinking-controls">
          <span class="thinking-status-dot" id="thinkingStatusDot"></span>
          <button class="thinking-clear-btn" id="thinkingClearBtn" title="Clear events">🗑️</button>
          <button class="thinking-toggle-btn" id="thinkingToggleBtn">▼</button>
        </div>
      </div>
      <div class="thinking-content" id="thinkingContent">
        <div class="thinking-empty" id="thinkingEmpty">
          <div class="thinking-empty-icon">🤔</div>
          <p>Waiting for thinking events...</p>
        </div>
      </div>
    </div>
    <!-- Input box -->
    <div id="input">
      <input id="userInput" type="text" placeholder="Type a message..." autofocus />
@@ -55,6 +105,57 @@
    </div>
  </div>
  <!-- Settings Modal (outside chat container) -->
  <div id="settingsModal" class="modal">
    <div class="modal-overlay"></div>
    <div class="modal-content">
      <div class="modal-header">
        <h3>Settings</h3>
        <button id="closeModalBtn" class="close-btn">✕</button>
      </div>
      <div class="modal-body">
        <div class="settings-section">
          <h4>Standard Mode Backend</h4>
          <p class="settings-desc">Select which LLM backend to use for Standard Mode:</p>
          <div class="radio-group">
            <label class="radio-label">
              <input type="radio" name="backend" value="SECONDARY" checked>
              <span>SECONDARY - Ollama/Qwen (3090)</span>
              <small>Fast, local, good for general chat</small>
            </label>
            <label class="radio-label">
              <input type="radio" name="backend" value="PRIMARY">
              <span>PRIMARY - llama.cpp (MI50)</span>
              <small>Local, powerful, good for complex reasoning</small>
            </label>
            <label class="radio-label">
              <input type="radio" name="backend" value="OPENAI">
              <span>OPENAI - GPT-4o-mini</span>
              <small>Cloud-based, high quality (costs money)</small>
            </label>
            <label class="radio-label">
              <input type="radio" name="backend" value="custom">
              <span>Custom Backend</span>
              <input type="text" id="customBackend" placeholder="e.g., FALLBACK" />
            </label>
          </div>
        </div>
        <div class="settings-section" style="margin-top: 24px;">
          <h4>Session Management</h4>
          <p class="settings-desc">Manage your saved chat sessions:</p>
          <div id="sessionList" class="session-list">
            <p style="color: var(--text-fade); font-size: 0.85rem;">Loading sessions...</p>
          </div>
        </div>
      </div>
      <div class="modal-footer">
        <button id="saveSettingsBtn" class="primary-btn">Save</button>
        <button id="cancelSettingsBtn">Cancel</button>
      </div>
    </div>
  </div>
  <script>
    const RELAY_BASE = "http://10.0.0.41:7078";
    const API_URL = `${RELAY_BASE}/v1/chat/completions`;
@@ -65,29 +166,56 @@
    let history = [];
 	let currentSession = localStorage.getItem("currentSession") || null;
-	let sessions = JSON.parse(localStorage.getItem("sessions") || "[]");
+	let sessions = []; // Now loaded from server
-	function saveSessions() {
+	async function loadSessionsFromServer() {
-	  localStorage.setItem("sessions", JSON.stringify(sessions));
+	  try {
-	  localStorage.setItem("currentSession", currentSession);
+		const resp = await fetch(`${RELAY_BASE}/sessions`);
 		const serverSessions = await resp.json();
 		sessions = serverSessions;
 		return sessions;
 	  } catch (e) {
 		console.error("Failed to load sessions from server:", e);
 		return [];
 	  }
 	}
-	function renderSessions() {
+	async function renderSessions() {
 	  const select = document.getElementById("sessions");
 	  const mobileSelect = document.getElementById("mobileSessions");
 	  select.innerHTML = "";
 	  mobileSelect.innerHTML = "";
 	  sessions.forEach(s => {
 		const opt = document.createElement("option");
 		opt.value = s.id;
-		opt.textContent = s.name;
+		opt.textContent = s.name || s.id;
 		if (s.id === currentSession) opt.selected = true;
 		select.appendChild(opt);
 		// Clone for mobile menu
 		const mobileOpt = opt.cloneNode(true);
 		mobileSelect.appendChild(mobileOpt);
 	  });
 	}
 	function getSessionName(id) {
 	  const s = sessions.find(s => s.id === id);
-	  return s ? s.name : id;
+	  return s ? (s.name || s.id) : id;
 	}
 	async function saveSessionMetadata(sessionId, name) {
 	  try {
 		await fetch(`${RELAY_BASE}/sessions/${sessionId}/metadata`, {
 		  method: "PATCH",
 		  headers: { "Content-Type": "application/json" },
 		  body: JSON.stringify({ name })
 		});
 		return true;
 	  } catch (e) {
 		console.error("Failed to save session metadata:", e);
 		return false;
 	  }
 	}
    async function loadSession(id) {
@@ -97,8 +225,10 @@
 		history = Array.isArray(data) ? data : [];
 		const messagesEl = document.getElementById("messages");
 		messagesEl.innerHTML = "";
-		history.forEach(m => addMessage(m.role, m.content));
+		history.forEach(m => addMessage(m.role, m.content, false)); // Don't auto-scroll for each message
-		addMessage("system", `📂 Loaded session: ${getSessionName(id)} — ${history.length} message(s)`);
+		addMessage("system", `📂 Loaded session: ${getSessionName(id)} — ${history.length} message(s)`, false);
 		// Scroll to bottom after all messages are loaded
 		messagesEl.scrollTo({ top: messagesEl.scrollHeight, behavior: "smooth" });
 	  } catch (e) {
 		addMessage("system", `Failed to load session: ${e.message}`);
 	  }
@@ -128,7 +258,6 @@
 	  await saveSession(); // ✅ persist both user + assistant messages
      const model = document.getElementById("model").value;
      const mode = document.getElementById("mode").value;
 	// make sure we always include a stable user_id
@@ -137,13 +266,24 @@
 	  userId = "brian"; // use whatever ID you seeded Mem0 with
 	  localStorage.setItem("userId", userId);
 	}
      // Get backend preference for Standard Mode
      let backend = null;
      if (mode === "standard") {
        backend = localStorage.getItem("standardModeBackend") || "SECONDARY";
      }
      const body = {
        model: model,
        mode: mode,
        messages: history,
        sessionId: currentSession
      };
      // Only add backend if in standard mode
      if (backend) {
        body.backend = backend;
      }
      try {
        const resp = await fetch(API_URL, {
          method: "POST",
@@ -161,7 +301,7 @@
      }
    }
-	function addMessage(role, text) {
+	function addMessage(role, text, autoScroll = true) {
 	  const messagesEl = document.getElementById("messages");
 	  const msgDiv = document.createElement("div");
@@ -169,11 +309,12 @@
 	  msgDiv.textContent = text;
 	  messagesEl.appendChild(msgDiv);
-	  // only auto-scroll if user is near bottom
+	  // Auto-scroll to bottom if enabled
-	  const threshold = 120;
+	  if (autoScroll) {
-	  const isNearBottom = messagesEl.scrollHeight - messagesEl.scrollTop - messagesEl.clientHeight < threshold;
+		// Use requestAnimationFrame to ensure DOM has updated
-	  if (isNearBottom) {
+		requestAnimationFrame(() => {
 		  messagesEl.scrollTo({ top: messagesEl.scrollHeight, behavior: "smooth" });
 		});
 	  }
 	}
@@ -194,73 +335,352 @@
    }
    document.addEventListener("DOMContentLoaded", () => {
-      // Dark mode toggle
+      // Mobile Menu Toggle
      const hamburgerMenu = document.getElementById("hamburgerMenu");
      const mobileMenu = document.getElementById("mobileMenu");
      const mobileMenuOverlay = document.getElementById("mobileMenuOverlay");
      function toggleMobileMenu() {
        mobileMenu.classList.toggle("open");
        mobileMenuOverlay.classList.toggle("show");
        hamburgerMenu.classList.toggle("active");
      }
      function closeMobileMenu() {
        mobileMenu.classList.remove("open");
        mobileMenuOverlay.classList.remove("show");
        hamburgerMenu.classList.remove("active");
      }
      hamburgerMenu.addEventListener("click", toggleMobileMenu);
      mobileMenuOverlay.addEventListener("click", closeMobileMenu);
      // Sync mobile menu controls with desktop
      const mobileMode = document.getElementById("mobileMode");
      const desktopMode = document.getElementById("mode");
      // Sync mode selection
      mobileMode.addEventListener("change", (e) => {
        desktopMode.value = e.target.value;
        desktopMode.dispatchEvent(new Event("change"));
      });
      desktopMode.addEventListener("change", (e) => {
        mobileMode.value = e.target.value;
      });
      // Mobile theme toggle
      document.getElementById("mobileToggleThemeBtn").addEventListener("click", () => {
        document.getElementById("toggleThemeBtn").click();
        updateMobileThemeButton();
      });
      function updateMobileThemeButton() {
        const isDark = document.body.classList.contains("dark");
        document.getElementById("mobileToggleThemeBtn").textContent = isDark ? "☀️ Light Mode" : "🌙 Dark Mode";
      }
      // Mobile settings button
      document.getElementById("mobileSettingsBtn").addEventListener("click", () => {
        closeMobileMenu();
        document.getElementById("settingsBtn").click();
      });
      // Mobile thinking stream button
      document.getElementById("mobileThinkingStreamBtn").addEventListener("click", () => {
        closeMobileMenu();
        document.getElementById("thinkingStreamBtn").click();
      });
      // Mobile new session button
      document.getElementById("mobileNewSessionBtn").addEventListener("click", () => {
        closeMobileMenu();
        document.getElementById("newSessionBtn").click();
      });
      // Mobile rename session button
      document.getElementById("mobileRenameSessionBtn").addEventListener("click", () => {
        closeMobileMenu();
        document.getElementById("renameSessionBtn").click();
      });
      // Sync mobile session selector with desktop
      document.getElementById("mobileSessions").addEventListener("change", async (e) => {
        closeMobileMenu();
        const desktopSessions = document.getElementById("sessions");
        desktopSessions.value = e.target.value;
        desktopSessions.dispatchEvent(new Event("change"));
      });
      // Mobile force reload button
      document.getElementById("mobileForceReloadBtn").addEventListener("click", async () => {
        if (confirm("Force reload the app? This will clear cache and reload.")) {
          // Clear all caches if available
          if ('caches' in window) {
            const cacheNames = await caches.keys();
            await Promise.all(cacheNames.map(name => caches.delete(name)));
          }
          // Force reload from server (bypass cache)
          window.location.reload(true);
        }
      });
      // Dark mode toggle - defaults to dark
      const btn = document.getElementById("toggleThemeBtn");
      // Set dark mode by default if no preference saved
      const savedTheme = localStorage.getItem("theme");
      if (!savedTheme || savedTheme === "dark") {
        document.body.classList.add("dark");
        btn.textContent = "☀️ Light Mode";
        localStorage.setItem("theme", "dark");
      } else {
        btn.textContent = "🌙 Dark Mode";
      }
      btn.addEventListener("click", () => {
        document.body.classList.toggle("dark");
        const isDark = document.body.classList.contains("dark");
        btn.textContent = isDark ? "☀️ Light Mode" : "🌙 Dark Mode";
        localStorage.setItem("theme", isDark ? "dark" : "light");
        updateMobileThemeButton();
      });
      if (localStorage.getItem("theme") === "dark") {
        document.body.classList.add("dark");
        btn.textContent = "☀️ Light Mode";
      }
-      // Sessions
+      // Initialize mobile theme button
-	// Populate dropdown initially
+      updateMobileThemeButton();
-	renderSessions();
+
      // Sessions - Load from server
 	  (async () => {
 		await loadSessionsFromServer();
 		await renderSessions();
 		// Ensure we have at least one session
-	  if (!currentSession) {
+		if (sessions.length === 0) {
 		  const id = generateSessionId();
 		  const name = "default";
 		sessions.push({ id, name });
 		  currentSession = id;
-		saveSessions();
+		  history = [];
-		renderSessions();
+		  await saveSession(); // Create empty session on server
 		  await saveSessionMetadata(id, name);
 		  await loadSessionsFromServer();
 		  await renderSessions();
 		  localStorage.setItem("currentSession", currentSession);
 		} else {
 		  // If no current session or current session doesn't exist, use first one
 		  if (!currentSession || !sessions.find(s => s.id === currentSession)) {
 			currentSession = sessions[0].id;
 			localStorage.setItem("currentSession", currentSession);
 		  }
 		}
-  // Load current session history (if it exists on Relay)
+		// Load current session history
-  loadSession(currentSession);
+		if (currentSession) {
-
+		  await loadSession(currentSession);
 		}
 	  })();
 	// Switch session
 	document.getElementById("sessions").addEventListener("change", async e => {
 	  currentSession = e.target.value;
 	  history = [];
-	  saveSessions();
+	  localStorage.setItem("currentSession", currentSession);
 	  addMessage("system", `Switched to session: ${getSessionName(currentSession)}`);
-	  await loadSession(currentSession); // ✅ load the chat history from Relay
+	  await loadSession(currentSession);
 	});
 	// Create new session
-	document.getElementById("newSessionBtn").addEventListener("click", () => {
+	document.getElementById("newSessionBtn").addEventListener("click", async () => {
 	  const name = prompt("Enter new session name:");
 	  if (!name) return;
 	  const id = generateSessionId();
 	  sessions.push({ id, name });
 	  currentSession = id;
 	  history = [];
-	  saveSessions();
+	  localStorage.setItem("currentSession", currentSession);
-	  renderSessions();
+
 	  // Create session on server
 	  await saveSession();
 	  await saveSessionMetadata(id, name);
 	  await loadSessionsFromServer();
 	  await renderSessions();
 	  addMessage("system", `Created session: ${name}`);
 	});
 	// Rename session
-	document.getElementById("renameSessionBtn").addEventListener("click", () => {
+	document.getElementById("renameSessionBtn").addEventListener("click", async () => {
 	  const session = sessions.find(s => s.id === currentSession);
 	  if (!session) return;
-	  const newName = prompt("Rename session:", session.name);
+	  const newName = prompt("Rename session:", session.name || currentSession);
 	  if (!newName) return;
-	  session.name = newName;
+
-	  saveSessions();
+	  // Update metadata on server
-	  renderSessions();
+	  await saveSessionMetadata(currentSession, newName);
 	  await loadSessionsFromServer();
 	  await renderSessions();
 	  addMessage("system", `Session renamed to: ${newName}`);
 	});
 	// Thinking Stream button
 	document.getElementById("thinkingStreamBtn").addEventListener("click", () => {
 	  if (!currentSession) {
 		alert("Please select a session first");
 		return;
 	  }
 	  // Open thinking stream in new window
 	  const streamUrl = `http://10.0.0.41:8081/thinking-stream.html?session=${currentSession}`;
 	  const windowFeatures = "width=600,height=800,menubar=no,toolbar=no,location=no,status=no";
 	  window.open(streamUrl, `thinking_${currentSession}`, windowFeatures);
 	  addMessage("system", "🧠 Opened thinking stream in new window");
 	});
      // Settings Modal
      const settingsModal = document.getElementById("settingsModal");
      const settingsBtn = document.getElementById("settingsBtn");
      const closeModalBtn = document.getElementById("closeModalBtn");
      const saveSettingsBtn = document.getElementById("saveSettingsBtn");
      const cancelSettingsBtn = document.getElementById("cancelSettingsBtn");
      const modalOverlay = document.querySelector(".modal-overlay");
      // Load saved backend preference
      const savedBackend = localStorage.getItem("standardModeBackend") || "SECONDARY";
      // Set initial radio button state
      const backendRadios = document.querySelectorAll('input[name="backend"]');
      let isCustomBackend = !["SECONDARY", "PRIMARY", "OPENAI"].includes(savedBackend);
      if (isCustomBackend) {
        document.querySelector('input[name="backend"][value="custom"]').checked = true;
        document.getElementById("customBackend").value = savedBackend;
      } else {
        document.querySelector(`input[name="backend"][value="${savedBackend}"]`).checked = true;
      }
      // Session management functions
      async function loadSessionList() {
        try {
          // Reload from server to get latest
          await loadSessionsFromServer();
          const sessionListEl = document.getElementById("sessionList");
          if (sessions.length === 0) {
            sessionListEl.innerHTML = '<p style="color: var(--text-fade); font-size: 0.85rem;">No saved sessions found</p>';
            return;
          }
          sessionListEl.innerHTML = "";
          sessions.forEach(sess => {
            const sessionItem = document.createElement("div");
            sessionItem.className = "session-item";
            const sessionInfo = document.createElement("div");
            sessionInfo.className = "session-info";
            const sessionName = sess.name || sess.id;
            const lastModified = new Date(sess.lastModified).toLocaleString();
            sessionInfo.innerHTML = `
              <strong>${sessionName}</strong>
              <small>${sess.messageCount} messages • ${lastModified}</small>
            `;
            const deleteBtn = document.createElement("button");
            deleteBtn.className = "session-delete-btn";
            deleteBtn.textContent = "🗑️";
            deleteBtn.title = "Delete session";
            deleteBtn.onclick = async () => {
              if (!confirm(`Delete session "${sessionName}"?`)) return;
              try {
                await fetch(`${RELAY_BASE}/sessions/${sess.id}`, { method: "DELETE" });
                // Reload sessions from server
                await loadSessionsFromServer();
                // If we deleted the current session, switch to another or create new
                if (currentSession === sess.id) {
                  if (sessions.length > 0) {
                    currentSession = sessions[0].id;
                    localStorage.setItem("currentSession", currentSession);
                    history = [];
                    await loadSession(currentSession);
                  } else {
                    const id = generateSessionId();
                    const name = "default";
                    currentSession = id;
                    localStorage.setItem("currentSession", currentSession);
                    history = [];
                    await saveSession();
                    await saveSessionMetadata(id, name);
                    await loadSessionsFromServer();
                  }
                }
                // Refresh both the dropdown and the settings list
                await renderSessions();
                await loadSessionList();
                addMessage("system", `Deleted session: ${sessionName}`);
              } catch (e) {
                alert("Failed to delete session: " + e.message);
              }
            };
            sessionItem.appendChild(sessionInfo);
            sessionItem.appendChild(deleteBtn);
            sessionListEl.appendChild(sessionItem);
          });
        } catch (e) {
          const sessionListEl = document.getElementById("sessionList");
          sessionListEl.innerHTML = '<p style="color: #ff3333; font-size: 0.85rem;">Failed to load sessions</p>';
        }
      }
      // Show modal and load session list
      settingsBtn.addEventListener("click", () => {
        settingsModal.classList.add("show");
        loadSessionList(); // Refresh session list when opening settings
      });
      // Hide modal functions
      const hideModal = () => {
        settingsModal.classList.remove("show");
      };
      closeModalBtn.addEventListener("click", hideModal);
      cancelSettingsBtn.addEventListener("click", hideModal);
      modalOverlay.addEventListener("click", hideModal);
      // ESC key to close
      document.addEventListener("keydown", (e) => {
        if (e.key === "Escape" && settingsModal.classList.contains("show")) {
          hideModal();
        }
      });
      // Save settings
      saveSettingsBtn.addEventListener("click", () => {
        const selectedRadio = document.querySelector('input[name="backend"]:checked');
        let backendValue;
        if (selectedRadio.value === "custom") {
          backendValue = document.getElementById("customBackend").value.trim().toUpperCase();
          if (!backendValue) {
            alert("Please enter a custom backend name");
            return;
          }
        } else {
          backendValue = selectedRadio.value;
        }
        localStorage.setItem("standardModeBackend", backendValue);
        addMessage("system", `Backend changed to: ${backendValue}`);
        hideModal();
      });
      // Health check
      checkHealth();
@@ -271,6 +691,236 @@
      document.getElementById("userInput").addEventListener("keypress", e => {
        if (e.key === "Enter") sendMessage();
      });
      // ========== THINKING STREAM INTEGRATION ==========
      const thinkingPanel = document.getElementById("thinkingPanel");
      const thinkingHeader = document.getElementById("thinkingHeader");
      const thinkingToggleBtn = document.getElementById("thinkingToggleBtn");
      const thinkingClearBtn = document.getElementById("thinkingClearBtn");
      const thinkingContent = document.getElementById("thinkingContent");
      const thinkingStatusDot = document.getElementById("thinkingStatusDot");
      const thinkingEmpty = document.getElementById("thinkingEmpty");
      let thinkingEventSource = null;
      let thinkingEventCount = 0;
      const CORTEX_BASE = "http://10.0.0.41:7081";
      // Load thinking panel state from localStorage
      const isPanelCollapsed = localStorage.getItem("thinkingPanelCollapsed") === "true";
      if (!isPanelCollapsed) {
        thinkingPanel.classList.remove("collapsed");
      }
      // Toggle thinking panel
      thinkingHeader.addEventListener("click", (e) => {
        if (e.target === thinkingClearBtn) return; // Don't toggle if clicking clear
        thinkingPanel.classList.toggle("collapsed");
        localStorage.setItem("thinkingPanelCollapsed", thinkingPanel.classList.contains("collapsed"));
      });
      // Clear thinking events
      thinkingClearBtn.addEventListener("click", (e) => {
        e.stopPropagation();
        clearThinkingEvents();
      });
      function clearThinkingEvents() {
        thinkingContent.innerHTML = '';
        thinkingContent.appendChild(thinkingEmpty);
        thinkingEventCount = 0;
        // Clear from localStorage
        if (currentSession) {
          localStorage.removeItem(`thinkingEvents_${currentSession}`);
        }
      }
      function connectThinkingStream() {
        if (!currentSession) return;
        // Close existing connection
        if (thinkingEventSource) {
          thinkingEventSource.close();
        }
        // Load persisted events
        loadThinkingEvents();
        const url = `${CORTEX_BASE}/stream/thinking/${currentSession}`;
        console.log('Connecting thinking stream:', url);
        thinkingEventSource = new EventSource(url);
        thinkingEventSource.onopen = () => {
          console.log('Thinking stream connected');
          thinkingStatusDot.className = 'thinking-status-dot connected';
        };
        thinkingEventSource.onmessage = (event) => {
          try {
            const data = JSON.parse(event.data);
            addThinkingEvent(data);
            saveThinkingEvent(data); // Persist event
          } catch (e) {
            console.error('Failed to parse thinking event:', e);
          }
        };
        thinkingEventSource.onerror = (error) => {
          console.error('Thinking stream error:', error);
          thinkingStatusDot.className = 'thinking-status-dot disconnected';
          // Retry connection after 2 seconds
          setTimeout(() => {
            if (thinkingEventSource && thinkingEventSource.readyState === EventSource.CLOSED) {
              console.log('Reconnecting thinking stream...');
              connectThinkingStream();
            }
          }, 2000);
        };
      }
      function addThinkingEvent(event) {
        // Remove empty state if present
        if (thinkingEventCount === 0 && thinkingEmpty.parentNode) {
          thinkingContent.removeChild(thinkingEmpty);
        }
        const eventDiv = document.createElement('div');
        eventDiv.className = `thinking-event thinking-event-${event.type}`;
        let icon = '';
        let message = '';
        let details = '';
        switch (event.type) {
          case 'connected':
            icon = '✓';
            message = 'Stream connected';
            details = `Session: ${event.session_id}`;
            break;
          case 'thinking':
            icon = '🤔';
            message = event.data.message;
            break;
          case 'tool_call':
            icon = '🔧';
            message = event.data.message;
            if (event.data.args) {
              details = JSON.stringify(event.data.args, null, 2);
            }
            break;
          case 'tool_result':
            icon = '📊';
            message = event.data.message;
            if (event.data.result && event.data.result.stdout) {
              details = `stdout: ${event.data.result.stdout}`;
            }
            break;
          case 'done':
            icon = '✅';
            message = event.data.message;
            if (event.data.final_answer) {
              details = event.data.final_answer;
            }
            break;
          case 'error':
            icon = '❌';
            message = event.data.message;
            break;
          default:
            icon = '•';
            message = JSON.stringify(event.data);
        }
        eventDiv.innerHTML = `
          <span class="thinking-event-icon">${icon}</span>
          <span>${message}</span>
          ${details ? `<div class="thinking-event-details">${details}</div>` : ''}
        `;
        thinkingContent.appendChild(eventDiv);
        thinkingContent.scrollTop = thinkingContent.scrollHeight;
        thinkingEventCount++;
      }
      // Persist thinking events to localStorage
      function saveThinkingEvent(event) {
        if (!currentSession) return;
        const key = `thinkingEvents_${currentSession}`;
        let events = JSON.parse(localStorage.getItem(key) || '[]');
        // Keep only last 50 events to avoid bloating localStorage
        if (events.length >= 50) {
          events = events.slice(-49);
        }
        events.push({
          ...event,
          timestamp: Date.now()
        });
        localStorage.setItem(key, JSON.stringify(events));
      }
      // Load persisted thinking events
      function loadThinkingEvents() {
        if (!currentSession) return;
        const key = `thinkingEvents_${currentSession}`;
        const events = JSON.parse(localStorage.getItem(key) || '[]');
        // Clear current display
        thinkingContent.innerHTML = '';
        thinkingEventCount = 0;
        // Replay events
        events.forEach(event => addThinkingEvent(event));
        // Show empty state if no events
        if (events.length === 0) {
          thinkingContent.appendChild(thinkingEmpty);
        }
      }
      // Update the old thinking stream button to toggle panel instead
      document.getElementById("thinkingStreamBtn").addEventListener("click", () => {
        thinkingPanel.classList.remove("collapsed");
        localStorage.setItem("thinkingPanelCollapsed", "false");
      });
      // Mobile thinking stream button
      document.getElementById("mobileThinkingStreamBtn").addEventListener("click", () => {
        closeMobileMenu();
        thinkingPanel.classList.remove("collapsed");
        localStorage.setItem("thinkingPanelCollapsed", "false");
      });
      // Connect thinking stream when session loads
      if (currentSession) {
        connectThinkingStream();
      }
      // Reconnect thinking stream when session changes
      const originalSessionChange = document.getElementById("sessions").onchange;
      document.getElementById("sessions").addEventListener("change", () => {
        setTimeout(() => {
          connectThinkingStream();
        }, 500); // Wait for session to load
      });
      // Cleanup on page unload
      window.addEventListener('beforeunload', () => {
        if (thinkingEventSource) {
          thinkingEventSource.close();
        }
      });
    });
  </script>
 </body>
--- a/core/ui/style.css
+++ b/core/ui/style.css
@@ -8,6 +8,26 @@
  --font-console: "IBM Plex Mono", monospace;
 }
 /* Light mode variables */
 body {
  --bg-dark: #f5f5f5;
  --bg-panel: rgba(255, 115, 0, 0.05);
  --accent: #ff6600;
  --accent-glow: 0 0 12px #ff6600cc;
  --text-main: #1a1a1a;
  --text-fade: #666;
 }
 /* Dark mode variables */
 body.dark {
  --bg-dark: #0a0a0a;
  --bg-panel: rgba(255, 115, 0, 0.1);
  --accent: #ff6600;
  --accent-glow: 0 0 12px #ff6600cc;
  --text-main: #e6e6e6;
  --text-fade: #999;
 }
 body {
  margin: 0;
  background: var(--bg-dark);
@@ -28,7 +48,7 @@ body {
  border: 1px solid var(--accent);
  border-radius: 10px;
  box-shadow: var(--accent-glow);
-  background: linear-gradient(180deg, rgba(255,102,0,0.05) 0%, rgba(0,0,0,0.9) 100%);
+  background: var(--bg-dark);
  overflow: hidden;
 }
@@ -61,6 +81,16 @@ button:hover, select:hover {
  cursor: pointer;
 }
 #thinkingStreamBtn {
  background: rgba(138, 43, 226, 0.2);
  border-color: #8a2be2;
 }
 #thinkingStreamBtn:hover {
  box-shadow: 0 0 8px #8a2be2;
  background: rgba(138, 43, 226, 0.3);
 }
 /* Chat area */
 #messages {
  flex: 1;
@@ -153,8 +183,8 @@ button:hover, select:hover {
 /* Dropdown (session selector) styling */
 select {
-  background-color: #1a1a1a;
+  background-color: var(--bg-dark);
-  color: #f5f5f5;
+  color: var(--text-main);
  border: 1px solid #b84a12;
  border-radius: 6px;
  padding: 4px 6px;
@@ -162,8 +192,8 @@ select {
 }
 select option {
-  background-color: #1a1a1a;
+  background-color: var(--bg-dark);
-  color: #f5f5f5;
+  color: var(--text-main);
 }
 /* Hover/focus for better visibility */
@@ -171,5 +201,709 @@ select:focus,
 select:hover {
  outline: none;
  border-color: #ff7a33;
-  background-color: #222;
+  background-color: var(--bg-panel);
 }
 /* Settings Modal */
 .modal {
  display: none !important;
  position: fixed;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  z-index: 1000;
 }
 .modal.show {
  display: block !important;
 }
 .modal-overlay {
  position: fixed;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  background: rgba(0, 0, 0, 0.8);
  backdrop-filter: blur(4px);
  z-index: 999;
 }
 .modal-content {
  position: fixed;
  top: 50%;
  left: 50%;
  transform: translate(-50%, -50%);
  background: linear-gradient(180deg, rgba(255,102,0,0.1) 0%, rgba(10,10,10,0.95) 100%);
  border: 2px solid var(--accent);
  border-radius: 12px;
  box-shadow: var(--accent-glow), 0 0 40px rgba(255,102,0,0.3);
  min-width: 400px;
  max-width: 600px;
  max-height: 80vh;
  overflow-y: auto;
  z-index: 1001;
 }
 .modal-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
  padding: 16px 20px;
  border-bottom: 1px solid var(--accent);
  background: rgba(255,102,0,0.1);
 }
 .modal-header h3 {
  margin: 0;
  font-size: 1.2rem;
  color: var(--accent);
 }
 .close-btn {
  background: transparent;
  border: none;
  color: var(--accent);
  font-size: 1.5rem;
  cursor: pointer;
  padding: 0;
  width: 30px;
  height: 30px;
  display: flex;
  align-items: center;
  justify-content: center;
  border-radius: 4px;
 }
 .close-btn:hover {
  background: rgba(255,102,0,0.2);
  box-shadow: 0 0 8px var(--accent);
 }
 .modal-body {
  padding: 20px;
 }
 .settings-section h4 {
  margin: 0 0 8px 0;
  color: var(--accent);
  font-size: 1rem;
 }
 .settings-desc {
  margin: 0 0 16px 0;
  color: var(--text-fade);
  font-size: 0.85rem;
 }
 .radio-group {
  display: flex;
  flex-direction: column;
  gap: 12px;
 }
 .radio-label {
  display: flex;
  flex-direction: column;
  padding: 12px;
  border: 1px solid rgba(255,102,0,0.3);
  border-radius: 6px;
  background: rgba(255,102,0,0.05);
  cursor: pointer;
  transition: all 0.2s;
 }
 .radio-label:hover {
  border-color: var(--accent);
  background: rgba(255,102,0,0.1);
  box-shadow: 0 0 8px rgba(255,102,0,0.3);
 }
 .radio-label input[type="radio"] {
  margin-right: 8px;
  accent-color: var(--accent);
 }
 .radio-label span {
  font-weight: 500;
  margin-bottom: 4px;
 }
 .radio-label small {
  color: var(--text-fade);
  font-size: 0.8rem;
  margin-left: 24px;
 }
 .radio-label input[type="text"] {
  margin-top: 8px;
  margin-left: 24px;
  padding: 6px;
  background: rgba(0,0,0,0.3);
  border: 1px solid rgba(255,102,0,0.5);
  border-radius: 4px;
  color: var(--text-main);
  font-family: var(--font-console);
 }
 .radio-label input[type="text"]:focus {
  outline: none;
  border-color: var(--accent);
  box-shadow: 0 0 8px rgba(255,102,0,0.3);
 }
 .modal-footer {
  display: flex;
  justify-content: flex-end;
  gap: 10px;
  padding: 16px 20px;
  border-top: 1px solid var(--accent);
  background: rgba(255,102,0,0.05);
 }
 .primary-btn {
  background: var(--accent);
  color: #000;
  font-weight: bold;
 }
 .primary-btn:hover {
  background: #ff7a33;
  box-shadow: var(--accent-glow);
 }
 /* Session List */
 .session-list {
  display: flex;
  flex-direction: column;
  gap: 8px;
  max-height: 300px;
  overflow-y: auto;
 }
 .session-item {
  display: flex;
  justify-content: space-between;
  align-items: center;
  padding: 12px;
  border: 1px solid rgba(255,102,0,0.3);
  border-radius: 6px;
  background: rgba(255,102,0,0.05);
  transition: all 0.2s;
 }
 .session-item:hover {
  border-color: var(--accent);
  background: rgba(255,102,0,0.1);
 }
 .session-info {
  display: flex;
  flex-direction: column;
  gap: 4px;
  flex: 1;
 }
 .session-info strong {
  color: var(--text-main);
  font-size: 0.95rem;
 }
 .session-info small {
  color: var(--text-fade);
  font-size: 0.75rem;
 }
 .session-delete-btn {
  background: transparent;
  border: 1px solid rgba(255,102,0,0.5);
  color: var(--accent);
  padding: 6px 10px;
  border-radius: 4px;
  cursor: pointer;
  font-size: 1rem;
  transition: all 0.2s;
 }
 .session-delete-btn:hover {
  background: rgba(255,0,0,0.2);
  border-color: #ff3333;
  color: #ff3333;
  box-shadow: 0 0 8px rgba(255,0,0,0.3);
 }
 /* Thinking Stream Panel */
 .thinking-panel {
  border-top: 1px solid var(--accent);
  background: rgba(255, 102, 0, 0.02);
  display: flex;
  flex-direction: column;
  transition: max-height 0.3s ease;
  max-height: 300px;
 }
 .thinking-panel.collapsed {
  max-height: 40px;
 }
 .thinking-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
  padding: 10px 12px;
  background: rgba(255, 102, 0, 0.08);
  cursor: pointer;
  user-select: none;
  border-bottom: 1px solid rgba(255, 102, 0, 0.2);
  font-size: 0.9rem;
  font-weight: 500;
 }
 .thinking-header:hover {
  background: rgba(255, 102, 0, 0.12);
 }
 .thinking-controls {
  display: flex;
  align-items: center;
  gap: 8px;
 }
 .thinking-status-dot {
  width: 8px;
  height: 8px;
  border-radius: 50%;
  background: #666;
  display: inline-block;
 }
 .thinking-status-dot.connected {
  background: #00ff66;
  box-shadow: 0 0 8px #00ff66;
 }
 .thinking-status-dot.disconnected {
  background: #ff3333;
 }
 .thinking-clear-btn,
 .thinking-toggle-btn {
  background: transparent;
  border: 1px solid rgba(255, 102, 0, 0.5);
  color: var(--text-main);
  padding: 4px 8px;
  border-radius: 4px;
  cursor: pointer;
  font-size: 0.85rem;
 }
 .thinking-clear-btn:hover,
 .thinking-toggle-btn:hover {
  background: rgba(255, 102, 0, 0.2);
  box-shadow: 0 0 6px rgba(255, 102, 0, 0.3);
 }
 .thinking-toggle-btn {
  transition: transform 0.3s ease;
 }
 .thinking-panel.collapsed .thinking-toggle-btn {
  transform: rotate(-90deg);
 }
 .thinking-content {
  flex: 1;
  overflow-y: auto;
  padding: 12px;
  display: flex;
  flex-direction: column;
  gap: 8px;
  min-height: 0;
 }
 .thinking-panel.collapsed .thinking-content {
  display: none;
 }
 .thinking-empty {
  text-align: center;
  padding: 40px 20px;
  color: var(--text-fade);
  font-size: 0.85rem;
 }
 .thinking-empty-icon {
  font-size: 2rem;
  margin-bottom: 10px;
 }
 .thinking-event {
  padding: 8px 12px;
  border-radius: 6px;
  font-size: 0.85rem;
  font-family: 'Courier New', monospace;
  animation: thinkingSlideIn 0.3s ease-out;
  border-left: 3px solid;
  word-wrap: break-word;
 }
@keyframes thinkingSlideIn {
  from {
    opacity: 0;
    transform: translateY(-10px);
  }
  to {
    opacity: 1;
    transform: translateY(0);
  }
 }
 .thinking-event-connected {
  background: rgba(0, 255, 102, 0.1);
  border-color: #00ff66;
  color: #00ff66;
 }
 .thinking-event-thinking {
  background: rgba(138, 43, 226, 0.1);
  border-color: #8a2be2;
  color: #c79cff;
 }
 .thinking-event-tool_call {
  background: rgba(255, 165, 0, 0.1);
  border-color: #ffa500;
  color: #ffb84d;
 }
 .thinking-event-tool_result {
  background: rgba(0, 191, 255, 0.1);
  border-color: #00bfff;
  color: #7dd3fc;
 }
 .thinking-event-done {
  background: rgba(168, 85, 247, 0.1);
  border-color: #a855f7;
  color: #e9d5ff;
  font-weight: bold;
 }
 .thinking-event-error {
  background: rgba(255, 51, 51, 0.1);
  border-color: #ff3333;
  color: #fca5a5;
 }
 .thinking-event-icon {
  display: inline-block;
  margin-right: 8px;
 }
 .thinking-event-details {
  font-size: 0.75rem;
  color: var(--text-fade);
  margin-top: 4px;
  padding-left: 20px;
  white-space: pre-wrap;
  max-height: 100px;
  overflow-y: auto;
 }
 /* ========== MOBILE RESPONSIVE STYLES ========== */
 /* Hamburger Menu */
 .hamburger-menu {
  display: none;
  flex-direction: column;
  gap: 4px;
  cursor: pointer;
  padding: 8px;
  border: 1px solid var(--accent);
  border-radius: 4px;
  background: transparent;
  z-index: 100;
 }
 .hamburger-menu span {
  width: 20px;
  height: 2px;
  background: var(--accent);
  transition: all 0.3s;
  display: block;
 }
 .hamburger-menu.active span:nth-child(1) {
  transform: rotate(45deg) translate(5px, 5px);
 }
 .hamburger-menu.active span:nth-child(2) {
  opacity: 0;
 }
 .hamburger-menu.active span:nth-child(3) {
  transform: rotate(-45deg) translate(5px, -5px);
 }
 /* Mobile Menu Container */
 .mobile-menu {
  display: none;
  position: fixed;
  top: 0;
  left: -100%;
  width: 280px;
  height: 100vh;
  background: var(--bg-dark);
  border-right: 2px solid var(--accent);
  box-shadow: var(--accent-glow);
  z-index: 999;
  transition: left 0.3s ease;
  overflow-y: auto;
  padding: 20px;
  flex-direction: column;
  gap: 16px;
 }
 .mobile-menu.open {
  left: 0;
 }
 .mobile-menu-overlay {
  display: none;
  position: fixed;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  background: rgba(0, 0, 0, 0.7);
  z-index: 998;
 }
 .mobile-menu-overlay.show {
  display: block;
 }
 .mobile-menu-section {
  display: flex;
  flex-direction: column;
  gap: 8px;
  padding-bottom: 16px;
  border-bottom: 1px solid rgba(255, 102, 0, 0.3);
 }
 .mobile-menu-section:last-child {
  border-bottom: none;
 }
 .mobile-menu-section h4 {
  margin: 0;
  color: var(--accent);
  font-size: 0.9rem;
  text-transform: uppercase;
  letter-spacing: 1px;
 }
 .mobile-menu button,
 .mobile-menu select {
  width: 100%;
  padding: 10px;
  font-size: 0.95rem;
  text-align: left;
 }
 /* Mobile Breakpoints */
@media screen and (max-width: 768px) {
  body {
    padding: 0;
  }
  #chat {
    width: 100%;
    max-width: 100%;
    height: 100vh;
    border-radius: 0;
    border-left: none;
    border-right: none;
  }
  /* Show hamburger, hide desktop header controls */
  .hamburger-menu {
    display: flex;
  }
  #model-select {
    padding: 12px;
    justify-content: space-between;
  }
  /* Hide all controls except hamburger on mobile */
  #model-select > *:not(.hamburger-menu) {
    display: none;
  }
  #session-select {
    display: none;
  }
  /* Show mobile menu */
  .mobile-menu {
    display: flex;
  }
  /* Messages - more width on mobile */
  .msg {
    max-width: 90%;
    font-size: 0.95rem;
  }
  /* Status bar */
  #status {
    padding: 10px 12px;
    font-size: 0.85rem;
  }
  /* Input area - bigger touch targets */
  #input {
    padding: 12px;
  }
  #userInput {
    font-size: 16px; /* Prevents zoom on iOS */
    padding: 12px;
  }
  #sendBtn {
    padding: 12px 16px;
    font-size: 1rem;
  }
  /* Modal - full width on mobile */
  .modal-content {
    width: 95%;
    min-width: unset;
    max-width: unset;
    max-height: 90vh;
    top: 50%;
    left: 50%;
    transform: translate(-50%, -50%);
  }
  .modal-header {
    padding: 12px 16px;
  }
  .modal-body {
    padding: 16px;
  }
  .modal-footer {
    padding: 12px 16px;
    flex-wrap: wrap;
  }
  .modal-footer button {
    flex: 1;
    min-width: 120px;
  }
  /* Radio labels - stack better on mobile */
  .radio-label {
    padding: 10px;
  }
  .radio-label small {
    margin-left: 20px;
    font-size: 0.75rem;
  }
  /* Session list */
  .session-item {
    padding: 10px;
  }
  .session-info strong {
    font-size: 0.9rem;
  }
  .session-info small {
    font-size: 0.7rem;
  }
  /* Settings button in header */
  #settingsBtn {
    padding: 8px 12px;
  }
  /* Thinking panel adjustments for mobile */
  .thinking-panel {
    max-height: 250px;
  }
  .thinking-panel.collapsed {
    max-height: 38px;
  }
  .thinking-header {
    padding: 8px 10px;
    font-size: 0.85rem;
  }
  .thinking-event {
    font-size: 0.8rem;
    padding: 6px 10px;
  }
  .thinking-event-details {
    font-size: 0.7rem;
    max-height: 80px;
  }
 }
 /* Extra small devices (phones in portrait) */
@media screen and (max-width: 480px) {
  .mobile-menu {
    width: 240px;
  }
  .msg {
    max-width: 95%;
    font-size: 0.9rem;
    padding: 8px 12px;
  }
  #userInput {
    font-size: 16px;
    padding: 10px;
  }
  #sendBtn {
    padding: 10px 14px;
    font-size: 0.95rem;
  }
  .modal-header h3 {
    font-size: 1.1rem;
  }
  .settings-section h4 {
    font-size: 0.95rem;
  }
  .radio-label span {
    font-size: 0.9rem;
  }
 }
 /* Tablet landscape and desktop */
@media screen and (min-width: 769px) {
  /* Ensure mobile menu is hidden on desktop */
  .mobile-menu,
  .mobile-menu-overlay {
    display: none !important;
  }
  .hamburger-menu {
    display: none !important;
  }
 }
--- a/core/ui/thinking-stream.html
+++ b/core/ui/thinking-stream.html
@@ -0,0 +1,362 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>🧠 Thinking Stream</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        body {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            background: #0d0d0d;
            color: #e0e0e0;
            height: 100vh;
            display: flex;
            flex-direction: column;
            overflow: hidden;
        }
        .header {
            background: #1a1a1a;
            padding: 15px 20px;
            border-bottom: 2px solid #333;
            display: flex;
            align-items: center;
            justify-content: space-between;
        }
        .header h1 {
            font-size: 18px;
            font-weight: bold;
        }
        .status {
            display: flex;
            align-items: center;
            gap: 10px;
            font-size: 14px;
        }
        .status-dot {
            width: 10px;
            height: 10px;
            border-radius: 50%;
            background: #666;
        }
        .status-dot.connected {
            background: #90ee90;
            box-shadow: 0 0 10px #90ee90;
        }
        .status-dot.disconnected {
            background: #ff6b6b;
        }
        .events-container {
            flex: 1;
            overflow-y: auto;
            padding: 20px;
        }
        .event {
            margin-bottom: 12px;
            padding: 10px 15px;
            border-radius: 6px;
            font-size: 14px;
            font-family: 'Courier New', monospace;
            animation: slideIn 0.3s ease-out;
            border-left: 3px solid;
        }
        @keyframes slideIn {
            from {
                opacity: 0;
                transform: translateX(-20px);
            }
            to {
                opacity: 1;
                transform: translateX(0);
            }
        }
        .event-connected {
            background: #1a2a1a;
            border-color: #4a7c59;
            color: #90ee90;
        }
        .event-thinking {
            background: #1a3a1a;
            border-color: #5a9c69;
            color: #a0f0a0;
        }
        .event-tool_call {
            background: #3a2a1a;
            border-color: #d97706;
            color: #fbbf24;
        }
        .event-tool_result {
            background: #1a2a3a;
            border-color: #0ea5e9;
            color: #7dd3fc;
        }
        .event-done {
            background: #2a1a3a;
            border-color: #a855f7;
            color: #e9d5ff;
            font-weight: bold;
        }
        .event-error {
            background: #3a1a1a;
            border-color: #dc2626;
            color: #fca5a5;
        }
        .event-icon {
            display: inline-block;
            margin-right: 8px;
        }
        .event-details {
            font-size: 12px;
            color: #999;
            margin-top: 5px;
            padding-left: 25px;
        }
        .footer {
            background: #1a1a1a;
            padding: 10px 20px;
            border-top: 1px solid #333;
            text-align: center;
            font-size: 12px;
            color: #666;
        }
        .clear-btn {
            background: #333;
            border: 1px solid #444;
            color: #e0e0e0;
            padding: 6px 12px;
            border-radius: 4px;
            cursor: pointer;
            font-size: 12px;
        }
        .clear-btn:hover {
            background: #444;
        }
        .empty-state {
            text-align: center;
            padding: 60px 20px;
            color: #666;
        }
        .empty-state-icon {
            font-size: 48px;
            margin-bottom: 20px;
        }
    </style>
 </head>
 <body>
    <div class="header">
        <h1>🧠 Thinking Stream</h1>
        <div class="status">
            <div class="status-dot" id="statusDot"></div>
            <span id="statusText">Connecting...</span>
        </div>
    </div>
    <div class="events-container" id="events">
        <div class="empty-state">
            <div class="empty-state-icon">🤔</div>
            <p>Waiting for thinking events...</p>
            <p style="font-size: 12px; margin-top: 10px;">Events will appear here when Lyra uses tools</p>
        </div>
    </div>
    <div class="footer">
        <button class="clear-btn" onclick="clearEvents()">Clear Events</button>
        <span style="margin: 0 20px;">|</span>
        <span id="sessionInfo">Session: <span id="sessionId">-</span></span>
    </div>
    <script>
        console.log('🧠 Thinking stream page loaded!');
        // Get session ID from URL
        const urlParams = new URLSearchParams(window.location.search);
        const SESSION_ID = urlParams.get('session');
        const CORTEX_BASE = "http://10.0.0.41:7081"; // Direct to cortex
        console.log('Session ID:', SESSION_ID);
        console.log('Cortex base:', CORTEX_BASE);
        // Declare variables first
        let eventSource = null;
        let eventCount = 0;
        if (!SESSION_ID) {
            document.getElementById('events').innerHTML = `
                <div class="empty-state">
                    <div class="empty-state-icon">⚠️</div>
                    <p>No session ID provided</p>
                    <p style="font-size: 12px; margin-top: 10px;">Please open this from the main chat interface</p>
                </div>
            `;
        } else {
            document.getElementById('sessionId').textContent = SESSION_ID;
            connectStream();
        }
        function connectStream() {
            if (eventSource) {
                eventSource.close();
            }
            const url = `${CORTEX_BASE}/stream/thinking/${SESSION_ID}`;
            console.log('Connecting to:', url);
            eventSource = new EventSource(url);
            eventSource.onopen = () => {
                console.log('EventSource onopen fired');
                updateStatus(true, 'Connected');
            };
            eventSource.onmessage = (event) => {
                console.log('Received message:', event.data);
                try {
                    const data = JSON.parse(event.data);
                    // Update status to connected when first message arrives
                    if (data.type === 'connected') {
                        updateStatus(true, 'Connected');
                    }
                    addEvent(data);
                } catch (e) {
                    console.error('Failed to parse event:', e, event.data);
                }
            };
            eventSource.onerror = (error) => {
                console.error('Stream error:', error, 'readyState:', eventSource.readyState);
                updateStatus(false, 'Disconnected');
                // Try to reconnect after 2 seconds
                setTimeout(() => {
                    if (eventSource.readyState === EventSource.CLOSED) {
                        console.log('Attempting to reconnect...');
                        connectStream();
                    }
                }, 2000);
            };
        }
        function updateStatus(connected, text) {
            const dot = document.getElementById('statusDot');
            const statusText = document.getElementById('statusText');
            dot.className = 'status-dot ' + (connected ? 'connected' : 'disconnected');
            statusText.textContent = text;
        }
        function addEvent(event) {
            const container = document.getElementById('events');
            // Remove empty state if present
            if (eventCount === 0) {
                container.innerHTML = '';
            }
            const eventDiv = document.createElement('div');
            eventDiv.className = `event event-${event.type}`;
            let icon = '';
            let message = '';
            let details = '';
            switch (event.type) {
                case 'connected':
                    icon = '✓';
                    message = 'Stream connected';
                    details = `Session: ${event.session_id}`;
                    break;
                case 'thinking':
                    icon = '🤔';
                    message = event.data.message;
                    break;
                case 'tool_call':
                    icon = '🔧';
                    message = event.data.message;
                    details = JSON.stringify(event.data.args, null, 2);
                    break;
                case 'tool_result':
                    icon = '📊';
                    message = event.data.message;
                    if (event.data.result && event.data.result.stdout) {
                        details = `stdout: ${event.data.result.stdout}`;
                    }
                    break;
                case 'done':
                    icon = '✅';
                    message = event.data.message;
                    details = event.data.final_answer;
                    break;
                case 'error':
                    icon = '❌';
                    message = event.data.message;
                    break;
                default:
                    icon = '•';
                    message = JSON.stringify(event.data);
            }
            eventDiv.innerHTML = `
                <span class="event-icon">${icon}</span>
                <span>${message}</span>
                ${details ? `<div class="event-details">${details}</div>` : ''}
            `;
            container.appendChild(eventDiv);
            container.scrollTop = container.scrollHeight;
            eventCount++;
        }
        function clearEvents() {
            const container = document.getElementById('events');
            container.innerHTML = `
                <div class="empty-state">
                    <div class="empty-state-icon">🤔</div>
                    <p>Waiting for thinking events...</p>
                    <p style="font-size: 12px; margin-top: 10px;">Events will appear here when Lyra uses tools</p>
                </div>
            `;
            eventCount = 0;
        }
        // Cleanup on page unload
        window.addEventListener('beforeunload', () => {
            if (eventSource) {
                eventSource.close();
            }
        });
    </script>
 </body>
 </html>
--- a/cortex/Dockerfile
+++ b/cortex/Dockerfile
@@ -1,5 +1,11 @@
 FROM python:3.11-slim
 WORKDIR /app
 # Install docker CLI for code executor
 RUN apt-get update && apt-get install -y \
    docker.io \
    && rm -rf /var/lib/apt/lists/*
 COPY requirements.txt .
 RUN pip install -r requirements.txt
 COPY . .
--- a/cortex/autonomy/tools/adapters/init.py
+++ b/cortex/autonomy/tools/adapters/init.py
@@ -0,0 +1,13 @@
 """Provider adapters for tool calling."""
 from .base import ToolAdapter
 from .openai_adapter import OpenAIAdapter
 from .ollama_adapter import OllamaAdapter
 from .llamacpp_adapter import LlamaCppAdapter
 __all__ = [
    "ToolAdapter",
    "OpenAIAdapter",
    "OllamaAdapter",
    "LlamaCppAdapter",
 ]
--- a/cortex/autonomy/tools/adapters/base.py
+++ b/cortex/autonomy/tools/adapters/base.py
@@ -0,0 +1,79 @@
 """
 Base adapter interface for provider-agnostic tool calling.
 This module defines the abstract base class that all LLM provider adapters
 must implement to support tool calling in Lyra.
 """
 from abc import ABC, abstractmethod
 from typing import Dict, List, Optional
 class ToolAdapter(ABC):
    """Base class for provider-specific tool adapters.
    Each LLM provider (OpenAI, Ollama, llama.cpp, etc.) has its own
    way of handling tool calls. This adapter pattern allows Lyra to
    support tools across all providers with a unified interface.
    """
    @abstractmethod
    async def prepare_request(
        self,
        messages: List[Dict],
        tools: List[Dict],
        tool_choice: Optional[str] = None
    ) -> Dict:
        """Convert Lyra tool definitions to provider-specific format.
        Args:
            messages: Conversation history in OpenAI format
            tools: List of Lyra tool definitions (provider-agnostic)
            tool_choice: Optional tool forcing ("auto", "required", "none")
        Returns:
            dict: Provider-specific request payload ready to send to LLM
        """
        pass
    @abstractmethod
    async def parse_response(self, response) -> Dict:
        """Extract tool calls from provider response.
        Args:
            response: Raw provider response (format varies by provider)
        Returns:
            dict: Standardized response in Lyra format:
                {
                    "content": str,  # Assistant's text response
                    "tool_calls": [  # List of tool calls or None
                        {
                            "id": str,       # Unique call ID
                            "name": str,     # Tool name
                            "arguments": dict  # Tool arguments
                        }
                    ] or None
                }
        """
        pass
    @abstractmethod
    def format_tool_result(
        self,
        tool_call_id: str,
        tool_name: str,
        result: Dict
    ) -> Dict:
        """Format tool execution result for next LLM call.
        Args:
            tool_call_id: ID from the original tool call
            tool_name: Name of the executed tool
            result: Tool execution result dictionary
        Returns:
            dict: Message object to append to conversation
                (format varies by provider)
        """
        pass
--- a/cortex/autonomy/tools/adapters/llamacpp_adapter.py
+++ b/cortex/autonomy/tools/adapters/llamacpp_adapter.py
@@ -0,0 +1,17 @@
 """
 llama.cpp adapter for tool calling.
 Since llama.cpp has similar constraints to Ollama (no native function calling),
 this adapter reuses the XML-based approach from OllamaAdapter.
 """
 from .ollama_adapter import OllamaAdapter
 class LlamaCppAdapter(OllamaAdapter):
    """llama.cpp adapter - uses same XML approach as Ollama.
    llama.cpp doesn't have native function calling support, so we use
    the same XML-based prompt engineering approach as Ollama.
    """
    pass
--- a/cortex/autonomy/tools/adapters/ollama_adapter.py
+++ b/cortex/autonomy/tools/adapters/ollama_adapter.py
@@ -0,0 +1,191 @@
 """
 Ollama adapter for tool calling using XML-structured prompts.
 Since Ollama doesn't have native function calling, this adapter uses
 XML-based prompts to instruct the model how to call tools.
 """
 import json
 import re
 from typing import Dict, List, Optional
 from .base import ToolAdapter
 class OllamaAdapter(ToolAdapter):
    """Ollama adapter using XML-structured prompts for tool calling.
    This adapter injects tool descriptions into the system prompt and
    teaches the model to respond with XML when it wants to use a tool.
    """
    SYSTEM_PROMPT = """You have access to the following tools:
 {tool_descriptions}
 To use a tool, respond with XML in this exact format:
 <tool_call>
  <name>tool_name</name>
  <arguments>
    <arg_name>value</arg_name>
  </arguments>
  <reason>why you're using this tool</reason>
 </tool_call>
 You can call multiple tools by including multiple <tool_call> blocks.
 If you don't need to use any tools, respond normally without XML.
 After tools are executed, you'll receive results and can continue the conversation."""
    async def prepare_request(
        self,
        messages: List[Dict],
        tools: List[Dict],
        tool_choice: Optional[str] = None
    ) -> Dict:
        """Inject tool descriptions into system prompt.
        Args:
            messages: Conversation history
            tools: Lyra tool definitions
            tool_choice: Ignored for Ollama (no native support)
        Returns:
            dict: Request payload with modified messages
        """
        # Format tool descriptions
        tool_desc = "\n".join([
            f"- {t['name']}: {t['description']}\n  Parameters: {self._format_parameters(t['parameters'], t.get('required', []))}"
            for t in tools
        ])
        system_msg = self.SYSTEM_PROMPT.format(tool_descriptions=tool_desc)
        # Check if first message is already a system message
        modified_messages = messages.copy()
        if modified_messages and modified_messages[0].get("role") == "system":
            # Prepend tool instructions to existing system message
            modified_messages[0]["content"] = system_msg + "\n\n" + modified_messages[0]["content"]
        else:
            # Add new system message at the beginning
            modified_messages.insert(0, {"role": "system", "content": system_msg})
        return {"messages": modified_messages}
    def _format_parameters(self, parameters: Dict, required: List[str]) -> str:
        """Format parameters for tool description.
        Args:
            parameters: Parameter definitions
            required: List of required parameter names
        Returns:
            str: Human-readable parameter description
        """
        param_strs = []
        for name, spec in parameters.items():
            req_marker = "(required)" if name in required else "(optional)"
            param_strs.append(f"{name} {req_marker}: {spec.get('description', '')}")
        return ", ".join(param_strs)
    async def parse_response(self, response) -> Dict:
        """Extract tool calls from XML in response.
        Args:
            response: String response from Ollama
        Returns:
            dict: Standardized Lyra format with content and tool_calls
        """
        import logging
        logger = logging.getLogger(__name__)
        # Ollama returns a string
        if isinstance(response, dict):
            content = response.get("message", {}).get("content", "")
        else:
            content = str(response)
        logger.info(f"🔍 OllamaAdapter.parse_response: content length={len(content)}, has <tool_call>={('<tool_call>' in content)}")
        logger.debug(f"🔍 Content preview: {content[:500]}")
        # Parse XML tool calls
        tool_calls = []
        if "<tool_call>" in content:
            # Split content by <tool_call> to get each block
            blocks = content.split('<tool_call>')
            logger.info(f"🔍 Split into {len(blocks)} blocks")
            # First block is content before any tool calls
            clean_parts = [blocks[0]]
            for idx, block in enumerate(blocks[1:]):  # Skip first block (pre-tool content)
                # Extract tool name
                name_match = re.search(r'<name>(.*?)</name>', block)
                if not name_match:
                    logger.warning(f"Block {idx} has no <name> tag, skipping")
                    continue
                name = name_match.group(1).strip()
                arguments = {}
                # Extract arguments
                args_match = re.search(r'<arguments>(.*?)</arguments>', block, re.DOTALL)
                if args_match:
                    args_xml = args_match.group(1)
                    # Parse <key>value</key> pairs
                    arg_pairs = re.findall(r'<(\w+)>(.*?)</\1>', args_xml, re.DOTALL)
                    arguments = {k: v.strip() for k, v in arg_pairs}
                tool_calls.append({
                    "id": f"call_{idx}",
                    "name": name,
                    "arguments": arguments
                })
                # For clean content, find what comes AFTER the tool call block
                # Look for the last closing tag (</tool_call> or malformed </xxx>) and keep what's after
                # Split by any closing tag at the END of the tool block
                remaining = block
                # Remove everything up to and including a standalone closing tag
                # Pattern: find </something> that's not followed by more XML
                end_match = re.search(r'</[a-z_]+>\s*(.*)$', remaining, re.DOTALL)
                if end_match:
                    after_content = end_match.group(1).strip()
                    if after_content and not after_content.startswith('<'):
                        # Only keep if it's actual text content, not more XML
                        clean_parts.append(after_content)
            clean_content = ''.join(clean_parts).strip()
        else:
            clean_content = content
        return {
            "content": clean_content,
            "tool_calls": tool_calls if tool_calls else None
        }
    def format_tool_result(
        self,
        tool_call_id: str,
        tool_name: str,
        result: Dict
    ) -> Dict:
        """Format tool result as XML for next prompt.
        Args:
            tool_call_id: ID from the original tool call
            tool_name: Name of the executed tool
            result: Tool execution result
        Returns:
            dict: Message in user role with XML-formatted result
        """
        # Format result as XML
        result_xml = f"""<tool_result>
  <tool>{tool_name}</tool>
  <result>{json.dumps(result, ensure_ascii=False)}</result>
 </tool_result>"""
        return {
            "role": "user",
            "content": result_xml
        }
--- a/cortex/autonomy/tools/adapters/openai_adapter.py
+++ b/cortex/autonomy/tools/adapters/openai_adapter.py
@@ -0,0 +1,130 @@
 """
 OpenAI adapter for tool calling using native function calling API.
 This adapter converts Lyra tool definitions to OpenAI's function calling
 format and parses OpenAI responses back to Lyra's standardized format.
 """
 import json
 from typing import Dict, List, Optional
 from .base import ToolAdapter
 class OpenAIAdapter(ToolAdapter):
    """OpenAI-specific adapter using native function calling.
    OpenAI supports function calling natively through the 'tools' parameter
    in chat completions. This adapter leverages that capability.
    """
    async def prepare_request(
        self,
        messages: List[Dict],
        tools: List[Dict],
        tool_choice: Optional[str] = None
    ) -> Dict:
        """Convert Lyra tools to OpenAI function calling format.
        Args:
            messages: Conversation history
            tools: Lyra tool definitions
            tool_choice: "auto", "required", "none", or None
        Returns:
            dict: Request payload with OpenAI-formatted tools
        """
        # Convert Lyra tools → OpenAI function calling format
        openai_tools = []
        for tool in tools:
            openai_tools.append({
                "type": "function",
                "function": {
                    "name": tool["name"],
                    "description": tool["description"],
                    "parameters": {
                        "type": "object",
                        "properties": tool["parameters"],
                        "required": tool.get("required", [])
                    }
                }
            })
        payload = {
            "messages": messages,
            "tools": openai_tools
        }
        # Add tool_choice if specified
        if tool_choice:
            if tool_choice == "required":
                payload["tool_choice"] = "required"
            elif tool_choice == "none":
                payload["tool_choice"] = "none"
            else:  # "auto" or default
                payload["tool_choice"] = "auto"
        return payload
    async def parse_response(self, response) -> Dict:
        """Extract tool calls from OpenAI response.
        Args:
            response: OpenAI ChatCompletion response object
        Returns:
            dict: Standardized Lyra format with content and tool_calls
        """
        message = response.choices[0].message
        content = message.content if message.content else ""
        tool_calls = []
        # Check if response contains tool calls
        if hasattr(message, 'tool_calls') and message.tool_calls:
            for tc in message.tool_calls:
                try:
                    # Parse arguments (may be JSON string)
                    args = tc.function.arguments
                    if isinstance(args, str):
                        args = json.loads(args)
                    tool_calls.append({
                        "id": tc.id,
                        "name": tc.function.name,
                        "arguments": args
                    })
                except json.JSONDecodeError as e:
                    # If arguments can't be parsed, include error
                    tool_calls.append({
                        "id": tc.id,
                        "name": tc.function.name,
                        "arguments": {},
                        "error": f"Failed to parse arguments: {str(e)}"
                    })
        return {
            "content": content,
            "tool_calls": tool_calls if tool_calls else None
        }
    def format_tool_result(
        self,
        tool_call_id: str,
        tool_name: str,
        result: Dict
    ) -> Dict:
        """Format tool result as OpenAI tool message.
        Args:
            tool_call_id: ID from the original tool call
            tool_name: Name of the executed tool
            result: Tool execution result
        Returns:
            dict: Message in OpenAI tool message format
        """
        return {
            "role": "tool",
            "tool_call_id": tool_call_id,
            "name": tool_name,
            "content": json.dumps(result, ensure_ascii=False)
        }
--- a/cortex/autonomy/tools/executors/init.py
+++ b/cortex/autonomy/tools/executors/init.py
@@ -0,0 +1,12 @@
 """Tool executors for Lyra."""
 from .code_executor import execute_code
 from .web_search import search_web
 from .trilium import search_notes, create_note
 __all__ = [
    "execute_code",
    "search_web",
    "search_notes",
    "create_note",
 ]
--- a/cortex/autonomy/tools/executors/code_executor.py
+++ b/cortex/autonomy/tools/executors/code_executor.py
@@ -0,0 +1,218 @@
 """
 Code executor for running Python and bash code in a sandbox container.
 This module provides secure code execution with timeout protection,
 output limits, and forbidden pattern detection.
 """
 import asyncio
 import os
 import tempfile
 import re
 from typing import Dict
 import docker
 from docker.errors import (
    DockerException,
    APIError,
    ContainerError,
    ImageNotFound,
    NotFound
 )
 # Forbidden patterns that pose security risks
 FORBIDDEN_PATTERNS = [
    r'rm\s+-rf',  # Destructive file removal
    r':\(\)\{\s*:\|:&\s*\};:',  # Fork bomb
    r'mkfs',  # Filesystem formatting
    r'/dev/sd[a-z]',  # Direct device access
    r'dd\s+if=',  # Low-level disk operations
    r'>\s*/dev/sd',  # Writing to devices
    r'curl.*\|.*sh',  # Pipe to shell (common attack vector)
    r'wget.*\|.*sh',  # Pipe to shell
 ]
 async def execute_code(args: Dict) -> Dict:
    """Execute code in sandbox container.
    Args:
        args: Dictionary containing:
            - language (str): "python" or "bash"
            - code (str): The code to execute
            - reason (str): Why this code is being executed
            - timeout (int, optional): Execution timeout in seconds
    Returns:
        dict: Execution result containing:
            - stdout (str): Standard output
            - stderr (str): Standard error
            - exit_code (int): Process exit code
            - execution_time (float): Time taken in seconds
            OR
            - error (str): Error message if execution failed
    """
    language = args.get("language")
    code = args.get("code")
    reason = args.get("reason", "No reason provided")
    timeout = args.get("timeout", 30)
    # Validation
    if not language or language not in ["python", "bash"]:
        return {"error": "Invalid language. Must be 'python' or 'bash'"}
    if not code:
        return {"error": "No code provided"}
    # Security: Check for forbidden patterns
    for pattern in FORBIDDEN_PATTERNS:
        if re.search(pattern, code, re.IGNORECASE):
            return {"error": f"Forbidden pattern detected for security reasons"}
    # Validate and cap timeout
    max_timeout = int(os.getenv("CODE_SANDBOX_MAX_TIMEOUT", "120"))
    timeout = min(max(timeout, 1), max_timeout)
    container = os.getenv("CODE_SANDBOX_CONTAINER", "lyra-code-sandbox")
    # Validate container exists and is running
    try:
        docker_client = docker.from_env()
        container_obj = docker_client.containers.get(container)
        if container_obj.status != "running":
            return {
                "error": f"Sandbox container '{container}' is not running (status: {container_obj.status})",
                "hint": "Start the container with: docker start " + container
            }
    except NotFound:
        return {
            "error": f"Sandbox container '{container}' not found",
            "hint": "Ensure the container exists and is running"
        }
    except DockerException as e:
        return {
            "error": f"Docker daemon error: {str(e)}",
            "hint": "Check Docker connectivity and permissions"
        }
    # Write code to temporary file
    suffix = ".py" if language == "python" else ".sh"
    try:
        with tempfile.NamedTemporaryFile(
            mode='w',
            suffix=suffix,
            delete=False,
            encoding='utf-8'
        ) as f:
            f.write(code)
            temp_file = f.name
    except Exception as e:
        return {"error": f"Failed to create temp file: {str(e)}"}
    try:
        # Copy file to container
        exec_path = f"/executions/{os.path.basename(temp_file)}"
        cp_proc = await asyncio.create_subprocess_exec(
            "docker", "cp", temp_file, f"{container}:{exec_path}",
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        await cp_proc.communicate()
        if cp_proc.returncode != 0:
            return {"error": "Failed to copy code to sandbox container"}
        # Fix permissions so sandbox user can read the file (run as root)
        chown_proc = await asyncio.create_subprocess_exec(
            "docker", "exec", "-u", "root", container, "chown", "sandbox:sandbox", exec_path,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        await chown_proc.communicate()
        # Execute in container as sandbox user
        if language == "python":
            cmd = ["docker", "exec", "-u", "sandbox", container, "python3", exec_path]
        else:  # bash
            cmd = ["docker", "exec", "-u", "sandbox", container, "bash", exec_path]
        start_time = asyncio.get_event_loop().time()
        proc = await asyncio.create_subprocess_exec(
            *cmd,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        try:
            stdout, stderr = await asyncio.wait_for(
                proc.communicate(),
                timeout=timeout
            )
            execution_time = asyncio.get_event_loop().time() - start_time
            # Truncate output to prevent memory issues (configurable)
            max_output = int(os.getenv("CODE_SANDBOX_MAX_OUTPUT", "10240"))  # 10KB default
            stdout_str = stdout[:max_output].decode('utf-8', errors='replace')
            stderr_str = stderr[:max_output].decode('utf-8', errors='replace')
            if len(stdout) > max_output:
                stdout_str += f"\n... (output truncated, {len(stdout)} bytes total)"
            if len(stderr) > max_output:
                stderr_str += f"\n... (output truncated, {len(stderr)} bytes total)"
            return {
                "stdout": stdout_str,
                "stderr": stderr_str,
                "exit_code": proc.returncode,
                "execution_time": round(execution_time, 2)
            }
        except asyncio.TimeoutError:
            # Kill the process
            try:
                proc.kill()
                await proc.wait()
            except:
                pass
            return {"error": f"Execution timeout after {timeout}s"}
    except APIError as e:
        return {
            "error": f"Docker API error: {e.explanation}",
            "status_code": e.status_code
        }
    except ContainerError as e:
        return {
            "error": f"Container execution error: {str(e)}",
            "exit_code": e.exit_status
        }
    except DockerException as e:
        return {
            "error": f"Docker error: {str(e)}",
            "hint": "Check Docker daemon connectivity and permissions"
        }
    except Exception as e:
        return {"error": f"Execution failed: {str(e)}"}
    finally:
        # Cleanup temporary file
        try:
            if 'temp_file' in locals():
                os.unlink(temp_file)
        except Exception as cleanup_error:
            # Log but don't fail on cleanup errors
            pass
        # Optional: Clean up file from container (best effort)
        try:
            if 'exec_path' in locals() and 'container_obj' in locals():
                container_obj.exec_run(
                    f"rm -f {exec_path}",
                    user="sandbox"
                )
        except:
            pass  # Best effort cleanup
--- a/cortex/autonomy/tools/executors/search_providers/init.py
+++ b/cortex/autonomy/tools/executors/search_providers/init.py
@@ -0,0 +1,13 @@
 """Web search provider implementations."""
 from .base import SearchProvider, SearchResult, SearchResponse
 from .brave import BraveSearchProvider
 from .duckduckgo import DuckDuckGoProvider
 __all__ = [
    "SearchProvider",
    "SearchResult",
    "SearchResponse",
    "BraveSearchProvider",
    "DuckDuckGoProvider",
 ]
--- a/cortex/autonomy/tools/executors/search_providers/base.py
+++ b/cortex/autonomy/tools/executors/search_providers/base.py
@@ -0,0 +1,49 @@
 """Base interface for web search providers."""
 from abc import ABC, abstractmethod
 from typing import List, Optional
 from dataclasses import dataclass
@dataclass
 class SearchResult:
    """Standardized search result format."""
    title: str
    url: str
    snippet: str
    score: Optional[float] = None
@dataclass
 class SearchResponse:
    """Standardized search response."""
    results: List[SearchResult]
    count: int
    provider: str
    query: str
    error: Optional[str] = None
 class SearchProvider(ABC):
    """Abstract base class for search providers."""
    @abstractmethod
    async def search(
        self,
        query: str,
        max_results: int = 5,
        **kwargs
    ) -> SearchResponse:
        """Execute search and return standardized results."""
        pass
    @abstractmethod
    async def health_check(self) -> bool:
        """Check if provider is healthy and reachable."""
        pass
    @property
    @abstractmethod
    def name(self) -> str:
        """Provider name."""
        pass
--- a/cortex/autonomy/tools/executors/search_providers/brave.py
+++ b/cortex/autonomy/tools/executors/search_providers/brave.py
@@ -0,0 +1,123 @@
 """Brave Search API provider implementation."""
 import os
 import asyncio
 import aiohttp
 from .base import SearchProvider, SearchResponse, SearchResult
 from ..utils.resilience import async_retry
 class BraveSearchProvider(SearchProvider):
    """Brave Search API implementation."""
    def __init__(self):
        self.api_key = os.getenv("BRAVE_SEARCH_API_KEY", "")
        self.base_url = os.getenv(
            "BRAVE_SEARCH_URL",
            "https://api.search.brave.com/res/v1"
        )
        self.timeout = float(os.getenv("BRAVE_SEARCH_TIMEOUT", "10.0"))
    @property
    def name(self) -> str:
        return "brave"
    @async_retry(
        max_attempts=3,
        exceptions=(aiohttp.ClientError, asyncio.TimeoutError)
    )
    async def search(
        self,
        query: str,
        max_results: int = 5,
        **kwargs
    ) -> SearchResponse:
        """Execute Brave search with retry logic."""
        if not self.api_key:
            return SearchResponse(
                results=[],
                count=0,
                provider=self.name,
                query=query,
                error="BRAVE_SEARCH_API_KEY not configured"
            )
        headers = {
            "Accept": "application/json",
            "X-Subscription-Token": self.api_key
        }
        params = {
            "q": query,
            "count": min(max_results, 20)  # Brave max is 20
        }
        try:
            async with aiohttp.ClientSession() as session:
                async with session.get(
                    f"{self.base_url}/web/search",
                    headers=headers,
                    params=params,
                    timeout=aiohttp.ClientTimeout(total=self.timeout)
                ) as resp:
                    if resp.status == 200:
                        data = await resp.json()
                        results = []
                        for item in data.get("web", {}).get("results", []):
                            results.append(SearchResult(
                                title=item.get("title", ""),
                                url=item.get("url", ""),
                                snippet=item.get("description", ""),
                                score=item.get("score")
                            ))
                        return SearchResponse(
                            results=results,
                            count=len(results),
                            provider=self.name,
                            query=query
                        )
                    elif resp.status == 401:
                        error = "Authentication failed. Check BRAVE_SEARCH_API_KEY"
                    elif resp.status == 429:
                        error = f"Rate limit exceeded. Status: {resp.status}"
                    else:
                        error_text = await resp.text()
                        error = f"HTTP {resp.status}: {error_text}"
                    return SearchResponse(
                        results=[],
                        count=0,
                        provider=self.name,
                        query=query,
                        error=error
                    )
        except aiohttp.ClientConnectorError as e:
            return SearchResponse(
                results=[],
                count=0,
                provider=self.name,
                query=query,
                error=f"Cannot connect to Brave Search API: {str(e)}"
            )
        except asyncio.TimeoutError:
            return SearchResponse(
                results=[],
                count=0,
                provider=self.name,
                query=query,
                error=f"Search timeout after {self.timeout}s"
            )
    async def health_check(self) -> bool:
        """Check if Brave API is reachable."""
        if not self.api_key:
            return False
        try:
            response = await self.search("test", max_results=1)
            return response.error is None
        except:
            return False
--- a/cortex/autonomy/tools/executors/search_providers/duckduckgo.py
+++ b/cortex/autonomy/tools/executors/search_providers/duckduckgo.py
@@ -0,0 +1,60 @@
 """DuckDuckGo search provider with retry logic (legacy fallback)."""
 from duckduckgo_search import DDGS
 from .base import SearchProvider, SearchResponse, SearchResult
 from ..utils.resilience import async_retry
 class DuckDuckGoProvider(SearchProvider):
    """DuckDuckGo search implementation with retry logic."""
    @property
    def name(self) -> str:
        return "duckduckgo"
    @async_retry(
        max_attempts=3,
        exceptions=(Exception,)  # DDG throws generic exceptions
    )
    async def search(
        self,
        query: str,
        max_results: int = 5,
        **kwargs
    ) -> SearchResponse:
        """Execute DuckDuckGo search with retry logic."""
        try:
            with DDGS() as ddgs:
                results = []
                for result in ddgs.text(query, max_results=max_results):
                    results.append(SearchResult(
                        title=result.get("title", ""),
                        url=result.get("href", ""),
                        snippet=result.get("body", "")
                    ))
                return SearchResponse(
                    results=results,
                    count=len(results),
                    provider=self.name,
                    query=query
                )
        except Exception as e:
            return SearchResponse(
                results=[],
                count=0,
                provider=self.name,
                query=query,
                error=f"Search failed: {str(e)}"
            )
    async def health_check(self) -> bool:
        """Basic health check for DDG."""
        try:
            response = await self.search("test", max_results=1)
            return response.error is None
        except:
            return False
--- a/cortex/autonomy/tools/executors/trilium.py
+++ b/cortex/autonomy/tools/executors/trilium.py
@@ -0,0 +1,216 @@
 """
 Trilium notes executor for searching and creating notes via ETAPI.
 This module provides integration with Trilium notes through the ETAPI HTTP API
 with improved resilience: timeout configuration, retry logic, and connection pooling.
 """
 import os
 import asyncio
 import aiohttp
 from typing import Dict, Optional
 from ..utils.resilience import async_retry
 TRILIUM_URL = os.getenv("TRILIUM_URL", "http://localhost:8080")
 TRILIUM_TOKEN = os.getenv("TRILIUM_ETAPI_TOKEN", "")
 # Module-level session for connection pooling
 _session: Optional[aiohttp.ClientSession] = None
 def get_session() -> aiohttp.ClientSession:
    """Get or create shared aiohttp session for connection pooling."""
    global _session
    if _session is None or _session.closed:
        timeout = aiohttp.ClientTimeout(
            total=float(os.getenv("TRILIUM_TIMEOUT", "30.0")),
            connect=float(os.getenv("TRILIUM_CONNECT_TIMEOUT", "10.0"))
        )
        _session = aiohttp.ClientSession(timeout=timeout)
    return _session
@async_retry(
    max_attempts=3,
    exceptions=(aiohttp.ClientError, asyncio.TimeoutError)
 )
 async def search_notes(args: Dict) -> Dict:
    """Search Trilium notes via ETAPI with retry logic.
    Args:
        args: Dictionary containing:
            - query (str): Search query
            - limit (int, optional): Maximum notes to return (default: 5, max: 20)
    Returns:
        dict: Search results containing:
            - notes (list): List of notes with noteId, title, content, type
            - count (int): Number of notes returned
            OR
            - error (str): Error message if search failed
    """
    query = args.get("query")
    limit = args.get("limit", 5)
    # Validation
    if not query:
        return {"error": "No query provided"}
    if not TRILIUM_TOKEN:
        return {
            "error": "TRILIUM_ETAPI_TOKEN not configured in environment",
            "hint": "Set TRILIUM_ETAPI_TOKEN in .env file"
        }
    # Cap limit
    limit = min(max(limit, 1), 20)
    try:
        session = get_session()
        async with session.get(
            f"{TRILIUM_URL}/etapi/notes",
            params={"search": query, "limit": limit},
            headers={"Authorization": TRILIUM_TOKEN}
        ) as resp:
            if resp.status == 200:
                data = await resp.json()
                # ETAPI returns {"results": [...]} format
                results = data.get("results", [])
                return {
                    "notes": results,
                    "count": len(results)
                }
            elif resp.status == 401:
                return {
                    "error": "Authentication failed. Check TRILIUM_ETAPI_TOKEN",
                    "status": 401
                }
            elif resp.status == 404:
                return {
                    "error": "Trilium API endpoint not found. Check TRILIUM_URL",
                    "status": 404,
                    "url": TRILIUM_URL
                }
            else:
                error_text = await resp.text()
                return {
                    "error": f"HTTP {resp.status}: {error_text}",
                    "status": resp.status
                }
    except aiohttp.ClientConnectorError as e:
        return {
            "error": f"Cannot connect to Trilium at {TRILIUM_URL}",
            "hint": "Check if Trilium is running and URL is correct",
            "details": str(e)
        }
    except asyncio.TimeoutError:
        timeout = os.getenv("TRILIUM_TIMEOUT", "30.0")
        return {
            "error": f"Trilium request timeout after {timeout}s",
            "hint": "Trilium may be slow or unresponsive"
        }
    except Exception as e:
        return {
            "error": f"Search failed: {str(e)}",
            "type": type(e).__name__
        }
@async_retry(
    max_attempts=3,
    exceptions=(aiohttp.ClientError, asyncio.TimeoutError)
 )
 async def create_note(args: Dict) -> Dict:
    """Create a note in Trilium via ETAPI with retry logic.
    Args:
        args: Dictionary containing:
            - title (str): Note title
            - content (str): Note content in markdown or HTML
            - parent_note_id (str, optional): Parent note ID to nest under
    Returns:
        dict: Creation result containing:
            - noteId (str): ID of created note
            - title (str): Title of created note
            - success (bool): True if created successfully
            OR
            - error (str): Error message if creation failed
    """
    title = args.get("title")
    content = args.get("content")
    parent_note_id = args.get("parent_note_id", "root")  # Default to root if not specified
    # Validation
    if not title:
        return {"error": "No title provided"}
    if not content:
        return {"error": "No content provided"}
    if not TRILIUM_TOKEN:
        return {
            "error": "TRILIUM_ETAPI_TOKEN not configured in environment",
            "hint": "Set TRILIUM_ETAPI_TOKEN in .env file"
        }
    # Prepare payload
    payload = {
        "parentNoteId": parent_note_id,  # Always include parentNoteId
        "title": title,
        "content": content,
        "type": "text",
        "mime": "text/html"
    }
    try:
        session = get_session()
        async with session.post(
            f"{TRILIUM_URL}/etapi/create-note",
            json=payload,
            headers={"Authorization": TRILIUM_TOKEN}
        ) as resp:
            if resp.status in [200, 201]:
                data = await resp.json()
                return {
                    "noteId": data.get("noteId"),
                    "title": title,
                    "success": True
                }
            elif resp.status == 401:
                return {
                    "error": "Authentication failed. Check TRILIUM_ETAPI_TOKEN",
                    "status": 401
                }
            elif resp.status == 404:
                return {
                    "error": "Trilium API endpoint not found. Check TRILIUM_URL",
                    "status": 404,
                    "url": TRILIUM_URL
                }
            else:
                error_text = await resp.text()
                return {
                    "error": f"HTTP {resp.status}: {error_text}",
                    "status": resp.status
                }
    except aiohttp.ClientConnectorError as e:
        return {
            "error": f"Cannot connect to Trilium at {TRILIUM_URL}",
            "hint": "Check if Trilium is running and URL is correct",
            "details": str(e)
        }
    except asyncio.TimeoutError:
        timeout = os.getenv("TRILIUM_TIMEOUT", "30.0")
        return {
            "error": f"Trilium request timeout after {timeout}s",
            "hint": "Trilium may be slow or unresponsive"
        }
    except Exception as e:
        return {
            "error": f"Note creation failed: {str(e)}",
            "type": type(e).__name__
        }
--- a/cortex/autonomy/tools/executors/web_search.py
+++ b/cortex/autonomy/tools/executors/web_search.py
@@ -0,0 +1,113 @@
 """
 Web search executor with pluggable provider support.
 Supports multiple providers with automatic fallback:
 - Brave Search API (recommended, configurable)
 - DuckDuckGo (legacy fallback)
 """
 import os
 from typing import Dict, Optional
 from .search_providers.base import SearchProvider
 from .search_providers.brave import BraveSearchProvider
 from .search_providers.duckduckgo import DuckDuckGoProvider
 # Provider registry
 PROVIDERS = {
    "brave": BraveSearchProvider,
    "duckduckgo": DuckDuckGoProvider,
 }
 # Singleton provider instances
 _provider_instances: Dict[str, SearchProvider] = {}
 def get_provider(name: str) -> Optional[SearchProvider]:
    """Get or create provider instance."""
    if name not in _provider_instances:
        provider_class = PROVIDERS.get(name)
        if provider_class:
            _provider_instances[name] = provider_class()
    return _provider_instances.get(name)
 async def search_web(args: Dict) -> Dict:
    """Search the web using configured provider with automatic fallback.
    Args:
        args: Dictionary containing:
            - query (str): The search query
            - max_results (int, optional): Maximum results to return (default: 5, max: 20)
            - provider (str, optional): Force specific provider
    Returns:
        dict: Search results containing:
            - results (list): List of search results with title, url, snippet
            - count (int): Number of results returned
            - provider (str): Provider that returned results
            OR
            - error (str): Error message if all providers failed
    """
    query = args.get("query")
    max_results = args.get("max_results", 5)
    forced_provider = args.get("provider")
    # Validation
    if not query:
        return {"error": "No query provided"}
    # Cap max_results
    max_results = min(max(max_results, 1), 20)
    # Get provider preference from environment
    primary_provider = os.getenv("WEB_SEARCH_PROVIDER", "duckduckgo")
    fallback_providers = os.getenv(
        "WEB_SEARCH_FALLBACK",
        "duckduckgo"
    ).split(",")
    # Build provider list
    if forced_provider:
        providers_to_try = [forced_provider]
    else:
        providers_to_try = [primary_provider] + [
            p.strip() for p in fallback_providers if p.strip() != primary_provider
        ]
    # Try providers in order
    last_error = None
    for provider_name in providers_to_try:
        provider = get_provider(provider_name)
        if not provider:
            last_error = f"Unknown provider: {provider_name}"
            continue
        try:
            response = await provider.search(query, max_results)
            # If successful, return results
            if response.error is None and response.count > 0:
                return {
                    "results": [
                        {
                            "title": r.title,
                            "url": r.url,
                            "snippet": r.snippet,
                        }
                        for r in response.results
                    ],
                    "count": response.count,
                    "provider": provider_name
                }
            last_error = response.error or "No results returned"
        except Exception as e:
            last_error = f"{provider_name} failed: {str(e)}"
            continue
    # All providers failed
    return {
        "error": f"All search providers failed. Last error: {last_error}",
        "providers_tried": providers_to_try
    }
--- a/cortex/autonomy/tools/function_caller.py
+++ b/cortex/autonomy/tools/function_caller.py
@@ -0,0 +1,235 @@
 """
 Provider-agnostic function caller with iterative tool calling loop.
 This module implements the iterative loop that allows LLMs to call tools
 multiple times until they have the information they need to answer the user.
 """
 import os
 import logging
 from typing import Dict, List, Optional
 from llm.llm_router import call_llm, TOOL_ADAPTERS, BACKENDS
 from .registry import get_registry
 from .stream_events import get_stream_manager
 logger = logging.getLogger(__name__)
 class FunctionCaller:
    """Provider-agnostic iterative tool calling loop.
    This class orchestrates the back-and-forth between the LLM and tools:
    1. Call LLM with tools available
    2. If LLM requests tool calls, execute them
    3. Add results to conversation
    4. Repeat until LLM is done or max iterations reached
    """
    def __init__(self, backend: str, temperature: float = 0.7):
        """Initialize function caller.
        Args:
            backend: LLM backend to use ("OPENAI", "OLLAMA", etc.)
            temperature: Temperature for LLM calls
        """
        self.backend = backend
        self.temperature = temperature
        self.registry = get_registry()
        self.max_iterations = int(os.getenv("MAX_TOOL_ITERATIONS", "5"))
        # Resolve adapter for this backend
        self.adapter = self._get_adapter()
    def _get_adapter(self):
        """Get the appropriate adapter for this backend."""
        adapter = TOOL_ADAPTERS.get(self.backend)
        # For PRIMARY/SECONDARY/FALLBACK, determine adapter based on provider
        if adapter is None and self.backend in ["PRIMARY", "SECONDARY", "FALLBACK"]:
            cfg = BACKENDS.get(self.backend, {})
            provider = cfg.get("provider", "").lower()
            if provider == "openai":
                adapter = TOOL_ADAPTERS["OPENAI"]
            elif provider == "ollama":
                adapter = TOOL_ADAPTERS["OLLAMA"]
            elif provider == "mi50":
                adapter = TOOL_ADAPTERS["MI50"]
        return adapter
    async def call_with_tools(
        self,
        messages: List[Dict],
        max_tokens: int = 2048,
        session_id: Optional[str] = None
    ) -> Dict:
        """Execute LLM with iterative tool calling.
        Args:
            messages: Conversation history
            max_tokens: Maximum tokens for LLM response
            session_id: Optional session ID for streaming events
        Returns:
            dict: {
                "content": str,  # Final response
                "iterations": int,  # Number of iterations
                "tool_calls": list,  # All tool calls made
                "messages": list,  # Full conversation history
                "truncated": bool (optional)  # True if max iterations reached
            }
        """
        logger.info(f"🔍 FunctionCaller.call_with_tools() invoked with {len(messages)} messages")
        tools = self.registry.get_tool_definitions()
        logger.info(f"🔍 Got {len(tools or [])} tool definitions from registry")
        # Get stream manager for emitting events
        stream_manager = get_stream_manager()
        should_stream = session_id and stream_manager.has_subscribers(session_id)
        # If no tools are enabled, just call LLM directly
        if not tools:
            logger.warning("FunctionCaller invoked but no tools are enabled")
            response = await call_llm(
                messages=messages,
                backend=self.backend,
                temperature=self.temperature,
                max_tokens=max_tokens
            )
            return {
                "content": response,
                "iterations": 1,
                "tool_calls": [],
                "messages": messages + [{"role": "assistant", "content": response}]
            }
        conversation = messages.copy()
        all_tool_calls = []
        for iteration in range(self.max_iterations):
            logger.info(f"Tool calling iteration {iteration + 1}/{self.max_iterations}")
            # Emit thinking event
            if should_stream:
                await stream_manager.emit(session_id, "thinking", {
                    "message": f"🤔 Thinking... (iteration {iteration + 1}/{self.max_iterations})"
                })
            # Call LLM with tools
            try:
                response = await call_llm(
                    messages=conversation,
                    backend=self.backend,
                    temperature=self.temperature,
                    max_tokens=max_tokens,
                    tools=tools,
                    tool_choice="auto",
                    return_adapter_response=True
                )
            except Exception as e:
                logger.error(f"LLM call failed: {str(e)}")
                if should_stream:
                    await stream_manager.emit(session_id, "error", {
                        "message": f"❌ Error: {str(e)}"
                    })
                return {
                    "content": f"Error calling LLM: {str(e)}",
                    "iterations": iteration + 1,
                    "tool_calls": all_tool_calls,
                    "messages": conversation,
                    "error": True
                }
            # Add assistant message to conversation
            if response.get("content"):
                conversation.append({
                    "role": "assistant",
                    "content": response["content"]
                })
            # Check for tool calls
            tool_calls = response.get("tool_calls")
            logger.debug(f"Response from LLM: content_length={len(response.get('content', ''))}, tool_calls={tool_calls}")
            if not tool_calls:
                # No more tool calls - LLM is done
                logger.info(f"Tool calling complete after {iteration + 1} iterations")
                if should_stream:
                    await stream_manager.emit(session_id, "done", {
                        "message": "✅ Complete!",
                        "final_answer": response["content"]
                    })
                return {
                    "content": response["content"],
                    "iterations": iteration + 1,
                    "tool_calls": all_tool_calls,
                    "messages": conversation
                }
            # Execute each tool call
            logger.info(f"Executing {len(tool_calls)} tool call(s)")
            for tool_call in tool_calls:
                all_tool_calls.append(tool_call)
                tool_name = tool_call.get("name")
                tool_args = tool_call.get("arguments", {})
                tool_id = tool_call.get("id", "unknown")
                logger.info(f"Calling tool: {tool_name} with args: {tool_args}")
                # Emit tool call event
                if should_stream:
                    await stream_manager.emit(session_id, "tool_call", {
                        "tool": tool_name,
                        "args": tool_args,
                        "message": f"🔧 Using tool: {tool_name}"
                    })
                try:
                    # Execute tool
                    result = await self.registry.execute_tool(tool_name, tool_args)
                    logger.info(f"Tool {tool_name} executed successfully")
                    # Emit tool result event
                    if should_stream:
                        # Format result preview
                        result_preview = str(result)
                        if len(result_preview) > 200:
                            result_preview = result_preview[:200] + "..."
                        await stream_manager.emit(session_id, "tool_result", {
                            "tool": tool_name,
                            "result": result,
                            "message": f"📊 Result: {result_preview}"
                        })
                except Exception as e:
                    logger.error(f"Tool {tool_name} execution failed: {str(e)}")
                    result = {"error": f"Tool execution failed: {str(e)}"}
                # Format result using adapter
                if not self.adapter:
                    logger.warning(f"No adapter available for backend {self.backend}, using fallback format")
                    result_msg = {
                        "role": "user",
                        "content": f"Tool {tool_name} result: {result}"
                    }
                else:
                    result_msg = self.adapter.format_tool_result(
                        tool_id,
                        tool_name,
                        result
                    )
                conversation.append(result_msg)
        # Max iterations reached without completion
        logger.warning(f"Tool calling truncated after {self.max_iterations} iterations")
        return {
            "content": response.get("content", ""),
            "iterations": self.max_iterations,
            "tool_calls": all_tool_calls,
            "messages": conversation,
            "truncated": True
        }
--- a/cortex/autonomy/tools/registry.py
+++ b/cortex/autonomy/tools/registry.py
@@ -0,0 +1,196 @@
 """
 Provider-agnostic Tool Registry for Lyra.
 This module provides a central registry for all available tools with
 Lyra-native definitions (not provider-specific).
 """
 import os
 from typing import Dict, List, Optional
 from .executors import execute_code, search_web, search_notes, create_note
 class ToolRegistry:
    """Registry for managing available tools and their definitions.
    Tools are defined in Lyra's own format (provider-agnostic), and
    adapters convert them to provider-specific formats (OpenAI function
    calling, Ollama XML prompts, etc.).
    """
    def __init__(self):
        """Initialize the tool registry with feature flags from environment."""
        self.tools = {}
        self.executors = {}
        # Feature flags from environment
        self.code_execution_enabled = os.getenv("ENABLE_CODE_EXECUTION", "true").lower() == "true"
        self.web_search_enabled = os.getenv("ENABLE_WEB_SEARCH", "true").lower() == "true"
        self.trilium_enabled = os.getenv("ENABLE_TRILIUM", "false").lower() == "true"
        self._register_tools()
        self._register_executors()
    def _register_executors(self):
        """Register executor functions for each tool."""
        if self.code_execution_enabled:
            self.executors["execute_code"] = execute_code
        if self.web_search_enabled:
            self.executors["search_web"] = search_web
        if self.trilium_enabled:
            self.executors["search_notes"] = search_notes
            self.executors["create_note"] = create_note
    def _register_tools(self):
        """Register all available tools based on feature flags."""
        if self.code_execution_enabled:
            self.tools["execute_code"] = {
                "name": "execute_code",
                "description": "Execute Python or bash code in a secure sandbox environment. Use this to perform calculations, data processing, file operations, or any programmatic tasks. The sandbox is persistent across calls within a session and has common Python packages (numpy, pandas, requests, matplotlib, scipy) pre-installed.",
                "parameters": {
                    "language": {
                        "type": "string",
                        "enum": ["python", "bash"],
                        "description": "The programming language to execute (python or bash)"
                    },
                    "code": {
                        "type": "string",
                        "description": "The code to execute. For multi-line code, use proper indentation. For Python, use standard Python 3.11 syntax."
                    },
                    "reason": {
                        "type": "string",
                        "description": "Brief explanation of why you're executing this code and what you expect to achieve"
                    }
                },
                "required": ["language", "code", "reason"]
            }
        if self.web_search_enabled:
            self.tools["search_web"] = {
                "name": "search_web",
                "description": "Search the internet using DuckDuckGo to find current information, facts, news, or answers to questions. Returns a list of search results with titles, snippets, and URLs. Use this when you need up-to-date information or facts not in your training data.",
                "parameters": {
                    "query": {
                        "type": "string",
                        "description": "The search query to look up on the internet"
                    },
                    "max_results": {
                        "type": "integer",
                        "description": "Maximum number of results to return (default: 5, max: 10)"
                    }
                },
                "required": ["query"]
            }
        if self.trilium_enabled:
            self.tools["search_notes"] = {
                "name": "search_notes",
                "description": "Search through Trilium notes to find relevant information. Use this to retrieve knowledge, context, or information previously stored in the user's notes.",
                "parameters": {
                    "query": {
                        "type": "string",
                        "description": "The search query to find matching notes"
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Maximum number of notes to return (default: 5, max: 20)"
                    }
                },
                "required": ["query"]
            }
            self.tools["create_note"] = {
                "name": "create_note",
                "description": "Create a new note in Trilium. Use this to store important information, insights, or knowledge for future reference. Notes are stored in the user's Trilium knowledge base.",
                "parameters": {
                    "title": {
                        "type": "string",
                        "description": "The title of the note"
                    },
                    "content": {
                        "type": "string",
                        "description": "The content of the note in markdown or HTML format"
                    },
                    "parent_note_id": {
                        "type": "string",
                        "description": "Optional ID of the parent note to nest this note under"
                    }
                },
                "required": ["title", "content"]
            }
    def get_tool_definitions(self) -> Optional[List[Dict]]:
        """Get list of all enabled tool definitions in Lyra format.
        Returns:
            list: List of tool definition dicts, or None if no tools enabled
        """
        if not self.tools:
            return None
        return list(self.tools.values())
    def get_tool_names(self) -> List[str]:
        """Get list of all enabled tool names.
        Returns:
            list: List of tool name strings
        """
        return list(self.tools.keys())
    def is_tool_enabled(self, tool_name: str) -> bool:
        """Check if a specific tool is enabled.
        Args:
            tool_name: Name of the tool to check
        Returns:
            bool: True if tool is enabled, False otherwise
        """
        return tool_name in self.tools
    def register_executor(self, tool_name: str, executor_func):
        """Register an executor function for a tool.
        Args:
            tool_name: Name of the tool
            executor_func: Async function that executes the tool
        """
        self.executors[tool_name] = executor_func
    async def execute_tool(self, name: str, arguments: dict) -> dict:
        """Execute a tool by name.
        Args:
            name: Tool name
            arguments: Tool arguments dict
        Returns:
            dict: Tool execution result
        """
        if name not in self.executors:
            return {"error": f"Unknown tool: {name}"}
        executor = self.executors[name]
        try:
            return await executor(arguments)
        except Exception as e:
            return {"error": f"Tool execution failed: {str(e)}"}
 # Global registry instance (singleton pattern)
 _registry = None
 def get_registry() -> ToolRegistry:
    """Get the global ToolRegistry instance.
    Returns:
        ToolRegistry: The global registry instance
    """
    global _registry
    if _registry is None:
        _registry = ToolRegistry()
    return _registry
--- a/cortex/autonomy/tools/stream_events.py
+++ b/cortex/autonomy/tools/stream_events.py
@@ -0,0 +1,91 @@
 """
 Event streaming for tool calling "show your work" feature.
 This module manages Server-Sent Events (SSE) for broadcasting the internal
 thinking process during tool calling operations.
 """
 import asyncio
 from typing import Dict, Optional
 from collections import defaultdict
 import json
 import logging
 logger = logging.getLogger(__name__)
 class ToolStreamManager:
    """Manages SSE streams for tool calling events."""
    def __init__(self):
        # session_id -> list of queues (one per connected client)
        self._subscribers: Dict[str, list] = defaultdict(list)
    def subscribe(self, session_id: str) -> asyncio.Queue:
        """Subscribe to events for a session.
        Returns:
            Queue that will receive events for this session
        """
        queue = asyncio.Queue()
        self._subscribers[session_id].append(queue)
        logger.info(f"New subscriber for session {session_id}, total: {len(self._subscribers[session_id])}")
        return queue
    def unsubscribe(self, session_id: str, queue: asyncio.Queue):
        """Unsubscribe from events for a session."""
        if session_id in self._subscribers:
            try:
                self._subscribers[session_id].remove(queue)
                logger.info(f"Removed subscriber for session {session_id}, remaining: {len(self._subscribers[session_id])}")
                # Clean up empty lists
                if not self._subscribers[session_id]:
                    del self._subscribers[session_id]
            except ValueError:
                pass
    async def emit(self, session_id: str, event_type: str, data: dict):
        """Emit an event to all subscribers of a session.
        Args:
            session_id: Session to emit to
            event_type: Type of event (thinking, tool_call, tool_result, done)
            data: Event data
        """
        if session_id not in self._subscribers:
            return
        event = {
            "type": event_type,
            "data": data
        }
        # Send to all subscribers
        dead_queues = []
        for queue in self._subscribers[session_id]:
            try:
                await queue.put(event)
            except Exception as e:
                logger.error(f"Failed to emit event to queue: {e}")
                dead_queues.append(queue)
        # Clean up dead queues
        for queue in dead_queues:
            self.unsubscribe(session_id, queue)
    def has_subscribers(self, session_id: str) -> bool:
        """Check if a session has any active subscribers."""
        return session_id in self._subscribers and len(self._subscribers[session_id]) > 0
 # Global stream manager instance
 _stream_manager: Optional[ToolStreamManager] = None
 def get_stream_manager() -> ToolStreamManager:
    """Get the global stream manager instance."""
    global _stream_manager
    if _stream_manager is None:
        _stream_manager = ToolStreamManager()
    return _stream_manager
--- a/cortex/autonomy/tools/utils/init.py
+++ b/cortex/autonomy/tools/utils/init.py
@@ -0,0 +1,5 @@
 """Utility modules for tool executors."""
 from .resilience import async_retry, async_timeout_wrapper
 __all__ = ["async_retry", "async_timeout_wrapper"]
--- a/cortex/autonomy/tools/utils/resilience.py
+++ b/cortex/autonomy/tools/utils/resilience.py
@@ -0,0 +1,70 @@
 """Common resilience utilities for tool executors."""
 import asyncio
 import functools
 import logging
 from typing import Optional, Callable, Any, TypeVar
 from tenacity import (
    retry,
    stop_after_attempt,
    wait_exponential,
    retry_if_exception_type,
    before_sleep_log
 )
 logger = logging.getLogger(__name__)
 # Type variable for generic decorators
 T = TypeVar('T')
 def async_retry(
    max_attempts: int = 3,
    exceptions: tuple = (Exception,),
    **kwargs
 ):
    """Async retry decorator with exponential backoff.
    Args:
        max_attempts: Maximum retry attempts
        exceptions: Exception types to retry on
        **kwargs: Additional tenacity configuration
    Example:
        @async_retry(max_attempts=3, exceptions=(aiohttp.ClientError,))
        async def fetch_data():
            ...
    """
    return retry(
        stop=stop_after_attempt(max_attempts),
        wait=wait_exponential(multiplier=1, min=1, max=10),
        retry=retry_if_exception_type(exceptions),
        reraise=True,
        before_sleep=before_sleep_log(logger, logging.WARNING),
        **kwargs
    )
 async def async_timeout_wrapper(
    coro: Callable[..., T],
    timeout: float,
    *args,
    **kwargs
 ) -> T:
    """Wrap async function with timeout.
    Args:
        coro: Async function to wrap
        timeout: Timeout in seconds
        *args, **kwargs: Arguments for the function
    Returns:
        Result from the function
    Raises:
        asyncio.TimeoutError: If timeout exceeded
    Example:
        result = await async_timeout_wrapper(some_async_func, 5.0, arg1, arg2)
    """
    return await asyncio.wait_for(coro(*args, **kwargs), timeout=timeout)
--- a/cortex/data/self_state.json
+++ b/cortex/data/self_state.json
@@ -4,8 +4,8 @@
  "focus": "conversation",
  "confidence": 0.7,
  "curiosity": 1.0,
-  "last_updated": "2025-12-20T09:08:41.342756",
+  "last_updated": "2025-12-27T18:16:00.152499",
-  "interaction_count": 25,
+  "interaction_count": 27,
  "learning_queue": [],
  "active_goals": [],
  "preferences": {
--- a/cortex/intake/intake.py
+++ b/cortex/intake/intake.py
@@ -326,11 +326,33 @@ def bg_summarize(session_id: str):
 # ─────────────────────────────
 # Internal entrypoint for Cortex
 # ─────────────────────────────
 def get_recent_messages(session_id: str, limit: int = 20) -> list:
    """
    Get recent raw messages from the session buffer.
    Args:
        session_id: Session identifier
        limit: Maximum number of messages to return (default 20)
    Returns:
        List of message dicts with 'role' and 'content' fields
    """
    if session_id not in SESSIONS:
        return []
    buffer = SESSIONS[session_id]["buffer"]
    # Convert buffer to list and get last N messages
    messages = list(buffer)[-limit:]
    return messages
 def add_exchange_internal(exchange: dict):
    """
    Direct internal call — bypasses FastAPI request handling.
    Cortex uses this to feed user/assistant turns directly
-    into Intake’s buffer and trigger full summarization.
+    into Intake's buffer and trigger full summarization.
    """
    session_id = exchange.get("session_id")
    if not session_id:
--- a/cortex/llm/llm_router.py
+++ b/cortex/llm/llm_router.py
@@ -3,6 +3,8 @@ import os
 import httpx
 import json
 import logging
 from typing import Optional, List, Dict
 from autonomy.tools.adapters import OpenAIAdapter, OllamaAdapter, LlamaCppAdapter
 logger = logging.getLogger(__name__)
@@ -39,16 +41,47 @@ DEFAULT_BACKEND = "PRIMARY"
 # Reusable async HTTP client
 http_client = httpx.AsyncClient(timeout=120.0)
 # Tool adapters for each backend
 TOOL_ADAPTERS = {
    "OPENAI": OpenAIAdapter(),
    "OLLAMA": OllamaAdapter(),
    "MI50": LlamaCppAdapter(),  # MI50 uses llama.cpp
    "PRIMARY": None,  # Determined at runtime
    "SECONDARY": None,  # Determined at runtime
    "FALLBACK": None,  # Determined at runtime
 }
 # ------------------------------------------------------------
 # Public call
 # ------------------------------------------------------------
 async def call_llm(
-    prompt: str,
+    prompt: str = None,
    messages: list = None,
    backend: str | None = None,
    temperature: float = 0.7,
    max_tokens: int = 512,
    tools: Optional[List[Dict]] = None,
    tool_choice: Optional[str] = None,
    return_adapter_response: bool = False,
 ):
    """
    Call an LLM backend with optional tool calling support.
    Args:
        prompt: String prompt (for completion-style APIs like mi50)
        messages: List of message dicts (for chat-style APIs like Ollama/OpenAI)
        backend: Which backend to use (PRIMARY, SECONDARY, OPENAI, etc.)
        temperature: Sampling temperature
        max_tokens: Maximum tokens to generate
        tools: List of Lyra tool definitions (provider-agnostic)
        tool_choice: How to use tools ("auto", "required", "none")
        return_adapter_response: If True, return dict with content and tool_calls
    Returns:
        str (default) or dict (if return_adapter_response=True):
            {"content": str, "tool_calls": [...] or None}
    """
    backend = (backend or DEFAULT_BACKEND).upper()
    if backend not in BACKENDS:
@@ -62,20 +95,73 @@ async def call_llm(
    if not url or not model:
        raise RuntimeError(f"Backend '{backend}' missing url/model in env")
    # If tools are requested, use adapter to prepare request
    if tools:
        # Get adapter for this backend
        adapter = TOOL_ADAPTERS.get(backend)
        # For PRIMARY/SECONDARY/FALLBACK, determine adapter based on provider
        if adapter is None and backend in ["PRIMARY", "SECONDARY", "FALLBACK"]:
            if provider == "openai":
                adapter = TOOL_ADAPTERS["OPENAI"]
            elif provider == "ollama":
                adapter = TOOL_ADAPTERS["OLLAMA"]
            elif provider == "mi50":
                adapter = TOOL_ADAPTERS["MI50"]
        if adapter:
            # Use messages array if provided, otherwise convert prompt to messages
            if not messages:
                messages = [{"role": "user", "content": prompt}]
            # Prepare request through adapter
            adapted_request = await adapter.prepare_request(messages, tools, tool_choice)
            messages = adapted_request["messages"]
            # Extract tools in provider format if present
            provider_tools = adapted_request.get("tools")
            provider_tool_choice = adapted_request.get("tool_choice")
        else:
            logger.warning(f"No adapter available for backend {backend}, ignoring tools")
            provider_tools = None
            provider_tool_choice = None
    else:
        provider_tools = None
        provider_tool_choice = None
    # -------------------------------
    # Provider: MI50 (llama.cpp server)
    # -------------------------------
    if provider == "mi50":
        # If tools requested, convert messages to prompt with tool instructions
        if messages and tools:
            # Combine messages into a prompt
            prompt_parts = []
            for msg in messages:
                role = msg.get("role", "user")
                content = msg.get("content", "")
                prompt_parts.append(f"{role.capitalize()}: {content}")
            prompt = "\n".join(prompt_parts) + "\nAssistant:"
        payload = {
            "prompt": prompt,
            "n_predict": max_tokens,
-            "temperature": temperature
+            "temperature": temperature,
            "stop": ["User:", "\nUser:", "Assistant:", "\n\n\n"]
        }
        try:
            r = await http_client.post(f"{url}/completion", json=payload)
            r.raise_for_status()
            data = r.json()
-            return data.get("content", "")
+            response_content = data.get("content", "")
            # If caller wants adapter response with tool calls, parse and return
            if return_adapter_response and tools:
                adapter = TOOL_ADAPTERS.get(backend) or TOOL_ADAPTERS["MI50"]
                return await adapter.parse_response(response_content)
            else:
                return response_content
        except httpx.HTTPError as e:
            logger.error(f"HTTP error calling mi50: {type(e).__name__}: {str(e)}")
            raise RuntimeError(f"LLM API error (mi50): {type(e).__name__}: {str(e)}")
@@ -89,19 +175,41 @@ async def call_llm(
    # -------------------------------
    # Provider: OLLAMA (your 3090)
    # -------------------------------
    logger.info(f"🔍 LLM Router: provider={provider}, checking if ollama...")
    if provider == "ollama":
        logger.info(f"🔍 LLM Router: Matched ollama provider, tools={bool(tools)}, return_adapter_response={return_adapter_response}")
        # Use messages array if provided, otherwise convert prompt to single user message
        if messages:
            chat_messages = messages
        else:
            chat_messages = [{"role": "user", "content": prompt}]
        payload = {
            "model": model,
-            "messages": [
+            "messages": chat_messages,
-                {"role": "user", "content": prompt}
+            "stream": False,
-            ],
+            "options": {
-            "stream": False
+                "temperature": temperature,
                "num_predict": max_tokens
            }
        }
        try:
            r = await http_client.post(f"{url}/api/chat", json=payload)
            r.raise_for_status()
            data = r.json()
-            return data["message"]["content"]
+            response_content = data["message"]["content"]
            # If caller wants adapter response with tool calls, parse and return
            if return_adapter_response and tools:
                logger.info(f"🔍 Ollama: return_adapter_response=True, calling adapter.parse_response")
                adapter = TOOL_ADAPTERS.get(backend) or TOOL_ADAPTERS["OLLAMA"]
                logger.info(f"🔍 Ollama: Using adapter {adapter.__class__.__name__}")
                result = await adapter.parse_response(response_content)
                logger.info(f"🔍 Ollama: Adapter returned {result}")
                return result
            else:
                return response_content
        except httpx.HTTPError as e:
            logger.error(f"HTTP error calling ollama: {type(e).__name__}: {str(e)}")
            raise RuntimeError(f"LLM API error (ollama): {type(e).__name__}: {str(e)}")
@@ -121,19 +229,62 @@ async def call_llm(
            "Authorization": f"Bearer {cfg['api_key']}",
            "Content-Type": "application/json"
        }
        # Use messages array if provided, otherwise convert prompt to single user message
        if messages:
            chat_messages = messages
        else:
            chat_messages = [{"role": "user", "content": prompt}]
        payload = {
            "model": model,
-            "messages": [
+            "messages": chat_messages,
                {"role": "user", "content": prompt}
            ],
            "temperature": temperature,
            "max_tokens": max_tokens,
        }
        # Add tools if available (OpenAI native function calling)
        if provider_tools:
            payload["tools"] = provider_tools
            if provider_tool_choice:
                payload["tool_choice"] = provider_tool_choice
        try:
            r = await http_client.post(f"{url}/chat/completions", json=payload, headers=headers)
            r.raise_for_status()
            data = r.json()
            # If caller wants adapter response with tool calls, parse and return
            if return_adapter_response and tools:
                # Create mock response object for adapter
                class MockChoice:
                    def __init__(self, message_data):
                        self.message = type('obj', (object,), {})()
                        self.message.content = message_data.get("content")
                        # Convert tool_calls dicts to objects
                        raw_tool_calls = message_data.get("tool_calls")
                        if raw_tool_calls:
                            self.message.tool_calls = []
                            for tc in raw_tool_calls:
                                tool_call_obj = type('obj', (object,), {})()
                                tool_call_obj.id = tc.get("id")
                                tool_call_obj.function = type('obj', (object,), {})()
                                tool_call_obj.function.name = tc.get("function", {}).get("name")
                                tool_call_obj.function.arguments = tc.get("function", {}).get("arguments")
                                self.message.tool_calls.append(tool_call_obj)
                        else:
                            self.message.tool_calls = None
                class MockResponse:
                    def __init__(self, data):
                        self.choices = [MockChoice(data["choices"][0]["message"])]
                mock_resp = MockResponse(data)
                adapter = TOOL_ADAPTERS.get(backend) or TOOL_ADAPTERS["OPENAI"]
                return await adapter.parse_response(mock_resp)
            else:
                return data["choices"][0]["message"]["content"]
        except httpx.HTTPError as e:
            logger.error(f"HTTP error calling openai: {type(e).__name__}: {str(e)}")
            raise RuntimeError(f"LLM API error (openai): {type(e).__name__}: {str(e)}")
--- a/cortex/main.py
+++ b/cortex/main.py
@@ -1,6 +1,16 @@
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
 from router import cortex_router
 app = FastAPI()
 # Add CORS middleware to allow SSE connections from nginx UI
 app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # In production, specify exact origins
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
 )
 app.include_router(cortex_router)
--- a/cortex/requirements.txt
+++ b/cortex/requirements.txt
@@ -4,3 +4,7 @@ python-dotenv==1.0.1
 requests==2.32.3
 httpx==0.27.2
 pydantic==2.10.4
 duckduckgo-search==6.3.5
 aiohttp==3.9.1
 tenacity==9.0.0
 docker==7.1.0
--- a/cortex/router.py
+++ b/cortex/router.py
@@ -2,7 +2,9 @@
 import os
 import logging
 import asyncio
 from fastapi import APIRouter
 from fastapi.responses import StreamingResponse
 from pydantic import BaseModel
 from reasoning.reasoning import reason_check
@@ -15,6 +17,7 @@ from intake.intake import add_exchange_internal
 from autonomy.monologue.monologue import InnerMonologue
 from autonomy.self.state import load_self_state
 from autonomy.tools.stream_events import get_stream_manager
 # -------------------------------------------------------------------
@@ -44,6 +47,7 @@ class ReasonRequest(BaseModel):
    session_id: str
    user_prompt: str
    temperature: float | None = None
    backend: str | None = None
 # -------------------------------------------------------------------
@@ -342,6 +346,7 @@ async def run_simple(req: ReasonRequest):
    """
    from datetime import datetime
    from llm.llm_router import call_llm
    from autonomy.tools.function_caller import FunctionCaller
    start_time = datetime.now()
@@ -351,17 +356,34 @@ async def run_simple(req: ReasonRequest):
    logger.info(f"📝 User: {req.user_prompt[:150]}...")
    logger.info(f"{'-'*100}\n")
-    # Get conversation history from context
+    # Get conversation history from context and intake buffer
    context_state = await collect_context(req.session_id, req.user_prompt)
-    # Build simple conversation history
+    # Get recent messages from Intake buffer
-    messages = []
+    from intake.intake import get_recent_messages
-    if context_state.get("recent_messages"):
+    recent_msgs = get_recent_messages(req.session_id, limit=20)
-        for msg in context_state["recent_messages"]:
+    logger.info(f"📋 Retrieved {len(recent_msgs)} recent messages from Intake buffer")
    # Build simple conversation history with system message
    system_message = {
        "role": "system",
        "content": (
            "You are a helpful AI assistant. Provide direct, concise responses to the user's questions. "
            "Maintain context from previous messages in the conversation."
        )
    }
    messages = [system_message]
    # Add conversation history
    if recent_msgs:
        for msg in recent_msgs:
            messages.append({
                "role": msg.get("role", "user"),
                "content": msg.get("content", "")
            })
            logger.info(f"  - {msg.get('role')}: {msg.get('content', '')[:50]}...")
    # Add current user message
    messages.append({
@@ -369,30 +391,49 @@ async def run_simple(req: ReasonRequest):
        "content": req.user_prompt
    })
-    # Format messages into a simple prompt for the LLM
+    logger.info(f"📨 Total messages being sent to LLM: {len(messages)} (including system message)")
    conversation = ""
    for msg in messages:
        role = msg["role"]
        content = msg["content"]
        if role == "user":
            conversation += f"User: {content}\n\n"
        elif role == "assistant":
            conversation += f"Assistant: {content}\n\n"
-    conversation += "Assistant: "
+    # Get backend from request, otherwise fall back to env variable
    backend = req.backend if req.backend else os.getenv("STANDARD_MODE_LLM", "SECONDARY")
    backend = backend.upper()  # Normalize to uppercase
    logger.info(f"🔧 Using backend: {backend}")
    # Get backend from env (default to OPENAI for standard mode)
    backend = os.getenv("STANDARD_MODE_LLM", "OPENAI")
    temperature = req.temperature if req.temperature is not None else 0.7
-    # Direct LLM call
+    # Check if tools are enabled
    enable_tools = os.getenv("STANDARD_MODE_ENABLE_TOOLS", "false").lower() == "true"
    # Call LLM with or without tools
    try:
-        response = await call_llm(
+        if enable_tools:
-            prompt=conversation,
+            # Use FunctionCaller for tool-enabled conversation
            logger.info(f"🛠️  Tool calling enabled for Standard Mode")
            logger.info(f"🔍 Creating FunctionCaller with backend={backend}, temp={temperature}")
            function_caller = FunctionCaller(backend, temperature)
            logger.info(f"🔍 FunctionCaller created, calling call_with_tools...")
            result = await function_caller.call_with_tools(
                messages=messages,
                max_tokens=2048,
                session_id=req.session_id  # Pass session_id for streaming
            )
            logger.info(f"🔍 call_with_tools returned: iterations={result.get('iterations')}, tool_calls={len(result.get('tool_calls', []))}")
            # Log tool usage
            if result.get("tool_calls"):
                tool_names = [tc["name"] for tc in result["tool_calls"]]
                logger.info(f"🔧 Tools used: {', '.join(tool_names)} ({result['iterations']} iterations)")
            response = result["content"].strip()
        else:
            # Direct LLM call without tools (original behavior)
            raw_response = await call_llm(
                messages=messages,
                backend=backend,
                temperature=temperature,
                max_tokens=2048
            )
            response = raw_response.strip()
    except Exception as e:
        logger.error(f"❌ LLM call failed: {e}")
        response = f"Error: {str(e)}"
@@ -433,6 +474,63 @@ async def run_simple(req: ReasonRequest):
    }
 # -------------------------------------------------------------------
 # /stream/thinking endpoint - SSE stream for "show your work"
 # -------------------------------------------------------------------
@cortex_router.get("/stream/thinking/{session_id}")
 async def stream_thinking(session_id: str):
    """
    Server-Sent Events stream for tool calling "show your work" feature.
    Streams real-time updates about:
    - Thinking/planning steps
    - Tool calls being made
    - Tool execution results
    - Final completion
    """
    stream_manager = get_stream_manager()
    queue = stream_manager.subscribe(session_id)
    async def event_generator():
        try:
            # Send initial connection message
            import json
            connected_event = json.dumps({"type": "connected", "session_id": session_id})
            yield f"data: {connected_event}\n\n"
            while True:
                # Wait for events with timeout to send keepalive
                try:
                    event = await asyncio.wait_for(queue.get(), timeout=30.0)
                    # Format as SSE
                    event_data = json.dumps(event)
                    yield f"data: {event_data}\n\n"
                    # If it's a "done" event, close the stream
                    if event.get("type") == "done":
                        break
                except asyncio.TimeoutError:
                    # Send keepalive comment
                    yield ": keepalive\n\n"
        except asyncio.CancelledError:
            logger.info(f"Stream cancelled for session {session_id}")
        finally:
            stream_manager.unsubscribe(session_id, queue)
    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no"  # Disable nginx buffering
        }
    )
 # -------------------------------------------------------------------
 # /ingest endpoint (internal)
 # -------------------------------------------------------------------
--- a/debug_regex.py
+++ b/debug_regex.py
@@ -0,0 +1,26 @@
 #!/usr/bin/env python3
 import re
 xml = """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(50 / 2)</code>
    <reason>To calculate the result of dividing 50 by 2.</reason>
  </arguments>
 </olith>"""
 pattern = r'<tool_call>(.*?)</(?:tool_call|[a-zA-Z]+)>'
 matches = re.findall(pattern, xml, re.DOTALL)
 print(f"Pattern: {pattern}")
 print(f"Number of matches: {len(matches)}")
 print("\nMatches:")
 for idx, match in enumerate(matches):
    print(f"\nMatch {idx + 1}:")
    print(f"Length: {len(match)} chars")
    print(f"Content:\n{match[:200]}")
 # Now test what gets removed
 clean_content = re.sub(pattern, '', xml, flags=re.DOTALL).strip()
 print(f"\n\nCleaned content:\n{clean_content}")
--- a/deprecated.env.txt
+++ b/deprecated.env.txt
@@ -1,25 +0,0 @@
 # === GLOBAL LYRA SETTINGS ===
 PROJECT_NAME=lyra
 LOG_LEVEL=info
 # === SHARED MEMORY / DATABASE CONFIG ===
 NEOMEM_API=http://10.0.0.40:7077
 NEOMEM_KEY=placeholder
 # === PRIMARY LLM BACKEND (MI50 vLLM) ===
 LLM_PRIMARY_URL=http://10.0.0.43:8000
 LLM_PRIMARY_MODEL=qwen2.5:14b-instruct
 # === SECONDARY (3090 Ollama) ===
 LLM_SECONDARY_URL=http://10.0.0.3:11434
 # === CLOUD BACKEND (OpenAI, optional) ===
 LLM_CLOUD_URL=https://api.openai.com/v1
 OPENAI_API_KEY=sk-...
 # === LOCAL CPU FALLBACK ===
 LLM_FALLBACK_URL=http://localhost:11434
 # === DEFAULT TEMPERATURE / BACKEND SELECTION ===
 LLM_TEMPERATURE=0.7
 LLM_FORCE_BACKEND=primary    # auto | primary | secondary | cloud | fallback
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -7,6 +7,8 @@ volumes:
    driver: local
  neo4j_data:
    driver: local
  code_executions:
    driver: local
 services:
@@ -125,11 +127,37 @@ services:
      - ./.env
    volumes:
      - ./cortex:/app
      - /var/run/docker.sock:/var/run/docker.sock:ro
    ports:
      - "7081:7081"
    networks:
      - lyra_net
  # ============================================================
  # Code Sandbox (for tool execution)
  # ============================================================
  code-sandbox:
    build:
      context: ./sandbox
    container_name: lyra-code-sandbox
    restart: unless-stopped
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - SETUID
      - SETGID
    network_mode: "none"
    volumes:
      - code_executions:/executions
    mem_limit: 512m
    cpus: 1.0
    pids_limit: 100
    user: sandbox
    command: tail -f /dev/null
  # ============================================================
  # Intake
  # ============================================================
--- a/docs/TRILLIUM_API.md
+++ b/docs/TRILLIUM_API.md
@@ -0,0 +1,730 @@
 # TriliumNext ETAPI Complete API Reference
 ## Overview
 ETAPI is TriliumNext's public/external REST API available since Trilium v0.50.
 **Base URLs:**
 - `http://localhost:37740/etapi`
 - `http://localhost:8080/etapi`
 **API Version:** 1.0.0  
 **License:** Apache 2.0
 ## Authentication
 All operations require authentication using one of these methods:
 ### 1. ETAPI Token Authentication (Recommended)
 ```http
 GET /etapi/app-info
 Authorization: <ETAPI_TOKEN>
 ```
 OR (since v0.93.0):
 ```http
 GET /etapi/app-info
 Authorization: Bearer <ETAPI_TOKEN>
 ```
 ### 2. Basic Authentication (since v0.56)
 ```http
 GET /etapi/app-info
 Authorization: Basic <BASE64(username:password)>
 ```
 **Note:** Password must be the ETAPI token (NOT your Trilium password).
 ### 3. Get Token via API
 ```http
 POST /etapi/auth/login
 Content-Type: application/json
 {
  "password": "your_trilium_password"
 }
 ```
 **Response:**
 ```json
 {
  "authToken": "Bc4bFn0Ffiok_4NpbVCDnFz7B2WU+pdhW8B5Ne3DiR5wXrEyqdjgRIsk="
 }
 ```
 ---
 ## Complete API Endpoints
 ### Authentication
 #### Login
 - **POST** `/auth/login`
 - **Description:** Get an ETAPI token based on password
 - **Security:** None (public endpoint)
 - **Request Body:**
  ```json
  {
    "password": "string"
  }
  ```
 - **Responses:**
  - `201`: Auth token created
  - `429`: Client IP blacklisted (too many failed attempts)
 ---
 ### Application Information
 #### Get App Info
 - **GET** `/app-info`
 - **Description:** Get application information
 - **Response:**
  ```json
  {
    "appVersion": "0.91.0",
    "dbVersion": 231,
    "syncVersion": 25,
    "buildDate": "2022-02-09T22:52:36+01:00",
    "buildRevision": "23daaa2387a0655685377f0a541d154aeec2aae8",
    "dataDirectory": "/home/user/data",
    "clipperProtocolVersion": "1.0",
    "utcDateTime": "2022-03-07T21:54:25.277Z"
  }
  ```
 #### Get Metrics
 - **GET** `/etapi/metrics`
 - **Description:** Get Prometheus-format metrics for monitoring
 - **Query Parameters:**
  - `format`: `json` or `prometheus` (default: prometheus)
 - **Response:** Metrics data including note counts, db stats, etc.
 ---
 ### Notes Management
 #### Create Note
 - **POST** `/create-note`
 - **Description:** Create a note and place it into the note tree
 - **Request Body:**
  ```json
  {
    "parentNoteId": "root",
    "title": "My Note",
    "type": "text",
    "mime": "text/html",
    "content": "<p>Hello World</p>",
    "notePosition": 10,
    "prefix": "",
    "isExpanded": false,
    "noteId": "customId123",
    "branchId": "customBranchId",
    "utcDateCreated": "2021-12-31 19:18:11.930Z",
    "utcDateModified": "2021-12-31 19:18:11.930Z"
  }
  ```
 - **Required Fields:** `parentNoteId`, `title`, `type`, `content`
 - **Optional Fields:** `notePosition`, `prefix`, `isExpanded`, `noteId`, `branchId`, timestamps
 - **Note Types:** 
  - `text` - Rich text notes
  - `code` - Code notes (requires `mime`)
  - `file` - File attachments (requires `mime`)
  - `image` - Image notes (requires `mime`)
  - `search` - Saved search
  - `book` - Book/container note
  - `relationMap` - Relation map
  - `render` - Render note
  - `noteMap` - Note map
  - `mermaid` - Mermaid diagrams
  - `webView` - Web view
  - `shortcut` - Shortcut
  - `doc` - Document
  - `contentWidget` - Content widget
  - `launcher` - Launcher
  - `canvas` - Canvas note
 - **Response:** `201` with `NoteWithBranch` object
 #### Search Notes
 - **GET** `/notes`
 - **Description:** Search notes using query syntax
 - **Query Parameters:**
  - `search` (required): Search query string
  - `ancestorNoteId`: Search in subtree only
  - `fastSearch`: Boolean for fast search mode
  - `includeArchivedNotes`: Include archived notes (default: false)
  - `orderBy`: Field to order by (e.g., `title`, `dateModified`)
  - `orderDirection`: `asc` or `desc`
  - `limit`: Maximum results (default: 10)
  - `debug`: Enable debug info
 - **Response:** Array of note objects
 #### Get Note
 - **GET** `/notes/{noteId}`
 - **Description:** Get note metadata by ID
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Response:** Note object with metadata
 #### Get Note Content
 - **GET** `/notes/{noteId}/content`
 - **Description:** Get note content (HTML/text for text notes, binary for files/images)
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Response:** Note content (content-type varies by note type)
 #### Update Note Content
 - **PUT** `/notes/{noteId}/content`
 - **Description:** Update note content
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Request Body:** Raw content (HTML for text notes, binary for files)
 - **Response:** `204` No Content
 #### Update Note Metadata
 - **PATCH** `/notes/{noteId}`
 - **Description:** Update note metadata (title, type, mime, etc.)
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Request Body:**
  ```json
  {
    "title": "Updated Title",
    "type": "text",
    "mime": "text/html"
  }
  ```
 - **Response:** `200` with updated note object
 #### Delete Note
 - **DELETE** `/notes/{noteId}`
 - **Description:** Delete note and all its branches
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Response:** `204` No Content
 - **Note:** Deletes all clones/branches of the note
 #### Export Note
 - **GET** `/notes/{noteId}/export`
 - **Description:** Export note as ZIP file (with optional subtree)
 - **Path Parameters:**
  - `noteId`: Note ID (use "root" to export entire tree)
 - **Query Parameters:**
  - `format`: `html` or `markdown`/`md`
 - **Response:** ZIP file download
 ---
 ### Branches Management
 Branches represent note clones/placements in the tree. A single note can exist in multiple locations via different branches.
 #### Create Branch
 - **POST** `/branches`
 - **Description:** Create a branch (clone a note to another location)
 - **Request Body:**
  ```json
  {
    "noteId": "existingNoteId",
    "parentNoteId": "targetParentId",
    "prefix": "Branch Prefix",
    "notePosition": 10,
    "isExpanded": false,
    "branchId": "customBranchId"
  }
  ```
 - **Required Fields:** `noteId`, `parentNoteId`
 - **Response:** `201` with Branch object
 #### Get Branch
 - **GET** `/branches/{branchId}`
 - **Description:** Get branch by ID
 - **Path Parameters:**
  - `branchId`: Branch ID
 - **Response:** Branch object
 #### Update Branch
 - **PATCH** `/branches/{branchId}`
 - **Description:** Update branch (prefix, notePosition)
 - **Path Parameters:**
  - `branchId`: Branch ID
 - **Request Body:**
  ```json
  {
    "prefix": "New Prefix",
    "notePosition": 20,
    "isExpanded": true
  }
  ```
 - **Response:** `200` with updated branch
 - **Note:** Only `prefix`, `notePosition`, and `isExpanded` can be updated. For other properties, delete and recreate.
 #### Set Branch Prefix
 - **PATCH** `/branches/{branchId}/set-prefix`
 - **Description:** Set branch prefix
 - **Path Parameters:**
  - `branchId`: Branch ID
 - **Request Body:**
  ```json
  {
    "prefix": "New Prefix"
  }
  ```
 #### Move Branch to Parent
 - **POST** `/branches/{branchId}/set-note-to-parent`
 - **Description:** Move branch to a different parent
 - **Path Parameters:**
  - `branchId`: Branch ID
 - **Request Body:**
  ```json
  {
    "parentNoteId": "newParentId"
  }
  ```
 #### Delete Branch
 - **DELETE** `/branches/{branchId}`
 - **Description:** Delete branch (removes note from this tree location)
 - **Path Parameters:**
  - `branchId`: Branch ID
 - **Response:** `204` No Content
 - **Note:** If this is the last branch of the note, the note itself is deleted
 #### Refresh Note Ordering
 - **PATCH** `/refresh-note-ordering/{parentNoteId}`
 - **Description:** Push notePosition changes to connected clients
 - **Path Parameters:**
  - `parentNoteId`: Parent note ID
 - **Note:** Call this after updating branch notePositions to sync changes to clients
 ---
 ### Attributes Management
 Attributes include labels (key-value metadata) and relations (links between notes).
 #### Create Attribute
 - **POST** `/attributes`
 - **Description:** Create an attribute
 - **Request Body:**
  ```json
  {
    "noteId": "targetNoteId",
    "type": "label",
    "name": "priority",
    "value": "high",
    "position": 10,
    "isInheritable": false,
    "attributeId": "customAttributeId"
  }
  ```
 - **Attribute Types:**
  - `label`: Key-value metadata
  - `relation`: Link to another note (value is target noteId)
 - **Required Fields:** `noteId`, `type`, `name`
 - **Optional Fields:** `value`, `position`, `isInheritable`, `attributeId`
 - **Response:** `201` with Attribute object
 #### Create Attribute for Note
 - **POST** `/notes/{noteId}/attributes`
 - **Description:** Create attribute for specific note
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Request Body:** Same as Create Attribute (noteId not required)
 #### Get Attribute
 - **GET** `/attributes/{attributeId}`
 - **Description:** Get attribute by ID
 - **Path Parameters:**
  - `attributeId`: Attribute ID
 - **Response:** Attribute object
 #### Get Note Attributes
 - **GET** `/notes/{noteId}/attributes`
 - **Description:** Get all attributes for a note
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Response:** Array of attribute objects
 #### Update Attribute
 - **PATCH** `/attributes/{attributeId}`
 - **Description:** Update attribute (name, value, position)
 - **Path Parameters:**
  - `attributeId`: Attribute ID
 - **Request Body:**
  ```json
  {
    "name": "newName",
    "value": "newValue",
    "position": 20,
    "isInheritable": true
  }
  ```
 - **Response:** `200` with updated attribute
 #### Delete Attribute
 - **DELETE** `/attributes/{attributeId}`
 - **Description:** Delete attribute
 - **Path Parameters:**
  - `attributeId`: Attribute ID
 - **Response:** `204` No Content
 ---
 ### Attachments Management
 #### Create Attachment
 - **POST** `/attachments`
 - **Description:** Create attachment for a note
 - **Request Body:** Multipart form data with file
  ```json
  {
    "ownerId": "noteId",
    "role": "image",
    "mime": "image/png",
    "title": "Screenshot",
    "position": 10,
    "attachmentId": "customAttachmentId"
  }
  ```
 - **Required Fields:** `ownerId`, file data
 - **Optional Fields:** `role`, `mime`, `title`, `position`, `attachmentId`
 - **Response:** `201` with Attachment object
 #### Create Attachment for Note
 - **POST** `/notes/{noteId}/attachments`
 - **Description:** Create attachment (alternative endpoint)
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Request Body:** Same as Create Attachment (ownerId not required)
 #### Get Attachment
 - **GET** `/attachments/{attachmentId}`
 - **Description:** Get attachment metadata
 - **Path Parameters:**
  - `attachmentId`: Attachment ID
 - **Response:** Attachment object
 #### Get Attachment Content
 - **GET** `/attachments/{attachmentId}/content`
 - **Description:** Get attachment binary content
 - **Path Parameters:**
  - `attachmentId`: Attachment ID
 - **Response:** Binary content with appropriate MIME type
 #### Get Note Attachments
 - **GET** `/notes/{noteId}/attachments`
 - **Description:** Get all attachments for a note
 - **Path Parameters:**
  - `noteId`: Note ID
 - **Response:** Array of attachment objects
 #### Update Attachment Content
 - **PUT** `/attachments/{attachmentId}/content`
 - **Description:** Update attachment binary content
 - **Path Parameters:**
  - `attachmentId`: Attachment ID
 - **Request Body:** Binary file data
 - **Response:** `204` No Content
 #### Update Attachment Metadata
 - **PATCH** `/attachments/{attachmentId}`
 - **Description:** Update attachment metadata
 - **Path Parameters:**
  - `attachmentId`: Attachment ID
 - **Request Body:**
  ```json
  {
    "title": "New Title",
    "role": "image",
    "mime": "image/jpeg",
    "position": 20
  }
  ```
 - **Response:** `200` with updated attachment
 #### Delete Attachment
 - **DELETE** `/attachments/{attachmentId}`
 - **Description:** Delete attachment
 - **Path Parameters:**
  - `attachmentId`: Attachment ID
 - **Response:** `204` No Content
 ---
 ### Special Purpose Endpoints
 #### Get Inbox Note
 - **GET** `/inbox/{date}`
 - **Description:** Get or create inbox note for specific date
 - **Path Parameters:**
  - `date`: Date in format `YYYY-MM-DD`
 - **Response:** Note object
 - **Behavior:**
  - Returns fixed inbox note (marked with `#inbox` label) if configured
  - Otherwise returns/creates day note in journal for the specified date
 #### Get Day Note
 - **GET** `/calendar/days/{date}`
 - **Description:** Get or create day note
 - **Path Parameters:**
  - `date`: Date in format `YYYY-MM-DD` (e.g., `2022-12-31`)
 - **Response:** Note object
 - **Note:** Creates note if it doesn't exist
 #### Get Month Note
 - **GET** `/calendar/months/{month}`
 - **Description:** Get or create month note
 - **Path Parameters:**
  - `month`: Month in format `YYYY-MM` (e.g., `2022-12`)
 - **Response:** Note object
 - **Note:** Creates note if it doesn't exist
 #### Get Year Note
 - **GET** `/calendar/years/{year}`
 - **Description:** Get or create year note
 - **Path Parameters:**
  - `year`: Year in format `YYYY` (e.g., `2022`)
 - **Response:** Note object
 - **Note:** Creates note if it doesn't exist
 ---
 ### Backup
 #### Create Backup
 - **PUT** `/backup/{backupName}`
 - **Description:** Create a database backup
 - **Path Parameters:**
  - `backupName`: Backup filename (without extension)
 - **Example:** `PUT /backup/now` creates `backup-now.db`
 - **Response:** `204` No Content
 ---
 ## Data Types and Schemas
 ### Common Field Types
 - **EntityId**: 12-character alphanumeric string (e.g., `evnnmvHTCgIn`)
 - **LocalDateTime**: `YYYY-MM-DD HH:mm:ss.SSS±ZZZZ` (e.g., `2021-12-31 20:18:11.930+0100`)
 - **UtcDateTime**: `YYYY-MM-DD HH:mm:ss.SSSZ` (e.g., `2021-12-31 19:18:11.930Z`)
 ### Note Position
 - Normal ordering: 10, 20, 30, 40...
 - First position: use value < 10 (e.g., 5)
 - Last position: use large value (e.g., 1000000)
 - Between existing: use value between their positions
 ### Branch Prefix
 Branch-specific title prefix displayed in the tree. Useful when same note appears in multiple locations with slightly different context.
 ---
 ## Error Responses
 All endpoints may return these error responses:
 ### Standard Error Object
 ```json
 {
  "status": 400,
  "code": "NOTE_IS_PROTECTED",
  "message": "Note 'evnnmvHTCgIn' is protected and cannot be modified through ETAPI"
 }
 ```
 ### Common HTTP Status Codes
 - `200`: Success
 - `201`: Resource created
 - `204`: Success (no content)
 - `400`: Bad request (validation error)
 - `401`: Unauthorized (invalid token)
 - `404`: Not found
 - `429`: Too many requests (rate limited/blacklisted)
 - `500`: Internal server error
 ### Common Error Codes
 - `NOTE_IS_PROTECTED`: Protected note cannot be modified
 - `INVALID_TOKEN`: Invalid or expired ETAPI token
 - `VALIDATION_ERROR`: Request validation failed
 - `NOT_FOUND`: Resource not found
 - `RATE_LIMITED`: Too many requests
 ---
 ## Search Query Syntax
 The `/notes` search endpoint supports Trilium's query language:
 ### Basic Search
 ```
 python          # Search in title and content
 #todo           # Find notes with label "todo"
 ~project        # Find notes with relation "project"
 ```
 ### Advanced Operators
 ```
 note.title =* "meeting"              # Title contains "meeting"
 note.title %= ".*2022.*"             # Regex in title
 #priority = "high"                   # Label with specific value
 ~template = "someNoteId"             # Relation to specific note
 #created >= MONTH-1                  # Created in last month
 note.dateModified >= "2022-01-01"    # Modified after date
 ```
 ### Combining Queries
 ```
 #todo AND #urgent                    # Both labels
 #work OR #personal                   # Either label
 #project AND note.title =* "Q1"      # Label AND title condition
 ```
 ### Hierarchical Queries
 ```
 note.parents.title = "Work"          # Direct parent title
 note.ancestors.title = "Archive"     # Any ancestor title
 note.children.title =* "Chapter"     # Direct children
 ```
 See Trilium Search Documentation for complete syntax.
 ---
 ## Rate Limiting
 - Failed authentication attempts can result in IP blacklist
 - Blacklisted IPs receive `429` response
 - Wait period required before retry
 - Use valid tokens to avoid blacklisting
 ---
 ## Configuration Notes
 ### Upload Size Limits
 - Default: 250MB
 - Disable limit: Set `TRILIUM_NO_UPLOAD_LIMIT=true`
 - Custom limit: Set `MAX_ALLOWED_FILE_SIZE_MB=<size>`
 ### Network Configuration
 ETAPI accessible through:
 - Local interface: `http://localhost:8080/etapi`
 - Network interface: Configure reverse proxy (nginx/Apache)
 - SSL/TLS: Recommended for production use
 ---
 ## Best Practices
 1. **Always use ETAPI tokens** (not passwords) for authentication
 2. **Store tokens securely** - they provide full access to your Trilium instance
 3. **Use notePosition strategically** - leave gaps (10, 20, 30) for easy insertion
 4. **Handle branches carefully** - deleting last branch deletes the note
 5. **Check for protected notes** - they cannot be modified via ETAPI
 6. **Implement rate limiting** in your client to avoid blacklisting
 7. **Use search efficiently** - leverage fastSearch for better performance
 8. **Call refresh-note-ordering** after bulk branch position updates
 9. **Validate data before submission** - reduce error responses
 10. **Handle errors gracefully** - check status codes and error messages
 ---
 ## Example Workflows
 ### Create a Note with Attributes
 ```bash
 # 1. Create note
 NOTE_RESPONSE=$(curl -X POST "$SERVER/etapi/create-note" \
  -H "Authorization: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "parentNoteId": "root",
    "title": "Project TODO",
    "type": "text",
    "content": "<p>Task list</p>"
  }')
 NOTE_ID=$(echo $NOTE_RESPONSE | jq -r '.note.noteId')
 # 2. Add label
 curl -X POST "$SERVER/etapi/attributes" \
  -H "Authorization: $TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"noteId\": \"$NOTE_ID\",
    \"type\": \"label\",
    \"name\": \"priority\",
    \"value\": \"high\"
  }"
 ```
 ### Clone Note to Multiple Locations
 ```bash
 # Clone note to another parent
 curl -X POST "$SERVER/etapi/branches" \
  -H "Authorization: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "noteId": "existingNoteId",
    "parentNoteId": "anotherParentId",
    "prefix": "Reference: "
  }'
 ```
 ### Daily Journal Entry
 ```bash
 # Get or create today's note
 TODAY=$(date +%Y-%m-%d)
 curl "$SERVER/etapi/calendar/days/$TODAY" \
  -H "Authorization: $TOKEN"
 ```
 ---
 ## Client Libraries
 ### Python
 - **trilium-py**: Full-featured client with extended functionality
 - **PyTrilium**: Lightweight wrapper matching OpenAPI spec
 - **trilium-alchemy**: SQLAlchemy-style SDK with CLI toolkit
 ### Node.js
 - **trilium-etapi**: TypeScript wrapper with type safety
 ### Other Tools
 - **trilium-mcp-server**: Model Context Protocol server for LLMs
 - **openapi-mcp-generator**: Generate MCP servers from OpenAPI specs
 ---
 ## Version Compatibility
 - ETAPI introduced: Trilium v0.50
 - Basic Auth support: v0.56
 - Bearer token format: v0.93.0
 - TriliumNext fork: Compatible with Trilium API, ongoing development
 Check `/app-info` endpoint for version details of your instance.
 ---
 ## Additional Resources
 - **Official Documentation**: https://docs.triliumnotes.org/
 - **GitHub Repository**: https://github.com/TriliumNext/Trilium
 - **Search Syntax Guide**: https://github.com/zadam/trilium/wiki/Search
 - **Community Resources**: https://github.com/Nriver/awesome-trilium
 ---
 **License:** Apache 2.0  
 **Maintainer:** TriliumNext Community  
 **Contact:** https://github.com/TriliumNext/Trilium/discussions
--- a/neomem_history/history.db
+++ b/neomem_history/history.db
--- a/sandbox/Dockerfile
+++ b/sandbox/Dockerfile
@@ -0,0 +1,30 @@
 FROM python:3.11-slim
 # Install runtime dependencies
 RUN apt-get update && apt-get install -y \
    bash \
    coreutils \
    && rm -rf /var/lib/apt/lists/*
 # Install common Python packages for data analysis and computation
 RUN pip install --no-cache-dir \
    numpy \
    pandas \
    requests \
    matplotlib \
    scipy
 # Create non-root user for security
 RUN useradd -m -u 1000 sandbox
 # Create execution directory
 RUN mkdir /executions && chown sandbox:sandbox /executions
 # Switch to non-root user
 USER sandbox
 # Set working directory
 WORKDIR /executions
 # Keep container running
 CMD ["tail", "-f", "/dev/null"]
--- a/test_ollama_parser.py
+++ b/test_ollama_parser.py
@@ -0,0 +1,103 @@
 #!/usr/bin/env python3
 """
 Test OllamaAdapter XML parsing with real malformed examples.
 """
 import asyncio
 import sys
 sys.path.insert(0, '/home/serversdown/project-lyra/cortex')
 from autonomy.tools.adapters.ollama_adapter import OllamaAdapter
 async def test_parser():
    adapter = OllamaAdapter()
    # Test cases with actual malformed XML we've seen
    test_cases = [
        {
            "name": "Malformed closing tag 1",
            "xml": """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(50 / 2)</code>
    <reason>To calculate the result of dividing 50 by 2.</reason>
  </arguments>
 </olith>"""
        },
        {
            "name": "Malformed closing tag 2",
            "xml": """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(60 / 4)</code>
  </arguments>
  <reason>To calculate 60 divided by 4 using Python.</reason>
 </LTRB"""
        },
        {
            "name": "Partial XML (missing opening)",
            "xml": """<arguments>
    <language>python</language>
    <code>result = 35 / 7; result</code>
  </arguments>
  <reason>To calculate the division of 35 by 7 using Python.</reason>
 </tool_call>"""
        },
        {
            "name": "Correct XML",
            "xml": """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(100 / 4)</code>
    <reason>Calculate division</reason>
  </arguments>
 </tool_call>"""
        },
        {
            "name": "XML with surrounding text",
            "xml": """Let me help you with that.
 <tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(20 / 4)</code>
    <reason>Calculate the result</reason>
  </arguments>
 </tool_call>
 The result will be shown above."""
        }
    ]
    print("=" * 80)
    print("Testing OllamaAdapter XML Parsing")
    print("=" * 80)
    for test in test_cases:
        print(f"\nTest: {test['name']}")
        print("-" * 80)
        print(f"Input XML:\n{test['xml'][:200]}{'...' if len(test['xml']) > 200 else ''}")
        print("-" * 80)
        try:
            result = await adapter.parse_response(test['xml'])
            print(f"✅ Parsed successfully!")
            print(f"   Content: {result.get('content', '')[:100]}")
            print(f"   Tool calls found: {len(result.get('tool_calls') or [])}")
            if result.get('tool_calls'):
                for idx, tc in enumerate(result['tool_calls']):
                    print(f"   Tool {idx + 1}: {tc.get('name')} with args: {tc.get('arguments')}")
        except Exception as e:
            print(f"❌ Error: {e}")
        print()
 if __name__ == "__main__":
    asyncio.run(test_parser())
--- a/test_thinking_stream.html
+++ b/test_thinking_stream.html
@@ -0,0 +1,286 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Lyra - Show Your Work</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        body {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            background: #1a1a1a;
            color: #e0e0e0;
            height: 100vh;
            display: flex;
            flex-direction: column;
        }
        .container {
            display: flex;
            height: 100%;
            overflow: hidden;
        }
        .panel {
            flex: 1;
            display: flex;
            flex-direction: column;
            padding: 20px;
            overflow: hidden;
        }
        .panel-header {
            font-size: 18px;
            font-weight: bold;
            margin-bottom: 15px;
            padding-bottom: 10px;
            border-bottom: 2px solid #333;
        }
        .chat-panel {
            border-right: 1px solid #333;
        }
        .thinking-panel {
            background: #0d0d0d;
        }
        .messages, .thinking-output {
            flex: 1;
            overflow-y: auto;
            padding: 10px;
            background: #222;
            border-radius: 8px;
            margin-bottom: 15px;
        }
        .message {
            margin-bottom: 15px;
            padding: 10px;
            border-radius: 6px;
            line-height: 1.5;
        }
        .user-message {
            background: #1e3a5f;
            align-self: flex-end;
        }
        .assistant-message {
            background: #2d2d2d;
        }
        .thinking-event {
            margin-bottom: 10px;
            padding: 8px 12px;
            border-radius: 4px;
            font-size: 14px;
            font-family: 'Courier New', monospace;
            animation: fadeIn 0.3s;
        }
        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(-5px); }
            to { opacity: 1; transform: translateY(0); }
        }
        .event-thinking { background: #1a3a1a; color: #90ee90; }
        .event-tool_call { background: #3a2a1a; color: #ffa500; }
        .event-tool_result { background: #1a2a3a; color: #87ceeb; }
        .event-done { background: #2a1a3a; color: #da70d6; }
        .event-error { background: #3a1a1a; color: #ff6b6b; }
        .input-area {
            display: flex;
            gap: 10px;
        }
        input {
            flex: 1;
            padding: 12px;
            background: #2d2d2d;
            border: 1px solid #444;
            border-radius: 6px;
            color: #e0e0e0;
            font-size: 14px;
        }
        button {
            padding: 12px 24px;
            background: #4a7c59;
            border: none;
            border-radius: 6px;
            color: white;
            cursor: pointer;
            font-weight: bold;
            transition: background 0.2s;
        }
        button:hover {
            background: #5a9c69;
        }
        button:disabled {
            background: #333;
            cursor: not-allowed;
        }
        .status {
            padding: 8px;
            text-align: center;
            font-size: 12px;
            color: #888;
        }
        .status.connected { color: #90ee90; }
        .status.disconnected { color: #ff6b6b; }
        /* Mobile responsive */
        @media (max-width: 768px) {
            .container {
                flex-direction: column;
            }
            .chat-panel {
                border-right: none;
                border-bottom: 1px solid #333;
            }
        }
    </style>
 </head>
 <body>
    <div class="container">
        <!-- Chat Panel -->
        <div class="panel chat-panel">
            <div class="panel-header">💬 Chat</div>
            <div class="messages" id="messages"></div>
            <div class="input-area">
                <input
                    type="text"
                    id="userInput"
                    placeholder="Ask something that requires tools... (e.g., 'Calculate 50/2 using Python')"
                    onkeypress="if(event.key==='Enter') sendMessage()"
                >
                <button onclick="sendMessage()" id="sendBtn">Send</button>
            </div>
        </div>
        <!-- Thinking Panel -->
        <div class="panel thinking-panel">
            <div class="panel-header">🧠 Show Your Work</div>
            <div class="thinking-output" id="thinking"></div>
            <div class="status" id="status">Not connected</div>
        </div>
    </div>
    <script>
        const SESSION_ID = 'thinking-demo-' + Date.now();
        let eventSource = null;
        // Connect to thinking stream
        function connectThinkingStream() {
            if (eventSource) {
                eventSource.close();
            }
            const url = `http://localhost:7081/stream/thinking/${SESSION_ID}`;
            eventSource = new EventSource(url);
            eventSource.onopen = () => {
                document.getElementById('status').textContent = '🟢 Connected to thinking stream';
                document.getElementById('status').className = 'status connected';
            };
            eventSource.onmessage = (event) => {
                try {
                    const data = JSON.parse(event.data);
                    addThinkingEvent(data);
                } catch (e) {
                    console.error('Failed to parse event:', e);
                }
            };
            eventSource.onerror = () => {
                document.getElementById('status').textContent = '🔴 Disconnected from thinking stream';
                document.getElementById('status').className = 'status disconnected';
                setTimeout(connectThinkingStream, 2000); // Reconnect after 2s
            };
        }
        function addThinkingEvent(event) {
            const thinking = document.getElementById('thinking');
            const eventDiv = document.createElement('div');
            eventDiv.className = `thinking-event event-${event.type}`;
            if (event.type === 'connected') {
                eventDiv.textContent = `✓ Connected (Session: ${event.session_id})`;
            } else if (event.data && event.data.message) {
                eventDiv.textContent = event.data.message;
            } else {
                eventDiv.textContent = JSON.stringify(event.data);
            }
            thinking.appendChild(eventDiv);
            thinking.scrollTop = thinking.scrollHeight;
        }
        async function sendMessage() {
            const input = document.getElementById('userInput');
            const message = input.value.trim();
            if (!message) return;
            // Add user message to chat
            addMessage('user', message);
            input.value = '';
            // Disable send button
            const sendBtn = document.getElementById('sendBtn');
            sendBtn.disabled = true;
            sendBtn.textContent = 'Thinking...';
            // Clear thinking panel
            document.getElementById('thinking').innerHTML = '';
            try {
                const response = await fetch('http://localhost:7081/simple', {
                    method: 'POST',
                    headers: {
                        'Content-Type': 'application/json',
                    },
                    body: JSON.stringify({
                        session_id: SESSION_ID,
                        user_prompt: message,
                        backend: 'SECONDARY'
                    })
                });
                const data = await response.json();
                addMessage('assistant', data.draft);
            } catch (error) {
                addMessage('assistant', `Error: ${error.message}`);
            } finally {
                sendBtn.disabled = false;
                sendBtn.textContent = 'Send';
            }
        }
        function addMessage(role, content) {
            const messages = document.getElementById('messages');
            const messageDiv = document.createElement('div');
            messageDiv.className = `message ${role}-message`;
            messageDiv.textContent = content;
            messages.appendChild(messageDiv);
            messages.scrollTop = messages.scrollHeight;
        }
        // Connect on page load
        connectThinkingStream();
    </script>
 </body>
 </html>
--- a/test_tools.py
+++ b/test_tools.py
@@ -0,0 +1,91 @@
 #!/usr/bin/env python3
 """
 Quick test script for tool calling system.
 Tests the components before full endpoint integration.
 """
 import asyncio
 import sys
 import os
 # Add cortex to path
 sys.path.insert(0, '/home/serversdown/project-lyra/cortex')
 # Set required env vars
 os.environ['ENABLE_CODE_EXECUTION'] = 'true'
 os.environ['ENABLE_WEB_SEARCH'] = 'true'
 os.environ['CODE_SANDBOX_CONTAINER'] = 'lyra-code-sandbox'
 from autonomy.tools.registry import get_registry
 from autonomy.tools.executors.code_executor import execute_code
 from autonomy.tools.executors.web_search import search_web
 async def test_code_executor():
    """Test code execution in sandbox."""
    print("\n=== Testing Code Executor ===")
    result = await execute_code({
        "language": "python",
        "code": "print('Hello from sandbox!')\nprint(2 + 2)",
        "reason": "Testing sandbox execution"
    })
    print(f"Result: {result}")
    return result.get("stdout") == "Hello from sandbox!\n4\n"
 async def test_web_search():
    """Test web search."""
    print("\n=== Testing Web Search ===")
    result = await search_web({
        "query": "Python programming",
        "max_results": 3
    })
    print(f"Found {result.get('count', 0)} results")
    if result.get('results'):
        print(f"First result: {result['results'][0]['title']}")
    return result.get("count", 0) > 0
 async def test_registry():
    """Test tool registry."""
    print("\n=== Testing Tool Registry ===")
    registry = get_registry()
    tools = registry.get_tool_definitions()
    print(f"Registered tools: {registry.get_tool_names()}")
    print(f"Total tools: {len(tools) if tools else 0}")
    return len(tools or []) > 0
 async def main():
    print("🧪 Tool System Component Tests\n")
    tests = [
        ("Tool Registry", test_registry),
        ("Code Executor", test_code_executor),
        ("Web Search", test_web_search),
    ]
    results = {}
    for name, test_func in tests:
        try:
            passed = await test_func()
            results[name] = "✅ PASS" if passed else "❌ FAIL"
        except Exception as e:
            results[name] = f"❌ ERROR: {str(e)}"
    print("\n" + "="*50)
    print("Test Results:")
    for name, result in results.items():
        print(f"  {name}: {result}")
    print("="*50)
 if __name__ == "__main__":
    asyncio.run(main())
Author	SHA1	Message	Date
serversdwn	89988da472	Security: Update .gitignore to exclude runtime data, sessions, and logs - Add session files exclusion (core/relay/sessions/) - Add log directories exclusion (logs/, -logs/, intake-logs/) - Add runtime database files exclusion (.db, .sqlite) - Add temporary files exclusion (.cache/, .tmp, .temp) - Remove previously tracked session and database files from git These files contain runtime data and should not be version controlled.	2026-01-02 13:41:28 -05:00
serversdwn	b700ac3808	tool improvment	2025-12-31 22:36:24 -05:00
serversdwn	6716245a99	v0.9.1	2025-12-29 22:44:47 -05:00
serversdwn	a900110fe4	primary backend added to standard mode.	2025-12-29 02:10:59 -05:00
serversdwn	794baf2a96	0.9.0 - Added Trilium ETAPI integration. Lyra can now: Search trilium notes and create new notes. with proper ETAPI auth.	2025-12-29 01:58:20 -05:00
serversdwn	64429b19e6	feat: Implement Trillium notes executor for searching and creating notes via ETAPI - Added `trillium.py` for searching and creating notes with Trillium's ETAPI. - Implemented `search_notes` and `create_note` functions with appropriate error handling and validation. feat: Add web search functionality using DuckDuckGo - Introduced `web_search.py` for performing web searches without API keys. - Implemented `search_web` function with result handling and validation. feat: Create provider-agnostic function caller for iterative tool calling - Developed `function_caller.py` to manage LLM interactions with tools. - Implemented iterative calling logic with error handling and tool execution. feat: Establish a tool registry for managing available tools - Created `registry.py` to define and manage tool availability and execution. - Integrated feature flags for enabling/disabling tools based on environment variables. feat: Implement event streaming for tool calling processes - Added `stream_events.py` to manage Server-Sent Events (SSE) for tool calling. - Enabled real-time updates during tool execution for enhanced user experience. test: Add tests for tool calling system components - Created `test_tools.py` to validate functionality of code execution, web search, and tool registry. - Implemented asynchronous tests to ensure proper execution and result handling. chore: Add Dockerfile for sandbox environment setup - Created `Dockerfile` to set up a Python environment with necessary dependencies for code execution. chore: Add debug regex script for testing XML parsing - Introduced `debug_regex.py` to validate regex patterns against XML tool calls. chore: Add HTML template for displaying thinking stream events - Created `test_thinking_stream.html` for visualizing tool calling events in a user-friendly format. test: Add tests for OllamaAdapter XML parsing - Developed `test_ollama_parser.py` to validate XML parsing with various test cases, including malformed XML.	2025-12-26 03:49:20 -05:00
serversdwn	f1471cde84	docs updated v0.7.0	2025-12-22 01:40:24 -05:00
serversdwn	b4613ac30c	sessions improved, v0.7.0	2025-12-21 15:50:52 -05:00
serversdwn	01d4811717	mode selection, settings added to ui	2025-12-21 14:30:32 -05:00
serversdwn	ceb60119fb	simple context added to standard mode	2025-12-21 13:01:00 -05:00