feat: Implement Trillium notes executor for searching and creating notes via ETAPI

- Added `trillium.py` for searching and creating notes with Trillium's ETAPI. - Implemented `search_notes` and `create_note` functions with appropriate error handling and validation. feat: Add web search functionality using DuckDuckGo - Introduced `web_search.py` for performing web searches without API keys. - Implemented `search_web` function with result handling and validation. feat: Create provider-agnostic function caller for iterative tool calling - Developed `function_caller.py` to manage LLM interactions with tools. - Implemented iterative calling logic with error handling and tool execution. feat: Establish a tool registry for managing available tools - Created `registry.py` to define and manage tool availability and execution. - Integrated feature flags for enabling/disabling tools based on environment variables. feat: Implement event streaming for tool calling processes - Added `stream_events.py` to manage Server-Sent Events (SSE) for tool calling. - Enabled real-time updates during tool execution for enhanced user experience. test: Add tests for tool calling system components - Created `test_tools.py` to validate functionality of code execution, web search, and tool registry. - Implemented asynchronous tests to ensure proper execution and result handling. chore: Add Dockerfile for sandbox environment setup - Created `Dockerfile` to set up a Python environment with necessary dependencies for code execution. chore: Add debug regex script for testing XML parsing - Introduced `debug_regex.py` to validate regex patterns against XML tool calls. chore: Add HTML template for displaying thinking stream events - Created `test_thinking_stream.html` for visualizing tool calling events in a user-friendly format. test: Add tests for OllamaAdapter XML parsing - Developed `test_ollama_parser.py` to validate XML parsing with various test cases, including malformed XML.
2025-12-26 03:49:20 -05:00
parent f1471cde84
commit 64429b19e6
37 changed files with 3238 additions and 23 deletions
--- a/THINKING_STREAM.md
+++ b/THINKING_STREAM.md
@@ -0,0 +1,163 @@
+# "Show Your Work" - Thinking Stream Feature
+
+Real-time Server-Sent Events (SSE) stream that broadcasts the internal thinking process during tool calling operations.
+
+## What It Does
+
+When Lyra uses tools to answer a question, you can now watch her "think" in real-time through a parallel stream:
+
+- 🤔 **Thinking** - When she's planning what to do
+- 🔧 **Tool Calls** - When she decides to use a tool
+- 📊 **Tool Results** - The results from tool execution
+- ✅ **Done** - When she has the final answer
+- ❌ **Errors** - If something goes wrong
+
+## How To Use
+
+### 1. Open the SSE Stream
+
+Connect to the thinking stream for a session:
+
+```bash
+curl -N http://localhost:7081/stream/thinking/{session_id}
+```
+
+The stream will send Server-Sent Events in this format:
+
+```
+data: {"type": "thinking", "data": {"message": "🤔 Thinking... (iteration 1/5)"}}
+
+data: {"type": "tool_call", "data": {"tool": "execute_code", "args": {...}, "message": "🔧 Using tool: execute_code"}}
+
+data: {"type": "tool_result", "data": {"tool": "execute_code", "result": {...}, "message": "📊 Result: ..."}}
+
+data: {"type": "done", "data": {"message": "✅ Complete!", "final_answer": "The result is..."}}
+```
+
+### 2. Send a Request
+
+In parallel, send a request to `/simple` with the same `session_id`:
+
+```bash
+curl -X POST http://localhost:7081/simple \
+  -H "Content-Type: application/json" \
+  -d '{
+    "session_id": "your-session-id",
+    "user_prompt": "Calculate 50/2 using Python",
+    "backend": "SECONDARY"
+  }'
+```
+
+### 3. Watch the Stream
+
+As the request processes, you'll see real-time events showing:
+- Each thinking iteration
+- Every tool call being made
+- The results from each tool
+- The final answer
+
+## Event Types
+
+| Event Type | Description | Data Fields |
+|-----------|-------------|-------------|
+| `connected` | Initial connection | `session_id` |
+| `thinking` | LLM is processing | `message` |
+| `tool_call` | Tool is being invoked | `tool`, `args`, `message` |
+| `tool_result` | Tool execution completed | `tool`, `result`, `message` |
+| `done` | Process complete | `message`, `final_answer` |
+| `error` | Something went wrong | `message` |
+
+## Demo Page
+
+A demo HTML page is included at [test_thinking_stream.html](../test_thinking_stream.html):
+
+```bash
+# Serve the demo page
+python3 -m http.server 8000
+```
+
+Then open http://localhost:8000/test_thinking_stream.html in your browser.
+
+The demo shows:
+- **Left panel**: Chat interface
+- **Right panel**: Real-time thinking stream
+- **Mobile**: Swipe between panels
+
+## Architecture
+
+### Components
+
+1. **ToolStreamManager** (`autonomy/tools/stream_events.py`)
+   - Manages SSE subscriptions per session
+   - Broadcasts events to all connected clients
+   - Handles automatic cleanup
+
+2. **FunctionCaller** (`autonomy/tools/function_caller.py`)
+   - Enhanced with event emission at each step
+   - Checks for active subscribers before emitting
+   - Passes `session_id` through the call chain
+
+3. **SSE Endpoint** (`/stream/thinking/{session_id}`)
+   - FastAPI streaming response
+   - 30-second keepalive for connection maintenance
+   - Automatic reconnection on client side
+
+### Event Flow
+
+```
+Client                 SSE Endpoint           FunctionCaller          Tools
+  |                         |                         |                  |
+  |--- Connect SSE -------->|                         |                  |
+  |<-- connected ----------|                          |                  |
+  |                         |                         |                  |
+  |--- POST /simple --------|                         |                  |
+  |                         |                         |                  |
+  |                         |<-- emit("thinking") ---|                  |
+  |<-- thinking ------------|                         |                  |
+  |                         |                         |                  |
+  |                         |<-- emit("tool_call") ---|                  |
+  |<-- tool_call -----------|                         |                  |
+  |                         |                         |-- execute ------>|
+  |                         |                         |<-- result -------|
+  |                         |<-- emit("tool_result")--|                  |
+  |<-- tool_result ---------|                         |                  |
+  |                         |                         |                  |
+  |                         |<-- emit("done") --------|                  |
+  |<-- done ---------------|                         |                  |
+  |                         |                         |                  |
+```
+
+## Configuration
+
+No additional configuration needed! The feature works automatically when:
+1. `STANDARD_MODE_ENABLE_TOOLS=true` (already set)
+2. A client connects to the SSE stream BEFORE sending the request
+
+## Example Output
+
+```
+🟢 Connected to thinking stream
+✓ Connected (Session: thinking-demo-1735177234567)
+🤔 Thinking... (iteration 1/5)
+🔧 Using tool: execute_code
+📊 Result: {'stdout': '12.0\n', 'stderr': '', 'exit_code': 0, 'execution_time': 0.04}
+🤔 Thinking... (iteration 2/5)
+✅ Complete!
+```
+
+## Use Cases
+
+- **Debugging**: See exactly what tools are being called and why
+- **Transparency**: Show users what the AI is doing behind the scenes
+- **Education**: Learn how the system breaks down complex tasks
+- **UI Enhancement**: Create engaging "thinking" animations
+- **Mobile App**: Separate tab for "Show Your Work" view
+
+## Future Enhancements
+
+Potential additions:
+- Token usage per iteration
+- Estimated time remaining
+- Tool execution duration
+- Intermediate reasoning steps
+- Visual progress indicators