Files
project-lyra/THINKING_STREAM.md
serversdwn 64429b19e6 feat: Implement Trillium notes executor for searching and creating notes via ETAPI
- Added `trillium.py` for searching and creating notes with Trillium's ETAPI.
- Implemented `search_notes` and `create_note` functions with appropriate error handling and validation.

feat: Add web search functionality using DuckDuckGo

- Introduced `web_search.py` for performing web searches without API keys.
- Implemented `search_web` function with result handling and validation.

feat: Create provider-agnostic function caller for iterative tool calling

- Developed `function_caller.py` to manage LLM interactions with tools.
- Implemented iterative calling logic with error handling and tool execution.

feat: Establish a tool registry for managing available tools

- Created `registry.py` to define and manage tool availability and execution.
- Integrated feature flags for enabling/disabling tools based on environment variables.

feat: Implement event streaming for tool calling processes

- Added `stream_events.py` to manage Server-Sent Events (SSE) for tool calling.
- Enabled real-time updates during tool execution for enhanced user experience.

test: Add tests for tool calling system components

- Created `test_tools.py` to validate functionality of code execution, web search, and tool registry.
- Implemented asynchronous tests to ensure proper execution and result handling.

chore: Add Dockerfile for sandbox environment setup

- Created `Dockerfile` to set up a Python environment with necessary dependencies for code execution.

chore: Add debug regex script for testing XML parsing

- Introduced `debug_regex.py` to validate regex patterns against XML tool calls.

chore: Add HTML template for displaying thinking stream events

- Created `test_thinking_stream.html` for visualizing tool calling events in a user-friendly format.

test: Add tests for OllamaAdapter XML parsing

- Developed `test_ollama_parser.py` to validate XML parsing with various test cases, including malformed XML.
2025-12-26 03:49:20 -05:00

5.6 KiB

"Show Your Work" - Thinking Stream Feature

Real-time Server-Sent Events (SSE) stream that broadcasts the internal thinking process during tool calling operations.

What It Does

When Lyra uses tools to answer a question, you can now watch her "think" in real-time through a parallel stream:

  • 🤔 Thinking - When she's planning what to do
  • 🔧 Tool Calls - When she decides to use a tool
  • 📊 Tool Results - The results from tool execution
  • Done - When she has the final answer
  • Errors - If something goes wrong

How To Use

1. Open the SSE Stream

Connect to the thinking stream for a session:

curl -N http://localhost:7081/stream/thinking/{session_id}

The stream will send Server-Sent Events in this format:

data: {"type": "thinking", "data": {"message": "🤔 Thinking... (iteration 1/5)"}}

data: {"type": "tool_call", "data": {"tool": "execute_code", "args": {...}, "message": "🔧 Using tool: execute_code"}}

data: {"type": "tool_result", "data": {"tool": "execute_code", "result": {...}, "message": "📊 Result: ..."}}

data: {"type": "done", "data": {"message": "✅ Complete!", "final_answer": "The result is..."}}

2. Send a Request

In parallel, send a request to /simple with the same session_id:

curl -X POST http://localhost:7081/simple \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "your-session-id",
    "user_prompt": "Calculate 50/2 using Python",
    "backend": "SECONDARY"
  }'

3. Watch the Stream

As the request processes, you'll see real-time events showing:

  • Each thinking iteration
  • Every tool call being made
  • The results from each tool
  • The final answer

Event Types

Event Type Description Data Fields
connected Initial connection session_id
thinking LLM is processing message
tool_call Tool is being invoked tool, args, message
tool_result Tool execution completed tool, result, message
done Process complete message, final_answer
error Something went wrong message

Demo Page

A demo HTML page is included at test_thinking_stream.html:

# Serve the demo page
python3 -m http.server 8000

Then open http://localhost:8000/test_thinking_stream.html in your browser.

The demo shows:

  • Left panel: Chat interface
  • Right panel: Real-time thinking stream
  • Mobile: Swipe between panels

Architecture

Components

  1. ToolStreamManager (autonomy/tools/stream_events.py)

    • Manages SSE subscriptions per session
    • Broadcasts events to all connected clients
    • Handles automatic cleanup
  2. FunctionCaller (autonomy/tools/function_caller.py)

    • Enhanced with event emission at each step
    • Checks for active subscribers before emitting
    • Passes session_id through the call chain
  3. SSE Endpoint (/stream/thinking/{session_id})

    • FastAPI streaming response
    • 30-second keepalive for connection maintenance
    • Automatic reconnection on client side

Event Flow

Client                 SSE Endpoint           FunctionCaller          Tools
  |                         |                         |                  |
  |--- Connect SSE -------->|                         |                  |
  |<-- connected ----------|                          |                  |
  |                         |                         |                  |
  |--- POST /simple --------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("thinking") ---|                  |
  |<-- thinking ------------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("tool_call") ---|                  |
  |<-- tool_call -----------|                         |                  |
  |                         |                         |-- execute ------>|
  |                         |                         |<-- result -------|
  |                         |<-- emit("tool_result")--|                  |
  |<-- tool_result ---------|                         |                  |
  |                         |                         |                  |
  |                         |<-- emit("done") --------|                  |
  |<-- done ---------------|                         |                  |
  |                         |                         |                  |

Configuration

No additional configuration needed! The feature works automatically when:

  1. STANDARD_MODE_ENABLE_TOOLS=true (already set)
  2. A client connects to the SSE stream BEFORE sending the request

Example Output

🟢 Connected to thinking stream
✓ Connected (Session: thinking-demo-1735177234567)
🤔 Thinking... (iteration 1/5)
🔧 Using tool: execute_code
📊 Result: {'stdout': '12.0\n', 'stderr': '', 'exit_code': 0, 'execution_time': 0.04}
🤔 Thinking... (iteration 2/5)
✅ Complete!

Use Cases

  • Debugging: See exactly what tools are being called and why
  • Transparency: Show users what the AI is doing behind the scenes
  • Education: Learn how the system breaks down complex tasks
  • UI Enhancement: Create engaging "thinking" animations
  • Mobile App: Separate tab for "Show Your Work" view

Future Enhancements

Potential additions:

  • Token usage per iteration
  • Estimated time remaining
  • Tool execution duration
  • Intermediate reasoning steps
  • Visual progress indicators