feat: Implement Trillium notes executor for searching and creating notes via ETAPI

- Added `trillium.py` for searching and creating notes with Trillium's ETAPI. - Implemented `search_notes` and `create_note` functions with appropriate error handling and validation. feat: Add web search functionality using DuckDuckGo - Introduced `web_search.py` for performing web searches without API keys. - Implemented `search_web` function with result handling and validation. feat: Create provider-agnostic function caller for iterative tool calling - Developed `function_caller.py` to manage LLM interactions with tools. - Implemented iterative calling logic with error handling and tool execution. feat: Establish a tool registry for managing available tools - Created `registry.py` to define and manage tool availability and execution. - Integrated feature flags for enabling/disabling tools based on environment variables. feat: Implement event streaming for tool calling processes - Added `stream_events.py` to manage Server-Sent Events (SSE) for tool calling. - Enabled real-time updates during tool execution for enhanced user experience. test: Add tests for tool calling system components - Created `test_tools.py` to validate functionality of code execution, web search, and tool registry. - Implemented asynchronous tests to ensure proper execution and result handling. chore: Add Dockerfile for sandbox environment setup - Created `Dockerfile` to set up a Python environment with necessary dependencies for code execution. chore: Add debug regex script for testing XML parsing - Introduced `debug_regex.py` to validate regex patterns against XML tool calls. chore: Add HTML template for displaying thinking stream events - Created `test_thinking_stream.html` for visualizing tool calling events in a user-friendly format. test: Add tests for OllamaAdapter XML parsing - Developed `test_ollama_parser.py` to validate XML parsing with various test cases, including malformed XML.
2025-12-26 03:49:20 -05:00
parent f1471cde84
commit 64429b19e6
37 changed files with 3238 additions and 23 deletions
--- a/README.md
+++ b/README.md
@@ -1,12 +1,12 @@
-# Project Lyra - README v0.7.0
+# Project Lyra - README v0.8.0

 Lyra is a modular persistent AI companion system with advanced reasoning capabilities and autonomous decision-making.
 It provides memory-backed chat using **Relay** + **Cortex** with integrated **Autonomy System**,
 featuring a multi-stage reasoning pipeline powered by HTTP-based LLM backends.

-**NEW in v0.7.0:** Standard Mode for simple chatbot functionality + UI backend selection + server-side session persistence
+**NEW in v0.8.0:** Agentic tool calling + "Show Your Work" real-time thinking stream visualization

-**Current Version:** v0.7.0 (2025-12-21)
+**Current Version:** v0.8.0 (2025-12-26)

 > **Note:** As of v0.6.0, NeoMem is **disabled by default** while we work out integration hiccups in the pipeline. The autonomy system is being refined independently before full memory integration.

@@ -33,11 +33,16 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Manages async calls to Cortex ingest
 - *(NeoMem integration currently disabled in v0.6.0)*

-**2. UI** (Static HTML)
+**2. UI** (Static HTML) - Port 8081 (nginx)
 - Browser-based chat interface with cyberpunk theme
- **NEW:** Mode selector (Standard/Cortex) in header
- **NEW:** Settings modal with backend selection and session management
- **NEW:** Light/Dark mode toggle (dark by default)
+- Mode selector (Standard/Cortex) in header
+- Settings modal with backend selection and session management
+- Light/Dark mode toggle (dark by default)
+- **NEW in v0.8.0:** "🧠 Show Work" button for real-time thinking stream
+  - Opens popup window with live SSE connection
+  - Color-coded events: thinking, tool calls, results, completion
+  - Auto-scrolling with animations
+  - Session-aware (matches current chat session)
 - Server-synced session management (persists across browsers and reboots)
 - OpenAI-compatible message format

@@ -55,11 +60,19 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Primary reasoning engine with multi-stage pipeline and autonomy system
 - **Includes embedded Intake module** (no separate service as of v0.5.1)
 - **Integrated Autonomy System** (NEW in v0.6.0) - See Autonomy System section below
+- **Tool Calling System** (NEW in v0.8.0) - Agentic execution for Standard Mode
+  - Sandboxed code execution (Python, JavaScript, Bash)
+  - Web search via Tavily API
+  - Trillium knowledge base integration
+  - Multi-iteration autonomous tool use (max 5 iterations)
+  - Real-time thinking stream via SSE
 - **Dual Operating Modes:**
-  - **Standard Mode** (NEW in v0.7.0) - Simple chatbot with context retention
+  - **Standard Mode** (v0.7.0) - Simple chatbot with context retention + tool calling (v0.8.0)
    - Bypasses reflection, reasoning, refinement stages
    - Direct LLM call with conversation history
    - User-selectable backend (SECONDARY, OPENAI, or custom)
+    - **NEW:** Autonomous tool calling for code execution, web search, knowledge queries
+    - **NEW:** "Show Your Work" real-time thinking stream
    - Faster responses for coding and practical tasks
  - **Cortex Mode** - Full 4-stage reasoning pipeline
    1. **Reflection** - Generates meta-awareness notes about conversation
@@ -70,7 +83,8 @@ Project Lyra operates as a **single docker-compose deployment** with multiple Do
 - Flexible LLM router supporting multiple backends via HTTP
 - **Endpoints:**
  - `POST /reason` - Main reasoning pipeline (Cortex Mode)
-  - `POST /simple` - Direct LLM chat (Standard Mode) **NEW in v0.7.0**
+  - `POST /simple` - Direct LLM chat with tool calling (Standard Mode)
+  - `GET /stream/thinking/{session_id}` - SSE stream for thinking events **NEW in v0.8.0**
  - `POST /ingest` - Receives conversation exchanges from Relay
  - `GET /health` - Service health check
  - `GET /debug/sessions` - Inspect in-memory SESSIONS state