Files
project-lyra/docs/LLMS.md

1.4 KiB

Request Flow Chain

  1. UI (Frontend) ↓ sends HTTP POST to

  2. Relay Service (Node.js - server.js) Location: /home/serversdown/project-lyra/core/relay/server.js Port: 7078 Endpoint: POST /v1/chat/completions ↓ calls handleChatRequest() which posts to

  3. Cortex Service - Reason Endpoint (Python FastAPI - router.py) Location: /home/serversdown/project-lyra/cortex/router.py Port: 7081 Endpoint: POST /reason Function: run_reason() at line 126 ↓ calls

  4. Cortex Reasoning Module (reasoning.py) Location: /home/serversdown/project-lyra/cortex/reasoning/reasoning.py Function: reason_check() at line 188 ↓ calls

  5. LLM Router (llm_router.py) Location: /home/serversdown/project-lyra/cortex/llm/llm_router.py Function: call_llm()

    • Gets backend from env: CORTEX_LLM=PRIMARY (from .env line 29)
    • Looks up PRIMARY config which has provider="mi50" (from .env line 13)
    • Routes to the mi50 provider handler (line 62-70) ↓ makes HTTP POST to
  6. MI50 LLM Server (llama.cpp) Location: http://10.0.0.44:8080 Endpoint: POST /completion Hardware: AMD MI50 GPU running DeepSeek model Key Configuration Points Backend Selection: .env:29 sets CORTEX_LLM=PRIMARY Provider Name: .env:13 sets LLM_PRIMARY_PROVIDER=mi50 Server URL: .env:14 sets LLM_PRIMARY_URL=http://10.0.0.44:8080 Provider Handler: llm_router.py:62-70 implements the mi50 provider