project-lyra/docs/ENVIRONMENT_VARIABLES.md

# Environment Variables Reference

This document describes all environment variables used across Project Lyra services.

## Quick Start

1. Copy environment templates:
   ```bash
   cp .env.example .env
   cp cortex/.env.example cortex/.env
   cp neomem/.env.example neomem/.env
   cp intake/.env.example intake/.env
   ```

2. Edit `.env` and add your credentials:
   - `OPENAI_API_KEY`: Your OpenAI API key
   - `POSTGRES_PASSWORD`: Database password
   - `NEO4J_PASSWORD`: Graph database password
   - `NEOMEM_API_KEY`: Generate a secure token

3. Update service URLs if your infrastructure differs from defaults

## File Structure

### Root `.env` - Shared Infrastructure
Contains all shared configuration used by multiple services:
- LLM backend options (PRIMARY, SECONDARY, CLOUD, FALLBACK)
- Database credentials (Postgres, Neo4j)
- API keys (OpenAI)
- Internal service URLs
- Feature flags

### Service-Specific `.env` Files
Each service has minimal overrides for service-specific parameters:
- **`cortex/.env`**: Cortex operational parameters
- **`neomem/.env`**: NeoMem LLM naming convention mappings
- **`intake/.env`**: Intake summarization parameters

## Environment Loading Order

Docker Compose loads environment files in this order (later overrides earlier):
1. Service-specific `.env` (e.g., `cortex/.env`)
2. Root `.env`

This means service-specific files can override root values when needed.

## Global Variables (Root `.env`)

### Global Configuration
| Variable | Default | Description |
|----------|---------|-------------|
| `LOCAL_TZ_LABEL` | `America/New_York` | Timezone for logs and timestamps |
| `DEFAULT_SESSION_ID` | `default` | Default chat session identifier |

### LLM Backend Options
Each service chooses which backend to use from these available options.

#### Primary Backend (vLLM on MI50 GPU)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_PRIMARY_PROVIDER` | `vllm` | Provider type |
| `LLM_PRIMARY_URL` | `http://10.0.0.43:8000` | vLLM server endpoint |
| `LLM_PRIMARY_MODEL` | `/model` | Model path for vLLM |

#### Secondary Backend (Ollama on 3090 GPU)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_SECONDARY_PROVIDER` | `ollama` | Provider type |
| `LLM_SECONDARY_URL` | `http://10.0.0.3:11434` | Ollama server endpoint |
| `LLM_SECONDARY_MODEL` | `qwen2.5:7b-instruct-q4_K_M` | Ollama model name |

#### Cloud Backend (OpenAI)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_CLOUD_PROVIDER` | `openai_chat` | Provider type |
| `LLM_CLOUD_URL` | `https://api.openai.com/v1` | OpenAI API endpoint |
| `LLM_CLOUD_MODEL` | `gpt-4o-mini` | OpenAI model to use |
| `OPENAI_API_KEY` | *required* | OpenAI API authentication key |

#### Fallback Backend (llama.cpp/LM Studio)
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_FALLBACK_PROVIDER` | `openai_completions` | Provider type (llama.cpp mimics OpenAI) |
| `LLM_FALLBACK_URL` | `http://10.0.0.41:11435` | Fallback server endpoint |
| `LLM_FALLBACK_MODEL` | `llama-3.2-8b-instruct` | Fallback model name |

#### LLM Global Settings
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_TEMPERATURE` | `0.7` | Sampling temperature (0.0-2.0) |

### Database Configuration

#### PostgreSQL (with pgvector)
| Variable | Default | Description |
|----------|---------|-------------|
| `POSTGRES_USER` | `neomem` | PostgreSQL username |
| `POSTGRES_PASSWORD` | *required* | PostgreSQL password |
| `POSTGRES_DB` | `neomem` | Database name |
| `POSTGRES_HOST` | `neomem-postgres` | Container name/hostname |
| `POSTGRES_PORT` | `5432` | PostgreSQL port |

#### Neo4j Graph Database
| Variable | Default | Description |
|----------|---------|-------------|
| `NEO4J_URI` | `bolt://neomem-neo4j:7687` | Neo4j connection URI |
| `NEO4J_USERNAME` | `neo4j` | Neo4j username |
| `NEO4J_PASSWORD` | *required* | Neo4j password |
| `NEO4J_AUTH` | `neo4j/<password>` | Neo4j auth string |

### Memory Services (NeoMem)
| Variable | Default | Description |
|----------|---------|-------------|
| `NEOMEM_API` | `http://neomem-api:7077` | NeoMem API endpoint |
| `NEOMEM_API_KEY` | *required* | NeoMem API authentication token |
| `NEOMEM_HISTORY_DB` | `postgresql://...` | PostgreSQL connection string for history |
| `EMBEDDER_PROVIDER` | `openai` | Embedding provider (used by NeoMem) |
| `EMBEDDER_MODEL` | `text-embedding-3-small` | Embedding model name |

### Internal Service URLs
All using Docker container names for network communication:

| Variable | Default | Description |
|----------|---------|-------------|
| `INTAKE_API_URL` | `http://intake:7080` | Intake summarizer service |
| `CORTEX_API` | `http://cortex:7081` | Cortex reasoning service |
| `CORTEX_URL` | `http://cortex:7081/reflect` | Cortex reflection endpoint |
| `CORTEX_URL_INGEST` | `http://cortex:7081/ingest` | Cortex ingest endpoint |
| `RAG_API_URL` | `http://rag:7090` | RAG service (if enabled) |
| `RELAY_URL` | `http://relay:7078` | Relay orchestration service |
| `PERSONA_URL` | `http://persona-sidecar:7080/current` | Persona service (optional) |

### Feature Flags
| Variable | Default | Description |
|----------|---------|-------------|
| `CORTEX_ENABLED` | `true` | Enable Cortex autonomous reflection |
| `MEMORY_ENABLED` | `true` | Enable NeoMem long-term memory |
| `PERSONA_ENABLED` | `false` | Enable persona sidecar |
| `DEBUG_PROMPT` | `true` | Enable debug logging for prompts |

## Service-Specific Variables

### Cortex (`cortex/.env`)
Cortex operational parameters:

| Variable | Default | Description |
|----------|---------|-------------|
| `CORTEX_MODE` | `autonomous` | Operation mode (autonomous/manual) |
| `CORTEX_LOOP_INTERVAL` | `300` | Seconds between reflection loops |
| `CORTEX_REFLECTION_INTERVAL` | `86400` | Seconds between deep reflections (24h) |
| `CORTEX_LOG_LEVEL` | `debug` | Logging verbosity |
| `NEOMEM_HEALTH_CHECK_INTERVAL` | `300` | NeoMem health check frequency |
| `REFLECTION_NOTE_TARGET` | `trilium` | Where to store reflection notes |
| `REFLECTION_NOTE_PATH` | `/app/logs/reflections.log` | Reflection output path |
| `RELEVANCE_THRESHOLD` | `0.78` | Memory retrieval relevance threshold |

**Note**: Cortex uses `LLM_PRIMARY` (vLLM on MI50) by default from root `.env`.

### NeoMem (`neomem/.env`)
NeoMem uses different variable naming conventions:

| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_PROVIDER` | `ollama` | NeoMem's LLM provider name |
| `LLM_MODEL` | `qwen2.5:7b-instruct-q4_K_M` | NeoMem's LLM model |
| `LLM_API_BASE` | `http://10.0.0.3:11434` | NeoMem's LLM endpoint (Ollama) |

**Note**: NeoMem uses Ollama (SECONDARY) for reasoning and OpenAI for embeddings. Database credentials and `OPENAI_API_KEY` inherited from root `.env`.

### Intake (`intake/.env`)
Intake summarization parameters:

| Variable | Default | Description |
|----------|---------|-------------|
| `SUMMARY_MODEL_NAME` | `/model` | Model path for summarization |
| `SUMMARY_API_URL` | `http://10.0.0.43:8000` | LLM endpoint for summaries |
| `SUMMARY_MAX_TOKENS` | `400` | Max tokens for summary generation |
| `SUMMARY_TEMPERATURE` | `0.4` | Temperature for summaries (lower = more focused) |
| `SUMMARY_INTERVAL` | `300` | Seconds between summary checks |
| `INTAKE_LOG_PATH` | `/app/logs/intake.log` | Log file location |
| `INTAKE_LOG_LEVEL` | `info` | Logging verbosity |

**Note**: Intake uses `LLM_PRIMARY` (vLLM) by default.

## Multi-Backend LLM Strategy

Project Lyra supports flexible backend selection per service:

**Root `.env` provides backend OPTIONS**:
- PRIMARY: vLLM on MI50 GPU (high performance)
- SECONDARY: Ollama on 3090 GPU (local inference)
- CLOUD: OpenAI API (cloud fallback)
- FALLBACK: llama.cpp/LM Studio (CPU-only)

**Services choose which backend to USE**:
- **Cortex** → vLLM (PRIMARY) for autonomous reasoning
- **NeoMem** → Ollama (SECONDARY) + OpenAI embeddings
- **Intake** → vLLM (PRIMARY) for summarization
- **Relay** → Implements fallback cascade with user preference

This design eliminates URL duplication while preserving per-service flexibility.

## Security Best Practices

1. **Never commit `.env` files to git** - they contain secrets
2. **Use `.env.example` templates** for documentation and onboarding
3. **Rotate credentials regularly**, especially:
   - `OPENAI_API_KEY`
   - `NEOMEM_API_KEY`
   - Database passwords
4. **Use strong passwords** for production databases
5. **Restrict network access** to LLM backends and databases

## Troubleshooting

### Services can't connect to each other
- Verify container names match in service URLs
- Check all services are on the `lyra_net` Docker network
- Use `docker-compose ps` to verify all services are running

### LLM calls failing
- Verify backend URLs are correct for your infrastructure
- Check if LLM servers are running and accessible
- Test with `curl <LLM_URL>/v1/models` (OpenAI-compatible APIs)

### Database connection errors
- Verify database credentials match in all locations
- Check if database containers are healthy: `docker-compose ps`
- Review database logs: `docker-compose logs neomem-postgres`

### Environment variables not loading
- Verify env_file paths in docker-compose.yml
- Check file permissions: `.env` files must be readable
- Remember loading order: service `.env` overrides root `.env`

## Migration from Old Setup

If you have the old multi-file setup with duplicated variables:

1. **Backup existing files**: All original `.env` files are in `.env-backups/`
2. **Copy new templates**: Use `.env.example` files as base
3. **Merge credentials**: Transfer your actual keys/passwords to new root `.env`
4. **Test thoroughly**: Verify all services start and communicate correctly

## Support

For issues or questions:
- Check logs: `docker-compose logs <service>`
- Verify configuration: `docker exec <container> env | grep <VAR>`
- Review this documentation for variable descriptions