T

claude c015e39925 feat: add incremental embedding, unchanged files now resuse cached embedding to save on tokens.

fix: early return bug in ingest.py fixed.

2026-03-09 16:23:59 -04:00

.env.example

init commit: v0.1.0

2026-03-04 17:20:29 -05:00

.gitignore

init commit: v0.1.0

2026-03-04 17:20:29 -05:00

ingest.py

feat: add incremental embedding, unchanged files now resuse cached embedding to save on tokens.

2026-03-09 16:23:59 -04:00

query.py

init commit: v0.1.0

2026-03-04 17:20:29 -05:00

README.md

doc: add readme.md

2026-03-05 23:10:25 -05:00

requirements.txt

init commit: v0.1.0

2026-03-04 17:20:29 -05:00

sources.yaml

init commit: v0.1.0

2026-03-04 17:20:29 -05:00

README.md

TMI RAG (rag-tmi)

A lightweight local Retrieval-Augmented Generation (RAG) system used to index and search technical documentation and source code across the Terra-Mechanics development workspace.

This tool allows Claude Code, Codex, or any LLM assistant to retrieve relevant context from engineering documentation before answering questions.

The goal is to create a searchable semantic memory layer for projects such as:

Terra-View
Seismo Relay
Series 3 / Minimate protocol research
Modem / SLM documentation
Reverse engineering notes
Parser documentation

Instead of manually searching repos or notes, the system retrieves relevant information using vector similarity search.

How It Works

The system follows a standard RAG architecture.

1. Indexing

ingest.py:

Reads directories listed in sources.yaml
Recursively scans for supported files (.md, .txt, .py)
Splits content into chunks
Generates embeddings using OpenAI
Stores vectors in a FAISS index

Result:

index/
  index.faiss
  meta.pkl

This becomes the semantic database.

2. Querying

query.py:

Embeds the query text
Searches the FAISS vector index
Returns the most relevant chunks of text

These results can be pasted into an LLM chat (Claude Code, Codex, etc.) as context.

Example:

python query.py "checksum algorithm"

Repository Structure

rag-tmi/
│
├─ ingest.py
├─ query.py
├─ sources.yaml
├─ requirements.txt
├─ README.md
│
├─ index/            # Generated FAISS index (not committed)
│
└─ .venv/            # Local Python environment (not committed)

Installation

Create a virtual environment:

python -m venv .venv

Activate it.

Windows (PowerShell):

.venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Environment Variables

Create a .env file in the project root.

OPENAI_API_KEY=your_key_here

The key is loaded automatically by:

python-dotenv

Running the Indexer

Build the vector database:

python ingest.py

The script will:

scan all configured sources
generate embeddings
build a FAISS index

You should see progress like:

Embedding 72 chunks

The index will be written to:

index/index.faiss
index/meta.pkl

Querying the Database

Run:

python query.py "frame checksum"

The system will return the most relevant chunks from indexed documents.

These chunks can be copied into Claude or Codex to provide accurate context for responses.

sources.yaml

This file defines which directories are indexed.

Example:

sources:
  - ../terra-view
  - ../seismo-relay
  - ../series3-agent
  - ../protocol-docs

Paths are relative to the rag-tmi directory.

Supported File Types

Currently indexed:

.md
.txt
.py

Additional types can be added inside ingest.py.

Git Notes

The following files should not be committed:

.venv/
.env
index/
__pycache__/

Example .gitignore entries:

.venv/
.env
index/
__pycache__/

Why This Exists

Large engineering projects accumulate knowledge across:

source code
documentation
research notes
reverse engineering logs

Traditional search tools rely on exact text matches.

RAG enables semantic search, meaning queries like:

"frame checksum"

can retrieve documentation that contains phrases like:

payload CRC
frame validation
checksum calculation

even if the exact words do not match.

Future Improvements

Possible upgrades:

CLI command wrapper (rag "query")
automatic repo indexing
incremental indexing
MCP server for AI tool access
reranking model
code-aware chunking
web UI for search

Philosophy

This tool acts as a local memory layer for engineering work.

Instead of searching through repos manually, developers and AI assistants can retrieve the most relevant information instantly.

The system is intentionally simple:

local
transparent
easy to rebuild
no heavy frameworks

It follows the principle that small tools that solve real problems are better than complex systems that are hard to maintain.