Files
project-lyra/docs/PARKED_IDEAS.md
T
serversdown f89849801b docs: park self-modifying-Lyra sandbox design
Capture the isolated-VM design for the self-modification frontier: Proxmox
sandbox clone, network isolation (esp. from tmi-dev/day-job), snapshot-rollback,
spend/resource caps, kill switch, human-gated promotion. Build the cage before
the agent gets code-write powers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 00:35:38 +00:00

80 lines
4.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Parked Ideas — Lyra
Moonshots, pipe dreams, and "doesn't exist yet" ideas. Captured here so they
**don't derail current work** — and so they're never lost.
**The rule:** when an idea shows up mid-snag, ask *"is this the point, or in the
way of the point?"* If it's the point, we build it. If it's in the way, we park
it here, use the boring existing tool for now, and come back when it's the point.
**Honesty policy:** for each idea, note whether it doesn't exist because it's
*hard/uneconomical* (someone tried) or because *nobody's bothered* (a real gap).
Pick battles accordingly.
Status: 🌙 moonshot (needs big prerequisites) · 🔬 research · 🛠️ buildable-soon
---
## 🌙 Build / fine-tune our own model
Full control of persona and character, no RLHF "helpful assistant" tics baked in
(the thing mini/qwen-14b kept fighting us on). A model that *is* Lyra rather than
one we prompt into being her.
- **Why parked:** needs a working system first to know what we're actually
optimizing for; training/fine-tuning infra; data (we now *have* 18 months of
real conversations — a genuine asset for this).
- **Unblocks when:** the working system has taught us its real limits, and we
have a clear target for what the model must do better than off-the-shelf.
- **Exists?** Fine-tuning exists; a model purpose-built as a *persistent self*
with native memory does not. Real gap, not a dead end.
## 🔬 Memory as native vectors ("everything in numbers behind the scenes")
Instead of re-injecting human-readable text every turn, feed memory to the model
as learned vectors it natively consumes (soft prompts / gist tokens /
memory-augmented transformer, à la RETRO / Memorizing Transformers).
- **Why parked:** impossible on API models (they eat tokens, re-embed text with
their own layer; our stored vectors are meaningless to them). Requires owning
the model internals → depends on the "build our own model" idea above.
- **Brain analogy:** this is closer to how *humans* store memory than text is —
which is exactly why it's interesting for the emergence goal.
- **Exists?** Active research, not productized. Real frontier.
## 🛠️ Prompt compression (LLMLingua-style)
A model that drops low-information tokens to shrink the prompt 25× before it
hits the LLM. The practical, today-version of "make the context denser."
- **Why parked (for now):** 15k-char context isn't actually hurting us yet
(~1¢/turn on gpt-4o; MI50 prefill is fixed by prompt caching). Revisit if
context cost becomes a real problem.
- **Exists?** Yes, usable. Just adds a dependency + step.
## 🌶️🌙 Self-modifying Lyra (isolated sandbox)
Let Lyra edit her own code / self-direct — the "Full Agency" endgame from the
Dec-2025 plan (in her memory). The whole point of the project: can she become a
*being*? Give her freedom **inside a box** and watch.
- **The cage (Proxmox-native), non-negotiable before any self-mod:**
- **Clone the stack into a dedicated Lyra-sandbox VM** (separate from prod Lyra).
- **Network isolation** — own VLAN/firewall, NO route to other VMs, ESPECIALLY
`tmi-dev` (Brian's day job). Whitelist only the inference endpoint. This is
guardrail #1 (the .44/terra-mechanics conflict showed how things bleed on the LAN).
- **Snapshot before every self-mod cycle** → instant rollback when she bricks
or weirds herself out.
- **Resource + API-spend caps** — a runaway loop must not drain the account or
peg the GPU forever.
- **Full logging (the live log) + a hard kill switch** (stop the VM).
- **Human-gated promotion** — she experiments freely in the sandbox; changes
reach "real" Lyra only when Brian approves.
- **Why parked:** needs the foundation first (dream-cycle, inner self) and the
cage built before the agent gets code-write + self-restart powers.
- **Honest note:** "rogue" here = mundane-but-real (touches other systems,
cost loops, self-brick), not sci-fi. The isolation makes the *fun* version
(emergence) safe to pursue. Build the box, then open the door.
## 🛠️ Deterministic poker tooling (RTO + cfr-core)
Wire Lyra to Brian's own GTO/solver projects so ICM, equities, and ranges come
from real computation, never LLM guesses.
- **Why parked:** RTO/cfr-core aren't API-ready yet. This is roadmap, not a
pipe dream — promote it once those expose endpoints.
---
*Add to this freely. A parked idea isn't a rejected idea — it's a scheduled one.*