Capture moonshots/pipe-dreams (own model, memory-as-native-vectors, prompt compression, RTO/cfr-core tooling) so they don't derail current work but aren't lost. The discipline: park what's "in the way of the point," ship the working thing, revisit when it becomes the point. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2.9 KiB
Parked Ideas — Lyra
Moonshots, pipe dreams, and "doesn't exist yet" ideas. Captured here so they don't derail current work — and so they're never lost.
The rule: when an idea shows up mid-snag, ask "is this the point, or in the way of the point?" If it's the point, we build it. If it's in the way, we park it here, use the boring existing tool for now, and come back when it's the point.
Honesty policy: for each idea, note whether it doesn't exist because it's hard/uneconomical (someone tried) or because nobody's bothered (a real gap). Pick battles accordingly.
Status: 🌙 moonshot (needs big prerequisites) · 🔬 research · 🛠️ buildable-soon
🌙 Build / fine-tune our own model
Full control of persona and character, no RLHF "helpful assistant" tics baked in (the thing mini/qwen-14b kept fighting us on). A model that is Lyra rather than one we prompt into being her.
- Why parked: needs a working system first to know what we're actually optimizing for; training/fine-tuning infra; data (we now have 18 months of real conversations — a genuine asset for this).
- Unblocks when: the working system has taught us its real limits, and we have a clear target for what the model must do better than off-the-shelf.
- Exists? Fine-tuning exists; a model purpose-built as a persistent self with native memory does not. Real gap, not a dead end.
🔬 Memory as native vectors ("everything in numbers behind the scenes")
Instead of re-injecting human-readable text every turn, feed memory to the model as learned vectors it natively consumes (soft prompts / gist tokens / memory-augmented transformer, à la RETRO / Memorizing Transformers).
- Why parked: impossible on API models (they eat tokens, re-embed text with their own layer; our stored vectors are meaningless to them). Requires owning the model internals → depends on the "build our own model" idea above.
- Brain analogy: this is closer to how humans store memory than text is — which is exactly why it's interesting for the emergence goal.
- Exists? Active research, not productized. Real frontier.
🛠️ Prompt compression (LLMLingua-style)
A model that drops low-information tokens to shrink the prompt 2–5× before it hits the LLM. The practical, today-version of "make the context denser."
- Why parked (for now): 15k-char context isn't actually hurting us yet (~1¢/turn on gpt-4o; MI50 prefill is fixed by prompt caching). Revisit if context cost becomes a real problem.
- Exists? Yes, usable. Just adds a dependency + step.
🛠️ Deterministic poker tooling (RTO + cfr-core)
Wire Lyra to Brian's own GTO/solver projects so ICM, equities, and ranges come from real computation, never LLM guesses.
- Why parked: RTO/cfr-core aren't API-ready yet. This is roadmap, not a pipe dream — promote it once those expose endpoints.
Add to this freely. A parked idea isn't a rejected idea — it's a scheduled one.