docs: clean up waveform-codec doc layers per review

Three "truth layers" had drifted apart between commits.  Fixed:

1. waveform_codec.py docstring rewritten from the 2026-05-08
   "structural framing only" state to the 2026-05-11 "Tran segment 0
   solved + segment-header partially decoded" state.  Killed stale
   "~80 sample-sets per segment" language (real segments are
   flash-page-byte-sized, not sample-count-sized; observed first-segment
   sizes are 42-510 samples depending on signal).  Killed stale
   "preamble is 7 or 9 bytes" language (always 7).

2. docs/instantel_protocol_reference.md §7.6.1: added a clear
   "CURRENT STATUS" box at the top with a status table.  Replaced the
   stale "~80 sample-sets" line with the verified per-event segment
   sizes.  Merged two redundant segment-header field-table sections.

3. docs/waveform_codec_re_status.md (NEW): clean working-status doc.
   Solved / not solved / hypothesis / next experiment / fixtures /
   tests.  The protocol reference remains the historical Rosetta
   Stone; this new file is the current-truth working note that
   shouldn't accumulate fossil layers.

4. CLAUDE.md §"Waveform body codec": prominent warning box at top —
   "DO NOT TRUST decoded sample arrays yet."  BW binary passthrough
   is the only sample-bearing output to trust until the decoder
   lands.  Added a "Next experiment" subsection pointing the next
   pass at the segment-channel scoring analyzer.

40 tests still pass.
This commit is contained in:
Claude
2026-05-12 02:43:25 +00:00
committed by serversdown
parent 5bf5329369
commit f68ee9f0f9
4 changed files with 385 additions and 139 deletions
+33 -6
View File
@@ -61,10 +61,24 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
## Waveform body codec — PARTIAL (2026-05-11)
> ### ⛔️ DO NOT TRUST decoded sample arrays yet
>
> `client.py:_decode_a5_waveform` still uses the broken legacy int16 LE
> decoder. The `.h5` sidecars SFM writes contain WRONG sample values
> for every event. Treat decoded sample arrays as "unverified" in all
> downstream consumers.
>
> The **BW binary write path** (`blastware_file.py`) is unaffected —
> it's pure passthrough of device flash bytes and remains byte-perfect.
> Use the `.bw` binary as the authoritative waveform output until the
> codec is fully decoded.
>
> Clean working-status doc: `docs/waveform_codec_re_status.md`.
> Full archaeological record: `docs/instantel_protocol_reference.md §7.6.1`.
The **per-byte decoding** of the Blastware waveform-file body (between the
21-byte STRT record and the 26-byte footer) was historically claimed to be
"raw int16 LE, 8 bytes per sample-set." That was wrong — see the
retraction in `docs/instantel_protocol_reference.md §7.6.1`. The body
"raw int16 LE, 8 bytes per sample-set." That was wrong. The body
is actually a tagged-block stream with a custom delta+RLE codec.
### What's solved (2026-05-11)
@@ -96,13 +110,26 @@ is actually a tagged-block stream with a custom delta+RLE codec.
(SS0, SV0) and breaks the simple Tran walk there. Probably a channel-
switch or alternative-encoding marker for high-amplitude regions.
### Next experiment
**Don't hero-code the full decoder.** Build a small analysis tool — a
segment-channel scoring analyzer. For each segment of each fixture
event, run the segment-0 Tran block-walk + RLE decode and score the
cumulative trajectory against the BW ASCII truth for each of {Tran,
Vert, Long, MicL} over that segment's sample range, trying different
anchor-bytes candidates from the segment header. The winning
(channel, anchor-location) combination for each segment reveals
whether segments rotate channels and which header bytes encode the
per-segment channel anchors.
See `docs/waveform_codec_re_status.md` for the full specification of
the next experiment.
### Production-code status
`client.py:_decode_a5_waveform` still uses the old (broken) int16 LE
decoder. Until the multi-channel decoder lands, the `.h5` sidecars
produced by SFM contain WRONG samples — keep treating them as
"unverified" downstream. `decode_waveform_v2()` returns `None` as a
placeholder.
decoder (see warning at the top of this section). `decode_waveform_v2()`
in `minimateplus/waveform_codec.py` returns `None` as a placeholder.
### Test fixtures