docs: clean up waveform-codec doc layers per review
Three "truth layers" had drifted apart between commits. Fixed: 1. waveform_codec.py docstring rewritten from the 2026-05-08 "structural framing only" state to the 2026-05-11 "Tran segment 0 solved + segment-header partially decoded" state. Killed stale "~80 sample-sets per segment" language (real segments are flash-page-byte-sized, not sample-count-sized; observed first-segment sizes are 42-510 samples depending on signal). Killed stale "preamble is 7 or 9 bytes" language (always 7). 2. docs/instantel_protocol_reference.md §7.6.1: added a clear "CURRENT STATUS" box at the top with a status table. Replaced the stale "~80 sample-sets" line with the verified per-event segment sizes. Merged two redundant segment-header field-table sections. 3. docs/waveform_codec_re_status.md (NEW): clean working-status doc. Solved / not solved / hypothesis / next experiment / fixtures / tests. The protocol reference remains the historical Rosetta Stone; this new file is the current-truth working note that shouldn't accumulate fossil layers. 4. CLAUDE.md §"Waveform body codec": prominent warning box at top — "DO NOT TRUST decoded sample arrays yet." BW binary passthrough is the only sample-bearing output to trust until the decoder lands. Added a "Next experiment" subsection pointing the next pass at the segment-channel scoring analyzer. 40 tests still pass.
This commit is contained in:
@@ -860,20 +860,39 @@ MicL: 39 64 1D AA = 0.0000875 psi
|
||||
|
||||
---
|
||||
|
||||
#### 7.6.1 Blast / Waveform mode — 🟡 STRUCTURAL FRAMING + TRAN CODEC DECODED (2026-05-11)
|
||||
#### 7.6.1 Blast / Waveform mode — 🟡 PARTIAL DECODE (2026-05-11)
|
||||
|
||||
> **Status (2026-05-11):** Block-level framing is solved. The Tran-channel
|
||||
> encoding (preamble + first data block) is **fully verified** against the
|
||||
> 3-event May 11 2026 high-amplitude bundle (PPV 6-7 in/s) and the 4-event
|
||||
> May 8 bundle. Verts / Long / MicL channel encodings and multi-block
|
||||
> Tran continuation are **still open**. The previous int16 LE claim
|
||||
> remains REFUTED (see history below).
|
||||
> ### 📌 CURRENT STATUS — read this first
|
||||
>
|
||||
> The earlier "4-channel interleaved s16 LE, 8 bytes per sample-set"
|
||||
> claim was never validated and was wrong. No event in the project's
|
||||
> archive ever came close to ADC saturation, yet the int16 LE decoder
|
||||
> consistently produced full-scale ±32K noise — that was the signature
|
||||
> of mis-aligned encoded data, not signal saturation.
|
||||
> The body codec is **partially decoded** as of 2026-05-11. This
|
||||
> section contains both current-truth spec AND historical retractions;
|
||||
> when in doubt, the working summary lives at
|
||||
> `docs/waveform_codec_re_status.md`.
|
||||
>
|
||||
> | Item | Status |
|
||||
> |---|---|
|
||||
> | Body has tagged variable-length blocks, NOT raw int16 LE | ✅ confirmed |
|
||||
> | 5 block tag types (10/20/00/30/40 NN) with lengths | ✅ confirmed |
|
||||
> | 7-byte preamble: `00 02 00` + Tran[0] + Tran[1] int16 BE | ✅ confirmed |
|
||||
> | `00 NN` = RLE for zero deltas in the current channel | ✅ confirmed |
|
||||
> | Tran channel, segment 0 (~482-510 samples / event) | ✅ byte-exact, 5/5 events |
|
||||
> | Multi-segment Tran continuation | ❌ open (breaks at sample ~512) |
|
||||
> | Vert / Long / MicL channel decoders | ❌ open |
|
||||
> | `30 NN` block content (loud-from-start events) | ❌ open |
|
||||
> | Earlier "raw int16 LE, 8 bytes per sample-set" claim | ❌ REFUTED |
|
||||
>
|
||||
> **Production code in `client.py:_decode_a5_waveform` still uses the
|
||||
> broken int16 LE decoder.** The `.h5` sidecars SFM produces contain
|
||||
> wrong sample values and must be treated as "unverified" downstream.
|
||||
> The BW binary write path is unaffected (it's pure passthrough of the
|
||||
> device's flash bytes, no decoding) and remains byte-perfect.
|
||||
|
||||
The "4-channel interleaved s16 LE, 8 bytes per sample-set" claim that
|
||||
appeared in earlier revisions of this section was never validated and
|
||||
was wrong. No event in the project's archive ever came close to ADC
|
||||
saturation, yet the int16 LE decoder consistently produced full-scale
|
||||
±32K noise — that was the signature of mis-aligned encoded data, not
|
||||
signal saturation.
|
||||
|
||||
##### Body file layout
|
||||
|
||||
@@ -932,23 +951,38 @@ followed by a ``00 NN`` marker before the next data block.
|
||||
|
||||
##### Segments
|
||||
|
||||
The body is divided into ~16 SEGMENTS for a 1280-sample event (= 1
|
||||
segment per ~80 sample-sets), separated by ``40 02`` segment headers.
|
||||
A 3328-sample event has ~42 segments.
|
||||
The body is divided into segments separated by ``40 02`` segment headers.
|
||||
**Segment size is variable** — bounded by a fixed device-flash byte
|
||||
budget, not a fixed sample count. Quiet events fit more samples per
|
||||
segment (RLE compacts zero deltas via ``00 NN`` markers); loud events
|
||||
fit fewer. Observed first-segment sizes in the bundled fixtures:
|
||||
|
||||
The 18-byte ``40 02`` payload structure (CONFIRMED across all 4
|
||||
fixtures by inspecting the increment of bytes [8:12]):
|
||||
| Event | Segment 0 size (Tran samples) |
|
||||
|---|---|
|
||||
| SP0 (loud, 0.25s pretrig) | 510 |
|
||||
| SV0 (loud-from-start) | 58 (stops at first ``30 NN``) |
|
||||
| SS0 (loud-from-start) | 42 (stops at first ``30 04``) |
|
||||
| JQ0 (Vert-heavy, quiet Tran) | 510 |
|
||||
| V70 (Mic-heavy, quiet geos) | 510 |
|
||||
|
||||
| Offset | Length | Field |
|
||||
|--------|--------|--------------------------------------------------|
|
||||
| 0 | 4 | Anchor / channel state (open — see below) |
|
||||
| 4 | 4 | Variable field (open) |
|
||||
| 8 | 4 | uint32 LE counter — increments by 1 per segment |
|
||||
| 12 | 4 | Fixed pattern ``02 00 00 01`` |
|
||||
| 16 | 2 | Variable tail |
|
||||
⚠️ Earlier drafts of this section claimed "~80 sample-sets per segment"
|
||||
based on incomplete walks; that figure is wrong. Segments are
|
||||
flash-page-sized in bytes, not sample-count-sized.
|
||||
|
||||
The counter at bytes [8:12] starts in the 0x40s for a freshly-erased
|
||||
device and increments cleanly — useful as a structural sanity check.
|
||||
The 18-byte ``40 02`` payload structure:
|
||||
|
||||
| Offset | Field | Status |
|
||||
|-----------|---------------------------------------------|-------------|
|
||||
| [0:2] | T_delta at first sample of new segment | ✅ confirmed|
|
||||
| | (int16 BE, in 16-count units) | |
|
||||
| [2:4] | Likely T_delta at sample seg_start+1 | 🟡 likely |
|
||||
| [4:6] | Unknown (varies; possibly a checksum) | ❓ open |
|
||||
| [6:8] | Byte length to next segment header − 2 | ✅ confirmed|
|
||||
| | (uint16 BE; useful for walker pre-scan) | |
|
||||
| [8:12] | Monotonic uint32 LE counter | ✅ confirmed|
|
||||
| | (starts ~0x47, increments by 1 per segment) | |
|
||||
| [12:14] | Constant ``02 00`` | ✅ confirmed|
|
||||
| [14:18] | Unknown 4-byte field | ❓ open |
|
||||
|
||||
Examples from event-c (1 sec single-shot):
|
||||
|
||||
@@ -1008,26 +1042,25 @@ where the codec is most complex stop at the first ``30 04``.
|
||||
|
||||
Implementation: :func:`minimateplus.waveform_codec.decode_tran_initial`.
|
||||
|
||||
##### Segment header T-delta (PARTIAL 2026-05-11)
|
||||
##### Multi-segment Tran continuation — OPEN
|
||||
|
||||
The 20-byte ``40 02`` segment header has its first 2 bytes ([0:2] of
|
||||
payload) as an int16 BE Tran delta for the first sample of the new
|
||||
segment. Verified across V70 (3 segments with 0 deltas) and SP0/JQ0
|
||||
(1 segment with +1 delta). Other bytes of the segment header payload
|
||||
are partially understood:
|
||||
After segment 0 ends and the segment header's T_delta (bytes [0:2])
|
||||
is applied, the next segment's blocks produce values that diverge from
|
||||
truth by sample ~512. The block structure inside segment 1 is
|
||||
identical to segment 0 (alternating ``10 NN`` / ``20 NN`` data +
|
||||
``00 NN`` RLE), and the per-segment delta budget exactly matches the
|
||||
segment size — V70 segment 1 has 264 nibble-deltas + 244 RLE-zeros =
|
||||
508 = the segment's sample count. Cumulative deltas are correct in
|
||||
aggregate (V70 net-zero ≈ truth net-zero) but the per-sample trajectory
|
||||
is wrong when applied as Tran continuation.
|
||||
|
||||
| Payload offset | Field | Status |
|
||||
|---|---|---|
|
||||
| [0:2] | T_delta at first sample of new segment (int16 BE) | ✅ confirmed |
|
||||
| [2:4] | unknown (often 0; not a simple V or T delta) | ❓ open |
|
||||
| [4:6] | unknown (varies per event; possibly a checksum) | ❓ open |
|
||||
| [6:8] | byte length to next segment header − 2 (uint16 BE) | ✅ confirmed |
|
||||
| [8:12] | monotonic uint32 LE counter | ✅ confirmed |
|
||||
| [12:14] | constant ``02 00`` | ✅ confirmed |
|
||||
| [14:18] | unknown 4-byte field | ❓ open |
|
||||
|
||||
Multi-segment Tran decoding diverges after sample ~512 — the per-segment
|
||||
channel ordering after the header is still unknown.
|
||||
The strongest unverified hypothesis is that **segments rotate
|
||||
channels**: segment 0 = Tran, segment 1 = Vert, segment 2 = Long,
|
||||
segment 3 = Mic, segment 4 = Tran continuation, … This would explain
|
||||
the per-segment delta-budget match while also explaining why segment
|
||||
1 isn't Tran continuation. Verification needs the per-channel anchor
|
||||
to come from segment-header bytes [4:6] or [14:18], which are still
|
||||
open.
|
||||
|
||||
##### What's still open
|
||||
|
||||
|
||||
Reference in New Issue
Block a user