doc(fix): retracts raw int16 LE sample set assumptions. #18

Merged
serversdown merged 1 commits from sfm-waveform-store into main 2026-05-08 15:27:27 -04:00
Showing only changes of commit 8aea46b8a0 - Show all commits
+61 -8
View File
@@ -11,6 +11,7 @@
| Date | Section | Change |
|---|---|---|
| 2026-05-08 | §7.6.1 (RETRACTION) | **❌ RETRACTED — "raw int16 LE 8 bytes/sample-set" body codec was never validated.** The original 4-2-26 confirmation was based on misreading broken-decoder output (full-scale ±32K noise) as evidence the signal had saturated. BW's own 0C peaks for that capture (Tran=0.420 / Vert=3.870 / Long=0.495 in/s) prove the signal was NOT saturated — none of those exceed 13K ADC counts. No event in the project's archive has ever come close to saturation, yet the decoder consistently produces ±32K noise on every event. Conclusion: the body codec is not raw int16 LE; the actual encoding is open. Body byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`, lots of `10 XX` pairs) — likely a delta encoding with `0x10` as escape, but unverified. Retraction box added at top of §7.6.1; "fully-saturating event" claim removed from channel-identification note. The histogram codec in §7.6.2 IS verified and decoded correctly (different recording mode, 32-byte blocks); use it as a structural hint when reverse-engineering the waveform codec. |
| 2026-02-26 | Initial | Document created from first hex dump analysis |
| 2026-02-26 | §2 Frame Structure | **CORRECTED:** Frame uses DLE-STX (`0x10 0x02`) and DLE-ETX (`0x10 0x03`), not bare `0x02`/`0x03`. `0x41` confirmed as ACK not STX. DLE stuffing rule added. |
| 2026-02-26 | §8 Timestamp | **UPDATED:** Year `0x07CB = 1995` confirmed as MiniMate hardware default date when RTC battery is disconnected. Not an encoding error. Confidence upgraded from ❓ to 🔶. |
@@ -851,14 +852,59 @@ MicL: 39 64 1D AA = 0.0000875 psi
> strings actually live — NOT in any sample-chunk frame)
> - **§7.8.8** — multi-event "Download All" sequence
>
> The waveform sample encoding (4-channel interleaved s16 LE, 8 bytes per sample-set) described in §7.6.1
> below is still correct. Only the frame-indexing claims and metadata-source claims are wrong.
> The waveform sample encoding described in §7.6.1 below (4-channel interleaved s16 LE, 8 bytes
> per sample-set) is **NOT actually verified** — see the retraction note at the top of §7.6.1.
> The frame-indexing claims and metadata-source claims in §7.6 are also wrong; use §7.8.5–§7.8.8.
**Two distinct formats exist depending on recording mode. Both confirmed from captures.**
---
#### 7.6.1 Blast / Waveform mode — ✅ CONFIRMED (4-2-26 capture)
#### 7.6.1 Blast / Waveform mode — ❌ NOT VERIFIED (retracted 2026-05-08)
> ## ⚠️ RETRACTION (2026-05-08)
>
> The "4-channel interleaved s16 LE, 8 bytes per sample-set" claim
> below was **never actually validated**. It got into this document
> because the decoder built around that assumption produced full-scale
> ±32K counts on every channel of the 4-2-26 capture, and the
> ±32K-shaped output was misread as "the signal must have saturated."
>
> Cross-checking the BW-reported peaks proves the opposite:
>
> | Channel | BW PPV (in/s) | Expected ADC counts at 10 in/s FS |
> |---|---|---|
> | Tran | 0.420 | **1,376** |
> | Vert | 3.870 | **12,686** |
> | Long | 0.495 | **1,622** |
>
> None of these are anywhere near ±32K saturation. No event in the
> project's archive (across all captures from 1-2-26 onward) has
> ever come close to saturation either. Yet the decoder has
> consistently produced ±32K-shaped noise on every event. The right
> conclusion is that the byte-to-sample interpretation has been wrong
> the whole time, NOT that every event happened to saturate.
>
> What's actually known about the body bytes:
>
> - The byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`,
> plus high frequencies of `0x01 / 0x04 / 0x0F / 0xF0 / 0xF1`). Lots
> of `10 XX` pairs. Reading them as LE int16 produces uniform ±32K
> noise — the signature of mis-aligned or encoded data.
> - The CHANGELOG note for v0.14.2 calls the body a "delta-encoded
> ADC stream" — that hint plus the byte distribution points toward
> a delta encoding with `0x10` as an escape marker, but no decoder
> has been worked out yet.
> - The histogram-mode codec in §7.6.2 IS verified and decoded
> correctly (different format: 32-byte blocks with 9× int16 LE
> samples + metadata). The same firmware emits both formats, so
> §7.6.2 may share encoding primitives with the waveform codec
> and is worth using as a structural hint when reverse-engineering.
>
> **Treat the spec below as a starting hypothesis to disprove, not
> ground truth.** The frame-layout pieces (STRT location, preamble,
> chunk header) appear correct; the per-byte sample interpretation
> is the open question.
4-channel interleaved signed 16-bit little-endian, 8 bytes per sample-set:
@@ -923,11 +969,18 @@ Total: 7633B → 954 naive sample-sets, 948 alignment-corrected
Only 948 of 9306 sample-sets captured (10%) — `stop_after_metadata=True` terminated
download after A5[7] was received.
**Channel identification note:** The 4-2-26 blast saturated all four geophone channels
to near-maximum ADC output (~3200032617 counts). Channel ordering [Tran, Vert, Long, Mic]
= [ch0, ch1, ch2, ch3] is the Blastware convention and is consistent with per-channel PPV
values (Tran=0.420, Vert=3.870, Long=0.495 in/s from 0C record), but cannot be
independently confirmed from a fully-saturating event alone.
**Channel identification note:** Channel ordering [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3]
is the Blastware convention. This ordering has not been independently verified end-to-end,
since no decoder yet produces samples that match BW's own rendering of the same event (see
the retraction at the top of §7.6.1). Once the body codec is decoded, the per-channel PPV
values from the 0C record (Tran=0.420, Vert=3.870, Long=0.495 in/s for the 4-2-26 capture)
provide the cross-check that pins down channel order.
> **Historical note:** earlier revisions of this section claimed the 4-2-26 blast had
> "saturated all four channels to ~3200032617 counts," citing that as evidence the s16 LE
> interpretation was correct. That claim was wrong — the ±32K values were the broken
> decoder's output, not the actual signal amplitude (which the 0C peaks above show was
> nowhere near saturation). Retracted 2026-05-08.
---