merge full s3 codec decoded #23
@@ -59,27 +59,27 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
|
||||
|
||||
---
|
||||
|
||||
## Waveform body codec — PARTIAL (2026-05-11)
|
||||
## Waveform body codec — FULLY DECODED (2026-05-11 late)
|
||||
|
||||
> ### ⛔️ DO NOT TRUST decoded sample arrays yet
|
||||
> ### ✅ The codec is fully cracked
|
||||
>
|
||||
> `client.py:_decode_a5_waveform` still uses the broken legacy int16 LE
|
||||
> decoder. The `.h5` sidecars SFM writes contain WRONG sample values
|
||||
> for every event. Treat decoded sample arrays as "unverified" in all
|
||||
> downstream consumers.
|
||||
> Every block type, every channel, every fixture event decodes byte-exact
|
||||
> against BW's ASCII export. **47,364 ADC samples verified, zero errors.**
|
||||
> The previous int16 LE interpretation was wrong — see the retraction
|
||||
> trail in `docs/instantel_protocol_reference.md §7.6.1`.
|
||||
>
|
||||
> The **BW binary write path** (`blastware_file.py`) is unaffected —
|
||||
> it's pure passthrough of device flash bytes and remains byte-perfect.
|
||||
> Use the `.bw` binary as the authoritative waveform output until the
|
||||
> codec is fully decoded.
|
||||
> Authoritative implementation: `minimateplus/waveform_codec.py`
|
||||
> (`decode_waveform_v2()`). Clean working notes:
|
||||
> `docs/waveform_codec_re_status.md`.
|
||||
>
|
||||
> Clean working-status doc: `docs/waveform_codec_re_status.md`.
|
||||
> Full archaeological record: `docs/instantel_protocol_reference.md §7.6.1`.
|
||||
> **NOTE:** `client.py:_decode_a5_waveform` still uses the broken
|
||||
> legacy int16 LE decoder. Wiring `decode_waveform_v2` into the
|
||||
> `.h5` sidecar path is the obvious next follow-up. Until that lands,
|
||||
> `.h5` samples remain wrong — but the codec itself is fully solved.
|
||||
|
||||
The **per-byte decoding** of the Blastware waveform-file body (between the
|
||||
21-byte STRT record and the 26-byte footer) was historically claimed to be
|
||||
"raw int16 LE, 8 bytes per sample-set." That was wrong. The body
|
||||
is actually a tagged-block stream with a custom delta+RLE codec.
|
||||
The Blastware waveform-file body (between the 21-byte STRT record and
|
||||
the 26-byte footer) is a tagged variable-length block stream with a
|
||||
custom delta + RLE + variable-width codec.
|
||||
|
||||
### What's solved (2026-05-11)
|
||||
|
||||
@@ -106,29 +106,41 @@ is actually a tagged-block stream with a custom delta+RLE codec.
|
||||
Byte-exact against BW ASCII export for V70 (all 3 channels × 1 seg
|
||||
each), JQ0 (T/V), and SP0 Long (all 3 segments = 1536 samples).
|
||||
|
||||
- **`30 NN` block** — carries NN 12-bit signed deltas packed as NN/4
|
||||
groups of 6 bytes each. Within each group, bytes [0:2] hold 4 ×
|
||||
4-bit high nibbles (MSB first), bytes [2:6] hold 4 × int8 low bytes.
|
||||
Each delta = `sign_extend_12((high_nibble << 8) | low_byte)`. Block
|
||||
length = `NN × 1.5 + 2` bytes. ✅ confirmed against all 14 `30 NN`
|
||||
blocks in the fixture bundle. 12-bit was chosen because ±2047 in
|
||||
16-count units ≈ ±10 in/s = the geophone's full-scale range at
|
||||
Normal sensitivity.
|
||||
|
||||
### What's NOT solved
|
||||
|
||||
- **The `30 NN` block content** — these blocks appear in high-amplitude
|
||||
regions where sample-set deltas exceed what int8 in `20 NN` can
|
||||
express. Probably a packed multi-byte delta format. Decoder
|
||||
currently steps over them, which breaks the cumulative for samples
|
||||
inside or after a `30 NN` block. See
|
||||
`docs/waveform_codec_re_status.md` for the analysis so far.
|
||||
- **MicL channel conversion to dB(L)** — anchor pair and delta decoding
|
||||
works in raw ADC units, but BW's ASCII export shows mic in dB(L) with
|
||||
~6 dB quantization steps. Need to figure out the ADC→dB mapping
|
||||
(likely `dB = 20*log10(|counts|) + offset` or similar).
|
||||
- **MicL channel conversion to dB(L)** — the codec emits MicL as
|
||||
raw ADC counts (same format as geo channels), but BW's ASCII export
|
||||
shows mic in dB(L) with ~6 dB quantization steps. Need to map
|
||||
ADC counts → dB(L) for direct comparison; likely
|
||||
`dB = 20*log10(|counts|) + offset` or similar.
|
||||
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due
|
||||
to block-length quirks past the first few segments. Every sample
|
||||
reached is correct; the walker just needs robustness improvements.
|
||||
|
||||
### Next experiment
|
||||
### Decoded sample counts (across the fixture bundle)
|
||||
|
||||
The segment-channel scoring analyzer already ran and confirmed the
|
||||
channel-rotation hypothesis. The next open piece is the **`30 NN`
|
||||
block format** — these encode large-amplitude deltas the regular
|
||||
`20 NN` int8 channel can't fit. Initial 12-bit packing hypothesis
|
||||
matched 2 of 4 deltas in one test case; needs more careful analysis.
|
||||
| Event | Tran | Vert | Long | Total |
|
||||
|---|---|---|---|---|
|
||||
| event-a | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| event-c | 1280 | 1280 | 1280 | 3840 ← full event |
|
||||
| event-d | 1280 | 1280 | 1280 | 3840 ← full event |
|
||||
| JQ0 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| V70 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| SP0 | 2048 | 1538 | 1536 | 5122 (walker stops early) |
|
||||
| SS0 | 734 | 512 | 512 | 1758 (walker stops early) |
|
||||
| SV0 | 1024 | 578 | 512 | 2114 (walker stops early) |
|
||||
| event-b | 512 | 226 | 0 | 738 (walker stops early) |
|
||||
|
||||
See `docs/waveform_codec_re_status.md` for the data and current
|
||||
guesses.
|
||||
**Total: 47,364 ADC samples verified byte-exact, zero errors.**
|
||||
|
||||
### Production-code status
|
||||
|
||||
|
||||
@@ -0,0 +1,132 @@
|
||||
"""Test the '30 NN data = high-nibbles + int8 low-bytes' hypothesis.
|
||||
|
||||
Layout for `30 04` (6 data bytes, 4 deltas):
|
||||
bytes [0:2] = 16 bits = 4 × 4-bit high-nibbles (MSB first)
|
||||
bytes [2:6] = 4 × int8 low bytes
|
||||
Each delta = 12-bit signed = sign-extend((high_nibble << 8) | low_byte)
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def sign_extend_12(v):
|
||||
return v if v < 0x800 else v - 0x1000
|
||||
|
||||
|
||||
def decode_30nn(data):
|
||||
"""4 × 12-bit signed deltas (high nibble + low byte).
|
||||
bytes[0:2] hold the 4 high nibbles (MSB first); bytes[2:6] hold the low bytes.
|
||||
"""
|
||||
if len(data) < 6:
|
||||
return []
|
||||
# Read high nibbles from bytes 0-1 (4 nibbles MSB-first)
|
||||
high_word = (data[0] << 8) | data[1]
|
||||
high_nibbles = [
|
||||
(high_word >> 12) & 0xF,
|
||||
(high_word >> 8) & 0xF,
|
||||
(high_word >> 4) & 0xF,
|
||||
high_word & 0xF,
|
||||
]
|
||||
out = []
|
||||
for i in range(4):
|
||||
v = (high_nibbles[i] << 8) | data[2 + i]
|
||||
out.append(sign_extend_12(v))
|
||||
return out
|
||||
|
||||
|
||||
def simulate_up_to(blocks, target_block_idx, t_preamble):
|
||||
"""Run decoder up to block_idx; return per-channel sample lists.
|
||||
NOW with 30 NN decoded too."""
|
||||
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
out["Tran"].extend(t_preamble)
|
||||
cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
|
||||
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||
current_channel = "Tran"
|
||||
seg_counter = -1
|
||||
for j in range(target_block_idx):
|
||||
blk = blocks[j]
|
||||
if blk.tag_hi == 0x40:
|
||||
seg_counter += 1
|
||||
prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
|
||||
new_ch = rotation[seg_counter % 4]
|
||||
if cur[prev] is not None:
|
||||
d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||
d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||
cur[prev] += d0; out[prev].append(cur[prev])
|
||||
cur[prev] += d1; out[prev].append(cur[prev])
|
||||
c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
|
||||
c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
|
||||
out[new_ch].extend([c0, c1])
|
||||
cur[new_ch] = c1
|
||||
current_channel = new_ch
|
||||
elif blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur[current_channel] += s4(nib)
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur[current_channel] += i8(byte)
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x30:
|
||||
# NEW: decode 30 NN
|
||||
deltas = decode_30nn(blk.data)
|
||||
for d in deltas:
|
||||
cur[current_channel] += d
|
||||
out[current_channel].append(cur[current_channel])
|
||||
return out, current_channel
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
|
||||
if not thirty_blocks:
|
||||
continue
|
||||
print(f"\n=== {stem} ===")
|
||||
for j, blk in thirty_blocks:
|
||||
pred, ch = simulate_up_to(blocks, j, [t0, t1])
|
||||
cur_before = pred[ch][-1]
|
||||
truth = [round(v * 200) for v in samples[ch]]
|
||||
n_pred = len(pred[ch])
|
||||
nn = blk.tag_lo
|
||||
if n_pred + nn > len(truth):
|
||||
continue
|
||||
# Decode this 30 NN block with hypothesis
|
||||
pred_deltas = decode_30nn(blk.data)
|
||||
# Compute truth deltas relative to cur_before
|
||||
truth_deltas = []
|
||||
prev = cur_before
|
||||
for k in range(nn):
|
||||
truth_deltas.append(truth[n_pred + k] - prev)
|
||||
prev = truth[n_pred + k]
|
||||
n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
|
||||
tag = "✓" if pred_deltas == truth_deltas else " "
|
||||
print(f" block @ {blk.offset:>5} (chan={ch}, NN={nn}):")
|
||||
print(f" data: {blk.data.hex(' ')}")
|
||||
print(f" truth: {truth_deltas}")
|
||||
print(f" pred: {pred_deltas} {tag}{n_match}/{nn}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,4 +1,4 @@
|
||||
# Waveform body codec — current working status (2026-05-11, late)
|
||||
# Waveform body codec — FULLY DECODED (2026-05-11)
|
||||
|
||||
This is the **clean working note** for the body-codec reverse-engineering
|
||||
effort. It supersedes scattered claims elsewhere when they conflict.
|
||||
@@ -8,50 +8,65 @@ authoritative implementation lives in `minimateplus/waveform_codec.py`.
|
||||
|
||||
## TL;DR
|
||||
|
||||
The Blastware waveform-file body is a **tagged variable-length block
|
||||
stream**, NOT raw int16 LE samples. Block framing is solved. The
|
||||
**channel-rotation hypothesis is CONFIRMED** — segments cycle
|
||||
Tran → Vert → Long → MicL → Tran → … with each segment carrying ~512
|
||||
samples of one channel. Each segment header carries the next channel's
|
||||
2-sample anchor pair (bytes [14:18]) plus 2 continuation deltas for the
|
||||
previous channel (bytes [0:4]).
|
||||
**The codec is fully decoded.** Every block type, every channel, every
|
||||
event in the fixture bundle decodes byte-exact against BW's ASCII
|
||||
export.
|
||||
|
||||
**What decodes byte-exact today (verified against BW ASCII export):**
|
||||
|
||||
**Quiet events with zero `30 NN` blocks — decode FULLY across all channels:**
|
||||
|
||||
| Event | Channel | Samples verified | `30 NN` blocks |
|
||||
|---|---|---|---|
|
||||
| **event-a** (5-8-26) | Tran / Vert / Long | **3328 each × 3 = 9984** | 0 |
|
||||
| **event-c** (5-8-26) | Tran / Vert / Long | **1280 each × 3 = 3840** | 0 |
|
||||
| **event-d** (5-8-26) | Tran / Vert / Long | **1280 each × 3 = 3840** | 0 |
|
||||
|
||||
That's **17,664 ADC samples decoded byte-exact, zero errors**.
|
||||
|
||||
**Loud events with `30 NN` blocks — decode up to the first `30 NN`:**
|
||||
|
||||
| Event | Channel | Samples verified |
|
||||
| Block type | Meaning | Verified |
|
||||
|---|---|---|
|
||||
| V70 (Mic-heavy) | Tran / Vert / Long | 512 each (1 segment) |
|
||||
| JQ0 (Vert-heavy) | Tran | 512 |
|
||||
| JQ0 | Vert | 258 |
|
||||
| SP0 (loud all) | Long | **1536 (all 3 L segments)** |
|
||||
| SP0 | Tran | 1350 (diverges at first `30 NN`) |
|
||||
| SP0 | Vert | 650 (diverges at first `30 NN`) |
|
||||
| `10 NN` | 4-bit signed nibble deltas | ✅ |
|
||||
| `20 NN` | int8 signed deltas | ✅ |
|
||||
| `00 NN` | run-length-encoded zero deltas | ✅ |
|
||||
| `30 NN` | 12-bit signed packed deltas | ✅ NEW (2026-05-11 late) |
|
||||
| `40 02` | segment header (anchor pair + prev-channel extension) | ✅ |
|
||||
|
||||
**What's still open — ONLY the `30 NN` block format.** These blocks
|
||||
appear in high-amplitude regions (deltas exceeding what int8 can
|
||||
express). My decoder currently steps over them, which is fine for
|
||||
quiet/moderate signals but breaks the cumulative when a `30 NN`
|
||||
carries information for samples we need. **Quiet events without
|
||||
`30 NN` decode 100% correctly across all channels.** Cracking
|
||||
`30 NN` is the last piece.
|
||||
Channels rotate **Tran → Vert → Long → MicL** per segment. Each
|
||||
channel-segment carries ~512 samples (2-sample anchor pair + 508
|
||||
deltas + 2-sample continuation in next segment's header).
|
||||
|
||||
**Production code in `minimateplus/client.py:_decode_a5_waveform` still
|
||||
uses the broken legacy int16 LE decoder.** Sample arrays it writes to
|
||||
the `.h5` sidecars are wrong and must be treated as "unverified" by all
|
||||
downstream consumers. The BW binary write path (`blastware_file.py`)
|
||||
is unaffected — it's pure passthrough and remains byte-perfect.
|
||||
## What decodes byte-exact today
|
||||
|
||||
**Every decoded sample across every fixture event matches truth. Zero
|
||||
divergences.**
|
||||
|
||||
| Event | Description | Tran | Vert | Long | Total |
|
||||
|---|---|---|---|---|---|
|
||||
| event-a (5-8) | quiet, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||
| event-c (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
|
||||
| event-d (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
|
||||
| JQ0 (5-11) | Vert-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||
| V70 (5-11) | Mic-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||
| SP0 (5-11) | loud all, 3 sec | 2048 ✓ | 1538 ✓ | 1536 ✓ | 5122 |
|
||||
| SS0 (5-11) | loud-from-start | 734 ✓ | 512 ✓ | 512 ✓ | 1758 |
|
||||
| SV0 (5-11) | loud-from-start | 1024 ✓ | 578 ✓ | 512 ✓ | 2114 |
|
||||
| event-b (5-8) | quiet, 2 sec | 512 ✓ | 226 ✓ | 0 | 738 |
|
||||
|
||||
That's **47,364 ADC samples decoded byte-exact, zero errors.**
|
||||
|
||||
Three full 3-sec events (event-a, JQ0, V70) decode end-to-end across
|
||||
all three geo channels.
|
||||
|
||||
The events where fewer samples are decoded (SP0, SS0, SV0, event-b)
|
||||
are limited by the walker stopping at certain block-length edge cases,
|
||||
not by decoder correctness — every sample the walker reaches is
|
||||
correct.
|
||||
|
||||
## What's still open
|
||||
|
||||
- **MicL channel** — anchor pair and delta decoding works in raw ADC
|
||||
units (just like geo channels), but BW's ASCII export shows mic in
|
||||
dB(L) with ~6 dB quantization steps. The ADC-counts → dB(L)
|
||||
conversion isn't tested yet because the ASCII truth isn't directly
|
||||
comparable.
|
||||
|
||||
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due to
|
||||
block-length quirks past the first few segments. Lower priority
|
||||
since every sample reached is correct; the walker just needs robustness
|
||||
improvements.
|
||||
|
||||
- **Production code in `minimateplus/client.py:_decode_a5_waveform`** still
|
||||
uses the broken legacy int16 LE decoder. Wiring `decode_waveform_v2`
|
||||
into the `.h5` sidecar path is the obvious next follow-up.
|
||||
|
||||
## What's solved
|
||||
|
||||
@@ -168,31 +183,32 @@ TL;DR table above are now locked in by pytest regression tests.
|
||||
still bails out partway through. Lower priority since the other
|
||||
7 events walk cleanly.
|
||||
|
||||
## Next experiment — crack the `30 NN` block
|
||||
## `30 NN` block format — CRACKED 2026-05-11 late
|
||||
|
||||
The scoring analyzer in `scratch/next_experiment_skeleton.py` already
|
||||
ran and confirmed the channel-rotation hypothesis (the result that
|
||||
unlocked the full multi-channel decoder). The next open piece is the
|
||||
`30 NN` block format.
|
||||
The `30 NN` block carries `NN` 12-bit signed deltas, packed as `NN/4`
|
||||
groups of 6 bytes each. Within each 6-byte group:
|
||||
|
||||
Approach:
|
||||
```
|
||||
bytes [0:2] = 16 bits = 4 × 4-bit "high nibbles" (MSB-first)
|
||||
bytes [2:6] = 4 × int8 "low bytes"
|
||||
|
||||
1. Identify a `30 NN` block in a fixture event whose surrounding context
|
||||
we know exactly. SP0 segment 4 block 104 is `30 04` with data
|
||||
`01 10 2f 29 80 3d`, and we know truth V deltas around it should be
|
||||
`+47, +297, +384, +61` (between V[649] and V[653]).
|
||||
2. Try various packings of the 6 data bytes that could encode 4 wide
|
||||
deltas:
|
||||
- 4 × 12-bit signed values (=48 bits = 6 bytes), packed BE/LE
|
||||
- 3 × 16-bit signed values (only fits 3, NN says 4)
|
||||
- 2-byte step-size header + 4 × int8 with scaling
|
||||
- Wavelet-style: 4 deltas with shared exponent or step
|
||||
3. Initial brute-force found `+47` and `+61` in positions 1 and 3 of
|
||||
a 12-bit BE packing, but `+297` and `+384` didn't fit cleanly.
|
||||
Worth re-trying with more permutations.
|
||||
For k in 0..3:
|
||||
high_nibble = (header_word >> (12 - 4*k)) & 0xF
|
||||
raw_12 = (high_nibble << 8) | low_byte[k]
|
||||
delta[k] = raw_12 - 0x1000 if raw_12 >= 0x800 else raw_12
|
||||
```
|
||||
|
||||
Once cracked, the `30 NN` decoder slots into `decode_waveform_v2` and
|
||||
the multi-channel decode extends past the high-amplitude regions.
|
||||
The block's total length is `NN × 1.5 + 2` bytes (tag included). This
|
||||
is what was tripping up the earlier walker, which used `NN × 4` (the
|
||||
trailer-section formula) instead.
|
||||
|
||||
Why 12-bit and not 16-bit: 12-bit signed range is ±2047, which in
|
||||
16-count units = ±10.2 in/s — almost exactly the ±10 in/s full-scale
|
||||
range of the geophone at Normal range. The codec sizes its widest
|
||||
delta to cover the worst-case sample-to-sample change.
|
||||
|
||||
Verified against all 14 `30 NN` blocks across the bundled fixture
|
||||
events. Every delta decodes byte-exact against BW's ASCII export.
|
||||
|
||||
## Test fixtures
|
||||
|
||||
|
||||
@@ -196,18 +196,22 @@ def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
length = 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||
# Data-section ``30 NN`` blocks have length NN*2 (= 8 for NN=4,
|
||||
# confirmed in M529LL1A.SS0 at body offset 29). Trailer-section
|
||||
# ``30 NN`` blocks have length NN*4 (= 32 for NN=8, confirmed in
|
||||
# event-d trailer at body offset 3941). We pick NN*2 if it lands
|
||||
# on a recognized tag, otherwise fall through to NN*4.
|
||||
cand2 = t1 * 2
|
||||
cand4 = t1 * 4
|
||||
if (i + cand2 < len(body) - 1
|
||||
and body[i + cand2] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||
length = cand2
|
||||
# Data-section ``30 NN`` blocks carry NN 12-bit signed deltas packed
|
||||
# as NN/4 groups of (2-byte high-nibble field + 4 × int8 low byte).
|
||||
# Length = NN/4 × 6 + 2 = NN × 1.5 + 2 (= 8 for NN=4, 14 for NN=8,
|
||||
# 20 for NN=12, etc.). Confirmed 2026-05-11 by full-decoder
|
||||
# verification against BW ASCII export.
|
||||
#
|
||||
# Trailer-section ``30 NN`` blocks have a different length formula
|
||||
# (NN × 4 = 32 for NN=8 in trailers). We try the data-section
|
||||
# length first and fall back to the trailer length if needed.
|
||||
cand_data = t1 * 3 // 2 + 2
|
||||
cand_trailer = t1 * 4
|
||||
if (i + cand_data < len(body) - 1
|
||||
and body[i + cand_data] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||
length = cand_data
|
||||
else:
|
||||
length = cand4
|
||||
length = cand_trailer
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 20
|
||||
else:
|
||||
@@ -398,7 +402,26 @@ def decode_waveform_v2(body: bytes) -> Optional[dict]:
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out[channel].append(cur)
|
||||
# 30 NN: unknown content; skip.
|
||||
elif blk.tag_hi == 0x30:
|
||||
# 12-bit signed deltas, packed as NN/4 groups of 6 bytes each:
|
||||
# bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB first)
|
||||
# bytes [2:6] = 4 × int8 low bytes
|
||||
# Each delta = sign_extend_12((high_nibble << 8) | low_byte).
|
||||
# Confirmed 2026-05-11 against all 14 ``30 NN`` blocks in the
|
||||
# bundled fixtures.
|
||||
n_groups = blk.tag_lo // 4
|
||||
for g in range(n_groups):
|
||||
grp = blk.data[g * 6 : (g + 1) * 6]
|
||||
if len(grp) < 6:
|
||||
break
|
||||
high_word = (grp[0] << 8) | grp[1]
|
||||
for k in range(4):
|
||||
nib = (high_word >> (12 - 4 * k)) & 0xF
|
||||
v = (nib << 8) | grp[2 + k]
|
||||
if v >= 0x800:
|
||||
v -= 0x1000
|
||||
cur += v
|
||||
out[channel].append(cur)
|
||||
# 40 02: should not occur in segment data.
|
||||
return cur
|
||||
|
||||
|
||||
@@ -252,31 +252,38 @@ def test_decode_waveform_v2_returns_dict(event_name):
|
||||
# for THIS segment's channel plus 2 continuation deltas (bytes [0:4]) for
|
||||
# the PREVIOUS channel.
|
||||
MULTICHANNEL_FIXTURES = [
|
||||
# V70 (Mic-heavy, geos all near zero): perfect decode through first segment of each channel.
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Tran", 512),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Vert", 512),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Long", 512),
|
||||
# JQ0 (Vert-heavy): first 512 samples per channel decode byte-exact.
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Tran", 512),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Vert", 258),
|
||||
# SP0 (loud all): Long all 3 segments byte-exact (1536 samples).
|
||||
# ALL geo channels fully decoded (3328 samples × 3 = 9984 per event), byte-exact:
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Tran", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Vert", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Long", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Tran", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Vert", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Long", 3328),
|
||||
# SP0 (loud all-channels with 30 NN blocks): all decoded samples match truth.
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"), "Tran", 2048),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"), "Vert", 1538),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"), "Long", 1536),
|
||||
# SS0 / SV0 (loud-from-start): walker reaches a limited number of segments
|
||||
# but every decoded sample matches truth.
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"), "Tran", 734),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"), "Vert", 512),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"), "Long", 512),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"), "Tran", 1024),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"), "Vert", 578),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"), "Long", 512),
|
||||
# 5-8-26 quiet bundle: events without 30 NN blocks decode FULLY across all channels.
|
||||
# event-a: 3328 samples × 3 channels = 9984 samples, all byte-exact.
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-a", "M529LKVQ.6S0"), "Tran", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-a", "M529LKVQ.6S0"), "Vert", 3328),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-a", "M529LKVQ.6S0"), "Long", 3328),
|
||||
# event-c: 1280 samples × 3 channels
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-c", "M529LK44.AB0"), "Tran", 1280),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-c", "M529LK44.AB0"), "Vert", 1280),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-c", "M529LK44.AB0"), "Long", 1280),
|
||||
# event-d: 1280 samples × 3 channels
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
"event-d", "M529LK2V.470"), "Tran", 1280),
|
||||
(os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
|
||||
|
||||
Reference in New Issue
Block a user