codec-re: 30 NN block CRACKED — codec fully decoded
User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC
range constraint led to the final piece.
30 NN block format (CONFIRMED across all 14 blocks in the fixture
bundle):
NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each.
Within each group:
bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first)
bytes [2:6] = 4 × int8 low bytes
delta[k] = sign_extend_12((high_nibble[k] << 8) | low_byte[k])
Block length = NN × 1.5 + 2 bytes (tag included). Earlier walker
used NN × 4 which is only correct in the TRAILER section.
Why 12-bit: ±2047 in 16-count units ≈ ±10 in/s = the geophone's
full-scale range at Normal sensitivity. The codec sizes its widest
delta to cover the worst-case sample-to-sample change.
Results: every decoded sample across all fixture events matches truth
byte-exact. ZERO divergences.
event-a: 9984 samples (full event, all 3 geos)
event-c: 3840 (full event)
event-d: 3840 (full event)
JQ0: 9984 (full event)
V70: 9984 (full event)
SP0: 5122 (walker stops early on edge cases)
SS0: 1758
SV0: 2114
event-b: 738
TOTAL: 47,364 ADC samples verified, zero errors.
Three full 3-sec events decode end-to-end across all three geo
channels. The events where fewer samples decode (SP0/SS0/SV0/event-b)
are limited by walker robustness issues past the first few segments,
NOT by decoder correctness.
64 tests pass (up from 55). Files: minimateplus/waveform_codec.py
(new 30 NN decode + corrected walker length), tests/test_waveform_codec.py
(new full-event regression tests), docs/* (updated status everywhere),
analysis/test_30nn_hybrid.py (new — the analysis script that confirmed
the format).
This commit is contained in:
@@ -196,18 +196,22 @@ def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
length = 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||
# Data-section ``30 NN`` blocks have length NN*2 (= 8 for NN=4,
|
||||
# confirmed in M529LL1A.SS0 at body offset 29). Trailer-section
|
||||
# ``30 NN`` blocks have length NN*4 (= 32 for NN=8, confirmed in
|
||||
# event-d trailer at body offset 3941). We pick NN*2 if it lands
|
||||
# on a recognized tag, otherwise fall through to NN*4.
|
||||
cand2 = t1 * 2
|
||||
cand4 = t1 * 4
|
||||
if (i + cand2 < len(body) - 1
|
||||
and body[i + cand2] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||
length = cand2
|
||||
# Data-section ``30 NN`` blocks carry NN 12-bit signed deltas packed
|
||||
# as NN/4 groups of (2-byte high-nibble field + 4 × int8 low byte).
|
||||
# Length = NN/4 × 6 + 2 = NN × 1.5 + 2 (= 8 for NN=4, 14 for NN=8,
|
||||
# 20 for NN=12, etc.). Confirmed 2026-05-11 by full-decoder
|
||||
# verification against BW ASCII export.
|
||||
#
|
||||
# Trailer-section ``30 NN`` blocks have a different length formula
|
||||
# (NN × 4 = 32 for NN=8 in trailers). We try the data-section
|
||||
# length first and fall back to the trailer length if needed.
|
||||
cand_data = t1 * 3 // 2 + 2
|
||||
cand_trailer = t1 * 4
|
||||
if (i + cand_data < len(body) - 1
|
||||
and body[i + cand_data] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||
length = cand_data
|
||||
else:
|
||||
length = cand4
|
||||
length = cand_trailer
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 20
|
||||
else:
|
||||
@@ -398,7 +402,26 @@ def decode_waveform_v2(body: bytes) -> Optional[dict]:
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out[channel].append(cur)
|
||||
# 30 NN: unknown content; skip.
|
||||
elif blk.tag_hi == 0x30:
|
||||
# 12-bit signed deltas, packed as NN/4 groups of 6 bytes each:
|
||||
# bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB first)
|
||||
# bytes [2:6] = 4 × int8 low bytes
|
||||
# Each delta = sign_extend_12((high_nibble << 8) | low_byte).
|
||||
# Confirmed 2026-05-11 against all 14 ``30 NN`` blocks in the
|
||||
# bundled fixtures.
|
||||
n_groups = blk.tag_lo // 4
|
||||
for g in range(n_groups):
|
||||
grp = blk.data[g * 6 : (g + 1) * 6]
|
||||
if len(grp) < 6:
|
||||
break
|
||||
high_word = (grp[0] << 8) | grp[1]
|
||||
for k in range(4):
|
||||
nib = (high_word >> (12 - 4 * k)) & 0xF
|
||||
v = (nib << 8) | grp[2 + k]
|
||||
if v >= 0x800:
|
||||
v -= 0x1000
|
||||
cur += v
|
||||
out[channel].append(cur)
|
||||
# 40 02: should not occur in segment data.
|
||||
return cur
|
||||
|
||||
|
||||
Reference in New Issue
Block a user