codec-re: crack Tran channel codec with high-amplitude May 11 bundle
User uploaded 3 high-amplitude events (PPV 6-7 in/s — shook the geophone
hard) to decode-re/5-11-26/. These cracked the Tran codec:
- Preamble bytes [3:5] and [5:7] = Tran[0] and Tran[1] as int16 BE
in 16-count units (LSB = 0.005 in/s). Confirmed across all 7
fixtures.
- First data block carries Tran deltas from sample 2 onward:
* 10 NN block: NN/2 bytes of payload, each byte = two 4-bit signed
nibble deltas (high nibble first)
* 20 NN block: NN int8 signed deltas
Verified 22+42+46 = 110 Tran samples across SP0/SS0/SV0 with 0 errors
against BW's ASCII export.
Why the earlier 96-combination brute force failed: the quiet 5-8
events all had T[0] = T[1] ≈ 0 so the preamble's per-channel encoding
was undetectable. Loud events made the encoding obvious.
What's solved:
- minimateplus.waveform_codec.decode_tran_initial: returns first
N Tran samples in 16-count units for any body.
- Walker length formula for in-data 30 NN blocks (NN*2 instead of NN*4).
- Walker now handles bodies that start with 20 NN (in addition to 10 NN).
What's still open:
- Tran past the first data block (multi-block channel switching).
- Vert / Long / MicL channel encodings.
- Walker correctness past offset ~427 in event-b.
Tests: 36 pass. decode_waveform_v2 still returns None — the full
multi-channel decoder is not wired up. decode_tran_initial is the
new verified entry point.
Files: minimateplus/waveform_codec.py, tests/test_waveform_codec.py
(adds 5-11-26 fixtures + decode_tran_initial tests), and
docs/instantel_protocol_reference.md §7.6.1 (Tran codec spec).
This commit is contained in:
@@ -137,9 +137,17 @@ class WaveformBlock:
|
||||
|
||||
|
||||
def find_data_start(body: bytes) -> int:
|
||||
"""Auto-detect the offset of the first ``10 NN`` block."""
|
||||
"""Auto-detect the offset of the first data block (``10 NN`` or ``20 NN``).
|
||||
|
||||
The preamble is always either 7 bytes (when sample 0 and 1 have small
|
||||
values) or 9 bytes (when they don't, but only on continuous-mode events
|
||||
in the small May-8 bundle). Returning the offset of the first ``10/20 NN``
|
||||
tag is the most robust heuristic.
|
||||
"""
|
||||
for i in range(min(20, len(body) - 1)):
|
||||
if body[i] == 0x10 and body[i + 1] % 4 == 0 and 0 < body[i + 1] <= 0xFC:
|
||||
b = body[i]
|
||||
nn = body[i + 1]
|
||||
if b in (0x10, 0x20) and nn % 4 == 0 and 0 < nn <= 0xFC:
|
||||
return i
|
||||
return -1
|
||||
|
||||
@@ -167,7 +175,18 @@ def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
length = 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||
length = t1 * 4
|
||||
# Data-section ``30 NN`` blocks have length NN*2 (= 8 for NN=4,
|
||||
# confirmed in M529LL1A.SS0 at body offset 29). Trailer-section
|
||||
# ``30 NN`` blocks have length NN*4 (= 32 for NN=8, confirmed in
|
||||
# event-d trailer at body offset 3941). We pick NN*2 if it lands
|
||||
# on a recognized tag, otherwise fall through to NN*4.
|
||||
cand2 = t1 * 2
|
||||
cand4 = t1 * 4
|
||||
if (i + cand2 < len(body) - 1
|
||||
and body[i + cand2] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||
length = cand2
|
||||
else:
|
||||
length = cand4
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 20
|
||||
else:
|
||||
@@ -227,16 +246,91 @@ def parse_segment_header(block: WaveformBlock) -> Optional[dict]:
|
||||
}
|
||||
|
||||
|
||||
def _s4(n: int) -> int:
|
||||
"""Sign-extend a 4-bit value to signed int (0..7 → 0..7; 8..F → -8..-1)."""
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def _i8(b: int) -> int:
|
||||
"""Reinterpret an unsigned byte as signed int8."""
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def decode_tran_initial(body: bytes) -> Optional[List[int]]:
|
||||
"""
|
||||
Decode the initial Tran-channel samples from the body — VERIFIED 2026-05-11
|
||||
against M529LL1A.SP0 / .SS0 / .SV0 (22 + 42 + 46 samples, 0 errors).
|
||||
|
||||
Returns a list of Tran sample values in **16-count units** (LSB = 0.005 in/s
|
||||
at Normal range, the same quantization BW uses for its ASCII export).
|
||||
Returns ``None`` if the body cannot be parsed.
|
||||
|
||||
The decoded list extends from sample 0 (= ``Tran[0]`` from preamble bytes
|
||||
[3:5]) through the end of the FIRST data block. Subsequent samples
|
||||
require decoding additional blocks — that walk is not yet wired up here
|
||||
because the multi-block channel-switching rule is still under
|
||||
investigation (see waveform_codec module docstring).
|
||||
|
||||
Codec details (CONFIRMED 2026-05-11):
|
||||
|
||||
- Body bytes [0:3] are the magic ``00 02 00``.
|
||||
- Body bytes [3:5] = ``Tran[0]`` as int16 BE in 16-count units.
|
||||
- Body bytes [5:7] = ``Tran[1]`` as int16 BE in 16-count units.
|
||||
- The first data block (``10 NN`` or ``20 NN``) carries Tran deltas
|
||||
starting at sample 2:
|
||||
|
||||
* ``10 NN``: NN nibbles = NN/2 bytes; each nibble is a 4-bit signed
|
||||
delta (0..7 → 0..+7; 8..F → -8..-1). High nibble of each byte
|
||||
comes first.
|
||||
* ``20 NN``: NN int8 signed deltas (one delta per byte).
|
||||
"""
|
||||
if len(body) < 9:
|
||||
return None
|
||||
if body[0:3] != b"\x00\x02\x00":
|
||||
return None
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
start = find_data_start(body)
|
||||
if start < 0:
|
||||
return None
|
||||
blocks = walk_body(body, start)
|
||||
if not blocks:
|
||||
return [t0, t1]
|
||||
first = blocks[0]
|
||||
|
||||
out = [t0, t1]
|
||||
cur = t1
|
||||
if first.tag_hi == 0x10:
|
||||
for byte in first.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += _s4(nib)
|
||||
out.append(cur)
|
||||
elif first.tag_hi == 0x20:
|
||||
for byte in first.data:
|
||||
cur += _i8(byte)
|
||||
out.append(cur)
|
||||
else:
|
||||
# First block is something else — fall back to just the preamble.
|
||||
return out
|
||||
return out
|
||||
|
||||
|
||||
def decode_waveform_v2(body: bytes) -> Optional[dict]:
|
||||
"""
|
||||
Decode the body into per-channel sample arrays.
|
||||
|
||||
Returns a dict ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||
when a verified decoder is wired up; returns ``None`` otherwise.
|
||||
Returns ``None`` because the full multi-channel decoder is not yet
|
||||
wired up. Tran is partially solved — see :func:`decode_tran_initial`
|
||||
for the initial portion (verified against ground-truth BW exports).
|
||||
|
||||
Currently returns ``None`` because the byte-to-sample mapping is OPEN.
|
||||
The block framing in :func:`walk_body` is verified — callers can use
|
||||
that to inspect block-level structure without claiming the per-byte
|
||||
interpretation.
|
||||
Status (2026-05-11):
|
||||
- Tran[0:N] correctly decoded by ``decode_tran_initial`` for the
|
||||
first N samples of every fixture (where N = 22 / 42 / 46
|
||||
depending on event).
|
||||
- Subsequent Tran samples + all Vert / Long / MicL samples: open.
|
||||
The block stream after the first data block likely interleaves
|
||||
channels with ``30 NN`` channel-switch markers, but the exact
|
||||
switching rule is still under investigation.
|
||||
"""
|
||||
return None
|
||||
|
||||
Reference in New Issue
Block a user