Files
seismo-relay/analysis
Claude 2ff2762eec codec-re: 30 NN block CRACKED — codec fully decoded
User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC
range constraint led to the final piece.

30 NN block format (CONFIRMED across all 14 blocks in the fixture
bundle):

  NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each.
  Within each group:
    bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first)
    bytes [2:6] = 4 × int8 low bytes
    delta[k] = sign_extend_12((high_nibble[k] << 8) | low_byte[k])

  Block length = NN × 1.5 + 2 bytes (tag included).  Earlier walker
  used NN × 4 which is only correct in the TRAILER section.

Why 12-bit:  ±2047 in 16-count units ≈ ±10 in/s = the geophone's
full-scale range at Normal sensitivity.  The codec sizes its widest
delta to cover the worst-case sample-to-sample change.

Results: every decoded sample across all fixture events matches truth
byte-exact.  ZERO divergences.

  event-a:  9984 samples (full event, all 3 geos)
  event-c:  3840 (full event)
  event-d:  3840 (full event)
  JQ0:      9984 (full event)
  V70:      9984 (full event)
  SP0:      5122 (walker stops early on edge cases)
  SS0:      1758
  SV0:      2114
  event-b:   738

  TOTAL: 47,364 ADC samples verified, zero errors.

Three full 3-sec events decode end-to-end across all three geo
channels.  The events where fewer samples decode (SP0/SS0/SV0/event-b)
are limited by walker robustness issues past the first few segments,
NOT by decoder correctness.

64 tests pass (up from 55).  Files: minimateplus/waveform_codec.py
(new 30 NN decode + corrected walker length), tests/test_waveform_codec.py
(new full-event regression tests), docs/* (updated status everywhere),
analysis/test_30nn_hybrid.py (new — the analysis script that confirmed
the format).
2026-05-20 17:28:54 +00:00
..

analysis/ — exploratory scripts for waveform-body RE

These are scratch. Run them, read them, copy them, but don't trust them as documentation. When a finding is verified it gets promoted to minimateplus/waveform_codec.py and tests/test_waveform_codec.py; when it's wrong it stays here as a fossil.

Authoritative status lives in:

  • docs/waveform_codec_re_status.md (current truth, working note)
  • minimateplus/waveform_codec.py (verified implementation + docstring)
  • tests/test_waveform_codec.py (regression locks against fixtures)

Still useful

File What it does
load_bundle.py Fixture loader. Parses BW binary + ASCII TXT into a Bundle dataclass with samples, metadata, body bytes. Used by most other scripts here.
verify_tran.py Verifies decode_tran_initial against fixture ground truth across all events. Useful when you change the decoder and want a quick sanity check.
inspect_5_11.py Inspects the 5-11-26 high-amplitude bundle's body structure, prints metadata, peaks, and block counts.
walk_5_11.py Walks blocks for the 5-11-26 bundle and prints offset/tag/length/data.
seg1_blocks.py Dumps all blocks in segment 1 of each event. The starting point for cracking multi-segment Tran continuation.
full_tran.py Multi-segment Tran decoder attempt (broken — diverges at sample ~512). Useful as a starting scaffold for the next experiment.
multi_segment.py Earlier multi-segment attempt with different segment-header consumption strategies. Records what didn't work.
test_rle.py Tests 00 NN interpretation as zero-RLE with different divisor values. Documents how the RLE rule was confirmed.

Superseded — keep for archaeology

File Superseded by
walk_v2.pywalk_v5.py walk_v6.py and ultimately minimateplus/waveform_codec.walk_body. Each version represents one round of refinement. Don't read in isolation — read the diff between them to see what was learned.
walk_chunks.py walk_v6.py / production walker
decode_v1.py First naive decoder attempt. Wrong but readable.

Pure exploration — read if curious

File What it explored
inspect_body.py Byte-frequency stats per event. Established that bytes 0x00 / 0x10 dominate.
find_blocks.py Searched for repeating 2-byte tag patterns.
find_signal_runs.py Searched for stretches of bytes that "look like a smooth signal" (small inter-byte deltas). Found the 20 NN literal blocks.
dump_head.py, dump_trailer.py, dump_around.py Hex dumpers at various body positions.
compare_cd.py Byte-diff between event-c and event-d (same length, similar signal). Used to identify structural vs data bytes.
brute_force.py Tested 96 combinations of channel-permutation × nibble-order × sign-convention × init-from-header on the quiet bundle. All failed because the quiet bundle had T[0]=T[1]=0, making the preamble undetectable.
try_nibbles.py, try_layouts.py Earlier channel-interleaving hypotheses. All wrong.
test_tran_continue.py Test of "Tran continues uninterrupted across 30 04 blocks" hypothesis. Disproven.

Adding new scripts

If you're picking up the codec work, feel free to add new scripts here. Suggested conventions:

  • Start the filename with what you're testing: test_<hypothesis>.py, verify_<piece>.py, inspect_<region>.py.
  • Print enough output that the reader can see exactly which events match / diverge and where.
  • When a finding is solid, move the verified logic to minimateplus/waveform_codec.py and add a regression test in tests/test_waveform_codec.py — don't leave the truth only in this directory.
  • If a script is fully superseded, leave it in place (don't delete) — the fossil record is useful when re-evaluating hypotheses later.