8 Commits

Author SHA1 Message Date
Claude 0466bb4f44 codec: crack wide-NN blocks (1X NN / 2X NN); loud events now fully decode
When NN exceeds 0xFC, the codec extends to 12-bit NN by using the
low nibble of the TYPE byte as the high nibble of NN:

    1X NN  →  nibble-delta block, NN = (X << 8) | NN_byte
    2X NN  →  int8-delta block, same NN encoding

Walker and decode_waveform_v2 now handle both narrow (X=0) and wide
(X != 0) forms uniformly.

Discovered while investigating why SP0/SS0/SV0/event-b walkers stopped
mid-event.  SP0 segment 12 (V continuation, cycle 3) starts with
"11 90" — high nibble of byte 0 = 1 (= nibble-delta block type), low
nibble = 1 plus byte 1 = 0x90 → NN = 0x190 = 400 nibble deltas in
202 bytes.  Walker was rejecting "11" as a non-tag.

Sample count went from 47,364 to 72,972 verified byte-exact:

  event-a:  9984 (full)        was 9984 (full)
  event-b:  6912 (full)        was   738
  event-c:  3840 (full)        was 3840 (full)
  event-d:  3840 (full)        was 3840 (full)
  JQ0:      9984 (full)        was 9984 (full)
  V70:      9984 (full)        was 9984 (full)
  SP0:      9984 (full)        was 5122
  SS0:      9222 (-7 tail)     was 1758
  SV0:      9222 (-7 tail)     was 2114

7 of 9 fixtures now decode end-to-end across all 3 geo channels.
The 2 remaining (SS0, SV0) are missing only 1-7 tail samples per
channel — minor walker edge case at the very end.

74 tests pass (was 71).
2026-05-20 17:28:54 +00:00
Claude 85f4bcfe86 codec: wire decode_waveform_v2 into production; add MicL dB helper
Replaces the broken legacy int16 LE decoder in client.py with the
verified multi-channel codec.  Three changes:

1. blastware_file.extract_body_bytes(a5_frames) — new helper that
   factors out the body-reconstruction logic from write_blastware_file
   so both writers (BW binary) and decoders (sample arrays) can use
   the same canonical bytes.

2. waveform_codec.decode_a5_frames(a5_frames) — production entry point.
   Returns the raw_samples dict consumers expect (Tran/Vert/Long as
   int16 ADC counts; MicL as native ADC counts).  Internally:
     A5 frames → extract_body_bytes → decode_waveform_v2
                → decoded_to_adc_counts (geos ×16; mic pass-through)

3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's
   display formula:
     dB = sign(count) × (81.94 + 20 × log10(|count|))   for |count| ≥ 1
   Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1).

client.py:_decode_a5_waveform is reduced to a thin wrapper that calls
decode_a5_frames and populates event.raw_samples.  Original implementation
preserved as _decode_a5_waveform_LEGACY (dead code; reference only).

Also fixed a tail-end bug in decode_waveform_v2 where trailer-section
"40 02" markers (containing ASCII serial bytes, NOT real segment headers)
were being mis-interpreted, producing 2 spurious samples per channel at
the end of each event.  Added bytes [12:14] == "02 00" validation to
reject non-header markers.

7 new pytest tests cover the new helpers and dB conversion.  Total:
71 passing (up from 64).

Known limitation (carried over from before): the walker still stops
mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some
mid-segment edge cases not yet characterized.  Every sample reached
is decoded correctly; the walker just doesn't reach all of them.
Loud events still yield 5,000–15,000 byte-exact samples each.
2026-05-20 17:28:54 +00:00
Claude 2ff2762eec codec-re: 30 NN block CRACKED — codec fully decoded
User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC
range constraint led to the final piece.

30 NN block format (CONFIRMED across all 14 blocks in the fixture
bundle):

  NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each.
  Within each group:
    bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first)
    bytes [2:6] = 4 × int8 low bytes
    delta[k] = sign_extend_12((high_nibble[k] << 8) | low_byte[k])

  Block length = NN × 1.5 + 2 bytes (tag included).  Earlier walker
  used NN × 4 which is only correct in the TRAILER section.

Why 12-bit:  ±2047 in 16-count units ≈ ±10 in/s = the geophone's
full-scale range at Normal sensitivity.  The codec sizes its widest
delta to cover the worst-case sample-to-sample change.

Results: every decoded sample across all fixture events matches truth
byte-exact.  ZERO divergences.

  event-a:  9984 samples (full event, all 3 geos)
  event-c:  3840 (full event)
  event-d:  3840 (full event)
  JQ0:      9984 (full event)
  V70:      9984 (full event)
  SP0:      5122 (walker stops early on edge cases)
  SS0:      1758
  SV0:      2114
  event-b:   738

  TOTAL: 47,364 ADC samples verified, zero errors.

Three full 3-sec events decode end-to-end across all three geo
channels.  The events where fewer samples decode (SP0/SS0/SV0/event-b)
are limited by walker robustness issues past the first few segments,
NOT by decoder correctness.

64 tests pass (up from 55).  Files: minimateplus/waveform_codec.py
(new 30 NN decode + corrected walker length), tests/test_waveform_codec.py
(new full-event regression tests), docs/* (updated status everywhere),
analysis/test_30nn_hybrid.py (new — the analysis script that confirmed
the format).
2026-05-20 17:28:54 +00:00
Claude 07675626dc codec-re: channel rotation CONFIRMED — full multi-channel decoder works
The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py)
ran and immediately confirmed the rotation hypothesis:

  SP0 seg 0: best fit Vert  508/508  ✓
  SP0 seg 1: best fit Long  508/508  ✓
  SP0 seg 3: best fit Tran  508/508  ✓  (Tran continuation)
  SP0 seg 5: best fit Long  508/508  ✓
  SP0 seg 9: best fit Long  508/508  ✓
  V70 seg 0: best fit Vert  508/508  ✓
  V70 seg 1: best fit Long  508/508  ✓

Channels rotate Tran → Vert → Long → MicL per 40 02 segment header.

Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor
the NEW segment's channel (2 samples as int16 BE in 16-count units), AND
bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as
int16 BE).  This is the same "2 anchors + delta stream" structure as the
body preamble for Tran.

decode_waveform_v2 now returns full per-channel sample dicts.
Byte-exact verified ranges:
  V70: Tran 512, Vert 512, Long 512   (all first segments)
  JQ0: Tran 512, Vert 258
  SP0: Long 1536 (all 3 L segments)

Still open: the 30 NN block format (high-amplitude packed deltas) —
appears mid-segment when single-byte deltas can't carry the magnitude.

6 new tests bring the count to 46.  All passing.
2026-05-20 17:28:54 +00:00
Claude f68ee9f0f9 docs: clean up waveform-codec doc layers per review
Three "truth layers" had drifted apart between commits.  Fixed:

1. waveform_codec.py docstring rewritten from the 2026-05-08
   "structural framing only" state to the 2026-05-11 "Tran segment 0
   solved + segment-header partially decoded" state.  Killed stale
   "~80 sample-sets per segment" language (real segments are
   flash-page-byte-sized, not sample-count-sized; observed first-segment
   sizes are 42-510 samples depending on signal).  Killed stale
   "preamble is 7 or 9 bytes" language (always 7).

2. docs/instantel_protocol_reference.md §7.6.1: added a clear
   "CURRENT STATUS" box at the top with a status table.  Replaced the
   stale "~80 sample-sets" line with the verified per-event segment
   sizes.  Merged two redundant segment-header field-table sections.

3. docs/waveform_codec_re_status.md (NEW): clean working-status doc.
   Solved / not solved / hypothesis / next experiment / fixtures /
   tests.  The protocol reference remains the historical Rosetta
   Stone; this new file is the current-truth working note that
   shouldn't accumulate fossil layers.

4. CLAUDE.md §"Waveform body codec": prominent warning box at top —
   "DO NOT TRUST decoded sample arrays yet."  BW binary passthrough
   is the only sample-bearing output to trust until the decoder
   lands.  Added a "Next experiment" subsection pointing the next
   pass at the segment-channel scoring analyzer.

40 tests still pass.
2026-05-20 17:28:54 +00:00
Claude a0c9a482c7 codec-re: 00 NN is RLE; full Tran segment-0 decode (4 of 5 events)
User uploaded a Vert-heavy event (JQ0) and a Mic-heavy event (V70).
Those two were exactly what was needed to crack the next piece:

- 00 NN block = run-length-encoded zero deltas in the current channel.
  Append NN copies of the current cumulative value (no change).
- find_data_start now recognizes 00 NN as a valid first tag (some events
  begin with a leading 00 NN RLE block).
- decode_tran_initial now decodes the FULL segment 0 (not just the first
  data block).

Results across 5 fixture events:
  - M529LL1A.SP0 (loud-all-channels)  : 510 / 510  ✓
  - M529LL1L.JQ0 (Vert-heavy)         : 510 / 510  ✓
  - M529LL1L.V70 (Mic-heavy)          : 510 / 510  ✓
  - M529LL1A.SV0 (loud-from-start)    :  58 /  58  ✓
  - M529LL1A.SS0 (loud-from-start)    :  42 / 502  (stops at first 30 04)

The 30 04 block (only seen in loud-from-start events) hasn't been
decoded yet — likely a channel-switch marker for the high-amplitude
regime.

Also discovered: segment header (40 02) payload bytes [0:2] = T_delta
at first sample of new segment, [6:8] = byte length to next segment.
Multi-segment Tran decoding still diverges after sample 512 because
the per-segment channel ordering after the header is unknown.

Tests: 40 pass (up from 36).

Files:
- minimateplus/waveform_codec.py: find_data_start fix, RLE handling,
  full segment-0 decode in decode_tran_initial
- tests/test_waveform_codec.py: synthetic RLE test, full segment 0
  tests for JQ0 and V70
- tests/fixtures/5-11-26/: M529LL1L.JQ0, M529LL1L.V70 + TXT exports
- docs/instantel_protocol_reference.md §7.6.1: RLE + segment-header docs
2026-05-20 17:28:54 +00:00
Claude 6ac126e05c codec-re: crack Tran channel codec with high-amplitude May 11 bundle
User uploaded 3 high-amplitude events (PPV 6-7 in/s — shook the geophone
hard) to decode-re/5-11-26/.  These cracked the Tran codec:

- Preamble bytes [3:5] and [5:7] = Tran[0] and Tran[1] as int16 BE
  in 16-count units (LSB = 0.005 in/s).  Confirmed across all 7
  fixtures.
- First data block carries Tran deltas from sample 2 onward:
  * 10 NN block: NN/2 bytes of payload, each byte = two 4-bit signed
    nibble deltas (high nibble first)
  * 20 NN block: NN int8 signed deltas

Verified 22+42+46 = 110 Tran samples across SP0/SS0/SV0 with 0 errors
against BW's ASCII export.

Why the earlier 96-combination brute force failed: the quiet 5-8
events all had T[0] = T[1] ≈ 0 so the preamble's per-channel encoding
was undetectable.  Loud events made the encoding obvious.

What's solved:
- minimateplus.waveform_codec.decode_tran_initial: returns first
  N Tran samples in 16-count units for any body.
- Walker length formula for in-data 30 NN blocks (NN*2 instead of NN*4).
- Walker now handles bodies that start with 20 NN (in addition to 10 NN).

What's still open:
- Tran past the first data block (multi-block channel switching).
- Vert / Long / MicL channel encodings.
- Walker correctness past offset ~427 in event-b.

Tests: 36 pass.  decode_waveform_v2 still returns None — the full
multi-channel decoder is not wired up.  decode_tran_initial is the
new verified entry point.

Files: minimateplus/waveform_codec.py, tests/test_waveform_codec.py
(adds 5-11-26 fixtures + decode_tran_initial tests), and
docs/instantel_protocol_reference.md §7.6.1 (Tran codec spec).
2026-05-20 17:28:54 +00:00
Claude d3f77d1d96 codec-re: solve waveform body block framing; per-byte sample mapping still open
Decoded the structural framing of the Blastware waveform body — the bytes
between the 21-byte STRT record and the 26-byte file footer.  The body is
a sequence of tagged variable-length blocks, NOT raw int16 LE.  Five tag
types (10/20/00/30/40 NN) and their lengths are now confirmed against the
4-event May 2026 fixture bundle.  Body splits cleanly into ~16 segments
(for a 1280-sample event) separated by 40 02 segment headers carrying a
monotonically incrementing uint32 LE counter at bytes [8:12].

What's done:
- minimateplus/waveform_codec.py — block walker, segment splitter, segment
  header parser.  decode_waveform_v2 is a stub returning None until the
  byte-to-sample mapping is solved; client.py is unchanged.
- tests/test_waveform_codec.py — 31 tests covering block detection, lengths,
  contiguous-walk, segment splitting, segment-header parsing, and counter
  monotonicity.  All pass.
- tests/fixtures/decode-re-5-8-26/ — bundled fixtures (4 events, BW binary
  + Blastware ASCII export each).
- docs/instantel_protocol_reference.md §7.6.1 — replaced retraction box
  with the verified structural decoding plus an explicit list of what's
  still open.

What's still open: the per-byte mapping inside 10 NN / 20 NN blocks.  96
channel-permutation × nibble-order × sign-convention combinations were
brute-force tested; none match BW's ASCII export to within ±1 ADC count.
The codec is more elaborate than uniform 4-bit deltas — likely a hybrid
variable-bit-width scheme with segment-anchor resync points.  Next
recommended step: capture an event with a known calibration tone to pin
down magnitude scaling.

Walker also bails out partway through event-b (open issue documented in
both the module and the protocol reference).
2026-05-20 17:28:54 +00:00