Commit Graph

141 Commits

Author SHA1 Message Date
serversdown d21e3b5298 histogram aggregation + parser extension for BW interval fields
Three layered changes that together make histogram charts visually
match BW's printout (one bar per interval, not per codec block):

1. bw_ascii_report parser captures histogram fields it previously
   dropped:
     - Histogram Start/Stop Time + Date → datetime
     - Number of Intervals + Interval Size (string + parsed seconds)
     - <Channel> Peak Time + Peak Date → datetime (per-channel)
     - Peak Vector Sum Date (combined with PVS Time → datetime;
       clears the bogus seconds parse that interpreted "22:33:52"
       as 22.0)
   New _parse_iso_date() handles BW's ISO format for histograms
   (waveforms use "May 8, 2026" long form).  New _parse_interval_size()
   handles "1 minute" / "5 minutes" / "15 seconds" etc.

2. _bw_report_to_dict() projects the new fields into a new
   bw_report.histogram block in the sidecar.

3. /db/events/{id}/waveform.json wraps the existing path 1 (HDF5)
   output with _maybe_aggregate_histogram(): when the event is a
   histogram AND the sidecar has bw_report.histogram.n_intervals,
   group the codec's per-block samples into N intervals via
   max-per-group and return the aggregated array.  time_axis gains
   histogram_aggregated / n_intervals / interval_size_s / interval_times
   fields.

Frontend (both modal chart in sfm_webapp.html + standalone event
browser) uses interval_times as x-axis labels when provided (BW-style
HH:MM:SS), falls back to interval index.

Defensive: aggregation is no-op when the sidecar lacks the histogram
block (events ingested before this change).  Activates automatically
on prod once a watcher re-forward populates new sidecars.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:23:05 +00:00
serversdown ad2b553c7b ingest: preserve raw BW ASCII report (.TXT) alongside the binary
Previously the .TXT was parsed into the sidecar's bw_report projection
and then discarded at ingest time.  Now save_imported_bw() writes it
to <store>/<serial>/<filename>_ASCII.TXT permanently.

Rationale: with BW Mail / Forwarding Agent being phased out of the
operator workflow, the XML/PDF/WMF those tools produce won't be
available — the binary + .TXT (created by BW ACH itself) are our
only authoritative inputs going forward.  Keeping the raw .TXT
unlocks:

  - Parser bug fixes can be applied RETROACTIVELY by re-parsing the
    stored .TXT, instead of requiring a re-forward from the watcher
    PC (which lost the .TXT after BW ACH cleanup).
  - Audit trail of what BW actually sent us, for debugging.
  - The five known parser-PPV-miss events will be re-parseable once
    the regex fix lands (instead of staying broken indefinitely).

Storage cost: ~15 KB per event × 14k events = ~210 MB on the
existing prod corpus.  Negligible.

Implementation:
  - WaveformStore gains txt_path_for() + open_txt()
  - save_imported_bw() writes the .TXT when bw_report_text is supplied
  - sidecar source block records the txt_filename
  - backfill_sidecars.py preserves txt_filename across regens
  - New GET /db/events/{id}/ascii_report.txt endpoint serves it
  - Returns 404 for events ingested before this change (no .TXT in
    the store yet) — re-forward to populate

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:01:12 +00:00
serversdown 35842ac50a backfill: overlay bw_report onto Event before DB upsert
Mirror what the ingest path does: BW's reported peaks (and sample_rate
/ record_time) take precedence over codec output where present.

Without this, --force backfill silently overwrites bw_report-overlaid
DB columns with codec-derived peaks.  Wrong for events where the codec
doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style
events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing
PVS=0 on real high-amplitude events.  Bit on prod 2026-05-22 with
three top-10 waveform events ending up at PVS=0 (rolled back same day,
this fix is the proper resolution).

New helper minimateplus.event_file_io.apply_bw_report_dict_to_event
operates on the projected sidecar dict shape (the structure
_bw_report_to_dict produces, which is what gets preserved in the
sidecar).  Mirrors apply_report_to_event's semantics: only writes
fields where bw_report has a non-None value, no-ops cleanly on
empty / None input.

Dev validation against prod snapshot:
  pre  : 1839.7315 pvs_sum   356 events with DB PVS ≠ sidecar bw_report
  post : 2016.4902 pvs_sum     2 events still mismatched (both have NULL
                                timestamp + duplicate rows, edge case)

Both edge-case events DO get the correct value written by the new
backfill — their stale rows from prior backfills remain because
UNIQUE(serial, timestamp) doesn't fire on NULL.  Separate dedup
cleanup needed for those 2 events (0.014% of corpus); not blocking.

Backfill remains idempotent + bw_report preservation still passes
(0 WIPED, 0 CHANGED on the 3rd consecutive run).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 18:56:22 +00:00
serversdown d506ebc103 histogram_codec: peak count is uint8 (not uint16 LE) — properly cracks
the BE9558 / BE18003 extension-byte case

The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still
unclear — empirically non-zero on intervals with sub-Hz or unmeasurable
freq), NOT the high byte of the peak count.  The N844 fixture corpus
the original RE was done against had zero values in those bytes for
every block, so uint8 and uint16 LE were equivalent there — but on
real BE9558 Tran-drift events and BE18003 Histogram+Continuous events
the uint16 LE interpretation produced peaks up to 268 in/s and 35×
inflated PVS sums.

Cross-correlated against BW's per-interval ASCII export on:
  - K558LKZU/LL1P/LL3K  → 100% T/V/L/M peak match (1435 blocks each)
  - T003LKZR/LL0O/LL1M  → 100% T/V/L, 99.3% M (0.05 dB rounding only)
  - N599LKZS/LL0L        → 100% all channels
  - N844 fixture corpus  → 100% all channels (unchanged)

Annotations preserved on every record for future RE; the defensive
_MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s,
well below any physical limit).

Synthetic regression test added using the verbatim K558LKZU.RE0H
interval-12 block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 06:05:19 +00:00
serversdown e949232875 histogram_codec + backfill: tighter peak ceiling, preserve bw_report
histogram_codec: drop _MAX_PEAK_COUNT 4096 → 2200. The old ceiling
let extension-byte blocks slip through at up to 20.48 in/s per
channel, producing 35× inflated PVS sums when first deployed to
prod. 2200 covers Normal-range full-scale (10 in/s = 2000 counts)
plus 10% headroom for quantization edge cases.

backfill_sidecars: also preserve the bw_report block alongside
review + extensions when regenerating sidecars. event_to_sidecar_dict
takes a BwAsciiReport dataclass not a dict, so for bw_report we
overlay the existing block after regen rather than passing as a kwarg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:50:10 +00:00
serversdown bc5a2d3f19 histogram_codec: defensive bounds-check on peak counts
Discovered while running the backfill on prod: certain histogram
blocks contain an undocumented extension byte format whose naive
uint16 LE interpretation yields physically impossible peak values
(150+ in/s when the device max is 10).  Concrete example from
K558LKSG.3I0H block at body+7424:

  bytes [6:10] = 05 79 69 00
  current code: T_peak = uint16 LE = 0x7905 = 30981 → 154.9 in/s
  reality:     T_peak = byte[6] = 5 → 0.025 in/s (matches BW display)

The high byte (0x79 here) appears to be an extension field — possibly
"time of peak within interval" or a Histogram+Continuous sub-mode
marker.  Observed across BE9558 and BE18003 units in prod data; never
appeared in the BE12844 fixture corpus the codec was originally
verified against.

Effect on prod: 26 out of 1433 blocks in this one event had inflated
peaks, plus dozens of similar events across the fleet → sum(PVS)
inflated from baseline 988 to 34501 (35x).  Rolled back via the
pre-backfill snapshot before any UI exposure.

Defensive fix: bounds-check peak counts in `_decode_block`.  Any
field exceeding `_MAX_PEAK_COUNT` (4096 = ~20 in/s, well past the
device's 10 in/s Normal-range FS) causes the block to be skipped
entirely.  Other valid blocks in the same event still decode
correctly.

Trade-off: those skipped blocks lose their per-interval data
(peaks + frequencies).  Acceptable until the extension format is
reverse-engineered — better than propagating bogus values into PVS
computations downstream.

The 24 existing tests all still pass — the fixtures used during the
original codec development don't exercise the extension-byte case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:17:33 +00:00
serversdown 7183b953e4 minimateplus: histogram body codec — FULLY DECODED
The histogram-mode event body is now byte-exact decodable.
Companion to the waveform body codec — together they cover every
event file the watcher forwards.  Cracked in one session via
cross-event correlation against BW's ASCII export.

The §7.6.2 spec in instantel_protocol_reference.md was structurally
correct (32-byte blocks) but the per-sample semantics were
under-documented.  Cross-checking block 130 of N844L6Z8.ZR0H
against its TXT row revealed the layout perfectly:

  slot[0] = 10 (constant marker)
  slot[1] = T_peak_count    (× 0.005 → in/s at Normal range)
  slot[2] = T_halfperiod    (freq_Hz = 512 / halfp)
  slot[3] = V_peak_count
  slot[4] = V_halfperiod
  slot[5] = L_peak_count
  slot[6] = L_halfperiod
  slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db)
  slot[8] = MicL_halfperiod

The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz).
Mic dB uses the SAME formula as the waveform codec (sign × (81.94
+ 20·log10(|count|))) — they share the mic ADC calibration constant.

Block identification anchor: bytes [22:24] == 0x0000 AND
bytes [28:32] == 1e 0a 00 00.  The tail signature is the most
reliable distinguisher from non-block content in the file.

Files:

  minimateplus/histogram_codec.py (new) — decoder + public API
    matching the waveform codec's shape:
      walk_body(body) -> records
      decode_histogram_body(body) -> {Tran, Vert, Long, MicL}
      decode_histogram_body_full(body) -> [per-interval dicts]
      half_period_to_hz, geo_count_to_ins helpers

  minimateplus/event_file_io.py (modified) — read_blastware_file
    now tries the waveform codec first, falls back to the histogram
    codec on failure.  Same output shape, same downstream pipeline.

  tests/test_histogram_codec.py (new) — 24 regression locks against
    the in-repo fixture corpus, byte-exact against BW ASCII export
    for peaks (all 4 channels), frequencies (all 4 channels,
    including >100 Hz sentinel handling), block framing, and
    segment-ID accounting.

  scripts/backfill_sidecars.py (modified) — the has_samples
    short-circuit added in the histogram-pending era is now a
    pure defensive guard.  Histograms in prod will regen .h5 files
    correctly on the next backfill run.

  docs/histogram_codec_re_status.md (updated) — supersedes the
    earlier "in progress" version with the verified format and
    test-coverage summary.  Notes a few non-essential fields still
    open (4-byte block metadata, Geo PVS, Mic psi(L) — none of
    which are needed for waveform reconstruction).

Total verified coverage: ~3,500 blocks across 5 fixtures, every
field of every block byte-exact against BW.

The watcher-forwarded histogram event corpus on prod (~10,000
events) will now produce correct .h5 sidecars on the next backfill
run.  No additional changes needed to the backfill flow — the
existing tool_version-bump cascade picks them up automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:05:13 +00:00
serversdown fa9d3cdef2 read_blastware_file: leave peak_values=None when samples can't be decoded
Fixes a data-loss bug discovered while dry-running the backfill against
the prod store.

Symptom: every histogram event in the store has its body decoded by
read_blastware_file → codec returns None → samples = empty dict →
``ev.peak_values = _peaks_from_samples(empty)`` returns
``PeakValues(0, 0, 0, 0, 0)`` (NOT None).  The backfill script's
existing "seed from DB row when peak_values is None" branch then
correctly *skips* the seeding, and the all-zeros PeakValues flows into
``db.insert_events()``'s UPSERT path, OVERWRITING the existing good DB
peak values for that event (which were populated from the paired BW
ASCII report at ingest).

Net effect: running the backfill on prod would have wiped the PPV /
mic / vector-sum columns for ~10,000 histogram events.

Fix: only compute peaks-from-samples when there are actually samples.
For events the codec couldn't decode (histogram-mode bodies, until
the §7.6.2 histogram codec is wired in), leave peak_values=None as
the "we don't know" signal.  Downstream consumers:

  - backfill_sidecars.py — its existing ``if ev.peak_values is None:``
    branch (line 243) seeds from the DB row, preserving the real
    BW-report peaks across the regen.
  - WaveformStore.save_imported_bw — apply_report_to_event overlays
    peaks from the paired BW ASCII report when one was uploaded.
    Histogram imports without a paired report end up with NULL peaks
    in the DB, which is correct (better than zeros — clearly says
    "no peak data available" rather than "peaks are exactly zero").

Updated the existing synthetic-event round-trip test to expect
peak_values=None for the no-real-body case, which is the truth now.

The 7 fixture-corpus regression tests for real BW waveforms continue
to pass — those have decodable samples, so peak_values is still
populated from the codec output as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 20:30:53 +00:00
serversdown e8682d49ad scripts/backfill_sidecars: cascade h5 regen when sidecar is stale + bump TOOL_VERSION
Two coupled changes that close the rollout gap left by the
read_blastware_file codec wiring:

1. minimateplus/event_file_io.py: bump TOOL_VERSION from 0.16.1 to
   0.20.0.  This is the version stamp the backfill script reads from
   each sidecar's source.tool_version field to detect "this sidecar
   was written before the current decoder shipped, regenerate it."
   Bumping past every value baked into existing prod sidecars flags
   them all as stale on the next backfill run — which is exactly what
   we want, since every pre-codec-wiring sidecar was written by the
   retracted int16-LE decoder.

2. scripts/backfill_sidecars.py: when the sidecar is being
   regenerated this iteration (sha mismatch, tool_version too old,
   or --force), also regenerate the .h5.  Previously the .h5 logic
   only rewrote when --force was passed or the file was missing —
   so a tool_version-driven sidecar regen left the broken .h5 in
   place forever.  Added a `sidecar_stale` boolean to track the
   "we're rewriting the sidecar this iteration" state and wired it
   into the h5 need-rewrite check.

   Path coverage (verified by trace):
     - sidecar missing  → both regen
     - --force          → both regen
     - sha mismatch     → both regen
     - tool_ver too old → both regen (THE post-codec-wiring case)
     - everything OK    → skip iteration entirely (h5 untouched)

Operator review state (review.false_trigger, reviewer, notes) and
the sidecar's extensions block are preserved across regen by the
existing read-existing-sidecar / pass-into-event_to_sidecar_dict
path — unchanged from prior behavior.

Deploy procedure (on prod):
  1. Pull this change + the read_blastware_file codec wiring.
  2. `python scripts/backfill_sidecars.py --dry-run` to preview.
     Every sidecar with source.tool_version<0.20.0 will show as
     "would (re)write".
  3. Run for real (drop --dry-run).  Expect every pre-fix event
     to regen.  Big stores may take a while.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 18:24:06 +00:00
serversdown 31d691b40b minimateplus: wire read_blastware_file to verified body codec
`read_blastware_file()` was still calling `_decode_samples_4ch_int16_le`
(the retracted int16-LE-interleaved hypothesis) on the body bytes,
producing ±32K noise on every channel of every BW file read from disk.
This was the path watcher-forwarded events take into the system
(via the import endpoint → save_imported_bw → read_blastware_file,
since the watcher doesn't ship A5 frames), so every .h5 sidecar
generated for a forwarded event has been wrong since the feature
shipped.

The fix is mechanical: pass the body bytes straight to
`waveform_codec.decode_waveform_v2()` and run the result through
`decoded_to_adc_counts()` for the 16x geo scaling.  The body already
starts with the codec's exact 7-byte preamble `00 02 00 [Tran[0] BE]
[Tran[1] BE]` — confirmed by `body[:3].hex()` across all 9 fixture
events.  No body-slice adjustment needed.

If the codec returns None (truncated/malformed file, synthetic test
input with no real waveform), fall back to empty channels with a log
warning.  The rest of the event (timestamp, waveform_key, project
strings, sensor_location, peaks-from-samples=0) is still recoverable.

Verified against the bundled fixture corpus:

  V70  Tran/Vert/Long 3328/3328 sample-sets match .TXT ground truth
       within the 0.005 in/s display quantum, every row
  6S0/RG0/AB0/470 (5-8-26)  3328/2304/1280/1280 samples; Vert PPVs
       match BW's own report within 0.02 in/s
  JQ0  3328 samples, Vert PPV 3.384 vs BW 3.465
  SP0/SS0/SV0 (loud events)  3072–3328 samples; known walker
       tail-truncation 1–7 samples per channel, samples reached are
       byte-exact

Existing `test_read_blastware_file_round_trip` (synthetic empty event)
continues to pass thanks to the None-fallback.  Codec verify scripts
(`analysis/verify_quiet_bundle.py`, `analysis/verify_full_decode.py`)
re-run unchanged.

Added two regression-lock tests in tests/test_event_file_io.py:
  - test_read_blastware_file_decodes_via_codec[6 fixtures]
    — verifies sample count + Vert PPV per fixture
  - test_read_blastware_file_v70_samples_match_txt_truth
    — verifies every one of V70's 3328 sample-sets across Tran/Vert/Long
      matches the .TXT ground truth row-by-row within 0.003 in/s

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 18:13:24 +00:00
Claude 0466bb4f44 codec: crack wide-NN blocks (1X NN / 2X NN); loud events now fully decode
When NN exceeds 0xFC, the codec extends to 12-bit NN by using the
low nibble of the TYPE byte as the high nibble of NN:

    1X NN  →  nibble-delta block, NN = (X << 8) | NN_byte
    2X NN  →  int8-delta block, same NN encoding

Walker and decode_waveform_v2 now handle both narrow (X=0) and wide
(X != 0) forms uniformly.

Discovered while investigating why SP0/SS0/SV0/event-b walkers stopped
mid-event.  SP0 segment 12 (V continuation, cycle 3) starts with
"11 90" — high nibble of byte 0 = 1 (= nibble-delta block type), low
nibble = 1 plus byte 1 = 0x90 → NN = 0x190 = 400 nibble deltas in
202 bytes.  Walker was rejecting "11" as a non-tag.

Sample count went from 47,364 to 72,972 verified byte-exact:

  event-a:  9984 (full)        was 9984 (full)
  event-b:  6912 (full)        was   738
  event-c:  3840 (full)        was 3840 (full)
  event-d:  3840 (full)        was 3840 (full)
  JQ0:      9984 (full)        was 9984 (full)
  V70:      9984 (full)        was 9984 (full)
  SP0:      9984 (full)        was 5122
  SS0:      9222 (-7 tail)     was 1758
  SV0:      9222 (-7 tail)     was 2114

7 of 9 fixtures now decode end-to-end across all 3 geo channels.
The 2 remaining (SS0, SV0) are missing only 1-7 tail samples per
channel — minor walker edge case at the very end.

74 tests pass (was 71).
2026-05-20 17:28:54 +00:00
Claude 85f4bcfe86 codec: wire decode_waveform_v2 into production; add MicL dB helper
Replaces the broken legacy int16 LE decoder in client.py with the
verified multi-channel codec.  Three changes:

1. blastware_file.extract_body_bytes(a5_frames) — new helper that
   factors out the body-reconstruction logic from write_blastware_file
   so both writers (BW binary) and decoders (sample arrays) can use
   the same canonical bytes.

2. waveform_codec.decode_a5_frames(a5_frames) — production entry point.
   Returns the raw_samples dict consumers expect (Tran/Vert/Long as
   int16 ADC counts; MicL as native ADC counts).  Internally:
     A5 frames → extract_body_bytes → decode_waveform_v2
                → decoded_to_adc_counts (geos ×16; mic pass-through)

3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's
   display formula:
     dB = sign(count) × (81.94 + 20 × log10(|count|))   for |count| ≥ 1
   Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1).

client.py:_decode_a5_waveform is reduced to a thin wrapper that calls
decode_a5_frames and populates event.raw_samples.  Original implementation
preserved as _decode_a5_waveform_LEGACY (dead code; reference only).

Also fixed a tail-end bug in decode_waveform_v2 where trailer-section
"40 02" markers (containing ASCII serial bytes, NOT real segment headers)
were being mis-interpreted, producing 2 spurious samples per channel at
the end of each event.  Added bytes [12:14] == "02 00" validation to
reject non-header markers.

7 new pytest tests cover the new helpers and dB conversion.  Total:
71 passing (up from 64).

Known limitation (carried over from before): the walker still stops
mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some
mid-segment edge cases not yet characterized.  Every sample reached
is decoded correctly; the walker just doesn't reach all of them.
Loud events still yield 5,000–15,000 byte-exact samples each.
2026-05-20 17:28:54 +00:00
Claude 2ff2762eec codec-re: 30 NN block CRACKED — codec fully decoded
User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC
range constraint led to the final piece.

30 NN block format (CONFIRMED across all 14 blocks in the fixture
bundle):

  NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each.
  Within each group:
    bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first)
    bytes [2:6] = 4 × int8 low bytes
    delta[k] = sign_extend_12((high_nibble[k] << 8) | low_byte[k])

  Block length = NN × 1.5 + 2 bytes (tag included).  Earlier walker
  used NN × 4 which is only correct in the TRAILER section.

Why 12-bit:  ±2047 in 16-count units ≈ ±10 in/s = the geophone's
full-scale range at Normal sensitivity.  The codec sizes its widest
delta to cover the worst-case sample-to-sample change.

Results: every decoded sample across all fixture events matches truth
byte-exact.  ZERO divergences.

  event-a:  9984 samples (full event, all 3 geos)
  event-c:  3840 (full event)
  event-d:  3840 (full event)
  JQ0:      9984 (full event)
  V70:      9984 (full event)
  SP0:      5122 (walker stops early on edge cases)
  SS0:      1758
  SV0:      2114
  event-b:   738

  TOTAL: 47,364 ADC samples verified, zero errors.

Three full 3-sec events decode end-to-end across all three geo
channels.  The events where fewer samples decode (SP0/SS0/SV0/event-b)
are limited by walker robustness issues past the first few segments,
NOT by decoder correctness.

64 tests pass (up from 55).  Files: minimateplus/waveform_codec.py
(new 30 NN decode + corrected walker length), tests/test_waveform_codec.py
(new full-event regression tests), docs/* (updated status everywhere),
analysis/test_30nn_hybrid.py (new — the analysis script that confirmed
the format).
2026-05-20 17:28:54 +00:00
Claude 07675626dc codec-re: channel rotation CONFIRMED — full multi-channel decoder works
The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py)
ran and immediately confirmed the rotation hypothesis:

  SP0 seg 0: best fit Vert  508/508  ✓
  SP0 seg 1: best fit Long  508/508  ✓
  SP0 seg 3: best fit Tran  508/508  ✓  (Tran continuation)
  SP0 seg 5: best fit Long  508/508  ✓
  SP0 seg 9: best fit Long  508/508  ✓
  V70 seg 0: best fit Vert  508/508  ✓
  V70 seg 1: best fit Long  508/508  ✓

Channels rotate Tran → Vert → Long → MicL per 40 02 segment header.

Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor
the NEW segment's channel (2 samples as int16 BE in 16-count units), AND
bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as
int16 BE).  This is the same "2 anchors + delta stream" structure as the
body preamble for Tran.

decode_waveform_v2 now returns full per-channel sample dicts.
Byte-exact verified ranges:
  V70: Tran 512, Vert 512, Long 512   (all first segments)
  JQ0: Tran 512, Vert 258
  SP0: Long 1536 (all 3 L segments)

Still open: the 30 NN block format (high-amplitude packed deltas) —
appears mid-segment when single-byte deltas can't carry the magnitude.

6 new tests bring the count to 46.  All passing.
2026-05-20 17:28:54 +00:00
Claude f68ee9f0f9 docs: clean up waveform-codec doc layers per review
Three "truth layers" had drifted apart between commits.  Fixed:

1. waveform_codec.py docstring rewritten from the 2026-05-08
   "structural framing only" state to the 2026-05-11 "Tran segment 0
   solved + segment-header partially decoded" state.  Killed stale
   "~80 sample-sets per segment" language (real segments are
   flash-page-byte-sized, not sample-count-sized; observed first-segment
   sizes are 42-510 samples depending on signal).  Killed stale
   "preamble is 7 or 9 bytes" language (always 7).

2. docs/instantel_protocol_reference.md §7.6.1: added a clear
   "CURRENT STATUS" box at the top with a status table.  Replaced the
   stale "~80 sample-sets" line with the verified per-event segment
   sizes.  Merged two redundant segment-header field-table sections.

3. docs/waveform_codec_re_status.md (NEW): clean working-status doc.
   Solved / not solved / hypothesis / next experiment / fixtures /
   tests.  The protocol reference remains the historical Rosetta
   Stone; this new file is the current-truth working note that
   shouldn't accumulate fossil layers.

4. CLAUDE.md §"Waveform body codec": prominent warning box at top —
   "DO NOT TRUST decoded sample arrays yet."  BW binary passthrough
   is the only sample-bearing output to trust until the decoder
   lands.  Added a "Next experiment" subsection pointing the next
   pass at the segment-channel scoring analyzer.

40 tests still pass.
2026-05-20 17:28:54 +00:00
Claude a0c9a482c7 codec-re: 00 NN is RLE; full Tran segment-0 decode (4 of 5 events)
User uploaded a Vert-heavy event (JQ0) and a Mic-heavy event (V70).
Those two were exactly what was needed to crack the next piece:

- 00 NN block = run-length-encoded zero deltas in the current channel.
  Append NN copies of the current cumulative value (no change).
- find_data_start now recognizes 00 NN as a valid first tag (some events
  begin with a leading 00 NN RLE block).
- decode_tran_initial now decodes the FULL segment 0 (not just the first
  data block).

Results across 5 fixture events:
  - M529LL1A.SP0 (loud-all-channels)  : 510 / 510  ✓
  - M529LL1L.JQ0 (Vert-heavy)         : 510 / 510  ✓
  - M529LL1L.V70 (Mic-heavy)          : 510 / 510  ✓
  - M529LL1A.SV0 (loud-from-start)    :  58 /  58  ✓
  - M529LL1A.SS0 (loud-from-start)    :  42 / 502  (stops at first 30 04)

The 30 04 block (only seen in loud-from-start events) hasn't been
decoded yet — likely a channel-switch marker for the high-amplitude
regime.

Also discovered: segment header (40 02) payload bytes [0:2] = T_delta
at first sample of new segment, [6:8] = byte length to next segment.
Multi-segment Tran decoding still diverges after sample 512 because
the per-segment channel ordering after the header is unknown.

Tests: 40 pass (up from 36).

Files:
- minimateplus/waveform_codec.py: find_data_start fix, RLE handling,
  full segment-0 decode in decode_tran_initial
- tests/test_waveform_codec.py: synthetic RLE test, full segment 0
  tests for JQ0 and V70
- tests/fixtures/5-11-26/: M529LL1L.JQ0, M529LL1L.V70 + TXT exports
- docs/instantel_protocol_reference.md §7.6.1: RLE + segment-header docs
2026-05-20 17:28:54 +00:00
Claude 6ac126e05c codec-re: crack Tran channel codec with high-amplitude May 11 bundle
User uploaded 3 high-amplitude events (PPV 6-7 in/s — shook the geophone
hard) to decode-re/5-11-26/.  These cracked the Tran codec:

- Preamble bytes [3:5] and [5:7] = Tran[0] and Tran[1] as int16 BE
  in 16-count units (LSB = 0.005 in/s).  Confirmed across all 7
  fixtures.
- First data block carries Tran deltas from sample 2 onward:
  * 10 NN block: NN/2 bytes of payload, each byte = two 4-bit signed
    nibble deltas (high nibble first)
  * 20 NN block: NN int8 signed deltas

Verified 22+42+46 = 110 Tran samples across SP0/SS0/SV0 with 0 errors
against BW's ASCII export.

Why the earlier 96-combination brute force failed: the quiet 5-8
events all had T[0] = T[1] ≈ 0 so the preamble's per-channel encoding
was undetectable.  Loud events made the encoding obvious.

What's solved:
- minimateplus.waveform_codec.decode_tran_initial: returns first
  N Tran samples in 16-count units for any body.
- Walker length formula for in-data 30 NN blocks (NN*2 instead of NN*4).
- Walker now handles bodies that start with 20 NN (in addition to 10 NN).

What's still open:
- Tran past the first data block (multi-block channel switching).
- Vert / Long / MicL channel encodings.
- Walker correctness past offset ~427 in event-b.

Tests: 36 pass.  decode_waveform_v2 still returns None — the full
multi-channel decoder is not wired up.  decode_tran_initial is the
new verified entry point.

Files: minimateplus/waveform_codec.py, tests/test_waveform_codec.py
(adds 5-11-26 fixtures + decode_tran_initial tests), and
docs/instantel_protocol_reference.md §7.6.1 (Tran codec spec).
2026-05-20 17:28:54 +00:00
Claude d3f77d1d96 codec-re: solve waveform body block framing; per-byte sample mapping still open
Decoded the structural framing of the Blastware waveform body — the bytes
between the 21-byte STRT record and the 26-byte file footer.  The body is
a sequence of tagged variable-length blocks, NOT raw int16 LE.  Five tag
types (10/20/00/30/40 NN) and their lengths are now confirmed against the
4-event May 2026 fixture bundle.  Body splits cleanly into ~16 segments
(for a 1280-sample event) separated by 40 02 segment headers carrying a
monotonically incrementing uint32 LE counter at bytes [8:12].

What's done:
- minimateplus/waveform_codec.py — block walker, segment splitter, segment
  header parser.  decode_waveform_v2 is a stub returning None until the
  byte-to-sample mapping is solved; client.py is unchanged.
- tests/test_waveform_codec.py — 31 tests covering block detection, lengths,
  contiguous-walk, segment splitting, segment-header parsing, and counter
  monotonicity.  All pass.
- tests/fixtures/decode-re-5-8-26/ — bundled fixtures (4 events, BW binary
  + Blastware ASCII export each).
- docs/instantel_protocol_reference.md §7.6.1 — replaced retraction box
  with the verified structural decoding plus an explicit list of what's
  still open.

What's still open: the per-byte mapping inside 10 NN / 20 NN blocks.  96
channel-permutation × nibble-order × sign-convention combinations were
brute-force tested; none match BW's ASCII export to within ±1 ADC count.
The codec is more elaborate than uniform 4-bit deltas — likely a hybrid
variable-bit-width scheme with segment-anchor resync points.  Next
recommended step: capture an event with a known calibration tone to pin
down magnitude scaling.

Walker also bails out partway through event-b (open issue documented in
both the module and the protocol reference).
2026-05-20 17:28:54 +00:00
serversdown cd20be2eff feat: add thor/micromate compatibility v0.18.0 2026-05-19 04:32:43 +00:00
serversdown aac1c8e06d fix(import): derive record_type from filename suffix instead of hardcoding "Waveform"
The BW ACH ingest path was inserting every event with
record_type="Waveform" regardless of the actual type because
read_blastware_file() had `ev.record_type = "Waveform"` hardcoded, and
the live watcher-forward path parses files from a tmp path (suffix
".bw") that doesn't carry the original extension.

V10.72+ MiniMate Plus firmware encodes the event type as the last
character of the AB0T extension scheme (H=Histogram, W=Waveform,
M=Manual, E=Event, C=Combo).  This change:

  1. Adds derive_record_type_from_filename() public helper in
     minimateplus/event_file_io.py
  2. Uses it inside read_blastware_file() so direct callers (the
     --dry-run path of scripts/import_bw.py, tests, ad-hoc scripts)
     get correct types automatically
  3. Overrides ev.record_type in WaveformStore.save_imported_bw()
     using the ORIGINAL filename (source_path.name) — required
     because the parser sees only the tmp file

Old S338 firmware (3-char extensions ending in `0`) and any
unrecognized suffix fall back to "Waveform".

Existing DB rows ingested before this fix are stuck with
record_type="Waveform" — a one-off SQL backfill would fix them
retroactively if desired.  Terra-view's event modal also derives
client-side from the filename, so the UI already shows the correct
type for old events even without the backfill.

Version bumped to 0.16.1 in pyproject.toml, event_file_io.py
TOOL_VERSION, sfm/server.py FastAPI version, and CHANGELOG.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 21:09:21 +00:00
serversdown 197c0630e2 chore(release): v0.16.0 — BW ACH ingestion
The "BW ACH ingestion" release.  Paired with series3-watcher v1.5.0,
every Blastware ACH event (binary + _ASCII.TXT report) lands in
SeismoDb with device-authoritative peaks, project metadata, sensor
self-check, and ZC/Time-of-Peak data — without depending on the
still-undecoded waveform body codec.

Bumps pyproject.toml + minimateplus/event_file_io.py TOOL_VERSION
to 0.16.0.  README banner + CHANGELOG entry summarise the work
that landed across commits cdfe4ad..f83993a on this branch.
2026-05-11 07:33:48 +00:00
serversdown 6b2a44ff02 fix(import): overlay BW report onto Event + upsert DB row on re-import
Two compounding bugs caused forwarded events to land in the DB with
broken-codec peak values (~10 in/s saturation on every channel) and
no project info, even when the watcher correctly paired a BW ASCII
report with the binary.

Bug 1: save_imported_bw built the sidecar JSON with the report's
authoritative peak / project values via event_to_sidecar_dict(
bw_report=...), but never overlaid those onto the in-memory Event
that flows to db.insert_events().  So the DB row got peak_values
from read_blastware_file()._peaks_from_samples() — which runs the
still-undecoded waveform body codec assuming raw int16 LE and
produces ±32K-shaped noise (= ±10 in/s at Normal range) regardless
of the actual signal.  The sidecar JSON had the truth but the DB
columns (which the webapp queries for fast filter/sort) lied.

Bug 2: insert_events' IntegrityError handler only refreshed the
filename/filesize/a5_pickle/sidecar columns when a duplicate
(serial, timestamp) was seen.  Peak values, project info,
sample_rate, record_type stayed locked in at whatever the FIRST
insert wrote.  So even after Bug 1 was fixed, the historical
events in the DB (already inserted with broken-codec peaks) would
never get their values corrected, because a re-forward would just
hit IntegrityError and skip the field refresh.

Fix 1 (minimateplus/event_file_io.py + sfm/waveform_store.py):
  - New apply_report_to_event(event, report) helper folds the BW
    report's device-authoritative fields onto the Event in-place:
    per-channel PPV, peak vector sum, mic PSPL→psi, project /
    client / operator / sensor_location, sample_rate, record_time.
  - save_imported_bw() calls the helper right after parsing the
    report.  The Event that flows to insert_events() now carries
    correct values.

Fix 2 (sfm/database.py):
  - insert_events()'s IntegrityError UPDATE now refreshes every
    device-authoritative column from the new data: tran_ppv,
    vert_ppv, long_ppv, peak_vector_sum, mic_ppv, project, client,
    operator, sensor_location, sample_rate, record_type, plus
    the existing filename/filesize/a5_pickle/sidecar fields.
  - Preserves: id, waveform_key, session_id, created_at (immutable
    / FK fields), and false_trigger (operator review state).

End-to-end simulation verified:
  - Step 1: import without report → DB has ±10 in/s peaks, no project
  - Step 2: re-import WITH report → upsert path fires, DB now has
            device-authoritative 0.005 in/s peaks + sensor_location
  - Step 3: operator sets false_trigger=1, re-import again → flag
            preserved, peaks remain correct

For the user's situation: deleting the watcher state file forces a
re-forward of all events.  Each re-forward now pairs with its
_ASCII.TXT, applies the report onto the Event, and the upsert
refreshes the DB row.  No DB nuke needed.

Full SFM suite: 62 passed, 44 skipped.
2026-05-11 05:51:39 +00:00
serversdown a032fa5451 refactor(bw-report): parse user notes by POSITION, not by label
The four operator-supplied note fields in BW's Compliance Setup →
Notes tab (Project / Client / User Name / Seis Loc) have
USER-EDITABLE LABELS — an operator can rename them in BW's UI to
"Building:", "Site Address:", "Inspector:", or anything else, and
the ASCII export writes those literal labels verbatim.  The
previous label-normalisation map approach (just added in commit
6a7e8c6) was fragile: it could only match label spellings we'd
enumerated in advance.  An operator using "Site:" instead of
"Seis Loc:" would have their sensor location silently dropped.

What IS reliable: BW always writes the 4 user-notes lines
contiguously, in the same order, between the "Units :" line and
the "Geo Range :" line of the export.  So parse them by POSITION:

  position 1 → project
  position 2 → client
  position 3 → operator
  position 4 → sensor_location

The original labels BW wrote are preserved in a new
`BwAsciiReport.user_note_labels` dict (canonical slot → literal
label string) so terra-view can render them as the operator named
them.

Removes the `_OPERATOR_LABEL_MAP` / `_normalise_label_for_lookup`
helpers and the elif-by-normalised-label branch in `parse_report`.
Replaces with a small state machine that flips on the "Units" line
and flips off on the "Geo Range" line.

Tests:
  - Default-label fixtures (waveform + histogram) still populate
    correctly, with operator's labels captured.
  - Synthetic custom-labelled exports ("Building:" / "Site Address:" /
    etc.) populate the right slots by position.
  - Histogram-specific "Seis. Location:" works.
  - Lines outside the Units→Geo Range range are ignored even if
    they look like user notes (defensive against malformed exports).
  - Partial blocks (fewer than 4 lines) leave later slots None.
  - Extra lines beyond 4 are dropped (5th slot doesn't exist).

26 tests in test_bw_ascii_report.py (was 33; net drop reflects
parametrised label tests collapsed into 6 focused position tests).
Full SFM suite: 62 passed, 44 skipped.

Pairs with series3-watcher v1.5.0 which fixes the filename pairing
so the report reaches this parser in the first place.
2026-05-10 22:28:31 +00:00
serversdown 6a7e8c6e86 feat(bw-report): normalise operator-field label variants
Blastware writes the operator-supplied fields with different label
spellings across firmware versions and recording modes — most
notably "Seis. Location" on histogram exports vs "Seis Loc:" on
waveform exports.  Previous parser only matched the latter, so
every histogram event silently lost its sensor_location field.

Replace the four hardcoded `key.rstrip(":") == "X"` branches with
a single `_OPERATOR_LABEL_MAP` dispatch table keyed by normalised
label (lowercase, trailing colon/period stripped, internal
whitespace collapsed).  Adds these variants on day 1:

  project:         "Project:" / "Project"
  client:          "Client:"  / "Client"
  operator:        "User Name:" / "User Name"
  sensor_location: "Seis Loc:" / "Seis. Location" / "Seis Location"
                 / "Sensor Location" / "Seis Loc"

To absorb future BW label drift, add a one-line dict entry — no
new elif branch.

14 new tests cover:
  - Each label variant routes to the correct field (parametrised)
  - Case-insensitive matching ("seis loc" / "SEIS LOC" / "SeIs LoC")
  - Whitespace-collapse ("Seis  Loc" with double-space)
  - End-to-end parse of a real histogram fixture from
    example-events/histogram/ — sensor_location ('Loc #1 - 2652 Hepner...')
    populates correctly even though the file uses "Seis. Location"

Total bw_ascii_report tests: 19 → 33.  Full SFM suite still green
(69 passed, 44 skipped — pre-existing skips for h5py-dep tests).

Pairs with series3-watcher v1.5.4 (which fixes the filename pairing
so histograms actually reach this parser in the first place).
2026-05-10 20:13:44 +00:00
serversdown cdfe4ad3c8 feat(import): parse paired BW ASCII reports on /db/import/blastware_file
Blastware's ACH writes a per-event ASCII report (.TXT) alongside each
event binary, containing the rich derived per-channel fields BW
computes (PPV, ZC Freq, Time of Peak, Peak Acceleration, Peak
Displacement, Peak Vector Sum + time, sensor self-check Pass/Fail,
monitor-log timestamps).  None of this lives in the BW binary itself.

When the watcher daemon forwards both files to /db/import/blastware_file
in one multipart POST, we now:

  - Pair binaries with their .TXT partners by filename match
  - Parse the report into a structured BwAsciiReport
  - Land the rich fields in a new top-level `bw_report` block of the
    sidecar JSON
  - Overlay the report's peaks/project_info/timestamp/sample_rate/
    record_time/total_samples/pretrig_samples onto the canonical
    sidecar fields (the report values are device-authoritative; the
    BW-binary STRT-derived values had bugs like reading the 0x46
    record-type marker as rectime)

This unblocks the monthly-summary review workflow — events become
sortable/filterable by peak, location, project, etc. — without
depending on the still-undecoded waveform body codec.
2026-05-08 23:56:43 +00:00
serversdown e1a73b2c44 Merge pull request 'feat: add waveform store handling' (#16) from sfm-waveform-store into main
Reviewed-on: #16
2026-05-08 15:03:32 -04:00
serversdown bbed85f7e2 fix: update channel keys to include 'MicL' in device_event_waveform documentation 2026-05-08 18:48:06 +00:00
serversdown c641d5fc10 feat: v0.15.0
### Added

- **Layered event storage architecture.**  Each event now lands as four
  files in the per-serial waveform store, each with a clear role:

  - `<filename>` — the Blastware-readable binary (BW file).  Untouched.
  - `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
  - `<filename>.h5` — clean per-channel waveform arrays in physical
    units (in/s for geo, psi for mic) plus event metadata (HDF5 with
    gzip compression).  This is the canonical format for downstream
    analysis tools.
  - `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
    project, source provenance, review state, extensions).

  SQLite (`seismo_relay.db`) is the searchable index over all four.

- **Plot-ready waveform JSON (`sfm.plot.v1`).**  The `/device/event/{idx}/waveform`
  and `/db/events/{id}/waveform.json` endpoints now return samples in
  physical units with explicit time-axis metadata, peak markers, and
  per-channel unit hints — no more guessing the ADC-to-velocity scale
  client-side.  The webapp waveform viewer was rewritten to consume
  this shape.

- **In-app waveform viewer accuracy fix.**  The standalone SFM webapp
  viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
  (≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
  *in/s per V* hardware constant — not the ADC-counts-to-velocity
  factor.  This silently scaled every plot ~38% too low for Normal-range
  geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
  Sensitive).  Conversion is now done server-side using the geo_range
  from compliance config; the client just plots.

- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
  `read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.

### Dependencies

- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
  `/db/import/blastware_file` endpoint introduced in this release).
2026-05-08 04:39:51 +00:00
serversdown 9afa3484f4 feat(cache): implement integrity checks for cached events and waveforms
- Added `waveform_key` and `event_timestamp` columns to `CachedEvent` and `CachedWaveform` for integrity verification.
- Implemented logic to flush the cache when a mismatch in (waveform_key, event_timestamp) is detected during event and waveform updates.
- Enhanced `set_events` and `set_waveform` methods to check for mismatches and trigger cache eviction as necessary.
- Introduced a new `LiveCache` class to manage in-memory caching of live device data, separating it from the server logic for better testability.
- Added tests to verify the correctness of cache invalidation logic, particularly for post-erase key reuse scenarios.
- Updated web application to include a "Force refresh" toggle, allowing users to bypass the cache and re-fetch data from the device.
2026-05-07 04:42:00 +00:00
serversdown 429c6ac87a feat(protocol): implement v0.14.0 SUB 5A protocol rewrite with enhanced chunk handling and new helpers
test: add regression tests for v0.14.x SUB 5A protocol fixes
refactor(logging): change warning logs to debug for less verbosity in write_blastware_file
2026-05-06 14:18:31 -04:00
claude a27693242d fix(protocol): implement partial DLE stuffing for 0x10 bytes in params to prevent request corruption 2026-05-05 18:28:28 -04:00
claude eefec0bd64 fix(blastware_file): remove harmful "duplicate header+STRT" strip logic to preserve valid waveform data 2026-05-05 17:48:40 -04:00
claude 7444738883 debug(protocol): event-N probe is now at counter = start_offset instead of start_offset + 0x46 2026-05-05 16:46:35 -04:00
claude b66cc9d075 fix(blastware_file): update TERM detection logic and strip duplicate header blocks for accurate file writing 2026-05-04 14:28:11 -04:00
claude 45e61fbcaf big refactor of waveform protocol. 2026-05-03 01:20:21 -04:00
claude d758825c67 fix(protocol): correct continuous-mode record header classification for accurate timestamp extraction 2026-05-01 20:28:55 -04:00
claude 0fbb39c21a Big event bugfix. see details:
## v0.13.0 — 2026-05-01

### Fixed

- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
  `read_bulk_waveform_stream` was walking the chunk counter past the actual
  end of the event, picking up post-event circular-buffer garbage that
  corrupted reconstructed Blastware files for any waveform > ~1 sec.  The
  loop now extracts the event's `end_offset` from the STRT record at
  `data[23:27]` of the probe response and stops the chunk walk when the next
  counter would step past it.  Verified against three BW MITM captures
  (4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
  bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.

### Added

- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)` —
  computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
  formula confirmed across all 3 BW captures.  Not yet wired into
  `read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
  existing `blastware_file.write_blastware_file` frame-structure expectations);
  available for the next iteration that switches to BW's 0x0200 chunk step.
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
  from the STRT record in an A5 response payload.
2026-05-01 18:37:34 -04:00
Claude 625b0a4dfc feat(seismo_lab): add Download tab that captures wire bytes during event download
Adds a new CapturingTransport wrapper in minimateplus.transport that mirrors
every TX/RX byte to two raw .bin files using the same on-wire format as
bridges/ach_mitm.py, so the resulting captures are byte-for-byte compatible
with the existing Blastware MITM captures and load directly in the Analyzer.

A new "Download" tab in seismo_lab.py lets the user connect to a device over
TCP or serial and run connect / list-keys / download-events while the wrapper
saves raw_bw_<ts>.bin (our TX) and raw_s3_<ts>.bin (device TX) into a
seismo_dl_<ts>[_<label>]/ session directory. On completion, the panel hands
both files to the Analyzer and switches tabs, mirroring the UX of the
existing Bridge capture flow.
2026-05-01 00:12:02 +00:00
claude a7585cb5e0 fix(blastware_file, server): implement logic to skip extra chunks after metadata for accurate file writing 2026-04-26 16:32:32 -04:00
claude ae30a02898 fix(blastware_file, server): enhance logging and correct chunk handling for accurate data processing 2026-04-26 16:03:07 -04:00
claude 2f084ed105 fix(protocol): update chunk counter formula to use max(key4[2:4], 0x0400) for accurate data streaming 2026-04-26 01:28:47 -04:00
claude 7976b544ed fix(blastware_file): never skip A5 frames based on classification at fi>0
Frame 0 is always the probe; frames 1+ are always data (waveform ADC
chunks, compliance config, compliance continuation).  Gating on
classify_frame() at fi>0 produces false positives: ADC binary data
can coincidentally contain b"STRT\xff\xfe", causing frames 1 and 5
to be silently dropped from the body (confirmed from live capture on
event key=01110000).  Remove all type-based filtering; include every
frame unconditionally with the standard index-based skip amounts.
2026-04-26 00:59:36 -04:00
claude 0415af19b4 fix(blastware_file): remove seen_metadata flag and adjust frame processing logic 2026-04-24 20:21:03 -04:00
claude 35c3f4f945 fix(protocol): correct A5 frame classification and chunk counter formula 2026-04-24 17:25:29 -04:00
claude 43c8158493 feat(blastware_file): classify A5 frames, only write waveform frames to body
Add classify_frame() which categorises each A5 frame by content:
  terminator    — page_key == 0x0000
  probe_or_strt — contains b"STRT"
  metadata      — contains compliance-config ASCII markers
                  (Project:, Client:, Standard Recording Setup, …)
  waveform      — binary-heavy (< 20% printable ASCII), i.e. raw ADC data
  unknown       — fallback

Update write_blastware_file() body loop: frame 0 (probe) is still
always processed; frames 1+ are only included when classify_frame
returns "waveform".  Metadata frames (compliance config block with
Project:/Client:/etc.) and any stray STRT-bearing frames are skipped
with a warning/debug log.  Terminator frame handling is unchanged.

Adds temporary print() diagnostics so each frame's classification is
visible in the server log to aid debugging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 15:48:37 -04:00
claude 242666f358 fix(protocol): correct chunk counter formula for accurate data streaming 2026-04-24 12:52:02 -04:00
claude 03540fdc00 fix: raise max_chunks to 128 for metadata-only 5A download
For 2-second events at 1024 sps the "Project:" metadata frame appears
beyond chunk 32 (the old default cap), causing the safety limit to be
hit and ~34 KB of waveform data to be downloaded instead of stopping
at the metadata frame.  Raising max_chunks to 128 ensures
stop_after_metadata=True can locate the metadata frame for record
times up to ~4 seconds.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 02:19:27 -04:00
claude ab2c11e9a9 fix(protocol): refine extra chunk fetching logic for accurate termination response 2026-04-23 20:30:07 -04:00
claude fa887b85d9 fix(protocol): update extra chunk fetching logic to stop at silence detection 2026-04-23 18:28:14 -04:00
claude ecd980d345 fix(protocol): enhance extra chunk fetching logic to ensure footer detection 2026-04-23 18:22:27 -04:00