The /db/events/{id}/waveform.json endpoint returns `time_axis` as a
metadata object — {sample_rate, pretrig_samples, t0_ms, dt_ms,
n_samples, total_samples, rectime_seconds} — not a per-sample times
array. Both viewers (sfm_webapp.html sidecar modal + event_browser.html)
were treating it as an array, silently falling back to a derived path
that ignored pretrig entirely and started the time axis at 0.
Symptom: trigger line drawn at the very left edge of every chart, no
visible "leading up to the event" samples even though they're in the
decoded data.
Fix: read time_axis.t0_ms (negative when pretrig samples exist),
time_axis.dt_ms, build per-sample times as `t0_ms + i * dt_ms`. Trigger
line lands at sample where t crosses 0; pretrig samples render at
negative t to the left of it.
Confirmed on a K558 event with 208 pretrig samples + 2 sec rectime at
1024 sps — time axis now spans -203 ms to +2046 ms, trigger line at
~9% from the left edge as expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three UX upgrades to the main SFM webapp at /, all reinforcing the
'browse stored events' flow as the primary entry point:
1. Default section is now Database, not Live Device. Most users land
here to look at stored events; Live Device is opt-in (click the tab
to talk to a unit). Initial history + units fetch fires on first
paint so the table is populated when the page loads.
2. History table columns are sortable. Click any header to sort:
timestamp, serial, per-channel PPV (Tran/Vert/Long), PVS, mic dB(L),
project, client, type, key. Default direction varies by column type
(desc for numbers + timestamps, asc for text). Sort arrows appear
in the active column header. Headers are sticky so they stay
visible while scrolling.
3. Click-event-to-see-waveform. The existing sidecar review modal now
renders the 4-channel waveform plot inline at the top, fetched from
/db/events/{id}/waveform.json in parallel with the sidecar fetch.
Channels stacked MicL / Long / Vert / Tran (Instantel printout
order), shared bottom time axis, dashed trigger line + triangle
markers at t=0, zero baseline with "0.0" label on the right edge,
peak callouts per channel. Charts cleaned up on modal close.
Resolves the "where is the viewer" surprise — operators no longer need
to know about the /events route to see waveforms.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the cheap visual wins from the BW Event Report layout:
1. Channel order reversed → MicL (top), Long, Vert, Tran (bottom)
to match the Instantel printout.
2. Shared bottom time axis — x-axis ticks only render on the
bottom-most data channel; other channels hide ticks so all four
visually share one time scale.
3. Triangle trigger markers above and below the t=0 dashed line.
4. Horizontal zero-baseline (dotted) per channel with "0.0" label
on the right edge — Instantel convention.
5. "Print view" toggle that flips dark→light theme (white panels,
light grids, dark text) so the viewer can render usefully on
paper-style output / @media print.
6. Per-channel PPV stats table in the metadata header, with Peak
Vector Sum displayed prominently.
7. Colors adjusted to approximate BW trace colors (magenta MicL,
blue Long, green Vert, red Tran).
Future PDF-export work will reproduce the same layout server-side
once you upload a real example PDF and we pick a rendering pipeline
(weasyprint / chromium --print-to-pdf / etc.).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New standalone HTML page (sfm/event_browser.html, ~470 lines, Chart.js)
that lets you browse persisted events from the SeismoDb + WaveformStore.
Companion to the existing live-device viewer at /waveform:
/waveform — connect to a unit and pull events in real time
/events — browse events already stored in the DB
Flow:
1. Page loads → GET /db/units → populate serial dropdown
2. Select serial → GET /db/events?serial=X&limit=500 → event list
3. Click event → GET /db/events/{id}/waveform.json → render
Layout is Instantel-printout-ready: channels stacked vertically in
Tran / Vert / Long / MicL order, trigger line at t=0, peak labels,
clean dark theme. Frames the future PDF-export feature without
needing extra layout work.
Smoke-tested against the dev prod-snapshot — 4 channels render with
correct peaks for K558 events (L=0.3 in/s = the offset-fault peak
we've been chasing all week).
CHANGELOG entry added under [Unreleased] per the v0.20.0 release plan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. bw_ascii_report parser misses PPV/vector_sum fields on certain TXT
formats (5 events in prod). Parser extracts every OTHER field for
the same channels — likely a regex / format mismatch specific to
some firmware-or-event-type combination.
2. NULL-timestamp duplicate rows. events.timestamp can come back as
NULL when the codec can't extract a footer timestamp; UNIQUE(serial,
timestamp) doesn't fire on NULL, so backfills create new rows
instead of upserting. 2 affected events on prod, easy SQL cleanup.
3. Histogram body sub-format with byte[5] != 0. ~3 events on prod
(T190LD5Q, O121L4L1) use a histogram body the walker doesn't
recognize. Codec returns 0 valid blocks; DB peaks come from the
bw_report ASCII overlay so DB columns are correct, only the .h5
plot is empty. Cracking the sub-format unlocks the plot.
All three are pre-existing issues that today's deployment surfaced
during validation; none are regressions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror what the ingest path does: BW's reported peaks (and sample_rate
/ record_time) take precedence over codec output where present.
Without this, --force backfill silently overwrites bw_report-overlaid
DB columns with codec-derived peaks. Wrong for events where the codec
doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style
events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing
PVS=0 on real high-amplitude events. Bit on prod 2026-05-22 with
three top-10 waveform events ending up at PVS=0 (rolled back same day,
this fix is the proper resolution).
New helper minimateplus.event_file_io.apply_bw_report_dict_to_event
operates on the projected sidecar dict shape (the structure
_bw_report_to_dict produces, which is what gets preserved in the
sidecar). Mirrors apply_report_to_event's semantics: only writes
fields where bw_report has a non-None value, no-ops cleanly on
empty / None input.
Dev validation against prod snapshot:
pre : 1839.7315 pvs_sum 356 events with DB PVS ≠ sidecar bw_report
post : 2016.4902 pvs_sum 2 events still mismatched (both have NULL
timestamp + duplicate rows, edge case)
Both edge-case events DO get the correct value written by the new
backfill — their stale rows from prior backfills remain because
UNIQUE(serial, timestamp) doesn't fire on NULL. Separate dedup
cleanup needed for those 2 events (0.014% of corpus); not blocking.
Backfill remains idempotent + bw_report preservation still passes
(0 WIPED, 0 CHANGED on the 3rd consecutive run).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLAUDE.md gains an Architecture section near the top describing the
canonical three-tier mental model:
- SFM: device-side, live connections, /device/* endpoints
- SDM: data-side, DB + waveform store + /db/* endpoints (currently
living under sfm/ for historical reasons; rename deferred)
- Codec library: pure data-interpretation, used by both tiers
Future code should be placed and named according to this model even
though the directory layout doesn't fully reflect it yet. Decision
rule for where new code goes is documented inline.
README.md's Roadmap section gains two strategic-direction subsections:
- "Strategic direction" — frames the suite-of-components vision and
notes that BW ACH + Thor IDF call-home remain the data movers;
seismo-relay's value is on the receiving and processing side.
- "Terra-View ↔ SFM device control" — the long-term vision where
Terra-View can launch into SFM device-control surfaces (operator
notices missing unit → clicks "Connect to Device" → live view in
browser). Includes concrete implementation checklist (auth,
embedded live-monitor view, action history, series IV live
support).
The existing tactical roadmap items remain unchanged below.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-step tool to verify that backfill_sidecars doesn't wipe the
bw_report block from existing sidecars. Workflow:
1. snapshot --out before.json (canonical-JSON hash per sidecar)
2. run backfill
3. diff --baseline before.json (classifies every sidecar:
PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED)
Exit code 1 if any WIPED or CHANGED entries found, 0 otherwise — so
it can gate a CI step or a deploy script.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
the BE9558 / BE18003 extension-byte case
The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still
unclear — empirically non-zero on intervals with sub-Hz or unmeasurable
freq), NOT the high byte of the peak count. The N844 fixture corpus
the original RE was done against had zero values in those bytes for
every block, so uint8 and uint16 LE were equivalent there — but on
real BE9558 Tran-drift events and BE18003 Histogram+Continuous events
the uint16 LE interpretation produced peaks up to 268 in/s and 35×
inflated PVS sums.
Cross-correlated against BW's per-interval ASCII export on:
- K558LKZU/LL1P/LL3K → 100% T/V/L/M peak match (1435 blocks each)
- T003LKZR/LL0O/LL1M → 100% T/V/L, 99.3% M (0.05 dB rounding only)
- N599LKZS/LL0L → 100% all channels
- N844 fixture corpus → 100% all channels (unchanged)
Annotations preserved on every record for future RE; the defensive
_MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s,
well below any physical limit).
Synthetic regression test added using the verbatim K558LKZU.RE0H
interval-12 block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
histogram_codec: drop _MAX_PEAK_COUNT 4096 → 2200. The old ceiling
let extension-byte blocks slip through at up to 20.48 in/s per
channel, producing 35× inflated PVS sums when first deployed to
prod. 2200 covers Normal-range full-scale (10 in/s = 2000 counts)
plus 10% headroom for quantization edge cases.
backfill_sidecars: also preserve the bw_report block alongside
review + extensions when regenerating sidecars. event_to_sidecar_dict
takes a BwAsciiReport dataclass not a dict, so for bw_report we
overlay the existing block after regen rather than passing as a kwarg.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered while running the backfill on prod: certain histogram
blocks contain an undocumented extension byte format whose naive
uint16 LE interpretation yields physically impossible peak values
(150+ in/s when the device max is 10). Concrete example from
K558LKSG.3I0H block at body+7424:
bytes [6:10] = 05 79 69 00
current code: T_peak = uint16 LE = 0x7905 = 30981 → 154.9 in/s
reality: T_peak = byte[6] = 5 → 0.025 in/s (matches BW display)
The high byte (0x79 here) appears to be an extension field — possibly
"time of peak within interval" or a Histogram+Continuous sub-mode
marker. Observed across BE9558 and BE18003 units in prod data; never
appeared in the BE12844 fixture corpus the codec was originally
verified against.
Effect on prod: 26 out of 1433 blocks in this one event had inflated
peaks, plus dozens of similar events across the fleet → sum(PVS)
inflated from baseline 988 to 34501 (35x). Rolled back via the
pre-backfill snapshot before any UI exposure.
Defensive fix: bounds-check peak counts in `_decode_block`. Any
field exceeding `_MAX_PEAK_COUNT` (4096 = ~20 in/s, well past the
device's 10 in/s Normal-range FS) causes the block to be skipped
entirely. Other valid blocks in the same event still decode
correctly.
Trade-off: those skipped blocks lose their per-interval data
(peaks + frequencies). Acceptable until the extension format is
reverse-engineered — better than propagating bogus values into PVS
computations downstream.
The 24 existing tests all still pass — the fixtures used during the
original codec development don't exercise the extension-byte case.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered while dry-running the backfill on prod: the waveform store
contains both BW (.AB0*/.N00) and Thor IDF (.IDFW/.IDFH) event files
side-by-side because both go through the same per-serial directory
layout. The script's `_looks_like_event_file` heuristic accepted any
3-4 char extension ending in W or H, which matched both BW and IDF.
The script then routes everything through
`event_file_io.read_blastware_file`, which rejects IDF files with
"not a Blastware file (bad header prefix)" — 3807 errors on prod
out of 7201 total events.
Thor IDF events have their own ingest path
(`WaveformStore.save_imported_idf`) and their sidecars are populated
at ingest from the paired `.IDFW.txt` ASCII report. The backfill
script has no value to add for them — there's no decoder to refresh,
and the sidecar metadata is already correct. Filter them out.
After this fix, the prod backfill should run clean: ~3392 BW events
get sidecar+h5 regen as expected; the ~3807 Thor IDF events are
silently skipped.
The proper "IDF backfill" (refresh tool_version stamp on IDF
sidecars by re-running event_to_sidecar_dict against the stored
DB row + sidecar extensions block) is a separate, narrower
follow-up — not blocking the BW backfill rollout.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The histogram-mode event body is now byte-exact decodable.
Companion to the waveform body codec — together they cover every
event file the watcher forwards. Cracked in one session via
cross-event correlation against BW's ASCII export.
The §7.6.2 spec in instantel_protocol_reference.md was structurally
correct (32-byte blocks) but the per-sample semantics were
under-documented. Cross-checking block 130 of N844L6Z8.ZR0H
against its TXT row revealed the layout perfectly:
slot[0] = 10 (constant marker)
slot[1] = T_peak_count (× 0.005 → in/s at Normal range)
slot[2] = T_halfperiod (freq_Hz = 512 / halfp)
slot[3] = V_peak_count
slot[4] = V_halfperiod
slot[5] = L_peak_count
slot[6] = L_halfperiod
slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db)
slot[8] = MicL_halfperiod
The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz).
Mic dB uses the SAME formula as the waveform codec (sign × (81.94
+ 20·log10(|count|))) — they share the mic ADC calibration constant.
Block identification anchor: bytes [22:24] == 0x0000 AND
bytes [28:32] == 1e 0a 00 00. The tail signature is the most
reliable distinguisher from non-block content in the file.
Files:
minimateplus/histogram_codec.py (new) — decoder + public API
matching the waveform codec's shape:
walk_body(body) -> records
decode_histogram_body(body) -> {Tran, Vert, Long, MicL}
decode_histogram_body_full(body) -> [per-interval dicts]
half_period_to_hz, geo_count_to_ins helpers
minimateplus/event_file_io.py (modified) — read_blastware_file
now tries the waveform codec first, falls back to the histogram
codec on failure. Same output shape, same downstream pipeline.
tests/test_histogram_codec.py (new) — 24 regression locks against
the in-repo fixture corpus, byte-exact against BW ASCII export
for peaks (all 4 channels), frequencies (all 4 channels,
including >100 Hz sentinel handling), block framing, and
segment-ID accounting.
scripts/backfill_sidecars.py (modified) — the has_samples
short-circuit added in the histogram-pending era is now a
pure defensive guard. Histograms in prod will regen .h5 files
correctly on the next backfill run.
docs/histogram_codec_re_status.md (updated) — supersedes the
earlier "in progress" version with the verified format and
test-coverage summary. Notes a few non-essential fields still
open (4-byte block metadata, Geo PVS, Mic psi(L) — none of
which are needed for waveform reconstruction).
Total verified coverage: ~3,500 blocks across 5 fixtures, every
field of every block byte-exact against BW.
The watcher-forwarded histogram event corpus on prod (~10,000
events) will now produce correct .h5 sidecars on the next backfill
run. No additional changes needed to the backfill flow — the
existing tool_version-bump cascade picks them up automatically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures everything learned in the 2026-05-20 session before scope
forced a pause:
- Block framing is solved: 32-byte blocks, one per histogram
interval, signature byte pattern `[22:24]=0x0000` +
`[28:32]=0x1e 0x0a 0x00 0x00` reliably identifies data blocks.
- Block count = interval count (791 blocks in N844L20G.630H for
a TXT-reported 792 intervals).
- Sample[0] = Tran peak in 0.0005 in/s/count units (verified on
one event — needs cross-event confirmation).
- Samples 1-8 → channel/metric mapping is still open. None of
the obvious layouts (peak-then-freq alternating, all-peaks-
then-all-freqs, per-channel 3-tuples) match the TXT values
across multiple blocks. Likely needs a higher-activity
fixture (current N844 corpus is all noise-floor data) to
disambiguate.
- `>100 Hz` sentinel encoding in the binary is unknown.
- 4-byte variable metadata field at block[24:28] needs
correlation work against TXT columns.
Doc mirrors the structure of docs/waveform_codec_re_status.md so
a future RE session has a familiar entry point. Includes the
suggested attack plan + the code seam where the eventual decoder
will land (minimateplus/histogram_codec.py).
The §7.6.2 spec in instantel_protocol_reference.md is structurally
correct but doesn't pin down per-sample semantics — this doc
supersedes it where they conflict on confidence level.
No code shipped on this branch. When the codec is cracked, the
plan is to land minimateplus/histogram_codec.py + wire into
event_file_io.read_blastware_file() + remove the has_samples
short-circuit from scripts/backfill_sidecars.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes a data-loss bug discovered while dry-running the backfill against
the prod store.
Symptom: every histogram event in the store has its body decoded by
read_blastware_file → codec returns None → samples = empty dict →
``ev.peak_values = _peaks_from_samples(empty)`` returns
``PeakValues(0, 0, 0, 0, 0)`` (NOT None). The backfill script's
existing "seed from DB row when peak_values is None" branch then
correctly *skips* the seeding, and the all-zeros PeakValues flows into
``db.insert_events()``'s UPSERT path, OVERWRITING the existing good DB
peak values for that event (which were populated from the paired BW
ASCII report at ingest).
Net effect: running the backfill on prod would have wiped the PPV /
mic / vector-sum columns for ~10,000 histogram events.
Fix: only compute peaks-from-samples when there are actually samples.
For events the codec couldn't decode (histogram-mode bodies, until
the §7.6.2 histogram codec is wired in), leave peak_values=None as
the "we don't know" signal. Downstream consumers:
- backfill_sidecars.py — its existing ``if ev.peak_values is None:``
branch (line 243) seeds from the DB row, preserving the real
BW-report peaks across the regen.
- WaveformStore.save_imported_bw — apply_report_to_event overlays
peaks from the paired BW ASCII report when one was uploaded.
Histogram imports without a paired report end up with NULL peaks
in the DB, which is correct (better than zeros — clearly says
"no peak data available" rather than "peaks are exactly zero").
Updated the existing synthetic-event round-trip test to expect
peak_values=None for the no-real-body case, which is the truth now.
The 7 fixture-corpus regression tests for real BW waveforms continue
to pass — those have decodable samples, so peak_values is still
populated from the codec output as before.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered while dry-running the backfill on the prod store: ~10,000
of ~10,059 events are histogram-mode (filename extension `*H`), and
the waveform-body codec wired in via the previous commit doesn't
handle histogram-mode bodies — only the waveform-mode codec at
§7.6.1 is implemented; the histogram-mode codec at §7.6.2 of the
protocol reference is documented but no Python implementation
exists yet.
Without this guard, every histogram event's .h5 file would be
*replaced* with an empty one — strictly worse than today's
broken-int16-LE .h5 because any downstream viewer expecting
non-empty sample arrays would now error out instead of just
rendering wrong values.
Fix: after the decoder runs, check whether any channel has samples.
If not, skip the .h5 write entirely. The sidecar still regenerates
(refreshing the tool_version stamp and any peaks/project info from
the DB row), but the existing .h5 is left untouched.
This is a *temporary* gate. When the histogram codec lands (next
branch: `feat/wire-histogram-codec`), the has_samples check can be
removed and the backfill will then correctly regenerate all .h5
files, histogram and waveform alike.
Observed effect (dry-run on prod store, 10,059 events):
- waveform events (~5%): "[DRY ] would write … + .h5 (would (re)write)"
- histogram events (~95%): "[DRY ] would write … + .h5 (skipped-empty-samples)"
- sidecar tool_version bump succeeds for both
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two coupled changes that close the rollout gap left by the
read_blastware_file codec wiring:
1. minimateplus/event_file_io.py: bump TOOL_VERSION from 0.16.1 to
0.20.0. This is the version stamp the backfill script reads from
each sidecar's source.tool_version field to detect "this sidecar
was written before the current decoder shipped, regenerate it."
Bumping past every value baked into existing prod sidecars flags
them all as stale on the next backfill run — which is exactly what
we want, since every pre-codec-wiring sidecar was written by the
retracted int16-LE decoder.
2. scripts/backfill_sidecars.py: when the sidecar is being
regenerated this iteration (sha mismatch, tool_version too old,
or --force), also regenerate the .h5. Previously the .h5 logic
only rewrote when --force was passed or the file was missing —
so a tool_version-driven sidecar regen left the broken .h5 in
place forever. Added a `sidecar_stale` boolean to track the
"we're rewriting the sidecar this iteration" state and wired it
into the h5 need-rewrite check.
Path coverage (verified by trace):
- sidecar missing → both regen
- --force → both regen
- sha mismatch → both regen
- tool_ver too old → both regen (THE post-codec-wiring case)
- everything OK → skip iteration entirely (h5 untouched)
Operator review state (review.false_trigger, reviewer, notes) and
the sidecar's extensions block are preserved across regen by the
existing read-existing-sidecar / pass-into-event_to_sidecar_dict
path — unchanged from prior behavior.
Deploy procedure (on prod):
1. Pull this change + the read_blastware_file codec wiring.
2. `python scripts/backfill_sidecars.py --dry-run` to preview.
Every sidecar with source.tool_version<0.20.0 will show as
"would (re)write".
3. Run for real (drop --dry-run). Expect every pre-fix event
to regen. Big stores may take a while.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`read_blastware_file()` was still calling `_decode_samples_4ch_int16_le`
(the retracted int16-LE-interleaved hypothesis) on the body bytes,
producing ±32K noise on every channel of every BW file read from disk.
This was the path watcher-forwarded events take into the system
(via the import endpoint → save_imported_bw → read_blastware_file,
since the watcher doesn't ship A5 frames), so every .h5 sidecar
generated for a forwarded event has been wrong since the feature
shipped.
The fix is mechanical: pass the body bytes straight to
`waveform_codec.decode_waveform_v2()` and run the result through
`decoded_to_adc_counts()` for the 16x geo scaling. The body already
starts with the codec's exact 7-byte preamble `00 02 00 [Tran[0] BE]
[Tran[1] BE]` — confirmed by `body[:3].hex()` across all 9 fixture
events. No body-slice adjustment needed.
If the codec returns None (truncated/malformed file, synthetic test
input with no real waveform), fall back to empty channels with a log
warning. The rest of the event (timestamp, waveform_key, project
strings, sensor_location, peaks-from-samples=0) is still recoverable.
Verified against the bundled fixture corpus:
V70 Tran/Vert/Long 3328/3328 sample-sets match .TXT ground truth
within the 0.005 in/s display quantum, every row
6S0/RG0/AB0/470 (5-8-26) 3328/2304/1280/1280 samples; Vert PPVs
match BW's own report within 0.02 in/s
JQ0 3328 samples, Vert PPV 3.384 vs BW 3.465
SP0/SS0/SV0 (loud events) 3072–3328 samples; known walker
tail-truncation 1–7 samples per channel, samples reached are
byte-exact
Existing `test_read_blastware_file_round_trip` (synthetic empty event)
continues to pass thanks to the None-fallback. Codec verify scripts
(`analysis/verify_quiet_bundle.py`, `analysis/verify_full_decode.py`)
re-run unchanged.
Added two regression-lock tests in tests/test_event_file_io.py:
- test_read_blastware_file_decodes_via_codec[6 fixtures]
— verifies sample count + Vert PPV per fixture
- test_read_blastware_file_v70_samples_match_txt_truth
— verifies every one of V70's 3328 sample-sets across Tran/Vert/Long
matches the .TXT ground truth row-by-row within 0.003 in/s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When NN exceeds 0xFC, the codec extends to 12-bit NN by using the
low nibble of the TYPE byte as the high nibble of NN:
1X NN → nibble-delta block, NN = (X << 8) | NN_byte
2X NN → int8-delta block, same NN encoding
Walker and decode_waveform_v2 now handle both narrow (X=0) and wide
(X != 0) forms uniformly.
Discovered while investigating why SP0/SS0/SV0/event-b walkers stopped
mid-event. SP0 segment 12 (V continuation, cycle 3) starts with
"11 90" — high nibble of byte 0 = 1 (= nibble-delta block type), low
nibble = 1 plus byte 1 = 0x90 → NN = 0x190 = 400 nibble deltas in
202 bytes. Walker was rejecting "11" as a non-tag.
Sample count went from 47,364 to 72,972 verified byte-exact:
event-a: 9984 (full) was 9984 (full)
event-b: 6912 (full) was 738
event-c: 3840 (full) was 3840 (full)
event-d: 3840 (full) was 3840 (full)
JQ0: 9984 (full) was 9984 (full)
V70: 9984 (full) was 9984 (full)
SP0: 9984 (full) was 5122
SS0: 9222 (-7 tail) was 1758
SV0: 9222 (-7 tail) was 2114
7 of 9 fixtures now decode end-to-end across all 3 geo channels.
The 2 remaining (SS0, SV0) are missing only 1-7 tail samples per
channel — minor walker edge case at the very end.
74 tests pass (was 71).
Replaces the broken legacy int16 LE decoder in client.py with the
verified multi-channel codec. Three changes:
1. blastware_file.extract_body_bytes(a5_frames) — new helper that
factors out the body-reconstruction logic from write_blastware_file
so both writers (BW binary) and decoders (sample arrays) can use
the same canonical bytes.
2. waveform_codec.decode_a5_frames(a5_frames) — production entry point.
Returns the raw_samples dict consumers expect (Tran/Vert/Long as
int16 ADC counts; MicL as native ADC counts). Internally:
A5 frames → extract_body_bytes → decode_waveform_v2
→ decoded_to_adc_counts (geos ×16; mic pass-through)
3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's
display formula:
dB = sign(count) × (81.94 + 20 × log10(|count|)) for |count| ≥ 1
Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1).
client.py:_decode_a5_waveform is reduced to a thin wrapper that calls
decode_a5_frames and populates event.raw_samples. Original implementation
preserved as _decode_a5_waveform_LEGACY (dead code; reference only).
Also fixed a tail-end bug in decode_waveform_v2 where trailer-section
"40 02" markers (containing ASCII serial bytes, NOT real segment headers)
were being mis-interpreted, producing 2 spurious samples per channel at
the end of each event. Added bytes [12:14] == "02 00" validation to
reject non-header markers.
7 new pytest tests cover the new helpers and dB conversion. Total:
71 passing (up from 64).
Known limitation (carried over from before): the walker still stops
mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some
mid-segment edge cases not yet characterized. Every sample reached
is decoded correctly; the walker just doesn't reach all of them.
Loud events still yield 5,000–15,000 byte-exact samples each.
User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC
range constraint led to the final piece.
30 NN block format (CONFIRMED across all 14 blocks in the fixture
bundle):
NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each.
Within each group:
bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first)
bytes [2:6] = 4 × int8 low bytes
delta[k] = sign_extend_12((high_nibble[k] << 8) | low_byte[k])
Block length = NN × 1.5 + 2 bytes (tag included). Earlier walker
used NN × 4 which is only correct in the TRAILER section.
Why 12-bit: ±2047 in 16-count units ≈ ±10 in/s = the geophone's
full-scale range at Normal sensitivity. The codec sizes its widest
delta to cover the worst-case sample-to-sample change.
Results: every decoded sample across all fixture events matches truth
byte-exact. ZERO divergences.
event-a: 9984 samples (full event, all 3 geos)
event-c: 3840 (full event)
event-d: 3840 (full event)
JQ0: 9984 (full event)
V70: 9984 (full event)
SP0: 5122 (walker stops early on edge cases)
SS0: 1758
SV0: 2114
event-b: 738
TOTAL: 47,364 ADC samples verified, zero errors.
Three full 3-sec events decode end-to-end across all three geo
channels. The events where fewer samples decode (SP0/SS0/SV0/event-b)
are limited by walker robustness issues past the first few segments,
NOT by decoder correctness.
64 tests pass (up from 55). Files: minimateplus/waveform_codec.py
(new 30 NN decode + corrected walker length), tests/test_waveform_codec.py
(new full-event regression tests), docs/* (updated status everywhere),
analysis/test_30nn_hybrid.py (new — the analysis script that confirmed
the format).
Tested the 12-bit signed packed delta hypothesis (motivated by the
observation that ±2047 in 16-count units ≈ ±32K raw ADC counts, almost
exactly the int16 ADC range — a strong design hint).
Result: mixed. For SP0 block @1689 (V seg 4, samples 650..653):
truth deltas: 47, 297, 384, 61 (sum = 789)
12-bit BE contiguous pred: 17, 47, 664, 61 (sum = 789)
Positions 1 and 3 of the pred match truth values at positions 0 and 3
exactly, AND the total sum across all 4 positions matches. But
positions 0 and 2 of pred don't match any truth value.
Hypothesis space narrows to:
- 12-bit deltas WITH a specific re-ordering or interleaving
- 12-bit deltas with one of the positions being a "step size" or
"checksum-like" repacked value
- A nonlinear / coded format where the underlying total displacement
is preserved but per-sample distribution is encoded differently
Two analysis scripts committed (test_30nn_12bit.py, test_30nn_v2.py).
The v2 script uses a real-decoder simulation to get the exact channel
+ sample-index for each 30 NN block, eliminating off-by-one errors in
the truth lookup.
User asked the right question: do events without 30 NN blocks decode
fully? Answer: YES.
event-a: Tran 3328 ✓ Vert 3328 ✓ Long 3328 ✓ (28 segments, 0 '30 NN')
event-c: Tran 1280 ✓ Vert 1280 ✓ Long 1280 ✓ (12 segments, 0 '30 NN')
event-d: Tran 1280 ✓ Vert 1280 ✓ Long 1280 ✓ (12 segments, 0 '30 NN')
17,664 ADC samples decoded byte-exact against BW's ASCII export.
Zero divergences across event-a, event-c, event-d.
This means the codec is FULLY SOLVED for any event without 30 NN
blocks. The remaining gap is the 30 NN block format only — used for
high-amplitude regions where deltas exceed int8 range. For quiet
events (or quiet stretches of loud events), the decoder is complete.
9 new regression tests bring the total to 55, all passing.
Files: tests/test_waveform_codec.py + docs/waveform_codec_re_status.md
+ new analysis/verify_quiet_bundle.py.
The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py)
ran and immediately confirmed the rotation hypothesis:
SP0 seg 0: best fit Vert 508/508 ✓
SP0 seg 1: best fit Long 508/508 ✓
SP0 seg 3: best fit Tran 508/508 ✓ (Tran continuation)
SP0 seg 5: best fit Long 508/508 ✓
SP0 seg 9: best fit Long 508/508 ✓
V70 seg 0: best fit Vert 508/508 ✓
V70 seg 1: best fit Long 508/508 ✓
Channels rotate Tran → Vert → Long → MicL per 40 02 segment header.
Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor
the NEW segment's channel (2 samples as int16 BE in 16-count units), AND
bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as
int16 BE). This is the same "2 anchors + delta stream" structure as the
body preamble for Tran.
decode_waveform_v2 now returns full per-channel sample dicts.
Byte-exact verified ranges:
V70: Tran 512, Vert 512, Long 512 (all first segments)
JQ0: Tran 512, Vert 258
SP0: Long 1536 (all 3 L segments)
Still open: the 30 NN block format (high-amplitude packed deltas) —
appears mid-segment when single-byte deltas can't carry the magnitude.
6 new tests bring the count to 46. All passing.
Three things to make pickup smoother:
1. analysis/README.md (NEW): catalogues the ~25 scratch scripts.
Categorizes them as "still useful" / "superseded — keep for
archaeology" / "pure exploration". Tells a fresh engineer which
files to read first and which to ignore.
2. scratch/next_experiment_skeleton.py (NEW): stub + spec for the
segment-channel scoring analyzer. Includes the fixture loader,
block walker, and decode-segment-as-channel helper — just enough
scaffolding that the next pass starts from "fill in
score_segment_against_all_channels()" rather than from scratch.
Already runs and confirms 13 segments per 3-sec event with sample
starts going to 6590 (way past the 3328 actual samples) — strong
evidence that not all segments carry Tran.
3. Removed decode-re/ duplicate. It was a mirror of tests/fixtures/.
Analysis scripts that hardcoded decode-re/ paths updated to point
at tests/fixtures/. CLAUDE.md note updated: future event uploads
go directly into a dated subdirectory under tests/fixtures/.
All 40 tests still pass. Skeleton runs.
Three "truth layers" had drifted apart between commits. Fixed:
1. waveform_codec.py docstring rewritten from the 2026-05-08
"structural framing only" state to the 2026-05-11 "Tran segment 0
solved + segment-header partially decoded" state. Killed stale
"~80 sample-sets per segment" language (real segments are
flash-page-byte-sized, not sample-count-sized; observed first-segment
sizes are 42-510 samples depending on signal). Killed stale
"preamble is 7 or 9 bytes" language (always 7).
2. docs/instantel_protocol_reference.md §7.6.1: added a clear
"CURRENT STATUS" box at the top with a status table. Replaced the
stale "~80 sample-sets" line with the verified per-event segment
sizes. Merged two redundant segment-header field-table sections.
3. docs/waveform_codec_re_status.md (NEW): clean working-status doc.
Solved / not solved / hypothesis / next experiment / fixtures /
tests. The protocol reference remains the historical Rosetta
Stone; this new file is the current-truth working note that
shouldn't accumulate fossil layers.
4. CLAUDE.md §"Waveform body codec": prominent warning box at top —
"DO NOT TRUST decoded sample arrays yet." BW binary passthrough
is the only sample-bearing output to trust until the decoder
lands. Added a "Next experiment" subsection pointing the next
pass at the segment-channel scoring analyzer.
40 tests still pass.
Mirrors the structural findings now documented in
docs/instantel_protocol_reference.md §7.6.1: block framing solved,
Tran segment-0 decode verified across 5 fixture events, multi-segment
continuation still open. Also adds waveform_codec.py to the project
layout map.
Investigated multi-segment Tran continuation but couldn't crack it.
Each hypothesis tried (segment header consumes 0/1/2 T deltas, blocks
continue Tran with various interpretations) breaks at sample ~512.
Block budget for V70 segment 1: 264 nibbles + 244 RLE zeros = 508
deltas — exactly the segment size. So the block structure CAN encode
508 single-channel samples, but applying segment 1 blocks as Tran
gives wrong values.
Most likely the channel ordering changes in segment 1+ (e.g., segment
0 = Tran, segment 1 = Vert, segment 2 = Long, etc.) but I couldn't
verify cleanly. Stopping here — segment-0 Tran decode is solid and
multi-segment work needs more fresh thinking.
User uploaded a Vert-heavy event (JQ0) and a Mic-heavy event (V70).
Those two were exactly what was needed to crack the next piece:
- 00 NN block = run-length-encoded zero deltas in the current channel.
Append NN copies of the current cumulative value (no change).
- find_data_start now recognizes 00 NN as a valid first tag (some events
begin with a leading 00 NN RLE block).
- decode_tran_initial now decodes the FULL segment 0 (not just the first
data block).
Results across 5 fixture events:
- M529LL1A.SP0 (loud-all-channels) : 510 / 510 ✓
- M529LL1L.JQ0 (Vert-heavy) : 510 / 510 ✓
- M529LL1L.V70 (Mic-heavy) : 510 / 510 ✓
- M529LL1A.SV0 (loud-from-start) : 58 / 58 ✓
- M529LL1A.SS0 (loud-from-start) : 42 / 502 (stops at first 30 04)
The 30 04 block (only seen in loud-from-start events) hasn't been
decoded yet — likely a channel-switch marker for the high-amplitude
regime.
Also discovered: segment header (40 02) payload bytes [0:2] = T_delta
at first sample of new segment, [6:8] = byte length to next segment.
Multi-segment Tran decoding still diverges after sample 512 because
the per-segment channel ordering after the header is unknown.
Tests: 40 pass (up from 36).
Files:
- minimateplus/waveform_codec.py: find_data_start fix, RLE handling,
full segment-0 decode in decode_tran_initial
- tests/test_waveform_codec.py: synthetic RLE test, full segment 0
tests for JQ0 and V70
- tests/fixtures/5-11-26/: M529LL1L.JQ0, M529LL1L.V70 + TXT exports
- docs/instantel_protocol_reference.md §7.6.1: RLE + segment-header docs
User uploaded 3 high-amplitude events (PPV 6-7 in/s — shook the geophone
hard) to decode-re/5-11-26/. These cracked the Tran codec:
- Preamble bytes [3:5] and [5:7] = Tran[0] and Tran[1] as int16 BE
in 16-count units (LSB = 0.005 in/s). Confirmed across all 7
fixtures.
- First data block carries Tran deltas from sample 2 onward:
* 10 NN block: NN/2 bytes of payload, each byte = two 4-bit signed
nibble deltas (high nibble first)
* 20 NN block: NN int8 signed deltas
Verified 22+42+46 = 110 Tran samples across SP0/SS0/SV0 with 0 errors
against BW's ASCII export.
Why the earlier 96-combination brute force failed: the quiet 5-8
events all had T[0] = T[1] ≈ 0 so the preamble's per-channel encoding
was undetectable. Loud events made the encoding obvious.
What's solved:
- minimateplus.waveform_codec.decode_tran_initial: returns first
N Tran samples in 16-count units for any body.
- Walker length formula for in-data 30 NN blocks (NN*2 instead of NN*4).
- Walker now handles bodies that start with 20 NN (in addition to 10 NN).
What's still open:
- Tran past the first data block (multi-block channel switching).
- Vert / Long / MicL channel encodings.
- Walker correctness past offset ~427 in event-b.
Tests: 36 pass. decode_waveform_v2 still returns None — the full
multi-channel decoder is not wired up. decode_tran_initial is the
new verified entry point.
Files: minimateplus/waveform_codec.py, tests/test_waveform_codec.py
(adds 5-11-26 fixtures + decode_tran_initial tests), and
docs/instantel_protocol_reference.md §7.6.1 (Tran codec spec).
Decoded the structural framing of the Blastware waveform body — the bytes
between the 21-byte STRT record and the 26-byte file footer. The body is
a sequence of tagged variable-length blocks, NOT raw int16 LE. Five tag
types (10/20/00/30/40 NN) and their lengths are now confirmed against the
4-event May 2026 fixture bundle. Body splits cleanly into ~16 segments
(for a 1280-sample event) separated by 40 02 segment headers carrying a
monotonically incrementing uint32 LE counter at bytes [8:12].
What's done:
- minimateplus/waveform_codec.py — block walker, segment splitter, segment
header parser. decode_waveform_v2 is a stub returning None until the
byte-to-sample mapping is solved; client.py is unchanged.
- tests/test_waveform_codec.py — 31 tests covering block detection, lengths,
contiguous-walk, segment splitting, segment-header parsing, and counter
monotonicity. All pass.
- tests/fixtures/decode-re-5-8-26/ — bundled fixtures (4 events, BW binary
+ Blastware ASCII export each).
- docs/instantel_protocol_reference.md §7.6.1 — replaced retraction box
with the verified structural decoding plus an explicit list of what's
still open.
What's still open: the per-byte mapping inside 10 NN / 20 NN blocks. 96
channel-permutation × nibble-order × sign-convention combinations were
brute-force tested; none match BW's ASCII export to within ±1 ADC count.
The codec is more elaborate than uniform 4-bit deltas — likely a hybrid
variable-bit-width scheme with segment-anchor resync points. Next
recommended step: capture an event with a known calibration tone to pin
down magnitude scaling.
Walker also bails out partway through event-b (open issue documented in
both the module and the protocol reference).
Three-pass audit of docs/instantel_protocol_reference.md against
CLAUDE.md and the minimateplus/ implementation. Closes long-standing
discrepancies that had accumulated as the protocol understanding
evolved month over month.
Major corrections:
- §2/§3: S3 frames terminate on bare ETX, not DLE+ETX; payload
byte[1] is flags / byte[2] is SUB (was wrongly DLE/ADDR).
- §4.2: probe responses do not carry data length; DATA_LENGTH
is a per-SUB hardcoded constant.
- §5.1: dropped stale duplicate "SUB 1C = TRIGGER CONFIG READ"
row; SUB 0A lengths corrected from 0x30/0x26 to 0x46/0x2C.
- §5.3: added the missing write-frame mechanics (BW_CMD-only
doubling, DLE-aware checksum, offset = data[1]+2, ack format,
SUB 71 chunk parameters).
- §7.6.x: switched compliance-anchor convention from the unstable
10-byte form to the canonical 6-byte `\xbe\x80\x00\x00\x00\x00`;
recording_mode confirmed at anchor−8 in both read and write
(the prior anchor−3/−4 split caused anchor drift on write).
Sample_rate at anchor−6, histogram_interval at anchor−4 (now ✅),
record_time at anchor+6. Geo_range row added at channel_label+33.
- §7.5b/§8: added the 10-byte sub_code=0x03 continuous-mode
timestamp variant; peak vector sum location corrected from
fixed offset 87 to label-relative tran_pos−12.
- §7.7.2: SUB 1E/1F token byte at params[7], not params[6].
- §7.7.3: SUB 0A length disambiguation rewritten.
- §7.8.4/§7.8.7: fi==9 skip marked FIXED; metadata-page TODO
replaced with current decoder state.
- §11: POLL example wire bytes corrected; SUB 5A row added to
checksum table.
- §13/§14: device-under-test updated to BE11529/S338.17; TCP
Idle Timeout consistency fix (0→2 min); Data Forwarding
Timeout units clarified.
- §15 (renumbered from second §14): open-question entries
already resolved in CLAUDE.md closed out.
- Appendix D: extension taxonomy rewritten — extensions encode
a timestamp (AB0T scheme), not recording mode.
Navigation note added to §7 acknowledging the organic-growth
duplicate section numbers (§7.5/§7.5b, §7.6, §7.7, §7.8, §7.9)
and pointing readers to the canonical sections for each topic.
https://claude.ai/code/session_019tWZybD94YUsBaEGhnM5A2
## v0.19.0 — 2026-05-20
The "device-family separation" release. Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics.
### Added — Phase 1: `device_family` column on `events`
- **`events.device_family TEXT`** — new column carrying `"series3"` or `"series4"`. Populated by every import path (`/db/import/blastware_file`, `/db/import/idf_file`, ACH server, BW CLI, sidecar backfill script). Returned through `/db/events` since `query_events` uses `SELECT *`.
- **Self-applying migration** — on startup, `ALTER TABLE ... ADD COLUMN` lands the new column; a follow-on `UPDATE` backfills existing rows from the binary filename extension (`.IDFH`/`.IDFW` → `series4`, everything else → `series3`). No manual SQL needed.
- **UPSERT preserves family** — re-imports without an explicit family don't blank existing rows (`COALESCE(?, device_family)`).
- **UI dispatches on the column** — `sfm_webapp.html` events-table mic formatter now branches on `ev.device_family === 'series4'` (Thor stores native dB(L); BW stores psi). Modal uses `source.kind === 'idf-import'` from the sidecar (sidecars don't carry the DB column). Source-files section labels changed from "BW filename / BW filesize / BW sha256" to format-neutral "Event file / File size / File sha256".
### Added — Phase 2: `micromate/` package alongside `minimateplus/`
- **`micromate/`** — new sibling package for the Thor / Micromate Series IV device. Currently scoped to offline-file ingest; live-device support (TCP transport, framing, protocol, client) will land here when reverse-engineering happens.
- `micromate/idf_ascii_report.py` — moved from `sfm/idf_ascii_report.py`. No behaviour change.
- `micromate/models.py` — typed `IdfReport`, `IdfEvent`, `IdfPeaks`, `IdfProjectInfo`, `IdfSensorCheck`. Stores mic in native `mic_pspl_dbl` (dB(L)) instead of the pseudo-psi shoehorn that the BW-shaped model uses. `IdfEvent.from_report()` constructs from a parsed dict + filename; `IdfEvent.to_minimateplus_event(waveform_key)` bridges to the existing sidecar / DB-insert machinery.
- `micromate/idf_file.py` — placeholder for the binary codec (`.IDFH` / `.IDFW`). Stubbed `read_idf_file()` raises `NotImplementedError`; documents the planned reverse-engineering path.
- **`WaveformStore.save_imported_idf`** refactored to use the native `IdfEvent` and bridge at the SQL-insert boundary. Cleaner separation of "parse a Thor event" (in `micromate/`) from "store it on disk + write a sidecar" (in `sfm/waveform_store.py`).
- **Tests** — `tests/test_idf_ascii_report.py` imports updated to `micromate.idf_ascii_report`. All 1,014 example-data sidecars round-trip through `IdfEvent.from_report()` without errors.
### Companion releases
- **thor-watcher** unaffected — it talks to the relay over HTTP only. No version bump needed.
- **terra-view** unaffected today; can use `device_family` in its event-detail rendering when convenient.
---
## v0.18.0 — 2026-05-19
The "Thor / Series IV ingest adapter" release. Seismo-relay can now accept event files from Instantel Micromate Series IV (Thor) units alongside the existing MiniMate Plus (Series III) Blastware pipeline.
### Added — Thor (Series IV) IDF ingest
- **`POST /db/import/idf_file`** (`sfm/server.py`) — multipart upload endpoint for `.IDFH` (histogram) and `.IDFW` (waveform) event files plus their `.IDFH.txt` / `.IDFW.txt` ASCII sidecars. Mirrors the shape of `/db/import/blastware_file`: pairing by filename, optional `serial` query hint, per-file outcome reporting.
- **`sfm/idf_ascii_report.py`** — parser for Thor's TXT sidecars (verified against 1,014 real-world samples). Extracts device-authoritative PPV, ZC Freq, Peak Vector Sum, Mic PSPL, calibration date, firmware version, sensor self-check results, and project/client/operator strings.
- **`WaveformStore.save_imported_idf()`** (`sfm/waveform_store.py`) — stores Thor binaries verbatim in `<root>/<serial>/<filename>`, writes a `.sfm.json` sidecar with `source.kind = "idf-import"` and the full parsed report under `extensions.idf_report`. Reuses the existing `events` table — Thor events dedupe on (serial, timestamp) and surface in `/db/events` alongside BW events.
- **`tests/test_idf_ascii_report.py`** — parser tests against the `thor-watcher/example-data/` corpus.
### Changed
- `event_to_sidecar_dict()` (`minimateplus/event_file_io.py`) allow-list for `source_kind` now includes `"idf-import"` so the existing sidecar machinery can carry Thor imports.
- Bumped `pyproject.toml` version to `0.18.0`.
### Companion release
This release ships alongside **thor-watcher v0.3.0**, which adds the SFM forwarder that targets the new `/db/import/idf_file` endpoint. Operators flip the switch in thor-watcher's new "SFM Forward" Settings tab; events POST to seismo-relay just like the series3-watcher BW forwarder does today.
Tighten the Series III / Series IV boundary so UI and storage dispatch
on a clean signal instead of sniffing filenames or applying magnitude
heuristics.
Phase 1 — events.device_family column ("series3" | "series4"):
self-applying migration with filename-based backfill of existing rows
(1,132 backfilled on prod 2026-05-20); plumbed through every import
path (BW endpoint, IDF endpoint, ACH server, BW CLI, sidecar
backfill); UPSERT preserves via COALESCE; UI dispatches on it.
Phase 2 — extract micromate/ package alongside minimateplus/:
native IdfEvent / IdfReport / IdfPeaks / IdfProjectInfo /
IdfSensorCheck (mic in dB(L), not pseudo-psi); moved
idf_ascii_report.py from sfm/ to micromate/; refactored
save_imported_idf to use IdfEvent and bridge to minimateplus.Event at
the SQL-insert boundary; idf_file.py stub for the future binary codec.
Phase 3 prep — docs/idf_protocol_reference.md captures the two
observed Thor binary header signatures (1,012 newer-firmware files vs
2 old files whose layout is byte-for-byte BW-STRT-compatible), file-size
hints suggesting int8 sample encoding, open questions in dependency
order, and a concrete first-session plan for cracking the codec.
Also rolled in the v0.18.1 hotfixes that motivated this work:
- idf_ascii_report parser now handles "<0.005 in/s" (below-threshold)
and "N/A" markers without leaving raw strings in numeric DB columns.
- sfm_webapp.html: defensive _ppvFmt / mic formatter so future
data-shape drift can't kill the whole events table render.
All 1,014 example-data sidecars round-trip through the new package.
See CHANGELOG.md for full notes.
Reviewed-on: #21
## v0.17.0 — 2026-05-17
The "field rescue + DB management" release. Hardened against units that are stuck in a runaway call-home loop, and added an operator-facing path for purging bogus events that those same units dump into the DB before recovery. All work in this release was driven by the BE9558H incident (full incident log + recovery procedure at `docs/runbooks/wedged_unit_recovery.md`).
### Added — wedged-unit recovery toolkit
A toolkit for breaking the call-home loop on a misbehaving unit whose firmware is too busy to keep up with normal request/response handshakes. Tested in production against BE9558H (16 May 2026) — a unit with a stuck-triggered Long-axis geophone that had been call-homing the office BW ACH server every 30 seconds for hours. Endpoints layered from "single attempt" to "siege mode" to suit different contention levels:
- **`GET /device/events/storage_range`** — SUB 0x06 probe. POLL + one read; ~2s. Returns first/last event keys and an `is_empty` flag. Use to triage whether a unit has stored events without invoking the slow `count_events()` 1E/1F chain (which choked on BE9558H's corrupted event chain).
- **`GET /device/events/index`** — SUB 0x08 probe. POLL + one read; ~2s. Returns the lifetime event counter (does NOT decrement on erase — use `storage_range` for "right now" state).
- **`POST /device/events/erase`** — full erase sequence `0xA3 → 0x1C → 0x06 → 0xA2` (confirmed 2026-04-11, see the protocol reference). Resets event keys to `0x01110000`. Caller's responsibility to disable ACH first if the underlying trigger condition will re-fill the buffer.
- **`POST /device/rescue`** — one TCP session, short connect+recv timeouts: POLL → disable ACH (compliance config write) → erase events → close. Designed for race-loop usage when the device is busy in another session. 503 on connect-refused, 502 on protocol failure, 200 on full sequence success.
- **`POST /device/stop_monitoring_blind`** — fire-and-forget Stop Monitoring (SUB 0x97), TCP-only. Dumps `SESSION_RESET + POLL_PROBE + SESSION_RESET + POLL_DATA + 0x97 × repeat` and closes without reading any S3 response. The full POLL preamble is required — write commands without it are silently ignored by the device's protocol parser (false-positive surface area that bit the first version of this endpoint). Use when the device's firmware can't keep up with full request/response but might process inbound bytes at its own pace.
- **`POST /device/stop_monitoring_spam`** — server-side hammer loop, duration-bounded. Open TCP → write the same blind payload → close → repeat as fast as possible until `duration_s` elapses. Configurable `connect_timeout` (default 500ms) and `repeat` (frames per session). Reports `sent_ok`, `connect_failed`, `write_failed`, `rate_attempts_per_s`. Clamped to 5min duration.
- **`POST /device/stop_monitoring_slow_drip`** — opposite of spam. Open ONE TCP session, drip the wake handshake + stop frames at `interval_s` (default 3s) for `duration_s` (default 120s, max 10min). Each drip is ~23 bytes — well under any UART FIFO size. Opportunistically drains any inbound bytes the device sends back; `bytes_received > 0` in the response strongly suggests the device has started talking and the session is healthy. **This is the endpoint that saved BE9558H.** Spam mode had been overrunning the device's UART FIFO; slow drip stayed under it.
- **Six rescue scripts** under `scripts/` — thin bash wrappers around the endpoints, default `SFM_BASE_URL=http://localhost:8200` (direct, not via Terra-View proxy whose 60s timeout would cut off the longer endpoints):
- `rescue_device.sh` — race-loop wrapper for `/device/rescue`
- `blind_stop.sh` — race-loop wrapper for `/device/stop_monitoring_blind`
- `spam_stop.sh` — single-call burst hammer
- `slow_drip.sh` — single-call held-session drip
- `watch_unit.sh` — passive periodic reachability check (every N min, logs to file), useful for unattended overnight monitoring of a wedged unit
- **`docs/runbooks/wedged_unit_recovery.md`** — symptoms, quick-reference recovery procedure, the modem-layer mechanism (Sierra Wireless serial-port mode-flipping is the real failure mode — not the device firmware), and a table of "why simpler approaches don't work" so the next incident skips the dead ends.
### Added — operator event DB management
Endpoints powering Terra-View's new `/admin/events` page (v0.12.0). Designed for purging bogus events from a unit that's been forwarding them in bulk (e.g. a stuck-triggered seismograph dumping hundreds of junk events before it's recovered).
- **`DELETE /db/events/{event_id}`** — hard-delete one event row. Also unlinks the associated blastware binary (`.AB0*`), `.a5.pkl`, `.sfm.json` sidecar, and `.h5` clean-waveform files via the WaveformStore. Returns the per-file removal status. 404 if the event doesn't exist.
- **`POST /db/events/delete_bulk`** — filter-based or id-list-based bulk delete with safety rails:
- Filters (`serial`, `from_dt`, `to_dt`, `false_trigger`) combine with AND; same semantics as `GET /db/events`. `ids` is an additional inclusion list. Refuses to run with no filters (would wipe the whole table — raises 422).
- `confirm` must be `true` to actually delete. Otherwise returns a dry-run summary (`status: "dry_run"`, `matched: N`, `sample_serials: [...]`).
- `max_rows` (default 10,000) caps how many rows can be deleted by-filter in one call. If exceeded, returns `status: "too_many"` with a hint to narrow or raise the cap. Bypassed when only `ids` is supplied.
- **`_cleanup_event_files(row)`** helper in `sfm/server.py` — best-effort `unlink()` of all four sidecar paths derived from the row's `blastware_filename`. Logged at WARN if a path exists but unlink fails; the DB row deletion still proceeds.
- **`SeismoDb.delete_event(id)` and `SeismoDb.delete_events_bulk(...)`** in `sfm/database.py` — both return the deleted row dict(s) so callers can do file cleanup. `delete_events_bulk` raises `ValueError` if no filters are supplied.
### Changed
- **Default protocol recv timeout dropped from 30s → 10s** in `_build_client()`. The unit usually responds in well under a second over cellular; 10s leaves comfortable headroom for retransmits while failing reasonably fast when a unit is wedged. The two endpoints that perform full 5A waveform downloads still pass `timeout=120.0` explicitly so multi-minute event transfers are unaffected.
- **`_build_client()` now accepts an optional `connect_timeout`** (TCP-only) so rescue / race-loop endpoints can fail fast on busy modems without affecting the protocol-level recv timeout.
### Fixed
- **`GET /device/monitor/status` returned HTTP 500 + uncaught traceback when the device was unresponsive**. The retry-on-`Exception` inner block let the second `client.poll()`'s `ProtocolError` propagate out of the handler. Now wrapped in proper try/except — returns 502 with `{"detail": "Protocol error: No S3 frame received within 10.0s ..."}` on timeout, 502 on connection errors, 500 only for genuinely unexpected exceptions.
### Migration
No schema changes. No data migration required.
If you've been running a previous version against a wedged unit and accumulated bogus events, the new `/admin/events` page in Terra-View v0.12.0 (or direct `POST /db/events/delete_bulk` with `confirm: true`) is the cleanup tool. Watcher state on the upstream DL2 PC does NOT need separate cleaning — the watcher's `sfm_forwarded.json` keys on file sha256 and won't re-forward the same files.
### Pairing
This release pairs with **Terra-View v0.12.0**, which adds the `/admin/events` UI that consumes the new bulk-delete endpoints, the bulk false-trigger flagging on `/unit/{id}`, and the field-deployment workflow that uses the same `series3-watcher` → SFM ingest path as before.
---
## v0.16.1 — 2026-05-14
### Fixed
- **`record_type` always "Waveform" for forwarded events.** `read_blastware_file()` hardcoded `ev.record_type = "Waveform"` regardless of the file's actual type. The watcher-forward pipeline (the main BW ACH ingest path) compounds this by parsing files from a tmp path with a `.bw` suffix, so even a filename-based fallback inside the parser still wouldn't see the original extension. Now:
1. New `derive_record_type_from_filename(filename)` helper in `minimateplus/event_file_io.py` derives the type from the LAST character of the filename's extension (V10.72+ AB0T scheme: `H`=Histogram, `W`=Waveform, `M`=Manual, `E`=Event, `C`=Combo). Falls back to `"Waveform"` for old S338 firmware (3-char extensions ending in `0`) and any unrecognized suffix.
2. `read_blastware_file()` now calls the helper with its `path.name` so direct callers (the `--dry-run` path in `scripts/import_bw.py`, tests, ad-hoc scripts) get the right value automatically.
3. `WaveformStore.save_imported_bw()` overrides `ev.record_type` with the **original** filename's derived type after parsing (the tmp file inside the parser doesn't carry the original extension). This is the path the live watcher-forwarder hits, so the DB column now reflects the actual event type going forward.
Events ingested before this fix are stuck with `record_type="Waveform"` in the DB; a one-off backfill (`UPDATE events SET record_type = ... WHERE blastware_filename LIKE '%H'`) would fix them retroactively if desired. Terra-view's event modal also derives client-side from the filename, so the UI already shows the correct type for old events even without the backfill.
---
- Created a comprehensive runbook (`wedged_unit_recovery.md`) detailing the recovery process for units stuck in a call-home loop, including symptoms, recovery steps, and explanations of the failure mode.
- Added `blind_stop.sh` script to send stop-monitoring commands in a tight loop for unresponsive devices.
- Introduced `rescue_device.sh` script to disable Auto Call Home and erase events from a busy device.
- Implemented `slow_drip.sh` script to send stop-monitoring frames at a slow rate to prevent UART overrun.
- Developed `spam_stop.sh` script to rapidly send stop-monitoring commands to a device.
- Created `watch_unit.sh` script for passive monitoring of device reachability, logging results over time.