Previously the .TXT was parsed into the sidecar's bw_report projection
and then discarded at ingest time. Now save_imported_bw() writes it
to <store>/<serial>/<filename>_ASCII.TXT permanently.
Rationale: with BW Mail / Forwarding Agent being phased out of the
operator workflow, the XML/PDF/WMF those tools produce won't be
available — the binary + .TXT (created by BW ACH itself) are our
only authoritative inputs going forward. Keeping the raw .TXT
unlocks:
- Parser bug fixes can be applied RETROACTIVELY by re-parsing the
stored .TXT, instead of requiring a re-forward from the watcher
PC (which lost the .TXT after BW ACH cleanup).
- Audit trail of what BW actually sent us, for debugging.
- The five known parser-PPV-miss events will be re-parseable once
the regex fix lands (instead of staying broken indefinitely).
Storage cost: ~15 KB per event × 14k events = ~210 MB on the
existing prod corpus. Negligible.
Implementation:
- WaveformStore gains txt_path_for() + open_txt()
- save_imported_bw() writes the .TXT when bw_report_text is supplied
- sidecar source block records the txt_filename
- backfill_sidecars.py preserves txt_filename across regens
- New GET /db/events/{id}/ascii_report.txt endpoint serves it
- Returns 404 for events ingested before this change (no .TXT in
the store yet) — re-forward to populate
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror what the ingest path does: BW's reported peaks (and sample_rate
/ record_time) take precedence over codec output where present.
Without this, --force backfill silently overwrites bw_report-overlaid
DB columns with codec-derived peaks. Wrong for events where the codec
doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style
events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing
PVS=0 on real high-amplitude events. Bit on prod 2026-05-22 with
three top-10 waveform events ending up at PVS=0 (rolled back same day,
this fix is the proper resolution).
New helper minimateplus.event_file_io.apply_bw_report_dict_to_event
operates on the projected sidecar dict shape (the structure
_bw_report_to_dict produces, which is what gets preserved in the
sidecar). Mirrors apply_report_to_event's semantics: only writes
fields where bw_report has a non-None value, no-ops cleanly on
empty / None input.
Dev validation against prod snapshot:
pre : 1839.7315 pvs_sum 356 events with DB PVS ≠ sidecar bw_report
post : 2016.4902 pvs_sum 2 events still mismatched (both have NULL
timestamp + duplicate rows, edge case)
Both edge-case events DO get the correct value written by the new
backfill — their stale rows from prior backfills remain because
UNIQUE(serial, timestamp) doesn't fire on NULL. Separate dedup
cleanup needed for those 2 events (0.014% of corpus); not blocking.
Backfill remains idempotent + bw_report preservation still passes
(0 WIPED, 0 CHANGED on the 3rd consecutive run).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-step tool to verify that backfill_sidecars doesn't wipe the
bw_report block from existing sidecars. Workflow:
1. snapshot --out before.json (canonical-JSON hash per sidecar)
2. run backfill
3. diff --baseline before.json (classifies every sidecar:
PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED)
Exit code 1 if any WIPED or CHANGED entries found, 0 otherwise — so
it can gate a CI step or a deploy script.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
histogram_codec: drop _MAX_PEAK_COUNT 4096 → 2200. The old ceiling
let extension-byte blocks slip through at up to 20.48 in/s per
channel, producing 35× inflated PVS sums when first deployed to
prod. 2200 covers Normal-range full-scale (10 in/s = 2000 counts)
plus 10% headroom for quantization edge cases.
backfill_sidecars: also preserve the bw_report block alongside
review + extensions when regenerating sidecars. event_to_sidecar_dict
takes a BwAsciiReport dataclass not a dict, so for bw_report we
overlay the existing block after regen rather than passing as a kwarg.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered while dry-running the backfill on prod: the waveform store
contains both BW (.AB0*/.N00) and Thor IDF (.IDFW/.IDFH) event files
side-by-side because both go through the same per-serial directory
layout. The script's `_looks_like_event_file` heuristic accepted any
3-4 char extension ending in W or H, which matched both BW and IDF.
The script then routes everything through
`event_file_io.read_blastware_file`, which rejects IDF files with
"not a Blastware file (bad header prefix)" — 3807 errors on prod
out of 7201 total events.
Thor IDF events have their own ingest path
(`WaveformStore.save_imported_idf`) and their sidecars are populated
at ingest from the paired `.IDFW.txt` ASCII report. The backfill
script has no value to add for them — there's no decoder to refresh,
and the sidecar metadata is already correct. Filter them out.
After this fix, the prod backfill should run clean: ~3392 BW events
get sidecar+h5 regen as expected; the ~3807 Thor IDF events are
silently skipped.
The proper "IDF backfill" (refresh tool_version stamp on IDF
sidecars by re-running event_to_sidecar_dict against the stored
DB row + sidecar extensions block) is a separate, narrower
follow-up — not blocking the BW backfill rollout.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The histogram-mode event body is now byte-exact decodable.
Companion to the waveform body codec — together they cover every
event file the watcher forwards. Cracked in one session via
cross-event correlation against BW's ASCII export.
The §7.6.2 spec in instantel_protocol_reference.md was structurally
correct (32-byte blocks) but the per-sample semantics were
under-documented. Cross-checking block 130 of N844L6Z8.ZR0H
against its TXT row revealed the layout perfectly:
slot[0] = 10 (constant marker)
slot[1] = T_peak_count (× 0.005 → in/s at Normal range)
slot[2] = T_halfperiod (freq_Hz = 512 / halfp)
slot[3] = V_peak_count
slot[4] = V_halfperiod
slot[5] = L_peak_count
slot[6] = L_halfperiod
slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db)
slot[8] = MicL_halfperiod
The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz).
Mic dB uses the SAME formula as the waveform codec (sign × (81.94
+ 20·log10(|count|))) — they share the mic ADC calibration constant.
Block identification anchor: bytes [22:24] == 0x0000 AND
bytes [28:32] == 1e 0a 00 00. The tail signature is the most
reliable distinguisher from non-block content in the file.
Files:
minimateplus/histogram_codec.py (new) — decoder + public API
matching the waveform codec's shape:
walk_body(body) -> records
decode_histogram_body(body) -> {Tran, Vert, Long, MicL}
decode_histogram_body_full(body) -> [per-interval dicts]
half_period_to_hz, geo_count_to_ins helpers
minimateplus/event_file_io.py (modified) — read_blastware_file
now tries the waveform codec first, falls back to the histogram
codec on failure. Same output shape, same downstream pipeline.
tests/test_histogram_codec.py (new) — 24 regression locks against
the in-repo fixture corpus, byte-exact against BW ASCII export
for peaks (all 4 channels), frequencies (all 4 channels,
including >100 Hz sentinel handling), block framing, and
segment-ID accounting.
scripts/backfill_sidecars.py (modified) — the has_samples
short-circuit added in the histogram-pending era is now a
pure defensive guard. Histograms in prod will regen .h5 files
correctly on the next backfill run.
docs/histogram_codec_re_status.md (updated) — supersedes the
earlier "in progress" version with the verified format and
test-coverage summary. Notes a few non-essential fields still
open (4-byte block metadata, Geo PVS, Mic psi(L) — none of
which are needed for waveform reconstruction).
Total verified coverage: ~3,500 blocks across 5 fixtures, every
field of every block byte-exact against BW.
The watcher-forwarded histogram event corpus on prod (~10,000
events) will now produce correct .h5 sidecars on the next backfill
run. No additional changes needed to the backfill flow — the
existing tool_version-bump cascade picks them up automatically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered while dry-running the backfill on the prod store: ~10,000
of ~10,059 events are histogram-mode (filename extension `*H`), and
the waveform-body codec wired in via the previous commit doesn't
handle histogram-mode bodies — only the waveform-mode codec at
§7.6.1 is implemented; the histogram-mode codec at §7.6.2 of the
protocol reference is documented but no Python implementation
exists yet.
Without this guard, every histogram event's .h5 file would be
*replaced* with an empty one — strictly worse than today's
broken-int16-LE .h5 because any downstream viewer expecting
non-empty sample arrays would now error out instead of just
rendering wrong values.
Fix: after the decoder runs, check whether any channel has samples.
If not, skip the .h5 write entirely. The sidecar still regenerates
(refreshing the tool_version stamp and any peaks/project info from
the DB row), but the existing .h5 is left untouched.
This is a *temporary* gate. When the histogram codec lands (next
branch: `feat/wire-histogram-codec`), the has_samples check can be
removed and the backfill will then correctly regenerate all .h5
files, histogram and waveform alike.
Observed effect (dry-run on prod store, 10,059 events):
- waveform events (~5%): "[DRY ] would write … + .h5 (would (re)write)"
- histogram events (~95%): "[DRY ] would write … + .h5 (skipped-empty-samples)"
- sidecar tool_version bump succeeds for both
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two coupled changes that close the rollout gap left by the
read_blastware_file codec wiring:
1. minimateplus/event_file_io.py: bump TOOL_VERSION from 0.16.1 to
0.20.0. This is the version stamp the backfill script reads from
each sidecar's source.tool_version field to detect "this sidecar
was written before the current decoder shipped, regenerate it."
Bumping past every value baked into existing prod sidecars flags
them all as stale on the next backfill run — which is exactly what
we want, since every pre-codec-wiring sidecar was written by the
retracted int16-LE decoder.
2. scripts/backfill_sidecars.py: when the sidecar is being
regenerated this iteration (sha mismatch, tool_version too old,
or --force), also regenerate the .h5. Previously the .h5 logic
only rewrote when --force was passed or the file was missing —
so a tool_version-driven sidecar regen left the broken .h5 in
place forever. Added a `sidecar_stale` boolean to track the
"we're rewriting the sidecar this iteration" state and wired it
into the h5 need-rewrite check.
Path coverage (verified by trace):
- sidecar missing → both regen
- --force → both regen
- sha mismatch → both regen
- tool_ver too old → both regen (THE post-codec-wiring case)
- everything OK → skip iteration entirely (h5 untouched)
Operator review state (review.false_trigger, reviewer, notes) and
the sidecar's extensions block are preserved across regen by the
existing read-existing-sidecar / pass-into-event_to_sidecar_dict
path — unchanged from prior behavior.
Deploy procedure (on prod):
1. Pull this change + the read_blastware_file codec wiring.
2. `python scripts/backfill_sidecars.py --dry-run` to preview.
Every sidecar with source.tool_version<0.20.0 will show as
"would (re)write".
3. Run for real (drop --dry-run). Expect every pre-fix event
to regen. Big stores may take a while.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Created a comprehensive runbook (`wedged_unit_recovery.md`) detailing the recovery process for units stuck in a call-home loop, including symptoms, recovery steps, and explanations of the failure mode.
- Added `blind_stop.sh` script to send stop-monitoring commands in a tight loop for unresponsive devices.
- Introduced `rescue_device.sh` script to disable Auto Call Home and erase events from a busy device.
- Implemented `slow_drip.sh` script to send stop-monitoring frames at a slow rate to prevent UART overrun.
- Developed `spam_stop.sh` script to rapidly send stop-monitoring commands to a device.
- Created `watch_unit.sh` script for passive monitoring of device reachability, logging results over time.
Pre-v0.16.1 (commit aac1c8e), every event ingested through
read_blastware_file got record_type="Waveform" regardless of actual
type because the field was hardcoded. New ingests derive correctly
from the AB0T filename scheme (H/W/M/E/C). Existing rows still hold
the wrong value.
This script walks the events table, derives the correct record_type
from each row's blastware_filename, and bulk-updates rows that differ.
Idempotent + dry-run by default.
Usage:
python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db
python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db --apply
Terra-view's event-detail modal already derives the record_type
client-side from the filename for display, so operators see the
correct type in the UI even before this backfill runs. This script
brings the DB column in line with what the UI is already showing —
matters for reporting and any downstream consumer that reads the
column directly.
The /db/import/blastware_file endpoint was bucketing every
forwarded event into serial='UNKNOWN' in the DB. WaveformStore
correctly decoded the serial from the BW filename and saved
files to <store>/<serial>/<filename> (e.g.
.../BE17353/S353L5KC.DR0H.h5), but the endpoint code called
db.insert_events(serial=_serial_from_event(ev)) — and
_serial_from_event was a stub that always returned None,
falling back to "UNKNOWN".
Effect on the user's prod server: 3,039 events forwarded across
24 distinct units, ALL inserted under serial='UNKNOWN'. The
on-disk waveform store + sidecars + HDF5s were fine, but the
SFM webapp's /db/units only showed the two original manually-
uploaded serials because every forwarded row had its serial
column zeroed to UNKNOWN.
Fix:
- WaveformStore.save_imported_bw() now surfaces the decoded
serial on the returned `rec` dict (rec["serial"]).
- The import endpoint uses rec["serial"] as the authoritative
fallback when the operator hasn't supplied a serial_hint query
parameter. Order of precedence:
query string `serial` → rec["serial"] → _serial_from_event(ev) → "UNKNOWN"
- Response payload now includes `serial` per file so the watcher
log lines (or any future caller) can see which unit each event
was attributed to.
Recovery for existing DB rows:
scripts/repair_unknown_serials.py walks the events table looking
for rows with serial='UNKNOWN' and re-attributes each one to the
serial decoded from blastware_filename. Updates the row in place
unless the target (serial, timestamp) already has a row, in which
case the UNKNOWN duplicate is deleted. Idempotent. Default
dry-run; pass --apply to commit.
Verified on the user's actual DB (dry-run):
UNKNOWN rows scanned: 3039
Updated to real serial: 2602
Deleted (duplicate of an
already-correct row): 437
Unresolved (bad filename): 0
After running the repair, /db/units will show all 24 units
correctly populated.
### Added
- **Layered event storage architecture.** Each event now lands as four
files in the per-serial waveform store, each with a clear role:
- `<filename>` — the Blastware-readable binary (BW file). Untouched.
- `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
- `<filename>.h5` — clean per-channel waveform arrays in physical
units (in/s for geo, psi for mic) plus event metadata (HDF5 with
gzip compression). This is the canonical format for downstream
analysis tools.
- `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
project, source provenance, review state, extensions).
SQLite (`seismo_relay.db`) is the searchable index over all four.
- **Plot-ready waveform JSON (`sfm.plot.v1`).** The `/device/event/{idx}/waveform`
and `/db/events/{id}/waveform.json` endpoints now return samples in
physical units with explicit time-axis metadata, peak markers, and
per-channel unit hints — no more guessing the ADC-to-velocity scale
client-side. The webapp waveform viewer was rewritten to consume
this shape.
- **In-app waveform viewer accuracy fix.** The standalone SFM webapp
viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
(≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
*in/s per V* hardware constant — not the ADC-counts-to-velocity
factor. This silently scaled every plot ~38% too low for Normal-range
geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
Sensitive). Conversion is now done server-side using the geo_range
from compliance config; the client just plots.
- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
`read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.
### Dependencies
- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
`/db/import/blastware_file` endpoint introduced in this release).