seismo-relay

Author	SHA1	Message	Date
serversdown	3457ed0072	bw_ascii_report: parse OORANGE saturation marker + TimeSum typo BW writes "OORANGE" (truncation of "Out Of Range") when a channel exceeds its full-scale, and uses a typo'd label "Peak Vector Sum TimeSum" for the PVS time field. Both confirmed against real ASCII files pulled from a Windows watcher PC 2026-05-27: T190LD5Q.LK0W Vert PPV = OORANGE (Normal range, 10 in/s exceeded) T438L713.RY0W All three PPVs OORANGE (Sensitive range, 1.25 in/s) K557L3YM.OE0W Tran+Vert PPV OORANGE + MicL PSPL OORANGE Previously our _parse_number() returned None for OORANGE → DB columns ended up NULL → events vanished from filters / sorts / dashboards despite being legitimate high-amplitude events. New behavior — substitute a conservative bound + set a saturation flag: - Channel PPV → geo_range_ips + ChannelStats.ppv_saturated - Peak Vector Sum → sqrt(3) * geo_range_ips + peak_vector_sum_saturated - MicL PSPL → 140 dB(L) + MicStats.pspl_saturated Flags propagate to the sidecar's bw_report block so the SFM UI can render "> 10 in/s" / "> 140 dBL" rather than treating the substituted value as exact. Same commit also accepts "Peak Vector Sum TimeSum" as an alias for "Peak Vector Sum Time" (BW always writes the typo on OORANGE PVS lines — every example file confirms it). Tests: new test_oorange_marker_treated_as_saturation (synthetic) + test_real_oorange_event_t190_parses (skips if real fixture absent). 177/177 tests pass; 16 pre-existing missing-fixture skips unchanged. Five events on prod (T190, T438, K557, plus 2 others matching the same fault pattern) will pick up correct peaks + saturation flags once watchers re-forward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:32:56 +00:00
serversdown	d21e3b5298	histogram aggregation + parser extension for BW interval fields Three layered changes that together make histogram charts visually match BW's printout (one bar per interval, not per codec block): 1. bw_ascii_report parser captures histogram fields it previously dropped: - Histogram Start/Stop Time + Date → datetime - Number of Intervals + Interval Size (string + parsed seconds) - <Channel> Peak Time + Peak Date → datetime (per-channel) - Peak Vector Sum Date (combined with PVS Time → datetime; clears the bogus seconds parse that interpreted "22:33:52" as 22.0) New _parse_iso_date() handles BW's ISO format for histograms (waveforms use "May 8, 2026" long form). New _parse_interval_size() handles "1 minute" / "5 minutes" / "15 seconds" etc. 2. _bw_report_to_dict() projects the new fields into a new bw_report.histogram block in the sidecar. 3. /db/events/{id}/waveform.json wraps the existing path 1 (HDF5) output with _maybe_aggregate_histogram(): when the event is a histogram AND the sidecar has bw_report.histogram.n_intervals, group the codec's per-block samples into N intervals via max-per-group and return the aggregated array. time_axis gains histogram_aggregated / n_intervals / interval_size_s / interval_times fields. Frontend (both modal chart in sfm_webapp.html + standalone event browser) uses interval_times as x-axis labels when provided (BW-style HH:MM:SS), falls back to interval index. Defensive: aggregation is no-op when the sidecar lacks the histogram block (events ingested before this change). Activates automatically on prod once a watcher re-forward populates new sidecars. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:23:05 +00:00
serversdown	ad2b553c7b	ingest: preserve raw BW ASCII report (.TXT) alongside the binary Previously the .TXT was parsed into the sidecar's bw_report projection and then discarded at ingest time. Now save_imported_bw() writes it to <store>/<serial>/<filename>_ASCII.TXT permanently. Rationale: with BW Mail / Forwarding Agent being phased out of the operator workflow, the XML/PDF/WMF those tools produce won't be available — the binary + .TXT (created by BW ACH itself) are our only authoritative inputs going forward. Keeping the raw .TXT unlocks: - Parser bug fixes can be applied RETROACTIVELY by re-parsing the stored .TXT, instead of requiring a re-forward from the watcher PC (which lost the .TXT after BW ACH cleanup). - Audit trail of what BW actually sent us, for debugging. - The five known parser-PPV-miss events will be re-parseable once the regex fix lands (instead of staying broken indefinitely). Storage cost: ~15 KB per event × 14k events = ~210 MB on the existing prod corpus. Negligible. Implementation: - WaveformStore gains txt_path_for() + open_txt() - save_imported_bw() writes the .TXT when bw_report_text is supplied - sidecar source block records the txt_filename - backfill_sidecars.py preserves txt_filename across regens - New GET /db/events/{id}/ascii_report.txt endpoint serves it - Returns 404 for events ingested before this change (no .TXT in the store yet) — re-forward to populate Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:01:12 +00:00
serversdown	35842ac50a	backfill: overlay bw_report onto Event before DB upsert Mirror what the ingest path does: BW's reported peaks (and sample_rate / record_time) take precedence over codec output where present. Without this, --force backfill silently overwrites bw_report-overlaid DB columns with codec-derived peaks. Wrong for events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing PVS=0 on real high-amplitude events. Bit on prod 2026-05-22 with three top-10 waveform events ending up at PVS=0 (rolled back same day, this fix is the proper resolution). New helper minimateplus.event_file_io.apply_bw_report_dict_to_event operates on the projected sidecar dict shape (the structure _bw_report_to_dict produces, which is what gets preserved in the sidecar). Mirrors apply_report_to_event's semantics: only writes fields where bw_report has a non-None value, no-ops cleanly on empty / None input. Dev validation against prod snapshot: pre : 1839.7315 pvs_sum 356 events with DB PVS ≠ sidecar bw_report post : 2016.4902 pvs_sum 2 events still mismatched (both have NULL timestamp + duplicate rows, edge case) Both edge-case events DO get the correct value written by the new backfill — their stale rows from prior backfills remain because UNIQUE(serial, timestamp) doesn't fire on NULL. Separate dedup cleanup needed for those 2 events (0.014% of corpus); not blocking. Backfill remains idempotent + bw_report preservation still passes (0 WIPED, 0 CHANGED on the 3rd consecutive run). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 18:56:22 +00:00
serversdown	7183b953e4	minimateplus: histogram body codec — FULLY DECODED The histogram-mode event body is now byte-exact decodable. Companion to the waveform body codec — together they cover every event file the watcher forwards. Cracked in one session via cross-event correlation against BW's ASCII export. The §7.6.2 spec in instantel_protocol_reference.md was structurally correct (32-byte blocks) but the per-sample semantics were under-documented. Cross-checking block 130 of N844L6Z8.ZR0H against its TXT row revealed the layout perfectly: slot[0] = 10 (constant marker) slot[1] = T_peak_count (× 0.005 → in/s at Normal range) slot[2] = T_halfperiod (freq_Hz = 512 / halfp) slot[3] = V_peak_count slot[4] = V_halfperiod slot[5] = L_peak_count slot[6] = L_halfperiod slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db) slot[8] = MicL_halfperiod The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz). Mic dB uses the SAME formula as the waveform codec (sign × (81.94 + 20·log10(\|count\|))) — they share the mic ADC calibration constant. Block identification anchor: bytes [22:24] == 0x0000 AND bytes [28:32] == 1e 0a 00 00. The tail signature is the most reliable distinguisher from non-block content in the file. Files: minimateplus/histogram_codec.py (new) — decoder + public API matching the waveform codec's shape: walk_body(body) -> records decode_histogram_body(body) -> {Tran, Vert, Long, MicL} decode_histogram_body_full(body) -> [per-interval dicts] half_period_to_hz, geo_count_to_ins helpers minimateplus/event_file_io.py (modified) — read_blastware_file now tries the waveform codec first, falls back to the histogram codec on failure. Same output shape, same downstream pipeline. tests/test_histogram_codec.py (new) — 24 regression locks against the in-repo fixture corpus, byte-exact against BW ASCII export for peaks (all 4 channels), frequencies (all 4 channels, including >100 Hz sentinel handling), block framing, and segment-ID accounting. scripts/backfill_sidecars.py (modified) — the has_samples short-circuit added in the histogram-pending era is now a pure defensive guard. Histograms in prod will regen .h5 files correctly on the next backfill run. docs/histogram_codec_re_status.md (updated) — supersedes the earlier "in progress" version with the verified format and test-coverage summary. Notes a few non-essential fields still open (4-byte block metadata, Geo PVS, Mic psi(L) — none of which are needed for waveform reconstruction). Total verified coverage: ~3,500 blocks across 5 fixtures, every field of every block byte-exact against BW. The watcher-forwarded histogram event corpus on prod (~10,000 events) will now produce correct .h5 sidecars on the next backfill run. No additional changes needed to the backfill flow — the existing tool_version-bump cascade picks them up automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 23:05:13 +00:00
serversdown	fa9d3cdef2	read_blastware_file: leave peak_values=None when samples can't be decoded Fixes a data-loss bug discovered while dry-running the backfill against the prod store. Symptom: every histogram event in the store has its body decoded by read_blastware_file → codec returns None → samples = empty dict → ``ev.peak_values = _peaks_from_samples(empty)`` returns ``PeakValues(0, 0, 0, 0, 0)`` (NOT None). The backfill script's existing "seed from DB row when peak_values is None" branch then correctly skips the seeding, and the all-zeros PeakValues flows into ``db.insert_events()``'s UPSERT path, OVERWRITING the existing good DB peak values for that event (which were populated from the paired BW ASCII report at ingest). Net effect: running the backfill on prod would have wiped the PPV / mic / vector-sum columns for ~10,000 histogram events. Fix: only compute peaks-from-samples when there are actually samples. For events the codec couldn't decode (histogram-mode bodies, until the §7.6.2 histogram codec is wired in), leave peak_values=None as the "we don't know" signal. Downstream consumers: - backfill_sidecars.py — its existing ``if ev.peak_values is None:`` branch (line 243) seeds from the DB row, preserving the real BW-report peaks across the regen. - WaveformStore.save_imported_bw — apply_report_to_event overlays peaks from the paired BW ASCII report when one was uploaded. Histogram imports without a paired report end up with NULL peaks in the DB, which is correct (better than zeros — clearly says "no peak data available" rather than "peaks are exactly zero"). Updated the existing synthetic-event round-trip test to expect peak_values=None for the no-real-body case, which is the truth now. The 7 fixture-corpus regression tests for real BW waveforms continue to pass — those have decodable samples, so peak_values is still populated from the codec output as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 20:30:53 +00:00
serversdown	e8682d49ad	scripts/backfill_sidecars: cascade h5 regen when sidecar is stale + bump TOOL_VERSION Two coupled changes that close the rollout gap left by the read_blastware_file codec wiring: 1. minimateplus/event_file_io.py: bump TOOL_VERSION from 0.16.1 to 0.20.0. This is the version stamp the backfill script reads from each sidecar's source.tool_version field to detect "this sidecar was written before the current decoder shipped, regenerate it." Bumping past every value baked into existing prod sidecars flags them all as stale on the next backfill run — which is exactly what we want, since every pre-codec-wiring sidecar was written by the retracted int16-LE decoder. 2. scripts/backfill_sidecars.py: when the sidecar is being regenerated this iteration (sha mismatch, tool_version too old, or --force), also regenerate the .h5. Previously the .h5 logic only rewrote when --force was passed or the file was missing — so a tool_version-driven sidecar regen left the broken .h5 in place forever. Added a `sidecar_stale` boolean to track the "we're rewriting the sidecar this iteration" state and wired it into the h5 need-rewrite check. Path coverage (verified by trace): - sidecar missing → both regen - --force → both regen - sha mismatch → both regen - tool_ver too old → both regen (THE post-codec-wiring case) - everything OK → skip iteration entirely (h5 untouched) Operator review state (review.false_trigger, reviewer, notes) and the sidecar's extensions block are preserved across regen by the existing read-existing-sidecar / pass-into-event_to_sidecar_dict path — unchanged from prior behavior. Deploy procedure (on prod): 1. Pull this change + the read_blastware_file codec wiring. 2. `python scripts/backfill_sidecars.py --dry-run` to preview. Every sidecar with source.tool_version<0.20.0 will show as "would (re)write". 3. Run for real (drop --dry-run). Expect every pre-fix event to regen. Big stores may take a while. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 18:24:06 +00:00
serversdown	31d691b40b	minimateplus: wire read_blastware_file to verified body codec `read_blastware_file()` was still calling `_decode_samples_4ch_int16_le` (the retracted int16-LE-interleaved hypothesis) on the body bytes, producing ±32K noise on every channel of every BW file read from disk. This was the path watcher-forwarded events take into the system (via the import endpoint → save_imported_bw → read_blastware_file, since the watcher doesn't ship A5 frames), so every .h5 sidecar generated for a forwarded event has been wrong since the feature shipped. The fix is mechanical: pass the body bytes straight to `waveform_codec.decode_waveform_v2()` and run the result through `decoded_to_adc_counts()` for the 16x geo scaling. The body already starts with the codec's exact 7-byte preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]` — confirmed by `body[:3].hex()` across all 9 fixture events. No body-slice adjustment needed. If the codec returns None (truncated/malformed file, synthetic test input with no real waveform), fall back to empty channels with a log warning. The rest of the event (timestamp, waveform_key, project strings, sensor_location, peaks-from-samples=0) is still recoverable. Verified against the bundled fixture corpus: V70 Tran/Vert/Long 3328/3328 sample-sets match .TXT ground truth within the 0.005 in/s display quantum, every row 6S0/RG0/AB0/470 (5-8-26) 3328/2304/1280/1280 samples; Vert PPVs match BW's own report within 0.02 in/s JQ0 3328 samples, Vert PPV 3.384 vs BW 3.465 SP0/SS0/SV0 (loud events) 3072–3328 samples; known walker tail-truncation 1–7 samples per channel, samples reached are byte-exact Existing `test_read_blastware_file_round_trip` (synthetic empty event) continues to pass thanks to the None-fallback. Codec verify scripts (`analysis/verify_quiet_bundle.py`, `analysis/verify_full_decode.py`) re-run unchanged. Added two regression-lock tests in tests/test_event_file_io.py: - test_read_blastware_file_decodes_via_codec[6 fixtures] — verifies sample count + Vert PPV per fixture - test_read_blastware_file_v70_samples_match_txt_truth — verifies every one of V70's 3328 sample-sets across Tran/Vert/Long matches the .TXT ground truth row-by-row within 0.003 in/s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 18:13:24 +00:00
serversdown	cd20be2eff	feat: add thor/micromate compatibility v0.18.0	2026-05-19 04:32:43 +00:00
serversdown	aac1c8e06d	fix(import): derive record_type from filename suffix instead of hardcoding "Waveform" The BW ACH ingest path was inserting every event with record_type="Waveform" regardless of the actual type because read_blastware_file() had `ev.record_type = "Waveform"` hardcoded, and the live watcher-forward path parses files from a tmp path (suffix ".bw") that doesn't carry the original extension. V10.72+ MiniMate Plus firmware encodes the event type as the last character of the AB0T extension scheme (H=Histogram, W=Waveform, M=Manual, E=Event, C=Combo). This change: 1. Adds derive_record_type_from_filename() public helper in minimateplus/event_file_io.py 2. Uses it inside read_blastware_file() so direct callers (the --dry-run path of scripts/import_bw.py, tests, ad-hoc scripts) get correct types automatically 3. Overrides ev.record_type in WaveformStore.save_imported_bw() using the ORIGINAL filename (source_path.name) — required because the parser sees only the tmp file Old S338 firmware (3-char extensions ending in `0`) and any unrecognized suffix fall back to "Waveform". Existing DB rows ingested before this fix are stuck with record_type="Waveform" — a one-off SQL backfill would fix them retroactively if desired. Terra-view's event modal also derives client-side from the filename, so the UI already shows the correct type for old events even without the backfill. Version bumped to 0.16.1 in pyproject.toml, event_file_io.py TOOL_VERSION, sfm/server.py FastAPI version, and CHANGELOG.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 21:09:21 +00:00
serversdown	197c0630e2	chore(release): v0.16.0 — BW ACH ingestion The "BW ACH ingestion" release. Paired with series3-watcher v1.5.0, every Blastware ACH event (binary + _ASCII.TXT report) lands in SeismoDb with device-authoritative peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data — without depending on the still-undecoded waveform body codec. Bumps pyproject.toml + minimateplus/event_file_io.py TOOL_VERSION to 0.16.0. README banner + CHANGELOG entry summarise the work that landed across commits cdfe4ad..f83993a on this branch.	2026-05-11 07:33:48 +00:00
serversdown	6b2a44ff02	fix(import): overlay BW report onto Event + upsert DB row on re-import Two compounding bugs caused forwarded events to land in the DB with broken-codec peak values (~10 in/s saturation on every channel) and no project info, even when the watcher correctly paired a BW ASCII report with the binary. Bug 1: save_imported_bw built the sidecar JSON with the report's authoritative peak / project values via event_to_sidecar_dict( bw_report=...), but never overlaid those onto the in-memory Event that flows to db.insert_events(). So the DB row got peak_values from read_blastware_file()._peaks_from_samples() — which runs the still-undecoded waveform body codec assuming raw int16 LE and produces ±32K-shaped noise (= ±10 in/s at Normal range) regardless of the actual signal. The sidecar JSON had the truth but the DB columns (which the webapp queries for fast filter/sort) lied. Bug 2: insert_events' IntegrityError handler only refreshed the filename/filesize/a5_pickle/sidecar columns when a duplicate (serial, timestamp) was seen. Peak values, project info, sample_rate, record_type stayed locked in at whatever the FIRST insert wrote. So even after Bug 1 was fixed, the historical events in the DB (already inserted with broken-codec peaks) would never get their values corrected, because a re-forward would just hit IntegrityError and skip the field refresh. Fix 1 (minimateplus/event_file_io.py + sfm/waveform_store.py): - New apply_report_to_event(event, report) helper folds the BW report's device-authoritative fields onto the Event in-place: per-channel PPV, peak vector sum, mic PSPL→psi, project / client / operator / sensor_location, sample_rate, record_time. - save_imported_bw() calls the helper right after parsing the report. The Event that flows to insert_events() now carries correct values. Fix 2 (sfm/database.py): - insert_events()'s IntegrityError UPDATE now refreshes every device-authoritative column from the new data: tran_ppv, vert_ppv, long_ppv, peak_vector_sum, mic_ppv, project, client, operator, sensor_location, sample_rate, record_type, plus the existing filename/filesize/a5_pickle/sidecar fields. - Preserves: id, waveform_key, session_id, created_at (immutable / FK fields), and false_trigger (operator review state). End-to-end simulation verified: - Step 1: import without report → DB has ±10 in/s peaks, no project - Step 2: re-import WITH report → upsert path fires, DB now has device-authoritative 0.005 in/s peaks + sensor_location - Step 3: operator sets false_trigger=1, re-import again → flag preserved, peaks remain correct For the user's situation: deleting the watcher state file forces a re-forward of all events. Each re-forward now pairs with its _ASCII.TXT, applies the report onto the Event, and the upsert refreshes the DB row. No DB nuke needed. Full SFM suite: 62 passed, 44 skipped.	2026-05-11 05:51:39 +00:00
serversdown	cdfe4ad3c8	feat(import): parse paired BW ASCII reports on /db/import/blastware_file Blastware's ACH writes a per-event ASCII report (.TXT) alongside each event binary, containing the rich derived per-channel fields BW computes (PPV, ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement, Peak Vector Sum + time, sensor self-check Pass/Fail, monitor-log timestamps). None of this lives in the BW binary itself. When the watcher daemon forwards both files to /db/import/blastware_file in one multipart POST, we now: - Pair binaries with their .TXT partners by filename match - Parse the report into a structured BwAsciiReport - Land the rich fields in a new top-level `bw_report` block of the sidecar JSON - Overlay the report's peaks/project_info/timestamp/sample_rate/ record_time/total_samples/pretrig_samples onto the canonical sidecar fields (the report values are device-authoritative; the BW-binary STRT-derived values had bugs like reading the 0x46 record-type marker as rectime) This unblocks the monthly-summary review workflow — events become sortable/filterable by peak, location, project, etc. — without depending on the still-undecoded waveform body codec.	2026-05-08 23:56:43 +00:00
serversdown	c641d5fc10	feat: v0.15.0 ### Added - Layered event storage architecture. Each event now lands as four files in the per-serial waveform store, each with a clear role: - `<filename>` — the Blastware-readable binary (BW file). Untouched. - `<filename>.a5.pkl` — the raw 5A frames (regenerative source). - `<filename>.h5` — clean per-channel waveform arrays in physical units (in/s for geo, psi for mic) plus event metadata (HDF5 with gzip compression). This is the canonical format for downstream analysis tools. - `<filename>.sfm.json` — the modern review/metadata sidecar (peaks, project, source provenance, review state, extensions). SQLite (`seismo_relay.db`) is the searchable index over all four. - Plot-ready waveform JSON (`sfm.plot.v1`). The `/device/event/{idx}/waveform` and `/db/events/{id}/waveform.json` endpoints now return samples in physical units with explicit time-axis metadata, peak markers, and per-channel unit hints — no more guessing the ADC-to-velocity scale client-side. The webapp waveform viewer was rewritten to consume this shape. - In-app waveform viewer accuracy fix. The standalone SFM webapp viewer was scaling geophone amplitudes by `geoAdcScale / 32767` (≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's in/s per V hardware constant — not the ADC-counts-to-velocity factor. This silently scaled every plot ~38% too low for Normal-range geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for Sensitive). Conversion is now done server-side using the geo_range from compliance config; the client just plots. - New `sfm/event_hdf5.py` module: `write_event_hdf5()`, `read_event_hdf5()`, plus a plot-JSON helper. - Backfill script extended to also emit `.h5` for existing events. ### Dependencies - Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer. - Added `python-multipart>=0.0.7` (required by FastAPI for the `/db/import/blastware_file` endpoint introduced in this release).	2026-05-08 04:39:51 +00:00

14 Commits