seismo-relay

Author	SHA1	Message	Date
serversdown	35842ac50a	backfill: overlay bw_report onto Event before DB upsert Mirror what the ingest path does: BW's reported peaks (and sample_rate / record_time) take precedence over codec output where present. Without this, --force backfill silently overwrites bw_report-overlaid DB columns with codec-derived peaks. Wrong for events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing PVS=0 on real high-amplitude events. Bit on prod 2026-05-22 with three top-10 waveform events ending up at PVS=0 (rolled back same day, this fix is the proper resolution). New helper minimateplus.event_file_io.apply_bw_report_dict_to_event operates on the projected sidecar dict shape (the structure _bw_report_to_dict produces, which is what gets preserved in the sidecar). Mirrors apply_report_to_event's semantics: only writes fields where bw_report has a non-None value, no-ops cleanly on empty / None input. Dev validation against prod snapshot: pre : 1839.7315 pvs_sum 356 events with DB PVS ≠ sidecar bw_report post : 2016.4902 pvs_sum 2 events still mismatched (both have NULL timestamp + duplicate rows, edge case) Both edge-case events DO get the correct value written by the new backfill — their stale rows from prior backfills remain because UNIQUE(serial, timestamp) doesn't fire on NULL. Separate dedup cleanup needed for those 2 events (0.014% of corpus); not blocking. Backfill remains idempotent + bw_report preservation still passes (0 WIPED, 0 CHANGED on the 3rd consecutive run). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 18:56:22 +00:00
serversdown	fa9d3cdef2	read_blastware_file: leave peak_values=None when samples can't be decoded Fixes a data-loss bug discovered while dry-running the backfill against the prod store. Symptom: every histogram event in the store has its body decoded by read_blastware_file → codec returns None → samples = empty dict → ``ev.peak_values = _peaks_from_samples(empty)`` returns ``PeakValues(0, 0, 0, 0, 0)`` (NOT None). The backfill script's existing "seed from DB row when peak_values is None" branch then correctly skips the seeding, and the all-zeros PeakValues flows into ``db.insert_events()``'s UPSERT path, OVERWRITING the existing good DB peak values for that event (which were populated from the paired BW ASCII report at ingest). Net effect: running the backfill on prod would have wiped the PPV / mic / vector-sum columns for ~10,000 histogram events. Fix: only compute peaks-from-samples when there are actually samples. For events the codec couldn't decode (histogram-mode bodies, until the §7.6.2 histogram codec is wired in), leave peak_values=None as the "we don't know" signal. Downstream consumers: - backfill_sidecars.py — its existing ``if ev.peak_values is None:`` branch (line 243) seeds from the DB row, preserving the real BW-report peaks across the regen. - WaveformStore.save_imported_bw — apply_report_to_event overlays peaks from the paired BW ASCII report when one was uploaded. Histogram imports without a paired report end up with NULL peaks in the DB, which is correct (better than zeros — clearly says "no peak data available" rather than "peaks are exactly zero"). Updated the existing synthetic-event round-trip test to expect peak_values=None for the no-real-body case, which is the truth now. The 7 fixture-corpus regression tests for real BW waveforms continue to pass — those have decodable samples, so peak_values is still populated from the codec output as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 20:30:53 +00:00
serversdown	31d691b40b	minimateplus: wire read_blastware_file to verified body codec `read_blastware_file()` was still calling `_decode_samples_4ch_int16_le` (the retracted int16-LE-interleaved hypothesis) on the body bytes, producing ±32K noise on every channel of every BW file read from disk. This was the path watcher-forwarded events take into the system (via the import endpoint → save_imported_bw → read_blastware_file, since the watcher doesn't ship A5 frames), so every .h5 sidecar generated for a forwarded event has been wrong since the feature shipped. The fix is mechanical: pass the body bytes straight to `waveform_codec.decode_waveform_v2()` and run the result through `decoded_to_adc_counts()` for the 16x geo scaling. The body already starts with the codec's exact 7-byte preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]` — confirmed by `body[:3].hex()` across all 9 fixture events. No body-slice adjustment needed. If the codec returns None (truncated/malformed file, synthetic test input with no real waveform), fall back to empty channels with a log warning. The rest of the event (timestamp, waveform_key, project strings, sensor_location, peaks-from-samples=0) is still recoverable. Verified against the bundled fixture corpus: V70 Tran/Vert/Long 3328/3328 sample-sets match .TXT ground truth within the 0.005 in/s display quantum, every row 6S0/RG0/AB0/470 (5-8-26) 3328/2304/1280/1280 samples; Vert PPVs match BW's own report within 0.02 in/s JQ0 3328 samples, Vert PPV 3.384 vs BW 3.465 SP0/SS0/SV0 (loud events) 3072–3328 samples; known walker tail-truncation 1–7 samples per channel, samples reached are byte-exact Existing `test_read_blastware_file_round_trip` (synthetic empty event) continues to pass thanks to the None-fallback. Codec verify scripts (`analysis/verify_quiet_bundle.py`, `analysis/verify_full_decode.py`) re-run unchanged. Added two regression-lock tests in tests/test_event_file_io.py: - test_read_blastware_file_decodes_via_codec[6 fixtures] — verifies sample count + Vert PPV per fixture - test_read_blastware_file_v70_samples_match_txt_truth — verifies every one of V70's 3328 sample-sets across Tran/Vert/Long matches the .TXT ground truth row-by-row within 0.003 in/s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 18:13:24 +00:00
serversdown	082e5946bc	fix(import): resolve real serial from BW filename instead of bucketing to UNKNOWN The /db/import/blastware_file endpoint was bucketing every forwarded event into serial='UNKNOWN' in the DB. WaveformStore correctly decoded the serial from the BW filename and saved files to <store>/<serial>/<filename> (e.g. .../BE17353/S353L5KC.DR0H.h5), but the endpoint code called db.insert_events(serial=_serial_from_event(ev)) — and _serial_from_event was a stub that always returned None, falling back to "UNKNOWN". Effect on the user's prod server: 3,039 events forwarded across 24 distinct units, ALL inserted under serial='UNKNOWN'. The on-disk waveform store + sidecars + HDF5s were fine, but the SFM webapp's /db/units only showed the two original manually- uploaded serials because every forwarded row had its serial column zeroed to UNKNOWN. Fix: - WaveformStore.save_imported_bw() now surfaces the decoded serial on the returned `rec` dict (rec["serial"]). - The import endpoint uses rec["serial"] as the authoritative fallback when the operator hasn't supplied a serial_hint query parameter. Order of precedence: query string `serial` → rec["serial"] → _serial_from_event(ev) → "UNKNOWN" - Response payload now includes `serial` per file so the watcher log lines (or any future caller) can see which unit each event was attributed to. Recovery for existing DB rows: scripts/repair_unknown_serials.py walks the events table looking for rows with serial='UNKNOWN' and re-attributes each one to the serial decoded from blastware_filename. Updates the row in place unless the target (serial, timestamp) already has a row, in which case the UNKNOWN duplicate is deleted. Idempotent. Default dry-run; pass --apply to commit. Verified on the user's actual DB (dry-run): UNKNOWN rows scanned: 3039 Updated to real serial: 2602 Deleted (duplicate of an already-correct row): 437 Unresolved (bad filename): 0 After running the repair, /db/units will show all 24 units correctly populated.	2026-05-11 02:25:08 +00:00
serversdown	cdfe4ad3c8	feat(import): parse paired BW ASCII reports on /db/import/blastware_file Blastware's ACH writes a per-event ASCII report (.TXT) alongside each event binary, containing the rich derived per-channel fields BW computes (PPV, ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement, Peak Vector Sum + time, sensor self-check Pass/Fail, monitor-log timestamps). None of this lives in the BW binary itself. When the watcher daemon forwards both files to /db/import/blastware_file in one multipart POST, we now: - Pair binaries with their .TXT partners by filename match - Parse the report into a structured BwAsciiReport - Land the rich fields in a new top-level `bw_report` block of the sidecar JSON - Overlay the report's peaks/project_info/timestamp/sample_rate/ record_time/total_samples/pretrig_samples onto the canonical sidecar fields (the report values are device-authoritative; the BW-binary STRT-derived values had bugs like reading the 0x46 record-type marker as rectime) This unblocks the monthly-summary review workflow — events become sortable/filterable by peak, location, project, etc. — without depending on the still-undecoded waveform body codec.	2026-05-08 23:56:43 +00:00
serversdown	c641d5fc10	feat: v0.15.0 ### Added - Layered event storage architecture. Each event now lands as four files in the per-serial waveform store, each with a clear role: - `<filename>` — the Blastware-readable binary (BW file). Untouched. - `<filename>.a5.pkl` — the raw 5A frames (regenerative source). - `<filename>.h5` — clean per-channel waveform arrays in physical units (in/s for geo, psi for mic) plus event metadata (HDF5 with gzip compression). This is the canonical format for downstream analysis tools. - `<filename>.sfm.json` — the modern review/metadata sidecar (peaks, project, source provenance, review state, extensions). SQLite (`seismo_relay.db`) is the searchable index over all four. - Plot-ready waveform JSON (`sfm.plot.v1`). The `/device/event/{idx}/waveform` and `/db/events/{id}/waveform.json` endpoints now return samples in physical units with explicit time-axis metadata, peak markers, and per-channel unit hints — no more guessing the ADC-to-velocity scale client-side. The webapp waveform viewer was rewritten to consume this shape. - In-app waveform viewer accuracy fix. The standalone SFM webapp viewer was scaling geophone amplitudes by `geoAdcScale / 32767` (≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's in/s per V hardware constant — not the ADC-counts-to-velocity factor. This silently scaled every plot ~38% too low for Normal-range geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for Sensitive). Conversion is now done server-side using the geo_range from compliance config; the client just plots. - New `sfm/event_hdf5.py` module: `write_event_hdf5()`, `read_event_hdf5()`, plus a plot-JSON helper. - Backfill script extended to also emit `.h5` for existing events. ### Dependencies - Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer. - Added `python-multipart>=0.0.7` (required by FastAPI for the `/db/import/blastware_file` endpoint introduced in this release).	2026-05-08 04:39:51 +00:00

6 Commits