seismo-relay

Author	SHA1	Message	Date
serversdown	a3cc44d30a	feat(backfill): --reparse-txt flag to refresh bw_report from preserved .TXT The existing backfill_sidecars.py PRESERVES the bw_report block across regenerations — it's treated as the source of truth from the original ingest pass (the .TXT isn't reachable from the script's normal data path, so it can't be re-derived). That means parser-side fixes (like the 2026-05-28 ">100 Hz" ZC Freq addition) won't reach old events even with --force. The new --reparse-txt flag fixes that: when the sidecar's source.txt_filename points at a preserved <serial>/<filename>_ASCII.TXT, the script re-runs the current parser against it and overwrites the bw_report block. Implies sidecar regeneration on every event (bypasses the sha-up-to-date / version-up-to-date skip), so that the .h5 cascade- regenerates alongside. No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27). Idempotent — re-running it produces the same sidecar bytes when the parser hasn't changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:56:23 +00:00
serversdown	6a73523e4d	ui: surface per-channel ZC Freq (and ">100") in event modals The PDF report shows per-channel ZC Freq alongside PPV in the stats block, but neither modal exposed it. Now that the sidecar projection carries zc_freq_hz + zc_freq_above_range, plumb them through: - sfm_webapp.html: inline suffix on existing Peaks cells, e.g. "Tran 0.04500 in/s · >100 Hz". Empty suffix when no ZC is available (legacy events without a preserved .TXT). - event_browser.html: new ZC Freq column on the per-channel stats table. Required adding a parallel sidecar fetch in loadEvent() (waveform.json alone doesn't carry bw_report). Fetch failure is non-fatal — falls back to "—" in the new column. Above-range ZC peaks (BW ">100 Hz") render with a literal ">" prefix mirroring the PDF, so operators don't have to generate the PDF to see when a channel hit the zero-crossing ceiling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:47:37 +00:00
serversdown	780b45a371	feat: render ">100" for above-range ZC Freq instead of "—" BW writes ">100 Hz" for ZC Freq when the zero-crossing algorithm sees a peak too fast to count — the device's reporting ceiling is 100 Hz on V10.72. Our parser fell back to None via _parse_number (which requires a leading digit), so the PDF rendered "—" where BW shows ">100". Mirrors the OORANGE/saturated pattern already used for PPV and PSPL: parser stores the threshold (100.0) on zc_freq_hz + sets a new zc_freq_above_range flag. Projection carries the flag through to the sidecar; PDF renderer prepends ">" when set. Affects both per-channel stats tables (waveform + histogram variants) and the mic block's ZC Freq row. Verified on the real T190LD5Q.LK0W fixture: Tran zc_freq_hz=100.0 above_range=True; Vert/Long (normal values) above_range=False; "N/A" still produces zc_freq_hz=None which renders as "—" (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:38:49 +00:00
serversdown	f6abe3caa0	fix(report_pdf): histogram geo channels share nice-quantized y-axis Two related visual bugs on histogram PDFs: 1. Per-channel auto-scale meant Tran/Vert/Long had different y-axes (e.g. 0-0.015, 0-0.025, 0-0.020) — bars looked taller on the channel that happened to be quietest. Not directly comparable. 2. Footer "Amplitude Geo: X in/s/div" was just amax/5 of the FIRST geo channel with data, with no LSB quantization — producing nonsense like 0.003 in/s/div when the geophone LSB is 0.005. Fix: compute a single shared geo y-axis range from max(Tran,Vert,Long), quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same ylim + ticks to all three geo subplots, and use that same step for the footer label. MicL stays on its own auto-scale (different units). Verified across edge cases including the reported event (geo max 0.025 → 0.005/div, top 0.025), small PVS events, and large blast amplitudes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:22:20 +00:00
serversdown	ad2702d4bf	fix(report_pdf): add missing histogram_interval_size_s field The histogram-interval-times derivation block at line 314 references rd.histogram_interval_size_s, but the field wasn't declared on the ReportData dataclass — only the string form histogram_interval_size was. Result: every PDF render of a histogram event raised AttributeError → 500 from /db/events/{id}/report.pdf. Cause: when the histogram aggregation block was inlined into gather_report_data, the seconds-numeric counterpart that the projection already carries (bw_report.histogram.interval_size_s) was never wired into the dataclass. Waveform PDFs weren't affected because the offending line is gated on is_histogram. Fix: add the field, read it from the projection alongside the other histogram keys. No-op for waveform events (the field stays None and the gate skips it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:07:41 +00:00
serversdown	86325b9bab	docs: roadmap entry for a SECOND undecoded histogram sub-format (S353) Observed in fresh ingest logs on 2026-05-28: BE17353 events (S353L4H2.FZ0H, S353L4H2.P00H, etc.) cause "body codec failed to decode" warnings. Different from the byte[5]!=0 case already tracked (T190 / O121) — these have byte[5]==0x00 with what looks like a valid block header, but the walker finds zero data blocks anyway. Operational impact identical to the existing case: ingestion succeeds, DB peaks come from bw_report overlay, only the chart is empty. No data loss. Pinning so it doesn't get lost — needs a hex dump of one body to work out what's different about these. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 05:42:18 +00:00
serversdown	6381dcb312	tz: server-wide display timezone via TZ env var (default EST/EDT) User-reported issue: server logs were timestamped in UTC ("05:36:20" when local was ~01:36 EDT), and the PDF report's "Created" footer similarly showed raw UTC. Inconsistent with the modal which already converts to browser local via toLocaleString. Solution: standard Linux TZ env var. Set once in the container, and: - Python's datetime.now() uses local - Logging module's timestamps use local - matplotlib renderers + report_pdf formatters use local - astimezone() conversions resolve to the configured TZ DB columns stay UTC (created_at uses SQLite's strftime('%Y-...Z', 'now') which is always UTC, regardless of TZ env var — proper "store UTC, display local" pattern). Changes: - Dockerfile: install tzdata (python:3.11-slim omits the timezone database), set default TZ=America/New_York - sfm/report_pdf.py: _fmt_iso_to_bw and _split_iso_to_date_time now convert UTC inputs (Z-suffixed) to local via astimezone(); naïve inputs (BW recorded-at, already unit-local) returned as-is. New _to_display_local helper centralizes the logic. - "Created" line in the PDF page footer now uses the converted timestamp. Override per-deployment via the TZ env var in docker-compose (separate commit on terra-view side). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 05:41:10 +00:00
serversdown	53c05d93e2	delete: also clean up preserved _ASCII.TXT file _cleanup_event_files() removes the on-disk artifacts when an event is hard-deleted (binary, a5_pickle, sidecar, h5). Today's .TXT preservation feature added a new on-disk file (_ASCII.TXT next to the binary) but the cleanup didn't know about it — so any event deleted via /db/events/{id} (single) or /db/events/delete_bulk (or the Terra-View "SFM Event DB Manager" UI which proxies through to those endpoints) was leaving orphan .TXT files in the store. Added "txt" to the cleanup list using the new WaveformStore.txt_path_for(). Safe for old events without a .TXT — the exists() check skips the unlink. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 05:31:08 +00:00
serversdown	a5888e1b5c	report_pdf: PDF histogram aggregation + fix footer/x-axis overlap Two issues spotted on a histogram event PDF: 1. Footer scale ("Time — /div Amplitude Geo: X in/s/div Mic: Y psi(L)/div") was overlapping horizontally with the x-axis tick labels (0, 20, 40, 60...). Both rendered on the same Y row. Fix: bumped gridspec bottom margin from 0.06 → 0.12, moved the footer text from y=0.045 → y=0.030 (below the tick labels), moved the page-bottom Created/Event line from y=0.015 → y=0.005. Trigger legend on waveforms moved 0.030 → 0.018. Everything stacks cleanly now without collision. 2. PDF was showing the raw codec output (~150+ bars per histogram) instead of BW's per-interval aggregation. Why: the aggregation I'd added to /db/events/{id}/waveform.json wasn't replicated in the PDF gather path. Now: gather_report_data does the same max-per-group aggregation when bw_report.histogram.n_intervals is populated, AND derives per-interval HH:MM:SS labels from the start time + interval_size_s. Result: histogram PDFs now match BW's display (one bar per BW interval, x-axis labeled with actual times) — same fix as the modal chart, applied to the PDF. For events ingested BEFORE the parser extension (no histogram block in their sidecar), aggregation is a no-op — they still render with per-block bars + interval-index x-axis (but the overlap fix applies to them too). Re-forwarding repopulates the histogram block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 04:33:53 +00:00
serversdown	b9f8bbb220	viewers: enforce minimum Y-range on histogram channels Quiet histogram events were filling the chart panel even though the peak was tiny (0.005 in/s rendered as 90% of chart height because Chart.js auto-scaled to peak * 1.1). Made everything look uniformly loud regardless of actual amplitude. BW's solution: a near-fixed scale per channel ("Geo: 0.002 in/s/div" from the footer). Quiet events render small, loud events render proportionally tall. Match the intent without copying BW's "no Y-axis labels at all" convention. For histogram channels: Geo (in/s): min Y range 0.05 in/s Mic in psi: min Y range 0.001 psi Mic in dBL: unchanged (the 60 dBL floor + peak+5 top already gives quiet events a sensible baseline) So a 0.005 in/s geo event renders as ~10% of chart height; a 0.05 event fills it; a 5.0 event still fills it (max(peak1.1, 0.05) == peak1.1 for any peak > 0.045). Waveform charts unchanged — they should zoom for shape detail. Applied to both the modal in sfm_webapp.html and the standalone /events page in event_browser.html. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 04:23:01 +00:00
serversdown	b59f886cb7	docs: roadmap entry for sensor-check waveform extraction BW's Event Report PDFs include a per-channel sensor-check response waveform on the right side of the bottom plot (damped sinusoid for geo channels, sawtooth-at-test-freq for mic). Looks like real per-sample data extracted from the binary, not synthesized. Our parser captures the test results (freq, ratio, amplitude, pass/fail) but not the waveform samples — so the report shows text only for sensor check. Pinning a roadmap entry to investigate the binary for the sample data (path a) or fall back to synthesized visualization (path b). Current text-only display is operationally sufficient. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 04:17:50 +00:00
serversdown	87aec3f4d1	viewers: smoother mic dBL chart + restore binary/TXT download links Two issues spotted in the modal: 1. Mic dBL chart looked spikey/discontinuous — isolated bars at 80-95 with gaps in between. Cause: _psiToDbl() returns null for zero or negative samples, and most mic samples on a quiet event sit at the digitization noise floor where they're effectively zero. Result: the chart only renders the moments when instantaneous SPL exceeded the Y-axis bottom — looks like a sound trigger gate. Fix: new _psiToDblForChart() rectifies the AC waveform (abs), then converts to dBL, then floors at MIC_DBL_FLOOR=60 dBL. Chart now has a continuous 60 dBL baseline with peaks above it — matches how acoustic engineers expect SPL-vs-time. Y-axis bottom pinned to MIC_DBL_FLOOR, top to peak + 5 dB headroom. Peak label still uses the unrectified _psiToDbl so the displayed peak value is exact. 2. Filename in Source/Files block was unlinked. Endpoint exists (/db/events/{id}/blastware_file) — just wasn't wired to the modal. Made it a clickable download link. Same treatment for the preserved .TXT — added "(download .TXT)" link next to source kind when source.txt_filename is populated (events ingested after the .TXT preservation feature landed; older events show no link). Applied to both the inline modal in sfm_webapp.html and the standalone /events page in event_browser.html. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 23:08:21 +00:00
serversdown	ace542cba5	report_pdf: wire histogram peak date/time + PVS-when + Finish field Spotted comparing our PDF to BW's reference for T003LLUB.CE0H: - Finish blank - Per-channel Date / Time rows all dashes - MicL PSPL line missing "on May 27, 2026 at 06:19:14" - Peak Vector Sum missing "on May 27, 2026 At 06:06:14" Root cause: I'd added these fields to the projection (write side) in _bw_report_to_dict but never wired them into gather_report_data (read side). Plus the projection used keys "start"/"stop" while gather was reading "start_str"/"stop_str" — typo'd lookup. Fixes: - gather_report_data now reads bw_report.histogram.start / .stop / .channel_peak_when (correct keys, matching the projection) - Per-channel "peak_date" / "peak_time" populated from channel_peak_when[<channel>] for the histogram stats table - MicL PSPL line formats as "PSPL 125.7 dB(L) on May 27, 2026 at 06:19:14" (BW style) when channel_peak_when["MicL"] is present; falls back to the waveform-relative "at 0.012 sec" otherwise - PVS line formats as "Peak Vector Sum 0.091 in/s on May 27, 2026 At 06:06:14" (BW style) when bw_report.peaks.vector_sum.when is populated; falls back to the relative time_s for waveforms - New _split_iso_to_date_time() helper splits ISO timestamps into BW-formatted ("May 27 /26", "06:06:14") date+time pairs for the stats table's separate Date and Time rows Events ingested BEFORE the parser extension landed (most of the existing prod corpus) still show dashes — their sidecars lack the histogram block. Re-forwarding repopulates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 22:47:53 +00:00
serversdown	8cbda09917	viewers: render timestamps in browser-local time Spotted on the SFM webapp event modal — "Received by server at" was showing the raw ISO string "2026-05-27T21:59:57.213043Z" because we were assigning ev.timestamp / src.captured_at directly to the textContent of the modal fields, bypassing the existing _fmtTs() helper that wraps them in toLocaleString(). Net effect for operators: confusing "21:59 vs it's 6 PM" mismatch when the displayed UTC timestamp didn't match wall-clock time. The values were always correct; the display was just ambiguous. After this fix: - "Recorded at" (naive ISO from BW = unit local time) renders cleanly as the unit wrote it: "5/27/2026, 6:00:13 AM" - "Received by server at" (UTC with Z suffix) converts to browser local: "5/27/2026, 5:59:57 PM" - Timestamp column in the history table already used _fmtTs — unchanged - Same fix applied to the standalone /events page (sidebar event list + meta header) via a new _fmtTsLocal helper Note: did NOT add file-mtime-on-watcher-PC tracking as a separate "Called in at" column — discussed and decided created_at is close enough for schedule-compliance monitoring (worst case lag = watcher poll interval ~60s, indistinguishable from BW write time at the operationally-relevant resolution). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 22:30:43 +00:00
serversdown	3457ed0072	bw_ascii_report: parse OORANGE saturation marker + TimeSum typo BW writes "OORANGE" (truncation of "Out Of Range") when a channel exceeds its full-scale, and uses a typo'd label "Peak Vector Sum TimeSum" for the PVS time field. Both confirmed against real ASCII files pulled from a Windows watcher PC 2026-05-27: T190LD5Q.LK0W Vert PPV = OORANGE (Normal range, 10 in/s exceeded) T438L713.RY0W All three PPVs OORANGE (Sensitive range, 1.25 in/s) K557L3YM.OE0W Tran+Vert PPV OORANGE + MicL PSPL OORANGE Previously our _parse_number() returned None for OORANGE → DB columns ended up NULL → events vanished from filters / sorts / dashboards despite being legitimate high-amplitude events. New behavior — substitute a conservative bound + set a saturation flag: - Channel PPV → geo_range_ips + ChannelStats.ppv_saturated - Peak Vector Sum → sqrt(3) * geo_range_ips + peak_vector_sum_saturated - MicL PSPL → 140 dB(L) + MicStats.pspl_saturated Flags propagate to the sidecar's bw_report block so the SFM UI can render "> 10 in/s" / "> 140 dBL" rather than treating the substituted value as exact. Same commit also accepts "Peak Vector Sum TimeSum" as an alias for "Peak Vector Sum Time" (BW always writes the typo on OORANGE PVS lines — every example file confirms it). Tests: new test_oorange_marker_treated_as_saturation (synthetic) + test_real_oorange_event_t190_parses (skips if real fixture absent). 177/177 tests pass; 16 pre-existing missing-fixture skips unchanged. Five events on prod (T190, T438, K557, plus 2 others matching the same fault pattern) will pick up correct peaks + saturation flags once watchers re-forward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:32:56 +00:00
serversdown	d21e3b5298	histogram aggregation + parser extension for BW interval fields Three layered changes that together make histogram charts visually match BW's printout (one bar per interval, not per codec block): 1. bw_ascii_report parser captures histogram fields it previously dropped: - Histogram Start/Stop Time + Date → datetime - Number of Intervals + Interval Size (string + parsed seconds) - <Channel> Peak Time + Peak Date → datetime (per-channel) - Peak Vector Sum Date (combined with PVS Time → datetime; clears the bogus seconds parse that interpreted "22:33:52" as 22.0) New _parse_iso_date() handles BW's ISO format for histograms (waveforms use "May 8, 2026" long form). New _parse_interval_size() handles "1 minute" / "5 minutes" / "15 seconds" etc. 2. _bw_report_to_dict() projects the new fields into a new bw_report.histogram block in the sidecar. 3. /db/events/{id}/waveform.json wraps the existing path 1 (HDF5) output with _maybe_aggregate_histogram(): when the event is a histogram AND the sidecar has bw_report.histogram.n_intervals, group the codec's per-block samples into N intervals via max-per-group and return the aggregated array. time_axis gains histogram_aggregated / n_intervals / interval_size_s / interval_times fields. Frontend (both modal chart in sfm_webapp.html + standalone event browser) uses interval_times as x-axis labels when provided (BW-style HH:MM:SS), falls back to interval index. Defensive: aggregation is no-op when the sidecar lacks the histogram block (events ingested before this change). Activates automatically on prod once a watcher re-forward populates new sidecars. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:23:05 +00:00
serversdown	ad2b553c7b	ingest: preserve raw BW ASCII report (.TXT) alongside the binary Previously the .TXT was parsed into the sidecar's bw_report projection and then discarded at ingest time. Now save_imported_bw() writes it to <store>/<serial>/<filename>_ASCII.TXT permanently. Rationale: with BW Mail / Forwarding Agent being phased out of the operator workflow, the XML/PDF/WMF those tools produce won't be available — the binary + .TXT (created by BW ACH itself) are our only authoritative inputs going forward. Keeping the raw .TXT unlocks: - Parser bug fixes can be applied RETROACTIVELY by re-parsing the stored .TXT, instead of requiring a re-forward from the watcher PC (which lost the .TXT after BW ACH cleanup). - Audit trail of what BW actually sent us, for debugging. - The five known parser-PPV-miss events will be re-parseable once the regex fix lands (instead of staying broken indefinitely). Storage cost: ~15 KB per event × 14k events = ~210 MB on the existing prod corpus. Negligible. Implementation: - WaveformStore gains txt_path_for() + open_txt() - save_imported_bw() writes the .TXT when bw_report_text is supplied - sidecar source block records the txt_filename - backfill_sidecars.py preserves txt_filename across regens - New GET /db/events/{id}/ascii_report.txt endpoint serves it - Returns 404 for events ingested before this change (no .TXT in the store yet) — re-forward to populate Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 20:01:12 +00:00
serversdown	dfbc8b8520	report_pdf: split waveform vs histogram layouts (BW PDF iteration) Reviewed against real Blastware Event Report PDFs (uploaded to example-events/pdfsnstuff/) for K558LLB7.V20H (histogram) and K558LLB8.0E0W (waveform). Each event type has its own layout because BW's printouts genuinely differ: Waveform header: Date/Time, Trigger Source, Range, Sample Rate Histogram header: Start, Finish, Intervals At Size, Range, Sample Rate (no trigger field — histograms aren't triggered) Waveform stats: PPV, ZC Freq, Time (Rel. to Trig), Peak Acceleration, Peak Displacement, Sensor Check Histogram stats: PPV, ZC Freq, Date, Time (of peak), Sensor Check Waveform plot: 4-channel stacked line, x-axis in SECONDS, trigger triangle + window markers, symmetric Y for geo, zero-anchored mic, "0.0" baseline label on right edge per BW convention Histogram plot: 4-channel stacked bars, Y-axis 0-to-peak only (never negative — peaks are magnitudes), 0.0 baseline at the bottom Waveform footer: USBM chart placeholder upper-right; "Time X sec/div Amplitude Geo: Y in/s/div Mic: 0.001 psi(L)/div" "Trigger = ▶━━◀" Histogram footer: No USBM chart; same scale-info footer with interval-size as the time unit Other fixes from the first-pass screenshot review: - Channel labels (MicL/Long/Vert/Tran) no longer cut off (wider left margin) - Histogram bars rise from zero baseline (abs of any signed values) - ISO timestamp "2026-05-16T22:33:50" → "22:33:50 May 16, 2026" matching BW's display format Known gaps (separate work): - Histogram codec returns per-block granularity (~200 bars for BW's 4-interval display). XML-driven data source is the planned fix; the structured BW XML has the per-interval aggregates. - USBM RI8507 / OSMRE compliance chart still placeholder Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 18:22:03 +00:00
serversdown	411ef8139e	sfm: Event Report PDF generation (v0.20.0 stub layout) New endpoint GET /db/events/{id}/report.pdf returns a single-page letter-portrait PDF for any event with waveform data on disk. Architecture: sfm/report_pdf.py — gather_report_data() assembles fields from SeismoDb row + .sfm.json sidecar (bw_report block) + .h5 samples; render_event_report_pdf() turns that into PDF bytes via matplotlib. sfm/server.py — new endpoint wires them together, streams PDF back with Content-Disposition: inline so the browser displays it. sfm_webapp.html — new "Download PDF" button in the event modal footer that opens the endpoint in a new tab. Fields surfaced — same coverage as a Blastware Event Report: Header metadata (date/time, trigger source, range, sample rate, project, client, operator, location, serial+firmware, battery, calibration, file name) Microphone block (PSPL in dB(L) + psi, ZC freq, channel test) Per-channel stats (PPV, ZC Freq, Time of Peak, Peak Accel, Peak Disp, Sensor Check) for Tran/Vert/Long Peak Vector Sum Waveform plot (MicL/Long/Vert/Tran stacked, shared time axis, trigger marker, symmetric Y for geo, zero-anchored mic) — OR per-interval bar chart for histograms. Rendering pipeline = matplotlib only (vector PDF, no headless-browser dep). Adds matplotlib>=3.8 to deps. Visual layout is approximate until reference PDFs from Instantel land at docs/reference/instantel/ for iteration. USBM RI8507 / OSMRE compliance chart is stubbed (placeholder rectangle) — separate work item. Smoke-tested on a K558 waveform event: 77 KB valid PDF, all fields populated correctly from the snapshot DB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 02:55:58 +00:00
serversdown	ed926de3f4	viewers: default mic to dB(L) + add Mic-unit toggle (dBL ↔ psi) The sidecar-modal waveform plot was rendering mic in raw psi, while the rest of SFM (history table column, peaks block, live-device chart, event detail modal mic field) had already converted to dB(L) — matching the BW Event Report convention. Unifying. Both viewers now: - Default mic chart values + axis title + peak label to dB(L) - Provide a header toggle ("Mic: dBL" pill) to flip to psi - Persist the preference via localStorage (sfm_mic_unit) - Re-render the open chart immediately on toggle Conversion: dBL = 20 * log10(psi / 2.9e-9), where 2.9e-9 psi is the 20 µPa reference pressure already defined for the rest of the webapp. Non-positive psi samples (log undefined) render as null; Chart.js handles them as gaps in line mode and missing bars in histogram mode. Also fixes event_browser.html's stats table — the MicL row was hard-coding "<value> psi"; now honors the same toggle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 02:30:56 +00:00
serversdown	5d5441604b	viewers: symmetric Y-axis on geo waveforms + clarify timestamp labels Two fixes from the second screenshot review: 1. Geophone waveform Y-axis now renders SYMMETRIC around zero — zero line sits in the middle of the chart, signal goes both above and below. Standard seismograph display convention; matches the Instantel printout look. Previously Chart.js auto-scaled to the data range so e.g. Vert showing values from -0.005 to -0.015 had the zero line completely off-screen. Mic channel (sound pressure, always positive) keeps the default auto-scale anchored at zero. Histograms (per-interval peaks, also always positive) likewise keep bars rising from a zero baseline. 2. Modal labels clarified to remove the 'Timestamp' vs 'Captured at' ambiguity: 'Timestamp' → 'Recorded at' (when the seismograph recorded the event — from BW report's Event Time field) 'Captured at' → 'Received by server at' (when our sfm-db inserted the row) Both have tooltips explaining the distinction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 20:26:23 +00:00
serversdown	784f2cca36	viewers: decimal peak labels + bar chart for histograms + clean x-axis ticks Three polish fixes spotted in the first prod screenshot of the inline event-modal waveform plot: 1. Peak labels were rendering as "PEAK 2.500E-2 IN/S" because of a blanket toExponential(3) call. New _fmtPeak() formatter picks decimal with adaptive precision for normal-range values (0.0001 to 10000) and falls back to scientific only for truly extreme magnitudes. Same value now reads "peak 0.0250 in/s". 2. Histogram events were being plotted as connected line charts, but histograms are per-INTERVAL peaks (one bar per minute, typically), not per-sample waveforms. Now: detect histogram via record_type, render as a tight bar graph (bars touch), suppress the trigger line + zero baseline overlays (no trigger event on a histogram), and label the x-axis with interval number instead of milliseconds. 3. X-axis tick labels were displaying as "11.7187040000000002 ms" because the callback used the raw float, not the formatted label. Snap to 1 decimal place (or integer for whole-number values like histogram intervals). Applied to both the inline modal plot in sfm_webapp.html and the standalone /events viewer in event_browser.html — they share the same data shape and presentation conventions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 19:54:04 +00:00
serversdown	6abfadae4f	viewers: render pre-trigger samples (time_axis is metadata, not an array) The /db/events/{id}/waveform.json endpoint returns `time_axis` as a metadata object — {sample_rate, pretrig_samples, t0_ms, dt_ms, n_samples, total_samples, rectime_seconds} — not a per-sample times array. Both viewers (sfm_webapp.html sidecar modal + event_browser.html) were treating it as an array, silently falling back to a derived path that ignored pretrig entirely and started the time axis at 0. Symptom: trigger line drawn at the very left edge of every chart, no visible "leading up to the event" samples even though they're in the decoded data. Fix: read time_axis.t0_ms (negative when pretrig samples exist), time_axis.dt_ms, build per-sample times as `t0_ms + i * dt_ms`. Trigger line lands at sample where t crosses 0; pretrig samples render at negative t to the left of it. Confirmed on a K558 event with 208 pretrig samples + 2 sec rectime at 1024 sps — time axis now spans -203 ms to +2046 ms, trigger line at ~9% from the left edge as expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 21:58:20 +00:00
serversdown	fd0e28657d	sfm_webapp: default to Database view + sortable columns + inline waveform plot Three UX upgrades to the main SFM webapp at /, all reinforcing the 'browse stored events' flow as the primary entry point: 1. Default section is now Database, not Live Device. Most users land here to look at stored events; Live Device is opt-in (click the tab to talk to a unit). Initial history + units fetch fires on first paint so the table is populated when the page loads. 2. History table columns are sortable. Click any header to sort: timestamp, serial, per-channel PPV (Tran/Vert/Long), PVS, mic dB(L), project, client, type, key. Default direction varies by column type (desc for numbers + timestamps, asc for text). Sort arrows appear in the active column header. Headers are sticky so they stay visible while scrolling. 3. Click-event-to-see-waveform. The existing sidecar review modal now renders the 4-channel waveform plot inline at the top, fetched from /db/events/{id}/waveform.json in parallel with the sidecar fetch. Channels stacked MicL / Long / Vert / Tran (Instantel printout order), shared bottom time axis, dashed trigger line + triangle markers at t=0, zero baseline with "0.0" label on the right edge, peak callouts per channel. Charts cleaned up on modal close. Resolves the "where is the viewer" surprise — operators no longer need to know about the /events route to see waveforms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 19:39:18 +00:00
serversdown	c14a8c54db	event_browser: Instantel-printout-style polish Apply the cheap visual wins from the BW Event Report layout: 1. Channel order reversed → MicL (top), Long, Vert, Tran (bottom) to match the Instantel printout. 2. Shared bottom time axis — x-axis ticks only render on the bottom-most data channel; other channels hide ticks so all four visually share one time scale. 3. Triangle trigger markers above and below the t=0 dashed line. 4. Horizontal zero-baseline (dotted) per channel with "0.0" label on the right edge — Instantel convention. 5. "Print view" toggle that flips dark→light theme (white panels, light grids, dark text) so the viewer can render usefully on paper-style output / @media print. 6. Per-channel PPV stats table in the metadata header, with Peak Vector Sum displayed prominently. 7. Colors adjusted to approximate BW trace colors (magenta MicL, blue Long, green Vert, red Tran). Future PDF-export work will reproduce the same layout server-side once you upload a real example PDF and we pick a rendering pipeline (weasyprint / chromium --print-to-pdf / etc.). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 07:09:12 +00:00
serversdown	460006e5cd	sfm: stored-event browser at /events New standalone HTML page (sfm/event_browser.html, ~470 lines, Chart.js) that lets you browse persisted events from the SeismoDb + WaveformStore. Companion to the existing live-device viewer at /waveform: /waveform — connect to a unit and pull events in real time /events — browse events already stored in the DB Flow: 1. Page loads → GET /db/units → populate serial dropdown 2. Select serial → GET /db/events?serial=X&limit=500 → event list 3. Click event → GET /db/events/{id}/waveform.json → render Layout is Instantel-printout-ready: channels stacked vertically in Tran / Vert / Long / MicL order, trigger line at t=0, peak labels, clean dark theme. Frames the future PDF-export feature without needing extra layout work. Smoke-tested against the dev prod-snapshot — 4 channels render with correct peaks for K558 events (L=0.3 in/s = the offset-fault peak we've been chasing all week). CHANGELOG entry added under [Unreleased] per the v0.20.0 release plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 06:53:48 +00:00
serversdown	8710b8f327	docs: record three known issues discovered during prod deployment 1. bw_ascii_report parser misses PPV/vector_sum fields on certain TXT formats (5 events in prod). Parser extracts every OTHER field for the same channels — likely a regex / format mismatch specific to some firmware-or-event-type combination. 2. NULL-timestamp duplicate rows. events.timestamp can come back as NULL when the codec can't extract a footer timestamp; UNIQUE(serial, timestamp) doesn't fire on NULL, so backfills create new rows instead of upserting. 2 affected events on prod, easy SQL cleanup. 3. Histogram body sub-format with byte[5] != 0. ~3 events on prod (T190LD5Q, O121L4L1) use a histogram body the walker doesn't recognize. Codec returns 0 valid blocks; DB peaks come from the bw_report ASCII overlay so DB columns are correct, only the .h5 plot is empty. Cracking the sub-format unlocks the plot. All three are pre-existing issues that today's deployment surfaced during validation; none are regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 21:02:13 +00:00
serversdown	db657bcac9	Merge pull request 'fix: bw_report overlay onto event before DB, prevents data loss docs: three-tier architecture model + strategic roadmap' (#27 ) from feat/wire-histogram-codec into dev Reviewed-on: #27	2026-05-22 15:46:46 -04:00
serversdown	35842ac50a	backfill: overlay bw_report onto Event before DB upsert Mirror what the ingest path does: BW's reported peaks (and sample_rate / record_time) take precedence over codec output where present. Without this, --force backfill silently overwrites bw_report-overlaid DB columns with codec-derived peaks. Wrong for events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing PVS=0 on real high-amplitude events. Bit on prod 2026-05-22 with three top-10 waveform events ending up at PVS=0 (rolled back same day, this fix is the proper resolution). New helper minimateplus.event_file_io.apply_bw_report_dict_to_event operates on the projected sidecar dict shape (the structure _bw_report_to_dict produces, which is what gets preserved in the sidecar). Mirrors apply_report_to_event's semantics: only writes fields where bw_report has a non-None value, no-ops cleanly on empty / None input. Dev validation against prod snapshot: pre : 1839.7315 pvs_sum 356 events with DB PVS ≠ sidecar bw_report post : 2016.4902 pvs_sum 2 events still mismatched (both have NULL timestamp + duplicate rows, edge case) Both edge-case events DO get the correct value written by the new backfill — their stale rows from prior backfills remain because UNIQUE(serial, timestamp) doesn't fire on NULL. Separate dedup cleanup needed for those 2 events (0.014% of corpus); not blocking. Backfill remains idempotent + bw_report preservation still passes (0 WIPED, 0 CHANGED on the 3rd consecutive run). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 18:56:22 +00:00
serversdown	49a524d0d4	docs: three-tier architecture model + strategic roadmap CLAUDE.md gains an Architecture section near the top describing the canonical three-tier mental model: - SFM: device-side, live connections, /device/* endpoints - SDM: data-side, DB + waveform store + /db/* endpoints (currently living under sfm/ for historical reasons; rename deferred) - Codec library: pure data-interpretation, used by both tiers Future code should be placed and named according to this model even though the directory layout doesn't fully reflect it yet. Decision rule for where new code goes is documented inline. README.md's Roadmap section gains two strategic-direction subsections: - "Strategic direction" — frames the suite-of-components vision and notes that BW ACH + Thor IDF call-home remain the data movers; seismo-relay's value is on the receiving and processing side. - "Terra-View ↔ SFM device control" — the long-term vision where Terra-View can launch into SFM device-control surfaces (operator notices missing unit → clicks "Connect to Device" → live view in browser). Includes concrete implementation checklist (auth, embedded live-monitor view, action history, series IV live support). The existing tactical roadmap items remain unchanged below. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 18:38:00 +00:00
serversdown	9ef424d098	Merge pull request 'Histogram body codec — full RE + peak-count fix that resolves the prod inflation incident' (#26 ) from feat/wire-histogram-codec into dev Reviewed-on: #26	2026-05-22 13:08:03 -04:00
serversdown	ed6982c512	scripts: bw_report preservation check for backfill safety Two-step tool to verify that backfill_sidecars doesn't wipe the bw_report block from existing sidecars. Workflow: 1. snapshot --out before.json (canonical-JSON hash per sidecar) 2. run backfill 3. diff --baseline before.json (classifies every sidecar: PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED) Exit code 1 if any WIPED or CHANGED entries found, 0 otherwise — so it can gate a CI step or a deploy script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 06:13:52 +00:00
serversdown	d506ebc103	histogram_codec: peak count is uint8 (not uint16 LE) — properly cracks the BE9558 / BE18003 extension-byte case The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still unclear — empirically non-zero on intervals with sub-Hz or unmeasurable freq), NOT the high byte of the peak count. The N844 fixture corpus the original RE was done against had zero values in those bytes for every block, so uint8 and uint16 LE were equivalent there — but on real BE9558 Tran-drift events and BE18003 Histogram+Continuous events the uint16 LE interpretation produced peaks up to 268 in/s and 35× inflated PVS sums. Cross-correlated against BW's per-interval ASCII export on: - K558LKZU/LL1P/LL3K → 100% T/V/L/M peak match (1435 blocks each) - T003LKZR/LL0O/LL1M → 100% T/V/L, 99.3% M (0.05 dB rounding only) - N599LKZS/LL0L → 100% all channels - N844 fixture corpus → 100% all channels (unchanged) Annotations preserved on every record for future RE; the defensive _MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s, well below any physical limit). Synthetic regression test added using the verbatim K558LKZU.RE0H interval-12 block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 06:05:19 +00:00
serversdown	e949232875	histogram_codec + backfill: tighter peak ceiling, preserve bw_report histogram_codec: drop _MAX_PEAK_COUNT 4096 → 2200. The old ceiling let extension-byte blocks slip through at up to 20.48 in/s per channel, producing 35× inflated PVS sums when first deployed to prod. 2200 covers Normal-range full-scale (10 in/s = 2000 counts) plus 10% headroom for quantization edge cases. backfill_sidecars: also preserve the bw_report block alongside review + extensions when regenerating sidecars. event_to_sidecar_dict takes a BwAsciiReport dataclass not a dict, so for bw_report we overlay the existing block after regen rather than passing as a kwarg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 02:50:10 +00:00
serversdown	bc5a2d3f19	histogram_codec: defensive bounds-check on peak counts Discovered while running the backfill on prod: certain histogram blocks contain an undocumented extension byte format whose naive uint16 LE interpretation yields physically impossible peak values (150+ in/s when the device max is 10). Concrete example from K558LKSG.3I0H block at body+7424: bytes [6:10] = 05 79 69 00 current code: T_peak = uint16 LE = 0x7905 = 30981 → 154.9 in/s reality: T_peak = byte[6] = 5 → 0.025 in/s (matches BW display) The high byte (0x79 here) appears to be an extension field — possibly "time of peak within interval" or a Histogram+Continuous sub-mode marker. Observed across BE9558 and BE18003 units in prod data; never appeared in the BE12844 fixture corpus the codec was originally verified against. Effect on prod: 26 out of 1433 blocks in this one event had inflated peaks, plus dozens of similar events across the fleet → sum(PVS) inflated from baseline 988 to 34501 (35x). Rolled back via the pre-backfill snapshot before any UI exposure. Defensive fix: bounds-check peak counts in `_decode_block`. Any field exceeding `_MAX_PEAK_COUNT` (4096 = ~20 in/s, well past the device's 10 in/s Normal-range FS) causes the block to be skipped entirely. Other valid blocks in the same event still decode correctly. Trade-off: those skipped blocks lose their per-interval data (peaks + frequencies). Acceptable until the extension format is reverse-engineered — better than propagating bogus values into PVS computations downstream. The 24 existing tests all still pass — the fixtures used during the original codec development don't exercise the extension-byte case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 02:17:33 +00:00
serversdown	88549bc659	backfill_sidecars: filter out Thor IDF files Discovered while dry-running the backfill on prod: the waveform store contains both BW (.AB0*/.N00) and Thor IDF (.IDFW/.IDFH) event files side-by-side because both go through the same per-serial directory layout. The script's `_looks_like_event_file` heuristic accepted any 3-4 char extension ending in W or H, which matched both BW and IDF. The script then routes everything through `event_file_io.read_blastware_file`, which rejects IDF files with "not a Blastware file (bad header prefix)" — 3807 errors on prod out of 7201 total events. Thor IDF events have their own ingest path (`WaveformStore.save_imported_idf`) and their sidecars are populated at ingest from the paired `.IDFW.txt` ASCII report. The backfill script has no value to add for them — there's no decoder to refresh, and the sidecar metadata is already correct. Filter them out. After this fix, the prod backfill should run clean: ~3392 BW events get sidecar+h5 regen as expected; the ~3807 Thor IDF events are silently skipped. The proper "IDF backfill" (refresh tool_version stamp on IDF sidecars by re-running event_to_sidecar_dict against the stored DB row + sidecar extensions block) is a separate, narrower follow-up — not blocking the BW backfill rollout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 01:20:08 +00:00
serversdown	76bce0b5a3	Merge pull request 'v0.20.0 - prerelease features.' (#25 ) from feat/wire-histogram-codec into dev - dockerfile fix - histogram body codec FULLY decoded - backfill scripts fixed. - docs added for histogram codec	2026-05-20 21:05:37 -04:00
serversdown	7183b953e4	minimateplus: histogram body codec — FULLY DECODED The histogram-mode event body is now byte-exact decodable. Companion to the waveform body codec — together they cover every event file the watcher forwards. Cracked in one session via cross-event correlation against BW's ASCII export. The §7.6.2 spec in instantel_protocol_reference.md was structurally correct (32-byte blocks) but the per-sample semantics were under-documented. Cross-checking block 130 of N844L6Z8.ZR0H against its TXT row revealed the layout perfectly: slot[0] = 10 (constant marker) slot[1] = T_peak_count (× 0.005 → in/s at Normal range) slot[2] = T_halfperiod (freq_Hz = 512 / halfp) slot[3] = V_peak_count slot[4] = V_halfperiod slot[5] = L_peak_count slot[6] = L_halfperiod slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db) slot[8] = MicL_halfperiod The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz). Mic dB uses the SAME formula as the waveform codec (sign × (81.94 + 20·log10(\|count\|))) — they share the mic ADC calibration constant. Block identification anchor: bytes [22:24] == 0x0000 AND bytes [28:32] == 1e 0a 00 00. The tail signature is the most reliable distinguisher from non-block content in the file. Files: minimateplus/histogram_codec.py (new) — decoder + public API matching the waveform codec's shape: walk_body(body) -> records decode_histogram_body(body) -> {Tran, Vert, Long, MicL} decode_histogram_body_full(body) -> [per-interval dicts] half_period_to_hz, geo_count_to_ins helpers minimateplus/event_file_io.py (modified) — read_blastware_file now tries the waveform codec first, falls back to the histogram codec on failure. Same output shape, same downstream pipeline. tests/test_histogram_codec.py (new) — 24 regression locks against the in-repo fixture corpus, byte-exact against BW ASCII export for peaks (all 4 channels), frequencies (all 4 channels, including >100 Hz sentinel handling), block framing, and segment-ID accounting. scripts/backfill_sidecars.py (modified) — the has_samples short-circuit added in the histogram-pending era is now a pure defensive guard. Histograms in prod will regen .h5 files correctly on the next backfill run. docs/histogram_codec_re_status.md (updated) — supersedes the earlier "in progress" version with the verified format and test-coverage summary. Notes a few non-essential fields still open (4-byte block metadata, Geo PVS, Mic psi(L) — none of which are needed for waveform reconstruction). Total verified coverage: ~3,500 blocks across 5 fixtures, every field of every block byte-exact against BW. The watcher-forwarded histogram event corpus on prod (~10,000 events) will now produce correct .h5 sidecars on the next backfill run. No additional changes needed to the backfill flow — the existing tool_version-bump cascade picks them up automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 23:05:13 +00:00
serversdown	c3c7fe559c	docs: histogram body codec RE — starting-point status doc Captures everything learned in the 2026-05-20 session before scope forced a pause: - Block framing is solved: 32-byte blocks, one per histogram interval, signature byte pattern `[22:24]=0x0000` + `[28:32]=0x1e 0x0a 0x00 0x00` reliably identifies data blocks. - Block count = interval count (791 blocks in N844L20G.630H for a TXT-reported 792 intervals). - Sample[0] = Tran peak in 0.0005 in/s/count units (verified on one event — needs cross-event confirmation). - Samples 1-8 → channel/metric mapping is still open. None of the obvious layouts (peak-then-freq alternating, all-peaks- then-all-freqs, per-channel 3-tuples) match the TXT values across multiple blocks. Likely needs a higher-activity fixture (current N844 corpus is all noise-floor data) to disambiguate. - `>100 Hz` sentinel encoding in the binary is unknown. - 4-byte variable metadata field at block[24:28] needs correlation work against TXT columns. Doc mirrors the structure of docs/waveform_codec_re_status.md so a future RE session has a familiar entry point. Includes the suggested attack plan + the code seam where the eventual decoder will land (minimateplus/histogram_codec.py). The §7.6.2 spec in instantel_protocol_reference.md is structurally correct but doesn't pin down per-sample semantics — this doc supersedes it where they conflict on confidence level. No code shipped on this branch. When the codec is cracked, the plan is to land minimateplus/histogram_codec.py + wire into event_file_io.read_blastware_file() + remove the has_samples short-circuit from scripts/backfill_sidecars.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 21:13:26 +00:00
serversdown	fa9d3cdef2	read_blastware_file: leave peak_values=None when samples can't be decoded Fixes a data-loss bug discovered while dry-running the backfill against the prod store. Symptom: every histogram event in the store has its body decoded by read_blastware_file → codec returns None → samples = empty dict → ``ev.peak_values = _peaks_from_samples(empty)`` returns ``PeakValues(0, 0, 0, 0, 0)`` (NOT None). The backfill script's existing "seed from DB row when peak_values is None" branch then correctly skips the seeding, and the all-zeros PeakValues flows into ``db.insert_events()``'s UPSERT path, OVERWRITING the existing good DB peak values for that event (which were populated from the paired BW ASCII report at ingest). Net effect: running the backfill on prod would have wiped the PPV / mic / vector-sum columns for ~10,000 histogram events. Fix: only compute peaks-from-samples when there are actually samples. For events the codec couldn't decode (histogram-mode bodies, until the §7.6.2 histogram codec is wired in), leave peak_values=None as the "we don't know" signal. Downstream consumers: - backfill_sidecars.py — its existing ``if ev.peak_values is None:`` branch (line 243) seeds from the DB row, preserving the real BW-report peaks across the regen. - WaveformStore.save_imported_bw — apply_report_to_event overlays peaks from the paired BW ASCII report when one was uploaded. Histogram imports without a paired report end up with NULL peaks in the DB, which is correct (better than zeros — clearly says "no peak data available" rather than "peaks are exactly zero"). Updated the existing synthetic-event round-trip test to expect peak_values=None for the no-real-body case, which is the truth now. The 7 fixture-corpus regression tests for real BW waveforms continue to pass — those have decodable samples, so peak_values is still populated from the codec output as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 20:30:53 +00:00
serversdown	c4648c1959	scripts/backfill_sidecars: skip .h5 write when decoder returned no samples Discovered while dry-running the backfill on the prod store: ~10,000 of ~10,059 events are histogram-mode (filename extension `H`), and the waveform-body codec wired in via the previous commit doesn't handle histogram-mode bodies — only the waveform-mode codec at §7.6.1 is implemented; the histogram-mode codec at §7.6.2 of the protocol reference is documented but no Python implementation exists yet. Without this guard, every histogram event's .h5 file would be replaced* with an empty one — strictly worse than today's broken-int16-LE .h5 because any downstream viewer expecting non-empty sample arrays would now error out instead of just rendering wrong values. Fix: after the decoder runs, check whether any channel has samples. If not, skip the .h5 write entirely. The sidecar still regenerates (refreshing the tool_version stamp and any peaks/project info from the DB row), but the existing .h5 is left untouched. This is a temporary gate. When the histogram codec lands (next branch: `feat/wire-histogram-codec`), the has_samples check can be removed and the backfill will then correctly regenerate all .h5 files, histogram and waveform alike. Observed effect (dry-run on prod store, 10,059 events): - waveform events (~5%): "[DRY ] would write … + .h5 (would (re)write)" - histogram events (~95%): "[DRY ] would write … + .h5 (skipped-empty-samples)" - sidecar tool_version bump succeeds for both Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 20:16:31 +00:00
serversdown	0e89125495	docker: fix dockerfile to include scripts and micromate folders	2026-05-20 19:58:54 +00:00
serversdown	fffb363b2b	Merge pull request 'minimateplus: wire read_blastware_file to verified body codec' (#24 ) from feat/wire-codec-to-import-path into dev Reviewed-on: #24	2026-05-20 15:26:15 -04:00
serversdown	e8682d49ad	scripts/backfill_sidecars: cascade h5 regen when sidecar is stale + bump TOOL_VERSION Two coupled changes that close the rollout gap left by the read_blastware_file codec wiring: 1. minimateplus/event_file_io.py: bump TOOL_VERSION from 0.16.1 to 0.20.0. This is the version stamp the backfill script reads from each sidecar's source.tool_version field to detect "this sidecar was written before the current decoder shipped, regenerate it." Bumping past every value baked into existing prod sidecars flags them all as stale on the next backfill run — which is exactly what we want, since every pre-codec-wiring sidecar was written by the retracted int16-LE decoder. 2. scripts/backfill_sidecars.py: when the sidecar is being regenerated this iteration (sha mismatch, tool_version too old, or --force), also regenerate the .h5. Previously the .h5 logic only rewrote when --force was passed or the file was missing — so a tool_version-driven sidecar regen left the broken .h5 in place forever. Added a `sidecar_stale` boolean to track the "we're rewriting the sidecar this iteration" state and wired it into the h5 need-rewrite check. Path coverage (verified by trace): - sidecar missing → both regen - --force → both regen - sha mismatch → both regen - tool_ver too old → both regen (THE post-codec-wiring case) - everything OK → skip iteration entirely (h5 untouched) Operator review state (review.false_trigger, reviewer, notes) and the sidecar's extensions block are preserved across regen by the existing read-existing-sidecar / pass-into-event_to_sidecar_dict path — unchanged from prior behavior. Deploy procedure (on prod): 1. Pull this change + the read_blastware_file codec wiring. 2. `python scripts/backfill_sidecars.py --dry-run` to preview. Every sidecar with source.tool_version<0.20.0 will show as "would (re)write". 3. Run for real (drop --dry-run). Expect every pre-fix event to regen. Big stores may take a while. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 18:24:06 +00:00
serversdown	31d691b40b	minimateplus: wire read_blastware_file to verified body codec `read_blastware_file()` was still calling `_decode_samples_4ch_int16_le` (the retracted int16-LE-interleaved hypothesis) on the body bytes, producing ±32K noise on every channel of every BW file read from disk. This was the path watcher-forwarded events take into the system (via the import endpoint → save_imported_bw → read_blastware_file, since the watcher doesn't ship A5 frames), so every .h5 sidecar generated for a forwarded event has been wrong since the feature shipped. The fix is mechanical: pass the body bytes straight to `waveform_codec.decode_waveform_v2()` and run the result through `decoded_to_adc_counts()` for the 16x geo scaling. The body already starts with the codec's exact 7-byte preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]` — confirmed by `body[:3].hex()` across all 9 fixture events. No body-slice adjustment needed. If the codec returns None (truncated/malformed file, synthetic test input with no real waveform), fall back to empty channels with a log warning. The rest of the event (timestamp, waveform_key, project strings, sensor_location, peaks-from-samples=0) is still recoverable. Verified against the bundled fixture corpus: V70 Tran/Vert/Long 3328/3328 sample-sets match .TXT ground truth within the 0.005 in/s display quantum, every row 6S0/RG0/AB0/470 (5-8-26) 3328/2304/1280/1280 samples; Vert PPVs match BW's own report within 0.02 in/s JQ0 3328 samples, Vert PPV 3.384 vs BW 3.465 SP0/SS0/SV0 (loud events) 3072–3328 samples; known walker tail-truncation 1–7 samples per channel, samples reached are byte-exact Existing `test_read_blastware_file_round_trip` (synthetic empty event) continues to pass thanks to the None-fallback. Codec verify scripts (`analysis/verify_quiet_bundle.py`, `analysis/verify_full_decode.py`) re-run unchanged. Added two regression-lock tests in tests/test_event_file_io.py: - test_read_blastware_file_decodes_via_codec[6 fixtures] — verifies sample count + Vert PPV per fixture - test_read_blastware_file_v70_samples_match_txt_truth — verifies every one of V70's 3328 sample-sets across Tran/Vert/Long matches the .TXT ground truth row-by-row within 0.003 in/s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 18:13:24 +00:00
serversdown	beca5de06e	docs: clean up and verify s3 protocol docs	2026-05-20 17:55:02 +00:00
serversdown	d85df4c886	Merge pull request 'merge full s3 codec decoded' (#23 ) from codec-re into main Reviewed-on: #23	2026-05-20 13:45:32 -04:00
Claude	0466bb4f44	codec: crack wide-NN blocks (1X NN / 2X NN); loud events now fully decode When NN exceeds 0xFC, the codec extends to 12-bit NN by using the low nibble of the TYPE byte as the high nibble of NN: 1X NN → nibble-delta block, NN = (X << 8) \| NN_byte 2X NN → int8-delta block, same NN encoding Walker and decode_waveform_v2 now handle both narrow (X=0) and wide (X != 0) forms uniformly. Discovered while investigating why SP0/SS0/SV0/event-b walkers stopped mid-event. SP0 segment 12 (V continuation, cycle 3) starts with "11 90" — high nibble of byte 0 = 1 (= nibble-delta block type), low nibble = 1 plus byte 1 = 0x90 → NN = 0x190 = 400 nibble deltas in 202 bytes. Walker was rejecting "11" as a non-tag. Sample count went from 47,364 to 72,972 verified byte-exact: event-a: 9984 (full) was 9984 (full) event-b: 6912 (full) was 738 event-c: 3840 (full) was 3840 (full) event-d: 3840 (full) was 3840 (full) JQ0: 9984 (full) was 9984 (full) V70: 9984 (full) was 9984 (full) SP0: 9984 (full) was 5122 SS0: 9222 (-7 tail) was 1758 SV0: 9222 (-7 tail) was 2114 7 of 9 fixtures now decode end-to-end across all 3 geo channels. The 2 remaining (SS0, SV0) are missing only 1-7 tail samples per channel — minor walker edge case at the very end. 74 tests pass (was 71).	2026-05-20 17:28:54 +00:00
Claude	85f4bcfe86	codec: wire decode_waveform_v2 into production; add MicL dB helper Replaces the broken legacy int16 LE decoder in client.py with the verified multi-channel codec. Three changes: 1. blastware_file.extract_body_bytes(a5_frames) — new helper that factors out the body-reconstruction logic from write_blastware_file so both writers (BW binary) and decoders (sample arrays) can use the same canonical bytes. 2. waveform_codec.decode_a5_frames(a5_frames) — production entry point. Returns the raw_samples dict consumers expect (Tran/Vert/Long as int16 ADC counts; MicL as native ADC counts). Internally: A5 frames → extract_body_bytes → decode_waveform_v2 → decoded_to_adc_counts (geos ×16; mic pass-through) 3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's display formula: dB = sign(count) × (81.94 + 20 × log10(\|count\|)) for \|count\| ≥ 1 Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1). client.py:_decode_a5_waveform is reduced to a thin wrapper that calls decode_a5_frames and populates event.raw_samples. Original implementation preserved as _decode_a5_waveform_LEGACY (dead code; reference only). Also fixed a tail-end bug in decode_waveform_v2 where trailer-section "40 02" markers (containing ASCII serial bytes, NOT real segment headers) were being mis-interpreted, producing 2 spurious samples per channel at the end of each event. Added bytes [12:14] == "02 00" validation to reject non-header markers. 7 new pytest tests cover the new helpers and dB conversion. Total: 71 passing (up from 64). Known limitation (carried over from before): the walker still stops mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some mid-segment edge cases not yet characterized. Every sample reached is decoded correctly; the walker just doesn't reach all of them. Loud events still yield 5,000–15,000 byte-exact samples each.	2026-05-20 17:28:54 +00:00
Claude	2ff2762eec	codec-re: 30 NN block CRACKED — codec fully decoded User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC range constraint led to the final piece. 30 NN block format (CONFIRMED across all 14 blocks in the fixture bundle): NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each. Within each group: bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first) bytes [2:6] = 4 × int8 low bytes delta[k] = sign_extend_12((high_nibble[k] << 8) \| low_byte[k]) Block length = NN × 1.5 + 2 bytes (tag included). Earlier walker used NN × 4 which is only correct in the TRAILER section. Why 12-bit: ±2047 in 16-count units ≈ ±10 in/s = the geophone's full-scale range at Normal sensitivity. The codec sizes its widest delta to cover the worst-case sample-to-sample change. Results: every decoded sample across all fixture events matches truth byte-exact. ZERO divergences. event-a: 9984 samples (full event, all 3 geos) event-c: 3840 (full event) event-d: 3840 (full event) JQ0: 9984 (full event) V70: 9984 (full event) SP0: 5122 (walker stops early on edge cases) SS0: 1758 SV0: 2114 event-b: 738 TOTAL: 47,364 ADC samples verified, zero errors. Three full 3-sec events decode end-to-end across all three geo channels. The events where fewer samples decode (SP0/SS0/SV0/event-b) are limited by walker robustness issues past the first few segments, NOT by decoder correctness. 64 tests pass (up from 55). Files: minimateplus/waveform_codec.py (new 30 NN decode + corrected walker length), tests/test_waveform_codec.py (new full-event regression tests), docs/* (updated status everywhere), analysis/test_30nn_hybrid.py (new — the analysis script that confirmed the format).	2026-05-20 17:28:54 +00:00

1 2 3 4 5 ...

366 Commits