histogram aggregation + parser extension for BW interval fields
Three layered changes that together make histogram charts visually
match BW's printout (one bar per interval, not per codec block):
1. bw_ascii_report parser captures histogram fields it previously
dropped:
- Histogram Start/Stop Time + Date → datetime
- Number of Intervals + Interval Size (string + parsed seconds)
- <Channel> Peak Time + Peak Date → datetime (per-channel)
- Peak Vector Sum Date (combined with PVS Time → datetime;
clears the bogus seconds parse that interpreted "22:33:52"
as 22.0)
New _parse_iso_date() handles BW's ISO format for histograms
(waveforms use "May 8, 2026" long form). New _parse_interval_size()
handles "1 minute" / "5 minutes" / "15 seconds" etc.
2. _bw_report_to_dict() projects the new fields into a new
bw_report.histogram block in the sidecar.
3. /db/events/{id}/waveform.json wraps the existing path 1 (HDF5)
output with _maybe_aggregate_histogram(): when the event is a
histogram AND the sidecar has bw_report.histogram.n_intervals,
group the codec's per-block samples into N intervals via
max-per-group and return the aggregated array. time_axis gains
histogram_aggregated / n_intervals / interval_size_s / interval_times
fields.
Frontend (both modal chart in sfm_webapp.html + standalone event
browser) uses interval_times as x-axis labels when provided (BW-style
HH:MM:SS), falls back to interval index.
Defensive: aggregation is no-op when the sidecar lacks the histogram
block (events ingested before this change). Activates automatically
on prod once a watcher re-forward populates new sidecars.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -8,6 +8,15 @@ All notable changes to seismo-relay are documented here.
|
||||
|
||||
### Added
|
||||
|
||||
- **Histogram per-interval aggregation in `waveform.json`.** Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output). When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array. `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings). Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present. Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
|
||||
- **`bw_ascii_report` parser now captures histogram-specific fields.** Previously the parser dropped these fields silently (Roadmap item closed):
|
||||
- `Histogram Start Time` / `Histogram Start Date` (combined into `histogram_start: datetime`)
|
||||
- `Histogram Stop Time` / `Histogram Stop Date` (combined into `histogram_stop: datetime`)
|
||||
- `Number of Intervals` (`histogram_n_intervals: int`)
|
||||
- `Interval Size` ("1 minute" string + parsed seconds: `histogram_interval_size_str`, `histogram_interval_size_s`)
|
||||
- `<Channel> Peak Time` + `<Channel> Peak Date` for histogram events (combined into `channel_peak_when: dict`; waveforms continue to use `time_of_peak_s` relative)
|
||||
- `Peak Vector Sum Date` (combined with PVS Time into `peak_vector_sum_when: datetime`; clears the previous bogus `peak_vector_sum_time_s` parse that interpreted "22:33:52" as 22.0 seconds)
|
||||
- All new fields land in the sidecar's `bw_report.histogram` block via `_bw_report_to_dict`. Tested against synthetic K558LLB7.V20H-shaped input.
|
||||
- **Raw BW ASCII report (.TXT) preservation.** `save_imported_bw` now writes the paired `_ASCII.TXT` to `<store>/<serial>/<filename>_ASCII.TXT` alongside the binary at ingest time. Previously the .TXT was parsed into the sidecar's `bw_report` projection and then discarded — meaning parser bug fixes couldn't be applied retroactively without re-forwarding from the watcher PC. Now the raw .TXT lives in the waveform store permanently (~15 KB per event; ~210 MB total for a 14k-event store; negligible). Sidecar's `source.txt_filename` field records the saved path; backfill_sidecars preserves it across regens. New `GET /db/events/{id}/ascii_report.txt` endpoint serves the raw .TXT for any event ingested after this change. Events ingested before today still return 404 from that endpoint until re-forwarded. Architectural rationale: with BW Mail / Forwarding Agent being phased out of the operator workflow, the XML/PDF/WMF that those tools produced are no longer available — the binary + .TXT (created by BW ACH itself) are our authoritative source for everything going forward.
|
||||
|
||||
- **Event Report PDF generation** — `GET /db/events/{id}/report.pdf` returns a single-page letter-portrait PDF for any event with waveform data on disk. Covers every field a Blastware Event Report includes: header metadata (date/time, trigger source, range, sample rate, project/client/operator/location, serial+firmware, battery, calibration, file name), microphone block (PSPL in dB(L) + psi, ZC freq, channel test), per-channel stats table (rows differ for waveform vs histogram), Peak Vector Sum, and the 4-channel plot. Iterated against real Blastware reference PDFs (uploaded to `example-events/pdfsnstuff/`):
|
||||
|
||||
Reference in New Issue
Block a user