Merge pull request 'v0.20.0 -- Full s3 event parse and PDF creation.' (#28) from dev into main

Reviewed-on: #28
This commit was merged in pull request #28.
This commit is contained in:
2026-05-28 17:54:31 -04:00
21 changed files with 4652 additions and 95 deletions
+1 -1
View File
@@ -1,6 +1,6 @@
/bridges/captures/ /bridges/captures/
/example-events/ /example-events/
/tests/fixtures/
/manuals/ /manuals/
# Python build artifacts # Python build artifacts
+93
View File
@@ -4,6 +4,99 @@ All notable changes to seismo-relay are documented here.
--- ---
## [Unreleased]
---
## v0.20.0 — 2026-05-28
The "PDF + parser polish" release. Closes out the Event-Report PDF iteration started in v0.17.x: histogram layouts now render correctly against BW reference PDFs, the ASCII parser handles the real-world edge cases production events were tripping over (OORANGE, `>100 Hz`, histogram timestamps), and the `.TXT` preservation rollout lets parser fixes be applied retroactively to ingested events. Adds server-wide timezone support so operator-visible timestamps no longer drift into UTC. Rolls up the substantial "pre-v0.20" body of work that had accumulated under `[Unreleased]` (PDF generation, histogram codec fix, histogram parser fields, `.TXT` preservation, backfill safety) — see the trailing "pre-v0.20.0 work" section below for the full list.
### Added (2026-05-28)
- **Server-wide display timezone via `TZ` env var.** Both seismo-relay and terra-view now respect a `TZ` environment variable (default `America/New_York` on prod). Affects server log timestamps, the PDF report renderer's UTC→local conversions on the "Created" footer line, matplotlib's datetime axes, and any other naïve-vs-aware datetime rendering. DB columns (`created_at`, etc.) stay UTC regardless — this is a display-side fix, not a storage-side one. Dockerfile now installs `tzdata` (required for the env var to take effect under `python:slim`). Override per-deployment via the `TZ` line in `docker-compose.yml`.
- **ZC Freq "above-range" handling — render `>100 Hz` instead of `—`.** BW writes `">100 Hz"` literally when the zero-crossing algorithm sees a peak too fast to count (device cuts off at 100 Hz on V10.72). Previously `_parse_number(">100")` returned None and the PDF stats table rendered `—`. Now the parser mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` and sets a new `zc_freq_above_range` flag. Flag rides through the sidecar's `bw_report` block. Renders as `>100` in the PDF (per-channel + mic block), as `· >100 Hz` inline on the event modal's Peaks section, and as a dedicated column on the event-browser stats table. Verified against the real T190LD5Q.LK0W fixture from 2026-05-27 plus a synthetic test case.
- **Per-channel ZC Freq surfaced in event modals.** Neither the main webapp modal (`sfm_webapp.html`) nor the standalone event browser (`event_browser.html`) previously exposed ZC Freq. Now both do — webapp shows it inline alongside PPV (`0.04500 in/s · 47 Hz`); event-browser gets a dedicated column on its per-channel stats table. Required wiring a parallel sidecar fetch into the event-browser's `loadEvent()` (it was only fetching `waveform.json`). Falls back to `—` for events without a preserved `.TXT` (pre-2026-05-27 ingests).
- **`scripts/backfill_sidecars.py --reparse-txt` flag.** Before this, the backfill script preserved the `bw_report` block from existing sidecars verbatim — so parser-side fixes (like the `>100 Hz` addition above) couldn't reach old events. The new flag re-runs the current parser against the preserved `<serial>/<filename>_ASCII.TXT`, overwrites the bw_report block, and cascade-regenerates the sidecar. Implies sidecar regeneration on every event (bypasses the sha/version skip). No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27 .TXT-preservation rollout). Idempotent. Run with `--skip-hdf5` to skip waveform regen — recommended when only the bw_report needs refreshing. Validated end-to-end on prod: 9,999 events refreshed cleanly, ZC Freq + OORANGE flags now populated where the original .TXT had them.
### Fixed (2026-05-28)
- **Histogram PDFs no longer 500 on the missing `histogram_interval_size_s` attribute.** The histogram-interval-times derivation block in `gather_report_data` referenced `rd.histogram_interval_size_s`, but the field was never declared on the `ReportData` dataclass nor read from the sidecar projection (it was inlined into `gather_report_data` without the seconds-numeric counterpart making it onto the dataclass). Every histogram PDF render raised `AttributeError → 500`. Waveform PDFs were unaffected. Fix: add the field, read it from the projection's existing `bw_report.histogram.interval_size_s` key.
- **Histogram PDF geo channels now share a single nice-quantized y-axis.** Previously each geo subplot auto-scaled independently — Tran, Vert, and Long all showed different per-channel maxes, so bar heights weren't directly comparable across channels. The footer "Amplitude Geo: X in/s/div" label was also computed as `max(first_geo_channel) / 5` with no LSB quantization, producing nonsense values like `0.003 in/s/div` when the geophone LSB is 0.005. Fix: compute a single shared geo y-axis range from `max(Tran, Vert, Long)`, quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 in/s LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same `ylim` + ticks to all three subplots, and use that step for the footer label. MicL stays on its own auto-scale (different units). Matches BW's chart styling.
### Docs (2026-05-28)
- **Roadmap entry for a second undecoded histogram body sub-format.** BE17353 (S353) events observed on 2026-05-28 use a histogram body where `byte[5] = 0x00` (looks like a valid block header by every prior signal) but the walker finds zero data blocks. Different from the existing `byte[5] != 0` roadmap entry (T190 / O121). Operationally identical impact — ingestion succeeds, DB peaks come from the bw_report overlay, only the chart is empty. Sample events captured in the roadmap entry for future RE work.
### Migration / Operations
- **Re-parse existing events to pick up the new parser fields.** Run on whichever box hosts the live waveform store:
```bash
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
--reparse-txt --skip-hdf5 --dry-run -v | tail
# Looks reasonable? Run for real:
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
--reparse-txt --skip-hdf5 -v | tee /tmp/reparse.log | tail -30
```
Idempotent; safe to re-run. Only touches sidecars on disk — no DB writes.
- **terra-view docker-compose.yml**: add `TZ=America/New_York` (or your deployment's zone) to both the `terra-view` and `sfm` service `environment:` blocks. Without this, server-rendered timestamps stay in UTC even on the rebuilt SFM image.
### Pre-v0.20.0 work (rolled into this release)
The bullets below accumulated under `[Unreleased]` between v0.19.0 and v0.20.0; kept here so the historical narrative isn't lost.
#### Fixed
- **bw_ascii_report parser now handles `OORANGE` saturation marker.** BW writes `"OORANGE"` (truncation of "Out Of Range") in PPV / PVS / MicL PSPL fields when the underlying measurement exceeded the channel's full-scale. Previously our `_parse_number()` returned None → DB ended up with NULL peaks for legitimate high-amplitude events. Confirmed on real ASCII files pulled 2026-05-27 from the Windows watcher PC: T190LD5Q.LK0W (Vert saturated at Normal range 10 in/s), T438L713.RY0W (all three channels saturated at Sensitive range 1.25 in/s), K557L3YM.OE0W (Tran+Vert saturated + Mic PSPL OORANGE). New behavior:
- Per-channel PPV: substitute `geo_range_ips` as a conservative lower bound + set `ppv_saturated` flag
- Peak Vector Sum: substitute `sqrt(3) * geo_range_ips` (the theoretical max when all 3 channels are simultaneously at full-scale) + `peak_vector_sum_saturated` flag
- MicL PSPL: substitute 140 dB(L) (conservative NL-43 max) + `pspl_saturated` flag
- Saturation flags are propagated into the sidecar's `bw_report` block for downstream UI rendering (`> 10 in/s` or similar)
- Five events on prod (T190 / T438 / K557 + 2 others matching the same fault pattern) will pick up correct DB peaks + saturation flags once re-forwarded
- **bw_ascii_report parser handles `Peak Vector Sum TimeSum` typo'd label.** Real BW output uses this misspelled label (Sum appended twice instead of "Peak Vector Sum Time"). Now accepted as an alias. Confirmed against all three OORANGE example files — every one has the typo.
#### Added
- **Histogram per-interval aggregation in `waveform.json`.** Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output). When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array. `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings). Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present. Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
- **`bw_ascii_report` parser now captures histogram-specific fields.** Previously the parser dropped these fields silently (Roadmap item closed):
- `Histogram Start Time` / `Histogram Start Date` (combined into `histogram_start: datetime`)
- `Histogram Stop Time` / `Histogram Stop Date` (combined into `histogram_stop: datetime`)
- `Number of Intervals` (`histogram_n_intervals: int`)
- `Interval Size` ("1 minute" string + parsed seconds: `histogram_interval_size_str`, `histogram_interval_size_s`)
- `<Channel> Peak Time` + `<Channel> Peak Date` for histogram events (combined into `channel_peak_when: dict`; waveforms continue to use `time_of_peak_s` relative)
- `Peak Vector Sum Date` (combined with PVS Time into `peak_vector_sum_when: datetime`; clears the previous bogus `peak_vector_sum_time_s` parse that interpreted "22:33:52" as 22.0 seconds)
- All new fields land in the sidecar's `bw_report.histogram` block via `_bw_report_to_dict`. Tested against synthetic K558LLB7.V20H-shaped input.
- **Raw BW ASCII report (.TXT) preservation.** `save_imported_bw` now writes the paired `_ASCII.TXT` to `<store>/<serial>/<filename>_ASCII.TXT` alongside the binary at ingest time. Previously the .TXT was parsed into the sidecar's `bw_report` projection and then discarded — meaning parser bug fixes couldn't be applied retroactively without re-forwarding from the watcher PC. Now the raw .TXT lives in the waveform store permanently (~15 KB per event; ~210 MB total for a 14k-event store; negligible). Sidecar's `source.txt_filename` field records the saved path; backfill_sidecars preserves it across regens. New `GET /db/events/{id}/ascii_report.txt` endpoint serves the raw .TXT for any event ingested after this change. Events ingested before today still return 404 from that endpoint until re-forwarded. Architectural rationale: with BW Mail / Forwarding Agent being phased out of the operator workflow, the XML/PDF/WMF that those tools produced are no longer available — the binary + .TXT (created by BW ACH itself) are our authoritative source for everything going forward.
- **Event Report PDF generation** — `GET /db/events/{id}/report.pdf` returns a single-page letter-portrait PDF for any event with waveform data on disk. Covers every field a Blastware Event Report includes: header metadata (date/time, trigger source, range, sample rate, project/client/operator/location, serial+firmware, battery, calibration, file name), microphone block (PSPL in dB(L) + psi, ZC freq, channel test), per-channel stats table (rows differ for waveform vs histogram), Peak Vector Sum, and the 4-channel plot. Iterated against real Blastware reference PDFs (uploaded to `example-events/pdfsnstuff/`):
- **Waveform layout**: header shows Date/Time, Trigger Source, Range, Sample Rate; stats table has PPV / ZC Freq / Time (Rel. to Trig) / Peak Accel / Peak Disp / Sensor Check; bottom plot is 4-channel line waveform (MicL top → Tran bottom), shared time axis in seconds, dashed trigger line + triangle marker at t=0, symmetric Y on geo channels, zero-anchored on mic, "0.0" baseline label on right per BW convention; footer shows `Time X sec/div Amplitude Geo: Y in/s/div Mic: 0.001 psi(L)/div` and the trigger window `▶━━◀` marker. USBM RI8507/OSMRE compliance chart placeholder upper-right.
- **Histogram layout**: header shows Start / Finish / Intervals At Size / Range / Sample Rate (no Trigger Source — histograms aren't triggered); NO USBM chart; stats table has PPV / ZC Freq / Date / Time / Sensor Check; bottom plot is per-interval bar chart, Y-axis 0-to-peak (never negative), 0.0 baseline at the bottom; footer shows `Time INTERVAL_SIZE /div Amplitude Geo: Y in/s/div Mic: 0.001 psi(L)/div`.
- Backed by matplotlib (vector PDF, no headless-browser dep). Adds matplotlib>=3.8 to deps.
- **Known gap**: histogram codec returns per-block granularity (~200 bars for a 4-interval event) instead of BW's per-interval aggregation. Visual difference vs BW's 4-bar display. XML-driven data source (parsing the structured `_XML.XML` files BW also exports) is the planned fix; that route also resolves the bw_ascii_report PPV-miss bug.
- **Stubbed**: USBM RI8507 / OSMRE compliance chart curves (separate work item; requires coding the regulatory piecewise functions).
- **"Download PDF" button** in the event modal's footer — triggers the new endpoint; opens in a new tab so the browser handles save-or-display + surfaces any 404 / server errors visibly.
- **SFM webapp now opens to Database view by default** and the History table is fully interactive. Click any column header to sort ascending / descending (timestamp, serial, per-channel PPV, PVS, mic dB(L), project, client, record type, key — all sortable). Click any event row to open the event modal, which now renders a **4-channel waveform plot inline** (MicL / Long / Vert / Tran stacked, Instantel-printout order) alongside the existing sidecar review fields. Headers are sticky so the columns stay visible while scrolling long event lists. No more "where is the viewer" — pick a unit from the filter dropdown, scan the table, click the event, see the waveform.
- **Stored-event browser** — new standalone HTML page at `GET /events` (`sfm/event_browser.html`). Pick a serial from the unit dropdown, scroll through that unit's events (newest-first), click any event to render its decoded waveform via the existing `/db/events/{id}/waveform.json` endpoint. Dark-themed Chart.js viewer, channels stacked vertically (MicL / Long / Vert / Tran — Instantel printout order, designed PDF-export-ready), trigger line at t=0, peak labels, search/filter, false-trigger flag honored. Companion to the existing live-device viewer at `/waveform`; the two routes are now clearly delineated in their docstrings. The webapp's inline plot at `/` is the primary path; `/events` remains a useful diagnostic when you want just a viewer.
- **Histogram body codec — uint8 peak count fix.** Per-channel peak fields at `block[6]/[10]/[14]/[18]` are `uint8`, not `uint16 LE` spanning `block[6:8]` etc. The original interpretation was byte-exact on the N844 fixture corpus only because every annotation byte (`block[7]/[11]/[15]/[19]`) in those fixtures was zero. On non-N844 events with non-zero annotation bytes (observed across BE9558 Tran-drift and BE18003 Histogram+Continuous units), the old interpretation produced peaks up to 268 in/s per channel and 35× inflated PVS sums when first deployed to prod (rolled back same day; properly fixed in this release). Cross-correlated against BW's per-interval ASCII export on K558 / T003 / N599 / N844 corpora — 100% byte-exact on T/V/L, 99%+ on M (sub-precision rounding). Annotation byte preserved on each record as `record["annotations"]` for future RE. Verified against ~3,500 blocks across 5 in-repo fixtures + a synthetic K558 interval-12 regression block.
- **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`. Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file. BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
- **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars. Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED. Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.
#### Fixed
- **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.** Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict). Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
- **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.** The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts). Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
- **Thor IDF files no longer attempted as BW events in backfill.** `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.
#### Docs
- **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes. Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
- **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
- **`docs/histogram_codec_re_status.md`** updated with the uint8 retraction and the annotation-byte status.
- Three known issues recorded in the Roadmap that were discovered during prod validation: (1) `bw_ascii_report` parser misses PPV / `vector_sum` on some `.TXT` formats (5 events on prod); (2) NULL-timestamp duplicate-row dedup needed (2 events on prod); (3) histogram body sub-format with `byte[5] != 0` not yet decoded (~3 events on prod with empty `.h5` plots).
---
## v0.19.0 — 2026-05-20 ## v0.19.0 — 2026-05-20
The "device-family separation" release. Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics. The "device-family separation" release. Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics.
+79 -1
View File
@@ -2,12 +2,90 @@
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
(Sierra Wireless RV50 / RV55). Current version: **v0.17.0**. (Sierra Wireless RV50 / RV55). Current version: **v0.20.0**.
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
--- ---
## Architecture: three-tier conceptual model
seismo-relay is a **suite of cooperating components**, not a single app.
The three tiers below are the canonical mental model — the current
directory layout doesn't fully reflect them yet (some of what is
conceptually SDM lives under `sfm/` today), but new code should be
placed and named according to this model.
### 1. SFM — the device-side (active connection to physical units)
Replaces Blastware's *talk-to-the-meter* role. Lives where a connection
to a physical seismograph is open.
In scope:
- `minimateplus/{transport,framing,protocol,client}.py` — wire protocol
- `seismo_lab.py` — diagnostic GUI (a thick client for SFM)
- The `/device/*` HTTP endpoints in `sfm/server.py`
`/device/info`, `/device/events`, `/device/monitor/*`, `/device/call_home`,
etc. Anything that opens a connection at the moment of the request.
- Future: a Thor / Micromate live client (mirror `minimateplus/`)
- Future: a control surface Terra-View can launch into — see the
README's Roadmap.
Does NOT own a database. Outputs `Event` objects. Has a "spun up when
needed" runtime profile rather than "always on".
### 2. SDM — the data-side (storage, ingest, and serving)
The new name for the receiving-and-storing role. Originally called SFM
because the FastAPI service started life as a thin device proxy, but
the actual role has migrated heavily toward data management. **For now
the directory remains `sfm/`** — renaming requires touching ~30-50
files in seismo-relay + ~10-15 in terra-view + a Docker volume
migration; deferred until the codebase is quiet enough to do it as a
clean refactor.
In scope:
- `sfm/database.py` (`SeismoDb`)
- `sfm/waveform_store.py`, `sfm/event_hdf5.py`
- The `/db/*` HTTP endpoints — `events`, `units`, `monitor_log`,
`sessions`, `false_trigger` mutations
- The `/db/import/*` ingest endpoints — `blastware_file` (series3),
`idf_file` (series4); anything that receives events FROM somewhere
- `scripts/backfill_sidecars.py`, `scripts/check_bw_report_preservation.py`,
and similar data-maintenance tools
- The `.sfm.json` sidecars and `.h5` files in the waveform store
- The shape that Terra-View consumes (Terra-View should never need to
reach into SFM/device-side endpoints to populate its UI)
Always-on, scaled for storage/serving, has the DB and waveform store.
### 3. Codec library — pure data interpretation (used by both sides)
Neither SFM nor SDM — a shared library both depend on.
In scope:
- `minimateplus/{waveform_codec,histogram_codec,event_file_io,bw_ascii_report,blastware_file}.py`
- `micromate/{idf_ascii_report,idf_file}.py`
These modules take bytes (off the wire on the SFM side, or from a
forwarded file on the SDM side) and return `Event` objects. They
should not import from `sfm/`, must not touch a DB, and have no I/O
beyond reading files passed as arguments. Keep them pure — both
tiers can then depend on them without circularity.
### Practical consequences
When deciding where new code goes, ask:
- *Does it need a connection to a device?* → SFM
- *Does it operate on stored events / sidecars / DB rows?* → SDM
- *Does it interpret bytes into structured data, with no I/O of its own?* → codec lib
Terra-View is downstream of SDM for data, and (per the roadmap) will
eventually invoke into SFM's device-control endpoints to provide a
"connect to unit" experience.
---
## Project layout ## Project layout
``` ```
+12 -1
View File
@@ -2,10 +2,21 @@ FROM python:3.11-slim
WORKDIR /app WORKDIR /app
# tzdata is required for the TZ env var to take effect (python:slim
# omits the timezone database). Without it, datetime.now() / logging
# / matplotlib all stay in UTC regardless of TZ. Default zone gets
# set further down via ENV; users override per-deployment via the
# `TZ` env var in docker-compose.
RUN apt-get update && \ RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \ apt-get install -y --no-install-recommends curl tzdata && \
rm -rf /var/lib/apt/lists/* rm -rf /var/lib/apt/lists/*
# Default display timezone — applied to server logs, datetime.now(),
# matplotlib rendered timestamps, and any naïve-vs-aware datetime
# conversions in the PDF renderer. Override via TZ env var in
# docker-compose; storage in the DB is always UTC regardless.
ENV TZ=America/New_York
COPY pyproject.toml requirements.txt ./ COPY pyproject.toml requirements.txt ./
COPY minimateplus ./minimateplus COPY minimateplus ./minimateplus
COPY micromate ./micromate COPY micromate ./micromate
+84 -3
View File
@@ -1,4 +1,4 @@
# seismo-relay `v0.19.0` # seismo-relay `v0.20.0`
A ground-up replacement for **Blastware** — Instantel's aging Windows-only A ground-up replacement for **Blastware** — Instantel's aging Windows-only
software for managing seismographs. Supports both the **MiniMate Plus software for managing seismographs. Supports both the **MiniMate Plus
@@ -35,6 +35,16 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
> and storage layer dispatch deterministically instead of sniffing > and storage layer dispatch deterministically instead of sniffing
> filenames. Self-applying migration backfills existing rows from the > filenames. Self-applying migration backfills existing rows from the
> binary filename extension. > binary filename extension.
> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
> started in v0.17.x: histogram layouts render correctly against BW
> reference PDFs, the ASCII parser handles real-world edge cases
> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
> Freq is surfaced in both modals (event browser + main webapp).
> Adds a server-wide `TZ` env var so operator-visible timestamps
> render in local time instead of UTC. New
> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
> applied retroactively to existing events without re-forwarding,
> using the `.TXT` files preserved at ingest time.
> See [CHANGELOG.md](CHANGELOG.md) for full version history. > See [CHANGELOG.md](CHANGELOG.md) for full version history.
--- ---
@@ -459,6 +469,72 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
## Roadmap (Future) ## Roadmap (Future)
### Strategic direction — where this is going
seismo-relay is being built as a **suite of cooperating components**
that together replace and improve on Blastware's role. Three logical
tiers:
1. **SFM** (device-side) — owns the active connection to a physical
unit. Today: `minimateplus/`, `/device/*` HTTP endpoints,
`seismo_lab.py`. Future: live Thor / Micromate support.
2. **SDM** (data-side) — owns the database, waveform store, ingest
pipelines, and the read-API that Terra-View consumes. Today this
code lives under `sfm/` for historical reasons; the role has
migrated and the eventual rename is on the long-tail cleanup list.
3. **Codec library** — pure data-interpretation: `minimateplus/*_codec.py`,
`bw_ascii_report.py`, `micromate/idf_*.py`. Used by both SFM and
SDM, depends on neither.
Terra-View is downstream of SDM for fleet listings, event detail, etc.
The long-term vision adds a **second link** from Terra-View → SFM for
direct device interaction (see below).
The codec work in this repo isn't trying to replace BW's network
layer — BW's ACH file forwarding and Thor's IDF call-home are
battle-tested. The value is in the receiving and processing side: turn
the stream of binary+ASCII pairs into something users can search,
filter, alert on, and report from.
### Terra-View ↔ SFM device control (the long-term vision)
Today Terra-View only reads from SDM (event listings, dashboards,
project reports). When a unit goes missing — operator notices in the
Terra-View dashboard — there's no way to *do* anything from the UI.
The path of least resistance is to RDP into a Windows box and open
Blastware, which defeats the purpose of having Terra-View.
Target experience:
- Operator notices a unit in Terra-View dashboard hasn't called in.
- Clicks unit detail → "Connect to Device" button.
- Terra-View opens an embedded view (modal or side-panel) that talks
to SFM's `/device/*` endpoints over the network.
- Live view: device clock, battery, memory, current monitor status.
- Actions: start/stop monitoring, push compliance config changes, pull
fresh events, run a sensor self-check, change call-home settings.
- Audit log: every connect / action recorded in SDM for the unit
history.
Implementation steps (concrete):
- [ ] **SFM authentication & authorization layer.** Today `/device/*`
endpoints are unauthenticated — anyone on the network can call
them. Need at minimum a token-based auth, ideally with a "who
can connect to which units" mapping. Hard prerequisite for
letting Terra-View users into the control surface.
- [ ] **Terra-View "Connect to Device" entry point** on the unit
detail page. Renders only when unit has connection info on file
and the user has permission.
- [ ] **Embedded live-monitor view** in Terra-View — equivalent to
`seismo_lab.py`'s Bridge tab, but in the browser. Polls SFM's
`/device/monitor/status` on an interval; sends start/stop via
`/device/monitor/{start,stop}`.
- [ ] **Action history** — every connect / push / action call records
a row in `unit_history`, viewable on the unit detail page.
- [ ] **Series IV live-device support in SFM** — currently `/device/*`
only supports MiniMate Plus. Blocks "Connect to Device" for
Thor units until done. Depends on Thor wire-protocol capture
and a `micromate/` parallel of the `minimateplus/` modules.
### High-impact (unblocks product features) ### High-impact (unblocks product features)
- [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec. - [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
@@ -470,9 +546,10 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
### BW ASCII report parser enhancements (built in v0.16.0) ### BW ASCII report parser enhancements (built in v0.16.0)
- [ ] **Histogram-specific structural fields.** Current parser handles the shared fields (PPV, ZC Freq, sensor self-check, project) but silently drops histogram-only fields: `Histogram Start/Stop Time`, `Histogram Start/Stop Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date` (absolute timestamps rather than the waveform's `Time of Peak` relative seconds). - [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value. Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag. All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now. Land in the sidecar's `bw_report.histogram` block.
- [ ] **Histogram interval bin-table parsing.** Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed. Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file. - [ ] **Histogram interval bin-table parsing.** Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed. Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
- [ ] **`>100 Hz` value parsing.** Histogram TXTs use `>100 Hz` for out-of-range ZC freq; current `_parse_number()` returns `None` for these (loses information). - [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag. PDF + both modals render `>100 Hz` instead of `—`.
### Ingestion gaps ### Ingestion gaps
@@ -498,3 +575,7 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
- [ ] Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring). - [ ] Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring).
- [ ] Call Home — map time slots 3/4 offsets; confirm `modem_power_relay_enabled`. - [ ] Call Home — map time slots 3/4 offsets; confirm `modem_power_relay_enabled`.
- [ ] RV55 DCD/DTR — newer RV55 firmware doesn't assert DCD by default; units don't resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred). - [ ] RV55 DCD/DTR — newer RV55 firmware doesn't assert DCD by default; units don't resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred).
- [ ] **NULL-timestamp duplicate-row dedup.** A small handful of events (2 known on prod as of 2026-05-22) have `events.timestamp IS NULL` because the codec couldn't extract a timestamp from the binary footer. The `UNIQUE(serial, timestamp)` constraint doesn't fire on `NULL` (SQL semantics: `NULL ≠ NULL`), so every `--force` backfill INSERTs a new row instead of UPSERTing the existing one. Cleanup: a one-shot SQL query that keeps only the newest row per `(serial, blastware_filename)` and deletes the rest. Longer-term: extend the unique key to `(serial, COALESCE(timestamp, blastware_filename))` or reject inserts with NULL timestamp.
- [ ] **Histogram body sub-format with `byte[5] != 0`.** ~3 events on prod (`T190LD5Q.LD0H`, `O121L4L1.GU0H`) use a histogram body my walker doesn't recognize — the first block has `byte[5] = 0x01` or `0x07` instead of `0x00`, and the entire body lacks the `1e 0a 00 00` tail signature. Codec returns 0 valid blocks; their DB PVS comes from the bw_report ASCII overlay (which BW computed from the same binary, so the DB columns are correct). Only the `.h5` waveform plot is empty. Cracking the sub-format would unlock the plot. Needs binary+ASCII pairs from a few `byte[5]!=0` events; same RE approach as the K558 case.
- [ ] **Histogram body sub-format with `byte[5] == 0x00` but undecodable.** Observed 2026-05-28 on BE17353 (S353) events: `S353L4H2.FZ0H`, `S353L4H2.P00H`, `S353L4H3.7O0H`, `S353L4H3.E10H`. Body starts `00 00 00 01 0a 00 XX 00 ...` which LOOKS like a valid histogram block header (marker 0x000a at byte[4:6] ✓, byte[5]=0x00 normal-format ✓), but the walker finds zero data blocks across the whole body. Likely an extra header before the block stream OR a different tail signature than `1e 0a 00 00`. Smaller body lengths (1900-2100 bytes) suggest these may be short-recording histogram variants. Same operational impact as the byte[5]!=0 case: event ingests cleanly, DB peaks correct via bw_report overlay, only the chart is empty. Worth dumping a hex view of one body to diagnose.
- [ ] **Sensor-check waveform extraction from the BW binary.** BW's Event Report PDFs include a narrow panel on the right side of the waveform plot showing each channel's response to the sensor self-check signal (a damped sinusoid for geo, sawtooth-at-test-freq for mic). Our parser captures the test RESULTS (`test_freq_hz`, `test_ratio`, `test_amplitude_mv`, `test_results` pass/fail) and the PDF + modal display them as text — but BW's per-sample sensor-check waveform isn't accessible to us today. Two paths to add it: (a) RE the binary to find where the sensor-check samples are stored — could be a section before STRT, after the footer, or in a separate sub-record; protocol reference doesn't currently mention it. (b) If samples aren't in the binary, synthesize a representative waveform from the test parameters (damped sinusoid at `test_freq_hz` with damping from `test_ratio`). Path (a) is the honest answer; path (b) is decorative. Until either lands, the text-only sensor-check display in the report is fine.
+185
View File
@@ -0,0 +1,185 @@
# Histogram body codec — FULLY DECODED (2026-05-20)
Clean working status doc for the MiniMate Plus histogram-mode event
body codec. Companion to `waveform_codec_re_status.md`. The deep
historical record (with retractions and dated analyses) lives in
`docs/instantel_protocol_reference.md §7.6.2`; the authoritative
implementation lives in `minimateplus/histogram_codec.py`.
## TL;DR
**The codec is fully decoded.** Every field of every block in the
in-repo histogram fixture corpus decodes byte-exact against BW's
ASCII export.
26 regression tests pass against ~3,500 blocks across 5 in-repo
fixtures, plus a synthetic regression block taken from a real
BE9558 prod event to lock in the uint8-peak interpretation.
**Important correction (2026-05-21):** the per-channel peak count
is `uint8` at byte[6]/[10]/[14]/[18], NOT `uint16 LE` at byte[6:8]
etc. The N844 fixture corpus the original RE was done against has
zero values in bytes [7]/[11]/[15]/[19] for every block, so the
two interpretations happened to be equivalent. Cross-correlating
non-N844 events (BE9558 Tran-drift, BE18003 Histogram+Continuous)
against BW's per-interval ASCII export — 4 channels × ~1400 blocks
per event × multiple events = 100% byte-exact only when the peak
is read as uint8. Reading as uint16 LE produced peaks up to 268
in/s per channel and 35× inflated PVS sums when first deployed to
prod (rolled back, root-caused, and fixed in commit 7183b95+1).
## Body format
```
body = [stream of 32-byte data blocks] + [small trailing remnant]
```
Each block represents one histogram interval. Block layout:
```
[0] 0x00 always-zero tag
[1] segment_id (uint8) 0x00..0x03 — 256 blocks per segment
[2:4] block_ctr (uint16 LE) resets each segment (0x0100, 0x0101, …)
[4:6] 0x000a (uint16 LE) constant marker (= 10)
[6] T_peak_count uint8 Tran peak (count × 0.005 → in/s at Normal,
max 1.275 in/s — fits in uint8)
[7] T_annotation uint8 empirically non-zero on intervals with sub-Hz
or unmeasurable freq; meaning not fully RE'd
[8:10] T_halfperiod uint16 LE Tran half-period in samples
(freq_Hz = 512 / halfp; ≤ 5 means ">100 Hz")
[10] V_peak_count uint8 Vert peak
[11] V_annotation uint8
[12:14] V_halfperiod uint16 LE Vert freq half-period
[14] L_peak_count uint8 Long peak
[15] L_annotation uint8
[16:18] L_halfperiod uint16 LE Long freq half-period
[18] M_peak_count uint8 MicL peak count
(dB via waveform_codec.mic_count_to_db)
[19] M_annotation uint8
[20:22] M_halfperiod uint16 LE MicL freq half-period
[22:24] 0x00 0x00 constant
[24:28] 4-byte variable purpose unknown — possibly CRC,
timestamp delta, or psi(L) numeric;
not needed for waveform reconstruction
[28:32] 0x1e 0x0a 0x00 0x00 constant block-end signature
```
Reliable block-identification anchor:
```python
block[22:24] == b"\x00\x00" and block[28:32] == b"\x1e\x0a\x00\x00"
```
(The `1e 0a 00 00` constant tail is the most distinctive signature.)
## Per-channel encoding
| Channel | Peak encoding | Frequency encoding |
|---|---|---|
| Tran | count × 0.005 = in/s at Normal range | `freq_Hz = 512 / halfperiod` |
| Vert | same | same |
| Long | same | same |
| MicL | count → dB via `mic_count_to_db(count)` (same formula as waveform codec) | same |
**`>100 Hz` sentinel**: when halfperiod ≤ 5 (giving ≥100 Hz from the
512/halfp formula), BW displays `>100 Hz`. Codec's `half_period_to_hz`
returns `None` in this range.
## Verified facts (cross-checked against fixture corpus)
Example: N844L6Z8.ZR0H block 130 → all 8 decoded fields byte-exact:
```
binary samples [10, 6, 24, 4, 18, 5, 21, 5, 9]
TXT row [0.030, 21, 0.020, 28, 0.025, 24, 0.040, 0.000, 95.92, 57]
slot[0] = 10 marker
slot[1] = 6 × 0.005 = 0.030 in/s ✓ T_peak
slot[2] = 24 → 512/24 = 21.3 → 21 Hz ✓ T_freq
slot[3] = 4 × 0.005 = 0.020 in/s ✓ V_peak
slot[4] = 18 → 512/18 = 28.4 → 28 Hz ✓ V_freq
slot[5] = 5 × 0.005 = 0.025 in/s ✓ L_peak
slot[6] = 21 → 512/21 = 24.4 → 24 Hz ✓ L_freq
slot[7] = 5 → 81.94 + 20·log10(5) = 95.92 dB ✓ M_peak
slot[8] = 9 → 512/9 = 56.9 → 57 Hz ✓ M_freq
```
## Verified test coverage
`tests/test_histogram_codec.py` (24 tests):
- Block walking: yields one record per `.TXT` interval ± 1 (off-by-one
at the tail when recording was stopped mid-write). Segment-ID
groups of 256 blocks confirmed.
- Geo peaks: every block of N844L20G, N844L6Z8, N844L6XE, N844L23B
matches `.TXT` within the 0.0005 in/s quantization step.
- Geo freqs: every block of N844L6Z8 and N844L6XE matches `.TXT`
within 1 Hz (BW display rounds). `>100 Hz` sentinel handled correctly.
- Mic dB: every block of N844L6XE, N844L23B, N844L6Z8 matches `.TXT`
within 0.1 dB (BW display precision).
- Mic freq: matches `.TXT` within 1 Hz across active blocks.
## What's NOT yet decoded
- **Annotation bytes (`block[7]/[11]/[15]/[19]`)**. Empirically
non-zero on intervals where the per-channel ZC frequency comes
out as `N/A` or sub-Hz (`<1.0`, `1.X`). Hypothesis tested in the
RE session: byte != 0 ↔ sub-Hz freq. Only ~50% correlation
across the K558 corpus, so the relationship is more complex.
Possibilities: time-of-peak-within-interval, halfp extension for
very-long-period signals, or a debug/diagnostic field the firmware
writes opportunistically. Doesn't affect peak amplitudes or
waveform reconstruction. Captured as `record["annotations"]` for
future RE.
- **4-byte variable metadata field (bytes 24:28)**. Not needed for
waveform reconstruction. Speculation: per-block CRC, sub-second
timestamp offset, or a Mic psi(L) count not in the 9 samples.
Punt until something needs it.
- **Geo PVS (TXT col 7, e.g. "0.040 in/s")**. Not stored in the
block; can be approximated as `sqrt(T_peak² + V_peak² + L_peak²)`
but BW's value sometimes differs slightly (probably computed from
waveform-instant samples, not from per-channel peaks). Punt — the
`.h5` consumers don't need PVS as a sample channel.
- **Mic psi(L) value (TXT col 8)**. TXT shows it as a small psi value
derived from the dB measurement. Not in the 9 samples. Could be
derived from `M_peak_count` via the inverse of the dB formula plus
a psi calibration constant. Defer.
## Output shape
`decode_histogram_body` returns the standard 4-channel dict that
mirrors `waveform_codec.decode_waveform_v2`'s output:
```python
{
"Tran": [peak_count_per_interval, ...], # 16-count units (LSB = 0.005 in/s)
"Vert": [..., ...],
"Long": [..., ...],
"MicL": [..., ...], # raw ADC counts
}
```
Run through `waveform_codec.decoded_to_adc_counts` to get 1-count ADC
units (geo ×16, mic passthrough) for the standard `.h5` writer.
For the full per-interval record with frequencies + metadata, use
`decode_histogram_body_full()`.
## Where it's wired
- `minimateplus/event_file_io.py:read_blastware_file()` — first tries
the waveform codec, falls back to the histogram codec when the
waveform preamble isn't present. Same output shape, same
downstream pipeline.
- `scripts/backfill_sidecars.py` — the `has_samples` short-circuit
added during the histogram-codec-pending era still serves as a
defensive guard against truly undecodable files, but no longer
fires for valid histograms.
## Companion reference
- `docs/waveform_codec_re_status.md` — sibling status doc for the
much-more-complex waveform-mode codec.
- `docs/instantel_protocol_reference.md §7.6.2` — historical
protocol-reference entry. Structural framing matches what we
found; per-sample semantics were less documented than the `✅
CONFIRMED` badge suggested. This doc supersedes §7.6.2 where they
conflict on confidence level.
+220 -4
View File
@@ -60,6 +60,18 @@ class ChannelStats:
time_of_peak_s: Optional[float] = None # seconds (relative to trigger; can be negative) time_of_peak_s: Optional[float] = None # seconds (relative to trigger; can be negative)
peak_accel_g: Optional[float] = None # g (geo channels only) peak_accel_g: Optional[float] = None # g (geo channels only)
peak_disp_in: Optional[float] = None # in (geo channels only) peak_disp_in: Optional[float] = None # in (geo channels only)
# When BW writes "OORANGE" (Out Of Range — truncated) for a PPV
# value, the true peak exceeded the channel's full-scale range.
# We substitute the range max (e.g. 10.000 in/s for Normal range)
# as a lower bound, and flag here so downstream UI / alerts know
# to render "> 10 in/s" or "saturated" instead of trusting the
# value as an exact measurement.
ppv_saturated: bool = False
# Set when BW writes ">100 Hz" for ZC Freq — the zero-crossing
# algorithm's peak frequency exceeded the device's reporting
# ceiling (typically 100 Hz on V10.72). zc_freq_hz gets the
# threshold (100.0) as a lower bound; downstream UI renders ">100".
zc_freq_above_range: bool = False
@dataclass @dataclass
@@ -69,6 +81,14 @@ class MicStats:
pspl_dbl: Optional[float] = None # dB(L) pspl_dbl: Optional[float] = None # dB(L)
zc_freq_hz: Optional[float] = None zc_freq_hz: Optional[float] = None
time_of_peak_s: Optional[float] = None time_of_peak_s: Optional[float] = None
# Set when BW writes "OORANGE" for PSPL — mic exceeded its
# measurement range. pspl_dbl gets the conservative upper bound
# 140 dBL (typical NL-43 max; some units cap at 148). Consumers
# should render "> 140 dB(L)" or similar when this flag is set.
pspl_saturated: bool = False
# Same semantics as ChannelStats.zc_freq_above_range — mic ZC
# peak exceeded device reporting ceiling.
zc_freq_above_range: bool = False
@dataclass @dataclass
@@ -92,6 +112,35 @@ class MonitorLogEntry:
description: Optional[str] = None description: Optional[str] = None
# BW saturation marker — appears in PPV / Peak Vector Sum / similar
# numeric fields when the underlying measurement exceeded the
# channel's full-scale range (e.g., a geophone reading > 10 in/s at
# Normal range, or a mic exceeding its sensitivity ceiling). Treated
# as "≥ range_max" + a saturated flag rather than discarded.
# Appears as: ``"Tran PPV : OORANGE in/s"``
_OORANGE_MARKERS = ("OORANGE", "OUT OF RANGE")
def _is_oorange(value: str) -> bool:
"""True when a BW numeric field is an Out-Of-Range saturation marker."""
s = value.strip().upper()
return any(m in s for m in _OORANGE_MARKERS)
def _parse_above_range(value: str) -> Optional[float]:
"""For BW "above-range" markers like ">100 Hz", return the threshold.
BW writes ZC Freq as ">100 Hz" when the zero-crossing algorithm sees
a peak too fast to count (device cuts off at 100 Hz). Returns the
numeric portion after the '>' (e.g. 100.0), or None if `value` is
not an above-range marker.
"""
s = value.strip()
if not s.startswith(">"):
return None
return _parse_number(s[1:])
@dataclass @dataclass
class BwAsciiReport: class BwAsciiReport:
"""Structured representation of one BW per-event ASCII export.""" """Structured representation of one BW per-event ASCII export."""
@@ -144,6 +193,29 @@ class BwAsciiReport:
# ── Vector sum ────────────────────────────────────────────────────────── # ── Vector sum ──────────────────────────────────────────────────────────
peak_vector_sum_ips: Optional[float] = None peak_vector_sum_ips: Optional[float] = None
peak_vector_sum_time_s: Optional[float] = None peak_vector_sum_time_s: Optional[float] = None
# Saturation flag — set when BW writes "OORANGE" for the PVS. We
# then substitute sqrt(3) * geo_range_ips as a conservative upper
# bound (the theoretical maximum PVS when all 3 geo channels are
# simultaneously at full-scale). Consumers should display this as
# ">{value} in/s" or similar.
peak_vector_sum_saturated: bool = False
# Histograms additionally have an absolute date+time for the PVS
# (it occurred at a specific interval). Waveform reports show
# only the relative-time value above.
peak_vector_sum_when: Optional[datetime.datetime] = None
# ── Histogram-specific fields (populated only when Event Type starts
# with 'Histogram' / 'Full Histogram' / 'Histogram + Continuous') ──
histogram_start: Optional[datetime.datetime] = None
histogram_stop: Optional[datetime.datetime] = None
histogram_n_intervals: Optional[int] = None # e.g. 4, 1436
histogram_interval_size_str: Optional[str] = None # "1 minute" / "5 minutes" / "15 seconds"
histogram_interval_size_s: Optional[float] = None # parsed to seconds
# Per-channel absolute peak time+date (histogram-specific). For
# waveform events these are None — those reports use the channel's
# time_of_peak_s (relative to trigger) instead. Keyed by channel
# name ("Tran", "Vert", "Long", "MicL").
channel_peak_when: Dict[str, datetime.datetime] = field(default_factory=dict)
# ── Sensor self-check (per channel) ───────────────────────────────────── # ── Sensor self-check (per channel) ─────────────────────────────────────
sensor_check: Dict[str, SensorCheck] = field(default_factory=dict) sensor_check: Dict[str, SensorCheck] = field(default_factory=dict)
@@ -223,6 +295,46 @@ def _parse_event_date(s: str) -> Optional[datetime.date]:
return None return None
def _parse_iso_date(s: str) -> Optional[datetime.date]:
"""Parse "2026-05-16" → date. Histograms use ISO format for their
Start Date / Stop Date / Peak Date fields; waveforms use the
"May 8, 2026" long form which `_parse_event_date` handles."""
s = s.strip()
try:
return datetime.date.fromisoformat(s)
except ValueError:
return None
_INTERVAL_UNIT_SECONDS = {
"second": 1, "seconds": 1, "sec": 1, "secs": 1,
"minute": 60, "minutes": 60, "min": 60, "mins": 60,
"hour": 3600, "hours": 3600, "hr": 3600, "hrs": 3600,
}
def _parse_interval_size(s: str) -> Optional[float]:
"""Parse "1 minute" / "5 minutes" / "15 seconds" / "2 seconds" → seconds.
Handles the BW Compliance Setup Histogram Interval values verbatim
("2 seconds", "5 seconds", "15 seconds", "1 minute", "5 minutes",
"15 minutes") plus a few defensive variants.
"""
if not s:
return None
parts = s.strip().split()
if len(parts) < 2:
return None
try:
n = float(parts[0])
except ValueError:
return None
unit_per_s = _INTERVAL_UNIT_SECONDS.get(parts[1].lower())
if unit_per_s is None:
return None
return n * unit_per_s
def _parse_event_time(s: str) -> Optional[datetime.time]: def _parse_event_time(s: str) -> Optional[datetime.time]:
"""Parse "15:56:35" → time.""" """Parse "15:56:35" → time."""
s = s.strip() s = s.strip()
@@ -336,6 +448,15 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
in_user_notes_block = False in_user_notes_block = False
user_note_position = 0 user_note_position = 0
# Histogram-field staging — BW writes <Channel> Peak Time and
# <Channel> Peak Date on separate lines (and similarly Histogram
# Start Time / Date). We stash the partial value when the time
# line arrives and combine it when the matching date line arrives.
_hist_start_time: Optional[datetime.time] = None
_hist_stop_time: Optional[datetime.time] = None
_pending_peak_time: Dict[str, Optional[datetime.time]] = {}
_pvs_time_raw: Optional[str] = None # last Peak Vector Sum Time value, raw
while i < n: while i < n:
raw_line = lines[i] raw_line = lines[i]
i += 1 i += 1
@@ -420,23 +541,112 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
): ):
ch_name, stat = key.split(" ", 1) ch_name, stat = key.split(" ", 1)
cs = report.channels.setdefault(ch_name, ChannelStats()) cs = report.channels.setdefault(ch_name, ChannelStats())
if stat == "PPV":
if _is_oorange(value):
# Channel saturated — substitute range max as lower
# bound; flag so downstream UI can render "> 10 in/s".
cs.ppv_ips = report.geo_range_ips
cs.ppv_saturated = True
else:
cs.ppv_ips = _parse_number(value)
elif stat == "ZC Freq":
# ">100 Hz" → store threshold + flag; numeric → parse normally
threshold = _parse_above_range(value)
if threshold is not None:
cs.zc_freq_hz = threshold
cs.zc_freq_above_range = True
else:
cs.zc_freq_hz = _parse_number(value)
else:
num = _parse_number(value) num = _parse_number(value)
if stat == "PPV": cs.ppv_ips = num if stat == "Time of Peak": cs.time_of_peak_s = num
elif stat == "ZC Freq": cs.zc_freq_hz = num
elif stat == "Time of Peak": cs.time_of_peak_s = num
elif stat == "Peak Acceleration": cs.peak_accel_g = num elif stat == "Peak Acceleration": cs.peak_accel_g = num
elif stat == "Peak Displacement": cs.peak_disp_in = num elif stat == "Peak Displacement": cs.peak_disp_in = num
# ── Histogram-specific fields ────────────────────────────────────────
# Histograms have Start/Stop time+date pairs + an interval count
# and size, plus per-channel absolute Peak Time/Date instead of
# the waveform's relative Time of Peak.
elif key == "Histogram Start Time":
_hist_start_time = _parse_event_time(value)
elif key == "Histogram Start Date":
_d = _parse_iso_date(value)
if _d and _hist_start_time:
report.histogram_start = datetime.datetime.combine(_d, _hist_start_time)
elif key == "Histogram Stop Time":
_hist_stop_time = _parse_event_time(value)
elif key == "Histogram Stop Date":
_d = _parse_iso_date(value)
if _d and _hist_stop_time:
report.histogram_stop = datetime.datetime.combine(_d, _hist_stop_time)
elif key == "Number of Intervals":
try:
report.histogram_n_intervals = int(float(value.strip()))
except ValueError:
pass
elif key == "Interval Size":
report.histogram_interval_size_str = value.strip()
report.histogram_interval_size_s = _parse_interval_size(value)
# ── Per-channel histogram Peak Date / Peak Time ──
# Lines like "Tran Peak Time : 22:31:38" + "Tran Peak Date : 2026-05-16"
elif key in ("Tran Peak Time", "Vert Peak Time", "Long Peak Time", "MicL Time"):
ch_name = "MicL" if key == "MicL Time" else key.split(" ", 1)[0]
_pending_peak_time[ch_name] = _parse_event_time(value)
elif key in ("Tran Peak Date", "Vert Peak Date", "Long Peak Date", "MicL Date"):
ch_name = "MicL" if key == "MicL Date" else key.split(" ", 1)[0]
_d = _parse_iso_date(value)
_t = _pending_peak_time.get(ch_name)
if _d and _t:
report.channel_peak_when[ch_name] = datetime.datetime.combine(_d, _t)
# ── Vector Sum ─────────────────────────────────────────────────────── # ── Vector Sum ───────────────────────────────────────────────────────
elif key == "Peak Vector Sum": elif key == "Peak Vector Sum":
if _is_oorange(value):
# PVS saturated — conservative upper bound is
# sqrt(3) * geo_range_ips (all 3 channels at full-scale).
# Real PVS could be lower (channels rarely peak
# simultaneously) but never higher within the range.
if report.geo_range_ips is not None:
import math as _math
report.peak_vector_sum_ips = _math.sqrt(3) * report.geo_range_ips
report.peak_vector_sum_saturated = True
else:
report.peak_vector_sum_ips = _parse_number(value) report.peak_vector_sum_ips = _parse_number(value)
elif key == "Peak Vector Sum Time": # BW writes the PVS-time label with a typo: "Peak Vector Sum TimeSum"
# (looks like Sum got appended twice). Accept both forms. Confirmed
# against actual BW output on 2026-05-27 — every PVS-time line in
# the field examples (T190, T438, K557) uses the typo'd label.
elif key in ("Peak Vector Sum Time", "Peak Vector Sum TimeSum"):
report.peak_vector_sum_time_s = _parse_number(value) report.peak_vector_sum_time_s = _parse_number(value)
_pvs_time_raw = value
elif key == "Peak Vector Sum Date":
# Histogram-mode PVS gets paired with a date. We may have
# captured 'Peak Vector Sum Time' as either a relative
# seconds float (waveform) or an HH:MM:SS string we
# interpreted as a number. For histograms, BW writes
# "Peak Vector Sum Time : 22:33:52" which _parse_number
# parses as 22.0 (loses information). When Peak Vector Sum
# Date arrives, re-parse the previous PVS time line as a
# clock time and combine into an absolute datetime.
_d = _parse_iso_date(value)
if _d and _pvs_time_raw is not None:
_t = _parse_event_time(_pvs_time_raw)
if _t:
report.peak_vector_sum_when = datetime.datetime.combine(_d, _t)
# The earlier seconds parse was bogus for histograms;
# clear it so consumers don't think it's a real offset.
report.peak_vector_sum_time_s = None
# ── Microphone block ──────────────────────────────────────────────── # ── Microphone block ────────────────────────────────────────────────
elif key == "Microphone": elif key == "Microphone":
report.mic.weighting = value report.mic.weighting = value
elif key == "MicL PSPL": elif key == "MicL PSPL":
if _is_oorange(value):
# Mic saturated — substitute conservative upper bound 140 dBL.
report.mic.pspl_dbl = 140.0
report.mic.pspl_saturated = True
else:
report.mic.pspl_dbl = _parse_number(value) report.mic.pspl_dbl = _parse_number(value)
# Mirror onto the "MicL" entry in channels so callers querying # Mirror onto the "MicL" entry in channels so callers querying
# `channels["MicL"].ppv_ips` see something — but it's dB(L), not # `channels["MicL"].ppv_ips` see something — but it's dB(L), not
@@ -446,9 +656,15 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
cs = report.channels.setdefault("MicL", ChannelStats()) cs = report.channels.setdefault("MicL", ChannelStats())
cs.time_of_peak_s = report.mic.time_of_peak_s cs.time_of_peak_s = report.mic.time_of_peak_s
elif key == "MicL ZC Freq": elif key == "MicL ZC Freq":
threshold = _parse_above_range(value)
if threshold is not None:
report.mic.zc_freq_hz = threshold
report.mic.zc_freq_above_range = True
else:
report.mic.zc_freq_hz = _parse_number(value) report.mic.zc_freq_hz = _parse_number(value)
cs = report.channels.setdefault("MicL", ChannelStats()) cs = report.channels.setdefault("MicL", ChannelStats())
cs.zc_freq_hz = report.mic.zc_freq_hz cs.zc_freq_hz = report.mic.zc_freq_hz
cs.zc_freq_above_range = report.mic.zc_freq_above_range
# ── Sensor self-check ──────────────────────────────────────────────── # ── Sensor self-check ────────────────────────────────────────────────
elif key in ( elif key in (
+136 -8
View File
@@ -27,6 +27,8 @@ from typing import Optional, Union
from .models import Event, PeakValues, ProjectInfo, Timestamp from .models import Event, PeakValues, ProjectInfo, Timestamp
from . import blastware_file as _bw # avoid circular reference at module load from . import blastware_file as _bw # avoid circular reference at module load
from .bw_ascii_report import BwAsciiReport from .bw_ascii_report import BwAsciiReport
from .waveform_codec import decode_waveform_v2, decoded_to_adc_counts
from .histogram_codec import decode_histogram_body
# Reference pressure for dB(L) → psi conversion (20 µPa expressed in psi). # Reference pressure for dB(L) → psi conversion (20 µPa expressed in psi).
# Same constant as sfm/sfm_webapp.html so server-side and browser-side # Same constant as sfm/sfm_webapp.html so server-side and browser-side
@@ -47,7 +49,7 @@ SIDECAR_KIND = "sfm.event"
# bumped without a `pip install` re-run — leading to confusing stale # bumped without a `pip install` re-run — leading to confusing stale
# version stamps in sidecars. Bump this constant and CHANGELOG.md # version stamps in sidecars. Bump this constant and CHANGELOG.md
# together at release time. # together at release time.
TOOL_VERSION = "0.16.1" TOOL_VERSION = "0.20.0"
try: try:
# Best-effort: prefer the installed metadata when it's NEWER than the # Best-effort: prefer the installed metadata when it's NEWER than the
@@ -118,7 +120,16 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
"peak_disp_in": cs.peak_disp_in, "peak_disp_in": cs.peak_disp_in,
} }
# Drop all-None entries — keeps the JSON tidy for partial reports. # Drop all-None entries — keeps the JSON tidy for partial reports.
return {k: v for k, v in out.items() if v is not None} out = {k: v for k, v in out.items() if v is not None}
# Saturation flag (only present when True) — signals that ppv_ips
# is the channel range max (a lower bound), not an exact reading.
if getattr(cs, "ppv_saturated", False):
out["ppv_saturated"] = True
# ZC Freq above device reporting ceiling (BW ">100 Hz") — value
# in zc_freq_hz is the threshold, not an exact measurement.
if getattr(cs, "zc_freq_above_range", False):
out["zc_freq_above_range"] = True
return out
def _sc(ch_name: str) -> dict: def _sc(ch_name: str) -> dict:
sc = report.sensor_check.get(ch_name) sc = report.sensor_check.get(ch_name)
@@ -169,12 +180,22 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
"vector_sum": { "vector_sum": {
"ips": report.peak_vector_sum_ips, "ips": report.peak_vector_sum_ips,
"time_s": report.peak_vector_sum_time_s, "time_s": report.peak_vector_sum_time_s,
# Histogram events have an absolute date+time for the PVS
# (the interval at which it occurred); waveform events
# only have the time_s offset.
"when": report.peak_vector_sum_when.isoformat() if report.peak_vector_sum_when else None,
# Set when BW reported the PVS as OORANGE — value is the
# conservative upper bound sqrt(3) * geo_range_ips, not
# an exact peak.
"saturated": bool(getattr(report, "peak_vector_sum_saturated", False)),
}, },
}, },
"mic": { "mic": {
"weighting": report.mic.weighting, "weighting": report.mic.weighting,
"pspl_dbl": report.mic.pspl_dbl, "pspl_dbl": report.mic.pspl_dbl,
"pspl_saturated": bool(getattr(report.mic, "pspl_saturated", False)),
"zc_freq_hz": report.mic.zc_freq_hz, "zc_freq_hz": report.mic.zc_freq_hz,
"zc_freq_above_range": bool(getattr(report.mic, "zc_freq_above_range", False)),
"time_of_peak_s": report.mic.time_of_peak_s, "time_of_peak_s": report.mic.time_of_peak_s,
}, },
"sensor_check": { "sensor_check": {
@@ -183,6 +204,17 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
"long": _sc("Long"), "long": _sc("Long"),
"mic": _sc("MicL"), "mic": _sc("MicL"),
}, },
# Histogram-specific fields (None on waveform-mode events).
# Per-channel absolute peak time/date for histograms — for
# waveforms see channels[ch]["time_of_peak_s"] instead.
"histogram": {
"start": report.histogram_start.isoformat() if report.histogram_start else None,
"stop": report.histogram_stop.isoformat() if report.histogram_stop else None,
"n_intervals": report.histogram_n_intervals,
"interval_size": report.histogram_interval_size_str,
"interval_size_s": report.histogram_interval_size_s,
"channel_peak_when": {ch: dt.isoformat() for ch, dt in report.channel_peak_when.items()},
},
"monitor_log": monitor_log, "monitor_log": monitor_log,
"pc_sw_version": report.pc_sw_version, "pc_sw_version": report.pc_sw_version,
} }
@@ -252,6 +284,60 @@ def apply_report_to_event(event: Event, report: BwAsciiReport) -> None:
event.rectime_seconds = report.record_time_s event.rectime_seconds = report.record_time_s
def apply_bw_report_dict_to_event(event: Event, bw_report: dict) -> None:
"""Mirror of ``apply_report_to_event`` for the projected sidecar
dict shape (as produced by ``_bw_report_to_dict``).
Why this exists
The ingest path holds a live ``BwAsciiReport`` parsed straight from
the ``_ASCII.TXT`` and uses ``apply_report_to_event`` to overlay
device-authoritative peaks onto the codec output before insert.
The backfill path doesn't have the original ``.TXT`` (it's not
retained in the waveform store), but it does have the preserved
``bw_report`` block from the sidecar which contains the same
projected fields. Re-overlaying those during a backfill keeps the
DB peak columns aligned with what BW reports rather than letting
the codec output (which may be incomplete for unhandled formats or
walker edge cases) win by default.
No-ops cleanly when ``bw_report`` is ``None``, empty, or missing
any particular sub-field only fields with a concrete value get
written. Mirrors ``apply_report_to_event``'s "report wins where
present" semantics.
"""
if not bw_report:
return
if event.peak_values is None:
event.peak_values = PeakValues()
pv = event.peak_values
peaks = bw_report.get("peaks") or {}
tran = (peaks.get("tran") or {}).get("ppv_ips")
vert = (peaks.get("vert") or {}).get("ppv_ips")
long = (peaks.get("long") or {}).get("ppv_ips")
if tran is not None: pv.tran = tran
if vert is not None: pv.vert = vert
if long is not None: pv.long = long
vs_ips = (peaks.get("vector_sum") or {}).get("ips")
if vs_ips is not None:
pv.peak_vector_sum = vs_ips
mic = bw_report.get("mic") or {}
pspl = mic.get("pspl_dbl")
if pspl is not None and pspl > 0:
pv.micl = _dbl_to_psi(pspl)
rec = bw_report.get("recording") or {}
sr = rec.get("sample_rate_sps")
if sr:
event.sample_rate = sr
rt = rec.get("record_time_s")
if rt is not None:
event.rectime_seconds = rt
def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict: def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
if pi is None: if pi is None:
return { return {
@@ -276,6 +362,7 @@ def event_to_sidecar_dict(
blastware_filesize: int, blastware_filesize: int,
blastware_sha256: str, blastware_sha256: str,
source_kind: str = "sfm-live", source_kind: str = "sfm-live",
txt_filename: Optional[str] = None,
a5_pickle_filename: Optional[str] = None, a5_pickle_filename: Optional[str] = None,
tool_version: str = _TOOL_VERSION_DEFAULT, tool_version: str = _TOOL_VERSION_DEFAULT,
captured_at: Optional[datetime.datetime] = None, captured_at: Optional[datetime.datetime] = None,
@@ -392,6 +479,7 @@ def event_to_sidecar_dict(
"captured_at": captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(), "captured_at": captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
"tool_version": tool_version, "tool_version": tool_version,
"a5_pickle_filename": a5_pickle_filename, "a5_pickle_filename": a5_pickle_filename,
"txt_filename": txt_filename,
}, },
"review": review or { "review": review or {
@@ -755,11 +843,40 @@ def read_blastware_file(path: Union[str, Path]) -> Event:
ts1 = _bw._decode_ts_be(footer[2:10]) ts1 = _bw._decode_ts_be(footer[2:10])
ts2 = _bw._decode_ts_be(footer[10:18]) ts2 = _bw._decode_ts_be(footer[10:18])
# Body: first 6 bytes are the preamble (00 00 ff ff ff ff). Strip # Body: decode via the verified body codecs. Two formats coexist:
# them before decoding samples. Any trailing tail past the last #
# full sample-set is silently truncated by _decode_samples_4ch. # 1. Waveform-mode (.AB0W) — starts with 7-byte preamble
sample_bytes = body[6:] if body[:6].hex() in ("0000ffffffff", "0000FFFFFFFF") else body # ``00 02 00 [Tran[0] BE] [Tran[1] BE]`` followed by the
samples = _decode_samples_4ch_int16_le(sample_bytes) # tagged-block delta stream documented in
# ``docs/waveform_codec_re_status.md`` and §7.6.1 of the
# protocol reference. Decoded by ``waveform_codec.decode_waveform_v2``.
#
# 2. Histogram-mode (.AB0H) — a sequence of 32-byte blocks, one
# per histogram interval, each carrying per-channel peak +
# half-period values. Decoded by
# ``histogram_codec.decode_histogram_body``. Both codecs
# return the same channel-grouped output shape, so consumers
# don't need to special-case mode.
#
# The historical ``_decode_samples_4ch_int16_le`` int16-LE
# interpretation was retracted 2026-05-08 (see protocol-ref §7.6.1
# retraction box) — it produced ±32K noise on every event.
#
# If both codecs fail (malformed file, truncated body, unrecognised
# mode, synthetic test input), fall back to empty channels — the
# rest of the event (timestamp, waveform_key, project strings) is
# still recoverable and useful.
decoded = decode_waveform_v2(body)
if decoded is None:
decoded = decode_histogram_body(body)
if decoded is None:
log.warning(
"%s: body codec failed to decode (body starts %s) — "
"raw_samples will be empty", path, body[:8].hex(" "),
)
samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
else:
samples = decoded_to_adc_counts(decoded)
# Metadata strings (label-anchored search across the body). # Metadata strings (label-anchored search across the body).
project = _find_first_string(body, b"Project:") project = _find_first_string(body, b"Project:")
@@ -793,7 +910,18 @@ def read_blastware_file(path: Union[str, Path]) -> Event:
project=project, client=client, operator=user, sensor_location=seisloc, project=project, client=client, operator=user, sensor_location=seisloc,
) )
ev.raw_samples = samples ev.raw_samples = samples
ev.peak_values = _peaks_from_samples(samples) # Only compute peaks from samples when we actually have samples.
# For events the codec couldn't decode (histogram-mode bodies, until
# the §7.6.2 histogram codec is wired in), samples is an empty dict
# and ``_peaks_from_samples`` would return PeakValues(0, 0, 0, 0, 0).
# That would then OVERWRITE existing good DB peak values (e.g. from
# paired BW ASCII reports) during the backfill UPSERT path.
# Leaving peak_values=None signals "we don't know" to downstream
# consumers; the backfill script seeds from the DB row when it sees
# None, and ``apply_report_to_event`` overlays from a paired ASCII
# report when one is supplied.
has_samples = any(samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL"))
ev.peak_values = _peaks_from_samples(samples) if has_samples else None
ev._a5_frames = None # not recoverable from BW file ev._a5_frames = None # not recoverable from BW file
return ev return ev
+283
View File
@@ -0,0 +1,283 @@
"""
histogram_codec.py decoder for MiniMate Plus histogram-mode event bodies.
FULLY DECODED 2026-05-20. Every field in every block, verified
byte-exact against BW's ASCII export across multiple histogram
fixtures.
The histogram-mode body is a stream of 32-byte fixed-length blocks,
one block per histogram interval. Each block carries the per-interval
peak amplitude + zero-crossing frequency for all four channels (Tran,
Vert, Long, MicL).
Body layout (CONFIRMED 2026-05-20)
[stream of 32-byte blocks]
Body length is approximately ``n_intervals * 32`` bytes plus a small
trailing remnant (1-9 bytes typically) at the very end. Walker should
iterate 32-stride and stop before the tail.
32-byte block layout
[0] 0x00 always-zero tag
[1] segment_id (uint8) 0x00..0x03 256 blocks per segment
[2:4] block_ctr (uint16 LE) resets each segment (0x0100, 0x0101, )
[4:6] 0x000a (uint16 LE) constant marker (= 10)
[6] T_peak_count uint8 Tran peak (count × 0.005 in/s, max 1.275 in/s)
[7] T_annotation uint8 empirically non-zero on intervals with sub-Hz
or unmeasurable Tran freq; meaning not fully RE'd
[8:10] T_halfperiod uint16 LE Tran half-period in samples (freq = 512 / halfp Hz)
[10] V_peak_count uint8
[11] V_annotation uint8
[12:14] V_halfperiod uint16 LE
[14] L_peak_count uint8
[15] L_annotation uint8
[16:18] L_halfperiod uint16 LE
[18] M_peak_count uint8 MicL peak (count dB via mic_count_to_db)
[19] M_annotation uint8
[20:22] M_halfperiod uint16 LE MicL half-period in samples (freq = 512 / halfp Hz)
[22:24] 0x00 0x00 constant
[24:28] 4-byte variable purpose unknown (possibly CRC or timestamp delta)
[28:32] 0x1e 0x0a 0x00 0x00 constant block-end signature
NOTE on peak-count width: an earlier interpretation treated the peak
fields as uint16 LE spanning [6:8] / [10:12] / [14:16] / [18:20].
That happened to be byte-exact against the N844 fixture corpus only
because every annotation byte in those fixtures was zero, making
``uint16 LE == uint8``. Cross-correlating BE9558 (K558) Tran-drift
and BE18003 (T003) Histogram+Continuous events against the BW ASCII
export proved peak is uint8 alone see test_histogram_codec.py
and docs/histogram_codec_re_status.md.
Block-identification anchor: ``block[22:24] == b"\\x00\\x00"`` AND
``block[28:32] == b"\\x1e\\x0a\\x00\\x00"``. This is the reliable
distinguisher from non-block content in the file.
Per-channel encoding
Geophone channels (Tran, Vert, Long):
- peak_count × 0.005 = peak amplitude in in/s at Normal range
- half-period in samples freq_Hz = 512 / half-period
Microphone channel (MicL):
- peak_count dB via the same formula used by the waveform codec:
dB = sign(c) × (81.94 + 20·log10(|c|)) for |c| 1
dB = 0 for c == 0
- half-period freq_Hz = 512 / half-period (same as geo)
Frequency `>100 Hz` sentinel: the device emits half-period 5 when the
measured zero-crossing rate exceeds the geophone's measurement range
(since 512/5 = 102 Hz; the BW display rounds anything > 100 to ">100").
Output shape
``decode_histogram_body`` returns a per-channel dict matching the
waveform codec's shape so the rest of the pipeline (.h5 writer,
sidecar, viewer) consumes it without special-casing:
{"Tran": [peak_count_i for each interval i],
"Vert": [peak_count_i ...],
"Long": [peak_count_i ...],
"MicL": [peak_count_i ...]}
Values are in **16-count units for geo** (LSB = 0.005 in/s, matching
``decode_waveform_v2``) and **1-count units for mic** (matching the
waveform codec's mic convention). Run through
``waveform_codec.decoded_to_adc_counts`` to scale geo to 1-count ADC.
Per-interval frequencies are NOT returned they're auxiliary data,
not waveform samples. Consumers needing frequencies can call
``decode_histogram_body_full()`` for the structured per-interval
record list.
"""
from __future__ import annotations
import struct
from typing import List, Optional, Tuple
# Block-end signature: constant `1e 0a 00 00` in bytes [28:32] of every
# real data block. More distinctive than the byte-22 `00 00` (which
# matches many false positives), so we anchor on this.
_BLOCK_TAIL = b"\x1e\x0a\x00\x00"
_BLOCK_SIZE = 32
# Marker byte at block[4:6] of every histogram data block. Used as
# additional validation that we're looking at a real block.
_BLOCK_MARKER = 10
# Geo peak scaling: stored as "count × 0.005 in/s" where 1 count = one
# 0.005 in/s display quantum. Equivalent to the waveform codec's
# 16-count-unit output (1 unit = 0.005 in/s = 16 ADC counts).
_GEO_LSB_INS = 0.005
# Frequency formula: freq_Hz = _FREQ_NUMERATOR / half_period_samples.
# Empirically determined to be 512 (= sample_rate / 2, where sample rate
# is 1024 sps for the standard MiniMate Plus configuration).
_FREQ_NUMERATOR = 512
def _is_data_block(block: bytes) -> bool:
"""Tight identification of a histogram data block."""
if len(block) < _BLOCK_SIZE:
return False
if block[28:32] != _BLOCK_TAIL:
return False
if block[22:24] != b"\x00\x00":
return False
if block[0] != 0x00:
return False
marker = block[4] | (block[5] << 8)
if marker != _BLOCK_MARKER:
return False
return True
def _decode_block(block: bytes) -> Optional[dict]:
"""Decode one 32-byte histogram block. Caller must have validated
with ``_is_data_block`` first.
Returns a record with per-channel peak counts (uint8) and
half-periods (uint16 LE).
"""
# Peak counts are uint8 at bytes [6] / [10] / [14] / [18]. The
# adjacent bytes [7] / [11] / [15] / [19] hold an annotation field
# whose meaning isn't fully understood (empirically non-zero in
# intervals with sub-Hz or unmeasurable geo frequencies, mostly
# zero otherwise — see test fixtures from BE9558/BE18003 corpora).
# Crucially, those annotation bytes are NOT the high byte of the
# peak count: cross-correlating against BW's per-interval ASCII
# export proves the peak is uint8 alone.
#
# Reading the peak as uint16 LE (the original interpretation) was
# accidentally correct only because every block in the N844 fixture
# corpus had a zero annotation byte; non-N844 events with non-zero
# annotation bytes decoded to physically impossible peaks (e.g.
# 268 in/s per channel) and produced 35× inflated PVS sums when
# first run against prod data. See histogram_codec_re_status.md.
t_peak = block[6]
v_peak = block[10]
l_peak = block[14]
m_peak = block[18]
t_halfp = block[8] | (block[9] << 8)
v_halfp = block[12] | (block[13] << 8)
l_halfp = block[16] | (block[17] << 8)
m_halfp = block[20] | (block[21] << 8)
segment_id = block[1]
block_ctr = block[2] | (block[3] << 8)
var_meta = bytes(block[24:28])
annotations = (block[7], block[11], block[15], block[19])
return {
"segment_id": segment_id,
"block_ctr": block_ctr,
"t_peak": t_peak,
"t_halfp": t_halfp,
"v_peak": v_peak,
"v_halfp": v_halfp,
"l_peak": l_peak,
"l_halfp": l_halfp,
"m_peak": m_peak,
"m_halfp": m_halfp,
"meta_var": var_meta,
"annotations": annotations,
}
def walk_body(body: bytes) -> List[dict]:
"""Walk the body and return one dict per histogram interval.
Iterates 32-byte strides from offset 0. Yields a decoded record
for every block that passes ``_is_data_block`` validation. Stops
when the remaining bytes are too short to form a complete block.
In Histogram+Continuous mode the body interleaves data blocks with
other 32-byte content (likely continuous-mode waveform blocks) that
fail the data-block validation; the walker naturally skips them
without losing 32-byte alignment. Use ``block_ctr`` from each
returned record to map back to the original interval index the
record list is sparse when other block types are interleaved.
"""
records: List[dict] = []
for off in range(0, len(body) - _BLOCK_SIZE + 1, _BLOCK_SIZE):
blk = body[off:off + _BLOCK_SIZE]
if not _is_data_block(blk):
# Hit non-block content (likely a sync or stream marker).
# Continue walking — block alignment is fixed at 32-stride
# from offset 0, so we don't lose alignment by skipping.
continue
decoded = _decode_block(blk)
if decoded is None:
# Block validated as a histogram block but had peak fields
# outside the plausible range — undocumented extension.
# Skip rather than propagating bogus PVS contributions.
continue
records.append(decoded)
return records
def decode_histogram_body(body: bytes) -> Optional[dict]:
"""Decode a histogram-mode body into per-channel peak-sample arrays.
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
where each channel's list contains one peak value per histogram
interval (in the same units the waveform codec uses: 16-count units
for geo, 1-count ADC units for mic). Returns ``None`` if the body
doesn't contain any valid histogram blocks.
To convert to physical units:
- Geo channels: ``count * 0.005`` = peak in in/s at Normal range
(or run through ``waveform_codec.decoded_to_adc_counts`` first
to get 1-count ADC values, then ``count / 32767 * 10.0`` for in/s)
- Mic channel: use ``waveform_codec.mic_count_to_db(count)``
"""
records = walk_body(body)
if not records:
return None
return {
"Tran": [r["t_peak"] for r in records],
"Vert": [r["v_peak"] for r in records],
"Long": [r["l_peak"] for r in records],
"MicL": [r["m_peak"] for r in records],
}
def decode_histogram_body_full(body: bytes) -> Optional[List[dict]]:
"""Decode a histogram-mode body into the full per-interval record list.
Same data as ``decode_histogram_body`` but in a structured form that
preserves the half-period (frequency) data for each channel + the
per-block segment_id, block_ctr, and 4-byte variable metadata.
Useful for diagnostic tools, sidecar enrichment, and future-codec
work.
Returns ``None`` if the body has no valid blocks.
"""
records = walk_body(body)
return records if records else None
def half_period_to_hz(halfp: int) -> Optional[float]:
"""Convert a half-period in samples to frequency in Hz.
Returns ``None`` for half-period 5 the device emits values in
that range when the measured zero-crossing rate exceeds 100 Hz
(the BW display reports `>100 Hz` for such cases). Callers can
treat ``None`` as the `>100 Hz` sentinel.
"""
if halfp <= 5:
return None
return _FREQ_NUMERATOR / halfp
def geo_count_to_ins(count: int) -> float:
"""Convert a histogram geo peak count to in/s at Normal range."""
return count * _GEO_LSB_INS
+2 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "seismo-relay" name = "seismo-relay"
version = "0.19.0" version = "0.20.0"
description = "Python client and REST server for MiniMate Plus seismographs" description = "Python client and REST server for MiniMate Plus seismographs"
requires-python = ">=3.10" requires-python = ">=3.10"
dependencies = [ dependencies = [
@@ -15,6 +15,7 @@ dependencies = [
"python-multipart>=0.0.7", "python-multipart>=0.0.7",
"h5py>=3.10", "h5py>=3.10",
"numpy>=1.24", "numpy>=1.24",
"matplotlib>=3.8",
] ]
[tool.setuptools.packages.find] [tool.setuptools.packages.find]
+1
View File
@@ -5,3 +5,4 @@ pyserial
python-multipart python-multipart
h5py h5py
numpy numpy
matplotlib
+130 -11
View File
@@ -12,8 +12,20 @@ Walks `<store_root>/<serial>/<filename>` and for each BW event file:
parsing the BW binary directly (peaks computed from samples). parsing the BW binary directly (peaks computed from samples).
Clean waveform (.h5): Clean waveform (.h5):
- Skip when <filename>.h5 already exists (idempotent). - Regenerated whenever the sidecar is regenerated (sha mismatch
- Else write from .a5.pkl (preferred) or BW binary parse (fallback). OR sidecar.source.tool_version < current TOOL_VERSION OR --force).
The .h5 and the sidecar both come from the same decoder output,
so if the sidecar is stale the .h5 is too.
- Written when missing.
- --skip-hdf5 turns off all .h5 writes.
Typical use after a decoder upgrade:
1. Pull the new seismo-relay code (which bumped TOOL_VERSION).
2. Run this script every sidecar with an older tool_version
stamp regenerates, and the associated .h5 cascade-regenerates.
3. Operator review state (review.false_trigger, notes, reviewer)
and the sidecar's extensions block are preserved across the
regen.
Usage: Usage:
python scripts/backfill_sidecars.py [--store-root PATH] python scripts/backfill_sidecars.py [--store-root PATH]
@@ -42,14 +54,26 @@ log = logging.getLogger("backfill_sidecars")
def _looks_like_event_file(path: Path) -> bool: def _looks_like_event_file(path: Path) -> bool:
"""Same heuristic as the importer CLI.""" """Same heuristic as the importer CLI.
Filters to BW (Series III) event files only Thor (Series IV)
`.IDFW` / `.IDFH` files share the store but have their own ingest
path (`WaveformStore.save_imported_idf`) and are NOT decodable by
`event_file_io.read_blastware_file`. Their sidecars are populated
at ingest from the paired `.IDFW.txt` ASCII report; nothing the
backfill regenerates would improve on them, so we exclude them
from scope.
"""
if not path.is_file(): if not path.is_file():
return False return False
if path.name.endswith((".a5.pkl", ".sfm.json")): if path.name.endswith((".a5.pkl", ".sfm.json", ".h5")):
return False return False
ext = path.suffix.lstrip(".") ext = path.suffix.lstrip(".")
if not (3 <= len(ext) <= 4): if not (3 <= len(ext) <= 4):
return False return False
# Thor IDF files share the .{W,H}-suffix shape but aren't BW.
if ext.upper() in ("IDFW", "IDFH"):
return False
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")): if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
return False return False
try: try:
@@ -79,6 +103,17 @@ def main(argv=None) -> int:
"STRT-rectime byte-offset fix in v0.15.x)." "STRT-rectime byte-offset fix in v0.15.x)."
), ),
) )
p.add_argument(
"--reparse-txt", action="store_true",
help=(
"Re-parse the preserved <serial>/<filename>_ASCII.TXT with the "
"current bw_ascii_report parser and overwrite the sidecar's "
"bw_report block. Use this after upgrading the ASCII parser to "
"pull in new fields (e.g. zc_freq_above_range for BW '>100 Hz' "
"ZC peaks). No-op for events without a preserved .TXT; safely "
"idempotent when the parser hasn't changed."
),
)
p.add_argument("-v", "--verbose", action="store_true") p.add_argument("-v", "--verbose", action="store_true")
args = p.parse_args(argv) args = p.parse_args(argv)
@@ -123,7 +158,13 @@ def main(argv=None) -> int:
# the sidecar was written by a build that includes any # the sidecar was written by a build that includes any
# decoder fixes shipped since). # decoder fixes shipped since).
# Either part failing → regenerate. --force bypasses both. # Either part failing → regenerate. --force bypasses both.
if sidecar_path.exists() and not args.force: #
# Tracks whether we're regenerating the sidecar this iteration
# so the .h5 logic below knows to refresh that too — staleness
# of the sidecar implies staleness of the derived .h5 (both
# come out of the same decoder).
sidecar_stale = True
if sidecar_path.exists() and not args.force and not args.reparse_txt:
try: try:
existing = event_file_io.read_sidecar(sidecar_path) existing = event_file_io.read_sidecar(sidecar_path)
sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
@@ -136,6 +177,7 @@ def main(argv=None) -> int:
ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION) ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION)
if sha_ok and ver_ok: if sha_ok and ver_ok:
skipped += 1 skipped += 1
sidecar_stale = False
continue continue
if sha_ok and not ver_ok: if sha_ok and not ver_ok:
log.info( log.info(
@@ -256,19 +298,68 @@ def main(argv=None) -> int:
or ev.total_samples < derived // 4): or ev.total_samples < derived // 4):
ev.total_samples = derived ev.total_samples = derived
# Preserve user-edited review state + extensions from the # Preserve user-edited review state + extensions + the
# existing sidecar (false_trigger flag, notes, etc.) so a # bw_report block from the existing sidecar so a backfill
# backfill never wipes them out. # never wipes them out. The bw_report block originates
# from the paired .TXT ASCII report parsed at ORIGINAL
# import time (ach forward / direct upload); the .TXT
# file is not in the waveform store, so we can't re-derive
# it from disk. event_to_sidecar_dict takes a
# BwAsciiReport dataclass (not a dict), so for bw_report
# we overlay the existing block after regen instead of
# passing it as a kwarg.
preserved_review = None preserved_review = None
preserved_ext = None preserved_ext = None
preserved_bw_report = None
preserved_txt_fn = None
if sidecar_path.exists(): if sidecar_path.exists():
try: try:
_existing = event_file_io.read_sidecar(sidecar_path) _existing = event_file_io.read_sidecar(sidecar_path)
preserved_review = _existing.get("review") preserved_review = _existing.get("review")
preserved_ext = _existing.get("extensions") preserved_ext = _existing.get("extensions")
preserved_bw_report = _existing.get("bw_report")
# Preserve txt_filename so backfills don't blank out the
# pointer to the saved raw .TXT (events ingested after
# 2026-05-27 have this).
preserved_txt_fn = (_existing.get("source") or {}).get("txt_filename")
except Exception: except Exception:
pass pass
# --reparse-txt: if a .TXT is preserved on disk, run the
# current parser against it and overwrite the bw_report
# block. Picks up post-ingest parser fixes (e.g. the
# 2026-05-28 zc_freq_above_range / ">100 Hz" addition).
if args.reparse_txt and preserved_txt_fn:
try:
from minimateplus import bw_ascii_report
txt_path = store.txt_path_for(serial, path.name)
if txt_path.exists():
refreshed = bw_ascii_report.parse_report_file(txt_path)
preserved_bw_report = event_file_io._bw_report_to_dict(refreshed)
log.debug("reparsed bw_report from %s", txt_path.name)
else:
log.debug("--reparse-txt: no .TXT at %s (sidecar says %r)",
txt_path, preserved_txt_fn)
except Exception as exc:
log.warning("--reparse-txt failed for %s: %s", path.name, exc)
# Overlay BW ASCII report fields onto the rebuilt Event
# BEFORE the sidecar + DB write. Mirrors what the ingest
# path does — BW's reported peaks (and sample_rate /
# record_time) win over codec output where present.
#
# Without this step, --force backfill silently overwrites
# the bw_report-overlaid DB columns with codec-derived
# values, which is wrong for events the codec doesn't
# fully decode (e.g. waveform walker edge cases on
# SP0/SS0/SV0-style events, or histogram sub-formats with
# byte[5]!=0 that aren't yet RE'd). Net effect was PVS=0
# on three top-10 events on 2026-05-22.
if preserved_bw_report:
event_file_io.apply_bw_report_dict_to_event(
ev, preserved_bw_report,
)
sidecar = event_file_io.event_to_sidecar_dict( sidecar = event_file_io.event_to_sidecar_dict(
ev, ev,
serial=serial, serial=serial,
@@ -277,16 +368,44 @@ def main(argv=None) -> int:
blastware_sha256=bw_sha, blastware_sha256=bw_sha,
source_kind=source_kind, source_kind=source_kind,
a5_pickle_filename=a5_filename, a5_pickle_filename=a5_filename,
txt_filename=preserved_txt_fn,
review=preserved_review, review=preserved_review,
extensions=preserved_ext, extensions=preserved_ext,
) )
if preserved_bw_report is not None:
sidecar["bw_report"] = preserved_bw_report
# Also emit the .h5 clean-waveform file when missing OR when # Also emit the .h5 clean-waveform file when:
# --force was passed (so a re-backfill picks up decoder fixes). # - it's missing, OR
# - --force was passed, OR
# - the sidecar is being regenerated this iteration
# (sha mismatch / tool_version too old). The .h5 and
# the sidecar are both derived from the same decoder
# output, so if the sidecar is stale, so is the .h5.
#
# Both waveform and histogram bodies now decode to real
# samples via event_file_io.read_blastware_file → either
# waveform_codec.decode_waveform_v2 or histogram_codec.
# decode_histogram_body. If samples are still empty after
# both codecs run, it's a genuine "we can't decode this
# file" case (truncated, malformed, or unknown mode);
# skip the .h5 write so we don't replace whatever's
# there with an empty placeholder.
has_samples = bool(
ev.raw_samples and any(
ev.raw_samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL")
)
)
hdf5_path = store.hdf5_path_for(serial, path.name) hdf5_path = store.hdf5_path_for(serial, path.name)
hdf5_filename = hdf5_path.name if hdf5_path.exists() else None hdf5_filename = hdf5_path.name if hdf5_path.exists() else None
hdf5_action = "kept" hdf5_action = "kept"
need_h5 = not args.skip_hdf5 and (args.force or not hdf5_path.exists()) need_h5 = (
not args.skip_hdf5
and (args.force or not hdf5_path.exists() or sidecar_stale)
and has_samples
)
if not has_samples and not args.skip_hdf5:
hdf5_action = "skipped-undecodable"
if need_h5: if need_h5:
if args.dry_run: if args.dry_run:
hdf5_action = "would (re)write" hdf5_action = "would (re)write"
+185
View File
@@ -0,0 +1,185 @@
"""
scripts/check_bw_report_preservation.py verify that running backfill_sidecars
doesn't wipe the `bw_report` block from sidecars that already had one.
Two-step workflow:
# Before running backfill — capture a baseline snapshot:
python scripts/check_bw_report_preservation.py snapshot \
--store-root /path/to/waveforms \
--out before.json
# Run backfill:
python scripts/backfill_sidecars.py --store-root /path/to/waveforms --force
# After backfill — diff against the baseline:
python scripts/check_bw_report_preservation.py diff \
--store-root /path/to/waveforms \
--baseline before.json
The diff classifies every sidecar into one of:
PRESERVED had bw_report before, has same hash now GOOD
CHANGED had bw_report before, has different hash now suspicious
(backfill should only ever copy the block verbatim)
WIPED had bw_report before, doesn't now ← BUG — data loss
STILL_MISSING didn't have bw_report before, still doesn't expected
NEW didn't have bw_report before, has one now
(only possible if a re-ingest happened between snapshots;
shouldn't happen during backfill)
REMOVED sidecar existed in baseline, file is gone now
ADDED sidecar didn't exist in baseline, exists now
Exit code is 0 if no WIPED or CHANGED entries are found, 1 otherwise.
"""
from __future__ import annotations
import argparse
import hashlib
import json
import sys
from pathlib import Path
from typing import Optional
# Allow running from the repo root without installation.
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from minimateplus import event_file_io
def _bw_report_hash(sidecar_data: dict) -> Optional[str]:
"""Canonical-JSON hash of the bw_report block, or None if absent."""
br = sidecar_data.get("bw_report")
if not br:
return None
# sort_keys for stable hashing across dict-ordering differences
blob = json.dumps(br, sort_keys=True, separators=(",", ":"))
return hashlib.sha256(blob.encode()).hexdigest()
def _scan_store(store_root: Path) -> dict:
"""Walk every <serial>/<file>.sfm.json and return {relpath: hash_or_None}.
Relpath is `<serial>/<filename>` stable across machines/snapshots.
"""
out: dict[str, Optional[str]] = {}
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
for sidecar in sorted(serial_dir.glob("*.sfm.json")):
relpath = f"{serial_dir.name}/{sidecar.name}"
try:
data = event_file_io.read_sidecar(sidecar)
except Exception as exc:
print(f" WARN: failed to read {relpath}: {exc}", file=sys.stderr)
continue
out[relpath] = _bw_report_hash(data)
return out
def cmd_snapshot(args) -> int:
store_root = Path(args.store_root).expanduser().resolve()
if not store_root.exists():
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
return 2
out_path = Path(args.out).expanduser().resolve()
print(f"Scanning {store_root}")
snapshot = _scan_store(store_root)
with_bw = sum(1 for v in snapshot.values() if v is not None)
without_bw = sum(1 for v in snapshot.values() if v is None)
print(f" total sidecars: {len(snapshot)}")
print(f" with bw_report: {with_bw}")
print(f" without bw_report: {without_bw}")
out_path.parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w") as f:
json.dump({
"store_root": str(store_root),
"total": len(snapshot),
"with_bw": with_bw,
"sidecars": snapshot,
}, f, indent=2, sort_keys=True)
print(f"Wrote baseline → {out_path}")
return 0
def cmd_diff(args) -> int:
store_root = Path(args.store_root).expanduser().resolve()
if not store_root.exists():
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
return 2
baseline_path = Path(args.baseline).expanduser().resolve()
if not baseline_path.exists():
print(f"error: baseline file not found: {baseline_path}", file=sys.stderr)
return 2
with open(baseline_path) as f:
baseline = json.load(f)
before = baseline["sidecars"]
print(f"Scanning {store_root} for comparison against {baseline_path.name}")
after = _scan_store(store_root)
classes = {k: [] for k in (
"PRESERVED", "CHANGED", "WIPED", "STILL_MISSING", "NEW", "REMOVED", "ADDED",
)}
all_keys = set(before) | set(after)
for key in sorted(all_keys):
b = before.get(key, "__MISSING__")
a = after.get(key, "__MISSING__")
if b == "__MISSING__":
classes["ADDED"].append(key)
elif a == "__MISSING__":
classes["REMOVED"].append(key)
elif b is None and a is None:
classes["STILL_MISSING"].append(key)
elif b is None and a is not None:
classes["NEW"].append(key)
elif b is not None and a is None:
classes["WIPED"].append(key)
elif b == a:
classes["PRESERVED"].append(key)
else:
classes["CHANGED"].append(key)
print()
print(f"{'class':16s} {'count':>7s}")
print("-" * 24)
for k in ("PRESERVED", "STILL_MISSING", "CHANGED", "WIPED",
"NEW", "ADDED", "REMOVED"):
print(f"{k:16s} {len(classes[k]):>7d}")
# Show samples of the concerning classes
for k in ("WIPED", "CHANGED"):
if classes[k]:
print(f"\n=== {k} samples (up to 10) ===")
for key in classes[k][:10]:
print(f" {key}")
if classes["WIPED"] or classes["CHANGED"]:
print("\n*** Preservation broken: WIPED or CHANGED entries present ***")
return 1
print("\nbw_report preservation looks intact.")
return 0
def main(argv=None) -> int:
p = argparse.ArgumentParser(description=__doc__)
sub = p.add_subparsers(dest="cmd", required=True)
p_snap = sub.add_parser("snapshot", help="capture baseline bw_report hashes")
p_snap.add_argument("--store-root", required=True)
p_snap.add_argument("--out", required=True, help="output JSON path")
p_snap.set_defaults(func=cmd_snapshot)
p_diff = sub.add_parser("diff", help="diff current store against a baseline")
p_diff.add_argument("--store-root", required=True)
p_diff.add_argument("--baseline", required=True, help="JSON from `snapshot`")
p_diff.set_defaults(func=cmd_diff)
args = p.parse_args(argv)
return args.func(args)
if __name__ == "__main__":
sys.exit(main())
+909
View File
@@ -0,0 +1,909 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>SFM Event Browser</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.1/chart.umd.min.js"></script>
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
background: #0d1117;
color: #c9d1d9;
font-family: 'Segoe UI', system-ui, sans-serif;
font-size: 13px;
height: 100vh;
display: flex;
flex-direction: column;
overflow: hidden;
}
header {
background: #161b22;
border-bottom: 1px solid #30363d;
padding: 12px 20px;
display: flex;
align-items: center;
gap: 16px;
flex-shrink: 0;
}
header h1 {
font-size: 15px;
font-weight: 600;
color: #f0f6fc;
white-space: nowrap;
}
label { color: #8b949e; font-size: 12px; }
select, input[type="text"], input[type="search"] {
background: #0d1117;
border: 1px solid #30363d;
border-radius: 6px;
color: #c9d1d9;
padding: 5px 8px;
font-size: 13px;
}
select { min-width: 140px; }
input[type="search"] { width: 200px; }
select:focus, input:focus { outline: none; border-color: #388bfd; }
button {
background: #1f6feb;
border: none;
border-radius: 6px;
color: #fff;
cursor: pointer;
font-size: 13px;
font-weight: 500;
padding: 5px 14px;
}
button:hover { background: #388bfd; }
button:disabled { background: #21262d; color: #484f58; cursor: not-allowed; }
#main {
flex: 1;
display: flex;
overflow: hidden;
}
/* ── Event list (left sidebar) ────────────────────────────────── */
#event-list-wrap {
width: 320px;
flex-shrink: 0;
background: #0d1117;
border-right: 1px solid #21262d;
display: flex;
flex-direction: column;
}
#event-list-header {
padding: 10px 14px;
border-bottom: 1px solid #21262d;
font-size: 11px;
color: #8b949e;
text-transform: uppercase;
letter-spacing: 0.06em;
display: flex;
justify-content: space-between;
}
#event-list {
flex: 1;
overflow-y: auto;
}
.event-row {
padding: 8px 14px;
border-bottom: 1px solid #161b22;
cursor: pointer;
transition: background 0.1s;
}
.event-row:hover { background: #161b22; }
.event-row.active { background: #1f3a5f; border-left: 3px solid #58a6ff; padding-left: 11px; }
.event-row .er-top {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 2px;
}
.event-row .er-ts { font-family: monospace; font-size: 12px; color: #c9d1d9; }
.event-row .er-pvs { font-family: monospace; font-size: 12px; color: #58a6ff; font-weight: 600; }
.event-row .er-meta { font-size: 11px; color: #8b949e; }
.event-row.false_trigger .er-pvs { color: #f85149; text-decoration: line-through; }
/* ── Main viewer (right side) ─────────────────────────────────── */
#viewer {
flex: 1;
display: flex;
flex-direction: column;
overflow: hidden;
}
#event-meta {
padding: 12px 20px;
background: #161b22;
border-bottom: 1px solid #21262d;
display: grid;
grid-template-columns: repeat(auto-fit, minmax(160px, 1fr));
gap: 8px 24px;
flex-shrink: 0;
}
.meta-field {
display: flex;
flex-direction: column;
gap: 1px;
}
.meta-field .mf-label {
font-size: 10px;
color: #484f58;
text-transform: uppercase;
letter-spacing: 0.05em;
}
.meta-field .mf-value {
font-family: monospace;
font-size: 13px;
color: #c9d1d9;
}
.meta-field .mf-value.highlight { color: #58a6ff; font-weight: 600; }
#charts {
flex: 1;
overflow-y: auto;
padding: 12px 16px;
display: flex;
flex-direction: column;
gap: 10px;
}
.chart-wrap {
background: #161b22;
border: 1px solid #21262d;
border-radius: 8px;
padding: 10px 30px 8px 12px; /* right padding leaves room for the "0.0" baseline label */
}
.chart-label {
font-size: 11px;
font-weight: 600;
letter-spacing: 0.06em;
text-transform: uppercase;
margin-bottom: 4px;
display: flex;
justify-content: space-between;
}
.chart-canvas-wrap { position: relative; height: 130px; }
.ch-tran { color: #58a6ff; }
.ch-vert { color: #3fb950; }
.ch-long { color: #d29922; }
.ch-micl { color: #bc8cff; }
#status-bar {
background: #161b22;
border-top: 1px solid #21262d;
padding: 5px 20px;
font-size: 12px;
color: #8b949e;
min-height: 26px;
flex-shrink: 0;
}
#status-bar.error { color: #f85149; }
#status-bar.ok { color: #3fb950; }
#empty-state {
flex: 1;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
color: #484f58;
gap: 8px;
}
#empty-state svg { opacity: 0.3; }
.pill {
background: #21262d;
border-radius: 4px;
padding: 2px 8px;
color: #c9d1d9;
font-family: monospace;
font-size: 11px;
margin-left: 8px;
}
/* Per-channel stats table in the metadata header */
.stats-table {
grid-column: 1 / -1;
border-collapse: collapse;
font-family: monospace;
font-size: 12px;
margin-top: 4px;
}
.stats-table th, .stats-table td {
padding: 3px 14px 3px 0;
text-align: left;
color: #c9d1d9;
}
.stats-table th {
color: #484f58;
font-size: 10px;
text-transform: uppercase;
letter-spacing: 0.05em;
font-weight: 500;
}
/* ── Print view (light theme matching the Instantel printout) ─── */
body.print-view {
background: #ffffff;
color: #000000;
}
body.print-view header,
body.print-view #event-list-wrap,
body.print-view #event-list-header,
body.print-view #event-meta,
body.print-view #status-bar,
body.print-view .chart-wrap {
background: #ffffff;
border-color: #cccccc;
color: #000000;
}
body.print-view .event-row { color: #000; border-bottom-color: #eee; }
body.print-view .event-row:hover { background: #f4f4f4; }
body.print-view .event-row.active {
background: #e6f0ff;
border-left-color: #1f6feb;
}
body.print-view .er-ts { color: #000; }
body.print-view .er-pvs { color: #003a8c; }
body.print-view .er-meta,
body.print-view #event-list-header,
body.print-view .meta-field .mf-label,
body.print-view .stats-table th {
color: #666;
}
body.print-view .mf-value { color: #000; }
body.print-view .mf-value.highlight { color: #003a8c; }
body.print-view label { color: #444; }
body.print-view input, body.print-view select {
background: #fff; color: #000; border-color: #ccc;
}
/* In print theme, the channel-label colors stay (they identify
the trace). Only the chart panel background flips. */
@media print {
header, #event-list-wrap, #status-bar, button { display: none !important; }
body { overflow: visible; height: auto; }
#main, #viewer { overflow: visible; }
#charts { overflow: visible; }
}
</style>
</head>
<body>
<header>
<h1>SFM Event Browser</h1>
<label>Serial</label>
<select id="serial-select">
<option value="">Loading…</option>
</select>
<input type="search" id="event-filter" placeholder="filter events…" />
<span class="pill" id="count-pill"></span>
<button id="mic-unit-toggle" style="margin-left:auto;background:#21262d"
onclick="_setMicUnit(_getMicUnit() === 'dBL' ? 'psi' : 'dBL')"
title="Toggle mic display unit (dBL ↔ psi). Persists across page loads.">
Mic: dBL
</button>
<button id="print-btn" onclick="togglePrintView()" style="background:#21262d">Print view</button>
<button id="reload-btn" onclick="loadSerials()">Reload</button>
</header>
<div id="main">
<div id="event-list-wrap">
<div id="event-list-header">
<span>Events</span>
<span id="event-list-count"></span>
</div>
<div id="event-list"></div>
</div>
<div id="viewer">
<div id="empty-state">
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5">
<polyline points="22 12 18 12 15 21 9 3 6 12 2 12"/>
</svg>
<p>Select a unit and event to view its waveform.</p>
</div>
<div id="event-meta" style="display:none"></div>
<div id="charts" style="display:none"></div>
</div>
</div>
<div id="status-bar">Ready.</div>
<script>
// Channel colors and rendering order mirror Instantel's BW Event Report
// printout: MicL at the top, Tran at the bottom. Colors approximate
// what BW renders (magenta mic, blue long, green vert, red tran).
const CHANNEL_COLORS = {
MicL: '#e066ff',
Long: '#3a80ff',
Vert: '#3fb950',
Tran: '#f85149',
};
const CHANNEL_ORDER = ['MicL', 'Long', 'Vert', 'Tran'];
// Reference pressure for dB(L) — 20 µPa expressed in psi (≈ 2.9e-9 psi).
const DBL_REF = 2.9e-9;
// User-toggleable mic display unit: 'dBL' (default, matches BW printout
// + the rest of SFM) or 'psi' (raw sample unit).
function _getMicUnit() {
return localStorage.getItem('sfm_mic_unit') === 'psi' ? 'psi' : 'dBL';
}
function _setMicUnit(u) {
localStorage.setItem('sfm_mic_unit', u === 'psi' ? 'psi' : 'dBL');
_refreshMicUnitToggle();
if (currentEventId) loadEvent(currentEventId);
}
function _refreshMicUnitToggle() {
const b = document.getElementById('mic-unit-toggle');
if (b) b.textContent = `Mic: ${_getMicUnit()}`;
}
// psi → dB(L). Null for non-positive (log undefined; Chart.js renders as a gap).
function _psiToDbl(psi) {
if (psi == null || !(psi > 0)) return null;
return 20 * Math.log10(psi / DBL_REF);
}
// Per-sample mic chart conversion — rectify the AC waveform, dBL,
// floor below the noise-floor minimum. Gives a continuous baseline
// instead of the spikey/discontinuous look you get from raw _psiToDbl.
const MIC_DBL_FLOOR = 60;
function _psiToDblForChart(psi) {
if (psi == null) return MIC_DBL_FLOOR;
const a = Math.abs(psi);
if (a === 0) return MIC_DBL_FLOOR;
const dbl = 20 * Math.log10(a / DBL_REF);
return dbl > MIC_DBL_FLOOR ? dbl : MIC_DBL_FLOOR;
}
// Format an ISO timestamp in the browser's local timezone — UTC values
// (with 'Z' suffix) convert; naive values are interpreted as local clock.
// Returns '—' for null/empty/unparseable.
function _fmtTsLocal(iso) {
if (!iso) return '—';
const d = new Date(iso);
if (isNaN(d)) return iso;
return d.toLocaleString();
}
// Adaptive decimal formatter — scientific notation only for truly extreme
// values. Normal-range peaks render as plain decimals with sensible
// precision (was previously forcing toExponential(3) which produced ugly
// "2.500E-2 IN/S" labels).
function _fmtPeak(v, unit) {
if (v == null || (typeof v === 'number' && !isFinite(v))) return '';
if (typeof v !== 'number') return String(v) + (unit ? ' ' + unit : '');
if (v === 0) return '0' + (unit ? ' ' + unit : '');
const a = Math.abs(v);
const u = unit ? ' ' + unit : '';
if (a >= 0.0001 && a < 10000) {
const d = a >= 100 ? 1 : a >= 10 ? 2 : a >= 1 ? 3 : a >= 0.1 ? 4 : 5;
return v.toFixed(d) + u;
}
return v.toExponential(2) + u;
}
let allEvents = [];
let filteredEvents = [];
let currentEventId = null;
let charts = {};
const apiBase = window.location.origin;
function setStatus(msg, cls = '') {
const bar = document.getElementById('status-bar');
bar.textContent = msg;
bar.className = cls;
}
async function loadSerials() {
setStatus('Loading serials…');
try {
const r = await fetch(`${apiBase}/db/units`);
if (!r.ok) throw new Error(r.statusText);
// /db/units returns a bare list[dict], not {units:[...]}
const units = await r.json();
const sel = document.getElementById('serial-select');
sel.innerHTML = '';
if (!units || units.length === 0) {
sel.innerHTML = '<option value="">(no units found)</option>';
setStatus('No units in DB.', 'error');
return;
}
sel.innerHTML = '<option value="">— pick a unit —</option>' +
units.map(u => {
const n = u.total_events ?? 0;
return `<option value="${u.serial}">${u.serial} (${n} events)</option>`;
}).join('');
setStatus(`Loaded ${units.length} units.`, 'ok');
} catch (e) {
setStatus(`Failed to load units: ${e.message}`, 'error');
}
}
async function loadEventsForSerial(serial) {
if (!serial) {
allEvents = [];
renderEventList();
return;
}
setStatus(`Loading events for ${serial}…`);
try {
const r = await fetch(`${apiBase}/db/events?serial=${encodeURIComponent(serial)}&limit=500`);
if (!r.ok) throw new Error(r.statusText);
const d = await r.json();
allEvents = d.events || [];
document.getElementById('count-pill').textContent = `${allEvents.length} events`;
applyFilter();
setStatus(`Loaded ${allEvents.length} events for ${serial}.`, 'ok');
} catch (e) {
setStatus(`Failed to load events: ${e.message}`, 'error');
}
}
function applyFilter() {
const q = document.getElementById('event-filter').value.toLowerCase().trim();
if (!q) {
filteredEvents = allEvents;
} else {
filteredEvents = allEvents.filter(ev =>
(ev.blastware_filename || '').toLowerCase().includes(q) ||
(ev.timestamp || '').toLowerCase().includes(q) ||
(ev.record_type || '').toLowerCase().includes(q) ||
(ev.project || '').toLowerCase().includes(q)
);
}
document.getElementById('event-list-count').textContent = `${filteredEvents.length} / ${allEvents.length}`;
renderEventList();
}
function renderEventList() {
const list = document.getElementById('event-list');
list.innerHTML = '';
if (filteredEvents.length === 0) {
list.innerHTML = '<div style="padding:14px;color:#484f58;font-size:12px">No events.</div>';
return;
}
for (const ev of filteredEvents) {
const row = document.createElement('div');
row.className = 'event-row' + (ev.false_trigger ? ' false_trigger' : '');
if (ev.id === currentEventId) row.className += ' active';
const ts = _fmtTsLocal(ev.timestamp);
const pvs = ev.peak_vector_sum != null ? `${ev.peak_vector_sum.toFixed(3)} in/s` : '—';
row.innerHTML = `
<div class="er-top">
<span class="er-ts">${ts || '(no ts)'}</span>
<span class="er-pvs">${pvs}</span>
</div>
<div class="er-meta">${ev.record_type || '?'} · ${ev.blastware_filename || ev.id.slice(0,8)}</div>
`;
row.onclick = () => loadEvent(ev.id);
list.appendChild(row);
}
}
async function loadEvent(eventId) {
currentEventId = eventId;
renderEventList();
setStatus('Loading waveform…');
try {
// Sidecar fetch runs in parallel — its bw_report block carries ZC
// Freq + above-range flags + sensor-check results that the per-
// channel stats table surfaces. Failures are non-fatal (legacy
// events without a preserved .TXT have no sidecar bw_report).
const sidecarP = fetch(`${apiBase}/db/events/${eventId}/sidecar`)
.then(r => r.ok ? r.json() : null)
.catch(() => null);
const r = await fetch(`${apiBase}/db/events/${eventId}/waveform.json`);
if (!r.ok) {
if (r.status === 404) {
showEmpty('No waveform data for this event (codec returned no samples).');
return;
}
throw new Error(r.statusText);
}
const data = await r.json();
renderWaveform(data);
// Also fetch metadata from the events list for richer header
const ev = allEvents.find(e => e.id === eventId);
const sidecar = await sidecarP;
renderMeta(data, ev, sidecar);
setStatus(`Event loaded.`, 'ok');
} catch (e) {
setStatus(`Failed to load event: ${e.message}`, 'error');
showEmpty(`Error: ${e.message}`);
}
}
function showEmpty(msg) {
document.getElementById('empty-state').style.display = 'flex';
document.getElementById('empty-state').querySelector('p').textContent = msg;
document.getElementById('event-meta').style.display = 'none';
document.getElementById('charts').style.display = 'none';
Object.values(charts).forEach(c => c.destroy());
charts = {};
}
function renderMeta(data, ev, sidecar) {
const metaDiv = document.getElementById('event-meta');
const fields = [
['Serial', data.serial || ev?.serial || '—'],
['Timestamp', _fmtTsLocal(data.timestamp || ev?.timestamp)],
['Record', data.record_type || ev?.record_type || '—'],
['Sample rate', data.sample_rate ? `${data.sample_rate} sps` : '—'],
['Geo range', data.geo_range ? `${data.geo_range} (${data.geo_full_scale_ips} in/s FS)` : '—'],
['Project', ev?.project || '—'],
['Location', ev?.sensor_location || '—'],
['Peak Vector Sum',
ev?.peak_vector_sum != null ? `${ev.peak_vector_sum.toFixed(4)} in/s` : '—'],
];
// Per-channel stats table mirroring the printout's middle block.
// PPV from the events DB row; ZC Freq + saturation flags from the
// sidecar's bw_report block (when a .TXT was preserved on ingest).
const bwrPeaks = (sidecar?.bw_report || {}).peaks || {};
const bwrMic = (sidecar?.bw_report || {}).mic || {};
const fmt = v => (v == null ? '—' : (typeof v === 'number' ? v.toFixed(3) : v));
const fmtZc = bwr => {
if (!bwr || bwr.zc_freq_hz == null) return '—';
const prefix = bwr.zc_freq_above_range ? '>' : '';
return `${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
};
const rows = [
['Tran', ev?.tran_ppv, fmtZc(bwrPeaks.tran)],
['Vert', ev?.vert_ppv, fmtZc(bwrPeaks.vert)],
['Long', ev?.long_ppv, fmtZc(bwrPeaks.long)],
];
// Mic display honors the current user preference (dBL default).
// mic_ppv is stored as raw psi on series3 events; convert when needed.
const micPsi = ev?.mic_ppv;
const micUnitDisplay = _getMicUnit();
let micStr;
if (micPsi == null) {
micStr = '—';
} else if (micUnitDisplay === 'dBL') {
const d = _psiToDbl(Number(micPsi));
micStr = (d != null ? d.toFixed(1) : '—') + ' dBL';
} else {
micStr = Number(micPsi).toExponential(2) + ' psi';
}
const statsHtml = `
<table class="stats-table">
<thead>
<tr><th>Channel</th><th>PPV (in/s)</th><th>ZC Freq</th></tr>
</thead>
<tbody>
${rows.map(([ch, ppv, zc]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td><td>${zc}</td></tr>`).join('')}
<tr><td>MicL</td><td>${micStr}</td><td>${fmtZc(bwrMic)}</td></tr>
</tbody>
</table>
`;
metaDiv.innerHTML =
fields.map(([l, v]) =>
`<div class="meta-field"><span class="mf-label">${l}</span><span class="mf-value${l === 'Peak Vector Sum' ? ' highlight' : ''}">${v}</span></div>`
).join('') + statsHtml;
metaDiv.style.display = 'grid';
}
function togglePrintView() {
document.body.classList.toggle('print-view');
// Force chart redraw so axis/grid colors are re-evaluated against the
// new background. Easiest: re-render the current event.
if (currentEventId) {
loadEvent(currentEventId);
}
}
function renderWaveform(data) {
document.getElementById('empty-state').style.display = 'none';
const chartsDiv = document.getElementById('charts');
chartsDiv.style.display = 'flex';
chartsDiv.innerHTML = '';
Object.values(charts).forEach(c => c.destroy());
charts = {};
const channels = data.channels || {};
// time_axis is METADATA from sfm.plot.v1 — sample_rate, pretrig_samples,
// t0_ms (first-sample time relative to trigger; negative when pretrig
// exists), dt_ms. Trigger is at t=0 by convention.
const ta = data.time_axis || {};
const sr = ta.sample_rate || 1024;
const dtMs = ta.dt_ms || (1000.0 / sr);
const t0Ms = ta.t0_ms != null ? ta.t0_ms : 0;
const isPrintMode = document.body.classList.contains('print-view');
// Histograms record per-interval peaks (typically 1 per minute/5-min),
// not per-sample waveforms. Render as a tight bar graph instead of a
// line plot — matches the BW Event Report's histogram presentation.
const isHistogram = String(data.record_type || '').toLowerCase().includes('histogram');
// Which channels actually have data → determines which one renders the
// shared x-axis at the bottom (Instantel printout has the time scale
// only on the bottom-most chart).
const channelsWithData = CHANNEL_ORDER.filter(ch =>
channels[ch] && (channels[ch].values || []).length > 0
);
const lastDataCh = channelsWithData[channelsWithData.length - 1];
const micUnit = _getMicUnit();
for (const ch of CHANNEL_ORDER) {
const chData = channels[ch];
if (!chData) continue;
if ((chData.values || []).length === 0) {
// Render an empty card so user sees the channel exists but is missing
const wrap = document.createElement('div');
wrap.className = 'chart-wrap';
wrap.innerHTML = `
<div class="chart-label ch-${ch.toLowerCase()}">
<span>${ch}</span>
<span style="color:#484f58">no samples decoded</span>
</div>
<div class="chart-canvas-wrap" style="display:flex;align-items:center;justify-content:center;color:#484f58;font-size:12px">empty</div>
`;
chartsDiv.appendChild(wrap);
continue;
}
// Mic channel: convert from raw psi to dB(L) when the user prefers dBL
// (the default). We mutate `values`, `peak`, and `unit` locally so the
// chart datasets + axis title + tooltip + peak label all stay aligned.
let values = chData.values || [];
let unit = chData.unit || 'unit';
let peak = chData.peak;
const peakT = chData.peak_t_ms;
if (ch === 'MicL' && unit === 'psi' && micUnit === 'dBL') {
// Per-sample chart uses rectified-and-floored conversion so the
// baseline is continuous; the peak label uses the unrectified
// converter to preserve the true measurement.
values = values.map(_psiToDblForChart);
peak = _psiToDbl(peak);
unit = 'dB(L)';
}
const peakLabel = peak != null
? `peak ${_fmtPeak(peak, unit)}`
+ (!isHistogram && peakT != null ? ` @ ${peakT.toFixed(1)} ms` : '')
: '';
// Hide x-axis on every chart except the bottom-most data channel —
// gives the "single shared time axis" feel of the BW printout.
const showXAxis = (ch === lastDataCh);
const wrap = document.createElement('div');
wrap.className = 'chart-wrap';
const lbl = document.createElement('div');
lbl.className = `chart-label ch-${ch.toLowerCase()}`;
lbl.innerHTML = `<span>${ch}</span><span style="color:#8b949e;font-weight:normal">${peakLabel}</span>`;
wrap.appendChild(lbl);
const canvasWrap = document.createElement('div');
canvasWrap.className = 'chart-canvas-wrap';
const canvas = document.createElement('canvas');
canvasWrap.appendChild(canvas);
wrap.appendChild(canvasWrap);
chartsDiv.appendChild(wrap);
// Waveform: per-sample time in ms relative to trigger (negative for pretrig).
// Histogram: when the server has aggregated to BW-reported intervals AND
// provides per-interval timestamps, use those as x-axis labels (HH:MM:SS).
// Falls back to interval index.
let times;
if (isHistogram) {
const intervalTimes = ta.interval_times || [];
times = (intervalTimes.length === values.length)
? intervalTimes
: values.map((_, i) => i + 1);
} else {
times = values.map((_, i) => t0Ms + i * dtMs);
}
// Downsample for rendering
const MAX_POINTS = 4000;
let rT = times, rV = values;
if (values.length > MAX_POINTS) {
const step = Math.ceil(values.length / MAX_POINTS);
rT = times.filter((_, i) => i % step === 0);
rV = values.filter((_, i) => i % step === 0);
}
// Tick formatter — round to 1 decimal so we don't get
// "11.7187040000000002 ms" garbage from floating-point accumulation.
const xAxisUnit = isHistogram ? '' : ' ms';
const fmtTick = i => {
const v = rT[i];
if (typeof v !== 'number') return String(v) + xAxisUnit;
return (Number.isInteger(v) ? String(v) : v.toFixed(1)) + xAxisUnit;
};
// Y-axis bounds. Geophone waveforms render symmetric around zero
// (seismograph convention — zero line in the middle, signal goes
// up AND down). Mic + histograms keep default auto-scale (always
// positive values; zero at the bottom).
let yBounds = {};
const isGeo = ch !== 'MicL';
if (isGeo && !isHistogram) {
// Waveform geo: symmetric around zero for full shape detail.
let absMax = 0;
for (const v of values) {
const a = Math.abs(v);
if (a > absMax) absMax = a;
}
const padded = (absMax || 1) * 1.10;
yBounds = { min: -padded, max: padded };
} else if (isGeo && isHistogram) {
// Histogram geo: enforce minimum chart range so quiet events
// look quiet (matches BW's near-fixed-scale convention).
const HIST_GEO_MIN_INS = 0.05;
let p = 0;
for (const v of values) { const a = Math.abs(v); if (a > p) p = a; }
yBounds = { min: 0, max: Math.max(p * 1.10, HIST_GEO_MIN_INS) };
} else if (ch === 'MicL' && micUnit === 'dBL') {
// Mic dBL: baseline at noise-floor minimum, top at peak + 5 dB.
const peakDbl = (typeof peak === 'number' && isFinite(peak))
? peak + 5
: 100;
yBounds = { min: MIC_DBL_FLOOR, max: Math.max(peakDbl, MIC_DBL_FLOOR + 20) };
} else if (ch === 'MicL' && isHistogram && micUnit === 'psi') {
// Mic histogram in psi: same minimum-range treatment as geo.
const HIST_MIC_MIN_PSI = 0.001;
let p = 0;
for (const v of values) { const a = Math.abs(v); if (a > p) p = a; }
yBounds = { min: 0, max: Math.max(p * 1.10, HIST_MIC_MIN_PSI) };
}
const chart = new Chart(canvas, {
type: isHistogram ? 'bar' : 'line',
data: {
labels: rT.map(t => (typeof t === 'number' ? (Number.isInteger(t) ? String(t) : t.toFixed(2)) : t)),
datasets: isHistogram ? [{
data: rV,
backgroundColor: CHANNEL_COLORS[ch],
borderWidth: 0,
barPercentage: 1.0,
categoryPercentage: 1.0, // bars touch — tight bargraph
}] : [{
data: rV,
borderColor: CHANNEL_COLORS[ch],
borderWidth: 1,
pointRadius: 0,
tension: 0,
}],
},
options: {
animation: false,
responsive: true,
maintainAspectRatio: false,
plugins: {
legend: { display: false },
tooltip: {
mode: 'index',
intersect: false,
callbacks: {
title: items => isHistogram
? `interval ${items[0].label}`
: `t = ${items[0].label} ms`,
label: item => `${ch}: ${_fmtPeak(item.raw, unit)}`,
},
},
},
scales: {
x: {
type: 'category',
display: showXAxis,
ticks: {
color: isPrintMode ? '#666' : '#484f58',
maxTicksLimit: 10,
maxRotation: 0,
callback: (val, i) => fmtTick(i),
},
grid: { color: isPrintMode ? '#e0e0e0' : '#21262d', drawTicks: showXAxis },
},
y: {
...yBounds,
ticks: { color: isPrintMode ? '#666' : '#484f58', maxTicksLimit: 5 },
grid: { color: isPrintMode ? '#e0e0e0' : '#21262d' },
title: { display: true, text: unit,
color: isPrintMode ? '#666' : '#484f58', font: { size: 10 } },
},
},
},
plugins: isHistogram ? [] : [{
// Trigger line @ t=0 + triangle markers above/below + "0.0"
// baseline label on the right edge. Matches the Instantel
// BW Event Report printout style. Skipped for histograms —
// they have no trigger event.
id: 'instantelOverlays',
afterDraw(chart) {
const ctx = chart.ctx;
const xAxis = chart.scales.x;
const yAxis = chart.scales.y;
const fgPrim = isPrintMode ? '#000' : '#c9d1d9';
const fgTrigger = '#f85149';
// Dashed vertical trigger line at t=0
const zeroIdx = rT.findIndex(t => parseFloat(t) >= 0);
if (zeroIdx >= 0) {
const x = xAxis.getPixelForValue(zeroIdx);
ctx.save();
ctx.beginPath();
ctx.moveTo(x, yAxis.top);
ctx.lineTo(x, yAxis.bottom);
ctx.strokeStyle = isPrintMode ? '#cc0000' : 'rgba(248, 81, 73, 0.8)';
ctx.lineWidth = 1.2;
ctx.setLineDash([4, 3]);
ctx.stroke();
ctx.restore();
// Triangles above and below the chart at the trigger column
ctx.save();
ctx.fillStyle = fgTrigger;
ctx.beginPath(); // top triangle pointing down
ctx.moveTo(x - 5, yAxis.top - 8);
ctx.lineTo(x + 5, yAxis.top - 8);
ctx.lineTo(x, yAxis.top - 1);
ctx.closePath();
ctx.fill();
ctx.beginPath(); // bottom triangle pointing up
ctx.moveTo(x - 5, yAxis.bottom + 8);
ctx.lineTo(x + 5, yAxis.bottom + 8);
ctx.lineTo(x, yAxis.bottom + 1);
ctx.closePath();
ctx.fill();
ctx.restore();
}
// "0.0" baseline label on the right edge — printout convention.
// Position vertically at the zero-amplitude level.
const zeroY = yAxis.getPixelForValue(0);
if (zeroY >= yAxis.top && zeroY <= yAxis.bottom) {
ctx.save();
ctx.strokeStyle = isPrintMode ? '#aaa' : '#30363d';
ctx.lineWidth = 0.8;
ctx.setLineDash([2, 2]);
ctx.beginPath();
ctx.moveTo(xAxis.left, zeroY);
ctx.lineTo(xAxis.right, zeroY);
ctx.stroke();
ctx.restore();
ctx.save();
ctx.fillStyle = fgPrim;
ctx.font = '11px monospace';
ctx.textAlign = 'left';
ctx.textBaseline = 'middle';
ctx.fillText('0.0', xAxis.right + 6, zeroY);
ctx.restore();
}
},
}],
});
charts[ch] = chart;
}
}
// Wire up handlers
document.getElementById('serial-select').addEventListener('change', e => {
loadEventsForSerial(e.target.value);
});
document.getElementById('event-filter').addEventListener('input', applyFilter);
// Reflect any persisted mic-unit preference in the header pill on load
_refreshMicUnitToggle();
// Initial load
loadSerials();
</script>
</body>
</html>
+900
View File
@@ -0,0 +1,900 @@
"""
sfm/report_pdf.py generate Instantel-style Event Report PDFs.
Stub layout for v0.20.0 the exact visual is iterated against actual
Blastware reference PDFs (uploaded to docs/reference/instantel/).
Current output captures all the data fields a real BW Event Report
contains, but the visual hierarchy / spacing is still approximate.
Architecture
1. ``gather_report_data(event_id)`` assembles a flat dict from three
sources: the SeismoDb events row, the .sfm.json sidecar (bw_report
block), and the .h5 waveform samples. Returns ``None`` when the
event doesn't exist or has no waveform data on disk.
2. ``render_event_report_pdf(data)`` takes that dict and produces a
single-page letter-sized PDF as bytes, using matplotlib's PDF
backend (vector output, no rasterization, prints cleanly).
3. The HTTP endpoint at ``/db/events/{id}/report.pdf`` wires them
together: fetch event gather render stream bytes back with
``Content-Type: application/pdf``.
What's in the report (every field BW's printout includes):
Header (left): Date/Time, Trigger Source, Range, Sample Rate, Notes,
Project, Client, User Name, Seis. Loc
Header (right): Serial + firmware, Battery, Calibration, File Name,
Post Event Notes
Mic block: PSPL (dBL + psi), ZC Freq, Channel Test result
Stats table: per-channel PPV / ZC Freq / Time of Peak /
Peak Acceleration / Peak Displacement / Sensor Check
Peak Vector Sum
Waveform plot: 4 channels stacked (MicL/Long/Vert/Tran), shared
time axis, trigger marker, peak markers
USBM RI8507/OSMRE compliance chart: STUBBED separate work item
Histogram events: the layout differs (Number of Intervals header
field, no trigger marker, per-interval bar chart instead of waveform).
Handled via a record_type branch in ``render_event_report_pdf``.
"""
from __future__ import annotations
import io
import json
import logging
import math
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
import matplotlib
matplotlib.use("Agg") # headless — no display required
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.backends.backend_pdf import PdfPages
log = logging.getLogger(__name__)
# Reference pressure for dB(L) conversion: 20 µPa expressed in psi.
DBL_REF_PSI = 2.9e-9
# ── Data assembly ────────────────────────────────────────────────────────────
@dataclass
class ReportData:
"""All fields needed to render an Instantel-style Event Report.
Most fields are Optional BW's printout shows '' or just omits
sections when source data is missing. The renderer mirrors that.
"""
# Header — left column
event_datetime_str: Optional[str] = None
trigger_source: Optional[str] = None
geo_range_str: Optional[str] = None
sample_rate_str: Optional[str] = None
notes: Optional[str] = None
project: Optional[str] = None
client: Optional[str] = None
operator: Optional[str] = None
sensor_location: Optional[str] = None
# Header — right column
serial: Optional[str] = None
firmware: Optional[str] = None
battery_volts: Optional[float] = None
calibration_date: Optional[str] = None
calibration_by: Optional[str] = None
file_name: Optional[str] = None
post_event_notes: Optional[str] = None
# Microphone block
mic_pspl_dbl: Optional[float] = None
mic_pspl_psi: Optional[float] = None
mic_pspl_time_s: Optional[float] = None
mic_pspl_when_str: Optional[str] = None # histogram absolute date+time, BW-formatted
mic_zc_freq_hz: Optional[float] = None
mic_zc_freq_above_range: bool = False
mic_channel_test_result: Optional[str] = None
mic_channel_test_freq_hz: Optional[float] = None
mic_channel_test_amp_mv: Optional[float] = None
# Per-channel stats — list of dicts (one per channel)
# Keys: name, ppv_ips, zc_freq_hz, time_of_peak_s,
# peak_accel_g, peak_disp_in, sensor_check
channel_stats: list[dict] = field(default_factory=list)
# Peak Vector Sum
peak_vector_sum_ips: Optional[float] = None
peak_vector_sum_time_s: Optional[float] = None
# Waveform samples — channels[ch] = list of floats in physical units
# Time axis derived from sample_rate + pretrig_samples
channels: dict = field(default_factory=dict)
sample_rate_sps: Optional[int] = None
pretrig_samples: Optional[int] = None
t0_ms: Optional[float] = None
dt_ms: Optional[float] = None
# Record-type discriminator
record_type: Optional[str] = None
is_histogram: bool = False
# Histogram-only fields — only populated for record_type starts with 'Hist'
histogram_start_str: Optional[str] = None # "22:30:38 May 16, 2026"
histogram_stop_str: Optional[str] = None
histogram_n_intervals: Optional[float] = None # 4.00
histogram_interval_size: Optional[str] = None # "1 minute"
histogram_interval_size_s: Optional[float] = None # 60.0 — numeric seconds, used to derive interval_times
histogram_interval_times: list[str] = field(default_factory=list) # per-interval timestamps for x-axis
# Peak Vector Sum metadata (histograms show absolute date+time)
peak_vector_sum_when_str: Optional[str] = None
# Bookkeeping
event_id: Optional[str] = None
server_received_at: Optional[str] = None
bw_pc_sw_version: Optional[str] = None
def gather_report_data(
db,
store,
event_id: str,
) -> Optional[ReportData]:
"""Collect every field needed to render an event report.
Returns ``None`` if the event is unknown or has no waveform data
on disk (no .h5, no .a5.pkl same condition the waveform.json
endpoint 404s on).
"""
row = db.get_event(event_id)
if row is None:
return None
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
return None
rd = ReportData(
event_id=event_id,
serial=serial,
file_name=filename,
record_type=row.get("record_type"),
is_histogram=str(row.get("record_type", "")).lower().startswith("hist"),
event_datetime_str=row.get("timestamp"),
sample_rate_sps=row.get("sample_rate"),
project=row.get("project"),
client=row.get("client"),
operator=row.get("operator"),
sensor_location=row.get("sensor_location"),
server_received_at=row.get("created_at"),
)
# ── Sidecar bw_report — the rich BW-derived fields ──
sidecar_path = store.sidecar_path_for(serial, filename)
if sidecar_path.exists():
try:
sc = json.loads(sidecar_path.read_text())
except Exception as exc:
log.warning("gather_report_data: sidecar read failed: %s", exc)
sc = {}
bw = sc.get("bw_report") or {}
# Trigger / range / sample-rate display
trig = bw.get("trigger") or {}
rd.trigger_source = (
f"{trig.get('channel','')}: {trig.get('geo_level_ips')} in/s"
if trig.get("channel") or trig.get("geo_level_ips") is not None
else None
)
rec = bw.get("recording") or {}
rd.geo_range_str = (
f"Geo: {rec.get('geo_range_ips')} in/s"
if rec.get("geo_range_ips") is not None else None
)
rt = rec.get("record_time_s")
if rt is not None and rd.sample_rate_sps:
rd.sample_rate_str = f"{rt:.1f} sec At {rd.sample_rate_sps} Sps"
# Device block
dev = bw.get("device") or {}
rd.battery_volts = dev.get("battery_volts")
rd.calibration_date = dev.get("calibration_date")
rd.calibration_by = dev.get("calibration_by")
rd.firmware = bw.get("version")
rd.bw_pc_sw_version = bw.get("pc_sw_version")
# Microphone block
mic = bw.get("mic") or {}
rd.mic_pspl_dbl = mic.get("pspl_dbl")
if rd.mic_pspl_dbl is not None and rd.mic_pspl_dbl > 0:
# Inverse of the dBL formula → psi. Mirrors waveform_codec convention.
rd.mic_pspl_psi = DBL_REF_PSI * (10 ** (rd.mic_pspl_dbl / 20))
rd.mic_pspl_time_s = mic.get("time_of_peak_s")
rd.mic_zc_freq_hz = mic.get("zc_freq_hz")
rd.mic_zc_freq_above_range = bool(mic.get("zc_freq_above_range"))
sc_mic = (bw.get("sensor_check") or {}).get("mic") or {}
rd.mic_channel_test_result = sc_mic.get("result")
rd.mic_channel_test_freq_hz = sc_mic.get("freq_hz")
rd.mic_channel_test_amp_mv = sc_mic.get("amplitude_mv")
# Per-channel stats (Tran / Vert / Long). Per-channel peak
# date+time for histograms comes from bw_report.histogram.channel_peak_when
# (populated when the parser captured it; see the bw_ascii_report
# parser's histogram-fields handler).
peaks = bw.get("peaks") or {}
sc_block = bw.get("sensor_check") or {}
hist_block = bw.get("histogram") or {}
peak_when = hist_block.get("channel_peak_when") or {}
for ch_lc, ch_label in (("tran", "Tran"), ("vert", "Vert"), ("long", "Long")):
ch = peaks.get(ch_lc) or {}
sc_ch = sc_block.get(ch_lc) or {}
ch_when_iso = peak_when.get(ch_label)
peak_date, peak_time = _split_iso_to_date_time(ch_when_iso)
rd.channel_stats.append({
"name": ch_label,
"ppv_ips": ch.get("ppv_ips"),
"zc_freq_hz": ch.get("zc_freq_hz"),
"zc_freq_above_range": bool(ch.get("zc_freq_above_range")),
"time_of_peak_s": ch.get("time_of_peak_s"),
"peak_accel_g": ch.get("peak_accel_g"),
"peak_disp_in": ch.get("peak_disp_in"),
"sensor_check": sc_ch.get("result"),
"peak_date": peak_date,
"peak_time": peak_time,
})
# MicL peak time (used in the mic block — "PSPL ... on DATE at TIME")
mic_when_iso = peak_when.get("MicL")
rd.mic_pspl_when_str = _fmt_iso_to_bw(mic_when_iso) if mic_when_iso else None
# Peak Vector Sum
vs = peaks.get("vector_sum") or {}
rd.peak_vector_sum_ips = vs.get("ips")
rd.peak_vector_sum_time_s = vs.get("time_s")
# PVS absolute date+time (histograms). Same formatting as Mic.
pvs_when_iso = vs.get("when")
rd.peak_vector_sum_when_str = _fmt_iso_to_bw(pvs_when_iso) if pvs_when_iso else None
# Histogram-specific header fields — keys match the projection in
# _bw_report_to_dict ("start" / "stop", not "_str" suffixed).
if rd.is_histogram:
rd.histogram_start_str = hist_block.get("start") or rd.event_datetime_str
rd.histogram_stop_str = hist_block.get("stop")
rd.histogram_n_intervals = hist_block.get("n_intervals")
rd.histogram_interval_size = hist_block.get("interval_size")
rd.histogram_interval_size_s = hist_block.get("interval_size_s")
rd.histogram_interval_times = hist_block.get("interval_times") or []
# ── Waveform samples — from the .h5 via the existing helper ──
from sfm import event_hdf5
h5_path = store.hdf5_path_for(serial, filename)
if h5_path.exists():
try:
wf = event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
rd.channels = {
ch: (chd.get("values") or [])
for ch, chd in (wf.get("channels") or {}).items()
}
ta = wf.get("time_axis") or {}
rd.sample_rate_sps = rd.sample_rate_sps or ta.get("sample_rate")
rd.pretrig_samples = ta.get("pretrig_samples")
rd.t0_ms = ta.get("t0_ms")
rd.dt_ms = ta.get("dt_ms")
except Exception as exc:
log.warning("gather_report_data: hdf5 read failed: %s", exc)
# ── Histogram aggregation ──
# Codec emits ~N per-block samples (typically 1/sec); BW reports
# one bar per configured interval (1 min / 5 min / etc.). When
# bw_report.histogram.n_intervals is populated (events ingested
# with the parser extension), group max-per-group to match. Also
# derives per-interval timestamps for the x-axis. No-op for
# waveform events or when n_intervals is missing.
if rd.is_histogram and rd.histogram_n_intervals and rd.histogram_n_intervals >= 1:
n = int(rd.histogram_n_intervals)
for ch, vals in list(rd.channels.items()):
if not vals:
continue
per_group = len(vals) // n
remainder = len(vals) % n
agg: list = []
offset = 0
for i in range(n):
grp_size = per_group + (1 if i < remainder else 0)
if grp_size > 0:
grp = vals[offset:offset + grp_size]
agg.append(max((abs(v) for v in grp if v is not None), default=0))
offset += grp_size
else:
agg.append(0)
rd.channels[ch] = agg
# Derive per-interval HH:MM:SS labels if we have the start time + size
if rd.histogram_start_str and rd.histogram_interval_size_s and not rd.histogram_interval_times:
try:
import datetime as _dt
start = _dt.datetime.fromisoformat(rd.histogram_start_str)
rd.histogram_interval_times = [
(start + _dt.timedelta(seconds=(i + 1) * rd.histogram_interval_size_s)).strftime("%H:%M:%S")
for i in range(n)
]
except Exception:
pass
return rd
# ── PDF rendering ────────────────────────────────────────────────────────────
def render_event_report_pdf(rd: ReportData) -> bytes:
"""Render an event report dict to a single-page letter PDF.
Branches on ``rd.is_histogram`` waveform and histogram layouts
differ in their header fields, stats-table rows, and bottom plot.
Layout modeled on Blastware's Event Report PDFs (samples in
docs/reference/instantel/).
"""
# Letter portrait — 8.5"×11"
fig = plt.figure(figsize=(8.5, 11), dpi=100)
fig.patch.set_facecolor("white")
if rd.is_histogram:
_render_histogram_layout(fig, rd)
else:
_render_waveform_layout(fig, rd)
# Page footer (common to both layouts) — Created date + event id.
# Pushed to the very page bottom so it doesn't collide with the
# waveform footer scale / trigger legend lines just above.
# Convert UTC server_received_at to local for display.
created_local = _fmt_iso_to_bw(rd.server_received_at) if rd.server_received_at else ""
fig.text(
0.07, 0.005,
f"Created: {created_local} • seismo-relay",
fontsize=6, color="#888", ha="left",
)
fig.text(
0.93, 0.005,
f"Event {rd.event_id[:8] if rd.event_id else ''}",
fontsize=6, color="#888", ha="right",
)
buf = io.BytesIO()
fig.savefig(buf, format="pdf")
plt.close(fig)
return buf.getvalue()
def _render_waveform_layout(fig, rd: ReportData) -> None:
"""Waveform layout: header / mic+USBM / per-channel stats / waveform plot.
Stats table includes Time (Rel. to Trig), Peak Accel, Peak Disp.
Left margin sized to fit the channel labels (MicL/Long/Vert/Tran).
Extra bottom margin reserves space for x-axis tick labels +
"Amplitude Geo: X in/s/div Mic: Y psi(L)/div" footer + trigger
legend without overlap.
"""
gs = fig.add_gridspec(
nrows=4, ncols=1,
left=0.11, right=0.94, top=0.97, bottom=0.12,
height_ratios=[1.7, 2.0, 1.8, 5.5],
hspace=0.35,
)
ax_header = fig.add_subplot(gs[0]); ax_header.axis("off")
_draw_header_waveform(ax_header, rd)
ax_mid = fig.add_subplot(gs[1]); ax_mid.axis("off")
_draw_mic_and_usbm(ax_mid, rd)
ax_stats = fig.add_subplot(gs[2]); ax_stats.axis("off")
_draw_channel_stats_waveform(ax_stats, rd)
_draw_waveform_subplot(fig, gs[3], rd)
def _render_histogram_layout(fig, rd: ReportData) -> None:
"""Histogram layout: header / mic-only / per-channel stats / bar plot.
No USBM compliance chart (it's a waveform-only concept). Stats table
uses Date + Time-of-peak instead of relative-time + accel + disp.
Left margin sized to fit the channel labels. Extra bottom margin
leaves room for the x-axis time labels + footer scale legend
without overlap.
"""
gs = fig.add_gridspec(
nrows=4, ncols=1,
left=0.11, right=0.94, top=0.97, bottom=0.12,
height_ratios=[1.8, 0.9, 1.7, 5.6],
hspace=0.35,
)
ax_header = fig.add_subplot(gs[0]); ax_header.axis("off")
_draw_header_histogram(ax_header, rd)
ax_mic = fig.add_subplot(gs[1]); ax_mic.axis("off")
_draw_mic_only(ax_mic, rd)
ax_stats = fig.add_subplot(gs[2]); ax_stats.axis("off")
_draw_channel_stats_histogram(ax_stats, rd)
_draw_histogram_subplot(fig, gs[3], rd)
def _to_display_local(iso: str):
"""Parse an ISO timestamp and return a datetime in the system's local
timezone (set by the TZ env var, default America/New_York via the
Dockerfile).
Behaviour:
- "...Z" or "...+HH:MM" suffix tz-aware UTC converted to local
- Naïve "YYYY-MM-DDTHH:MM:SS" (no tz) returned as-is. This
matches the convention used elsewhere in seismo-relay: BW's
recorded-at timestamps are naïve and ALREADY in the unit's
local clock; we don't second-guess them.
"""
import datetime as _dt
dt = _dt.datetime.fromisoformat(iso.replace("Z", "+00:00"))
if dt.tzinfo is not None:
# Convert from UTC (or other tz) → local per the TZ env var.
# astimezone() without arg uses the system timezone.
dt = dt.astimezone()
return dt
def _fmt_iso_to_bw(iso: Optional[str]) -> Optional[str]:
"""Convert an ISO-8601 timestamp to BW's display format
'22:30:37 May 16, 2026'. UTC inputs (with Z suffix) are
converted to the system's local timezone first; naïve inputs
are formatted as-is. Returns input unchanged on parse failure."""
if not iso or "T" not in iso:
return iso
try:
return _to_display_local(iso).strftime("%H:%M:%S %B %d, %Y").replace(" 0", " ")
except Exception:
return iso
def _split_iso_to_date_time(iso: Optional[str]) -> tuple[Optional[str], Optional[str]]:
"""Split an ISO timestamp into BW-formatted ('May 27 /26', '06:06:14')
date+time strings. Used for the histogram stats table where the
Date and Time rows are presented separately. UTC inputs are
converted to local time first. Returns (None, None) on parse failure."""
if not iso:
return (None, None)
try:
dt = _to_display_local(iso)
# BW format: 'May 27 /26' (3-letter month + 2-digit year)
date_str = dt.strftime("%b %d /%y").replace(" 0", " ")
time_str = dt.strftime("%H:%M:%S")
return (date_str, time_str)
except Exception:
return (None, None)
def _kv(ax, x, y, label, value, *, label_w=0.18):
"""Render a 'Label Value' row at axes-coordinates (x, y)."""
ax.text(x, y, label, fontsize=8, color="#555", ha="left", va="top",
transform=ax.transAxes)
ax.text(x + label_w, y, _fmt(value), fontsize=8, ha="left", va="top",
transform=ax.transAxes, family="monospace")
def _fmt(v):
"""Format any field for display — '' for None, str otherwise."""
if v is None:
return ""
if isinstance(v, float):
return f"{v:.4f}".rstrip("0").rstrip(".")
return str(v)
def _draw_header_waveform(ax, rd: ReportData) -> None:
"""Two-column metadata header — waveform variant."""
rows_left = [
("Date/Time", _fmt_iso_to_bw(rd.event_datetime_str)),
("Trigger Source", rd.trigger_source),
("Range", rd.geo_range_str),
("Sample Rate", rd.sample_rate_str),
("Notes", rd.notes),
("Project:", rd.project),
("Client:", rd.client),
("User Name:", rd.operator),
("Seis. Loc:", rd.sensor_location),
]
_draw_header_columns(ax, rows_left, rd)
def _draw_header_histogram(ax, rd: ReportData) -> None:
"""Two-column metadata header — histogram variant.
Histograms have Start / Finish / Intervals fields instead of
Trigger Source (there's no trigger event for a histogram capture).
"""
intervals_str = None
if rd.histogram_n_intervals is not None and rd.histogram_interval_size:
intervals_str = f"{rd.histogram_n_intervals} At {rd.histogram_interval_size}"
rows_left = [
("Start", _fmt_iso_to_bw(rd.histogram_start_str or rd.event_datetime_str)),
("Finish", _fmt_iso_to_bw(rd.histogram_stop_str)),
("Intervals", intervals_str),
("Range", rd.geo_range_str),
("Sample Rate", (f"{rd.sample_rate_sps} Sps" if rd.sample_rate_sps else None)),
("Notes", rd.notes),
("Project:", rd.project),
("Client:", rd.client),
("User Name:", rd.operator),
("Seis. Loc:", rd.sensor_location),
]
_draw_header_columns(ax, rows_left, rd)
def _draw_header_columns(ax, rows_left, rd: ReportData) -> None:
"""Shared 2-column header rendering used by both layouts."""
rows_right = [
("Serial Number", f"{rd.serial or ''}" + (f" {rd.firmware}" if rd.firmware else "")),
("Battery Level", f"{rd.battery_volts:.1f} Volts" if rd.battery_volts is not None else None),
("Unit Calibration", (f"{rd.calibration_date}" + (f" by {rd.calibration_by}" if rd.calibration_by else ""))
if rd.calibration_date else None),
("File Name", rd.file_name),
("Post Event Notes", rd.post_event_notes),
]
y = 0.95
dy = 0.095
for label, value in rows_left:
_kv(ax, 0.0, y, label, value, label_w=0.18)
y -= dy
y = 0.95
for label, value in rows_right:
_kv(ax, 0.55, y, label, value, label_w=0.20)
y -= dy
def _draw_mic_only(ax, rd: ReportData) -> None:
"""Mic block (histogram variant — no USBM chart)."""
ax.text(0.0, 0.95, "Microphone Linear Weighting", fontsize=8, color="#555",
transform=ax.transAxes, va="top")
rows = _mic_rows(rd)
y = 0.70
for label, value in rows:
_kv(ax, 0.0, y, label, value, label_w=0.18)
y -= 0.22
def _draw_mic_and_usbm(ax, rd: ReportData) -> None:
"""Mic block on the left + USBM compliance chart placeholder on right.
(Waveform variant USBM is a velocity-vs-frequency compliance plot
that doesn't apply to histograms.)"""
ax.text(0.0, 0.95, "Microphone Linear Weighting", fontsize=8, color="#555",
transform=ax.transAxes, va="top")
rows = _mic_rows(rd)
y = 0.80
for label, value in rows:
_kv(ax, 0.0, y, label, value, label_w=0.18)
y -= 0.15
# USBM chart placeholder — upper-right. Real piecewise compliance
# curves are a separate work item; for now this just shows the title
# + a "see report" message so the layout is correct.
ax.text(0.72, 0.97, "USBM RI8507 And OSMRE",
fontsize=9, weight="bold", color="#333", ha="center", va="top",
transform=ax.transAxes)
ax.text(0.72, 0.50, "[compliance chart\ncoming soon]",
fontsize=8, color="#bbb", ha="center", va="center",
transform=ax.transAxes, style="italic")
def _mic_rows(rd: ReportData) -> list[tuple[str, Optional[str]]]:
"""Build the mic-section value rows (shared by both layouts).
For histograms, BW formats the PSPL line as
"125.7 dB(L) on May 27, 2026 at 06:19:14"
(absolute date+time of peak). Waveform events show the relative
"at 0.012 sec." instead. Both formats covered here based on which
field is populated.
"""
rows: list[tuple[str, Optional[str]]] = []
if rd.mic_pspl_dbl is not None:
line = f"{rd.mic_pspl_dbl:.1f} dB(L)"
if rd.mic_pspl_when_str:
# Histogram-style: "PSPL 125.7 dB(L) on May 27, 2026 at 06:19:14"
# mic_pspl_when_str is already "HH:MM:SS Month DD, YYYY";
# reformat to "on Month DD, YYYY at HH:MM:SS" for BW match.
parts = rd.mic_pspl_when_str.split(" ", 1)
if len(parts) == 2:
line += f" on {parts[1]} at {parts[0]}"
else:
line += f" on {rd.mic_pspl_when_str}"
elif rd.mic_pspl_time_s is not None:
# Waveform-style: relative-to-trigger seconds.
line += f" at {rd.mic_pspl_time_s:.3f} sec."
rows.append(("PSPL", line))
if rd.mic_zc_freq_hz is not None:
prefix = ">" if rd.mic_zc_freq_above_range else ""
rows.append(("ZC Freq", f"{prefix}{rd.mic_zc_freq_hz:.0f} Hz"))
if rd.mic_channel_test_result:
line = rd.mic_channel_test_result
if rd.mic_channel_test_freq_hz is not None and rd.mic_channel_test_amp_mv is not None:
line += (f" (Freq = {rd.mic_channel_test_freq_hz:.1f} Hz, "
f"Amp = {rd.mic_channel_test_amp_mv:.0f} mv)")
rows.append(("Channel Test", line))
return rows
def _draw_channel_stats_waveform(ax, rd: ReportData) -> None:
"""Waveform stats table — has Time (Rel. to Trig), Peak Accel, Peak Disp.
Followed by Peak Vector Sum line."""
rows_spec = [
("PPV", "ppv_ips", "in/s"),
("ZC Freq", "zc_freq_hz", "Hz"),
("Time (Rel. to Trig)", "time_of_peak_s", "sec"),
("Peak Acceleration", "peak_accel_g", "g"),
("Peak Displacement", "peak_disp_in", "in"),
("Sensor Check", "sensor_check", ""),
]
_draw_stats_table(ax, rd, rows_spec)
if rd.peak_vector_sum_ips is not None:
line = f"Peak Vector Sum {rd.peak_vector_sum_ips:.3f} in/s"
if rd.peak_vector_sum_time_s is not None:
line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
ax.text(0.0, -0.08, line, fontsize=9, weight="bold",
ha="left", va="top", transform=ax.transAxes)
ax.text(0.0, -0.18, "NA: Not Applicable", fontsize=7, color="#888",
ha="left", va="top", transform=ax.transAxes)
def _draw_channel_stats_histogram(ax, rd: ReportData) -> None:
"""Histogram stats table — PPV, ZC Freq, Date, Time of peak, Sensor Check.
Followed by Peak Vector Sum line."""
# Date / Time of peak are per-channel timestamps for the interval at peak.
# bw_report stores time_of_peak_s as relative seconds, but for histograms
# BW shows them as absolute date+time. We populate from rd.channel_stats
# if those absolute fields are present; otherwise fall back to relative.
rows_spec = [
("PPV", "ppv_ips", "in/s"),
("ZC Freq", "zc_freq_hz", "Hz"),
("Date", "peak_date", ""),
("Time", "peak_time", ""),
("Sensor Check", "sensor_check", ""),
]
_draw_stats_table(ax, rd, rows_spec)
if rd.peak_vector_sum_ips is not None:
line = f"Peak Vector Sum {rd.peak_vector_sum_ips:.3f} in/s"
# Histograms: "0.091 in/s on May 27, 2026 At 06:06:14"
# The when_str is "HH:MM:SS Month DD, YYYY" — reformat for BW match.
if rd.peak_vector_sum_when_str:
parts = rd.peak_vector_sum_when_str.split(" ", 1)
if len(parts) == 2:
line += f" on {parts[1]} At {parts[0]}"
else:
line += f" on {rd.peak_vector_sum_when_str}"
ax.text(0.0, -0.08, line, fontsize=9, weight="bold",
ha="left", va="top", transform=ax.transAxes)
ax.text(0.0, -0.18, "NA: Not Applicable", fontsize=7, color="#888",
ha="left", va="top", transform=ax.transAxes)
def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]]) -> None:
"""Render a per-channel stats table (Tran/Vert/Long).
rows_spec: list of (label, field_name_in_channel_stats, unit_string)
"""
headers = ["", "Tran", "Vert", "Long", ""]
ch_lookup = {c["name"]: c for c in rd.channel_stats}
def _cell(field, ch_name):
ch_rec = ch_lookup.get(ch_name, {})
val = ch_rec.get(field)
if val is None:
return ""
if isinstance(val, float):
# ZC Freq is integer-formatted in BW; ">100 Hz" sentinel
# rendered as ">N" (val carries the threshold). Everything
# else gets 3 decimals.
if field == "zc_freq_hz":
prefix = ">" if ch_rec.get("zc_freq_above_range") else ""
return f"{prefix}{val:.0f}"
return f"{val:.3f}"
return str(val)
table_data = [headers]
for label, field_name, unit in rows_spec:
table_data.append([
label,
_cell(field_name, "Tran"),
_cell(field_name, "Vert"),
_cell(field_name, "Long"),
unit,
])
tbl = ax.table(
cellText=table_data, loc="upper left",
colWidths=[0.28, 0.14, 0.14, 0.14, 0.10],
cellLoc="left", edges="open",
)
tbl.auto_set_font_size(False)
tbl.set_fontsize(8)
tbl.scale(1, 1.4)
for j in range(5):
tbl[(0, j)].set_text_props(weight="bold", color="#555")
def _channel_axis_color(ch: str) -> str:
return {"MicL": "#cc00cc", "Long": "#0066ff", "Vert": "#009933", "Tran": "#cc0000"}.get(ch, "#444")
def _draw_waveform_subplot(fig, gridspec_cell, rd: ReportData) -> None:
"""4-channel stacked waveform plot — Instantel printout order
(MicL on top, Tran on bottom), shared x-axis in SECONDS, trigger
triangle markers at t=0, '0.0' baseline label on right of each."""
inner = gridspec_cell.subgridspec(4, 1, hspace=0.0)
order = ["MicL", "Long", "Vert", "Tran"]
sr = rd.sample_rate_sps or 1024
# Convert ms-based time axis to seconds for the x-axis
dt_s = (rd.dt_ms or (1000.0 / sr)) / 1000.0
t0_s = (rd.t0_ms if rd.t0_ms is not None else 0.0) / 1000.0
last_idx = len(order) - 1
for i, ch in enumerate(order):
ax = fig.add_subplot(inner[i])
values = rd.channels.get(ch) or []
times = [t0_s + j * dt_s for j in range(len(values))]
if values:
color = _channel_axis_color(ch)
ax.plot(times, values, color=color, linewidth=0.5)
# Symmetric y-axis for geo; zero-anchored for mic.
if ch != "MicL":
amax = max((abs(v) for v in values), default=0.001)
ax.set_ylim(-amax * 1.10, amax * 1.10)
else:
amax = max((abs(v) for v in values), default=0.001)
ax.set_ylim(-amax * 1.10, amax * 1.10)
# Channel label on the LEFT (matches BW)
ax.set_ylabel(ch, fontsize=8, rotation=0, ha="right", va="center",
color=_channel_axis_color(ch), weight="bold", labelpad=14)
# "0.0" on the RIGHT (BW convention)
ax.text(1.005, 0.5, "0.0", transform=ax.transAxes,
fontsize=7, color="#555", va="center", ha="left")
ax.grid(True, linestyle="--", linewidth=0.3, color="#bbb", alpha=0.6)
# Vertical dashed trigger line at t=0
ax.axvline(0.0, color="#cc0000", linestyle="--", linewidth=0.6, alpha=0.7)
# Zero baseline horizontal
ax.axhline(0.0, color=_channel_axis_color(ch), linestyle="-",
linewidth=0.4, alpha=0.5)
if i != last_idx:
ax.set_xticklabels([])
ax.tick_params(axis="x", length=0)
else:
ax.tick_params(axis="x", labelsize=7)
ax.tick_params(axis="y", labelsize=6)
# Trigger triangle marker ▼ above the top channel at t=0
top_ax = fig.axes[-4] # MicL is the first added in this gridspec
top_ax.plot([0], [top_ax.get_ylim()[1]], marker="v", color="black",
markersize=8, clip_on=False, zorder=10)
# Compute scale-per-division for the footer (10 divs across the chart)
# and find peak geo amplitude for the geo amp/div setting.
total_s = times[-1] - times[0] if values else 0
div_s = total_s / 10 if total_s > 0 else 0
geo_amp_div = ""
for ch in ("Tran", "Vert", "Long"):
v = rd.channels.get(ch) or []
if v:
amax = max(abs(x) for x in v)
geo_amp_div = f"{(amax * 1.1 * 2) / 10:.3f}"
break
fig.text(
0.11, 0.030,
f"Time(Seconds) {div_s:.2f} sec/div Amplitude Geo: {geo_amp_div} in/s/div Mic: 0.001 psi(L)/div",
fontsize=7, color="#444", ha="left",
)
fig.text(
0.11, 0.018,
"Trigger = ▶━━━━━ ━━━━━━◀",
fontsize=7, color="#444", ha="left",
)
def _nice_geo_step(amax: float) -> float:
"""Pick a "nice" per-division step for the geo y-axis.
Geo LSB is 0.005 in/s sub-LSB steps like 0.003/div are nonsense.
Quantize to the BW-style 1-2-5 sequence (0.005, 0.01, 0.025, 0.05,
) and return the smallest step where 5 divisions >= amax, so the
top of the chart lands on a tick.
"""
if amax <= 0:
return 0.005
for step in (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0):
if step * 5 >= amax:
return step
return 10.0
def _draw_histogram_subplot(fig, gridspec_cell, rd: ReportData) -> None:
"""4-channel stacked histogram bar chart — per-interval peaks.
X-axis labeled with the actual times from rd.histogram_interval_times
when available; otherwise interval index.
The three geo channels share a single y-axis scale (a BW-style nice
multiple of the 0.005 in/s LSB) so bar heights are directly
comparable across channels. MicL has its own auto-scale.
"""
inner = gridspec_cell.subgridspec(4, 1, hspace=0.0)
order = ["MicL", "Long", "Vert", "Tran"]
last_idx = len(order) - 1
# X-axis: use absolute time labels if we have them, else interval index
have_times = bool(rd.histogram_interval_times)
# Shared geo scale: max across Tran/Vert/Long, quantized to a nice
# tick step. Used for ylim + the footer "Amplitude Geo: X in/s/div".
geo_amax = 0.0
for gch in ("Tran", "Vert", "Long"):
gv = rd.channels.get(gch) or []
if gv:
geo_amax = max(geo_amax, max(abs(x) for x in gv if x is not None))
geo_step = _nice_geo_step(geo_amax)
geo_top = geo_step * 5 # 5 divisions — top tick lands at this value
for i, ch in enumerate(order):
ax = fig.add_subplot(inner[i])
values = rd.channels.get(ch) or []
if values:
# Histograms record per-interval PEAK magnitudes — always
# non-negative. Codec output occasionally includes signed
# values when the underlying .h5 was scaled like a waveform;
# take the absolute value so the bars rise from zero.
abs_vals = [abs(v) if v is not None else 0 for v in values]
xs = np.arange(len(abs_vals))
color = _channel_axis_color(ch)
ax.bar(xs, abs_vals, color=color, width=0.85, linewidth=0)
if ch in ("Tran", "Vert", "Long"):
ax.set_ylim(0, geo_top)
ax.set_yticks([j * geo_step for j in range(6)])
else:
amax = max(abs_vals, default=0)
if amax > 0:
ax.set_ylim(0, amax * 1.10)
ax.set_ylabel(ch, fontsize=8, rotation=0, ha="right", va="center",
color=_channel_axis_color(ch), weight="bold", labelpad=14)
ax.text(1.005, 0.02, "0.0", transform=ax.transAxes,
fontsize=7, color="#555", va="bottom", ha="left")
ax.grid(True, axis="y", linestyle="--", linewidth=0.3, color="#bbb", alpha=0.6)
if i != last_idx:
ax.set_xticklabels([])
ax.tick_params(axis="x", length=0)
else:
if have_times and len(rd.histogram_interval_times) == len(values):
# Show 2-4 labels evenly spaced
n = len(values)
step = max(1, n // 4)
tick_positions = list(range(0, n, step))
ax.set_xticks(tick_positions)
ax.set_xticklabels([rd.histogram_interval_times[t] for t in tick_positions],
rotation=0, fontsize=6)
else:
ax.set_xlabel("Interval", fontsize=8)
ax.tick_params(axis="x", labelsize=7)
ax.tick_params(axis="y", labelsize=6)
# Footer scale info — histograms use minute/div. Reuses the shared
# geo_step computed above so the label matches the actual y-axis
# tick spacing on every subplot.
interval_str = rd.histogram_interval_size or ""
geo_amp_div = f"{geo_step:.3f}"
fig.text(
0.11, 0.030,
f"Time {interval_str} /div Amplitude Geo: {geo_amp_div} in/s/div Mic: 0.001 psi(L)/div",
fontsize=7, color="#444", ha="left",
)
+166 -4
View File
@@ -46,7 +46,7 @@ from typing import Optional
# FastAPI / Pydantic # FastAPI / Pydantic
try: try:
from fastapi import Body, FastAPI, File, HTTPException, Query, UploadFile from fastapi import Body, FastAPI, File, HTTPException, Query, Response, UploadFile
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, JSONResponse, StreamingResponse from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
from pydantic import BaseModel from pydantic import BaseModel
@@ -381,10 +381,24 @@ def webapp():
@app.get("/waveform", response_class=FileResponse) @app.get("/waveform", response_class=FileResponse)
def waveform_viewer(): def waveform_viewer():
"""Serve the standalone waveform viewer.""" """Serve the standalone LIVE-device waveform viewer.
Talks to ``/device/*`` endpoints for plotting events pulled from
a connected unit in real time. For the stored-event browser that
reads from the SeismoDb + WaveformStore, see ``/events``.
"""
return str(Path(__file__).parent / "waveform_viewer.html") return str(Path(__file__).parent / "waveform_viewer.html")
@app.get("/events", response_class=FileResponse)
def event_browser():
"""Serve the stored-event browser — pick a serial, list its events,
render any one's waveform from the persisted ``.h5`` via the
``/db/events/{id}/waveform.json`` endpoint. Standalone HTML +
Chart.js, no auth, no build step."""
return str(Path(__file__).parent / "event_browser.html")
@app.get("/device/info") @app.get("/device/info")
def device_info( def device_info(
port: Optional[str] = Query(None, description="Serial port (e.g. COM5, /dev/ttyUSB0)"), port: Optional[str] = Query(None, description="Serial port (e.g. COM5, /dev/ttyUSB0)"),
@@ -1975,8 +1989,13 @@ def _cleanup_event_files(row: dict) -> dict:
bw_path, a5_path = store.paths_for(serial, base_name) bw_path, a5_path = store.paths_for(serial, base_name)
sc_path = store.sidecar_path_for(serial, base_name) sc_path = store.sidecar_path_for(serial, base_name)
h5_path = store.hdf5_path_for(serial, base_name) h5_path = store.hdf5_path_for(serial, base_name)
# Preserved BW ASCII report (added 2026-05-27 with the .TXT
# preservation feature) — needs to be cleaned up too, otherwise
# deletes leave orphan _ASCII.TXT files behind.
txt_path = store.txt_path_for(serial, base_name)
for kind, p in [("blastware", bw_path), ("a5_pickle", a5_path), for kind, p in [("blastware", bw_path), ("a5_pickle", a5_path),
("sidecar", sc_path), ("hdf5", h5_path)]: ("sidecar", sc_path), ("hdf5", h5_path),
("txt", txt_path)]:
try: try:
if p.exists(): if p.exists():
p.unlink() p.unlink()
@@ -2164,6 +2183,148 @@ def db_event_blastware_file(event_id: str) -> FileResponse:
) )
@app.get("/db/events/{event_id}/ascii_report.txt")
def db_event_ascii_report_txt(event_id: str):
"""Serve the raw BW ASCII report (.TXT) for an event, when preserved.
Returns 404 for events ingested before the .TXT-preservation feature
landed (2026-05-27) those events have only the parsed ``bw_report``
block in the sidecar, not the raw .TXT. Re-forwarding from the
watcher PC will populate the .TXT going forward.
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(status_code=404, detail="Event has no associated BW file")
txt_path = _get_store().open_txt(serial, filename)
if txt_path is None:
raise HTTPException(
status_code=404,
detail=(
f"Raw .TXT not preserved for {filename}. Events ingested "
"before 2026-05-27 don't have it; re-forward from the "
"watcher PC to populate."
),
)
return FileResponse(
path=str(txt_path),
media_type="text/plain",
filename=txt_path.name,
)
@app.get("/db/events/{event_id}/report.pdf")
def db_event_report_pdf(event_id: str):
"""Render an Instantel-style Event Report as a PDF.
Single-page letter portrait, matches the BW Event Report's data
coverage and layout (header / mic block / per-channel stats /
waveform plot). V0.20.0 stub exact visual being iterated
against reference PDFs in ``docs/reference/instantel/``.
Returns 404 if the event is unknown or has no waveform data on
disk (same condition as /waveform.json).
"""
from sfm import report_pdf
rd = report_pdf.gather_report_data(_get_db(), _get_store(), event_id)
if rd is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found or has no waveform")
pdf_bytes = report_pdf.render_event_report_pdf(rd)
# Suggested download filename based on the BW file basename.
fname = (rd.file_name or event_id).replace(".", "_")
return Response(
content=pdf_bytes,
media_type="application/pdf",
headers={"Content-Disposition": f'inline; filename="{fname}_report.pdf"'},
)
def _maybe_aggregate_histogram(plot: dict, store, serial: str, filename: str, row: dict) -> dict:
"""For histogram events, aggregate the codec's per-block samples into
the BW-reported number of intervals. No-op for waveforms or when
we don't have the histogram metadata (interval count + size) in the
sidecar's bw_report block.
Why: the histogram codec emits one value per internal block (~1 per
second), but BW's printout shows one bar per configured interval
(typically 1-15 minutes). For a 1-minute-interval event the codec
gives ~60 blocks per BW bar. Aggregating max-per-group makes the
SFM chart + PDF visually match BW's display.
"""
record_type = row.get("record_type") or ""
if not record_type.lower().startswith("hist"):
return plot
# Read interval count + size from the sidecar's bw_report.histogram block
try:
import json as _json
sidecar_path = store.sidecar_path_for(serial, filename)
if not sidecar_path.exists():
return plot
sc = _json.loads(sidecar_path.read_text())
hist = (sc.get("bw_report") or {}).get("histogram") or {}
n_intervals = hist.get("n_intervals")
interval_size_s = hist.get("interval_size_s")
start_iso = hist.get("start")
except Exception:
return plot
if not n_intervals or n_intervals < 1:
return plot
# Aggregate each channel's values into n_intervals groups, max-per-group
channels = plot.get("channels") or {}
aggregated_channels: dict = {}
for ch, chd in channels.items():
vals = chd.get("values") or []
if not vals:
aggregated_channels[ch] = chd
continue
# Distribute len(vals) samples across n_intervals groups; uneven
# remainders get distributed across the first few groups.
per_group = len(vals) // n_intervals
remainder = len(vals) % n_intervals
agg: list = []
offset = 0
for i in range(n_intervals):
grp_size = per_group + (1 if i < remainder else 0)
if grp_size > 0:
grp = vals[offset:offset + grp_size]
# Max of absolute values (peaks are magnitudes).
agg.append(max((abs(v) for v in grp if v is not None), default=0))
offset += grp_size
else:
agg.append(0)
aggregated_channels[ch] = {**chd, "values": agg}
# Build per-interval timestamp labels for the x-axis if we have start time
interval_times: list = []
if start_iso and interval_size_s:
try:
import datetime as _dt
start = _dt.datetime.fromisoformat(start_iso)
for i in range(int(n_intervals)):
# Show the END of each interval (BW convention — the
# peak reported is for samples taken THROUGH that time)
end = start + _dt.timedelta(seconds=(i + 1) * interval_size_s)
interval_times.append(end.strftime("%H:%M:%S"))
except Exception:
pass
# Override the time_axis to reflect intervals (not samples).
plot_aggr = {**plot, "channels": aggregated_channels}
plot_aggr["time_axis"] = {
**(plot.get("time_axis") or {}),
"histogram_aggregated": True,
"n_intervals": int(n_intervals),
"interval_size_s": interval_size_s,
"interval_times": interval_times,
}
return plot_aggr
@app.get("/db/events/{event_id}/waveform.json") @app.get("/db/events/{event_id}/waveform.json")
def db_event_waveform_json(event_id: str) -> dict: def db_event_waveform_json(event_id: str) -> dict:
""" """
@@ -2195,7 +2356,8 @@ def db_event_waveform_json(event_id: str) -> dict:
h5_path = store.hdf5_path_for(serial, filename) h5_path = store.hdf5_path_for(serial, filename)
if h5_path.exists(): if h5_path.exists():
try: try:
return event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id) plot = event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
return _maybe_aggregate_histogram(plot, store, serial, filename, row)
except Exception as exc: except Exception as exc:
log.warning("HDF5 read failed (%s); falling back to A5 path", exc) log.warning("HDF5 read failed (%s); falling back to A5 path", exc)
+557 -39
View File
@@ -499,6 +499,20 @@
text-align: left; text-align: left;
border-bottom: 1px solid var(--border); border-bottom: 1px solid var(--border);
white-space: nowrap; white-space: nowrap;
position: sticky;
top: 0;
z-index: 1;
}
table.db-table thead th[data-sort]:hover {
background: var(--border2);
color: var(--text);
}
table.db-table thead th .sort-arrow {
display: inline-block;
width: 10px;
color: var(--accent, #58a6ff);
font-weight: 900;
text-align: center;
} }
table.db-table tbody tr { border-bottom: 1px solid var(--border2); } table.db-table tbody tr { border-bottom: 1px solid var(--border2); }
table.db-table tbody tr:last-child { border-bottom: none; } table.db-table tbody tr:last-child { border-bottom: none; }
@@ -758,7 +772,9 @@
overflow: hidden; overflow: hidden;
min-height: 0; min-height: 0;
} }
#section-db { display: none; } /* Default to Database view on page load — most users are here to
browse stored events, not connect to a live unit. */
#section-live { display: none; }
/* ── Live connect bar (host/port/connect, live section only) ── */ /* ── Live connect bar (host/port/connect, live section only) ── */
#live-connect-bar { #live-connect-bar {
@@ -792,8 +808,8 @@
</div> </div>
<div class="hdr-sep"></div> <div class="hdr-sep"></div>
<div class="section-switcher"> <div class="section-switcher">
<button class="section-btn active" onclick="switchSection('live')">Live Device</button> <button class="section-btn" onclick="switchSection('live')">Live Device</button>
<button class="section-btn" onclick="switchSection('db')">Database</button> <button class="section-btn active" onclick="switchSection('db')">Database</button>
</div> </div>
<div class="hdr-sep"></div> <div class="hdr-sep"></div>
<label class="force-toggle" id="force-toggle" <label class="force-toggle" id="force-toggle"
@@ -802,6 +818,12 @@
<span class="ft-dot"></span> <span class="ft-dot"></span>
<span>Force refresh</span> <span>Force refresh</span>
</label> </label>
<div class="hdr-sep"></div>
<button id="mic-unit-toggle" class="section-btn"
onclick="_setMicUnit(_getMicUnit() === 'dBL' ? 'psi' : 'dBL')"
title="Toggle microphone display unit (dBL ↔ psi) for waveform plots. Affects all mic charts; persists across page loads.">
Mic: dBL
</button>
</header> </header>
<!-- ════════════════════════════════════════════════════════════════ <!-- ════════════════════════════════════════════════════════════════
@@ -1224,18 +1246,18 @@
<div class="db-table-wrap" id="hist-table-wrap" style="display:none"> <div class="db-table-wrap" id="hist-table-wrap" style="display:none">
<table class="db-table" id="hist-table"> <table class="db-table" id="hist-table">
<thead> <thead>
<tr> <tr id="hist-header-row">
<th>Timestamp</th> <th data-sort="timestamp">Timestamp <span class="sort-arrow"></span></th>
<th>Serial</th> <th data-sort="serial">Serial <span class="sort-arrow"></span></th>
<th>Tran (in/s)</th> <th data-sort="tran_ppv">Tran (in/s) <span class="sort-arrow"></span></th>
<th>Vert (in/s)</th> <th data-sort="vert_ppv">Vert (in/s) <span class="sort-arrow"></span></th>
<th>Long (in/s)</th> <th data-sort="long_ppv">Long (in/s) <span class="sort-arrow"></span></th>
<th>PVS (in/s)</th> <th data-sort="peak_vector_sum">PVS (in/s) <span class="sort-arrow"></span></th>
<th>Mic (dBL)</th> <th data-sort="mic_ppv">Mic (dBL) <span class="sort-arrow"></span></th>
<th>Project</th> <th data-sort="project">Project <span class="sort-arrow"></span></th>
<th>Client</th> <th data-sort="client">Client <span class="sort-arrow"></span></th>
<th>Type</th> <th data-sort="record_type">Type <span class="sort-arrow"></span></th>
<th>Key</th> <th data-sort="waveform_key">Key <span class="sort-arrow"></span></th>
<th></th> <th></th>
</tr> </tr>
</thead> </thead>
@@ -1388,7 +1410,9 @@ function deviceParams() {
} }
// ── Section switching ───────────────────────────────────────────────────────── // ── Section switching ─────────────────────────────────────────────────────────
let currentSection = 'live'; // Default to Database — most users land here to browse stored events.
// Live Device is opt-in (click the tab to talk to a unit).
let currentSection = 'db';
function switchSection(name) { function switchSection(name) {
currentSection = name; currentSection = name;
@@ -2333,6 +2357,12 @@ async function _fetchUnits() {
} }
// ── History tab ──────────────────────────────────────────────────────────────── // ── History tab ────────────────────────────────────────────────────────────────
// Module-level state for the history table — preserved across re-sorts.
// We sort + re-render without re-fetching.
let _histEvents = [];
let _histSortKey = 'timestamp';
let _histSortDir = 'desc'; // 'asc' | 'desc'
async function loadHistory() { async function loadHistory() {
histLoaded = true; histLoaded = true;
const serial = document.getElementById('hist-serial-filter').value; const serial = document.getElementById('hist-serial-filter').value;
@@ -2364,10 +2394,20 @@ async function loadHistory() {
_populateSerialDropdown('monlog-serial-filter'); _populateSerialDropdown('monlog-serial-filter');
_populateSerialDropdown('sess-serial-filter'); _populateSerialDropdown('sess-serial-filter');
document.getElementById('hist-count').textContent = `${events.length} event${events.length !== 1 ? 's' : ''}`; _histEvents = events;
renderHistTable();
}
// Re-render the history table from `_histEvents` using the current sort
// state. Pulled out of `loadHistory` so column-header clicks can re-sort
// in-memory without re-fetching from the server.
function renderHistTable() {
const events = _histEvents;
document.getElementById('hist-count').textContent =
`${events.length} event${events.length !== 1 ? 's' : ''}`;
const tbody = document.getElementById('hist-tbody'); const tbody = document.getElementById('hist-tbody');
tbody.innerHTML = ''; tbody.innerHTML = '';
if (events.length === 0) { if (events.length === 0) {
document.getElementById('hist-empty').style.display = 'block'; document.getElementById('hist-empty').style.display = 'block';
document.getElementById('hist-table-wrap').style.display = 'none'; document.getElementById('hist-table-wrap').style.display = 'none';
@@ -2376,11 +2416,31 @@ async function loadHistory() {
document.getElementById('hist-empty').style.display = 'none'; document.getElementById('hist-empty').style.display = 'none';
document.getElementById('hist-table-wrap').style.display = 'block'; document.getElementById('hist-table-wrap').style.display = 'block';
for (const ev of events) { // Sort in-place by current key + direction. Nulls sink to the bottom
// regardless of direction.
const k = _histSortKey;
const dir = _histSortDir === 'asc' ? 1 : -1;
const sorted = [...events].sort((a, b) => {
const av = a[k], bv = b[k];
if (av == null && bv == null) return 0;
if (av == null) return 1;
if (bv == null) return -1;
if (typeof av === 'number' && typeof bv === 'number') return (av - bv) * dir;
return String(av).localeCompare(String(bv)) * dir;
});
// Update arrow indicators in the headers
document.querySelectorAll('#hist-header-row th[data-sort]').forEach(th => {
const arrow = th.querySelector('.sort-arrow');
if (!arrow) return;
arrow.textContent = th.dataset.sort === k ? (_histSortDir === 'asc' ? '↑' : '↓') : '';
});
for (const ev of sorted) {
const tr = document.createElement('tr'); const tr = document.createElement('tr');
const pvs = ev.peak_vector_sum; const pvs = ev.peak_vector_sum;
tr.classList.add('clickable'); tr.classList.add('clickable');
tr.title = 'Click to review (open sidecar editor)'; tr.title = 'Click to view waveform + sidecar';
tr.dataset.eventId = ev.id; tr.dataset.eventId = ev.id;
tr.innerHTML = ` tr.innerHTML = `
<td>${_fmtTs(ev.timestamp)}</td> <td>${_fmtTs(ev.timestamp)}</td>
@@ -2408,6 +2468,28 @@ async function loadHistory() {
} }
} }
// Click a column header → toggle sort. Click another → set sort to that column.
document.addEventListener('DOMContentLoaded', () => {
const headerRow = document.getElementById('hist-header-row');
if (!headerRow) return;
headerRow.querySelectorAll('th[data-sort]').forEach(th => {
th.style.cursor = 'pointer';
th.style.userSelect = 'none';
th.addEventListener('click', () => {
const k = th.dataset.sort;
if (_histSortKey === k) {
_histSortDir = _histSortDir === 'asc' ? 'desc' : 'asc';
} else {
_histSortKey = k;
// Default direction: 'desc' for numbers + timestamps (biggest/newest first),
// 'asc' for text columns (alphabetical).
_histSortDir = ['serial','project','client','record_type','waveform_key'].includes(k) ? 'asc' : 'desc';
}
renderHistTable();
});
});
});
// ── Sidecar review modal ─────────────────────────────────────────────────────── // ── Sidecar review modal ───────────────────────────────────────────────────────
// //
// Opens on row click in the History table. Loads the .sfm.json sidecar // Opens on row click in the History table. Loads the .sfm.json sidecar
@@ -2430,23 +2512,373 @@ async function openSidecarModal(eventId) {
document.getElementById('sc-edit-ft').checked = false; document.getElementById('sc-edit-ft').checked = false;
document.getElementById('sc-edit-reviewer').value = ''; document.getElementById('sc-edit-reviewer').value = '';
document.getElementById('sc-edit-notes').value = ''; document.getElementById('sc-edit-notes').value = '';
// Reset waveform area
document.getElementById('sc-waveform-status').textContent = 'Loading waveform…';
document.getElementById('sc-waveform-charts').innerHTML = '';
_destroyScCharts();
try { // Sidecar + waveform fetched in parallel — neither blocks the other.
const r = await fetch(`${api()}/db/events/${eventId}/sidecar`); const sidecarP = fetch(`${api()}/db/events/${eventId}/sidecar`)
if (!r.ok) { .then(async r => {
const e = await r.json().catch(() => ({})); if (!r.ok) { const e = await r.json().catch(() => ({})); throw new Error(e.detail || r.statusText); }
throw new Error(e.detail || r.statusText); return r.json();
} });
const data = await r.json(); const waveformP = fetch(`${api()}/db/events/${eventId}/waveform.json`)
.then(async r => {
if (r.status === 404) return null; // no waveform available — render empty state
if (!r.ok) { const e = await r.json().catch(() => ({})); throw new Error(e.detail || r.statusText); }
return r.json();
});
// Sidecar usually loads first (smaller payload). Each one renders
// independently so the modal becomes useful as soon as either lands.
sidecarP.then(data => {
_scCurrentSidecar = data; _scCurrentSidecar = data;
_renderSidecar(data); _renderSidecar(data);
document.getElementById('sc-status').textContent = ''; document.getElementById('sc-status').textContent = '';
} catch (e) { }).catch(e => {
document.getElementById('sc-status').className = 'sc-status error'; document.getElementById('sc-status').className = 'sc-status error';
document.getElementById('sc-status').textContent = `Load failed: ${e.message}`; document.getElementById('sc-status').textContent = `Sidecar load failed: ${e.message}`;
});
waveformP.then(data => {
if (!data) {
document.getElementById('sc-waveform-status').textContent = 'No waveform data for this event.';
return;
}
_renderScWaveform(data);
}).catch(e => {
document.getElementById('sc-waveform-status').textContent = `Waveform load failed: ${e.message}`;
});
}
// ── Sidecar-modal waveform plot ──────────────────────────────────────────────
// Renders the 4-channel decoded waveform fetched from
// /db/events/{id}/waveform.json — MicL on top, Tran on bottom (matches
// Instantel BW Event Report layout). Uses Chart.js (loaded at the top of
// the page for the live-device viewer).
const _SC_CHANNEL_COLORS = {
MicL: '#e066ff',
Long: '#3a80ff',
Vert: '#3fb950',
Tran: '#f85149',
};
const _SC_CHANNEL_ORDER = ['MicL', 'Long', 'Vert', 'Tran'];
let _scCharts = {};
// User preference for how mic is displayed in plots — dBL (default,
// matches BW printout convention + the rest of SFM) or psi (the raw
// sample unit). Toggleable via the header pill; persists in localStorage.
function _getMicUnit() {
return localStorage.getItem('sfm_mic_unit') === 'psi' ? 'psi' : 'dBL';
}
function _setMicUnit(u) {
localStorage.setItem('sfm_mic_unit', u === 'psi' ? 'psi' : 'dBL');
_refreshMicUnitToggleLabel();
// Re-render the open modal so the change is immediately visible.
if (_scCurrentEventId) openSidecarModal(_scCurrentEventId);
}
function _refreshMicUnitToggleLabel() {
const b = document.getElementById('mic-unit-toggle');
if (b) b.textContent = `Mic: ${_getMicUnit()}`;
}
// Convert a psi value to dB(L). Returns null for non-positive values
// (log of zero is undefined) — Chart.js handles null as a gap in the line.
function _psiToDbl(psi) {
if (psi == null || !(psi > 0)) return null;
return 20 * Math.log10(psi / DBL_REF);
}
// Per-sample mic display floor. Sound pressure AC samples spend most
// of their time at the digitization noise floor (1-2 ADC counts ≈ ~20-40
// dBL). Rendering each one as null/-inf produces a spikey discontinuous
// chart of "moments when sound briefly exceeded 80 dBL" — confusing.
// Instead we rectify (abs the AC waveform), convert to dBL, and floor
// anything below MIC_DBL_FLOOR so the chart has a continuous baseline
// with peaks rising above it. Matches how acoustic engineers expect to
// see SPL-vs-time.
const MIC_DBL_FLOOR = 60;
function _psiToDblForChart(psi) {
if (psi == null) return MIC_DBL_FLOOR;
const a = Math.abs(psi);
if (a === 0) return MIC_DBL_FLOOR;
const dbl = 20 * Math.log10(a / DBL_REF);
return dbl > MIC_DBL_FLOOR ? dbl : MIC_DBL_FLOOR;
}
// Adaptive decimal formatter — scientific notation is reserved for truly
// extreme values (10000+ or sub-0.0001). Normal-range values (most peaks
// fall here) render as decimals with sensible precision. Replaces the
// previous .toExponential(3) call that turned every peak into ugly "2.500E-2".
function _fmtPeak(v, unit) {
if (v == null || (typeof v === 'number' && !isFinite(v))) return '';
if (typeof v !== 'number') return String(v) + (unit ? ' ' + unit : '');
if (v === 0) return '0' + (unit ? ' ' + unit : '');
const a = Math.abs(v);
const u = unit ? ' ' + unit : '';
if (a >= 0.0001 && a < 10000) {
const d = a >= 100 ? 1 : a >= 10 ? 2 : a >= 1 ? 3 : a >= 0.1 ? 4 : 5;
return v.toFixed(d) + u;
}
return v.toExponential(2) + u;
}
function _destroyScCharts() {
Object.values(_scCharts).forEach(c => { try { c.destroy(); } catch {} });
_scCharts = {};
}
function _renderScWaveform(data) {
document.getElementById('sc-waveform-status').textContent = '';
const chartsDiv = document.getElementById('sc-waveform-charts');
chartsDiv.innerHTML = '';
_destroyScCharts();
const channels = data.channels || {};
// time_axis is METADATA, not an array — it carries sample_rate,
// pretrig_samples, t0_ms (first-sample time relative to trigger,
// negative when pretrig samples exist), and dt_ms. Trigger is at
// t=0 by convention.
const ta = data.time_axis || {};
const sr = ta.sample_rate || 1024;
const dtMs = ta.dt_ms || (1000.0 / sr);
const t0Ms = ta.t0_ms != null ? ta.t0_ms : 0;
// Histogram events have per-interval peaks, not per-sample data.
// Render as bars (one per interval) instead of a connected line, and
// suppress trigger/zero overlays which don't apply. X-axis becomes
// interval index since the sample_rate-based time math is meaningless
// here (each "sample" is one interval, typically 1-5 minutes long).
const isHistogram = String(data.record_type || '').toLowerCase().includes('histogram');
// Which channels have data — determines which one renders the shared bottom axis.
const withData = _SC_CHANNEL_ORDER.filter(ch =>
channels[ch] && (channels[ch].values || []).length > 0
);
const lastCh = withData[withData.length - 1];
const micUnit = _getMicUnit(); // user preference: 'dBL' or 'psi'
for (const ch of _SC_CHANNEL_ORDER) {
const chData = channels[ch];
if (!chData) continue;
let values = chData.values || [];
let chUnit = chData.unit || '';
let chPeak = chData.peak;
// Mic channel: convert from raw psi to dB(L) when user prefers dBL
// (default). Per-sample values use _psiToDblForChart which rectifies
// (abs) the AC waveform and floors at MIC_DBL_FLOOR so the chart is
// continuous with a baseline + peaks above it, instead of a sparse
// pattern of isolated spikes for "moments when sound briefly exceeded
// the Y-axis bottom". The peak label uses _psiToDbl with the
// unrectified peak (preserves the true measurement).
if (ch === 'MicL' && chUnit === 'psi' && micUnit === 'dBL') {
values = values.map(_psiToDblForChart);
chPeak = _psiToDbl(chPeak);
chUnit = 'dB(L)';
}
const wrap = document.createElement('div');
wrap.style.cssText = 'background:var(--surface);border:1px solid var(--border2);border-radius:6px;padding:6px 30px 4px 10px';
const lbl = document.createElement('div');
lbl.style.cssText = `font-size:10px;font-weight:600;letter-spacing:0.05em;text-transform:uppercase;margin-bottom:2px;color:${_SC_CHANNEL_COLORS[ch]};display:flex;justify-content:space-between`;
const peakStr = chPeak != null
? `peak ${_fmtPeak(chPeak, chUnit)}`
: '';
lbl.innerHTML = `<span>${ch}</span><span style="color:var(--text-dim);font-weight:normal">${peakStr}</span>`;
wrap.appendChild(lbl);
if (values.length === 0) {
const e = document.createElement('div');
e.style.cssText = 'height:80px;display:flex;align-items:center;justify-content:center;color:var(--text-dim);font-size:11px';
e.textContent = 'no samples decoded';
wrap.appendChild(e);
chartsDiv.appendChild(wrap);
continue;
}
const canvasWrap = document.createElement('div');
canvasWrap.style.cssText = 'position:relative;height:100px';
const canvas = document.createElement('canvas');
canvasWrap.appendChild(canvas);
wrap.appendChild(canvasWrap);
chartsDiv.appendChild(wrap);
// Waveform: per-sample time in ms relative to trigger (negative for pretrig).
// Histogram: when the server has aggregated to BW-reported intervals AND
// provides per-interval timestamps, use those as x-axis labels (HH:MM:SS).
// Falls back to interval index.
let times;
if (isHistogram) {
const intervalTimes = ta.interval_times || [];
times = (intervalTimes.length === values.length)
? intervalTimes
: values.map((_, i) => i + 1);
} else {
times = values.map((_, i) => t0Ms + i * dtMs);
}
// Downsample for rendering when very long.
const MAX = 3000;
let rT = times, rV = values;
if (values.length > MAX) {
const step = Math.ceil(values.length / MAX);
rT = times.filter((_, i) => i % step === 0);
rV = values.filter((_, i) => i % step === 0);
}
const showX = (ch === lastCh);
// Tick label formatter: snap floats to 1 decimal place so we don't get
// "11.7187040000000002 ms" garbage from accumulated floating-point error.
const xAxisLabel = isHistogram ? '' : ' ms';
const fmtTick = i => {
const v = rT[i];
if (typeof v === 'number') {
// Whole numbers (intervals) → no decimals. Sub-integer ms → 1 decimal.
const s = Number.isInteger(v) ? String(v) : v.toFixed(1);
return s + xAxisLabel;
}
return String(v) + xAxisLabel;
};
// Y-axis bounds. Convention:
// - Geophones (Tran/Vert/Long) on waveform-mode events:
// symmetric around zero so the zero line sits in the middle and
// positive/negative excursions are visually balanced.
// - Mic (always positive sound pressure) + histograms (per-interval
// peaks, always positive): default auto-scale, zero at the bottom.
let yBounds = {};
const isGeo = ch !== 'MicL';
if (isGeo && !isHistogram) {
// Waveform geo: symmetric around zero, full zoom to shape detail.
let absMax = 0;
for (const v of values) {
const a = Math.abs(v);
if (a > absMax) absMax = a;
}
const padded = (absMax || 1) * 1.10;
yBounds = { min: -padded, max: padded };
} else if (isGeo && isHistogram) {
// Histogram geo: enforce a minimum chart range so a quiet
// 0.005 in/s event renders as ~10% of chart height instead of
// filling the panel. Matches BW's near-fixed-scale convention
// (their footer is "Geo: 0.002 in/s/div" — a chart-relative scale,
// not auto-zoom).
const HIST_GEO_MIN_INS = 0.05;
let peak = 0;
for (const v of values) { const a = Math.abs(v); if (a > peak) peak = a; }
yBounds = { min: 0, max: Math.max(peak * 1.10, HIST_GEO_MIN_INS) };
} else if (ch === 'MicL' && micUnit === 'dBL') {
// Mic in dBL — pin baseline at noise-floor minimum (where we floored
// quiet samples), top at actual peak + a few dB headroom.
const peakDbl = (typeof chPeak === 'number' && isFinite(chPeak))
? chPeak + 5
: 100;
yBounds = { min: MIC_DBL_FLOOR, max: Math.max(peakDbl, MIC_DBL_FLOOR + 20) };
} else if (ch === 'MicL' && isHistogram && micUnit === 'psi') {
// Mic histogram in psi — same minimum-range treatment as geo.
// 0.001 psi ≈ 110 dBL — typical "loud" mic peak. Quiet events
// sit near the bottom.
const HIST_MIC_MIN_PSI = 0.001;
let peak = 0;
for (const v of values) { const a = Math.abs(v); if (a > peak) peak = a; }
yBounds = { min: 0, max: Math.max(peak * 1.10, HIST_MIC_MIN_PSI) };
}
_scCharts[ch] = new Chart(canvas, {
type: isHistogram ? 'bar' : 'line',
data: {
labels: rT.map(t => (typeof t === 'number' ? (Number.isInteger(t) ? String(t) : t.toFixed(2)) : t)),
datasets: isHistogram ? [{
data: rV,
backgroundColor: _SC_CHANNEL_COLORS[ch],
borderWidth: 0,
barPercentage: 1.0,
categoryPercentage: 1.0, // bars touch — "tight bargraph" look
}] : [{
data: rV,
borderColor: _SC_CHANNEL_COLORS[ch],
borderWidth: 1,
pointRadius: 0,
tension: 0,
}],
},
options: {
animation: false, responsive: true, maintainAspectRatio: false,
plugins: {
legend: { display: false },
tooltip: {
mode: 'index', intersect: false,
callbacks: {
title: items => isHistogram
? `interval ${items[0].label}`
: `t = ${items[0].label} ms`,
label: item => `${ch}: ${_fmtPeak(item.raw, chUnit)}`,
},
},
},
scales: {
x: {
type: 'category', display: showX,
ticks: { color: '#484f58', maxTicksLimit: 8, maxRotation: 0, callback: (v, i) => fmtTick(i) },
grid: { color: '#21262d', drawTicks: showX },
},
y: {
...yBounds,
ticks: { color: '#484f58', maxTicksLimit: 4 },
grid: { color: '#21262d' },
title: { display: true, text: chUnit, color: '#484f58', font: { size: 9 } },
},
},
},
plugins: isHistogram ? [] : [{
// Trigger line + triangle markers + zero baseline — only meaningful
// for waveform-mode events. Histograms have no trigger.
id: 'overlays',
afterDraw(chart) {
const ctx = chart.ctx, x = chart.scales.x, y = chart.scales.y;
// Dashed trigger line at t=0
const zi = rT.findIndex(t => parseFloat(t) >= 0);
if (zi >= 0) {
const px = x.getPixelForValue(zi);
ctx.save();
ctx.beginPath(); ctx.moveTo(px, y.top); ctx.lineTo(px, y.bottom);
ctx.strokeStyle = 'rgba(248,81,73,0.8)'; ctx.lineWidth = 1.2;
ctx.setLineDash([4, 3]); ctx.stroke(); ctx.restore();
// Triangle markers above and below the chart
ctx.save();
ctx.fillStyle = '#f85149';
ctx.beginPath();
ctx.moveTo(px - 4, y.top - 7); ctx.lineTo(px + 4, y.top - 7); ctx.lineTo(px, y.top - 1);
ctx.closePath(); ctx.fill();
ctx.beginPath();
ctx.moveTo(px - 4, y.bottom + 7); ctx.lineTo(px + 4, y.bottom + 7); ctx.lineTo(px, y.bottom + 1);
ctx.closePath(); ctx.fill();
ctx.restore();
}
// Zero baseline + label
const zy = y.getPixelForValue(0);
if (zy >= y.top && zy <= y.bottom) {
ctx.save();
ctx.strokeStyle = '#30363d'; ctx.lineWidth = 0.8;
ctx.setLineDash([2, 2]);
ctx.beginPath(); ctx.moveTo(x.left, zy); ctx.lineTo(x.right, zy); ctx.stroke();
ctx.restore();
ctx.save();
ctx.fillStyle = '#c9d1d9'; ctx.font = '10px monospace';
ctx.textAlign = 'left'; ctx.textBaseline = 'middle';
ctx.fillText('0.0', x.right + 6, zy);
ctx.restore();
}
},
}],
});
} }
} }
// Make sure charts get cleaned up when the modal closes.
function _scCleanupOnClose() { _destroyScCharts(); }
function _renderSidecar(data) { function _renderSidecar(data) {
const ev = data.event || {}; const ev = data.event || {};
const pv = data.peak_values || {}; const pv = data.peak_values || {};
@@ -2454,6 +2886,12 @@ function _renderSidecar(data) {
const bw = data.blastware || {}; const bw = data.blastware || {};
const src = data.source || {}; const src = data.source || {};
const rev = data.review || {}; const rev = data.review || {};
// bw_report carries the per-channel ASCII-derived stats (ZC Freq,
// saturation flags, peak time, etc.). Only present on events
// ingested with a preserved .TXT (post-2026-05-27); falls back to
// empty for legacy events.
const bwrPeaks = (data.bw_report || {}).peaks || {};
const bwrMic = (data.bw_report || {}).mic || {};
document.getElementById('sc-title').textContent = `Event — ${bw.filename || ev.waveform_key || 'unknown'}`; document.getElementById('sc-title').textContent = `Event — ${bw.filename || ev.waveform_key || 'unknown'}`;
@@ -2479,27 +2917,72 @@ function _renderSidecar(data) {
}; };
document.getElementById('sc-f-serial').textContent = ev.serial || '—'; document.getElementById('sc-f-serial').textContent = ev.serial || '—';
document.getElementById('sc-f-ts').textContent = ev.timestamp || '—'; // Route through _fmtTs so the unit-local naive timestamp shows as
// "5/27/2026, 6:00:13 AM" instead of "2026-05-27T06:00:13".
document.getElementById('sc-f-ts').textContent = _fmtTs(ev.timestamp);
document.getElementById('sc-f-rt').textContent = ev.record_type || '—'; document.getElementById('sc-f-rt').textContent = ev.record_type || '—';
document.getElementById('sc-f-sr').textContent = (ev.sample_rate ?? '—') + (ev.sample_rate ? ' sps' : ''); document.getElementById('sc-f-sr').textContent = (ev.sample_rate ?? '—') + (ev.sample_rate ? ' sps' : '');
document.getElementById('sc-f-key').textContent = ev.waveform_key || '—'; document.getElementById('sc-f-key').textContent = ev.waveform_key || '—';
document.getElementById('sc-f-tran').textContent = fmtPpv(pv.transverse); // Suffix with " · {prefix}{N} Hz" when bw_report has a ZC Freq.
document.getElementById('sc-f-vert').textContent = fmtPpv(pv.vertical); // Above-range ZC peaks (BW ">100 Hz") get a literal ">" prefix so
document.getElementById('sc-f-long').textContent = fmtPpv(pv.longitudinal); // operators see the same indicator the PDF shows.
const fmtZc = bwr => {
if (!bwr || bwr.zc_freq_hz == null) return '';
const prefix = bwr.zc_freq_above_range ? '>' : '';
return ` · ${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
};
document.getElementById('sc-f-tran').textContent = fmtPpv(pv.transverse) + fmtZc(bwrPeaks.tran);
document.getElementById('sc-f-vert').textContent = fmtPpv(pv.vertical) + fmtZc(bwrPeaks.vert);
document.getElementById('sc-f-long').textContent = fmtPpv(pv.longitudinal) + fmtZc(bwrPeaks.long);
document.getElementById('sc-f-pvs').textContent = fmtPpv(pv.vector_sum); document.getElementById('sc-f-pvs').textContent = fmtPpv(pv.vector_sum);
document.getElementById('sc-f-mic').textContent = fmtMic(pv.mic_psi); document.getElementById('sc-f-mic').textContent = fmtMic(pv.mic_psi) + fmtZc(bwrMic);
document.getElementById('sc-f-project').textContent = pi.project || '—'; document.getElementById('sc-f-project').textContent = pi.project || '—';
document.getElementById('sc-f-client').textContent = pi.client || '—'; document.getElementById('sc-f-client').textContent = pi.client || '—';
document.getElementById('sc-f-operator').textContent = pi.operator || '—'; document.getElementById('sc-f-operator').textContent = pi.operator || '—';
document.getElementById('sc-f-loc').textContent = pi.sensor_location || '—'; document.getElementById('sc-f-loc').textContent = pi.sensor_location || '—';
document.getElementById('sc-f-bw').textContent = bw.filename || '—'; // Filename rendered as a clickable download link for the original BW
// binary. Same endpoint the live-device viewer uses for stored events
// (/db/events/{id}/blastware_file).
const bwCell = document.getElementById('sc-f-bw');
bwCell.innerHTML = '';
if (bw.filename && _scCurrentEventId) {
const a = document.createElement('a');
a.href = `${api()}/db/events/${_scCurrentEventId}/blastware_file`;
a.textContent = bw.filename;
a.download = bw.filename;
a.title = 'Download original BW event binary';
a.style.color = 'var(--accent, #58a6ff)';
a.style.textDecoration = 'underline';
bwCell.appendChild(a);
} else {
bwCell.textContent = '—';
}
document.getElementById('sc-f-bwsize').textContent = bw.filesize != null ? `${bw.filesize} bytes` : '—'; document.getElementById('sc-f-bwsize').textContent = bw.filesize != null ? `${bw.filesize} bytes` : '—';
document.getElementById('sc-f-sha').textContent = bw.sha256 || '—'; document.getElementById('sc-f-sha').textContent = bw.sha256 || '—';
document.getElementById('sc-f-src').textContent = src.kind || '—'; // Source kind + a download link for the preserved BW ASCII report
document.getElementById('sc-f-cap').textContent = src.captured_at || '—'; // (.TXT), when available. Only events ingested after 2026-05-27
// have the .TXT preserved; older events show "—".
const srcCell = document.getElementById('sc-f-src');
srcCell.innerHTML = '';
srcCell.appendChild(document.createTextNode(src.kind || '—'));
if (src.txt_filename && _scCurrentEventId) {
const a = document.createElement('a');
a.href = `${api()}/db/events/${_scCurrentEventId}/ascii_report.txt`;
a.textContent = ' (download .TXT)';
a.download = src.txt_filename;
a.title = 'Download preserved BW ASCII report';
a.style.color = 'var(--accent, #58a6ff)';
a.style.marginLeft = '8px';
a.style.fontSize = '11px';
srcCell.appendChild(a);
}
// captured_at has a "Z" suffix (UTC); _fmtTs converts to browser local
// — matches the BW-reported recorded-at, no more "21:59:57 vs it's 6 PM"
// confusion from operators reading the raw UTC value.
document.getElementById('sc-f-cap').textContent = _fmtTs(src.captured_at);
document.getElementById('sc-edit-ft').checked = !!rev.false_trigger; document.getElementById('sc-edit-ft').checked = !!rev.false_trigger;
document.getElementById('sc-edit-reviewer').value = rev.reviewer || ''; document.getElementById('sc-edit-reviewer').value = rev.reviewer || '';
@@ -2512,6 +2995,19 @@ function closeSidecarModal() {
document.getElementById('sc-overlay').classList.remove('visible'); document.getElementById('sc-overlay').classList.remove('visible');
_scCurrentEventId = null; _scCurrentEventId = null;
_scCurrentSidecar = null; _scCurrentSidecar = null;
_destroyScCharts();
}
// Trigger a PDF download for the currently-open event. The browser
// handles the actual save dialog from the Content-Disposition header
// the server sends.
function downloadEventReport() {
if (!_scCurrentEventId) return;
const url = `${api()}/db/events/${_scCurrentEventId}/report.pdf`;
// Open in a new tab — browser prompts to save or displays inline,
// and a failed fetch (e.g. 404 for events with no waveform) shows
// its JSON error in-page rather than silently failing.
window.open(url, '_blank');
} }
function onSidecarOverlayClick(e) { function onSidecarOverlayClick(e) {
@@ -2722,6 +3218,16 @@ document.addEventListener('keydown', e => {
// hit localhost:8200, 10.0.0.44:8200, or anything else. // hit localhost:8200, 10.0.0.44:8200, or anything else.
document.getElementById('api-base').value = window.location.origin; document.getElementById('api-base').value = window.location.origin;
// Reflect any persisted mic-unit preference in the header pill on load
_refreshMicUnitToggleLabel();
// We default to Database view → trigger initial history + units load
// (switchSection handles this when clicked, but we never click on first paint).
if (currentSection === 'db') {
if (!histLoaded) loadHistory();
if (!unitsLoaded) loadUnits();
}
// Press Enter in any live connect field to connect // Press Enter in any live connect field to connect
['dev-host','dev-port'].forEach(id => { ['dev-host','dev-port'].forEach(id => {
document.getElementById(id)?.addEventListener('keydown', e => { if (e.key === 'Enter') connectUnit(); }); document.getElementById(id)?.addEventListener('keydown', e => { if (e.key === 'Enter') connectUnit(); });
@@ -2738,11 +3244,18 @@ document.getElementById('api-base').value = window.location.origin;
<button class="sc-close" onclick="closeSidecarModal()">×</button> <button class="sc-close" onclick="closeSidecarModal()">×</button>
</div> </div>
<div class="sc-body"> <div class="sc-body">
<!-- Waveform plot — 4 channels stacked (MicL, Long, Vert, Tran) — -->
<div class="sc-section" id="sc-section-waveform">
<h4>Waveform</h4>
<div id="sc-waveform-status" style="color:var(--text-dim);font-size:11px;margin-bottom:6px">Loading…</div>
<div id="sc-waveform-charts" style="display:flex;flex-direction:column;gap:6px"></div>
</div>
<div class="sc-section"> <div class="sc-section">
<h4>Event</h4> <h4>Event</h4>
<dl class="sc-grid"> <dl class="sc-grid">
<dt>Serial</dt> <dd id="sc-f-serial"></dd> <dt>Serial</dt> <dd id="sc-f-serial"></dd>
<dt>Timestamp</dt> <dd id="sc-f-ts"></dd> <dt title="When the seismograph recorded this event (from the BW report's Event Time field)">Recorded at</dt>
<dd id="sc-f-ts"></dd>
<dt>Record type</dt> <dd id="sc-f-rt"></dd> <dt>Record type</dt> <dd id="sc-f-rt"></dd>
<dt>Sample rate</dt> <dd id="sc-f-sr"></dd> <dt>Sample rate</dt> <dd id="sc-f-sr"></dd>
<dt>Waveform key</dt> <dd id="sc-f-key"></dd> <dt>Waveform key</dt> <dd id="sc-f-key"></dd>
@@ -2774,7 +3287,8 @@ document.getElementById('api-base').value = window.location.origin;
<dt id="sc-l-bwsize">File size</dt> <dd id="sc-f-bwsize"></dd> <dt id="sc-l-bwsize">File size</dt> <dd id="sc-f-bwsize"></dd>
<dt id="sc-l-sha">File sha256</dt> <dd id="sc-f-sha"></dd> <dt id="sc-l-sha">File sha256</dt> <dd id="sc-f-sha"></dd>
<dt>Source kind</dt> <dd id="sc-f-src"></dd> <dt>Source kind</dt> <dd id="sc-f-src"></dd>
<dt>Captured at</dt> <dd id="sc-f-cap"></dd> <dt title="When our server received and stored this event (sfm-db insert time, not the recording time)">Received by server at</dt>
<dd id="sc-f-cap"></dd>
</dl> </dl>
</div> </div>
<div class="sc-section"> <div class="sc-section">
@@ -2797,6 +3311,10 @@ document.getElementById('api-base').value = window.location.origin;
</div> </div>
<div class="sc-footer"> <div class="sc-footer">
<span class="sc-status" id="sc-status"></span> <span class="sc-status" id="sc-status"></span>
<button class="btn btn-ghost" id="sc-pdf-btn" onclick="downloadEventReport()"
title="Download an Instantel-style Event Report PDF for this event">
Download PDF
</button>
<button class="btn btn-ghost" onclick="closeSidecarModal()">Cancel</button> <button class="btn btn-ghost" onclick="closeSidecarModal()">Cancel</button>
<button class="btn" id="sc-save-btn" onclick="saveSidecarReview()">Save</button> <button class="btn" id="sc-save-btn" onclick="saveSidecarReview()">Save</button>
</div> </div>
+42
View File
@@ -108,11 +108,30 @@ class WaveformStore:
"""Return absolute path to the .h5 clean-waveform file for a given event.""" """Return absolute path to the .h5 clean-waveform file for a given event."""
return self._serial_dir(serial) / f"{filename}.h5" return self._serial_dir(serial) / f"{filename}.h5"
def txt_path_for(self, serial: str, filename: str) -> Path:
"""Return absolute path to the preserved BW ASCII report (.TXT)
for a given event.
We name it ``<filename>_ASCII.TXT`` to match BW's own filename
convention in the ACH folder. Saved at ingest time alongside
the binary so the parser bug fixes can be applied retroactively
by re-parsing without needing to re-forward from the watcher PC.
"""
return self._serial_dir(serial) / f"{filename}_ASCII.TXT"
def open_blastware(self, serial: str, filename: str) -> Optional[Path]: def open_blastware(self, serial: str, filename: str) -> Optional[Path]:
"""Return absolute path to an existing event file or None.""" """Return absolute path to an existing event file or None."""
bw_path, _ = self.paths_for(serial, filename) bw_path, _ = self.paths_for(serial, filename)
return bw_path if bw_path.exists() else None return bw_path if bw_path.exists() else None
def open_txt(self, serial: str, filename: str) -> Optional[Path]:
"""Return absolute path to the preserved BW ASCII report for an
event, or None if the .TXT wasn't saved at ingest time (events
ingested before .TXT preservation landed will show None until
re-forwarded)."""
p = self.txt_path_for(serial, filename)
return p if p.exists() else None
# ── save / load ───────────────────────────────────────────────────────────── # ── save / load ─────────────────────────────────────────────────────────────
def save( def save(
@@ -357,6 +376,28 @@ class WaveformStore:
filesize = bw_path.stat().st_size filesize = bw_path.stat().st_size
sha256 = event_file_io.file_sha256(bw_path) sha256 = event_file_io.file_sha256(bw_path)
# 1b. preserve the raw BW ASCII report (.TXT) alongside the binary.
# Saved at <root>/<serial>/<filename>_ASCII.TXT. Lets us re-parse
# offline after parser fixes without needing to re-forward from
# the watcher PC. Negligible storage cost (~15 KB per event).
# Skipped silently when no report was supplied (live download path,
# manual upload without paired TXT).
txt_filename: Optional[str] = None
if bw_report_text is not None:
try:
txt_path = self.txt_path_for(serial, filename)
if isinstance(bw_report_text, bytes):
txt_path.write_bytes(bw_report_text)
else:
txt_path.write_text(bw_report_text)
txt_filename = txt_path.name
except Exception as exc:
log.warning(
"save_imported_bw: failed to save TXT for %s: %s"
"continuing without it",
filename, exc,
)
# 2. write the .h5 clean-waveform file from the parsed Event. # 2. write the .h5 clean-waveform file from the parsed Event.
# Note: peaks here are computed from raw samples (the BW file # Note: peaks here are computed from raw samples (the BW file
# doesn't carry the device-authoritative 0C peaks). Best-effort. # doesn't carry the device-authoritative 0C peaks). Best-effort.
@@ -393,6 +434,7 @@ class WaveformStore:
blastware_sha256=sha256, blastware_sha256=sha256,
source_kind="bw-import", source_kind="bw-import",
a5_pickle_filename=None, a5_pickle_filename=None,
txt_filename=txt_filename,
review=existing_review, review=existing_review,
bw_report=bw_report, bw_report=bw_report,
) )
+92
View File
@@ -385,6 +385,98 @@ def test_user_notes_extra_lines_beyond_four_are_dropped():
assert "L5" not in r.user_note_labels.values() assert "L5" not in r.user_note_labels.values()
def test_oorange_marker_treated_as_saturation():
"""BW writes 'OORANGE' (Out Of Range — truncated) when a channel
exceeds its full-scale. Verify ppv_ips falls back to geo_range_ips
+ saturated flag is set, mirroring the real T190LD5Q.LK0W,
T438L713.RY0W, and K557L3YM.OE0W events from prod 2026-05-27.
"""
txt = """\
"Event Type : Full Waveform"
"Serial Number : BE18190"
"Geo Range : 10.000 in/s"
"Tran PPV : 2.140 in/s"
"Vert PPV : OORANGE in/s"
"Long PPV : 2.830 in/s"
"Peak Vector Sum : OORANGE in/s"
"Peak Vector Sum TimeSum : 0.007 s"
"MicL PSPL : OORANGE "
"""
r = parse_report(txt)
# Tran/Long parse normally
assert r.channels["Tran"].ppv_ips == 2.14
assert r.channels["Tran"].ppv_saturated is False
assert r.channels["Long"].ppv_ips == 2.83
# Vert saturated → range max + flag
assert r.channels["Vert"].ppv_ips == 10.0
assert r.channels["Vert"].ppv_saturated is True
# PVS saturated → sqrt(3) * range_max as upper bound + flag
import math
assert r.peak_vector_sum_ips == pytest.approx(math.sqrt(3) * 10.0)
assert r.peak_vector_sum_saturated is True
# Mic saturated → 140 dBL conservative upper bound + flag
assert r.mic.pspl_dbl == 140.0
assert r.mic.pspl_saturated is True
# PVS time still parses despite the BW typo'd label "TimeSum"
assert r.peak_vector_sum_time_s == pytest.approx(0.007)
def test_real_oorange_event_t190_parses():
"""End-to-end against the real T190LD5Q.LK0W ASCII file pulled from
a Windows watcher PC on 2026-05-27. This is the canonical example
of the parser-PPV-miss bug we fixed in this iteration."""
fixture_path = (
Path(__file__).parent.parent / "example-events" /
"ascii-5-27-26" / "T190LD5Q_LK0W_ASCII.TXT"
)
if not fixture_path.exists():
pytest.skip("real ASCII fixture not present (local-only)")
r = parse_report_file(fixture_path)
assert r.serial == "BE18190"
assert r.geo_range_ips == 10.0
# Tran reads cleanly, Vert was OORANGE
assert r.channels["Tran"].ppv_ips == pytest.approx(2.14)
assert r.channels["Vert"].ppv_ips == 10.0
assert r.channels["Vert"].ppv_saturated is True
assert r.channels["Long"].ppv_ips == pytest.approx(2.83)
assert r.peak_vector_sum_saturated is True
assert r.peak_vector_sum_time_s == pytest.approx(0.007)
# Same fixture: Tran ZC Freq is ">100 Hz" — must parse as 100 +
# above_range flag, not None (which would render as "—" on the PDF).
assert r.channels["Tran"].zc_freq_hz == 100.0
assert r.channels["Tran"].zc_freq_above_range is True
# Vert/Long are normal numeric values; flag stays False.
assert r.channels["Vert"].zc_freq_above_range is False
assert r.channels["Long"].zc_freq_above_range is False
def test_above_range_marker_treated_as_zc_threshold():
"""BW writes '>100 Hz' for ZC Freq when the zero-crossing algorithm
sees a peak too fast to count (cuts off at the device's 100 Hz
reporting ceiling). Parser must store the threshold + flag, not
fall back to None.
"""
txt = """\
"Event Type : Full Waveform"
"Serial Number : BE18190"
"Tran ZC Freq : >100 Hz"
"Vert ZC Freq : 73 Hz"
"Long ZC Freq : N/A Hz"
"MicL ZC Freq : >100 Hz"
"""
r = parse_report(txt)
assert r.channels["Tran"].zc_freq_hz == 100.0
assert r.channels["Tran"].zc_freq_above_range is True
assert r.channels["Vert"].zc_freq_hz == 73.0
assert r.channels["Vert"].zc_freq_above_range is False
# N/A → None, flag stays False
assert r.channels["Long"].zc_freq_hz is None
assert r.channels["Long"].zc_freq_above_range is False
# Mic above-range
assert r.mic.zc_freq_hz == 100.0
assert r.mic.zc_freq_above_range is True
def test_real_histogram_fixture_populates_sensor_location(): def test_real_histogram_fixture_populates_sensor_location():
"""End-to-end: the histogram fixture uses 'Seis. Location:' — must """End-to-end: the histogram fixture uses 'Seis. Location:' — must
successfully populate sensor_location via position-based parsing.""" successfully populate sensor_location via position-based parsing."""
+171 -3
View File
@@ -289,9 +289,106 @@ def test_read_blastware_file_round_trip(tmp_path: Path):
assert parsed.timestamp.second == ev.timestamp.second assert parsed.timestamp.second == ev.timestamp.second
# No A5 source recoverable. # No A5 source recoverable.
assert parsed._a5_frames is None assert parsed._a5_frames is None
# Peaks computed from samples (synthetic = zero samples → zero peaks). # The synthetic event has no real waveform body, so the codec can't
assert parsed.peak_values is not None # decode samples → read_blastware_file leaves peak_values=None
assert parsed.peak_values.peak_vector_sum == 0.0 # (the "we don't know" signal) rather than fabricating all-zero
# peaks that would otherwise overwrite real DB values via UPSERT.
assert parsed.peak_values is None
assert parsed.raw_samples is not None
# Empty channels — codec returned None for the malformed synthetic body.
for ch in ("Tran", "Vert", "Long", "MicL"):
assert parsed.raw_samples[ch] == []
_BW_CODEC_FIXTURES = [
# (path, expected_n_samples_per_channel, BW-reported Vert PPV in/s for sanity)
("tests/fixtures/decode-re-5-8-26/event-a/M529LKVQ.6S0", 3328, 0.780),
("tests/fixtures/decode-re-5-8-26/event-b/M529LK5Q.RG0", 2304, 0.505),
("tests/fixtures/decode-re-5-8-26/event-c/M529LK44.AB0", 1280, 0.610),
("tests/fixtures/decode-re-5-8-26/event-d/M529LK2V.470", 1280, 0.565),
("tests/fixtures/5-11-26/M529LL1L.V70", 3328, 0.010),
("tests/fixtures/5-11-26/M529LL1L.JQ0", 3328, 3.465),
]
@pytest.mark.parametrize("path,expected_n,expected_ppv", _BW_CODEC_FIXTURES)
def test_read_blastware_file_decodes_via_codec(path: str, expected_n: int, expected_ppv: float):
"""Regression lock: ``read_blastware_file()`` must use the verified
waveform-body codec (``minimateplus.waveform_codec``), not the
retracted int16-LE assumption.
Verifies against the real BW fixture corpus: every event in the
bundled fixtures must produce the expected per-channel sample count
and a Vert PPV close to BW's own reported value. Catches any
accidental regression of the body decoder back to the old
``_decode_samples_4ch_int16_le`` path (which produced ±32K noise
on every event, giving wildly wrong PPVs).
"""
repo_root = Path(__file__).resolve().parent.parent
full_path = repo_root / path
if not full_path.exists():
pytest.skip(f"fixture missing: {full_path}")
ev = event_file_io.read_blastware_file(full_path)
assert ev.raw_samples is not None
for ch in ("Tran", "Vert", "Long"):
assert len(ev.raw_samples[ch]) == expected_n, (
f"{ch}: expected {expected_n} samples, got {len(ev.raw_samples[ch])}"
)
# PPV check: the codec produces decoded samples in 1-count ADC units;
# _peaks_from_samples scales by GEO_NORMAL_FS_INS / 32767. BW's own
# PPV is computed at slightly different precision/interpolation, so
# we allow a 0.2 in/s tolerance — well under the broken-decoder
# signature (which would produce ~10 in/s saturation).
assert ev.peak_values is not None
assert abs(ev.peak_values.vert - expected_ppv) < 0.2, (
f"Vert PPV {ev.peak_values.vert:.3f} differs from BW's "
f"{expected_ppv:.3f} by >0.2 in/s — codec regression?"
)
def test_read_blastware_file_v70_samples_match_txt_truth():
"""Strongest regression lock: every one of V70's 3328 decoded
sample-sets must match the .TXT ground truth table within the
0.005 in/s display quantum."""
repo_root = Path(__file__).resolve().parent.parent
bw_path = repo_root / "tests/fixtures/5-11-26/M529LL1L.V70"
txt_path = repo_root / "tests/fixtures/5-11-26/M529LL1L.V70.TXT"
if not bw_path.exists() or not txt_path.exists():
pytest.skip(f"V70 fixture missing")
import re
ev = event_file_io.read_blastware_file(bw_path)
# Parse .TXT ground truth sample table
text = txt_path.read_text()
lines = text.splitlines()
hdr_idx = next(i for i, line in enumerate(lines)
if re.match(r"^Tran\s+Vert\s+Long\s+MicL?", line.strip()))
truth = []
for line in lines[hdr_idx + 1:]:
parts = line.strip().split()
if len(parts) != 4:
continue
try:
truth.append([float(x) for x in parts])
except ValueError:
continue
assert len(truth) == 3328, f"expected 3328 truth rows, got {len(truth)}"
def adc_to_ins(count):
return count / 32767.0 * 10.0
for i, truth_row in enumerate(truth):
for ch_idx, ch_name in enumerate(("Tran", "Vert", "Long")):
decoded_ips = adc_to_ins(ev.raw_samples[ch_name][i])
truth_ips = truth_row[ch_idx]
# 0.003 in/s tolerance: <0.005 quantum + small float precision room
assert abs(decoded_ips - truth_ips) < 0.003, (
f"row {i} {ch_name}: decoded {decoded_ips:+.4f} vs "
f"truth {truth_ips:+.4f} (delta {decoded_ips - truth_ips:+.4f})"
)
def test_save_imported_bw_with_paired_report(tmp_path: Path): def test_save_imported_bw_with_paired_report(tmp_path: Path):
@@ -432,6 +529,77 @@ def test_save_imported_bw_round_trip(tmp_path: Path):
assert stored_path.read_bytes() == src.read_bytes() assert stored_path.read_bytes() == src.read_bytes()
# ── apply_bw_report_dict_to_event ────────────────────────────────────────────
def test_apply_bw_report_dict_overlays_peaks_and_recording():
"""Verbatim mirror of the data shape produced by `_bw_report_to_dict`
when projecting a parsed `BwAsciiReport` into the sidecar. Confirms
each field overlays onto Event correctly so the backfill path
matches ingest behavior."""
from minimateplus.models import PeakValues
ev = Event(index=0)
bw_report = {
"peaks": {
"tran": {"ppv_ips": 9.84375},
"vert": {"ppv_ips": 0.305},
"long": {"ppv_ips": 0.405},
"vector_sum": {"ips": 14.86736},
},
"mic": {"pspl_dbl": 115.9},
"recording": {"sample_rate_sps": 1024, "record_time_s": 3.0},
}
event_file_io.apply_bw_report_dict_to_event(ev, bw_report)
assert ev.peak_values is not None
assert ev.peak_values.tran == 9.84375
assert ev.peak_values.vert == 0.305
assert ev.peak_values.long == 0.405
assert ev.peak_values.peak_vector_sum == 14.86736
# MicL is converted dB → psi via _dbl_to_psi — just confirm non-zero
assert ev.peak_values.micl is not None and ev.peak_values.micl > 0
assert ev.sample_rate == 1024
assert ev.rectime_seconds == 3.0
def test_apply_bw_report_dict_overwrites_codec_peaks():
"""The whole point of this helper: bw_report wins over whatever the
codec produced. This is what the 2026-05-22 prod backfill missed
DB peaks got overwritten with codec output (incl. PVS=0 on the
three top events) when they should have stayed bw_report-overlaid."""
from minimateplus.models import PeakValues
ev = Event(index=0)
# Simulate codec output that's clearly wrong (incomplete decode):
ev.peak_values = PeakValues(
tran=2.09, vert=0.0, long=0.0, peak_vector_sum=0.0,
)
bw_report = {
"peaks": {
"tran": {"ppv_ips": 9.84},
"vert": {"ppv_ips": 4.95},
"long": {"ppv_ips": 8.05},
"vector_sum": {"ips": 14.95},
},
}
event_file_io.apply_bw_report_dict_to_event(ev, bw_report)
assert ev.peak_values.tran == 9.84
assert ev.peak_values.vert == 4.95
assert ev.peak_values.long == 8.05
assert ev.peak_values.peak_vector_sum == 14.95
def test_apply_bw_report_dict_no_op_on_empty():
"""None / empty dict / missing keys should leave Event untouched."""
from minimateplus.models import PeakValues
for empty in (None, {}, {"peaks": {}}, {"peaks": {"tran": {}}}):
ev = Event(index=0)
ev.peak_values = PeakValues(tran=1.0, vert=2.0, long=3.0)
event_file_io.apply_bw_report_dict_to_event(ev, empty)
# Unchanged
assert ev.peak_values.tran == 1.0
assert ev.peak_values.vert == 2.0
assert ev.peak_values.long == 3.0
if __name__ == "__main__": if __name__ == "__main__":
if pytest is not None: if pytest is not None:
pytest.main([__file__, "-v"]) pytest.main([__file__, "-v"])
+385
View File
@@ -0,0 +1,385 @@
"""
test_histogram_codec.py regression locks for the histogram body codec.
The codec is verified byte-exact against BW's ASCII export across the
in-repo histogram fixture bundle. Each test cross-checks decoded
binary fields against the corresponding .TXT row.
Run:
python -m pytest tests/test_histogram_codec.py -q
"""
from __future__ import annotations
import os
import re
import sys
from pathlib import Path
import pytest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.blastware_file import _WAVEFORM_HEADER_SIZE
from minimateplus.histogram_codec import (
_BLOCK_SIZE,
decode_histogram_body,
decode_histogram_body_full,
geo_count_to_ins,
half_period_to_hz,
walk_body,
)
from minimateplus.waveform_codec import mic_count_to_db
_FIXTURE_DIR = Path(__file__).resolve().parent.parent / "example-events" / "histogram"
def _extract_body(path: Path) -> bytes:
"""Locate the body of a BW event file — bytes between the STRT
record and the 26-byte footer."""
raw = path.read_bytes()
body_start = _WAVEFORM_HEADER_SIZE + 21
pos = body_start
footer_pos = -1
while True:
pos = raw.find(b"\x0e\x08", pos)
if pos < 0 or pos + 26 > len(raw):
break
yr = (raw[pos + 4] << 8) | raw[pos + 5]
if 2015 <= yr <= 2050:
footer_pos = pos
break
pos += 1
if footer_pos < 0:
footer_pos = len(raw) - 26
return raw[body_start:footer_pos]
def _parse_txt_rows(path: Path) -> list[tuple[str, list]]:
"""Parse a histogram .TXT into ``[(time_str, [10 col values]), …]``.
Special tokens:
- ``">100"`` (the BW-display sentinel for freq > 100 Hz) ``None``
- non-numeric ``None``
"""
text = path.read_text()
lines = text.splitlines()
hdr = None
for i, line in enumerate(lines):
if re.match(r"^Tran\s+", line.strip()):
hdr = i + 3 # skip 2-row header + units row
break
if hdr is None:
return []
rows: list[tuple[str, list]] = []
for line in lines[hdr:]:
parts = line.split("\t")
if len(parts) != 11:
continue
vals: list = []
for p in parts[1:]:
s = p.strip()
if s.startswith(">"):
vals.append(None) # ">100 Hz" sentinel
continue
try:
vals.append(float(s))
except ValueError:
vals.append(None)
rows.append((parts[0].strip(), vals))
return rows
# ── Block-walker plumbing ────────────────────────────────────────────────────
@pytest.mark.parametrize("fixture", [
"N844L20G.630H",
"N844L21H.2R0H",
"N844L6Z8.ZR0H",
"N844L6XE.BH0H",
"N844L23B.ND0H",
])
def test_walk_body_returns_records(fixture: str):
"""Walker yields at least one valid block per fixture."""
path = _FIXTURE_DIR / fixture
if not path.exists():
pytest.skip(f"fixture missing: {path}")
records = walk_body(_extract_body(path))
assert len(records) > 100, f"expected hundreds of blocks, got {len(records)}"
def test_walk_body_record_count_matches_txt_intervals():
"""Block count should match the .TXT interval count (off-by-one
at the tail is acceptable last interval may be truncated at
recording stop)."""
bin_path = _FIXTURE_DIR / "N844L20G.630H"
txt_path = _FIXTURE_DIR / "N844L20G_630H_ASCII.TXT"
if not bin_path.exists() or not txt_path.exists():
pytest.skip("fixture missing")
records = walk_body(_extract_body(bin_path))
txt_rows = _parse_txt_rows(txt_path)
# Allow off-by-one (final block may have been mid-write at stop)
assert abs(len(records) - len(txt_rows)) <= 1, (
f"binary {len(records)} blocks vs TXT {len(txt_rows)} intervals"
)
def test_walk_body_segment_id_increments_every_256_blocks():
"""Segment ID advances 0→1→2→… after every 256 blocks within
one event."""
path = _FIXTURE_DIR / "N844L20G.630H"
if not path.exists():
pytest.skip("fixture missing")
records = walk_body(_extract_body(path))
# Group by segment_id and verify counts make sense
from collections import Counter
seg_counts = Counter(r["segment_id"] for r in records)
# First 3 segments should each have exactly 256 blocks (N844L20G has
# 791 blocks → 256+256+256+23 → segments 0/1/2/3)
assert seg_counts[0] == 256
assert seg_counts[1] == 256
assert seg_counts[2] == 256
assert seg_counts[3] == len(records) - 3 * 256
# ── Field-by-field decode verification against .TXT ground truth ─────────────
@pytest.mark.parametrize("fixture", [
"N844L20G.630H",
"N844L6Z8.ZR0H",
"N844L6XE.BH0H",
"N844L23B.ND0H",
])
def test_decoded_geo_peaks_match_txt(fixture: str):
"""For every block, decoded Tran/Vert/Long peak (count × 0.005)
matches the corresponding .TXT cell."""
bin_path = _FIXTURE_DIR / fixture
txt_path = _FIXTURE_DIR / (fixture.replace(".", "_") + "_ASCII.TXT")
if not bin_path.exists() or not txt_path.exists():
pytest.skip("fixture missing")
records = walk_body(_extract_body(bin_path))
txt_rows = _parse_txt_rows(txt_path)
n = min(len(records), len(txt_rows))
assert n > 0
for i in range(n):
rec = records[i]
_ts, txt = txt_rows[i]
# TXT cols 0/2/4 are T/V/L peak in in/s
for slot, key in (("T", "t_peak"), ("V", "v_peak"), ("L", "l_peak")):
col = {"T": 0, "V": 2, "L": 4}[slot]
decoded_ips = geo_count_to_ins(rec[key])
expected = txt[col]
assert abs(decoded_ips - expected) < 0.0005, (
f"{fixture} block {i} {slot}_peak: "
f"decoded={decoded_ips:.4f} vs txt={expected:.4f}"
)
@pytest.mark.parametrize("fixture", [
"N844L6Z8.ZR0H",
"N844L6XE.BH0H",
])
def test_decoded_geo_freqs_match_txt(fixture: str):
"""Decoded half-period → Hz matches the .TXT freq column for blocks
where the freq is in-range (not the `>100 Hz` sentinel)."""
bin_path = _FIXTURE_DIR / fixture
txt_path = _FIXTURE_DIR / (fixture.replace(".", "_") + "_ASCII.TXT")
if not bin_path.exists() or not txt_path.exists():
pytest.skip("fixture missing")
records = walk_body(_extract_body(bin_path))
txt_rows = _parse_txt_rows(txt_path)
n = min(len(records), len(txt_rows))
for i in range(n):
rec = records[i]
_ts, txt = txt_rows[i]
for slot, key, col in (("T", "t_halfp", 1), ("V", "v_halfp", 3), ("L", "l_halfp", 5)):
decoded_hz = half_period_to_hz(rec[key])
expected = txt[col]
if expected is None:
# TXT shows `>100 Hz` — codec should also yield None
assert decoded_hz is None or decoded_hz > 100, (
f"{fixture} block {i} {slot}_freq: codec says "
f"{decoded_hz} but TXT says >100"
)
continue
# TXT rounds; allow ±1 Hz
assert decoded_hz is not None
assert abs(decoded_hz - expected) < 1.0, (
f"{fixture} block {i} {slot}_freq: "
f"decoded={decoded_hz:.2f} Hz vs txt={expected:.2f} Hz"
)
@pytest.mark.parametrize("fixture", [
"N844L6XE.BH0H",
"N844L23B.ND0H",
"N844L6Z8.ZR0H",
])
def test_decoded_mic_db_matches_txt(fixture: str):
"""Decoded MicL peak count → dB(L) via mic_count_to_db matches
the .TXT dB(L) column."""
bin_path = _FIXTURE_DIR / fixture
txt_path = _FIXTURE_DIR / (fixture.replace(".", "_") + "_ASCII.TXT")
if not bin_path.exists() or not txt_path.exists():
pytest.skip("fixture missing")
records = walk_body(_extract_body(bin_path))
txt_rows = _parse_txt_rows(txt_path)
n = min(len(records), len(txt_rows))
for i in range(n):
rec = records[i]
_ts, txt = txt_rows[i]
# TXT col 8 = MicL dB(L)
decoded_db = mic_count_to_db(rec["m_peak"])
expected = txt[8]
if expected is None:
continue
# BW rounds to 1 decimal place for display. Tolerance 0.1 dB
# absorbs both rounding modes (truncate vs round-half-even).
assert abs(decoded_db - expected) < 0.1, (
f"{fixture} block {i} M_dB: "
f"decoded={decoded_db:.2f} dB vs txt={expected:.2f} dB"
)
@pytest.mark.parametrize("fixture", [
"N844L20G.630H",
"N844L6Z8.ZR0H",
])
def test_decoded_mic_freq_matches_txt(fixture: str):
"""Decoded MicL half-period → freq matches the .TXT col 9 freq."""
bin_path = _FIXTURE_DIR / fixture
txt_path = _FIXTURE_DIR / (fixture.replace(".", "_") + "_ASCII.TXT")
if not bin_path.exists() or not txt_path.exists():
pytest.skip("fixture missing")
records = walk_body(_extract_body(bin_path))
txt_rows = _parse_txt_rows(txt_path)
n = min(len(records), len(txt_rows))
for i in range(n):
rec = records[i]
_ts, txt = txt_rows[i]
decoded_hz = half_period_to_hz(rec["m_halfp"])
expected = txt[9]
if expected is None:
assert decoded_hz is None or decoded_hz > 100
continue
assert decoded_hz is not None
assert abs(decoded_hz - expected) < 1.0, (
f"{fixture} block {i} M_freq: "
f"decoded={decoded_hz:.2f} Hz vs txt={expected:.2f} Hz"
)
# ── Public API ───────────────────────────────────────────────────────────────
def test_decode_histogram_body_returns_four_channels():
"""The public API returns the standard 4-channel dict shape."""
path = _FIXTURE_DIR / "N844L20G.630H"
if not path.exists():
pytest.skip("fixture missing")
decoded = decode_histogram_body(_extract_body(path))
assert decoded is not None
assert set(decoded.keys()) == {"Tran", "Vert", "Long", "MicL"}
# All channels same length (one value per histogram interval)
n = len(decoded["Tran"])
assert all(len(decoded[ch]) == n for ch in ("Vert", "Long", "MicL"))
assert n > 100
def test_decode_histogram_body_returns_none_for_non_histogram():
"""A waveform-mode body (starts with 00 02 00) doesn't decode as
a histogram body."""
fake_waveform_body = b"\x00\x02\x00" + b"\x00" * 100
assert decode_histogram_body(fake_waveform_body) is None
def test_decode_histogram_body_returns_none_for_garbage():
"""Bytes that don't form valid blocks return None."""
assert decode_histogram_body(b"\xff" * 256) is None
def test_decode_histogram_body_full_preserves_frequency_data():
"""The structured-record API preserves the per-channel half-period
fields that the flat-channel API drops."""
path = _FIXTURE_DIR / "N844L20G.630H"
if not path.exists():
pytest.skip("fixture missing")
records = decode_histogram_body_full(_extract_body(path))
assert records is not None
r0 = records[0]
expected_fields = {
"segment_id", "block_ctr",
"t_peak", "t_halfp", "v_peak", "v_halfp",
"l_peak", "l_halfp", "m_peak", "m_halfp",
"meta_var",
}
assert set(r0.keys()) >= expected_fields
# ── Helpers ──────────────────────────────────────────────────────────────────
def test_half_period_to_hz_sentinel():
"""Half-period ≤ 5 returns None (the `>100 Hz` sentinel)."""
assert half_period_to_hz(5) is None
assert half_period_to_hz(1) is None
# halfp=6 gives 512/6 = 85.3 Hz — below the >100 threshold
assert half_period_to_hz(6) == pytest.approx(85.33, abs=0.01)
def test_geo_count_to_ins_scale():
"""1 count = 0.005 in/s at Normal range."""
assert geo_count_to_ins(1) == pytest.approx(0.005)
assert geo_count_to_ins(10) == pytest.approx(0.050)
assert geo_count_to_ins(0) == 0.0
# ── Regression: peak is uint8 byte[N], NOT uint16 LE byte[N:N+2] ────────────
#
# Block taken verbatim from K558LKZU.RE0H (BE9558) interval 12 — a real
# field event where the Tran channel had developed a DC offset and was
# producing sub-Hz drift content the device couldn't characterize.
# The annotation byte at [7] = 0xd2 is non-zero in that case. The
# legacy codec read [6:8] as uint16 LE, producing T_peak = 53763 →
# 268 in/s — physically impossible and 35× too high for the actual
# 0.015 in/s value (T_lo = 3 alone gives the correct count).
# Verified against the paired BW ASCII export.
_K558_INTERVAL_12_BLOCK = bytes.fromhex(
"00 00 0c 01 0a 00 03 d2 45 00 02 00 02 00 02 00"
"02 00 10 00 06 00 00 00 0e 91 2f 00 1e 0a 00 00".replace(" ", "")
)
def test_extension_byte_does_not_inflate_peak():
"""The annotation byte at [7]/[11]/[15]/[19] must NOT contribute to
the peak count. Decoded T_peak must be 3 (uint8 byte[6]), NOT
53763 (uint16 LE byte[6:8])."""
body = _K558_INTERVAL_12_BLOCK
records = decode_histogram_body_full(body)
assert records is not None
assert len(records) == 1
r = records[0]
assert r["t_peak"] == 3, f"T_peak should be 3 (uint8), got {r['t_peak']}"
assert r["v_peak"] == 2
assert r["l_peak"] == 2
assert r["m_peak"] == 16
# Half-periods unchanged — still uint16 LE.
assert r["t_halfp"] == 0x0045 # 69 → 7.4 Hz
assert r["m_halfp"] == 6 # → 85.3 Hz
# Annotation byte is preserved (for future RE) but does not affect peak.
assert r["annotations"] == (0xd2, 0x00, 0x00, 0x00)
def test_extension_byte_decoded_to_correct_in_s():
"""End-to-end: the channel-grouped output for the K558 ext block
should give T = 3 counts = 0.015 in/s, not 53763 counts = 268 in/s."""
channels = decode_histogram_body(_K558_INTERVAL_12_BLOCK)
assert channels is not None
assert channels["Tran"] == [3]
assert geo_count_to_ins(channels["Tran"][0]) == pytest.approx(0.015)
assert channels["Vert"] == [2]
assert channels["Long"] == [2]
assert channels["MicL"] == [16]