v0.20.0 -- Full s3 event parse and PDF creation. #28
+43
-4
@@ -6,7 +6,46 @@ All notable changes to seismo-relay are documented here.
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Fixed
|
||||
---
|
||||
|
||||
## v0.20.0 — 2026-05-28
|
||||
|
||||
The "PDF + parser polish" release. Closes out the Event-Report PDF iteration started in v0.17.x: histogram layouts now render correctly against BW reference PDFs, the ASCII parser handles the real-world edge cases production events were tripping over (OORANGE, `>100 Hz`, histogram timestamps), and the `.TXT` preservation rollout lets parser fixes be applied retroactively to ingested events. Adds server-wide timezone support so operator-visible timestamps no longer drift into UTC. Rolls up the substantial "pre-v0.20" body of work that had accumulated under `[Unreleased]` (PDF generation, histogram codec fix, histogram parser fields, `.TXT` preservation, backfill safety) — see the trailing "pre-v0.20.0 work" section below for the full list.
|
||||
|
||||
### Added (2026-05-28)
|
||||
|
||||
- **Server-wide display timezone via `TZ` env var.** Both seismo-relay and terra-view now respect a `TZ` environment variable (default `America/New_York` on prod). Affects server log timestamps, the PDF report renderer's UTC→local conversions on the "Created" footer line, matplotlib's datetime axes, and any other naïve-vs-aware datetime rendering. DB columns (`created_at`, etc.) stay UTC regardless — this is a display-side fix, not a storage-side one. Dockerfile now installs `tzdata` (required for the env var to take effect under `python:slim`). Override per-deployment via the `TZ` line in `docker-compose.yml`.
|
||||
- **ZC Freq "above-range" handling — render `>100 Hz` instead of `—`.** BW writes `">100 Hz"` literally when the zero-crossing algorithm sees a peak too fast to count (device cuts off at 100 Hz on V10.72). Previously `_parse_number(">100")` returned None and the PDF stats table rendered `—`. Now the parser mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` and sets a new `zc_freq_above_range` flag. Flag rides through the sidecar's `bw_report` block. Renders as `>100` in the PDF (per-channel + mic block), as `· >100 Hz` inline on the event modal's Peaks section, and as a dedicated column on the event-browser stats table. Verified against the real T190LD5Q.LK0W fixture from 2026-05-27 plus a synthetic test case.
|
||||
- **Per-channel ZC Freq surfaced in event modals.** Neither the main webapp modal (`sfm_webapp.html`) nor the standalone event browser (`event_browser.html`) previously exposed ZC Freq. Now both do — webapp shows it inline alongside PPV (`0.04500 in/s · 47 Hz`); event-browser gets a dedicated column on its per-channel stats table. Required wiring a parallel sidecar fetch into the event-browser's `loadEvent()` (it was only fetching `waveform.json`). Falls back to `—` for events without a preserved `.TXT` (pre-2026-05-27 ingests).
|
||||
- **`scripts/backfill_sidecars.py --reparse-txt` flag.** Before this, the backfill script preserved the `bw_report` block from existing sidecars verbatim — so parser-side fixes (like the `>100 Hz` addition above) couldn't reach old events. The new flag re-runs the current parser against the preserved `<serial>/<filename>_ASCII.TXT`, overwrites the bw_report block, and cascade-regenerates the sidecar. Implies sidecar regeneration on every event (bypasses the sha/version skip). No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27 .TXT-preservation rollout). Idempotent. Run with `--skip-hdf5` to skip waveform regen — recommended when only the bw_report needs refreshing. Validated end-to-end on prod: 9,999 events refreshed cleanly, ZC Freq + OORANGE flags now populated where the original .TXT had them.
|
||||
|
||||
### Fixed (2026-05-28)
|
||||
|
||||
- **Histogram PDFs no longer 500 on the missing `histogram_interval_size_s` attribute.** The histogram-interval-times derivation block in `gather_report_data` referenced `rd.histogram_interval_size_s`, but the field was never declared on the `ReportData` dataclass nor read from the sidecar projection (it was inlined into `gather_report_data` without the seconds-numeric counterpart making it onto the dataclass). Every histogram PDF render raised `AttributeError → 500`. Waveform PDFs were unaffected. Fix: add the field, read it from the projection's existing `bw_report.histogram.interval_size_s` key.
|
||||
- **Histogram PDF geo channels now share a single nice-quantized y-axis.** Previously each geo subplot auto-scaled independently — Tran, Vert, and Long all showed different per-channel maxes, so bar heights weren't directly comparable across channels. The footer "Amplitude Geo: X in/s/div" label was also computed as `max(first_geo_channel) / 5` with no LSB quantization, producing nonsense values like `0.003 in/s/div` when the geophone LSB is 0.005. Fix: compute a single shared geo y-axis range from `max(Tran, Vert, Long)`, quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 in/s LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same `ylim` + ticks to all three subplots, and use that step for the footer label. MicL stays on its own auto-scale (different units). Matches BW's chart styling.
|
||||
|
||||
### Docs (2026-05-28)
|
||||
|
||||
- **Roadmap entry for a second undecoded histogram body sub-format.** BE17353 (S353) events observed on 2026-05-28 use a histogram body where `byte[5] = 0x00` (looks like a valid block header by every prior signal) but the walker finds zero data blocks. Different from the existing `byte[5] != 0` roadmap entry (T190 / O121). Operationally identical impact — ingestion succeeds, DB peaks come from the bw_report overlay, only the chart is empty. Sample events captured in the roadmap entry for future RE work.
|
||||
|
||||
### Migration / Operations
|
||||
|
||||
- **Re-parse existing events to pick up the new parser fields.** Run on whichever box hosts the live waveform store:
|
||||
```bash
|
||||
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
|
||||
--reparse-txt --skip-hdf5 --dry-run -v | tail
|
||||
# Looks reasonable? Run for real:
|
||||
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
|
||||
--reparse-txt --skip-hdf5 -v | tee /tmp/reparse.log | tail -30
|
||||
```
|
||||
Idempotent; safe to re-run. Only touches sidecars on disk — no DB writes.
|
||||
- **terra-view docker-compose.yml**: add `TZ=America/New_York` (or your deployment's zone) to both the `terra-view` and `sfm` service `environment:` blocks. Without this, server-rendered timestamps stay in UTC even on the rebuilt SFM image.
|
||||
|
||||
### Pre-v0.20.0 work (rolled into this release)
|
||||
|
||||
The bullets below accumulated under `[Unreleased]` between v0.19.0 and v0.20.0; kept here so the historical narrative isn't lost.
|
||||
|
||||
#### Fixed
|
||||
|
||||
- **bw_ascii_report parser now handles `OORANGE` saturation marker.** BW writes `"OORANGE"` (truncation of "Out Of Range") in PPV / PVS / MicL PSPL fields when the underlying measurement exceeded the channel's full-scale. Previously our `_parse_number()` returned None → DB ended up with NULL peaks for legitimate high-amplitude events. Confirmed on real ASCII files pulled 2026-05-27 from the Windows watcher PC: T190LD5Q.LK0W (Vert saturated at Normal range 10 in/s), T438L713.RY0W (all three channels saturated at Sensitive range 1.25 in/s), K557L3YM.OE0W (Tran+Vert saturated + Mic PSPL OORANGE). New behavior:
|
||||
- Per-channel PPV: substitute `geo_range_ips` as a conservative lower bound + set `ppv_saturated` flag
|
||||
@@ -16,7 +55,7 @@ All notable changes to seismo-relay are documented here.
|
||||
- Five events on prod (T190 / T438 / K557 + 2 others matching the same fault pattern) will pick up correct DB peaks + saturation flags once re-forwarded
|
||||
- **bw_ascii_report parser handles `Peak Vector Sum TimeSum` typo'd label.** Real BW output uses this misspelled label (Sum appended twice instead of "Peak Vector Sum Time"). Now accepted as an alias. Confirmed against all three OORANGE example files — every one has the typo.
|
||||
|
||||
### Added
|
||||
#### Added
|
||||
|
||||
- **Histogram per-interval aggregation in `waveform.json`.** Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output). When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array. `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings). Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present. Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
|
||||
- **`bw_ascii_report` parser now captures histogram-specific fields.** Previously the parser dropped these fields silently (Roadmap item closed):
|
||||
@@ -43,13 +82,13 @@ All notable changes to seismo-relay are documented here.
|
||||
- **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`. Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file. BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
|
||||
- **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars. Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED. Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.
|
||||
|
||||
### Fixed
|
||||
#### Fixed
|
||||
|
||||
- **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.** Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict). Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
|
||||
- **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.** The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts). Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
|
||||
- **Thor IDF files no longer attempted as BW events in backfill.** `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.
|
||||
|
||||
### Docs
|
||||
#### Docs
|
||||
|
||||
- **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes. Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
|
||||
- **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
|
||||
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
|
||||
(Sierra Wireless RV50 / RV55). Current version: **v0.17.0**.
|
||||
(Sierra Wireless RV50 / RV55). Current version: **v0.20.0**.
|
||||
|
||||
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# seismo-relay `v0.19.0`
|
||||
# seismo-relay `v0.20.0`
|
||||
|
||||
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
|
||||
software for managing seismographs. Supports both the **MiniMate Plus
|
||||
@@ -35,6 +35,16 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
||||
> and storage layer dispatch deterministically instead of sniffing
|
||||
> filenames. Self-applying migration backfills existing rows from the
|
||||
> binary filename extension.
|
||||
> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
|
||||
> started in v0.17.x: histogram layouts render correctly against BW
|
||||
> reference PDFs, the ASCII parser handles real-world edge cases
|
||||
> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
|
||||
> Freq is surfaced in both modals (event browser + main webapp).
|
||||
> Adds a server-wide `TZ` env var so operator-visible timestamps
|
||||
> render in local time instead of UTC. New
|
||||
> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
|
||||
> applied retroactively to existing events without re-forwarding,
|
||||
> using the `.TXT` files preserved at ingest time.
|
||||
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
|
||||
|
||||
---
|
||||
@@ -536,10 +546,10 @@ Implementation steps (concrete):
|
||||
|
||||
### BW ASCII report parser enhancements (built in v0.16.0)
|
||||
|
||||
- [ ] **PPV field misses on certain TXT formats.** Discovered 2026-05-22 during the histogram-codec backfill validation: a handful of events (5 in prod) have a `bw_report` block where `peaks.{tran,vert,long}.ppv_ips` and `peaks.vector_sum.ips` are all `None`, despite the parser correctly extracting every OTHER field for the same channels (zc_freq_hz, time_of_peak_s, peak_accel_g, peak_disp_in). Symptom on the DB side: `peak_vector_sum=0` after a `--force` backfill that overlays from the parsed bw_report dict. Affected events on prod include `T190LD5Q.LK0W`, `T438L713.RY0W`, `K557L3YM.OE0W`. Root cause likely a regex or format mismatch for the "PPV" header line in those specific firmware/event-type outputs. Once fixed, re-forwarding the events from series3-watcher will re-populate the `bw_report` blocks correctly.
|
||||
- [ ] **Histogram-specific structural fields.** Current parser handles the shared fields (PPV, ZC Freq, sensor self-check, project) but silently drops histogram-only fields: `Histogram Start/Stop Time`, `Histogram Start/Stop Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date` (absolute timestamps rather than the waveform's `Time of Peak` relative seconds).
|
||||
- [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value. Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag. All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
|
||||
- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now. Land in the sidecar's `bw_report.histogram` block.
|
||||
- [ ] **Histogram interval bin-table parsing.** Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed. Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
|
||||
- [ ] **`>100 Hz` value parsing.** Histogram TXTs use `>100 Hz` for out-of-range ZC freq; current `_parse_number()` returns `None` for these (loses information).
|
||||
- [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag. PDF + both modals render `>100 Hz` instead of `—`.
|
||||
|
||||
### Ingestion gaps
|
||||
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "seismo-relay"
|
||||
version = "0.19.0"
|
||||
version = "0.20.0"
|
||||
description = "Python client and REST server for MiniMate Plus seismographs"
|
||||
requires-python = ">=3.10"
|
||||
dependencies = [
|
||||
|
||||
Reference in New Issue
Block a user