Compare commits
15 Commits
f6abe3caa0
...
dev
| Author | SHA1 | Date | |
|---|---|---|---|
| 25386cab8b | |||
| 6cb619ecc4 | |||
| 1ed86244d0 | |||
| b2c565f217 | |||
| 43f440812a | |||
| 23e83908c2 | |||
| bee118506b | |||
| defd17d9c2 | |||
| e42956a20b | |||
| 9fd52ddabb | |||
| 9b71ead44b | |||
| 1bccc44b88 | |||
| a3cc44d30a | |||
| 6a73523e4d | |||
| 780b45a371 |
+133
-3
@@ -6,8 +6,138 @@ All notable changes to seismo-relay are documented here.
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
---
|
||||
|
||||
## v0.21.1 — 2026-06-01
|
||||
|
||||
Bug fixes against v0.21.0 surfaced after the first prod redeploy. Three
|
||||
production-visible symptoms — blank waveform charts on most Thor events,
|
||||
blank histogram charts on all Thor events, and a mic chart that
|
||||
auto-scaled against a dB(L) value treated as psi — all root-caused and
|
||||
fixed.
|
||||
|
||||
### Fixed
|
||||
|
||||
- **Dynamic IDFW body offset.** The v0.21.0 codec hardcoded the body
|
||||
at file offset `0x0f1f` based on the example corpus, but only ~52%
|
||||
of production IDFW events use that offset; the rest sit at offsets
|
||||
from `0x1033` up to `0x3082` depending on header padding. At
|
||||
`0x0f1f` the codec would find a coincidentally-matching `00 02 00`
|
||||
magic, read the 2-byte Tran preamble, and return empty V/L/M
|
||||
arrays — producing near-empty .h5 files and blank charts.
|
||||
`micromate.idf_file._find_waveform_body_offset()` now scans every
|
||||
`00 02 00` magic position past `0x0E00`, trial-decodes each one,
|
||||
and picks the offset with the most samples. Validated across 483
|
||||
prod IDFW files: 0 preamble-only events (was ~50%), 355/483 fully
|
||||
decode, 126/483 partial (BW codec walker-stops-early on loud
|
||||
events — pre-existing limitation, samples reached are correct).
|
||||
|
||||
- **IDFH histograms now render bar charts.** Histograms previously
|
||||
skipped the .h5 write because there are no per-sample arrays, but
|
||||
the renderer drives the per-interval bar chart from .h5 channel
|
||||
data + `bw_report.histogram.n_intervals`. `save_imported_idf` now
|
||||
synthesizes a 1-sample-per-interval array from the decoded
|
||||
`IdfhInterval` peak counts and writes an .h5 so the existing
|
||||
renderer works unchanged — each "sample" is the per-interval peak
|
||||
ADC count, so the writer's `count × geo_fs/32768` conversion
|
||||
yields the right bar height.
|
||||
|
||||
- **Mic chart scaling on Thor events.** `PeakValues.micl` (consumed
|
||||
by the h5 writer's per-count mic scale factor) expects psi, but
|
||||
the Thor bridge was stuffing the dB(L) value (~99.4) into it,
|
||||
producing a per-count factor 5+ orders of magnitude too large and
|
||||
a flat-looking mic chart. Fixed by adding `IdfPeaks.mic_pspl_psi`
|
||||
alongside `mic_pspl_dbl`; `read_idf_file()` computes it from
|
||||
binary mic counts (`max(|MicL|) × 2.14e-6 psi/count`) for both
|
||||
IDFW and IDFH paths; `save_imported_idf` merges it onto the typed
|
||||
event after `IdfEvent.from_report`; the bridge feeds psi to
|
||||
`PeakValues.micl` with a dB(L)→psi formula fallback when only the
|
||||
dB(L) value is available. dB(L) for the report header still
|
||||
flows through `bw_report.mic.pspl_dbl` unchanged.
|
||||
|
||||
### Operator
|
||||
|
||||
After deploy, run `python scripts/backfill_thor_events.py` to refresh
|
||||
every existing Thor event's sidecar + .h5 with the corrected codec
|
||||
output. The script auto-skips events already at the current
|
||||
`TOOL_VERSION`, so the bump from `0.21.0` → `0.21.1` is what triggers
|
||||
the refresh.
|
||||
|
||||
---
|
||||
|
||||
## v0.21.0 — 2026-05-29
|
||||
|
||||
The "Thor / Series IV codec" release. Two big pieces landed: (1) the IDF binary codec actually decodes now, both IDFW and IDFH, and (2) a Thor→BW adapter lets Thor events flow through the existing Series III Event Report PDF pipeline. Combined effect: a Thor event ingested via `/db/import/idf_file` now lands in the DB with the same fidelity as a Blastware event, gets a per-event PDF on demand, and renders in Terra-View's modal chart with the same plotting code as a BW event.
|
||||
|
||||
### Added — Thor IDF binary codec (`micromate/idf_file.read_idf_file`)
|
||||
|
||||
- **IDFW (waveform)** — body sits at fixed file offset `0x0f1f`; reuses the verified `decode_waveform_v2()` walker from `minimateplus.waveform_codec`. Sample fidelity is **87–99% byte-exact** against the ASCII-sidecar reference values on quiet events; loud events hit the same walker-stops-early limitation as the BW codec on `SP0/SS0/SV0`-style events.
|
||||
- **IDFH (histogram)** — dedicated segment-based decoder for the Thor histogram body format: `[len_be][0a 00 00 00][00 NN][05 3f]` framing plus N × 72-byte interval records (4 × 16-byte per-channel min/max/halfp). **All 859 Thor IDFH corpus files decode**, totalling **181,071 intervals**; per-channel peaks match the sidecar within **~1.8% (ADC quantization)**.
|
||||
- **BW-aliased binary detection** — a small number of corpus files (e.g. `BE9439_*.IDFW/IDFH`) are actually Series III Blastware binaries that share the IDF filename convention by accident. `read_idf_file()` detects them via their BW `STRT` signature and raises `NotImplementedError` pointing the caller at `read_blastware_file()` instead of trying to decode them as IDF.
|
||||
- Full field layouts in `docs/idf_protocol_reference.md`; supporting analysis scripts in `analysis_idf/` (decode validators, per-file detail dumps, corpus accuracy reports).
|
||||
|
||||
### Added — Thor → BW report adapter (`micromate/idf_to_bw_report.py`)
|
||||
|
||||
- **`build_bw_report_from_idf(report_dict, binary_md=, intervals=, is_histogram=)`** projects a parsed Thor `IdfReport` plus binary-extracted metadata plus decoded IDFH intervals into the `bw_report`-shaped dict that `sfm.report_pdf.gather_report_data` consumes. No need to duplicate the renderer — Thor data is ~95% the same metric set as BW; the adapter handles the field-name mapping (`MicPSPL` → `pspl_dbl`, `>100` sentinel → `zc_freq_above_range`, free-form `Calibration : Nov 22, 2023 by Instantel` → `calibration_date` + `calibration_by`, etc.).
|
||||
- For IDFH events the adapter derives `histogram.interval_times` by stepping `IntervalSize` from `HistogramStartTime`, matching what the BW pipeline expects from a histogram-mode event.
|
||||
- **Wired into `WaveformStore.save_imported_idf`** — every Thor event ingested via `/db/import/idf_file` now gets a `bw_report` block in its sidecar in addition to the existing `extensions.idf_report` (the raw parsed Thor payload). Falls back gracefully (PDF renders from DB-only fields) if the adapter raises — logged as a warning rather than failing the ingest.
|
||||
|
||||
### Companion releases
|
||||
|
||||
- **Terra-View v0.13.0** ships in parallel — closes Phase 1 of the SFM integration. The shared event-detail modal now renders the SFM event story (Chart.js waveform/histogram chart, inline PDF preview, `.TXT` download, FT/reviewer/notes review form) without operators needing to bounce to the standalone SFM webapp on port 8200. Uses only existing seismo-relay endpoints — no API changes here, just better consumption.
|
||||
|
||||
### Migration / Operations
|
||||
|
||||
No DB migration needed. Existing Thor events already in the store don't automatically pick up the new `bw_report` block — they'd need a re-ingest (post the IDF binary + paired `.TXT` back to `/db/import/idf_file`) for the adapter to run. Alternatively, run `scripts/backfill_sidecars.py --reparse-txt` after a small adapter change (the script currently only re-runs the BW ASCII parser; extending it to handle Thor would be a small follow-up).
|
||||
|
||||
```bash
|
||||
cd /home/serversdown/terra-view
|
||||
docker compose build sfm && docker compose up -d sfm
|
||||
```
|
||||
|
||||
The bumped `TOOL_VERSION = "0.21.0"` in `minimateplus/event_file_io.py` means any subsequent `backfill_sidecars.py --force` pass will re-write sidecars with the new version stamp; that's expected and harmless.
|
||||
|
||||
---
|
||||
|
||||
## v0.20.0 — 2026-05-28
|
||||
|
||||
The "PDF + parser polish" release. Closes out the Event-Report PDF iteration started in v0.17.x: histogram layouts now render correctly against BW reference PDFs, the ASCII parser handles the real-world edge cases production events were tripping over (OORANGE, `>100 Hz`, histogram timestamps), and the `.TXT` preservation rollout lets parser fixes be applied retroactively to ingested events. Adds server-wide timezone support so operator-visible timestamps no longer drift into UTC. Rolls up the substantial "pre-v0.20" body of work that had accumulated under `[Unreleased]` (PDF generation, histogram codec fix, histogram parser fields, `.TXT` preservation, backfill safety) — see the trailing "pre-v0.20.0 work" section below for the full list.
|
||||
|
||||
### Added (2026-05-28)
|
||||
|
||||
- **Server-wide display timezone via `TZ` env var.** Both seismo-relay and terra-view now respect a `TZ` environment variable (default `America/New_York` on prod). Affects server log timestamps, the PDF report renderer's UTC→local conversions on the "Created" footer line, matplotlib's datetime axes, and any other naïve-vs-aware datetime rendering. DB columns (`created_at`, etc.) stay UTC regardless — this is a display-side fix, not a storage-side one. Dockerfile now installs `tzdata` (required for the env var to take effect under `python:slim`). Override per-deployment via the `TZ` line in `docker-compose.yml`.
|
||||
- **ZC Freq "above-range" handling — render `>100 Hz` instead of `—`.** BW writes `">100 Hz"` literally when the zero-crossing algorithm sees a peak too fast to count (device cuts off at 100 Hz on V10.72). Previously `_parse_number(">100")` returned None and the PDF stats table rendered `—`. Now the parser mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` and sets a new `zc_freq_above_range` flag. Flag rides through the sidecar's `bw_report` block. Renders as `>100` in the PDF (per-channel + mic block), as `· >100 Hz` inline on the event modal's Peaks section, and as a dedicated column on the event-browser stats table. Verified against the real T190LD5Q.LK0W fixture from 2026-05-27 plus a synthetic test case.
|
||||
- **Per-channel ZC Freq surfaced in event modals.** Neither the main webapp modal (`sfm_webapp.html`) nor the standalone event browser (`event_browser.html`) previously exposed ZC Freq. Now both do — webapp shows it inline alongside PPV (`0.04500 in/s · 47 Hz`); event-browser gets a dedicated column on its per-channel stats table. Required wiring a parallel sidecar fetch into the event-browser's `loadEvent()` (it was only fetching `waveform.json`). Falls back to `—` for events without a preserved `.TXT` (pre-2026-05-27 ingests).
|
||||
- **`scripts/backfill_sidecars.py --reparse-txt` flag.** Before this, the backfill script preserved the `bw_report` block from existing sidecars verbatim — so parser-side fixes (like the `>100 Hz` addition above) couldn't reach old events. The new flag re-runs the current parser against the preserved `<serial>/<filename>_ASCII.TXT`, overwrites the bw_report block, and cascade-regenerates the sidecar. Implies sidecar regeneration on every event (bypasses the sha/version skip). No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27 .TXT-preservation rollout). Idempotent. Run with `--skip-hdf5` to skip waveform regen — recommended when only the bw_report needs refreshing. Validated end-to-end on prod: 9,999 events refreshed cleanly, ZC Freq + OORANGE flags now populated where the original .TXT had them.
|
||||
|
||||
### Fixed (2026-05-28)
|
||||
|
||||
- **Histogram PDFs no longer 500 on the missing `histogram_interval_size_s` attribute.** The histogram-interval-times derivation block in `gather_report_data` referenced `rd.histogram_interval_size_s`, but the field was never declared on the `ReportData` dataclass nor read from the sidecar projection (it was inlined into `gather_report_data` without the seconds-numeric counterpart making it onto the dataclass). Every histogram PDF render raised `AttributeError → 500`. Waveform PDFs were unaffected. Fix: add the field, read it from the projection's existing `bw_report.histogram.interval_size_s` key.
|
||||
- **Histogram PDF geo channels now share a single nice-quantized y-axis.** Previously each geo subplot auto-scaled independently — Tran, Vert, and Long all showed different per-channel maxes, so bar heights weren't directly comparable across channels. The footer "Amplitude Geo: X in/s/div" label was also computed as `max(first_geo_channel) / 5` with no LSB quantization, producing nonsense values like `0.003 in/s/div` when the geophone LSB is 0.005. Fix: compute a single shared geo y-axis range from `max(Tran, Vert, Long)`, quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 in/s LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same `ylim` + ticks to all three subplots, and use that step for the footer label. MicL stays on its own auto-scale (different units). Matches BW's chart styling.
|
||||
|
||||
### Docs (2026-05-28)
|
||||
|
||||
- **Roadmap entry for a second undecoded histogram body sub-format.** BE17353 (S353) events observed on 2026-05-28 use a histogram body where `byte[5] = 0x00` (looks like a valid block header by every prior signal) but the walker finds zero data blocks. Different from the existing `byte[5] != 0` roadmap entry (T190 / O121). Operationally identical impact — ingestion succeeds, DB peaks come from the bw_report overlay, only the chart is empty. Sample events captured in the roadmap entry for future RE work.
|
||||
|
||||
### Migration / Operations
|
||||
|
||||
- **Re-parse existing events to pick up the new parser fields.** Run on whichever box hosts the live waveform store:
|
||||
```bash
|
||||
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
|
||||
--reparse-txt --skip-hdf5 --dry-run -v | tail
|
||||
# Looks reasonable? Run for real:
|
||||
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
|
||||
--reparse-txt --skip-hdf5 -v | tee /tmp/reparse.log | tail -30
|
||||
```
|
||||
Idempotent; safe to re-run. Only touches sidecars on disk — no DB writes.
|
||||
- **terra-view docker-compose.yml**: add `TZ=America/New_York` (or your deployment's zone) to both the `terra-view` and `sfm` service `environment:` blocks. Without this, server-rendered timestamps stay in UTC even on the rebuilt SFM image.
|
||||
|
||||
### Pre-v0.20.0 work (rolled into this release)
|
||||
|
||||
The bullets below accumulated under `[Unreleased]` between v0.19.0 and v0.20.0; kept here so the historical narrative isn't lost.
|
||||
|
||||
#### Fixed
|
||||
|
||||
- **bw_ascii_report parser now handles `OORANGE` saturation marker.** BW writes `"OORANGE"` (truncation of "Out Of Range") in PPV / PVS / MicL PSPL fields when the underlying measurement exceeded the channel's full-scale. Previously our `_parse_number()` returned None → DB ended up with NULL peaks for legitimate high-amplitude events. Confirmed on real ASCII files pulled 2026-05-27 from the Windows watcher PC: T190LD5Q.LK0W (Vert saturated at Normal range 10 in/s), T438L713.RY0W (all three channels saturated at Sensitive range 1.25 in/s), K557L3YM.OE0W (Tran+Vert saturated + Mic PSPL OORANGE). New behavior:
|
||||
- Per-channel PPV: substitute `geo_range_ips` as a conservative lower bound + set `ppv_saturated` flag
|
||||
- Peak Vector Sum: substitute `sqrt(3) * geo_range_ips` (the theoretical max when all 3 channels are simultaneously at full-scale) + `peak_vector_sum_saturated` flag
|
||||
@@ -16,7 +146,7 @@ All notable changes to seismo-relay are documented here.
|
||||
- Five events on prod (T190 / T438 / K557 + 2 others matching the same fault pattern) will pick up correct DB peaks + saturation flags once re-forwarded
|
||||
- **bw_ascii_report parser handles `Peak Vector Sum TimeSum` typo'd label.** Real BW output uses this misspelled label (Sum appended twice instead of "Peak Vector Sum Time"). Now accepted as an alias. Confirmed against all three OORANGE example files — every one has the typo.
|
||||
|
||||
### Added
|
||||
#### Added
|
||||
|
||||
- **Histogram per-interval aggregation in `waveform.json`.** Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output). When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array. `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings). Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present. Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
|
||||
- **`bw_ascii_report` parser now captures histogram-specific fields.** Previously the parser dropped these fields silently (Roadmap item closed):
|
||||
@@ -43,13 +173,13 @@ All notable changes to seismo-relay are documented here.
|
||||
- **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`. Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file. BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
|
||||
- **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars. Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED. Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.
|
||||
|
||||
### Fixed
|
||||
#### Fixed
|
||||
|
||||
- **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.** Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict). Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
|
||||
- **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.** The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts). Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
|
||||
- **Thor IDF files no longer attempted as BW events in backfill.** `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.
|
||||
|
||||
### Docs
|
||||
#### Docs
|
||||
|
||||
- **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes. Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
|
||||
- **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
|
||||
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
|
||||
(Sierra Wireless RV50 / RV55). Current version: **v0.17.0**.
|
||||
(Sierra Wireless RV50 / RV55). Current version: **v0.21.0**.
|
||||
|
||||
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
|
||||
|
||||
@@ -73,6 +73,28 @@ should not import from `sfm/`, must not touch a DB, and have no I/O
|
||||
beyond reading files passed as arguments. Keep them pure — both
|
||||
tiers can then depend on them without circularity.
|
||||
|
||||
#### Thor IDF binary codec (2026-05-28)
|
||||
|
||||
`micromate/idf_file.read_idf_file()` decodes both Thor IDFW
|
||||
(waveform) and IDFH (histogram) binaries.
|
||||
|
||||
- **IDFW** reuses `decode_waveform_v2()` on the body at fixed file
|
||||
offset `0x0f1f`. Sample fidelity is 87–99% byte-exact on quiet
|
||||
events; loud events hit the BW codec's known walker-stops-early
|
||||
limitation.
|
||||
- **IDFH** has its own segment-based decoder: `[len_be][0a 00 00 00]
|
||||
[00 NN][05 3f]` + N × 72-byte interval records (4 × 16-byte
|
||||
per-channel min/max/halfp). All 859 Thor IDFH corpus files
|
||||
decode (181,071 intervals); peak matches sidecar within ~1.8%
|
||||
(ADC quantization).
|
||||
|
||||
The two outlier `BE9439_*` files in the Thor example corpus are
|
||||
actually Series III Blastware binaries that share the `.IDFW`/`.IDFH`
|
||||
filename convention by accident. `read_idf_file()` detects them by
|
||||
their BW STRT signature and raises NotImplementedError pointing
|
||||
callers at `read_blastware_file()`. See
|
||||
`docs/idf_protocol_reference.md` for full field layouts.
|
||||
|
||||
### Practical consequences
|
||||
|
||||
When deciding where new code goes, ask:
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# seismo-relay `v0.19.0`
|
||||
# seismo-relay `v0.21.0`
|
||||
|
||||
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
|
||||
software for managing seismographs. Supports both the **MiniMate Plus
|
||||
@@ -35,6 +35,25 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
||||
> and storage layer dispatch deterministically instead of sniffing
|
||||
> filenames. Self-applying migration backfills existing rows from the
|
||||
> binary filename extension.
|
||||
> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
|
||||
> started in v0.17.x: histogram layouts render correctly against BW
|
||||
> reference PDFs, the ASCII parser handles real-world edge cases
|
||||
> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
|
||||
> Freq is surfaced in both modals (event browser + main webapp).
|
||||
> Adds a server-wide `TZ` env var so operator-visible timestamps
|
||||
> render in local time instead of UTC. New
|
||||
> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
|
||||
> applied retroactively to existing events without re-forwarding,
|
||||
> using the `.TXT` files preserved at ingest time.
|
||||
> **v0.21.0 (2026-05-29)** is the Thor / Series IV decoder release —
|
||||
> `micromate/idf_file.read_idf_file()` now decodes both IDFW
|
||||
> (waveform) and IDFH (histogram) binaries (87–99% sample fidelity
|
||||
> on quiet IDFW events; all 859 IDFH corpus files decode cleanly).
|
||||
> A new `micromate/idf_to_bw_report.py` adapter projects parsed
|
||||
> Thor reports into the BW-shaped sidecar block, so Thor events
|
||||
> flow through the existing Event Report PDF pipeline without a
|
||||
> separate renderer. Terra-View v0.13.0 ships in parallel and
|
||||
> closes Phase 1 of the SFM integration — see its CHANGELOG.
|
||||
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
|
||||
|
||||
---
|
||||
@@ -58,7 +77,8 @@ seismo-relay/
|
||||
├── micromate/ ← Series IV (Micromate / Thor) client library (NEW v0.19)
|
||||
│ ├── models.py ← IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L))
|
||||
│ ├── idf_ascii_report.py ← Parse Thor .IDFW.txt / .IDFH.txt event sidecars
|
||||
│ └── idf_file.py ← Stub for the .IDFW / .IDFH binary codec (reverse-engineering pending)
|
||||
│ ├── idf_file.py ← Binary codec for .IDFW + .IDFH (v0.21.0+)
|
||||
│ └── idf_to_bw_report.py ← Adapter projecting Thor IDF into the BW report shape (v0.21.0+)
|
||||
│
|
||||
├── sfm/ ← SFM REST API server (FastAPI, port 8200)
|
||||
│ ├── server.py ← Live device endpoints + DB query + ingest endpoints + caching
|
||||
@@ -415,7 +435,7 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
|
||||
- [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+)
|
||||
- [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version
|
||||
- [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/`
|
||||
- [ ] Binary `.IDFW` / `.IDFH` codec — pending (see Roadmap + [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md))
|
||||
- [x] Binary `.IDFW` / `.IDFH` codec — ✅ v0.21.0. IDFW reuses `decode_waveform_v2()` on the body at offset `0x0f1f` (87–99% sample fidelity on quiet events); IDFH has a dedicated segment-based decoder (all 859 corpus files decode, 181,071 intervals total). See `micromate/idf_file.py` + `docs/idf_protocol_reference.md`.
|
||||
- [ ] Live-device protocol — pending codec
|
||||
|
||||
**Data persistence:**
|
||||
@@ -528,7 +548,7 @@ Implementation steps (concrete):
|
||||
### High-impact (unblocks product features)
|
||||
|
||||
- [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
|
||||
- [ ] **Series IV (Thor IDF) binary codec reverse-engineering.** `.IDFH` / `.IDFW` files are currently stored opaquely by `WaveformStore.save_imported_idf`, with all metadata sourced from the paired `.txt` sidecar. This works because thor-watcher forwards both files together, but operators who haven't enabled Thor's TXT exporter get rows with NULL peaks. Cracking the binary closes that gap and unlocks waveform display. Starting-point reference at [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) — two observed file signatures (1,012 newer-firmware files + 2 old files whose layout matches the Series III STRT-record format), suggested first-session plan (~2-4 hrs), 1,014 paired binary+txt files available as ground truth in `thor-watcher/example-data/`. Code seam ready at `micromate/idf_file.py`.
|
||||
- [x] **Series IV (Thor IDF) binary codec reverse-engineering.** ✅ v0.21.0 — `micromate/idf_file.read_idf_file()` decodes both IDFW (waveform body at offset `0x0f1f`, reusing `decode_waveform_v2()`; 87–99% sample fidelity on quiet events) and IDFH (dedicated segment-based decoder: all 859 corpus files decode, 181,071 intervals, peaks within ~1.8% of sidecar values). `WaveformStore.save_imported_idf` now also projects parsed Thor data into a `bw_report` block via `micromate/idf_to_bw_report.py` so Thor events render in the existing Event Report PDF pipeline without a separate renderer.
|
||||
- [ ] **In-app waveform viewer accuracy.** Depends on Series III codec decode. Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples. Series IV waveforms come online when the IDF codec lands.
|
||||
- [ ] **Series IV live-device support.** Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD).
|
||||
- [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing.
|
||||
@@ -536,10 +556,10 @@ Implementation steps (concrete):
|
||||
|
||||
### BW ASCII report parser enhancements (built in v0.16.0)
|
||||
|
||||
- [ ] **PPV field misses on certain TXT formats.** Discovered 2026-05-22 during the histogram-codec backfill validation: a handful of events (5 in prod) have a `bw_report` block where `peaks.{tran,vert,long}.ppv_ips` and `peaks.vector_sum.ips` are all `None`, despite the parser correctly extracting every OTHER field for the same channels (zc_freq_hz, time_of_peak_s, peak_accel_g, peak_disp_in). Symptom on the DB side: `peak_vector_sum=0` after a `--force` backfill that overlays from the parsed bw_report dict. Affected events on prod include `T190LD5Q.LK0W`, `T438L713.RY0W`, `K557L3YM.OE0W`. Root cause likely a regex or format mismatch for the "PPV" header line in those specific firmware/event-type outputs. Once fixed, re-forwarding the events from series3-watcher will re-populate the `bw_report` blocks correctly.
|
||||
- [ ] **Histogram-specific structural fields.** Current parser handles the shared fields (PPV, ZC Freq, sensor self-check, project) but silently drops histogram-only fields: `Histogram Start/Stop Time`, `Histogram Start/Stop Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date` (absolute timestamps rather than the waveform's `Time of Peak` relative seconds).
|
||||
- [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value. Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag. All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
|
||||
- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now. Land in the sidecar's `bw_report.histogram` block.
|
||||
- [ ] **Histogram interval bin-table parsing.** Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed. Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
|
||||
- [ ] **`>100 Hz` value parsing.** Histogram TXTs use `>100 Hz` for out-of-range ZC freq; current `_parse_number()` returns `None` for these (loses information).
|
||||
- [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag. PDF + both modals render `>100 Hz` instead of `—`.
|
||||
|
||||
### Ingestion gaps
|
||||
|
||||
|
||||
@@ -0,0 +1,65 @@
|
||||
"""Run read_idf_file across the corpus and report per-channel accuracy vs sidecars."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from micromate.idf_file import read_idf_file
|
||||
from analysis_idf.recon import load_sidecar_samples
|
||||
|
||||
|
||||
def sidecar_path(idfw: Path) -> Path:
|
||||
return idfw.parent / "TXT" / f"{idfw.name}.txt"
|
||||
|
||||
|
||||
def main():
|
||||
root = REPO / "tests/fixtures/THORDATA_example"
|
||||
files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
|
||||
files.sort()
|
||||
GEO_LSB = 0.0003
|
||||
|
||||
n_ok = n_skip = 0
|
||||
overall = {"Tran": [], "Vert": [], "Long": []}
|
||||
|
||||
for f in files:
|
||||
try:
|
||||
res = read_idf_file(f)
|
||||
except Exception:
|
||||
n_skip += 1
|
||||
continue
|
||||
sc_path = sidecar_path(f)
|
||||
if not sc_path.exists():
|
||||
n_skip += 1
|
||||
continue
|
||||
try:
|
||||
sc = load_sidecar_samples(sc_path)
|
||||
except Exception:
|
||||
n_skip += 1
|
||||
continue
|
||||
|
||||
per_file = {}
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||
dec = res.samples.get(ch, [])
|
||||
n = min(len(sc_counts), len(dec))
|
||||
if n == 0:
|
||||
per_file[ch] = 0.0
|
||||
continue
|
||||
exact = sum(1 for i in range(n) if sc_counts[i] == dec[i])
|
||||
pct = 100.0 * exact / n
|
||||
per_file[ch] = pct
|
||||
overall[ch].append(pct)
|
||||
n_ok += 1
|
||||
|
||||
print(f"Processed {n_ok} files (skipped {n_skip})")
|
||||
print("Per-channel exact-match % (mean / min / max):")
|
||||
for ch, vals in overall.items():
|
||||
if vals:
|
||||
avg = sum(vals) / len(vals)
|
||||
print(f" {ch}: mean={avg:.2f}% min={min(vals):.2f}% max={max(vals):.2f}% n={len(vals)}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,49 @@
|
||||
"""Find where decoded-vs-sidecar diverges for each channel."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from minimateplus.waveform_codec import decode_waveform_v2
|
||||
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
sc = load_sidecar_samples(TXT)
|
||||
decoded = decode_waveform_v2(buf[0x0f1f:])
|
||||
GEO_LSB = 0.0003
|
||||
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||
dec = decoded[ch]
|
||||
# Find ALL transitions where mismatches start/stop
|
||||
first_diff = next((i for i in range(len(dec)) if dec[i] != sc_counts[i]), None)
|
||||
if first_diff is None:
|
||||
print(f"{ch}: NO MISMATCHES")
|
||||
continue
|
||||
print(f"{ch}: first diff at idx {first_diff}")
|
||||
# Show 5 before, 5 after
|
||||
for i in range(max(0, first_diff - 3), min(len(dec), first_diff + 8)):
|
||||
mark = " " if dec[i] == sc_counts[i] else "**"
|
||||
print(f" {mark} idx {i:4d}: sc={sc_counts[i]:6d} dec={dec[i]:6d} diff={dec[i]-sc_counts[i]:+d}")
|
||||
# Where does cumulative diff exceed 100?
|
||||
cum_match_run = 0
|
||||
max_match_run = 0
|
||||
match_run_start = 0
|
||||
diff_count = 0
|
||||
for i in range(len(dec)):
|
||||
if dec[i] == sc_counts[i]:
|
||||
cum_match_run += 1
|
||||
max_match_run = max(max_match_run, cum_match_run)
|
||||
else:
|
||||
cum_match_run = 0
|
||||
diff_count += 1
|
||||
print(f" total mismatches: {diff_count}/{len(dec)}, longest run of matches: {max_match_run}")
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,48 @@
|
||||
"""End-to-end IDFH ingest verification."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
import tempfile
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from sfm.waveform_store import WaveformStore
|
||||
|
||||
|
||||
def main():
|
||||
idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||
txt = idfh.parent / "TXT" / f"{idfh.name}.txt"
|
||||
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
store = WaveformStore(Path(td))
|
||||
ev, rec = store.save_imported_idf(
|
||||
idfh.read_bytes(),
|
||||
idfh,
|
||||
idf_report_text=txt.read_text(errors="replace"),
|
||||
)
|
||||
print("=== save_imported_idf (IDFH) ===")
|
||||
print(f" serial: {rec['serial']}")
|
||||
print(f" filename: {rec['filename']}")
|
||||
print(f" filesize: {rec['filesize']}")
|
||||
print(f" h5: {rec['hdf5_filename']}") # expect None for histogram
|
||||
print(f" sidecar: {rec['sidecar_filename']}")
|
||||
print()
|
||||
print("=== Event ===")
|
||||
print(f" timestamp: {ev.timestamp}")
|
||||
print(f" record_type: {ev.record_type}")
|
||||
print(f" sample_rate: {ev.sample_rate}")
|
||||
print()
|
||||
# Inspect sidecar to confirm intervals were stashed
|
||||
sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
|
||||
sc = json.loads(sc_path.read_text())
|
||||
intervals = sc.get("extensions", {}).get("idf_intervals", [])
|
||||
print(f" sidecar intervals: {len(intervals)}")
|
||||
if intervals:
|
||||
print(f" first interval: {intervals[0]}")
|
||||
print(f" last interval: {intervals[-1]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,40 @@
|
||||
"""Verify the had_report=False path: ingest IDFW with no .txt."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
import tempfile
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from sfm.waveform_store import WaveformStore
|
||||
|
||||
|
||||
def main():
|
||||
idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
store = WaveformStore(Path(td))
|
||||
ev, rec = store.save_imported_idf(
|
||||
idfw.read_bytes(),
|
||||
idfw,
|
||||
serial_hint=None,
|
||||
idf_report_text=None, # ← no .txt!
|
||||
)
|
||||
print("=== IDFW without .txt ingest ===")
|
||||
print(f" serial: {rec['serial']}")
|
||||
print(f" timestamp: {ev.timestamp}")
|
||||
print(f" sample_rate: {ev.sample_rate}")
|
||||
print(f" record_type: {ev.record_type}")
|
||||
print(f" rectime_sec: {ev.rectime_seconds}")
|
||||
nT = len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0
|
||||
nV = len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0
|
||||
nL = len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0
|
||||
nM = len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0
|
||||
print(f" raw_samples: Tran={nT} Vert={nV} Long={nL} MicL={nM}")
|
||||
if ev.peak_values:
|
||||
print(f" peak_values: tran={ev.peak_values.tran} vert={ev.peak_values.vert} long={ev.peak_values.long}")
|
||||
print(f" h5 written: {rec['hdf5_filename']}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,102 @@
|
||||
"""End-to-end Thor report PDF rendering.
|
||||
|
||||
Ingests an IDFW + .txt via save_imported_idf, runs gather_report_data
|
||||
(faking a minimal DB row), and renders the PDF to disk.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
import tempfile
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from sfm.waveform_store import WaveformStore
|
||||
from sfm import report_pdf
|
||||
|
||||
|
||||
class FakeDb:
|
||||
"""Stand-in for SeismoDb.get_event(); the renderer only needs a few cols."""
|
||||
def __init__(self, event):
|
||||
self.event = event
|
||||
|
||||
def get_event(self, _id):
|
||||
return self.event
|
||||
|
||||
|
||||
def main():
|
||||
base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
|
||||
idfw = base / "UM11719_20231219162723.IDFW"
|
||||
txt = base / "TXT" / f"{idfw.name}.txt"
|
||||
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
store = WaveformStore(Path(td))
|
||||
ev, rec = store.save_imported_idf(
|
||||
idfw.read_bytes(),
|
||||
idfw,
|
||||
idf_report_text=txt.read_text(errors="replace"),
|
||||
)
|
||||
print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
|
||||
|
||||
# Verify sidecar has bw_report block
|
||||
sc_path = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
|
||||
sc = json.loads(sc_path.read_text())
|
||||
bw = sc.get("bw_report", {})
|
||||
print(f" bw_report.available: {bw.get('available')}")
|
||||
print(f" bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
|
||||
print(f" bw_report.mic.pspl_dbl: {bw.get('mic', {}).get('pspl_dbl')}")
|
||||
print(f" bw_report.histogram.n_intervals: {bw.get('histogram', {}).get('n_intervals')}")
|
||||
|
||||
# Build a DB-row-shaped dict from the Event for gather_report_data
|
||||
import datetime
|
||||
ts = ev.timestamp
|
||||
ts_iso = None
|
||||
if ts is not None:
|
||||
try:
|
||||
ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
|
||||
except Exception:
|
||||
pass
|
||||
fake_row = {
|
||||
"serial": "UM11719",
|
||||
"blastware_filename": rec["filename"],
|
||||
"record_type": "Waveform",
|
||||
"timestamp": ts_iso,
|
||||
"sample_rate": ev.sample_rate,
|
||||
"project": ev.project_info.project if ev.project_info else None,
|
||||
"client": ev.project_info.client if ev.project_info else None,
|
||||
"operator": ev.project_info.operator if ev.project_info else None,
|
||||
"sensor_location": ev.project_info.sensor_location if ev.project_info else None,
|
||||
"created_at": None,
|
||||
}
|
||||
|
||||
rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
|
||||
print()
|
||||
print(f"=== ReportData ===")
|
||||
print(f" event_id: {rd.event_id}")
|
||||
print(f" serial: {rd.serial}")
|
||||
print(f" record_type: {rd.record_type}")
|
||||
print(f" event_datetime: {rd.event_datetime_str}")
|
||||
print(f" trigger: {rd.trigger_source}")
|
||||
print(f" geo_range: {rd.geo_range_str}")
|
||||
print(f" sample_rate: {rd.sample_rate_str}")
|
||||
print(f" firmware: {rd.firmware}")
|
||||
print(f" calibration: {rd.calibration_date} by {rd.calibration_by}")
|
||||
print(f" battery: {rd.battery_volts}")
|
||||
print(f" PVS: {rd.peak_vector_sum_ips} in/s at {rd.peak_vector_sum_time_s} sec")
|
||||
print(f" mic_pspl_dbl: {rd.mic_pspl_dbl}")
|
||||
print(f" mic_zc_freq_hz: {rd.mic_zc_freq_hz}")
|
||||
print(f" channel_stats: {len(rd.channel_stats)} rows")
|
||||
for cs in rd.channel_stats:
|
||||
print(f" {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} ToP={cs['time_of_peak_s']} Acc={cs['peak_accel_g']} Disp={cs['peak_disp_in']} Test={cs['sensor_check']}")
|
||||
|
||||
# Render the PDF
|
||||
out_path = REPO / "analysis_idf" / "thor_report.pdf"
|
||||
pdf_bytes = report_pdf.render_event_report_pdf(rd)
|
||||
out_path.write_bytes(pdf_bytes)
|
||||
print()
|
||||
print(f" PDF written: {out_path} ({len(pdf_bytes)} bytes)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,91 @@
|
||||
"""End-to-end Thor IDFH histogram report PDF rendering."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
import tempfile
|
||||
import json
|
||||
import datetime
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from sfm.waveform_store import WaveformStore
|
||||
from sfm import report_pdf
|
||||
|
||||
|
||||
class FakeDb:
|
||||
def __init__(self, event):
|
||||
self.event = event
|
||||
|
||||
def get_event(self, _id):
|
||||
return self.event
|
||||
|
||||
|
||||
def main():
|
||||
# Use the multi-interval IDFH (81 + trigger row)
|
||||
idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||
txt = idfh.parent / "TXT" / f"{idfh.name}.txt"
|
||||
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
store = WaveformStore(Path(td))
|
||||
ev, rec = store.save_imported_idf(
|
||||
idfh.read_bytes(),
|
||||
idfh,
|
||||
idf_report_text=txt.read_text(errors="replace"),
|
||||
)
|
||||
print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
|
||||
|
||||
sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
|
||||
sc = json.loads(sc_path.read_text())
|
||||
bw = sc.get("bw_report", {})
|
||||
hist = bw.get("histogram", {})
|
||||
print(f" bw_report.histogram.start: {hist.get('start')}")
|
||||
print(f" bw_report.histogram.stop: {hist.get('stop')}")
|
||||
print(f" bw_report.histogram.n_intervals: {hist.get('n_intervals')}")
|
||||
print(f" bw_report.histogram.interval_size: {hist.get('interval_size')}")
|
||||
print(f" bw_report.histogram.interval_size_s: {hist.get('interval_size_s')}")
|
||||
print(f" bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
|
||||
|
||||
ts = ev.timestamp
|
||||
ts_iso = None
|
||||
if ts is not None:
|
||||
try:
|
||||
ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
|
||||
except Exception:
|
||||
pass
|
||||
fake_row = {
|
||||
"serial": "UM13981",
|
||||
"blastware_filename": rec["filename"],
|
||||
"record_type": "Histogram",
|
||||
"timestamp": ts_iso,
|
||||
"sample_rate": ev.sample_rate,
|
||||
"project": ev.project_info.project if ev.project_info else None,
|
||||
"client": ev.project_info.client if ev.project_info else None,
|
||||
"operator": ev.project_info.operator if ev.project_info else None,
|
||||
"sensor_location": ev.project_info.sensor_location if ev.project_info else None,
|
||||
"created_at": None,
|
||||
}
|
||||
rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="hist-1")
|
||||
|
||||
print()
|
||||
print("=== ReportData (histogram) ===")
|
||||
print(f" is_histogram: {rd.is_histogram}")
|
||||
print(f" histogram_start: {rd.histogram_start_str}")
|
||||
print(f" histogram_stop: {rd.histogram_stop_str}")
|
||||
print(f" histogram_n_intervals: {rd.histogram_n_intervals}")
|
||||
print(f" histogram_interval_size:{rd.histogram_interval_size}")
|
||||
print(f" histogram_interval_times[:3]: {rd.histogram_interval_times[:3]}")
|
||||
print(f" histogram_interval_times[-2:]: {rd.histogram_interval_times[-2:]}")
|
||||
print(f" channel_stats: {len(rd.channel_stats)} rows")
|
||||
for cs in rd.channel_stats:
|
||||
print(f" {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} peak_date={cs['peak_date']} peak_time={cs['peak_time']}")
|
||||
|
||||
pdf_bytes = report_pdf.render_event_report_pdf(rd)
|
||||
out_path = REPO / "analysis_idf" / "thor_report_idfh.pdf"
|
||||
out_path.write_bytes(pdf_bytes)
|
||||
print()
|
||||
print(f" PDF written: {out_path} ({len(pdf_bytes)} bytes)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,52 @@
|
||||
"""End-to-end ingest test: feed an IDFW + .txt to save_imported_idf in a tmp store."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
import tempfile
|
||||
import shutil
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from sfm.waveform_store import WaveformStore
|
||||
|
||||
|
||||
def main():
|
||||
idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||
txt = idfw.parent / "TXT" / f"{idfw.name}.txt"
|
||||
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
store = WaveformStore(Path(td))
|
||||
ev, rec = store.save_imported_idf(
|
||||
idfw.read_bytes(),
|
||||
idfw,
|
||||
serial_hint=None,
|
||||
idf_report_text=txt.read_text(errors="replace"),
|
||||
)
|
||||
print("=== Save result ===")
|
||||
print(f" serial: {rec['serial']}")
|
||||
print(f" filename: {rec['filename']}")
|
||||
print(f" filesize: {rec['filesize']}")
|
||||
print(f" h5: {rec['hdf5_filename']}")
|
||||
print(f" sidecar: {rec['sidecar_filename']}")
|
||||
print()
|
||||
print("=== Event ===")
|
||||
print(f" serial: {ev.serial if hasattr(ev,'serial') else '(n/a)'}")
|
||||
print(f" timestamp: {ev.timestamp}")
|
||||
print(f" sample_rate: {ev.sample_rate}")
|
||||
print(f" record_type: {ev.record_type}")
|
||||
print(f" rectime_sec: {ev.rectime_seconds}")
|
||||
print(f" raw_samples: Tran={len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0}, Vert={len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0}, Long={len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0}, MicL={len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0}")
|
||||
if ev.peak_values:
|
||||
print(f" peaks (txt): Tran={ev.peak_values.tran} Vert={ev.peak_values.vert} Long={ev.peak_values.long}")
|
||||
print()
|
||||
|
||||
# Verify the h5 file actually got written
|
||||
h5path = Path(td) / "UM11719" / f"{idfw.name}.h5"
|
||||
print(f" h5 exists: {h5path.exists()} size={h5path.stat().st_size if h5path.exists() else 0}")
|
||||
sidecar = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
|
||||
print(f" sidecar exists:{sidecar.exists()} size={sidecar.stat().st_size if sidecar.exists() else 0}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,137 @@
|
||||
"""Decode IDFH histogram intervals + verify against sidecar."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
import struct
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
|
||||
SEGMENT_MAGIC = b"\x02\xda\x0a\x00\x00\x00"
|
||||
SEGMENT_SIZE = 732 # = 10-byte header + 10 × 72-byte intervals + 2-byte tail
|
||||
INTERVAL_SIZE = 72
|
||||
CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||
|
||||
|
||||
def decode_interval(buf72: bytes) -> dict:
|
||||
"""Decode one 72-byte interval into per-channel min/max/halfp."""
|
||||
out = {}
|
||||
for i, ch in enumerate(CHANNELS):
|
||||
block = buf72[i*16 : (i+1)*16]
|
||||
mn = struct.unpack_from(">h", block, 0)[0]
|
||||
mx = struct.unpack_from(">h", block, 2)[0]
|
||||
sb = struct.unpack_from(">h", block, 4)[0]
|
||||
halfp = struct.unpack_from(">H", block, 6)[0]
|
||||
f10 = struct.unpack_from(">H", block, 10)[0]
|
||||
f14 = struct.unpack_from(">H", block, 14)[0]
|
||||
peak_count = max(abs(mn), abs(mx))
|
||||
out[ch] = {
|
||||
"min": mn,
|
||||
"max": mx,
|
||||
"field4": sb,
|
||||
"halfp": halfp,
|
||||
"field10": f10,
|
||||
"field14": f14,
|
||||
"peak": peak_count,
|
||||
"freq_hz": (512.0 / halfp) if halfp > 5 else None,
|
||||
}
|
||||
out["_tail"] = buf72[64:].hex(" ")
|
||||
return out
|
||||
|
||||
|
||||
def walk_idfh(buf: bytes) -> list:
|
||||
"""Walk all interval records in an IDFH file."""
|
||||
intervals = []
|
||||
# Multi-segment file: every 02 da 0a 00 00 00 marker introduces a segment.
|
||||
# Single-interval file: just one body header at 0xf96 of form ?? ?? 0a 00 00 00.
|
||||
# Find them all.
|
||||
i = 0
|
||||
while True:
|
||||
j = buf.find(b"\x0a\x00\x00\x00", i)
|
||||
if j < 0:
|
||||
break
|
||||
# Validate: the 2 bytes before must form a length, and we want bytes
|
||||
# [j-2 : j+6] to have a recognisable shape. Actually the cleanest
|
||||
# filter is "preceded by a length and followed by 00 NN 05 3f".
|
||||
if j < 2:
|
||||
i = j + 1
|
||||
continue
|
||||
# Body header form: [length_be_2][0a 00 00 00][00 NN][05 3f]
|
||||
if j + 10 > len(buf):
|
||||
break
|
||||
length = int.from_bytes(buf[j-2:j], "big")
|
||||
# Verify the segment-marker shape: [length_be][0a 00 00 00][00 NN][05 3f]
|
||||
if buf[j+4] != 0x00:
|
||||
i = j + 1
|
||||
continue
|
||||
if buf[j+6:j+8] != b"\x05\x3f":
|
||||
i = j + 1
|
||||
continue
|
||||
# Header layout (10 bytes): [length_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
|
||||
# Followed by N interval records of 72 bytes each, then 2 tail bytes.
|
||||
# length value = (N × 72) + 10 (counts bytes from 0x0a... through interval data).
|
||||
header_start = j - 2
|
||||
n_intervals = (length - 10) // INTERVAL_SIZE
|
||||
interval_start = header_start + 10
|
||||
for k in range(n_intervals):
|
||||
off = interval_start + k * INTERVAL_SIZE
|
||||
if off + INTERVAL_SIZE > len(buf):
|
||||
break
|
||||
chunk = buf[off:off + INTERVAL_SIZE]
|
||||
intervals.append({"offset": off, **decode_interval(chunk)})
|
||||
i = header_start + length + 2
|
||||
return intervals
|
||||
|
||||
|
||||
def main():
|
||||
# Test against multi-segment IDFH
|
||||
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||
sc_path = target.parent / "TXT" / f"{target.name}.txt"
|
||||
buf = target.read_bytes()
|
||||
intervals = walk_idfh(buf)
|
||||
print(f"=== {target.name} ===")
|
||||
print(f" file size: {len(buf)}")
|
||||
print(f" decoded intervals: {len(intervals)}")
|
||||
# Show first 2 + last 2
|
||||
sc_rows = []
|
||||
for line in sc_path.read_text(errors="replace").splitlines():
|
||||
if line.startswith("2022-") or line.startswith("2023-"):
|
||||
sc_rows.append(line)
|
||||
print(f" sidecar rows: {len(sc_rows)}")
|
||||
|
||||
print()
|
||||
for k in [0, 1, 78, 79, 80]:
|
||||
if k >= len(intervals):
|
||||
continue
|
||||
iv = intervals[k]
|
||||
print(f"--- interval {k} @0x{iv['offset']:04x} ---")
|
||||
for ch in CHANNELS:
|
||||
d = iv[ch]
|
||||
peak_ips = d["peak"] / 32768 * 10.0
|
||||
print(f" {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s) halfp={d['halfp']:5d} freq={d['freq_hz']}")
|
||||
# sidecar row
|
||||
if k < len(sc_rows):
|
||||
print(f" SC: {sc_rows[k]}")
|
||||
|
||||
# Test single-interval IDFH
|
||||
print()
|
||||
target2 = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
|
||||
sc2 = target2.parent / "TXT" / f"{target2.name}.txt"
|
||||
buf2 = target2.read_bytes()
|
||||
intervals2 = walk_idfh(buf2)
|
||||
print(f"=== {target2.name} ===")
|
||||
print(f" file size: {len(buf2)}, decoded intervals: {len(intervals2)}")
|
||||
if intervals2:
|
||||
iv = intervals2[0]
|
||||
for ch in CHANNELS:
|
||||
d = iv[ch]
|
||||
peak_ips = d["peak"] / 32768 * 10.0
|
||||
print(f" {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s) halfp={d['halfp']:5d} freq={d['freq_hz']}")
|
||||
sc_rows2 = [l for l in sc2.read_text(errors='replace').splitlines() if l.startswith("2023-")]
|
||||
if sc_rows2:
|
||||
print(f" SC: {sc_rows2[0]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,41 @@
|
||||
"""Find IDFH interval period via auto-correlation of structural patterns."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from collections import Counter
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
|
||||
def main():
|
||||
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||
buf = target.read_bytes()
|
||||
body_start = 0xF96
|
||||
body_end = 0x270C
|
||||
body = buf[body_start:body_end]
|
||||
print(f"body size: {len(body)} bytes (file {len(buf)} bytes)")
|
||||
|
||||
# For each candidate interval size, count how many bytes at fixed offsets within
|
||||
# each interval are zero (consistent column-zero pattern indicates correct size).
|
||||
print()
|
||||
print("=== zero-column score by interval size (higher = more likely) ===")
|
||||
best = []
|
||||
for sz in range(16, 100):
|
||||
n = len(body) // sz
|
||||
if n < 30:
|
||||
continue
|
||||
# For each column position within an interval, count how many of n intervals have zero
|
||||
score = 0
|
||||
for col in range(sz):
|
||||
zeros = sum(1 for i in range(n) if body[i*sz + col] == 0)
|
||||
if zeros >= n * 0.9:
|
||||
score += 1
|
||||
best.append((score, sz, n))
|
||||
best.sort(reverse=True)
|
||||
for score, sz, n in best[:10]:
|
||||
print(f" size={sz:3d} n_intervals={n} consistently-zero-cols={score}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,40 @@
|
||||
"""Per-file accuracy + sample-count details."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from micromate.idf_file import read_idf_file
|
||||
from analysis_idf.recon import load_sidecar_samples
|
||||
|
||||
|
||||
def main():
|
||||
root = REPO / "tests/fixtures/THORDATA_example"
|
||||
files = sorted([f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")])
|
||||
GEO_LSB = 0.0003
|
||||
# Limit to first 15 successful files for detail.
|
||||
shown = 0
|
||||
for f in files:
|
||||
try:
|
||||
res = read_idf_file(f)
|
||||
except Exception:
|
||||
continue
|
||||
sc_path = f.parent / "TXT" / f"{f.name}.txt"
|
||||
if not sc_path.exists():
|
||||
continue
|
||||
sc = load_sidecar_samples(sc_path)
|
||||
sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
|
||||
dec = res.samples.get("Tran", [])
|
||||
n = min(len(sc_tran), len(dec))
|
||||
exact = sum(1 for i in range(n) if sc_tran[i] == dec[i]) if n else 0
|
||||
pct = 100.0 * exact / n if n else 0.0
|
||||
print(f"{f.name:40s} size={f.stat().st_size:6d} sc_n={len(sc_tran):4d} dec_n={len(dec):4d} exact={pct:.1f}%")
|
||||
shown += 1
|
||||
if shown >= 20:
|
||||
break
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,64 @@
|
||||
"""Look at what's at the divergence boundary."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start, parse_segment_header
|
||||
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
body = buf[0x0f1f:]
|
||||
start = find_data_start(body)
|
||||
print(f"data_start: {start} (= file offset 0x{0x0f1f + start:04x})")
|
||||
|
||||
blocks = walk_body(body, start)
|
||||
print(f"{len(blocks)} blocks total")
|
||||
print()
|
||||
|
||||
# First 25 blocks
|
||||
print("=== first 30 blocks ===")
|
||||
for i, b in enumerate(blocks[:30]):
|
||||
body_off = 0x0f1f + b.offset
|
||||
if b.tag_hi == 0x40:
|
||||
hdr = parse_segment_header(b)
|
||||
print(f" [{i:3d}] @0x{body_off:04x} {b.kind} (segment header) counter={hdr['counter'] if hdr else '?'} field2={hdr['field2'].hex() if hdr else '?'} anchor={hdr['anchor_bytes'].hex() if hdr else '?'} tail={hdr['tail'].hex() if hdr else '?'}")
|
||||
else:
|
||||
print(f" [{i:3d}] @0x{body_off:04x} {b.kind} len={b.length} data={b.data[:16].hex()}")
|
||||
print()
|
||||
|
||||
# Cumulative sample counts per block to find which block contains sample 254
|
||||
print("=== cumulative samples through blocks ===")
|
||||
cur_ch = "Tran"
|
||||
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||
seg_count = 0
|
||||
samples_in_curseg = 2 # preamble Tran[0], Tran[1]
|
||||
for i, b in enumerate(blocks[:30]):
|
||||
if b.tag_hi == 0x40:
|
||||
seg_count += 1
|
||||
prev_ch = cur_ch
|
||||
cur_ch = rotation[(seg_count - 1) % 4]
|
||||
print(f" [{i:3d}] 40 02 -> end of {prev_ch} segment, start {cur_ch} (segment {seg_count})")
|
||||
samples_in_curseg = 2 # anchors
|
||||
elif (b.tag_hi & 0xF0) == 0x10:
|
||||
nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
|
||||
samples_in_curseg += nn
|
||||
print(f" [{i:3d}] {b.kind} nibble: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||
elif (b.tag_hi & 0xF0) == 0x20:
|
||||
nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
|
||||
samples_in_curseg += nn
|
||||
print(f" [{i:3d}] {b.kind} int8: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||
elif b.tag_hi == 0x00:
|
||||
samples_in_curseg += b.tag_lo
|
||||
print(f" [{i:3d}] {b.kind} RLE: +{b.tag_lo}, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||
elif b.tag_hi == 0x30:
|
||||
samples_in_curseg += b.tag_lo
|
||||
print(f" [{i:3d}] {b.kind} packed12: +{b.tag_lo} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,89 @@
|
||||
"""Reconnaissance helpers for cracking the Thor IDFW binary."""
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
TARGET = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||
TXT = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/TXT/UM11719_20231219162723.IDFW.txt"
|
||||
|
||||
|
||||
def hex_at(buf: bytes, off: int, n: int = 32) -> str:
|
||||
chunk = buf[off : off + n]
|
||||
hexs = " ".join(f"{b:02x}" for b in chunk)
|
||||
asc = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
|
||||
return f"{off:04x}: {hexs} {asc}"
|
||||
|
||||
|
||||
def find_all(buf: bytes, needle: bytes) -> list[int]:
|
||||
out: list[int] = []
|
||||
i = 0
|
||||
while True:
|
||||
j = buf.find(needle, i)
|
||||
if j < 0:
|
||||
break
|
||||
out.append(j)
|
||||
i = j + 1
|
||||
return out
|
||||
|
||||
|
||||
def load_sidecar_samples(path: Path) -> dict[str, list[float]]:
|
||||
"""Parse the txt sample table — Tran/Vert/Long/MicL."""
|
||||
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
in_block = False
|
||||
for line in path.read_text(errors="replace").splitlines():
|
||||
if not in_block:
|
||||
if line.strip() == "Waveform Data Channels":
|
||||
in_block = True
|
||||
continue
|
||||
if line.startswith("Waveform Data USB Channels"):
|
||||
break
|
||||
parts = line.split("\t")
|
||||
# First row is the header "\tTran\tVert\tLong\tMicL"
|
||||
if len(parts) >= 5 and parts[1] == "Tran":
|
||||
continue
|
||||
if len(parts) < 5:
|
||||
continue
|
||||
try:
|
||||
out["Tran"].append(float(parts[1]))
|
||||
out["Vert"].append(float(parts[2]))
|
||||
out["Long"].append(float(parts[3]))
|
||||
out["MicL"].append(float(parts[4]))
|
||||
except ValueError:
|
||||
continue
|
||||
return out
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
samples = load_sidecar_samples(TXT)
|
||||
print(f"file size: {len(buf)} bytes")
|
||||
print(f"sample rows: Tran={len(samples['Tran'])} Vert={len(samples['Vert'])} Long={len(samples['Long'])} MicL={len(samples['MicL'])}")
|
||||
print(f"first 6 Tran samples: {samples['Tran'][:6]}")
|
||||
print(f"first 6 Vert samples: {samples['Vert'][:6]}")
|
||||
print(f"first 6 Long samples: {samples['Long'][:6]}")
|
||||
print(f"first 6 MicL samples: {samples['MicL'][:6]}")
|
||||
|
||||
print()
|
||||
print("=== BW magic '00 02 00' positions ===")
|
||||
hits = find_all(buf, b"\x00\x02\x00")
|
||||
print(f"{len(hits)} hits")
|
||||
for h in hits[:20]:
|
||||
print(hex_at(buf, h, 24))
|
||||
|
||||
print()
|
||||
print("=== '40 02' segment-header positions ===")
|
||||
hits = find_all(buf, b"\x40\x02")
|
||||
print(f"{len(hits)} hits")
|
||||
for h in hits:
|
||||
ctx_pre = buf[max(0, h - 4): h].hex()
|
||||
ctx_post = buf[h: h + 20].hex()
|
||||
# Show byte preceding to help identify real headers vs casual occurrences
|
||||
print(f" 0x{h:04x} pre={ctx_pre} post={ctx_post}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,40 @@
|
||||
"""Find each segment boundary in the channel and check if errors reset there."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from minimateplus.waveform_codec import decode_waveform_v2
|
||||
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
sc = load_sidecar_samples(TXT)
|
||||
decoded = decode_waveform_v2(buf[0x0f1f:])
|
||||
GEO_LSB = 0.0003
|
||||
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||
dec = decoded[ch]
|
||||
# Find every transition where error becomes zero from nonzero (or grows from zero)
|
||||
# Print indices where dec resyncs back to exact match.
|
||||
n = min(len(sc_counts), len(dec))
|
||||
events = []
|
||||
prev_match = True
|
||||
for i in range(n):
|
||||
match = sc_counts[i] == dec[i]
|
||||
if match != prev_match:
|
||||
kind = "RESYNC" if match else "DIVERGE"
|
||||
events.append((i, kind, sc_counts[i], dec[i]))
|
||||
prev_match = match
|
||||
print(f"{ch}: {len(events)} transitions")
|
||||
for i, kind, sc_v, dec_v in events[:20]:
|
||||
print(f" idx {i:4d} {kind:8s} sc={sc_v:6d} dec={dec_v:6d} diff={dec_v-sc_v:+d}")
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,46 @@
|
||||
"""Smoke-test read_idf_file on IDFH across the corpus."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from micromate.idf_file import read_idf_file
|
||||
|
||||
|
||||
def main():
|
||||
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
|
||||
result = read_idf_file(target)
|
||||
ev = result.event
|
||||
print(f"=== {target.name} ===")
|
||||
print(f" signature: {result.signature}")
|
||||
print(f" serial: {ev.serial}")
|
||||
print(f" timestamp: {ev.timestamp}")
|
||||
print(f" sample_rate: {ev.sample_rate}")
|
||||
print(f" kind: {ev.kind}")
|
||||
print(f" intervals: {len(result.intervals or [])}")
|
||||
print(f" peaks: T={ev.peaks.transverse_ips:.4f} V={ev.peaks.vertical_ips:.4f} L={ev.peaks.longitudinal_ips:.4f}")
|
||||
print()
|
||||
|
||||
root = REPO / "tests/fixtures/THORDATA_example"
|
||||
files = list(root.rglob("*.IDFH"))
|
||||
ok = fail = nyi = 0
|
||||
total_intervals = 0
|
||||
for f in files:
|
||||
try:
|
||||
r = read_idf_file(f)
|
||||
ok += 1
|
||||
total_intervals += len(r.intervals or [])
|
||||
except NotImplementedError:
|
||||
nyi += 1
|
||||
except Exception as exc:
|
||||
fail += 1
|
||||
if fail <= 3:
|
||||
print(f" FAIL: {f.name}: {type(exc).__name__}: {exc}")
|
||||
print(f"Corpus: {len(files)} IDFH files | ok={ok} fail={fail} nyi={nyi}")
|
||||
print(f"Total intervals decoded: {total_intervals}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,48 @@
|
||||
"""Smoke-test read_idf_file across the sample corpus."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from micromate.idf_file import read_idf_file, geo_count_to_ips, mic_count_to_psi
|
||||
|
||||
|
||||
def main():
|
||||
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||
result = read_idf_file(target)
|
||||
ev = result.event
|
||||
print(f"=== {target.name} ===")
|
||||
print(f" signature: {result.signature}")
|
||||
print(f" serial: {ev.serial}")
|
||||
print(f" timestamp: {ev.timestamp}")
|
||||
print(f" sample_rate: {ev.sample_rate}")
|
||||
print(f" record_time: {ev.record_time_sec}")
|
||||
print(f" calibration: {result.binary_metadata.calibration_date}")
|
||||
print(f" Tran samples: {len(result.samples['Tran'])}, peak_ips={ev.peaks.transverse_ips:.4f}")
|
||||
print(f" Vert samples: {len(result.samples['Vert'])}, peak_ips={ev.peaks.vertical_ips:.4f}")
|
||||
print(f" Long samples: {len(result.samples['Long'])}, peak_ips={ev.peaks.longitudinal_ips:.4f}")
|
||||
print(f" MicL samples: {len(result.samples['MicL'])}")
|
||||
print()
|
||||
|
||||
# Corpus sweep
|
||||
root = REPO / "tests/fixtures/THORDATA_example"
|
||||
files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
|
||||
ok = fail = nyi = 0
|
||||
for f in files:
|
||||
try:
|
||||
r = read_idf_file(f)
|
||||
ok += 1
|
||||
except NotImplementedError:
|
||||
nyi += 1
|
||||
except Exception as exc:
|
||||
fail += 1
|
||||
if fail <= 5:
|
||||
print(f" FAIL: {f.name}: {type(exc).__name__}: {exc}")
|
||||
print()
|
||||
print(f"Corpus: {len(files)} IDFW files | ok={ok} fail={fail} not-implemented={nyi}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,47 @@
|
||||
"""Verify build_bw_report_from_idf against a known sidecar."""
|
||||
from __future__ import annotations
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from micromate.idf_ascii_report import parse_idf_report
|
||||
from micromate.idf_to_bw_report import build_bw_report_from_idf
|
||||
from micromate.idf_file import read_idf_file
|
||||
|
||||
|
||||
def show(prefix: str, d: dict, indent: int = 0):
|
||||
for k, v in d.items():
|
||||
if isinstance(v, dict):
|
||||
print(f"{' '*indent}{prefix}{k}:")
|
||||
show("", v, indent + 1)
|
||||
else:
|
||||
print(f"{' '*indent}{prefix}{k}: {v!r}")
|
||||
|
||||
|
||||
def main():
|
||||
base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
|
||||
idfw = base / "UM11719_20231219162723.IDFW"
|
||||
txt = base / "TXT" / f"{idfw.name}.txt"
|
||||
|
||||
report_dict = parse_idf_report(txt.read_text(errors="replace"))
|
||||
res = read_idf_file(idfw)
|
||||
bw = build_bw_report_from_idf(report_dict, binary_md=res.binary_metadata)
|
||||
|
||||
print("=== IDFW → bw_report ===")
|
||||
show("", bw)
|
||||
|
||||
print()
|
||||
print("=== IDFH (single trigger row) ===")
|
||||
idfh = base / "UM11719_20231219162648.IDFH"
|
||||
txt_h = base / "TXT" / f"{idfh.name}.txt"
|
||||
rh = parse_idf_report(txt_h.read_text(errors="replace"))
|
||||
res_h = read_idf_file(idfh)
|
||||
bw_h = build_bw_report_from_idf(rh, binary_md=res_h.binary_metadata, intervals=res_h.intervals)
|
||||
show("", bw_h)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Binary file not shown.
Binary file not shown.
@@ -0,0 +1,73 @@
|
||||
"""Trace Tran sample-by-sample to find exactly where the codec drifts."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||
|
||||
|
||||
def s4(n: int) -> int:
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b: int) -> int:
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
sc = load_sidecar_samples(TXT)
|
||||
GEO_LSB = 0.0003
|
||||
sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
|
||||
|
||||
body = buf[0x0f1f:]
|
||||
# Tran[0], Tran[1] from preamble
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
print(f"preamble Tran[0]={t0} Tran[1]={t1} (sidecar: {sc_tran[0]}, {sc_tran[1]})")
|
||||
|
||||
# Block 0: 10 f8 at body[7:9]
|
||||
print(f"block 0: tag {body[7]:02x} {body[8]:02x}")
|
||||
print(f" block 0 first 10 data bytes: {body[9:19].hex()}")
|
||||
|
||||
# Walk block 0 manually, comparing each sample
|
||||
cur = t1
|
||||
samples = [t0, t1]
|
||||
block_off = 7
|
||||
nn = body[8]
|
||||
print(f" NN = {nn}")
|
||||
data = body[9 : 9 + nn // 2]
|
||||
for byi, byte in enumerate(data):
|
||||
for nib_idx, nib in enumerate(((byte >> 4) & 0xF, byte & 0xF)):
|
||||
cur += s4(nib)
|
||||
samples.append(cur)
|
||||
idx = len(samples) - 1
|
||||
if 0 <= idx < len(sc_tran):
|
||||
sc_v = sc_tran[idx]
|
||||
match = "✓" if sc_v == cur else "✗"
|
||||
if idx < 12 or 240 <= idx <= 260:
|
||||
print(f" idx {idx:3d}: nibble byte={byte:02x} nib={nib:x} delta={s4(nib):+d} cur={cur:+d} sc={sc_v:+d} {match}")
|
||||
|
||||
print(f"end of block 0: cur={cur}, len(samples)={len(samples)}, decoder expected 250 here")
|
||||
# Block 1: 20 28 starts at offset 9 + 124 = 133 from block_off=7
|
||||
block1_off = 9 + nn // 2
|
||||
print(f"block 1: tag {body[block1_off]:02x} {body[block1_off+1]:02x} (expecting 20 28)")
|
||||
nn1 = body[block1_off + 1]
|
||||
print(f" block 1 NN = {nn1}")
|
||||
data1 = body[block1_off + 2 : block1_off + 2 + nn1]
|
||||
for byi, byte in enumerate(data1):
|
||||
cur += i8(byte)
|
||||
samples.append(cur)
|
||||
idx = len(samples) - 1
|
||||
if idx < len(sc_tran):
|
||||
sc_v = sc_tran[idx]
|
||||
match = "✓" if sc_v == cur else "✗"
|
||||
if 248 <= idx <= 295:
|
||||
print(f" idx {idx:3d}: int8 byte={byte:02x} delta={i8(byte):+d} cur={cur:+d} sc={sc_v:+d} {match}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,42 @@
|
||||
"""Feed candidate body offsets to the BW codec and compare with sidecar."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
|
||||
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
sc = load_sidecar_samples(TXT)
|
||||
# Sidecar samples in 0.0003 counts (Thor geo LSB).
|
||||
sc_tran = [int(round(v / 0.0003)) for v in sc["Tran"][:30]]
|
||||
sc_vert = [int(round(v / 0.0003)) for v in sc["Vert"][:30]]
|
||||
sc_long = [int(round(v / 0.0003)) for v in sc["Long"][:30]]
|
||||
sc_micl = [int(round(v / 1e-6)) for v in sc["MicL"][:30]] # 1 µ unit for mic? Will iterate.
|
||||
print(f"sidecar Tran (counts): {sc_tran}")
|
||||
print(f"sidecar Vert (counts): {sc_vert}")
|
||||
print(f"sidecar Long (counts): {sc_long}")
|
||||
print(f"sidecar MicL (×1e-6): {sc_micl}")
|
||||
print()
|
||||
|
||||
# Try candidate body start offsets.
|
||||
for off in (0x0f1f, 0x1057, 0x11f1, 0x1333, 0x1bde, 0x0d30):
|
||||
print(f"=== body @ 0x{off:04x} ===")
|
||||
body = buf[off:]
|
||||
decoded = decode_waveform_v2(body)
|
||||
if not decoded:
|
||||
print(" decode_waveform_v2 returned None")
|
||||
continue
|
||||
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||
arr = decoded.get(ch, [])
|
||||
print(f" {ch}[{len(arr)}]: {arr[:20]}")
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,51 @@
|
||||
"""Verify decode_waveform_v2 against sidecar across all 2304 samples per channel."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
REPO = Path(__file__).resolve().parents[1]
|
||||
sys.path.insert(0, str(REPO))
|
||||
|
||||
from minimateplus.waveform_codec import decode_waveform_v2
|
||||
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||
|
||||
|
||||
def main():
|
||||
buf = TARGET.read_bytes()
|
||||
sc = load_sidecar_samples(TXT)
|
||||
body = buf[0x0f1f:]
|
||||
decoded = decode_waveform_v2(body)
|
||||
|
||||
print(f"Sidecar lengths: Tran={len(sc['Tran'])} Vert={len(sc['Vert'])} Long={len(sc['Long'])} MicL={len(sc['MicL'])}")
|
||||
print(f"Decoded lengths: Tran={len(decoded['Tran'])} Vert={len(decoded['Vert'])} Long={len(decoded['Long'])} MicL={len(decoded['MicL'])}")
|
||||
print()
|
||||
|
||||
GEO_LSB = 0.0003 # in/s per count
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||
dec = decoded[ch]
|
||||
n = min(len(sc_counts), len(dec))
|
||||
matches = sum(1 for i in range(n) if sc_counts[i] == dec[i])
|
||||
first_mismatch = next((i for i in range(n) if sc_counts[i] != dec[i]), None)
|
||||
print(f"{ch}: compared {n}, exact matches {matches} ({100*matches/n:.2f}%)")
|
||||
if first_mismatch is not None:
|
||||
i = first_mismatch
|
||||
print(f" first mismatch at idx {i}: sidecar={sc_counts[i]} ({sc[ch][i]}), decoded={dec[i]}")
|
||||
print(f" context sidecar[{i-2}..{i+5}]: {sc_counts[max(0,i-2):i+5]}")
|
||||
print(f" context decoded[{i-2}..{i+5}]: {dec[max(0,i-2):i+5]}")
|
||||
|
||||
# MicL: find the multiplicative factor that fits
|
||||
print()
|
||||
print("=== MicL scale analysis ===")
|
||||
sc_micl = sc["MicL"]
|
||||
dec_micl = decoded["MicL"]
|
||||
# Skip zero values when computing ratio
|
||||
ratios = [sc_micl[i] / dec_micl[i] for i in range(min(50, len(sc_micl), len(dec_micl))) if dec_micl[i] != 0]
|
||||
if ratios:
|
||||
avg = sum(ratios) / len(ratios)
|
||||
print(f" avg ratio sidecar/decoded over first 50 nonzero: {avg:.4e} (n={len(ratios)})")
|
||||
print(f" ratios sample: {[f'{r:.4e}' for r in ratios[:6]]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -6,11 +6,68 @@ Series IV event-file format. Sibling to
|
||||
Series III "Rosetta Stone") — this doc holds what we know so far and
|
||||
the open questions still to crack.
|
||||
|
||||
**Status (2026-05-20):** ASCII text sidecar fully decoded (1,014
|
||||
sample files round-trip). Binary `.IDFH` / `.IDFW` codec
|
||||
**not yet implemented** — binaries are stored opaquely by
|
||||
`WaveformStore.save_imported_idf`, with metadata sourced from the
|
||||
paired `.txt` sidecar.
|
||||
**Status (2026-05-28):** ASCII text sidecar fully decoded (1,014
|
||||
sample files round-trip). **Thor IDFW** binary now decodes via
|
||||
`micromate.idf_file.read_idf_file()` — reuses the BW segment-rotated
|
||||
block codec verbatim at fixed body offset `0x0f1f`; metadata (serial,
|
||||
timestamp, sample_rate, record_time, calibration_date) extracted from
|
||||
the binary header. Sample fidelity is 87–99% byte-exact on quiet
|
||||
events; loud events hit the BW codec's known walker-stops-early
|
||||
limitation. Residual ~3% drift on per-sample deltas (likely a
|
||||
Thor-specific 12-bit delta refinement not yet modelled).
|
||||
|
||||
**Thor IDFH histograms also decoded.** Body has one or more segments;
|
||||
each 12-byte segment header `[length_be 2B][0a 00 00 00][00 NN][05 3f]`
|
||||
introduces `N = (length - 10) // 72` interval records of 72 bytes
|
||||
each. Each interval = 4 × 16-byte per-channel records:
|
||||
`[int16 min][int16 max][int16 ??][uint16 halfp][2B 00][uint16 ??][2B 00][uint16 ??]`.
|
||||
Geo peak `= max(|min|, |max|) / 32768 × 10` in/s (matches sidecar
|
||||
~1.8%); freq `= 512 / halfp` Hz (None for halfp ≤ 5 → ">100"
|
||||
sentinel). Corpus: **all 859 Thor IDFH files decode, 181,071
|
||||
intervals**. Wired through `read_idf_file()` →
|
||||
`save_imported_idf()` → sidecar's `extensions.idf_intervals`.
|
||||
|
||||
**Note on the BE9439 outliers in the example corpus:** Two files
|
||||
(`BE9439_20200713131747.IDFW` and `BE9439_20200713124251.IDFH`) are
|
||||
**Series III Blastware** binaries, not Thor. Provenance: TMI tried
|
||||
to use Thor to manage auto-call-homes for Series III units; the
|
||||
experiment didn't work out, but it did leave a few BW event files
|
||||
in Thor's per-serial directory structure with `.IDFW`/`.IDFH`
|
||||
extensions — Thor's forwarder applied its own naming convention to
|
||||
the BW bodies it was relaying. Their header `10 00 01 80 00 00
|
||||
Instantel STRT ff fe <end_key> <start_key>` is the BW SUB 5A STRT
|
||||
record, not a Thor body preamble. The reader detects them by
|
||||
signature and raises `NotImplementedError` pointing callers at
|
||||
`read_blastware_file()`, which extracts BW-format peaks from them.
|
||||
|
||||
**Still NYI for Thor IDFH:** per-channel `int16 field4` (possibly
|
||||
time-of-peak); the two uint16 fields (probably PVS contributions);
|
||||
8-byte interval tail (PVS data); mic dB(L) exact conversion constant.
|
||||
|
||||
### Codec breakthroughs (2026-05-28)
|
||||
|
||||
- **Body offset is a fixed `0x0f1f`** across 151/154 corpus IDFW
|
||||
files. Preceded by a 4-byte record-type marker (`46 00 00 00`)
|
||||
+ magic preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]`.
|
||||
- **Sample stream is BW's segment-rotated block codec verbatim.**
|
||||
Thor reuses `10 NN` (nibble), `20 NN` (int8), `00 NN` (RLE),
|
||||
`30 NN` (packed12), `40 02` (segment header) tags with the same
|
||||
semantics. Channel rotation Tran→Vert→Long→MicL.
|
||||
- **Geo LSB = 0.0003 in/s** (not BW's 0.005), because Thor's 16-bit
|
||||
ADC range maps to 10 in/s without the 16-count BW quantization step.
|
||||
- **Mic ≈ 2.14×10⁻⁶ psi/count** (rough scale; refine after channel
|
||||
block calibration constants are decoded).
|
||||
- **BW compliance anchor `\xbe\x80\x00\x00\x00\x00` reappears at
|
||||
IDFW offset 0x952** — sample_rate at anchor−6 (uint16 BE),
|
||||
record_time at anchor+6 (float32 BE), same layout as BW.
|
||||
- **Event timestamp at offset 0x97A** — 8 bytes `[day][month]
|
||||
[year_be][unk][hour][min][sec]`. Stop-time mirrors at 0x982.
|
||||
- **Serial as null-terminated ASCII at 0x14E**.
|
||||
- **Calibration date** at 0x194–0x197 (day, month, year_be).
|
||||
- Per-sample residual drift of ~3% suggests Thor encodes int8/nibble
|
||||
deltas with an extra refinement bit that BW doesn't carry —
|
||||
unsolved; errors resync within a few samples so cumulative impact
|
||||
is small.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -210,8 +210,7 @@ def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
|
||||
"long_peak_acceleration",
|
||||
"tran_peak_displacement", "vert_peak_displacement",
|
||||
"long_peak_displacement",
|
||||
"tran_time_of_peak", "vert_time_of_peak", "long_time_of_peak",
|
||||
"mic_time_of_peak", "mic_zc_freq",
|
||||
"mic_zc_freq",
|
||||
)
|
||||
for key in float_fields:
|
||||
v = raw.get(key)
|
||||
@@ -223,6 +222,22 @@ def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
|
||||
else:
|
||||
out.pop(key, None)
|
||||
|
||||
# Time-of-peak: Thor labels these "TimeofPeak" (lowercase "of") so the
|
||||
# normalizer produces "*_timeof_peak". Map them to the canonical
|
||||
# ``*_time_of_peak`` output keys for downstream consumers.
|
||||
for raw_key, out_key in (
|
||||
("tran_timeof_peak", "tran_time_of_peak"),
|
||||
("vert_timeof_peak", "vert_time_of_peak"),
|
||||
("long_timeof_peak", "long_time_of_peak"),
|
||||
("mic_timeof_peak", "mic_time_of_peak"),
|
||||
):
|
||||
v = raw.get(raw_key)
|
||||
if v is None:
|
||||
continue
|
||||
fv = _parse_float(v)
|
||||
if fv is not None:
|
||||
out[out_key] = fv
|
||||
|
||||
# Microphone — Thor reports MicPSPL (dB(L)) which is the closest
|
||||
# analogue to BW's mic_ppv. The raw "99.4 dB(L)" string stays in
|
||||
# `out` under the original `mic_pspl` key for display; the parsed
|
||||
|
||||
+514
-48
@@ -1,64 +1,530 @@
|
||||
"""
|
||||
micromate/idf_file.py — placeholder for the Thor IDF binary codec.
|
||||
micromate/idf_file.py — Thor IDF binary codec.
|
||||
|
||||
Thor's ``.IDFH`` (histogram) and ``.IDFW`` (waveform) event files are an
|
||||
Instantel proprietary binary format that has not yet been reverse-
|
||||
engineered. Today seismo-relay treats them as opaque blobs:
|
||||
``WaveformStore.save_imported_idf`` stores the bytes verbatim and reads
|
||||
all device-authoritative metadata from the paired ``.IDFW.txt`` /
|
||||
``.IDFH.txt`` ASCII sidecar (parsed by ``idf_ascii_report.py``).
|
||||
Decodes the Instantel Micromate Series IV ``.IDFW`` (waveform) and
|
||||
``.IDFH`` (histogram) binary on-disk format. Sister module to
|
||||
``minimateplus/event_file_io.py``.
|
||||
|
||||
When we crack the binary codec — same reverse-engineering playbook we
|
||||
used to byte-perfect-parse Series III BW files (see
|
||||
``docs/instantel_protocol_reference.md`` and ``minimateplus/event_file_io.py``)
|
||||
— this module will grow:
|
||||
Status (2026-05-28):
|
||||
|
||||
- ``read_idf_file(path) -> IdfEvent``
|
||||
Parse a ``.IDFW``/``.IDFH`` binary and return a fully populated
|
||||
``IdfEvent`` whose waveform-sample arrays come from the binary
|
||||
(the .txt sidecar's tabular sample block being a best-effort
|
||||
check). Lets us ingest Thor events even when the operator
|
||||
hasn't enabled the .txt exporter — closing the
|
||||
``had_report=False`` gap that the thor-watcher forwarder
|
||||
currently tolerates as a known limitation.
|
||||
- **Genuine Series IV / Thor binaries** are all signed
|
||||
``00 12 01 00 00 00 Instantel\\0`` (sig-A in earlier notes). Two
|
||||
Series III (Blastware) binaries appear in the example corpus
|
||||
(``BE9439_*``) — they share the ``.IDFW``/``.IDFH`` extension by
|
||||
filing convention but carry a BW STRT header (``10 00 01 80 00 00
|
||||
Instantel STRT...``) and are NOT Thor data. The reader detects
|
||||
them by signature and raises NotImplementedError pointing callers
|
||||
at ``minimateplus.event_file_io.read_blastware_file()``.
|
||||
- **IDFW waveform body** reuses the BW segment-rotated block codec
|
||||
verbatim. Body always starts at file offset ``0x0f1f``. Samples
|
||||
decoded via ``minimateplus.waveform_codec.decode_waveform_v2``
|
||||
with 87–99% byte-exact match against ``.IDFW.txt`` sidecar (quiet
|
||||
events). Loud events hit the BW codec's known walker-stops-early
|
||||
limit. Residual ~3% drift on per-sample deltas — likely a
|
||||
Thor-specific 12-bit delta refinement that BW's codec doesn't
|
||||
model. Geo LSB = 0.0003 in/s; mic factor ~2.14e-6 psi/count.
|
||||
- **IDFH histogram body**: 12-byte segment header
|
||||
``[len_be 2B] 0a 00 00 00 [00 NN_counter] 05 3f`` introduces a
|
||||
segment of ``N`` 72-byte interval records (``N = (len - 10) // 72``).
|
||||
Each record holds 4 × 16-byte per-channel min/max/halfp + 8-byte
|
||||
tail. Geo peaks via ``max(|min|, |max|) / 32768 × 10`` in/s
|
||||
(matches sidecar within ~1.8%), freq via ``512 / halfp`` Hz.
|
||||
**All 859 Thor IDFH files in the corpus decode (181,071 intervals).**
|
||||
- Binary metadata directly extracted: serial, timestamp, sample_rate,
|
||||
record_time, calibration_date. Other fields fall back to the paired
|
||||
``.IDFW.txt`` / ``.IDFH.txt`` sidecar (consumed by
|
||||
``WaveformStore.save_imported_idf``).
|
||||
|
||||
- ``write_idf_file(path, event)`` (eventually)
|
||||
Round-trip event reconstruction, used for verifying the codec
|
||||
against captured device files the way ``write_blastware_file``
|
||||
verifies the Series III codec.
|
||||
|
||||
- Helpers for decoding the binary's per-channel sample arrays into
|
||||
physical units, the per-event flash buffer's monitor-log records,
|
||||
etc.
|
||||
|
||||
The reverse-engineering path: pair every ``.IDFW`` binary in
|
||||
``thor-watcher/example-data/`` with its sibling ``.IDFW.txt``, treating
|
||||
the txt's "Waveform Data Channels" block as ground-truth, and align
|
||||
the binary's per-channel int16-or-similar arrays against it. Header
|
||||
fields (sample rate, channel count, record time, timestamps) sit before
|
||||
the sample block — same approach as the BW codec where ASCII strings
|
||||
inside the binary (``Project:``, ``Client:``, etc.) anchored field
|
||||
discovery.
|
||||
The full reverse-engineering writeup lives in
|
||||
``docs/idf_protocol_reference.md``.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import struct
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Union
|
||||
from typing import Optional, Union
|
||||
|
||||
from .models import IdfEvent
|
||||
from minimateplus.waveform_codec import decode_waveform_v2
|
||||
|
||||
from .models import IdfEvent, IdfPeaks, IdfReport
|
||||
|
||||
|
||||
def read_idf_file(path: Union[str, Path]) -> "IdfEvent":
|
||||
"""Parse a Thor ``.IDFW``/``.IDFH`` binary into an ``IdfEvent``.
|
||||
# Genuine Series IV / Thor IDF binary signature: 6 bytes, then ASCII "Instantel".
|
||||
_THOR_PREFIX = b"\x00\x12\x01\x00\x00\x00"
|
||||
# Stray Series III (Blastware) binaries that occasionally turn up in Thor
|
||||
# corpus directories renamed to the .IDFW/.IDFH convention. Their header
|
||||
# (`10 00 01 80 00 00 Instantel STRT ...`) is byte-for-byte a BW SUB 5A
|
||||
# STRT record, not a Thor binary. Detected so we can refuse-and-route
|
||||
# rather than mis-parse.
|
||||
_BW_STRAY_PREFIX = b"\x10\x00\x01\x80\x00\x00"
|
||||
_INSTANTEL_TAG = b"Instantel"
|
||||
|
||||
Not yet implemented. When implemented, this will be the canonical
|
||||
entry point for reading Thor binaries — the ASCII sidecar parser
|
||||
becomes an optional fast-path metadata supplement rather than the
|
||||
sole source of device-authoritative data.
|
||||
# Most common body offset for sig-A IDFW files (~50% of prod events;
|
||||
# 151/154 in the original tests/fixtures/THORDATA_example corpus). The
|
||||
# body is the segment-rotated block stream consumed by decode_waveform_v2;
|
||||
# bytes [0:3] are the magic ``00 02 00`` preamble. Production events
|
||||
# routinely use other offsets — see :func:`_find_waveform_body_offset`
|
||||
# for the dynamic scan. This constant survives only as the priority hint.
|
||||
_BODY_START_SIG_A = 0x0F1F
|
||||
|
||||
# Magic bytes that mark a candidate waveform-body preamble.
|
||||
_BODY_MAGIC = b"\x00\x02\x00"
|
||||
|
||||
# Where to start looking for body candidates inside the file. Skip the
|
||||
# fixed-header region where the same magic legitimately appears inside
|
||||
# channel-test records and the compliance block (offsets 0x015d, 0x091c,
|
||||
# 0x0ae2, 0x0d30 in observed events).
|
||||
_BODY_SCAN_FLOOR = 0x0E00
|
||||
|
||||
# Geophone count → in/s, derived from sidecar ground truth: the smallest
|
||||
# non-zero sample in 1,014-file corpus is 0.0003 in/s.
|
||||
_GEO_LSB_IPS = 0.0003
|
||||
|
||||
# Microphone count → psi, derived from sidecar regression on 50 sample
|
||||
# pairs from UM11719_20231219162723.IDFW (mic-heavy event).
|
||||
_MIC_LSB_PSI = 2.14e-6
|
||||
|
||||
# IDFH histogram constants.
|
||||
_IDFH_INTERVAL_SIZE = 72 # bytes per per-interval record
|
||||
_IDFH_SEGMENT_HEADER = 10 # bytes: [len_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
|
||||
_IDFH_SEGMENT_TAIL = 2 # bytes after the interval data block, before next marker
|
||||
_IDFH_HALFP_FREQ_NUM = 512.0 # freq_hz = NUM / halfp; halfp ≤ 5 means ">100 Hz" sentinel
|
||||
_IDFH_GEO_FULL_SCALE = 10.0 # in/s — Normal range
|
||||
_IDFH_INT16_FS = 32768.0
|
||||
_IDFH_CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||
|
||||
|
||||
# ─── Binary metadata extraction ─────────────────────────────────────────────
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfBinaryMetadata:
|
||||
"""Fields recoverable from the sig-A binary header (no .txt needed)."""
|
||||
serial: Optional[str] = None
|
||||
event_datetime: Optional[datetime.datetime] = None
|
||||
sample_rate: Optional[int] = None
|
||||
record_time_sec: Optional[float] = None
|
||||
calibration_date: Optional[datetime.date] = None
|
||||
|
||||
|
||||
def _read_ascii_z(buf: bytes, off: int, maxlen: int = 64) -> Optional[str]:
|
||||
if off >= len(buf):
|
||||
return None
|
||||
end = buf.find(b"\x00", off, off + maxlen)
|
||||
if end < 0:
|
||||
end = min(off + maxlen, len(buf))
|
||||
s = buf[off:end].decode("ascii", errors="replace").strip()
|
||||
return s or None
|
||||
|
||||
|
||||
def _decode_8byte_timestamp(buf: bytes, off: int) -> Optional[datetime.datetime]:
|
||||
"""Layout: ``[day][month][year_hi][year_lo][unknown][hour][min][sec]``."""
|
||||
if off + 8 > len(buf):
|
||||
return None
|
||||
day, mon, yh, yl, _unk, hr, mn, sc = buf[off : off + 8]
|
||||
year = (yh << 8) | yl
|
||||
if not (2015 <= year <= 2050 and 1 <= mon <= 12 and 1 <= day <= 31
|
||||
and 0 <= hr < 24 and 0 <= mn < 60 and 0 <= sc < 60):
|
||||
return None
|
||||
try:
|
||||
return datetime.datetime(year, mon, day, hr, mn, sc)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def extract_binary_metadata(buf: bytes) -> IdfBinaryMetadata:
|
||||
"""Pull serial/timestamp/sample_rate/record_time/calibration from the
|
||||
sig-A binary header.
|
||||
|
||||
Field positions confirmed against UM11719_20231219162723.IDFW; stable
|
||||
across the 151-file sig-A corpus.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"IDF binary codec not yet implemented; the .IDFW/.IDFH binary format "
|
||||
"is undecoded. Use parse_idf_report() on the paired .txt sidecar "
|
||||
"for device-authoritative metadata."
|
||||
md = IdfBinaryMetadata()
|
||||
|
||||
# Serial: null-terminated ASCII at 0x14E.
|
||||
md.serial = _read_ascii_z(buf, 0x14E, maxlen=16)
|
||||
|
||||
# Sample rate + record time live in a BW-compatible compliance block.
|
||||
# Locate the 6-byte anchor `be 80 00 00 00 00` and read offsets relative
|
||||
# to it: anchor-6 = sample_rate uint16 BE; anchor+6 = record_time float32 BE.
|
||||
anchor = buf.find(b"\xbe\x80\x00\x00\x00\x00", 0x800, 0xA00)
|
||||
if anchor > 0:
|
||||
sr_bytes = buf[anchor - 6 : anchor - 4]
|
||||
if len(sr_bytes) == 2:
|
||||
sr = int.from_bytes(sr_bytes, "big")
|
||||
if sr in (256, 512, 1024, 2048, 4096):
|
||||
md.sample_rate = sr
|
||||
rt_bytes = buf[anchor + 6 : anchor + 10]
|
||||
if len(rt_bytes) == 4:
|
||||
try:
|
||||
rt = struct.unpack(">f", rt_bytes)[0]
|
||||
if 0.1 <= rt <= 600.0:
|
||||
md.record_time_sec = float(rt)
|
||||
except struct.error:
|
||||
pass
|
||||
|
||||
# Event timestamp: 8 bytes. Position differs between IDFW (0x97A) and
|
||||
# IDFH (0x9F8); scan a small range and accept the first valid decode.
|
||||
for off in (0x97A, 0x9F8):
|
||||
ts = _decode_8byte_timestamp(buf, off)
|
||||
if ts is not None:
|
||||
md.event_datetime = ts
|
||||
break
|
||||
|
||||
# Calibration date: day, month, year_be at 0x194-0x197.
|
||||
if len(buf) > 0x197:
|
||||
day, mon = buf[0x194], buf[0x195]
|
||||
year = int.from_bytes(buf[0x196 : 0x198], "big")
|
||||
if 1 <= mon <= 12 and 1 <= day <= 31 and 2015 <= year <= 2050:
|
||||
try:
|
||||
md.calibration_date = datetime.date(year, mon, day)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
return md
|
||||
|
||||
|
||||
# ─── Sample decoder + unit conversion ───────────────────────────────────────
|
||||
|
||||
|
||||
def _find_waveform_body_offset(buf: bytes) -> Optional[int]:
|
||||
"""Pick the file offset of the waveform body by trial-decoding every
|
||||
``00 02 00`` magic position past the fixed-header region.
|
||||
|
||||
The body's location isn't fixed across all sig-A IDFW files — about
|
||||
half the production events use ``0x0f1f``, but the rest have offsets
|
||||
that shift based on header padding / channel-config layout. We
|
||||
auto-detect by:
|
||||
|
||||
1. Find every ``00 02 00`` occurrence past ``_BODY_SCAN_FLOOR``.
|
||||
2. Try ``decode_waveform_v2()`` on each candidate.
|
||||
3. Pick the offset whose decoded sample count is largest.
|
||||
|
||||
Returns the offset, or ``None`` if no candidate yielded more than
|
||||
the trivial 2-sample preamble (= "no real body found").
|
||||
|
||||
Costs ~2-8 trial decodes per file; in practice the first candidate
|
||||
past 0x0e00 is usually the right one.
|
||||
"""
|
||||
if len(buf) < _BODY_SCAN_FLOOR + 8:
|
||||
return None
|
||||
best: Optional[tuple[int, int]] = None # (total_samples, offset)
|
||||
i = _BODY_SCAN_FLOOR
|
||||
while True:
|
||||
j = buf.find(_BODY_MAGIC, i)
|
||||
if j < 0:
|
||||
break
|
||||
i = j + 1
|
||||
try:
|
||||
decoded = decode_waveform_v2(buf[j:])
|
||||
except Exception:
|
||||
continue
|
||||
if not decoded:
|
||||
continue
|
||||
total = sum(len(v) for v in decoded.values())
|
||||
# A "real" body has more than just the 2-sample preamble.
|
||||
if total <= 2:
|
||||
continue
|
||||
if best is None or total > best[0]:
|
||||
best = (total, j)
|
||||
return best[1] if best else None
|
||||
|
||||
|
||||
def _decode_waveform_samples(buf: bytes) -> Optional[dict]:
|
||||
"""Decode samples from the sig-A waveform body.
|
||||
|
||||
Returns the raw decoder counts dict — geo LSB = 0.0003 in/s, mic in
|
||||
its own count unit (see :func:`mic_count_to_psi`). Returns None if
|
||||
no usable body is found.
|
||||
|
||||
Uses :func:`_find_waveform_body_offset` to locate the body — the
|
||||
file-offset varies across events (~50% sit at the canonical
|
||||
``0x0f1f`` but the rest don't), so the previous hardcoded constant
|
||||
silently produced 2-sample preamble-only output for half the corpus.
|
||||
"""
|
||||
off = _find_waveform_body_offset(buf)
|
||||
if off is None:
|
||||
return None
|
||||
return decode_waveform_v2(buf[off:])
|
||||
|
||||
|
||||
def geo_count_to_ips(count: int) -> float:
|
||||
"""Convert a Thor geo decoder count to in/s. LSB = 0.0003 in/s."""
|
||||
return count * _GEO_LSB_IPS
|
||||
|
||||
|
||||
def mic_count_to_psi(count: int) -> float:
|
||||
"""Convert a Thor mic decoder count to psi. Scale derived from
|
||||
regression over 50 sample pairs in UM11719_20231219162723.IDFW;
|
||||
consistent to ~5%. Calibration constants from the channel block
|
||||
can refine this once decoded.
|
||||
"""
|
||||
return count * _MIC_LSB_PSI
|
||||
|
||||
|
||||
# ─── IDFH histogram decoder ─────────────────────────────────────────────────
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfhInterval:
|
||||
"""One decoded histogram interval (typically one minute of monitoring)."""
|
||||
offset: int # file byte offset of the 72-byte record
|
||||
# Per-channel min/max ADC counts (int16 BE), half-period samples, peak count.
|
||||
# Peak = max(|min|, |max|). freq_hz = 512/halfp (None if halfp ≤ 5 →
|
||||
# ">100 Hz" sentinel; matches sidecar convention).
|
||||
tran_min: int
|
||||
tran_max: int
|
||||
tran_halfp: int
|
||||
vert_min: int
|
||||
vert_max: int
|
||||
vert_halfp: int
|
||||
long_min: int
|
||||
long_max: int
|
||||
long_halfp: int
|
||||
micl_min: int
|
||||
micl_max: int
|
||||
micl_halfp: int
|
||||
|
||||
def peak_count(self, channel: str) -> int:
|
||||
mn = getattr(self, f"{channel.lower()}_min")
|
||||
mx = getattr(self, f"{channel.lower()}_max")
|
||||
return max(abs(mn), abs(mx))
|
||||
|
||||
def peak_ips(self, channel: str) -> float:
|
||||
"""Convert peak count to in/s (geo channels only)."""
|
||||
return self.peak_count(channel) / _IDFH_INT16_FS * _IDFH_GEO_FULL_SCALE
|
||||
|
||||
def freq_hz(self, channel: str) -> Optional[float]:
|
||||
halfp = getattr(self, f"{channel.lower()}_halfp")
|
||||
if halfp <= 5:
|
||||
return None
|
||||
return _IDFH_HALFP_FREQ_NUM / halfp
|
||||
|
||||
|
||||
def _decode_idfh_interval(buf72: bytes, offset: int) -> IdfhInterval:
|
||||
"""Decode one 72-byte interval record into per-channel min/max/halfp."""
|
||||
import struct
|
||||
fields = []
|
||||
for i in range(4):
|
||||
block = buf72[i * 16 : (i + 1) * 16]
|
||||
mn = struct.unpack_from(">h", block, 0)[0]
|
||||
mx = struct.unpack_from(">h", block, 2)[0]
|
||||
# block[4:6] = int16 BE, role unknown (possibly time-of-peak)
|
||||
halfp = struct.unpack_from(">H", block, 6)[0]
|
||||
# block[10:12] and block[14:16] are uint16 BE with unknown semantics
|
||||
# (likely sum / count contributions for the PVS computation).
|
||||
fields.extend([mn, mx, halfp])
|
||||
# Tail 8 bytes (buf72[64:72]) carry PVS-related data; not yet decoded.
|
||||
return IdfhInterval(
|
||||
offset=offset,
|
||||
tran_min=fields[0], tran_max=fields[1], tran_halfp=fields[2],
|
||||
vert_min=fields[3], vert_max=fields[4], vert_halfp=fields[5],
|
||||
long_min=fields[6], long_max=fields[7], long_halfp=fields[8],
|
||||
micl_min=fields[9], micl_max=fields[10], micl_halfp=fields[11],
|
||||
)
|
||||
|
||||
|
||||
def decode_idfh_body(buf: bytes) -> list:
|
||||
"""Walk an IDFH file and decode every interval record.
|
||||
|
||||
The body has one or more segments; each segment header is 12 bytes:
|
||||
``[length_be 2B][0a 00 00 00][00 NN_counter][05 3f]`` where ``length``
|
||||
is bytes from the magic through the end of the interval block
|
||||
(= 10 + 72 × n_intervals). Segments are separated by a 2-byte tail
|
||||
+ next-segment 2-byte prefix (the bytes before the next length field).
|
||||
Confirmed against the 859-file corpus (181,071 intervals decoded; 1
|
||||
failure is the sig-B BE9439 file).
|
||||
"""
|
||||
intervals: list = []
|
||||
i = 0
|
||||
while True:
|
||||
j = buf.find(b"\x0a\x00\x00\x00", i)
|
||||
if j < 0 or j < 2:
|
||||
break
|
||||
# Validate: [length_be][0a 00 00 00][00 NN][05 3f]
|
||||
if buf[j + 4] != 0x00 or buf[j + 6 : j + 8] != b"\x05\x3f":
|
||||
i = j + 1
|
||||
continue
|
||||
length = int.from_bytes(buf[j - 2 : j], "big")
|
||||
n = (length - _IDFH_SEGMENT_HEADER) // _IDFH_INTERVAL_SIZE
|
||||
if n <= 0:
|
||||
i = j + 1
|
||||
continue
|
||||
header_start = j - 2
|
||||
interval_start = header_start + _IDFH_SEGMENT_HEADER
|
||||
for k in range(n):
|
||||
off = interval_start + k * _IDFH_INTERVAL_SIZE
|
||||
if off + _IDFH_INTERVAL_SIZE > len(buf):
|
||||
break
|
||||
chunk = buf[off : off + _IDFH_INTERVAL_SIZE]
|
||||
intervals.append(_decode_idfh_interval(chunk, off))
|
||||
# Advance past this segment + the 2-byte tail.
|
||||
i = header_start + length + _IDFH_SEGMENT_TAIL
|
||||
return intervals
|
||||
|
||||
|
||||
# ─── Top-level reader ───────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfReadResult:
|
||||
"""Return type for :func:`read_idf_file`.
|
||||
|
||||
For waveforms (``.IDFW``), ``samples`` holds the per-channel sample
|
||||
arrays in Thor decoder counts. For histograms (``.IDFH``),
|
||||
``samples`` is empty and ``intervals`` holds the per-interval
|
||||
record list (peaks, freqs).
|
||||
"""
|
||||
event: IdfEvent
|
||||
samples: dict # {"Tran": [...], ...} for IDFW; empty for IDFH
|
||||
binary_metadata: IdfBinaryMetadata
|
||||
signature: str # always "thor" for now (sig-A genuine Thor)
|
||||
intervals: Optional[list] = None # list[IdfhInterval] for IDFH; None for IDFW
|
||||
|
||||
|
||||
def read_idf_file(
|
||||
path: Union[str, Path],
|
||||
*,
|
||||
data: Optional[bytes] = None,
|
||||
) -> IdfReadResult:
|
||||
"""Parse a Thor ``.IDFW`` binary into an ``IdfEvent`` + decoded samples.
|
||||
|
||||
Currently implements signature-A waveforms only. Signature-B
|
||||
(old-firmware) and ``.IDFH`` histograms raise NotImplementedError;
|
||||
use the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar for those via
|
||||
``parse_idf_report()``.
|
||||
|
||||
Returns an :class:`IdfReadResult`. The caller converts int sample
|
||||
counts to physical units via :func:`geo_count_to_ips` /
|
||||
:func:`mic_count_to_psi`.
|
||||
|
||||
``path`` is used for filename in error messages and ``.IDFH`` vs
|
||||
``.IDFW`` suffix detection. When ``data`` is supplied the disk
|
||||
read is skipped — useful for ingest paths that already have the
|
||||
bytes in memory and where the file may not exist on disk yet.
|
||||
"""
|
||||
p = Path(path)
|
||||
buf = data if data is not None else p.read_bytes()
|
||||
|
||||
if len(buf) < 16 or buf[6:16] != _INSTANTEL_TAG + b"\x00":
|
||||
raise ValueError(f"{p.name}: not an IDF file (missing Instantel magic)")
|
||||
|
||||
sig_prefix = buf[:6]
|
||||
if sig_prefix == _THOR_PREFIX:
|
||||
signature = "thor"
|
||||
elif sig_prefix == _BW_STRAY_PREFIX:
|
||||
raise NotImplementedError(
|
||||
f"{p.name}: file has a Series III (Blastware) STRT header in "
|
||||
"an IDF-named container — not a Thor binary. Route through "
|
||||
"minimateplus.event_file_io.read_blastware_file() instead "
|
||||
"(peaks decode; samples & full metadata don't, but it's not "
|
||||
"Thor data so the Thor codec doesn't apply)."
|
||||
)
|
||||
else:
|
||||
raise ValueError(f"{p.name}: unknown IDF signature {sig_prefix.hex()}")
|
||||
|
||||
is_histogram = p.suffix.upper() == ".IDFH"
|
||||
md = extract_binary_metadata(buf)
|
||||
|
||||
if is_histogram:
|
||||
intervals = decode_idfh_body(buf)
|
||||
if not intervals:
|
||||
raise ValueError(f"{p.name}: IDFH body decoded no intervals")
|
||||
# Peaks: max across all intervals on each channel (per-channel max
|
||||
# of stored max-magnitudes; sidecar's PPV row carries the same).
|
||||
peak_tran = max((iv.peak_ips("Tran") for iv in intervals), default=0.0)
|
||||
peak_vert = max((iv.peak_ips("Vert") for iv in intervals), default=0.0)
|
||||
peak_long = max((iv.peak_ips("Long") for iv in intervals), default=0.0)
|
||||
# Mic peak in psi — Thor stores per-interval mic ADC counts in the
|
||||
# binary; convert the max count to psi via the per-count factor.
|
||||
mic_peak_count = max((iv.peak_count("MicL") for iv in intervals), default=0)
|
||||
mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
|
||||
rep = IdfReport(
|
||||
serial_number=md.serial,
|
||||
event_type="Full Histogram",
|
||||
event_datetime=md.event_datetime,
|
||||
filename=p.name,
|
||||
sample_rate=md.sample_rate,
|
||||
record_time_sec=md.record_time_sec,
|
||||
)
|
||||
peaks = IdfPeaks(
|
||||
transverse_ips=peak_tran,
|
||||
vertical_ips=peak_vert,
|
||||
longitudinal_ips=peak_long,
|
||||
peak_vector_sum_ips=None,
|
||||
mic_pspl_dbl=None, # IDFH binary doesn't carry the dB(L) value
|
||||
mic_pspl_psi=mic_peak_psi,
|
||||
)
|
||||
event = IdfEvent(
|
||||
serial=md.serial or "UNKNOWN",
|
||||
timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
|
||||
kind="Histogram",
|
||||
filename=p.name,
|
||||
sample_rate=md.sample_rate,
|
||||
record_time_sec=md.record_time_sec,
|
||||
peaks=peaks,
|
||||
report=rep,
|
||||
)
|
||||
return IdfReadResult(
|
||||
event=event,
|
||||
samples={},
|
||||
binary_metadata=md,
|
||||
signature=signature,
|
||||
intervals=intervals,
|
||||
)
|
||||
|
||||
# Waveform path.
|
||||
decoded = _decode_waveform_samples(buf)
|
||||
if decoded is None:
|
||||
raise ValueError(f"{p.name}: waveform body codec failed")
|
||||
|
||||
rep = IdfReport(
|
||||
serial_number=md.serial,
|
||||
event_type="Full Waveform",
|
||||
event_datetime=md.event_datetime,
|
||||
filename=p.name,
|
||||
sample_rate=md.sample_rate,
|
||||
record_time_sec=md.record_time_sec,
|
||||
)
|
||||
|
||||
def _peak_ips(ch: str) -> float:
|
||||
arr = decoded.get(ch, [])
|
||||
return geo_count_to_ips(max((abs(v) for v in arr), default=0))
|
||||
|
||||
# Mic peak psi from binary: max absolute MicL ADC count × 2.14e-6 psi/count.
|
||||
mic_arr = decoded.get("MicL", [])
|
||||
mic_peak_count = max((abs(v) for v in mic_arr), default=0)
|
||||
mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
|
||||
|
||||
peaks = IdfPeaks(
|
||||
transverse_ips=_peak_ips("Tran"),
|
||||
vertical_ips=_peak_ips("Vert"),
|
||||
longitudinal_ips=_peak_ips("Long"),
|
||||
# PVS requires aligned per-sample √(T²+V²+L²); leave None — the
|
||||
# sidecar carries it and the bridge picks it up if present.
|
||||
peak_vector_sum_ips=None,
|
||||
mic_pspl_dbl=None, # binary IDFW doesn't carry the dB(L) value;
|
||||
# sidecar .txt fills it via IdfReport.from_dict
|
||||
mic_pspl_psi=mic_peak_psi,
|
||||
)
|
||||
|
||||
event = IdfEvent(
|
||||
serial=md.serial or "UNKNOWN",
|
||||
timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
|
||||
kind="Waveform",
|
||||
filename=p.name,
|
||||
sample_rate=md.sample_rate,
|
||||
record_time_sec=md.record_time_sec,
|
||||
peaks=peaks,
|
||||
report=rep,
|
||||
)
|
||||
|
||||
return IdfReadResult(
|
||||
event=event,
|
||||
samples=decoded,
|
||||
binary_metadata=md,
|
||||
signature=signature,
|
||||
)
|
||||
|
||||
@@ -0,0 +1,323 @@
|
||||
"""
|
||||
micromate/idf_to_bw_report.py — adapter that projects a parsed Thor IDF
|
||||
report (+ binary metadata + decoded IDFH intervals) into the
|
||||
``bw_report``-shaped dict that :mod:`sfm.report_pdf.gather_report_data`
|
||||
consumes.
|
||||
|
||||
Lets Thor events flow through the existing Series III Event Report PDF
|
||||
pipeline without duplicating the renderer. Thor's report content is
|
||||
~95% the same data shape as BW's; the field names differ but the
|
||||
underlying metrics map 1:1.
|
||||
|
||||
Caveats
|
||||
───────
|
||||
|
||||
- **Mic units** — Thor records ``MicPSPL`` natively in dB(L). This
|
||||
adapter sets ``bw_report.mic.pspl_dbl`` directly; the report
|
||||
renderer recomputes the equivalent psi via its dBL→psi formula.
|
||||
- **Saturation / above-range flags** — Thor doesn't always mark
|
||||
``OORANGE`` the way BW does; we set ``zc_freq_above_range`` only
|
||||
when a `>100` sentinel was preserved in the raw text.
|
||||
- **Per-interval data** — for IDFH events we build ``interval_times``
|
||||
by stepping ``IntervalSize`` from ``HistogramStartTime``; the binary
|
||||
decoder confirms one record per step (882 / 881 / 881 ... across
|
||||
the corpus).
|
||||
- **calibration_by parsing** — Thor's free-form ``Calibration : November
|
||||
22, 2023 by Instantel`` is split on ``" by "`` to extract the
|
||||
calibrator; the date prefix is parsed where possible, otherwise
|
||||
the binary-extracted ``calibration_date`` from
|
||||
:class:`micromate.idf_file.IdfBinaryMetadata` wins.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import re
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
|
||||
# ─── Helpers ────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
_NUM_RE = re.compile(r"-?\d+(?:\.\d+)?")
|
||||
|
||||
|
||||
def _parse_first_number(s: Optional[str]) -> Optional[float]:
|
||||
"""Pull the first numeric token from a string like ``"0.1500 in/s"``."""
|
||||
if s is None:
|
||||
return None
|
||||
m = _NUM_RE.search(str(s))
|
||||
if not m:
|
||||
return None
|
||||
try:
|
||||
return float(m.group(0))
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _parse_interval_size_s(s: Optional[str]) -> Optional[float]:
|
||||
"""``"60 sec"`` → 60.0, ``"5 min"`` → 300.0, ``"1 hour"`` → 3600."""
|
||||
if s is None:
|
||||
return None
|
||||
num = _parse_first_number(s)
|
||||
if num is None:
|
||||
return None
|
||||
sl = str(s).lower()
|
||||
if "hour" in sl or "hr" in sl:
|
||||
return num * 3600.0
|
||||
if "min" in sl:
|
||||
return num * 60.0
|
||||
return num # default to seconds
|
||||
|
||||
|
||||
def _parse_calibration(text: Optional[str]) -> tuple[Optional[str], Optional[str]]:
|
||||
"""Split ``"November 22, 2023 by Instantel"`` → (ISO date, calibrator).
|
||||
|
||||
Returns ``(None, None)`` if neither half parses.
|
||||
"""
|
||||
if not text:
|
||||
return None, None
|
||||
parts = str(text).split(" by ", 1)
|
||||
date_part = parts[0].strip() if parts else None
|
||||
by_part = parts[1].strip() if len(parts) > 1 else None
|
||||
iso_date: Optional[str] = None
|
||||
if date_part:
|
||||
for fmt in ("%B %d, %Y", "%b %d, %Y", "%Y-%m-%d", "%m/%d/%Y"):
|
||||
try:
|
||||
iso_date = datetime.datetime.strptime(date_part, fmt).date().isoformat()
|
||||
break
|
||||
except ValueError:
|
||||
continue
|
||||
return iso_date, by_part
|
||||
|
||||
|
||||
def _channel_peaks(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
|
||||
"""Map ``tran_ppv`` / ``tran_zc_freq`` / ... → bw_report.peaks.tran shape."""
|
||||
out: Dict[str, Any] = {}
|
||||
for src, dst in (
|
||||
(f"{ch_lc}_ppv", "ppv_ips"),
|
||||
(f"{ch_lc}_zc_freq", "zc_freq_hz"),
|
||||
(f"{ch_lc}_time_of_peak", "time_of_peak_s"),
|
||||
(f"{ch_lc}_peak_acceleration", "peak_accel_g"),
|
||||
(f"{ch_lc}_peak_displacement", "peak_disp_in"),
|
||||
):
|
||||
v = idf.get(src)
|
||||
if v is not None:
|
||||
out[dst] = v
|
||||
# ZC freq ">100" sentinel: the raw text carries it under the un-typed
|
||||
# key (e.g. ``raw["tran_zc_freq"]`` would be ``">100"``), and our parser
|
||||
# dropped the typed entry. Detect that case and flag.
|
||||
raw_zc = idf.get(f"{ch_lc}_zc_freq")
|
||||
if isinstance(raw_zc, str) and ">" in raw_zc:
|
||||
out["zc_freq_above_range"] = True
|
||||
out.pop("zc_freq_hz", None)
|
||||
return out
|
||||
|
||||
|
||||
def _sensor_check(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
|
||||
out: Dict[str, Any] = {}
|
||||
fr = idf.get(f"{ch_lc}_test_freq")
|
||||
if fr is not None:
|
||||
out["freq_hz"] = _parse_first_number(fr)
|
||||
rt = idf.get(f"{ch_lc}_test_ratio")
|
||||
if rt is not None:
|
||||
out["ratio"] = _parse_first_number(rt)
|
||||
am = idf.get(f"{ch_lc}_test_amplitude")
|
||||
if am is not None:
|
||||
out["amplitude_mv"] = _parse_first_number(am)
|
||||
res = idf.get(f"{ch_lc}_test_results")
|
||||
if res is not None:
|
||||
out["result"] = str(res).strip()
|
||||
return {k: v for k, v in out.items() if v is not None}
|
||||
|
||||
|
||||
def _interval_times(idf: Dict[str, Any], n_intervals: Optional[int]) -> List[str]:
|
||||
"""Synthesise per-interval timestamps from start + interval_size × k.
|
||||
|
||||
Returns ``[]`` when start time or interval size is unknown.
|
||||
"""
|
||||
if not n_intervals:
|
||||
return []
|
||||
start_date = idf.get("histogram_start_date") or idf.get("event_date")
|
||||
start_time = idf.get("histogram_start_time") or idf.get("event_time")
|
||||
iv_str = idf.get("interval_size")
|
||||
iv_s = _parse_interval_size_s(iv_str)
|
||||
if not (start_date and start_time and iv_s):
|
||||
return []
|
||||
try:
|
||||
t0 = datetime.datetime.strptime(f"{start_date} {start_time}", "%Y-%m-%d %H:%M:%S")
|
||||
except ValueError:
|
||||
return []
|
||||
out = []
|
||||
for k in range(int(n_intervals)):
|
||||
t = t0 + datetime.timedelta(seconds=iv_s * (k + 1))
|
||||
out.append(t.isoformat())
|
||||
return out
|
||||
|
||||
|
||||
# ─── Top-level adapter ──────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def build_bw_report_from_idf(
|
||||
idf_report: Dict[str, Any],
|
||||
*,
|
||||
binary_md=None,
|
||||
intervals: Optional[list] = None,
|
||||
is_histogram: Optional[bool] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""Project a parsed IDF report dict (and optional binary metadata +
|
||||
decoded IDFH intervals) into the BW report sidecar shape.
|
||||
|
||||
The returned dict is structurally identical to what
|
||||
``minimateplus.event_file_io._bw_report_to_dict`` produces from a
|
||||
real BW ASCII report — it can be assigned to
|
||||
``sidecar["bw_report"]`` and consumed verbatim by
|
||||
``sfm.report_pdf.gather_report_data``.
|
||||
|
||||
``intervals`` is the list of :class:`micromate.idf_file.IdfhInterval`
|
||||
objects from :func:`micromate.idf_file.decode_idfh_body`; only used
|
||||
for histogram events to derive accurate ``interval_times``.
|
||||
"""
|
||||
if is_histogram is None:
|
||||
et = str(idf_report.get("event_type", ""))
|
||||
is_histogram = et.lower().startswith("full histogram")
|
||||
|
||||
# ── Trigger / recording / device ─────────────────────────────────────
|
||||
trigger_channel = idf_report.get("trigger")
|
||||
trigger_level = _parse_first_number(idf_report.get("geo_trigger_level"))
|
||||
geo_range_ips = _parse_first_number(idf_report.get("geo_range"))
|
||||
|
||||
cal_iso, cal_by = _parse_calibration(idf_report.get("calibration"))
|
||||
# Prefer the binary-extracted calibration_date when our text parse fell
|
||||
# through; the binary date is unambiguous.
|
||||
if cal_iso is None and binary_md is not None and binary_md.calibration_date:
|
||||
cal_iso = binary_md.calibration_date.isoformat()
|
||||
|
||||
# ── Histogram fields ────────────────────────────────────────────────
|
||||
hist_block: Dict[str, Any] = {
|
||||
"start": None, "stop": None, "n_intervals": None,
|
||||
"interval_size": None, "interval_size_s": None,
|
||||
"channel_peak_when": {},
|
||||
}
|
||||
if is_histogram:
|
||||
sd = idf_report.get("histogram_start_date")
|
||||
st = idf_report.get("histogram_start_time")
|
||||
if sd and st:
|
||||
try:
|
||||
hist_block["start"] = datetime.datetime.strptime(
|
||||
f"{sd} {st}", "%Y-%m-%d %H:%M:%S"
|
||||
).isoformat()
|
||||
except ValueError:
|
||||
pass
|
||||
ed = idf_report.get("histogram_stop_date")
|
||||
et_ = idf_report.get("histogram_stop_time")
|
||||
if ed and et_:
|
||||
try:
|
||||
hist_block["stop"] = datetime.datetime.strptime(
|
||||
f"{ed} {et_}", "%Y-%m-%d %H:%M:%S"
|
||||
).isoformat()
|
||||
except ValueError:
|
||||
pass
|
||||
n_raw = idf_report.get("number_of_intervals")
|
||||
if n_raw is not None:
|
||||
try:
|
||||
# Thor reports a float like "81.04"; round to int (the BW
|
||||
# report uses an int for the column).
|
||||
hist_block["n_intervals"] = int(float(str(n_raw)))
|
||||
except ValueError:
|
||||
pass
|
||||
# When the binary decoder gave us the actual interval count, prefer it.
|
||||
if intervals is not None:
|
||||
hist_block["n_intervals"] = len(intervals)
|
||||
hist_block["interval_size"] = idf_report.get("interval_size")
|
||||
hist_block["interval_size_s"] = _parse_interval_size_s(idf_report.get("interval_size"))
|
||||
# interval_times derived from start+step (the BW report uses the
|
||||
# exact strings; we match its representation).
|
||||
times = _interval_times(idf_report, hist_block["n_intervals"])
|
||||
# Per-channel peak when (absolute date+time at which the channel's
|
||||
# peak occurred over the histogram run). Thor splits this into
|
||||
# ``TranPeakDate`` / ``TranPeakTime`` etc.
|
||||
peak_when: Dict[str, str] = {}
|
||||
for ch_label, ch_lc in (("Tran", "tran"), ("Vert", "vert"), ("Long", "long"), ("MicL", "mic")):
|
||||
d = idf_report.get(f"{ch_lc}_peak_date")
|
||||
t = idf_report.get(f"{ch_lc}_peak_time")
|
||||
if d and t:
|
||||
try:
|
||||
peak_when[ch_label] = datetime.datetime.strptime(
|
||||
f"{d} {t}", "%Y-%m-%d %H:%M:%S"
|
||||
).isoformat()
|
||||
except ValueError:
|
||||
continue
|
||||
if peak_when:
|
||||
hist_block["channel_peak_when"] = peak_when
|
||||
|
||||
# ── Mic block ────────────────────────────────────────────────────────
|
||||
mic_block = {
|
||||
"weighting": "L", # Thor mic is ISEE Linear
|
||||
"pspl_dbl": idf_report.get("mic_ppv"), # the dB(L) float
|
||||
"pspl_saturated": False,
|
||||
"zc_freq_hz": idf_report.get("mic_zc_freq"),
|
||||
"zc_freq_above_range": isinstance(idf_report.get("mic_zc_freq"), str)
|
||||
and ">" in str(idf_report.get("mic_zc_freq")),
|
||||
"time_of_peak_s": idf_report.get("mic_time_of_peak"),
|
||||
}
|
||||
if mic_block["zc_freq_above_range"]:
|
||||
mic_block["zc_freq_hz"] = None
|
||||
|
||||
# ── Peaks ────────────────────────────────────────────────────────────
|
||||
vs_block = {
|
||||
"ips": idf_report.get("peak_vector_sum"),
|
||||
"time_s": _parse_first_number(idf_report.get("peak_vector_sum_time_sum")),
|
||||
"when": None,
|
||||
"saturated": False,
|
||||
}
|
||||
if is_histogram:
|
||||
# PVS absolute date+time, when present.
|
||||
vs_d = idf_report.get("peak_vector_sum_date")
|
||||
vs_t = idf_report.get("peak_vector_sum_time")
|
||||
if vs_d and vs_t:
|
||||
try:
|
||||
vs_block["when"] = datetime.datetime.strptime(
|
||||
f"{vs_d} {vs_t}", "%Y-%m-%d %H:%M:%S"
|
||||
).isoformat()
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
return {
|
||||
"available": True,
|
||||
"event_type": idf_report.get("event_type"),
|
||||
"version": idf_report.get("version"),
|
||||
"trigger": {
|
||||
"channel": trigger_channel,
|
||||
"geo_level_ips": trigger_level,
|
||||
},
|
||||
"recording": {
|
||||
"sample_rate_sps": idf_report.get("sample_rate"),
|
||||
"record_time_s": idf_report.get("record_time_sec"),
|
||||
"pretrig_s": idf_report.get("pre_trigger_sec"),
|
||||
"stop_mode": idf_report.get("record_stop_mode"),
|
||||
"geo_range_ips": geo_range_ips,
|
||||
"units": idf_report.get("units"),
|
||||
},
|
||||
"device": {
|
||||
"battery_volts": idf_report.get("battery_volts"),
|
||||
"calibration_date": cal_iso,
|
||||
"calibration_by": cal_by,
|
||||
},
|
||||
"peaks": {
|
||||
"tran": _channel_peaks(idf_report, "tran"),
|
||||
"vert": _channel_peaks(idf_report, "vert"),
|
||||
"long": _channel_peaks(idf_report, "long"),
|
||||
"vector_sum": vs_block,
|
||||
},
|
||||
"mic": mic_block,
|
||||
"sensor_check": {
|
||||
"tran": _sensor_check(idf_report, "tran"),
|
||||
"vert": _sensor_check(idf_report, "vert"),
|
||||
"long": _sensor_check(idf_report, "long"),
|
||||
"mic": _sensor_check(idf_report, "mic"),
|
||||
},
|
||||
"histogram": hist_block,
|
||||
"monitor_log": [],
|
||||
"pc_sw_version": None,
|
||||
}
|
||||
+27
-6
@@ -159,12 +159,23 @@ class IdfReport:
|
||||
|
||||
@dataclass
|
||||
class IdfPeaks:
|
||||
"""Geophone + mic peak values for one Thor event. Native Thor units."""
|
||||
"""Geophone + mic peak values for one Thor event. Native Thor units.
|
||||
|
||||
Thor stores the mic peak in two parallel forms — ``mic_pspl_dbl`` is
|
||||
what the sidecar's top-level ``MicPSPL`` header field carries (dB(L)),
|
||||
used in the report header. ``mic_pspl_psi`` is the psi value derived
|
||||
either from the IDFW sample table / IDFH interval column 9, or from
|
||||
the binary mic counts (~2.14e-6 psi/count). Needed because the
|
||||
BW-shaped ``PeakValues.micl`` consumed by ``event_hdf5.write_event_hdf5``
|
||||
expects psi — feeding it dB(L) makes the h5 mic-chart scale factor
|
||||
blow up.
|
||||
"""
|
||||
transverse_ips: Optional[float] = None # in/s
|
||||
vertical_ips: Optional[float] = None # in/s
|
||||
longitudinal_ips: Optional[float] = None # in/s
|
||||
peak_vector_sum_ips: Optional[float] = None # in/s
|
||||
mic_pspl_dbl: Optional[float] = None # dB(L)
|
||||
mic_pspl_psi: Optional[float] = None # psi
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -324,10 +335,14 @@ class IdfEvent:
|
||||
machinery without those code paths needing to know about Thor.
|
||||
|
||||
Caveats of the bridge:
|
||||
- ``mic_ppv`` on the produced Event carries Thor's dB(L) value
|
||||
verbatim — the UI distinguishes via the ``device_family``
|
||||
column (Phase 1). Don't run the BW psi→dBL converter on
|
||||
Series IV rows.
|
||||
- ``PeakValues.micl`` carries the mic peak in **psi** (matching
|
||||
BW's convention) — set from :attr:`IdfPeaks.mic_pspl_psi`,
|
||||
with a dB(L)→psi fallback when only the dB(L) value is
|
||||
available. This is what the h5 writer's mic-scale-factor
|
||||
logic needs. The dB(L) value still flows through
|
||||
``bw_report.mic.pspl_dbl`` (set by the
|
||||
``idf_to_bw_report`` adapter) and the renderer reads it
|
||||
from there for the report header.
|
||||
- Many Thor-specific fields (Peak Acceleration / Displacement,
|
||||
sensor self-check, calibration) don't have a slot in
|
||||
``Event``. The full IdfReport is preserved on the
|
||||
@@ -349,11 +364,17 @@ class IdfEvent:
|
||||
minute=self.timestamp.minute,
|
||||
second=self.timestamp.second,
|
||||
)
|
||||
# Resolve mic peak as psi. Priority: binary-derived mic_pspl_psi
|
||||
# (set by read_idf_file) > dB(L)→psi fallback via standard formula
|
||||
# (psi = 2.9e-9 × 10^(dBL/20)) > None.
|
||||
mic_psi = self.peaks.mic_pspl_psi
|
||||
if mic_psi is None and self.peaks.mic_pspl_dbl is not None:
|
||||
mic_psi = 2.9e-9 * (10.0 ** (self.peaks.mic_pspl_dbl / 20.0))
|
||||
pv = PeakValues(
|
||||
tran=self.peaks.transverse_ips,
|
||||
vert=self.peaks.vertical_ips,
|
||||
long=self.peaks.longitudinal_ips,
|
||||
micl=self.peaks.mic_pspl_dbl, # dB(L) — see caveat above
|
||||
micl=mic_psi, # psi, matching BW's convention (h5 scaling depends on this)
|
||||
peak_vector_sum=self.peaks.peak_vector_sum_ips,
|
||||
)
|
||||
pi = ProjectInfo(
|
||||
|
||||
@@ -67,6 +67,11 @@ class ChannelStats:
|
||||
# to render "> 10 in/s" or "saturated" instead of trusting the
|
||||
# value as an exact measurement.
|
||||
ppv_saturated: bool = False
|
||||
# Set when BW writes ">100 Hz" for ZC Freq — the zero-crossing
|
||||
# algorithm's peak frequency exceeded the device's reporting
|
||||
# ceiling (typically 100 Hz on V10.72). zc_freq_hz gets the
|
||||
# threshold (100.0) as a lower bound; downstream UI renders ">100".
|
||||
zc_freq_above_range: bool = False
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -81,6 +86,9 @@ class MicStats:
|
||||
# 140 dBL (typical NL-43 max; some units cap at 148). Consumers
|
||||
# should render "> 140 dB(L)" or similar when this flag is set.
|
||||
pspl_saturated: bool = False
|
||||
# Same semantics as ChannelStats.zc_freq_above_range — mic ZC
|
||||
# peak exceeded device reporting ceiling.
|
||||
zc_freq_above_range: bool = False
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -119,6 +127,20 @@ def _is_oorange(value: str) -> bool:
|
||||
return any(m in s for m in _OORANGE_MARKERS)
|
||||
|
||||
|
||||
def _parse_above_range(value: str) -> Optional[float]:
|
||||
"""For BW "above-range" markers like ">100 Hz", return the threshold.
|
||||
|
||||
BW writes ZC Freq as ">100 Hz" when the zero-crossing algorithm sees
|
||||
a peak too fast to count (device cuts off at 100 Hz). Returns the
|
||||
numeric portion after the '>' (e.g. 100.0), or None if `value` is
|
||||
not an above-range marker.
|
||||
"""
|
||||
s = value.strip()
|
||||
if not s.startswith(">"):
|
||||
return None
|
||||
return _parse_number(s[1:])
|
||||
|
||||
|
||||
@dataclass
|
||||
class BwAsciiReport:
|
||||
"""Structured representation of one BW per-event ASCII export."""
|
||||
@@ -527,10 +549,17 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
|
||||
cs.ppv_saturated = True
|
||||
else:
|
||||
cs.ppv_ips = _parse_number(value)
|
||||
elif stat == "ZC Freq":
|
||||
# ">100 Hz" → store threshold + flag; numeric → parse normally
|
||||
threshold = _parse_above_range(value)
|
||||
if threshold is not None:
|
||||
cs.zc_freq_hz = threshold
|
||||
cs.zc_freq_above_range = True
|
||||
else:
|
||||
cs.zc_freq_hz = _parse_number(value)
|
||||
else:
|
||||
num = _parse_number(value)
|
||||
if stat == "ZC Freq": cs.zc_freq_hz = num
|
||||
elif stat == "Time of Peak": cs.time_of_peak_s = num
|
||||
if stat == "Time of Peak": cs.time_of_peak_s = num
|
||||
elif stat == "Peak Acceleration": cs.peak_accel_g = num
|
||||
elif stat == "Peak Displacement": cs.peak_disp_in = num
|
||||
|
||||
@@ -627,9 +656,15 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
|
||||
cs = report.channels.setdefault("MicL", ChannelStats())
|
||||
cs.time_of_peak_s = report.mic.time_of_peak_s
|
||||
elif key == "MicL ZC Freq":
|
||||
report.mic.zc_freq_hz = _parse_number(value)
|
||||
threshold = _parse_above_range(value)
|
||||
if threshold is not None:
|
||||
report.mic.zc_freq_hz = threshold
|
||||
report.mic.zc_freq_above_range = True
|
||||
else:
|
||||
report.mic.zc_freq_hz = _parse_number(value)
|
||||
cs = report.channels.setdefault("MicL", ChannelStats())
|
||||
cs.zc_freq_hz = report.mic.zc_freq_hz
|
||||
cs.zc_freq_hz = report.mic.zc_freq_hz
|
||||
cs.zc_freq_above_range = report.mic.zc_freq_above_range
|
||||
|
||||
# ── Sensor self-check ────────────────────────────────────────────────
|
||||
elif key in (
|
||||
|
||||
@@ -49,7 +49,7 @@ SIDECAR_KIND = "sfm.event"
|
||||
# bumped without a `pip install` re-run — leading to confusing stale
|
||||
# version stamps in sidecars. Bump this constant and CHANGELOG.md
|
||||
# together at release time.
|
||||
TOOL_VERSION = "0.20.0"
|
||||
TOOL_VERSION = "0.21.1"
|
||||
|
||||
try:
|
||||
# Best-effort: prefer the installed metadata when it's NEWER than the
|
||||
@@ -125,6 +125,10 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
|
||||
# is the channel range max (a lower bound), not an exact reading.
|
||||
if getattr(cs, "ppv_saturated", False):
|
||||
out["ppv_saturated"] = True
|
||||
# ZC Freq above device reporting ceiling (BW ">100 Hz") — value
|
||||
# in zc_freq_hz is the threshold, not an exact measurement.
|
||||
if getattr(cs, "zc_freq_above_range", False):
|
||||
out["zc_freq_above_range"] = True
|
||||
return out
|
||||
|
||||
def _sc(ch_name: str) -> dict:
|
||||
@@ -187,11 +191,12 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
|
||||
},
|
||||
},
|
||||
"mic": {
|
||||
"weighting": report.mic.weighting,
|
||||
"pspl_dbl": report.mic.pspl_dbl,
|
||||
"pspl_saturated": bool(getattr(report.mic, "pspl_saturated", False)),
|
||||
"zc_freq_hz": report.mic.zc_freq_hz,
|
||||
"time_of_peak_s": report.mic.time_of_peak_s,
|
||||
"weighting": report.mic.weighting,
|
||||
"pspl_dbl": report.mic.pspl_dbl,
|
||||
"pspl_saturated": bool(getattr(report.mic, "pspl_saturated", False)),
|
||||
"zc_freq_hz": report.mic.zc_freq_hz,
|
||||
"zc_freq_above_range": bool(getattr(report.mic, "zc_freq_above_range", False)),
|
||||
"time_of_peak_s": report.mic.time_of_peak_s,
|
||||
},
|
||||
"sensor_check": {
|
||||
"tran": _sc("Tran"),
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "seismo-relay"
|
||||
version = "0.19.0"
|
||||
version = "0.21.1"
|
||||
description = "Python client and REST server for MiniMate Plus seismographs"
|
||||
requires-python = ">=3.10"
|
||||
dependencies = [
|
||||
|
||||
@@ -103,6 +103,17 @@ def main(argv=None) -> int:
|
||||
"STRT-rectime byte-offset fix in v0.15.x)."
|
||||
),
|
||||
)
|
||||
p.add_argument(
|
||||
"--reparse-txt", action="store_true",
|
||||
help=(
|
||||
"Re-parse the preserved <serial>/<filename>_ASCII.TXT with the "
|
||||
"current bw_ascii_report parser and overwrite the sidecar's "
|
||||
"bw_report block. Use this after upgrading the ASCII parser to "
|
||||
"pull in new fields (e.g. zc_freq_above_range for BW '>100 Hz' "
|
||||
"ZC peaks). No-op for events without a preserved .TXT; safely "
|
||||
"idempotent when the parser hasn't changed."
|
||||
),
|
||||
)
|
||||
p.add_argument("-v", "--verbose", action="store_true")
|
||||
args = p.parse_args(argv)
|
||||
|
||||
@@ -153,7 +164,7 @@ def main(argv=None) -> int:
|
||||
# of the sidecar implies staleness of the derived .h5 (both
|
||||
# come out of the same decoder).
|
||||
sidecar_stale = True
|
||||
if sidecar_path.exists() and not args.force:
|
||||
if sidecar_path.exists() and not args.force and not args.reparse_txt:
|
||||
try:
|
||||
existing = event_file_io.read_sidecar(sidecar_path)
|
||||
sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
|
||||
@@ -314,6 +325,24 @@ def main(argv=None) -> int:
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# --reparse-txt: if a .TXT is preserved on disk, run the
|
||||
# current parser against it and overwrite the bw_report
|
||||
# block. Picks up post-ingest parser fixes (e.g. the
|
||||
# 2026-05-28 zc_freq_above_range / ">100 Hz" addition).
|
||||
if args.reparse_txt and preserved_txt_fn:
|
||||
try:
|
||||
from minimateplus import bw_ascii_report
|
||||
txt_path = store.txt_path_for(serial, path.name)
|
||||
if txt_path.exists():
|
||||
refreshed = bw_ascii_report.parse_report_file(txt_path)
|
||||
preserved_bw_report = event_file_io._bw_report_to_dict(refreshed)
|
||||
log.debug("reparsed bw_report from %s", txt_path.name)
|
||||
else:
|
||||
log.debug("--reparse-txt: no .TXT at %s (sidecar says %r)",
|
||||
txt_path, preserved_txt_fn)
|
||||
except Exception as exc:
|
||||
log.warning("--reparse-txt failed for %s: %s", path.name, exc)
|
||||
|
||||
# Overlay BW ASCII report fields onto the rebuilt Event
|
||||
# BEFORE the sidecar + DB write. Mirrors what the ingest
|
||||
# path does — BW's reported peaks (and sample_rate /
|
||||
|
||||
@@ -0,0 +1,331 @@
|
||||
"""
|
||||
scripts/backfill_thor_events.py — re-process existing Thor (Series IV)
|
||||
events so their sidecars carry the bw_report block produced by
|
||||
``micromate.idf_to_bw_report.build_bw_report_from_idf`` + their .h5
|
||||
clean-waveform files for IDFW events.
|
||||
|
||||
Why this exists
|
||||
───────────────
|
||||
|
||||
Thor events ingested before v0.21.0 (or during the v0.21.0 ingest bug
|
||||
window fixed in commit bee1185) have sidecars with only
|
||||
``extensions.idf_report`` — no ``bw_report`` block. Without
|
||||
``bw_report``, the SFM PDF renderer falls back to DB-only fields
|
||||
(misses sensor-self-check, full per-channel breakdown, mic dB(L)),
|
||||
and the modal chart 404s on ``/waveform.json`` for IDFW events
|
||||
because no .h5 was written when the codec failed at ingest.
|
||||
|
||||
Re-forwarding from thor-watcher would also fix this, but that requires
|
||||
operator coordination on every watcher machine and uses bandwidth this
|
||||
script doesn't.
|
||||
|
||||
What this does
|
||||
──────────────
|
||||
|
||||
Walks ``<store>/<serial>/<filename>`` for ``.IDFW`` / ``.IDFH`` files
|
||||
and, for each one:
|
||||
|
||||
1. Reads the existing sidecar (preserving review state + captured_at).
|
||||
2. Re-runs ``micromate.idf_file.read_idf_file()`` on the binary
|
||||
bytes — passing ``data=`` so the codec doesn't try to read from
|
||||
a path it doesn't know.
|
||||
3. Pulls ``extensions.idf_report`` (the raw parsed Thor dict the
|
||||
v0.18.0+ ingest path already stashed) and runs the v0.21.0
|
||||
``build_bw_report_from_idf`` adapter against it.
|
||||
4. Writes the refreshed sidecar with the new ``bw_report``,
|
||||
bumped ``source.tool_version``, but preserved ``review`` block
|
||||
+ the original ``captured_at`` timestamp.
|
||||
5. Regenerates the .h5 waveform file via the existing
|
||||
``event_hdf5`` writer. For IDFW that's the decoded per-sample
|
||||
stream; for IDFH it's a 1-sample-per-interval synthesised array
|
||||
(peak ADC count per channel) so the renderer's bar-chart code
|
||||
has data to group on. Mic peak psi from the binary is merged
|
||||
onto the IdfEvent before the bridge so the h5 writer's per-count
|
||||
mic scale factor lands on a sensible value (without this the
|
||||
mic chart on Thor events plots dB(L)-as-pseudo-psi and shows
|
||||
bomb-level numbers).
|
||||
|
||||
Idempotent. Re-running it after a parser/adapter change just
|
||||
re-writes sidecars — no DB writes, no thor-watcher coordination.
|
||||
|
||||
Usage
|
||||
─────
|
||||
|
||||
python scripts/backfill_thor_events.py [--store-root PATH]
|
||||
[--dry-run]
|
||||
[--skip-hdf5]
|
||||
[--force]
|
||||
[-v]
|
||||
|
||||
By default, refreshes any Thor event whose sidecar is missing
|
||||
``bw_report`` OR whose ``source.tool_version`` is older than the
|
||||
current ``TOOL_VERSION``. ``--force`` refreshes every Thor event
|
||||
regardless.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import logging
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Allow running from the repo root without installation.
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
|
||||
from minimateplus import event_file_io
|
||||
from sfm.waveform_store import WaveformStore
|
||||
|
||||
log = logging.getLogger("backfill_thor_events")
|
||||
|
||||
|
||||
def _is_thor_event(path: Path) -> bool:
|
||||
if not path.is_file():
|
||||
return False
|
||||
if path.name.endswith((".sfm.json", ".h5", "_ASCII.TXT")):
|
||||
return False
|
||||
return path.suffix.upper() in (".IDFW", ".IDFH")
|
||||
|
||||
|
||||
def _vtuple(s: str) -> tuple:
|
||||
try:
|
||||
return tuple(int(p) for p in str(s).split(".")[:3])
|
||||
except Exception:
|
||||
return (0, 0, 0)
|
||||
|
||||
|
||||
def main(argv=None) -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
p.add_argument(
|
||||
"--db-path",
|
||||
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
|
||||
help="Used only to derive the default --store-root.",
|
||||
)
|
||||
p.add_argument("--store-root", default=None)
|
||||
p.add_argument("--dry-run", action="store_true")
|
||||
p.add_argument("--skip-hdf5", action="store_true",
|
||||
help="Don't regenerate .h5 files for IDFW events.")
|
||||
p.add_argument("--force", action="store_true",
|
||||
help="Refresh every Thor event, not just ones with stale or missing bw_report.")
|
||||
p.add_argument("-v", "--verbose", action="store_true")
|
||||
args = p.parse_args(argv)
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG if args.verbose else logging.INFO,
|
||||
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
|
||||
datefmt="%H:%M:%S",
|
||||
)
|
||||
|
||||
db_path = Path(args.db_path).expanduser().resolve()
|
||||
store_root = (
|
||||
Path(args.store_root).expanduser().resolve()
|
||||
if args.store_root else db_path.parent / "waveforms"
|
||||
)
|
||||
if not store_root.exists():
|
||||
log.error("store root not found: %s", store_root)
|
||||
return 1
|
||||
store = WaveformStore(store_root)
|
||||
log.info("store root: %s", store_root)
|
||||
log.info("current TOOL_VERSION: %s", event_file_io.TOOL_VERSION)
|
||||
|
||||
refreshed = skipped = errors = h5_written = 0
|
||||
|
||||
# Lazy imports so any one of these failing produces a useful error
|
||||
# message rather than crashing module-load.
|
||||
from micromate.idf_file import read_idf_file
|
||||
from micromate.idf_to_bw_report import build_bw_report_from_idf
|
||||
|
||||
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
|
||||
serial = serial_dir.name
|
||||
for path in sorted(serial_dir.iterdir()):
|
||||
if not _is_thor_event(path):
|
||||
continue
|
||||
|
||||
sidecar_path = store.sidecar_path_for(serial, path.name)
|
||||
if not sidecar_path.exists():
|
||||
log.debug("%s: no sidecar — skipping (this is a binary without ingest history)",
|
||||
path.name)
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
try:
|
||||
existing = event_file_io.read_sidecar(sidecar_path)
|
||||
except Exception as exc:
|
||||
log.warning("%s: failed to read sidecar — %s", path.name, exc)
|
||||
errors += 1
|
||||
continue
|
||||
|
||||
has_bw_report = bool(existing.get("bw_report"))
|
||||
existing_version = (existing.get("source") or {}).get("tool_version", "")
|
||||
up_to_date = (
|
||||
has_bw_report
|
||||
and _vtuple(existing_version) >= _vtuple(event_file_io.TOOL_VERSION)
|
||||
)
|
||||
if up_to_date and not args.force:
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
# Re-decode the binary. Catch + log; continue with .txt-only
|
||||
# data if it fails (matches the live ingest path's behavior).
|
||||
idf_samples = None
|
||||
idf_intervals = None
|
||||
binary_md = None
|
||||
is_histogram = path.suffix.upper() == ".IDFH"
|
||||
try:
|
||||
binary_bytes = path.read_bytes()
|
||||
res = read_idf_file(path, data=binary_bytes)
|
||||
idf_samples = res.samples or None
|
||||
idf_intervals = res.intervals
|
||||
binary_md = res.binary_metadata
|
||||
is_histogram = res.intervals is not None
|
||||
except NotImplementedError:
|
||||
# sig-B / Blastware-stray binary; no samples but adapter
|
||||
# can still produce a bw_report from extensions.idf_report.
|
||||
log.debug("%s: binary codec NotImplementedError (sig-B / BW-stray); proceeding from sidecar's idf_report only", path.name)
|
||||
except Exception as exc:
|
||||
log.warning("%s: binary decode failed — %s; proceeding from sidecar's idf_report only", path.name, exc)
|
||||
|
||||
# Run the adapter. Pull report_dict from
|
||||
# extensions.idf_report (the v0.18.0+ ingest preserved it).
|
||||
report_dict = (existing.get("extensions") or {}).get("idf_report") or {}
|
||||
if not report_dict and binary_md is None:
|
||||
log.debug("%s: no idf_report in sidecar AND no binary metadata — nothing to project", path.name)
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
try:
|
||||
bw_report = build_bw_report_from_idf(
|
||||
report_dict, binary_md=binary_md,
|
||||
intervals=idf_intervals, is_histogram=is_histogram,
|
||||
)
|
||||
except Exception as exc:
|
||||
log.warning("%s: adapter failed — %s", path.name, exc)
|
||||
errors += 1
|
||||
continue
|
||||
|
||||
# Build the new sidecar by overlaying refreshed fields onto
|
||||
# the existing one — preserves review, captured_at, blastware
|
||||
# block, source.kind, etc.
|
||||
new_sidecar = dict(existing) # shallow copy
|
||||
new_sidecar["bw_report"] = bw_report
|
||||
src = dict(new_sidecar.get("source") or {})
|
||||
src["tool_version"] = event_file_io.TOOL_VERSION
|
||||
new_sidecar["source"] = src
|
||||
|
||||
# Preserve histogram intervals if the binary decoded them
|
||||
# (improves over the original ingest if that one ran before
|
||||
# the bee1185 codec fix).
|
||||
if idf_intervals is not None:
|
||||
ext = dict(new_sidecar.get("extensions") or {})
|
||||
ext["idf_intervals"] = [
|
||||
{
|
||||
"offset": iv.offset,
|
||||
"tran_peak": iv.peak_count("Tran"),
|
||||
"tran_halfp": iv.tran_halfp,
|
||||
"tran_freq": iv.freq_hz("Tran"),
|
||||
"vert_peak": iv.peak_count("Vert"),
|
||||
"vert_halfp": iv.vert_halfp,
|
||||
"vert_freq": iv.freq_hz("Vert"),
|
||||
"long_peak": iv.peak_count("Long"),
|
||||
"long_halfp": iv.long_halfp,
|
||||
"long_freq": iv.freq_hz("Long"),
|
||||
"mic_peak": iv.peak_count("MicL"),
|
||||
"mic_halfp": iv.micl_halfp,
|
||||
"mic_freq": iv.freq_hz("MicL"),
|
||||
}
|
||||
for iv in idf_intervals
|
||||
]
|
||||
new_sidecar["extensions"] = ext
|
||||
|
||||
if args.dry_run:
|
||||
will_write_h5 = (idf_samples or idf_intervals) and not args.skip_hdf5
|
||||
log.info("[DRY] %s/%s — would refresh sidecar (bw_report=%s, h5=%s)",
|
||||
serial, path.name,
|
||||
"wrote" if not has_bw_report else "refreshed",
|
||||
"would write" if will_write_h5 else "skipped")
|
||||
else:
|
||||
event_file_io.write_sidecar(sidecar_path, new_sidecar)
|
||||
log.info("%s/%s — sidecar refreshed (bw_report=%s, intervals=%d)",
|
||||
serial, path.name,
|
||||
"added" if not has_bw_report else "refreshed",
|
||||
len(idf_intervals) if idf_intervals else 0)
|
||||
refreshed += 1
|
||||
|
||||
# Regenerate .h5 by replaying the same IdfEvent → Event bridge
|
||||
# save_imported_idf uses. For IDFW we write the decoded per-
|
||||
# sample arrays. For IDFH we synthesise a 1-sample-per-interval
|
||||
# array (peak ADC count per channel per interval) so the
|
||||
# renderer's bar-chart code has something to group on.
|
||||
# Pre-condition: either real samples (IDFW) or decoded intervals
|
||||
# (IDFH). Skip otherwise.
|
||||
have_data = bool(idf_samples) or bool(idf_intervals)
|
||||
if have_data and not args.skip_hdf5:
|
||||
from sfm import event_hdf5
|
||||
hdf5_path = store.hdf5_path_for(serial, path.name)
|
||||
if args.dry_run:
|
||||
log.debug("[DRY] would write %s", hdf5_path.name)
|
||||
else:
|
||||
try:
|
||||
from micromate import IdfEvent
|
||||
from minimateplus.event_file_io import file_sha256
|
||||
idf_event = IdfEvent.from_report(report_dict, path.name)
|
||||
|
||||
# Merge the binary-derived mic peak psi (only the
|
||||
# binary path knows the proper psi value; the .txt
|
||||
# carries dB(L)). Without this, the h5 writer's
|
||||
# per-count mic factor is computed against the
|
||||
# dB(L) value-as-pseudo-psi and the mic chart
|
||||
# scales wildly.
|
||||
if (binary_md is not None and res is not None
|
||||
and res.event.peaks.mic_pspl_psi is not None):
|
||||
idf_event.peaks.mic_pspl_psi = res.event.peaks.mic_pspl_psi
|
||||
|
||||
sha256 = file_sha256(path)
|
||||
waveform_key = bytes.fromhex(sha256)[:16]
|
||||
ev = idf_event.to_minimateplus_event(waveform_key)
|
||||
|
||||
if is_histogram and idf_intervals:
|
||||
# 1 sample per interval per channel — same
|
||||
# synthesis save_imported_idf uses. The h5
|
||||
# writer's count×geo_fs/32768 conversion turns
|
||||
# each peak-ADC-count into the bar's physical
|
||||
# value.
|
||||
ev.raw_samples = {
|
||||
"Tran": [iv.peak_count("Tran") for iv in idf_intervals],
|
||||
"Vert": [iv.peak_count("Vert") for iv in idf_intervals],
|
||||
"Long": [iv.peak_count("Long") for iv in idf_intervals],
|
||||
"MicL": [iv.peak_count("MicL") for iv in idf_intervals],
|
||||
}
|
||||
ev.total_samples = ev.total_samples or len(idf_intervals)
|
||||
elif idf_samples:
|
||||
ev.raw_samples = idf_samples
|
||||
n_samp = max(
|
||||
(len(idf_samples.get(ch, []))
|
||||
for ch in ("Tran", "Vert", "Long", "MicL")),
|
||||
default=0,
|
||||
)
|
||||
ev.total_samples = ev.total_samples or n_samp
|
||||
|
||||
event_hdf5.write_event_hdf5(
|
||||
hdf5_path, ev,
|
||||
serial=serial,
|
||||
geo_range="normal",
|
||||
source_kind="idf-import",
|
||||
tool_version=event_file_io.TOOL_VERSION,
|
||||
)
|
||||
h5_written += 1
|
||||
log.debug("%s/%s — .h5 written (%s)",
|
||||
serial, path.name,
|
||||
f"{len(idf_intervals)} intervals" if is_histogram
|
||||
else f"{sum(len(v) for v in (idf_samples or {}).values())} samples")
|
||||
except Exception as exc:
|
||||
log.warning("%s/%s — .h5 write failed: %s",
|
||||
serial, path.name, exc)
|
||||
|
||||
log.info("Done. refreshed=%d skipped=%d errors=%d h5_written=%d",
|
||||
refreshed, skipped, errors, h5_written)
|
||||
return 0 if errors == 0 else 2
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -0,0 +1,91 @@
|
||||
"""Re-ingest a prod IDFW + IDFH via the patched save_imported_idf and
|
||||
render both PDFs to confirm charts have data."""
|
||||
from __future__ import annotations
|
||||
import sys
|
||||
import json
|
||||
import datetime
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
|
||||
|
||||
from sfm.waveform_store import WaveformStore
|
||||
from sfm import report_pdf
|
||||
import h5py
|
||||
|
||||
|
||||
class FakeDb:
|
||||
def __init__(self, event):
|
||||
self.event = event
|
||||
def get_event(self, _id):
|
||||
return self.event
|
||||
|
||||
|
||||
def to_ts_iso(ts):
|
||||
if ts is None:
|
||||
return None
|
||||
try:
|
||||
return datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def render_case(idf_path: Path, serial: str, out_pdf: Path, h5_summary: bool = True):
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
store = WaveformStore(Path(td))
|
||||
ev, rec = store.save_imported_idf(
|
||||
idf_path.read_bytes(),
|
||||
idf_path,
|
||||
idf_report_text=None, # production worst case: no .txt
|
||||
)
|
||||
print(f"=== {idf_path.name} ===")
|
||||
print(f" h5: {rec['hdf5_filename']}, sidecar: {rec['sidecar_filename']}")
|
||||
|
||||
h5p = Path(td) / serial / f"{idf_path.name}.h5"
|
||||
if h5p.exists() and h5_summary:
|
||||
with h5py.File(h5p) as h:
|
||||
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||
ds = h.get(f"samples/{ch}")
|
||||
if ds is not None:
|
||||
n = ds.shape[0]
|
||||
mx = float(abs(ds[...]).max()) if n else 0
|
||||
print(f" samples/{ch}: n={n} max_abs={mx:.5f}")
|
||||
|
||||
record_type = "Histogram" if idf_path.suffix.upper() == ".IDFH" else "Waveform"
|
||||
fake_row = {
|
||||
"serial": serial,
|
||||
"blastware_filename": rec["filename"],
|
||||
"record_type": record_type,
|
||||
"timestamp": to_ts_iso(ev.timestamp),
|
||||
"sample_rate": ev.sample_rate,
|
||||
"project": ev.project_info.project if ev.project_info else None,
|
||||
"client": ev.project_info.client if ev.project_info else None,
|
||||
"operator": ev.project_info.operator if ev.project_info else None,
|
||||
"sensor_location": ev.project_info.sensor_location if ev.project_info else None,
|
||||
"created_at": None,
|
||||
}
|
||||
rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
|
||||
print(f" ReportData: channels={ {k: len(v) for k,v in rd.channels.items()} }")
|
||||
if rd.is_histogram:
|
||||
print(f" histogram n_intervals={rd.histogram_n_intervals} interval_size={rd.histogram_interval_size}")
|
||||
pdf = report_pdf.render_event_report_pdf(rd)
|
||||
out_pdf.write_bytes(pdf)
|
||||
print(f" PDF: {out_pdf} ({len(pdf)} bytes)")
|
||||
|
||||
|
||||
def main():
|
||||
out_dir = Path("/tmp/thor_render_test"); out_dir.mkdir(exist_ok=True)
|
||||
cases = [
|
||||
# IDFW that decoded to preamble-only under the old codec
|
||||
("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804154137.IDFW", "UM6047"),
|
||||
# IDFW that worked under the old codec (validates no regression)
|
||||
("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804104450.IDFW", "UM6047"),
|
||||
# IDFH histogram
|
||||
("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804190047.IDFH", "UM6047"),
|
||||
]
|
||||
for path, serial in cases:
|
||||
render_case(Path(path), serial, out_dir / f"{Path(path).name}.pdf")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
+26
-11
@@ -499,6 +499,14 @@ async function loadEvent(eventId) {
|
||||
renderEventList();
|
||||
setStatus('Loading waveform…');
|
||||
try {
|
||||
// Sidecar fetch runs in parallel — its bw_report block carries ZC
|
||||
// Freq + above-range flags + sensor-check results that the per-
|
||||
// channel stats table surfaces. Failures are non-fatal (legacy
|
||||
// events without a preserved .TXT have no sidecar bw_report).
|
||||
const sidecarP = fetch(`${apiBase}/db/events/${eventId}/sidecar`)
|
||||
.then(r => r.ok ? r.json() : null)
|
||||
.catch(() => null);
|
||||
|
||||
const r = await fetch(`${apiBase}/db/events/${eventId}/waveform.json`);
|
||||
if (!r.ok) {
|
||||
if (r.status === 404) {
|
||||
@@ -511,7 +519,8 @@ async function loadEvent(eventId) {
|
||||
renderWaveform(data);
|
||||
// Also fetch metadata from the events list for richer header
|
||||
const ev = allEvents.find(e => e.id === eventId);
|
||||
renderMeta(data, ev);
|
||||
const sidecar = await sidecarP;
|
||||
renderMeta(data, ev, sidecar);
|
||||
setStatus(`Event loaded.`, 'ok');
|
||||
} catch (e) {
|
||||
setStatus(`Failed to load event: ${e.message}`, 'error');
|
||||
@@ -528,7 +537,7 @@ function showEmpty(msg) {
|
||||
charts = {};
|
||||
}
|
||||
|
||||
function renderMeta(data, ev) {
|
||||
function renderMeta(data, ev, sidecar) {
|
||||
const metaDiv = document.getElementById('event-meta');
|
||||
const fields = [
|
||||
['Serial', data.serial || ev?.serial || '—'],
|
||||
@@ -543,14 +552,20 @@ function renderMeta(data, ev) {
|
||||
];
|
||||
|
||||
// Per-channel stats table mirroring the printout's middle block.
|
||||
// Pulls per-channel PPV from the events row (DB columns) and additional
|
||||
// details (peak time, peak accel, peak displacement, sensor check) from
|
||||
// bw_report when present.
|
||||
// PPV from the events DB row; ZC Freq + saturation flags from the
|
||||
// sidecar's bw_report block (when a .TXT was preserved on ingest).
|
||||
const bwrPeaks = (sidecar?.bw_report || {}).peaks || {};
|
||||
const bwrMic = (sidecar?.bw_report || {}).mic || {};
|
||||
const fmt = v => (v == null ? '—' : (typeof v === 'number' ? v.toFixed(3) : v));
|
||||
const fmtZc = bwr => {
|
||||
if (!bwr || bwr.zc_freq_hz == null) return '—';
|
||||
const prefix = bwr.zc_freq_above_range ? '>' : '';
|
||||
return `${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
|
||||
};
|
||||
const rows = [
|
||||
['Tran', ev?.tran_ppv],
|
||||
['Vert', ev?.vert_ppv],
|
||||
['Long', ev?.long_ppv],
|
||||
['Tran', ev?.tran_ppv, fmtZc(bwrPeaks.tran)],
|
||||
['Vert', ev?.vert_ppv, fmtZc(bwrPeaks.vert)],
|
||||
['Long', ev?.long_ppv, fmtZc(bwrPeaks.long)],
|
||||
];
|
||||
// Mic display honors the current user preference (dBL default).
|
||||
// mic_ppv is stored as raw psi on series3 events; convert when needed.
|
||||
@@ -568,11 +583,11 @@ function renderMeta(data, ev) {
|
||||
const statsHtml = `
|
||||
<table class="stats-table">
|
||||
<thead>
|
||||
<tr><th>Channel</th><th>PPV (in/s)</th></tr>
|
||||
<tr><th>Channel</th><th>PPV (in/s)</th><th>ZC Freq</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
${rows.map(([ch, ppv]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td></tr>`).join('')}
|
||||
<tr><td>MicL</td><td>${micStr}</td></tr>
|
||||
${rows.map(([ch, ppv, zc]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td><td>${zc}</td></tr>`).join('')}
|
||||
<tr><td>MicL</td><td>${micStr}</td><td>${fmtZc(bwrMic)}</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
`;
|
||||
|
||||
+85
-38
@@ -99,6 +99,7 @@ class ReportData:
|
||||
mic_pspl_time_s: Optional[float] = None
|
||||
mic_pspl_when_str: Optional[str] = None # histogram absolute date+time, BW-formatted
|
||||
mic_zc_freq_hz: Optional[float] = None
|
||||
mic_zc_freq_above_range: bool = False
|
||||
mic_channel_test_result: Optional[str] = None
|
||||
mic_channel_test_freq_hz: Optional[float] = None
|
||||
mic_channel_test_amp_mv: Optional[float] = None
|
||||
@@ -216,7 +217,8 @@ def gather_report_data(
|
||||
# Inverse of the dBL formula → psi. Mirrors waveform_codec convention.
|
||||
rd.mic_pspl_psi = DBL_REF_PSI * (10 ** (rd.mic_pspl_dbl / 20))
|
||||
rd.mic_pspl_time_s = mic.get("time_of_peak_s")
|
||||
rd.mic_zc_freq_hz = mic.get("zc_freq_hz")
|
||||
rd.mic_zc_freq_hz = mic.get("zc_freq_hz")
|
||||
rd.mic_zc_freq_above_range = bool(mic.get("zc_freq_above_range"))
|
||||
sc_mic = (bw.get("sensor_check") or {}).get("mic") or {}
|
||||
rd.mic_channel_test_result = sc_mic.get("result")
|
||||
rd.mic_channel_test_freq_hz = sc_mic.get("freq_hz")
|
||||
@@ -236,15 +238,16 @@ def gather_report_data(
|
||||
ch_when_iso = peak_when.get(ch_label)
|
||||
peak_date, peak_time = _split_iso_to_date_time(ch_when_iso)
|
||||
rd.channel_stats.append({
|
||||
"name": ch_label,
|
||||
"ppv_ips": ch.get("ppv_ips"),
|
||||
"zc_freq_hz": ch.get("zc_freq_hz"),
|
||||
"time_of_peak_s": ch.get("time_of_peak_s"),
|
||||
"peak_accel_g": ch.get("peak_accel_g"),
|
||||
"peak_disp_in": ch.get("peak_disp_in"),
|
||||
"sensor_check": sc_ch.get("result"),
|
||||
"peak_date": peak_date,
|
||||
"peak_time": peak_time,
|
||||
"name": ch_label,
|
||||
"ppv_ips": ch.get("ppv_ips"),
|
||||
"zc_freq_hz": ch.get("zc_freq_hz"),
|
||||
"zc_freq_above_range": bool(ch.get("zc_freq_above_range")),
|
||||
"time_of_peak_s": ch.get("time_of_peak_s"),
|
||||
"peak_accel_g": ch.get("peak_accel_g"),
|
||||
"peak_disp_in": ch.get("peak_disp_in"),
|
||||
"sensor_check": sc_ch.get("result"),
|
||||
"peak_date": peak_date,
|
||||
"peak_time": peak_time,
|
||||
})
|
||||
|
||||
# MicL peak time (used in the mic block — "PSPL ... on DATE at TIME")
|
||||
@@ -612,7 +615,8 @@ def _mic_rows(rd: ReportData) -> list[tuple[str, Optional[str]]]:
|
||||
line += f" at {rd.mic_pspl_time_s:.3f} sec."
|
||||
rows.append(("PSPL", line))
|
||||
if rd.mic_zc_freq_hz is not None:
|
||||
rows.append(("ZC Freq", f"{rd.mic_zc_freq_hz:.0f} Hz"))
|
||||
prefix = ">" if rd.mic_zc_freq_above_range else ""
|
||||
rows.append(("ZC Freq", f"{prefix}{rd.mic_zc_freq_hz:.0f} Hz"))
|
||||
if rd.mic_channel_test_result:
|
||||
line = rd.mic_channel_test_result
|
||||
if rd.mic_channel_test_freq_hz is not None and rd.mic_channel_test_amp_mv is not None:
|
||||
@@ -634,14 +638,7 @@ def _draw_channel_stats_waveform(ax, rd: ReportData) -> None:
|
||||
("Sensor Check", "sensor_check", ""),
|
||||
]
|
||||
_draw_stats_table(ax, rd, rows_spec)
|
||||
if rd.peak_vector_sum_ips is not None:
|
||||
line = f"Peak Vector Sum {rd.peak_vector_sum_ips:.3f} in/s"
|
||||
if rd.peak_vector_sum_time_s is not None:
|
||||
line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
|
||||
ax.text(0.0, -0.08, line, fontsize=9, weight="bold",
|
||||
ha="left", va="top", transform=ax.transAxes)
|
||||
ax.text(0.0, -0.18, "NA: Not Applicable", fontsize=7, color="#888",
|
||||
ha="left", va="top", transform=ax.transAxes)
|
||||
_draw_pvs_summary(ax, rd, n_data_rows=len(rows_spec))
|
||||
|
||||
|
||||
def _draw_channel_stats_histogram(ax, rd: ReportData) -> None:
|
||||
@@ -659,20 +656,54 @@ def _draw_channel_stats_histogram(ax, rd: ReportData) -> None:
|
||||
("Sensor Check", "sensor_check", ""),
|
||||
]
|
||||
_draw_stats_table(ax, rd, rows_spec)
|
||||
if rd.peak_vector_sum_ips is not None:
|
||||
line = f"Peak Vector Sum {rd.peak_vector_sum_ips:.3f} in/s"
|
||||
# Histograms: "0.091 in/s on May 27, 2026 At 06:06:14"
|
||||
# The when_str is "HH:MM:SS Month DD, YYYY" — reformat for BW match.
|
||||
if rd.peak_vector_sum_when_str:
|
||||
parts = rd.peak_vector_sum_when_str.split(" ", 1)
|
||||
if len(parts) == 2:
|
||||
line += f" on {parts[1]} At {parts[0]}"
|
||||
else:
|
||||
line += f" on {rd.peak_vector_sum_when_str}"
|
||||
ax.text(0.0, -0.08, line, fontsize=9, weight="bold",
|
||||
ha="left", va="top", transform=ax.transAxes)
|
||||
ax.text(0.0, -0.18, "NA: Not Applicable", fontsize=7, color="#888",
|
||||
ha="left", va="top", transform=ax.transAxes)
|
||||
_draw_pvs_summary(ax, rd, n_data_rows=len(rows_spec), histogram_when=True)
|
||||
|
||||
|
||||
def _draw_pvs_summary(
|
||||
ax,
|
||||
rd: ReportData,
|
||||
*,
|
||||
n_data_rows: int,
|
||||
histogram_when: bool = False,
|
||||
) -> None:
|
||||
"""Render the Peak Vector Sum + 'NA: Not Applicable' caption below the
|
||||
stats table.
|
||||
|
||||
Reads ``ax._stats_table_bottom`` (set by ``_draw_stats_table`` when
|
||||
it pins the table via an explicit ``bbox``) so the PVS line lands
|
||||
just below the table's known bottom edge instead of guessing at the
|
||||
geometry.
|
||||
|
||||
Centered horizontally for visual balance (the previous left-aligned
|
||||
x=0 landed under the label column, not the data, which looked off).
|
||||
"""
|
||||
if rd.peak_vector_sum_ips is None:
|
||||
return
|
||||
|
||||
line = f"Peak Vector Sum {rd.peak_vector_sum_ips:.3f} in/s"
|
||||
if histogram_when and rd.peak_vector_sum_when_str:
|
||||
# Histogram absolute date+time. when_str is "HH:MM:SS Month DD, YYYY";
|
||||
# reformat to "<value> on <date> At <time>" to match BW.
|
||||
parts = rd.peak_vector_sum_when_str.split(" ", 1)
|
||||
if len(parts) == 2:
|
||||
line += f" on {parts[1]} At {parts[0]}"
|
||||
else:
|
||||
line += f" on {rd.peak_vector_sum_when_str}"
|
||||
elif not histogram_when and rd.peak_vector_sum_time_s is not None:
|
||||
line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
|
||||
|
||||
# _draw_stats_table stashes the bbox bottom on the axes so we don't
|
||||
# have to guess geometry. Falls back to a conservative default if
|
||||
# the bbox approach hasn't run.
|
||||
table_bottom_y = getattr(ax, "_stats_table_bottom", -0.10)
|
||||
pvs_y = table_bottom_y - 0.04 # small gap below the table border
|
||||
|
||||
# Centered for visual balance — looks intentional rather than offset.
|
||||
# The original BW-replica had a "NA: Not Applicable" caption below
|
||||
# this line; dropped because we use "—" for missing values and the
|
||||
# legend was always squished against the PVS line.
|
||||
ax.text(0.5, pvs_y, line, fontsize=9, weight="bold",
|
||||
ha="center", va="top", transform=ax.transAxes)
|
||||
|
||||
|
||||
def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]]) -> None:
|
||||
@@ -684,13 +715,17 @@ def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]])
|
||||
ch_lookup = {c["name"]: c for c in rd.channel_stats}
|
||||
|
||||
def _cell(field, ch_name):
|
||||
val = ch_lookup.get(ch_name, {}).get(field)
|
||||
ch_rec = ch_lookup.get(ch_name, {})
|
||||
val = ch_rec.get(field)
|
||||
if val is None:
|
||||
return "—"
|
||||
if isinstance(val, float):
|
||||
# ZC Freq is integer-formatted in BW; everything else with 3 decimals
|
||||
# ZC Freq is integer-formatted in BW; ">100 Hz" sentinel
|
||||
# rendered as ">N" (val carries the threshold). Everything
|
||||
# else gets 3 decimals.
|
||||
if field == "zc_freq_hz":
|
||||
return f"{val:.0f}"
|
||||
prefix = ">" if ch_rec.get("zc_freq_above_range") else ""
|
||||
return f"{prefix}{val:.0f}"
|
||||
return f"{val:.3f}"
|
||||
return str(val)
|
||||
|
||||
@@ -703,16 +738,28 @@ def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]])
|
||||
_cell(field_name, "Long"),
|
||||
unit,
|
||||
])
|
||||
# Pin the table's position+size via bbox so we know exactly where
|
||||
# the bottom edge lands. Lets _draw_pvs_summary place the PVS line
|
||||
# just below the table without guessing at row heights.
|
||||
#
|
||||
# bbox = [x, y, width, height] in axes coords. Header + data rows
|
||||
# at row_h each; horizontal extent matches sum(colWidths).
|
||||
n_rows = len(table_data) # header + data rows
|
||||
row_h = 0.12 # axes-fraction per row (fits fontsize=8)
|
||||
table_height = n_rows * row_h
|
||||
table_bottom = 1.0 - table_height
|
||||
tbl = ax.table(
|
||||
cellText=table_data, loc="upper left",
|
||||
cellText=table_data,
|
||||
colWidths=[0.28, 0.14, 0.14, 0.14, 0.10],
|
||||
cellLoc="left", edges="open",
|
||||
bbox=[0.0, table_bottom, 0.80, table_height],
|
||||
)
|
||||
tbl.auto_set_font_size(False)
|
||||
tbl.set_fontsize(8)
|
||||
tbl.scale(1, 1.4)
|
||||
for j in range(5):
|
||||
tbl[(0, j)].set_text_props(weight="bold", color="#555")
|
||||
# Stash the bottom Y so _draw_pvs_summary can position itself below.
|
||||
ax._stats_table_bottom = table_bottom
|
||||
|
||||
|
||||
def _channel_axis_color(ch: str) -> str:
|
||||
|
||||
+19
-5
@@ -2886,6 +2886,12 @@ function _renderSidecar(data) {
|
||||
const bw = data.blastware || {};
|
||||
const src = data.source || {};
|
||||
const rev = data.review || {};
|
||||
// bw_report carries the per-channel ASCII-derived stats (ZC Freq,
|
||||
// saturation flags, peak time, etc.). Only present on events
|
||||
// ingested with a preserved .TXT (post-2026-05-27); falls back to
|
||||
// empty for legacy events.
|
||||
const bwrPeaks = (data.bw_report || {}).peaks || {};
|
||||
const bwrMic = (data.bw_report || {}).mic || {};
|
||||
|
||||
document.getElementById('sc-title').textContent = `Event — ${bw.filename || ev.waveform_key || 'unknown'}`;
|
||||
|
||||
@@ -2918,11 +2924,19 @@ function _renderSidecar(data) {
|
||||
document.getElementById('sc-f-sr').textContent = (ev.sample_rate ?? '—') + (ev.sample_rate ? ' sps' : '');
|
||||
document.getElementById('sc-f-key').textContent = ev.waveform_key || '—';
|
||||
|
||||
document.getElementById('sc-f-tran').textContent = fmtPpv(pv.transverse);
|
||||
document.getElementById('sc-f-vert').textContent = fmtPpv(pv.vertical);
|
||||
document.getElementById('sc-f-long').textContent = fmtPpv(pv.longitudinal);
|
||||
// Suffix with " · {prefix}{N} Hz" when bw_report has a ZC Freq.
|
||||
// Above-range ZC peaks (BW ">100 Hz") get a literal ">" prefix so
|
||||
// operators see the same indicator the PDF shows.
|
||||
const fmtZc = bwr => {
|
||||
if (!bwr || bwr.zc_freq_hz == null) return '';
|
||||
const prefix = bwr.zc_freq_above_range ? '>' : '';
|
||||
return ` · ${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
|
||||
};
|
||||
document.getElementById('sc-f-tran').textContent = fmtPpv(pv.transverse) + fmtZc(bwrPeaks.tran);
|
||||
document.getElementById('sc-f-vert').textContent = fmtPpv(pv.vertical) + fmtZc(bwrPeaks.vert);
|
||||
document.getElementById('sc-f-long').textContent = fmtPpv(pv.longitudinal) + fmtZc(bwrPeaks.long);
|
||||
document.getElementById('sc-f-pvs').textContent = fmtPpv(pv.vector_sum);
|
||||
document.getElementById('sc-f-mic').textContent = fmtMic(pv.mic_psi);
|
||||
document.getElementById('sc-f-mic').textContent = fmtMic(pv.mic_psi) + fmtZc(bwrMic);
|
||||
|
||||
document.getElementById('sc-f-project').textContent = pi.project || '—';
|
||||
document.getElementById('sc-f-client').textContent = pi.client || '—';
|
||||
@@ -3273,7 +3287,7 @@ if (currentSection === 'db') {
|
||||
<dt id="sc-l-bwsize">File size</dt> <dd id="sc-f-bwsize">—</dd>
|
||||
<dt id="sc-l-sha">File sha256</dt> <dd id="sc-f-sha">—</dd>
|
||||
<dt>Source kind</dt> <dd id="sc-f-src">—</dd>
|
||||
<dt title="When our server received and stored this event (sfm-db insert time, not the recording time)">Received by server at</dt>
|
||||
<dt title="When SFM received and stored this event — NOT the unit-local trigger time (see Timestamp at the top of the modal for that).">Time received</dt>
|
||||
<dd id="sc-f-cap">—</dd>
|
||||
</dl>
|
||||
</div>
|
||||
|
||||
+189
-23
@@ -467,21 +467,21 @@ class WaveformStore:
|
||||
Ingest a Thor (Micromate Series IV) IDF event file (`.IDFW` or
|
||||
`.IDFH`) produced by Thor's TXT exporter.
|
||||
|
||||
Thor binaries are stored as opaque bytes — seismo-relay doesn't
|
||||
yet decode the proprietary IDF binary format (codec slot lives
|
||||
at ``micromate/idf_file.py``). Device-authoritative metadata
|
||||
comes from the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar
|
||||
when supplied.
|
||||
|
||||
Workflow:
|
||||
1. Parse the paired TXT report (when supplied) via
|
||||
``micromate.parse_idf_report`` → dict.
|
||||
2. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
|
||||
3. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
|
||||
4. Bridge IdfEvent → ``minimateplus.Event`` (for the existing
|
||||
sidecar / DB insert machinery) via
|
||||
``IdfEvent.to_minimateplus_event(waveform_key)``.
|
||||
5. Write the ``.sfm.json`` sidecar with
|
||||
1. For sig-A `.IDFW` binaries, decode samples + binary metadata
|
||||
via ``micromate.idf_file.read_idf_file()``. Failure or
|
||||
non-IDFW path falls through to the .txt-only flow.
|
||||
2. Parse the paired TXT report (when supplied) via
|
||||
``micromate.parse_idf_report`` → dict. TXT remains the
|
||||
source of truth for fields the binary doesn't yet supply
|
||||
(full peak set with ZC freq / Time of Peak, sensor self-check,
|
||||
firmware string, project strings).
|
||||
3. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
|
||||
4. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
|
||||
5. Bridge IdfEvent → ``minimateplus.Event`` and attach
|
||||
``raw_samples`` from the binary decoder (when available).
|
||||
6. Write the `.h5` clean-waveform file when samples decoded.
|
||||
7. Write the ``.sfm.json`` sidecar with
|
||||
``source.kind = "idf-import"`` and the full raw IDF report
|
||||
under ``extensions.idf_report``.
|
||||
|
||||
@@ -490,7 +490,38 @@ class WaveformStore:
|
||||
"""
|
||||
from micromate import IdfEvent, parse_idf_report
|
||||
|
||||
# Parse the .txt sidecar (best-effort; non-fatal on failure).
|
||||
# 1. Binary decode (sig-A IDFW and IDFH). Non-fatal: any failure
|
||||
# leaves samples / binary metadata unfilled and we proceed with
|
||||
# the .txt path as before.
|
||||
idf_samples: Optional[dict] = None
|
||||
idf_intervals: Optional[list] = None
|
||||
binary_md = None
|
||||
binary_peaks = None
|
||||
is_histogram = False
|
||||
try:
|
||||
from micromate.idf_file import read_idf_file
|
||||
# Pass idf_bytes through `data=` — at this point in the flow
|
||||
# the binary hasn't been written to disk yet, so the codec
|
||||
# can't read from source_path. We still pass source_path so
|
||||
# the codec has the filename for error messages + .IDFH
|
||||
# suffix detection.
|
||||
res = read_idf_file(source_path, data=idf_bytes)
|
||||
idf_samples = res.samples or None
|
||||
idf_intervals = res.intervals
|
||||
is_histogram = res.intervals is not None
|
||||
binary_md = res.binary_metadata
|
||||
binary_peaks = res.event.peaks
|
||||
except NotImplementedError:
|
||||
# sig-B — codec doesn't handle this yet.
|
||||
pass
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_idf: binary codec failed for %s: %s — "
|
||||
"falling back to .txt-only ingest",
|
||||
source_path.name, exc,
|
||||
)
|
||||
|
||||
# 2. Parse the .txt sidecar (best-effort; non-fatal on failure).
|
||||
report_dict: dict = {}
|
||||
if idf_report_text is not None:
|
||||
try:
|
||||
@@ -501,17 +532,58 @@ class WaveformStore:
|
||||
exc,
|
||||
)
|
||||
|
||||
# Build the typed IdfEvent. Filename is authoritative for
|
||||
# 3. Backfill report_dict with binary metadata for fields the
|
||||
# .txt didn't supply. Binary takes precedence on tied fields
|
||||
# where the binary is more reliable (timestamp, sample_rate),
|
||||
# and fills in fields entirely missing from the .txt.
|
||||
if binary_md is not None:
|
||||
if binary_md.serial and not report_dict.get("serial_number"):
|
||||
report_dict["serial_number"] = binary_md.serial
|
||||
if binary_md.event_datetime and not report_dict.get("event_datetime"):
|
||||
report_dict["event_datetime"] = binary_md.event_datetime
|
||||
if binary_md.sample_rate and not report_dict.get("sample_rate"):
|
||||
report_dict["sample_rate"] = binary_md.sample_rate
|
||||
if binary_md.record_time_sec and not report_dict.get("record_time_sec"):
|
||||
report_dict["record_time_sec"] = binary_md.record_time_sec
|
||||
# Calibration date (binary) vs calibration text (.txt) cohabit
|
||||
# under different keys; no overwrite needed.
|
||||
if binary_md.event_datetime and not report_dict.get("event_type"):
|
||||
report_dict["event_type"] = (
|
||||
"Full Histogram" if is_histogram else "Full Waveform"
|
||||
)
|
||||
|
||||
# Binary-derived peaks fill in when the .txt didn't supply them.
|
||||
# They're ~3% low vs the device-authoritative .txt values (residual
|
||||
# codec drift), so .txt always wins when present.
|
||||
if binary_peaks is not None:
|
||||
if binary_peaks.transverse_ips and not report_dict.get("tran_ppv"):
|
||||
report_dict["tran_ppv"] = binary_peaks.transverse_ips
|
||||
if binary_peaks.vertical_ips and not report_dict.get("vert_ppv"):
|
||||
report_dict["vert_ppv"] = binary_peaks.vertical_ips
|
||||
if binary_peaks.longitudinal_ips and not report_dict.get("long_ppv"):
|
||||
report_dict["long_ppv"] = binary_peaks.longitudinal_ips
|
||||
|
||||
# 4. Build the typed IdfEvent. Filename is authoritative for
|
||||
# (serial, timestamp, kind); the report's event_datetime takes
|
||||
# precedence over the filename timestamp inside from_report().
|
||||
idf_event = IdfEvent.from_report(report_dict, source_path.name)
|
||||
|
||||
# The binary mic peak (psi) isn't carried through from_report() —
|
||||
# IdfReport.from_dict only sees the .txt's dB(L) value. Pull the
|
||||
# binary-derived ``mic_pspl_psi`` onto the typed IdfEvent so the
|
||||
# downstream bridge can populate ``PeakValues.micl`` (psi-shaped)
|
||||
# and the h5 writer's per-count mic factor lands at a sensible
|
||||
# value. Without this, the h5 mic chart auto-scales against the
|
||||
# dB(L) value-as-pseudo-psi and renders ~flat.
|
||||
if binary_peaks is not None and binary_peaks.mic_pspl_psi is not None:
|
||||
idf_event.peaks.mic_pspl_psi = binary_peaks.mic_pspl_psi
|
||||
|
||||
# Operator-supplied serial_hint wins over the binary's filename
|
||||
# prefix when both are present (e.g. callers passing a known-good
|
||||
# serial that overrides a misnamed export).
|
||||
serial = serial_hint or idf_event.serial or "UNKNOWN"
|
||||
|
||||
# Filesystem write.
|
||||
# 5. Filesystem write of binary bytes.
|
||||
filename = source_path.name
|
||||
bw_path = self._serial_dir(serial) / filename
|
||||
bw_path.write_bytes(idf_bytes)
|
||||
@@ -523,13 +595,59 @@ class WaveformStore:
|
||||
# surrogate — every distinct binary maps to a distinct row.
|
||||
waveform_key = bytes.fromhex(sha256)[:16]
|
||||
|
||||
# Bridge to minimateplus.Event for the existing sidecar / DB
|
||||
# 6. Bridge to minimateplus.Event for the existing sidecar / DB
|
||||
# insert paths. See IdfEvent.to_minimateplus_event() for the
|
||||
# caveats of this bridge (mic units, missing fields → sidecar).
|
||||
ev = idf_event.to_minimateplus_event(waveform_key)
|
||||
|
||||
# Write the sidecar. Source kind "idf-import" was added to the
|
||||
# allow-list in event_file_io.event_to_sidecar_dict for this.
|
||||
# Attach the decoded sample arrays. Thor's decoder counts use
|
||||
# LSB = 0.0003 in/s for geo (vs BW's 16-count units at 0.005 in/s)
|
||||
# — the .h5 writer's geo_range="normal" yields LSB = 10/32768
|
||||
# ≈ 0.000305 in/s, so plotted samples come out ~1.7% high.
|
||||
# Acceptable known offset; refine with a Thor-aware h5 path later.
|
||||
if idf_samples is not None:
|
||||
ev.raw_samples = idf_samples
|
||||
n_samples = max((len(idf_samples.get(ch, [])) for ch in ("Tran", "Vert", "Long", "MicL")), default=0)
|
||||
ev.total_samples = ev.total_samples or n_samples
|
||||
|
||||
# For IDFH histograms there are no per-sample waveform arrays — the
|
||||
# device stores one peak ADC count per interval per channel. Synthesise
|
||||
# a 1-sample-per-interval array so the existing h5+renderer pipeline
|
||||
# (which groups samples down to ``n_intervals`` bars via max-per-group)
|
||||
# produces a non-blank histogram chart. Each "sample" is the peak ADC
|
||||
# count for that interval, so the h5 writer's ``count × geo_fs/32768``
|
||||
# conversion yields the right physical value for the bar height.
|
||||
if is_histogram and idf_intervals:
|
||||
hist_samples = {
|
||||
"Tran": [iv.peak_count("Tran") for iv in idf_intervals],
|
||||
"Vert": [iv.peak_count("Vert") for iv in idf_intervals],
|
||||
"Long": [iv.peak_count("Long") for iv in idf_intervals],
|
||||
"MicL": [iv.peak_count("MicL") for iv in idf_intervals],
|
||||
}
|
||||
ev.raw_samples = hist_samples
|
||||
ev.total_samples = ev.total_samples or len(idf_intervals)
|
||||
|
||||
# 7. Write the .h5 clean-waveform file when we have samples to write
|
||||
# (either the IDFW per-sample stream, or the IDFH synthesised per-
|
||||
# interval peak array). The renderer treats both shapes the same way.
|
||||
hdf5_filename: Optional[str] = None
|
||||
if ev.raw_samples:
|
||||
hdf5_path = self.hdf5_path_for(serial, filename)
|
||||
try:
|
||||
event_hdf5.write_event_hdf5(
|
||||
hdf5_path, ev,
|
||||
serial=serial,
|
||||
geo_range="normal", # Thor's geo full scale is also 10 in/s (Normal)
|
||||
source_kind="idf-import",
|
||||
)
|
||||
hdf5_filename = hdf5_path.name
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_idf: HDF5 write failed for %s: %s — continuing without .h5",
|
||||
hdf5_path, exc,
|
||||
)
|
||||
|
||||
# 8. Write the sidecar. Source kind "idf-import" is on the allow-list.
|
||||
sidecar_path = self.sidecar_path_for(serial, filename)
|
||||
existing_review = None
|
||||
if sidecar_path.exists():
|
||||
@@ -554,19 +672,67 @@ class WaveformStore:
|
||||
# Time of Peak, sensor self-check, calibration, firmware).
|
||||
if report_dict:
|
||||
sidecar["extensions"]["idf_report"] = report_dict
|
||||
|
||||
# Project the IDF report into the BW report sidecar shape so the
|
||||
# existing Event Report PDF pipeline (sfm/report_pdf.py) can
|
||||
# render Thor events without needing a separate code path. Thor
|
||||
# data is 95% the same metric set as BW — the adapter handles
|
||||
# the field-name mapping.
|
||||
if report_dict or binary_md is not None:
|
||||
try:
|
||||
from micromate.idf_to_bw_report import build_bw_report_from_idf
|
||||
sidecar["bw_report"] = build_bw_report_from_idf(
|
||||
report_dict or {},
|
||||
binary_md=binary_md,
|
||||
intervals=idf_intervals,
|
||||
is_histogram=is_histogram,
|
||||
)
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_idf: idf→bw_report adapter failed for %s: %s — "
|
||||
"report PDF will fall back to DB-only fields",
|
||||
filename, exc,
|
||||
)
|
||||
# For histograms, also stash the binary-decoded per-interval
|
||||
# records so the UI / report layer doesn't need to re-walk the
|
||||
# IDFH file at render time.
|
||||
if idf_intervals is not None:
|
||||
sidecar["extensions"]["idf_intervals"] = [
|
||||
{
|
||||
"offset": iv.offset,
|
||||
"tran_peak": iv.peak_count("Tran"),
|
||||
"tran_halfp": iv.tran_halfp,
|
||||
"tran_freq": iv.freq_hz("Tran"),
|
||||
"vert_peak": iv.peak_count("Vert"),
|
||||
"vert_halfp": iv.vert_halfp,
|
||||
"vert_freq": iv.freq_hz("Vert"),
|
||||
"long_peak": iv.peak_count("Long"),
|
||||
"long_halfp": iv.long_halfp,
|
||||
"long_freq": iv.freq_hz("Long"),
|
||||
"mic_peak": iv.peak_count("MicL"),
|
||||
"mic_halfp": iv.micl_halfp,
|
||||
"mic_freq": iv.freq_hz("MicL"),
|
||||
}
|
||||
for iv in idf_intervals
|
||||
]
|
||||
event_file_io.write_sidecar(sidecar_path, sidecar)
|
||||
|
||||
log.info(
|
||||
"WaveformStore.save_imported_idf serial=%s filename=%s filesize=%d "
|
||||
"report_attached=%s",
|
||||
serial, filename, filesize, bool(report_dict),
|
||||
"kind=%s report_attached=%s binary_decoded=%s h5=%s intervals=%d",
|
||||
serial, filename, filesize,
|
||||
"histogram" if is_histogram else "waveform",
|
||||
bool(report_dict),
|
||||
(idf_samples is not None) or (idf_intervals is not None),
|
||||
hdf5_filename or "(skipped)",
|
||||
len(idf_intervals) if idf_intervals else 0,
|
||||
)
|
||||
return ev, {
|
||||
"filename": filename,
|
||||
"filesize": filesize,
|
||||
"sha256": sha256,
|
||||
"a5_pickle_filename": None,
|
||||
"hdf5_filename": None,
|
||||
"hdf5_filename": hdf5_filename,
|
||||
"sidecar_filename": sidecar_path.name,
|
||||
"serial": serial,
|
||||
}
|
||||
|
||||
@@ -441,6 +441,40 @@ def test_real_oorange_event_t190_parses():
|
||||
assert r.channels["Long"].ppv_ips == pytest.approx(2.83)
|
||||
assert r.peak_vector_sum_saturated is True
|
||||
assert r.peak_vector_sum_time_s == pytest.approx(0.007)
|
||||
# Same fixture: Tran ZC Freq is ">100 Hz" — must parse as 100 +
|
||||
# above_range flag, not None (which would render as "—" on the PDF).
|
||||
assert r.channels["Tran"].zc_freq_hz == 100.0
|
||||
assert r.channels["Tran"].zc_freq_above_range is True
|
||||
# Vert/Long are normal numeric values; flag stays False.
|
||||
assert r.channels["Vert"].zc_freq_above_range is False
|
||||
assert r.channels["Long"].zc_freq_above_range is False
|
||||
|
||||
|
||||
def test_above_range_marker_treated_as_zc_threshold():
|
||||
"""BW writes '>100 Hz' for ZC Freq when the zero-crossing algorithm
|
||||
sees a peak too fast to count (cuts off at the device's 100 Hz
|
||||
reporting ceiling). Parser must store the threshold + flag, not
|
||||
fall back to None.
|
||||
"""
|
||||
txt = """\
|
||||
"Event Type : Full Waveform"
|
||||
"Serial Number : BE18190"
|
||||
"Tran ZC Freq : >100 Hz"
|
||||
"Vert ZC Freq : 73 Hz"
|
||||
"Long ZC Freq : N/A Hz"
|
||||
"MicL ZC Freq : >100 Hz"
|
||||
"""
|
||||
r = parse_report(txt)
|
||||
assert r.channels["Tran"].zc_freq_hz == 100.0
|
||||
assert r.channels["Tran"].zc_freq_above_range is True
|
||||
assert r.channels["Vert"].zc_freq_hz == 73.0
|
||||
assert r.channels["Vert"].zc_freq_above_range is False
|
||||
# N/A → None, flag stays False
|
||||
assert r.channels["Long"].zc_freq_hz is None
|
||||
assert r.channels["Long"].zc_freq_above_range is False
|
||||
# Mic above-range
|
||||
assert r.mic.zc_freq_hz == 100.0
|
||||
assert r.mic.zc_freq_above_range is True
|
||||
|
||||
|
||||
def test_real_histogram_fixture_populates_sensor_location():
|
||||
|
||||
Reference in New Issue
Block a user