fix(backfill): regenerate IDFH .h5 + merge binary mic_pspl_psi onto bridge

Two gaps in backfill_thor_events.py that left old Thor events showing stale charts after a v0.21.1 backfill pass: 1. IDFH events were skipped from .h5 regeneration (the "have decoded samples" gate was IDFW-only). Histograms kept their pre-v0.21.1 .h5 — written from raw_samples = None, which the renderer turned into a near-empty bar chart, or for older events the dB(L)-as-pseudo- psi mic scale that produced "107.7 psi" peaks (atomic-bomb level instead of footstep level). Fix: synthesise the same 1-sample-per- interval array save_imported_idf v0.21.1 uses (peak ADC count per channel per interval) so the renderer's bar-chart grouping has data to work with. 2. The IDFW h5 path didn't merge binary_peaks.mic_pspl_psi onto the IdfEvent before to_minimateplus_event(). The live save_imported_idf does this merge — without it, IdfEvent.from_report() only sees the .txt's dB(L) value, the bridge falls back to the dBL→psi formula (instead of the binary-accurate 2.14e-6 psi/count value), and the h5 writer's per-count mic factor lands on a less-correct value. Fix: same merge the live ingest does (lift res.event.peaks.mic_pspl_psi onto idf_event.peaks before the bridge call). Verified against UM6047_20250804190047.IDFH (250-interval prod histogram): 250 intervals decode, mic_pspl_psi = 2.78e-5 (was being treated as dB(L)=107.7 in the old h5). Operator: re-run after deploy. `docker compose exec sfm python scripts/backfill_thor_events.py` is idempotent — the existing version check still skips events already at the new TOOL_VERSION, and review state + captured_at are preserved on the second pass.
version bump - 0.21.1
2026-06-01 20:02:54 +00:00 · 2026-06-01 19:33:44 +00:00 · 2026-06-01 18:27:24 +00:00 · 2026-05-31 20:51:09 +00:00 · 2026-05-30 04:37:43 +00:00 · 2026-05-29 22:17:43 +00:00
40 changed files with 3147 additions and 161 deletions
@@ -6,8 +6,138 @@ All notable changes to seismo-relay are documented here.

 ## [Unreleased]

+---
+
+## v0.21.1 — 2026-06-01
+
+Bug fixes against v0.21.0 surfaced after the first prod redeploy.  Three
+production-visible symptoms — blank waveform charts on most Thor events,
+blank histogram charts on all Thor events, and a mic chart that
+auto-scaled against a dB(L) value treated as psi — all root-caused and
+fixed.
+
 ### Fixed

+- **Dynamic IDFW body offset.**  The v0.21.0 codec hardcoded the body
+  at file offset `0x0f1f` based on the example corpus, but only ~52%
+  of production IDFW events use that offset; the rest sit at offsets
+  from `0x1033` up to `0x3082` depending on header padding.  At
+  `0x0f1f` the codec would find a coincidentally-matching `00 02 00`
+  magic, read the 2-byte Tran preamble, and return empty V/L/M
+  arrays — producing near-empty .h5 files and blank charts.
+  `micromate.idf_file._find_waveform_body_offset()` now scans every
+  `00 02 00` magic position past `0x0E00`, trial-decodes each one,
+  and picks the offset with the most samples.  Validated across 483
+  prod IDFW files: 0 preamble-only events (was ~50%), 355/483 fully
+  decode, 126/483 partial (BW codec walker-stops-early on loud
+  events — pre-existing limitation, samples reached are correct).
+
+- **IDFH histograms now render bar charts.**  Histograms previously
+  skipped the .h5 write because there are no per-sample arrays, but
+  the renderer drives the per-interval bar chart from .h5 channel
+  data + `bw_report.histogram.n_intervals`.  `save_imported_idf` now
+  synthesizes a 1-sample-per-interval array from the decoded
+  `IdfhInterval` peak counts and writes an .h5 so the existing
+  renderer works unchanged — each "sample" is the per-interval peak
+  ADC count, so the writer's `count × geo_fs/32768` conversion
+  yields the right bar height.
+
+- **Mic chart scaling on Thor events.**  `PeakValues.micl` (consumed
+  by the h5 writer's per-count mic scale factor) expects psi, but
+  the Thor bridge was stuffing the dB(L) value (~99.4) into it,
+  producing a per-count factor 5+ orders of magnitude too large and
+  a flat-looking mic chart.  Fixed by adding `IdfPeaks.mic_pspl_psi`
+  alongside `mic_pspl_dbl`; `read_idf_file()` computes it from
+  binary mic counts (`max(|MicL|) × 2.14e-6 psi/count`) for both
+  IDFW and IDFH paths; `save_imported_idf` merges it onto the typed
+  event after `IdfEvent.from_report`; the bridge feeds psi to
+  `PeakValues.micl` with a dB(L)→psi formula fallback when only the
+  dB(L) value is available.  dB(L) for the report header still
+  flows through `bw_report.mic.pspl_dbl` unchanged.
+
+### Operator
+
+After deploy, run `python scripts/backfill_thor_events.py` to refresh
+every existing Thor event's sidecar + .h5 with the corrected codec
+output.  The script auto-skips events already at the current
+`TOOL_VERSION`, so the bump from `0.21.0` → `0.21.1` is what triggers
+the refresh.
+
+---
+
+## v0.21.0 — 2026-05-29
+
+The "Thor / Series IV codec" release.  Two big pieces landed: (1) the IDF binary codec actually decodes now, both IDFW and IDFH, and (2) a Thor→BW adapter lets Thor events flow through the existing Series III Event Report PDF pipeline.  Combined effect: a Thor event ingested via `/db/import/idf_file` now lands in the DB with the same fidelity as a Blastware event, gets a per-event PDF on demand, and renders in Terra-View's modal chart with the same plotting code as a BW event.
+
+### Added — Thor IDF binary codec (`micromate/idf_file.read_idf_file`)
+
+- **IDFW (waveform)** — body sits at fixed file offset `0x0f1f`; reuses the verified `decode_waveform_v2()` walker from `minimateplus.waveform_codec`.  Sample fidelity is **87–99% byte-exact** against the ASCII-sidecar reference values on quiet events; loud events hit the same walker-stops-early limitation as the BW codec on `SP0/SS0/SV0`-style events.
+- **IDFH (histogram)** — dedicated segment-based decoder for the Thor histogram body format: `[len_be][0a 00 00 00][00 NN][05 3f]` framing plus N × 72-byte interval records (4 × 16-byte per-channel min/max/halfp).  **All 859 Thor IDFH corpus files decode**, totalling **181,071 intervals**; per-channel peaks match the sidecar within **~1.8% (ADC quantization)**.
+- **BW-aliased binary detection** — a small number of corpus files (e.g. `BE9439_*.IDFW/IDFH`) are actually Series III Blastware binaries that share the IDF filename convention by accident.  `read_idf_file()` detects them via their BW `STRT` signature and raises `NotImplementedError` pointing the caller at `read_blastware_file()` instead of trying to decode them as IDF.
+- Full field layouts in `docs/idf_protocol_reference.md`; supporting analysis scripts in `analysis_idf/` (decode validators, per-file detail dumps, corpus accuracy reports).
+
+### Added — Thor → BW report adapter (`micromate/idf_to_bw_report.py`)
+
+- **`build_bw_report_from_idf(report_dict, binary_md=, intervals=, is_histogram=)`** projects a parsed Thor `IdfReport` plus binary-extracted metadata plus decoded IDFH intervals into the `bw_report`-shaped dict that `sfm.report_pdf.gather_report_data` consumes.  No need to duplicate the renderer — Thor data is ~95% the same metric set as BW; the adapter handles the field-name mapping (`MicPSPL` → `pspl_dbl`, `>100` sentinel → `zc_freq_above_range`, free-form `Calibration : Nov 22, 2023 by Instantel` → `calibration_date` + `calibration_by`, etc.).
+- For IDFH events the adapter derives `histogram.interval_times` by stepping `IntervalSize` from `HistogramStartTime`, matching what the BW pipeline expects from a histogram-mode event.
+- **Wired into `WaveformStore.save_imported_idf`** — every Thor event ingested via `/db/import/idf_file` now gets a `bw_report` block in its sidecar in addition to the existing `extensions.idf_report` (the raw parsed Thor payload).  Falls back gracefully (PDF renders from DB-only fields) if the adapter raises — logged as a warning rather than failing the ingest.
+
+### Companion releases
+
+- **Terra-View v0.13.0** ships in parallel — closes Phase 1 of the SFM integration.  The shared event-detail modal now renders the SFM event story (Chart.js waveform/histogram chart, inline PDF preview, `.TXT` download, FT/reviewer/notes review form) without operators needing to bounce to the standalone SFM webapp on port 8200.  Uses only existing seismo-relay endpoints — no API changes here, just better consumption.
+
+### Migration / Operations
+
+No DB migration needed.  Existing Thor events already in the store don't automatically pick up the new `bw_report` block — they'd need a re-ingest (post the IDF binary + paired `.TXT` back to `/db/import/idf_file`) for the adapter to run.  Alternatively, run `scripts/backfill_sidecars.py --reparse-txt` after a small adapter change (the script currently only re-runs the BW ASCII parser; extending it to handle Thor would be a small follow-up).
+
+```bash
+cd /home/serversdown/terra-view
+docker compose build sfm && docker compose up -d sfm
+```
+
+The bumped `TOOL_VERSION = "0.21.0"` in `minimateplus/event_file_io.py` means any subsequent `backfill_sidecars.py --force` pass will re-write sidecars with the new version stamp; that's expected and harmless.
+
+---
+
+## v0.20.0 — 2026-05-28
+
+The "PDF + parser polish" release.  Closes out the Event-Report PDF iteration started in v0.17.x: histogram layouts now render correctly against BW reference PDFs, the ASCII parser handles the real-world edge cases production events were tripping over (OORANGE, `>100 Hz`, histogram timestamps), and the `.TXT` preservation rollout lets parser fixes be applied retroactively to ingested events.  Adds server-wide timezone support so operator-visible timestamps no longer drift into UTC.  Rolls up the substantial "pre-v0.20" body of work that had accumulated under `[Unreleased]` (PDF generation, histogram codec fix, histogram parser fields, `.TXT` preservation, backfill safety) — see the trailing "pre-v0.20.0 work" section below for the full list.
+
+### Added (2026-05-28)
+
+- **Server-wide display timezone via `TZ` env var.**  Both seismo-relay and terra-view now respect a `TZ` environment variable (default `America/New_York` on prod).  Affects server log timestamps, the PDF report renderer's UTC→local conversions on the "Created" footer line, matplotlib's datetime axes, and any other naïve-vs-aware datetime rendering.  DB columns (`created_at`, etc.) stay UTC regardless — this is a display-side fix, not a storage-side one.  Dockerfile now installs `tzdata` (required for the env var to take effect under `python:slim`).  Override per-deployment via the `TZ` line in `docker-compose.yml`.
+- **ZC Freq "above-range" handling — render `>100 Hz` instead of `—`.**  BW writes `">100 Hz"` literally when the zero-crossing algorithm sees a peak too fast to count (device cuts off at 100 Hz on V10.72).  Previously `_parse_number(">100")` returned None and the PDF stats table rendered `—`.  Now the parser mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` and sets a new `zc_freq_above_range` flag.  Flag rides through the sidecar's `bw_report` block.  Renders as `>100` in the PDF (per-channel + mic block), as `· >100 Hz` inline on the event modal's Peaks section, and as a dedicated column on the event-browser stats table.  Verified against the real T190LD5Q.LK0W fixture from 2026-05-27 plus a synthetic test case.
+- **Per-channel ZC Freq surfaced in event modals.**  Neither the main webapp modal (`sfm_webapp.html`) nor the standalone event browser (`event_browser.html`) previously exposed ZC Freq.  Now both do — webapp shows it inline alongside PPV (`0.04500 in/s · 47 Hz`); event-browser gets a dedicated column on its per-channel stats table.  Required wiring a parallel sidecar fetch into the event-browser's `loadEvent()` (it was only fetching `waveform.json`).  Falls back to `—` for events without a preserved `.TXT` (pre-2026-05-27 ingests).
+- **`scripts/backfill_sidecars.py --reparse-txt` flag.**  Before this, the backfill script preserved the `bw_report` block from existing sidecars verbatim — so parser-side fixes (like the `>100 Hz` addition above) couldn't reach old events.  The new flag re-runs the current parser against the preserved `<serial>/<filename>_ASCII.TXT`, overwrites the bw_report block, and cascade-regenerates the sidecar.  Implies sidecar regeneration on every event (bypasses the sha/version skip).  No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27 .TXT-preservation rollout).  Idempotent.  Run with `--skip-hdf5` to skip waveform regen — recommended when only the bw_report needs refreshing.  Validated end-to-end on prod: 9,999 events refreshed cleanly, ZC Freq + OORANGE flags now populated where the original .TXT had them.
+
+### Fixed (2026-05-28)
+
+- **Histogram PDFs no longer 500 on the missing `histogram_interval_size_s` attribute.**  The histogram-interval-times derivation block in `gather_report_data` referenced `rd.histogram_interval_size_s`, but the field was never declared on the `ReportData` dataclass nor read from the sidecar projection (it was inlined into `gather_report_data` without the seconds-numeric counterpart making it onto the dataclass).  Every histogram PDF render raised `AttributeError → 500`.  Waveform PDFs were unaffected.  Fix: add the field, read it from the projection's existing `bw_report.histogram.interval_size_s` key.
+- **Histogram PDF geo channels now share a single nice-quantized y-axis.**  Previously each geo subplot auto-scaled independently — Tran, Vert, and Long all showed different per-channel maxes, so bar heights weren't directly comparable across channels.  The footer "Amplitude Geo: X in/s/div" label was also computed as `max(first_geo_channel) / 5` with no LSB quantization, producing nonsense values like `0.003 in/s/div` when the geophone LSB is 0.005.  Fix: compute a single shared geo y-axis range from `max(Tran, Vert, Long)`, quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 in/s LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same `ylim` + ticks to all three subplots, and use that step for the footer label.  MicL stays on its own auto-scale (different units).  Matches BW's chart styling.
+
+### Docs (2026-05-28)
+
+- **Roadmap entry for a second undecoded histogram body sub-format.**  BE17353 (S353) events observed on 2026-05-28 use a histogram body where `byte[5] = 0x00` (looks like a valid block header by every prior signal) but the walker finds zero data blocks.  Different from the existing `byte[5] != 0` roadmap entry (T190 / O121).  Operationally identical impact — ingestion succeeds, DB peaks come from the bw_report overlay, only the chart is empty.  Sample events captured in the roadmap entry for future RE work.
+
+### Migration / Operations
+
+- **Re-parse existing events to pick up the new parser fields.**  Run on whichever box hosts the live waveform store:
+  ```bash
+  docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
+      --reparse-txt --skip-hdf5 --dry-run -v | tail
+  # Looks reasonable?  Run for real:
+  docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
+      --reparse-txt --skip-hdf5 -v | tee /tmp/reparse.log | tail -30
+  ```
+  Idempotent; safe to re-run.  Only touches sidecars on disk — no DB writes.
+- **terra-view docker-compose.yml**: add `TZ=America/New_York` (or your deployment's zone) to both the `terra-view` and `sfm` service `environment:` blocks.  Without this, server-rendered timestamps stay in UTC even on the rebuilt SFM image.
+
+### Pre-v0.20.0 work (rolled into this release)
+
+The bullets below accumulated under `[Unreleased]` between v0.19.0 and v0.20.0; kept here so the historical narrative isn't lost.
+
+#### Fixed
+
 - **bw_ascii_report parser now handles `OORANGE` saturation marker.**  BW writes `"OORANGE"` (truncation of "Out Of Range") in PPV / PVS / MicL PSPL fields when the underlying measurement exceeded the channel's full-scale.  Previously our `_parse_number()` returned None → DB ended up with NULL peaks for legitimate high-amplitude events.  Confirmed on real ASCII files pulled 2026-05-27 from the Windows watcher PC: T190LD5Q.LK0W (Vert saturated at Normal range 10 in/s), T438L713.RY0W (all three channels saturated at Sensitive range 1.25 in/s), K557L3YM.OE0W (Tran+Vert saturated + Mic PSPL OORANGE).  New behavior:
   - Per-channel PPV: substitute `geo_range_ips` as a conservative lower bound + set `ppv_saturated` flag
   - Peak Vector Sum: substitute `sqrt(3) * geo_range_ips` (the theoretical max when all 3 channels are simultaneously at full-scale) + `peak_vector_sum_saturated` flag
@@ -16,7 +146,7 @@ All notable changes to seismo-relay are documented here.
   - Five events on prod (T190 / T438 / K557 + 2 others matching the same fault pattern) will pick up correct DB peaks + saturation flags once re-forwarded
 - **bw_ascii_report parser handles `Peak Vector Sum TimeSum` typo'd label.**  Real BW output uses this misspelled label (Sum appended twice instead of "Peak Vector Sum Time").  Now accepted as an alias.  Confirmed against all three OORANGE example files — every one has the typo.

-### Added
+#### Added

 - **Histogram per-interval aggregation in `waveform.json`.**  Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output).  When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array.  `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings).  Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present.  Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
 - **`bw_ascii_report` parser now captures histogram-specific fields.**  Previously the parser dropped these fields silently (Roadmap item closed):
@@ -43,13 +173,13 @@ All notable changes to seismo-relay are documented here.
 - **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`.  Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file.  BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
 - **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars.  Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED.  Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.

-### Fixed
+#### Fixed

 - **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.**  Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict).  Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
 - **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.**  The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts).  Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
 - **Thor IDF files no longer attempted as BW events in backfill.**  `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.

-### Docs
+#### Docs

 - **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes.  Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
 - **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
@@ -2,7 +2,7 @@

 Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
 managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
-(Sierra Wireless RV50 / RV55). Current version: **v0.17.0**.
+(Sierra Wireless RV50 / RV55). Current version: **v0.21.0**.

 When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document

@@ -73,6 +73,28 @@ should not import from `sfm/`, must not touch a DB, and have no I/O
 beyond reading files passed as arguments.  Keep them pure — both
 tiers can then depend on them without circularity.

+#### Thor IDF binary codec (2026-05-28)
+
+`micromate/idf_file.read_idf_file()` decodes both Thor IDFW
+(waveform) and IDFH (histogram) binaries.
+
+- **IDFW** reuses `decode_waveform_v2()` on the body at fixed file
+  offset `0x0f1f`.  Sample fidelity is 87–99% byte-exact on quiet
+  events; loud events hit the BW codec's known walker-stops-early
+  limitation.
+- **IDFH** has its own segment-based decoder: `[len_be][0a 00 00 00]
+  [00 NN][05 3f]` + N × 72-byte interval records (4 × 16-byte
+  per-channel min/max/halfp).  All 859 Thor IDFH corpus files
+  decode (181,071 intervals); peak matches sidecar within ~1.8%
+  (ADC quantization).
+
+The two outlier `BE9439_*` files in the Thor example corpus are
+actually Series III Blastware binaries that share the `.IDFW`/`.IDFH`
+filename convention by accident.  `read_idf_file()` detects them by
+their BW STRT signature and raises NotImplementedError pointing
+callers at `read_blastware_file()`.  See
+`docs/idf_protocol_reference.md` for full field layouts.
+
 ### Practical consequences

 When deciding where new code goes, ask:
@@ -1,4 +1,4 @@
-# seismo-relay  `v0.19.0`
+# seismo-relay  `v0.21.0`

 A ground-up replacement for **Blastware** — Instantel's aging Windows-only
 software for managing seismographs.  Supports both the **MiniMate Plus
@@ -35,6 +35,25 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
 > and storage layer dispatch deterministically instead of sniffing
 > filenames.  Self-applying migration backfills existing rows from the
 > binary filename extension.
+> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
+> started in v0.17.x: histogram layouts render correctly against BW
+> reference PDFs, the ASCII parser handles real-world edge cases
+> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
+> Freq is surfaced in both modals (event browser + main webapp).
+> Adds a server-wide `TZ` env var so operator-visible timestamps
+> render in local time instead of UTC.  New
+> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
+> applied retroactively to existing events without re-forwarding,
+> using the `.TXT` files preserved at ingest time.
+> **v0.21.0 (2026-05-29)** is the Thor / Series IV decoder release —
+> `micromate/idf_file.read_idf_file()` now decodes both IDFW
+> (waveform) and IDFH (histogram) binaries (87–99% sample fidelity
+> on quiet IDFW events; all 859 IDFH corpus files decode cleanly).
+> A new `micromate/idf_to_bw_report.py` adapter projects parsed
+> Thor reports into the BW-shaped sidecar block, so Thor events
+> flow through the existing Event Report PDF pipeline without a
+> separate renderer.  Terra-View v0.13.0 ships in parallel and
+> closes Phase 1 of the SFM integration — see its CHANGELOG.
 > See [CHANGELOG.md](CHANGELOG.md) for full version history.

 ---
@@ -58,7 +77,8 @@ seismo-relay/
 ├── micromate/                 ← Series IV (Micromate / Thor) client library (NEW v0.19)
 │   ├── models.py              ←   IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L))
 │   ├── idf_ascii_report.py    ←   Parse Thor .IDFW.txt / .IDFH.txt event sidecars
-│   └── idf_file.py            ←   Stub for the .IDFW / .IDFH binary codec (reverse-engineering pending)
+│   ├── idf_file.py            ←   Binary codec for .IDFW + .IDFH (v0.21.0+)
+│   └── idf_to_bw_report.py    ←   Adapter projecting Thor IDF into the BW report shape (v0.21.0+)
 │
 ├── sfm/                       ← SFM REST API server (FastAPI, port 8200)
 │   ├── server.py              ←   Live device endpoints + DB query + ingest endpoints + caching
@@ -415,7 +435,7 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
 - [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+)
 - [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version
 - [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/`
- [ ] Binary `.IDFW` / `.IDFH` codec — pending (see Roadmap + [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md))
+- [x] Binary `.IDFW` / `.IDFH` codec — ✅ v0.21.0.  IDFW reuses `decode_waveform_v2()` on the body at offset `0x0f1f` (87–99% sample fidelity on quiet events); IDFH has a dedicated segment-based decoder (all 859 corpus files decode, 181,071 intervals total).  See `micromate/idf_file.py` + `docs/idf_protocol_reference.md`.
 - [ ] Live-device protocol — pending codec

 **Data persistence:**
@@ -528,7 +548,7 @@ Implementation steps (concrete):
 ### High-impact (unblocks product features)

 - [ ] **Series III waveform body codec reverse-engineering.**  The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`).  Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open.  Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise.  Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
- [ ] **Series IV (Thor IDF) binary codec reverse-engineering.**  `.IDFH` / `.IDFW` files are currently stored opaquely by `WaveformStore.save_imported_idf`, with all metadata sourced from the paired `.txt` sidecar.  This works because thor-watcher forwards both files together, but operators who haven't enabled Thor's TXT exporter get rows with NULL peaks.  Cracking the binary closes that gap and unlocks waveform display.  Starting-point reference at [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) — two observed file signatures (1,012 newer-firmware files + 2 old files whose layout matches the Series III STRT-record format), suggested first-session plan (~2-4 hrs), 1,014 paired binary+txt files available as ground truth in `thor-watcher/example-data/`.  Code seam ready at `micromate/idf_file.py`.
+- [x] **Series IV (Thor IDF) binary codec reverse-engineering.** ✅ v0.21.0 — `micromate/idf_file.read_idf_file()` decodes both IDFW (waveform body at offset `0x0f1f`, reusing `decode_waveform_v2()`; 87–99% sample fidelity on quiet events) and IDFH (dedicated segment-based decoder: all 859 corpus files decode, 181,071 intervals, peaks within ~1.8% of sidecar values).  `WaveformStore.save_imported_idf` now also projects parsed Thor data into a `bw_report` block via `micromate/idf_to_bw_report.py` so Thor events render in the existing Event Report PDF pipeline without a separate renderer.
 - [ ] **In-app waveform viewer accuracy.**  Depends on Series III codec decode.  Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples.  Series IV waveforms come online when the IDF codec lands.
 - [ ] **Series IV live-device support.**  Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD).
 - [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing.
@@ -536,10 +556,10 @@ Implementation steps (concrete):

 ### BW ASCII report parser enhancements (built in v0.16.0)

- [ ] **PPV field misses on certain TXT formats.**  Discovered 2026-05-22 during the histogram-codec backfill validation: a handful of events (5 in prod) have a `bw_report` block where `peaks.{tran,vert,long}.ppv_ips` and `peaks.vector_sum.ips` are all `None`, despite the parser correctly extracting every OTHER field for the same channels (zc_freq_hz, time_of_peak_s, peak_accel_g, peak_disp_in).  Symptom on the DB side: `peak_vector_sum=0` after a `--force` backfill that overlays from the parsed bw_report dict.  Affected events on prod include `T190LD5Q.LK0W`, `T438L713.RY0W`, `K557L3YM.OE0W`.  Root cause likely a regex or format mismatch for the "PPV" header line in those specific firmware/event-type outputs.  Once fixed, re-forwarding the events from series3-watcher will re-populate the `bw_report` blocks correctly.
- [ ] **Histogram-specific structural fields.**  Current parser handles the shared fields (PPV, ZC Freq, sensor self-check, project) but silently drops histogram-only fields: `Histogram Start/Stop Time`, `Histogram Start/Stop Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date` (absolute timestamps rather than the waveform's `Time of Peak` relative seconds).
+- [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value.  Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag.  All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
+- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now.  Land in the sidecar's `bw_report.histogram` block.
 - [ ] **Histogram interval bin-table parsing.**  Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed.  Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
- [ ] **`>100 Hz` value parsing.**  Histogram TXTs use `>100 Hz` for out-of-range ZC freq; current `_parse_number()` returns `None` for these (loses information).
+- [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag.  PDF + both modals render `>100 Hz` instead of `—`.

 ### Ingestion gaps

@@ -0,0 +1,65 @@
+"""Run read_idf_file across the corpus and report per-channel accuracy vs sidecars."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+from analysis_idf.recon import load_sidecar_samples
+
+
+def sidecar_path(idfw: Path) -> Path:
+    return idfw.parent / "TXT" / f"{idfw.name}.txt"
+
+
+def main():
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
+    files.sort()
+    GEO_LSB = 0.0003
+
+    n_ok = n_skip = 0
+    overall = {"Tran": [], "Vert": [], "Long": []}
+
+    for f in files:
+        try:
+            res = read_idf_file(f)
+        except Exception:
+            n_skip += 1
+            continue
+        sc_path = sidecar_path(f)
+        if not sc_path.exists():
+            n_skip += 1
+            continue
+        try:
+            sc = load_sidecar_samples(sc_path)
+        except Exception:
+            n_skip += 1
+            continue
+
+        per_file = {}
+        for ch in ("Tran", "Vert", "Long"):
+            sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+            dec = res.samples.get(ch, [])
+            n = min(len(sc_counts), len(dec))
+            if n == 0:
+                per_file[ch] = 0.0
+                continue
+            exact = sum(1 for i in range(n) if sc_counts[i] == dec[i])
+            pct = 100.0 * exact / n
+            per_file[ch] = pct
+            overall[ch].append(pct)
+        n_ok += 1
+
+    print(f"Processed {n_ok} files (skipped {n_skip})")
+    print("Per-channel exact-match % (mean / min / max):")
+    for ch, vals in overall.items():
+        if vals:
+            avg = sum(vals) / len(vals)
+            print(f"  {ch}: mean={avg:.2f}%  min={min(vals):.2f}%  max={max(vals):.2f}%  n={len(vals)}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,49 @@
+"""Find where decoded-vs-sidecar diverges for each channel."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    decoded = decode_waveform_v2(buf[0x0f1f:])
+    GEO_LSB = 0.0003
+
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        # Find ALL transitions where mismatches start/stop
+        first_diff = next((i for i in range(len(dec)) if dec[i] != sc_counts[i]), None)
+        if first_diff is None:
+            print(f"{ch}: NO MISMATCHES")
+            continue
+        print(f"{ch}: first diff at idx {first_diff}")
+        # Show 5 before, 5 after
+        for i in range(max(0, first_diff - 3), min(len(dec), first_diff + 8)):
+            mark = "  " if dec[i] == sc_counts[i] else "**"
+            print(f"  {mark} idx {i:4d}: sc={sc_counts[i]:6d}  dec={dec[i]:6d}  diff={dec[i]-sc_counts[i]:+d}")
+        # Where does cumulative diff exceed 100?
+        cum_match_run = 0
+        max_match_run = 0
+        match_run_start = 0
+        diff_count = 0
+        for i in range(len(dec)):
+            if dec[i] == sc_counts[i]:
+                cum_match_run += 1
+                max_match_run = max(max_match_run, cum_match_run)
+            else:
+                cum_match_run = 0
+                diff_count += 1
+        print(f"  total mismatches: {diff_count}/{len(dec)}, longest run of matches: {max_match_run}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,48 @@
+"""End-to-end IDFH ingest verification."""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    txt  = idfh.parent / "TXT" / f"{idfh.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfh.read_bytes(),
+            idfh,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print("=== save_imported_idf (IDFH) ===")
+        print(f"  serial:        {rec['serial']}")
+        print(f"  filename:      {rec['filename']}")
+        print(f"  filesize:      {rec['filesize']}")
+        print(f"  h5:            {rec['hdf5_filename']}")  # expect None for histogram
+        print(f"  sidecar:       {rec['sidecar_filename']}")
+        print()
+        print("=== Event ===")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print()
+        # Inspect sidecar to confirm intervals were stashed
+        sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        intervals = sc.get("extensions", {}).get("idf_intervals", [])
+        print(f"  sidecar intervals: {len(intervals)}")
+        if intervals:
+            print(f"  first interval:    {intervals[0]}")
+            print(f"  last interval:     {intervals[-1]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Verify the had_report=False path: ingest IDFW with no .txt."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+import tempfile
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            serial_hint=None,
+            idf_report_text=None,        # ← no .txt!
+        )
+        print("=== IDFW without .txt ingest ===")
+        print(f"  serial:        {rec['serial']}")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  rectime_sec:   {ev.rectime_seconds}")
+        nT = len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0
+        nV = len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0
+        nL = len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0
+        nM = len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0
+        print(f"  raw_samples:   Tran={nT} Vert={nV} Long={nL} MicL={nM}")
+        if ev.peak_values:
+            print(f"  peak_values:   tran={ev.peak_values.tran} vert={ev.peak_values.vert} long={ev.peak_values.long}")
+        print(f"  h5 written:    {rec['hdf5_filename']}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,102 @@
+"""End-to-end Thor report PDF rendering.
+
+Ingests an IDFW + .txt via save_imported_idf, runs gather_report_data
+(faking a minimal DB row), and renders the PDF to disk.
+"""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+
+
+class FakeDb:
+    """Stand-in for SeismoDb.get_event(); the renderer only needs a few cols."""
+    def __init__(self, event):
+        self.event = event
+
+    def get_event(self, _id):
+        return self.event
+
+
+def main():
+    base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
+    idfw = base / "UM11719_20231219162723.IDFW"
+    txt  = base / "TXT" / f"{idfw.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
+
+        # Verify sidecar has bw_report block
+        sc_path = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        bw = sc.get("bw_report", {})
+        print(f"  bw_report.available: {bw.get('available')}")
+        print(f"  bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
+        print(f"  bw_report.mic.pspl_dbl: {bw.get('mic', {}).get('pspl_dbl')}")
+        print(f"  bw_report.histogram.n_intervals: {bw.get('histogram', {}).get('n_intervals')}")
+
+        # Build a DB-row-shaped dict from the Event for gather_report_data
+        import datetime
+        ts = ev.timestamp
+        ts_iso = None
+        if ts is not None:
+            try:
+                ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+            except Exception:
+                pass
+        fake_row = {
+            "serial":              "UM11719",
+            "blastware_filename":  rec["filename"],
+            "record_type":         "Waveform",
+            "timestamp":           ts_iso,
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client  if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
+        print()
+        print(f"=== ReportData ===")
+        print(f"  event_id:           {rd.event_id}")
+        print(f"  serial:             {rd.serial}")
+        print(f"  record_type:        {rd.record_type}")
+        print(f"  event_datetime:     {rd.event_datetime_str}")
+        print(f"  trigger:            {rd.trigger_source}")
+        print(f"  geo_range:          {rd.geo_range_str}")
+        print(f"  sample_rate:        {rd.sample_rate_str}")
+        print(f"  firmware:           {rd.firmware}")
+        print(f"  calibration:        {rd.calibration_date} by {rd.calibration_by}")
+        print(f"  battery:            {rd.battery_volts}")
+        print(f"  PVS:                {rd.peak_vector_sum_ips} in/s at {rd.peak_vector_sum_time_s} sec")
+        print(f"  mic_pspl_dbl:       {rd.mic_pspl_dbl}")
+        print(f"  mic_zc_freq_hz:     {rd.mic_zc_freq_hz}")
+        print(f"  channel_stats:      {len(rd.channel_stats)} rows")
+        for cs in rd.channel_stats:
+            print(f"    {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} ToP={cs['time_of_peak_s']} Acc={cs['peak_accel_g']} Disp={cs['peak_disp_in']} Test={cs['sensor_check']}")
+
+        # Render the PDF
+        out_path = REPO / "analysis_idf" / "thor_report.pdf"
+        pdf_bytes = report_pdf.render_event_report_pdf(rd)
+        out_path.write_bytes(pdf_bytes)
+        print()
+        print(f"  PDF written: {out_path} ({len(pdf_bytes)} bytes)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,91 @@
+"""End-to-end Thor IDFH histogram report PDF rendering."""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+import datetime
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+
+
+class FakeDb:
+    def __init__(self, event):
+        self.event = event
+
+    def get_event(self, _id):
+        return self.event
+
+
+def main():
+    # Use the multi-interval IDFH (81 + trigger row)
+    idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    txt  = idfh.parent / "TXT" / f"{idfh.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfh.read_bytes(),
+            idfh,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
+
+        sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        bw = sc.get("bw_report", {})
+        hist = bw.get("histogram", {})
+        print(f"  bw_report.histogram.start:           {hist.get('start')}")
+        print(f"  bw_report.histogram.stop:            {hist.get('stop')}")
+        print(f"  bw_report.histogram.n_intervals:     {hist.get('n_intervals')}")
+        print(f"  bw_report.histogram.interval_size:   {hist.get('interval_size')}")
+        print(f"  bw_report.histogram.interval_size_s: {hist.get('interval_size_s')}")
+        print(f"  bw_report.peaks.tran.ppv_ips:        {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
+
+        ts = ev.timestamp
+        ts_iso = None
+        if ts is not None:
+            try:
+                ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+            except Exception:
+                pass
+        fake_row = {
+            "serial":              "UM13981",
+            "blastware_filename":  rec["filename"],
+            "record_type":         "Histogram",
+            "timestamp":           ts_iso,
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client  if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="hist-1")
+
+        print()
+        print("=== ReportData (histogram) ===")
+        print(f"  is_histogram:           {rd.is_histogram}")
+        print(f"  histogram_start:        {rd.histogram_start_str}")
+        print(f"  histogram_stop:         {rd.histogram_stop_str}")
+        print(f"  histogram_n_intervals:  {rd.histogram_n_intervals}")
+        print(f"  histogram_interval_size:{rd.histogram_interval_size}")
+        print(f"  histogram_interval_times[:3]: {rd.histogram_interval_times[:3]}")
+        print(f"  histogram_interval_times[-2:]: {rd.histogram_interval_times[-2:]}")
+        print(f"  channel_stats: {len(rd.channel_stats)} rows")
+        for cs in rd.channel_stats:
+            print(f"    {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} peak_date={cs['peak_date']} peak_time={cs['peak_time']}")
+
+        pdf_bytes = report_pdf.render_event_report_pdf(rd)
+        out_path = REPO / "analysis_idf" / "thor_report_idfh.pdf"
+        out_path.write_bytes(pdf_bytes)
+        print()
+        print(f"  PDF written: {out_path} ({len(pdf_bytes)} bytes)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,52 @@
+"""End-to-end ingest test: feed an IDFW + .txt to save_imported_idf in a tmp store."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+import tempfile
+import shutil
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    txt  = idfw.parent / "TXT" / f"{idfw.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            serial_hint=None,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print("=== Save result ===")
+        print(f"  serial:    {rec['serial']}")
+        print(f"  filename:  {rec['filename']}")
+        print(f"  filesize:  {rec['filesize']}")
+        print(f"  h5:        {rec['hdf5_filename']}")
+        print(f"  sidecar:   {rec['sidecar_filename']}")
+        print()
+        print("=== Event ===")
+        print(f"  serial:        {ev.serial if hasattr(ev,'serial') else '(n/a)'}")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  rectime_sec:   {ev.rectime_seconds}")
+        print(f"  raw_samples:   Tran={len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0}, Vert={len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0}, Long={len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0}, MicL={len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0}")
+        if ev.peak_values:
+            print(f"  peaks (txt):   Tran={ev.peak_values.tran} Vert={ev.peak_values.vert} Long={ev.peak_values.long}")
+        print()
+
+        # Verify the h5 file actually got written
+        h5path = Path(td) / "UM11719" / f"{idfw.name}.h5"
+        print(f"  h5 exists:     {h5path.exists()}  size={h5path.stat().st_size if h5path.exists() else 0}")
+        sidecar = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
+        print(f"  sidecar exists:{sidecar.exists()}  size={sidecar.stat().st_size if sidecar.exists() else 0}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,137 @@
+"""Decode IDFH histogram intervals + verify against sidecar."""
+from __future__ import annotations
+import sys
+import struct
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+
+SEGMENT_MAGIC = b"\x02\xda\x0a\x00\x00\x00"
+SEGMENT_SIZE = 732   # = 10-byte header + 10 × 72-byte intervals + 2-byte tail
+INTERVAL_SIZE = 72
+CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+def decode_interval(buf72: bytes) -> dict:
+    """Decode one 72-byte interval into per-channel min/max/halfp."""
+    out = {}
+    for i, ch in enumerate(CHANNELS):
+        block = buf72[i*16 : (i+1)*16]
+        mn = struct.unpack_from(">h", block, 0)[0]
+        mx = struct.unpack_from(">h", block, 2)[0]
+        sb = struct.unpack_from(">h", block, 4)[0]
+        halfp = struct.unpack_from(">H", block, 6)[0]
+        f10 = struct.unpack_from(">H", block, 10)[0]
+        f14 = struct.unpack_from(">H", block, 14)[0]
+        peak_count = max(abs(mn), abs(mx))
+        out[ch] = {
+            "min":     mn,
+            "max":     mx,
+            "field4":  sb,
+            "halfp":   halfp,
+            "field10": f10,
+            "field14": f14,
+            "peak":    peak_count,
+            "freq_hz": (512.0 / halfp) if halfp > 5 else None,
+        }
+    out["_tail"] = buf72[64:].hex(" ")
+    return out
+
+
+def walk_idfh(buf: bytes) -> list:
+    """Walk all interval records in an IDFH file."""
+    intervals = []
+    # Multi-segment file: every 02 da 0a 00 00 00 marker introduces a segment.
+    # Single-interval file: just one body header at 0xf96 of form ?? ?? 0a 00 00 00.
+    # Find them all.
+    i = 0
+    while True:
+        j = buf.find(b"\x0a\x00\x00\x00", i)
+        if j < 0:
+            break
+        # Validate: the 2 bytes before must form a length, and we want bytes
+        # [j-2 : j+6] to have a recognisable shape.  Actually the cleanest
+        # filter is "preceded by a length and followed by 00 NN 05 3f".
+        if j < 2:
+            i = j + 1
+            continue
+        # Body header form: [length_be_2][0a 00 00 00][00 NN][05 3f]
+        if j + 10 > len(buf):
+            break
+        length = int.from_bytes(buf[j-2:j], "big")
+        # Verify the segment-marker shape: [length_be][0a 00 00 00][00 NN][05 3f]
+        if buf[j+4] != 0x00:
+            i = j + 1
+            continue
+        if buf[j+6:j+8] != b"\x05\x3f":
+            i = j + 1
+            continue
+        # Header layout (10 bytes): [length_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
+        # Followed by N interval records of 72 bytes each, then 2 tail bytes.
+        # length value = (N × 72) + 10  (counts bytes from 0x0a... through interval data).
+        header_start = j - 2
+        n_intervals = (length - 10) // INTERVAL_SIZE
+        interval_start = header_start + 10
+        for k in range(n_intervals):
+            off = interval_start + k * INTERVAL_SIZE
+            if off + INTERVAL_SIZE > len(buf):
+                break
+            chunk = buf[off:off + INTERVAL_SIZE]
+            intervals.append({"offset": off, **decode_interval(chunk)})
+        i = header_start + length + 2
+    return intervals
+
+
+def main():
+    # Test against multi-segment IDFH
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    sc_path = target.parent / "TXT" / f"{target.name}.txt"
+    buf = target.read_bytes()
+    intervals = walk_idfh(buf)
+    print(f"=== {target.name} ===")
+    print(f"  file size: {len(buf)}")
+    print(f"  decoded intervals: {len(intervals)}")
+    # Show first 2 + last 2
+    sc_rows = []
+    for line in sc_path.read_text(errors="replace").splitlines():
+        if line.startswith("2022-") or line.startswith("2023-"):
+            sc_rows.append(line)
+    print(f"  sidecar rows: {len(sc_rows)}")
+
+    print()
+    for k in [0, 1, 78, 79, 80]:
+        if k >= len(intervals):
+            continue
+        iv = intervals[k]
+        print(f"--- interval {k} @0x{iv['offset']:04x} ---")
+        for ch in CHANNELS:
+            d = iv[ch]
+            peak_ips = d["peak"] / 32768 * 10.0
+            print(f"  {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s)  halfp={d['halfp']:5d}  freq={d['freq_hz']}")
+        # sidecar row
+        if k < len(sc_rows):
+            print(f"  SC: {sc_rows[k]}")
+
+    # Test single-interval IDFH
+    print()
+    target2 = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
+    sc2 = target2.parent / "TXT" / f"{target2.name}.txt"
+    buf2 = target2.read_bytes()
+    intervals2 = walk_idfh(buf2)
+    print(f"=== {target2.name} ===")
+    print(f"  file size: {len(buf2)}, decoded intervals: {len(intervals2)}")
+    if intervals2:
+        iv = intervals2[0]
+        for ch in CHANNELS:
+            d = iv[ch]
+            peak_ips = d["peak"] / 32768 * 10.0
+            print(f"  {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s)  halfp={d['halfp']:5d}  freq={d['freq_hz']}")
+        sc_rows2 = [l for l in sc2.read_text(errors='replace').splitlines() if l.startswith("2023-")]
+        if sc_rows2:
+            print(f"  SC: {sc_rows2[0]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,41 @@
+"""Find IDFH interval period via auto-correlation of structural patterns."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+from collections import Counter
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    buf = target.read_bytes()
+    body_start = 0xF96
+    body_end   = 0x270C
+    body = buf[body_start:body_end]
+    print(f"body size: {len(body)} bytes (file {len(buf)} bytes)")
+
+    # For each candidate interval size, count how many bytes at fixed offsets within
+    # each interval are zero (consistent column-zero pattern indicates correct size).
+    print()
+    print("=== zero-column score by interval size (higher = more likely) ===")
+    best = []
+    for sz in range(16, 100):
+        n = len(body) // sz
+        if n < 30:
+            continue
+        # For each column position within an interval, count how many of n intervals have zero
+        score = 0
+        for col in range(sz):
+            zeros = sum(1 for i in range(n) if body[i*sz + col] == 0)
+            if zeros >= n * 0.9:
+                score += 1
+        best.append((score, sz, n))
+    best.sort(reverse=True)
+    for score, sz, n in best[:10]:
+        print(f"  size={sz:3d}  n_intervals={n}  consistently-zero-cols={score}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Per-file accuracy + sample-count details."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+from analysis_idf.recon import load_sidecar_samples
+
+
+def main():
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = sorted([f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")])
+    GEO_LSB = 0.0003
+    # Limit to first 15 successful files for detail.
+    shown = 0
+    for f in files:
+        try:
+            res = read_idf_file(f)
+        except Exception:
+            continue
+        sc_path = f.parent / "TXT" / f"{f.name}.txt"
+        if not sc_path.exists():
+            continue
+        sc = load_sidecar_samples(sc_path)
+        sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
+        dec = res.samples.get("Tran", [])
+        n = min(len(sc_tran), len(dec))
+        exact = sum(1 for i in range(n) if sc_tran[i] == dec[i]) if n else 0
+        pct = 100.0 * exact / n if n else 0.0
+        print(f"{f.name:40s}  size={f.stat().st_size:6d}  sc_n={len(sc_tran):4d}  dec_n={len(dec):4d}  exact={pct:.1f}%")
+        shown += 1
+        if shown >= 20:
+            break
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,64 @@
+"""Look at what's at the divergence boundary."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import walk_body, find_data_start, parse_segment_header
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    body = buf[0x0f1f:]
+    start = find_data_start(body)
+    print(f"data_start: {start}  (= file offset 0x{0x0f1f + start:04x})")
+
+    blocks = walk_body(body, start)
+    print(f"{len(blocks)} blocks total")
+    print()
+
+    # First 25 blocks
+    print("=== first 30 blocks ===")
+    for i, b in enumerate(blocks[:30]):
+        body_off = 0x0f1f + b.offset
+        if b.tag_hi == 0x40:
+            hdr = parse_segment_header(b)
+            print(f"  [{i:3d}] @0x{body_off:04x}  {b.kind}  (segment header)  counter={hdr['counter'] if hdr else '?'}  field2={hdr['field2'].hex() if hdr else '?'}  anchor={hdr['anchor_bytes'].hex() if hdr else '?'}  tail={hdr['tail'].hex() if hdr else '?'}")
+        else:
+            print(f"  [{i:3d}] @0x{body_off:04x}  {b.kind}  len={b.length}  data={b.data[:16].hex()}")
+    print()
+
+    # Cumulative sample counts per block to find which block contains sample 254
+    print("=== cumulative samples through blocks ===")
+    cur_ch = "Tran"
+    rotation = ["Vert", "Long", "MicL", "Tran"]
+    seg_count = 0
+    samples_in_curseg = 2  # preamble Tran[0], Tran[1]
+    for i, b in enumerate(blocks[:30]):
+        if b.tag_hi == 0x40:
+            seg_count += 1
+            prev_ch = cur_ch
+            cur_ch = rotation[(seg_count - 1) % 4]
+            print(f"  [{i:3d}] 40 02 -> end of {prev_ch} segment, start {cur_ch} (segment {seg_count})")
+            samples_in_curseg = 2  # anchors
+        elif (b.tag_hi & 0xF0) == 0x10:
+            nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
+            samples_in_curseg += nn
+            print(f"  [{i:3d}] {b.kind} nibble: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif (b.tag_hi & 0xF0) == 0x20:
+            nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
+            samples_in_curseg += nn
+            print(f"  [{i:3d}] {b.kind} int8: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif b.tag_hi == 0x00:
+            samples_in_curseg += b.tag_lo
+            print(f"  [{i:3d}] {b.kind} RLE: +{b.tag_lo}, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif b.tag_hi == 0x30:
+            samples_in_curseg += b.tag_lo
+            print(f"  [{i:3d}] {b.kind} packed12: +{b.tag_lo} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,89 @@
+"""Reconnaissance helpers for cracking the Thor IDFW binary."""
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+TARGET = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+TXT = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/TXT/UM11719_20231219162723.IDFW.txt"
+
+
+def hex_at(buf: bytes, off: int, n: int = 32) -> str:
+    chunk = buf[off : off + n]
+    hexs = " ".join(f"{b:02x}" for b in chunk)
+    asc = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
+    return f"{off:04x}: {hexs}  {asc}"
+
+
+def find_all(buf: bytes, needle: bytes) -> list[int]:
+    out: list[int] = []
+    i = 0
+    while True:
+        j = buf.find(needle, i)
+        if j < 0:
+            break
+        out.append(j)
+        i = j + 1
+    return out
+
+
+def load_sidecar_samples(path: Path) -> dict[str, list[float]]:
+    """Parse the txt sample table — Tran/Vert/Long/MicL."""
+    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    in_block = False
+    for line in path.read_text(errors="replace").splitlines():
+        if not in_block:
+            if line.strip() == "Waveform Data Channels":
+                in_block = True
+            continue
+        if line.startswith("Waveform Data USB Channels"):
+            break
+        parts = line.split("\t")
+        # First row is the header "\tTran\tVert\tLong\tMicL"
+        if len(parts) >= 5 and parts[1] == "Tran":
+            continue
+        if len(parts) < 5:
+            continue
+        try:
+            out["Tran"].append(float(parts[1]))
+            out["Vert"].append(float(parts[2]))
+            out["Long"].append(float(parts[3]))
+            out["MicL"].append(float(parts[4]))
+        except ValueError:
+            continue
+    return out
+
+
+def main():
+    buf = TARGET.read_bytes()
+    samples = load_sidecar_samples(TXT)
+    print(f"file size: {len(buf)} bytes")
+    print(f"sample rows: Tran={len(samples['Tran'])} Vert={len(samples['Vert'])} Long={len(samples['Long'])} MicL={len(samples['MicL'])}")
+    print(f"first 6 Tran samples: {samples['Tran'][:6]}")
+    print(f"first 6 Vert samples: {samples['Vert'][:6]}")
+    print(f"first 6 Long samples: {samples['Long'][:6]}")
+    print(f"first 6 MicL samples: {samples['MicL'][:6]}")
+
+    print()
+    print("=== BW magic '00 02 00' positions ===")
+    hits = find_all(buf, b"\x00\x02\x00")
+    print(f"{len(hits)} hits")
+    for h in hits[:20]:
+        print(hex_at(buf, h, 24))
+
+    print()
+    print("=== '40 02' segment-header positions ===")
+    hits = find_all(buf, b"\x40\x02")
+    print(f"{len(hits)} hits")
+    for h in hits:
+        ctx_pre = buf[max(0, h - 4): h].hex()
+        ctx_post = buf[h: h + 20].hex()
+        # Show byte preceding to help identify real headers vs casual occurrences
+        print(f"  0x{h:04x}  pre={ctx_pre}  post={ctx_post}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Find each segment boundary in the channel and check if errors reset there."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    decoded = decode_waveform_v2(buf[0x0f1f:])
+    GEO_LSB = 0.0003
+
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        # Find every transition where error becomes zero from nonzero (or grows from zero)
+        # Print indices where dec resyncs back to exact match.
+        n = min(len(sc_counts), len(dec))
+        events = []
+        prev_match = True
+        for i in range(n):
+            match = sc_counts[i] == dec[i]
+            if match != prev_match:
+                kind = "RESYNC" if match else "DIVERGE"
+                events.append((i, kind, sc_counts[i], dec[i]))
+                prev_match = match
+        print(f"{ch}: {len(events)} transitions")
+        for i, kind, sc_v, dec_v in events[:20]:
+            print(f"  idx {i:4d}  {kind:8s}  sc={sc_v:6d}  dec={dec_v:6d}  diff={dec_v-sc_v:+d}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,46 @@
+"""Smoke-test read_idf_file on IDFH across the corpus."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
+    result = read_idf_file(target)
+    ev = result.event
+    print(f"=== {target.name} ===")
+    print(f"  signature:   {result.signature}")
+    print(f"  serial:      {ev.serial}")
+    print(f"  timestamp:   {ev.timestamp}")
+    print(f"  sample_rate: {ev.sample_rate}")
+    print(f"  kind:        {ev.kind}")
+    print(f"  intervals:   {len(result.intervals or [])}")
+    print(f"  peaks:       T={ev.peaks.transverse_ips:.4f} V={ev.peaks.vertical_ips:.4f} L={ev.peaks.longitudinal_ips:.4f}")
+    print()
+
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = list(root.rglob("*.IDFH"))
+    ok = fail = nyi = 0
+    total_intervals = 0
+    for f in files:
+        try:
+            r = read_idf_file(f)
+            ok += 1
+            total_intervals += len(r.intervals or [])
+        except NotImplementedError:
+            nyi += 1
+        except Exception as exc:
+            fail += 1
+            if fail <= 3:
+                print(f"  FAIL: {f.name}: {type(exc).__name__}: {exc}")
+    print(f"Corpus: {len(files)} IDFH files | ok={ok} fail={fail} nyi={nyi}")
+    print(f"Total intervals decoded: {total_intervals}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,48 @@
+"""Smoke-test read_idf_file across the sample corpus."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file, geo_count_to_ips, mic_count_to_psi
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    result = read_idf_file(target)
+    ev = result.event
+    print(f"=== {target.name} ===")
+    print(f"  signature: {result.signature}")
+    print(f"  serial:    {ev.serial}")
+    print(f"  timestamp: {ev.timestamp}")
+    print(f"  sample_rate: {ev.sample_rate}")
+    print(f"  record_time: {ev.record_time_sec}")
+    print(f"  calibration: {result.binary_metadata.calibration_date}")
+    print(f"  Tran samples: {len(result.samples['Tran'])}, peak_ips={ev.peaks.transverse_ips:.4f}")
+    print(f"  Vert samples: {len(result.samples['Vert'])}, peak_ips={ev.peaks.vertical_ips:.4f}")
+    print(f"  Long samples: {len(result.samples['Long'])}, peak_ips={ev.peaks.longitudinal_ips:.4f}")
+    print(f"  MicL samples: {len(result.samples['MicL'])}")
+    print()
+
+    # Corpus sweep
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
+    ok = fail = nyi = 0
+    for f in files:
+        try:
+            r = read_idf_file(f)
+            ok += 1
+        except NotImplementedError:
+            nyi += 1
+        except Exception as exc:
+            fail += 1
+            if fail <= 5:
+                print(f"  FAIL: {f.name}: {type(exc).__name__}: {exc}")
+    print()
+    print(f"Corpus: {len(files)} IDFW files | ok={ok} fail={fail} not-implemented={nyi}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,47 @@
+"""Verify build_bw_report_from_idf against a known sidecar."""
+from __future__ import annotations
+import json
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_ascii_report import parse_idf_report
+from micromate.idf_to_bw_report import build_bw_report_from_idf
+from micromate.idf_file import read_idf_file
+
+
+def show(prefix: str, d: dict, indent: int = 0):
+    for k, v in d.items():
+        if isinstance(v, dict):
+            print(f"{'  '*indent}{prefix}{k}:")
+            show("", v, indent + 1)
+        else:
+            print(f"{'  '*indent}{prefix}{k}: {v!r}")
+
+
+def main():
+    base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
+    idfw = base / "UM11719_20231219162723.IDFW"
+    txt  = base / "TXT" / f"{idfw.name}.txt"
+
+    report_dict = parse_idf_report(txt.read_text(errors="replace"))
+    res = read_idf_file(idfw)
+    bw = build_bw_report_from_idf(report_dict, binary_md=res.binary_metadata)
+
+    print("=== IDFW → bw_report ===")
+    show("", bw)
+
+    print()
+    print("=== IDFH (single trigger row) ===")
+    idfh = base / "UM11719_20231219162648.IDFH"
+    txt_h = base / "TXT" / f"{idfh.name}.txt"
+    rh = parse_idf_report(txt_h.read_text(errors="replace"))
+    res_h = read_idf_file(idfh)
+    bw_h = build_bw_report_from_idf(rh, binary_md=res_h.binary_metadata, intervals=res_h.intervals)
+    show("", bw_h)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,73 @@
+"""Trace Tran sample-by-sample to find exactly where the codec drifts."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def s4(n: int) -> int:
+    return n if n < 8 else n - 16
+
+
+def i8(b: int) -> int:
+    return b if b < 128 else b - 256
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    GEO_LSB = 0.0003
+    sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
+
+    body = buf[0x0f1f:]
+    # Tran[0], Tran[1] from preamble
+    t0 = int.from_bytes(body[3:5], "big", signed=True)
+    t1 = int.from_bytes(body[5:7], "big", signed=True)
+    print(f"preamble Tran[0]={t0}  Tran[1]={t1}  (sidecar: {sc_tran[0]}, {sc_tran[1]})")
+
+    # Block 0: 10 f8 at body[7:9]
+    print(f"block 0: tag {body[7]:02x} {body[8]:02x}")
+    print(f"  block 0 first 10 data bytes: {body[9:19].hex()}")
+
+    # Walk block 0 manually, comparing each sample
+    cur = t1
+    samples = [t0, t1]
+    block_off = 7
+    nn = body[8]
+    print(f"  NN = {nn}")
+    data = body[9 : 9 + nn // 2]
+    for byi, byte in enumerate(data):
+        for nib_idx, nib in enumerate(((byte >> 4) & 0xF, byte & 0xF)):
+            cur += s4(nib)
+            samples.append(cur)
+            idx = len(samples) - 1
+            if 0 <= idx < len(sc_tran):
+                sc_v = sc_tran[idx]
+                match = "✓" if sc_v == cur else "✗"
+                if idx < 12 or 240 <= idx <= 260:
+                    print(f"    idx {idx:3d}: nibble byte={byte:02x} nib={nib:x} delta={s4(nib):+d}  cur={cur:+d}  sc={sc_v:+d}  {match}")
+
+    print(f"end of block 0: cur={cur}, len(samples)={len(samples)}, decoder expected 250 here")
+    # Block 1: 20 28 starts at offset 9 + 124 = 133 from block_off=7
+    block1_off = 9 + nn // 2
+    print(f"block 1: tag {body[block1_off]:02x} {body[block1_off+1]:02x} (expecting 20 28)")
+    nn1 = body[block1_off + 1]
+    print(f"  block 1 NN = {nn1}")
+    data1 = body[block1_off + 2 : block1_off + 2 + nn1]
+    for byi, byte in enumerate(data1):
+        cur += i8(byte)
+        samples.append(cur)
+        idx = len(samples) - 1
+        if idx < len(sc_tran):
+            sc_v = sc_tran[idx]
+            match = "✓" if sc_v == cur else "✗"
+            if 248 <= idx <= 295:
+                print(f"    idx {idx:3d}: int8 byte={byte:02x} delta={i8(byte):+d}  cur={cur:+d}  sc={sc_v:+d}  {match}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,42 @@
+"""Feed candidate body offsets to the BW codec and compare with sidecar."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    # Sidecar samples in 0.0003 counts (Thor geo LSB).
+    sc_tran = [int(round(v / 0.0003)) for v in sc["Tran"][:30]]
+    sc_vert = [int(round(v / 0.0003)) for v in sc["Vert"][:30]]
+    sc_long = [int(round(v / 0.0003)) for v in sc["Long"][:30]]
+    sc_micl = [int(round(v / 1e-6)) for v in sc["MicL"][:30]]  # 1 µ unit for mic? Will iterate.
+    print(f"sidecar Tran (counts): {sc_tran}")
+    print(f"sidecar Vert (counts): {sc_vert}")
+    print(f"sidecar Long (counts): {sc_long}")
+    print(f"sidecar MicL (×1e-6):  {sc_micl}")
+    print()
+
+    # Try candidate body start offsets.
+    for off in (0x0f1f, 0x1057, 0x11f1, 0x1333, 0x1bde, 0x0d30):
+        print(f"=== body @ 0x{off:04x} ===")
+        body = buf[off:]
+        decoded = decode_waveform_v2(body)
+        if not decoded:
+            print("  decode_waveform_v2 returned None")
+            continue
+        for ch in ("Tran", "Vert", "Long", "MicL"):
+            arr = decoded.get(ch, [])
+            print(f"  {ch}[{len(arr)}]: {arr[:20]}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,51 @@
+"""Verify decode_waveform_v2 against sidecar across all 2304 samples per channel."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    body = buf[0x0f1f:]
+    decoded = decode_waveform_v2(body)
+
+    print(f"Sidecar lengths: Tran={len(sc['Tran'])} Vert={len(sc['Vert'])} Long={len(sc['Long'])} MicL={len(sc['MicL'])}")
+    print(f"Decoded lengths: Tran={len(decoded['Tran'])} Vert={len(decoded['Vert'])} Long={len(decoded['Long'])} MicL={len(decoded['MicL'])}")
+    print()
+
+    GEO_LSB = 0.0003  # in/s per count
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        n = min(len(sc_counts), len(dec))
+        matches = sum(1 for i in range(n) if sc_counts[i] == dec[i])
+        first_mismatch = next((i for i in range(n) if sc_counts[i] != dec[i]), None)
+        print(f"{ch}: compared {n}, exact matches {matches} ({100*matches/n:.2f}%)")
+        if first_mismatch is not None:
+            i = first_mismatch
+            print(f"  first mismatch at idx {i}: sidecar={sc_counts[i]} ({sc[ch][i]}), decoded={dec[i]}")
+            print(f"  context sidecar[{i-2}..{i+5}]: {sc_counts[max(0,i-2):i+5]}")
+            print(f"  context decoded[{i-2}..{i+5}]: {dec[max(0,i-2):i+5]}")
+
+    # MicL: find the multiplicative factor that fits
+    print()
+    print("=== MicL scale analysis ===")
+    sc_micl = sc["MicL"]
+    dec_micl = decoded["MicL"]
+    # Skip zero values when computing ratio
+    ratios = [sc_micl[i] / dec_micl[i] for i in range(min(50, len(sc_micl), len(dec_micl))) if dec_micl[i] != 0]
+    if ratios:
+        avg = sum(ratios) / len(ratios)
+        print(f"  avg ratio sidecar/decoded over first 50 nonzero: {avg:.4e} (n={len(ratios)})")
+        print(f"  ratios sample: {[f'{r:.4e}' for r in ratios[:6]]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -6,11 +6,68 @@ Series IV event-file format.  Sibling to
 Series III "Rosetta Stone") — this doc holds what we know so far and
 the open questions still to crack.

-**Status (2026-05-20):** ASCII text sidecar fully decoded (1,014
-sample files round-trip).  Binary `.IDFH` / `.IDFW` codec
-**not yet implemented** — binaries are stored opaquely by
-`WaveformStore.save_imported_idf`, with metadata sourced from the
-paired `.txt` sidecar.
+**Status (2026-05-28):** ASCII text sidecar fully decoded (1,014
+sample files round-trip).  **Thor IDFW** binary now decodes via
+`micromate.idf_file.read_idf_file()` — reuses the BW segment-rotated
+block codec verbatim at fixed body offset `0x0f1f`; metadata (serial,
+timestamp, sample_rate, record_time, calibration_date) extracted from
+the binary header.  Sample fidelity is 87–99% byte-exact on quiet
+events; loud events hit the BW codec's known walker-stops-early
+limitation.  Residual ~3% drift on per-sample deltas (likely a
+Thor-specific 12-bit delta refinement not yet modelled).
+
+**Thor IDFH histograms also decoded.**  Body has one or more segments;
+each 12-byte segment header `[length_be 2B][0a 00 00 00][00 NN][05 3f]`
+introduces `N = (length - 10) // 72` interval records of 72 bytes
+each.  Each interval = 4 × 16-byte per-channel records:
+`[int16 min][int16 max][int16 ??][uint16 halfp][2B 00][uint16 ??][2B 00][uint16 ??]`.
+Geo peak `= max(|min|, |max|) / 32768 × 10` in/s (matches sidecar
+~1.8%); freq `= 512 / halfp` Hz (None for halfp ≤ 5 → ">100"
+sentinel).  Corpus: **all 859 Thor IDFH files decode, 181,071
+intervals**.  Wired through `read_idf_file()` →
+`save_imported_idf()` → sidecar's `extensions.idf_intervals`.
+
+**Note on the BE9439 outliers in the example corpus:** Two files
+(`BE9439_20200713131747.IDFW` and `BE9439_20200713124251.IDFH`) are
+**Series III Blastware** binaries, not Thor.  Provenance: TMI tried
+to use Thor to manage auto-call-homes for Series III units; the
+experiment didn't work out, but it did leave a few BW event files
+in Thor's per-serial directory structure with `.IDFW`/`.IDFH`
+extensions — Thor's forwarder applied its own naming convention to
+the BW bodies it was relaying.  Their header `10 00 01 80 00 00
+Instantel STRT ff fe <end_key> <start_key>` is the BW SUB 5A STRT
+record, not a Thor body preamble.  The reader detects them by
+signature and raises `NotImplementedError` pointing callers at
+`read_blastware_file()`, which extracts BW-format peaks from them.
+
+**Still NYI for Thor IDFH:** per-channel `int16 field4` (possibly
+time-of-peak); the two uint16 fields (probably PVS contributions);
+8-byte interval tail (PVS data); mic dB(L) exact conversion constant.
+
+### Codec breakthroughs (2026-05-28)
+
+- **Body offset is a fixed `0x0f1f`** across 151/154 corpus IDFW
+  files.  Preceded by a 4-byte record-type marker (`46 00 00 00`)
+  + magic preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]`.
+- **Sample stream is BW's segment-rotated block codec verbatim.**
+  Thor reuses `10 NN` (nibble), `20 NN` (int8), `00 NN` (RLE),
+  `30 NN` (packed12), `40 02` (segment header) tags with the same
+  semantics.  Channel rotation Tran→Vert→Long→MicL.
+- **Geo LSB = 0.0003 in/s** (not BW's 0.005), because Thor's 16-bit
+  ADC range maps to 10 in/s without the 16-count BW quantization step.
+- **Mic ≈ 2.14×10⁻⁶ psi/count** (rough scale; refine after channel
+  block calibration constants are decoded).
+- **BW compliance anchor `\xbe\x80\x00\x00\x00\x00` reappears at
+  IDFW offset 0x952** — sample_rate at anchor−6 (uint16 BE),
+  record_time at anchor+6 (float32 BE), same layout as BW.
+- **Event timestamp at offset 0x97A** — 8 bytes `[day][month]
+  [year_be][unk][hour][min][sec]`.  Stop-time mirrors at 0x982.
+- **Serial as null-terminated ASCII at 0x14E**.
+- **Calibration date** at 0x194–0x197 (day, month, year_be).
+- Per-sample residual drift of ~3% suggests Thor encodes int8/nibble
+  deltas with an extra refinement bit that BW doesn't carry —
+  unsolved; errors resync within a few samples so cumulative impact
+  is small.

 ---

@@ -210,8 +210,7 @@ def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
        "long_peak_acceleration",
        "tran_peak_displacement", "vert_peak_displacement",
        "long_peak_displacement",
-        "tran_time_of_peak", "vert_time_of_peak", "long_time_of_peak",
-        "mic_time_of_peak", "mic_zc_freq",
+        "mic_zc_freq",
    )
    for key in float_fields:
        v = raw.get(key)
@@ -223,6 +222,22 @@ def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
        else:
            out.pop(key, None)

+    # Time-of-peak: Thor labels these "TimeofPeak" (lowercase "of") so the
+    # normalizer produces "*_timeof_peak".  Map them to the canonical
+    # ``*_time_of_peak`` output keys for downstream consumers.
+    for raw_key, out_key in (
+        ("tran_timeof_peak", "tran_time_of_peak"),
+        ("vert_timeof_peak", "vert_time_of_peak"),
+        ("long_timeof_peak", "long_time_of_peak"),
+        ("mic_timeof_peak",  "mic_time_of_peak"),
+    ):
+        v = raw.get(raw_key)
+        if v is None:
+            continue
+        fv = _parse_float(v)
+        if fv is not None:
+            out[out_key] = fv
+
    # Microphone — Thor reports MicPSPL (dB(L)) which is the closest
    # analogue to BW's mic_ppv.  The raw "99.4 dB(L)" string stays in
    # `out` under the original `mic_pspl` key for display; the parsed
@@ -1,64 +1,530 @@
 """
-micromate/idf_file.py — placeholder for the Thor IDF binary codec.
+micromate/idf_file.py — Thor IDF binary codec.

-Thor's ``.IDFH`` (histogram) and ``.IDFW`` (waveform) event files are an
-Instantel proprietary binary format that has not yet been reverse-
-engineered.  Today seismo-relay treats them as opaque blobs:
-``WaveformStore.save_imported_idf`` stores the bytes verbatim and reads
-all device-authoritative metadata from the paired ``.IDFW.txt`` /
-``.IDFH.txt`` ASCII sidecar (parsed by ``idf_ascii_report.py``).
+Decodes the Instantel Micromate Series IV ``.IDFW`` (waveform) and
+``.IDFH`` (histogram) binary on-disk format.  Sister module to
+``minimateplus/event_file_io.py``.

-When we crack the binary codec — same reverse-engineering playbook we
-used to byte-perfect-parse Series III BW files (see
-``docs/instantel_protocol_reference.md`` and ``minimateplus/event_file_io.py``)
-— this module will grow:
+Status (2026-05-28):

-  - ``read_idf_file(path) -> IdfEvent``
-        Parse a ``.IDFW``/``.IDFH`` binary and return a fully populated
-        ``IdfEvent`` whose waveform-sample arrays come from the binary
-        (the .txt sidecar's tabular sample block being a best-effort
-        check).  Lets us ingest Thor events even when the operator
-        hasn't enabled the .txt exporter — closing the
-        ``had_report=False`` gap that the thor-watcher forwarder
-        currently tolerates as a known limitation.
+- **Genuine Series IV / Thor binaries** are all signed
+  ``00 12 01 00 00 00 Instantel\\0`` (sig-A in earlier notes).  Two
+  Series III (Blastware) binaries appear in the example corpus
+  (``BE9439_*``) — they share the ``.IDFW``/``.IDFH`` extension by
+  filing convention but carry a BW STRT header (``10 00 01 80 00 00
+  Instantel STRT...``) and are NOT Thor data.  The reader detects
+  them by signature and raises NotImplementedError pointing callers
+  at ``minimateplus.event_file_io.read_blastware_file()``.
+- **IDFW waveform body** reuses the BW segment-rotated block codec
+  verbatim.  Body always starts at file offset ``0x0f1f``.  Samples
+  decoded via ``minimateplus.waveform_codec.decode_waveform_v2``
+  with 87–99% byte-exact match against ``.IDFW.txt`` sidecar (quiet
+  events).  Loud events hit the BW codec's known walker-stops-early
+  limit.  Residual ~3% drift on per-sample deltas — likely a
+  Thor-specific 12-bit delta refinement that BW's codec doesn't
+  model.  Geo LSB = 0.0003 in/s; mic factor ~2.14e-6 psi/count.
+- **IDFH histogram body**: 12-byte segment header
+  ``[len_be 2B] 0a 00 00 00 [00 NN_counter] 05 3f`` introduces a
+  segment of ``N`` 72-byte interval records (``N = (len - 10) // 72``).
+  Each record holds 4 × 16-byte per-channel min/max/halfp + 8-byte
+  tail.  Geo peaks via ``max(|min|, |max|) / 32768 × 10`` in/s
+  (matches sidecar within ~1.8%), freq via ``512 / halfp`` Hz.
+  **All 859 Thor IDFH files in the corpus decode (181,071 intervals).**
+- Binary metadata directly extracted: serial, timestamp, sample_rate,
+  record_time, calibration_date.  Other fields fall back to the paired
+  ``.IDFW.txt`` / ``.IDFH.txt`` sidecar (consumed by
+  ``WaveformStore.save_imported_idf``).

-  - ``write_idf_file(path, event)`` (eventually)
-        Round-trip event reconstruction, used for verifying the codec
-        against captured device files the way ``write_blastware_file``
-        verifies the Series III codec.
-
-  - Helpers for decoding the binary's per-channel sample arrays into
-    physical units, the per-event flash buffer's monitor-log records,
-    etc.
-
-The reverse-engineering path: pair every ``.IDFW`` binary in
-``thor-watcher/example-data/`` with its sibling ``.IDFW.txt``, treating
-the txt's "Waveform Data Channels" block as ground-truth, and align
-the binary's per-channel int16-or-similar arrays against it.  Header
-fields (sample rate, channel count, record time, timestamps) sit before
-the sample block — same approach as the BW codec where ASCII strings
-inside the binary (``Project:``, ``Client:``, etc.) anchored field
-discovery.
+The full reverse-engineering writeup lives in
+``docs/idf_protocol_reference.md``.
 """

 from __future__ import annotations

+import datetime
+import struct
+from dataclasses import dataclass
 from pathlib import Path
-from typing import Union
+from typing import Optional, Union

-from .models import IdfEvent
+from minimateplus.waveform_codec import decode_waveform_v2
+
+from .models import IdfEvent, IdfPeaks, IdfReport


-def read_idf_file(path: Union[str, Path]) -> "IdfEvent":
-    """Parse a Thor ``.IDFW``/``.IDFH`` binary into an ``IdfEvent``.
+# Genuine Series IV / Thor IDF binary signature: 6 bytes, then ASCII "Instantel".
+_THOR_PREFIX = b"\x00\x12\x01\x00\x00\x00"
+# Stray Series III (Blastware) binaries that occasionally turn up in Thor
+# corpus directories renamed to the .IDFW/.IDFH convention.  Their header
+# (`10 00 01 80 00 00 Instantel STRT ...`) is byte-for-byte a BW SUB 5A
+# STRT record, not a Thor binary.  Detected so we can refuse-and-route
+# rather than mis-parse.
+_BW_STRAY_PREFIX = b"\x10\x00\x01\x80\x00\x00"
+_INSTANTEL_TAG = b"Instantel"

-    Not yet implemented.  When implemented, this will be the canonical
-    entry point for reading Thor binaries — the ASCII sidecar parser
-    becomes an optional fast-path metadata supplement rather than the
-    sole source of device-authoritative data.
+# Most common body offset for sig-A IDFW files (~50% of prod events;
+# 151/154 in the original tests/fixtures/THORDATA_example corpus).  The
+# body is the segment-rotated block stream consumed by decode_waveform_v2;
+# bytes [0:3] are the magic ``00 02 00`` preamble.  Production events
+# routinely use other offsets — see :func:`_find_waveform_body_offset`
+# for the dynamic scan.  This constant survives only as the priority hint.
+_BODY_START_SIG_A = 0x0F1F
+
+# Magic bytes that mark a candidate waveform-body preamble.
+_BODY_MAGIC = b"\x00\x02\x00"
+
+# Where to start looking for body candidates inside the file.  Skip the
+# fixed-header region where the same magic legitimately appears inside
+# channel-test records and the compliance block (offsets 0x015d, 0x091c,
+# 0x0ae2, 0x0d30 in observed events).
+_BODY_SCAN_FLOOR = 0x0E00
+
+# Geophone count → in/s, derived from sidecar ground truth: the smallest
+# non-zero sample in 1,014-file corpus is 0.0003 in/s.
+_GEO_LSB_IPS = 0.0003
+
+# Microphone count → psi, derived from sidecar regression on 50 sample
+# pairs from UM11719_20231219162723.IDFW (mic-heavy event).
+_MIC_LSB_PSI = 2.14e-6
+
+# IDFH histogram constants.
+_IDFH_INTERVAL_SIZE = 72        # bytes per per-interval record
+_IDFH_SEGMENT_HEADER = 10       # bytes: [len_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
+_IDFH_SEGMENT_TAIL   = 2        # bytes after the interval data block, before next marker
+_IDFH_HALFP_FREQ_NUM = 512.0    # freq_hz = NUM / halfp; halfp ≤ 5 means ">100 Hz" sentinel
+_IDFH_GEO_FULL_SCALE = 10.0     # in/s — Normal range
+_IDFH_INT16_FS = 32768.0
+_IDFH_CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+# ─── Binary metadata extraction ─────────────────────────────────────────────
+
+
+@dataclass
+class IdfBinaryMetadata:
+    """Fields recoverable from the sig-A binary header (no .txt needed)."""
+    serial:           Optional[str] = None
+    event_datetime:   Optional[datetime.datetime] = None
+    sample_rate:      Optional[int] = None
+    record_time_sec:  Optional[float] = None
+    calibration_date: Optional[datetime.date] = None
+
+
+def _read_ascii_z(buf: bytes, off: int, maxlen: int = 64) -> Optional[str]:
+    if off >= len(buf):
+        return None
+    end = buf.find(b"\x00", off, off + maxlen)
+    if end < 0:
+        end = min(off + maxlen, len(buf))
+    s = buf[off:end].decode("ascii", errors="replace").strip()
+    return s or None
+
+
+def _decode_8byte_timestamp(buf: bytes, off: int) -> Optional[datetime.datetime]:
+    """Layout: ``[day][month][year_hi][year_lo][unknown][hour][min][sec]``."""
+    if off + 8 > len(buf):
+        return None
+    day, mon, yh, yl, _unk, hr, mn, sc = buf[off : off + 8]
+    year = (yh << 8) | yl
+    if not (2015 <= year <= 2050 and 1 <= mon <= 12 and 1 <= day <= 31
+            and 0 <= hr < 24 and 0 <= mn < 60 and 0 <= sc < 60):
+        return None
+    try:
+        return datetime.datetime(year, mon, day, hr, mn, sc)
+    except ValueError:
+        return None
+
+
+def extract_binary_metadata(buf: bytes) -> IdfBinaryMetadata:
+    """Pull serial/timestamp/sample_rate/record_time/calibration from the
+    sig-A binary header.
+
+    Field positions confirmed against UM11719_20231219162723.IDFW; stable
+    across the 151-file sig-A corpus.
    """
-    raise NotImplementedError(
-        "IDF binary codec not yet implemented; the .IDFW/.IDFH binary format "
-        "is undecoded.  Use parse_idf_report() on the paired .txt sidecar "
-        "for device-authoritative metadata."
+    md = IdfBinaryMetadata()
+
+    # Serial: null-terminated ASCII at 0x14E.
+    md.serial = _read_ascii_z(buf, 0x14E, maxlen=16)
+
+    # Sample rate + record time live in a BW-compatible compliance block.
+    # Locate the 6-byte anchor `be 80 00 00 00 00` and read offsets relative
+    # to it: anchor-6 = sample_rate uint16 BE; anchor+6 = record_time float32 BE.
+    anchor = buf.find(b"\xbe\x80\x00\x00\x00\x00", 0x800, 0xA00)
+    if anchor > 0:
+        sr_bytes = buf[anchor - 6 : anchor - 4]
+        if len(sr_bytes) == 2:
+            sr = int.from_bytes(sr_bytes, "big")
+            if sr in (256, 512, 1024, 2048, 4096):
+                md.sample_rate = sr
+        rt_bytes = buf[anchor + 6 : anchor + 10]
+        if len(rt_bytes) == 4:
+            try:
+                rt = struct.unpack(">f", rt_bytes)[0]
+                if 0.1 <= rt <= 600.0:
+                    md.record_time_sec = float(rt)
+            except struct.error:
+                pass
+
+    # Event timestamp: 8 bytes.  Position differs between IDFW (0x97A) and
+    # IDFH (0x9F8); scan a small range and accept the first valid decode.
+    for off in (0x97A, 0x9F8):
+        ts = _decode_8byte_timestamp(buf, off)
+        if ts is not None:
+            md.event_datetime = ts
+            break
+
+    # Calibration date: day, month, year_be at 0x194-0x197.
+    if len(buf) > 0x197:
+        day, mon = buf[0x194], buf[0x195]
+        year = int.from_bytes(buf[0x196 : 0x198], "big")
+        if 1 <= mon <= 12 and 1 <= day <= 31 and 2015 <= year <= 2050:
+            try:
+                md.calibration_date = datetime.date(year, mon, day)
+            except ValueError:
+                pass
+
+    return md
+
+
+# ─── Sample decoder + unit conversion ───────────────────────────────────────
+
+
+def _find_waveform_body_offset(buf: bytes) -> Optional[int]:
+    """Pick the file offset of the waveform body by trial-decoding every
+    ``00 02 00`` magic position past the fixed-header region.
+
+    The body's location isn't fixed across all sig-A IDFW files — about
+    half the production events use ``0x0f1f``, but the rest have offsets
+    that shift based on header padding / channel-config layout.  We
+    auto-detect by:
+
+      1. Find every ``00 02 00`` occurrence past ``_BODY_SCAN_FLOOR``.
+      2. Try ``decode_waveform_v2()`` on each candidate.
+      3. Pick the offset whose decoded sample count is largest.
+
+    Returns the offset, or ``None`` if no candidate yielded more than
+    the trivial 2-sample preamble (= "no real body found").
+
+    Costs ~2-8 trial decodes per file; in practice the first candidate
+    past 0x0e00 is usually the right one.
+    """
+    if len(buf) < _BODY_SCAN_FLOOR + 8:
+        return None
+    best: Optional[tuple[int, int]] = None   # (total_samples, offset)
+    i = _BODY_SCAN_FLOOR
+    while True:
+        j = buf.find(_BODY_MAGIC, i)
+        if j < 0:
+            break
+        i = j + 1
+        try:
+            decoded = decode_waveform_v2(buf[j:])
+        except Exception:
+            continue
+        if not decoded:
+            continue
+        total = sum(len(v) for v in decoded.values())
+        # A "real" body has more than just the 2-sample preamble.
+        if total <= 2:
+            continue
+        if best is None or total > best[0]:
+            best = (total, j)
+    return best[1] if best else None
+
+
+def _decode_waveform_samples(buf: bytes) -> Optional[dict]:
+    """Decode samples from the sig-A waveform body.
+
+    Returns the raw decoder counts dict — geo LSB = 0.0003 in/s, mic in
+    its own count unit (see :func:`mic_count_to_psi`).  Returns None if
+    no usable body is found.
+
+    Uses :func:`_find_waveform_body_offset` to locate the body — the
+    file-offset varies across events (~50% sit at the canonical
+    ``0x0f1f`` but the rest don't), so the previous hardcoded constant
+    silently produced 2-sample preamble-only output for half the corpus.
+    """
+    off = _find_waveform_body_offset(buf)
+    if off is None:
+        return None
+    return decode_waveform_v2(buf[off:])
+
+
+def geo_count_to_ips(count: int) -> float:
+    """Convert a Thor geo decoder count to in/s.  LSB = 0.0003 in/s."""
+    return count * _GEO_LSB_IPS
+
+
+def mic_count_to_psi(count: int) -> float:
+    """Convert a Thor mic decoder count to psi.  Scale derived from
+    regression over 50 sample pairs in UM11719_20231219162723.IDFW;
+    consistent to ~5%.  Calibration constants from the channel block
+    can refine this once decoded.
+    """
+    return count * _MIC_LSB_PSI
+
+
+# ─── IDFH histogram decoder ─────────────────────────────────────────────────
+
+
+@dataclass
+class IdfhInterval:
+    """One decoded histogram interval (typically one minute of monitoring)."""
+    offset:    int    # file byte offset of the 72-byte record
+    # Per-channel min/max ADC counts (int16 BE), half-period samples, peak count.
+    # Peak = max(|min|, |max|).  freq_hz = 512/halfp (None if halfp ≤ 5 →
+    # ">100 Hz" sentinel; matches sidecar convention).
+    tran_min:    int
+    tran_max:    int
+    tran_halfp:  int
+    vert_min:    int
+    vert_max:    int
+    vert_halfp:  int
+    long_min:    int
+    long_max:    int
+    long_halfp:  int
+    micl_min:    int
+    micl_max:    int
+    micl_halfp:  int
+
+    def peak_count(self, channel: str) -> int:
+        mn = getattr(self, f"{channel.lower()}_min")
+        mx = getattr(self, f"{channel.lower()}_max")
+        return max(abs(mn), abs(mx))
+
+    def peak_ips(self, channel: str) -> float:
+        """Convert peak count to in/s (geo channels only)."""
+        return self.peak_count(channel) / _IDFH_INT16_FS * _IDFH_GEO_FULL_SCALE
+
+    def freq_hz(self, channel: str) -> Optional[float]:
+        halfp = getattr(self, f"{channel.lower()}_halfp")
+        if halfp <= 5:
+            return None
+        return _IDFH_HALFP_FREQ_NUM / halfp
+
+
+def _decode_idfh_interval(buf72: bytes, offset: int) -> IdfhInterval:
+    """Decode one 72-byte interval record into per-channel min/max/halfp."""
+    import struct
+    fields = []
+    for i in range(4):
+        block = buf72[i * 16 : (i + 1) * 16]
+        mn = struct.unpack_from(">h", block, 0)[0]
+        mx = struct.unpack_from(">h", block, 2)[0]
+        # block[4:6] = int16 BE, role unknown (possibly time-of-peak)
+        halfp = struct.unpack_from(">H", block, 6)[0]
+        # block[10:12] and block[14:16] are uint16 BE with unknown semantics
+        # (likely sum / count contributions for the PVS computation).
+        fields.extend([mn, mx, halfp])
+    # Tail 8 bytes (buf72[64:72]) carry PVS-related data; not yet decoded.
+    return IdfhInterval(
+        offset=offset,
+        tran_min=fields[0], tran_max=fields[1], tran_halfp=fields[2],
+        vert_min=fields[3], vert_max=fields[4], vert_halfp=fields[5],
+        long_min=fields[6], long_max=fields[7], long_halfp=fields[8],
+        micl_min=fields[9], micl_max=fields[10], micl_halfp=fields[11],
+    )
+
+
+def decode_idfh_body(buf: bytes) -> list:
+    """Walk an IDFH file and decode every interval record.
+
+    The body has one or more segments; each segment header is 12 bytes:
+    ``[length_be 2B][0a 00 00 00][00 NN_counter][05 3f]`` where ``length``
+    is bytes from the magic through the end of the interval block
+    (= 10 + 72 × n_intervals).  Segments are separated by a 2-byte tail
+    + next-segment 2-byte prefix (the bytes before the next length field).
+    Confirmed against the 859-file corpus (181,071 intervals decoded; 1
+    failure is the sig-B BE9439 file).
+    """
+    intervals: list = []
+    i = 0
+    while True:
+        j = buf.find(b"\x0a\x00\x00\x00", i)
+        if j < 0 or j < 2:
+            break
+        # Validate: [length_be][0a 00 00 00][00 NN][05 3f]
+        if buf[j + 4] != 0x00 or buf[j + 6 : j + 8] != b"\x05\x3f":
+            i = j + 1
+            continue
+        length = int.from_bytes(buf[j - 2 : j], "big")
+        n = (length - _IDFH_SEGMENT_HEADER) // _IDFH_INTERVAL_SIZE
+        if n <= 0:
+            i = j + 1
+            continue
+        header_start = j - 2
+        interval_start = header_start + _IDFH_SEGMENT_HEADER
+        for k in range(n):
+            off = interval_start + k * _IDFH_INTERVAL_SIZE
+            if off + _IDFH_INTERVAL_SIZE > len(buf):
+                break
+            chunk = buf[off : off + _IDFH_INTERVAL_SIZE]
+            intervals.append(_decode_idfh_interval(chunk, off))
+        # Advance past this segment + the 2-byte tail.
+        i = header_start + length + _IDFH_SEGMENT_TAIL
+    return intervals
+
+
+# ─── Top-level reader ───────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfReadResult:
+    """Return type for :func:`read_idf_file`.
+
+    For waveforms (``.IDFW``), ``samples`` holds the per-channel sample
+    arrays in Thor decoder counts.  For histograms (``.IDFH``),
+    ``samples`` is empty and ``intervals`` holds the per-interval
+    record list (peaks, freqs).
+    """
+    event:           IdfEvent
+    samples:         dict   # {"Tran": [...], ...} for IDFW; empty for IDFH
+    binary_metadata: IdfBinaryMetadata
+    signature:       str    # always "thor" for now (sig-A genuine Thor)
+    intervals:       Optional[list] = None  # list[IdfhInterval] for IDFH; None for IDFW
+
+
+def read_idf_file(
+    path: Union[str, Path],
+    *,
+    data: Optional[bytes] = None,
+) -> IdfReadResult:
+    """Parse a Thor ``.IDFW`` binary into an ``IdfEvent`` + decoded samples.
+
+    Currently implements signature-A waveforms only.  Signature-B
+    (old-firmware) and ``.IDFH`` histograms raise NotImplementedError;
+    use the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar for those via
+    ``parse_idf_report()``.
+
+    Returns an :class:`IdfReadResult`.  The caller converts int sample
+    counts to physical units via :func:`geo_count_to_ips` /
+    :func:`mic_count_to_psi`.
+
+    ``path`` is used for filename in error messages and ``.IDFH`` vs
+    ``.IDFW`` suffix detection.  When ``data`` is supplied the disk
+    read is skipped — useful for ingest paths that already have the
+    bytes in memory and where the file may not exist on disk yet.
+    """
+    p = Path(path)
+    buf = data if data is not None else p.read_bytes()
+
+    if len(buf) < 16 or buf[6:16] != _INSTANTEL_TAG + b"\x00":
+        raise ValueError(f"{p.name}: not an IDF file (missing Instantel magic)")
+
+    sig_prefix = buf[:6]
+    if sig_prefix == _THOR_PREFIX:
+        signature = "thor"
+    elif sig_prefix == _BW_STRAY_PREFIX:
+        raise NotImplementedError(
+            f"{p.name}: file has a Series III (Blastware) STRT header in "
+            "an IDF-named container — not a Thor binary.  Route through "
+            "minimateplus.event_file_io.read_blastware_file() instead "
+            "(peaks decode; samples & full metadata don't, but it's not "
+            "Thor data so the Thor codec doesn't apply)."
+        )
+    else:
+        raise ValueError(f"{p.name}: unknown IDF signature {sig_prefix.hex()}")
+
+    is_histogram = p.suffix.upper() == ".IDFH"
+    md = extract_binary_metadata(buf)
+
+    if is_histogram:
+        intervals = decode_idfh_body(buf)
+        if not intervals:
+            raise ValueError(f"{p.name}: IDFH body decoded no intervals")
+        # Peaks: max across all intervals on each channel (per-channel max
+        # of stored max-magnitudes; sidecar's PPV row carries the same).
+        peak_tran = max((iv.peak_ips("Tran") for iv in intervals), default=0.0)
+        peak_vert = max((iv.peak_ips("Vert") for iv in intervals), default=0.0)
+        peak_long = max((iv.peak_ips("Long") for iv in intervals), default=0.0)
+        # Mic peak in psi — Thor stores per-interval mic ADC counts in the
+        # binary; convert the max count to psi via the per-count factor.
+        mic_peak_count = max((iv.peak_count("MicL") for iv in intervals), default=0)
+        mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
+        rep = IdfReport(
+            serial_number=md.serial,
+            event_type="Full Histogram",
+            event_datetime=md.event_datetime,
+            filename=p.name,
+            sample_rate=md.sample_rate,
+            record_time_sec=md.record_time_sec,
+        )
+        peaks = IdfPeaks(
+            transverse_ips=peak_tran,
+            vertical_ips=peak_vert,
+            longitudinal_ips=peak_long,
+            peak_vector_sum_ips=None,
+            mic_pspl_dbl=None,         # IDFH binary doesn't carry the dB(L) value
+            mic_pspl_psi=mic_peak_psi,
+        )
+        event = IdfEvent(
+            serial=md.serial or "UNKNOWN",
+            timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
+            kind="Histogram",
+            filename=p.name,
+            sample_rate=md.sample_rate,
+            record_time_sec=md.record_time_sec,
+            peaks=peaks,
+            report=rep,
+        )
+        return IdfReadResult(
+            event=event,
+            samples={},
+            binary_metadata=md,
+            signature=signature,
+            intervals=intervals,
+        )
+
+    # Waveform path.
+    decoded = _decode_waveform_samples(buf)
+    if decoded is None:
+        raise ValueError(f"{p.name}: waveform body codec failed")
+
+    rep = IdfReport(
+        serial_number=md.serial,
+        event_type="Full Waveform",
+        event_datetime=md.event_datetime,
+        filename=p.name,
+        sample_rate=md.sample_rate,
+        record_time_sec=md.record_time_sec,
+    )
+
+    def _peak_ips(ch: str) -> float:
+        arr = decoded.get(ch, [])
+        return geo_count_to_ips(max((abs(v) for v in arr), default=0))
+
+    # Mic peak psi from binary: max absolute MicL ADC count × 2.14e-6 psi/count.
+    mic_arr = decoded.get("MicL", [])
+    mic_peak_count = max((abs(v) for v in mic_arr), default=0)
+    mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
+
+    peaks = IdfPeaks(
+        transverse_ips=_peak_ips("Tran"),
+        vertical_ips=_peak_ips("Vert"),
+        longitudinal_ips=_peak_ips("Long"),
+        # PVS requires aligned per-sample √(T²+V²+L²); leave None — the
+        # sidecar carries it and the bridge picks it up if present.
+        peak_vector_sum_ips=None,
+        mic_pspl_dbl=None,             # binary IDFW doesn't carry the dB(L) value;
+                                       # sidecar .txt fills it via IdfReport.from_dict
+        mic_pspl_psi=mic_peak_psi,
+    )
+
+    event = IdfEvent(
+        serial=md.serial or "UNKNOWN",
+        timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
+        kind="Waveform",
+        filename=p.name,
+        sample_rate=md.sample_rate,
+        record_time_sec=md.record_time_sec,
+        peaks=peaks,
+        report=rep,
+    )
+
+    return IdfReadResult(
+        event=event,
+        samples=decoded,
+        binary_metadata=md,
+        signature=signature,
    )
@@ -0,0 +1,323 @@
+"""
+micromate/idf_to_bw_report.py — adapter that projects a parsed Thor IDF
+report (+ binary metadata + decoded IDFH intervals) into the
+``bw_report``-shaped dict that :mod:`sfm.report_pdf.gather_report_data`
+consumes.
+
+Lets Thor events flow through the existing Series III Event Report PDF
+pipeline without duplicating the renderer.  Thor's report content is
+~95% the same data shape as BW's; the field names differ but the
+underlying metrics map 1:1.
+
+Caveats
+───────
+
+- **Mic units** — Thor records ``MicPSPL`` natively in dB(L).  This
+  adapter sets ``bw_report.mic.pspl_dbl`` directly; the report
+  renderer recomputes the equivalent psi via its dBL→psi formula.
+- **Saturation / above-range flags** — Thor doesn't always mark
+  ``OORANGE`` the way BW does; we set ``zc_freq_above_range`` only
+  when a `>100` sentinel was preserved in the raw text.
+- **Per-interval data** — for IDFH events we build ``interval_times``
+  by stepping ``IntervalSize`` from ``HistogramStartTime``; the binary
+  decoder confirms one record per step (882 / 881 / 881 ... across
+  the corpus).
+- **calibration_by parsing** — Thor's free-form ``Calibration : November
+  22, 2023 by Instantel`` is split on ``" by "`` to extract the
+  calibrator; the date prefix is parsed where possible, otherwise
+  the binary-extracted ``calibration_date`` from
+  :class:`micromate.idf_file.IdfBinaryMetadata` wins.
+"""
+
+from __future__ import annotations
+
+import datetime
+import re
+from typing import Any, Dict, List, Optional
+
+
+# ─── Helpers ────────────────────────────────────────────────────────────────
+
+
+_NUM_RE = re.compile(r"-?\d+(?:\.\d+)?")
+
+
+def _parse_first_number(s: Optional[str]) -> Optional[float]:
+    """Pull the first numeric token from a string like ``"0.1500 in/s"``."""
+    if s is None:
+        return None
+    m = _NUM_RE.search(str(s))
+    if not m:
+        return None
+    try:
+        return float(m.group(0))
+    except ValueError:
+        return None
+
+
+def _parse_interval_size_s(s: Optional[str]) -> Optional[float]:
+    """``"60 sec"`` → 60.0, ``"5 min"`` → 300.0, ``"1 hour"`` → 3600."""
+    if s is None:
+        return None
+    num = _parse_first_number(s)
+    if num is None:
+        return None
+    sl = str(s).lower()
+    if "hour" in sl or "hr" in sl:
+        return num * 3600.0
+    if "min" in sl:
+        return num * 60.0
+    return num   # default to seconds
+
+
+def _parse_calibration(text: Optional[str]) -> tuple[Optional[str], Optional[str]]:
+    """Split ``"November 22, 2023 by Instantel"`` → (ISO date, calibrator).
+
+    Returns ``(None, None)`` if neither half parses.
+    """
+    if not text:
+        return None, None
+    parts = str(text).split(" by ", 1)
+    date_part = parts[0].strip() if parts else None
+    by_part = parts[1].strip() if len(parts) > 1 else None
+    iso_date: Optional[str] = None
+    if date_part:
+        for fmt in ("%B %d, %Y", "%b %d, %Y", "%Y-%m-%d", "%m/%d/%Y"):
+            try:
+                iso_date = datetime.datetime.strptime(date_part, fmt).date().isoformat()
+                break
+            except ValueError:
+                continue
+    return iso_date, by_part
+
+
+def _channel_peaks(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
+    """Map ``tran_ppv`` / ``tran_zc_freq`` / ... → bw_report.peaks.tran shape."""
+    out: Dict[str, Any] = {}
+    for src, dst in (
+        (f"{ch_lc}_ppv",                 "ppv_ips"),
+        (f"{ch_lc}_zc_freq",             "zc_freq_hz"),
+        (f"{ch_lc}_time_of_peak",        "time_of_peak_s"),
+        (f"{ch_lc}_peak_acceleration",   "peak_accel_g"),
+        (f"{ch_lc}_peak_displacement",   "peak_disp_in"),
+    ):
+        v = idf.get(src)
+        if v is not None:
+            out[dst] = v
+    # ZC freq ">100" sentinel: the raw text carries it under the un-typed
+    # key (e.g. ``raw["tran_zc_freq"]`` would be ``">100"``), and our parser
+    # dropped the typed entry.  Detect that case and flag.
+    raw_zc = idf.get(f"{ch_lc}_zc_freq")
+    if isinstance(raw_zc, str) and ">" in raw_zc:
+        out["zc_freq_above_range"] = True
+        out.pop("zc_freq_hz", None)
+    return out
+
+
+def _sensor_check(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
+    out: Dict[str, Any] = {}
+    fr = idf.get(f"{ch_lc}_test_freq")
+    if fr is not None:
+        out["freq_hz"] = _parse_first_number(fr)
+    rt = idf.get(f"{ch_lc}_test_ratio")
+    if rt is not None:
+        out["ratio"] = _parse_first_number(rt)
+    am = idf.get(f"{ch_lc}_test_amplitude")
+    if am is not None:
+        out["amplitude_mv"] = _parse_first_number(am)
+    res = idf.get(f"{ch_lc}_test_results")
+    if res is not None:
+        out["result"] = str(res).strip()
+    return {k: v for k, v in out.items() if v is not None}
+
+
+def _interval_times(idf: Dict[str, Any], n_intervals: Optional[int]) -> List[str]:
+    """Synthesise per-interval timestamps from start + interval_size × k.
+
+    Returns ``[]`` when start time or interval size is unknown.
+    """
+    if not n_intervals:
+        return []
+    start_date = idf.get("histogram_start_date") or idf.get("event_date")
+    start_time = idf.get("histogram_start_time") or idf.get("event_time")
+    iv_str = idf.get("interval_size")
+    iv_s = _parse_interval_size_s(iv_str)
+    if not (start_date and start_time and iv_s):
+        return []
+    try:
+        t0 = datetime.datetime.strptime(f"{start_date} {start_time}", "%Y-%m-%d %H:%M:%S")
+    except ValueError:
+        return []
+    out = []
+    for k in range(int(n_intervals)):
+        t = t0 + datetime.timedelta(seconds=iv_s * (k + 1))
+        out.append(t.isoformat())
+    return out
+
+
+# ─── Top-level adapter ──────────────────────────────────────────────────────
+
+
+def build_bw_report_from_idf(
+    idf_report: Dict[str, Any],
+    *,
+    binary_md=None,
+    intervals: Optional[list] = None,
+    is_histogram: Optional[bool] = None,
+) -> Dict[str, Any]:
+    """Project a parsed IDF report dict (and optional binary metadata +
+    decoded IDFH intervals) into the BW report sidecar shape.
+
+    The returned dict is structurally identical to what
+    ``minimateplus.event_file_io._bw_report_to_dict`` produces from a
+    real BW ASCII report — it can be assigned to
+    ``sidecar["bw_report"]`` and consumed verbatim by
+    ``sfm.report_pdf.gather_report_data``.
+
+    ``intervals`` is the list of :class:`micromate.idf_file.IdfhInterval`
+    objects from :func:`micromate.idf_file.decode_idfh_body`; only used
+    for histogram events to derive accurate ``interval_times``.
+    """
+    if is_histogram is None:
+        et = str(idf_report.get("event_type", ""))
+        is_histogram = et.lower().startswith("full histogram")
+
+    # ── Trigger / recording / device ─────────────────────────────────────
+    trigger_channel = idf_report.get("trigger")
+    trigger_level   = _parse_first_number(idf_report.get("geo_trigger_level"))
+    geo_range_ips   = _parse_first_number(idf_report.get("geo_range"))
+
+    cal_iso, cal_by = _parse_calibration(idf_report.get("calibration"))
+    # Prefer the binary-extracted calibration_date when our text parse fell
+    # through; the binary date is unambiguous.
+    if cal_iso is None and binary_md is not None and binary_md.calibration_date:
+        cal_iso = binary_md.calibration_date.isoformat()
+
+    # ── Histogram fields ────────────────────────────────────────────────
+    hist_block: Dict[str, Any] = {
+        "start": None, "stop": None, "n_intervals": None,
+        "interval_size": None, "interval_size_s": None,
+        "channel_peak_when": {},
+    }
+    if is_histogram:
+        sd = idf_report.get("histogram_start_date")
+        st = idf_report.get("histogram_start_time")
+        if sd and st:
+            try:
+                hist_block["start"] = datetime.datetime.strptime(
+                    f"{sd} {st}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+        ed = idf_report.get("histogram_stop_date")
+        et_ = idf_report.get("histogram_stop_time")
+        if ed and et_:
+            try:
+                hist_block["stop"] = datetime.datetime.strptime(
+                    f"{ed} {et_}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+        n_raw = idf_report.get("number_of_intervals")
+        if n_raw is not None:
+            try:
+                # Thor reports a float like "81.04"; round to int (the BW
+                # report uses an int for the column).
+                hist_block["n_intervals"] = int(float(str(n_raw)))
+            except ValueError:
+                pass
+        # When the binary decoder gave us the actual interval count, prefer it.
+        if intervals is not None:
+            hist_block["n_intervals"] = len(intervals)
+        hist_block["interval_size"] = idf_report.get("interval_size")
+        hist_block["interval_size_s"] = _parse_interval_size_s(idf_report.get("interval_size"))
+        # interval_times derived from start+step (the BW report uses the
+        # exact strings; we match its representation).
+        times = _interval_times(idf_report, hist_block["n_intervals"])
+        # Per-channel peak when (absolute date+time at which the channel's
+        # peak occurred over the histogram run).  Thor splits this into
+        # ``TranPeakDate`` / ``TranPeakTime`` etc.
+        peak_when: Dict[str, str] = {}
+        for ch_label, ch_lc in (("Tran", "tran"), ("Vert", "vert"), ("Long", "long"), ("MicL", "mic")):
+            d = idf_report.get(f"{ch_lc}_peak_date")
+            t = idf_report.get(f"{ch_lc}_peak_time")
+            if d and t:
+                try:
+                    peak_when[ch_label] = datetime.datetime.strptime(
+                        f"{d} {t}", "%Y-%m-%d %H:%M:%S"
+                    ).isoformat()
+                except ValueError:
+                    continue
+        if peak_when:
+            hist_block["channel_peak_when"] = peak_when
+
+    # ── Mic block ────────────────────────────────────────────────────────
+    mic_block = {
+        "weighting":           "L",                   # Thor mic is ISEE Linear
+        "pspl_dbl":            idf_report.get("mic_ppv"),  # the dB(L) float
+        "pspl_saturated":      False,
+        "zc_freq_hz":          idf_report.get("mic_zc_freq"),
+        "zc_freq_above_range": isinstance(idf_report.get("mic_zc_freq"), str)
+                               and ">" in str(idf_report.get("mic_zc_freq")),
+        "time_of_peak_s":      idf_report.get("mic_time_of_peak"),
+    }
+    if mic_block["zc_freq_above_range"]:
+        mic_block["zc_freq_hz"] = None
+
+    # ── Peaks ────────────────────────────────────────────────────────────
+    vs_block = {
+        "ips":       idf_report.get("peak_vector_sum"),
+        "time_s":    _parse_first_number(idf_report.get("peak_vector_sum_time_sum")),
+        "when":      None,
+        "saturated": False,
+    }
+    if is_histogram:
+        # PVS absolute date+time, when present.
+        vs_d = idf_report.get("peak_vector_sum_date")
+        vs_t = idf_report.get("peak_vector_sum_time")
+        if vs_d and vs_t:
+            try:
+                vs_block["when"] = datetime.datetime.strptime(
+                    f"{vs_d} {vs_t}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+
+    return {
+        "available":  True,
+        "event_type": idf_report.get("event_type"),
+        "version":    idf_report.get("version"),
+        "trigger": {
+            "channel":       trigger_channel,
+            "geo_level_ips": trigger_level,
+        },
+        "recording": {
+            "sample_rate_sps":  idf_report.get("sample_rate"),
+            "record_time_s":    idf_report.get("record_time_sec"),
+            "pretrig_s":        idf_report.get("pre_trigger_sec"),
+            "stop_mode":        idf_report.get("record_stop_mode"),
+            "geo_range_ips":    geo_range_ips,
+            "units":            idf_report.get("units"),
+        },
+        "device": {
+            "battery_volts":    idf_report.get("battery_volts"),
+            "calibration_date": cal_iso,
+            "calibration_by":   cal_by,
+        },
+        "peaks": {
+            "tran":       _channel_peaks(idf_report, "tran"),
+            "vert":       _channel_peaks(idf_report, "vert"),
+            "long":       _channel_peaks(idf_report, "long"),
+            "vector_sum": vs_block,
+        },
+        "mic":          mic_block,
+        "sensor_check": {
+            "tran": _sensor_check(idf_report, "tran"),
+            "vert": _sensor_check(idf_report, "vert"),
+            "long": _sensor_check(idf_report, "long"),
+            "mic":  _sensor_check(idf_report, "mic"),
+        },
+        "histogram":    hist_block,
+        "monitor_log":  [],
+        "pc_sw_version": None,
+    }
@@ -159,12 +159,23 @@ class IdfReport:

@dataclass
 class IdfPeaks:
-    """Geophone + mic peak values for one Thor event.  Native Thor units."""
+    """Geophone + mic peak values for one Thor event.  Native Thor units.
+
+    Thor stores the mic peak in two parallel forms — ``mic_pspl_dbl`` is
+    what the sidecar's top-level ``MicPSPL`` header field carries (dB(L)),
+    used in the report header.  ``mic_pspl_psi`` is the psi value derived
+    either from the IDFW sample table / IDFH interval column 9, or from
+    the binary mic counts (~2.14e-6 psi/count).  Needed because the
+    BW-shaped ``PeakValues.micl`` consumed by ``event_hdf5.write_event_hdf5``
+    expects psi — feeding it dB(L) makes the h5 mic-chart scale factor
+    blow up.
+    """
    transverse_ips:    Optional[float] = None    # in/s
    vertical_ips:      Optional[float] = None    # in/s
    longitudinal_ips:  Optional[float] = None    # in/s
    peak_vector_sum_ips: Optional[float] = None  # in/s
    mic_pspl_dbl:      Optional[float] = None    # dB(L)
+    mic_pspl_psi:      Optional[float] = None    # psi


@dataclass
@@ -324,10 +335,14 @@ class IdfEvent:
        machinery without those code paths needing to know about Thor.

        Caveats of the bridge:
-          - ``mic_ppv`` on the produced Event carries Thor's dB(L) value
-            verbatim — the UI distinguishes via the ``device_family``
-            column (Phase 1).  Don't run the BW psi→dBL converter on
-            Series IV rows.
+          - ``PeakValues.micl`` carries the mic peak in **psi** (matching
+            BW's convention) — set from :attr:`IdfPeaks.mic_pspl_psi`,
+            with a dB(L)→psi fallback when only the dB(L) value is
+            available.  This is what the h5 writer's mic-scale-factor
+            logic needs.  The dB(L) value still flows through
+            ``bw_report.mic.pspl_dbl`` (set by the
+            ``idf_to_bw_report`` adapter) and the renderer reads it
+            from there for the report header.
          - Many Thor-specific fields (Peak Acceleration / Displacement,
            sensor self-check, calibration) don't have a slot in
            ``Event``.  The full IdfReport is preserved on the
@@ -349,11 +364,17 @@ class IdfEvent:
            minute=self.timestamp.minute,
            second=self.timestamp.second,
        )
+        # Resolve mic peak as psi.  Priority: binary-derived mic_pspl_psi
+        # (set by read_idf_file) > dB(L)→psi fallback via standard formula
+        # (psi = 2.9e-9 × 10^(dBL/20)) > None.
+        mic_psi = self.peaks.mic_pspl_psi
+        if mic_psi is None and self.peaks.mic_pspl_dbl is not None:
+            mic_psi = 2.9e-9 * (10.0 ** (self.peaks.mic_pspl_dbl / 20.0))
        pv = PeakValues(
            tran=self.peaks.transverse_ips,
            vert=self.peaks.vertical_ips,
            long=self.peaks.longitudinal_ips,
-            micl=self.peaks.mic_pspl_dbl,   # dB(L) — see caveat above
+            micl=mic_psi,   # psi, matching BW's convention (h5 scaling depends on this)
            peak_vector_sum=self.peaks.peak_vector_sum_ips,
        )
        pi = ProjectInfo(
@@ -67,6 +67,11 @@ class ChannelStats:
    # to render "> 10 in/s" or "saturated" instead of trusting the
    # value as an exact measurement.
    ppv_saturated:     bool = False
+    # Set when BW writes ">100 Hz" for ZC Freq — the zero-crossing
+    # algorithm's peak frequency exceeded the device's reporting
+    # ceiling (typically 100 Hz on V10.72).  zc_freq_hz gets the
+    # threshold (100.0) as a lower bound; downstream UI renders ">100".
+    zc_freq_above_range: bool = False


@dataclass
@@ -81,6 +86,9 @@ class MicStats:
    # 140 dBL (typical NL-43 max; some units cap at 148).  Consumers
    # should render "> 140 dB(L)" or similar when this flag is set.
    pspl_saturated:    bool = False
+    # Same semantics as ChannelStats.zc_freq_above_range — mic ZC
+    # peak exceeded device reporting ceiling.
+    zc_freq_above_range: bool = False


@dataclass
@@ -119,6 +127,20 @@ def _is_oorange(value: str) -> bool:
    return any(m in s for m in _OORANGE_MARKERS)


+def _parse_above_range(value: str) -> Optional[float]:
+    """For BW "above-range" markers like ">100 Hz", return the threshold.
+
+    BW writes ZC Freq as ">100 Hz" when the zero-crossing algorithm sees
+    a peak too fast to count (device cuts off at 100 Hz).  Returns the
+    numeric portion after the '>' (e.g. 100.0), or None if `value` is
+    not an above-range marker.
+    """
+    s = value.strip()
+    if not s.startswith(">"):
+        return None
+    return _parse_number(s[1:])
+
+
@dataclass
 class BwAsciiReport:
    """Structured representation of one BW per-event ASCII export."""
@@ -527,10 +549,17 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
                    cs.ppv_saturated = True
                else:
                    cs.ppv_ips = _parse_number(value)
+            elif stat == "ZC Freq":
+                # ">100 Hz" → store threshold + flag; numeric → parse normally
+                threshold = _parse_above_range(value)
+                if threshold is not None:
+                    cs.zc_freq_hz = threshold
+                    cs.zc_freq_above_range = True
+                else:
+                    cs.zc_freq_hz = _parse_number(value)
            else:
                num = _parse_number(value)
-                if   stat == "ZC Freq":             cs.zc_freq_hz     = num
-                elif stat == "Time of Peak":        cs.time_of_peak_s = num
+                if   stat == "Time of Peak":        cs.time_of_peak_s = num
                elif stat == "Peak Acceleration":   cs.peak_accel_g   = num
                elif stat == "Peak Displacement":   cs.peak_disp_in   = num

@@ -627,9 +656,15 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
            cs = report.channels.setdefault("MicL", ChannelStats())
            cs.time_of_peak_s = report.mic.time_of_peak_s
        elif key == "MicL ZC Freq":
-            report.mic.zc_freq_hz = _parse_number(value)
+            threshold = _parse_above_range(value)
+            if threshold is not None:
+                report.mic.zc_freq_hz         = threshold
+                report.mic.zc_freq_above_range = True
+            else:
+                report.mic.zc_freq_hz = _parse_number(value)
            cs = report.channels.setdefault("MicL", ChannelStats())
-            cs.zc_freq_hz = report.mic.zc_freq_hz
+            cs.zc_freq_hz          = report.mic.zc_freq_hz
+            cs.zc_freq_above_range = report.mic.zc_freq_above_range

        # ── Sensor self-check ────────────────────────────────────────────────
        elif key in (
@@ -49,7 +49,7 @@ SIDECAR_KIND   = "sfm.event"
 # bumped without a `pip install` re-run — leading to confusing stale
 # version stamps in sidecars.  Bump this constant and CHANGELOG.md
 # together at release time.
-TOOL_VERSION = "0.20.0"
+TOOL_VERSION = "0.21.1"

 try:
    # Best-effort: prefer the installed metadata when it's NEWER than the
@@ -125,6 +125,10 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
        # is the channel range max (a lower bound), not an exact reading.
        if getattr(cs, "ppv_saturated", False):
            out["ppv_saturated"] = True
+        # ZC Freq above device reporting ceiling (BW ">100 Hz") — value
+        # in zc_freq_hz is the threshold, not an exact measurement.
+        if getattr(cs, "zc_freq_above_range", False):
+            out["zc_freq_above_range"] = True
        return out

    def _sc(ch_name: str) -> dict:
@@ -187,11 +191,12 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
            },
        },
        "mic": {
-            "weighting":        report.mic.weighting,
-            "pspl_dbl":         report.mic.pspl_dbl,
-            "pspl_saturated":   bool(getattr(report.mic, "pspl_saturated", False)),
-            "zc_freq_hz":       report.mic.zc_freq_hz,
-            "time_of_peak_s":   report.mic.time_of_peak_s,
+            "weighting":             report.mic.weighting,
+            "pspl_dbl":              report.mic.pspl_dbl,
+            "pspl_saturated":        bool(getattr(report.mic, "pspl_saturated", False)),
+            "zc_freq_hz":            report.mic.zc_freq_hz,
+            "zc_freq_above_range":   bool(getattr(report.mic, "zc_freq_above_range", False)),
+            "time_of_peak_s":        report.mic.time_of_peak_s,
        },
        "sensor_check": {
            "tran": _sc("Tran"),
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "seismo-relay"
-version = "0.19.0"
+version = "0.21.1"
 description = "Python client and REST server for MiniMate Plus seismographs"
 requires-python = ">=3.10"
 dependencies = [
@@ -103,6 +103,17 @@ def main(argv=None) -> int:
            "STRT-rectime byte-offset fix in v0.15.x)."
        ),
    )
+    p.add_argument(
+        "--reparse-txt", action="store_true",
+        help=(
+            "Re-parse the preserved <serial>/<filename>_ASCII.TXT with the "
+            "current bw_ascii_report parser and overwrite the sidecar's "
+            "bw_report block.  Use this after upgrading the ASCII parser to "
+            "pull in new fields (e.g. zc_freq_above_range for BW '>100 Hz' "
+            "ZC peaks).  No-op for events without a preserved .TXT; safely "
+            "idempotent when the parser hasn't changed."
+        ),
+    )
    p.add_argument("-v", "--verbose", action="store_true")
    args = p.parse_args(argv)

@@ -153,7 +164,7 @@ def main(argv=None) -> int:
            # of the sidecar implies staleness of the derived .h5 (both
            # come out of the same decoder).
            sidecar_stale = True
-            if sidecar_path.exists() and not args.force:
+            if sidecar_path.exists() and not args.force and not args.reparse_txt:
                try:
                    existing = event_file_io.read_sidecar(sidecar_path)
                    sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
@@ -314,6 +325,24 @@ def main(argv=None) -> int:
                    except Exception:
                        pass

+                # --reparse-txt: if a .TXT is preserved on disk, run the
+                # current parser against it and overwrite the bw_report
+                # block.  Picks up post-ingest parser fixes (e.g. the
+                # 2026-05-28 zc_freq_above_range / ">100 Hz" addition).
+                if args.reparse_txt and preserved_txt_fn:
+                    try:
+                        from minimateplus import bw_ascii_report
+                        txt_path = store.txt_path_for(serial, path.name)
+                        if txt_path.exists():
+                            refreshed = bw_ascii_report.parse_report_file(txt_path)
+                            preserved_bw_report = event_file_io._bw_report_to_dict(refreshed)
+                            log.debug("reparsed bw_report from %s", txt_path.name)
+                        else:
+                            log.debug("--reparse-txt: no .TXT at %s (sidecar says %r)",
+                                      txt_path, preserved_txt_fn)
+                    except Exception as exc:
+                        log.warning("--reparse-txt failed for %s: %s", path.name, exc)
+
                # Overlay BW ASCII report fields onto the rebuilt Event
                # BEFORE the sidecar + DB write.  Mirrors what the ingest
                # path does — BW's reported peaks (and sample_rate /
@@ -0,0 +1,331 @@
+"""
+scripts/backfill_thor_events.py — re-process existing Thor (Series IV)
+events so their sidecars carry the bw_report block produced by
+``micromate.idf_to_bw_report.build_bw_report_from_idf`` + their .h5
+clean-waveform files for IDFW events.
+
+Why this exists
+───────────────
+
+Thor events ingested before v0.21.0 (or during the v0.21.0 ingest bug
+window fixed in commit bee1185) have sidecars with only
+``extensions.idf_report`` — no ``bw_report`` block.  Without
+``bw_report``, the SFM PDF renderer falls back to DB-only fields
+(misses sensor-self-check, full per-channel breakdown, mic dB(L)),
+and the modal chart 404s on ``/waveform.json`` for IDFW events
+because no .h5 was written when the codec failed at ingest.
+
+Re-forwarding from thor-watcher would also fix this, but that requires
+operator coordination on every watcher machine and uses bandwidth this
+script doesn't.
+
+What this does
+──────────────
+
+Walks ``<store>/<serial>/<filename>`` for ``.IDFW`` / ``.IDFH`` files
+and, for each one:
+
+  1. Reads the existing sidecar (preserving review state + captured_at).
+  2. Re-runs ``micromate.idf_file.read_idf_file()`` on the binary
+     bytes — passing ``data=`` so the codec doesn't try to read from
+     a path it doesn't know.
+  3. Pulls ``extensions.idf_report`` (the raw parsed Thor dict the
+     v0.18.0+ ingest path already stashed) and runs the v0.21.0
+     ``build_bw_report_from_idf`` adapter against it.
+  4. Writes the refreshed sidecar with the new ``bw_report``,
+     bumped ``source.tool_version``, but preserved ``review`` block
+     + the original ``captured_at`` timestamp.
+  5. Regenerates the .h5 waveform file via the existing
+     ``event_hdf5`` writer.  For IDFW that's the decoded per-sample
+     stream; for IDFH it's a 1-sample-per-interval synthesised array
+     (peak ADC count per channel) so the renderer's bar-chart code
+     has data to group on.  Mic peak psi from the binary is merged
+     onto the IdfEvent before the bridge so the h5 writer's per-count
+     mic scale factor lands on a sensible value (without this the
+     mic chart on Thor events plots dB(L)-as-pseudo-psi and shows
+     bomb-level numbers).
+
+Idempotent.  Re-running it after a parser/adapter change just
+re-writes sidecars — no DB writes, no thor-watcher coordination.
+
+Usage
+─────
+
+    python scripts/backfill_thor_events.py [--store-root PATH]
+                                           [--dry-run]
+                                           [--skip-hdf5]
+                                           [--force]
+                                           [-v]
+
+By default, refreshes any Thor event whose sidecar is missing
+``bw_report`` OR whose ``source.tool_version`` is older than the
+current ``TOOL_VERSION``.  ``--force`` refreshes every Thor event
+regardless.
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+import sys
+from pathlib import Path
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+from sfm.waveform_store import WaveformStore
+
+log = logging.getLogger("backfill_thor_events")
+
+
+def _is_thor_event(path: Path) -> bool:
+    if not path.is_file():
+        return False
+    if path.name.endswith((".sfm.json", ".h5", "_ASCII.TXT")):
+        return False
+    return path.suffix.upper() in (".IDFW", ".IDFH")
+
+
+def _vtuple(s: str) -> tuple:
+    try:
+        return tuple(int(p) for p in str(s).split(".")[:3])
+    except Exception:
+        return (0, 0, 0)
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    p.add_argument(
+        "--db-path",
+        default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
+        help="Used only to derive the default --store-root.",
+    )
+    p.add_argument("--store-root", default=None)
+    p.add_argument("--dry-run", action="store_true")
+    p.add_argument("--skip-hdf5", action="store_true",
+                   help="Don't regenerate .h5 files for IDFW events.")
+    p.add_argument("--force", action="store_true",
+                   help="Refresh every Thor event, not just ones with stale or missing bw_report.")
+    p.add_argument("-v", "--verbose", action="store_true")
+    args = p.parse_args(argv)
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+        datefmt="%H:%M:%S",
+    )
+
+    db_path = Path(args.db_path).expanduser().resolve()
+    store_root = (
+        Path(args.store_root).expanduser().resolve()
+        if args.store_root else db_path.parent / "waveforms"
+    )
+    if not store_root.exists():
+        log.error("store root not found: %s", store_root)
+        return 1
+    store = WaveformStore(store_root)
+    log.info("store root: %s", store_root)
+    log.info("current TOOL_VERSION: %s", event_file_io.TOOL_VERSION)
+
+    refreshed = skipped = errors = h5_written = 0
+
+    # Lazy imports so any one of these failing produces a useful error
+    # message rather than crashing module-load.
+    from micromate.idf_file import read_idf_file
+    from micromate.idf_to_bw_report import build_bw_report_from_idf
+
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        serial = serial_dir.name
+        for path in sorted(serial_dir.iterdir()):
+            if not _is_thor_event(path):
+                continue
+
+            sidecar_path = store.sidecar_path_for(serial, path.name)
+            if not sidecar_path.exists():
+                log.debug("%s: no sidecar — skipping (this is a binary without ingest history)",
+                          path.name)
+                skipped += 1
+                continue
+
+            try:
+                existing = event_file_io.read_sidecar(sidecar_path)
+            except Exception as exc:
+                log.warning("%s: failed to read sidecar — %s", path.name, exc)
+                errors += 1
+                continue
+
+            has_bw_report = bool(existing.get("bw_report"))
+            existing_version = (existing.get("source") or {}).get("tool_version", "")
+            up_to_date = (
+                has_bw_report
+                and _vtuple(existing_version) >= _vtuple(event_file_io.TOOL_VERSION)
+            )
+            if up_to_date and not args.force:
+                skipped += 1
+                continue
+
+            # Re-decode the binary.  Catch + log; continue with .txt-only
+            # data if it fails (matches the live ingest path's behavior).
+            idf_samples = None
+            idf_intervals = None
+            binary_md = None
+            is_histogram = path.suffix.upper() == ".IDFH"
+            try:
+                binary_bytes = path.read_bytes()
+                res = read_idf_file(path, data=binary_bytes)
+                idf_samples = res.samples or None
+                idf_intervals = res.intervals
+                binary_md = res.binary_metadata
+                is_histogram = res.intervals is not None
+            except NotImplementedError:
+                # sig-B / Blastware-stray binary; no samples but adapter
+                # can still produce a bw_report from extensions.idf_report.
+                log.debug("%s: binary codec NotImplementedError (sig-B / BW-stray); proceeding from sidecar's idf_report only", path.name)
+            except Exception as exc:
+                log.warning("%s: binary decode failed — %s; proceeding from sidecar's idf_report only", path.name, exc)
+
+            # Run the adapter.  Pull report_dict from
+            # extensions.idf_report (the v0.18.0+ ingest preserved it).
+            report_dict = (existing.get("extensions") or {}).get("idf_report") or {}
+            if not report_dict and binary_md is None:
+                log.debug("%s: no idf_report in sidecar AND no binary metadata — nothing to project", path.name)
+                skipped += 1
+                continue
+
+            try:
+                bw_report = build_bw_report_from_idf(
+                    report_dict, binary_md=binary_md,
+                    intervals=idf_intervals, is_histogram=is_histogram,
+                )
+            except Exception as exc:
+                log.warning("%s: adapter failed — %s", path.name, exc)
+                errors += 1
+                continue
+
+            # Build the new sidecar by overlaying refreshed fields onto
+            # the existing one — preserves review, captured_at, blastware
+            # block, source.kind, etc.
+            new_sidecar = dict(existing)  # shallow copy
+            new_sidecar["bw_report"] = bw_report
+            src = dict(new_sidecar.get("source") or {})
+            src["tool_version"] = event_file_io.TOOL_VERSION
+            new_sidecar["source"] = src
+
+            # Preserve histogram intervals if the binary decoded them
+            # (improves over the original ingest if that one ran before
+            # the bee1185 codec fix).
+            if idf_intervals is not None:
+                ext = dict(new_sidecar.get("extensions") or {})
+                ext["idf_intervals"] = [
+                    {
+                        "offset":     iv.offset,
+                        "tran_peak":  iv.peak_count("Tran"),
+                        "tran_halfp": iv.tran_halfp,
+                        "tran_freq":  iv.freq_hz("Tran"),
+                        "vert_peak":  iv.peak_count("Vert"),
+                        "vert_halfp": iv.vert_halfp,
+                        "vert_freq":  iv.freq_hz("Vert"),
+                        "long_peak":  iv.peak_count("Long"),
+                        "long_halfp": iv.long_halfp,
+                        "long_freq":  iv.freq_hz("Long"),
+                        "mic_peak":   iv.peak_count("MicL"),
+                        "mic_halfp":  iv.micl_halfp,
+                        "mic_freq":   iv.freq_hz("MicL"),
+                    }
+                    for iv in idf_intervals
+                ]
+                new_sidecar["extensions"] = ext
+
+            if args.dry_run:
+                will_write_h5 = (idf_samples or idf_intervals) and not args.skip_hdf5
+                log.info("[DRY] %s/%s — would refresh sidecar (bw_report=%s, h5=%s)",
+                         serial, path.name,
+                         "wrote" if not has_bw_report else "refreshed",
+                         "would write" if will_write_h5 else "skipped")
+            else:
+                event_file_io.write_sidecar(sidecar_path, new_sidecar)
+                log.info("%s/%s — sidecar refreshed (bw_report=%s, intervals=%d)",
+                         serial, path.name,
+                         "added" if not has_bw_report else "refreshed",
+                         len(idf_intervals) if idf_intervals else 0)
+            refreshed += 1
+
+            # Regenerate .h5 by replaying the same IdfEvent → Event bridge
+            # save_imported_idf uses.  For IDFW we write the decoded per-
+            # sample arrays.  For IDFH we synthesise a 1-sample-per-interval
+            # array (peak ADC count per channel per interval) so the
+            # renderer's bar-chart code has something to group on.
+            # Pre-condition: either real samples (IDFW) or decoded intervals
+            # (IDFH).  Skip otherwise.
+            have_data = bool(idf_samples) or bool(idf_intervals)
+            if have_data and not args.skip_hdf5:
+                from sfm import event_hdf5
+                hdf5_path = store.hdf5_path_for(serial, path.name)
+                if args.dry_run:
+                    log.debug("[DRY] would write %s", hdf5_path.name)
+                else:
+                    try:
+                        from micromate import IdfEvent
+                        from minimateplus.event_file_io import file_sha256
+                        idf_event = IdfEvent.from_report(report_dict, path.name)
+
+                        # Merge the binary-derived mic peak psi (only the
+                        # binary path knows the proper psi value; the .txt
+                        # carries dB(L)).  Without this, the h5 writer's
+                        # per-count mic factor is computed against the
+                        # dB(L) value-as-pseudo-psi and the mic chart
+                        # scales wildly.
+                        if (binary_md is not None and res is not None
+                                and res.event.peaks.mic_pspl_psi is not None):
+                            idf_event.peaks.mic_pspl_psi = res.event.peaks.mic_pspl_psi
+
+                        sha256 = file_sha256(path)
+                        waveform_key = bytes.fromhex(sha256)[:16]
+                        ev = idf_event.to_minimateplus_event(waveform_key)
+
+                        if is_histogram and idf_intervals:
+                            # 1 sample per interval per channel — same
+                            # synthesis save_imported_idf uses.  The h5
+                            # writer's count×geo_fs/32768 conversion turns
+                            # each peak-ADC-count into the bar's physical
+                            # value.
+                            ev.raw_samples = {
+                                "Tran": [iv.peak_count("Tran") for iv in idf_intervals],
+                                "Vert": [iv.peak_count("Vert") for iv in idf_intervals],
+                                "Long": [iv.peak_count("Long") for iv in idf_intervals],
+                                "MicL": [iv.peak_count("MicL") for iv in idf_intervals],
+                            }
+                            ev.total_samples = ev.total_samples or len(idf_intervals)
+                        elif idf_samples:
+                            ev.raw_samples = idf_samples
+                            n_samp = max(
+                                (len(idf_samples.get(ch, []))
+                                 for ch in ("Tran", "Vert", "Long", "MicL")),
+                                default=0,
+                            )
+                            ev.total_samples = ev.total_samples or n_samp
+
+                        event_hdf5.write_event_hdf5(
+                            hdf5_path, ev,
+                            serial=serial,
+                            geo_range="normal",
+                            source_kind="idf-import",
+                            tool_version=event_file_io.TOOL_VERSION,
+                        )
+                        h5_written += 1
+                        log.debug("%s/%s — .h5 written (%s)",
+                                  serial, path.name,
+                                  f"{len(idf_intervals)} intervals" if is_histogram
+                                  else f"{sum(len(v) for v in (idf_samples or {}).values())} samples")
+                    except Exception as exc:
+                        log.warning("%s/%s — .h5 write failed: %s",
+                                    serial, path.name, exc)
+
+    log.info("Done.  refreshed=%d  skipped=%d  errors=%d  h5_written=%d",
+             refreshed, skipped, errors, h5_written)
+    return 0 if errors == 0 else 2
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,91 @@
+"""Re-ingest a prod IDFW + IDFH via the patched save_imported_idf and
+render both PDFs to confirm charts have data."""
+from __future__ import annotations
+import sys
+import json
+import datetime
+import tempfile
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+import h5py
+
+
+class FakeDb:
+    def __init__(self, event):
+        self.event = event
+    def get_event(self, _id):
+        return self.event
+
+
+def to_ts_iso(ts):
+    if ts is None:
+        return None
+    try:
+        return datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+    except Exception:
+        return None
+
+
+def render_case(idf_path: Path, serial: str, out_pdf: Path, h5_summary: bool = True):
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idf_path.read_bytes(),
+            idf_path,
+            idf_report_text=None,    # production worst case: no .txt
+        )
+        print(f"=== {idf_path.name} ===")
+        print(f"  h5: {rec['hdf5_filename']}, sidecar: {rec['sidecar_filename']}")
+
+        h5p = Path(td) / serial / f"{idf_path.name}.h5"
+        if h5p.exists() and h5_summary:
+            with h5py.File(h5p) as h:
+                for ch in ("Tran", "Vert", "Long", "MicL"):
+                    ds = h.get(f"samples/{ch}")
+                    if ds is not None:
+                        n = ds.shape[0]
+                        mx = float(abs(ds[...]).max()) if n else 0
+                        print(f"  samples/{ch}: n={n}  max_abs={mx:.5f}")
+
+        record_type = "Histogram" if idf_path.suffix.upper() == ".IDFH" else "Waveform"
+        fake_row = {
+            "serial":              serial,
+            "blastware_filename":  rec["filename"],
+            "record_type":         record_type,
+            "timestamp":           to_ts_iso(ev.timestamp),
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
+        print(f"  ReportData: channels={ {k: len(v) for k,v in rd.channels.items()} }")
+        if rd.is_histogram:
+            print(f"  histogram n_intervals={rd.histogram_n_intervals} interval_size={rd.histogram_interval_size}")
+        pdf = report_pdf.render_event_report_pdf(rd)
+        out_pdf.write_bytes(pdf)
+        print(f"  PDF: {out_pdf}  ({len(pdf)} bytes)")
+
+
+def main():
+    out_dir = Path("/tmp/thor_render_test"); out_dir.mkdir(exist_ok=True)
+    cases = [
+        # IDFW that decoded to preamble-only under the old codec
+        ("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804154137.IDFW", "UM6047"),
+        # IDFW that worked under the old codec (validates no regression)
+        ("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804104450.IDFW", "UM6047"),
+        # IDFH histogram
+        ("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804190047.IDFH", "UM6047"),
+    ]
+    for path, serial in cases:
+        render_case(Path(path), serial, out_dir / f"{Path(path).name}.pdf")
+
+
+if __name__ == "__main__":
+    main()
@@ -499,6 +499,14 @@ async function loadEvent(eventId) {
  renderEventList();
  setStatus('Loading waveform…');
  try {
+    // Sidecar fetch runs in parallel — its bw_report block carries ZC
+    // Freq + above-range flags + sensor-check results that the per-
+    // channel stats table surfaces.  Failures are non-fatal (legacy
+    // events without a preserved .TXT have no sidecar bw_report).
+    const sidecarP = fetch(`${apiBase}/db/events/${eventId}/sidecar`)
+      .then(r => r.ok ? r.json() : null)
+      .catch(() => null);
+
    const r = await fetch(`${apiBase}/db/events/${eventId}/waveform.json`);
    if (!r.ok) {
      if (r.status === 404) {
@@ -511,7 +519,8 @@ async function loadEvent(eventId) {
    renderWaveform(data);
    // Also fetch metadata from the events list for richer header
    const ev = allEvents.find(e => e.id === eventId);
-    renderMeta(data, ev);
+    const sidecar = await sidecarP;
+    renderMeta(data, ev, sidecar);
    setStatus(`Event loaded.`, 'ok');
  } catch (e) {
    setStatus(`Failed to load event: ${e.message}`, 'error');
@@ -528,7 +537,7 @@ function showEmpty(msg) {
  charts = {};
 }

-function renderMeta(data, ev) {
+function renderMeta(data, ev, sidecar) {
  const metaDiv = document.getElementById('event-meta');
  const fields = [
    ['Serial',      data.serial || ev?.serial || '—'],
@@ -543,14 +552,20 @@ function renderMeta(data, ev) {
  ];

  // Per-channel stats table mirroring the printout's middle block.
-  // Pulls per-channel PPV from the events row (DB columns) and additional
-  // details (peak time, peak accel, peak displacement, sensor check) from
-  // bw_report when present.
+  // PPV from the events DB row; ZC Freq + saturation flags from the
+  // sidecar's bw_report block (when a .TXT was preserved on ingest).
+  const bwrPeaks = (sidecar?.bw_report || {}).peaks || {};
+  const bwrMic   = (sidecar?.bw_report || {}).mic   || {};
  const fmt = v => (v == null ? '—' : (typeof v === 'number' ? v.toFixed(3) : v));
+  const fmtZc = bwr => {
+    if (!bwr || bwr.zc_freq_hz == null) return '—';
+    const prefix = bwr.zc_freq_above_range ? '>' : '';
+    return `${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
+  };
  const rows = [
-    ['Tran', ev?.tran_ppv],
-    ['Vert', ev?.vert_ppv],
-    ['Long', ev?.long_ppv],
+    ['Tran', ev?.tran_ppv, fmtZc(bwrPeaks.tran)],
+    ['Vert', ev?.vert_ppv, fmtZc(bwrPeaks.vert)],
+    ['Long', ev?.long_ppv, fmtZc(bwrPeaks.long)],
  ];
  // Mic display honors the current user preference (dBL default).
  // mic_ppv is stored as raw psi on series3 events; convert when needed.
@@ -568,11 +583,11 @@ function renderMeta(data, ev) {
  const statsHtml = `
    <table class="stats-table">
      <thead>
-        <tr><th>Channel</th><th>PPV (in/s)</th></tr>
+        <tr><th>Channel</th><th>PPV (in/s)</th><th>ZC Freq</th></tr>
      </thead>
      <tbody>
-        ${rows.map(([ch, ppv]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td></tr>`).join('')}
-        <tr><td>MicL</td><td>${micStr}</td></tr>
+        ${rows.map(([ch, ppv, zc]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td><td>${zc}</td></tr>`).join('')}
+        <tr><td>MicL</td><td>${micStr}</td><td>${fmtZc(bwrMic)}</td></tr>
      </tbody>
    </table>
  `;
@@ -99,6 +99,7 @@ class ReportData:
    mic_pspl_time_s:        Optional[float] = None
    mic_pspl_when_str:      Optional[str] = None    # histogram absolute date+time, BW-formatted
    mic_zc_freq_hz:         Optional[float] = None
+    mic_zc_freq_above_range: bool           = False
    mic_channel_test_result: Optional[str] = None
    mic_channel_test_freq_hz: Optional[float] = None
    mic_channel_test_amp_mv: Optional[float] = None
@@ -216,7 +217,8 @@ def gather_report_data(
            # Inverse of the dBL formula → psi.  Mirrors waveform_codec convention.
            rd.mic_pspl_psi = DBL_REF_PSI * (10 ** (rd.mic_pspl_dbl / 20))
        rd.mic_pspl_time_s = mic.get("time_of_peak_s")
-        rd.mic_zc_freq_hz  = mic.get("zc_freq_hz")
+        rd.mic_zc_freq_hz             = mic.get("zc_freq_hz")
+        rd.mic_zc_freq_above_range    = bool(mic.get("zc_freq_above_range"))
        sc_mic = (bw.get("sensor_check") or {}).get("mic") or {}
        rd.mic_channel_test_result   = sc_mic.get("result")
        rd.mic_channel_test_freq_hz  = sc_mic.get("freq_hz")
@@ -236,15 +238,16 @@ def gather_report_data(
            ch_when_iso = peak_when.get(ch_label)
            peak_date, peak_time = _split_iso_to_date_time(ch_when_iso)
            rd.channel_stats.append({
-                "name":          ch_label,
-                "ppv_ips":       ch.get("ppv_ips"),
-                "zc_freq_hz":    ch.get("zc_freq_hz"),
-                "time_of_peak_s": ch.get("time_of_peak_s"),
-                "peak_accel_g":  ch.get("peak_accel_g"),
-                "peak_disp_in":  ch.get("peak_disp_in"),
-                "sensor_check":  sc_ch.get("result"),
-                "peak_date":     peak_date,
-                "peak_time":     peak_time,
+                "name":               ch_label,
+                "ppv_ips":            ch.get("ppv_ips"),
+                "zc_freq_hz":         ch.get("zc_freq_hz"),
+                "zc_freq_above_range": bool(ch.get("zc_freq_above_range")),
+                "time_of_peak_s":     ch.get("time_of_peak_s"),
+                "peak_accel_g":       ch.get("peak_accel_g"),
+                "peak_disp_in":       ch.get("peak_disp_in"),
+                "sensor_check":       sc_ch.get("result"),
+                "peak_date":          peak_date,
+                "peak_time":          peak_time,
            })

        # MicL peak time (used in the mic block — "PSPL ... on DATE at TIME")
@@ -612,7 +615,8 @@ def _mic_rows(rd: ReportData) -> list[tuple[str, Optional[str]]]:
            line += f" at {rd.mic_pspl_time_s:.3f} sec."
        rows.append(("PSPL", line))
    if rd.mic_zc_freq_hz is not None:
-        rows.append(("ZC Freq", f"{rd.mic_zc_freq_hz:.0f} Hz"))
+        prefix = ">" if rd.mic_zc_freq_above_range else ""
+        rows.append(("ZC Freq", f"{prefix}{rd.mic_zc_freq_hz:.0f} Hz"))
    if rd.mic_channel_test_result:
        line = rd.mic_channel_test_result
        if rd.mic_channel_test_freq_hz is not None and rd.mic_channel_test_amp_mv is not None:
@@ -634,14 +638,7 @@ def _draw_channel_stats_waveform(ax, rd: ReportData) -> None:
        ("Sensor Check",         "sensor_check",   ""),
    ]
    _draw_stats_table(ax, rd, rows_spec)
-    if rd.peak_vector_sum_ips is not None:
-        line = f"Peak Vector Sum   {rd.peak_vector_sum_ips:.3f} in/s"
-        if rd.peak_vector_sum_time_s is not None:
-            line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
-        ax.text(0.0, -0.08, line, fontsize=9, weight="bold",
-                ha="left", va="top", transform=ax.transAxes)
-        ax.text(0.0, -0.18, "NA: Not Applicable", fontsize=7, color="#888",
-                ha="left", va="top", transform=ax.transAxes)
+    _draw_pvs_summary(ax, rd, n_data_rows=len(rows_spec))


 def _draw_channel_stats_histogram(ax, rd: ReportData) -> None:
@@ -659,20 +656,54 @@ def _draw_channel_stats_histogram(ax, rd: ReportData) -> None:
        ("Sensor Check", "sensor_check",    ""),
    ]
    _draw_stats_table(ax, rd, rows_spec)
-    if rd.peak_vector_sum_ips is not None:
-        line = f"Peak Vector Sum   {rd.peak_vector_sum_ips:.3f} in/s"
-        # Histograms: "0.091 in/s on May 27, 2026 At 06:06:14"
-        # The when_str is "HH:MM:SS Month DD, YYYY" — reformat for BW match.
-        if rd.peak_vector_sum_when_str:
-            parts = rd.peak_vector_sum_when_str.split(" ", 1)
-            if len(parts) == 2:
-                line += f" on {parts[1]} At {parts[0]}"
-            else:
-                line += f" on {rd.peak_vector_sum_when_str}"
-        ax.text(0.0, -0.08, line, fontsize=9, weight="bold",
-                ha="left", va="top", transform=ax.transAxes)
-        ax.text(0.0, -0.18, "NA: Not Applicable", fontsize=7, color="#888",
-                ha="left", va="top", transform=ax.transAxes)
+    _draw_pvs_summary(ax, rd, n_data_rows=len(rows_spec), histogram_when=True)
+
+
+def _draw_pvs_summary(
+    ax,
+    rd: ReportData,
+    *,
+    n_data_rows: int,
+    histogram_when: bool = False,
+) -> None:
+    """Render the Peak Vector Sum + 'NA: Not Applicable' caption below the
+    stats table.
+
+    Reads ``ax._stats_table_bottom`` (set by ``_draw_stats_table`` when
+    it pins the table via an explicit ``bbox``) so the PVS line lands
+    just below the table's known bottom edge instead of guessing at the
+    geometry.
+
+    Centered horizontally for visual balance (the previous left-aligned
+    x=0 landed under the label column, not the data, which looked off).
+    """
+    if rd.peak_vector_sum_ips is None:
+        return
+
+    line = f"Peak Vector Sum   {rd.peak_vector_sum_ips:.3f} in/s"
+    if histogram_when and rd.peak_vector_sum_when_str:
+        # Histogram absolute date+time.  when_str is "HH:MM:SS Month DD, YYYY";
+        # reformat to "<value> on <date> At <time>" to match BW.
+        parts = rd.peak_vector_sum_when_str.split(" ", 1)
+        if len(parts) == 2:
+            line += f" on {parts[1]} At {parts[0]}"
+        else:
+            line += f" on {rd.peak_vector_sum_when_str}"
+    elif not histogram_when and rd.peak_vector_sum_time_s is not None:
+        line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
+
+    # _draw_stats_table stashes the bbox bottom on the axes so we don't
+    # have to guess geometry.  Falls back to a conservative default if
+    # the bbox approach hasn't run.
+    table_bottom_y = getattr(ax, "_stats_table_bottom", -0.10)
+    pvs_y = table_bottom_y - 0.04   # small gap below the table border
+
+    # Centered for visual balance — looks intentional rather than offset.
+    # The original BW-replica had a "NA: Not Applicable" caption below
+    # this line; dropped because we use "—" for missing values and the
+    # legend was always squished against the PVS line.
+    ax.text(0.5, pvs_y, line, fontsize=9, weight="bold",
+            ha="center", va="top", transform=ax.transAxes)


 def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]]) -> None:
@@ -684,13 +715,17 @@ def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]])
    ch_lookup = {c["name"]: c for c in rd.channel_stats}

    def _cell(field, ch_name):
-        val = ch_lookup.get(ch_name, {}).get(field)
+        ch_rec = ch_lookup.get(ch_name, {})
+        val = ch_rec.get(field)
        if val is None:
            return "—"
        if isinstance(val, float):
-            # ZC Freq is integer-formatted in BW; everything else with 3 decimals
+            # ZC Freq is integer-formatted in BW; ">100 Hz" sentinel
+            # rendered as ">N" (val carries the threshold).  Everything
+            # else gets 3 decimals.
            if field == "zc_freq_hz":
-                return f"{val:.0f}"
+                prefix = ">" if ch_rec.get("zc_freq_above_range") else ""
+                return f"{prefix}{val:.0f}"
            return f"{val:.3f}"
        return str(val)

@@ -703,16 +738,28 @@ def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]])
            _cell(field_name, "Long"),
            unit,
        ])
+    # Pin the table's position+size via bbox so we know exactly where
+    # the bottom edge lands.  Lets _draw_pvs_summary place the PVS line
+    # just below the table without guessing at row heights.
+    #
+    # bbox = [x, y, width, height] in axes coords.  Header + data rows
+    # at row_h each; horizontal extent matches sum(colWidths).
+    n_rows = len(table_data)        # header + data rows
+    row_h  = 0.12                   # axes-fraction per row (fits fontsize=8)
+    table_height = n_rows * row_h
+    table_bottom = 1.0 - table_height
    tbl = ax.table(
-        cellText=table_data, loc="upper left",
+        cellText=table_data,
        colWidths=[0.28, 0.14, 0.14, 0.14, 0.10],
        cellLoc="left", edges="open",
+        bbox=[0.0, table_bottom, 0.80, table_height],
    )
    tbl.auto_set_font_size(False)
    tbl.set_fontsize(8)
-    tbl.scale(1, 1.4)
    for j in range(5):
        tbl[(0, j)].set_text_props(weight="bold", color="#555")
+    # Stash the bottom Y so _draw_pvs_summary can position itself below.
+    ax._stats_table_bottom = table_bottom


 def _channel_axis_color(ch: str) -> str:
@@ -2886,6 +2886,12 @@ function _renderSidecar(data) {
  const bw   = data.blastware    || {};
  const src  = data.source       || {};
  const rev  = data.review       || {};
+  // bw_report carries the per-channel ASCII-derived stats (ZC Freq,
+  // saturation flags, peak time, etc.).  Only present on events
+  // ingested with a preserved .TXT (post-2026-05-27); falls back to
+  // empty for legacy events.
+  const bwrPeaks = (data.bw_report || {}).peaks || {};
+  const bwrMic   = (data.bw_report || {}).mic   || {};

  document.getElementById('sc-title').textContent = `Event — ${bw.filename || ev.waveform_key || 'unknown'}`;

@@ -2918,11 +2924,19 @@ function _renderSidecar(data) {
  document.getElementById('sc-f-sr').textContent       = (ev.sample_rate ?? '—') + (ev.sample_rate ? ' sps' : '');
  document.getElementById('sc-f-key').textContent      = ev.waveform_key    || '—';

-  document.getElementById('sc-f-tran').textContent     = fmtPpv(pv.transverse);
-  document.getElementById('sc-f-vert').textContent     = fmtPpv(pv.vertical);
-  document.getElementById('sc-f-long').textContent     = fmtPpv(pv.longitudinal);
+  // Suffix with " · {prefix}{N} Hz" when bw_report has a ZC Freq.
+  // Above-range ZC peaks (BW ">100 Hz") get a literal ">" prefix so
+  // operators see the same indicator the PDF shows.
+  const fmtZc = bwr => {
+    if (!bwr || bwr.zc_freq_hz == null) return '';
+    const prefix = bwr.zc_freq_above_range ? '>' : '';
+    return ` · ${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
+  };
+  document.getElementById('sc-f-tran').textContent     = fmtPpv(pv.transverse)   + fmtZc(bwrPeaks.tran);
+  document.getElementById('sc-f-vert').textContent     = fmtPpv(pv.vertical)     + fmtZc(bwrPeaks.vert);
+  document.getElementById('sc-f-long').textContent     = fmtPpv(pv.longitudinal) + fmtZc(bwrPeaks.long);
  document.getElementById('sc-f-pvs').textContent      = fmtPpv(pv.vector_sum);
-  document.getElementById('sc-f-mic').textContent      = fmtMic(pv.mic_psi);
+  document.getElementById('sc-f-mic').textContent      = fmtMic(pv.mic_psi)      + fmtZc(bwrMic);

  document.getElementById('sc-f-project').textContent  = pi.project         || '—';
  document.getElementById('sc-f-client').textContent   = pi.client          || '—';
@@ -3273,7 +3287,7 @@ if (currentSection === 'db') {
          <dt id="sc-l-bwsize">File size</dt>   <dd id="sc-f-bwsize">—</dd>
          <dt id="sc-l-sha">File sha256</dt>    <dd id="sc-f-sha">—</dd>
          <dt>Source kind</dt>      <dd id="sc-f-src">—</dd>
-          <dt title="When our server received and stored this event (sfm-db insert time, not the recording time)">Received by server at</dt>
+          <dt title="When SFM received and stored this event — NOT the unit-local trigger time (see Timestamp at the top of the modal for that).">Time received</dt>
                                    <dd id="sc-f-cap">—</dd>
        </dl>
      </div>
@@ -467,21 +467,21 @@ class WaveformStore:
        Ingest a Thor (Micromate Series IV) IDF event file (`.IDFW` or
        `.IDFH`) produced by Thor's TXT exporter.

-        Thor binaries are stored as opaque bytes — seismo-relay doesn't
-        yet decode the proprietary IDF binary format (codec slot lives
-        at ``micromate/idf_file.py``).  Device-authoritative metadata
-        comes from the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar
-        when supplied.
-
        Workflow:
-          1. Parse the paired TXT report (when supplied) via
-             ``micromate.parse_idf_report`` → dict.
-          2. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
-          3. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
-          4. Bridge IdfEvent → ``minimateplus.Event`` (for the existing
-             sidecar / DB insert machinery) via
-             ``IdfEvent.to_minimateplus_event(waveform_key)``.
-          5. Write the ``.sfm.json`` sidecar with
+          1. For sig-A `.IDFW` binaries, decode samples + binary metadata
+             via ``micromate.idf_file.read_idf_file()``.  Failure or
+             non-IDFW path falls through to the .txt-only flow.
+          2. Parse the paired TXT report (when supplied) via
+             ``micromate.parse_idf_report`` → dict.  TXT remains the
+             source of truth for fields the binary doesn't yet supply
+             (full peak set with ZC freq / Time of Peak, sensor self-check,
+             firmware string, project strings).
+          3. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
+          4. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
+          5. Bridge IdfEvent → ``minimateplus.Event`` and attach
+             ``raw_samples`` from the binary decoder (when available).
+          6. Write the `.h5` clean-waveform file when samples decoded.
+          7. Write the ``.sfm.json`` sidecar with
             ``source.kind = "idf-import"`` and the full raw IDF report
             under ``extensions.idf_report``.

@@ -490,7 +490,38 @@ class WaveformStore:
        """
        from micromate import IdfEvent, parse_idf_report

-        # Parse the .txt sidecar (best-effort; non-fatal on failure).
+        # 1. Binary decode (sig-A IDFW and IDFH).  Non-fatal: any failure
+        # leaves samples / binary metadata unfilled and we proceed with
+        # the .txt path as before.
+        idf_samples: Optional[dict] = None
+        idf_intervals: Optional[list] = None
+        binary_md = None
+        binary_peaks = None
+        is_histogram = False
+        try:
+            from micromate.idf_file import read_idf_file
+            # Pass idf_bytes through `data=` — at this point in the flow
+            # the binary hasn't been written to disk yet, so the codec
+            # can't read from source_path.  We still pass source_path so
+            # the codec has the filename for error messages + .IDFH
+            # suffix detection.
+            res = read_idf_file(source_path, data=idf_bytes)
+            idf_samples = res.samples or None
+            idf_intervals = res.intervals
+            is_histogram = res.intervals is not None
+            binary_md = res.binary_metadata
+            binary_peaks = res.event.peaks
+        except NotImplementedError:
+            # sig-B — codec doesn't handle this yet.
+            pass
+        except Exception as exc:
+            log.warning(
+                "save_imported_idf: binary codec failed for %s: %s — "
+                "falling back to .txt-only ingest",
+                source_path.name, exc,
+            )
+
+        # 2. Parse the .txt sidecar (best-effort; non-fatal on failure).
        report_dict: dict = {}
        if idf_report_text is not None:
            try:
@@ -501,17 +532,58 @@ class WaveformStore:
                    exc,
                )

-        # Build the typed IdfEvent.  Filename is authoritative for
+        # 3. Backfill report_dict with binary metadata for fields the
+        # .txt didn't supply.  Binary takes precedence on tied fields
+        # where the binary is more reliable (timestamp, sample_rate),
+        # and fills in fields entirely missing from the .txt.
+        if binary_md is not None:
+            if binary_md.serial and not report_dict.get("serial_number"):
+                report_dict["serial_number"] = binary_md.serial
+            if binary_md.event_datetime and not report_dict.get("event_datetime"):
+                report_dict["event_datetime"] = binary_md.event_datetime
+            if binary_md.sample_rate and not report_dict.get("sample_rate"):
+                report_dict["sample_rate"] = binary_md.sample_rate
+            if binary_md.record_time_sec and not report_dict.get("record_time_sec"):
+                report_dict["record_time_sec"] = binary_md.record_time_sec
+            # Calibration date (binary) vs calibration text (.txt) cohabit
+            # under different keys; no overwrite needed.
+            if binary_md.event_datetime and not report_dict.get("event_type"):
+                report_dict["event_type"] = (
+                    "Full Histogram" if is_histogram else "Full Waveform"
+                )
+
+        # Binary-derived peaks fill in when the .txt didn't supply them.
+        # They're ~3% low vs the device-authoritative .txt values (residual
+        # codec drift), so .txt always wins when present.
+        if binary_peaks is not None:
+            if binary_peaks.transverse_ips and not report_dict.get("tran_ppv"):
+                report_dict["tran_ppv"] = binary_peaks.transverse_ips
+            if binary_peaks.vertical_ips and not report_dict.get("vert_ppv"):
+                report_dict["vert_ppv"] = binary_peaks.vertical_ips
+            if binary_peaks.longitudinal_ips and not report_dict.get("long_ppv"):
+                report_dict["long_ppv"] = binary_peaks.longitudinal_ips
+
+        # 4. Build the typed IdfEvent.  Filename is authoritative for
        # (serial, timestamp, kind); the report's event_datetime takes
        # precedence over the filename timestamp inside from_report().
        idf_event = IdfEvent.from_report(report_dict, source_path.name)

+        # The binary mic peak (psi) isn't carried through from_report() —
+        # IdfReport.from_dict only sees the .txt's dB(L) value.  Pull the
+        # binary-derived ``mic_pspl_psi`` onto the typed IdfEvent so the
+        # downstream bridge can populate ``PeakValues.micl`` (psi-shaped)
+        # and the h5 writer's per-count mic factor lands at a sensible
+        # value.  Without this, the h5 mic chart auto-scales against the
+        # dB(L) value-as-pseudo-psi and renders ~flat.
+        if binary_peaks is not None and binary_peaks.mic_pspl_psi is not None:
+            idf_event.peaks.mic_pspl_psi = binary_peaks.mic_pspl_psi
+
        # Operator-supplied serial_hint wins over the binary's filename
        # prefix when both are present (e.g. callers passing a known-good
        # serial that overrides a misnamed export).
        serial = serial_hint or idf_event.serial or "UNKNOWN"

-        # Filesystem write.
+        # 5. Filesystem write of binary bytes.
        filename = source_path.name
        bw_path = self._serial_dir(serial) / filename
        bw_path.write_bytes(idf_bytes)
@@ -523,13 +595,59 @@ class WaveformStore:
        # surrogate — every distinct binary maps to a distinct row.
        waveform_key = bytes.fromhex(sha256)[:16]

-        # Bridge to minimateplus.Event for the existing sidecar / DB
+        # 6. Bridge to minimateplus.Event for the existing sidecar / DB
        # insert paths.  See IdfEvent.to_minimateplus_event() for the
        # caveats of this bridge (mic units, missing fields → sidecar).
        ev = idf_event.to_minimateplus_event(waveform_key)

-        # Write the sidecar.  Source kind "idf-import" was added to the
-        # allow-list in event_file_io.event_to_sidecar_dict for this.
+        # Attach the decoded sample arrays.  Thor's decoder counts use
+        # LSB = 0.0003 in/s for geo (vs BW's 16-count units at 0.005 in/s)
+        # — the .h5 writer's geo_range="normal" yields LSB = 10/32768
+        # ≈ 0.000305 in/s, so plotted samples come out ~1.7% high.
+        # Acceptable known offset; refine with a Thor-aware h5 path later.
+        if idf_samples is not None:
+            ev.raw_samples = idf_samples
+            n_samples = max((len(idf_samples.get(ch, [])) for ch in ("Tran", "Vert", "Long", "MicL")), default=0)
+            ev.total_samples = ev.total_samples or n_samples
+
+        # For IDFH histograms there are no per-sample waveform arrays — the
+        # device stores one peak ADC count per interval per channel.  Synthesise
+        # a 1-sample-per-interval array so the existing h5+renderer pipeline
+        # (which groups samples down to ``n_intervals`` bars via max-per-group)
+        # produces a non-blank histogram chart.  Each "sample" is the peak ADC
+        # count for that interval, so the h5 writer's ``count × geo_fs/32768``
+        # conversion yields the right physical value for the bar height.
+        if is_histogram and idf_intervals:
+            hist_samples = {
+                "Tran": [iv.peak_count("Tran") for iv in idf_intervals],
+                "Vert": [iv.peak_count("Vert") for iv in idf_intervals],
+                "Long": [iv.peak_count("Long") for iv in idf_intervals],
+                "MicL": [iv.peak_count("MicL") for iv in idf_intervals],
+            }
+            ev.raw_samples = hist_samples
+            ev.total_samples = ev.total_samples or len(idf_intervals)
+
+        # 7. Write the .h5 clean-waveform file when we have samples to write
+        # (either the IDFW per-sample stream, or the IDFH synthesised per-
+        # interval peak array).  The renderer treats both shapes the same way.
+        hdf5_filename: Optional[str] = None
+        if ev.raw_samples:
+            hdf5_path = self.hdf5_path_for(serial, filename)
+            try:
+                event_hdf5.write_event_hdf5(
+                    hdf5_path, ev,
+                    serial=serial,
+                    geo_range="normal",   # Thor's geo full scale is also 10 in/s (Normal)
+                    source_kind="idf-import",
+                )
+                hdf5_filename = hdf5_path.name
+            except Exception as exc:
+                log.warning(
+                    "save_imported_idf: HDF5 write failed for %s: %s — continuing without .h5",
+                    hdf5_path, exc,
+                )
+
+        # 8. Write the sidecar.  Source kind "idf-import" is on the allow-list.
        sidecar_path = self.sidecar_path_for(serial, filename)
        existing_review = None
        if sidecar_path.exists():
@@ -554,19 +672,67 @@ class WaveformStore:
        # Time of Peak, sensor self-check, calibration, firmware).
        if report_dict:
            sidecar["extensions"]["idf_report"] = report_dict
+
+        # Project the IDF report into the BW report sidecar shape so the
+        # existing Event Report PDF pipeline (sfm/report_pdf.py) can
+        # render Thor events without needing a separate code path.  Thor
+        # data is 95% the same metric set as BW — the adapter handles
+        # the field-name mapping.
+        if report_dict or binary_md is not None:
+            try:
+                from micromate.idf_to_bw_report import build_bw_report_from_idf
+                sidecar["bw_report"] = build_bw_report_from_idf(
+                    report_dict or {},
+                    binary_md=binary_md,
+                    intervals=idf_intervals,
+                    is_histogram=is_histogram,
+                )
+            except Exception as exc:
+                log.warning(
+                    "save_imported_idf: idf→bw_report adapter failed for %s: %s — "
+                    "report PDF will fall back to DB-only fields",
+                    filename, exc,
+                )
+        # For histograms, also stash the binary-decoded per-interval
+        # records so the UI / report layer doesn't need to re-walk the
+        # IDFH file at render time.
+        if idf_intervals is not None:
+            sidecar["extensions"]["idf_intervals"] = [
+                {
+                    "offset":     iv.offset,
+                    "tran_peak":  iv.peak_count("Tran"),
+                    "tran_halfp": iv.tran_halfp,
+                    "tran_freq":  iv.freq_hz("Tran"),
+                    "vert_peak":  iv.peak_count("Vert"),
+                    "vert_halfp": iv.vert_halfp,
+                    "vert_freq":  iv.freq_hz("Vert"),
+                    "long_peak":  iv.peak_count("Long"),
+                    "long_halfp": iv.long_halfp,
+                    "long_freq":  iv.freq_hz("Long"),
+                    "mic_peak":   iv.peak_count("MicL"),
+                    "mic_halfp":  iv.micl_halfp,
+                    "mic_freq":   iv.freq_hz("MicL"),
+                }
+                for iv in idf_intervals
+            ]
        event_file_io.write_sidecar(sidecar_path, sidecar)

        log.info(
            "WaveformStore.save_imported_idf serial=%s filename=%s filesize=%d "
-            "report_attached=%s",
-            serial, filename, filesize, bool(report_dict),
+            "kind=%s report_attached=%s binary_decoded=%s h5=%s intervals=%d",
+            serial, filename, filesize,
+            "histogram" if is_histogram else "waveform",
+            bool(report_dict),
+            (idf_samples is not None) or (idf_intervals is not None),
+            hdf5_filename or "(skipped)",
+            len(idf_intervals) if idf_intervals else 0,
        )
        return ev, {
            "filename":           filename,
            "filesize":           filesize,
            "sha256":             sha256,
            "a5_pickle_filename": None,
-            "hdf5_filename":      None,
+            "hdf5_filename":      hdf5_filename,
            "sidecar_filename":   sidecar_path.name,
            "serial":             serial,
        }
@@ -441,6 +441,40 @@ def test_real_oorange_event_t190_parses():
    assert r.channels["Long"].ppv_ips == pytest.approx(2.83)
    assert r.peak_vector_sum_saturated is True
    assert r.peak_vector_sum_time_s == pytest.approx(0.007)
+    # Same fixture: Tran ZC Freq is ">100 Hz" — must parse as 100 +
+    # above_range flag, not None (which would render as "—" on the PDF).
+    assert r.channels["Tran"].zc_freq_hz == 100.0
+    assert r.channels["Tran"].zc_freq_above_range is True
+    # Vert/Long are normal numeric values; flag stays False.
+    assert r.channels["Vert"].zc_freq_above_range is False
+    assert r.channels["Long"].zc_freq_above_range is False
+
+
+def test_above_range_marker_treated_as_zc_threshold():
+    """BW writes '>100 Hz' for ZC Freq when the zero-crossing algorithm
+    sees a peak too fast to count (cuts off at the device's 100 Hz
+    reporting ceiling).  Parser must store the threshold + flag, not
+    fall back to None.
+    """
+    txt = """\
+"Event Type : Full Waveform"
+"Serial Number : BE18190"
+"Tran ZC Freq : >100 Hz"
+"Vert ZC Freq : 73 Hz"
+"Long ZC Freq : N/A Hz"
+"MicL  ZC Freq : >100 Hz"
+"""
+    r = parse_report(txt)
+    assert r.channels["Tran"].zc_freq_hz == 100.0
+    assert r.channels["Tran"].zc_freq_above_range is True
+    assert r.channels["Vert"].zc_freq_hz == 73.0
+    assert r.channels["Vert"].zc_freq_above_range is False
+    # N/A → None, flag stays False
+    assert r.channels["Long"].zc_freq_hz is None
+    assert r.channels["Long"].zc_freq_above_range is False
+    # Mic above-range
+    assert r.mic.zc_freq_hz == 100.0
+    assert r.mic.zc_freq_above_range is True


 def test_real_histogram_fixture_populates_sensor_location():
Author	SHA1	Message	Date
serversdown	25386cab8b	fix(backfill): regenerate IDFH .h5 + merge binary mic_pspl_psi onto bridge Two gaps in backfill_thor_events.py that left old Thor events showing stale charts after a v0.21.1 backfill pass: 1. IDFH events were skipped from .h5 regeneration (the "have decoded samples" gate was IDFW-only). Histograms kept their pre-v0.21.1 .h5 — written from raw_samples = None, which the renderer turned into a near-empty bar chart, or for older events the dB(L)-as-pseudo- psi mic scale that produced "107.7 psi" peaks (atomic-bomb level instead of footstep level). Fix: synthesise the same 1-sample-per- interval array save_imported_idf v0.21.1 uses (peak ADC count per channel per interval) so the renderer's bar-chart grouping has data to work with. 2. The IDFW h5 path didn't merge binary_peaks.mic_pspl_psi onto the IdfEvent before to_minimateplus_event(). The live save_imported_idf does this merge — without it, IdfEvent.from_report() only sees the .txt's dB(L) value, the bridge falls back to the dBL→psi formula (instead of the binary-accurate 2.14e-6 psi/count value), and the h5 writer's per-count mic factor lands on a less-correct value. Fix: same merge the live ingest does (lift res.event.peaks.mic_pspl_psi onto idf_event.peaks before the bridge call). Verified against UM6047_20250804190047.IDFH (250-interval prod histogram): 250 intervals decode, mic_pspl_psi = 2.78e-5 (was being treated as dB(L)=107.7 in the old h5). Operator: re-run after deploy. `docker compose exec sfm python scripts/backfill_thor_events.py` is idempotent — the existing version check still skips events already at the new TOOL_VERSION, and review state + captured_at are preserved on the second pass.	2026-06-01 20:02:54 +00:00
serversdown	6cb619ecc4	version bump - 0.21.1	2026-06-01 19:33:44 +00:00
serversdown	1ed86244d0	fix(thor-events): add parallel field for mic psi. Now shows mic in dbl and psi. (psi for charts)	2026-06-01 18:27:24 +00:00
serversdown	b2c565f217	fix(idf_waveforms): _find_waveform_body_offset() — scans every 00 02 00 magic past offset 0x0E00, runs decode_waveform_v2 on each candidate, picks the one that returns the most samples. Validated on 483 prod IDFW files: 0 preamble-only events (was ~50%), 355/483 fully decode, 126/483 partial (BW codec walker-stops-early on loud events — known issue). IDFH now synthesises a 1-sample-per-interval array from the binary intervals and writes an .h5 so the existing renderer works unchanged. Each "sample" is the per-interval peak ADC count → h5_value = count × geo_fs/32768 yields the right bar height.	2026-05-31 20:51:09 +00:00
serversdown	43f440812a	scripts: add backfill_thor_events.py Refreshes the bw_report sidecar block + .h5 waveform files for Thor events ingested before the v0.21.0 adapter wiring + the `bee1185` codec fix. Those events landed with extensions.idf_report only (no bw_report, no .h5 for IDFW) — symptom on the UI side: the modal chart 404'd on /waveform.json and the PDF rendered from DB-only fields without sensor self-check, full per-channel breakdown, or mic dB(L). Walks <store>/<serial>/<filename>: - Reads the existing sidecar (preserves review state + captured_at) - Re-runs read_idf_file() on the binary bytes (passes data= kwarg so codec doesn't try the broken bare-path Path.read_bytes) - Reads extensions.idf_report from the existing sidecar - Runs build_bw_report_from_idf adapter - Writes refreshed sidecar with bw_report + bumped tool_version, preserving review block and original captured_at - For IDFW: regenerates .h5 by bridging IdfEvent.from_report -> to_minimateplus_event -> write_event_hdf5 (mirrors save_imported_idf steps 4-7) - IDFH events skip .h5 (histograms have no per-sample data) Skips events already at current TOOL_VERSION with bw_report present. --force overrides. --skip-hdf5 limits to sidecar-only refresh. --dry-run for preview. Validated against the prod-snap waveform store: 3,815 Thor sidecars refreshed cleanly with 0 errors, 462 IDFW .h5 files written, 2 skipped (binaries with no sidecar — backfill doesn't conjure events from nothing). Verified one originally-broken IDFW event now serves waveform.json (200, 168KB) and a fully populated PDF (119KB vs the previous 56KB sparse output). Operator workflow on prod: docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py --dry-run # Inspect counts, then for real: docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py Idempotent — re-running it is a no-op once everything's at the current TOOL_VERSION. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-30 04:37:43 +00:00
serversdown	23e83908c2	report_pdf: fix PVS overlapping stats table, drop NA caption Two related fixes to the per-channel stats block: 1. Pin the stats table's position via an explicit bbox= on ax.table() so the bottom edge is at a known axes-fraction Y. The previous loc="upper left" + tbl.scale(1, 1.4) combo let matplotlib choose row heights based on text size, which made the table extend further below the axes than the hard-coded PVS line at y=-0.08 expected. Result was the "Peak Vector Sum X in/s" string landing horizontally inside the Peak Displacement row. With bbox=[0, 1-N0.12, 0.80, N0.12] the table is pinned to a precise rectangle (12% axes-fraction per row × N rows tall). _draw_stats_table now stashes the bottom Y on the axes for the PVS helper to reference, so the geometry stays in sync. 2. Center PVS horizontally (ha="center" at x=0.5 instead of ha="left" at x=0). The previous left-edge alignment put PVS at the same X as the label column, which read as "off-center" once the rest of the stats data was column-aligned further right. 3. Drop the "NA: Not Applicable" caption. It existed to explain "—" placeholder cells, but "—" is universally understood and the caption was always visually squished against the PVS line below. Less cruft on the page; one fewer position to manage. Verified against a real BE12599 histogram event (5 data rows) and a real UM12947 IDFW waveform event (6 data rows) — both layouts clear the table cleanly with no overlap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 22:17:43 +00:00
serversdown	bee118506b	fix(idf): decode from in-memory bytes during ingest Bug shipped in v0.21.0: save_imported_idf called read_idf_file() with `source_path` (a bare filename like "UM12947_….IDFW") BEFORE writing the binary to disk. The codec did Path(path).read_bytes() which resolved relative to /app and hit FileNotFoundError. The error was caught + logged as a warning, and ingest fell back to .txt-only — events still landed in the DB but lost the bw_report block + .h5 waveform that the codec was supposed to produce. Observed during a full re-forward from thor-watcher on 2026-05-29: every Thor event logged "binary codec failed for X: [Errno 2] No such file or directory" and got binary_decoded=False. Fix: - read_idf_file() gains a `data: Optional[bytes]` kwarg. When supplied, skips the disk read and decodes the provided bytes directly. `path` stays required (used for filename in error messages + .IDFH vs .IDFW suffix detection); only the read is conditional. Backward compatible — existing positional callers (CLI scripts, tests) continue to work unchanged. - save_imported_idf passes `data=idf_bytes` since the bytes are already in memory from the multipart upload. Filesystem write still happens at step 5 of the existing flow; codec just no longer depends on it. Verified end-to-end against UM11719_20231219162723.IDFW from the example-data corpus: ingest endpoint returns inserted=1, log line shows binary_decoded=True + h5=...IDFW.h5, no warnings. Re-forward existing Thor events from thor-watcher after deploy to backfill the bw_report block — UPSERT preserves review state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 20:09:54 +00:00
serversdown	defd17d9c2	sfm_webapp: harmonize "Received by server at" → "Time received" Matches Terra-View's event-modal relabel from the same iteration. Wording was already clearer here than in Terra-View's "Captured at", but using identical text across both surfaces means operators see the same label whether they're in the native modal or the standalone webapp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 19:51:58 +00:00
serversdown	e42956a20b	release: v0.21.0 — Thor / Series IV codec + Thor→BW adapter Documents two commits that landed on dev since v0.20.0: `9b71ead` series 4 codec work, initial decode success micromate/idf_file.read_idf_file() decodes both IDFW (waveform; 87-99% sample fidelity reusing decode_waveform_v2 at offset 0x0f1f) and IDFH (histogram; dedicated segment-based decoder, all 859 corpus files decode, 181,071 intervals total). `9fd52dd` feat: add thor report generation, pdf generation micromate/idf_to_bw_report.py adapter projects parsed Thor data into the bw_report sidecar shape so Thor events flow through sfm/report_pdf.py without a separate renderer. Wired into save_imported_idf. Net effect: a Thor event ingested via /db/import/idf_file now lands with the same fidelity as a BW event, gets a per-event PDF on demand, and renders in Terra-View's modal chart using the same plotting code as a BW event. Roadmap items closed: - Binary .IDFW / .IDFH codec (was pending) - Series IV (Thor IDF) binary codec reverse-engineering Companion: Terra-View v0.13.0 ships in parallel and closes Phase 1 of the SFM integration. No API changes in seismo-relay for that piece — Terra-View just consumes existing endpoints better. Bumps: - pyproject.toml 0.20.0 → 0.21.0 - minimateplus.event_file_io.TOOL_VERSION 0.20.0 → 0.21.0 (any subsequent backfill_sidecars.py --force will re-stamp existing sidecars; expected + harmless) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 19:25:44 +00:00
serversdown	9fd52ddabb	feat: add thor report generation, pdf generation.	2026-05-29 19:03:06 +00:00
serversdown	9b71ead44b	series 4 codec work, inital decode success	2026-05-29 06:33:13 +00:00
serversdown	1bccc44b88	release: v0.20.0 — PDF + parser polish Closes out the Event-Report PDF iteration started in v0.17.x and ships the parser fixes the real-world events were tripping over. Today's additions on top of the pre-v0.20 unreleased body: - Server-wide display TZ via the TZ env var (default America/New_York on prod). Affects server logs, the PDF report's "Created" footer, matplotlib datetime axes. DB columns stay UTC. Dockerfile now installs tzdata. - ZC Freq "above-range" handling — parser stores 100.0 + zc_freq_above_range flag for BW's ">100 Hz" marker. Renders as >100 in the PDF stats table, both modals (inline on webapp Peaks, new column on event-browser table). - scripts/backfill_sidecars.py --reparse-txt — re-runs the current parser against the preserved _ASCII.TXT and overwrites the sidecar's bw_report block. Lets parser fixes reach old events without re-forwarding. Validated end-to-end against ~10k prod events. Fixes shipped today: - histogram_interval_size_s missing from ReportData → every histogram PDF render 500'd. - Histogram PDF geo channels now share a nice-quantized y-axis (0.005-LSB-aware 1-2-5 step sequence) instead of auto-scaling per channel + inventing sub-LSB "0.003 in/s/div" footer labels. Roadmap delta: closes the BW ASCII parser "PPV-miss on some TXT formats", "histogram-specific structural fields", and ">100 Hz value parsing" items. Adds a new entry for the byte[5]==0 histogram body sub-format observed on S353 events. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 21:17:53 +00:00
serversdown	a3cc44d30a	feat(backfill): --reparse-txt flag to refresh bw_report from preserved .TXT The existing backfill_sidecars.py PRESERVES the bw_report block across regenerations — it's treated as the source of truth from the original ingest pass (the .TXT isn't reachable from the script's normal data path, so it can't be re-derived). That means parser-side fixes (like the 2026-05-28 ">100 Hz" ZC Freq addition) won't reach old events even with --force. The new --reparse-txt flag fixes that: when the sidecar's source.txt_filename points at a preserved <serial>/<filename>_ASCII.TXT, the script re-runs the current parser against it and overwrites the bw_report block. Implies sidecar regeneration on every event (bypasses the sha-up-to-date / version-up-to-date skip), so that the .h5 cascade- regenerates alongside. No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27). Idempotent — re-running it produces the same sidecar bytes when the parser hasn't changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:56:23 +00:00
serversdown	6a73523e4d	ui: surface per-channel ZC Freq (and ">100") in event modals The PDF report shows per-channel ZC Freq alongside PPV in the stats block, but neither modal exposed it. Now that the sidecar projection carries zc_freq_hz + zc_freq_above_range, plumb them through: - sfm_webapp.html: inline suffix on existing Peaks cells, e.g. "Tran 0.04500 in/s · >100 Hz". Empty suffix when no ZC is available (legacy events without a preserved .TXT). - event_browser.html: new ZC Freq column on the per-channel stats table. Required adding a parallel sidecar fetch in loadEvent() (waveform.json alone doesn't carry bw_report). Fetch failure is non-fatal — falls back to "—" in the new column. Above-range ZC peaks (BW ">100 Hz") render with a literal ">" prefix mirroring the PDF, so operators don't have to generate the PDF to see when a channel hit the zero-crossing ceiling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:47:37 +00:00
serversdown	780b45a371	feat: render ">100" for above-range ZC Freq instead of "—" BW writes ">100 Hz" for ZC Freq when the zero-crossing algorithm sees a peak too fast to count — the device's reporting ceiling is 100 Hz on V10.72. Our parser fell back to None via _parse_number (which requires a leading digit), so the PDF rendered "—" where BW shows ">100". Mirrors the OORANGE/saturated pattern already used for PPV and PSPL: parser stores the threshold (100.0) on zc_freq_hz + sets a new zc_freq_above_range flag. Projection carries the flag through to the sidecar; PDF renderer prepends ">" when set. Affects both per-channel stats tables (waveform + histogram variants) and the mic block's ZC Freq row. Verified on the real T190LD5Q.LK0W fixture: Tran zc_freq_hz=100.0 above_range=True; Vert/Long (normal values) above_range=False; "N/A" still produces zc_freq_hz=None which renders as "—" (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 18:38:49 +00:00