Merge pull request 'update to v0.21.1, thor data import successful' (#29 ) from dev into main

Reviewed-on: #29
fix(backfill): regenerate IDFH .h5 + merge binary mic_pspl_psi onto bridge
2026-06-01 16:54:23 -04:00 · 2026-06-01 20:02:54 +00:00 · 2026-06-01 19:33:44 +00:00 · 2026-06-01 18:27:24 +00:00 · 2026-05-31 20:51:09 +00:00 · 2026-05-30 04:37:43 +00:00
49 changed files with 6609 additions and 209 deletions
@@ -1,6 +1,6 @@
 /bridges/captures/
 /example-events/
-
+/tests/fixtures/
 /manuals/

 # Python build artifacts
@@ -4,6 +4,190 @@ All notable changes to seismo-relay are documented here.

 ---

+## [Unreleased]
+
+---
+
+## v0.21.1 — 2026-06-01
+
+Bug fixes against v0.21.0 surfaced after the first prod redeploy.  Three
+production-visible symptoms — blank waveform charts on most Thor events,
+blank histogram charts on all Thor events, and a mic chart that
+auto-scaled against a dB(L) value treated as psi — all root-caused and
+fixed.
+
+### Fixed
+
+- **Dynamic IDFW body offset.**  The v0.21.0 codec hardcoded the body
+  at file offset `0x0f1f` based on the example corpus, but only ~52%
+  of production IDFW events use that offset; the rest sit at offsets
+  from `0x1033` up to `0x3082` depending on header padding.  At
+  `0x0f1f` the codec would find a coincidentally-matching `00 02 00`
+  magic, read the 2-byte Tran preamble, and return empty V/L/M
+  arrays — producing near-empty .h5 files and blank charts.
+  `micromate.idf_file._find_waveform_body_offset()` now scans every
+  `00 02 00` magic position past `0x0E00`, trial-decodes each one,
+  and picks the offset with the most samples.  Validated across 483
+  prod IDFW files: 0 preamble-only events (was ~50%), 355/483 fully
+  decode, 126/483 partial (BW codec walker-stops-early on loud
+  events — pre-existing limitation, samples reached are correct).
+
+- **IDFH histograms now render bar charts.**  Histograms previously
+  skipped the .h5 write because there are no per-sample arrays, but
+  the renderer drives the per-interval bar chart from .h5 channel
+  data + `bw_report.histogram.n_intervals`.  `save_imported_idf` now
+  synthesizes a 1-sample-per-interval array from the decoded
+  `IdfhInterval` peak counts and writes an .h5 so the existing
+  renderer works unchanged — each "sample" is the per-interval peak
+  ADC count, so the writer's `count × geo_fs/32768` conversion
+  yields the right bar height.
+
+- **Mic chart scaling on Thor events.**  `PeakValues.micl` (consumed
+  by the h5 writer's per-count mic scale factor) expects psi, but
+  the Thor bridge was stuffing the dB(L) value (~99.4) into it,
+  producing a per-count factor 5+ orders of magnitude too large and
+  a flat-looking mic chart.  Fixed by adding `IdfPeaks.mic_pspl_psi`
+  alongside `mic_pspl_dbl`; `read_idf_file()` computes it from
+  binary mic counts (`max(|MicL|) × 2.14e-6 psi/count`) for both
+  IDFW and IDFH paths; `save_imported_idf` merges it onto the typed
+  event after `IdfEvent.from_report`; the bridge feeds psi to
+  `PeakValues.micl` with a dB(L)→psi formula fallback when only the
+  dB(L) value is available.  dB(L) for the report header still
+  flows through `bw_report.mic.pspl_dbl` unchanged.
+
+### Operator
+
+After deploy, run `python scripts/backfill_thor_events.py` to refresh
+every existing Thor event's sidecar + .h5 with the corrected codec
+output.  The script auto-skips events already at the current
+`TOOL_VERSION`, so the bump from `0.21.0` → `0.21.1` is what triggers
+the refresh.
+
+---
+
+## v0.21.0 — 2026-05-29
+
+The "Thor / Series IV codec" release.  Two big pieces landed: (1) the IDF binary codec actually decodes now, both IDFW and IDFH, and (2) a Thor→BW adapter lets Thor events flow through the existing Series III Event Report PDF pipeline.  Combined effect: a Thor event ingested via `/db/import/idf_file` now lands in the DB with the same fidelity as a Blastware event, gets a per-event PDF on demand, and renders in Terra-View's modal chart with the same plotting code as a BW event.
+
+### Added — Thor IDF binary codec (`micromate/idf_file.read_idf_file`)
+
+- **IDFW (waveform)** — body sits at fixed file offset `0x0f1f`; reuses the verified `decode_waveform_v2()` walker from `minimateplus.waveform_codec`.  Sample fidelity is **87–99% byte-exact** against the ASCII-sidecar reference values on quiet events; loud events hit the same walker-stops-early limitation as the BW codec on `SP0/SS0/SV0`-style events.
+- **IDFH (histogram)** — dedicated segment-based decoder for the Thor histogram body format: `[len_be][0a 00 00 00][00 NN][05 3f]` framing plus N × 72-byte interval records (4 × 16-byte per-channel min/max/halfp).  **All 859 Thor IDFH corpus files decode**, totalling **181,071 intervals**; per-channel peaks match the sidecar within **~1.8% (ADC quantization)**.
+- **BW-aliased binary detection** — a small number of corpus files (e.g. `BE9439_*.IDFW/IDFH`) are actually Series III Blastware binaries that share the IDF filename convention by accident.  `read_idf_file()` detects them via their BW `STRT` signature and raises `NotImplementedError` pointing the caller at `read_blastware_file()` instead of trying to decode them as IDF.
+- Full field layouts in `docs/idf_protocol_reference.md`; supporting analysis scripts in `analysis_idf/` (decode validators, per-file detail dumps, corpus accuracy reports).
+
+### Added — Thor → BW report adapter (`micromate/idf_to_bw_report.py`)
+
+- **`build_bw_report_from_idf(report_dict, binary_md=, intervals=, is_histogram=)`** projects a parsed Thor `IdfReport` plus binary-extracted metadata plus decoded IDFH intervals into the `bw_report`-shaped dict that `sfm.report_pdf.gather_report_data` consumes.  No need to duplicate the renderer — Thor data is ~95% the same metric set as BW; the adapter handles the field-name mapping (`MicPSPL` → `pspl_dbl`, `>100` sentinel → `zc_freq_above_range`, free-form `Calibration : Nov 22, 2023 by Instantel` → `calibration_date` + `calibration_by`, etc.).
+- For IDFH events the adapter derives `histogram.interval_times` by stepping `IntervalSize` from `HistogramStartTime`, matching what the BW pipeline expects from a histogram-mode event.
+- **Wired into `WaveformStore.save_imported_idf`** — every Thor event ingested via `/db/import/idf_file` now gets a `bw_report` block in its sidecar in addition to the existing `extensions.idf_report` (the raw parsed Thor payload).  Falls back gracefully (PDF renders from DB-only fields) if the adapter raises — logged as a warning rather than failing the ingest.
+
+### Companion releases
+
+- **Terra-View v0.13.0** ships in parallel — closes Phase 1 of the SFM integration.  The shared event-detail modal now renders the SFM event story (Chart.js waveform/histogram chart, inline PDF preview, `.TXT` download, FT/reviewer/notes review form) without operators needing to bounce to the standalone SFM webapp on port 8200.  Uses only existing seismo-relay endpoints — no API changes here, just better consumption.
+
+### Migration / Operations
+
+No DB migration needed.  Existing Thor events already in the store don't automatically pick up the new `bw_report` block — they'd need a re-ingest (post the IDF binary + paired `.TXT` back to `/db/import/idf_file`) for the adapter to run.  Alternatively, run `scripts/backfill_sidecars.py --reparse-txt` after a small adapter change (the script currently only re-runs the BW ASCII parser; extending it to handle Thor would be a small follow-up).
+
+```bash
+cd /home/serversdown/terra-view
+docker compose build sfm && docker compose up -d sfm
+```
+
+The bumped `TOOL_VERSION = "0.21.0"` in `minimateplus/event_file_io.py` means any subsequent `backfill_sidecars.py --force` pass will re-write sidecars with the new version stamp; that's expected and harmless.
+
+---
+
+## v0.20.0 — 2026-05-28
+
+The "PDF + parser polish" release.  Closes out the Event-Report PDF iteration started in v0.17.x: histogram layouts now render correctly against BW reference PDFs, the ASCII parser handles the real-world edge cases production events were tripping over (OORANGE, `>100 Hz`, histogram timestamps), and the `.TXT` preservation rollout lets parser fixes be applied retroactively to ingested events.  Adds server-wide timezone support so operator-visible timestamps no longer drift into UTC.  Rolls up the substantial "pre-v0.20" body of work that had accumulated under `[Unreleased]` (PDF generation, histogram codec fix, histogram parser fields, `.TXT` preservation, backfill safety) — see the trailing "pre-v0.20.0 work" section below for the full list.
+
+### Added (2026-05-28)
+
+- **Server-wide display timezone via `TZ` env var.**  Both seismo-relay and terra-view now respect a `TZ` environment variable (default `America/New_York` on prod).  Affects server log timestamps, the PDF report renderer's UTC→local conversions on the "Created" footer line, matplotlib's datetime axes, and any other naïve-vs-aware datetime rendering.  DB columns (`created_at`, etc.) stay UTC regardless — this is a display-side fix, not a storage-side one.  Dockerfile now installs `tzdata` (required for the env var to take effect under `python:slim`).  Override per-deployment via the `TZ` line in `docker-compose.yml`.
+- **ZC Freq "above-range" handling — render `>100 Hz` instead of `—`.**  BW writes `">100 Hz"` literally when the zero-crossing algorithm sees a peak too fast to count (device cuts off at 100 Hz on V10.72).  Previously `_parse_number(">100")` returned None and the PDF stats table rendered `—`.  Now the parser mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` and sets a new `zc_freq_above_range` flag.  Flag rides through the sidecar's `bw_report` block.  Renders as `>100` in the PDF (per-channel + mic block), as `· >100 Hz` inline on the event modal's Peaks section, and as a dedicated column on the event-browser stats table.  Verified against the real T190LD5Q.LK0W fixture from 2026-05-27 plus a synthetic test case.
+- **Per-channel ZC Freq surfaced in event modals.**  Neither the main webapp modal (`sfm_webapp.html`) nor the standalone event browser (`event_browser.html`) previously exposed ZC Freq.  Now both do — webapp shows it inline alongside PPV (`0.04500 in/s · 47 Hz`); event-browser gets a dedicated column on its per-channel stats table.  Required wiring a parallel sidecar fetch into the event-browser's `loadEvent()` (it was only fetching `waveform.json`).  Falls back to `—` for events without a preserved `.TXT` (pre-2026-05-27 ingests).
+- **`scripts/backfill_sidecars.py --reparse-txt` flag.**  Before this, the backfill script preserved the `bw_report` block from existing sidecars verbatim — so parser-side fixes (like the `>100 Hz` addition above) couldn't reach old events.  The new flag re-runs the current parser against the preserved `<serial>/<filename>_ASCII.TXT`, overwrites the bw_report block, and cascade-regenerates the sidecar.  Implies sidecar regeneration on every event (bypasses the sha/version skip).  No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27 .TXT-preservation rollout).  Idempotent.  Run with `--skip-hdf5` to skip waveform regen — recommended when only the bw_report needs refreshing.  Validated end-to-end on prod: 9,999 events refreshed cleanly, ZC Freq + OORANGE flags now populated where the original .TXT had them.
+
+### Fixed (2026-05-28)
+
+- **Histogram PDFs no longer 500 on the missing `histogram_interval_size_s` attribute.**  The histogram-interval-times derivation block in `gather_report_data` referenced `rd.histogram_interval_size_s`, but the field was never declared on the `ReportData` dataclass nor read from the sidecar projection (it was inlined into `gather_report_data` without the seconds-numeric counterpart making it onto the dataclass).  Every histogram PDF render raised `AttributeError → 500`.  Waveform PDFs were unaffected.  Fix: add the field, read it from the projection's existing `bw_report.histogram.interval_size_s` key.
+- **Histogram PDF geo channels now share a single nice-quantized y-axis.**  Previously each geo subplot auto-scaled independently — Tran, Vert, and Long all showed different per-channel maxes, so bar heights weren't directly comparable across channels.  The footer "Amplitude Geo: X in/s/div" label was also computed as `max(first_geo_channel) / 5` with no LSB quantization, producing nonsense values like `0.003 in/s/div` when the geophone LSB is 0.005.  Fix: compute a single shared geo y-axis range from `max(Tran, Vert, Long)`, quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 in/s LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same `ylim` + ticks to all three subplots, and use that step for the footer label.  MicL stays on its own auto-scale (different units).  Matches BW's chart styling.
+
+### Docs (2026-05-28)
+
+- **Roadmap entry for a second undecoded histogram body sub-format.**  BE17353 (S353) events observed on 2026-05-28 use a histogram body where `byte[5] = 0x00` (looks like a valid block header by every prior signal) but the walker finds zero data blocks.  Different from the existing `byte[5] != 0` roadmap entry (T190 / O121).  Operationally identical impact — ingestion succeeds, DB peaks come from the bw_report overlay, only the chart is empty.  Sample events captured in the roadmap entry for future RE work.
+
+### Migration / Operations
+
+- **Re-parse existing events to pick up the new parser fields.**  Run on whichever box hosts the live waveform store:
+  ```bash
+  docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
+      --reparse-txt --skip-hdf5 --dry-run -v | tail
+  # Looks reasonable?  Run for real:
+  docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
+      --reparse-txt --skip-hdf5 -v | tee /tmp/reparse.log | tail -30
+  ```
+  Idempotent; safe to re-run.  Only touches sidecars on disk — no DB writes.
+- **terra-view docker-compose.yml**: add `TZ=America/New_York` (or your deployment's zone) to both the `terra-view` and `sfm` service `environment:` blocks.  Without this, server-rendered timestamps stay in UTC even on the rebuilt SFM image.
+
+### Pre-v0.20.0 work (rolled into this release)
+
+The bullets below accumulated under `[Unreleased]` between v0.19.0 and v0.20.0; kept here so the historical narrative isn't lost.
+
+#### Fixed
+
+- **bw_ascii_report parser now handles `OORANGE` saturation marker.**  BW writes `"OORANGE"` (truncation of "Out Of Range") in PPV / PVS / MicL PSPL fields when the underlying measurement exceeded the channel's full-scale.  Previously our `_parse_number()` returned None → DB ended up with NULL peaks for legitimate high-amplitude events.  Confirmed on real ASCII files pulled 2026-05-27 from the Windows watcher PC: T190LD5Q.LK0W (Vert saturated at Normal range 10 in/s), T438L713.RY0W (all three channels saturated at Sensitive range 1.25 in/s), K557L3YM.OE0W (Tran+Vert saturated + Mic PSPL OORANGE).  New behavior:
+   - Per-channel PPV: substitute `geo_range_ips` as a conservative lower bound + set `ppv_saturated` flag
+   - Peak Vector Sum: substitute `sqrt(3) * geo_range_ips` (the theoretical max when all 3 channels are simultaneously at full-scale) + `peak_vector_sum_saturated` flag
+   - MicL PSPL: substitute 140 dB(L) (conservative NL-43 max) + `pspl_saturated` flag
+   - Saturation flags are propagated into the sidecar's `bw_report` block for downstream UI rendering (`> 10 in/s` or similar)
+   - Five events on prod (T190 / T438 / K557 + 2 others matching the same fault pattern) will pick up correct DB peaks + saturation flags once re-forwarded
+- **bw_ascii_report parser handles `Peak Vector Sum TimeSum` typo'd label.**  Real BW output uses this misspelled label (Sum appended twice instead of "Peak Vector Sum Time").  Now accepted as an alias.  Confirmed against all three OORANGE example files — every one has the typo.
+
+#### Added
+
+- **Histogram per-interval aggregation in `waveform.json`.**  Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output).  When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array.  `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings).  Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present.  Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
+- **`bw_ascii_report` parser now captures histogram-specific fields.**  Previously the parser dropped these fields silently (Roadmap item closed):
+   - `Histogram Start Time` / `Histogram Start Date` (combined into `histogram_start: datetime`)
+   - `Histogram Stop Time` / `Histogram Stop Date` (combined into `histogram_stop: datetime`)
+   - `Number of Intervals` (`histogram_n_intervals: int`)
+   - `Interval Size` ("1 minute" string + parsed seconds: `histogram_interval_size_str`, `histogram_interval_size_s`)
+   - `<Channel> Peak Time` + `<Channel> Peak Date` for histogram events (combined into `channel_peak_when: dict`; waveforms continue to use `time_of_peak_s` relative)
+   - `Peak Vector Sum Date` (combined with PVS Time into `peak_vector_sum_when: datetime`; clears the previous bogus `peak_vector_sum_time_s` parse that interpreted "22:33:52" as 22.0 seconds)
+   - All new fields land in the sidecar's `bw_report.histogram` block via `_bw_report_to_dict`.  Tested against synthetic K558LLB7.V20H-shaped input.
+- **Raw BW ASCII report (.TXT) preservation.**  `save_imported_bw` now writes the paired `_ASCII.TXT` to `<store>/<serial>/<filename>_ASCII.TXT` alongside the binary at ingest time.  Previously the .TXT was parsed into the sidecar's `bw_report` projection and then discarded — meaning parser bug fixes couldn't be applied retroactively without re-forwarding from the watcher PC.  Now the raw .TXT lives in the waveform store permanently (~15 KB per event; ~210 MB total for a 14k-event store; negligible).  Sidecar's `source.txt_filename` field records the saved path; backfill_sidecars preserves it across regens.  New `GET /db/events/{id}/ascii_report.txt` endpoint serves the raw .TXT for any event ingested after this change.  Events ingested before today still return 404 from that endpoint until re-forwarded.  Architectural rationale: with BW Mail / Forwarding Agent being phased out of the operator workflow, the XML/PDF/WMF that those tools produced are no longer available — the binary + .TXT (created by BW ACH itself) are our authoritative source for everything going forward.
+
+- **Event Report PDF generation** — `GET /db/events/{id}/report.pdf` returns a single-page letter-portrait PDF for any event with waveform data on disk.  Covers every field a Blastware Event Report includes: header metadata (date/time, trigger source, range, sample rate, project/client/operator/location, serial+firmware, battery, calibration, file name), microphone block (PSPL in dB(L) + psi, ZC freq, channel test), per-channel stats table (rows differ for waveform vs histogram), Peak Vector Sum, and the 4-channel plot.  Iterated against real Blastware reference PDFs (uploaded to `example-events/pdfsnstuff/`):
+   - **Waveform layout**: header shows Date/Time, Trigger Source, Range, Sample Rate; stats table has PPV / ZC Freq / Time (Rel. to Trig) / Peak Accel / Peak Disp / Sensor Check; bottom plot is 4-channel line waveform (MicL top → Tran bottom), shared time axis in seconds, dashed trigger line + triangle marker at t=0, symmetric Y on geo channels, zero-anchored on mic, "0.0" baseline label on right per BW convention; footer shows `Time X sec/div   Amplitude Geo: Y in/s/div   Mic: 0.001 psi(L)/div` and the trigger window `▶━━◀` marker.  USBM RI8507/OSMRE compliance chart placeholder upper-right.
+   - **Histogram layout**: header shows Start / Finish / Intervals At Size / Range / Sample Rate (no Trigger Source — histograms aren't triggered); NO USBM chart; stats table has PPV / ZC Freq / Date / Time / Sensor Check; bottom plot is per-interval bar chart, Y-axis 0-to-peak (never negative), 0.0 baseline at the bottom; footer shows `Time INTERVAL_SIZE /div   Amplitude Geo: Y in/s/div   Mic: 0.001 psi(L)/div`.
+   - Backed by matplotlib (vector PDF, no headless-browser dep).  Adds matplotlib>=3.8 to deps.
+   - **Known gap**: histogram codec returns per-block granularity (~200 bars for a 4-interval event) instead of BW's per-interval aggregation.  Visual difference vs BW's 4-bar display.  XML-driven data source (parsing the structured `_XML.XML` files BW also exports) is the planned fix; that route also resolves the bw_ascii_report PPV-miss bug.
+   - **Stubbed**: USBM RI8507 / OSMRE compliance chart curves (separate work item; requires coding the regulatory piecewise functions).
+- **"Download PDF" button** in the event modal's footer — triggers the new endpoint; opens in a new tab so the browser handles save-or-display + surfaces any 404 / server errors visibly.
+
+- **SFM webapp now opens to Database view by default** and the History table is fully interactive.  Click any column header to sort ascending / descending (timestamp, serial, per-channel PPV, PVS, mic dB(L), project, client, record type, key — all sortable).  Click any event row to open the event modal, which now renders a **4-channel waveform plot inline** (MicL / Long / Vert / Tran stacked, Instantel-printout order) alongside the existing sidecar review fields.  Headers are sticky so the columns stay visible while scrolling long event lists.  No more "where is the viewer" — pick a unit from the filter dropdown, scan the table, click the event, see the waveform.
+- **Stored-event browser** — new standalone HTML page at `GET /events` (`sfm/event_browser.html`).  Pick a serial from the unit dropdown, scroll through that unit's events (newest-first), click any event to render its decoded waveform via the existing `/db/events/{id}/waveform.json` endpoint.  Dark-themed Chart.js viewer, channels stacked vertically (MicL / Long / Vert / Tran — Instantel printout order, designed PDF-export-ready), trigger line at t=0, peak labels, search/filter, false-trigger flag honored.  Companion to the existing live-device viewer at `/waveform`; the two routes are now clearly delineated in their docstrings.  The webapp's inline plot at `/` is the primary path; `/events` remains a useful diagnostic when you want just a viewer.
+- **Histogram body codec — uint8 peak count fix.**  Per-channel peak fields at `block[6]/[10]/[14]/[18]` are `uint8`, not `uint16 LE` spanning `block[6:8]` etc.  The original interpretation was byte-exact on the N844 fixture corpus only because every annotation byte (`block[7]/[11]/[15]/[19]`) in those fixtures was zero.  On non-N844 events with non-zero annotation bytes (observed across BE9558 Tran-drift and BE18003 Histogram+Continuous units), the old interpretation produced peaks up to 268 in/s per channel and 35× inflated PVS sums when first deployed to prod (rolled back same day; properly fixed in this release).  Cross-correlated against BW's per-interval ASCII export on K558 / T003 / N599 / N844 corpora — 100% byte-exact on T/V/L, 99%+ on M (sub-precision rounding).  Annotation byte preserved on each record as `record["annotations"]` for future RE.  Verified against ~3,500 blocks across 5 in-repo fixtures + a synthetic K558 interval-12 regression block.
+- **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`.  Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file.  BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
+- **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars.  Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED.  Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.
+
+#### Fixed
+
+- **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.**  Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict).  Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
+- **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.**  The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts).  Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
+- **Thor IDF files no longer attempted as BW events in backfill.**  `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.
+
+#### Docs
+
+- **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes.  Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
+- **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
+- **`docs/histogram_codec_re_status.md`** updated with the uint8 retraction and the annotation-byte status.
+- Three known issues recorded in the Roadmap that were discovered during prod validation: (1) `bw_ascii_report` parser misses PPV / `vector_sum` on some `.TXT` formats (5 events on prod); (2) NULL-timestamp duplicate-row dedup needed (2 events on prod); (3) histogram body sub-format with `byte[5] != 0` not yet decoded (~3 events on prod with empty `.h5` plots).
+
+---
+
 ## v0.19.0 — 2026-05-20

 The "device-family separation" release.  Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics.
@@ -2,12 +2,112 @@

 Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
 managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
-(Sierra Wireless RV50 / RV55). Current version: **v0.17.0**.
+(Sierra Wireless RV50 / RV55). Current version: **v0.21.0**.

 When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document

 ---

+## Architecture: three-tier conceptual model
+
+seismo-relay is a **suite of cooperating components**, not a single app.
+The three tiers below are the canonical mental model — the current
+directory layout doesn't fully reflect them yet (some of what is
+conceptually SDM lives under `sfm/` today), but new code should be
+placed and named according to this model.
+
+### 1. SFM — the device-side (active connection to physical units)
+
+Replaces Blastware's *talk-to-the-meter* role.  Lives where a connection
+to a physical seismograph is open.
+
+In scope:
+- `minimateplus/{transport,framing,protocol,client}.py` — wire protocol
+- `seismo_lab.py` — diagnostic GUI (a thick client for SFM)
+- The `/device/*` HTTP endpoints in `sfm/server.py` —
+  `/device/info`, `/device/events`, `/device/monitor/*`, `/device/call_home`,
+  etc.  Anything that opens a connection at the moment of the request.
+- Future: a Thor / Micromate live client (mirror `minimateplus/`)
+- Future: a control surface Terra-View can launch into — see the
+  README's Roadmap.
+
+Does NOT own a database.  Outputs `Event` objects.  Has a "spun up when
+needed" runtime profile rather than "always on".
+
+### 2. SDM — the data-side (storage, ingest, and serving)
+
+The new name for the receiving-and-storing role.  Originally called SFM
+because the FastAPI service started life as a thin device proxy, but
+the actual role has migrated heavily toward data management.  **For now
+the directory remains `sfm/`** — renaming requires touching ~30-50
+files in seismo-relay + ~10-15 in terra-view + a Docker volume
+migration; deferred until the codebase is quiet enough to do it as a
+clean refactor.
+
+In scope:
+- `sfm/database.py` (`SeismoDb`)
+- `sfm/waveform_store.py`, `sfm/event_hdf5.py`
+- The `/db/*` HTTP endpoints — `events`, `units`, `monitor_log`,
+  `sessions`, `false_trigger` mutations
+- The `/db/import/*` ingest endpoints — `blastware_file` (series3),
+  `idf_file` (series4); anything that receives events FROM somewhere
+- `scripts/backfill_sidecars.py`, `scripts/check_bw_report_preservation.py`,
+  and similar data-maintenance tools
+- The `.sfm.json` sidecars and `.h5` files in the waveform store
+- The shape that Terra-View consumes (Terra-View should never need to
+  reach into SFM/device-side endpoints to populate its UI)
+
+Always-on, scaled for storage/serving, has the DB and waveform store.
+
+### 3. Codec library — pure data interpretation (used by both sides)
+
+Neither SFM nor SDM — a shared library both depend on.
+
+In scope:
+- `minimateplus/{waveform_codec,histogram_codec,event_file_io,bw_ascii_report,blastware_file}.py`
+- `micromate/{idf_ascii_report,idf_file}.py`
+
+These modules take bytes (off the wire on the SFM side, or from a
+forwarded file on the SDM side) and return `Event` objects.  They
+should not import from `sfm/`, must not touch a DB, and have no I/O
+beyond reading files passed as arguments.  Keep them pure — both
+tiers can then depend on them without circularity.
+
+#### Thor IDF binary codec (2026-05-28)
+
+`micromate/idf_file.read_idf_file()` decodes both Thor IDFW
+(waveform) and IDFH (histogram) binaries.
+
+- **IDFW** reuses `decode_waveform_v2()` on the body at fixed file
+  offset `0x0f1f`.  Sample fidelity is 87–99% byte-exact on quiet
+  events; loud events hit the BW codec's known walker-stops-early
+  limitation.
+- **IDFH** has its own segment-based decoder: `[len_be][0a 00 00 00]
+  [00 NN][05 3f]` + N × 72-byte interval records (4 × 16-byte
+  per-channel min/max/halfp).  All 859 Thor IDFH corpus files
+  decode (181,071 intervals); peak matches sidecar within ~1.8%
+  (ADC quantization).
+
+The two outlier `BE9439_*` files in the Thor example corpus are
+actually Series III Blastware binaries that share the `.IDFW`/`.IDFH`
+filename convention by accident.  `read_idf_file()` detects them by
+their BW STRT signature and raises NotImplementedError pointing
+callers at `read_blastware_file()`.  See
+`docs/idf_protocol_reference.md` for full field layouts.
+
+### Practical consequences
+
+When deciding where new code goes, ask:
+- *Does it need a connection to a device?* → SFM
+- *Does it operate on stored events / sidecars / DB rows?* → SDM
+- *Does it interpret bytes into structured data, with no I/O of its own?* → codec lib
+
+Terra-View is downstream of SDM for data, and (per the roadmap) will
+eventually invoke into SFM's device-control endpoints to provide a
+"connect to unit" experience.
+
+---
+
 ## Project layout

 ```
@@ -2,10 +2,21 @@ FROM python:3.11-slim

 WORKDIR /app

+# tzdata is required for the TZ env var to take effect (python:slim
+# omits the timezone database).  Without it, datetime.now() / logging
+# / matplotlib all stay in UTC regardless of TZ.  Default zone gets
+# set further down via ENV; users override per-deployment via the
+# `TZ` env var in docker-compose.
 RUN apt-get update && \
-    apt-get install -y --no-install-recommends curl && \
+    apt-get install -y --no-install-recommends curl tzdata && \
    rm -rf /var/lib/apt/lists/*

+# Default display timezone — applied to server logs, datetime.now(),
+# matplotlib rendered timestamps, and any naïve-vs-aware datetime
+# conversions in the PDF renderer.  Override via TZ env var in
+# docker-compose; storage in the DB is always UTC regardless.
+ENV TZ=America/New_York
+
 COPY pyproject.toml requirements.txt ./
 COPY minimateplus ./minimateplus
 COPY micromate    ./micromate
@@ -1,4 +1,4 @@
-# seismo-relay  `v0.19.0`
+# seismo-relay  `v0.21.0`

 A ground-up replacement for **Blastware** — Instantel's aging Windows-only
 software for managing seismographs.  Supports both the **MiniMate Plus
@@ -35,6 +35,25 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
 > and storage layer dispatch deterministically instead of sniffing
 > filenames.  Self-applying migration backfills existing rows from the
 > binary filename extension.
+> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
+> started in v0.17.x: histogram layouts render correctly against BW
+> reference PDFs, the ASCII parser handles real-world edge cases
+> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
+> Freq is surfaced in both modals (event browser + main webapp).
+> Adds a server-wide `TZ` env var so operator-visible timestamps
+> render in local time instead of UTC.  New
+> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
+> applied retroactively to existing events without re-forwarding,
+> using the `.TXT` files preserved at ingest time.
+> **v0.21.0 (2026-05-29)** is the Thor / Series IV decoder release —
+> `micromate/idf_file.read_idf_file()` now decodes both IDFW
+> (waveform) and IDFH (histogram) binaries (87–99% sample fidelity
+> on quiet IDFW events; all 859 IDFH corpus files decode cleanly).
+> A new `micromate/idf_to_bw_report.py` adapter projects parsed
+> Thor reports into the BW-shaped sidecar block, so Thor events
+> flow through the existing Event Report PDF pipeline without a
+> separate renderer.  Terra-View v0.13.0 ships in parallel and
+> closes Phase 1 of the SFM integration — see its CHANGELOG.
 > See [CHANGELOG.md](CHANGELOG.md) for full version history.

 ---
@@ -58,7 +77,8 @@ seismo-relay/
 ├── micromate/                 ← Series IV (Micromate / Thor) client library (NEW v0.19)
 │   ├── models.py              ←   IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L))
 │   ├── idf_ascii_report.py    ←   Parse Thor .IDFW.txt / .IDFH.txt event sidecars
-│   └── idf_file.py            ←   Stub for the .IDFW / .IDFH binary codec (reverse-engineering pending)
+│   ├── idf_file.py            ←   Binary codec for .IDFW + .IDFH (v0.21.0+)
+│   └── idf_to_bw_report.py    ←   Adapter projecting Thor IDF into the BW report shape (v0.21.0+)
 │
 ├── sfm/                       ← SFM REST API server (FastAPI, port 8200)
 │   ├── server.py              ←   Live device endpoints + DB query + ingest endpoints + caching
@@ -415,7 +435,7 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
 - [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+)
 - [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version
 - [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/`
- [ ] Binary `.IDFW` / `.IDFH` codec — pending (see Roadmap + [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md))
+- [x] Binary `.IDFW` / `.IDFH` codec — ✅ v0.21.0.  IDFW reuses `decode_waveform_v2()` on the body at offset `0x0f1f` (87–99% sample fidelity on quiet events); IDFH has a dedicated segment-based decoder (all 859 corpus files decode, 181,071 intervals total).  See `micromate/idf_file.py` + `docs/idf_protocol_reference.md`.
 - [ ] Live-device protocol — pending codec

 **Data persistence:**
@@ -459,10 +479,76 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.

 ## Roadmap (Future)

+### Strategic direction — where this is going
+
+seismo-relay is being built as a **suite of cooperating components**
+that together replace and improve on Blastware's role.  Three logical
+tiers:
+
+1. **SFM** (device-side) — owns the active connection to a physical
+   unit.  Today: `minimateplus/`, `/device/*` HTTP endpoints,
+   `seismo_lab.py`.  Future: live Thor / Micromate support.
+2. **SDM** (data-side) — owns the database, waveform store, ingest
+   pipelines, and the read-API that Terra-View consumes.  Today this
+   code lives under `sfm/` for historical reasons; the role has
+   migrated and the eventual rename is on the long-tail cleanup list.
+3. **Codec library** — pure data-interpretation: `minimateplus/*_codec.py`,
+   `bw_ascii_report.py`, `micromate/idf_*.py`.  Used by both SFM and
+   SDM, depends on neither.
+
+Terra-View is downstream of SDM for fleet listings, event detail, etc.
+The long-term vision adds a **second link** from Terra-View → SFM for
+direct device interaction (see below).
+
+The codec work in this repo isn't trying to replace BW's network
+layer — BW's ACH file forwarding and Thor's IDF call-home are
+battle-tested.  The value is in the receiving and processing side: turn
+the stream of binary+ASCII pairs into something users can search,
+filter, alert on, and report from.
+
+### Terra-View ↔ SFM device control (the long-term vision)
+
+Today Terra-View only reads from SDM (event listings, dashboards,
+project reports).  When a unit goes missing — operator notices in the
+Terra-View dashboard — there's no way to *do* anything from the UI.
+The path of least resistance is to RDP into a Windows box and open
+Blastware, which defeats the purpose of having Terra-View.
+
+Target experience:
+- Operator notices a unit in Terra-View dashboard hasn't called in.
+- Clicks unit detail → "Connect to Device" button.
+- Terra-View opens an embedded view (modal or side-panel) that talks
+  to SFM's `/device/*` endpoints over the network.
+- Live view: device clock, battery, memory, current monitor status.
+- Actions: start/stop monitoring, push compliance config changes, pull
+  fresh events, run a sensor self-check, change call-home settings.
+- Audit log: every connect / action recorded in SDM for the unit
+  history.
+
+Implementation steps (concrete):
+- [ ] **SFM authentication & authorization layer.**  Today `/device/*`
+      endpoints are unauthenticated — anyone on the network can call
+      them.  Need at minimum a token-based auth, ideally with a "who
+      can connect to which units" mapping.  Hard prerequisite for
+      letting Terra-View users into the control surface.
+- [ ] **Terra-View "Connect to Device" entry point** on the unit
+      detail page.  Renders only when unit has connection info on file
+      and the user has permission.
+- [ ] **Embedded live-monitor view** in Terra-View — equivalent to
+      `seismo_lab.py`'s Bridge tab, but in the browser.  Polls SFM's
+      `/device/monitor/status` on an interval; sends start/stop via
+      `/device/monitor/{start,stop}`.
+- [ ] **Action history** — every connect / push / action call records
+      a row in `unit_history`, viewable on the unit detail page.
+- [ ] **Series IV live-device support in SFM** — currently `/device/*`
+      only supports MiniMate Plus.  Blocks "Connect to Device" for
+      Thor units until done.  Depends on Thor wire-protocol capture
+      and a `micromate/` parallel of the `minimateplus/` modules.
+
 ### High-impact (unblocks product features)

 - [ ] **Series III waveform body codec reverse-engineering.**  The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`).  Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open.  Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise.  Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
- [ ] **Series IV (Thor IDF) binary codec reverse-engineering.**  `.IDFH` / `.IDFW` files are currently stored opaquely by `WaveformStore.save_imported_idf`, with all metadata sourced from the paired `.txt` sidecar.  This works because thor-watcher forwards both files together, but operators who haven't enabled Thor's TXT exporter get rows with NULL peaks.  Cracking the binary closes that gap and unlocks waveform display.  Starting-point reference at [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) — two observed file signatures (1,012 newer-firmware files + 2 old files whose layout matches the Series III STRT-record format), suggested first-session plan (~2-4 hrs), 1,014 paired binary+txt files available as ground truth in `thor-watcher/example-data/`.  Code seam ready at `micromate/idf_file.py`.
+- [x] **Series IV (Thor IDF) binary codec reverse-engineering.** ✅ v0.21.0 — `micromate/idf_file.read_idf_file()` decodes both IDFW (waveform body at offset `0x0f1f`, reusing `decode_waveform_v2()`; 87–99% sample fidelity on quiet events) and IDFH (dedicated segment-based decoder: all 859 corpus files decode, 181,071 intervals, peaks within ~1.8% of sidecar values).  `WaveformStore.save_imported_idf` now also projects parsed Thor data into a `bw_report` block via `micromate/idf_to_bw_report.py` so Thor events render in the existing Event Report PDF pipeline without a separate renderer.
 - [ ] **In-app waveform viewer accuracy.**  Depends on Series III codec decode.  Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples.  Series IV waveforms come online when the IDF codec lands.
 - [ ] **Series IV live-device support.**  Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD).
 - [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing.
@@ -470,9 +556,10 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.

 ### BW ASCII report parser enhancements (built in v0.16.0)

- [ ] **Histogram-specific structural fields.**  Current parser handles the shared fields (PPV, ZC Freq, sensor self-check, project) but silently drops histogram-only fields: `Histogram Start/Stop Time`, `Histogram Start/Stop Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date` (absolute timestamps rather than the waveform's `Time of Peak` relative seconds).
+- [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value.  Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag.  All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
+- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now.  Land in the sidecar's `bw_report.histogram` block.
 - [ ] **Histogram interval bin-table parsing.**  Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed.  Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
- [ ] **`>100 Hz` value parsing.**  Histogram TXTs use `>100 Hz` for out-of-range ZC freq; current `_parse_number()` returns `None` for these (loses information).
+- [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag.  PDF + both modals render `>100 Hz` instead of `—`.

 ### Ingestion gaps

@@ -498,3 +585,7 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
 - [ ] Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring).
 - [ ] Call Home — map time slots 3/4 offsets; confirm `modem_power_relay_enabled`.
 - [ ] RV55 DCD/DTR — newer RV55 firmware doesn't assert DCD by default; units don't resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred).
+- [ ] **NULL-timestamp duplicate-row dedup.**  A small handful of events (2 known on prod as of 2026-05-22) have `events.timestamp IS NULL` because the codec couldn't extract a timestamp from the binary footer.  The `UNIQUE(serial, timestamp)` constraint doesn't fire on `NULL` (SQL semantics: `NULL ≠ NULL`), so every `--force` backfill INSERTs a new row instead of UPSERTing the existing one.  Cleanup: a one-shot SQL query that keeps only the newest row per `(serial, blastware_filename)` and deletes the rest.  Longer-term: extend the unique key to `(serial, COALESCE(timestamp, blastware_filename))` or reject inserts with NULL timestamp.
+- [ ] **Histogram body sub-format with `byte[5] != 0`.**  ~3 events on prod (`T190LD5Q.LD0H`, `O121L4L1.GU0H`) use a histogram body my walker doesn't recognize — the first block has `byte[5] = 0x01` or `0x07` instead of `0x00`, and the entire body lacks the `1e 0a 00 00` tail signature.  Codec returns 0 valid blocks; their DB PVS comes from the bw_report ASCII overlay (which BW computed from the same binary, so the DB columns are correct).  Only the `.h5` waveform plot is empty.  Cracking the sub-format would unlock the plot.  Needs binary+ASCII pairs from a few `byte[5]!=0` events; same RE approach as the K558 case.
+- [ ] **Histogram body sub-format with `byte[5] == 0x00` but undecodable.**  Observed 2026-05-28 on BE17353 (S353) events: `S353L4H2.FZ0H`, `S353L4H2.P00H`, `S353L4H3.7O0H`, `S353L4H3.E10H`.  Body starts `00 00 00 01 0a 00 XX 00 ...` which LOOKS like a valid histogram block header (marker 0x000a at byte[4:6] ✓, byte[5]=0x00 normal-format ✓), but the walker finds zero data blocks across the whole body.  Likely an extra header before the block stream OR a different tail signature than `1e 0a 00 00`.  Smaller body lengths (1900-2100 bytes) suggest these may be short-recording histogram variants.  Same operational impact as the byte[5]!=0 case: event ingests cleanly, DB peaks correct via bw_report overlay, only the chart is empty.  Worth dumping a hex view of one body to diagnose.
+- [ ] **Sensor-check waveform extraction from the BW binary.**  BW's Event Report PDFs include a narrow panel on the right side of the waveform plot showing each channel's response to the sensor self-check signal (a damped sinusoid for geo, sawtooth-at-test-freq for mic).  Our parser captures the test RESULTS (`test_freq_hz`, `test_ratio`, `test_amplitude_mv`, `test_results` pass/fail) and the PDF + modal display them as text — but BW's per-sample sensor-check waveform isn't accessible to us today.  Two paths to add it:  (a) RE the binary to find where the sensor-check samples are stored — could be a section before STRT, after the footer, or in a separate sub-record; protocol reference doesn't currently mention it.  (b) If samples aren't in the binary, synthesize a representative waveform from the test parameters (damped sinusoid at `test_freq_hz` with damping from `test_ratio`).  Path (a) is the honest answer; path (b) is decorative.  Until either lands, the text-only sensor-check display in the report is fine.
@@ -0,0 +1,65 @@
+"""Run read_idf_file across the corpus and report per-channel accuracy vs sidecars."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+from analysis_idf.recon import load_sidecar_samples
+
+
+def sidecar_path(idfw: Path) -> Path:
+    return idfw.parent / "TXT" / f"{idfw.name}.txt"
+
+
+def main():
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
+    files.sort()
+    GEO_LSB = 0.0003
+
+    n_ok = n_skip = 0
+    overall = {"Tran": [], "Vert": [], "Long": []}
+
+    for f in files:
+        try:
+            res = read_idf_file(f)
+        except Exception:
+            n_skip += 1
+            continue
+        sc_path = sidecar_path(f)
+        if not sc_path.exists():
+            n_skip += 1
+            continue
+        try:
+            sc = load_sidecar_samples(sc_path)
+        except Exception:
+            n_skip += 1
+            continue
+
+        per_file = {}
+        for ch in ("Tran", "Vert", "Long"):
+            sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+            dec = res.samples.get(ch, [])
+            n = min(len(sc_counts), len(dec))
+            if n == 0:
+                per_file[ch] = 0.0
+                continue
+            exact = sum(1 for i in range(n) if sc_counts[i] == dec[i])
+            pct = 100.0 * exact / n
+            per_file[ch] = pct
+            overall[ch].append(pct)
+        n_ok += 1
+
+    print(f"Processed {n_ok} files (skipped {n_skip})")
+    print("Per-channel exact-match % (mean / min / max):")
+    for ch, vals in overall.items():
+        if vals:
+            avg = sum(vals) / len(vals)
+            print(f"  {ch}: mean={avg:.2f}%  min={min(vals):.2f}%  max={max(vals):.2f}%  n={len(vals)}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,49 @@
+"""Find where decoded-vs-sidecar diverges for each channel."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    decoded = decode_waveform_v2(buf[0x0f1f:])
+    GEO_LSB = 0.0003
+
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        # Find ALL transitions where mismatches start/stop
+        first_diff = next((i for i in range(len(dec)) if dec[i] != sc_counts[i]), None)
+        if first_diff is None:
+            print(f"{ch}: NO MISMATCHES")
+            continue
+        print(f"{ch}: first diff at idx {first_diff}")
+        # Show 5 before, 5 after
+        for i in range(max(0, first_diff - 3), min(len(dec), first_diff + 8)):
+            mark = "  " if dec[i] == sc_counts[i] else "**"
+            print(f"  {mark} idx {i:4d}: sc={sc_counts[i]:6d}  dec={dec[i]:6d}  diff={dec[i]-sc_counts[i]:+d}")
+        # Where does cumulative diff exceed 100?
+        cum_match_run = 0
+        max_match_run = 0
+        match_run_start = 0
+        diff_count = 0
+        for i in range(len(dec)):
+            if dec[i] == sc_counts[i]:
+                cum_match_run += 1
+                max_match_run = max(max_match_run, cum_match_run)
+            else:
+                cum_match_run = 0
+                diff_count += 1
+        print(f"  total mismatches: {diff_count}/{len(dec)}, longest run of matches: {max_match_run}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,48 @@
+"""End-to-end IDFH ingest verification."""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    txt  = idfh.parent / "TXT" / f"{idfh.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfh.read_bytes(),
+            idfh,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print("=== save_imported_idf (IDFH) ===")
+        print(f"  serial:        {rec['serial']}")
+        print(f"  filename:      {rec['filename']}")
+        print(f"  filesize:      {rec['filesize']}")
+        print(f"  h5:            {rec['hdf5_filename']}")  # expect None for histogram
+        print(f"  sidecar:       {rec['sidecar_filename']}")
+        print()
+        print("=== Event ===")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print()
+        # Inspect sidecar to confirm intervals were stashed
+        sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        intervals = sc.get("extensions", {}).get("idf_intervals", [])
+        print(f"  sidecar intervals: {len(intervals)}")
+        if intervals:
+            print(f"  first interval:    {intervals[0]}")
+            print(f"  last interval:     {intervals[-1]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Verify the had_report=False path: ingest IDFW with no .txt."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+import tempfile
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            serial_hint=None,
+            idf_report_text=None,        # ← no .txt!
+        )
+        print("=== IDFW without .txt ingest ===")
+        print(f"  serial:        {rec['serial']}")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  rectime_sec:   {ev.rectime_seconds}")
+        nT = len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0
+        nV = len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0
+        nL = len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0
+        nM = len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0
+        print(f"  raw_samples:   Tran={nT} Vert={nV} Long={nL} MicL={nM}")
+        if ev.peak_values:
+            print(f"  peak_values:   tran={ev.peak_values.tran} vert={ev.peak_values.vert} long={ev.peak_values.long}")
+        print(f"  h5 written:    {rec['hdf5_filename']}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,102 @@
+"""End-to-end Thor report PDF rendering.
+
+Ingests an IDFW + .txt via save_imported_idf, runs gather_report_data
+(faking a minimal DB row), and renders the PDF to disk.
+"""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+
+
+class FakeDb:
+    """Stand-in for SeismoDb.get_event(); the renderer only needs a few cols."""
+    def __init__(self, event):
+        self.event = event
+
+    def get_event(self, _id):
+        return self.event
+
+
+def main():
+    base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
+    idfw = base / "UM11719_20231219162723.IDFW"
+    txt  = base / "TXT" / f"{idfw.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
+
+        # Verify sidecar has bw_report block
+        sc_path = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        bw = sc.get("bw_report", {})
+        print(f"  bw_report.available: {bw.get('available')}")
+        print(f"  bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
+        print(f"  bw_report.mic.pspl_dbl: {bw.get('mic', {}).get('pspl_dbl')}")
+        print(f"  bw_report.histogram.n_intervals: {bw.get('histogram', {}).get('n_intervals')}")
+
+        # Build a DB-row-shaped dict from the Event for gather_report_data
+        import datetime
+        ts = ev.timestamp
+        ts_iso = None
+        if ts is not None:
+            try:
+                ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+            except Exception:
+                pass
+        fake_row = {
+            "serial":              "UM11719",
+            "blastware_filename":  rec["filename"],
+            "record_type":         "Waveform",
+            "timestamp":           ts_iso,
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client  if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
+        print()
+        print(f"=== ReportData ===")
+        print(f"  event_id:           {rd.event_id}")
+        print(f"  serial:             {rd.serial}")
+        print(f"  record_type:        {rd.record_type}")
+        print(f"  event_datetime:     {rd.event_datetime_str}")
+        print(f"  trigger:            {rd.trigger_source}")
+        print(f"  geo_range:          {rd.geo_range_str}")
+        print(f"  sample_rate:        {rd.sample_rate_str}")
+        print(f"  firmware:           {rd.firmware}")
+        print(f"  calibration:        {rd.calibration_date} by {rd.calibration_by}")
+        print(f"  battery:            {rd.battery_volts}")
+        print(f"  PVS:                {rd.peak_vector_sum_ips} in/s at {rd.peak_vector_sum_time_s} sec")
+        print(f"  mic_pspl_dbl:       {rd.mic_pspl_dbl}")
+        print(f"  mic_zc_freq_hz:     {rd.mic_zc_freq_hz}")
+        print(f"  channel_stats:      {len(rd.channel_stats)} rows")
+        for cs in rd.channel_stats:
+            print(f"    {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} ToP={cs['time_of_peak_s']} Acc={cs['peak_accel_g']} Disp={cs['peak_disp_in']} Test={cs['sensor_check']}")
+
+        # Render the PDF
+        out_path = REPO / "analysis_idf" / "thor_report.pdf"
+        pdf_bytes = report_pdf.render_event_report_pdf(rd)
+        out_path.write_bytes(pdf_bytes)
+        print()
+        print(f"  PDF written: {out_path} ({len(pdf_bytes)} bytes)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,91 @@
+"""End-to-end Thor IDFH histogram report PDF rendering."""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+import datetime
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+
+
+class FakeDb:
+    def __init__(self, event):
+        self.event = event
+
+    def get_event(self, _id):
+        return self.event
+
+
+def main():
+    # Use the multi-interval IDFH (81 + trigger row)
+    idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    txt  = idfh.parent / "TXT" / f"{idfh.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfh.read_bytes(),
+            idfh,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
+
+        sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        bw = sc.get("bw_report", {})
+        hist = bw.get("histogram", {})
+        print(f"  bw_report.histogram.start:           {hist.get('start')}")
+        print(f"  bw_report.histogram.stop:            {hist.get('stop')}")
+        print(f"  bw_report.histogram.n_intervals:     {hist.get('n_intervals')}")
+        print(f"  bw_report.histogram.interval_size:   {hist.get('interval_size')}")
+        print(f"  bw_report.histogram.interval_size_s: {hist.get('interval_size_s')}")
+        print(f"  bw_report.peaks.tran.ppv_ips:        {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
+
+        ts = ev.timestamp
+        ts_iso = None
+        if ts is not None:
+            try:
+                ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+            except Exception:
+                pass
+        fake_row = {
+            "serial":              "UM13981",
+            "blastware_filename":  rec["filename"],
+            "record_type":         "Histogram",
+            "timestamp":           ts_iso,
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client  if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="hist-1")
+
+        print()
+        print("=== ReportData (histogram) ===")
+        print(f"  is_histogram:           {rd.is_histogram}")
+        print(f"  histogram_start:        {rd.histogram_start_str}")
+        print(f"  histogram_stop:         {rd.histogram_stop_str}")
+        print(f"  histogram_n_intervals:  {rd.histogram_n_intervals}")
+        print(f"  histogram_interval_size:{rd.histogram_interval_size}")
+        print(f"  histogram_interval_times[:3]: {rd.histogram_interval_times[:3]}")
+        print(f"  histogram_interval_times[-2:]: {rd.histogram_interval_times[-2:]}")
+        print(f"  channel_stats: {len(rd.channel_stats)} rows")
+        for cs in rd.channel_stats:
+            print(f"    {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} peak_date={cs['peak_date']} peak_time={cs['peak_time']}")
+
+        pdf_bytes = report_pdf.render_event_report_pdf(rd)
+        out_path = REPO / "analysis_idf" / "thor_report_idfh.pdf"
+        out_path.write_bytes(pdf_bytes)
+        print()
+        print(f"  PDF written: {out_path} ({len(pdf_bytes)} bytes)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,52 @@
+"""End-to-end ingest test: feed an IDFW + .txt to save_imported_idf in a tmp store."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+import tempfile
+import shutil
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    txt  = idfw.parent / "TXT" / f"{idfw.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            serial_hint=None,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print("=== Save result ===")
+        print(f"  serial:    {rec['serial']}")
+        print(f"  filename:  {rec['filename']}")
+        print(f"  filesize:  {rec['filesize']}")
+        print(f"  h5:        {rec['hdf5_filename']}")
+        print(f"  sidecar:   {rec['sidecar_filename']}")
+        print()
+        print("=== Event ===")
+        print(f"  serial:        {ev.serial if hasattr(ev,'serial') else '(n/a)'}")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  rectime_sec:   {ev.rectime_seconds}")
+        print(f"  raw_samples:   Tran={len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0}, Vert={len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0}, Long={len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0}, MicL={len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0}")
+        if ev.peak_values:
+            print(f"  peaks (txt):   Tran={ev.peak_values.tran} Vert={ev.peak_values.vert} Long={ev.peak_values.long}")
+        print()
+
+        # Verify the h5 file actually got written
+        h5path = Path(td) / "UM11719" / f"{idfw.name}.h5"
+        print(f"  h5 exists:     {h5path.exists()}  size={h5path.stat().st_size if h5path.exists() else 0}")
+        sidecar = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
+        print(f"  sidecar exists:{sidecar.exists()}  size={sidecar.stat().st_size if sidecar.exists() else 0}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,137 @@
+"""Decode IDFH histogram intervals + verify against sidecar."""
+from __future__ import annotations
+import sys
+import struct
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+
+SEGMENT_MAGIC = b"\x02\xda\x0a\x00\x00\x00"
+SEGMENT_SIZE = 732   # = 10-byte header + 10 × 72-byte intervals + 2-byte tail
+INTERVAL_SIZE = 72
+CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+def decode_interval(buf72: bytes) -> dict:
+    """Decode one 72-byte interval into per-channel min/max/halfp."""
+    out = {}
+    for i, ch in enumerate(CHANNELS):
+        block = buf72[i*16 : (i+1)*16]
+        mn = struct.unpack_from(">h", block, 0)[0]
+        mx = struct.unpack_from(">h", block, 2)[0]
+        sb = struct.unpack_from(">h", block, 4)[0]
+        halfp = struct.unpack_from(">H", block, 6)[0]
+        f10 = struct.unpack_from(">H", block, 10)[0]
+        f14 = struct.unpack_from(">H", block, 14)[0]
+        peak_count = max(abs(mn), abs(mx))
+        out[ch] = {
+            "min":     mn,
+            "max":     mx,
+            "field4":  sb,
+            "halfp":   halfp,
+            "field10": f10,
+            "field14": f14,
+            "peak":    peak_count,
+            "freq_hz": (512.0 / halfp) if halfp > 5 else None,
+        }
+    out["_tail"] = buf72[64:].hex(" ")
+    return out
+
+
+def walk_idfh(buf: bytes) -> list:
+    """Walk all interval records in an IDFH file."""
+    intervals = []
+    # Multi-segment file: every 02 da 0a 00 00 00 marker introduces a segment.
+    # Single-interval file: just one body header at 0xf96 of form ?? ?? 0a 00 00 00.
+    # Find them all.
+    i = 0
+    while True:
+        j = buf.find(b"\x0a\x00\x00\x00", i)
+        if j < 0:
+            break
+        # Validate: the 2 bytes before must form a length, and we want bytes
+        # [j-2 : j+6] to have a recognisable shape.  Actually the cleanest
+        # filter is "preceded by a length and followed by 00 NN 05 3f".
+        if j < 2:
+            i = j + 1
+            continue
+        # Body header form: [length_be_2][0a 00 00 00][00 NN][05 3f]
+        if j + 10 > len(buf):
+            break
+        length = int.from_bytes(buf[j-2:j], "big")
+        # Verify the segment-marker shape: [length_be][0a 00 00 00][00 NN][05 3f]
+        if buf[j+4] != 0x00:
+            i = j + 1
+            continue
+        if buf[j+6:j+8] != b"\x05\x3f":
+            i = j + 1
+            continue
+        # Header layout (10 bytes): [length_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
+        # Followed by N interval records of 72 bytes each, then 2 tail bytes.
+        # length value = (N × 72) + 10  (counts bytes from 0x0a... through interval data).
+        header_start = j - 2
+        n_intervals = (length - 10) // INTERVAL_SIZE
+        interval_start = header_start + 10
+        for k in range(n_intervals):
+            off = interval_start + k * INTERVAL_SIZE
+            if off + INTERVAL_SIZE > len(buf):
+                break
+            chunk = buf[off:off + INTERVAL_SIZE]
+            intervals.append({"offset": off, **decode_interval(chunk)})
+        i = header_start + length + 2
+    return intervals
+
+
+def main():
+    # Test against multi-segment IDFH
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    sc_path = target.parent / "TXT" / f"{target.name}.txt"
+    buf = target.read_bytes()
+    intervals = walk_idfh(buf)
+    print(f"=== {target.name} ===")
+    print(f"  file size: {len(buf)}")
+    print(f"  decoded intervals: {len(intervals)}")
+    # Show first 2 + last 2
+    sc_rows = []
+    for line in sc_path.read_text(errors="replace").splitlines():
+        if line.startswith("2022-") or line.startswith("2023-"):
+            sc_rows.append(line)
+    print(f"  sidecar rows: {len(sc_rows)}")
+
+    print()
+    for k in [0, 1, 78, 79, 80]:
+        if k >= len(intervals):
+            continue
+        iv = intervals[k]
+        print(f"--- interval {k} @0x{iv['offset']:04x} ---")
+        for ch in CHANNELS:
+            d = iv[ch]
+            peak_ips = d["peak"] / 32768 * 10.0
+            print(f"  {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s)  halfp={d['halfp']:5d}  freq={d['freq_hz']}")
+        # sidecar row
+        if k < len(sc_rows):
+            print(f"  SC: {sc_rows[k]}")
+
+    # Test single-interval IDFH
+    print()
+    target2 = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
+    sc2 = target2.parent / "TXT" / f"{target2.name}.txt"
+    buf2 = target2.read_bytes()
+    intervals2 = walk_idfh(buf2)
+    print(f"=== {target2.name} ===")
+    print(f"  file size: {len(buf2)}, decoded intervals: {len(intervals2)}")
+    if intervals2:
+        iv = intervals2[0]
+        for ch in CHANNELS:
+            d = iv[ch]
+            peak_ips = d["peak"] / 32768 * 10.0
+            print(f"  {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s)  halfp={d['halfp']:5d}  freq={d['freq_hz']}")
+        sc_rows2 = [l for l in sc2.read_text(errors='replace').splitlines() if l.startswith("2023-")]
+        if sc_rows2:
+            print(f"  SC: {sc_rows2[0]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,41 @@
+"""Find IDFH interval period via auto-correlation of structural patterns."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+from collections import Counter
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    buf = target.read_bytes()
+    body_start = 0xF96
+    body_end   = 0x270C
+    body = buf[body_start:body_end]
+    print(f"body size: {len(body)} bytes (file {len(buf)} bytes)")
+
+    # For each candidate interval size, count how many bytes at fixed offsets within
+    # each interval are zero (consistent column-zero pattern indicates correct size).
+    print()
+    print("=== zero-column score by interval size (higher = more likely) ===")
+    best = []
+    for sz in range(16, 100):
+        n = len(body) // sz
+        if n < 30:
+            continue
+        # For each column position within an interval, count how many of n intervals have zero
+        score = 0
+        for col in range(sz):
+            zeros = sum(1 for i in range(n) if body[i*sz + col] == 0)
+            if zeros >= n * 0.9:
+                score += 1
+        best.append((score, sz, n))
+    best.sort(reverse=True)
+    for score, sz, n in best[:10]:
+        print(f"  size={sz:3d}  n_intervals={n}  consistently-zero-cols={score}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Per-file accuracy + sample-count details."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+from analysis_idf.recon import load_sidecar_samples
+
+
+def main():
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = sorted([f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")])
+    GEO_LSB = 0.0003
+    # Limit to first 15 successful files for detail.
+    shown = 0
+    for f in files:
+        try:
+            res = read_idf_file(f)
+        except Exception:
+            continue
+        sc_path = f.parent / "TXT" / f"{f.name}.txt"
+        if not sc_path.exists():
+            continue
+        sc = load_sidecar_samples(sc_path)
+        sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
+        dec = res.samples.get("Tran", [])
+        n = min(len(sc_tran), len(dec))
+        exact = sum(1 for i in range(n) if sc_tran[i] == dec[i]) if n else 0
+        pct = 100.0 * exact / n if n else 0.0
+        print(f"{f.name:40s}  size={f.stat().st_size:6d}  sc_n={len(sc_tran):4d}  dec_n={len(dec):4d}  exact={pct:.1f}%")
+        shown += 1
+        if shown >= 20:
+            break
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,64 @@
+"""Look at what's at the divergence boundary."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import walk_body, find_data_start, parse_segment_header
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    body = buf[0x0f1f:]
+    start = find_data_start(body)
+    print(f"data_start: {start}  (= file offset 0x{0x0f1f + start:04x})")
+
+    blocks = walk_body(body, start)
+    print(f"{len(blocks)} blocks total")
+    print()
+
+    # First 25 blocks
+    print("=== first 30 blocks ===")
+    for i, b in enumerate(blocks[:30]):
+        body_off = 0x0f1f + b.offset
+        if b.tag_hi == 0x40:
+            hdr = parse_segment_header(b)
+            print(f"  [{i:3d}] @0x{body_off:04x}  {b.kind}  (segment header)  counter={hdr['counter'] if hdr else '?'}  field2={hdr['field2'].hex() if hdr else '?'}  anchor={hdr['anchor_bytes'].hex() if hdr else '?'}  tail={hdr['tail'].hex() if hdr else '?'}")
+        else:
+            print(f"  [{i:3d}] @0x{body_off:04x}  {b.kind}  len={b.length}  data={b.data[:16].hex()}")
+    print()
+
+    # Cumulative sample counts per block to find which block contains sample 254
+    print("=== cumulative samples through blocks ===")
+    cur_ch = "Tran"
+    rotation = ["Vert", "Long", "MicL", "Tran"]
+    seg_count = 0
+    samples_in_curseg = 2  # preamble Tran[0], Tran[1]
+    for i, b in enumerate(blocks[:30]):
+        if b.tag_hi == 0x40:
+            seg_count += 1
+            prev_ch = cur_ch
+            cur_ch = rotation[(seg_count - 1) % 4]
+            print(f"  [{i:3d}] 40 02 -> end of {prev_ch} segment, start {cur_ch} (segment {seg_count})")
+            samples_in_curseg = 2  # anchors
+        elif (b.tag_hi & 0xF0) == 0x10:
+            nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
+            samples_in_curseg += nn
+            print(f"  [{i:3d}] {b.kind} nibble: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif (b.tag_hi & 0xF0) == 0x20:
+            nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
+            samples_in_curseg += nn
+            print(f"  [{i:3d}] {b.kind} int8: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif b.tag_hi == 0x00:
+            samples_in_curseg += b.tag_lo
+            print(f"  [{i:3d}] {b.kind} RLE: +{b.tag_lo}, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif b.tag_hi == 0x30:
+            samples_in_curseg += b.tag_lo
+            print(f"  [{i:3d}] {b.kind} packed12: +{b.tag_lo} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,89 @@
+"""Reconnaissance helpers for cracking the Thor IDFW binary."""
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+TARGET = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+TXT = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/TXT/UM11719_20231219162723.IDFW.txt"
+
+
+def hex_at(buf: bytes, off: int, n: int = 32) -> str:
+    chunk = buf[off : off + n]
+    hexs = " ".join(f"{b:02x}" for b in chunk)
+    asc = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
+    return f"{off:04x}: {hexs}  {asc}"
+
+
+def find_all(buf: bytes, needle: bytes) -> list[int]:
+    out: list[int] = []
+    i = 0
+    while True:
+        j = buf.find(needle, i)
+        if j < 0:
+            break
+        out.append(j)
+        i = j + 1
+    return out
+
+
+def load_sidecar_samples(path: Path) -> dict[str, list[float]]:
+    """Parse the txt sample table — Tran/Vert/Long/MicL."""
+    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    in_block = False
+    for line in path.read_text(errors="replace").splitlines():
+        if not in_block:
+            if line.strip() == "Waveform Data Channels":
+                in_block = True
+            continue
+        if line.startswith("Waveform Data USB Channels"):
+            break
+        parts = line.split("\t")
+        # First row is the header "\tTran\tVert\tLong\tMicL"
+        if len(parts) >= 5 and parts[1] == "Tran":
+            continue
+        if len(parts) < 5:
+            continue
+        try:
+            out["Tran"].append(float(parts[1]))
+            out["Vert"].append(float(parts[2]))
+            out["Long"].append(float(parts[3]))
+            out["MicL"].append(float(parts[4]))
+        except ValueError:
+            continue
+    return out
+
+
+def main():
+    buf = TARGET.read_bytes()
+    samples = load_sidecar_samples(TXT)
+    print(f"file size: {len(buf)} bytes")
+    print(f"sample rows: Tran={len(samples['Tran'])} Vert={len(samples['Vert'])} Long={len(samples['Long'])} MicL={len(samples['MicL'])}")
+    print(f"first 6 Tran samples: {samples['Tran'][:6]}")
+    print(f"first 6 Vert samples: {samples['Vert'][:6]}")
+    print(f"first 6 Long samples: {samples['Long'][:6]}")
+    print(f"first 6 MicL samples: {samples['MicL'][:6]}")
+
+    print()
+    print("=== BW magic '00 02 00' positions ===")
+    hits = find_all(buf, b"\x00\x02\x00")
+    print(f"{len(hits)} hits")
+    for h in hits[:20]:
+        print(hex_at(buf, h, 24))
+
+    print()
+    print("=== '40 02' segment-header positions ===")
+    hits = find_all(buf, b"\x40\x02")
+    print(f"{len(hits)} hits")
+    for h in hits:
+        ctx_pre = buf[max(0, h - 4): h].hex()
+        ctx_post = buf[h: h + 20].hex()
+        # Show byte preceding to help identify real headers vs casual occurrences
+        print(f"  0x{h:04x}  pre={ctx_pre}  post={ctx_post}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Find each segment boundary in the channel and check if errors reset there."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    decoded = decode_waveform_v2(buf[0x0f1f:])
+    GEO_LSB = 0.0003
+
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        # Find every transition where error becomes zero from nonzero (or grows from zero)
+        # Print indices where dec resyncs back to exact match.
+        n = min(len(sc_counts), len(dec))
+        events = []
+        prev_match = True
+        for i in range(n):
+            match = sc_counts[i] == dec[i]
+            if match != prev_match:
+                kind = "RESYNC" if match else "DIVERGE"
+                events.append((i, kind, sc_counts[i], dec[i]))
+                prev_match = match
+        print(f"{ch}: {len(events)} transitions")
+        for i, kind, sc_v, dec_v in events[:20]:
+            print(f"  idx {i:4d}  {kind:8s}  sc={sc_v:6d}  dec={dec_v:6d}  diff={dec_v-sc_v:+d}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,46 @@
+"""Smoke-test read_idf_file on IDFH across the corpus."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
+    result = read_idf_file(target)
+    ev = result.event
+    print(f"=== {target.name} ===")
+    print(f"  signature:   {result.signature}")
+    print(f"  serial:      {ev.serial}")
+    print(f"  timestamp:   {ev.timestamp}")
+    print(f"  sample_rate: {ev.sample_rate}")
+    print(f"  kind:        {ev.kind}")
+    print(f"  intervals:   {len(result.intervals or [])}")
+    print(f"  peaks:       T={ev.peaks.transverse_ips:.4f} V={ev.peaks.vertical_ips:.4f} L={ev.peaks.longitudinal_ips:.4f}")
+    print()
+
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = list(root.rglob("*.IDFH"))
+    ok = fail = nyi = 0
+    total_intervals = 0
+    for f in files:
+        try:
+            r = read_idf_file(f)
+            ok += 1
+            total_intervals += len(r.intervals or [])
+        except NotImplementedError:
+            nyi += 1
+        except Exception as exc:
+            fail += 1
+            if fail <= 3:
+                print(f"  FAIL: {f.name}: {type(exc).__name__}: {exc}")
+    print(f"Corpus: {len(files)} IDFH files | ok={ok} fail={fail} nyi={nyi}")
+    print(f"Total intervals decoded: {total_intervals}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,48 @@
+"""Smoke-test read_idf_file across the sample corpus."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file, geo_count_to_ips, mic_count_to_psi
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    result = read_idf_file(target)
+    ev = result.event
+    print(f"=== {target.name} ===")
+    print(f"  signature: {result.signature}")
+    print(f"  serial:    {ev.serial}")
+    print(f"  timestamp: {ev.timestamp}")
+    print(f"  sample_rate: {ev.sample_rate}")
+    print(f"  record_time: {ev.record_time_sec}")
+    print(f"  calibration: {result.binary_metadata.calibration_date}")
+    print(f"  Tran samples: {len(result.samples['Tran'])}, peak_ips={ev.peaks.transverse_ips:.4f}")
+    print(f"  Vert samples: {len(result.samples['Vert'])}, peak_ips={ev.peaks.vertical_ips:.4f}")
+    print(f"  Long samples: {len(result.samples['Long'])}, peak_ips={ev.peaks.longitudinal_ips:.4f}")
+    print(f"  MicL samples: {len(result.samples['MicL'])}")
+    print()
+
+    # Corpus sweep
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
+    ok = fail = nyi = 0
+    for f in files:
+        try:
+            r = read_idf_file(f)
+            ok += 1
+        except NotImplementedError:
+            nyi += 1
+        except Exception as exc:
+            fail += 1
+            if fail <= 5:
+                print(f"  FAIL: {f.name}: {type(exc).__name__}: {exc}")
+    print()
+    print(f"Corpus: {len(files)} IDFW files | ok={ok} fail={fail} not-implemented={nyi}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,47 @@
+"""Verify build_bw_report_from_idf against a known sidecar."""
+from __future__ import annotations
+import json
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_ascii_report import parse_idf_report
+from micromate.idf_to_bw_report import build_bw_report_from_idf
+from micromate.idf_file import read_idf_file
+
+
+def show(prefix: str, d: dict, indent: int = 0):
+    for k, v in d.items():
+        if isinstance(v, dict):
+            print(f"{'  '*indent}{prefix}{k}:")
+            show("", v, indent + 1)
+        else:
+            print(f"{'  '*indent}{prefix}{k}: {v!r}")
+
+
+def main():
+    base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
+    idfw = base / "UM11719_20231219162723.IDFW"
+    txt  = base / "TXT" / f"{idfw.name}.txt"
+
+    report_dict = parse_idf_report(txt.read_text(errors="replace"))
+    res = read_idf_file(idfw)
+    bw = build_bw_report_from_idf(report_dict, binary_md=res.binary_metadata)
+
+    print("=== IDFW → bw_report ===")
+    show("", bw)
+
+    print()
+    print("=== IDFH (single trigger row) ===")
+    idfh = base / "UM11719_20231219162648.IDFH"
+    txt_h = base / "TXT" / f"{idfh.name}.txt"
+    rh = parse_idf_report(txt_h.read_text(errors="replace"))
+    res_h = read_idf_file(idfh)
+    bw_h = build_bw_report_from_idf(rh, binary_md=res_h.binary_metadata, intervals=res_h.intervals)
+    show("", bw_h)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,73 @@
+"""Trace Tran sample-by-sample to find exactly where the codec drifts."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def s4(n: int) -> int:
+    return n if n < 8 else n - 16
+
+
+def i8(b: int) -> int:
+    return b if b < 128 else b - 256
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    GEO_LSB = 0.0003
+    sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
+
+    body = buf[0x0f1f:]
+    # Tran[0], Tran[1] from preamble
+    t0 = int.from_bytes(body[3:5], "big", signed=True)
+    t1 = int.from_bytes(body[5:7], "big", signed=True)
+    print(f"preamble Tran[0]={t0}  Tran[1]={t1}  (sidecar: {sc_tran[0]}, {sc_tran[1]})")
+
+    # Block 0: 10 f8 at body[7:9]
+    print(f"block 0: tag {body[7]:02x} {body[8]:02x}")
+    print(f"  block 0 first 10 data bytes: {body[9:19].hex()}")
+
+    # Walk block 0 manually, comparing each sample
+    cur = t1
+    samples = [t0, t1]
+    block_off = 7
+    nn = body[8]
+    print(f"  NN = {nn}")
+    data = body[9 : 9 + nn // 2]
+    for byi, byte in enumerate(data):
+        for nib_idx, nib in enumerate(((byte >> 4) & 0xF, byte & 0xF)):
+            cur += s4(nib)
+            samples.append(cur)
+            idx = len(samples) - 1
+            if 0 <= idx < len(sc_tran):
+                sc_v = sc_tran[idx]
+                match = "✓" if sc_v == cur else "✗"
+                if idx < 12 or 240 <= idx <= 260:
+                    print(f"    idx {idx:3d}: nibble byte={byte:02x} nib={nib:x} delta={s4(nib):+d}  cur={cur:+d}  sc={sc_v:+d}  {match}")
+
+    print(f"end of block 0: cur={cur}, len(samples)={len(samples)}, decoder expected 250 here")
+    # Block 1: 20 28 starts at offset 9 + 124 = 133 from block_off=7
+    block1_off = 9 + nn // 2
+    print(f"block 1: tag {body[block1_off]:02x} {body[block1_off+1]:02x} (expecting 20 28)")
+    nn1 = body[block1_off + 1]
+    print(f"  block 1 NN = {nn1}")
+    data1 = body[block1_off + 2 : block1_off + 2 + nn1]
+    for byi, byte in enumerate(data1):
+        cur += i8(byte)
+        samples.append(cur)
+        idx = len(samples) - 1
+        if idx < len(sc_tran):
+            sc_v = sc_tran[idx]
+            match = "✓" if sc_v == cur else "✗"
+            if 248 <= idx <= 295:
+                print(f"    idx {idx:3d}: int8 byte={byte:02x} delta={i8(byte):+d}  cur={cur:+d}  sc={sc_v:+d}  {match}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,42 @@
+"""Feed candidate body offsets to the BW codec and compare with sidecar."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    # Sidecar samples in 0.0003 counts (Thor geo LSB).
+    sc_tran = [int(round(v / 0.0003)) for v in sc["Tran"][:30]]
+    sc_vert = [int(round(v / 0.0003)) for v in sc["Vert"][:30]]
+    sc_long = [int(round(v / 0.0003)) for v in sc["Long"][:30]]
+    sc_micl = [int(round(v / 1e-6)) for v in sc["MicL"][:30]]  # 1 µ unit for mic? Will iterate.
+    print(f"sidecar Tran (counts): {sc_tran}")
+    print(f"sidecar Vert (counts): {sc_vert}")
+    print(f"sidecar Long (counts): {sc_long}")
+    print(f"sidecar MicL (×1e-6):  {sc_micl}")
+    print()
+
+    # Try candidate body start offsets.
+    for off in (0x0f1f, 0x1057, 0x11f1, 0x1333, 0x1bde, 0x0d30):
+        print(f"=== body @ 0x{off:04x} ===")
+        body = buf[off:]
+        decoded = decode_waveform_v2(body)
+        if not decoded:
+            print("  decode_waveform_v2 returned None")
+            continue
+        for ch in ("Tran", "Vert", "Long", "MicL"):
+            arr = decoded.get(ch, [])
+            print(f"  {ch}[{len(arr)}]: {arr[:20]}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,51 @@
+"""Verify decode_waveform_v2 against sidecar across all 2304 samples per channel."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    body = buf[0x0f1f:]
+    decoded = decode_waveform_v2(body)
+
+    print(f"Sidecar lengths: Tran={len(sc['Tran'])} Vert={len(sc['Vert'])} Long={len(sc['Long'])} MicL={len(sc['MicL'])}")
+    print(f"Decoded lengths: Tran={len(decoded['Tran'])} Vert={len(decoded['Vert'])} Long={len(decoded['Long'])} MicL={len(decoded['MicL'])}")
+    print()
+
+    GEO_LSB = 0.0003  # in/s per count
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        n = min(len(sc_counts), len(dec))
+        matches = sum(1 for i in range(n) if sc_counts[i] == dec[i])
+        first_mismatch = next((i for i in range(n) if sc_counts[i] != dec[i]), None)
+        print(f"{ch}: compared {n}, exact matches {matches} ({100*matches/n:.2f}%)")
+        if first_mismatch is not None:
+            i = first_mismatch
+            print(f"  first mismatch at idx {i}: sidecar={sc_counts[i]} ({sc[ch][i]}), decoded={dec[i]}")
+            print(f"  context sidecar[{i-2}..{i+5}]: {sc_counts[max(0,i-2):i+5]}")
+            print(f"  context decoded[{i-2}..{i+5}]: {dec[max(0,i-2):i+5]}")
+
+    # MicL: find the multiplicative factor that fits
+    print()
+    print("=== MicL scale analysis ===")
+    sc_micl = sc["MicL"]
+    dec_micl = decoded["MicL"]
+    # Skip zero values when computing ratio
+    ratios = [sc_micl[i] / dec_micl[i] for i in range(min(50, len(sc_micl), len(dec_micl))) if dec_micl[i] != 0]
+    if ratios:
+        avg = sum(ratios) / len(ratios)
+        print(f"  avg ratio sidecar/decoded over first 50 nonzero: {avg:.4e} (n={len(ratios)})")
+        print(f"  ratios sample: {[f'{r:.4e}' for r in ratios[:6]]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -12,7 +12,21 @@ implementation lives in `minimateplus/histogram_codec.py`.
 in-repo histogram fixture corpus decodes byte-exact against BW's
 ASCII export.

-24 regression tests pass against ~3,500 blocks across 5 fixtures.
+26 regression tests pass against ~3,500 blocks across 5 in-repo
+fixtures, plus a synthetic regression block taken from a real
+BE9558 prod event to lock in the uint8-peak interpretation.
+
+**Important correction (2026-05-21):** the per-channel peak count
+is `uint8` at byte[6]/[10]/[14]/[18], NOT `uint16 LE` at byte[6:8]
+etc.  The N844 fixture corpus the original RE was done against has
+zero values in bytes [7]/[11]/[15]/[19] for every block, so the
+two interpretations happened to be equivalent.  Cross-correlating
+non-N844 events (BE9558 Tran-drift, BE18003 Histogram+Continuous)
+against BW's per-interval ASCII export — 4 channels × ~1400 blocks
+per event × multiple events = 100% byte-exact only when the peak
+is read as uint8.  Reading as uint16 LE produced peaks up to 268
+in/s per channel and 35× inflated PVS sums when first deployed to
+prod (rolled back, root-caused, and fixed in commit 7183b95+1).

 ## Body format

@@ -27,15 +41,21 @@ Each block represents one histogram interval.  Block layout:
 [1]    segment_id (uint8)        0x00..0x03 — 256 blocks per segment
 [2:4]  block_ctr (uint16 LE)     resets each segment (0x0100, 0x0101, …)
 [4:6]  0x000a (uint16 LE)        constant marker (= 10)
-[6:8]  T_peak_count   uint16 LE  Tran peak (count × 0.005 → in/s at Normal)
+[6]    T_peak_count   uint8      Tran peak (count × 0.005 → in/s at Normal,
+                                  max 1.275 in/s — fits in uint8)
+[7]    T_annotation   uint8      empirically non-zero on intervals with sub-Hz
+                                  or unmeasurable freq; meaning not fully RE'd
 [8:10] T_halfperiod   uint16 LE  Tran half-period in samples
                                  (freq_Hz = 512 / halfp; ≤ 5 means ">100 Hz")
-[10:12] V_peak_count  uint16 LE  Vert peak
+[10]   V_peak_count   uint8      Vert peak
+[11]   V_annotation   uint8
 [12:14] V_halfperiod  uint16 LE  Vert freq half-period
-[14:16] L_peak_count  uint16 LE  Long peak
+[14]   L_peak_count   uint8      Long peak
+[15]   L_annotation   uint8
 [16:18] L_halfperiod  uint16 LE  Long freq half-period
-[18:20] M_peak_count  uint16 LE  MicL peak count
+[18]   M_peak_count   uint8      MicL peak count
                                  (dB via waveform_codec.mic_count_to_db)
+[19]   M_annotation   uint8
 [20:22] M_halfperiod  uint16 LE  MicL freq half-period
 [22:24] 0x00 0x00                constant
 [24:28] 4-byte variable          purpose unknown — possibly CRC,
@@ -99,6 +119,16 @@ slot[8] = 9  → 512/9 = 56.9 → 57 Hz       ✓ M_freq

 ## What's NOT yet decoded

+- **Annotation bytes (`block[7]/[11]/[15]/[19]`)**.  Empirically
+  non-zero on intervals where the per-channel ZC frequency comes
+  out as `N/A` or sub-Hz (`<1.0`, `1.X`).  Hypothesis tested in the
+  RE session: byte != 0 ↔ sub-Hz freq.  Only ~50% correlation
+  across the K558 corpus, so the relationship is more complex.
+  Possibilities: time-of-peak-within-interval, halfp extension for
+  very-long-period signals, or a debug/diagnostic field the firmware
+  writes opportunistically.  Doesn't affect peak amplitudes or
+  waveform reconstruction.  Captured as `record["annotations"]` for
+  future RE.
 - **4-byte variable metadata field (bytes 24:28)**.  Not needed for
  waveform reconstruction.  Speculation: per-block CRC, sub-second
  timestamp offset, or a Mic psi(L) count not in the 9 samples.
@@ -6,11 +6,68 @@ Series IV event-file format.  Sibling to
 Series III "Rosetta Stone") — this doc holds what we know so far and
 the open questions still to crack.

-**Status (2026-05-20):** ASCII text sidecar fully decoded (1,014
-sample files round-trip).  Binary `.IDFH` / `.IDFW` codec
-**not yet implemented** — binaries are stored opaquely by
-`WaveformStore.save_imported_idf`, with metadata sourced from the
-paired `.txt` sidecar.
+**Status (2026-05-28):** ASCII text sidecar fully decoded (1,014
+sample files round-trip).  **Thor IDFW** binary now decodes via
+`micromate.idf_file.read_idf_file()` — reuses the BW segment-rotated
+block codec verbatim at fixed body offset `0x0f1f`; metadata (serial,
+timestamp, sample_rate, record_time, calibration_date) extracted from
+the binary header.  Sample fidelity is 87–99% byte-exact on quiet
+events; loud events hit the BW codec's known walker-stops-early
+limitation.  Residual ~3% drift on per-sample deltas (likely a
+Thor-specific 12-bit delta refinement not yet modelled).
+
+**Thor IDFH histograms also decoded.**  Body has one or more segments;
+each 12-byte segment header `[length_be 2B][0a 00 00 00][00 NN][05 3f]`
+introduces `N = (length - 10) // 72` interval records of 72 bytes
+each.  Each interval = 4 × 16-byte per-channel records:
+`[int16 min][int16 max][int16 ??][uint16 halfp][2B 00][uint16 ??][2B 00][uint16 ??]`.
+Geo peak `= max(|min|, |max|) / 32768 × 10` in/s (matches sidecar
+~1.8%); freq `= 512 / halfp` Hz (None for halfp ≤ 5 → ">100"
+sentinel).  Corpus: **all 859 Thor IDFH files decode, 181,071
+intervals**.  Wired through `read_idf_file()` →
+`save_imported_idf()` → sidecar's `extensions.idf_intervals`.
+
+**Note on the BE9439 outliers in the example corpus:** Two files
+(`BE9439_20200713131747.IDFW` and `BE9439_20200713124251.IDFH`) are
+**Series III Blastware** binaries, not Thor.  Provenance: TMI tried
+to use Thor to manage auto-call-homes for Series III units; the
+experiment didn't work out, but it did leave a few BW event files
+in Thor's per-serial directory structure with `.IDFW`/`.IDFH`
+extensions — Thor's forwarder applied its own naming convention to
+the BW bodies it was relaying.  Their header `10 00 01 80 00 00
+Instantel STRT ff fe <end_key> <start_key>` is the BW SUB 5A STRT
+record, not a Thor body preamble.  The reader detects them by
+signature and raises `NotImplementedError` pointing callers at
+`read_blastware_file()`, which extracts BW-format peaks from them.
+
+**Still NYI for Thor IDFH:** per-channel `int16 field4` (possibly
+time-of-peak); the two uint16 fields (probably PVS contributions);
+8-byte interval tail (PVS data); mic dB(L) exact conversion constant.
+
+### Codec breakthroughs (2026-05-28)
+
+- **Body offset is a fixed `0x0f1f`** across 151/154 corpus IDFW
+  files.  Preceded by a 4-byte record-type marker (`46 00 00 00`)
+  + magic preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]`.
+- **Sample stream is BW's segment-rotated block codec verbatim.**
+  Thor reuses `10 NN` (nibble), `20 NN` (int8), `00 NN` (RLE),
+  `30 NN` (packed12), `40 02` (segment header) tags with the same
+  semantics.  Channel rotation Tran→Vert→Long→MicL.
+- **Geo LSB = 0.0003 in/s** (not BW's 0.005), because Thor's 16-bit
+  ADC range maps to 10 in/s without the 16-count BW quantization step.
+- **Mic ≈ 2.14×10⁻⁶ psi/count** (rough scale; refine after channel
+  block calibration constants are decoded).
+- **BW compliance anchor `\xbe\x80\x00\x00\x00\x00` reappears at
+  IDFW offset 0x952** — sample_rate at anchor−6 (uint16 BE),
+  record_time at anchor+6 (float32 BE), same layout as BW.
+- **Event timestamp at offset 0x97A** — 8 bytes `[day][month]
+  [year_be][unk][hour][min][sec]`.  Stop-time mirrors at 0x982.
+- **Serial as null-terminated ASCII at 0x14E**.
+- **Calibration date** at 0x194–0x197 (day, month, year_be).
+- Per-sample residual drift of ~3% suggests Thor encodes int8/nibble
+  deltas with an extra refinement bit that BW doesn't carry —
+  unsolved; errors resync within a few samples so cumulative impact
+  is small.

 ---

@@ -210,8 +210,7 @@ def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
        "long_peak_acceleration",
        "tran_peak_displacement", "vert_peak_displacement",
        "long_peak_displacement",
-        "tran_time_of_peak", "vert_time_of_peak", "long_time_of_peak",
-        "mic_time_of_peak", "mic_zc_freq",
+        "mic_zc_freq",
    )
    for key in float_fields:
        v = raw.get(key)
@@ -223,6 +222,22 @@ def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
        else:
            out.pop(key, None)

+    # Time-of-peak: Thor labels these "TimeofPeak" (lowercase "of") so the
+    # normalizer produces "*_timeof_peak".  Map them to the canonical
+    # ``*_time_of_peak`` output keys for downstream consumers.
+    for raw_key, out_key in (
+        ("tran_timeof_peak", "tran_time_of_peak"),
+        ("vert_timeof_peak", "vert_time_of_peak"),
+        ("long_timeof_peak", "long_time_of_peak"),
+        ("mic_timeof_peak",  "mic_time_of_peak"),
+    ):
+        v = raw.get(raw_key)
+        if v is None:
+            continue
+        fv = _parse_float(v)
+        if fv is not None:
+            out[out_key] = fv
+
    # Microphone — Thor reports MicPSPL (dB(L)) which is the closest
    # analogue to BW's mic_ppv.  The raw "99.4 dB(L)" string stays in
    # `out` under the original `mic_pspl` key for display; the parsed
@@ -1,64 +1,530 @@
 """
-micromate/idf_file.py — placeholder for the Thor IDF binary codec.
+micromate/idf_file.py — Thor IDF binary codec.

-Thor's ``.IDFH`` (histogram) and ``.IDFW`` (waveform) event files are an
-Instantel proprietary binary format that has not yet been reverse-
-engineered.  Today seismo-relay treats them as opaque blobs:
-``WaveformStore.save_imported_idf`` stores the bytes verbatim and reads
-all device-authoritative metadata from the paired ``.IDFW.txt`` /
-``.IDFH.txt`` ASCII sidecar (parsed by ``idf_ascii_report.py``).
+Decodes the Instantel Micromate Series IV ``.IDFW`` (waveform) and
+``.IDFH`` (histogram) binary on-disk format.  Sister module to
+``minimateplus/event_file_io.py``.

-When we crack the binary codec — same reverse-engineering playbook we
-used to byte-perfect-parse Series III BW files (see
-``docs/instantel_protocol_reference.md`` and ``minimateplus/event_file_io.py``)
-— this module will grow:
+Status (2026-05-28):

-  - ``read_idf_file(path) -> IdfEvent``
-        Parse a ``.IDFW``/``.IDFH`` binary and return a fully populated
-        ``IdfEvent`` whose waveform-sample arrays come from the binary
-        (the .txt sidecar's tabular sample block being a best-effort
-        check).  Lets us ingest Thor events even when the operator
-        hasn't enabled the .txt exporter — closing the
-        ``had_report=False`` gap that the thor-watcher forwarder
-        currently tolerates as a known limitation.
+- **Genuine Series IV / Thor binaries** are all signed
+  ``00 12 01 00 00 00 Instantel\\0`` (sig-A in earlier notes).  Two
+  Series III (Blastware) binaries appear in the example corpus
+  (``BE9439_*``) — they share the ``.IDFW``/``.IDFH`` extension by
+  filing convention but carry a BW STRT header (``10 00 01 80 00 00
+  Instantel STRT...``) and are NOT Thor data.  The reader detects
+  them by signature and raises NotImplementedError pointing callers
+  at ``minimateplus.event_file_io.read_blastware_file()``.
+- **IDFW waveform body** reuses the BW segment-rotated block codec
+  verbatim.  Body always starts at file offset ``0x0f1f``.  Samples
+  decoded via ``minimateplus.waveform_codec.decode_waveform_v2``
+  with 87–99% byte-exact match against ``.IDFW.txt`` sidecar (quiet
+  events).  Loud events hit the BW codec's known walker-stops-early
+  limit.  Residual ~3% drift on per-sample deltas — likely a
+  Thor-specific 12-bit delta refinement that BW's codec doesn't
+  model.  Geo LSB = 0.0003 in/s; mic factor ~2.14e-6 psi/count.
+- **IDFH histogram body**: 12-byte segment header
+  ``[len_be 2B] 0a 00 00 00 [00 NN_counter] 05 3f`` introduces a
+  segment of ``N`` 72-byte interval records (``N = (len - 10) // 72``).
+  Each record holds 4 × 16-byte per-channel min/max/halfp + 8-byte
+  tail.  Geo peaks via ``max(|min|, |max|) / 32768 × 10`` in/s
+  (matches sidecar within ~1.8%), freq via ``512 / halfp`` Hz.
+  **All 859 Thor IDFH files in the corpus decode (181,071 intervals).**
+- Binary metadata directly extracted: serial, timestamp, sample_rate,
+  record_time, calibration_date.  Other fields fall back to the paired
+  ``.IDFW.txt`` / ``.IDFH.txt`` sidecar (consumed by
+  ``WaveformStore.save_imported_idf``).

-  - ``write_idf_file(path, event)`` (eventually)
-        Round-trip event reconstruction, used for verifying the codec
-        against captured device files the way ``write_blastware_file``
-        verifies the Series III codec.
-
-  - Helpers for decoding the binary's per-channel sample arrays into
-    physical units, the per-event flash buffer's monitor-log records,
-    etc.
-
-The reverse-engineering path: pair every ``.IDFW`` binary in
-``thor-watcher/example-data/`` with its sibling ``.IDFW.txt``, treating
-the txt's "Waveform Data Channels" block as ground-truth, and align
-the binary's per-channel int16-or-similar arrays against it.  Header
-fields (sample rate, channel count, record time, timestamps) sit before
-the sample block — same approach as the BW codec where ASCII strings
-inside the binary (``Project:``, ``Client:``, etc.) anchored field
-discovery.
+The full reverse-engineering writeup lives in
+``docs/idf_protocol_reference.md``.
 """

 from __future__ import annotations

+import datetime
+import struct
+from dataclasses import dataclass
 from pathlib import Path
-from typing import Union
+from typing import Optional, Union

-from .models import IdfEvent
+from minimateplus.waveform_codec import decode_waveform_v2
+
+from .models import IdfEvent, IdfPeaks, IdfReport


-def read_idf_file(path: Union[str, Path]) -> "IdfEvent":
-    """Parse a Thor ``.IDFW``/``.IDFH`` binary into an ``IdfEvent``.
+# Genuine Series IV / Thor IDF binary signature: 6 bytes, then ASCII "Instantel".
+_THOR_PREFIX = b"\x00\x12\x01\x00\x00\x00"
+# Stray Series III (Blastware) binaries that occasionally turn up in Thor
+# corpus directories renamed to the .IDFW/.IDFH convention.  Their header
+# (`10 00 01 80 00 00 Instantel STRT ...`) is byte-for-byte a BW SUB 5A
+# STRT record, not a Thor binary.  Detected so we can refuse-and-route
+# rather than mis-parse.
+_BW_STRAY_PREFIX = b"\x10\x00\x01\x80\x00\x00"
+_INSTANTEL_TAG = b"Instantel"

-    Not yet implemented.  When implemented, this will be the canonical
-    entry point for reading Thor binaries — the ASCII sidecar parser
-    becomes an optional fast-path metadata supplement rather than the
-    sole source of device-authoritative data.
+# Most common body offset for sig-A IDFW files (~50% of prod events;
+# 151/154 in the original tests/fixtures/THORDATA_example corpus).  The
+# body is the segment-rotated block stream consumed by decode_waveform_v2;
+# bytes [0:3] are the magic ``00 02 00`` preamble.  Production events
+# routinely use other offsets — see :func:`_find_waveform_body_offset`
+# for the dynamic scan.  This constant survives only as the priority hint.
+_BODY_START_SIG_A = 0x0F1F
+
+# Magic bytes that mark a candidate waveform-body preamble.
+_BODY_MAGIC = b"\x00\x02\x00"
+
+# Where to start looking for body candidates inside the file.  Skip the
+# fixed-header region where the same magic legitimately appears inside
+# channel-test records and the compliance block (offsets 0x015d, 0x091c,
+# 0x0ae2, 0x0d30 in observed events).
+_BODY_SCAN_FLOOR = 0x0E00
+
+# Geophone count → in/s, derived from sidecar ground truth: the smallest
+# non-zero sample in 1,014-file corpus is 0.0003 in/s.
+_GEO_LSB_IPS = 0.0003
+
+# Microphone count → psi, derived from sidecar regression on 50 sample
+# pairs from UM11719_20231219162723.IDFW (mic-heavy event).
+_MIC_LSB_PSI = 2.14e-6
+
+# IDFH histogram constants.
+_IDFH_INTERVAL_SIZE = 72        # bytes per per-interval record
+_IDFH_SEGMENT_HEADER = 10       # bytes: [len_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
+_IDFH_SEGMENT_TAIL   = 2        # bytes after the interval data block, before next marker
+_IDFH_HALFP_FREQ_NUM = 512.0    # freq_hz = NUM / halfp; halfp ≤ 5 means ">100 Hz" sentinel
+_IDFH_GEO_FULL_SCALE = 10.0     # in/s — Normal range
+_IDFH_INT16_FS = 32768.0
+_IDFH_CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+# ─── Binary metadata extraction ─────────────────────────────────────────────
+
+
+@dataclass
+class IdfBinaryMetadata:
+    """Fields recoverable from the sig-A binary header (no .txt needed)."""
+    serial:           Optional[str] = None
+    event_datetime:   Optional[datetime.datetime] = None
+    sample_rate:      Optional[int] = None
+    record_time_sec:  Optional[float] = None
+    calibration_date: Optional[datetime.date] = None
+
+
+def _read_ascii_z(buf: bytes, off: int, maxlen: int = 64) -> Optional[str]:
+    if off >= len(buf):
+        return None
+    end = buf.find(b"\x00", off, off + maxlen)
+    if end < 0:
+        end = min(off + maxlen, len(buf))
+    s = buf[off:end].decode("ascii", errors="replace").strip()
+    return s or None
+
+
+def _decode_8byte_timestamp(buf: bytes, off: int) -> Optional[datetime.datetime]:
+    """Layout: ``[day][month][year_hi][year_lo][unknown][hour][min][sec]``."""
+    if off + 8 > len(buf):
+        return None
+    day, mon, yh, yl, _unk, hr, mn, sc = buf[off : off + 8]
+    year = (yh << 8) | yl
+    if not (2015 <= year <= 2050 and 1 <= mon <= 12 and 1 <= day <= 31
+            and 0 <= hr < 24 and 0 <= mn < 60 and 0 <= sc < 60):
+        return None
+    try:
+        return datetime.datetime(year, mon, day, hr, mn, sc)
+    except ValueError:
+        return None
+
+
+def extract_binary_metadata(buf: bytes) -> IdfBinaryMetadata:
+    """Pull serial/timestamp/sample_rate/record_time/calibration from the
+    sig-A binary header.
+
+    Field positions confirmed against UM11719_20231219162723.IDFW; stable
+    across the 151-file sig-A corpus.
    """
-    raise NotImplementedError(
-        "IDF binary codec not yet implemented; the .IDFW/.IDFH binary format "
-        "is undecoded.  Use parse_idf_report() on the paired .txt sidecar "
-        "for device-authoritative metadata."
+    md = IdfBinaryMetadata()
+
+    # Serial: null-terminated ASCII at 0x14E.
+    md.serial = _read_ascii_z(buf, 0x14E, maxlen=16)
+
+    # Sample rate + record time live in a BW-compatible compliance block.
+    # Locate the 6-byte anchor `be 80 00 00 00 00` and read offsets relative
+    # to it: anchor-6 = sample_rate uint16 BE; anchor+6 = record_time float32 BE.
+    anchor = buf.find(b"\xbe\x80\x00\x00\x00\x00", 0x800, 0xA00)
+    if anchor > 0:
+        sr_bytes = buf[anchor - 6 : anchor - 4]
+        if len(sr_bytes) == 2:
+            sr = int.from_bytes(sr_bytes, "big")
+            if sr in (256, 512, 1024, 2048, 4096):
+                md.sample_rate = sr
+        rt_bytes = buf[anchor + 6 : anchor + 10]
+        if len(rt_bytes) == 4:
+            try:
+                rt = struct.unpack(">f", rt_bytes)[0]
+                if 0.1 <= rt <= 600.0:
+                    md.record_time_sec = float(rt)
+            except struct.error:
+                pass
+
+    # Event timestamp: 8 bytes.  Position differs between IDFW (0x97A) and
+    # IDFH (0x9F8); scan a small range and accept the first valid decode.
+    for off in (0x97A, 0x9F8):
+        ts = _decode_8byte_timestamp(buf, off)
+        if ts is not None:
+            md.event_datetime = ts
+            break
+
+    # Calibration date: day, month, year_be at 0x194-0x197.
+    if len(buf) > 0x197:
+        day, mon = buf[0x194], buf[0x195]
+        year = int.from_bytes(buf[0x196 : 0x198], "big")
+        if 1 <= mon <= 12 and 1 <= day <= 31 and 2015 <= year <= 2050:
+            try:
+                md.calibration_date = datetime.date(year, mon, day)
+            except ValueError:
+                pass
+
+    return md
+
+
+# ─── Sample decoder + unit conversion ───────────────────────────────────────
+
+
+def _find_waveform_body_offset(buf: bytes) -> Optional[int]:
+    """Pick the file offset of the waveform body by trial-decoding every
+    ``00 02 00`` magic position past the fixed-header region.
+
+    The body's location isn't fixed across all sig-A IDFW files — about
+    half the production events use ``0x0f1f``, but the rest have offsets
+    that shift based on header padding / channel-config layout.  We
+    auto-detect by:
+
+      1. Find every ``00 02 00`` occurrence past ``_BODY_SCAN_FLOOR``.
+      2. Try ``decode_waveform_v2()`` on each candidate.
+      3. Pick the offset whose decoded sample count is largest.
+
+    Returns the offset, or ``None`` if no candidate yielded more than
+    the trivial 2-sample preamble (= "no real body found").
+
+    Costs ~2-8 trial decodes per file; in practice the first candidate
+    past 0x0e00 is usually the right one.
+    """
+    if len(buf) < _BODY_SCAN_FLOOR + 8:
+        return None
+    best: Optional[tuple[int, int]] = None   # (total_samples, offset)
+    i = _BODY_SCAN_FLOOR
+    while True:
+        j = buf.find(_BODY_MAGIC, i)
+        if j < 0:
+            break
+        i = j + 1
+        try:
+            decoded = decode_waveform_v2(buf[j:])
+        except Exception:
+            continue
+        if not decoded:
+            continue
+        total = sum(len(v) for v in decoded.values())
+        # A "real" body has more than just the 2-sample preamble.
+        if total <= 2:
+            continue
+        if best is None or total > best[0]:
+            best = (total, j)
+    return best[1] if best else None
+
+
+def _decode_waveform_samples(buf: bytes) -> Optional[dict]:
+    """Decode samples from the sig-A waveform body.
+
+    Returns the raw decoder counts dict — geo LSB = 0.0003 in/s, mic in
+    its own count unit (see :func:`mic_count_to_psi`).  Returns None if
+    no usable body is found.
+
+    Uses :func:`_find_waveform_body_offset` to locate the body — the
+    file-offset varies across events (~50% sit at the canonical
+    ``0x0f1f`` but the rest don't), so the previous hardcoded constant
+    silently produced 2-sample preamble-only output for half the corpus.
+    """
+    off = _find_waveform_body_offset(buf)
+    if off is None:
+        return None
+    return decode_waveform_v2(buf[off:])
+
+
+def geo_count_to_ips(count: int) -> float:
+    """Convert a Thor geo decoder count to in/s.  LSB = 0.0003 in/s."""
+    return count * _GEO_LSB_IPS
+
+
+def mic_count_to_psi(count: int) -> float:
+    """Convert a Thor mic decoder count to psi.  Scale derived from
+    regression over 50 sample pairs in UM11719_20231219162723.IDFW;
+    consistent to ~5%.  Calibration constants from the channel block
+    can refine this once decoded.
+    """
+    return count * _MIC_LSB_PSI
+
+
+# ─── IDFH histogram decoder ─────────────────────────────────────────────────
+
+
+@dataclass
+class IdfhInterval:
+    """One decoded histogram interval (typically one minute of monitoring)."""
+    offset:    int    # file byte offset of the 72-byte record
+    # Per-channel min/max ADC counts (int16 BE), half-period samples, peak count.
+    # Peak = max(|min|, |max|).  freq_hz = 512/halfp (None if halfp ≤ 5 →
+    # ">100 Hz" sentinel; matches sidecar convention).
+    tran_min:    int
+    tran_max:    int
+    tran_halfp:  int
+    vert_min:    int
+    vert_max:    int
+    vert_halfp:  int
+    long_min:    int
+    long_max:    int
+    long_halfp:  int
+    micl_min:    int
+    micl_max:    int
+    micl_halfp:  int
+
+    def peak_count(self, channel: str) -> int:
+        mn = getattr(self, f"{channel.lower()}_min")
+        mx = getattr(self, f"{channel.lower()}_max")
+        return max(abs(mn), abs(mx))
+
+    def peak_ips(self, channel: str) -> float:
+        """Convert peak count to in/s (geo channels only)."""
+        return self.peak_count(channel) / _IDFH_INT16_FS * _IDFH_GEO_FULL_SCALE
+
+    def freq_hz(self, channel: str) -> Optional[float]:
+        halfp = getattr(self, f"{channel.lower()}_halfp")
+        if halfp <= 5:
+            return None
+        return _IDFH_HALFP_FREQ_NUM / halfp
+
+
+def _decode_idfh_interval(buf72: bytes, offset: int) -> IdfhInterval:
+    """Decode one 72-byte interval record into per-channel min/max/halfp."""
+    import struct
+    fields = []
+    for i in range(4):
+        block = buf72[i * 16 : (i + 1) * 16]
+        mn = struct.unpack_from(">h", block, 0)[0]
+        mx = struct.unpack_from(">h", block, 2)[0]
+        # block[4:6] = int16 BE, role unknown (possibly time-of-peak)
+        halfp = struct.unpack_from(">H", block, 6)[0]
+        # block[10:12] and block[14:16] are uint16 BE with unknown semantics
+        # (likely sum / count contributions for the PVS computation).
+        fields.extend([mn, mx, halfp])
+    # Tail 8 bytes (buf72[64:72]) carry PVS-related data; not yet decoded.
+    return IdfhInterval(
+        offset=offset,
+        tran_min=fields[0], tran_max=fields[1], tran_halfp=fields[2],
+        vert_min=fields[3], vert_max=fields[4], vert_halfp=fields[5],
+        long_min=fields[6], long_max=fields[7], long_halfp=fields[8],
+        micl_min=fields[9], micl_max=fields[10], micl_halfp=fields[11],
+    )
+
+
+def decode_idfh_body(buf: bytes) -> list:
+    """Walk an IDFH file and decode every interval record.
+
+    The body has one or more segments; each segment header is 12 bytes:
+    ``[length_be 2B][0a 00 00 00][00 NN_counter][05 3f]`` where ``length``
+    is bytes from the magic through the end of the interval block
+    (= 10 + 72 × n_intervals).  Segments are separated by a 2-byte tail
+    + next-segment 2-byte prefix (the bytes before the next length field).
+    Confirmed against the 859-file corpus (181,071 intervals decoded; 1
+    failure is the sig-B BE9439 file).
+    """
+    intervals: list = []
+    i = 0
+    while True:
+        j = buf.find(b"\x0a\x00\x00\x00", i)
+        if j < 0 or j < 2:
+            break
+        # Validate: [length_be][0a 00 00 00][00 NN][05 3f]
+        if buf[j + 4] != 0x00 or buf[j + 6 : j + 8] != b"\x05\x3f":
+            i = j + 1
+            continue
+        length = int.from_bytes(buf[j - 2 : j], "big")
+        n = (length - _IDFH_SEGMENT_HEADER) // _IDFH_INTERVAL_SIZE
+        if n <= 0:
+            i = j + 1
+            continue
+        header_start = j - 2
+        interval_start = header_start + _IDFH_SEGMENT_HEADER
+        for k in range(n):
+            off = interval_start + k * _IDFH_INTERVAL_SIZE
+            if off + _IDFH_INTERVAL_SIZE > len(buf):
+                break
+            chunk = buf[off : off + _IDFH_INTERVAL_SIZE]
+            intervals.append(_decode_idfh_interval(chunk, off))
+        # Advance past this segment + the 2-byte tail.
+        i = header_start + length + _IDFH_SEGMENT_TAIL
+    return intervals
+
+
+# ─── Top-level reader ───────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfReadResult:
+    """Return type for :func:`read_idf_file`.
+
+    For waveforms (``.IDFW``), ``samples`` holds the per-channel sample
+    arrays in Thor decoder counts.  For histograms (``.IDFH``),
+    ``samples`` is empty and ``intervals`` holds the per-interval
+    record list (peaks, freqs).
+    """
+    event:           IdfEvent
+    samples:         dict   # {"Tran": [...], ...} for IDFW; empty for IDFH
+    binary_metadata: IdfBinaryMetadata
+    signature:       str    # always "thor" for now (sig-A genuine Thor)
+    intervals:       Optional[list] = None  # list[IdfhInterval] for IDFH; None for IDFW
+
+
+def read_idf_file(
+    path: Union[str, Path],
+    *,
+    data: Optional[bytes] = None,
+) -> IdfReadResult:
+    """Parse a Thor ``.IDFW`` binary into an ``IdfEvent`` + decoded samples.
+
+    Currently implements signature-A waveforms only.  Signature-B
+    (old-firmware) and ``.IDFH`` histograms raise NotImplementedError;
+    use the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar for those via
+    ``parse_idf_report()``.
+
+    Returns an :class:`IdfReadResult`.  The caller converts int sample
+    counts to physical units via :func:`geo_count_to_ips` /
+    :func:`mic_count_to_psi`.
+
+    ``path`` is used for filename in error messages and ``.IDFH`` vs
+    ``.IDFW`` suffix detection.  When ``data`` is supplied the disk
+    read is skipped — useful for ingest paths that already have the
+    bytes in memory and where the file may not exist on disk yet.
+    """
+    p = Path(path)
+    buf = data if data is not None else p.read_bytes()
+
+    if len(buf) < 16 or buf[6:16] != _INSTANTEL_TAG + b"\x00":
+        raise ValueError(f"{p.name}: not an IDF file (missing Instantel magic)")
+
+    sig_prefix = buf[:6]
+    if sig_prefix == _THOR_PREFIX:
+        signature = "thor"
+    elif sig_prefix == _BW_STRAY_PREFIX:
+        raise NotImplementedError(
+            f"{p.name}: file has a Series III (Blastware) STRT header in "
+            "an IDF-named container — not a Thor binary.  Route through "
+            "minimateplus.event_file_io.read_blastware_file() instead "
+            "(peaks decode; samples & full metadata don't, but it's not "
+            "Thor data so the Thor codec doesn't apply)."
+        )
+    else:
+        raise ValueError(f"{p.name}: unknown IDF signature {sig_prefix.hex()}")
+
+    is_histogram = p.suffix.upper() == ".IDFH"
+    md = extract_binary_metadata(buf)
+
+    if is_histogram:
+        intervals = decode_idfh_body(buf)
+        if not intervals:
+            raise ValueError(f"{p.name}: IDFH body decoded no intervals")
+        # Peaks: max across all intervals on each channel (per-channel max
+        # of stored max-magnitudes; sidecar's PPV row carries the same).
+        peak_tran = max((iv.peak_ips("Tran") for iv in intervals), default=0.0)
+        peak_vert = max((iv.peak_ips("Vert") for iv in intervals), default=0.0)
+        peak_long = max((iv.peak_ips("Long") for iv in intervals), default=0.0)
+        # Mic peak in psi — Thor stores per-interval mic ADC counts in the
+        # binary; convert the max count to psi via the per-count factor.
+        mic_peak_count = max((iv.peak_count("MicL") for iv in intervals), default=0)
+        mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
+        rep = IdfReport(
+            serial_number=md.serial,
+            event_type="Full Histogram",
+            event_datetime=md.event_datetime,
+            filename=p.name,
+            sample_rate=md.sample_rate,
+            record_time_sec=md.record_time_sec,
+        )
+        peaks = IdfPeaks(
+            transverse_ips=peak_tran,
+            vertical_ips=peak_vert,
+            longitudinal_ips=peak_long,
+            peak_vector_sum_ips=None,
+            mic_pspl_dbl=None,         # IDFH binary doesn't carry the dB(L) value
+            mic_pspl_psi=mic_peak_psi,
+        )
+        event = IdfEvent(
+            serial=md.serial or "UNKNOWN",
+            timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
+            kind="Histogram",
+            filename=p.name,
+            sample_rate=md.sample_rate,
+            record_time_sec=md.record_time_sec,
+            peaks=peaks,
+            report=rep,
+        )
+        return IdfReadResult(
+            event=event,
+            samples={},
+            binary_metadata=md,
+            signature=signature,
+            intervals=intervals,
+        )
+
+    # Waveform path.
+    decoded = _decode_waveform_samples(buf)
+    if decoded is None:
+        raise ValueError(f"{p.name}: waveform body codec failed")
+
+    rep = IdfReport(
+        serial_number=md.serial,
+        event_type="Full Waveform",
+        event_datetime=md.event_datetime,
+        filename=p.name,
+        sample_rate=md.sample_rate,
+        record_time_sec=md.record_time_sec,
+    )
+
+    def _peak_ips(ch: str) -> float:
+        arr = decoded.get(ch, [])
+        return geo_count_to_ips(max((abs(v) for v in arr), default=0))
+
+    # Mic peak psi from binary: max absolute MicL ADC count × 2.14e-6 psi/count.
+    mic_arr = decoded.get("MicL", [])
+    mic_peak_count = max((abs(v) for v in mic_arr), default=0)
+    mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
+
+    peaks = IdfPeaks(
+        transverse_ips=_peak_ips("Tran"),
+        vertical_ips=_peak_ips("Vert"),
+        longitudinal_ips=_peak_ips("Long"),
+        # PVS requires aligned per-sample √(T²+V²+L²); leave None — the
+        # sidecar carries it and the bridge picks it up if present.
+        peak_vector_sum_ips=None,
+        mic_pspl_dbl=None,             # binary IDFW doesn't carry the dB(L) value;
+                                       # sidecar .txt fills it via IdfReport.from_dict
+        mic_pspl_psi=mic_peak_psi,
+    )
+
+    event = IdfEvent(
+        serial=md.serial or "UNKNOWN",
+        timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
+        kind="Waveform",
+        filename=p.name,
+        sample_rate=md.sample_rate,
+        record_time_sec=md.record_time_sec,
+        peaks=peaks,
+        report=rep,
+    )
+
+    return IdfReadResult(
+        event=event,
+        samples=decoded,
+        binary_metadata=md,
+        signature=signature,
    )
@@ -0,0 +1,323 @@
+"""
+micromate/idf_to_bw_report.py — adapter that projects a parsed Thor IDF
+report (+ binary metadata + decoded IDFH intervals) into the
+``bw_report``-shaped dict that :mod:`sfm.report_pdf.gather_report_data`
+consumes.
+
+Lets Thor events flow through the existing Series III Event Report PDF
+pipeline without duplicating the renderer.  Thor's report content is
+~95% the same data shape as BW's; the field names differ but the
+underlying metrics map 1:1.
+
+Caveats
+───────
+
+- **Mic units** — Thor records ``MicPSPL`` natively in dB(L).  This
+  adapter sets ``bw_report.mic.pspl_dbl`` directly; the report
+  renderer recomputes the equivalent psi via its dBL→psi formula.
+- **Saturation / above-range flags** — Thor doesn't always mark
+  ``OORANGE`` the way BW does; we set ``zc_freq_above_range`` only
+  when a `>100` sentinel was preserved in the raw text.
+- **Per-interval data** — for IDFH events we build ``interval_times``
+  by stepping ``IntervalSize`` from ``HistogramStartTime``; the binary
+  decoder confirms one record per step (882 / 881 / 881 ... across
+  the corpus).
+- **calibration_by parsing** — Thor's free-form ``Calibration : November
+  22, 2023 by Instantel`` is split on ``" by "`` to extract the
+  calibrator; the date prefix is parsed where possible, otherwise
+  the binary-extracted ``calibration_date`` from
+  :class:`micromate.idf_file.IdfBinaryMetadata` wins.
+"""
+
+from __future__ import annotations
+
+import datetime
+import re
+from typing import Any, Dict, List, Optional
+
+
+# ─── Helpers ────────────────────────────────────────────────────────────────
+
+
+_NUM_RE = re.compile(r"-?\d+(?:\.\d+)?")
+
+
+def _parse_first_number(s: Optional[str]) -> Optional[float]:
+    """Pull the first numeric token from a string like ``"0.1500 in/s"``."""
+    if s is None:
+        return None
+    m = _NUM_RE.search(str(s))
+    if not m:
+        return None
+    try:
+        return float(m.group(0))
+    except ValueError:
+        return None
+
+
+def _parse_interval_size_s(s: Optional[str]) -> Optional[float]:
+    """``"60 sec"`` → 60.0, ``"5 min"`` → 300.0, ``"1 hour"`` → 3600."""
+    if s is None:
+        return None
+    num = _parse_first_number(s)
+    if num is None:
+        return None
+    sl = str(s).lower()
+    if "hour" in sl or "hr" in sl:
+        return num * 3600.0
+    if "min" in sl:
+        return num * 60.0
+    return num   # default to seconds
+
+
+def _parse_calibration(text: Optional[str]) -> tuple[Optional[str], Optional[str]]:
+    """Split ``"November 22, 2023 by Instantel"`` → (ISO date, calibrator).
+
+    Returns ``(None, None)`` if neither half parses.
+    """
+    if not text:
+        return None, None
+    parts = str(text).split(" by ", 1)
+    date_part = parts[0].strip() if parts else None
+    by_part = parts[1].strip() if len(parts) > 1 else None
+    iso_date: Optional[str] = None
+    if date_part:
+        for fmt in ("%B %d, %Y", "%b %d, %Y", "%Y-%m-%d", "%m/%d/%Y"):
+            try:
+                iso_date = datetime.datetime.strptime(date_part, fmt).date().isoformat()
+                break
+            except ValueError:
+                continue
+    return iso_date, by_part
+
+
+def _channel_peaks(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
+    """Map ``tran_ppv`` / ``tran_zc_freq`` / ... → bw_report.peaks.tran shape."""
+    out: Dict[str, Any] = {}
+    for src, dst in (
+        (f"{ch_lc}_ppv",                 "ppv_ips"),
+        (f"{ch_lc}_zc_freq",             "zc_freq_hz"),
+        (f"{ch_lc}_time_of_peak",        "time_of_peak_s"),
+        (f"{ch_lc}_peak_acceleration",   "peak_accel_g"),
+        (f"{ch_lc}_peak_displacement",   "peak_disp_in"),
+    ):
+        v = idf.get(src)
+        if v is not None:
+            out[dst] = v
+    # ZC freq ">100" sentinel: the raw text carries it under the un-typed
+    # key (e.g. ``raw["tran_zc_freq"]`` would be ``">100"``), and our parser
+    # dropped the typed entry.  Detect that case and flag.
+    raw_zc = idf.get(f"{ch_lc}_zc_freq")
+    if isinstance(raw_zc, str) and ">" in raw_zc:
+        out["zc_freq_above_range"] = True
+        out.pop("zc_freq_hz", None)
+    return out
+
+
+def _sensor_check(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
+    out: Dict[str, Any] = {}
+    fr = idf.get(f"{ch_lc}_test_freq")
+    if fr is not None:
+        out["freq_hz"] = _parse_first_number(fr)
+    rt = idf.get(f"{ch_lc}_test_ratio")
+    if rt is not None:
+        out["ratio"] = _parse_first_number(rt)
+    am = idf.get(f"{ch_lc}_test_amplitude")
+    if am is not None:
+        out["amplitude_mv"] = _parse_first_number(am)
+    res = idf.get(f"{ch_lc}_test_results")
+    if res is not None:
+        out["result"] = str(res).strip()
+    return {k: v for k, v in out.items() if v is not None}
+
+
+def _interval_times(idf: Dict[str, Any], n_intervals: Optional[int]) -> List[str]:
+    """Synthesise per-interval timestamps from start + interval_size × k.
+
+    Returns ``[]`` when start time or interval size is unknown.
+    """
+    if not n_intervals:
+        return []
+    start_date = idf.get("histogram_start_date") or idf.get("event_date")
+    start_time = idf.get("histogram_start_time") or idf.get("event_time")
+    iv_str = idf.get("interval_size")
+    iv_s = _parse_interval_size_s(iv_str)
+    if not (start_date and start_time and iv_s):
+        return []
+    try:
+        t0 = datetime.datetime.strptime(f"{start_date} {start_time}", "%Y-%m-%d %H:%M:%S")
+    except ValueError:
+        return []
+    out = []
+    for k in range(int(n_intervals)):
+        t = t0 + datetime.timedelta(seconds=iv_s * (k + 1))
+        out.append(t.isoformat())
+    return out
+
+
+# ─── Top-level adapter ──────────────────────────────────────────────────────
+
+
+def build_bw_report_from_idf(
+    idf_report: Dict[str, Any],
+    *,
+    binary_md=None,
+    intervals: Optional[list] = None,
+    is_histogram: Optional[bool] = None,
+) -> Dict[str, Any]:
+    """Project a parsed IDF report dict (and optional binary metadata +
+    decoded IDFH intervals) into the BW report sidecar shape.
+
+    The returned dict is structurally identical to what
+    ``minimateplus.event_file_io._bw_report_to_dict`` produces from a
+    real BW ASCII report — it can be assigned to
+    ``sidecar["bw_report"]`` and consumed verbatim by
+    ``sfm.report_pdf.gather_report_data``.
+
+    ``intervals`` is the list of :class:`micromate.idf_file.IdfhInterval`
+    objects from :func:`micromate.idf_file.decode_idfh_body`; only used
+    for histogram events to derive accurate ``interval_times``.
+    """
+    if is_histogram is None:
+        et = str(idf_report.get("event_type", ""))
+        is_histogram = et.lower().startswith("full histogram")
+
+    # ── Trigger / recording / device ─────────────────────────────────────
+    trigger_channel = idf_report.get("trigger")
+    trigger_level   = _parse_first_number(idf_report.get("geo_trigger_level"))
+    geo_range_ips   = _parse_first_number(idf_report.get("geo_range"))
+
+    cal_iso, cal_by = _parse_calibration(idf_report.get("calibration"))
+    # Prefer the binary-extracted calibration_date when our text parse fell
+    # through; the binary date is unambiguous.
+    if cal_iso is None and binary_md is not None and binary_md.calibration_date:
+        cal_iso = binary_md.calibration_date.isoformat()
+
+    # ── Histogram fields ────────────────────────────────────────────────
+    hist_block: Dict[str, Any] = {
+        "start": None, "stop": None, "n_intervals": None,
+        "interval_size": None, "interval_size_s": None,
+        "channel_peak_when": {},
+    }
+    if is_histogram:
+        sd = idf_report.get("histogram_start_date")
+        st = idf_report.get("histogram_start_time")
+        if sd and st:
+            try:
+                hist_block["start"] = datetime.datetime.strptime(
+                    f"{sd} {st}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+        ed = idf_report.get("histogram_stop_date")
+        et_ = idf_report.get("histogram_stop_time")
+        if ed and et_:
+            try:
+                hist_block["stop"] = datetime.datetime.strptime(
+                    f"{ed} {et_}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+        n_raw = idf_report.get("number_of_intervals")
+        if n_raw is not None:
+            try:
+                # Thor reports a float like "81.04"; round to int (the BW
+                # report uses an int for the column).
+                hist_block["n_intervals"] = int(float(str(n_raw)))
+            except ValueError:
+                pass
+        # When the binary decoder gave us the actual interval count, prefer it.
+        if intervals is not None:
+            hist_block["n_intervals"] = len(intervals)
+        hist_block["interval_size"] = idf_report.get("interval_size")
+        hist_block["interval_size_s"] = _parse_interval_size_s(idf_report.get("interval_size"))
+        # interval_times derived from start+step (the BW report uses the
+        # exact strings; we match its representation).
+        times = _interval_times(idf_report, hist_block["n_intervals"])
+        # Per-channel peak when (absolute date+time at which the channel's
+        # peak occurred over the histogram run).  Thor splits this into
+        # ``TranPeakDate`` / ``TranPeakTime`` etc.
+        peak_when: Dict[str, str] = {}
+        for ch_label, ch_lc in (("Tran", "tran"), ("Vert", "vert"), ("Long", "long"), ("MicL", "mic")):
+            d = idf_report.get(f"{ch_lc}_peak_date")
+            t = idf_report.get(f"{ch_lc}_peak_time")
+            if d and t:
+                try:
+                    peak_when[ch_label] = datetime.datetime.strptime(
+                        f"{d} {t}", "%Y-%m-%d %H:%M:%S"
+                    ).isoformat()
+                except ValueError:
+                    continue
+        if peak_when:
+            hist_block["channel_peak_when"] = peak_when
+
+    # ── Mic block ────────────────────────────────────────────────────────
+    mic_block = {
+        "weighting":           "L",                   # Thor mic is ISEE Linear
+        "pspl_dbl":            idf_report.get("mic_ppv"),  # the dB(L) float
+        "pspl_saturated":      False,
+        "zc_freq_hz":          idf_report.get("mic_zc_freq"),
+        "zc_freq_above_range": isinstance(idf_report.get("mic_zc_freq"), str)
+                               and ">" in str(idf_report.get("mic_zc_freq")),
+        "time_of_peak_s":      idf_report.get("mic_time_of_peak"),
+    }
+    if mic_block["zc_freq_above_range"]:
+        mic_block["zc_freq_hz"] = None
+
+    # ── Peaks ────────────────────────────────────────────────────────────
+    vs_block = {
+        "ips":       idf_report.get("peak_vector_sum"),
+        "time_s":    _parse_first_number(idf_report.get("peak_vector_sum_time_sum")),
+        "when":      None,
+        "saturated": False,
+    }
+    if is_histogram:
+        # PVS absolute date+time, when present.
+        vs_d = idf_report.get("peak_vector_sum_date")
+        vs_t = idf_report.get("peak_vector_sum_time")
+        if vs_d and vs_t:
+            try:
+                vs_block["when"] = datetime.datetime.strptime(
+                    f"{vs_d} {vs_t}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+
+    return {
+        "available":  True,
+        "event_type": idf_report.get("event_type"),
+        "version":    idf_report.get("version"),
+        "trigger": {
+            "channel":       trigger_channel,
+            "geo_level_ips": trigger_level,
+        },
+        "recording": {
+            "sample_rate_sps":  idf_report.get("sample_rate"),
+            "record_time_s":    idf_report.get("record_time_sec"),
+            "pretrig_s":        idf_report.get("pre_trigger_sec"),
+            "stop_mode":        idf_report.get("record_stop_mode"),
+            "geo_range_ips":    geo_range_ips,
+            "units":            idf_report.get("units"),
+        },
+        "device": {
+            "battery_volts":    idf_report.get("battery_volts"),
+            "calibration_date": cal_iso,
+            "calibration_by":   cal_by,
+        },
+        "peaks": {
+            "tran":       _channel_peaks(idf_report, "tran"),
+            "vert":       _channel_peaks(idf_report, "vert"),
+            "long":       _channel_peaks(idf_report, "long"),
+            "vector_sum": vs_block,
+        },
+        "mic":          mic_block,
+        "sensor_check": {
+            "tran": _sensor_check(idf_report, "tran"),
+            "vert": _sensor_check(idf_report, "vert"),
+            "long": _sensor_check(idf_report, "long"),
+            "mic":  _sensor_check(idf_report, "mic"),
+        },
+        "histogram":    hist_block,
+        "monitor_log":  [],
+        "pc_sw_version": None,
+    }
@@ -159,12 +159,23 @@ class IdfReport:

@dataclass
 class IdfPeaks:
-    """Geophone + mic peak values for one Thor event.  Native Thor units."""
+    """Geophone + mic peak values for one Thor event.  Native Thor units.
+
+    Thor stores the mic peak in two parallel forms — ``mic_pspl_dbl`` is
+    what the sidecar's top-level ``MicPSPL`` header field carries (dB(L)),
+    used in the report header.  ``mic_pspl_psi`` is the psi value derived
+    either from the IDFW sample table / IDFH interval column 9, or from
+    the binary mic counts (~2.14e-6 psi/count).  Needed because the
+    BW-shaped ``PeakValues.micl`` consumed by ``event_hdf5.write_event_hdf5``
+    expects psi — feeding it dB(L) makes the h5 mic-chart scale factor
+    blow up.
+    """
    transverse_ips:    Optional[float] = None    # in/s
    vertical_ips:      Optional[float] = None    # in/s
    longitudinal_ips:  Optional[float] = None    # in/s
    peak_vector_sum_ips: Optional[float] = None  # in/s
    mic_pspl_dbl:      Optional[float] = None    # dB(L)
+    mic_pspl_psi:      Optional[float] = None    # psi


@dataclass
@@ -324,10 +335,14 @@ class IdfEvent:
        machinery without those code paths needing to know about Thor.

        Caveats of the bridge:
-          - ``mic_ppv`` on the produced Event carries Thor's dB(L) value
-            verbatim — the UI distinguishes via the ``device_family``
-            column (Phase 1).  Don't run the BW psi→dBL converter on
-            Series IV rows.
+          - ``PeakValues.micl`` carries the mic peak in **psi** (matching
+            BW's convention) — set from :attr:`IdfPeaks.mic_pspl_psi`,
+            with a dB(L)→psi fallback when only the dB(L) value is
+            available.  This is what the h5 writer's mic-scale-factor
+            logic needs.  The dB(L) value still flows through
+            ``bw_report.mic.pspl_dbl`` (set by the
+            ``idf_to_bw_report`` adapter) and the renderer reads it
+            from there for the report header.
          - Many Thor-specific fields (Peak Acceleration / Displacement,
            sensor self-check, calibration) don't have a slot in
            ``Event``.  The full IdfReport is preserved on the
@@ -349,11 +364,17 @@ class IdfEvent:
            minute=self.timestamp.minute,
            second=self.timestamp.second,
        )
+        # Resolve mic peak as psi.  Priority: binary-derived mic_pspl_psi
+        # (set by read_idf_file) > dB(L)→psi fallback via standard formula
+        # (psi = 2.9e-9 × 10^(dBL/20)) > None.
+        mic_psi = self.peaks.mic_pspl_psi
+        if mic_psi is None and self.peaks.mic_pspl_dbl is not None:
+            mic_psi = 2.9e-9 * (10.0 ** (self.peaks.mic_pspl_dbl / 20.0))
        pv = PeakValues(
            tran=self.peaks.transverse_ips,
            vert=self.peaks.vertical_ips,
            long=self.peaks.longitudinal_ips,
-            micl=self.peaks.mic_pspl_dbl,   # dB(L) — see caveat above
+            micl=mic_psi,   # psi, matching BW's convention (h5 scaling depends on this)
            peak_vector_sum=self.peaks.peak_vector_sum_ips,
        )
        pi = ProjectInfo(
@@ -60,6 +60,18 @@ class ChannelStats:
    time_of_peak_s:    Optional[float] = None      # seconds (relative to trigger; can be negative)
    peak_accel_g:      Optional[float] = None      # g               (geo channels only)
    peak_disp_in:      Optional[float] = None      # in              (geo channels only)
+    # When BW writes "OORANGE" (Out Of Range — truncated) for a PPV
+    # value, the true peak exceeded the channel's full-scale range.
+    # We substitute the range max (e.g. 10.000 in/s for Normal range)
+    # as a lower bound, and flag here so downstream UI / alerts know
+    # to render "> 10 in/s" or "saturated" instead of trusting the
+    # value as an exact measurement.
+    ppv_saturated:     bool = False
+    # Set when BW writes ">100 Hz" for ZC Freq — the zero-crossing
+    # algorithm's peak frequency exceeded the device's reporting
+    # ceiling (typically 100 Hz on V10.72).  zc_freq_hz gets the
+    # threshold (100.0) as a lower bound; downstream UI renders ">100".
+    zc_freq_above_range: bool = False


@dataclass
@@ -69,6 +81,14 @@ class MicStats:
    pspl_dbl:          Optional[float] = None      # dB(L)
    zc_freq_hz:        Optional[float] = None
    time_of_peak_s:    Optional[float] = None
+    # Set when BW writes "OORANGE" for PSPL — mic exceeded its
+    # measurement range.  pspl_dbl gets the conservative upper bound
+    # 140 dBL (typical NL-43 max; some units cap at 148).  Consumers
+    # should render "> 140 dB(L)" or similar when this flag is set.
+    pspl_saturated:    bool = False
+    # Same semantics as ChannelStats.zc_freq_above_range — mic ZC
+    # peak exceeded device reporting ceiling.
+    zc_freq_above_range: bool = False


@dataclass
@@ -92,6 +112,35 @@ class MonitorLogEntry:
    description: Optional[str] = None


+# BW saturation marker — appears in PPV / Peak Vector Sum / similar
+# numeric fields when the underlying measurement exceeded the
+# channel's full-scale range (e.g., a geophone reading > 10 in/s at
+# Normal range, or a mic exceeding its sensitivity ceiling).  Treated
+# as "≥ range_max" + a saturated flag rather than discarded.
+# Appears as: ``"Tran PPV : OORANGE in/s"``
+_OORANGE_MARKERS = ("OORANGE", "OUT OF RANGE")
+
+
+def _is_oorange(value: str) -> bool:
+    """True when a BW numeric field is an Out-Of-Range saturation marker."""
+    s = value.strip().upper()
+    return any(m in s for m in _OORANGE_MARKERS)
+
+
+def _parse_above_range(value: str) -> Optional[float]:
+    """For BW "above-range" markers like ">100 Hz", return the threshold.
+
+    BW writes ZC Freq as ">100 Hz" when the zero-crossing algorithm sees
+    a peak too fast to count (device cuts off at 100 Hz).  Returns the
+    numeric portion after the '>' (e.g. 100.0), or None if `value` is
+    not an above-range marker.
+    """
+    s = value.strip()
+    if not s.startswith(">"):
+        return None
+    return _parse_number(s[1:])
+
+
@dataclass
 class BwAsciiReport:
    """Structured representation of one BW per-event ASCII export."""
@@ -144,6 +193,29 @@ class BwAsciiReport:
    # ── Vector sum ──────────────────────────────────────────────────────────
    peak_vector_sum_ips:    Optional[float] = None
    peak_vector_sum_time_s: Optional[float] = None
+    # Saturation flag — set when BW writes "OORANGE" for the PVS.  We
+    # then substitute sqrt(3) * geo_range_ips as a conservative upper
+    # bound (the theoretical maximum PVS when all 3 geo channels are
+    # simultaneously at full-scale).  Consumers should display this as
+    # ">{value} in/s" or similar.
+    peak_vector_sum_saturated: bool = False
+    # Histograms additionally have an absolute date+time for the PVS
+    # (it occurred at a specific interval).  Waveform reports show
+    # only the relative-time value above.
+    peak_vector_sum_when:   Optional[datetime.datetime] = None
+
+    # ── Histogram-specific fields (populated only when Event Type starts
+    # with 'Histogram' / 'Full Histogram' / 'Histogram + Continuous') ──
+    histogram_start:        Optional[datetime.datetime] = None
+    histogram_stop:         Optional[datetime.datetime] = None
+    histogram_n_intervals:  Optional[int]   = None      # e.g. 4, 1436
+    histogram_interval_size_str: Optional[str]   = None  # "1 minute" / "5 minutes" / "15 seconds"
+    histogram_interval_size_s:   Optional[float] = None  # parsed to seconds
+    # Per-channel absolute peak time+date (histogram-specific).  For
+    # waveform events these are None — those reports use the channel's
+    # time_of_peak_s (relative to trigger) instead.  Keyed by channel
+    # name ("Tran", "Vert", "Long", "MicL").
+    channel_peak_when:      Dict[str, datetime.datetime] = field(default_factory=dict)

    # ── Sensor self-check (per channel) ─────────────────────────────────────
    sensor_check:      Dict[str, SensorCheck] = field(default_factory=dict)
@@ -223,6 +295,46 @@ def _parse_event_date(s: str) -> Optional[datetime.date]:
        return None


+def _parse_iso_date(s: str) -> Optional[datetime.date]:
+    """Parse "2026-05-16" → date.  Histograms use ISO format for their
+    Start Date / Stop Date / Peak Date fields; waveforms use the
+    "May 8, 2026" long form which `_parse_event_date` handles."""
+    s = s.strip()
+    try:
+        return datetime.date.fromisoformat(s)
+    except ValueError:
+        return None
+
+
+_INTERVAL_UNIT_SECONDS = {
+    "second": 1, "seconds": 1, "sec": 1, "secs": 1,
+    "minute": 60, "minutes": 60, "min": 60, "mins": 60,
+    "hour": 3600, "hours": 3600, "hr": 3600, "hrs": 3600,
+}
+
+
+def _parse_interval_size(s: str) -> Optional[float]:
+    """Parse "1 minute" / "5 minutes" / "15 seconds" / "2 seconds" → seconds.
+
+    Handles the BW Compliance Setup → Histogram Interval values verbatim
+    ("2 seconds", "5 seconds", "15 seconds", "1 minute", "5 minutes",
+    "15 minutes") plus a few defensive variants.
+    """
+    if not s:
+        return None
+    parts = s.strip().split()
+    if len(parts) < 2:
+        return None
+    try:
+        n = float(parts[0])
+    except ValueError:
+        return None
+    unit_per_s = _INTERVAL_UNIT_SECONDS.get(parts[1].lower())
+    if unit_per_s is None:
+        return None
+    return n * unit_per_s
+
+
 def _parse_event_time(s: str) -> Optional[datetime.time]:
    """Parse "15:56:35" → time."""
    s = s.strip()
@@ -336,6 +448,15 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
    in_user_notes_block = False
    user_note_position = 0

+    # Histogram-field staging — BW writes <Channel> Peak Time and
+    # <Channel> Peak Date on separate lines (and similarly Histogram
+    # Start Time / Date).  We stash the partial value when the time
+    # line arrives and combine it when the matching date line arrives.
+    _hist_start_time: Optional[datetime.time] = None
+    _hist_stop_time:  Optional[datetime.time] = None
+    _pending_peak_time: Dict[str, Optional[datetime.time]] = {}
+    _pvs_time_raw: Optional[str] = None  # last Peak Vector Sum Time value, raw
+
    while i < n:
        raw_line = lines[i]
        i += 1
@@ -420,24 +541,113 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
        ):
            ch_name, stat = key.split(" ", 1)
            cs = report.channels.setdefault(ch_name, ChannelStats())
-            num = _parse_number(value)
-            if   stat == "PPV":                 cs.ppv_ips        = num
-            elif stat == "ZC Freq":             cs.zc_freq_hz     = num
-            elif stat == "Time of Peak":        cs.time_of_peak_s = num
-            elif stat == "Peak Acceleration":   cs.peak_accel_g   = num
-            elif stat == "Peak Displacement":   cs.peak_disp_in   = num
+            if stat == "PPV":
+                if _is_oorange(value):
+                    # Channel saturated — substitute range max as lower
+                    # bound; flag so downstream UI can render "> 10 in/s".
+                    cs.ppv_ips       = report.geo_range_ips
+                    cs.ppv_saturated = True
+                else:
+                    cs.ppv_ips = _parse_number(value)
+            elif stat == "ZC Freq":
+                # ">100 Hz" → store threshold + flag; numeric → parse normally
+                threshold = _parse_above_range(value)
+                if threshold is not None:
+                    cs.zc_freq_hz = threshold
+                    cs.zc_freq_above_range = True
+                else:
+                    cs.zc_freq_hz = _parse_number(value)
+            else:
+                num = _parse_number(value)
+                if   stat == "Time of Peak":        cs.time_of_peak_s = num
+                elif stat == "Peak Acceleration":   cs.peak_accel_g   = num
+                elif stat == "Peak Displacement":   cs.peak_disp_in   = num
+
+        # ── Histogram-specific fields ────────────────────────────────────────
+        # Histograms have Start/Stop time+date pairs + an interval count
+        # and size, plus per-channel absolute Peak Time/Date instead of
+        # the waveform's relative Time of Peak.
+        elif key == "Histogram Start Time":
+            _hist_start_time = _parse_event_time(value)
+        elif key == "Histogram Start Date":
+            _d = _parse_iso_date(value)
+            if _d and _hist_start_time:
+                report.histogram_start = datetime.datetime.combine(_d, _hist_start_time)
+        elif key == "Histogram Stop Time":
+            _hist_stop_time = _parse_event_time(value)
+        elif key == "Histogram Stop Date":
+            _d = _parse_iso_date(value)
+            if _d and _hist_stop_time:
+                report.histogram_stop = datetime.datetime.combine(_d, _hist_stop_time)
+        elif key == "Number of Intervals":
+            try:
+                report.histogram_n_intervals = int(float(value.strip()))
+            except ValueError:
+                pass
+        elif key == "Interval Size":
+            report.histogram_interval_size_str = value.strip()
+            report.histogram_interval_size_s   = _parse_interval_size(value)
+
+        # ── Per-channel histogram Peak Date / Peak Time ──
+        # Lines like "Tran Peak Time : 22:31:38" + "Tran Peak Date : 2026-05-16"
+        elif key in ("Tran Peak Time", "Vert Peak Time", "Long Peak Time", "MicL Time"):
+            ch_name = "MicL" if key == "MicL Time" else key.split(" ", 1)[0]
+            _pending_peak_time[ch_name] = _parse_event_time(value)
+        elif key in ("Tran Peak Date", "Vert Peak Date", "Long Peak Date", "MicL Date"):
+            ch_name = "MicL" if key == "MicL Date" else key.split(" ", 1)[0]
+            _d = _parse_iso_date(value)
+            _t = _pending_peak_time.get(ch_name)
+            if _d and _t:
+                report.channel_peak_when[ch_name] = datetime.datetime.combine(_d, _t)

        # ── Vector Sum ───────────────────────────────────────────────────────
        elif key == "Peak Vector Sum":
-            report.peak_vector_sum_ips = _parse_number(value)
-        elif key == "Peak Vector Sum Time":
+            if _is_oorange(value):
+                # PVS saturated — conservative upper bound is
+                # sqrt(3) * geo_range_ips (all 3 channels at full-scale).
+                # Real PVS could be lower (channels rarely peak
+                # simultaneously) but never higher within the range.
+                if report.geo_range_ips is not None:
+                    import math as _math
+                    report.peak_vector_sum_ips = _math.sqrt(3) * report.geo_range_ips
+                report.peak_vector_sum_saturated = True
+            else:
+                report.peak_vector_sum_ips = _parse_number(value)
+        # BW writes the PVS-time label with a typo: "Peak Vector Sum TimeSum"
+        # (looks like Sum got appended twice).  Accept both forms.  Confirmed
+        # against actual BW output on 2026-05-27 — every PVS-time line in
+        # the field examples (T190, T438, K557) uses the typo'd label.
+        elif key in ("Peak Vector Sum Time", "Peak Vector Sum TimeSum"):
            report.peak_vector_sum_time_s = _parse_number(value)
+            _pvs_time_raw = value
+        elif key == "Peak Vector Sum Date":
+            # Histogram-mode PVS gets paired with a date.  We may have
+            # captured 'Peak Vector Sum Time' as either a relative
+            # seconds float (waveform) or an HH:MM:SS string we
+            # interpreted as a number.  For histograms, BW writes
+            # "Peak Vector Sum Time : 22:33:52" which _parse_number
+            # parses as 22.0 (loses information).  When Peak Vector Sum
+            # Date arrives, re-parse the previous PVS time line as a
+            # clock time and combine into an absolute datetime.
+            _d = _parse_iso_date(value)
+            if _d and _pvs_time_raw is not None:
+                _t = _parse_event_time(_pvs_time_raw)
+                if _t:
+                    report.peak_vector_sum_when = datetime.datetime.combine(_d, _t)
+                    # The earlier seconds parse was bogus for histograms;
+                    # clear it so consumers don't think it's a real offset.
+                    report.peak_vector_sum_time_s = None

        # ── Microphone block ────────────────────────────────────────────────
        elif key == "Microphone":
            report.mic.weighting = value
        elif key == "MicL PSPL":
-            report.mic.pspl_dbl = _parse_number(value)
+            if _is_oorange(value):
+                # Mic saturated — substitute conservative upper bound 140 dBL.
+                report.mic.pspl_dbl       = 140.0
+                report.mic.pspl_saturated = True
+            else:
+                report.mic.pspl_dbl = _parse_number(value)
            # Mirror onto the "MicL" entry in channels so callers querying
            # `channels["MicL"].ppv_ips` see something — but it's dB(L), not
            # in/s, so we store as-is in the MicStats and mark the channel.
@@ -446,9 +656,15 @@ def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwA
            cs = report.channels.setdefault("MicL", ChannelStats())
            cs.time_of_peak_s = report.mic.time_of_peak_s
        elif key == "MicL ZC Freq":
-            report.mic.zc_freq_hz = _parse_number(value)
+            threshold = _parse_above_range(value)
+            if threshold is not None:
+                report.mic.zc_freq_hz         = threshold
+                report.mic.zc_freq_above_range = True
+            else:
+                report.mic.zc_freq_hz = _parse_number(value)
            cs = report.channels.setdefault("MicL", ChannelStats())
-            cs.zc_freq_hz = report.mic.zc_freq_hz
+            cs.zc_freq_hz          = report.mic.zc_freq_hz
+            cs.zc_freq_above_range = report.mic.zc_freq_above_range

        # ── Sensor self-check ────────────────────────────────────────────────
        elif key in (
@@ -49,7 +49,7 @@ SIDECAR_KIND   = "sfm.event"
 # bumped without a `pip install` re-run — leading to confusing stale
 # version stamps in sidecars.  Bump this constant and CHANGELOG.md
 # together at release time.
-TOOL_VERSION = "0.20.0"
+TOOL_VERSION = "0.21.1"

 try:
    # Best-effort: prefer the installed metadata when it's NEWER than the
@@ -120,7 +120,16 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
            "peak_disp_in":    cs.peak_disp_in,
        }
        # Drop all-None entries — keeps the JSON tidy for partial reports.
-        return {k: v for k, v in out.items() if v is not None}
+        out = {k: v for k, v in out.items() if v is not None}
+        # Saturation flag (only present when True) — signals that ppv_ips
+        # is the channel range max (a lower bound), not an exact reading.
+        if getattr(cs, "ppv_saturated", False):
+            out["ppv_saturated"] = True
+        # ZC Freq above device reporting ceiling (BW ">100 Hz") — value
+        # in zc_freq_hz is the threshold, not an exact measurement.
+        if getattr(cs, "zc_freq_above_range", False):
+            out["zc_freq_above_range"] = True
+        return out

    def _sc(ch_name: str) -> dict:
        sc = report.sensor_check.get(ch_name)
@@ -169,15 +178,25 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
            "vert":         _ch("Vert"),
            "long":         _ch("Long"),
            "vector_sum": {
-                "ips":     report.peak_vector_sum_ips,
-                "time_s":  report.peak_vector_sum_time_s,
+                "ips":       report.peak_vector_sum_ips,
+                "time_s":    report.peak_vector_sum_time_s,
+                # Histogram events have an absolute date+time for the PVS
+                # (the interval at which it occurred); waveform events
+                # only have the time_s offset.
+                "when":      report.peak_vector_sum_when.isoformat() if report.peak_vector_sum_when else None,
+                # Set when BW reported the PVS as OORANGE — value is the
+                # conservative upper bound sqrt(3) * geo_range_ips, not
+                # an exact peak.
+                "saturated": bool(getattr(report, "peak_vector_sum_saturated", False)),
            },
        },
        "mic": {
-            "weighting":        report.mic.weighting,
-            "pspl_dbl":         report.mic.pspl_dbl,
-            "zc_freq_hz":       report.mic.zc_freq_hz,
-            "time_of_peak_s":   report.mic.time_of_peak_s,
+            "weighting":             report.mic.weighting,
+            "pspl_dbl":              report.mic.pspl_dbl,
+            "pspl_saturated":        bool(getattr(report.mic, "pspl_saturated", False)),
+            "zc_freq_hz":            report.mic.zc_freq_hz,
+            "zc_freq_above_range":   bool(getattr(report.mic, "zc_freq_above_range", False)),
+            "time_of_peak_s":        report.mic.time_of_peak_s,
        },
        "sensor_check": {
            "tran": _sc("Tran"),
@@ -185,6 +204,17 @@ def _bw_report_to_dict(report: BwAsciiReport) -> dict:
            "long": _sc("Long"),
            "mic":  _sc("MicL"),
        },
+        # Histogram-specific fields (None on waveform-mode events).
+        # Per-channel absolute peak time/date for histograms — for
+        # waveforms see channels[ch]["time_of_peak_s"] instead.
+        "histogram": {
+            "start":               report.histogram_start.isoformat() if report.histogram_start else None,
+            "stop":                report.histogram_stop.isoformat()  if report.histogram_stop  else None,
+            "n_intervals":         report.histogram_n_intervals,
+            "interval_size":       report.histogram_interval_size_str,
+            "interval_size_s":     report.histogram_interval_size_s,
+            "channel_peak_when":   {ch: dt.isoformat() for ch, dt in report.channel_peak_when.items()},
+        },
        "monitor_log":   monitor_log,
        "pc_sw_version": report.pc_sw_version,
    }
@@ -254,6 +284,60 @@ def apply_report_to_event(event: Event, report: BwAsciiReport) -> None:
        event.rectime_seconds = report.record_time_s


+def apply_bw_report_dict_to_event(event: Event, bw_report: dict) -> None:
+    """Mirror of ``apply_report_to_event`` for the projected sidecar
+    dict shape (as produced by ``_bw_report_to_dict``).
+
+    Why this exists
+    ───────────────
+    The ingest path holds a live ``BwAsciiReport`` parsed straight from
+    the ``_ASCII.TXT`` and uses ``apply_report_to_event`` to overlay
+    device-authoritative peaks onto the codec output before insert.
+
+    The backfill path doesn't have the original ``.TXT`` (it's not
+    retained in the waveform store), but it does have the preserved
+    ``bw_report`` block from the sidecar — which contains the same
+    projected fields.  Re-overlaying those during a backfill keeps the
+    DB peak columns aligned with what BW reports rather than letting
+    the codec output (which may be incomplete for unhandled formats or
+    walker edge cases) win by default.
+
+    No-ops cleanly when ``bw_report`` is ``None``, empty, or missing
+    any particular sub-field — only fields with a concrete value get
+    written.  Mirrors ``apply_report_to_event``'s "report wins where
+    present" semantics.
+    """
+    if not bw_report:
+        return
+    if event.peak_values is None:
+        event.peak_values = PeakValues()
+    pv = event.peak_values
+
+    peaks = bw_report.get("peaks") or {}
+    tran = (peaks.get("tran") or {}).get("ppv_ips")
+    vert = (peaks.get("vert") or {}).get("ppv_ips")
+    long = (peaks.get("long") or {}).get("ppv_ips")
+    if tran is not None: pv.tran = tran
+    if vert is not None: pv.vert = vert
+    if long is not None: pv.long = long
+    vs_ips = (peaks.get("vector_sum") or {}).get("ips")
+    if vs_ips is not None:
+        pv.peak_vector_sum = vs_ips
+
+    mic = bw_report.get("mic") or {}
+    pspl = mic.get("pspl_dbl")
+    if pspl is not None and pspl > 0:
+        pv.micl = _dbl_to_psi(pspl)
+
+    rec = bw_report.get("recording") or {}
+    sr = rec.get("sample_rate_sps")
+    if sr:
+        event.sample_rate = sr
+    rt = rec.get("record_time_s")
+    if rt is not None:
+        event.rectime_seconds = rt
+
+
 def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
    if pi is None:
        return {
@@ -278,6 +362,7 @@ def event_to_sidecar_dict(
    blastware_filesize: int,
    blastware_sha256: str,
    source_kind: str = "sfm-live",
+    txt_filename: Optional[str] = None,
    a5_pickle_filename: Optional[str] = None,
    tool_version: str = _TOOL_VERSION_DEFAULT,
    captured_at: Optional[datetime.datetime] = None,
@@ -394,6 +479,7 @@ def event_to_sidecar_dict(
            "captured_at":        captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
            "tool_version":       tool_version,
            "a5_pickle_filename": a5_pickle_filename,
+            "txt_filename":       txt_filename,
        },

        "review": review or {
@@ -28,18 +28,32 @@ iterate 32-stride and stop before the tail.
    [1]    segment_id  (uint8)       0x00..0x03 — 256 blocks per segment
    [2:4]  block_ctr  (uint16 LE)    resets each segment (0x0100, 0x0101, …)
    [4:6]  0x000a (uint16 LE)        constant marker (= 10)
-    [6:8]  T_peak_count   uint16 LE  Tran peak (count × 0.005 → in/s)
+    [6]    T_peak_count   uint8      Tran peak (count × 0.005 → in/s, max 1.275 in/s)
+    [7]    T_annotation   uint8      empirically non-zero on intervals with sub-Hz
+                                     or unmeasurable Tran freq; meaning not fully RE'd
    [8:10] T_halfperiod   uint16 LE  Tran half-period in samples (freq = 512 / halfp Hz)
-    [10:12] V_peak_count  uint16 LE
+    [10]   V_peak_count   uint8
+    [11]   V_annotation   uint8
    [12:14] V_halfperiod  uint16 LE
-    [14:16] L_peak_count  uint16 LE
+    [14]   L_peak_count   uint8
+    [15]   L_annotation   uint8
    [16:18] L_halfperiod  uint16 LE
-    [18:20] M_peak_count  uint16 LE  MicL peak (count → dB via mic_count_to_db)
+    [18]   M_peak_count   uint8      MicL peak (count → dB via mic_count_to_db)
+    [19]   M_annotation   uint8
    [20:22] M_halfperiod  uint16 LE  MicL half-period in samples (freq = 512 / halfp Hz)
    [22:24] 0x00 0x00                constant
    [24:28] 4-byte variable          purpose unknown (possibly CRC or timestamp delta)
    [28:32] 0x1e 0x0a 0x00 0x00      constant block-end signature

+NOTE on peak-count width: an earlier interpretation treated the peak
+fields as uint16 LE spanning [6:8] / [10:12] / [14:16] / [18:20].
+That happened to be byte-exact against the N844 fixture corpus only
+because every annotation byte in those fixtures was zero, making
+``uint16 LE == uint8``.  Cross-correlating BE9558 (K558) Tran-drift
+and BE18003 (T003) Histogram+Continuous events against the BW ASCII
+export proved peak is uint8 alone — see test_histogram_codec.py
+and docs/histogram_codec_re_status.md.
+
 Block-identification anchor: ``block[22:24] == b"\\x00\\x00"`` AND
 ``block[28:32] == b"\\x1e\\x0a\\x00\\x00"``.  This is the reliable
 distinguisher from non-block content in the file.
@@ -101,23 +115,6 @@ _BLOCK_SIZE = 32
 # additional validation that we're looking at a real block.
 _BLOCK_MARKER = 10

-# Maximum plausible peak-count value.  Normal-range geophone tops out
-# at 10 in/s = 2000 counts at the 0.005 in/s per count scale; even
-# Sensitive range (1.25 in/s FS) wouldn't exceed ~250.  Mic counts run
-# 0..~400 in observed data.  4096 leaves comfortable headroom for any
-# legitimate value across all modes.
-#
-# Some prod blocks have been observed with peak-count fields whose
-# HIGH byte is non-zero (block[7] != 0 etc.) — observed across BE9558
-# and BE18003 units in Histogram-mode events.  Reading these as
-# uint16 LE produces values like 30981 / 41733 / 62469, which scale
-# to physically impossible peaks (150+ in/s).  Best guess: an
-# undocumented "time-of-peak-within-interval" extension byte the
-# device writes in some sub-mode (possibly Histogram+Continuous).
-# Until reverse-engineered, blocks exceeding this bound are skipped
-# rather than propagating bogus values into PVS computations.
-_MAX_PEAK_COUNT = 4096
-
 # Geo peak scaling: stored as "count × 0.005 in/s" where 1 count = one
 # 0.005 in/s display quantum.  Equivalent to the waveform codec's
 # 16-count-unit output (1 unit = 0.005 in/s = 16 ADC counts).
@@ -149,23 +146,36 @@ def _decode_block(block: bytes) -> Optional[dict]:
    """Decode one 32-byte histogram block.  Caller must have validated
    with ``_is_data_block`` first.

-    Returns ``None`` if any peak field exceeds ``_MAX_PEAK_COUNT`` —
-    those blocks contain an undocumented extension byte format whose
-    naive uint16 LE interpretation gives physically impossible peaks.
-    Skipping the block is safer than propagating bogus values into
-    PVS computations downstream.
+    Returns a record with per-channel peak counts (uint8) and
+    half-periods (uint16 LE).
    """
-    # All 16-bit fields are little-endian unsigned.  Peak counts are
-    # always non-negative; half-periods are always positive when valid.
-    t_peak, t_halfp, v_peak, v_halfp, l_peak, l_halfp, m_peak, m_halfp = struct.unpack_from(
-        "<HHHHHHHH", block, 6
-    )
-    if (t_peak > _MAX_PEAK_COUNT or v_peak > _MAX_PEAK_COUNT
-            or l_peak > _MAX_PEAK_COUNT or m_peak > _MAX_PEAK_COUNT):
-        return None
+    # Peak counts are uint8 at bytes [6] / [10] / [14] / [18].  The
+    # adjacent bytes [7] / [11] / [15] / [19] hold an annotation field
+    # whose meaning isn't fully understood (empirically non-zero in
+    # intervals with sub-Hz or unmeasurable geo frequencies, mostly
+    # zero otherwise — see test fixtures from BE9558/BE18003 corpora).
+    # Crucially, those annotation bytes are NOT the high byte of the
+    # peak count: cross-correlating against BW's per-interval ASCII
+    # export proves the peak is uint8 alone.
+    #
+    # Reading the peak as uint16 LE (the original interpretation) was
+    # accidentally correct only because every block in the N844 fixture
+    # corpus had a zero annotation byte; non-N844 events with non-zero
+    # annotation bytes decoded to physically impossible peaks (e.g.
+    # 268 in/s per channel) and produced 35× inflated PVS sums when
+    # first run against prod data.  See histogram_codec_re_status.md.
+    t_peak = block[6]
+    v_peak = block[10]
+    l_peak = block[14]
+    m_peak = block[18]
+    t_halfp = block[8]  | (block[9]  << 8)
+    v_halfp = block[12] | (block[13] << 8)
+    l_halfp = block[16] | (block[17] << 8)
+    m_halfp = block[20] | (block[21] << 8)
    segment_id = block[1]
    block_ctr  = block[2] | (block[3] << 8)
    var_meta   = bytes(block[24:28])
+    annotations = (block[7], block[11], block[15], block[19])
    return {
        "segment_id":  segment_id,
        "block_ctr":   block_ctr,
@@ -178,6 +188,7 @@ def _decode_block(block: bytes) -> Optional[dict]:
        "m_peak":      m_peak,
        "m_halfp":     m_halfp,
        "meta_var":    var_meta,
+        "annotations": annotations,
    }


@@ -185,10 +196,15 @@ def walk_body(body: bytes) -> List[dict]:
    """Walk the body and return one dict per histogram interval.

    Iterates 32-byte strides from offset 0.  Yields a decoded record
-    for every block that passes ``_is_data_block`` validation AND has
-    plausible peak values (``_decode_block`` returns None for blocks
-    with out-of-bound peaks).  Stops when the remaining bytes are too
-    short to form a complete block.
+    for every block that passes ``_is_data_block`` validation.  Stops
+    when the remaining bytes are too short to form a complete block.
+
+    In Histogram+Continuous mode the body interleaves data blocks with
+    other 32-byte content (likely continuous-mode waveform blocks) that
+    fail the data-block validation; the walker naturally skips them
+    without losing 32-byte alignment.  Use ``block_ctr`` from each
+    returned record to map back to the original interval index — the
+    record list is sparse when other block types are interleaved.
    """
    records: List[dict] = []
    for off in range(0, len(body) - _BLOCK_SIZE + 1, _BLOCK_SIZE):
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "seismo-relay"
-version = "0.19.0"
+version = "0.21.1"
 description = "Python client and REST server for MiniMate Plus seismographs"
 requires-python = ">=3.10"
 dependencies = [
@@ -15,6 +15,7 @@ dependencies = [
    "python-multipart>=0.0.7",
    "h5py>=3.10",
    "numpy>=1.24",
+    "matplotlib>=3.8",
 ]

 [tool.setuptools.packages.find]
@@ -5,3 +5,4 @@ pyserial
 python-multipart
 h5py
 numpy
+matplotlib
@@ -103,6 +103,17 @@ def main(argv=None) -> int:
            "STRT-rectime byte-offset fix in v0.15.x)."
        ),
    )
+    p.add_argument(
+        "--reparse-txt", action="store_true",
+        help=(
+            "Re-parse the preserved <serial>/<filename>_ASCII.TXT with the "
+            "current bw_ascii_report parser and overwrite the sidecar's "
+            "bw_report block.  Use this after upgrading the ASCII parser to "
+            "pull in new fields (e.g. zc_freq_above_range for BW '>100 Hz' "
+            "ZC peaks).  No-op for events without a preserved .TXT; safely "
+            "idempotent when the parser hasn't changed."
+        ),
+    )
    p.add_argument("-v", "--verbose", action="store_true")
    args = p.parse_args(argv)

@@ -153,7 +164,7 @@ def main(argv=None) -> int:
            # of the sidecar implies staleness of the derived .h5 (both
            # come out of the same decoder).
            sidecar_stale = True
-            if sidecar_path.exists() and not args.force:
+            if sidecar_path.exists() and not args.force and not args.reparse_txt:
                try:
                    existing = event_file_io.read_sidecar(sidecar_path)
                    sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
@@ -287,19 +298,68 @@ def main(argv=None) -> int:
                            or ev.total_samples < derived // 4):
                        ev.total_samples = derived

-                # Preserve user-edited review state + extensions from the
-                # existing sidecar (false_trigger flag, notes, etc.) so a
-                # backfill never wipes them out.
-                preserved_review = None
-                preserved_ext    = None
+                # Preserve user-edited review state + extensions + the
+                # bw_report block from the existing sidecar so a backfill
+                # never wipes them out.  The bw_report block originates
+                # from the paired .TXT ASCII report parsed at ORIGINAL
+                # import time (ach forward / direct upload); the .TXT
+                # file is not in the waveform store, so we can't re-derive
+                # it from disk.  event_to_sidecar_dict takes a
+                # BwAsciiReport dataclass (not a dict), so for bw_report
+                # we overlay the existing block after regen instead of
+                # passing it as a kwarg.
+                preserved_review     = None
+                preserved_ext        = None
+                preserved_bw_report  = None
+                preserved_txt_fn     = None
                if sidecar_path.exists():
                    try:
                        _existing = event_file_io.read_sidecar(sidecar_path)
-                        preserved_review = _existing.get("review")
-                        preserved_ext    = _existing.get("extensions")
+                        preserved_review    = _existing.get("review")
+                        preserved_ext       = _existing.get("extensions")
+                        preserved_bw_report = _existing.get("bw_report")
+                        # Preserve txt_filename so backfills don't blank out the
+                        # pointer to the saved raw .TXT (events ingested after
+                        # 2026-05-27 have this).
+                        preserved_txt_fn    = (_existing.get("source") or {}).get("txt_filename")
                    except Exception:
                        pass

+                # --reparse-txt: if a .TXT is preserved on disk, run the
+                # current parser against it and overwrite the bw_report
+                # block.  Picks up post-ingest parser fixes (e.g. the
+                # 2026-05-28 zc_freq_above_range / ">100 Hz" addition).
+                if args.reparse_txt and preserved_txt_fn:
+                    try:
+                        from minimateplus import bw_ascii_report
+                        txt_path = store.txt_path_for(serial, path.name)
+                        if txt_path.exists():
+                            refreshed = bw_ascii_report.parse_report_file(txt_path)
+                            preserved_bw_report = event_file_io._bw_report_to_dict(refreshed)
+                            log.debug("reparsed bw_report from %s", txt_path.name)
+                        else:
+                            log.debug("--reparse-txt: no .TXT at %s (sidecar says %r)",
+                                      txt_path, preserved_txt_fn)
+                    except Exception as exc:
+                        log.warning("--reparse-txt failed for %s: %s", path.name, exc)
+
+                # Overlay BW ASCII report fields onto the rebuilt Event
+                # BEFORE the sidecar + DB write.  Mirrors what the ingest
+                # path does — BW's reported peaks (and sample_rate /
+                # record_time) win over codec output where present.
+                #
+                # Without this step, --force backfill silently overwrites
+                # the bw_report-overlaid DB columns with codec-derived
+                # values, which is wrong for events the codec doesn't
+                # fully decode (e.g. waveform walker edge cases on
+                # SP0/SS0/SV0-style events, or histogram sub-formats with
+                # byte[5]!=0 that aren't yet RE'd).  Net effect was PVS=0
+                # on three top-10 events on 2026-05-22.
+                if preserved_bw_report:
+                    event_file_io.apply_bw_report_dict_to_event(
+                        ev, preserved_bw_report,
+                    )
+
                sidecar = event_file_io.event_to_sidecar_dict(
                    ev,
                    serial=serial,
@@ -308,9 +368,12 @@ def main(argv=None) -> int:
                    blastware_sha256=bw_sha,
                    source_kind=source_kind,
                    a5_pickle_filename=a5_filename,
+                    txt_filename=preserved_txt_fn,
                    review=preserved_review,
                    extensions=preserved_ext,
                )
+                if preserved_bw_report is not None:
+                    sidecar["bw_report"] = preserved_bw_report

                # Also emit the .h5 clean-waveform file when:
                #   - it's missing, OR
@@ -0,0 +1,331 @@
+"""
+scripts/backfill_thor_events.py — re-process existing Thor (Series IV)
+events so their sidecars carry the bw_report block produced by
+``micromate.idf_to_bw_report.build_bw_report_from_idf`` + their .h5
+clean-waveform files for IDFW events.
+
+Why this exists
+───────────────
+
+Thor events ingested before v0.21.0 (or during the v0.21.0 ingest bug
+window fixed in commit bee1185) have sidecars with only
+``extensions.idf_report`` — no ``bw_report`` block.  Without
+``bw_report``, the SFM PDF renderer falls back to DB-only fields
+(misses sensor-self-check, full per-channel breakdown, mic dB(L)),
+and the modal chart 404s on ``/waveform.json`` for IDFW events
+because no .h5 was written when the codec failed at ingest.
+
+Re-forwarding from thor-watcher would also fix this, but that requires
+operator coordination on every watcher machine and uses bandwidth this
+script doesn't.
+
+What this does
+──────────────
+
+Walks ``<store>/<serial>/<filename>`` for ``.IDFW`` / ``.IDFH`` files
+and, for each one:
+
+  1. Reads the existing sidecar (preserving review state + captured_at).
+  2. Re-runs ``micromate.idf_file.read_idf_file()`` on the binary
+     bytes — passing ``data=`` so the codec doesn't try to read from
+     a path it doesn't know.
+  3. Pulls ``extensions.idf_report`` (the raw parsed Thor dict the
+     v0.18.0+ ingest path already stashed) and runs the v0.21.0
+     ``build_bw_report_from_idf`` adapter against it.
+  4. Writes the refreshed sidecar with the new ``bw_report``,
+     bumped ``source.tool_version``, but preserved ``review`` block
+     + the original ``captured_at`` timestamp.
+  5. Regenerates the .h5 waveform file via the existing
+     ``event_hdf5`` writer.  For IDFW that's the decoded per-sample
+     stream; for IDFH it's a 1-sample-per-interval synthesised array
+     (peak ADC count per channel) so the renderer's bar-chart code
+     has data to group on.  Mic peak psi from the binary is merged
+     onto the IdfEvent before the bridge so the h5 writer's per-count
+     mic scale factor lands on a sensible value (without this the
+     mic chart on Thor events plots dB(L)-as-pseudo-psi and shows
+     bomb-level numbers).
+
+Idempotent.  Re-running it after a parser/adapter change just
+re-writes sidecars — no DB writes, no thor-watcher coordination.
+
+Usage
+─────
+
+    python scripts/backfill_thor_events.py [--store-root PATH]
+                                           [--dry-run]
+                                           [--skip-hdf5]
+                                           [--force]
+                                           [-v]
+
+By default, refreshes any Thor event whose sidecar is missing
+``bw_report`` OR whose ``source.tool_version`` is older than the
+current ``TOOL_VERSION``.  ``--force`` refreshes every Thor event
+regardless.
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+import sys
+from pathlib import Path
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+from sfm.waveform_store import WaveformStore
+
+log = logging.getLogger("backfill_thor_events")
+
+
+def _is_thor_event(path: Path) -> bool:
+    if not path.is_file():
+        return False
+    if path.name.endswith((".sfm.json", ".h5", "_ASCII.TXT")):
+        return False
+    return path.suffix.upper() in (".IDFW", ".IDFH")
+
+
+def _vtuple(s: str) -> tuple:
+    try:
+        return tuple(int(p) for p in str(s).split(".")[:3])
+    except Exception:
+        return (0, 0, 0)
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    p.add_argument(
+        "--db-path",
+        default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
+        help="Used only to derive the default --store-root.",
+    )
+    p.add_argument("--store-root", default=None)
+    p.add_argument("--dry-run", action="store_true")
+    p.add_argument("--skip-hdf5", action="store_true",
+                   help="Don't regenerate .h5 files for IDFW events.")
+    p.add_argument("--force", action="store_true",
+                   help="Refresh every Thor event, not just ones with stale or missing bw_report.")
+    p.add_argument("-v", "--verbose", action="store_true")
+    args = p.parse_args(argv)
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+        datefmt="%H:%M:%S",
+    )
+
+    db_path = Path(args.db_path).expanduser().resolve()
+    store_root = (
+        Path(args.store_root).expanduser().resolve()
+        if args.store_root else db_path.parent / "waveforms"
+    )
+    if not store_root.exists():
+        log.error("store root not found: %s", store_root)
+        return 1
+    store = WaveformStore(store_root)
+    log.info("store root: %s", store_root)
+    log.info("current TOOL_VERSION: %s", event_file_io.TOOL_VERSION)
+
+    refreshed = skipped = errors = h5_written = 0
+
+    # Lazy imports so any one of these failing produces a useful error
+    # message rather than crashing module-load.
+    from micromate.idf_file import read_idf_file
+    from micromate.idf_to_bw_report import build_bw_report_from_idf
+
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        serial = serial_dir.name
+        for path in sorted(serial_dir.iterdir()):
+            if not _is_thor_event(path):
+                continue
+
+            sidecar_path = store.sidecar_path_for(serial, path.name)
+            if not sidecar_path.exists():
+                log.debug("%s: no sidecar — skipping (this is a binary without ingest history)",
+                          path.name)
+                skipped += 1
+                continue
+
+            try:
+                existing = event_file_io.read_sidecar(sidecar_path)
+            except Exception as exc:
+                log.warning("%s: failed to read sidecar — %s", path.name, exc)
+                errors += 1
+                continue
+
+            has_bw_report = bool(existing.get("bw_report"))
+            existing_version = (existing.get("source") or {}).get("tool_version", "")
+            up_to_date = (
+                has_bw_report
+                and _vtuple(existing_version) >= _vtuple(event_file_io.TOOL_VERSION)
+            )
+            if up_to_date and not args.force:
+                skipped += 1
+                continue
+
+            # Re-decode the binary.  Catch + log; continue with .txt-only
+            # data if it fails (matches the live ingest path's behavior).
+            idf_samples = None
+            idf_intervals = None
+            binary_md = None
+            is_histogram = path.suffix.upper() == ".IDFH"
+            try:
+                binary_bytes = path.read_bytes()
+                res = read_idf_file(path, data=binary_bytes)
+                idf_samples = res.samples or None
+                idf_intervals = res.intervals
+                binary_md = res.binary_metadata
+                is_histogram = res.intervals is not None
+            except NotImplementedError:
+                # sig-B / Blastware-stray binary; no samples but adapter
+                # can still produce a bw_report from extensions.idf_report.
+                log.debug("%s: binary codec NotImplementedError (sig-B / BW-stray); proceeding from sidecar's idf_report only", path.name)
+            except Exception as exc:
+                log.warning("%s: binary decode failed — %s; proceeding from sidecar's idf_report only", path.name, exc)
+
+            # Run the adapter.  Pull report_dict from
+            # extensions.idf_report (the v0.18.0+ ingest preserved it).
+            report_dict = (existing.get("extensions") or {}).get("idf_report") or {}
+            if not report_dict and binary_md is None:
+                log.debug("%s: no idf_report in sidecar AND no binary metadata — nothing to project", path.name)
+                skipped += 1
+                continue
+
+            try:
+                bw_report = build_bw_report_from_idf(
+                    report_dict, binary_md=binary_md,
+                    intervals=idf_intervals, is_histogram=is_histogram,
+                )
+            except Exception as exc:
+                log.warning("%s: adapter failed — %s", path.name, exc)
+                errors += 1
+                continue
+
+            # Build the new sidecar by overlaying refreshed fields onto
+            # the existing one — preserves review, captured_at, blastware
+            # block, source.kind, etc.
+            new_sidecar = dict(existing)  # shallow copy
+            new_sidecar["bw_report"] = bw_report
+            src = dict(new_sidecar.get("source") or {})
+            src["tool_version"] = event_file_io.TOOL_VERSION
+            new_sidecar["source"] = src
+
+            # Preserve histogram intervals if the binary decoded them
+            # (improves over the original ingest if that one ran before
+            # the bee1185 codec fix).
+            if idf_intervals is not None:
+                ext = dict(new_sidecar.get("extensions") or {})
+                ext["idf_intervals"] = [
+                    {
+                        "offset":     iv.offset,
+                        "tran_peak":  iv.peak_count("Tran"),
+                        "tran_halfp": iv.tran_halfp,
+                        "tran_freq":  iv.freq_hz("Tran"),
+                        "vert_peak":  iv.peak_count("Vert"),
+                        "vert_halfp": iv.vert_halfp,
+                        "vert_freq":  iv.freq_hz("Vert"),
+                        "long_peak":  iv.peak_count("Long"),
+                        "long_halfp": iv.long_halfp,
+                        "long_freq":  iv.freq_hz("Long"),
+                        "mic_peak":   iv.peak_count("MicL"),
+                        "mic_halfp":  iv.micl_halfp,
+                        "mic_freq":   iv.freq_hz("MicL"),
+                    }
+                    for iv in idf_intervals
+                ]
+                new_sidecar["extensions"] = ext
+
+            if args.dry_run:
+                will_write_h5 = (idf_samples or idf_intervals) and not args.skip_hdf5
+                log.info("[DRY] %s/%s — would refresh sidecar (bw_report=%s, h5=%s)",
+                         serial, path.name,
+                         "wrote" if not has_bw_report else "refreshed",
+                         "would write" if will_write_h5 else "skipped")
+            else:
+                event_file_io.write_sidecar(sidecar_path, new_sidecar)
+                log.info("%s/%s — sidecar refreshed (bw_report=%s, intervals=%d)",
+                         serial, path.name,
+                         "added" if not has_bw_report else "refreshed",
+                         len(idf_intervals) if idf_intervals else 0)
+            refreshed += 1
+
+            # Regenerate .h5 by replaying the same IdfEvent → Event bridge
+            # save_imported_idf uses.  For IDFW we write the decoded per-
+            # sample arrays.  For IDFH we synthesise a 1-sample-per-interval
+            # array (peak ADC count per channel per interval) so the
+            # renderer's bar-chart code has something to group on.
+            # Pre-condition: either real samples (IDFW) or decoded intervals
+            # (IDFH).  Skip otherwise.
+            have_data = bool(idf_samples) or bool(idf_intervals)
+            if have_data and not args.skip_hdf5:
+                from sfm import event_hdf5
+                hdf5_path = store.hdf5_path_for(serial, path.name)
+                if args.dry_run:
+                    log.debug("[DRY] would write %s", hdf5_path.name)
+                else:
+                    try:
+                        from micromate import IdfEvent
+                        from minimateplus.event_file_io import file_sha256
+                        idf_event = IdfEvent.from_report(report_dict, path.name)
+
+                        # Merge the binary-derived mic peak psi (only the
+                        # binary path knows the proper psi value; the .txt
+                        # carries dB(L)).  Without this, the h5 writer's
+                        # per-count mic factor is computed against the
+                        # dB(L) value-as-pseudo-psi and the mic chart
+                        # scales wildly.
+                        if (binary_md is not None and res is not None
+                                and res.event.peaks.mic_pspl_psi is not None):
+                            idf_event.peaks.mic_pspl_psi = res.event.peaks.mic_pspl_psi
+
+                        sha256 = file_sha256(path)
+                        waveform_key = bytes.fromhex(sha256)[:16]
+                        ev = idf_event.to_minimateplus_event(waveform_key)
+
+                        if is_histogram and idf_intervals:
+                            # 1 sample per interval per channel — same
+                            # synthesis save_imported_idf uses.  The h5
+                            # writer's count×geo_fs/32768 conversion turns
+                            # each peak-ADC-count into the bar's physical
+                            # value.
+                            ev.raw_samples = {
+                                "Tran": [iv.peak_count("Tran") for iv in idf_intervals],
+                                "Vert": [iv.peak_count("Vert") for iv in idf_intervals],
+                                "Long": [iv.peak_count("Long") for iv in idf_intervals],
+                                "MicL": [iv.peak_count("MicL") for iv in idf_intervals],
+                            }
+                            ev.total_samples = ev.total_samples or len(idf_intervals)
+                        elif idf_samples:
+                            ev.raw_samples = idf_samples
+                            n_samp = max(
+                                (len(idf_samples.get(ch, []))
+                                 for ch in ("Tran", "Vert", "Long", "MicL")),
+                                default=0,
+                            )
+                            ev.total_samples = ev.total_samples or n_samp
+
+                        event_hdf5.write_event_hdf5(
+                            hdf5_path, ev,
+                            serial=serial,
+                            geo_range="normal",
+                            source_kind="idf-import",
+                            tool_version=event_file_io.TOOL_VERSION,
+                        )
+                        h5_written += 1
+                        log.debug("%s/%s — .h5 written (%s)",
+                                  serial, path.name,
+                                  f"{len(idf_intervals)} intervals" if is_histogram
+                                  else f"{sum(len(v) for v in (idf_samples or {}).values())} samples")
+                    except Exception as exc:
+                        log.warning("%s/%s — .h5 write failed: %s",
+                                    serial, path.name, exc)
+
+    log.info("Done.  refreshed=%d  skipped=%d  errors=%d  h5_written=%d",
+             refreshed, skipped, errors, h5_written)
+    return 0 if errors == 0 else 2
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,185 @@
+"""
+scripts/check_bw_report_preservation.py — verify that running backfill_sidecars
+doesn't wipe the `bw_report` block from sidecars that already had one.
+
+Two-step workflow:
+
+  # Before running backfill — capture a baseline snapshot:
+  python scripts/check_bw_report_preservation.py snapshot \
+      --store-root /path/to/waveforms \
+      --out before.json
+
+  # Run backfill:
+  python scripts/backfill_sidecars.py --store-root /path/to/waveforms --force
+
+  # After backfill — diff against the baseline:
+  python scripts/check_bw_report_preservation.py diff \
+      --store-root /path/to/waveforms \
+      --baseline before.json
+
+The diff classifies every sidecar into one of:
+
+  PRESERVED      had bw_report before, has same hash now  ← GOOD
+  CHANGED        had bw_report before, has different hash now  ← suspicious
+                 (backfill should only ever copy the block verbatim)
+  WIPED          had bw_report before, doesn't now  ← BUG — data loss
+  STILL_MISSING  didn't have bw_report before, still doesn't  ← expected
+  NEW            didn't have bw_report before, has one now
+                 (only possible if a re-ingest happened between snapshots;
+                  shouldn't happen during backfill)
+  REMOVED        sidecar existed in baseline, file is gone now
+  ADDED          sidecar didn't exist in baseline, exists now
+
+Exit code is 0 if no WIPED or CHANGED entries are found, 1 otherwise.
+"""
+
+from __future__ import annotations
+
+import argparse
+import hashlib
+import json
+import sys
+from pathlib import Path
+from typing import Optional
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+
+
+def _bw_report_hash(sidecar_data: dict) -> Optional[str]:
+    """Canonical-JSON hash of the bw_report block, or None if absent."""
+    br = sidecar_data.get("bw_report")
+    if not br:
+        return None
+    # sort_keys for stable hashing across dict-ordering differences
+    blob = json.dumps(br, sort_keys=True, separators=(",", ":"))
+    return hashlib.sha256(blob.encode()).hexdigest()
+
+
+def _scan_store(store_root: Path) -> dict:
+    """Walk every <serial>/<file>.sfm.json and return {relpath: hash_or_None}.
+
+    Relpath is `<serial>/<filename>` — stable across machines/snapshots.
+    """
+    out: dict[str, Optional[str]] = {}
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        for sidecar in sorted(serial_dir.glob("*.sfm.json")):
+            relpath = f"{serial_dir.name}/{sidecar.name}"
+            try:
+                data = event_file_io.read_sidecar(sidecar)
+            except Exception as exc:
+                print(f"  WARN: failed to read {relpath}: {exc}", file=sys.stderr)
+                continue
+            out[relpath] = _bw_report_hash(data)
+    return out
+
+
+def cmd_snapshot(args) -> int:
+    store_root = Path(args.store_root).expanduser().resolve()
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+    out_path = Path(args.out).expanduser().resolve()
+
+    print(f"Scanning {store_root} …")
+    snapshot = _scan_store(store_root)
+
+    with_bw    = sum(1 for v in snapshot.values() if v is not None)
+    without_bw = sum(1 for v in snapshot.values() if v is None)
+    print(f"  total sidecars:     {len(snapshot)}")
+    print(f"  with bw_report:     {with_bw}")
+    print(f"  without bw_report:  {without_bw}")
+
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(out_path, "w") as f:
+        json.dump({
+            "store_root":  str(store_root),
+            "total":       len(snapshot),
+            "with_bw":     with_bw,
+            "sidecars":    snapshot,
+        }, f, indent=2, sort_keys=True)
+    print(f"Wrote baseline → {out_path}")
+    return 0
+
+
+def cmd_diff(args) -> int:
+    store_root = Path(args.store_root).expanduser().resolve()
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+    baseline_path = Path(args.baseline).expanduser().resolve()
+    if not baseline_path.exists():
+        print(f"error: baseline file not found: {baseline_path}", file=sys.stderr)
+        return 2
+
+    with open(baseline_path) as f:
+        baseline = json.load(f)
+    before = baseline["sidecars"]
+    print(f"Scanning {store_root} for comparison against {baseline_path.name} …")
+    after = _scan_store(store_root)
+
+    classes = {k: [] for k in (
+        "PRESERVED", "CHANGED", "WIPED", "STILL_MISSING", "NEW", "REMOVED", "ADDED",
+    )}
+    all_keys = set(before) | set(after)
+    for key in sorted(all_keys):
+        b = before.get(key, "__MISSING__")
+        a = after.get(key, "__MISSING__")
+        if b == "__MISSING__":
+            classes["ADDED"].append(key)
+        elif a == "__MISSING__":
+            classes["REMOVED"].append(key)
+        elif b is None and a is None:
+            classes["STILL_MISSING"].append(key)
+        elif b is None and a is not None:
+            classes["NEW"].append(key)
+        elif b is not None and a is None:
+            classes["WIPED"].append(key)
+        elif b == a:
+            classes["PRESERVED"].append(key)
+        else:
+            classes["CHANGED"].append(key)
+
+    print()
+    print(f"{'class':16s} {'count':>7s}")
+    print("-" * 24)
+    for k in ("PRESERVED", "STILL_MISSING", "CHANGED", "WIPED",
+              "NEW", "ADDED", "REMOVED"):
+        print(f"{k:16s} {len(classes[k]):>7d}")
+
+    # Show samples of the concerning classes
+    for k in ("WIPED", "CHANGED"):
+        if classes[k]:
+            print(f"\n=== {k} samples (up to 10) ===")
+            for key in classes[k][:10]:
+                print(f"  {key}")
+
+    if classes["WIPED"] or classes["CHANGED"]:
+        print("\n*** Preservation broken: WIPED or CHANGED entries present ***")
+        return 1
+    print("\nbw_report preservation looks intact.")
+    return 0
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    sub = p.add_subparsers(dest="cmd", required=True)
+
+    p_snap = sub.add_parser("snapshot", help="capture baseline bw_report hashes")
+    p_snap.add_argument("--store-root", required=True)
+    p_snap.add_argument("--out", required=True, help="output JSON path")
+    p_snap.set_defaults(func=cmd_snapshot)
+
+    p_diff = sub.add_parser("diff", help="diff current store against a baseline")
+    p_diff.add_argument("--store-root", required=True)
+    p_diff.add_argument("--baseline",   required=True, help="JSON from `snapshot`")
+    p_diff.set_defaults(func=cmd_diff)
+
+    args = p.parse_args(argv)
+    return args.func(args)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,91 @@
+"""Re-ingest a prod IDFW + IDFH via the patched save_imported_idf and
+render both PDFs to confirm charts have data."""
+from __future__ import annotations
+import sys
+import json
+import datetime
+import tempfile
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+import h5py
+
+
+class FakeDb:
+    def __init__(self, event):
+        self.event = event
+    def get_event(self, _id):
+        return self.event
+
+
+def to_ts_iso(ts):
+    if ts is None:
+        return None
+    try:
+        return datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+    except Exception:
+        return None
+
+
+def render_case(idf_path: Path, serial: str, out_pdf: Path, h5_summary: bool = True):
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idf_path.read_bytes(),
+            idf_path,
+            idf_report_text=None,    # production worst case: no .txt
+        )
+        print(f"=== {idf_path.name} ===")
+        print(f"  h5: {rec['hdf5_filename']}, sidecar: {rec['sidecar_filename']}")
+
+        h5p = Path(td) / serial / f"{idf_path.name}.h5"
+        if h5p.exists() and h5_summary:
+            with h5py.File(h5p) as h:
+                for ch in ("Tran", "Vert", "Long", "MicL"):
+                    ds = h.get(f"samples/{ch}")
+                    if ds is not None:
+                        n = ds.shape[0]
+                        mx = float(abs(ds[...]).max()) if n else 0
+                        print(f"  samples/{ch}: n={n}  max_abs={mx:.5f}")
+
+        record_type = "Histogram" if idf_path.suffix.upper() == ".IDFH" else "Waveform"
+        fake_row = {
+            "serial":              serial,
+            "blastware_filename":  rec["filename"],
+            "record_type":         record_type,
+            "timestamp":           to_ts_iso(ev.timestamp),
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
+        print(f"  ReportData: channels={ {k: len(v) for k,v in rd.channels.items()} }")
+        if rd.is_histogram:
+            print(f"  histogram n_intervals={rd.histogram_n_intervals} interval_size={rd.histogram_interval_size}")
+        pdf = report_pdf.render_event_report_pdf(rd)
+        out_pdf.write_bytes(pdf)
+        print(f"  PDF: {out_pdf}  ({len(pdf)} bytes)")
+
+
+def main():
+    out_dir = Path("/tmp/thor_render_test"); out_dir.mkdir(exist_ok=True)
+    cases = [
+        # IDFW that decoded to preamble-only under the old codec
+        ("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804154137.IDFW", "UM6047"),
+        # IDFW that worked under the old codec (validates no regression)
+        ("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804104450.IDFW", "UM6047"),
+        # IDFH histogram
+        ("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804190047.IDFH", "UM6047"),
+    ]
+    for path, serial in cases:
+        render_case(Path(path), serial, out_dir / f"{Path(path).name}.pdf")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,909 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+  <title>SFM Event Browser</title>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.1/chart.umd.min.js"></script>
+  <style>
+    * { box-sizing: border-box; margin: 0; padding: 0; }
+
+    body {
+      background: #0d1117;
+      color: #c9d1d9;
+      font-family: 'Segoe UI', system-ui, sans-serif;
+      font-size: 13px;
+      height: 100vh;
+      display: flex;
+      flex-direction: column;
+      overflow: hidden;
+    }
+
+    header {
+      background: #161b22;
+      border-bottom: 1px solid #30363d;
+      padding: 12px 20px;
+      display: flex;
+      align-items: center;
+      gap: 16px;
+      flex-shrink: 0;
+    }
+
+    header h1 {
+      font-size: 15px;
+      font-weight: 600;
+      color: #f0f6fc;
+      white-space: nowrap;
+    }
+
+    label { color: #8b949e; font-size: 12px; }
+
+    select, input[type="text"], input[type="search"] {
+      background: #0d1117;
+      border: 1px solid #30363d;
+      border-radius: 6px;
+      color: #c9d1d9;
+      padding: 5px 8px;
+      font-size: 13px;
+    }
+    select { min-width: 140px; }
+    input[type="search"] { width: 200px; }
+    select:focus, input:focus { outline: none; border-color: #388bfd; }
+
+    button {
+      background: #1f6feb;
+      border: none;
+      border-radius: 6px;
+      color: #fff;
+      cursor: pointer;
+      font-size: 13px;
+      font-weight: 500;
+      padding: 5px 14px;
+    }
+    button:hover { background: #388bfd; }
+    button:disabled { background: #21262d; color: #484f58; cursor: not-allowed; }
+
+    #main {
+      flex: 1;
+      display: flex;
+      overflow: hidden;
+    }
+
+    /* ── Event list (left sidebar) ────────────────────────────────── */
+    #event-list-wrap {
+      width: 320px;
+      flex-shrink: 0;
+      background: #0d1117;
+      border-right: 1px solid #21262d;
+      display: flex;
+      flex-direction: column;
+    }
+
+    #event-list-header {
+      padding: 10px 14px;
+      border-bottom: 1px solid #21262d;
+      font-size: 11px;
+      color: #8b949e;
+      text-transform: uppercase;
+      letter-spacing: 0.06em;
+      display: flex;
+      justify-content: space-between;
+    }
+
+    #event-list {
+      flex: 1;
+      overflow-y: auto;
+    }
+
+    .event-row {
+      padding: 8px 14px;
+      border-bottom: 1px solid #161b22;
+      cursor: pointer;
+      transition: background 0.1s;
+    }
+    .event-row:hover { background: #161b22; }
+    .event-row.active { background: #1f3a5f; border-left: 3px solid #58a6ff; padding-left: 11px; }
+    .event-row .er-top {
+      display: flex;
+      justify-content: space-between;
+      align-items: center;
+      margin-bottom: 2px;
+    }
+    .event-row .er-ts { font-family: monospace; font-size: 12px; color: #c9d1d9; }
+    .event-row .er-pvs { font-family: monospace; font-size: 12px; color: #58a6ff; font-weight: 600; }
+    .event-row .er-meta { font-size: 11px; color: #8b949e; }
+    .event-row.false_trigger .er-pvs { color: #f85149; text-decoration: line-through; }
+
+    /* ── Main viewer (right side) ─────────────────────────────────── */
+    #viewer {
+      flex: 1;
+      display: flex;
+      flex-direction: column;
+      overflow: hidden;
+    }
+
+    #event-meta {
+      padding: 12px 20px;
+      background: #161b22;
+      border-bottom: 1px solid #21262d;
+      display: grid;
+      grid-template-columns: repeat(auto-fit, minmax(160px, 1fr));
+      gap: 8px 24px;
+      flex-shrink: 0;
+    }
+    .meta-field {
+      display: flex;
+      flex-direction: column;
+      gap: 1px;
+    }
+    .meta-field .mf-label {
+      font-size: 10px;
+      color: #484f58;
+      text-transform: uppercase;
+      letter-spacing: 0.05em;
+    }
+    .meta-field .mf-value {
+      font-family: monospace;
+      font-size: 13px;
+      color: #c9d1d9;
+    }
+    .meta-field .mf-value.highlight { color: #58a6ff; font-weight: 600; }
+
+    #charts {
+      flex: 1;
+      overflow-y: auto;
+      padding: 12px 16px;
+      display: flex;
+      flex-direction: column;
+      gap: 10px;
+    }
+    .chart-wrap {
+      background: #161b22;
+      border: 1px solid #21262d;
+      border-radius: 8px;
+      padding: 10px 30px 8px 12px;  /* right padding leaves room for the "0.0" baseline label */
+    }
+    .chart-label {
+      font-size: 11px;
+      font-weight: 600;
+      letter-spacing: 0.06em;
+      text-transform: uppercase;
+      margin-bottom: 4px;
+      display: flex;
+      justify-content: space-between;
+    }
+    .chart-canvas-wrap { position: relative; height: 130px; }
+
+    .ch-tran { color: #58a6ff; }
+    .ch-vert { color: #3fb950; }
+    .ch-long { color: #d29922; }
+    .ch-micl { color: #bc8cff; }
+
+    #status-bar {
+      background: #161b22;
+      border-top: 1px solid #21262d;
+      padding: 5px 20px;
+      font-size: 12px;
+      color: #8b949e;
+      min-height: 26px;
+      flex-shrink: 0;
+    }
+    #status-bar.error { color: #f85149; }
+    #status-bar.ok    { color: #3fb950; }
+
+    #empty-state {
+      flex: 1;
+      display: flex;
+      flex-direction: column;
+      align-items: center;
+      justify-content: center;
+      color: #484f58;
+      gap: 8px;
+    }
+    #empty-state svg { opacity: 0.3; }
+
+    .pill {
+      background: #21262d;
+      border-radius: 4px;
+      padding: 2px 8px;
+      color: #c9d1d9;
+      font-family: monospace;
+      font-size: 11px;
+      margin-left: 8px;
+    }
+
+    /* Per-channel stats table in the metadata header */
+    .stats-table {
+      grid-column: 1 / -1;
+      border-collapse: collapse;
+      font-family: monospace;
+      font-size: 12px;
+      margin-top: 4px;
+    }
+    .stats-table th, .stats-table td {
+      padding: 3px 14px 3px 0;
+      text-align: left;
+      color: #c9d1d9;
+    }
+    .stats-table th {
+      color: #484f58;
+      font-size: 10px;
+      text-transform: uppercase;
+      letter-spacing: 0.05em;
+      font-weight: 500;
+    }
+
+    /* ── Print view (light theme matching the Instantel printout) ─── */
+    body.print-view {
+      background: #ffffff;
+      color: #000000;
+    }
+    body.print-view header,
+    body.print-view #event-list-wrap,
+    body.print-view #event-list-header,
+    body.print-view #event-meta,
+    body.print-view #status-bar,
+    body.print-view .chart-wrap {
+      background: #ffffff;
+      border-color: #cccccc;
+      color: #000000;
+    }
+    body.print-view .event-row { color: #000; border-bottom-color: #eee; }
+    body.print-view .event-row:hover { background: #f4f4f4; }
+    body.print-view .event-row.active {
+      background: #e6f0ff;
+      border-left-color: #1f6feb;
+    }
+    body.print-view .er-ts { color: #000; }
+    body.print-view .er-pvs { color: #003a8c; }
+    body.print-view .er-meta,
+    body.print-view #event-list-header,
+    body.print-view .meta-field .mf-label,
+    body.print-view .stats-table th {
+      color: #666;
+    }
+    body.print-view .mf-value { color: #000; }
+    body.print-view .mf-value.highlight { color: #003a8c; }
+    body.print-view label { color: #444; }
+    body.print-view input, body.print-view select {
+      background: #fff; color: #000; border-color: #ccc;
+    }
+    /* In print theme, the channel-label colors stay (they identify
+       the trace).  Only the chart panel background flips. */
+
+    @media print {
+      header, #event-list-wrap, #status-bar, button { display: none !important; }
+      body { overflow: visible; height: auto; }
+      #main, #viewer { overflow: visible; }
+      #charts { overflow: visible; }
+    }
+  </style>
+</head>
+<body>
+
+<header>
+  <h1>SFM Event Browser</h1>
+  <label>Serial</label>
+  <select id="serial-select">
+    <option value="">Loading…</option>
+  </select>
+  <input type="search" id="event-filter" placeholder="filter events…" />
+  <span class="pill" id="count-pill">—</span>
+  <button id="mic-unit-toggle" style="margin-left:auto;background:#21262d"
+          onclick="_setMicUnit(_getMicUnit() === 'dBL' ? 'psi' : 'dBL')"
+          title="Toggle mic display unit (dBL ↔ psi). Persists across page loads.">
+    Mic: dBL
+  </button>
+  <button id="print-btn" onclick="togglePrintView()" style="background:#21262d">Print view</button>
+  <button id="reload-btn" onclick="loadSerials()">Reload</button>
+</header>
+
+<div id="main">
+  <div id="event-list-wrap">
+    <div id="event-list-header">
+      <span>Events</span>
+      <span id="event-list-count">—</span>
+    </div>
+    <div id="event-list"></div>
+  </div>
+
+  <div id="viewer">
+    <div id="empty-state">
+      <svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5">
+        <polyline points="22 12 18 12 15 21 9 3 6 12 2 12"/>
+      </svg>
+      <p>Select a unit and event to view its waveform.</p>
+    </div>
+    <div id="event-meta" style="display:none"></div>
+    <div id="charts" style="display:none"></div>
+  </div>
+</div>
+
+<div id="status-bar">Ready.</div>
+
+<script>
+// Channel colors and rendering order mirror Instantel's BW Event Report
+// printout: MicL at the top, Tran at the bottom.  Colors approximate
+// what BW renders (magenta mic, blue long, green vert, red tran).
+const CHANNEL_COLORS = {
+  MicL: '#e066ff',
+  Long: '#3a80ff',
+  Vert: '#3fb950',
+  Tran: '#f85149',
+};
+const CHANNEL_ORDER = ['MicL', 'Long', 'Vert', 'Tran'];
+
+// Reference pressure for dB(L) — 20 µPa expressed in psi (≈ 2.9e-9 psi).
+const DBL_REF = 2.9e-9;
+
+// User-toggleable mic display unit: 'dBL' (default, matches BW printout
+// + the rest of SFM) or 'psi' (raw sample unit).
+function _getMicUnit() {
+  return localStorage.getItem('sfm_mic_unit') === 'psi' ? 'psi' : 'dBL';
+}
+function _setMicUnit(u) {
+  localStorage.setItem('sfm_mic_unit', u === 'psi' ? 'psi' : 'dBL');
+  _refreshMicUnitToggle();
+  if (currentEventId) loadEvent(currentEventId);
+}
+function _refreshMicUnitToggle() {
+  const b = document.getElementById('mic-unit-toggle');
+  if (b) b.textContent = `Mic: ${_getMicUnit()}`;
+}
+// psi → dB(L).  Null for non-positive (log undefined; Chart.js renders as a gap).
+function _psiToDbl(psi) {
+  if (psi == null || !(psi > 0)) return null;
+  return 20 * Math.log10(psi / DBL_REF);
+}
+
+// Per-sample mic chart conversion — rectify the AC waveform, dBL,
+// floor below the noise-floor minimum.  Gives a continuous baseline
+// instead of the spikey/discontinuous look you get from raw _psiToDbl.
+const MIC_DBL_FLOOR = 60;
+function _psiToDblForChart(psi) {
+  if (psi == null) return MIC_DBL_FLOOR;
+  const a = Math.abs(psi);
+  if (a === 0) return MIC_DBL_FLOOR;
+  const dbl = 20 * Math.log10(a / DBL_REF);
+  return dbl > MIC_DBL_FLOOR ? dbl : MIC_DBL_FLOOR;
+}
+
+// Format an ISO timestamp in the browser's local timezone — UTC values
+// (with 'Z' suffix) convert; naive values are interpreted as local clock.
+// Returns '—' for null/empty/unparseable.
+function _fmtTsLocal(iso) {
+  if (!iso) return '—';
+  const d = new Date(iso);
+  if (isNaN(d)) return iso;
+  return d.toLocaleString();
+}
+
+// Adaptive decimal formatter — scientific notation only for truly extreme
+// values.  Normal-range peaks render as plain decimals with sensible
+// precision (was previously forcing toExponential(3) which produced ugly
+// "2.500E-2 IN/S" labels).
+function _fmtPeak(v, unit) {
+  if (v == null || (typeof v === 'number' && !isFinite(v))) return '';
+  if (typeof v !== 'number') return String(v) + (unit ? ' ' + unit : '');
+  if (v === 0) return '0' + (unit ? ' ' + unit : '');
+  const a = Math.abs(v);
+  const u = unit ? ' ' + unit : '';
+  if (a >= 0.0001 && a < 10000) {
+    const d = a >= 100 ? 1 : a >= 10 ? 2 : a >= 1 ? 3 : a >= 0.1 ? 4 : 5;
+    return v.toFixed(d) + u;
+  }
+  return v.toExponential(2) + u;
+}
+
+let allEvents = [];
+let filteredEvents = [];
+let currentEventId = null;
+let charts = {};
+
+const apiBase = window.location.origin;
+
+function setStatus(msg, cls = '') {
+  const bar = document.getElementById('status-bar');
+  bar.textContent = msg;
+  bar.className = cls;
+}
+
+async function loadSerials() {
+  setStatus('Loading serials…');
+  try {
+    const r = await fetch(`${apiBase}/db/units`);
+    if (!r.ok) throw new Error(r.statusText);
+    // /db/units returns a bare list[dict], not {units:[...]}
+    const units = await r.json();
+    const sel = document.getElementById('serial-select');
+    sel.innerHTML = '';
+    if (!units || units.length === 0) {
+      sel.innerHTML = '<option value="">(no units found)</option>';
+      setStatus('No units in DB.', 'error');
+      return;
+    }
+    sel.innerHTML = '<option value="">— pick a unit —</option>' +
+      units.map(u => {
+        const n = u.total_events ?? 0;
+        return `<option value="${u.serial}">${u.serial}  (${n} events)</option>`;
+      }).join('');
+    setStatus(`Loaded ${units.length} units.`, 'ok');
+  } catch (e) {
+    setStatus(`Failed to load units: ${e.message}`, 'error');
+  }
+}
+
+async function loadEventsForSerial(serial) {
+  if (!serial) {
+    allEvents = [];
+    renderEventList();
+    return;
+  }
+  setStatus(`Loading events for ${serial}…`);
+  try {
+    const r = await fetch(`${apiBase}/db/events?serial=${encodeURIComponent(serial)}&limit=500`);
+    if (!r.ok) throw new Error(r.statusText);
+    const d = await r.json();
+    allEvents = d.events || [];
+    document.getElementById('count-pill').textContent = `${allEvents.length} events`;
+    applyFilter();
+    setStatus(`Loaded ${allEvents.length} events for ${serial}.`, 'ok');
+  } catch (e) {
+    setStatus(`Failed to load events: ${e.message}`, 'error');
+  }
+}
+
+function applyFilter() {
+  const q = document.getElementById('event-filter').value.toLowerCase().trim();
+  if (!q) {
+    filteredEvents = allEvents;
+  } else {
+    filteredEvents = allEvents.filter(ev =>
+      (ev.blastware_filename || '').toLowerCase().includes(q) ||
+      (ev.timestamp           || '').toLowerCase().includes(q) ||
+      (ev.record_type         || '').toLowerCase().includes(q) ||
+      (ev.project             || '').toLowerCase().includes(q)
+    );
+  }
+  document.getElementById('event-list-count').textContent = `${filteredEvents.length} / ${allEvents.length}`;
+  renderEventList();
+}
+
+function renderEventList() {
+  const list = document.getElementById('event-list');
+  list.innerHTML = '';
+  if (filteredEvents.length === 0) {
+    list.innerHTML = '<div style="padding:14px;color:#484f58;font-size:12px">No events.</div>';
+    return;
+  }
+  for (const ev of filteredEvents) {
+    const row = document.createElement('div');
+    row.className = 'event-row' + (ev.false_trigger ? ' false_trigger' : '');
+    if (ev.id === currentEventId) row.className += ' active';
+    const ts = _fmtTsLocal(ev.timestamp);
+    const pvs = ev.peak_vector_sum != null ? `${ev.peak_vector_sum.toFixed(3)} in/s` : '—';
+    row.innerHTML = `
+      <div class="er-top">
+        <span class="er-ts">${ts || '(no ts)'}</span>
+        <span class="er-pvs">${pvs}</span>
+      </div>
+      <div class="er-meta">${ev.record_type || '?'} · ${ev.blastware_filename || ev.id.slice(0,8)}</div>
+    `;
+    row.onclick = () => loadEvent(ev.id);
+    list.appendChild(row);
+  }
+}
+
+async function loadEvent(eventId) {
+  currentEventId = eventId;
+  renderEventList();
+  setStatus('Loading waveform…');
+  try {
+    // Sidecar fetch runs in parallel — its bw_report block carries ZC
+    // Freq + above-range flags + sensor-check results that the per-
+    // channel stats table surfaces.  Failures are non-fatal (legacy
+    // events without a preserved .TXT have no sidecar bw_report).
+    const sidecarP = fetch(`${apiBase}/db/events/${eventId}/sidecar`)
+      .then(r => r.ok ? r.json() : null)
+      .catch(() => null);
+
+    const r = await fetch(`${apiBase}/db/events/${eventId}/waveform.json`);
+    if (!r.ok) {
+      if (r.status === 404) {
+        showEmpty('No waveform data for this event (codec returned no samples).');
+        return;
+      }
+      throw new Error(r.statusText);
+    }
+    const data = await r.json();
+    renderWaveform(data);
+    // Also fetch metadata from the events list for richer header
+    const ev = allEvents.find(e => e.id === eventId);
+    const sidecar = await sidecarP;
+    renderMeta(data, ev, sidecar);
+    setStatus(`Event loaded.`, 'ok');
+  } catch (e) {
+    setStatus(`Failed to load event: ${e.message}`, 'error');
+    showEmpty(`Error: ${e.message}`);
+  }
+}
+
+function showEmpty(msg) {
+  document.getElementById('empty-state').style.display = 'flex';
+  document.getElementById('empty-state').querySelector('p').textContent = msg;
+  document.getElementById('event-meta').style.display = 'none';
+  document.getElementById('charts').style.display = 'none';
+  Object.values(charts).forEach(c => c.destroy());
+  charts = {};
+}
+
+function renderMeta(data, ev, sidecar) {
+  const metaDiv = document.getElementById('event-meta');
+  const fields = [
+    ['Serial',      data.serial || ev?.serial || '—'],
+    ['Timestamp',   _fmtTsLocal(data.timestamp || ev?.timestamp)],
+    ['Record',      data.record_type || ev?.record_type || '—'],
+    ['Sample rate', data.sample_rate ? `${data.sample_rate} sps` : '—'],
+    ['Geo range',   data.geo_range ? `${data.geo_range} (${data.geo_full_scale_ips} in/s FS)` : '—'],
+    ['Project',     ev?.project || '—'],
+    ['Location',    ev?.sensor_location || '—'],
+    ['Peak Vector Sum',
+                    ev?.peak_vector_sum != null ? `${ev.peak_vector_sum.toFixed(4)} in/s` : '—'],
+  ];
+
+  // Per-channel stats table mirroring the printout's middle block.
+  // PPV from the events DB row; ZC Freq + saturation flags from the
+  // sidecar's bw_report block (when a .TXT was preserved on ingest).
+  const bwrPeaks = (sidecar?.bw_report || {}).peaks || {};
+  const bwrMic   = (sidecar?.bw_report || {}).mic   || {};
+  const fmt = v => (v == null ? '—' : (typeof v === 'number' ? v.toFixed(3) : v));
+  const fmtZc = bwr => {
+    if (!bwr || bwr.zc_freq_hz == null) return '—';
+    const prefix = bwr.zc_freq_above_range ? '>' : '';
+    return `${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
+  };
+  const rows = [
+    ['Tran', ev?.tran_ppv, fmtZc(bwrPeaks.tran)],
+    ['Vert', ev?.vert_ppv, fmtZc(bwrPeaks.vert)],
+    ['Long', ev?.long_ppv, fmtZc(bwrPeaks.long)],
+  ];
+  // Mic display honors the current user preference (dBL default).
+  // mic_ppv is stored as raw psi on series3 events; convert when needed.
+  const micPsi = ev?.mic_ppv;
+  const micUnitDisplay = _getMicUnit();
+  let micStr;
+  if (micPsi == null) {
+    micStr = '—';
+  } else if (micUnitDisplay === 'dBL') {
+    const d = _psiToDbl(Number(micPsi));
+    micStr = (d != null ? d.toFixed(1) : '—') + ' dBL';
+  } else {
+    micStr = Number(micPsi).toExponential(2) + ' psi';
+  }
+  const statsHtml = `
+    <table class="stats-table">
+      <thead>
+        <tr><th>Channel</th><th>PPV (in/s)</th><th>ZC Freq</th></tr>
+      </thead>
+      <tbody>
+        ${rows.map(([ch, ppv, zc]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td><td>${zc}</td></tr>`).join('')}
+        <tr><td>MicL</td><td>${micStr}</td><td>${fmtZc(bwrMic)}</td></tr>
+      </tbody>
+    </table>
+  `;
+
+  metaDiv.innerHTML =
+    fields.map(([l, v]) =>
+      `<div class="meta-field"><span class="mf-label">${l}</span><span class="mf-value${l === 'Peak Vector Sum' ? ' highlight' : ''}">${v}</span></div>`
+    ).join('') + statsHtml;
+  metaDiv.style.display = 'grid';
+}
+
+function togglePrintView() {
+  document.body.classList.toggle('print-view');
+  // Force chart redraw so axis/grid colors are re-evaluated against the
+  // new background.  Easiest: re-render the current event.
+  if (currentEventId) {
+    loadEvent(currentEventId);
+  }
+}
+
+function renderWaveform(data) {
+  document.getElementById('empty-state').style.display = 'none';
+  const chartsDiv = document.getElementById('charts');
+  chartsDiv.style.display = 'flex';
+  chartsDiv.innerHTML = '';
+  Object.values(charts).forEach(c => c.destroy());
+  charts = {};
+
+  const channels = data.channels || {};
+  // time_axis is METADATA from sfm.plot.v1 — sample_rate, pretrig_samples,
+  // t0_ms (first-sample time relative to trigger; negative when pretrig
+  // exists), dt_ms.  Trigger is at t=0 by convention.
+  const ta    = data.time_axis || {};
+  const sr    = ta.sample_rate || 1024;
+  const dtMs  = ta.dt_ms || (1000.0 / sr);
+  const t0Ms  = ta.t0_ms != null ? ta.t0_ms : 0;
+  const isPrintMode = document.body.classList.contains('print-view');
+  // Histograms record per-interval peaks (typically 1 per minute/5-min),
+  // not per-sample waveforms.  Render as a tight bar graph instead of a
+  // line plot — matches the BW Event Report's histogram presentation.
+  const isHistogram = String(data.record_type || '').toLowerCase().includes('histogram');
+
+  // Which channels actually have data → determines which one renders the
+  // shared x-axis at the bottom (Instantel printout has the time scale
+  // only on the bottom-most chart).
+  const channelsWithData = CHANNEL_ORDER.filter(ch =>
+    channels[ch] && (channels[ch].values || []).length > 0
+  );
+  const lastDataCh = channelsWithData[channelsWithData.length - 1];
+
+  const micUnit = _getMicUnit();
+  for (const ch of CHANNEL_ORDER) {
+    const chData = channels[ch];
+    if (!chData) continue;
+    if ((chData.values || []).length === 0) {
+      // Render an empty card so user sees the channel exists but is missing
+      const wrap = document.createElement('div');
+      wrap.className = 'chart-wrap';
+      wrap.innerHTML = `
+        <div class="chart-label ch-${ch.toLowerCase()}">
+          <span>${ch}</span>
+          <span style="color:#484f58">no samples decoded</span>
+        </div>
+        <div class="chart-canvas-wrap" style="display:flex;align-items:center;justify-content:center;color:#484f58;font-size:12px">empty</div>
+      `;
+      chartsDiv.appendChild(wrap);
+      continue;
+    }
+
+    // Mic channel: convert from raw psi to dB(L) when the user prefers dBL
+    // (the default).  We mutate `values`, `peak`, and `unit` locally so the
+    // chart datasets + axis title + tooltip + peak label all stay aligned.
+    let values = chData.values || [];
+    let unit  = chData.unit || 'unit';
+    let peak  = chData.peak;
+    const peakT = chData.peak_t_ms;
+    if (ch === 'MicL' && unit === 'psi' && micUnit === 'dBL') {
+      // Per-sample chart uses rectified-and-floored conversion so the
+      // baseline is continuous; the peak label uses the unrectified
+      // converter to preserve the true measurement.
+      values = values.map(_psiToDblForChart);
+      peak   = _psiToDbl(peak);
+      unit   = 'dB(L)';
+    }
+
+    const peakLabel = peak != null
+      ? `peak ${_fmtPeak(peak, unit)}`
+        + (!isHistogram && peakT != null ? ` @ ${peakT.toFixed(1)} ms` : '')
+      : '';
+    // Hide x-axis on every chart except the bottom-most data channel —
+    // gives the "single shared time axis" feel of the BW printout.
+    const showXAxis = (ch === lastDataCh);
+
+    const wrap = document.createElement('div');
+    wrap.className = 'chart-wrap';
+    const lbl = document.createElement('div');
+    lbl.className = `chart-label ch-${ch.toLowerCase()}`;
+    lbl.innerHTML = `<span>${ch}</span><span style="color:#8b949e;font-weight:normal">${peakLabel}</span>`;
+    wrap.appendChild(lbl);
+
+    const canvasWrap = document.createElement('div');
+    canvasWrap.className = 'chart-canvas-wrap';
+    const canvas = document.createElement('canvas');
+    canvasWrap.appendChild(canvas);
+    wrap.appendChild(canvasWrap);
+    chartsDiv.appendChild(wrap);
+
+    // Waveform: per-sample time in ms relative to trigger (negative for pretrig).
+    // Histogram: when the server has aggregated to BW-reported intervals AND
+    // provides per-interval timestamps, use those as x-axis labels (HH:MM:SS).
+    // Falls back to interval index.
+    let times;
+    if (isHistogram) {
+      const intervalTimes = ta.interval_times || [];
+      times = (intervalTimes.length === values.length)
+        ? intervalTimes
+        : values.map((_, i) => i + 1);
+    } else {
+      times = values.map((_, i) => t0Ms + i * dtMs);
+    }
+
+    // Downsample for rendering
+    const MAX_POINTS = 4000;
+    let rT = times, rV = values;
+    if (values.length > MAX_POINTS) {
+      const step = Math.ceil(values.length / MAX_POINTS);
+      rT = times.filter((_, i) => i % step === 0);
+      rV = values.filter((_, i) => i % step === 0);
+    }
+
+    // Tick formatter — round to 1 decimal so we don't get
+    // "11.7187040000000002 ms" garbage from floating-point accumulation.
+    const xAxisUnit = isHistogram ? '' : ' ms';
+    const fmtTick = i => {
+      const v = rT[i];
+      if (typeof v !== 'number') return String(v) + xAxisUnit;
+      return (Number.isInteger(v) ? String(v) : v.toFixed(1)) + xAxisUnit;
+    };
+
+    // Y-axis bounds.  Geophone waveforms render symmetric around zero
+    // (seismograph convention — zero line in the middle, signal goes
+    // up AND down).  Mic + histograms keep default auto-scale (always
+    // positive values; zero at the bottom).
+    let yBounds = {};
+    const isGeo = ch !== 'MicL';
+    if (isGeo && !isHistogram) {
+      // Waveform geo: symmetric around zero for full shape detail.
+      let absMax = 0;
+      for (const v of values) {
+        const a = Math.abs(v);
+        if (a > absMax) absMax = a;
+      }
+      const padded = (absMax || 1) * 1.10;
+      yBounds = { min: -padded, max: padded };
+    } else if (isGeo && isHistogram) {
+      // Histogram geo: enforce minimum chart range so quiet events
+      // look quiet (matches BW's near-fixed-scale convention).
+      const HIST_GEO_MIN_INS = 0.05;
+      let p = 0;
+      for (const v of values) { const a = Math.abs(v); if (a > p) p = a; }
+      yBounds = { min: 0, max: Math.max(p * 1.10, HIST_GEO_MIN_INS) };
+    } else if (ch === 'MicL' && micUnit === 'dBL') {
+      // Mic dBL: baseline at noise-floor minimum, top at peak + 5 dB.
+      const peakDbl = (typeof peak === 'number' && isFinite(peak))
+        ? peak + 5
+        : 100;
+      yBounds = { min: MIC_DBL_FLOOR, max: Math.max(peakDbl, MIC_DBL_FLOOR + 20) };
+    } else if (ch === 'MicL' && isHistogram && micUnit === 'psi') {
+      // Mic histogram in psi: same minimum-range treatment as geo.
+      const HIST_MIC_MIN_PSI = 0.001;
+      let p = 0;
+      for (const v of values) { const a = Math.abs(v); if (a > p) p = a; }
+      yBounds = { min: 0, max: Math.max(p * 1.10, HIST_MIC_MIN_PSI) };
+    }
+
+    const chart = new Chart(canvas, {
+      type: isHistogram ? 'bar' : 'line',
+      data: {
+        labels: rT.map(t => (typeof t === 'number' ? (Number.isInteger(t) ? String(t) : t.toFixed(2)) : t)),
+        datasets: isHistogram ? [{
+          data: rV,
+          backgroundColor: CHANNEL_COLORS[ch],
+          borderWidth: 0,
+          barPercentage: 1.0,
+          categoryPercentage: 1.0,  // bars touch — tight bargraph
+        }] : [{
+          data: rV,
+          borderColor: CHANNEL_COLORS[ch],
+          borderWidth: 1,
+          pointRadius: 0,
+          tension: 0,
+        }],
+      },
+      options: {
+        animation: false,
+        responsive: true,
+        maintainAspectRatio: false,
+        plugins: {
+          legend: { display: false },
+          tooltip: {
+            mode: 'index',
+            intersect: false,
+            callbacks: {
+              title: items => isHistogram
+                ? `interval ${items[0].label}`
+                : `t = ${items[0].label} ms`,
+              label: item => `${ch}: ${_fmtPeak(item.raw, unit)}`,
+            },
+          },
+        },
+        scales: {
+          x: {
+            type: 'category',
+            display: showXAxis,
+            ticks: {
+              color: isPrintMode ? '#666' : '#484f58',
+              maxTicksLimit: 10,
+              maxRotation: 0,
+              callback: (val, i) => fmtTick(i),
+            },
+            grid: { color: isPrintMode ? '#e0e0e0' : '#21262d', drawTicks: showXAxis },
+          },
+          y: {
+            ...yBounds,
+            ticks: { color: isPrintMode ? '#666' : '#484f58', maxTicksLimit: 5 },
+            grid: { color: isPrintMode ? '#e0e0e0' : '#21262d' },
+            title: { display: true, text: unit,
+                     color: isPrintMode ? '#666' : '#484f58', font: { size: 10 } },
+          },
+        },
+      },
+      plugins: isHistogram ? [] : [{
+        // Trigger line @ t=0 + triangle markers above/below + "0.0"
+        // baseline label on the right edge.  Matches the Instantel
+        // BW Event Report printout style.  Skipped for histograms —
+        // they have no trigger event.
+        id: 'instantelOverlays',
+        afterDraw(chart) {
+          const ctx   = chart.ctx;
+          const xAxis = chart.scales.x;
+          const yAxis = chart.scales.y;
+          const fgPrim = isPrintMode ? '#000' : '#c9d1d9';
+          const fgTrigger = '#f85149';
+
+          // Dashed vertical trigger line at t=0
+          const zeroIdx = rT.findIndex(t => parseFloat(t) >= 0);
+          if (zeroIdx >= 0) {
+            const x = xAxis.getPixelForValue(zeroIdx);
+            ctx.save();
+            ctx.beginPath();
+            ctx.moveTo(x, yAxis.top);
+            ctx.lineTo(x, yAxis.bottom);
+            ctx.strokeStyle = isPrintMode ? '#cc0000' : 'rgba(248, 81, 73, 0.8)';
+            ctx.lineWidth = 1.2;
+            ctx.setLineDash([4, 3]);
+            ctx.stroke();
+            ctx.restore();
+
+            // Triangles above and below the chart at the trigger column
+            ctx.save();
+            ctx.fillStyle = fgTrigger;
+            ctx.beginPath();  // top triangle pointing down
+            ctx.moveTo(x - 5, yAxis.top - 8);
+            ctx.lineTo(x + 5, yAxis.top - 8);
+            ctx.lineTo(x,     yAxis.top - 1);
+            ctx.closePath();
+            ctx.fill();
+            ctx.beginPath();  // bottom triangle pointing up
+            ctx.moveTo(x - 5, yAxis.bottom + 8);
+            ctx.lineTo(x + 5, yAxis.bottom + 8);
+            ctx.lineTo(x,     yAxis.bottom + 1);
+            ctx.closePath();
+            ctx.fill();
+            ctx.restore();
+          }
+
+          // "0.0" baseline label on the right edge — printout convention.
+          // Position vertically at the zero-amplitude level.
+          const zeroY = yAxis.getPixelForValue(0);
+          if (zeroY >= yAxis.top && zeroY <= yAxis.bottom) {
+            ctx.save();
+            ctx.strokeStyle = isPrintMode ? '#aaa' : '#30363d';
+            ctx.lineWidth = 0.8;
+            ctx.setLineDash([2, 2]);
+            ctx.beginPath();
+            ctx.moveTo(xAxis.left, zeroY);
+            ctx.lineTo(xAxis.right, zeroY);
+            ctx.stroke();
+            ctx.restore();
+
+            ctx.save();
+            ctx.fillStyle = fgPrim;
+            ctx.font = '11px monospace';
+            ctx.textAlign = 'left';
+            ctx.textBaseline = 'middle';
+            ctx.fillText('0.0', xAxis.right + 6, zeroY);
+            ctx.restore();
+          }
+        },
+      }],
+    });
+    charts[ch] = chart;
+  }
+}
+
+// Wire up handlers
+document.getElementById('serial-select').addEventListener('change', e => {
+  loadEventsForSerial(e.target.value);
+});
+document.getElementById('event-filter').addEventListener('input', applyFilter);
+
+// Reflect any persisted mic-unit preference in the header pill on load
+_refreshMicUnitToggle();
+
+// Initial load
+loadSerials();
+</script>
+</body>
+</html>
@@ -0,0 +1,939 @@
+"""
+sfm/report_pdf.py — generate Instantel-style Event Report PDFs.
+
+Stub layout for v0.20.0 — the exact visual is iterated against actual
+Blastware reference PDFs (uploaded to docs/reference/instantel/).
+Current output captures all the data fields a real BW Event Report
+contains, but the visual hierarchy / spacing is still approximate.
+
+Architecture
+────────────
+1. ``gather_report_data(event_id)`` — assembles a flat dict from three
+   sources: the SeismoDb events row, the .sfm.json sidecar (bw_report
+   block), and the .h5 waveform samples.  Returns ``None`` when the
+   event doesn't exist or has no waveform data on disk.
+
+2. ``render_event_report_pdf(data)`` — takes that dict and produces a
+   single-page letter-sized PDF as bytes, using matplotlib's PDF
+   backend (vector output, no rasterization, prints cleanly).
+
+3. The HTTP endpoint at ``/db/events/{id}/report.pdf`` wires them
+   together: fetch event → gather → render → stream bytes back with
+   ``Content-Type: application/pdf``.
+
+What's in the report (every field BW's printout includes):
+
+  Header (left):  Date/Time, Trigger Source, Range, Sample Rate, Notes,
+                  Project, Client, User Name, Seis. Loc
+  Header (right): Serial + firmware, Battery, Calibration, File Name,
+                  Post Event Notes
+  Mic block:      PSPL (dBL + psi), ZC Freq, Channel Test result
+  Stats table:    per-channel PPV / ZC Freq / Time of Peak /
+                  Peak Acceleration / Peak Displacement / Sensor Check
+  Peak Vector Sum
+  Waveform plot:  4 channels stacked (MicL/Long/Vert/Tran), shared
+                  time axis, trigger marker, peak markers
+  USBM RI8507/OSMRE compliance chart:  STUBBED — separate work item
+
+Histogram events: the layout differs (Number of Intervals header
+field, no trigger marker, per-interval bar chart instead of waveform).
+Handled via a record_type branch in ``render_event_report_pdf``.
+"""
+
+from __future__ import annotations
+
+import io
+import json
+import logging
+import math
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Optional
+
+import matplotlib
+matplotlib.use("Agg")   # headless — no display required
+import matplotlib.pyplot as plt
+import numpy as np
+from matplotlib.backends.backend_pdf import PdfPages
+
+log = logging.getLogger(__name__)
+
+
+# Reference pressure for dB(L) conversion: 20 µPa expressed in psi.
+DBL_REF_PSI = 2.9e-9
+
+
+# ── Data assembly ────────────────────────────────────────────────────────────
+
+
+@dataclass
+class ReportData:
+    """All fields needed to render an Instantel-style Event Report.
+
+    Most fields are Optional — BW's printout shows '—' or just omits
+    sections when source data is missing.  The renderer mirrors that.
+    """
+    # Header — left column
+    event_datetime_str: Optional[str] = None
+    trigger_source:     Optional[str] = None
+    geo_range_str:      Optional[str] = None
+    sample_rate_str:    Optional[str] = None
+    notes:              Optional[str] = None
+    project:            Optional[str] = None
+    client:             Optional[str] = None
+    operator:           Optional[str] = None
+    sensor_location:    Optional[str] = None
+
+    # Header — right column
+    serial:                 Optional[str] = None
+    firmware:               Optional[str] = None
+    battery_volts:          Optional[float] = None
+    calibration_date:       Optional[str] = None
+    calibration_by:         Optional[str] = None
+    file_name:              Optional[str] = None
+    post_event_notes:       Optional[str] = None
+
+    # Microphone block
+    mic_pspl_dbl:           Optional[float] = None
+    mic_pspl_psi:           Optional[float] = None
+    mic_pspl_time_s:        Optional[float] = None
+    mic_pspl_when_str:      Optional[str] = None    # histogram absolute date+time, BW-formatted
+    mic_zc_freq_hz:         Optional[float] = None
+    mic_zc_freq_above_range: bool           = False
+    mic_channel_test_result: Optional[str] = None
+    mic_channel_test_freq_hz: Optional[float] = None
+    mic_channel_test_amp_mv: Optional[float] = None
+
+    # Per-channel stats — list of dicts (one per channel)
+    # Keys: name, ppv_ips, zc_freq_hz, time_of_peak_s,
+    #       peak_accel_g, peak_disp_in, sensor_check
+    channel_stats:          list[dict] = field(default_factory=list)
+
+    # Peak Vector Sum
+    peak_vector_sum_ips:    Optional[float] = None
+    peak_vector_sum_time_s: Optional[float] = None
+
+    # Waveform samples — channels[ch] = list of floats in physical units
+    # Time axis derived from sample_rate + pretrig_samples
+    channels:               dict = field(default_factory=dict)
+    sample_rate_sps:        Optional[int] = None
+    pretrig_samples:        Optional[int] = None
+    t0_ms:                  Optional[float] = None
+    dt_ms:                  Optional[float] = None
+
+    # Record-type discriminator
+    record_type:            Optional[str] = None
+    is_histogram:           bool = False
+
+    # Histogram-only fields — only populated for record_type starts with 'Hist'
+    histogram_start_str:    Optional[str] = None       # "22:30:38 May 16, 2026"
+    histogram_stop_str:     Optional[str] = None
+    histogram_n_intervals:  Optional[float] = None     # 4.00
+    histogram_interval_size: Optional[str] = None      # "1 minute"
+    histogram_interval_size_s: Optional[float] = None  # 60.0 — numeric seconds, used to derive interval_times
+    histogram_interval_times: list[str] = field(default_factory=list)  # per-interval timestamps for x-axis
+
+    # Peak Vector Sum metadata (histograms show absolute date+time)
+    peak_vector_sum_when_str: Optional[str] = None
+
+    # Bookkeeping
+    event_id:               Optional[str] = None
+    server_received_at:     Optional[str] = None
+    bw_pc_sw_version:       Optional[str] = None
+
+
+def gather_report_data(
+    db,
+    store,
+    event_id: str,
+) -> Optional[ReportData]:
+    """Collect every field needed to render an event report.
+
+    Returns ``None`` if the event is unknown or has no waveform data
+    on disk (no .h5, no .a5.pkl — same condition the waveform.json
+    endpoint 404s on).
+    """
+    row = db.get_event(event_id)
+    if row is None:
+        return None
+    serial   = row.get("serial")
+    filename = row.get("blastware_filename")
+    if not serial or not filename:
+        return None
+
+    rd = ReportData(
+        event_id=event_id,
+        serial=serial,
+        file_name=filename,
+        record_type=row.get("record_type"),
+        is_histogram=str(row.get("record_type", "")).lower().startswith("hist"),
+        event_datetime_str=row.get("timestamp"),
+        sample_rate_sps=row.get("sample_rate"),
+        project=row.get("project"),
+        client=row.get("client"),
+        operator=row.get("operator"),
+        sensor_location=row.get("sensor_location"),
+        server_received_at=row.get("created_at"),
+    )
+
+    # ── Sidecar bw_report — the rich BW-derived fields ──
+    sidecar_path = store.sidecar_path_for(serial, filename)
+    if sidecar_path.exists():
+        try:
+            sc = json.loads(sidecar_path.read_text())
+        except Exception as exc:
+            log.warning("gather_report_data: sidecar read failed: %s", exc)
+            sc = {}
+        bw = sc.get("bw_report") or {}
+
+        # Trigger / range / sample-rate display
+        trig = bw.get("trigger") or {}
+        rd.trigger_source = (
+            f"{trig.get('channel','')}: {trig.get('geo_level_ips')} in/s"
+            if trig.get("channel") or trig.get("geo_level_ips") is not None
+            else None
+        )
+        rec = bw.get("recording") or {}
+        rd.geo_range_str = (
+            f"Geo: {rec.get('geo_range_ips')} in/s"
+            if rec.get("geo_range_ips") is not None else None
+        )
+        rt = rec.get("record_time_s")
+        if rt is not None and rd.sample_rate_sps:
+            rd.sample_rate_str = f"{rt:.1f} sec At {rd.sample_rate_sps} Sps"
+
+        # Device block
+        dev = bw.get("device") or {}
+        rd.battery_volts    = dev.get("battery_volts")
+        rd.calibration_date = dev.get("calibration_date")
+        rd.calibration_by   = dev.get("calibration_by")
+        rd.firmware         = bw.get("version")
+        rd.bw_pc_sw_version = bw.get("pc_sw_version")
+
+        # Microphone block
+        mic = bw.get("mic") or {}
+        rd.mic_pspl_dbl    = mic.get("pspl_dbl")
+        if rd.mic_pspl_dbl is not None and rd.mic_pspl_dbl > 0:
+            # Inverse of the dBL formula → psi.  Mirrors waveform_codec convention.
+            rd.mic_pspl_psi = DBL_REF_PSI * (10 ** (rd.mic_pspl_dbl / 20))
+        rd.mic_pspl_time_s = mic.get("time_of_peak_s")
+        rd.mic_zc_freq_hz             = mic.get("zc_freq_hz")
+        rd.mic_zc_freq_above_range    = bool(mic.get("zc_freq_above_range"))
+        sc_mic = (bw.get("sensor_check") or {}).get("mic") or {}
+        rd.mic_channel_test_result   = sc_mic.get("result")
+        rd.mic_channel_test_freq_hz  = sc_mic.get("freq_hz")
+        rd.mic_channel_test_amp_mv   = sc_mic.get("amplitude_mv")
+
+        # Per-channel stats (Tran / Vert / Long).  Per-channel peak
+        # date+time for histograms comes from bw_report.histogram.channel_peak_when
+        # (populated when the parser captured it; see the bw_ascii_report
+        # parser's histogram-fields handler).
+        peaks = bw.get("peaks") or {}
+        sc_block = bw.get("sensor_check") or {}
+        hist_block = bw.get("histogram") or {}
+        peak_when = hist_block.get("channel_peak_when") or {}
+        for ch_lc, ch_label in (("tran", "Tran"), ("vert", "Vert"), ("long", "Long")):
+            ch = peaks.get(ch_lc) or {}
+            sc_ch = sc_block.get(ch_lc) or {}
+            ch_when_iso = peak_when.get(ch_label)
+            peak_date, peak_time = _split_iso_to_date_time(ch_when_iso)
+            rd.channel_stats.append({
+                "name":               ch_label,
+                "ppv_ips":            ch.get("ppv_ips"),
+                "zc_freq_hz":         ch.get("zc_freq_hz"),
+                "zc_freq_above_range": bool(ch.get("zc_freq_above_range")),
+                "time_of_peak_s":     ch.get("time_of_peak_s"),
+                "peak_accel_g":       ch.get("peak_accel_g"),
+                "peak_disp_in":       ch.get("peak_disp_in"),
+                "sensor_check":       sc_ch.get("result"),
+                "peak_date":          peak_date,
+                "peak_time":          peak_time,
+            })
+
+        # MicL peak time (used in the mic block — "PSPL ... on DATE at TIME")
+        mic_when_iso = peak_when.get("MicL")
+        rd.mic_pspl_when_str = _fmt_iso_to_bw(mic_when_iso) if mic_when_iso else None
+
+        # Peak Vector Sum
+        vs = peaks.get("vector_sum") or {}
+        rd.peak_vector_sum_ips    = vs.get("ips")
+        rd.peak_vector_sum_time_s = vs.get("time_s")
+        # PVS absolute date+time (histograms).  Same formatting as Mic.
+        pvs_when_iso = vs.get("when")
+        rd.peak_vector_sum_when_str = _fmt_iso_to_bw(pvs_when_iso) if pvs_when_iso else None
+
+        # Histogram-specific header fields — keys match the projection in
+        # _bw_report_to_dict ("start" / "stop", not "_str" suffixed).
+        if rd.is_histogram:
+            rd.histogram_start_str   = hist_block.get("start") or rd.event_datetime_str
+            rd.histogram_stop_str    = hist_block.get("stop")
+            rd.histogram_n_intervals = hist_block.get("n_intervals")
+            rd.histogram_interval_size = hist_block.get("interval_size")
+            rd.histogram_interval_size_s = hist_block.get("interval_size_s")
+            rd.histogram_interval_times = hist_block.get("interval_times") or []
+
+    # ── Waveform samples — from the .h5 via the existing helper ──
+    from sfm import event_hdf5
+    h5_path = store.hdf5_path_for(serial, filename)
+    if h5_path.exists():
+        try:
+            wf = event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
+            rd.channels = {
+                ch: (chd.get("values") or [])
+                for ch, chd in (wf.get("channels") or {}).items()
+            }
+            ta = wf.get("time_axis") or {}
+            rd.sample_rate_sps  = rd.sample_rate_sps or ta.get("sample_rate")
+            rd.pretrig_samples  = ta.get("pretrig_samples")
+            rd.t0_ms            = ta.get("t0_ms")
+            rd.dt_ms            = ta.get("dt_ms")
+        except Exception as exc:
+            log.warning("gather_report_data: hdf5 read failed: %s", exc)
+
+    # ── Histogram aggregation ──
+    # Codec emits ~N per-block samples (typically 1/sec); BW reports
+    # one bar per configured interval (1 min / 5 min / etc.).  When
+    # bw_report.histogram.n_intervals is populated (events ingested
+    # with the parser extension), group max-per-group to match.  Also
+    # derives per-interval timestamps for the x-axis.  No-op for
+    # waveform events or when n_intervals is missing.
+    if rd.is_histogram and rd.histogram_n_intervals and rd.histogram_n_intervals >= 1:
+        n = int(rd.histogram_n_intervals)
+        for ch, vals in list(rd.channels.items()):
+            if not vals:
+                continue
+            per_group = len(vals) // n
+            remainder = len(vals) % n
+            agg: list = []
+            offset = 0
+            for i in range(n):
+                grp_size = per_group + (1 if i < remainder else 0)
+                if grp_size > 0:
+                    grp = vals[offset:offset + grp_size]
+                    agg.append(max((abs(v) for v in grp if v is not None), default=0))
+                    offset += grp_size
+                else:
+                    agg.append(0)
+            rd.channels[ch] = agg
+        # Derive per-interval HH:MM:SS labels if we have the start time + size
+        if rd.histogram_start_str and rd.histogram_interval_size_s and not rd.histogram_interval_times:
+            try:
+                import datetime as _dt
+                start = _dt.datetime.fromisoformat(rd.histogram_start_str)
+                rd.histogram_interval_times = [
+                    (start + _dt.timedelta(seconds=(i + 1) * rd.histogram_interval_size_s)).strftime("%H:%M:%S")
+                    for i in range(n)
+                ]
+            except Exception:
+                pass
+
+    return rd
+
+
+# ── PDF rendering ────────────────────────────────────────────────────────────
+
+
+def render_event_report_pdf(rd: ReportData) -> bytes:
+    """Render an event report dict to a single-page letter PDF.
+
+    Branches on ``rd.is_histogram`` — waveform and histogram layouts
+    differ in their header fields, stats-table rows, and bottom plot.
+    Layout modeled on Blastware's Event Report PDFs (samples in
+    docs/reference/instantel/).
+    """
+    # Letter portrait — 8.5"×11"
+    fig = plt.figure(figsize=(8.5, 11), dpi=100)
+    fig.patch.set_facecolor("white")
+
+    if rd.is_histogram:
+        _render_histogram_layout(fig, rd)
+    else:
+        _render_waveform_layout(fig, rd)
+
+    # Page footer (common to both layouts) — Created date + event id.
+    # Pushed to the very page bottom so it doesn't collide with the
+    # waveform footer scale / trigger legend lines just above.
+    # Convert UTC server_received_at to local for display.
+    created_local = _fmt_iso_to_bw(rd.server_received_at) if rd.server_received_at else "—"
+    fig.text(
+        0.07, 0.005,
+        f"Created: {created_local}  •  seismo-relay",
+        fontsize=6, color="#888", ha="left",
+    )
+    fig.text(
+        0.93, 0.005,
+        f"Event {rd.event_id[:8] if rd.event_id else '—'}",
+        fontsize=6, color="#888", ha="right",
+    )
+
+    buf = io.BytesIO()
+    fig.savefig(buf, format="pdf")
+    plt.close(fig)
+    return buf.getvalue()
+
+
+def _render_waveform_layout(fig, rd: ReportData) -> None:
+    """Waveform layout: header / mic+USBM / per-channel stats / waveform plot.
+
+    Stats table includes Time (Rel. to Trig), Peak Accel, Peak Disp.
+    Left margin sized to fit the channel labels (MicL/Long/Vert/Tran).
+    Extra bottom margin reserves space for x-axis tick labels +
+    "Amplitude Geo: X in/s/div Mic: Y psi(L)/div" footer + trigger
+    legend without overlap.
+    """
+    gs = fig.add_gridspec(
+        nrows=4, ncols=1,
+        left=0.11, right=0.94, top=0.97, bottom=0.12,
+        height_ratios=[1.7, 2.0, 1.8, 5.5],
+        hspace=0.35,
+    )
+    ax_header = fig.add_subplot(gs[0]); ax_header.axis("off")
+    _draw_header_waveform(ax_header, rd)
+
+    ax_mid = fig.add_subplot(gs[1]); ax_mid.axis("off")
+    _draw_mic_and_usbm(ax_mid, rd)
+
+    ax_stats = fig.add_subplot(gs[2]); ax_stats.axis("off")
+    _draw_channel_stats_waveform(ax_stats, rd)
+
+    _draw_waveform_subplot(fig, gs[3], rd)
+
+
+def _render_histogram_layout(fig, rd: ReportData) -> None:
+    """Histogram layout: header / mic-only / per-channel stats / bar plot.
+
+    No USBM compliance chart (it's a waveform-only concept).  Stats table
+    uses Date + Time-of-peak instead of relative-time + accel + disp.
+    Left margin sized to fit the channel labels.  Extra bottom margin
+    leaves room for the x-axis time labels + footer scale legend
+    without overlap.
+    """
+    gs = fig.add_gridspec(
+        nrows=4, ncols=1,
+        left=0.11, right=0.94, top=0.97, bottom=0.12,
+        height_ratios=[1.8, 0.9, 1.7, 5.6],
+        hspace=0.35,
+    )
+    ax_header = fig.add_subplot(gs[0]); ax_header.axis("off")
+    _draw_header_histogram(ax_header, rd)
+
+    ax_mic = fig.add_subplot(gs[1]); ax_mic.axis("off")
+    _draw_mic_only(ax_mic, rd)
+
+    ax_stats = fig.add_subplot(gs[2]); ax_stats.axis("off")
+    _draw_channel_stats_histogram(ax_stats, rd)
+
+    _draw_histogram_subplot(fig, gs[3], rd)
+
+
+def _to_display_local(iso: str):
+    """Parse an ISO timestamp and return a datetime in the system's local
+    timezone (set by the TZ env var, default America/New_York via the
+    Dockerfile).
+
+    Behaviour:
+      - "...Z" or "...+HH:MM" suffix → tz-aware UTC → converted to local
+      - Naïve "YYYY-MM-DDTHH:MM:SS" (no tz) → returned as-is.  This
+        matches the convention used elsewhere in seismo-relay: BW's
+        recorded-at timestamps are naïve and ALREADY in the unit's
+        local clock; we don't second-guess them.
+    """
+    import datetime as _dt
+    dt = _dt.datetime.fromisoformat(iso.replace("Z", "+00:00"))
+    if dt.tzinfo is not None:
+        # Convert from UTC (or other tz) → local per the TZ env var.
+        # astimezone() without arg uses the system timezone.
+        dt = dt.astimezone()
+    return dt
+
+
+def _fmt_iso_to_bw(iso: Optional[str]) -> Optional[str]:
+    """Convert an ISO-8601 timestamp to BW's display format
+    '22:30:37 May 16, 2026'.  UTC inputs (with Z suffix) are
+    converted to the system's local timezone first; naïve inputs
+    are formatted as-is.  Returns input unchanged on parse failure."""
+    if not iso or "T" not in iso:
+        return iso
+    try:
+        return _to_display_local(iso).strftime("%H:%M:%S %B %d, %Y").replace(" 0", " ")
+    except Exception:
+        return iso
+
+
+def _split_iso_to_date_time(iso: Optional[str]) -> tuple[Optional[str], Optional[str]]:
+    """Split an ISO timestamp into BW-formatted ('May 27 /26', '06:06:14')
+    date+time strings.  Used for the histogram stats table where the
+    Date and Time rows are presented separately.  UTC inputs are
+    converted to local time first.  Returns (None, None) on parse failure."""
+    if not iso:
+        return (None, None)
+    try:
+        dt = _to_display_local(iso)
+        # BW format: 'May 27 /26' (3-letter month + 2-digit year)
+        date_str = dt.strftime("%b %d /%y").replace(" 0", " ")
+        time_str = dt.strftime("%H:%M:%S")
+        return (date_str, time_str)
+    except Exception:
+        return (None, None)
+
+
+def _kv(ax, x, y, label, value, *, label_w=0.18):
+    """Render a 'Label  Value' row at axes-coordinates (x, y)."""
+    ax.text(x, y, label, fontsize=8, color="#555", ha="left", va="top",
+            transform=ax.transAxes)
+    ax.text(x + label_w, y, _fmt(value), fontsize=8, ha="left", va="top",
+            transform=ax.transAxes, family="monospace")
+
+
+def _fmt(v):
+    """Format any field for display — '—' for None, str otherwise."""
+    if v is None:
+        return "—"
+    if isinstance(v, float):
+        return f"{v:.4f}".rstrip("0").rstrip(".")
+    return str(v)
+
+
+def _draw_header_waveform(ax, rd: ReportData) -> None:
+    """Two-column metadata header — waveform variant."""
+    rows_left = [
+        ("Date/Time",      _fmt_iso_to_bw(rd.event_datetime_str)),
+        ("Trigger Source", rd.trigger_source),
+        ("Range",          rd.geo_range_str),
+        ("Sample Rate",    rd.sample_rate_str),
+        ("Notes",          rd.notes),
+        ("Project:",       rd.project),
+        ("Client:",        rd.client),
+        ("User Name:",     rd.operator),
+        ("Seis. Loc:",     rd.sensor_location),
+    ]
+    _draw_header_columns(ax, rows_left, rd)
+
+
+def _draw_header_histogram(ax, rd: ReportData) -> None:
+    """Two-column metadata header — histogram variant.
+
+    Histograms have Start / Finish / Intervals fields instead of
+    Trigger Source (there's no trigger event for a histogram capture).
+    """
+    intervals_str = None
+    if rd.histogram_n_intervals is not None and rd.histogram_interval_size:
+        intervals_str = f"{rd.histogram_n_intervals} At {rd.histogram_interval_size}"
+    rows_left = [
+        ("Start",      _fmt_iso_to_bw(rd.histogram_start_str or rd.event_datetime_str)),
+        ("Finish",     _fmt_iso_to_bw(rd.histogram_stop_str)),
+        ("Intervals",  intervals_str),
+        ("Range",      rd.geo_range_str),
+        ("Sample Rate", (f"{rd.sample_rate_sps} Sps" if rd.sample_rate_sps else None)),
+        ("Notes",      rd.notes),
+        ("Project:",   rd.project),
+        ("Client:",    rd.client),
+        ("User Name:", rd.operator),
+        ("Seis. Loc:", rd.sensor_location),
+    ]
+    _draw_header_columns(ax, rows_left, rd)
+
+
+def _draw_header_columns(ax, rows_left, rd: ReportData) -> None:
+    """Shared 2-column header rendering used by both layouts."""
+    rows_right = [
+        ("Serial Number", f"{rd.serial or '—'}" + (f"  {rd.firmware}" if rd.firmware else "")),
+        ("Battery Level", f"{rd.battery_volts:.1f} Volts" if rd.battery_volts is not None else None),
+        ("Unit Calibration", (f"{rd.calibration_date}" + (f" by {rd.calibration_by}" if rd.calibration_by else ""))
+                              if rd.calibration_date else None),
+        ("File Name", rd.file_name),
+        ("Post Event Notes", rd.post_event_notes),
+    ]
+    y = 0.95
+    dy = 0.095
+    for label, value in rows_left:
+        _kv(ax, 0.0, y, label, value, label_w=0.18)
+        y -= dy
+    y = 0.95
+    for label, value in rows_right:
+        _kv(ax, 0.55, y, label, value, label_w=0.20)
+        y -= dy
+
+
+def _draw_mic_only(ax, rd: ReportData) -> None:
+    """Mic block (histogram variant — no USBM chart)."""
+    ax.text(0.0, 0.95, "Microphone   Linear Weighting", fontsize=8, color="#555",
+            transform=ax.transAxes, va="top")
+    rows = _mic_rows(rd)
+    y = 0.70
+    for label, value in rows:
+        _kv(ax, 0.0, y, label, value, label_w=0.18)
+        y -= 0.22
+
+
+def _draw_mic_and_usbm(ax, rd: ReportData) -> None:
+    """Mic block on the left + USBM compliance chart placeholder on right.
+    (Waveform variant — USBM is a velocity-vs-frequency compliance plot
+    that doesn't apply to histograms.)"""
+    ax.text(0.0, 0.95, "Microphone   Linear Weighting", fontsize=8, color="#555",
+            transform=ax.transAxes, va="top")
+    rows = _mic_rows(rd)
+    y = 0.80
+    for label, value in rows:
+        _kv(ax, 0.0, y, label, value, label_w=0.18)
+        y -= 0.15
+
+    # USBM chart placeholder — upper-right.  Real piecewise compliance
+    # curves are a separate work item; for now this just shows the title
+    # + a "see report" message so the layout is correct.
+    ax.text(0.72, 0.97, "USBM RI8507 And OSMRE",
+            fontsize=9, weight="bold", color="#333", ha="center", va="top",
+            transform=ax.transAxes)
+    ax.text(0.72, 0.50, "[compliance chart\ncoming soon]",
+            fontsize=8, color="#bbb", ha="center", va="center",
+            transform=ax.transAxes, style="italic")
+
+
+def _mic_rows(rd: ReportData) -> list[tuple[str, Optional[str]]]:
+    """Build the mic-section value rows (shared by both layouts).
+
+    For histograms, BW formats the PSPL line as
+        "125.7 dB(L) on May 27, 2026 at 06:19:14"
+    (absolute date+time of peak).  Waveform events show the relative
+    "at 0.012 sec." instead.  Both formats covered here based on which
+    field is populated.
+    """
+    rows: list[tuple[str, Optional[str]]] = []
+    if rd.mic_pspl_dbl is not None:
+        line = f"{rd.mic_pspl_dbl:.1f} dB(L)"
+        if rd.mic_pspl_when_str:
+            # Histogram-style: "PSPL  125.7 dB(L) on May 27, 2026 at 06:19:14"
+            # mic_pspl_when_str is already "HH:MM:SS Month DD, YYYY";
+            # reformat to "on Month DD, YYYY at HH:MM:SS" for BW match.
+            parts = rd.mic_pspl_when_str.split(" ", 1)
+            if len(parts) == 2:
+                line += f" on {parts[1]} at {parts[0]}"
+            else:
+                line += f" on {rd.mic_pspl_when_str}"
+        elif rd.mic_pspl_time_s is not None:
+            # Waveform-style: relative-to-trigger seconds.
+            line += f" at {rd.mic_pspl_time_s:.3f} sec."
+        rows.append(("PSPL", line))
+    if rd.mic_zc_freq_hz is not None:
+        prefix = ">" if rd.mic_zc_freq_above_range else ""
+        rows.append(("ZC Freq", f"{prefix}{rd.mic_zc_freq_hz:.0f} Hz"))
+    if rd.mic_channel_test_result:
+        line = rd.mic_channel_test_result
+        if rd.mic_channel_test_freq_hz is not None and rd.mic_channel_test_amp_mv is not None:
+            line += (f" (Freq = {rd.mic_channel_test_freq_hz:.1f} Hz, "
+                     f"Amp = {rd.mic_channel_test_amp_mv:.0f} mv)")
+        rows.append(("Channel Test", line))
+    return rows
+
+
+def _draw_channel_stats_waveform(ax, rd: ReportData) -> None:
+    """Waveform stats table — has Time (Rel. to Trig), Peak Accel, Peak Disp.
+    Followed by Peak Vector Sum line."""
+    rows_spec = [
+        ("PPV",                  "ppv_ips",        "in/s"),
+        ("ZC Freq",              "zc_freq_hz",     "Hz"),
+        ("Time (Rel. to Trig)",  "time_of_peak_s", "sec"),
+        ("Peak Acceleration",    "peak_accel_g",   "g"),
+        ("Peak Displacement",    "peak_disp_in",   "in"),
+        ("Sensor Check",         "sensor_check",   ""),
+    ]
+    _draw_stats_table(ax, rd, rows_spec)
+    _draw_pvs_summary(ax, rd, n_data_rows=len(rows_spec))
+
+
+def _draw_channel_stats_histogram(ax, rd: ReportData) -> None:
+    """Histogram stats table — PPV, ZC Freq, Date, Time of peak, Sensor Check.
+    Followed by Peak Vector Sum line."""
+    # Date / Time of peak are per-channel timestamps for the interval at peak.
+    # bw_report stores time_of_peak_s as relative seconds, but for histograms
+    # BW shows them as absolute date+time.  We populate from rd.channel_stats
+    # if those absolute fields are present; otherwise fall back to relative.
+    rows_spec = [
+        ("PPV",          "ppv_ips",         "in/s"),
+        ("ZC Freq",      "zc_freq_hz",      "Hz"),
+        ("Date",         "peak_date",       ""),
+        ("Time",         "peak_time",       ""),
+        ("Sensor Check", "sensor_check",    ""),
+    ]
+    _draw_stats_table(ax, rd, rows_spec)
+    _draw_pvs_summary(ax, rd, n_data_rows=len(rows_spec), histogram_when=True)
+
+
+def _draw_pvs_summary(
+    ax,
+    rd: ReportData,
+    *,
+    n_data_rows: int,
+    histogram_when: bool = False,
+) -> None:
+    """Render the Peak Vector Sum + 'NA: Not Applicable' caption below the
+    stats table.
+
+    Reads ``ax._stats_table_bottom`` (set by ``_draw_stats_table`` when
+    it pins the table via an explicit ``bbox``) so the PVS line lands
+    just below the table's known bottom edge instead of guessing at the
+    geometry.
+
+    Centered horizontally for visual balance (the previous left-aligned
+    x=0 landed under the label column, not the data, which looked off).
+    """
+    if rd.peak_vector_sum_ips is None:
+        return
+
+    line = f"Peak Vector Sum   {rd.peak_vector_sum_ips:.3f} in/s"
+    if histogram_when and rd.peak_vector_sum_when_str:
+        # Histogram absolute date+time.  when_str is "HH:MM:SS Month DD, YYYY";
+        # reformat to "<value> on <date> At <time>" to match BW.
+        parts = rd.peak_vector_sum_when_str.split(" ", 1)
+        if len(parts) == 2:
+            line += f" on {parts[1]} At {parts[0]}"
+        else:
+            line += f" on {rd.peak_vector_sum_when_str}"
+    elif not histogram_when and rd.peak_vector_sum_time_s is not None:
+        line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
+
+    # _draw_stats_table stashes the bbox bottom on the axes so we don't
+    # have to guess geometry.  Falls back to a conservative default if
+    # the bbox approach hasn't run.
+    table_bottom_y = getattr(ax, "_stats_table_bottom", -0.10)
+    pvs_y = table_bottom_y - 0.04   # small gap below the table border
+
+    # Centered for visual balance — looks intentional rather than offset.
+    # The original BW-replica had a "NA: Not Applicable" caption below
+    # this line; dropped because we use "—" for missing values and the
+    # legend was always squished against the PVS line.
+    ax.text(0.5, pvs_y, line, fontsize=9, weight="bold",
+            ha="center", va="top", transform=ax.transAxes)
+
+
+def _draw_stats_table(ax, rd: ReportData, rows_spec: list[tuple[str, str, str]]) -> None:
+    """Render a per-channel stats table (Tran/Vert/Long).
+
+    rows_spec: list of (label, field_name_in_channel_stats, unit_string)
+    """
+    headers = ["", "Tran", "Vert", "Long", ""]
+    ch_lookup = {c["name"]: c for c in rd.channel_stats}
+
+    def _cell(field, ch_name):
+        ch_rec = ch_lookup.get(ch_name, {})
+        val = ch_rec.get(field)
+        if val is None:
+            return "—"
+        if isinstance(val, float):
+            # ZC Freq is integer-formatted in BW; ">100 Hz" sentinel
+            # rendered as ">N" (val carries the threshold).  Everything
+            # else gets 3 decimals.
+            if field == "zc_freq_hz":
+                prefix = ">" if ch_rec.get("zc_freq_above_range") else ""
+                return f"{prefix}{val:.0f}"
+            return f"{val:.3f}"
+        return str(val)
+
+    table_data = [headers]
+    for label, field_name, unit in rows_spec:
+        table_data.append([
+            label,
+            _cell(field_name, "Tran"),
+            _cell(field_name, "Vert"),
+            _cell(field_name, "Long"),
+            unit,
+        ])
+    # Pin the table's position+size via bbox so we know exactly where
+    # the bottom edge lands.  Lets _draw_pvs_summary place the PVS line
+    # just below the table without guessing at row heights.
+    #
+    # bbox = [x, y, width, height] in axes coords.  Header + data rows
+    # at row_h each; horizontal extent matches sum(colWidths).
+    n_rows = len(table_data)        # header + data rows
+    row_h  = 0.12                   # axes-fraction per row (fits fontsize=8)
+    table_height = n_rows * row_h
+    table_bottom = 1.0 - table_height
+    tbl = ax.table(
+        cellText=table_data,
+        colWidths=[0.28, 0.14, 0.14, 0.14, 0.10],
+        cellLoc="left", edges="open",
+        bbox=[0.0, table_bottom, 0.80, table_height],
+    )
+    tbl.auto_set_font_size(False)
+    tbl.set_fontsize(8)
+    for j in range(5):
+        tbl[(0, j)].set_text_props(weight="bold", color="#555")
+    # Stash the bottom Y so _draw_pvs_summary can position itself below.
+    ax._stats_table_bottom = table_bottom
+
+
+def _channel_axis_color(ch: str) -> str:
+    return {"MicL": "#cc00cc", "Long": "#0066ff", "Vert": "#009933", "Tran": "#cc0000"}.get(ch, "#444")
+
+
+def _draw_waveform_subplot(fig, gridspec_cell, rd: ReportData) -> None:
+    """4-channel stacked waveform plot — Instantel printout order
+    (MicL on top, Tran on bottom), shared x-axis in SECONDS, trigger
+    triangle markers at t=0, '0.0' baseline label on right of each."""
+    inner = gridspec_cell.subgridspec(4, 1, hspace=0.0)
+    order = ["MicL", "Long", "Vert", "Tran"]
+    sr = rd.sample_rate_sps or 1024
+    # Convert ms-based time axis to seconds for the x-axis
+    dt_s = (rd.dt_ms or (1000.0 / sr)) / 1000.0
+    t0_s = (rd.t0_ms if rd.t0_ms is not None else 0.0) / 1000.0
+
+    last_idx = len(order) - 1
+    for i, ch in enumerate(order):
+        ax = fig.add_subplot(inner[i])
+        values = rd.channels.get(ch) or []
+        times = [t0_s + j * dt_s for j in range(len(values))]
+
+        if values:
+            color = _channel_axis_color(ch)
+            ax.plot(times, values, color=color, linewidth=0.5)
+            # Symmetric y-axis for geo; zero-anchored for mic.
+            if ch != "MicL":
+                amax = max((abs(v) for v in values), default=0.001)
+                ax.set_ylim(-amax * 1.10, amax * 1.10)
+            else:
+                amax = max((abs(v) for v in values), default=0.001)
+                ax.set_ylim(-amax * 1.10, amax * 1.10)
+
+        # Channel label on the LEFT (matches BW)
+        ax.set_ylabel(ch, fontsize=8, rotation=0, ha="right", va="center",
+                      color=_channel_axis_color(ch), weight="bold", labelpad=14)
+        # "0.0" on the RIGHT (BW convention)
+        ax.text(1.005, 0.5, "0.0", transform=ax.transAxes,
+                fontsize=7, color="#555", va="center", ha="left")
+
+        ax.grid(True, linestyle="--", linewidth=0.3, color="#bbb", alpha=0.6)
+        # Vertical dashed trigger line at t=0
+        ax.axvline(0.0, color="#cc0000", linestyle="--", linewidth=0.6, alpha=0.7)
+        # Zero baseline horizontal
+        ax.axhline(0.0, color=_channel_axis_color(ch), linestyle="-",
+                   linewidth=0.4, alpha=0.5)
+
+        if i != last_idx:
+            ax.set_xticklabels([])
+            ax.tick_params(axis="x", length=0)
+        else:
+            ax.tick_params(axis="x", labelsize=7)
+        ax.tick_params(axis="y", labelsize=6)
+
+    # Trigger triangle marker ▼ above the top channel at t=0
+    top_ax = fig.axes[-4]  # MicL is the first added in this gridspec
+    top_ax.plot([0], [top_ax.get_ylim()[1]], marker="v", color="black",
+                markersize=8, clip_on=False, zorder=10)
+
+    # Compute scale-per-division for the footer (10 divs across the chart)
+    # and find peak geo amplitude for the geo amp/div setting.
+    total_s = times[-1] - times[0] if values else 0
+    div_s = total_s / 10 if total_s > 0 else 0
+    geo_amp_div = "—"
+    for ch in ("Tran", "Vert", "Long"):
+        v = rd.channels.get(ch) or []
+        if v:
+            amax = max(abs(x) for x in v)
+            geo_amp_div = f"{(amax * 1.1 * 2) / 10:.3f}"
+            break
+    fig.text(
+        0.11, 0.030,
+        f"Time(Seconds) {div_s:.2f} sec/div   Amplitude Geo: {geo_amp_div} in/s/div   Mic: 0.001 psi(L)/div",
+        fontsize=7, color="#444", ha="left",
+    )
+    fig.text(
+        0.11, 0.018,
+        "Trigger = ▶━━━━━ ━━━━━━◀",
+        fontsize=7, color="#444", ha="left",
+    )
+
+
+def _nice_geo_step(amax: float) -> float:
+    """Pick a "nice" per-division step for the geo y-axis.
+
+    Geo LSB is 0.005 in/s — sub-LSB steps like 0.003/div are nonsense.
+    Quantize to the BW-style 1-2-5 sequence (0.005, 0.01, 0.025, 0.05,
+    …) and return the smallest step where 5 divisions >= amax, so the
+    top of the chart lands on a tick.
+    """
+    if amax <= 0:
+        return 0.005
+    for step in (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0):
+        if step * 5 >= amax:
+            return step
+    return 10.0
+
+
+def _draw_histogram_subplot(fig, gridspec_cell, rd: ReportData) -> None:
+    """4-channel stacked histogram bar chart — per-interval peaks.
+
+    X-axis labeled with the actual times from rd.histogram_interval_times
+    when available; otherwise interval index.
+
+    The three geo channels share a single y-axis scale (a BW-style nice
+    multiple of the 0.005 in/s LSB) so bar heights are directly
+    comparable across channels.  MicL has its own auto-scale.
+    """
+    inner = gridspec_cell.subgridspec(4, 1, hspace=0.0)
+    order = ["MicL", "Long", "Vert", "Tran"]
+    last_idx = len(order) - 1
+
+    # X-axis: use absolute time labels if we have them, else interval index
+    have_times = bool(rd.histogram_interval_times)
+
+    # Shared geo scale: max across Tran/Vert/Long, quantized to a nice
+    # tick step.  Used for ylim + the footer "Amplitude Geo: X in/s/div".
+    geo_amax = 0.0
+    for gch in ("Tran", "Vert", "Long"):
+        gv = rd.channels.get(gch) or []
+        if gv:
+            geo_amax = max(geo_amax, max(abs(x) for x in gv if x is not None))
+    geo_step = _nice_geo_step(geo_amax)
+    geo_top  = geo_step * 5  # 5 divisions — top tick lands at this value
+
+    for i, ch in enumerate(order):
+        ax = fig.add_subplot(inner[i])
+        values = rd.channels.get(ch) or []
+        if values:
+            # Histograms record per-interval PEAK magnitudes — always
+            # non-negative.  Codec output occasionally includes signed
+            # values when the underlying .h5 was scaled like a waveform;
+            # take the absolute value so the bars rise from zero.
+            abs_vals = [abs(v) if v is not None else 0 for v in values]
+            xs = np.arange(len(abs_vals))
+            color = _channel_axis_color(ch)
+            ax.bar(xs, abs_vals, color=color, width=0.85, linewidth=0)
+            if ch in ("Tran", "Vert", "Long"):
+                ax.set_ylim(0, geo_top)
+                ax.set_yticks([j * geo_step for j in range(6)])
+            else:
+                amax = max(abs_vals, default=0)
+                if amax > 0:
+                    ax.set_ylim(0, amax * 1.10)
+        ax.set_ylabel(ch, fontsize=8, rotation=0, ha="right", va="center",
+                      color=_channel_axis_color(ch), weight="bold", labelpad=14)
+        ax.text(1.005, 0.02, "0.0", transform=ax.transAxes,
+                fontsize=7, color="#555", va="bottom", ha="left")
+        ax.grid(True, axis="y", linestyle="--", linewidth=0.3, color="#bbb", alpha=0.6)
+        if i != last_idx:
+            ax.set_xticklabels([])
+            ax.tick_params(axis="x", length=0)
+        else:
+            if have_times and len(rd.histogram_interval_times) == len(values):
+                # Show 2-4 labels evenly spaced
+                n = len(values)
+                step = max(1, n // 4)
+                tick_positions = list(range(0, n, step))
+                ax.set_xticks(tick_positions)
+                ax.set_xticklabels([rd.histogram_interval_times[t] for t in tick_positions],
+                                   rotation=0, fontsize=6)
+            else:
+                ax.set_xlabel("Interval", fontsize=8)
+            ax.tick_params(axis="x", labelsize=7)
+        ax.tick_params(axis="y", labelsize=6)
+
+    # Footer scale info — histograms use minute/div.  Reuses the shared
+    # geo_step computed above so the label matches the actual y-axis
+    # tick spacing on every subplot.
+    interval_str = rd.histogram_interval_size or "—"
+    geo_amp_div = f"{geo_step:.3f}"
+    fig.text(
+        0.11, 0.030,
+        f"Time {interval_str} /div   Amplitude Geo: {geo_amp_div} in/s/div   Mic: 0.001 psi(L)/div",
+        fontsize=7, color="#444", ha="left",
+    )
@@ -46,7 +46,7 @@ from typing import Optional

 # FastAPI / Pydantic
 try:
-    from fastapi import Body, FastAPI, File, HTTPException, Query, UploadFile
+    from fastapi import Body, FastAPI, File, HTTPException, Query, Response, UploadFile
    from fastapi.middleware.cors import CORSMiddleware
    from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
    from pydantic import BaseModel
@@ -381,10 +381,24 @@ def webapp():

@app.get("/waveform", response_class=FileResponse)
 def waveform_viewer():
-    """Serve the standalone waveform viewer."""
+    """Serve the standalone LIVE-device waveform viewer.
+
+    Talks to ``/device/*`` endpoints — for plotting events pulled from
+    a connected unit in real time.  For the stored-event browser that
+    reads from the SeismoDb + WaveformStore, see ``/events``.
+    """
    return str(Path(__file__).parent / "waveform_viewer.html")


+@app.get("/events", response_class=FileResponse)
+def event_browser():
+    """Serve the stored-event browser — pick a serial, list its events,
+    render any one's waveform from the persisted ``.h5`` via the
+    ``/db/events/{id}/waveform.json`` endpoint.  Standalone HTML +
+    Chart.js, no auth, no build step."""
+    return str(Path(__file__).parent / "event_browser.html")
+
+
@app.get("/device/info")
 def device_info(
    port:     Optional[str] = Query(None,             description="Serial port (e.g. COM5, /dev/ttyUSB0)"),
@@ -1973,10 +1987,15 @@ def _cleanup_event_files(row: dict) -> dict:
    base_name = bw_name or a5_name or sc_name
    if base_name:
        bw_path, a5_path = store.paths_for(serial, base_name)
-        sc_path = store.sidecar_path_for(serial, base_name)
-        h5_path = store.hdf5_path_for(serial, base_name)
+        sc_path  = store.sidecar_path_for(serial, base_name)
+        h5_path  = store.hdf5_path_for(serial, base_name)
+        # Preserved BW ASCII report (added 2026-05-27 with the .TXT
+        # preservation feature) — needs to be cleaned up too, otherwise
+        # deletes leave orphan _ASCII.TXT files behind.
+        txt_path = store.txt_path_for(serial, base_name)
        for kind, p in [("blastware", bw_path), ("a5_pickle", a5_path),
-                        ("sidecar", sc_path), ("hdf5", h5_path)]:
+                        ("sidecar", sc_path), ("hdf5", h5_path),
+                        ("txt", txt_path)]:
            try:
                if p.exists():
                    p.unlink()
@@ -2164,6 +2183,148 @@ def db_event_blastware_file(event_id: str) -> FileResponse:
    )


+@app.get("/db/events/{event_id}/ascii_report.txt")
+def db_event_ascii_report_txt(event_id: str):
+    """Serve the raw BW ASCII report (.TXT) for an event, when preserved.
+
+    Returns 404 for events ingested before the .TXT-preservation feature
+    landed (2026-05-27) — those events have only the parsed ``bw_report``
+    block in the sidecar, not the raw .TXT.  Re-forwarding from the
+    watcher PC will populate the .TXT going forward.
+    """
+    row = _get_db().get_event(event_id)
+    if row is None:
+        raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
+    serial   = row.get("serial")
+    filename = row.get("blastware_filename")
+    if not serial or not filename:
+        raise HTTPException(status_code=404, detail="Event has no associated BW file")
+    txt_path = _get_store().open_txt(serial, filename)
+    if txt_path is None:
+        raise HTTPException(
+            status_code=404,
+            detail=(
+                f"Raw .TXT not preserved for {filename}.  Events ingested "
+                "before 2026-05-27 don't have it; re-forward from the "
+                "watcher PC to populate."
+            ),
+        )
+    return FileResponse(
+        path=str(txt_path),
+        media_type="text/plain",
+        filename=txt_path.name,
+    )
+
+
+@app.get("/db/events/{event_id}/report.pdf")
+def db_event_report_pdf(event_id: str):
+    """Render an Instantel-style Event Report as a PDF.
+
+    Single-page letter portrait, matches the BW Event Report's data
+    coverage and layout (header / mic block / per-channel stats /
+    waveform plot).  V0.20.0 stub — exact visual being iterated
+    against reference PDFs in ``docs/reference/instantel/``.
+
+    Returns 404 if the event is unknown or has no waveform data on
+    disk (same condition as /waveform.json).
+    """
+    from sfm import report_pdf
+    rd = report_pdf.gather_report_data(_get_db(), _get_store(), event_id)
+    if rd is None:
+        raise HTTPException(status_code=404, detail=f"Event {event_id} not found or has no waveform")
+    pdf_bytes = report_pdf.render_event_report_pdf(rd)
+    # Suggested download filename based on the BW file basename.
+    fname = (rd.file_name or event_id).replace(".", "_")
+    return Response(
+        content=pdf_bytes,
+        media_type="application/pdf",
+        headers={"Content-Disposition": f'inline; filename="{fname}_report.pdf"'},
+    )
+
+
+def _maybe_aggregate_histogram(plot: dict, store, serial: str, filename: str, row: dict) -> dict:
+    """For histogram events, aggregate the codec's per-block samples into
+    the BW-reported number of intervals.  No-op for waveforms or when
+    we don't have the histogram metadata (interval count + size) in the
+    sidecar's bw_report block.
+
+    Why: the histogram codec emits one value per internal block (~1 per
+    second), but BW's printout shows one bar per configured interval
+    (typically 1-15 minutes).  For a 1-minute-interval event the codec
+    gives ~60 blocks per BW bar.  Aggregating max-per-group makes the
+    SFM chart + PDF visually match BW's display.
+    """
+    record_type = row.get("record_type") or ""
+    if not record_type.lower().startswith("hist"):
+        return plot
+
+    # Read interval count + size from the sidecar's bw_report.histogram block
+    try:
+        import json as _json
+        sidecar_path = store.sidecar_path_for(serial, filename)
+        if not sidecar_path.exists():
+            return plot
+        sc = _json.loads(sidecar_path.read_text())
+        hist = (sc.get("bw_report") or {}).get("histogram") or {}
+        n_intervals = hist.get("n_intervals")
+        interval_size_s = hist.get("interval_size_s")
+        start_iso = hist.get("start")
+    except Exception:
+        return plot
+    if not n_intervals or n_intervals < 1:
+        return plot
+
+    # Aggregate each channel's values into n_intervals groups, max-per-group
+    channels = plot.get("channels") or {}
+    aggregated_channels: dict = {}
+    for ch, chd in channels.items():
+        vals = chd.get("values") or []
+        if not vals:
+            aggregated_channels[ch] = chd
+            continue
+        # Distribute len(vals) samples across n_intervals groups; uneven
+        # remainders get distributed across the first few groups.
+        per_group = len(vals) // n_intervals
+        remainder = len(vals) % n_intervals
+        agg: list = []
+        offset = 0
+        for i in range(n_intervals):
+            grp_size = per_group + (1 if i < remainder else 0)
+            if grp_size > 0:
+                grp = vals[offset:offset + grp_size]
+                # Max of absolute values (peaks are magnitudes).
+                agg.append(max((abs(v) for v in grp if v is not None), default=0))
+                offset += grp_size
+            else:
+                agg.append(0)
+        aggregated_channels[ch] = {**chd, "values": agg}
+
+    # Build per-interval timestamp labels for the x-axis if we have start time
+    interval_times: list = []
+    if start_iso and interval_size_s:
+        try:
+            import datetime as _dt
+            start = _dt.datetime.fromisoformat(start_iso)
+            for i in range(int(n_intervals)):
+                # Show the END of each interval (BW convention — the
+                # peak reported is for samples taken THROUGH that time)
+                end = start + _dt.timedelta(seconds=(i + 1) * interval_size_s)
+                interval_times.append(end.strftime("%H:%M:%S"))
+        except Exception:
+            pass
+
+    # Override the time_axis to reflect intervals (not samples).
+    plot_aggr = {**plot, "channels": aggregated_channels}
+    plot_aggr["time_axis"] = {
+        **(plot.get("time_axis") or {}),
+        "histogram_aggregated": True,
+        "n_intervals":           int(n_intervals),
+        "interval_size_s":       interval_size_s,
+        "interval_times":        interval_times,
+    }
+    return plot_aggr
+
+
@app.get("/db/events/{event_id}/waveform.json")
 def db_event_waveform_json(event_id: str) -> dict:
    """
@@ -2195,7 +2356,8 @@ def db_event_waveform_json(event_id: str) -> dict:
    h5_path = store.hdf5_path_for(serial, filename)
    if h5_path.exists():
        try:
-            return event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
+            plot = event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
+            return _maybe_aggregate_histogram(plot, store, serial, filename, row)
        except Exception as exc:
            log.warning("HDF5 read failed (%s); falling back to A5 path", exc)

@@ -499,6 +499,20 @@
      text-align: left;
      border-bottom: 1px solid var(--border);
      white-space: nowrap;
+      position: sticky;
+      top: 0;
+      z-index: 1;
+    }
+    table.db-table thead th[data-sort]:hover {
+      background: var(--border2);
+      color: var(--text);
+    }
+    table.db-table thead th .sort-arrow {
+      display: inline-block;
+      width: 10px;
+      color: var(--accent, #58a6ff);
+      font-weight: 900;
+      text-align: center;
    }
    table.db-table tbody tr { border-bottom: 1px solid var(--border2); }
    table.db-table tbody tr:last-child { border-bottom: none; }
@@ -758,7 +772,9 @@
      overflow: hidden;
      min-height: 0;
    }
-    #section-db { display: none; }
+    /* Default to Database view on page load — most users are here to
+       browse stored events, not connect to a live unit. */
+    #section-live { display: none; }

    /* ── Live connect bar (host/port/connect, live section only) ── */
    #live-connect-bar {
@@ -792,8 +808,8 @@
  </div>
  <div class="hdr-sep"></div>
  <div class="section-switcher">
-    <button class="section-btn active" onclick="switchSection('live')">Live Device</button>
-    <button class="section-btn"        onclick="switchSection('db')">Database</button>
+    <button class="section-btn"        onclick="switchSection('live')">Live Device</button>
+    <button class="section-btn active" onclick="switchSection('db')">Database</button>
  </div>
  <div class="hdr-sep"></div>
  <label class="force-toggle" id="force-toggle"
@@ -802,6 +818,12 @@
    <span class="ft-dot"></span>
    <span>Force refresh</span>
  </label>
+  <div class="hdr-sep"></div>
+  <button id="mic-unit-toggle" class="section-btn"
+          onclick="_setMicUnit(_getMicUnit() === 'dBL' ? 'psi' : 'dBL')"
+          title="Toggle microphone display unit (dBL ↔ psi) for waveform plots.  Affects all mic charts; persists across page loads.">
+    Mic: dBL
+  </button>
 </header>

 <!-- ════════════════════════════════════════════════════════════════
@@ -1224,18 +1246,18 @@
    <div class="db-table-wrap" id="hist-table-wrap" style="display:none">
      <table class="db-table" id="hist-table">
        <thead>
-          <tr>
-            <th>Timestamp</th>
-            <th>Serial</th>
-            <th>Tran (in/s)</th>
-            <th>Vert (in/s)</th>
-            <th>Long (in/s)</th>
-            <th>PVS (in/s)</th>
-            <th>Mic (dBL)</th>
-            <th>Project</th>
-            <th>Client</th>
-            <th>Type</th>
-            <th>Key</th>
+          <tr id="hist-header-row">
+            <th data-sort="timestamp">Timestamp <span class="sort-arrow"></span></th>
+            <th data-sort="serial">Serial <span class="sort-arrow"></span></th>
+            <th data-sort="tran_ppv">Tran (in/s) <span class="sort-arrow"></span></th>
+            <th data-sort="vert_ppv">Vert (in/s) <span class="sort-arrow"></span></th>
+            <th data-sort="long_ppv">Long (in/s) <span class="sort-arrow"></span></th>
+            <th data-sort="peak_vector_sum">PVS (in/s) <span class="sort-arrow"></span></th>
+            <th data-sort="mic_ppv">Mic (dBL) <span class="sort-arrow"></span></th>
+            <th data-sort="project">Project <span class="sort-arrow"></span></th>
+            <th data-sort="client">Client <span class="sort-arrow"></span></th>
+            <th data-sort="record_type">Type <span class="sort-arrow"></span></th>
+            <th data-sort="waveform_key">Key <span class="sort-arrow"></span></th>
            <th></th>
          </tr>
        </thead>
@@ -1388,7 +1410,9 @@ function deviceParams() {
 }

 // ── Section switching ─────────────────────────────────────────────────────────
-let currentSection = 'live';
+// Default to Database — most users land here to browse stored events.
+// Live Device is opt-in (click the tab to talk to a unit).
+let currentSection = 'db';

 function switchSection(name) {
  currentSection = name;
@@ -2333,6 +2357,12 @@ async function _fetchUnits() {
 }

 // ── History tab ────────────────────────────────────────────────────────────────
+// Module-level state for the history table — preserved across re-sorts.
+// We sort + re-render without re-fetching.
+let _histEvents = [];
+let _histSortKey = 'timestamp';
+let _histSortDir = 'desc';   // 'asc' | 'desc'
+
 async function loadHistory() {
  histLoaded = true;
  const serial  = document.getElementById('hist-serial-filter').value;
@@ -2364,10 +2394,20 @@ async function loadHistory() {
  _populateSerialDropdown('monlog-serial-filter');
  _populateSerialDropdown('sess-serial-filter');

-  document.getElementById('hist-count').textContent = `${events.length} event${events.length !== 1 ? 's' : ''}`;
+  _histEvents = events;
+  renderHistTable();
+}
+
+// Re-render the history table from `_histEvents` using the current sort
+// state.  Pulled out of `loadHistory` so column-header clicks can re-sort
+// in-memory without re-fetching from the server.
+function renderHistTable() {
+  const events = _histEvents;
+  document.getElementById('hist-count').textContent =
+    `${events.length} event${events.length !== 1 ? 's' : ''}`;
+
  const tbody = document.getElementById('hist-tbody');
  tbody.innerHTML = '';
-
  if (events.length === 0) {
    document.getElementById('hist-empty').style.display = 'block';
    document.getElementById('hist-table-wrap').style.display = 'none';
@@ -2376,11 +2416,31 @@ async function loadHistory() {
  document.getElementById('hist-empty').style.display = 'none';
  document.getElementById('hist-table-wrap').style.display = 'block';

-  for (const ev of events) {
+  // Sort in-place by current key + direction.  Nulls sink to the bottom
+  // regardless of direction.
+  const k = _histSortKey;
+  const dir = _histSortDir === 'asc' ? 1 : -1;
+  const sorted = [...events].sort((a, b) => {
+    const av = a[k], bv = b[k];
+    if (av == null && bv == null) return 0;
+    if (av == null) return 1;
+    if (bv == null) return -1;
+    if (typeof av === 'number' && typeof bv === 'number') return (av - bv) * dir;
+    return String(av).localeCompare(String(bv)) * dir;
+  });
+
+  // Update arrow indicators in the headers
+  document.querySelectorAll('#hist-header-row th[data-sort]').forEach(th => {
+    const arrow = th.querySelector('.sort-arrow');
+    if (!arrow) return;
+    arrow.textContent = th.dataset.sort === k ? (_histSortDir === 'asc' ? '↑' : '↓') : '';
+  });
+
+  for (const ev of sorted) {
    const tr = document.createElement('tr');
    const pvs = ev.peak_vector_sum;
    tr.classList.add('clickable');
-    tr.title = 'Click to review (open sidecar editor)';
+    tr.title = 'Click to view waveform + sidecar';
    tr.dataset.eventId = ev.id;
    tr.innerHTML = `
      <td>${_fmtTs(ev.timestamp)}</td>
@@ -2408,6 +2468,28 @@ async function loadHistory() {
  }
 }

+// Click a column header → toggle sort.  Click another → set sort to that column.
+document.addEventListener('DOMContentLoaded', () => {
+  const headerRow = document.getElementById('hist-header-row');
+  if (!headerRow) return;
+  headerRow.querySelectorAll('th[data-sort]').forEach(th => {
+    th.style.cursor = 'pointer';
+    th.style.userSelect = 'none';
+    th.addEventListener('click', () => {
+      const k = th.dataset.sort;
+      if (_histSortKey === k) {
+        _histSortDir = _histSortDir === 'asc' ? 'desc' : 'asc';
+      } else {
+        _histSortKey = k;
+        // Default direction: 'desc' for numbers + timestamps (biggest/newest first),
+        // 'asc' for text columns (alphabetical).
+        _histSortDir = ['serial','project','client','record_type','waveform_key'].includes(k) ? 'asc' : 'desc';
+      }
+      renderHistTable();
+    });
+  });
+});
+
 // ── Sidecar review modal ───────────────────────────────────────────────────────
 //
 // Opens on row click in the History table.  Loads the .sfm.json sidecar
@@ -2430,23 +2512,373 @@ async function openSidecarModal(eventId) {
  document.getElementById('sc-edit-ft').checked = false;
  document.getElementById('sc-edit-reviewer').value = '';
  document.getElementById('sc-edit-notes').value = '';
+  // Reset waveform area
+  document.getElementById('sc-waveform-status').textContent = 'Loading waveform…';
+  document.getElementById('sc-waveform-charts').innerHTML = '';
+  _destroyScCharts();

-  try {
-    const r = await fetch(`${api()}/db/events/${eventId}/sidecar`);
-    if (!r.ok) {
-      const e = await r.json().catch(() => ({}));
-      throw new Error(e.detail || r.statusText);
-    }
-    const data = await r.json();
+  // Sidecar + waveform fetched in parallel — neither blocks the other.
+  const sidecarP  = fetch(`${api()}/db/events/${eventId}/sidecar`)
+    .then(async r => {
+      if (!r.ok) { const e = await r.json().catch(() => ({})); throw new Error(e.detail || r.statusText); }
+      return r.json();
+    });
+  const waveformP = fetch(`${api()}/db/events/${eventId}/waveform.json`)
+    .then(async r => {
+      if (r.status === 404) return null;  // no waveform available — render empty state
+      if (!r.ok) { const e = await r.json().catch(() => ({})); throw new Error(e.detail || r.statusText); }
+      return r.json();
+    });
+
+  // Sidecar usually loads first (smaller payload).  Each one renders
+  // independently so the modal becomes useful as soon as either lands.
+  sidecarP.then(data => {
    _scCurrentSidecar = data;
    _renderSidecar(data);
    document.getElementById('sc-status').textContent = '';
-  } catch (e) {
+  }).catch(e => {
    document.getElementById('sc-status').className = 'sc-status error';
-    document.getElementById('sc-status').textContent = `Load failed: ${e.message}`;
+    document.getElementById('sc-status').textContent = `Sidecar load failed: ${e.message}`;
+  });
+
+  waveformP.then(data => {
+    if (!data) {
+      document.getElementById('sc-waveform-status').textContent = 'No waveform data for this event.';
+      return;
+    }
+    _renderScWaveform(data);
+  }).catch(e => {
+    document.getElementById('sc-waveform-status').textContent = `Waveform load failed: ${e.message}`;
+  });
+}
+
+// ── Sidecar-modal waveform plot ──────────────────────────────────────────────
+// Renders the 4-channel decoded waveform fetched from
+// /db/events/{id}/waveform.json — MicL on top, Tran on bottom (matches
+// Instantel BW Event Report layout).  Uses Chart.js (loaded at the top of
+// the page for the live-device viewer).
+const _SC_CHANNEL_COLORS = {
+  MicL: '#e066ff',
+  Long: '#3a80ff',
+  Vert: '#3fb950',
+  Tran: '#f85149',
+};
+const _SC_CHANNEL_ORDER = ['MicL', 'Long', 'Vert', 'Tran'];
+let _scCharts = {};
+
+// User preference for how mic is displayed in plots — dBL (default,
+// matches BW printout convention + the rest of SFM) or psi (the raw
+// sample unit).  Toggleable via the header pill; persists in localStorage.
+function _getMicUnit() {
+  return localStorage.getItem('sfm_mic_unit') === 'psi' ? 'psi' : 'dBL';
+}
+function _setMicUnit(u) {
+  localStorage.setItem('sfm_mic_unit', u === 'psi' ? 'psi' : 'dBL');
+  _refreshMicUnitToggleLabel();
+  // Re-render the open modal so the change is immediately visible.
+  if (_scCurrentEventId) openSidecarModal(_scCurrentEventId);
+}
+function _refreshMicUnitToggleLabel() {
+  const b = document.getElementById('mic-unit-toggle');
+  if (b) b.textContent = `Mic: ${_getMicUnit()}`;
+}
+// Convert a psi value to dB(L).  Returns null for non-positive values
+// (log of zero is undefined) — Chart.js handles null as a gap in the line.
+function _psiToDbl(psi) {
+  if (psi == null || !(psi > 0)) return null;
+  return 20 * Math.log10(psi / DBL_REF);
+}
+
+// Per-sample mic display floor.  Sound pressure AC samples spend most
+// of their time at the digitization noise floor (1-2 ADC counts ≈ ~20-40
+// dBL).  Rendering each one as null/-inf produces a spikey discontinuous
+// chart of "moments when sound briefly exceeded 80 dBL" — confusing.
+// Instead we rectify (abs the AC waveform), convert to dBL, and floor
+// anything below MIC_DBL_FLOOR so the chart has a continuous baseline
+// with peaks rising above it.  Matches how acoustic engineers expect to
+// see SPL-vs-time.
+const MIC_DBL_FLOOR = 60;
+function _psiToDblForChart(psi) {
+  if (psi == null) return MIC_DBL_FLOOR;
+  const a = Math.abs(psi);
+  if (a === 0) return MIC_DBL_FLOOR;
+  const dbl = 20 * Math.log10(a / DBL_REF);
+  return dbl > MIC_DBL_FLOOR ? dbl : MIC_DBL_FLOOR;
+}
+
+// Adaptive decimal formatter — scientific notation is reserved for truly
+// extreme values (10000+ or sub-0.0001).  Normal-range values (most peaks
+// fall here) render as decimals with sensible precision.  Replaces the
+// previous .toExponential(3) call that turned every peak into ugly "2.500E-2".
+function _fmtPeak(v, unit) {
+  if (v == null || (typeof v === 'number' && !isFinite(v))) return '';
+  if (typeof v !== 'number') return String(v) + (unit ? ' ' + unit : '');
+  if (v === 0) return '0' + (unit ? ' ' + unit : '');
+  const a = Math.abs(v);
+  const u = unit ? ' ' + unit : '';
+  if (a >= 0.0001 && a < 10000) {
+    const d = a >= 100 ? 1 : a >= 10 ? 2 : a >= 1 ? 3 : a >= 0.1 ? 4 : 5;
+    return v.toFixed(d) + u;
+  }
+  return v.toExponential(2) + u;
+}
+
+function _destroyScCharts() {
+  Object.values(_scCharts).forEach(c => { try { c.destroy(); } catch {} });
+  _scCharts = {};
+}
+
+function _renderScWaveform(data) {
+  document.getElementById('sc-waveform-status').textContent = '';
+  const chartsDiv = document.getElementById('sc-waveform-charts');
+  chartsDiv.innerHTML = '';
+  _destroyScCharts();
+
+  const channels = data.channels || {};
+  // time_axis is METADATA, not an array — it carries sample_rate,
+  // pretrig_samples, t0_ms (first-sample time relative to trigger,
+  // negative when pretrig samples exist), and dt_ms.  Trigger is at
+  // t=0 by convention.
+  const ta       = data.time_axis || {};
+  const sr       = ta.sample_rate || 1024;
+  const dtMs    = ta.dt_ms || (1000.0 / sr);
+  const t0Ms    = ta.t0_ms != null ? ta.t0_ms : 0;
+  // Histogram events have per-interval peaks, not per-sample data.
+  // Render as bars (one per interval) instead of a connected line, and
+  // suppress trigger/zero overlays which don't apply.  X-axis becomes
+  // interval index since the sample_rate-based time math is meaningless
+  // here (each "sample" is one interval, typically 1-5 minutes long).
+  const isHistogram = String(data.record_type || '').toLowerCase().includes('histogram');
+
+  // Which channels have data — determines which one renders the shared bottom axis.
+  const withData = _SC_CHANNEL_ORDER.filter(ch =>
+    channels[ch] && (channels[ch].values || []).length > 0
+  );
+  const lastCh = withData[withData.length - 1];
+
+  const micUnit = _getMicUnit();   // user preference: 'dBL' or 'psi'
+
+  for (const ch of _SC_CHANNEL_ORDER) {
+    const chData = channels[ch];
+    if (!chData) continue;
+    let values = chData.values || [];
+    let chUnit = chData.unit || '';
+    let chPeak = chData.peak;
+
+    // Mic channel: convert from raw psi to dB(L) when user prefers dBL
+    // (default).  Per-sample values use _psiToDblForChart which rectifies
+    // (abs) the AC waveform and floors at MIC_DBL_FLOOR so the chart is
+    // continuous with a baseline + peaks above it, instead of a sparse
+    // pattern of isolated spikes for "moments when sound briefly exceeded
+    // the Y-axis bottom".  The peak label uses _psiToDbl with the
+    // unrectified peak (preserves the true measurement).
+    if (ch === 'MicL' && chUnit === 'psi' && micUnit === 'dBL') {
+      values = values.map(_psiToDblForChart);
+      chPeak = _psiToDbl(chPeak);
+      chUnit = 'dB(L)';
+    }
+
+    const wrap = document.createElement('div');
+    wrap.style.cssText = 'background:var(--surface);border:1px solid var(--border2);border-radius:6px;padding:6px 30px 4px 10px';
+    const lbl = document.createElement('div');
+    lbl.style.cssText = `font-size:10px;font-weight:600;letter-spacing:0.05em;text-transform:uppercase;margin-bottom:2px;color:${_SC_CHANNEL_COLORS[ch]};display:flex;justify-content:space-between`;
+    const peakStr = chPeak != null
+      ? `peak ${_fmtPeak(chPeak, chUnit)}`
+      : '';
+    lbl.innerHTML = `<span>${ch}</span><span style="color:var(--text-dim);font-weight:normal">${peakStr}</span>`;
+    wrap.appendChild(lbl);
+
+    if (values.length === 0) {
+      const e = document.createElement('div');
+      e.style.cssText = 'height:80px;display:flex;align-items:center;justify-content:center;color:var(--text-dim);font-size:11px';
+      e.textContent = 'no samples decoded';
+      wrap.appendChild(e);
+      chartsDiv.appendChild(wrap);
+      continue;
+    }
+
+    const canvasWrap = document.createElement('div');
+    canvasWrap.style.cssText = 'position:relative;height:100px';
+    const canvas = document.createElement('canvas');
+    canvasWrap.appendChild(canvas);
+    wrap.appendChild(canvasWrap);
+    chartsDiv.appendChild(wrap);
+
+    // Waveform: per-sample time in ms relative to trigger (negative for pretrig).
+    // Histogram: when the server has aggregated to BW-reported intervals AND
+    // provides per-interval timestamps, use those as x-axis labels (HH:MM:SS).
+    // Falls back to interval index.
+    let times;
+    if (isHistogram) {
+      const intervalTimes = ta.interval_times || [];
+      times = (intervalTimes.length === values.length)
+        ? intervalTimes
+        : values.map((_, i) => i + 1);
+    } else {
+      times = values.map((_, i) => t0Ms + i * dtMs);
+    }
+
+    // Downsample for rendering when very long.
+    const MAX = 3000;
+    let rT = times, rV = values;
+    if (values.length > MAX) {
+      const step = Math.ceil(values.length / MAX);
+      rT = times.filter((_, i) => i % step === 0);
+      rV = values.filter((_, i) => i % step === 0);
+    }
+    const showX = (ch === lastCh);
+
+    // Tick label formatter: snap floats to 1 decimal place so we don't get
+    // "11.7187040000000002 ms" garbage from accumulated floating-point error.
+    const xAxisLabel = isHistogram ? '' : ' ms';
+    const fmtTick = i => {
+      const v = rT[i];
+      if (typeof v === 'number') {
+        // Whole numbers (intervals) → no decimals.  Sub-integer ms → 1 decimal.
+        const s = Number.isInteger(v) ? String(v) : v.toFixed(1);
+        return s + xAxisLabel;
+      }
+      return String(v) + xAxisLabel;
+    };
+
+    // Y-axis bounds.  Convention:
+    //   - Geophones (Tran/Vert/Long) on waveform-mode events:
+    //     symmetric around zero so the zero line sits in the middle and
+    //     positive/negative excursions are visually balanced.
+    //   - Mic (always positive sound pressure) + histograms (per-interval
+    //     peaks, always positive): default auto-scale, zero at the bottom.
+    let yBounds = {};
+    const isGeo = ch !== 'MicL';
+    if (isGeo && !isHistogram) {
+      // Waveform geo: symmetric around zero, full zoom to shape detail.
+      let absMax = 0;
+      for (const v of values) {
+        const a = Math.abs(v);
+        if (a > absMax) absMax = a;
+      }
+      const padded = (absMax || 1) * 1.10;
+      yBounds = { min: -padded, max: padded };
+    } else if (isGeo && isHistogram) {
+      // Histogram geo: enforce a minimum chart range so a quiet
+      // 0.005 in/s event renders as ~10% of chart height instead of
+      // filling the panel.  Matches BW's near-fixed-scale convention
+      // (their footer is "Geo: 0.002 in/s/div" — a chart-relative scale,
+      // not auto-zoom).
+      const HIST_GEO_MIN_INS = 0.05;
+      let peak = 0;
+      for (const v of values) { const a = Math.abs(v); if (a > peak) peak = a; }
+      yBounds = { min: 0, max: Math.max(peak * 1.10, HIST_GEO_MIN_INS) };
+    } else if (ch === 'MicL' && micUnit === 'dBL') {
+      // Mic in dBL — pin baseline at noise-floor minimum (where we floored
+      // quiet samples), top at actual peak + a few dB headroom.
+      const peakDbl = (typeof chPeak === 'number' && isFinite(chPeak))
+        ? chPeak + 5
+        : 100;
+      yBounds = { min: MIC_DBL_FLOOR, max: Math.max(peakDbl, MIC_DBL_FLOOR + 20) };
+    } else if (ch === 'MicL' && isHistogram && micUnit === 'psi') {
+      // Mic histogram in psi — same minimum-range treatment as geo.
+      // 0.001 psi ≈ 110 dBL — typical "loud" mic peak.  Quiet events
+      // sit near the bottom.
+      const HIST_MIC_MIN_PSI = 0.001;
+      let peak = 0;
+      for (const v of values) { const a = Math.abs(v); if (a > peak) peak = a; }
+      yBounds = { min: 0, max: Math.max(peak * 1.10, HIST_MIC_MIN_PSI) };
+    }
+
+    _scCharts[ch] = new Chart(canvas, {
+      type: isHistogram ? 'bar' : 'line',
+      data: {
+        labels: rT.map(t => (typeof t === 'number' ? (Number.isInteger(t) ? String(t) : t.toFixed(2)) : t)),
+        datasets: isHistogram ? [{
+          data: rV,
+          backgroundColor: _SC_CHANNEL_COLORS[ch],
+          borderWidth: 0,
+          barPercentage: 1.0,
+          categoryPercentage: 1.0,  // bars touch — "tight bargraph" look
+        }] : [{
+          data: rV,
+          borderColor: _SC_CHANNEL_COLORS[ch],
+          borderWidth: 1,
+          pointRadius: 0,
+          tension: 0,
+        }],
+      },
+      options: {
+        animation: false, responsive: true, maintainAspectRatio: false,
+        plugins: {
+          legend: { display: false },
+          tooltip: {
+            mode: 'index', intersect: false,
+            callbacks: {
+              title: items => isHistogram
+                ? `interval ${items[0].label}`
+                : `t = ${items[0].label} ms`,
+              label: item => `${ch}: ${_fmtPeak(item.raw, chUnit)}`,
+            },
+          },
+        },
+        scales: {
+          x: {
+            type: 'category', display: showX,
+            ticks: { color: '#484f58', maxTicksLimit: 8, maxRotation: 0, callback: (v, i) => fmtTick(i) },
+            grid:  { color: '#21262d', drawTicks: showX },
+          },
+          y: {
+            ...yBounds,
+            ticks: { color: '#484f58', maxTicksLimit: 4 },
+            grid:  { color: '#21262d' },
+            title: { display: true, text: chUnit, color: '#484f58', font: { size: 9 } },
+          },
+        },
+      },
+      plugins: isHistogram ? [] : [{
+        // Trigger line + triangle markers + zero baseline — only meaningful
+        // for waveform-mode events.  Histograms have no trigger.
+        id: 'overlays',
+        afterDraw(chart) {
+          const ctx = chart.ctx, x = chart.scales.x, y = chart.scales.y;
+          // Dashed trigger line at t=0
+          const zi = rT.findIndex(t => parseFloat(t) >= 0);
+          if (zi >= 0) {
+            const px = x.getPixelForValue(zi);
+            ctx.save();
+            ctx.beginPath(); ctx.moveTo(px, y.top); ctx.lineTo(px, y.bottom);
+            ctx.strokeStyle = 'rgba(248,81,73,0.8)'; ctx.lineWidth = 1.2;
+            ctx.setLineDash([4, 3]); ctx.stroke(); ctx.restore();
+            // Triangle markers above and below the chart
+            ctx.save();
+            ctx.fillStyle = '#f85149';
+            ctx.beginPath();
+            ctx.moveTo(px - 4, y.top - 7); ctx.lineTo(px + 4, y.top - 7); ctx.lineTo(px, y.top - 1);
+            ctx.closePath(); ctx.fill();
+            ctx.beginPath();
+            ctx.moveTo(px - 4, y.bottom + 7); ctx.lineTo(px + 4, y.bottom + 7); ctx.lineTo(px, y.bottom + 1);
+            ctx.closePath(); ctx.fill();
+            ctx.restore();
+          }
+          // Zero baseline + label
+          const zy = y.getPixelForValue(0);
+          if (zy >= y.top && zy <= y.bottom) {
+            ctx.save();
+            ctx.strokeStyle = '#30363d'; ctx.lineWidth = 0.8;
+            ctx.setLineDash([2, 2]);
+            ctx.beginPath(); ctx.moveTo(x.left, zy); ctx.lineTo(x.right, zy); ctx.stroke();
+            ctx.restore();
+            ctx.save();
+            ctx.fillStyle = '#c9d1d9'; ctx.font = '10px monospace';
+            ctx.textAlign = 'left'; ctx.textBaseline = 'middle';
+            ctx.fillText('0.0', x.right + 6, zy);
+            ctx.restore();
+          }
+        },
+      }],
+    });
  }
 }

+// Make sure charts get cleaned up when the modal closes.
+function _scCleanupOnClose() { _destroyScCharts(); }
+
 function _renderSidecar(data) {
  const ev   = data.event        || {};
  const pv   = data.peak_values  || {};
@@ -2454,6 +2886,12 @@ function _renderSidecar(data) {
  const bw   = data.blastware    || {};
  const src  = data.source       || {};
  const rev  = data.review       || {};
+  // bw_report carries the per-channel ASCII-derived stats (ZC Freq,
+  // saturation flags, peak time, etc.).  Only present on events
+  // ingested with a preserved .TXT (post-2026-05-27); falls back to
+  // empty for legacy events.
+  const bwrPeaks = (data.bw_report || {}).peaks || {};
+  const bwrMic   = (data.bw_report || {}).mic   || {};

  document.getElementById('sc-title').textContent = `Event — ${bw.filename || ev.waveform_key || 'unknown'}`;

@@ -2479,27 +2917,72 @@ function _renderSidecar(data) {
  };

  document.getElementById('sc-f-serial').textContent   = ev.serial          || '—';
-  document.getElementById('sc-f-ts').textContent       = ev.timestamp       || '—';
+  // Route through _fmtTs so the unit-local naive timestamp shows as
+  // "5/27/2026, 6:00:13 AM" instead of "2026-05-27T06:00:13".
+  document.getElementById('sc-f-ts').textContent       = _fmtTs(ev.timestamp);
  document.getElementById('sc-f-rt').textContent       = ev.record_type     || '—';
  document.getElementById('sc-f-sr').textContent       = (ev.sample_rate ?? '—') + (ev.sample_rate ? ' sps' : '');
  document.getElementById('sc-f-key').textContent      = ev.waveform_key    || '—';

-  document.getElementById('sc-f-tran').textContent     = fmtPpv(pv.transverse);
-  document.getElementById('sc-f-vert').textContent     = fmtPpv(pv.vertical);
-  document.getElementById('sc-f-long').textContent     = fmtPpv(pv.longitudinal);
+  // Suffix with " · {prefix}{N} Hz" when bw_report has a ZC Freq.
+  // Above-range ZC peaks (BW ">100 Hz") get a literal ">" prefix so
+  // operators see the same indicator the PDF shows.
+  const fmtZc = bwr => {
+    if (!bwr || bwr.zc_freq_hz == null) return '';
+    const prefix = bwr.zc_freq_above_range ? '>' : '';
+    return ` · ${prefix}${Math.round(bwr.zc_freq_hz)} Hz`;
+  };
+  document.getElementById('sc-f-tran').textContent     = fmtPpv(pv.transverse)   + fmtZc(bwrPeaks.tran);
+  document.getElementById('sc-f-vert').textContent     = fmtPpv(pv.vertical)     + fmtZc(bwrPeaks.vert);
+  document.getElementById('sc-f-long').textContent     = fmtPpv(pv.longitudinal) + fmtZc(bwrPeaks.long);
  document.getElementById('sc-f-pvs').textContent      = fmtPpv(pv.vector_sum);
-  document.getElementById('sc-f-mic').textContent      = fmtMic(pv.mic_psi);
+  document.getElementById('sc-f-mic').textContent      = fmtMic(pv.mic_psi)      + fmtZc(bwrMic);

  document.getElementById('sc-f-project').textContent  = pi.project         || '—';
  document.getElementById('sc-f-client').textContent   = pi.client          || '—';
  document.getElementById('sc-f-operator').textContent = pi.operator        || '—';
  document.getElementById('sc-f-loc').textContent      = pi.sensor_location || '—';

-  document.getElementById('sc-f-bw').textContent       = bw.filename        || '—';
+  // Filename rendered as a clickable download link for the original BW
+  // binary.  Same endpoint the live-device viewer uses for stored events
+  // (/db/events/{id}/blastware_file).
+  const bwCell = document.getElementById('sc-f-bw');
+  bwCell.innerHTML = '';
+  if (bw.filename && _scCurrentEventId) {
+    const a = document.createElement('a');
+    a.href = `${api()}/db/events/${_scCurrentEventId}/blastware_file`;
+    a.textContent = bw.filename;
+    a.download = bw.filename;
+    a.title = 'Download original BW event binary';
+    a.style.color = 'var(--accent, #58a6ff)';
+    a.style.textDecoration = 'underline';
+    bwCell.appendChild(a);
+  } else {
+    bwCell.textContent = '—';
+  }
  document.getElementById('sc-f-bwsize').textContent   = bw.filesize != null ? `${bw.filesize} bytes` : '—';
  document.getElementById('sc-f-sha').textContent      = bw.sha256          || '—';
-  document.getElementById('sc-f-src').textContent      = src.kind           || '—';
-  document.getElementById('sc-f-cap').textContent      = src.captured_at    || '—';
+  // Source kind + a download link for the preserved BW ASCII report
+  // (.TXT), when available.  Only events ingested after 2026-05-27
+  // have the .TXT preserved; older events show "—".
+  const srcCell = document.getElementById('sc-f-src');
+  srcCell.innerHTML = '';
+  srcCell.appendChild(document.createTextNode(src.kind || '—'));
+  if (src.txt_filename && _scCurrentEventId) {
+    const a = document.createElement('a');
+    a.href = `${api()}/db/events/${_scCurrentEventId}/ascii_report.txt`;
+    a.textContent = ' (download .TXT)';
+    a.download = src.txt_filename;
+    a.title = 'Download preserved BW ASCII report';
+    a.style.color = 'var(--accent, #58a6ff)';
+    a.style.marginLeft = '8px';
+    a.style.fontSize = '11px';
+    srcCell.appendChild(a);
+  }
+  // captured_at has a "Z" suffix (UTC); _fmtTs converts to browser local
+  // — matches the BW-reported recorded-at, no more "21:59:57 vs it's 6 PM"
+  // confusion from operators reading the raw UTC value.
+  document.getElementById('sc-f-cap').textContent      = _fmtTs(src.captured_at);

  document.getElementById('sc-edit-ft').checked        = !!rev.false_trigger;
  document.getElementById('sc-edit-reviewer').value    = rev.reviewer || '';
@@ -2512,6 +2995,19 @@ function closeSidecarModal() {
  document.getElementById('sc-overlay').classList.remove('visible');
  _scCurrentEventId = null;
  _scCurrentSidecar = null;
+  _destroyScCharts();
+}
+
+// Trigger a PDF download for the currently-open event.  The browser
+// handles the actual save dialog from the Content-Disposition header
+// the server sends.
+function downloadEventReport() {
+  if (!_scCurrentEventId) return;
+  const url = `${api()}/db/events/${_scCurrentEventId}/report.pdf`;
+  // Open in a new tab — browser prompts to save or displays inline,
+  // and a failed fetch (e.g. 404 for events with no waveform) shows
+  // its JSON error in-page rather than silently failing.
+  window.open(url, '_blank');
 }

 function onSidecarOverlayClick(e) {
@@ -2722,6 +3218,16 @@ document.addEventListener('keydown', e => {
 // hit localhost:8200, 10.0.0.44:8200, or anything else.
 document.getElementById('api-base').value = window.location.origin;

+// Reflect any persisted mic-unit preference in the header pill on load
+_refreshMicUnitToggleLabel();
+
+// We default to Database view → trigger initial history + units load
+// (switchSection handles this when clicked, but we never click on first paint).
+if (currentSection === 'db') {
+  if (!histLoaded)  loadHistory();
+  if (!unitsLoaded) loadUnits();
+}
+
 // Press Enter in any live connect field to connect
 ['dev-host','dev-port'].forEach(id => {
  document.getElementById(id)?.addEventListener('keydown', e => { if (e.key === 'Enter') connectUnit(); });
@@ -2738,11 +3244,18 @@ document.getElementById('api-base').value = window.location.origin;
      <button class="sc-close" onclick="closeSidecarModal()">×</button>
    </div>
    <div class="sc-body">
+      <!-- Waveform plot — 4 channels stacked (MicL, Long, Vert, Tran) — -->
+      <div class="sc-section" id="sc-section-waveform">
+        <h4>Waveform</h4>
+        <div id="sc-waveform-status" style="color:var(--text-dim);font-size:11px;margin-bottom:6px">Loading…</div>
+        <div id="sc-waveform-charts" style="display:flex;flex-direction:column;gap:6px"></div>
+      </div>
      <div class="sc-section">
        <h4>Event</h4>
        <dl class="sc-grid">
          <dt>Serial</dt>           <dd id="sc-f-serial">—</dd>
-          <dt>Timestamp</dt>        <dd id="sc-f-ts">—</dd>
+          <dt title="When the seismograph recorded this event (from the BW report's Event Time field)">Recorded at</dt>
+                                    <dd id="sc-f-ts">—</dd>
          <dt>Record type</dt>      <dd id="sc-f-rt">—</dd>
          <dt>Sample rate</dt>      <dd id="sc-f-sr">—</dd>
          <dt>Waveform key</dt>     <dd id="sc-f-key">—</dd>
@@ -2774,7 +3287,8 @@ document.getElementById('api-base').value = window.location.origin;
          <dt id="sc-l-bwsize">File size</dt>   <dd id="sc-f-bwsize">—</dd>
          <dt id="sc-l-sha">File sha256</dt>    <dd id="sc-f-sha">—</dd>
          <dt>Source kind</dt>      <dd id="sc-f-src">—</dd>
-          <dt>Captured at</dt>      <dd id="sc-f-cap">—</dd>
+          <dt title="When SFM received and stored this event — NOT the unit-local trigger time (see Timestamp at the top of the modal for that).">Time received</dt>
+                                    <dd id="sc-f-cap">—</dd>
        </dl>
      </div>
      <div class="sc-section">
@@ -2797,6 +3311,10 @@ document.getElementById('api-base').value = window.location.origin;
    </div>
    <div class="sc-footer">
      <span class="sc-status" id="sc-status"></span>
+      <button class="btn btn-ghost" id="sc-pdf-btn" onclick="downloadEventReport()"
+              title="Download an Instantel-style Event Report PDF for this event">
+        Download PDF
+      </button>
      <button class="btn btn-ghost" onclick="closeSidecarModal()">Cancel</button>
      <button class="btn" id="sc-save-btn" onclick="saveSidecarReview()">Save</button>
    </div>
@@ -108,11 +108,30 @@ class WaveformStore:
        """Return absolute path to the .h5 clean-waveform file for a given event."""
        return self._serial_dir(serial) / f"{filename}.h5"

+    def txt_path_for(self, serial: str, filename: str) -> Path:
+        """Return absolute path to the preserved BW ASCII report (.TXT)
+        for a given event.
+
+        We name it ``<filename>_ASCII.TXT`` to match BW's own filename
+        convention in the ACH folder.  Saved at ingest time alongside
+        the binary so the parser bug fixes can be applied retroactively
+        by re-parsing without needing to re-forward from the watcher PC.
+        """
+        return self._serial_dir(serial) / f"{filename}_ASCII.TXT"
+
    def open_blastware(self, serial: str, filename: str) -> Optional[Path]:
        """Return absolute path to an existing event file or None."""
        bw_path, _ = self.paths_for(serial, filename)
        return bw_path if bw_path.exists() else None

+    def open_txt(self, serial: str, filename: str) -> Optional[Path]:
+        """Return absolute path to the preserved BW ASCII report for an
+        event, or None if the .TXT wasn't saved at ingest time (events
+        ingested before .TXT preservation landed will show None until
+        re-forwarded)."""
+        p = self.txt_path_for(serial, filename)
+        return p if p.exists() else None
+
    # ── save / load ─────────────────────────────────────────────────────────────

    def save(
@@ -357,6 +376,28 @@ class WaveformStore:
        filesize = bw_path.stat().st_size
        sha256   = event_file_io.file_sha256(bw_path)

+        # 1b. preserve the raw BW ASCII report (.TXT) alongside the binary.
+        # Saved at <root>/<serial>/<filename>_ASCII.TXT.  Lets us re-parse
+        # offline after parser fixes without needing to re-forward from
+        # the watcher PC.  Negligible storage cost (~15 KB per event).
+        # Skipped silently when no report was supplied (live download path,
+        # manual upload without paired TXT).
+        txt_filename: Optional[str] = None
+        if bw_report_text is not None:
+            try:
+                txt_path = self.txt_path_for(serial, filename)
+                if isinstance(bw_report_text, bytes):
+                    txt_path.write_bytes(bw_report_text)
+                else:
+                    txt_path.write_text(bw_report_text)
+                txt_filename = txt_path.name
+            except Exception as exc:
+                log.warning(
+                    "save_imported_bw: failed to save TXT for %s: %s — "
+                    "continuing without it",
+                    filename, exc,
+                )
+
        # 2. write the .h5 clean-waveform file from the parsed Event.
        # Note: peaks here are computed from raw samples (the BW file
        # doesn't carry the device-authoritative 0C peaks).  Best-effort.
@@ -393,6 +434,7 @@ class WaveformStore:
            blastware_sha256=sha256,
            source_kind="bw-import",
            a5_pickle_filename=None,
+            txt_filename=txt_filename,
            review=existing_review,
            bw_report=bw_report,
        )
@@ -425,21 +467,21 @@ class WaveformStore:
        Ingest a Thor (Micromate Series IV) IDF event file (`.IDFW` or
        `.IDFH`) produced by Thor's TXT exporter.

-        Thor binaries are stored as opaque bytes — seismo-relay doesn't
-        yet decode the proprietary IDF binary format (codec slot lives
-        at ``micromate/idf_file.py``).  Device-authoritative metadata
-        comes from the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar
-        when supplied.
-
        Workflow:
-          1. Parse the paired TXT report (when supplied) via
-             ``micromate.parse_idf_report`` → dict.
-          2. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
-          3. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
-          4. Bridge IdfEvent → ``minimateplus.Event`` (for the existing
-             sidecar / DB insert machinery) via
-             ``IdfEvent.to_minimateplus_event(waveform_key)``.
-          5. Write the ``.sfm.json`` sidecar with
+          1. For sig-A `.IDFW` binaries, decode samples + binary metadata
+             via ``micromate.idf_file.read_idf_file()``.  Failure or
+             non-IDFW path falls through to the .txt-only flow.
+          2. Parse the paired TXT report (when supplied) via
+             ``micromate.parse_idf_report`` → dict.  TXT remains the
+             source of truth for fields the binary doesn't yet supply
+             (full peak set with ZC freq / Time of Peak, sensor self-check,
+             firmware string, project strings).
+          3. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
+          4. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
+          5. Bridge IdfEvent → ``minimateplus.Event`` and attach
+             ``raw_samples`` from the binary decoder (when available).
+          6. Write the `.h5` clean-waveform file when samples decoded.
+          7. Write the ``.sfm.json`` sidecar with
             ``source.kind = "idf-import"`` and the full raw IDF report
             under ``extensions.idf_report``.

@@ -448,7 +490,38 @@ class WaveformStore:
        """
        from micromate import IdfEvent, parse_idf_report

-        # Parse the .txt sidecar (best-effort; non-fatal on failure).
+        # 1. Binary decode (sig-A IDFW and IDFH).  Non-fatal: any failure
+        # leaves samples / binary metadata unfilled and we proceed with
+        # the .txt path as before.
+        idf_samples: Optional[dict] = None
+        idf_intervals: Optional[list] = None
+        binary_md = None
+        binary_peaks = None
+        is_histogram = False
+        try:
+            from micromate.idf_file import read_idf_file
+            # Pass idf_bytes through `data=` — at this point in the flow
+            # the binary hasn't been written to disk yet, so the codec
+            # can't read from source_path.  We still pass source_path so
+            # the codec has the filename for error messages + .IDFH
+            # suffix detection.
+            res = read_idf_file(source_path, data=idf_bytes)
+            idf_samples = res.samples or None
+            idf_intervals = res.intervals
+            is_histogram = res.intervals is not None
+            binary_md = res.binary_metadata
+            binary_peaks = res.event.peaks
+        except NotImplementedError:
+            # sig-B — codec doesn't handle this yet.
+            pass
+        except Exception as exc:
+            log.warning(
+                "save_imported_idf: binary codec failed for %s: %s — "
+                "falling back to .txt-only ingest",
+                source_path.name, exc,
+            )
+
+        # 2. Parse the .txt sidecar (best-effort; non-fatal on failure).
        report_dict: dict = {}
        if idf_report_text is not None:
            try:
@@ -459,17 +532,58 @@ class WaveformStore:
                    exc,
                )

-        # Build the typed IdfEvent.  Filename is authoritative for
+        # 3. Backfill report_dict with binary metadata for fields the
+        # .txt didn't supply.  Binary takes precedence on tied fields
+        # where the binary is more reliable (timestamp, sample_rate),
+        # and fills in fields entirely missing from the .txt.
+        if binary_md is not None:
+            if binary_md.serial and not report_dict.get("serial_number"):
+                report_dict["serial_number"] = binary_md.serial
+            if binary_md.event_datetime and not report_dict.get("event_datetime"):
+                report_dict["event_datetime"] = binary_md.event_datetime
+            if binary_md.sample_rate and not report_dict.get("sample_rate"):
+                report_dict["sample_rate"] = binary_md.sample_rate
+            if binary_md.record_time_sec and not report_dict.get("record_time_sec"):
+                report_dict["record_time_sec"] = binary_md.record_time_sec
+            # Calibration date (binary) vs calibration text (.txt) cohabit
+            # under different keys; no overwrite needed.
+            if binary_md.event_datetime and not report_dict.get("event_type"):
+                report_dict["event_type"] = (
+                    "Full Histogram" if is_histogram else "Full Waveform"
+                )
+
+        # Binary-derived peaks fill in when the .txt didn't supply them.
+        # They're ~3% low vs the device-authoritative .txt values (residual
+        # codec drift), so .txt always wins when present.
+        if binary_peaks is not None:
+            if binary_peaks.transverse_ips and not report_dict.get("tran_ppv"):
+                report_dict["tran_ppv"] = binary_peaks.transverse_ips
+            if binary_peaks.vertical_ips and not report_dict.get("vert_ppv"):
+                report_dict["vert_ppv"] = binary_peaks.vertical_ips
+            if binary_peaks.longitudinal_ips and not report_dict.get("long_ppv"):
+                report_dict["long_ppv"] = binary_peaks.longitudinal_ips
+
+        # 4. Build the typed IdfEvent.  Filename is authoritative for
        # (serial, timestamp, kind); the report's event_datetime takes
        # precedence over the filename timestamp inside from_report().
        idf_event = IdfEvent.from_report(report_dict, source_path.name)

+        # The binary mic peak (psi) isn't carried through from_report() —
+        # IdfReport.from_dict only sees the .txt's dB(L) value.  Pull the
+        # binary-derived ``mic_pspl_psi`` onto the typed IdfEvent so the
+        # downstream bridge can populate ``PeakValues.micl`` (psi-shaped)
+        # and the h5 writer's per-count mic factor lands at a sensible
+        # value.  Without this, the h5 mic chart auto-scales against the
+        # dB(L) value-as-pseudo-psi and renders ~flat.
+        if binary_peaks is not None and binary_peaks.mic_pspl_psi is not None:
+            idf_event.peaks.mic_pspl_psi = binary_peaks.mic_pspl_psi
+
        # Operator-supplied serial_hint wins over the binary's filename
        # prefix when both are present (e.g. callers passing a known-good
        # serial that overrides a misnamed export).
        serial = serial_hint or idf_event.serial or "UNKNOWN"

-        # Filesystem write.
+        # 5. Filesystem write of binary bytes.
        filename = source_path.name
        bw_path = self._serial_dir(serial) / filename
        bw_path.write_bytes(idf_bytes)
@@ -481,13 +595,59 @@ class WaveformStore:
        # surrogate — every distinct binary maps to a distinct row.
        waveform_key = bytes.fromhex(sha256)[:16]

-        # Bridge to minimateplus.Event for the existing sidecar / DB
+        # 6. Bridge to minimateplus.Event for the existing sidecar / DB
        # insert paths.  See IdfEvent.to_minimateplus_event() for the
        # caveats of this bridge (mic units, missing fields → sidecar).
        ev = idf_event.to_minimateplus_event(waveform_key)

-        # Write the sidecar.  Source kind "idf-import" was added to the
-        # allow-list in event_file_io.event_to_sidecar_dict for this.
+        # Attach the decoded sample arrays.  Thor's decoder counts use
+        # LSB = 0.0003 in/s for geo (vs BW's 16-count units at 0.005 in/s)
+        # — the .h5 writer's geo_range="normal" yields LSB = 10/32768
+        # ≈ 0.000305 in/s, so plotted samples come out ~1.7% high.
+        # Acceptable known offset; refine with a Thor-aware h5 path later.
+        if idf_samples is not None:
+            ev.raw_samples = idf_samples
+            n_samples = max((len(idf_samples.get(ch, [])) for ch in ("Tran", "Vert", "Long", "MicL")), default=0)
+            ev.total_samples = ev.total_samples or n_samples
+
+        # For IDFH histograms there are no per-sample waveform arrays — the
+        # device stores one peak ADC count per interval per channel.  Synthesise
+        # a 1-sample-per-interval array so the existing h5+renderer pipeline
+        # (which groups samples down to ``n_intervals`` bars via max-per-group)
+        # produces a non-blank histogram chart.  Each "sample" is the peak ADC
+        # count for that interval, so the h5 writer's ``count × geo_fs/32768``
+        # conversion yields the right physical value for the bar height.
+        if is_histogram and idf_intervals:
+            hist_samples = {
+                "Tran": [iv.peak_count("Tran") for iv in idf_intervals],
+                "Vert": [iv.peak_count("Vert") for iv in idf_intervals],
+                "Long": [iv.peak_count("Long") for iv in idf_intervals],
+                "MicL": [iv.peak_count("MicL") for iv in idf_intervals],
+            }
+            ev.raw_samples = hist_samples
+            ev.total_samples = ev.total_samples or len(idf_intervals)
+
+        # 7. Write the .h5 clean-waveform file when we have samples to write
+        # (either the IDFW per-sample stream, or the IDFH synthesised per-
+        # interval peak array).  The renderer treats both shapes the same way.
+        hdf5_filename: Optional[str] = None
+        if ev.raw_samples:
+            hdf5_path = self.hdf5_path_for(serial, filename)
+            try:
+                event_hdf5.write_event_hdf5(
+                    hdf5_path, ev,
+                    serial=serial,
+                    geo_range="normal",   # Thor's geo full scale is also 10 in/s (Normal)
+                    source_kind="idf-import",
+                )
+                hdf5_filename = hdf5_path.name
+            except Exception as exc:
+                log.warning(
+                    "save_imported_idf: HDF5 write failed for %s: %s — continuing without .h5",
+                    hdf5_path, exc,
+                )
+
+        # 8. Write the sidecar.  Source kind "idf-import" is on the allow-list.
        sidecar_path = self.sidecar_path_for(serial, filename)
        existing_review = None
        if sidecar_path.exists():
@@ -512,19 +672,67 @@ class WaveformStore:
        # Time of Peak, sensor self-check, calibration, firmware).
        if report_dict:
            sidecar["extensions"]["idf_report"] = report_dict
+
+        # Project the IDF report into the BW report sidecar shape so the
+        # existing Event Report PDF pipeline (sfm/report_pdf.py) can
+        # render Thor events without needing a separate code path.  Thor
+        # data is 95% the same metric set as BW — the adapter handles
+        # the field-name mapping.
+        if report_dict or binary_md is not None:
+            try:
+                from micromate.idf_to_bw_report import build_bw_report_from_idf
+                sidecar["bw_report"] = build_bw_report_from_idf(
+                    report_dict or {},
+                    binary_md=binary_md,
+                    intervals=idf_intervals,
+                    is_histogram=is_histogram,
+                )
+            except Exception as exc:
+                log.warning(
+                    "save_imported_idf: idf→bw_report adapter failed for %s: %s — "
+                    "report PDF will fall back to DB-only fields",
+                    filename, exc,
+                )
+        # For histograms, also stash the binary-decoded per-interval
+        # records so the UI / report layer doesn't need to re-walk the
+        # IDFH file at render time.
+        if idf_intervals is not None:
+            sidecar["extensions"]["idf_intervals"] = [
+                {
+                    "offset":     iv.offset,
+                    "tran_peak":  iv.peak_count("Tran"),
+                    "tran_halfp": iv.tran_halfp,
+                    "tran_freq":  iv.freq_hz("Tran"),
+                    "vert_peak":  iv.peak_count("Vert"),
+                    "vert_halfp": iv.vert_halfp,
+                    "vert_freq":  iv.freq_hz("Vert"),
+                    "long_peak":  iv.peak_count("Long"),
+                    "long_halfp": iv.long_halfp,
+                    "long_freq":  iv.freq_hz("Long"),
+                    "mic_peak":   iv.peak_count("MicL"),
+                    "mic_halfp":  iv.micl_halfp,
+                    "mic_freq":   iv.freq_hz("MicL"),
+                }
+                for iv in idf_intervals
+            ]
        event_file_io.write_sidecar(sidecar_path, sidecar)

        log.info(
            "WaveformStore.save_imported_idf serial=%s filename=%s filesize=%d "
-            "report_attached=%s",
-            serial, filename, filesize, bool(report_dict),
+            "kind=%s report_attached=%s binary_decoded=%s h5=%s intervals=%d",
+            serial, filename, filesize,
+            "histogram" if is_histogram else "waveform",
+            bool(report_dict),
+            (idf_samples is not None) or (idf_intervals is not None),
+            hdf5_filename or "(skipped)",
+            len(idf_intervals) if idf_intervals else 0,
        )
        return ev, {
            "filename":           filename,
            "filesize":           filesize,
            "sha256":             sha256,
            "a5_pickle_filename": None,
-            "hdf5_filename":      None,
+            "hdf5_filename":      hdf5_filename,
            "sidecar_filename":   sidecar_path.name,
            "serial":             serial,
        }
@@ -385,6 +385,98 @@ def test_user_notes_extra_lines_beyond_four_are_dropped():
    assert "L5" not in r.user_note_labels.values()


+def test_oorange_marker_treated_as_saturation():
+    """BW writes 'OORANGE' (Out Of Range — truncated) when a channel
+    exceeds its full-scale.  Verify ppv_ips falls back to geo_range_ips
+    + saturated flag is set, mirroring the real T190LD5Q.LK0W,
+    T438L713.RY0W, and K557L3YM.OE0W events from prod 2026-05-27.
+    """
+    txt = """\
+"Event Type : Full Waveform"
+"Serial Number : BE18190"
+"Geo Range : 10.000 in/s"
+"Tran PPV : 2.140 in/s"
+"Vert PPV : OORANGE in/s"
+"Long PPV : 2.830 in/s"
+"Peak Vector Sum : OORANGE in/s"
+"Peak Vector Sum TimeSum : 0.007 s"
+"MicL PSPL : OORANGE "
+"""
+    r = parse_report(txt)
+    # Tran/Long parse normally
+    assert r.channels["Tran"].ppv_ips == 2.14
+    assert r.channels["Tran"].ppv_saturated is False
+    assert r.channels["Long"].ppv_ips == 2.83
+    # Vert saturated → range max + flag
+    assert r.channels["Vert"].ppv_ips == 10.0
+    assert r.channels["Vert"].ppv_saturated is True
+    # PVS saturated → sqrt(3) * range_max as upper bound + flag
+    import math
+    assert r.peak_vector_sum_ips == pytest.approx(math.sqrt(3) * 10.0)
+    assert r.peak_vector_sum_saturated is True
+    # Mic saturated → 140 dBL conservative upper bound + flag
+    assert r.mic.pspl_dbl == 140.0
+    assert r.mic.pspl_saturated is True
+    # PVS time still parses despite the BW typo'd label "TimeSum"
+    assert r.peak_vector_sum_time_s == pytest.approx(0.007)
+
+
+def test_real_oorange_event_t190_parses():
+    """End-to-end against the real T190LD5Q.LK0W ASCII file pulled from
+    a Windows watcher PC on 2026-05-27.  This is the canonical example
+    of the parser-PPV-miss bug we fixed in this iteration."""
+    fixture_path = (
+        Path(__file__).parent.parent / "example-events" /
+        "ascii-5-27-26" / "T190LD5Q_LK0W_ASCII.TXT"
+    )
+    if not fixture_path.exists():
+        pytest.skip("real ASCII fixture not present (local-only)")
+    r = parse_report_file(fixture_path)
+    assert r.serial == "BE18190"
+    assert r.geo_range_ips == 10.0
+    # Tran reads cleanly, Vert was OORANGE
+    assert r.channels["Tran"].ppv_ips == pytest.approx(2.14)
+    assert r.channels["Vert"].ppv_ips == 10.0
+    assert r.channels["Vert"].ppv_saturated is True
+    assert r.channels["Long"].ppv_ips == pytest.approx(2.83)
+    assert r.peak_vector_sum_saturated is True
+    assert r.peak_vector_sum_time_s == pytest.approx(0.007)
+    # Same fixture: Tran ZC Freq is ">100 Hz" — must parse as 100 +
+    # above_range flag, not None (which would render as "—" on the PDF).
+    assert r.channels["Tran"].zc_freq_hz == 100.0
+    assert r.channels["Tran"].zc_freq_above_range is True
+    # Vert/Long are normal numeric values; flag stays False.
+    assert r.channels["Vert"].zc_freq_above_range is False
+    assert r.channels["Long"].zc_freq_above_range is False
+
+
+def test_above_range_marker_treated_as_zc_threshold():
+    """BW writes '>100 Hz' for ZC Freq when the zero-crossing algorithm
+    sees a peak too fast to count (cuts off at the device's 100 Hz
+    reporting ceiling).  Parser must store the threshold + flag, not
+    fall back to None.
+    """
+    txt = """\
+"Event Type : Full Waveform"
+"Serial Number : BE18190"
+"Tran ZC Freq : >100 Hz"
+"Vert ZC Freq : 73 Hz"
+"Long ZC Freq : N/A Hz"
+"MicL  ZC Freq : >100 Hz"
+"""
+    r = parse_report(txt)
+    assert r.channels["Tran"].zc_freq_hz == 100.0
+    assert r.channels["Tran"].zc_freq_above_range is True
+    assert r.channels["Vert"].zc_freq_hz == 73.0
+    assert r.channels["Vert"].zc_freq_above_range is False
+    # N/A → None, flag stays False
+    assert r.channels["Long"].zc_freq_hz is None
+    assert r.channels["Long"].zc_freq_above_range is False
+    # Mic above-range
+    assert r.mic.zc_freq_hz == 100.0
+    assert r.mic.zc_freq_above_range is True
+
+
 def test_real_histogram_fixture_populates_sensor_location():
    """End-to-end: the histogram fixture uses 'Seis. Location:' — must
    successfully populate sensor_location via position-based parsing."""
@@ -529,6 +529,77 @@ def test_save_imported_bw_round_trip(tmp_path: Path):
    assert stored_path.read_bytes() == src.read_bytes()


+# ── apply_bw_report_dict_to_event ────────────────────────────────────────────
+
+
+def test_apply_bw_report_dict_overlays_peaks_and_recording():
+    """Verbatim mirror of the data shape produced by `_bw_report_to_dict`
+    when projecting a parsed `BwAsciiReport` into the sidecar.  Confirms
+    each field overlays onto Event correctly so the backfill path
+    matches ingest behavior."""
+    from minimateplus.models import PeakValues
+    ev = Event(index=0)
+    bw_report = {
+        "peaks": {
+            "tran":       {"ppv_ips": 9.84375},
+            "vert":       {"ppv_ips": 0.305},
+            "long":       {"ppv_ips": 0.405},
+            "vector_sum": {"ips": 14.86736},
+        },
+        "mic": {"pspl_dbl": 115.9},
+        "recording": {"sample_rate_sps": 1024, "record_time_s": 3.0},
+    }
+    event_file_io.apply_bw_report_dict_to_event(ev, bw_report)
+    assert ev.peak_values is not None
+    assert ev.peak_values.tran             == 9.84375
+    assert ev.peak_values.vert             == 0.305
+    assert ev.peak_values.long             == 0.405
+    assert ev.peak_values.peak_vector_sum  == 14.86736
+    # MicL is converted dB → psi via _dbl_to_psi — just confirm non-zero
+    assert ev.peak_values.micl is not None and ev.peak_values.micl > 0
+    assert ev.sample_rate    == 1024
+    assert ev.rectime_seconds == 3.0
+
+
+def test_apply_bw_report_dict_overwrites_codec_peaks():
+    """The whole point of this helper: bw_report wins over whatever the
+    codec produced.  This is what the 2026-05-22 prod backfill missed —
+    DB peaks got overwritten with codec output (incl. PVS=0 on the
+    three top events) when they should have stayed bw_report-overlaid."""
+    from minimateplus.models import PeakValues
+    ev = Event(index=0)
+    # Simulate codec output that's clearly wrong (incomplete decode):
+    ev.peak_values = PeakValues(
+        tran=2.09, vert=0.0, long=0.0, peak_vector_sum=0.0,
+    )
+    bw_report = {
+        "peaks": {
+            "tran":       {"ppv_ips": 9.84},
+            "vert":       {"ppv_ips": 4.95},
+            "long":       {"ppv_ips": 8.05},
+            "vector_sum": {"ips": 14.95},
+        },
+    }
+    event_file_io.apply_bw_report_dict_to_event(ev, bw_report)
+    assert ev.peak_values.tran             == 9.84
+    assert ev.peak_values.vert             == 4.95
+    assert ev.peak_values.long             == 8.05
+    assert ev.peak_values.peak_vector_sum  == 14.95
+
+
+def test_apply_bw_report_dict_no_op_on_empty():
+    """None / empty dict / missing keys should leave Event untouched."""
+    from minimateplus.models import PeakValues
+    for empty in (None, {}, {"peaks": {}}, {"peaks": {"tran": {}}}):
+        ev = Event(index=0)
+        ev.peak_values = PeakValues(tran=1.0, vert=2.0, long=3.0)
+        event_file_io.apply_bw_report_dict_to_event(ev, empty)
+        # Unchanged
+        assert ev.peak_values.tran == 1.0
+        assert ev.peak_values.vert == 2.0
+        assert ev.peak_values.long == 3.0
+
+
 if __name__ == "__main__":
    if pytest is not None:
        pytest.main([__file__, "-v"])
@@ -335,3 +335,51 @@ def test_geo_count_to_ins_scale():
    assert geo_count_to_ins(1)  == pytest.approx(0.005)
    assert geo_count_to_ins(10) == pytest.approx(0.050)
    assert geo_count_to_ins(0)  == 0.0
+
+
+# ── Regression: peak is uint8 byte[N], NOT uint16 LE byte[N:N+2] ────────────
+#
+# Block taken verbatim from K558LKZU.RE0H (BE9558) interval 12 — a real
+# field event where the Tran channel had developed a DC offset and was
+# producing sub-Hz drift content the device couldn't characterize.
+# The annotation byte at [7] = 0xd2 is non-zero in that case.  The
+# legacy codec read [6:8] as uint16 LE, producing T_peak = 53763 →
+# 268 in/s — physically impossible and 35× too high for the actual
+# 0.015 in/s value (T_lo = 3 alone gives the correct count).
+# Verified against the paired BW ASCII export.
+_K558_INTERVAL_12_BLOCK = bytes.fromhex(
+    "00 00 0c 01 0a 00 03 d2 45 00 02 00 02 00 02 00"
+    "02 00 10 00 06 00 00 00 0e 91 2f 00 1e 0a 00 00".replace(" ", "")
+)
+
+
+def test_extension_byte_does_not_inflate_peak():
+    """The annotation byte at [7]/[11]/[15]/[19] must NOT contribute to
+    the peak count.  Decoded T_peak must be 3 (uint8 byte[6]), NOT
+    53763 (uint16 LE byte[6:8])."""
+    body = _K558_INTERVAL_12_BLOCK
+    records = decode_histogram_body_full(body)
+    assert records is not None
+    assert len(records) == 1
+    r = records[0]
+    assert r["t_peak"] == 3,    f"T_peak should be 3 (uint8), got {r['t_peak']}"
+    assert r["v_peak"] == 2
+    assert r["l_peak"] == 2
+    assert r["m_peak"] == 16
+    # Half-periods unchanged — still uint16 LE.
+    assert r["t_halfp"] == 0x0045  # 69 → 7.4 Hz
+    assert r["m_halfp"] == 6       # → 85.3 Hz
+    # Annotation byte is preserved (for future RE) but does not affect peak.
+    assert r["annotations"] == (0xd2, 0x00, 0x00, 0x00)
+
+
+def test_extension_byte_decoded_to_correct_in_s():
+    """End-to-end: the channel-grouped output for the K558 ext block
+    should give T = 3 counts = 0.015 in/s, not 53763 counts = 268 in/s."""
+    channels = decode_histogram_body(_K558_INTERVAL_12_BLOCK)
+    assert channels is not None
+    assert channels["Tran"] == [3]
+    assert geo_count_to_ins(channels["Tran"][0]) == pytest.approx(0.015)
+    assert channels["Vert"] == [2]
+    assert channels["Long"] == [2]
+    assert channels["MicL"] == [16]