fix(backfill): regenerate IDFH .h5 + merge binary mic_pspl_psi onto bridge

Two gaps in backfill_thor_events.py that left old Thor events showing stale charts after a v0.21.1 backfill pass: 1. IDFH events were skipped from .h5 regeneration (the "have decoded samples" gate was IDFW-only). Histograms kept their pre-v0.21.1 .h5 — written from raw_samples = None, which the renderer turned into a near-empty bar chart, or for older events the dB(L)-as-pseudo- psi mic scale that produced "107.7 psi" peaks (atomic-bomb level instead of footstep level). Fix: synthesise the same 1-sample-per- interval array save_imported_idf v0.21.1 uses (peak ADC count per channel per interval) so the renderer's bar-chart grouping has data to work with. 2. The IDFW h5 path didn't merge binary_peaks.mic_pspl_psi onto the IdfEvent before to_minimateplus_event(). The live save_imported_idf does this merge — without it, IdfEvent.from_report() only sees the .txt's dB(L) value, the bridge falls back to the dBL→psi formula (instead of the binary-accurate 2.14e-6 psi/count value), and the h5 writer's per-count mic factor lands on a less-correct value. Fix: same merge the live ingest does (lift res.event.peaks.mic_pspl_psi onto idf_event.peaks before the bridge call). Verified against UM6047_20250804190047.IDFH (250-interval prod histogram): 250 intervals decode, mic_pspl_psi = 2.78e-5 (was being treated as dB(L)=107.7 in the old h5). Operator: re-run after deploy. `docker compose exec sfm python scripts/backfill_thor_events.py` is idempotent — the existing version check still skips events already at the new TOOL_VERSION, and review state + captured_at are preserved on the second pass.
version bump - 0.21.1
2026-06-01 20:02:54 +00:00 · 2026-06-01 19:33:44 +00:00 · 2026-06-01 18:27:24 +00:00 · 2026-05-31 20:51:09 +00:00 · 2026-05-30 04:37:43 +00:00 · 2026-05-29 22:17:43 +00:00
143 changed files with 59868 additions and 1261 deletions
@@ -0,0 +1,28 @@
+.git
+.gitignore
+
+.venv
+venv
+env
+__pycache__
+*.pyc
+*.pyo
+*.pyd
+.pytest_cache
+.mypy_cache
+.ruff_cache
+
+*.db
+*.db-wal
+*.db-shm
+*.sqlite
+*.sqlite3
+
+sfm/data
+bridges/captures
+example-events
+captures
+logs
+
+.DS_Store
+Thumbs.db
@@ -1,28 +1,33 @@
-/bridges/captures/
-/example-events/
-
-/manuals/
-
-# Python bytecode
-__pycache__/
-*.py[cod]
-
-# Virtual environments
-.venv/
-venv/
-env/
-
-# Editor / OS
-.vscode/
-*.swp
-.DS_Store
-Thumbs.db
-
-# Analyzer outputs
-*.report
-claude_export_*.md
-
-# Frame database
-*.db
-*.db-wal
-*.db-shm
+/bridges/captures/
+/example-events/
+/tests/fixtures/
+/manuals/
+
+# Python build artifacts
+*.egg-info/
+dist/
+build/
+
+# Python bytecode
+__pycache__/
+*.py[cod]
+
+# Virtual environments
+.venv/
+venv/
+env/
+
+# Editor / OS
+.vscode/
+*.swp
+.DS_Store
+Thumbs.db
+
+# Analyzer outputs
+*.report
+claude_export_*.md
+
+# Frame database
+*.db
+*.db-wal
+*.db-shm
@@ -0,0 +1,31 @@
+FROM python:3.11-slim
+
+WORKDIR /app
+
+# tzdata is required for the TZ env var to take effect (python:slim
+# omits the timezone database).  Without it, datetime.now() / logging
+# / matplotlib all stay in UTC regardless of TZ.  Default zone gets
+# set further down via ENV; users override per-deployment via the
+# `TZ` env var in docker-compose.
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends curl tzdata && \
+    rm -rf /var/lib/apt/lists/*
+
+# Default display timezone — applied to server logs, datetime.now(),
+# matplotlib rendered timestamps, and any naïve-vs-aware datetime
+# conversions in the PDF renderer.  Override via TZ env var in
+# docker-compose; storage in the DB is always UTC regardless.
+ENV TZ=America/New_York
+
+COPY pyproject.toml requirements.txt ./
+COPY minimateplus ./minimateplus
+COPY micromate    ./micromate
+COPY sfm          ./sfm
+COPY bridges      ./bridges
+COPY scripts      ./scripts
+
+RUN pip install --no-cache-dir -e .
+
+EXPOSE 8200
+
+CMD ["python", "-m", "uvicorn", "sfm.server:app", "--host", "0.0.0.0", "--port", "8200"]
@@ -1,16 +1,60 @@
-# seismo-relay  `v0.6.0`
+# seismo-relay  `v0.21.0`

 A ground-up replacement for **Blastware** — Instantel's aging Windows-only
-software for managing MiniMate Plus seismographs.
+software for managing seismographs.  Supports both the **MiniMate Plus
+(Series III)** and the **Micromate (Series IV / "Thor")** families:
+Series III via the live RS-232 / TCP wire protocol *and* Blastware ACH file
+ingest; Series IV currently via Thor TXT-paired IDF file ingest, with the
+binary codec on the roadmap.

-Built in Python. Runs on Windows. Connects to instruments over direct RS-232
-or cellular modem (Sierra Wireless RV50 / RV55).
+Built in Python. Runs on Windows, Linux, or macOS. Connects to instruments
+over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).

-> **Status:** Active development. Full read pipeline working end-to-end:
-> device info, compliance config (with geo thresholds), event download with
-> true event-time metadata (project / client / operator / sensor location
-> sourced from the device at record-time via SUB 5A). Write commands in progress.
-> See [CHANGELOG.md](CHANGELOG.md) for version history.
+> **Status:** Active development. Full read + write + erase + monitoring
+> pipeline working end-to-end over TCP/cellular. ACH Auto Call Home server
+> handles inbound unit connections, downloads events, and persists everything
+> to a SQLite database. SFM REST API exposes device control and DB queries.
+> **As of v0.14.3 (2026-05-05): SUB 5A bulk waveform protocol is verified
+> byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec
+> events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with
+> full Event Reports, frequency analysis, and waveform plots.
+> **v0.16.0 (2026-05-11)** adds BW ASCII report ingestion to
+> `/db/import/blastware_file` — paired with **series3-watcher v1.5.0**,
+> every Blastware ACH event lands in SeismoDb with device-authoritative
+> peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data,
+> without depending on the still-undecoded waveform body codec.
+> **v0.18.0 (2026-05-19)** adds Thor / Micromate Series IV ingest at
+> `/db/import/idf_file` — paired with **thor-watcher v0.3.0**, every
+> `.IDFH` / `.IDFW` event file (plus its `.txt` sidecar) lands in
+> SeismoDb the same way BW events do.  See
+> [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) for
+> the IDF format reference and reverse-engineering plan.
+> **v0.19.0 (2026-05-20)** separates Series III and Series IV at the
+> code level: new `micromate/` package alongside `minimateplus/`, new
+> `events.device_family` DB column ("series3" / "series4") so the UI
+> and storage layer dispatch deterministically instead of sniffing
+> filenames.  Self-applying migration backfills existing rows from the
+> binary filename extension.
+> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
+> started in v0.17.x: histogram layouts render correctly against BW
+> reference PDFs, the ASCII parser handles real-world edge cases
+> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
+> Freq is surfaced in both modals (event browser + main webapp).
+> Adds a server-wide `TZ` env var so operator-visible timestamps
+> render in local time instead of UTC.  New
+> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
+> applied retroactively to existing events without re-forwarding,
+> using the `.TXT` files preserved at ingest time.
+> **v0.21.0 (2026-05-29)** is the Thor / Series IV decoder release —
+> `micromate/idf_file.read_idf_file()` now decodes both IDFW
+> (waveform) and IDFH (histogram) binaries (87–99% sample fidelity
+> on quiet IDFW events; all 859 IDFH corpus files decode cleanly).
+> A new `micromate/idf_to_bw_report.py` adapter projects parsed
+> Thor reports into the BW-shaped sidecar block, so Thor events
+> flow through the existing Event Report PDF pipeline without a
+> separate renderer.  Terra-View v0.13.0 ships in parallel and
+> closes Phase 1 of the SFM integration — see its CHANGELOG.
+> See [CHANGELOG.md](CHANGELOG.md) for full version history.

 ---

@@ -18,156 +62,153 @@ or cellular modem (Sierra Wireless RV50 / RV55).

 ```
 seismo-relay/
-├── seismo_lab.py              ← Main GUI (Bridge + Analyzer + Console tabs)
+├── seismo_lab.py              ← Main GUI (Bridge + Analyzer + Download + Console tabs)
 │
-├── minimateplus/              ← MiniMate Plus client library
-│   ├── transport.py           ←   SerialTransport and TcpTransport
-│   ├── protocol.py            ←   DLE frame layer (read/write/parse)
-│   ├── client.py              ←   High-level client (connect, get_config, etc.)
-│   ├── framing.py             ←   Frame builder/parser primitives
-│   └── models.py              ←   DeviceInfo, EventRecord, etc.
+├── minimateplus/              ← Series III (MiniMate Plus) client library
+│   ├── transport.py           ←   SerialTransport, TcpTransport, SocketTransport
+│   ├── protocol.py            ←   DLE frame layer, SUB command dispatch
+│   ├── client.py              ←   High-level client (connect, get_events, delete_all_events, push_config, get_call_home_config, …)
+│   ├── framing.py             ←   Frame builders, DLE codec, S3FrameParser
+│   ├── models.py              ←   DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, CallHomeConfig, …
+│   ├── bw_ascii_report.py     ←   Parse BW per-event ASCII reports (.TXT sidecars)
+│   ├── event_file_io.py       ←   Read BW binaries, write .sfm.json sidecars
+│   └── blastware_file.py      ←   Write events to Blastware-compatible .AB0 files
 │
-├── sfm/                       ← SFM REST API server (FastAPI)
-│   └── server.py              ←   /device/info, /device/events, /device/event
+├── micromate/                 ← Series IV (Micromate / Thor) client library (NEW v0.19)
+│   ├── models.py              ←   IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L))
+│   ├── idf_ascii_report.py    ←   Parse Thor .IDFW.txt / .IDFH.txt event sidecars
+│   ├── idf_file.py            ←   Binary codec for .IDFW + .IDFH (v0.21.0+)
+│   └── idf_to_bw_report.py    ←   Adapter projecting Thor IDF into the BW report shape (v0.21.0+)
+│
+├── sfm/                       ← SFM REST API server (FastAPI, port 8200)
+│   ├── server.py              ←   Live device endpoints + DB query + ingest endpoints + caching
+│   ├── database.py            ←   SeismoDb — SQLite persistence (events, monitor_log, ach_sessions)
+│   ├── waveform_store.py      ←   On-disk store for BW + IDF event binaries + .sfm.json sidecars
+│   └── sfm_webapp.html        ←   Embedded web UI with Call Home config tab
 │
 ├── bridges/
-│   ├── s3-bridge/
-│   │   └── s3_bridge.py       ←   RS-232 serial bridge (capture tool)
+│   ├── ach_server.py          ←   Inbound ACH call-home server (main production server)
+│   ├── ach_mitm.py            ←   Transparent MITM proxy for capturing BW sessions
+│   ├── s3-bridge/             ←   RS-232 serial bridge (capture tool)
 │   ├── tcp_serial_bridge.py   ←   Local TCP↔serial bridge (bench testing)
-│   ├── gui_bridge.py          ←   Standalone bridge GUI (legacy)
+│   ├── gui_bridge.py          ←   Standalone bridge GUI with raw capture checkboxes
 │   └── raw_capture.py         ←   Simple raw capture tool
 │
 ├── parsers/
-│   ├── s3_parser.py           ←   DLE frame extractor
 │   ├── s3_analyzer.py         ←   Session parser, differ, Claude export
-│   ├── gui_analyzer.py        ←   Standalone analyzer GUI (legacy)
+│   ├── gui_analyzer.py        ←   Standalone analyzer GUI
 │   └── frame_db.py            ←   SQLite frame database
 │
 └── docs/
-    └── instantel_protocol_reference.md  ← Reverse-engineered protocol spec
+    ├── instantel_protocol_reference.md  ← Series III protocol spec (the Rosetta Stone)
+    └── idf_protocol_reference.md         ← Series IV (Thor IDF) format reference + codec RE plan
 ```

 ---

 ## Quick start

-### Seismo Lab (main GUI)
+### ACH inbound server (production)

-The all-in-one tool. Three tabs: **Bridge**, **Analyzer**, **Console**.
+Listens for inbound unit call-homes, downloads all new events and monitor log
+entries, and writes everything to `bridges/captures/seismo_relay.db`.

+```bash
+python bridges/ach_server.py --port 12345 --output bridges/captures/
 ```
-python seismo_lab.py
+
+Point the unit's ACEmanager **Remote Host** to this machine's IP and **Remote Port** to `12345`.
+
+Options:
+```
+--port N            Listen port (default 12345)
+--output DIR        Capture directory (default bridges/captures/)
+--allow-ip IP       Allowlist an IP (repeat for multiple; default: accept all)
+--max-events N      Safety cap for first run (default: unlimited)
+--clear-after-download  Erase device memory after successful download
+--verbose           Debug logging
 ```

 ### SFM REST server

-Exposes MiniMate Plus commands as a REST API for integration with other systems.
+Exposes device control and DB queries as a REST API. Proxied by terra-view.

-```
-cd sfm
-uvicorn server:app --reload
+```bash
+python sfm/server.py               # default: 0.0.0.0:8200
+python -m uvicorn sfm.server:app --host 0.0.0.0 --port 8200 --reload
 ```

-**Endpoints:**
+Open `http://localhost:8200` for the embedded web UI, or `http://localhost:8200/docs`
+for the interactive API docs.
+
+### Seismo Lab GUI
+
+```bash
+python seismo_lab.py
+```
+
+---
+
+## SFM REST API
+
+### Live device endpoints
+
+Each call dials the device, does its work, and closes the connection. TCP
+connections are retried once on `ProtocolError` to handle cold-boot timing.
+
+**In-memory caching** — frequently-polled endpoints avoid redundant TCP round-trips
+via a thread-safe `_LiveCache` (plain Python dict + `threading.Lock`):
+
+| Method | URL | Cache Strategy |
+|--------|-----|---|
+| `GET` | `/device/info` | Indefinite; invalidated by `POST /device/config` |
+| `GET` | `/device/events` | Count-probe fast path (~2s); full download only when new events detected |
+| `GET` | `/device/event/{idx}/waveform` | Permanent per event index |
+| `GET` | `/device/monitor/status` | 30-second TTL; invalidated by monitor start/stop |
+| `GET` | `/device/call_home` | Fresh read from device (not cached) |
+| `POST` | `/device/connect` | — |
+| `POST` | `/device/config` | Writes compliance config; invalidates info + events cache |
+| `POST` | `/device/config/project` | Patches project/client/operator/sensor_location strings |
+| `POST` | `/device/monitor/start` | Sends SUB 0x96; immediately evicts status cache |
+| `POST` | `/device/monitor/stop` | Sends SUB 0x97; immediately evicts status cache |
+| `POST` | `/device/call_home` | Reads, patches specified fields, writes back to device |
+
+**Cache bypass** — All cached endpoints accept `?force=true` to skip the cache and
+force a fresh read from the device.
+
+**Cache stats** — `GET /cache/stats` returns hit/miss counts and TTL info; `DELETE /cache/device`
+clears the device cache immediately.
+
+Transport query params (supply one set):
+```
+Serial:  ?port=COM5&baud=38400
+TCP:     ?host=1.2.3.4&tcp_port=12345
+```
+
+### DB read endpoints
+
+Query the SQLite database written by `ach_server.py`. All read-only except
+`PATCH /db/events/{id}/false_trigger`.

 | Method | URL | Description |
 |--------|-----|-------------|
-| `GET` | `/device/info?port=COM5` | Device info via serial |
-| `GET` | `/device/info?host=1.2.3.4&tcp_port=9034` | Device info via cellular modem |
-| `GET` | `/device/events?port=COM5` | Event index |
-| `GET` | `/device/event?port=COM5&index=0` | Single event record |
+| `GET` | `/db/units` | All known serials with summary stats |
+| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger).  Response rows include `device_family` ("series3" / "series4") so clients dispatch on unit type without sniffing filenames. |
+| `GET` | `/db/monitor_log` | Monitoring intervals |
+| `GET` | `/db/sessions` | ACH call-home session history |
+| `PATCH` | `/db/events/{id}/false_trigger?value=true` | Flag / unflag false triggers |

---
+### File ingest endpoints

-## Seismo Lab tabs
+Used by watcher daemons to push field-collected event files into the SFM DB
+ waveform store.  Both accept multipart uploads of binary event files
+optionally paired with their ASCII sidecar reports; both dedup by
+`(serial, timestamp)` and UPSERT device-authoritative fields on re-import.

-### Bridge tab
-
-Captures live RS-232 traffic between Blastware and the seismograph. Sits in
-the middle as a transparent pass-through while logging everything to disk.
-
-```
-Blastware → COM4 (virtual) ↔ s3_bridge ↔ COM5 (physical) → MiniMate Plus
-```
-
-Set your COM ports and log directory, then hit **Start Bridge**. Use
-**Add Mark** to annotate the capture at specific moments (e.g. "changed
-trigger level"). When the bridge starts, the Analyzer tab automatically wires
-up to the live files and starts updating in real time.
-
-### Analyzer tab
-
-Parses raw captures into DLE-framed protocol sessions, diffs consecutive
-sessions to show exactly which bytes changed, and lets you query across all
-historical captures via the built-in SQLite database.
-
- **Inventory** — all frames in a session, click to drill in
- **Hex Dump** — full payload hex dump with changed-byte annotations
- **Diff** — byte-level before/after diff between sessions
- **Full Report** — plain text session report
- **Query DB** — search across all captures by SUB, direction, or byte value
-
-Use **Export for Claude** to generate a self-contained `.md` report for
-AI-assisted field mapping.
-
-### Console tab
-
-Direct connection to a MiniMate Plus — no bridge, no Blastware. Useful for
-diagnosing field units over cellular without a full capture session.
-
-**Connection:** choose Serial (COM port + baud) or TCP (IP + port for
-cellular modem).
-
-**Commands:**
-| Button | What it does |
-|--------|-------------|
-| POLL | Startup handshake — confirms unit is alive and identifies model |
-| Serial # | Reads unit serial number |
-| Full Config | Reads full 166-byte config block (firmware version, channel scales, etc.) |
-| Event Index | Reads stored event list |
-
-Output is colour-coded: TX in blue, raw RX bytes in teal, decoded fields in
-green, errors in red. **Save Log** writes a timestamped `.log` file to
-`bridges/captures/`. **Send to Analyzer** injects the captured bytes into the
-Analyzer tab for deeper inspection.
-
---
-
-## Connecting over cellular (RV50 / RV55 modems)
-
-Field units connect via Sierra Wireless RV50 or RV55 cellular modems. Use
-TCP mode in the Console or SFM:
-
-```
-# Console tab
-Transport: TCP
-Host: <modem public IP>
-Port: 9034          ← Device Port in ACEmanager (call-up mode)
-```
-
-```python
-# In code
-from minimateplus.transport import TcpTransport
-from minimateplus.client import MiniMateClient
-
-client = MiniMateClient(transport=TcpTransport("1.2.3.4", 9034), timeout=30.0)
-info = client.connect()
-```
-
-### Required ACEmanager settings (Serial tab)
-
-These must match exactly — a single wrong setting causes the unit to beep
-on connect but never respond:
-
-| Setting | Value | Why |
-|---------|-------|-----|
-| Configure Serial Port | `38400,8N1` | Must match MiniMate baud rate |
-| Flow Control | `None` | Hardware flow control blocks unit TX if pins unconnected |
-| **Quiet Mode** | **Enable** | **Critical.** Disabled → modem injects `RING`/`CONNECT` onto serial line, corrupting the S3 handshake |
-| Data Forwarding Timeout | `1` (= 0.1 s) | Lower latency; `5` works but is sluggish |
-| TCP Connect Response Delay | `0` | Non-zero silently drops the first POLL frame |
-| TCP Idle Timeout | `2` (minutes) | Prevents premature disconnect |
-| DB9 Serial Echo | `Disable` | Echo corrupts the data stream |
+| Method | URL | Description |
+|--------|-----|-------------|
+| `POST` | `/db/import/blastware_file` | Series III: `.AB0*` / `.N00` binaries + paired `_ASCII.TXT`.  Source: `series3-watcher`. |
+| `POST` | `/db/import/idf_file` | Series IV: `.IDFH` / `.IDFW` binaries + paired `.IDFW.txt` / `.IDFH.txt`.  Source: `thor-watcher`. |

 ---

@@ -175,25 +216,150 @@ on connect but never respond:

 ```python
 from minimateplus import MiniMateClient
-from minimateplus.transport import SerialTransport, TcpTransport
+from minimateplus.transport import TcpTransport

 # Serial
 client = MiniMateClient(port="COM5")

 # TCP (cellular modem)
-client = MiniMateClient(transport=TcpTransport("1.2.3.4", 9034), timeout=30.0)
+client = MiniMateClient(transport=TcpTransport("1.2.3.4", 12345), timeout=30.0)

 with client:
-    info    = client.connect()        # DeviceInfo — model, serial, firmware, compliance config
-    serial  = client.get_serial()     # Serial number string
-    config  = client.get_config()     # Full config block (bytes)
-    events  = client.get_events()     # List[EventRecord] with true event-time metadata
+    # Read
+    info     = client.connect()               # DeviceInfo — serial, firmware, compliance config
+    count    = client.count_events()          # Number of stored events
+    keys     = client.list_event_keys()       # Fast browse walk — event keys only, no download
+    events   = client.get_events()            # Full download: headers + peaks + metadata
+    monitor  = client.get_monitor_status()    # Battery, memory, is_monitoring flag
+    log      = client.get_monitor_log_entries()  # Monitoring intervals (partial 0x2C records)
+    ach_cfg  = client.get_call_home_config()  # Auto Call Home settings (SUB 0x2C)
+
+    # Write
+    client.apply_config(
+        sample_rate=1024,
+        recording_mode="Continuous",         # Single Shot / Continuous / Histogram / Histogram+Continuous
+        histogram_interval_sec=15,            # 2, 5, 15, 60, 300, 900
+        trigger_level_geo=0.5,
+        geo_range="Normal",                   # Normal (10.000 in/s) / Sensitive (1.25 in/s)
+        project="Bridge Inspection 2026",
+        client_name="City of Portland",
+        operator="B. Harrison",
+    )
+    
+    client.set_call_home_config(
+        auto_call_home_enabled=True,
+        after_event_recorded=True,
+        at_specified_times=True,
+        time1_hour=18, time1_min=30,          # 6:30 PM
+        time2_hour=6, time2_min=0,            # 6:00 AM
+    )
+
+    # Control
+    client.start_monitoring()                # SUB 0x96
+    client.stop_monitoring()                 # SUB 0x97
+    client.delete_all_events()              # Erase all (SUB 0xA3 → 0x1C → 0x06 → 0xA2)
 ```

-`get_events()` runs the full download sequence per event: `1E → 0A → 0C → 5A → 1F`.
-The SUB 5A bulk waveform stream is used to retrieve `client`, `operator`, and
-`sensor_location` as they existed at record time — not backfilled from the current
-compliance config.
+`get_events()` runs the full per-event sequence:
+`1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`.
+SUB 5A bulk stream walks chunks bounded by the `end_offset` extracted from
+the STRT record at byte 17 of the probe response — no over-reading, no
+chunk-count cap. Project / client / operator / sensor location strings come
+from the dedicated metadata pages at counter `0x1002` and `0x1004`,
+read once per session (they reflect the compliance setup at session start,
+not per individual event).
+
+---
+
+## micromate library
+
+Series IV / Thor support, sibling to `minimateplus`.  Currently scoped to
+offline-file ingest from Thor's TXT exporter; live-device protocol is
+deferred until the binary codec is cracked.
+
+```python
+from micromate import IdfEvent, parse_idf_report
+
+# Parse a .IDFW.txt / .IDFH.txt sidecar (1014 example files round-trip cleanly)
+text = open("UM11719_20231219162723.IDFW.txt").read()
+report_dict = parse_idf_report(text)        # permissive dict
+
+# Wrap into a typed event using the device-native binary filename
+event = IdfEvent.from_report(report_dict, "UM11719_20231219162723.IDFW")
+
+event.serial                     # "UM11719"
+event.kind                       # "Waveform" or "Histogram"
+event.peaks.transverse_ips       # 0.0251  (in/s, native unit)
+event.peaks.mic_pspl_dbl         # 99.4    (dB(L), Thor's native mic unit — NOT psi)
+event.project_info.project       # "UPMC Presby-Loc 3-Level1-1R Elevator Rm"
+event.sensor_check.tran          # True (passed self-check)
+event.firmware_version           # "Micromate ISEE 11.0AK"
+event.calibration_text           # "November 22, 2023 by Instantel"
+
+# Bridge to the existing minimateplus.Event shape for the DB / sidecar paths
+# (waveform_key is a 16-byte sha256 prefix when ingesting from a binary file)
+bridged_event = event.to_minimateplus_event(waveform_key=b"\x00" * 16)
+```
+
+The binary codec (`.IDFW` / `.IDFH` event files themselves) is on the
+roadmap — see [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md)
+for everything known so far, the two observed file signatures, and the
+reverse-engineering plan.  The `micromate/idf_file.py` stub is where
+`read_idf_file()` will land.
+
+---
+
+## Database
+
+`ach_server.py` and the file-ingest endpoints write to
+`bridges/captures/seismo_relay.db` (SQLite, WAL mode) via the `SeismoDb`
+persistence layer.  Three tables, all unit-keyed by serial number:
+
+| Table | Key | Contents |
+|-------|-----|----------|
+| `ach_sessions` | UUID | Per-call-home audit record: serial, timestamp, peer IP, events_downloaded, monitor_entries, duration_seconds |
+| `events` | UUID, UNIQUE(serial, timestamp) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag, **`device_family`** ("series3" / "series4"), `blastware_filename` (binary at-rest in `waveforms/`), sidecar references |
+| `monitor_log` | UUID, UNIQUE(serial, start_time) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips |
+
+**Deduplication is by `(serial, timestamp)`** — the device clock is the
+stable natural key.  Repeat call-homes or re-runs UPSERT the row in place,
+refreshing every device-authoritative field (peaks, project strings,
+sample_rate, file references) so the latest writer wins.  `false_trigger`
+and `device_family` are preserved across UPSERTs.  Earlier versions used
+`(serial, waveform_key)` for dedup, but the device's event-key counter
+resets to `0x01110000` after every erase, so timestamps are the correct
+dedup field.  Migration handles the transition transparently on first
+startup.
+
+**`device_family` (added v0.19.0)** discriminates Series III from Series
+IV at the SQL level.  Set by every import path; the UI dispatches on it
+to render mic units correctly (Series III: psi → dBL conversion; Series
+IV: native dBL passthrough).  Existing rows are backfilled at first
+startup of v0.19.0+ by sniffing the binary filename extension.
+
+The on-disk waveform store lives at `bridges/captures/waveforms/<serial>/`
+and holds the original event binaries (BW `.AB0*` / `.N00` for Series III,
+`.IDFH` / `.IDFW` for Series IV) plus their `.sfm.json` review/metadata
+sidecars.  Series III events also produce `.a5.pkl` source-frame pickles
+and `.h5` clean-waveform exports; Series IV doesn't yet (pending codec).
+
+---
+
+## Connecting over cellular (RV50 / RV55)
+
+Field units connect via Sierra Wireless RV50 or RV55 cellular modems.
+
+### Required ACEmanager settings
+
+| Setting | Value | Why |
+|---------|-------|-----|
+| Configure Serial Port | `38400,8N1` | Must match MiniMate baud rate |
+| Flow Control | `None` | Hardware FC blocks TX if pins unconnected |
+| **Quiet Mode** | **Enable** | **Critical** — disabled injects `RING`/`CONNECT` onto serial, corrupting the S3 handshake |
+| Data Forwarding Timeout | `1` (= 0.1 s) | Lower latency |
+| TCP Connect Response Delay | `0` | Non-zero silently drops the first POLL frame |
+| TCP Idle Timeout | `2` (minutes) | Prevents premature disconnect |
+| DB9 Serial Echo | `Disable` | Echo corrupts the data stream |

 ---

@@ -204,56 +370,222 @@ compliance config.
 | DLE | `0x10` | Data Link Escape |
 | STX | `0x02` | Start of frame |
 | ETX | `0x03` | End of frame |
-| ACK | `0x41` (`'A'`) | Frame-start marker sent before every frame |
+| ACK | `0x41` | Frame-start marker sent before every BW frame |
 | DLE stuffing | `10 10` on wire | Literal `0x10` in payload |

-**S3-side frame** (seismograph → Blastware): `ACK DLE+STX [payload] CHK DLE+ETX`
-
-**De-stuffed payload header:**
-```
-[0] CMD        0x10 = BW request, 0x00 = S3 response
-[1] ?          unknown (0x00 BW / 0x10 S3)
-[2] SUB        Command/response identifier  ← the key field
-[3] PAGE_HI    Page address high byte
-[4] PAGE_LO    Page address low byte
-[5+] DATA      Payload content
-```
-
-**Response SUB rule:** `response_SUB = 0xFF - request_SUB`
-Example: request SUB `0x08` (Event Index) → response SUB `0xF7`
+**Response SUB rule:** `response_SUB = 0xFF - request_SUB` (no exceptions)

 Full protocol documentation: [`docs/instantel_protocol_reference.md`](docs/instantel_protocol_reference.md)

 ---

+## Compliance Config Features
+
+The REST API and web UI expose full control over device compliance settings:
+
+- **Recording Mode** (Single Shot / Continuous / Histogram / Histogram+Continuous)
+- **Sample Rate** (1024 / 2048 / 4096 sps)
+- **Record Time** (float, seconds)
+- **Histogram Interval** (2s, 5s, 15s, 1m, 5m, 15m) — when recording mode includes histogram
+- **Geo Trigger Levels** (float, in/s per channel)
+- **Geo Maximum Range** (Normal 10.000 in/s / Sensitive 1.250 in/s per channel)
+- **Project / Client / Operator / Sensor Location** (ASCII strings)
+
+Auto Call Home config:
+- **Auto Call Home Enable** (bool)
+- **Dial String** (read-only; 40-byte ASCII)
+- **Trigger on Event** (bool)
+- **Scheduled Call-Ins** (two time slots with HH:MM each)
+- **Retry Settings** (count, delay, connection timeout, warm-up time)
+
+---
+
 ## Requirements

-```
+```bash
 pip install pyserial fastapi uvicorn
 ```

 Python 3.10+. Tkinter is included with the standard Python installer on
-Windows (make sure "tcl/tk and IDLE" is checked during install).
+Windows (check "tcl/tk and IDLE" during install).

 ---

 ## Virtual COM ports (bridge capture)

-The bridge needs two COM ports on the same PC — one that Blastware connects
-to, and one wired to the seismograph. Use a virtual COM port pair
-(**com0com** or **VSPD**) to give Blastware a port to talk to.
-
 ```
 Blastware → COM4 (virtual) ↔ s3_bridge.py ↔ COM5 (physical) → MiniMate Plus
 ```

+Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
+
 ---

-## Roadmap
+## Key Features

- [x] Event download — pull waveform records from the unit (`1E → 0A → 0C → 5A → 1F`)
- [x] True event-time metadata — project / client / operator / sensor location from SUB 5A
- [ ] Write commands — push config changes to the unit (compliance setup, channel config, trigger settings)
- [ ] ACH inbound server — accept call-home connections from field units
- [ ] Modem manager — push standard configs to RV50/RV55 fleet via Sierra Wireless API
- [ ] Full Blastware parity — complete read/write/download cycle without Blastware
+**Series III (MiniMate Plus) device support:**
+- [x] Full read/write/erase pipelines over RS-232 or TCP/cellular
+- [x] Compliance config (recording mode, sample rate, histogram interval, geo sensitivity, project strings)
+- [x] Auto Call Home config (read/write ACH settings, dial string, time slots, retries)
+- [x] Monitor control (start/stop, status polling, battery/memory)
+- [x] Monitor log entries (continuous monitoring intervals without full waveform download)
+- [x] Blastware file ingest at `/db/import/blastware_file` (paired with `series3-watcher`)
+
+**Series IV (Micromate / Thor) device support:**
+- [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+)
+- [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version
+- [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/`
+- [x] Binary `.IDFW` / `.IDFH` codec — ✅ v0.21.0.  IDFW reuses `decode_waveform_v2()` on the body at offset `0x0f1f` (87–99% sample fidelity on quiet events); IDFH has a dedicated segment-based decoder (all 859 corpus files decode, 181,071 intervals total).  See `micromate/idf_file.py` + `docs/idf_protocol_reference.md`.
+- [ ] Live-device protocol — pending codec
+
+**Data persistence:**
+- [x] SQLite database (`seismo_relay.db`) with `events`, `monitor_log`, `ach_sessions` tables
+- [x] Per-row `device_family` column ("series3" / "series4") for clean UI / unit-of-measurement dispatch (v0.19.0+)
+- [x] Deduplication by `(serial, timestamp)` — natural key handles post-erase counter resets
+- [x] UPSERT on re-import refreshes every device-authoritative field (peaks, project, sample_rate); preserves operator review state (`false_trigger`)
+- [x] Post-erase key-reuse detection (tracks high-water mark in `ach_state.json`)
+
+**REST API:**
+- [x] Live device endpoints with in-memory caching (`_LiveCache`)
+- [x] Cache statistics (`/cache/stats`) and manual invalidation (`/cache/device`)
+- [x] DB query endpoints (units, events, monitor_log, sessions, false_trigger PATCH)
+- [x] Call Home config read/write endpoints
+- [x] Blastware file download endpoint (`/device/event/{index}/blastware_file`)
+- [x] Import endpoints for both device families (`/db/import/blastware_file`, `/db/import/idf_file`)
+
+**File output (v0.7+, byte-perfect as of v0.14.3):**
+- [x] Blastware-compatible `.AB0` / `.G10` file generation (waveform + metadata)
+- [x] Multi-channel waveform decode from SUB 5A bulk stream
+- [x] Second-resolution timestamp encoding in Blastware filename
+- [x] **Byte-perfect against BW reference captures** (verified across 2-sec / 3-sec / 10-sec event durations, both event 0 and event N continuation events)
+- [x] STRT-bounded chunk walk + correct event-N probe counter + partial DLE stuffing of `0x10` in 5A params (the four fixes that landed in v0.14.0–v0.14.3)
+
+**Capture tools:**
+- [x] Serial-to-TCP bridge with raw BW/S3 capture (s3_bridge.py, defaults to auto-capture)
+- [x] GUI bridge with raw capture checkboxes (gui_bridge.py)
+- [x] ACH inbound server with bidirectional capture (ach_server.py saves raw_tx + raw_rx)
+- [x] Transparent TCP MITM proxy for live BW session capture (ach_mitm.py)
+
+**Analysis tools:**
+- [x] s3_analyzer.py — session parser, frame differ, Claude export
+- [x] gui_analyzer.py — standalone analyzer GUI
+- [x] frame_db.py — SQLite frame database for capture analysis
+
+**seismo_lab.py GUI:**
+- [x] Bridge tab — Serial/TCP mode selector with raw capture options
+- [x] Analyzer tab — BW/S3 capture playback and differencing
+- [x] Download tab — Live wire-byte capture during event download
+- [x] Console tab — Logging and diagnostics
+
+## Roadmap (Future)
+
+### Strategic direction — where this is going
+
+seismo-relay is being built as a **suite of cooperating components**
+that together replace and improve on Blastware's role.  Three logical
+tiers:
+
+1. **SFM** (device-side) — owns the active connection to a physical
+   unit.  Today: `minimateplus/`, `/device/*` HTTP endpoints,
+   `seismo_lab.py`.  Future: live Thor / Micromate support.
+2. **SDM** (data-side) — owns the database, waveform store, ingest
+   pipelines, and the read-API that Terra-View consumes.  Today this
+   code lives under `sfm/` for historical reasons; the role has
+   migrated and the eventual rename is on the long-tail cleanup list.
+3. **Codec library** — pure data-interpretation: `minimateplus/*_codec.py`,
+   `bw_ascii_report.py`, `micromate/idf_*.py`.  Used by both SFM and
+   SDM, depends on neither.
+
+Terra-View is downstream of SDM for fleet listings, event detail, etc.
+The long-term vision adds a **second link** from Terra-View → SFM for
+direct device interaction (see below).
+
+The codec work in this repo isn't trying to replace BW's network
+layer — BW's ACH file forwarding and Thor's IDF call-home are
+battle-tested.  The value is in the receiving and processing side: turn
+the stream of binary+ASCII pairs into something users can search,
+filter, alert on, and report from.
+
+### Terra-View ↔ SFM device control (the long-term vision)
+
+Today Terra-View only reads from SDM (event listings, dashboards,
+project reports).  When a unit goes missing — operator notices in the
+Terra-View dashboard — there's no way to *do* anything from the UI.
+The path of least resistance is to RDP into a Windows box and open
+Blastware, which defeats the purpose of having Terra-View.
+
+Target experience:
+- Operator notices a unit in Terra-View dashboard hasn't called in.
+- Clicks unit detail → "Connect to Device" button.
+- Terra-View opens an embedded view (modal or side-panel) that talks
+  to SFM's `/device/*` endpoints over the network.
+- Live view: device clock, battery, memory, current monitor status.
+- Actions: start/stop monitoring, push compliance config changes, pull
+  fresh events, run a sensor self-check, change call-home settings.
+- Audit log: every connect / action recorded in SDM for the unit
+  history.
+
+Implementation steps (concrete):
+- [ ] **SFM authentication & authorization layer.**  Today `/device/*`
+      endpoints are unauthenticated — anyone on the network can call
+      them.  Need at minimum a token-based auth, ideally with a "who
+      can connect to which units" mapping.  Hard prerequisite for
+      letting Terra-View users into the control surface.
+- [ ] **Terra-View "Connect to Device" entry point** on the unit
+      detail page.  Renders only when unit has connection info on file
+      and the user has permission.
+- [ ] **Embedded live-monitor view** in Terra-View — equivalent to
+      `seismo_lab.py`'s Bridge tab, but in the browser.  Polls SFM's
+      `/device/monitor/status` on an interval; sends start/stop via
+      `/device/monitor/{start,stop}`.
+- [ ] **Action history** — every connect / push / action call records
+      a row in `unit_history`, viewable on the unit detail page.
+- [ ] **Series IV live-device support in SFM** — currently `/device/*`
+      only supports MiniMate Plus.  Blocks "Connect to Device" for
+      Thor units until done.  Depends on Thor wire-protocol capture
+      and a `micromate/` parallel of the `minimateplus/` modules.
+
+### High-impact (unblocks product features)
+
+- [ ] **Series III waveform body codec reverse-engineering.**  The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`).  Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open.  Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise.  Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
+- [x] **Series IV (Thor IDF) binary codec reverse-engineering.** ✅ v0.21.0 — `micromate/idf_file.read_idf_file()` decodes both IDFW (waveform body at offset `0x0f1f`, reusing `decode_waveform_v2()`; 87–99% sample fidelity on quiet events) and IDFH (dedicated segment-based decoder: all 859 corpus files decode, 181,071 intervals, peaks within ~1.8% of sidecar values).  `WaveformStore.save_imported_idf` now also projects parsed Thor data into a `bw_report` block via `micromate/idf_to_bw_report.py` so Thor events render in the existing Event Report PDF pipeline without a separate renderer.
+- [ ] **In-app waveform viewer accuracy.**  Depends on Series III codec decode.  Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples.  Series IV waveforms come online when the IDF codec lands.
+- [ ] **Series IV live-device support.**  Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD).
+- [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing.
+- [ ] **Vibration summary reports** — highest legit PPV per project → Word doc (false-trigger filtering first).
+
+### BW ASCII report parser enhancements (built in v0.16.0)
+
+- [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value.  Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag.  All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
+- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now.  Land in the sidecar's `bw_report.histogram` block.
+- [ ] **Histogram interval bin-table parsing.**  Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed.  Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
+- [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag.  PDF + both modals render `>100 Hz` instead of `—`.
+
+### Ingestion gaps
+
+- [ ] **MLG forwarding.**  `series3-watcher` forwards event binaries + their `_ASCII.TXT` reports, but skips `.MLG` per-unit monitor log files entirely.  Adding an `POST /db/import/mlg_file` endpoint + watcher scan path would populate `monitor_log` for non-ACH-routed units (coverage queries, "was this unit monitoring on date X" lookups).
+- [ ] **0C-record raw bytes persistence in the sidecar.**  Currently on branch `claude/codec-re-cBGNe` as commit `a187124`; cherry-pick if useful as a standalone fix.  Preserves the 210-byte 0C record under `extensions.raw_records.waveform_record_b64` so future field-offset analysis (Peak Acceleration / Time of Peak / etc. — the fields BW computes client-side from samples) can run offline.
+
+### Operational
+
+- [ ] **`series3-watcher` file archive manager** — 90-day-old events moved to `<watch_folder>_archive/<year>/<month>/` subfolders.  Plan drafted in `claude/codec-re-cBGNe`'s plan-mode session; awaiting a 5-minute test on whether Blastware UI walks subfolders before any code lands (determines layout: in-place subfolders vs sibling archive).
+- [ ] **Compliance config encoder** — build raw write payloads from a `ComplianceConfig` object.
+- [ ] **Modem manager** — push RV50/RV55 configs via Sierra Wireless API.
+- [ ] **Call Home dial_string write support** (requires DLE escaping for embedded control characters).
+- [ ] **Histogram mode recording support** (5A stream analysis for mode 0x03 — separate from histogram ASCII parsing above).
+
+### Test coverage
+
+- [ ] Verify 30-sec event download — body may exceed `0xFFFF` and force the device into a different `end_key` encoding (none of the 2/3/10-sec test cases hit this boundary).
+- [ ] Histogram mode (0x03) write via SFM — confirmed working for Single Shot / Continuous / Histogram+Continuous; Histogram (0x03) needs a live test from a non-Histogram starting state.
+
+### Lower-priority cleanups
+
+- [ ] Compliance write anchor-9 cleanup — when changing recording_mode via SFM, a spurious `0x10` may persist after Histogram→other mode transitions.  Doesn't affect device operation but differs from BW's byte-perfect output.
+- [ ] Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring).
+- [ ] Call Home — map time slots 3/4 offsets; confirm `modem_power_relay_enabled`.
+- [ ] RV55 DCD/DTR — newer RV55 firmware doesn't assert DCD by default; units don't resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred).
+- [ ] **NULL-timestamp duplicate-row dedup.**  A small handful of events (2 known on prod as of 2026-05-22) have `events.timestamp IS NULL` because the codec couldn't extract a timestamp from the binary footer.  The `UNIQUE(serial, timestamp)` constraint doesn't fire on `NULL` (SQL semantics: `NULL ≠ NULL`), so every `--force` backfill INSERTs a new row instead of UPSERTing the existing one.  Cleanup: a one-shot SQL query that keeps only the newest row per `(serial, blastware_filename)` and deletes the rest.  Longer-term: extend the unique key to `(serial, COALESCE(timestamp, blastware_filename))` or reject inserts with NULL timestamp.
+- [ ] **Histogram body sub-format with `byte[5] != 0`.**  ~3 events on prod (`T190LD5Q.LD0H`, `O121L4L1.GU0H`) use a histogram body my walker doesn't recognize — the first block has `byte[5] = 0x01` or `0x07` instead of `0x00`, and the entire body lacks the `1e 0a 00 00` tail signature.  Codec returns 0 valid blocks; their DB PVS comes from the bw_report ASCII overlay (which BW computed from the same binary, so the DB columns are correct).  Only the `.h5` waveform plot is empty.  Cracking the sub-format would unlock the plot.  Needs binary+ASCII pairs from a few `byte[5]!=0` events; same RE approach as the K558 case.
+- [ ] **Histogram body sub-format with `byte[5] == 0x00` but undecodable.**  Observed 2026-05-28 on BE17353 (S353) events: `S353L4H2.FZ0H`, `S353L4H2.P00H`, `S353L4H3.7O0H`, `S353L4H3.E10H`.  Body starts `00 00 00 01 0a 00 XX 00 ...` which LOOKS like a valid histogram block header (marker 0x000a at byte[4:6] ✓, byte[5]=0x00 normal-format ✓), but the walker finds zero data blocks across the whole body.  Likely an extra header before the block stream OR a different tail signature than `1e 0a 00 00`.  Smaller body lengths (1900-2100 bytes) suggest these may be short-recording histogram variants.  Same operational impact as the byte[5]!=0 case: event ingests cleanly, DB peaks correct via bw_report overlay, only the chart is empty.  Worth dumping a hex view of one body to diagnose.
+- [ ] **Sensor-check waveform extraction from the BW binary.**  BW's Event Report PDFs include a narrow panel on the right side of the waveform plot showing each channel's response to the sensor self-check signal (a damped sinusoid for geo, sawtooth-at-test-freq for mic).  Our parser captures the test RESULTS (`test_freq_hz`, `test_ratio`, `test_amplitude_mv`, `test_results` pass/fail) and the PDF + modal display them as text — but BW's per-sample sensor-check waveform isn't accessible to us today.  Two paths to add it:  (a) RE the binary to find where the sensor-check samples are stored — could be a section before STRT, after the footer, or in a separate sub-record; protocol reference doesn't currently mention it.  (b) If samples aren't in the binary, synthesize a representative waveform from the test parameters (damped sinusoid at `test_freq_hz` with damping from `test_ratio`).  Path (a) is the honest answer; path (b) is decorative.  Until either lands, the text-only sensor-check display in the report is fine.
@@ -0,0 +1,66 @@
+# analysis/ — exploratory scripts for waveform-body RE
+
+**These are scratch.** Run them, read them, copy them, but don't trust
+them as documentation.  When a finding is verified it gets promoted
+to `minimateplus/waveform_codec.py` and `tests/test_waveform_codec.py`;
+when it's wrong it stays here as a fossil.
+
+Authoritative status lives in:
+
+- `docs/waveform_codec_re_status.md` (current truth, working note)
+- `minimateplus/waveform_codec.py` (verified implementation + docstring)
+- `tests/test_waveform_codec.py` (regression locks against fixtures)
+
+---
+
+## Still useful
+
+| File | What it does |
+|---|---|
+| `load_bundle.py` | Fixture loader.  Parses BW binary + ASCII TXT into a `Bundle` dataclass with samples, metadata, body bytes.  Used by most other scripts here. |
+| `verify_tran.py` | Verifies `decode_tran_initial` against fixture ground truth across all events.  Useful when you change the decoder and want a quick sanity check. |
+| `inspect_5_11.py` | Inspects the 5-11-26 high-amplitude bundle's body structure, prints metadata, peaks, and block counts. |
+| `walk_5_11.py` | Walks blocks for the 5-11-26 bundle and prints offset/tag/length/data. |
+| `seg1_blocks.py` | Dumps all blocks in segment 1 of each event.  The starting point for cracking multi-segment Tran continuation. |
+| `full_tran.py` | Multi-segment Tran decoder attempt (broken — diverges at sample ~512).  Useful as a starting scaffold for the next experiment. |
+| `multi_segment.py` | Earlier multi-segment attempt with different segment-header consumption strategies.  Records what didn't work. |
+| `test_rle.py` | Tests `00 NN` interpretation as zero-RLE with different divisor values.  Documents how the RLE rule was confirmed. |
+
+## Superseded — keep for archaeology
+
+| File | Superseded by |
+|---|---|
+| `walk_v2.py` … `walk_v5.py` | `walk_v6.py` and ultimately `minimateplus/waveform_codec.walk_body`.  Each version represents one round of refinement.  Don't read in isolation — read the diff between them to see what was learned. |
+| `walk_chunks.py` | `walk_v6.py` / production walker |
+| `decode_v1.py` | First naive decoder attempt.  Wrong but readable. |
+
+## Pure exploration — read if curious
+
+| File | What it explored |
+|---|---|
+| `inspect_body.py` | Byte-frequency stats per event.  Established that bytes 0x00 / 0x10 dominate. |
+| `find_blocks.py` | Searched for repeating 2-byte tag patterns. |
+| `find_signal_runs.py` | Searched for stretches of bytes that "look like a smooth signal" (small inter-byte deltas).  Found the `20 NN` literal blocks. |
+| `dump_head.py`, `dump_trailer.py`, `dump_around.py` | Hex dumpers at various body positions. |
+| `compare_cd.py` | Byte-diff between event-c and event-d (same length, similar signal).  Used to identify structural vs data bytes. |
+| `brute_force.py` | Tested 96 combinations of channel-permutation × nibble-order × sign-convention × init-from-header on the quiet bundle.  All failed because the quiet bundle had T[0]=T[1]=0, making the preamble undetectable. |
+| `try_nibbles.py`, `try_layouts.py` | Earlier channel-interleaving hypotheses.  All wrong. |
+| `test_tran_continue.py` | Test of "Tran continues uninterrupted across `30 04` blocks" hypothesis.  Disproven. |
+
+---
+
+## Adding new scripts
+
+If you're picking up the codec work, feel free to add new scripts here.
+Suggested conventions:
+
+- Start the filename with what you're testing: `test_<hypothesis>.py`,
+  `verify_<piece>.py`, `inspect_<region>.py`.
+- Print enough output that the reader can see exactly which events
+  match / diverge and where.
+- When a finding is solid, move the verified logic to
+  `minimateplus/waveform_codec.py` and add a regression test in
+  `tests/test_waveform_codec.py` — don't leave the truth only in
+  this directory.
+- If a script is fully superseded, leave it in place (don't delete) —
+  the fossil record is useful when re-evaluating hypotheses later.
@@ -0,0 +1,93 @@
+"""Brute-force test channel permutations / nibble orders on event-d (simplest signal)."""
+import sys
+import itertools
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+from minimateplus.waveform_codec import walk_body
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def decode(body, channel_perm, nibble_order, sign_mode, init_from_header):
+    """Try one decoder configuration on event-d. Returns first 8 cumulative samples per channel."""
+    blocks = walk_body(body)
+    # Initial values from bytes [4:7] if init_from_header else 0
+    if init_from_header:
+        init = [body[4] if body[4] < 128 else body[4] - 256,
+                body[5] if body[5] < 128 else body[5] - 256,
+                body[6] if body[6] < 128 else body[6] - 256,
+                0]
+    else:
+        init = [0, 0, 0, 0]
+    cur = list(init)
+    out = [[init[0]], [init[1]], [init[2]], [init[3]]]  # sample 0 = init
+    nibble_idx = 0  # within delta stream; channel = channel_perm[nibble_idx % 4]
+
+    # Walk only the 10 NN data blocks
+    for blk in blocks:
+        if blk.tag_hi != 0x10:
+            continue
+        for byte in blk.data:
+            if nibble_order == 'high_first':
+                nib1, nib2 = (byte >> 4) & 0xF, byte & 0xF
+            else:
+                nib1, nib2 = byte & 0xF, (byte >> 4) & 0xF
+            for nib in (nib1, nib2):
+                if sign_mode == 'signed':
+                    delta = s4(nib)
+                else:
+                    delta = nib
+                ch = channel_perm[nibble_idx % 4]
+                cur[ch] += delta
+                if (nibble_idx + 1) % 4 == 0:
+                    out[0].append(cur[0])
+                    out[1].append(cur[1])
+                    out[2].append(cur[2])
+                    out[3].append(cur[3])
+                nibble_idx += 1
+                if len(out[0]) >= 16:
+                    return out
+    return out
+
+
+def best_match(pred, truth, n=10):
+    """Sum of squared differences in first n samples."""
+    n = min(n, len(pred), len(truth))
+    return sum((pred[i] - truth[i])**2 for i in range(n))
+
+
+def main():
+    b = load_bundle("event-d")
+    # truth in 16-count units
+    tr = {ch: [round(v * 200) for v in b.samples[ch]] for ch in ("Tran", "Vert", "Long")}
+
+    print("Truth event-d first 10 samples:")
+    for ch in ("Tran", "Vert", "Long"):
+        print(f"  {ch}: {tr[ch][:10]}")
+
+    # Test 96 combinations
+    best = []
+    for perm in itertools.permutations([0, 1, 2, 3]):
+        for nibble_order in ('high_first', 'low_first'):
+            for sign in ('signed', 'unsigned'):
+                for init_h in (False, True):
+                    decoded = decode(b.body, perm, nibble_order, sign, init_h)
+                    # Score as TVL channel-sum
+                    score = sum(
+                        best_match(decoded[i], tr[ch], n=10)
+                        for i, ch in enumerate(("Tran", "Vert", "Long"))
+                        if i < 3
+                    )
+                    label = f"perm={perm} nib={nibble_order[:1]} sign={sign[:3]} init={init_h}"
+                    best.append((score, label, decoded))
+
+    best.sort(key=lambda x: x[0])
+    print(f"\nTop 10 configurations:")
+    for s, lbl, dec in best[:10]:
+        print(f"  score={s:>5}  {lbl}  T={dec[0][:8]}  V={dec[1][:8]}  L={dec[2][:8]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,42 @@
+"""Compare event-c and event-d (same N_samples) to find header vs data bytes."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def main():
+    bc = load_bundle("event-c")
+    bd = load_bundle("event-d")
+
+    # Compare prefixes
+    nc, nd = len(bc.body), len(bd.body)
+    n = min(nc, nd)
+    diffs = []
+    for i in range(n):
+        if bc.body[i] != bd.body[i]:
+            diffs.append(i)
+    print(f"event-c body={nc}, event-d body={nd}")
+    print(f"Total diffs (first {n}): {len(diffs)}")
+
+    # Show common prefix
+    same_prefix = 0
+    for i in range(n):
+        if bc.body[i] == bd.body[i]:
+            same_prefix += 1
+        else:
+            break
+    print(f"Common prefix length: {same_prefix}")
+    print(f"event-c prefix: {bc.body[:same_prefix].hex(' ')}")
+
+    # Look for runs of common bytes
+    print(f"\nFirst 32 diff positions: {diffs[:32]}")
+
+    # Show the "diff fingerprint" of the first 100 bytes
+    print(f"\n  pos    c     d")
+    for i in range(0, 100):
+        marker = " " if bc.body[i] == bd.body[i] else "*"
+        bd_b = bd.body[i] if i < nd else None
+        print(f"  {i:>3}  {bc.body[i]:02x}{marker}  {bd_b:02x}" if bd_b is not None else f"  {i:>3}  {bc.body[i]:02x}{marker}")
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,99 @@
+"""
+Decoder v1: nibble-pair signed deltas in 10 NN blocks, 4-channel round-robin.
+"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def walk_blocks(body, start):
+    i = start
+    blocks = []
+    while i + 1 < len(body):
+        t0, t1 = body[i], body[i + 1]
+        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 // 2 + 2
+            data = bytes(body[i + 2 : i + length])
+            blocks.append(("10", t1, data))
+            i += length
+        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 + 2
+            data = bytes(body[i + 2 : i + length])
+            blocks.append(("20", t1, data))
+            i += length
+        elif t0 == 0x00 and t1 % 4 == 0:
+            blocks.append(("00", t1, b""))
+            i += 2
+        elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
+            length = t1 * 4
+            data = bytes(body[i + 2 : i + length])
+            blocks.append(("30", t1, data))
+            i += length
+        elif t0 == 0x40 and t1 == 0x02:
+            length = 20
+            data = bytes(body[i + 2 : i + length])
+            blocks.append(("40", t1, data))
+            i += length
+        else:
+            blocks.append(("??", t0, bytes(body[i:i+8])))
+            break
+    return blocks
+
+
+def decode_v1(body, start, n_samples):
+    """Decode by accumulating nibble-pair deltas from all 10 NN blocks."""
+    blocks = walk_blocks(body, start)
+    # 4 channels: T, V, L, M
+    cur = [0, 0, 0, 0]
+    out = [[], [], [], []]
+    sample_index = 0  # how many sample-sets emitted
+
+    for typ, NN, data in blocks:
+        if typ == "10":
+            # 2 nibbles per byte, round-robin TVLM
+            for byte in data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    ch = sample_index % 4
+                    cur[ch] += s4(nib)
+                    out[ch].append(cur[ch])
+                    sample_index = (sample_index + 1) // 4 * 4 + (sample_index + 1) % 4  # ?
+                    sample_index += 1
+                    # We emit per-nibble, but the structure is unclear
+        elif typ == "20":
+            # int8 absolute or delta?
+            for byte in data:
+                v = byte if byte < 128 else byte - 256
+                ch = sample_index % 4
+                cur[ch] = v  # treat as absolute
+                out[ch].append(cur[ch])
+                sample_index += 1
+    return out
+
+
+def main():
+    b = load_bundle("event-c")
+    body = b.body
+    truth_T = [round(v * 200) for v in b.samples["Tran"]]
+    truth_V = [round(v * 200) for v in b.samples["Vert"]]
+    truth_L = [round(v * 200) for v in b.samples["Long"]]
+
+    # Find start
+    for s in range(15):
+        if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
+            start = s
+            break
+
+    blocks = walk_blocks(body, start)
+    # Print block-by-block what's in each
+    print(f"Total blocks: {len(blocks)}")
+    bytes_processed = 0
+    for typ, NN, data in blocks[:30]:
+        print(f"  type={typ} NN=0x{NN:02x} data_len={len(data)} data_hex={data[:32].hex(' ')}{'...' if len(data) > 32 else ''}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,27 @@
+"""Dump body bytes around a specific offset."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def dump_around(name: str, center: int, radius: int = 96):
+    b = load_bundle(name)
+    body = b.body
+    start = max(0, center - radius)
+    end = min(len(body), center + radius)
+    print(f"\n=== {name} body[{start}:{end}] (full body={len(body)}) ===")
+    for i in range(start, end, 32):
+        row = body[i:i+32]
+        marker = "  <-- center" if i <= center < i+32 else ""
+        print(f"  +{i:>5}  {row.hex(' ')}{marker}")
+
+
+def main():
+    # Look at the trailer transitions
+    trailer_starts = {"event-a": 7047, "event-b": 6475, "event-c": 4043, "event-d": 3941}
+    for name, off in trailer_starts.items():
+        dump_around(name, off, 96)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,18 @@
+"""Dump the START of each body in 32-byte rows."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def main():
+    for name in ("event-a", "event-c"):
+        b = load_bundle(name)
+        body = b.body
+        print(f"\n=== {name} body[0:512] (full body={len(body)}, samples={len(b.samples['Tran'])}) ===")
+        for i in range(0, min(512, len(body)), 32):
+            row = body[i:i+32]
+            print(f"  +{i:>5}  {row.hex(' ')}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,24 @@
+"""Dump body bytes split into 32-byte rows starting from `start_offset`."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def dump(body: bytes, name: str, start: int, n_rows: int = 30):
+    print(f"\n=== {name} body[{start}:] (full body={len(body)}) ===")
+    end = min(start + 32 * n_rows, len(body))
+    for i in range(start, end, 32):
+        row = body[i:i+32]
+        print(f"  +{i:>5}  {row.hex(' ')}")
+
+
+def main():
+    for name in ("event-a", "event-b", "event-c", "event-d"):
+        b = load_bundle(name)
+        # Print the LAST ~600 bytes of the body to see the tail structure
+        start = max(0, len(b.body) - 32 * 12)
+        dump(b.body, name, start, 12)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,41 @@
+"""Search for structural repetition in the body bytes."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def find_pattern_offsets(body: bytes, pattern: bytes, max_count=20):
+    out = []
+    i = 0
+    while True:
+        i = body.find(pattern, i)
+        if i < 0:
+            break
+        out.append(i)
+        i += 1
+        if len(out) >= max_count:
+            break
+    return out
+
+
+def main():
+    for name in ("event-a", "event-b", "event-c", "event-d"):
+        b = load_bundle(name)
+        body = b.body
+        print(f"\n=== {name} (body={len(body)}, N_samples={len(b.samples['Tran'])}) ===")
+
+        # Try to find repeating substructures (look for 4-byte 0x10-prefixed markers)
+        for prefix in [b"\x10\x10", b"\x10\x04", b"\x10\x08", b"\x10\x0c", b"\x10\x18",
+                       b"\x10\x14", b"\x10\x20", b"\x10\x40", b"\x10\x80", b"\x10\x00",
+                       b"\x10\x01", b"\x10\x03", b"\x10\xf0", b"\xf1\x10", b"\x00\x10",
+                       b"\x40\x02", b"\x20\x04", b"\x30\x04", b"\x30\x08", b"\x00\x1a"]:
+            offs = find_pattern_offsets(body, prefix, max_count=200)
+            if 1 <= len(offs) <= 1000:
+                # Print first 10 offsets
+                first = offs[:6]
+                last = offs[-3:]
+                print(f"  '{prefix.hex()}' x{len(offs):>4}  first={first} last={last}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,34 @@
+"""Find body byte ranges that look like absolute int8 sample data (smooth waveform)."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def looks_like_smooth_int8(buf):
+    """Convert bytes to int8 and check if successive deltas are small (waveform-like)."""
+    if len(buf) < 8:
+        return 0.0
+    vals = [b if b < 128 else b - 256 for b in buf]
+    diffs = [abs(vals[i+1] - vals[i]) for i in range(len(vals)-1)]
+    avg_diff = sum(diffs) / len(diffs)
+    return avg_diff
+
+
+def main():
+    for name in ("event-a", "event-c"):
+        b = load_bundle(name)
+        body = b.body
+        # Scan with sliding window of 64 bytes; find segments where the bytes look like a smooth wave
+        win = 64
+        scores = []
+        for i in range(len(body) - win):
+            scores.append((i, looks_like_smooth_int8(body[i:i+win])))
+        # Lowest avg_diff means smoothest
+        scores.sort(key=lambda x: x[1])
+        print(f"\n=== {name} (body={len(body)}) — smoothest 10 windows ===")
+        for off, s in scores[:10]:
+            print(f"  +{off:>5}  avg_diff={s:.2f}  bytes={body[off:off+24].hex(' ')}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,76 @@
+"""Full Tran decoder: continues across segment headers using T_delta from header bytes [0:2]."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def decode_full_tran(body):
+    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
+        return None
+    T0 = int.from_bytes(body[3:5], "big", signed=True)
+    T1 = int.from_bytes(body[5:7], "big", signed=True)
+
+    i = 7
+    while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
+        i += 1
+
+    blocks = walk_body(body, i)
+    T = [T0, T1]
+    cur = T1
+    for blk in blocks:
+        if blk.tag_hi == 0x40:
+            # Segment header carries 2 T deltas (int16 BE each) at bytes [0:2] and [2:4]
+            if len(blk.data) >= 4:
+                delta1 = int.from_bytes(blk.data[0:2], "big", signed=True)
+                cur += delta1
+                T.append(cur)
+                delta2 = int.from_bytes(blk.data[2:4], "big", signed=True)
+                cur += delta2
+                T.append(cur)
+        elif blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur += s4(nib)
+                    T.append(cur)
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur += i8(byte)
+                T.append(cur)
+        elif blk.tag_hi == 0x00:
+            for _ in range(blk.tag_lo):
+                T.append(cur)
+        # 30 NN: skip for now
+    return T
+
+
+def main():
+    for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        truth_T = [round(v*200) for v in samples["Tran"]]
+        n_truth = len(truth_T)
+
+        decoded = decode_full_tran(body)
+        n = min(len(decoded), n_truth)
+        matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
+        div_at = -1
+        for i in range(n):
+            if decoded[i] != truth_T[i]:
+                div_at = i
+                break
+        print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,50 @@
+"""Quick inspection of the new high-amplitude events."""
+import os, re, sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+ROOT = "tests/fixtures/5-11-26"
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
+        bin_path = os.path.join(ROOT, stem)
+        txt_path = bin_path + ".TXT"
+        with open(bin_path, "rb") as f:
+            raw = f.read()
+        body = raw[43:-26]
+        meta, samples = _parse_txt(txt_path)
+        n = len(samples["Tran"])
+
+        print(f"\n=== {stem} ===")
+        print(f"  file={len(raw)}, body={len(body)}, N_samples={n}")
+        print(f"  rectime={meta.get('Record Time')} pretrig={meta.get('Pre-trigger Length')}")
+        print(f"  PPV(T,V,L)={meta.get('Tran PPV')} / {meta.get('Vert PPV')} / {meta.get('Long PPV')}")
+        # Show first few non-trivial samples
+        print(f"  First 5 truth samples (in/s):")
+        for i in range(5):
+            print(f"    T={samples['Tran'][i]:8.3f}  V={samples['Vert'][i]:8.3f}  "
+                  f"L={samples['Long'][i]:8.3f}  M={samples['MicL'][i]:8.3f}")
+        # Peak sample positions
+        for ch in ("Tran", "Vert", "Long"):
+            vals = samples[ch]
+            peak_i = max(range(n), key=lambda i: abs(vals[i]))
+            print(f"  {ch}: peak {vals[peak_i]:.3f} at sample {peak_i} (t={peak_i/1024:.3f}s)")
+        # Body structure
+        start = find_data_start(body)
+        blocks = walk_body(body, start)
+        types = {}
+        for b in blocks:
+            types[b.tag_hi] = types.get(b.tag_hi, 0) + 1
+        print(f"  body start={start}, total blocks walked: {len(blocks)}")
+        print(f"  block tag counts: {types}")
+        # How far the walker got
+        if blocks:
+            last = blocks[-1]
+            walked = last.offset + last.length
+            print(f"  walker stopped at offset {walked}/{len(body)} ({100*walked/len(body):.0f}%)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,23 @@
+"""Print raw body hex + byte-distribution stats for one event."""
+from collections import Counter
+import sys
+
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def main():
+    for name in ("event-a", "event-b", "event-c", "event-d"):
+        b = load_bundle(name)
+        body = b.body
+        print(f"\n=== {name} ({len(body)} body bytes) ===")
+        print(f"  STRT: {b.strt.hex()}")
+        print(f"  body[0:64]:   {body[:64].hex()}")
+        print(f"  body[64:128]: {body[64:128].hex()}")
+        print(f"  body[-32:]:   {body[-32:].hex()}")
+        cnt = Counter(body)
+        print(f"  top 16 bytes: {[(f'0x{k:02x}', f'{v/len(body):.2%}') for k,v in cnt.most_common(16)]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,144 @@
+"""
+load_bundle.py — extract body bytes from BW binary + parse sample columns from TXT.
+
+Used by the codec reverse-engineering scripts in this directory.
+"""
+from __future__ import annotations
+
+import os
+import re
+from dataclasses import dataclass
+
+
+BUNDLE_ROOT = os.path.join(
+    os.path.dirname(__file__), "..", "tests", "fixtures", "decode-re-5-8-26"
+)
+
+
+@dataclass
+class Bundle:
+    name: str
+    bin_path: str
+    txt_path: str
+    bin: bytes
+    body: bytes  # bytes between STRT (43) and footer (last 26)
+    strt: bytes  # 21-byte STRT record
+    samples: dict  # {"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}
+    sample_rate: int
+    rectime_sec: float
+    pretrig_sec: float
+    geo_range_ips: float
+    ppv: dict  # {"Tran": float, "Vert": float, "Long": float}
+    mic_pspl: float
+    serial: str
+
+
+def _parse_txt(path: str) -> dict:
+    with open(path, "r", encoding="utf-8", errors="replace") as f:
+        text = f.read()
+
+    meta = {}
+    samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+
+    # Find header line that starts the columns ("Tran   Vert   Long   MicL").
+    # Then every line after is sample data (4 tab-separated floats).
+    lines = text.splitlines()
+    header_idx = None
+    for i, line in enumerate(lines):
+        if "Tran" in line and "Vert" in line and "Long" in line and "MicL" in line:
+            # The columns header.  Sample lines start a few lines later.
+            header_idx = i
+            break
+    if header_idx is None:
+        raise ValueError(f"no Tran/Vert/Long/MicL header in {path}")
+
+    # Parse meta — quoted lines with "Field : value"
+    for line in lines[:header_idx]:
+        m = re.match(r'^"([^"]+)\s*:\s*([^"]*)"', line.strip())
+        if m:
+            k, v = m.group(1).strip(), m.group(2).strip()
+            meta[k] = v
+
+    # Parse samples
+    for line in lines[header_idx + 1 :]:
+        line = line.strip()
+        if not line:
+            continue
+        parts = re.split(r"\s+", line)
+        if len(parts) < 4:
+            continue
+        try:
+            t = float(parts[0])
+            v = float(parts[1])
+            l = float(parts[2])
+            m = float(parts[3])
+        except ValueError:
+            continue
+        samples["Tran"].append(t)
+        samples["Vert"].append(v)
+        samples["Long"].append(l)
+        samples["MicL"].append(m)
+
+    return meta, samples
+
+
+def load_bundle(name: str) -> Bundle:
+    folder = os.path.join(BUNDLE_ROOT, name)
+    files = os.listdir(folder)
+    bin_name = next(f for f in files if not f.endswith(".TXT"))
+    txt_name = next(f for f in files if f.endswith(".TXT"))
+
+    bin_path = os.path.join(folder, bin_name)
+    txt_path = os.path.join(folder, txt_name)
+
+    with open(bin_path, "rb") as f:
+        binary = f.read()
+
+    # Header is 22 bytes; STRT at [22:43]; footer at last 26 bytes.
+    strt = binary[22:43]
+    body = binary[43:-26]
+
+    meta, samples = _parse_txt(txt_path)
+
+    sample_rate = int(re.search(r"(\d+)", meta.get("Sample Rate", "1024")).group(1))
+    rectime_sec = float(re.search(r"([\d.]+)", meta.get("Record Time", "3.0")).group(1))
+    pretrig_sec = float(re.search(r"-?[\d.]+", meta.get("Pre-trigger Length", "0")).group(0))
+    geo_range_ips = float(re.search(r"([\d.]+)", meta.get("Geo Range", "10.0")).group(1))
+    serial = meta.get("Serial Number", "").strip()
+
+    def _f(s):
+        return float(re.search(r"-?[\d.]+", s).group(0))
+
+    ppv = {
+        "Tran": _f(meta.get("Tran PPV", "0")),
+        "Vert": _f(meta.get("Vert PPV", "0")),
+        "Long": _f(meta.get("Long PPV", "0")),
+    }
+    mic_pspl = _f(meta.get("MicL PSPL", "0"))
+
+    return Bundle(
+        name=name,
+        bin_path=bin_path,
+        txt_path=txt_path,
+        bin=binary,
+        body=body,
+        strt=strt,
+        samples=samples,
+        sample_rate=sample_rate,
+        rectime_sec=rectime_sec,
+        pretrig_sec=pretrig_sec,
+        geo_range_ips=geo_range_ips,
+        ppv=ppv,
+        mic_pspl=mic_pspl,
+        serial=serial,
+    )
+
+
+if __name__ == "__main__":
+    for name in ("event-a", "event-b", "event-c", "event-d"):
+        b = load_bundle(name)
+        n = len(b.samples["Tran"])
+        print(f"{name}: body={len(b.body):>6}  N_samples={n}  rate={b.sample_rate}  "
+              f"rectime={b.rectime_sec}  pretrig={b.pretrig_sec}  range={b.geo_range_ips}  "
+              f"PPV(T,V,L)={b.ppv['Tran']:.3f},{b.ppv['Vert']:.3f},{b.ppv['Long']:.3f}  "
+              f"MicL={b.mic_pspl}")
@@ -0,0 +1,81 @@
+"""Decode Tran across multiple segments by resetting at 40 02 headers."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def decode_full_tran(body):
+    """Decode all Tran samples in the body, walking through segments."""
+    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
+        return None
+    T0 = int.from_bytes(body[3:5], "big", signed=True)
+    T1 = int.from_bytes(body[5:7], "big", signed=True)
+
+    # Locate first tag
+    i = 7
+    while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
+        i += 1
+
+    blocks = walk_body(body, i)
+    T = [T0, T1]
+    cur = T1
+    for bi, blk in enumerate(blocks):
+        if blk.tag_hi == 0x40:
+            # Segment header — try interpreting bytes [0:2] as new T anchor
+            if len(blk.data) >= 2:
+                new_anchor = int.from_bytes(blk.data[0:2], "big", signed=True)
+                # The next sample IS this anchor value, NOT a delta from cur.
+                T.append(new_anchor)
+                cur = new_anchor
+        elif blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur += s4(nib)
+                    T.append(cur)
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur += i8(byte)
+                T.append(cur)
+        elif blk.tag_hi == 0x00:
+            # RLE: append NN zero deltas
+            for _ in range(blk.tag_lo):
+                T.append(cur)
+        # 30 NN: skip
+    return T
+
+
+def main():
+    for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        truth_T = [round(v*200) for v in samples["Tran"]]
+        n_truth = len(truth_T)
+
+        decoded = decode_full_tran(body)
+        n = min(len(decoded), n_truth)
+        matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
+        # Find first divergence
+        div_at = -1
+        for i in range(n):
+            if decoded[i] != truth_T[i]:
+                div_at = i
+                break
+        print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
+        if div_at >= 0 and div_at < 30:
+            print(f"  truth around div [{max(0,div_at-3)}:{div_at+8}]: {truth_T[max(0,div_at-3):div_at+8]}")
+            print(f"  pred  around div [{max(0,div_at-3)}:{div_at+8}]: {decoded[max(0,div_at-3):div_at+8]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,28 @@
+"""Dump all blocks in segment 1 of each event with their data."""
+import sys
+sys.path.insert(0, ".")
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        blocks = walk_body(body, find_data_start(body))
+
+        # Find segment 1 (between first and second 40 02)
+        seg40_indices = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
+        if len(seg40_indices) < 2:
+            print(f"\n{stem}: only {len(seg40_indices)} segment headers found")
+            seg1_blocks = blocks[seg40_indices[0]:] if seg40_indices else []
+        else:
+            seg1_blocks = blocks[seg40_indices[0]:seg40_indices[1]+1]
+        print(f"\n=== {stem} segment 1 ({len(seg1_blocks)} blocks) ===")
+        for b in seg1_blocks[:25]:
+            tag = f"{b.tag_hi:02x}{b.tag_lo:02x}"
+            print(f"  off={b.offset:>5} {tag} NN=0x{b.tag_lo:02x}({b.tag_lo:>3}) len={b.length:>3}  data={b.data[:16].hex(' ')}{'...' if len(b.data)>16 else ''}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,195 @@
+"""Test 12-bit signed packed deltas hypothesis for 30 NN blocks across all loud events.
+
+For each 30 NN block in each event, identify what samples it should cover
+(based on the cumulative delta count up to that point) and compare the
+truth deltas against various 12-bit packing schemes.
+"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+CHANNEL_ORDER = ["Vert", "Long", "MicL", "Tran"]  # rotation after initial T
+
+
+def s12(v):
+    """Sign-extend a 12-bit unsigned value to signed int."""
+    return v if v < 0x800 else v - 0x1000
+
+
+def unpack_12bit_be(data):
+    """4 deltas in 6 bytes, BE order: byte[0:1.5], byte[1.5:3], byte[3:4.5], byte[4.5:6]."""
+    # bits 0..47 (MSB-first), split into 4 × 12-bit
+    val = int.from_bytes(data, "big")
+    out = []
+    for i in range(4):
+        d = (val >> (12 * (3 - i))) & 0xFFF
+        out.append(s12(d))
+    return out
+
+
+def unpack_12bit_le(data):
+    """4 deltas in 6 bytes, LE order: bytes packed as 2 × 24-bit groups."""
+    out = []
+    # First 3 bytes contain 2 deltas
+    b0, b1, b2 = data[0], data[1], data[2]
+    d0 = b0 | ((b1 & 0x0F) << 8)
+    d1 = (b1 >> 4) | (b2 << 4)
+    out.append(s12(d0))
+    out.append(s12(d1))
+    # Next 3 bytes contain 2 more deltas
+    b3, b4, b5 = data[3], data[4], data[5]
+    d2 = b3 | ((b4 & 0x0F) << 8)
+    d3 = (b4 >> 4) | (b5 << 4)
+    out.append(s12(d2))
+    out.append(s12(d3))
+    return out
+
+
+def unpack_12bit_be_per_triplet(data):
+    """4 deltas as 2 triplets of (high4, low8) BE within each 3-byte group."""
+    out = []
+    b0, b1, b2 = data[0], data[1], data[2]
+    d0 = (b0 << 4) | (b1 >> 4)
+    d1 = ((b1 & 0x0F) << 8) | b2
+    out.append(s12(d0))
+    out.append(s12(d1))
+    b3, b4, b5 = data[3], data[4], data[5]
+    d2 = (b3 << 4) | (b4 >> 4)
+    d3 = ((b4 & 0x0F) << 8) | b5
+    out.append(s12(d2))
+    out.append(s12(d3))
+    return out
+
+
+def truth_deltas_for_block(blocks, block_idx, event_truth, channel):
+    """For a 30 NN block at block_idx, determine which samples it covers and
+    return the truth deltas for those samples.
+
+    Walks through all blocks before block_idx (within the same segment) and
+    counts how many deltas have been emitted for *channel*, starting from the
+    segment's anchor pair.
+    """
+    # Find the segment header that contains this block.
+    seg_header_idx = None
+    for j in range(block_idx, -1, -1):
+        if blocks[j].tag_hi == 0x40:
+            seg_header_idx = j
+            break
+    if seg_header_idx is None:
+        # block is in the initial T segment; samples count from sample 2.
+        first_sample_in_segment = 2
+    else:
+        # Anchor pair covers samples [N, N+1] for some N.  Subsequent deltas
+        # are samples [N+2, N+2+1, ...].  We don't actually need to know N
+        # for this test — just the relative position within the segment.
+        first_sample_in_segment = 2  # anchor=0,1; deltas start at 2
+
+    # Count deltas from segment-data start to block_idx.
+    delta_count = 0
+    start_block = seg_header_idx + 1 if seg_header_idx is not None else 0
+    for j in range(start_block, block_idx):
+        blk = blocks[j]
+        if blk.tag_hi == 0x10:
+            delta_count += blk.tag_lo  # NN nibbles = NN deltas
+        elif blk.tag_hi == 0x20:
+            delta_count += blk.tag_lo  # NN int8 deltas
+        elif blk.tag_hi == 0x00:
+            delta_count += blk.tag_lo  # RLE zero deltas
+    # Now the 30 NN block carries NN deltas.
+    nn = blocks[block_idx].tag_lo
+    # First sample affected: segment first_sample + delta_count.
+    # But we ALSO need to know which segment this is, since the segment maps
+    # to a specific channel and a specific starting absolute sample index.
+    return first_sample_in_segment + delta_count, nn
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
+                 "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        blocks = walk_body(body, find_data_start(body))
+        seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
+
+        # Find all 30 NN blocks in DATA section (not trailer).
+        thirty_blocks = []
+        for bi, b in enumerate(blocks):
+            if b.tag_hi != 0x30:
+                continue
+            # Determine which segment this is in
+            seg_num = None
+            for k, hi in enumerate(seg_idx):
+                next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
+                if hi < bi < next_hi:
+                    seg_num = k
+                    break
+            if seg_num is None and seg_idx and bi < seg_idx[0]:
+                seg_num = -1  # initial T segment
+            thirty_blocks.append((bi, b, seg_num))
+
+        if not thirty_blocks:
+            continue
+
+        print(f"\n=== {stem} ===")
+        for bi, b, seg_num in thirty_blocks:
+            # Channel for this segment
+            if seg_num == -1:
+                channel = "Tran"
+                seg_label = "initial T"
+            else:
+                channel = CHANNEL_ORDER[seg_num % 4]
+                seg_label = f"seg {seg_num}"
+
+            # Count deltas before this block within the same segment.
+            seg_header_idx = seg_idx[seg_num] if seg_num >= 0 else -1
+            start_block = seg_header_idx + 1 if seg_header_idx >= 0 else 0
+            delta_count = 0
+            for j in range(start_block, bi):
+                blk = blocks[j]
+                if blk.tag_hi in (0x10, 0x20, 0x00):
+                    delta_count += blk.tag_lo
+
+            # First sample this 30 NN block affects (within the segment)
+            # = anchor positions + delta_count + 2 (since anchor pair was samples 0,1)
+            # But the segment's first absolute sample index in the channel is
+            # (seg_num // 4) * 512 (approximately) if segment 0 is the first V seg.
+            cycle = (seg_num // 4) if seg_num >= 0 else 0
+            base = cycle * 512 + 2  # +2 for anchor pair
+            sample_idx = base + delta_count
+            truth_ch = [round(v * 200) for v in samples[channel]]
+            nn = b.tag_lo
+
+            if sample_idx + nn >= len(truth_ch):
+                print(f"  block @ {b.offset} ({seg_label} {channel}): out of truth range")
+                continue
+
+            # Get the previous sample so we can compute truth deltas
+            if sample_idx == 0:
+                prev = 0
+            else:
+                prev = truth_ch[sample_idx - 1]
+            truth_deltas = []
+            for k in range(nn):
+                truth_deltas.append(truth_ch[sample_idx + k] - (prev if k == 0 else truth_ch[sample_idx + k - 1]))
+
+            # Try each packing
+            schemes = [
+                ("12-bit BE contiguous", unpack_12bit_be(b.data)),
+                ("12-bit LE per-triplet", unpack_12bit_le(b.data)),
+                ("12-bit BE per-triplet", unpack_12bit_be_per_triplet(b.data)),
+            ]
+            print(f"  block @ {b.offset:>5} ({seg_label} {channel}, samples {sample_idx}..{sample_idx+nn-1}):")
+            print(f"    data:  {b.data.hex(' ')}")
+            print(f"    truth: {truth_deltas}")
+            for name, pred in schemes:
+                match = "✓" if pred == truth_deltas else " "
+                n_match = sum(1 for x, y in zip(pred, truth_deltas) if x == y)
+                print(f"    {match}{n_match}/4  {name}: {pred}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,132 @@
+"""Test the '30 NN data = high-nibbles + int8 low-bytes' hypothesis.
+
+Layout for `30 04` (6 data bytes, 4 deltas):
+  bytes [0:2] = 16 bits = 4 × 4-bit high-nibbles (MSB first)
+  bytes [2:6] = 4 × int8 low bytes
+  Each delta = 12-bit signed = sign-extend((high_nibble << 8) | low_byte)
+"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def sign_extend_12(v):
+    return v if v < 0x800 else v - 0x1000
+
+
+def decode_30nn(data):
+    """4 × 12-bit signed deltas (high nibble + low byte).
+    bytes[0:2] hold the 4 high nibbles (MSB first); bytes[2:6] hold the low bytes.
+    """
+    if len(data) < 6:
+        return []
+    # Read high nibbles from bytes 0-1 (4 nibbles MSB-first)
+    high_word = (data[0] << 8) | data[1]
+    high_nibbles = [
+        (high_word >> 12) & 0xF,
+        (high_word >> 8) & 0xF,
+        (high_word >> 4) & 0xF,
+        high_word & 0xF,
+    ]
+    out = []
+    for i in range(4):
+        v = (high_nibbles[i] << 8) | data[2 + i]
+        out.append(sign_extend_12(v))
+    return out
+
+
+def simulate_up_to(blocks, target_block_idx, t_preamble):
+    """Run decoder up to block_idx; return per-channel sample lists.
+    NOW with 30 NN decoded too."""
+    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    out["Tran"].extend(t_preamble)
+    cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
+    rotation = ["Vert", "Long", "MicL", "Tran"]
+    current_channel = "Tran"
+    seg_counter = -1
+    for j in range(target_block_idx):
+        blk = blocks[j]
+        if blk.tag_hi == 0x40:
+            seg_counter += 1
+            prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
+            new_ch = rotation[seg_counter % 4]
+            if cur[prev] is not None:
+                d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
+                d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
+                cur[prev] += d0; out[prev].append(cur[prev])
+                cur[prev] += d1; out[prev].append(cur[prev])
+            c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
+            c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
+            out[new_ch].extend([c0, c1])
+            cur[new_ch] = c1
+            current_channel = new_ch
+        elif blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur[current_channel] += s4(nib)
+                    out[current_channel].append(cur[current_channel])
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur[current_channel] += i8(byte)
+                out[current_channel].append(cur[current_channel])
+        elif blk.tag_hi == 0x00:
+            for _ in range(blk.tag_lo):
+                out[current_channel].append(cur[current_channel])
+        elif blk.tag_hi == 0x30:
+            # NEW: decode 30 NN
+            deltas = decode_30nn(blk.data)
+            for d in deltas:
+                cur[current_channel] += d
+                out[current_channel].append(cur[current_channel])
+    return out, current_channel
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
+                 "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        blocks = walk_body(body, find_data_start(body))
+        t0 = int.from_bytes(body[3:5], "big", signed=True)
+        t1 = int.from_bytes(body[5:7], "big", signed=True)
+        thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
+        if not thirty_blocks:
+            continue
+        print(f"\n=== {stem} ===")
+        for j, blk in thirty_blocks:
+            pred, ch = simulate_up_to(blocks, j, [t0, t1])
+            cur_before = pred[ch][-1]
+            truth = [round(v * 200) for v in samples[ch]]
+            n_pred = len(pred[ch])
+            nn = blk.tag_lo
+            if n_pred + nn > len(truth):
+                continue
+            # Decode this 30 NN block with hypothesis
+            pred_deltas = decode_30nn(blk.data)
+            # Compute truth deltas relative to cur_before
+            truth_deltas = []
+            prev = cur_before
+            for k in range(nn):
+                truth_deltas.append(truth[n_pred + k] - prev)
+                prev = truth[n_pred + k]
+            n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
+            tag = "✓" if pred_deltas == truth_deltas else " "
+            print(f"  block @ {blk.offset:>5} (chan={ch}, NN={nn}):")
+            print(f"    data:  {blk.data.hex(' ')}")
+            print(f"    truth: {truth_deltas}")
+            print(f"    pred:  {pred_deltas}  {tag}{n_match}/{nn}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,141 @@
+"""Test 30 NN packing by running the real decoder up to each 30 NN block,
+recording how many samples have been produced for each channel at that point,
+then checking truth deltas immediately after."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def s12(v):
+    return v if v < 0x800 else v - 0x1000
+
+
+def unpack_12bit_be_contiguous(data):
+    out = []
+    val = int.from_bytes(data, "big")
+    n = len(data) * 8 // 12
+    for i in range(n):
+        d = (val >> (12 * (n - 1 - i))) & 0xFFF
+        out.append(s12(d))
+    return out
+
+
+def unpack_12bit_per_triplet_be(data):
+    out = []
+    for i in range(0, len(data), 3):
+        if i + 2 >= len(data):
+            break
+        b0, b1, b2 = data[i], data[i + 1], data[i + 2]
+        d0 = (b0 << 4) | (b1 >> 4)
+        d1 = ((b1 & 0x0F) << 8) | b2
+        out.append(s12(d0))
+        out.append(s12(d1))
+    return out
+
+
+def simulate_up_to(blocks, target_block_idx, t_preamble):
+    """Run the decoder up to block_idx; return per-channel sample lists."""
+    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    out["Tran"].extend(t_preamble)
+    cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
+    rotation = ["Vert", "Long", "MicL", "Tran"]
+    seg_idx = [j for j, b in enumerate(blocks) if b.tag_hi == 0x40]
+
+    # Determine which channel we're CURRENTLY decoding into
+    current_channel = "Tran"
+    seg_counter = -1  # incremented at each 40 02
+
+    for j in range(target_block_idx):
+        blk = blocks[j]
+        if blk.tag_hi == 0x40:
+            # Switch: extend prev channel, set up new channel
+            seg_counter += 1
+            prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
+            new_ch = rotation[seg_counter % 4]
+            if cur[prev] is not None:
+                d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
+                d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
+                cur[prev] += d0; out[prev].append(cur[prev])
+                cur[prev] += d1; out[prev].append(cur[prev])
+            c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
+            c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
+            out[new_ch].extend([c0, c1])
+            cur[new_ch] = c1
+            current_channel = new_ch
+        elif blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur[current_channel] += s4(nib)
+                    out[current_channel].append(cur[current_channel])
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur[current_channel] += i8(byte)
+                out[current_channel].append(cur[current_channel])
+        elif blk.tag_hi == 0x00:
+            for _ in range(blk.tag_lo):
+                out[current_channel].append(cur[current_channel])
+        elif blk.tag_hi == 0x30:
+            # Skip for now — we want to know what comes next
+            pass
+
+    return out, current_channel
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
+                 "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        blocks = walk_body(body, find_data_start(body))
+        t0 = int.from_bytes(body[3:5], "big", signed=True)
+        t1 = int.from_bytes(body[5:7], "big", signed=True)
+
+        # Find all 30 NN blocks in data section
+        thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
+        if not thirty_blocks:
+            continue
+
+        print(f"\n=== {stem} ===")
+        for j, blk in thirty_blocks:
+            pred, ch = simulate_up_to(blocks, j, [t0, t1])
+            n_pred = len(pred[ch])
+            # The 30 NN block carries NN deltas for channel `ch` starting at sample n_pred
+            truth = [round(v * 200) for v in samples[ch]]
+            if n_pred >= len(truth):
+                continue
+            # Truth deltas: truth[n_pred] - cur, truth[n_pred+1] - truth[n_pred], ...
+            cur_val = pred[ch][-1]
+            nn = blk.tag_lo
+            truth_deltas = []
+            prev = cur_val
+            for k in range(min(nn, len(truth) - n_pred)):
+                truth_deltas.append(truth[n_pred + k] - prev)
+                prev = truth[n_pred + k]
+
+            print(f"  block @ {blk.offset:>5} (chan={ch}, after sample {n_pred-1}, "
+                  f"NN={nn}, last_val={cur_val}):")
+            print(f"    data:  {blk.data.hex(' ')}")
+            print(f"    truth: {truth_deltas}")
+            schemes = [
+                ("12-bit BE contiguous", unpack_12bit_be_contiguous(blk.data)),
+                ("12-bit per-triplet BE", unpack_12bit_per_triplet_be(blk.data)),
+            ]
+            for name, pred_deltas in schemes:
+                n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
+                tag = "✓" if pred_deltas == truth_deltas else " "
+                print(f"    {tag}{n_match}/{nn}  {name}: {pred_deltas[:nn]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,86 @@
+"""Test: 00 NN markers might be RLE for zero-deltas in current channel."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def decode_with_rle(body):
+    """Decode Tran assuming:
+    - preamble[3:5], [5:7] = T[0], T[1]
+    - All 10 NN / 20 NN blocks until segment_header (40 02) are Tran deltas
+    - 00 NN markers are RLE: NN/4 zero T deltas (or NN, or NN/2 — try them)
+    """
+    if len(body) < 9 or body[0:3] != b"\x00\x02\x00":
+        return None, None, None
+    T0 = int.from_bytes(body[3:5], "big", signed=True)
+    T1 = int.from_bytes(body[5:7], "big", signed=True)
+
+    # Find first tag (might be 00 NN, 10 NN, or 20 NN)
+    i = 7
+    while i + 1 < len(body):
+        if body[i] in (0x00, 0x10, 0x20):
+            break
+        i += 1
+    start = i
+
+    blocks = walk_body(body, start)
+
+    results = {}
+    for rle_div in (4, 2, 1):  # try different RLE interpretations
+        T = [T0, T1]
+        cur = T1
+        for blk in blocks:
+            if blk.tag_hi == 0x40:
+                break
+            if blk.tag_hi == 0x10:
+                for byte in blk.data:
+                    for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                        cur += s4(nib)
+                        T.append(cur)
+            elif blk.tag_hi == 0x20:
+                for byte in blk.data:
+                    cur += i8(byte)
+                    T.append(cur)
+            elif blk.tag_hi == 0x00:
+                # RLE of zero deltas
+                n_zeros = blk.tag_lo // rle_div
+                for _ in range(n_zeros):
+                    T.append(cur)
+            # 30 NN: skip for now
+        results[rle_div] = T
+    return results, T0, T1
+
+
+def main():
+    for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        truth_T = [round(v*200) for v in samples["Tran"]]
+
+        results, T0, T1 = decode_with_rle(body)
+        print(f"\n=== {stem} (T[0]={T0}, T[1]={T1}) ===")
+        for rle_div, T in results.items():
+            n = min(len(T), len(truth_T))
+            matches = sum(1 for i in range(n) if T[i] == truth_T[i])
+            # Find first divergence
+            div_at = -1
+            for i in range(n):
+                if T[i] != truth_T[i]:
+                    div_at = i
+                    break
+            print(f"  rle_div={rle_div}: decoded {len(T)}, matches {matches}/{n}, first div at sample {div_at}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,71 @@
+"""Test: does the second '20 NN' block in SS0 continue Tran samples?"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def main():
+    stem = "M529LL1A.SS0"
+    path = f"tests/fixtures/5-11-26/{stem}"
+    with open(path, "rb") as f:
+        body = f.read()[43:-26]
+    _, samples = _parse_txt(path + ".TXT")
+    truth_T_16 = [round(v * 200) for v in samples["Tran"]]
+
+    # Preamble
+    T0 = int.from_bytes(body[3:5], "big", signed=True)
+    T1 = int.from_bytes(body[5:7], "big", signed=True)
+
+    # Walk blocks
+    start = find_data_start(body)
+    blocks = walk_body(body, start)
+
+    print(f"=== {stem} ===  T[0]={T0} T[1]={T1}")
+
+    # Hypothesis: Tran continues through ALL 10 NN and 20 NN blocks
+    # in order, until the next 40 02 segment header (which resets).
+    T = [T0, T1]
+    cur = T1
+    decoded_count = 2  # T[0], T[1] from preamble
+    for bi, blk in enumerate(blocks):
+        if blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur += s4(nib)
+                    T.append(cur)
+                    decoded_count += 1
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur += i8(byte)
+                T.append(cur)
+                decoded_count += 1
+        elif blk.tag_hi == 0x40:
+            # Segment header — stop here for this test
+            break
+        # 00 and 30 NN don't contribute to Tran (in this hypothesis)
+
+    # Compare to truth
+    print(f"  Decoded {len(T)} T samples up to first 40 02")
+    matches = sum(1 for i in range(min(len(T), len(truth_T_16))) if T[i] == truth_T_16[i])
+    print(f"  Matches in first {min(len(T), len(truth_T_16))}: {matches}")
+    # Print first divergence
+    for i in range(min(len(T), len(truth_T_16))):
+        if T[i] != truth_T_16[i]:
+            print(f"  First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
+            # Show context
+            print(f"    pred  [{i-3}:{i+5}]: {T[max(0,i-3):i+5]}")
+            print(f"    truth [{i-3}:{i+5}]: {truth_T_16[max(0,i-3):i+5]}")
+            break
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,67 @@
+"""Try various nibble-level channel interleavings to find which one matches truth."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def run_decoder(body, layout, skip, n_channels=4):
+    """layout: function nibble_index -> channel_index. Returns list-of-lists per channel."""
+    out = [[] for _ in range(n_channels)]
+    cur = [0] * n_channels
+    nibbles = []
+    for byte in body[skip:]:
+        nibbles.append((byte >> 4) & 0xF)
+        nibbles.append(byte & 0xF)
+    for i, n in enumerate(nibbles):
+        ch = layout(i)
+        cur[ch] += s4(n)
+        out[ch].append(cur[ch])
+    return out
+
+
+def cmp(pred, truth, n=24):
+    n = min(n, len(pred), len(truth))
+    return [(pred[i], truth[i]) for i in range(n)]
+
+
+def main():
+    b = load_bundle("event-c")
+    truth_T = [round(v * 200) for v in b.samples["Tran"]]
+    truth_V = [round(v * 200) for v in b.samples["Vert"]]
+    truth_L = [round(v * 200) for v in b.samples["Long"]]
+    print(f"T truth[0:10]: {truth_T[:10]}")
+    print(f"V truth[0:10]: {truth_V[:10]}")
+    print(f"L truth[0:10]: {truth_L[:10]}")
+
+    # Try several nibble->channel layouts (4 channels)
+    layouts = {
+        "interleaved TVLM (0,1,2,3,0,1,2,3,...)": lambda i: i % 4,
+        "interleaved VLMT": lambda i: (i + 3) % 4,
+        "interleaved LMTV": lambda i: (i + 2) % 4,
+        "interleaved MTVL": lambda i: (i + 1) % 4,
+        "byte-based TV LM TV LM (high T low V byte0; high L low M byte1)": lambda i: i % 4,
+        # "chunks of 8 nibbles per channel": each channel gets 8 nibbles in a row
+        "chunks-8 TVLM": lambda i: (i // 8) % 4,
+        "chunks-16 TVLM": lambda i: (i // 16) % 4,
+        # planar (full channel sequential)
+        "planar T(0..N) V(N..2N) L(2N..3N) M(3N..4N)": None,  # special
+    }
+
+    for label, layout_fn in layouts.items():
+        if layout_fn is None:
+            continue
+        for skip in (0, 4, 7, 8, 9, 11, 14):
+            out = run_decoder(b.body, layout_fn, skip)
+            # Check first 8 cumulative on each channel
+            print(f"  skip={skip:2}  {label}")
+            print(f"    T_cum[0:10]: {out[0][:10]}")
+            print(f"    V_cum[0:10]: {out[1][:10]}")
+            print(f"    L_cum[0:10]: {out[2][:10]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,73 @@
+"""Try decoding body as 4-bit signed nibble deltas, 4-channel round-robin."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+def s4(n):
+    """Sign-extend a 4-bit unsigned to int (0..7 → 0..7, 8..F → -8..-1)."""
+    return n if n < 8 else n - 16
+
+
+def decode_nibbles(body: bytes, skip_bytes: int = 7, n_channels: int = 4):
+    """Read body as 2 nibbles per byte; accumulate as deltas for n_channels round-robin."""
+    out = [[] for _ in range(n_channels)]
+    cur = [0] * n_channels
+    ch = 0
+    nibbles = []
+    for byte in body[skip_bytes:]:
+        nibbles.append((byte >> 4) & 0xF)
+        nibbles.append(byte & 0xF)
+    for n in nibbles:
+        cur[ch] += s4(n)
+        out[ch].append(cur[ch])
+        ch = (ch + 1) % n_channels
+    return out
+
+
+def cmp_to_truth(pred, truth, scale=16):
+    """Compare predicted ints (in 16-count units) to truth (in 16-count units = txt * 200).
+    Return (max_abs_err, mean_abs_err, n_compared).
+    """
+    n = min(len(pred), len(truth))
+    errs = []
+    for i in range(n):
+        p = pred[i]
+        t = truth[i]
+        errs.append(abs(p - t))
+    if not errs:
+        return None
+    return (max(errs), sum(errs) / len(errs), n)
+
+
+def main():
+    for name in ("event-a", "event-c"):
+        b = load_bundle(name)
+        # Convert TXT samples (in/s) to 16-count units (multiply by 200, since 0.005 in/s = 1)
+        # WAIT: 0.005 in/s = 16 ADC counts. 1 count = 0.000305 in/s.
+        # So in 1-count units: count = txt * (1/0.0003052) ≈ txt * 3276.7
+        # But TXT only has 0.005 resolution so equivalent to 16-count units = txt * 200.
+        truth_in_16 = {ch: [round(v * 200) for v in b.samples[ch]] for ch in CHANNELS[:3]}
+        # MicL is in dB, skip for now
+
+        # Try decoder with skip_bytes = 7
+        decoded = decode_nibbles(b.body, skip_bytes=7, n_channels=4)
+        print(f"\n=== {name} ===")
+        print(f"  body={len(b.body)}, nibbles={2*(len(b.body)-7)}, samples_per_ch={len(decoded[0])}")
+        print(f"  truth samples per ch: {len(truth_in_16['Tran'])}")
+        # Print first 24 of each
+        for i, chan in enumerate(CHANNELS):
+            pred_first = decoded[i][:24]
+            if chan in truth_in_16:
+                truth_first = truth_in_16[chan][:24]
+                print(f"  {chan} pred: {pred_first}")
+                print(f"  {chan} truth: {truth_first}")
+            else:
+                print(f"  {chan} pred: {pred_first}  (truth in dB, skipped)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,32 @@
+"""Verify decode_waveform_v2 against BW ASCII truth for all fixtures."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import decode_waveform_v2
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
+                 "M529LL1L.JQ0", "M529LL1L.V70"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            body = f.read()[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        decoded = decode_waveform_v2(body)
+        if decoded is None:
+            print(f"{stem}: decoder returned None")
+            continue
+
+        print(f"\n=== {stem} ===")
+        for ch in ("Tran", "Vert", "Long"):
+            truth = [round(v * 200) for v in samples[ch]]
+            pred = decoded[ch]
+            n = min(len(pred), len(truth))
+            matches = sum(1 for i in range(n) if pred[i] == truth[i])
+            div = next((i for i in range(n) if pred[i] != truth[i]), -1)
+            print(f"  {ch}: decoded={len(pred):>5}  truth={len(truth):>5}  "
+                  f"matches={matches:>5}/{n:<5}  first div={div}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,55 @@
+"""Run decode_waveform_v2 against the 5-8-26 quiet bundle to test the
+'quiet events should decode fully' hypothesis."""
+import os, sys
+sys.path.insert(0, ".")
+from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
+from analysis.load_bundle import _parse_txt
+
+
+def main():
+    base = "tests/fixtures/decode-re-5-8-26"
+    for evt in sorted(os.listdir(base)):
+        folder = os.path.join(base, evt)
+        if not os.path.isdir(folder):
+            continue
+        # Find the binary (not .TXT)
+        bin_name = next(
+            (f for f in os.listdir(folder) if not f.endswith(".TXT")),
+            None,
+        )
+        if not bin_name:
+            continue
+        bin_path = os.path.join(folder, bin_name)
+        txt_path = bin_path + ".TXT"
+        if not os.path.exists(txt_path):
+            # Sometimes the TXT name differs slightly
+            for f in os.listdir(folder):
+                if f.endswith(".TXT"):
+                    txt_path = os.path.join(folder, f)
+                    break
+        with open(bin_path, "rb") as f:
+            body = f.read()[43:-26]
+        decoded = decode_waveform_v2(body)
+        _, samples = _parse_txt(txt_path)
+
+        # Count 30 NN blocks
+        blocks = walk_body(body, find_data_start(body))
+        n_30 = sum(1 for b in blocks if b.tag_hi == 0x30)
+        n_40 = sum(1 for b in blocks if b.tag_hi == 0x40)
+
+        print(f"\n=== {evt} === body={len(body)}  segments={n_40}  '30 NN' blocks={n_30}")
+        if decoded is None:
+            print("  decoder returned None")
+            continue
+        for ch in ("Tran", "Vert", "Long"):
+            truth = [round(v * 200) for v in samples[ch]]
+            pred = decoded[ch]
+            n = min(len(pred), len(truth))
+            matches = sum(1 for i in range(n) if pred[i] == truth[i])
+            div = next((i for i in range(n) if pred[i] != truth[i]), -1)
+            print(f"  {ch}: decoded={len(pred):>5}  truth={len(truth):>5}  "
+                  f"matches={matches:>5}/{n:<5}  first div={div}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,71 @@
+"""Verify: preamble[3:7] = Tran[0], Tran[1] as int16 BE in 16-count units.
+And first 20/10 NN block = Tran deltas starting at sample 2.
+"""
+import os, sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import _parse_txt
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def s4(n):
+    return n if n < 8 else n - 16
+
+
+def i8(b):
+    return b if b < 128 else b - 256
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
+        path = f"tests/fixtures/5-11-26/{stem}"
+        with open(path, "rb") as f:
+            raw = f.read()
+        body = raw[43:-26]
+        _, samples = _parse_txt(path + ".TXT")
+        truth_T_16 = [round(v * 200) for v in samples["Tran"]]
+
+        # Preamble parse
+        T0_pre = int.from_bytes(body[3:5], "big", signed=True)
+        T1_pre = int.from_bytes(body[5:7], "big", signed=True)
+        print(f"\n=== {stem} ===")
+        print(f"  Preamble T[0]={T0_pre} (truth {truth_T_16[0]})  T[1]={T1_pre} (truth {truth_T_16[1]})  match={T0_pre==truth_T_16[0] and T1_pre==truth_T_16[1]}")
+
+        # First block
+        start = find_data_start(body)
+        blocks = walk_body(body, start)
+        if not blocks:
+            print(f"  no blocks found")
+            continue
+
+        # Assume first block = Tran deltas from sample 2
+        first = blocks[0]
+        T = [T0_pre, T1_pre]
+        cur_T = T1_pre
+        if first.tag_hi == 0x10:
+            # Nibble pairs
+            for byte in first.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur_T += s4(nib)
+                    T.append(cur_T)
+        elif first.tag_hi == 0x20:
+            # int8 per byte
+            for byte in first.data:
+                cur_T += i8(byte)
+                T.append(cur_T)
+
+        # Compare against truth
+        n_check = min(len(T), len(truth_T_16))
+        match_count = sum(1 for i in range(n_check) if T[i] == truth_T_16[i])
+        print(f"  First block type=0x{first.tag_hi:02x} NN=0x{first.tag_lo:02x} len={len(first.data)} → {len(T)} T samples decoded")
+        print(f"  Tran predicted[0:10]: {T[:10]}")
+        print(f"  Tran truth    [0:10]: {truth_T_16[:10]}")
+        print(f"  Matches in first {n_check}: {match_count} / {n_check}")
+        # Show where it diverges
+        for i in range(n_check):
+            if T[i] != truth_T_16[i]:
+                print(f"  First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
+                break
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,20 @@
+"""Walk blocks of the new 5-11-26 events and look at what comes after Tran block."""
+import sys
+sys.path.insert(0, ".")
+from minimateplus.waveform_codec import walk_body, find_data_start
+
+
+def main():
+    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
+        with open(f"tests/fixtures/5-11-26/{stem}", "rb") as f:
+            raw = f.read()
+        body = raw[43:-26]
+        start = find_data_start(body)
+        blocks = walk_body(body, start)
+        print(f"\n=== {stem} === body={len(body)} start={start} blocks walked={len(blocks)}")
+        for i, b in enumerate(blocks[:20]):
+            print(f"  block[{i:>2}] @ {b.offset:>5} tag={b.tag_hi:02x} NN=0x{b.tag_lo:02x}({b.tag_lo}) len={b.length} data[:24]={b.data[:24].hex(' ')}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,44 @@
+"""Walk the body assuming chunks delimited by 0x10 NN tags. Print each chunk's structure."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def walk(body: bytes, start_offset: int = 7, max_chunks: int = 30):
+    """Find all positions where byte = 0x10 followed by a multiple-of-4 byte. Print chunks."""
+    chunks = []
+    i = start_offset
+    while i < len(body) - 1:
+        # Find next `10 NN` where NN is multiple of 4 (and not preceded by another 0x10 immediately, which would be data).
+        if body[i] == 0x10 and (body[i+1] % 4 == 0):
+            chunks.append(i)
+        i += 1
+    return chunks
+
+
+def main():
+    for name in ("event-c", "event-d"):
+        b = load_bundle(name)
+        body = b.body
+        positions = []
+        i = 7  # skip 7-byte preamble
+        while i < len(body) - 1:
+            if body[i] == 0x10 and body[i+1] % 4 == 0 and body[i+1] > 0:
+                positions.append(i)
+                i += 2  # skip past tag
+            else:
+                i += 1
+        print(f"\n=== {name} ===  body={len(body)}, total `10 NN` (NN%4==0, NN>0) tags: {len(positions)}")
+        # Print first 20 chunks: show position, NN, gap to next tag
+        for k in range(min(30, len(positions))):
+            pos = positions[k]
+            NN = body[pos + 1]
+            next_pos = positions[k+1] if k+1 < len(positions) else len(body)
+            gap = next_pos - pos
+            data_bytes = body[pos+2 : next_pos]
+            print(f"  chunk[{k:>3}] @ {pos:>5}  NN=0x{NN:02x} ({NN:>3}, NN/2={NN//2})  gap={gap:>3}  "
+                  f"data={data_bytes[:24].hex(' ')}{'...' if len(data_bytes) > 24 else ''}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,50 @@
+"""Deterministic chunk walker: each chunk = [10 NN][NN/2 bytes data][2 bytes trailer]."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def walk_chunks(body: bytes, start: int = 7):
+    """Yield (offset, NN, data_bytes, trailer_bytes) tuples."""
+    i = start
+    while i + 1 < len(body):
+        if body[i] != 0x10:
+            break
+        NN = body[i + 1]
+        if NN == 0 or NN > 0x80 or NN % 4 != 0:
+            break
+        chunk_len = NN // 2 + 4
+        if i + chunk_len > len(body):
+            break
+        data = bytes(body[i + 2 : i + 2 + NN // 2])
+        trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
+        yield (i, NN, data, trailer)
+        i += chunk_len
+
+
+def main():
+    for name in ("event-c", "event-d", "event-a", "event-b"):
+        b = load_bundle(name)
+        body = b.body
+        chunks = list(walk_chunks(body))
+        print(f"\n=== {name} ===  body={len(body)}  N_samples={len(b.samples['Tran'])}")
+        print(f"  chunks parsed: {len(chunks)}")
+        if chunks:
+            last = chunks[-1]
+            end_of_walk = last[0] + last[1] // 2 + 4
+            print(f"  walk ended at offset {end_of_walk} (= {len(body) - end_of_walk} bytes from end)")
+            # Stats
+            total_data_bytes = sum(len(c[2]) for c in chunks)
+            print(f"  total data bytes: {total_data_bytes}, total nibbles: {2*total_data_bytes}")
+            if name in ("event-c", "event-d"):
+                ratio = (2 * total_data_bytes) / (len(b.samples['Tran']) * 4)
+                print(f"  nibbles per (sample × channel): {ratio:.3f}")
+            # Sum of trailer second-byte
+            trailer_sums = [c[3][-1] if c[3] else None for c in chunks]
+            print(f"  first 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:10]]}")
+            # Print last 10 chunks (likely transition to trailer)
+            print(f"  last 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-10:]]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,51 @@
+"""Walk chunks; auto-detect preamble length by finding first 10 NN."""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def walk_chunks(body, start, max_NN=0x80):
+    chunks = []
+    i = start
+    while i + 1 < len(body):
+        if body[i] != 0x10:
+            break
+        NN = body[i + 1]
+        if NN == 0 or NN > max_NN or NN % 4 != 0:
+            break
+        chunk_len = NN // 2 + 4
+        if i + chunk_len > len(body):
+            break
+        data = bytes(body[i + 2 : i + 2 + NN // 2])
+        trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
+        chunks.append((i, NN, data, trailer))
+        i += chunk_len
+    return chunks, i
+
+
+def find_first_chunk_start(body):
+    """Locate first byte that begins a `10 NN` chunk (NN ∈ multiples of 4, 4..0x7C)."""
+    for i in range(20):
+        if body[i] == 0x10 and body[i + 1] % 4 == 0 and 0 < body[i + 1] <= 0x7C:
+            return i
+    return -1
+
+
+def main():
+    for name in ("event-c", "event-d", "event-a", "event-b"):
+        b = load_bundle(name)
+        body = b.body
+        start = find_first_chunk_start(body)
+        chunks, end = walk_chunks(body, start)
+        print(f"\n=== {name} ===  body={len(body)}  N_samples={len(b.samples['Tran'])}  start={start}")
+        print(f"  chunks parsed: {len(chunks)}, walk ended at {end}")
+        if chunks:
+            print(f"  first 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:5]]}")
+            print(f"  last 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-5:]]}")
+            print(f"  bytes around end of walk: {body[end-4:end+12].hex(' ')}")
+        else:
+            print(f"  bytes at start: {body[start:start+16].hex(' ')}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,75 @@
+"""
+Walker v4: alternate [10 NN] data chunks and [00 NN] (or other) marker tags.
+
+Hypothesis:
+- [10 NN]: data block, length NN/2 + 2 bytes (2-byte tag + NN/2 bytes data)
+- [00 NN]: 2-byte marker block (no data)
+- [20/30/40 NN]: special blocks with type-dependent length
+"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+
+
+def walk(body, start):
+    i = start
+    blocks = []
+    while i + 1 < len(body):
+        t0 = body[i]
+        t1 = body[i + 1]
+        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0x80:
+            # data chunk: length NN/2 + 2
+            length = t1 // 2 + 2
+            blocks.append((i, "10", t1, bytes(body[i + 2 : i + length]), length))
+            i += length
+        elif t0 == 0x00 and t1 % 4 == 0:
+            # 2-byte marker
+            blocks.append((i, "00", t1, b"", 2))
+            i += 2
+        elif t0 == 0x20 and t1 % 4 == 0:
+            # type 2 — try length 2+t1/2 (similar to 10) OR fixed
+            length = t1 // 2 + 2
+            blocks.append((i, "20", t1, bytes(body[i + 2 : i + length]), length))
+            i += length
+        elif t0 == 0x30 and t1 % 4 == 0:
+            length = t1 // 2 + 2
+            blocks.append((i, "30", t1, bytes(body[i + 2 : i + length]), length))
+            i += length
+        elif t0 == 0x40 and t1 == 0x02:
+            # Special "footer transition" block — try fixed 22 bytes
+            length = 22
+            blocks.append((i, "40", t1, bytes(body[i + 2 : i + length]), length))
+            i += length
+        else:
+            # Unknown tag — stop
+            blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
+            break
+    return blocks, i
+
+
+def main():
+    for name in ("event-c", "event-d", "event-a", "event-b"):
+        b = load_bundle(name)
+        body = b.body
+        # Auto-detect start
+        for s in range(15):
+            if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0x80:
+                start = s
+                break
+        else:
+            start = 7
+        blocks, end = walk(body, start)
+        # Categorize
+        from collections import Counter
+        types = Counter(b[1] for b in blocks)
+        print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])}  start={start}")
+        print(f"  total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
+        print(f"  type counts: {dict(types)}")
+        # Print last 5 blocks
+        print(f"  last 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-5:]]}")
+        if end < len(body):
+            print(f"  bytes at end: {body[end:end+24].hex(' ')}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,83 @@
+"""
+Walker v5: flexible NN range and multiple block-type lengths.
+
+Hypothesis:
+- [10 NN]: 4-bit-delta data block, length = NN/2 + 2
+- [20 NN]: 8-bit-literal data block, length = NN + 2
+- [00 NN]: 2-byte marker (no payload)
+- [30 NN]: trailer/summary block, length = NN*4
+- [40 NN]: footer-marker block, fixed 22 bytes
+"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+from collections import Counter
+
+
+def walk(body, start, max_blocks=10000):
+    i = start
+    blocks = []
+    while i + 1 < len(body) and len(blocks) < max_blocks:
+        t0 = body[i]
+        t1 = body[i + 1]
+        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 // 2 + 2
+            if i + length > len(body):
+                break
+            data = bytes(body[i + 2 : i + length])
+            blocks.append((i, "10", t1, data, length))
+            i += length
+        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 + 2
+            if i + length > len(body):
+                break
+            data = bytes(body[i + 2 : i + length])
+            blocks.append((i, "20", t1, data, length))
+            i += length
+        elif t0 == 0x00 and t1 % 4 == 0:
+            # 2-byte marker
+            blocks.append((i, "00", t1, b"", 2))
+            i += 2
+        elif t0 == 0x30 and t1 % 4 == 0:
+            length = t1 * 4
+            if i + length > len(body):
+                break
+            data = bytes(body[i + 2 : i + length])
+            blocks.append((i, "30", t1, data, length))
+            i += length
+        elif t0 == 0x40 and t1 == 0x02:
+            length = 22
+            if i + length > len(body):
+                break
+            data = bytes(body[i + 2 : i + length])
+            blocks.append((i, "40", t1, data, length))
+            i += length
+        else:
+            blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
+            break
+    return blocks, i
+
+
+def main():
+    for name in ("event-c", "event-d", "event-a", "event-b"):
+        b = load_bundle(name)
+        body = b.body
+        for s in range(15):
+            if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
+                start = s; break
+        else:
+            start = 7
+        blocks, end = walk(body, start)
+        types = Counter(bb[1] for bb in blocks)
+        print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])}  start={start}")
+        print(f"  total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
+        print(f"  type counts: {dict(types)}")
+        if blocks and blocks[-1][1] == "??":
+            print(f"  stopped at byte: 0x{blocks[-1][2]:02x}, prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
+        # Sum payload sizes by type
+        payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
+        print(f"  payload bytes by type: {payload_sizes}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,68 @@
+"""
+Walker v6: handle 40 02 blocks correctly (length 20).
+
+Block formats:
+- [10 NN]: 4-bit nibble delta data, length = NN/2 + 2
+- [20 NN]: int8 literal data, length = NN + 2
+- [00 NN]: 2-byte marker
+- [30 NN]: trailer/summary block, length = NN*4
+- [40 02]: segment header, fixed length 20
+"""
+import sys
+sys.path.insert(0, ".")
+from analysis.load_bundle import load_bundle
+from collections import Counter
+
+
+def walk(body, start, max_blocks=10000):
+    i = start
+    blocks = []
+    while i + 1 < len(body) and len(blocks) < max_blocks:
+        t0 = body[i]
+        t1 = body[i + 1]
+        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 // 2 + 2
+        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 + 2
+        elif t0 == 0x00 and t1 % 4 == 0:
+            length = 2
+        elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
+            length = t1 * 4
+        elif t0 == 0x40 and t1 == 0x02:
+            length = 20
+        else:
+            blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
+            break
+        if i + length > len(body):
+            break
+        data = bytes(body[i + 2 : i + length])
+        blocks.append((i, f"{t0:02x}", t1, data, length))
+        i += length
+    return blocks, i
+
+
+def main():
+    for name in ("event-c", "event-d", "event-a", "event-b"):
+        b = load_bundle(name)
+        body = b.body
+        for s in range(15):
+            if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
+                start = s; break
+        else:
+            start = 7
+        blocks, end = walk(body, start)
+        types = Counter(bb[1] for bb in blocks)
+        print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])}  start={start}")
+        print(f"  total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
+        print(f"  type counts: {dict(types)}")
+        if blocks and blocks[-1][1] == "??":
+            print(f"  stopped at byte: 0x{blocks[-1][2]:02x} at offset {blocks[-1][0]}")
+            print(f"  prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
+            print(f"  bytes around stop: {body[end-4:end+24].hex(' ')}")
+        # Sum
+        payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
+        print(f"  payload bytes by type: {payload_sizes}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,65 @@
+"""Run read_idf_file across the corpus and report per-channel accuracy vs sidecars."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+from analysis_idf.recon import load_sidecar_samples
+
+
+def sidecar_path(idfw: Path) -> Path:
+    return idfw.parent / "TXT" / f"{idfw.name}.txt"
+
+
+def main():
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
+    files.sort()
+    GEO_LSB = 0.0003
+
+    n_ok = n_skip = 0
+    overall = {"Tran": [], "Vert": [], "Long": []}
+
+    for f in files:
+        try:
+            res = read_idf_file(f)
+        except Exception:
+            n_skip += 1
+            continue
+        sc_path = sidecar_path(f)
+        if not sc_path.exists():
+            n_skip += 1
+            continue
+        try:
+            sc = load_sidecar_samples(sc_path)
+        except Exception:
+            n_skip += 1
+            continue
+
+        per_file = {}
+        for ch in ("Tran", "Vert", "Long"):
+            sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+            dec = res.samples.get(ch, [])
+            n = min(len(sc_counts), len(dec))
+            if n == 0:
+                per_file[ch] = 0.0
+                continue
+            exact = sum(1 for i in range(n) if sc_counts[i] == dec[i])
+            pct = 100.0 * exact / n
+            per_file[ch] = pct
+            overall[ch].append(pct)
+        n_ok += 1
+
+    print(f"Processed {n_ok} files (skipped {n_skip})")
+    print("Per-channel exact-match % (mean / min / max):")
+    for ch, vals in overall.items():
+        if vals:
+            avg = sum(vals) / len(vals)
+            print(f"  {ch}: mean={avg:.2f}%  min={min(vals):.2f}%  max={max(vals):.2f}%  n={len(vals)}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,49 @@
+"""Find where decoded-vs-sidecar diverges for each channel."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    decoded = decode_waveform_v2(buf[0x0f1f:])
+    GEO_LSB = 0.0003
+
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        # Find ALL transitions where mismatches start/stop
+        first_diff = next((i for i in range(len(dec)) if dec[i] != sc_counts[i]), None)
+        if first_diff is None:
+            print(f"{ch}: NO MISMATCHES")
+            continue
+        print(f"{ch}: first diff at idx {first_diff}")
+        # Show 5 before, 5 after
+        for i in range(max(0, first_diff - 3), min(len(dec), first_diff + 8)):
+            mark = "  " if dec[i] == sc_counts[i] else "**"
+            print(f"  {mark} idx {i:4d}: sc={sc_counts[i]:6d}  dec={dec[i]:6d}  diff={dec[i]-sc_counts[i]:+d}")
+        # Where does cumulative diff exceed 100?
+        cum_match_run = 0
+        max_match_run = 0
+        match_run_start = 0
+        diff_count = 0
+        for i in range(len(dec)):
+            if dec[i] == sc_counts[i]:
+                cum_match_run += 1
+                max_match_run = max(max_match_run, cum_match_run)
+            else:
+                cum_match_run = 0
+                diff_count += 1
+        print(f"  total mismatches: {diff_count}/{len(dec)}, longest run of matches: {max_match_run}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,48 @@
+"""End-to-end IDFH ingest verification."""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    txt  = idfh.parent / "TXT" / f"{idfh.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfh.read_bytes(),
+            idfh,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print("=== save_imported_idf (IDFH) ===")
+        print(f"  serial:        {rec['serial']}")
+        print(f"  filename:      {rec['filename']}")
+        print(f"  filesize:      {rec['filesize']}")
+        print(f"  h5:            {rec['hdf5_filename']}")  # expect None for histogram
+        print(f"  sidecar:       {rec['sidecar_filename']}")
+        print()
+        print("=== Event ===")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print()
+        # Inspect sidecar to confirm intervals were stashed
+        sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        intervals = sc.get("extensions", {}).get("idf_intervals", [])
+        print(f"  sidecar intervals: {len(intervals)}")
+        if intervals:
+            print(f"  first interval:    {intervals[0]}")
+            print(f"  last interval:     {intervals[-1]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Verify the had_report=False path: ingest IDFW with no .txt."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+import tempfile
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            serial_hint=None,
+            idf_report_text=None,        # ← no .txt!
+        )
+        print("=== IDFW without .txt ingest ===")
+        print(f"  serial:        {rec['serial']}")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  rectime_sec:   {ev.rectime_seconds}")
+        nT = len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0
+        nV = len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0
+        nL = len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0
+        nM = len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0
+        print(f"  raw_samples:   Tran={nT} Vert={nV} Long={nL} MicL={nM}")
+        if ev.peak_values:
+            print(f"  peak_values:   tran={ev.peak_values.tran} vert={ev.peak_values.vert} long={ev.peak_values.long}")
+        print(f"  h5 written:    {rec['hdf5_filename']}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,102 @@
+"""End-to-end Thor report PDF rendering.
+
+Ingests an IDFW + .txt via save_imported_idf, runs gather_report_data
+(faking a minimal DB row), and renders the PDF to disk.
+"""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+
+
+class FakeDb:
+    """Stand-in for SeismoDb.get_event(); the renderer only needs a few cols."""
+    def __init__(self, event):
+        self.event = event
+
+    def get_event(self, _id):
+        return self.event
+
+
+def main():
+    base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
+    idfw = base / "UM11719_20231219162723.IDFW"
+    txt  = base / "TXT" / f"{idfw.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
+
+        # Verify sidecar has bw_report block
+        sc_path = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        bw = sc.get("bw_report", {})
+        print(f"  bw_report.available: {bw.get('available')}")
+        print(f"  bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
+        print(f"  bw_report.mic.pspl_dbl: {bw.get('mic', {}).get('pspl_dbl')}")
+        print(f"  bw_report.histogram.n_intervals: {bw.get('histogram', {}).get('n_intervals')}")
+
+        # Build a DB-row-shaped dict from the Event for gather_report_data
+        import datetime
+        ts = ev.timestamp
+        ts_iso = None
+        if ts is not None:
+            try:
+                ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+            except Exception:
+                pass
+        fake_row = {
+            "serial":              "UM11719",
+            "blastware_filename":  rec["filename"],
+            "record_type":         "Waveform",
+            "timestamp":           ts_iso,
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client  if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
+        print()
+        print(f"=== ReportData ===")
+        print(f"  event_id:           {rd.event_id}")
+        print(f"  serial:             {rd.serial}")
+        print(f"  record_type:        {rd.record_type}")
+        print(f"  event_datetime:     {rd.event_datetime_str}")
+        print(f"  trigger:            {rd.trigger_source}")
+        print(f"  geo_range:          {rd.geo_range_str}")
+        print(f"  sample_rate:        {rd.sample_rate_str}")
+        print(f"  firmware:           {rd.firmware}")
+        print(f"  calibration:        {rd.calibration_date} by {rd.calibration_by}")
+        print(f"  battery:            {rd.battery_volts}")
+        print(f"  PVS:                {rd.peak_vector_sum_ips} in/s at {rd.peak_vector_sum_time_s} sec")
+        print(f"  mic_pspl_dbl:       {rd.mic_pspl_dbl}")
+        print(f"  mic_zc_freq_hz:     {rd.mic_zc_freq_hz}")
+        print(f"  channel_stats:      {len(rd.channel_stats)} rows")
+        for cs in rd.channel_stats:
+            print(f"    {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} ToP={cs['time_of_peak_s']} Acc={cs['peak_accel_g']} Disp={cs['peak_disp_in']} Test={cs['sensor_check']}")
+
+        # Render the PDF
+        out_path = REPO / "analysis_idf" / "thor_report.pdf"
+        pdf_bytes = report_pdf.render_event_report_pdf(rd)
+        out_path.write_bytes(pdf_bytes)
+        print()
+        print(f"  PDF written: {out_path} ({len(pdf_bytes)} bytes)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,91 @@
+"""End-to-end Thor IDFH histogram report PDF rendering."""
+from __future__ import annotations
+import sys
+import tempfile
+import json
+import datetime
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+from sfm import report_pdf
+
+
+class FakeDb:
+    def __init__(self, event):
+        self.event = event
+
+    def get_event(self, _id):
+        return self.event
+
+
+def main():
+    # Use the multi-interval IDFH (81 + trigger row)
+    idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    txt  = idfh.parent / "TXT" / f"{idfh.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfh.read_bytes(),
+            idfh,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
+
+        sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
+        sc = json.loads(sc_path.read_text())
+        bw = sc.get("bw_report", {})
+        hist = bw.get("histogram", {})
+        print(f"  bw_report.histogram.start:           {hist.get('start')}")
+        print(f"  bw_report.histogram.stop:            {hist.get('stop')}")
+        print(f"  bw_report.histogram.n_intervals:     {hist.get('n_intervals')}")
+        print(f"  bw_report.histogram.interval_size:   {hist.get('interval_size')}")
+        print(f"  bw_report.histogram.interval_size_s: {hist.get('interval_size_s')}")
+        print(f"  bw_report.peaks.tran.ppv_ips:        {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
+
+        ts = ev.timestamp
+        ts_iso = None
+        if ts is not None:
+            try:
+                ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
+            except Exception:
+                pass
+        fake_row = {
+            "serial":              "UM13981",
+            "blastware_filename":  rec["filename"],
+            "record_type":         "Histogram",
+            "timestamp":           ts_iso,
+            "sample_rate":         ev.sample_rate,
+            "project":             ev.project_info.project if ev.project_info else None,
+            "client":              ev.project_info.client  if ev.project_info else None,
+            "operator":            ev.project_info.operator if ev.project_info else None,
+            "sensor_location":     ev.project_info.sensor_location if ev.project_info else None,
+            "created_at":          None,
+        }
+        rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="hist-1")
+
+        print()
+        print("=== ReportData (histogram) ===")
+        print(f"  is_histogram:           {rd.is_histogram}")
+        print(f"  histogram_start:        {rd.histogram_start_str}")
+        print(f"  histogram_stop:         {rd.histogram_stop_str}")
+        print(f"  histogram_n_intervals:  {rd.histogram_n_intervals}")
+        print(f"  histogram_interval_size:{rd.histogram_interval_size}")
+        print(f"  histogram_interval_times[:3]: {rd.histogram_interval_times[:3]}")
+        print(f"  histogram_interval_times[-2:]: {rd.histogram_interval_times[-2:]}")
+        print(f"  channel_stats: {len(rd.channel_stats)} rows")
+        for cs in rd.channel_stats:
+            print(f"    {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} peak_date={cs['peak_date']} peak_time={cs['peak_time']}")
+
+        pdf_bytes = report_pdf.render_event_report_pdf(rd)
+        out_path = REPO / "analysis_idf" / "thor_report_idfh.pdf"
+        out_path.write_bytes(pdf_bytes)
+        print()
+        print(f"  PDF written: {out_path} ({len(pdf_bytes)} bytes)")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,52 @@
+"""End-to-end ingest test: feed an IDFW + .txt to save_imported_idf in a tmp store."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+import tempfile
+import shutil
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from sfm.waveform_store import WaveformStore
+
+
+def main():
+    idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    txt  = idfw.parent / "TXT" / f"{idfw.name}.txt"
+
+    with tempfile.TemporaryDirectory() as td:
+        store = WaveformStore(Path(td))
+        ev, rec = store.save_imported_idf(
+            idfw.read_bytes(),
+            idfw,
+            serial_hint=None,
+            idf_report_text=txt.read_text(errors="replace"),
+        )
+        print("=== Save result ===")
+        print(f"  serial:    {rec['serial']}")
+        print(f"  filename:  {rec['filename']}")
+        print(f"  filesize:  {rec['filesize']}")
+        print(f"  h5:        {rec['hdf5_filename']}")
+        print(f"  sidecar:   {rec['sidecar_filename']}")
+        print()
+        print("=== Event ===")
+        print(f"  serial:        {ev.serial if hasattr(ev,'serial') else '(n/a)'}")
+        print(f"  timestamp:     {ev.timestamp}")
+        print(f"  sample_rate:   {ev.sample_rate}")
+        print(f"  record_type:   {ev.record_type}")
+        print(f"  rectime_sec:   {ev.rectime_seconds}")
+        print(f"  raw_samples:   Tran={len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0}, Vert={len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0}, Long={len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0}, MicL={len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0}")
+        if ev.peak_values:
+            print(f"  peaks (txt):   Tran={ev.peak_values.tran} Vert={ev.peak_values.vert} Long={ev.peak_values.long}")
+        print()
+
+        # Verify the h5 file actually got written
+        h5path = Path(td) / "UM11719" / f"{idfw.name}.h5"
+        print(f"  h5 exists:     {h5path.exists()}  size={h5path.stat().st_size if h5path.exists() else 0}")
+        sidecar = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
+        print(f"  sidecar exists:{sidecar.exists()}  size={sidecar.stat().st_size if sidecar.exists() else 0}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,137 @@
+"""Decode IDFH histogram intervals + verify against sidecar."""
+from __future__ import annotations
+import sys
+import struct
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+
+SEGMENT_MAGIC = b"\x02\xda\x0a\x00\x00\x00"
+SEGMENT_SIZE = 732   # = 10-byte header + 10 × 72-byte intervals + 2-byte tail
+INTERVAL_SIZE = 72
+CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+def decode_interval(buf72: bytes) -> dict:
+    """Decode one 72-byte interval into per-channel min/max/halfp."""
+    out = {}
+    for i, ch in enumerate(CHANNELS):
+        block = buf72[i*16 : (i+1)*16]
+        mn = struct.unpack_from(">h", block, 0)[0]
+        mx = struct.unpack_from(">h", block, 2)[0]
+        sb = struct.unpack_from(">h", block, 4)[0]
+        halfp = struct.unpack_from(">H", block, 6)[0]
+        f10 = struct.unpack_from(">H", block, 10)[0]
+        f14 = struct.unpack_from(">H", block, 14)[0]
+        peak_count = max(abs(mn), abs(mx))
+        out[ch] = {
+            "min":     mn,
+            "max":     mx,
+            "field4":  sb,
+            "halfp":   halfp,
+            "field10": f10,
+            "field14": f14,
+            "peak":    peak_count,
+            "freq_hz": (512.0 / halfp) if halfp > 5 else None,
+        }
+    out["_tail"] = buf72[64:].hex(" ")
+    return out
+
+
+def walk_idfh(buf: bytes) -> list:
+    """Walk all interval records in an IDFH file."""
+    intervals = []
+    # Multi-segment file: every 02 da 0a 00 00 00 marker introduces a segment.
+    # Single-interval file: just one body header at 0xf96 of form ?? ?? 0a 00 00 00.
+    # Find them all.
+    i = 0
+    while True:
+        j = buf.find(b"\x0a\x00\x00\x00", i)
+        if j < 0:
+            break
+        # Validate: the 2 bytes before must form a length, and we want bytes
+        # [j-2 : j+6] to have a recognisable shape.  Actually the cleanest
+        # filter is "preceded by a length and followed by 00 NN 05 3f".
+        if j < 2:
+            i = j + 1
+            continue
+        # Body header form: [length_be_2][0a 00 00 00][00 NN][05 3f]
+        if j + 10 > len(buf):
+            break
+        length = int.from_bytes(buf[j-2:j], "big")
+        # Verify the segment-marker shape: [length_be][0a 00 00 00][00 NN][05 3f]
+        if buf[j+4] != 0x00:
+            i = j + 1
+            continue
+        if buf[j+6:j+8] != b"\x05\x3f":
+            i = j + 1
+            continue
+        # Header layout (10 bytes): [length_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
+        # Followed by N interval records of 72 bytes each, then 2 tail bytes.
+        # length value = (N × 72) + 10  (counts bytes from 0x0a... through interval data).
+        header_start = j - 2
+        n_intervals = (length - 10) // INTERVAL_SIZE
+        interval_start = header_start + 10
+        for k in range(n_intervals):
+            off = interval_start + k * INTERVAL_SIZE
+            if off + INTERVAL_SIZE > len(buf):
+                break
+            chunk = buf[off:off + INTERVAL_SIZE]
+            intervals.append({"offset": off, **decode_interval(chunk)})
+        i = header_start + length + 2
+    return intervals
+
+
+def main():
+    # Test against multi-segment IDFH
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    sc_path = target.parent / "TXT" / f"{target.name}.txt"
+    buf = target.read_bytes()
+    intervals = walk_idfh(buf)
+    print(f"=== {target.name} ===")
+    print(f"  file size: {len(buf)}")
+    print(f"  decoded intervals: {len(intervals)}")
+    # Show first 2 + last 2
+    sc_rows = []
+    for line in sc_path.read_text(errors="replace").splitlines():
+        if line.startswith("2022-") or line.startswith("2023-"):
+            sc_rows.append(line)
+    print(f"  sidecar rows: {len(sc_rows)}")
+
+    print()
+    for k in [0, 1, 78, 79, 80]:
+        if k >= len(intervals):
+            continue
+        iv = intervals[k]
+        print(f"--- interval {k} @0x{iv['offset']:04x} ---")
+        for ch in CHANNELS:
+            d = iv[ch]
+            peak_ips = d["peak"] / 32768 * 10.0
+            print(f"  {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s)  halfp={d['halfp']:5d}  freq={d['freq_hz']}")
+        # sidecar row
+        if k < len(sc_rows):
+            print(f"  SC: {sc_rows[k]}")
+
+    # Test single-interval IDFH
+    print()
+    target2 = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
+    sc2 = target2.parent / "TXT" / f"{target2.name}.txt"
+    buf2 = target2.read_bytes()
+    intervals2 = walk_idfh(buf2)
+    print(f"=== {target2.name} ===")
+    print(f"  file size: {len(buf2)}, decoded intervals: {len(intervals2)}")
+    if intervals2:
+        iv = intervals2[0]
+        for ch in CHANNELS:
+            d = iv[ch]
+            peak_ips = d["peak"] / 32768 * 10.0
+            print(f"  {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s)  halfp={d['halfp']:5d}  freq={d['freq_hz']}")
+        sc_rows2 = [l for l in sc2.read_text(errors='replace').splitlines() if l.startswith("2023-")]
+        if sc_rows2:
+            print(f"  SC: {sc_rows2[0]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,41 @@
+"""Find IDFH interval period via auto-correlation of structural patterns."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+from collections import Counter
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
+    buf = target.read_bytes()
+    body_start = 0xF96
+    body_end   = 0x270C
+    body = buf[body_start:body_end]
+    print(f"body size: {len(body)} bytes (file {len(buf)} bytes)")
+
+    # For each candidate interval size, count how many bytes at fixed offsets within
+    # each interval are zero (consistent column-zero pattern indicates correct size).
+    print()
+    print("=== zero-column score by interval size (higher = more likely) ===")
+    best = []
+    for sz in range(16, 100):
+        n = len(body) // sz
+        if n < 30:
+            continue
+        # For each column position within an interval, count how many of n intervals have zero
+        score = 0
+        for col in range(sz):
+            zeros = sum(1 for i in range(n) if body[i*sz + col] == 0)
+            if zeros >= n * 0.9:
+                score += 1
+        best.append((score, sz, n))
+    best.sort(reverse=True)
+    for score, sz, n in best[:10]:
+        print(f"  size={sz:3d}  n_intervals={n}  consistently-zero-cols={score}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Per-file accuracy + sample-count details."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+from analysis_idf.recon import load_sidecar_samples
+
+
+def main():
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = sorted([f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")])
+    GEO_LSB = 0.0003
+    # Limit to first 15 successful files for detail.
+    shown = 0
+    for f in files:
+        try:
+            res = read_idf_file(f)
+        except Exception:
+            continue
+        sc_path = f.parent / "TXT" / f"{f.name}.txt"
+        if not sc_path.exists():
+            continue
+        sc = load_sidecar_samples(sc_path)
+        sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
+        dec = res.samples.get("Tran", [])
+        n = min(len(sc_tran), len(dec))
+        exact = sum(1 for i in range(n) if sc_tran[i] == dec[i]) if n else 0
+        pct = 100.0 * exact / n if n else 0.0
+        print(f"{f.name:40s}  size={f.stat().st_size:6d}  sc_n={len(sc_tran):4d}  dec_n={len(dec):4d}  exact={pct:.1f}%")
+        shown += 1
+        if shown >= 20:
+            break
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,64 @@
+"""Look at what's at the divergence boundary."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import walk_body, find_data_start, parse_segment_header
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    body = buf[0x0f1f:]
+    start = find_data_start(body)
+    print(f"data_start: {start}  (= file offset 0x{0x0f1f + start:04x})")
+
+    blocks = walk_body(body, start)
+    print(f"{len(blocks)} blocks total")
+    print()
+
+    # First 25 blocks
+    print("=== first 30 blocks ===")
+    for i, b in enumerate(blocks[:30]):
+        body_off = 0x0f1f + b.offset
+        if b.tag_hi == 0x40:
+            hdr = parse_segment_header(b)
+            print(f"  [{i:3d}] @0x{body_off:04x}  {b.kind}  (segment header)  counter={hdr['counter'] if hdr else '?'}  field2={hdr['field2'].hex() if hdr else '?'}  anchor={hdr['anchor_bytes'].hex() if hdr else '?'}  tail={hdr['tail'].hex() if hdr else '?'}")
+        else:
+            print(f"  [{i:3d}] @0x{body_off:04x}  {b.kind}  len={b.length}  data={b.data[:16].hex()}")
+    print()
+
+    # Cumulative sample counts per block to find which block contains sample 254
+    print("=== cumulative samples through blocks ===")
+    cur_ch = "Tran"
+    rotation = ["Vert", "Long", "MicL", "Tran"]
+    seg_count = 0
+    samples_in_curseg = 2  # preamble Tran[0], Tran[1]
+    for i, b in enumerate(blocks[:30]):
+        if b.tag_hi == 0x40:
+            seg_count += 1
+            prev_ch = cur_ch
+            cur_ch = rotation[(seg_count - 1) % 4]
+            print(f"  [{i:3d}] 40 02 -> end of {prev_ch} segment, start {cur_ch} (segment {seg_count})")
+            samples_in_curseg = 2  # anchors
+        elif (b.tag_hi & 0xF0) == 0x10:
+            nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
+            samples_in_curseg += nn
+            print(f"  [{i:3d}] {b.kind} nibble: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif (b.tag_hi & 0xF0) == 0x20:
+            nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
+            samples_in_curseg += nn
+            print(f"  [{i:3d}] {b.kind} int8: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif b.tag_hi == 0x00:
+            samples_in_curseg += b.tag_lo
+            print(f"  [{i:3d}] {b.kind} RLE: +{b.tag_lo}, ch={cur_ch}, ch_total~{samples_in_curseg}")
+        elif b.tag_hi == 0x30:
+            samples_in_curseg += b.tag_lo
+            print(f"  [{i:3d}] {b.kind} packed12: +{b.tag_lo} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,89 @@
+"""Reconnaissance helpers for cracking the Thor IDFW binary."""
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+TARGET = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+TXT = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/TXT/UM11719_20231219162723.IDFW.txt"
+
+
+def hex_at(buf: bytes, off: int, n: int = 32) -> str:
+    chunk = buf[off : off + n]
+    hexs = " ".join(f"{b:02x}" for b in chunk)
+    asc = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
+    return f"{off:04x}: {hexs}  {asc}"
+
+
+def find_all(buf: bytes, needle: bytes) -> list[int]:
+    out: list[int] = []
+    i = 0
+    while True:
+        j = buf.find(needle, i)
+        if j < 0:
+            break
+        out.append(j)
+        i = j + 1
+    return out
+
+
+def load_sidecar_samples(path: Path) -> dict[str, list[float]]:
+    """Parse the txt sample table — Tran/Vert/Long/MicL."""
+    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    in_block = False
+    for line in path.read_text(errors="replace").splitlines():
+        if not in_block:
+            if line.strip() == "Waveform Data Channels":
+                in_block = True
+            continue
+        if line.startswith("Waveform Data USB Channels"):
+            break
+        parts = line.split("\t")
+        # First row is the header "\tTran\tVert\tLong\tMicL"
+        if len(parts) >= 5 and parts[1] == "Tran":
+            continue
+        if len(parts) < 5:
+            continue
+        try:
+            out["Tran"].append(float(parts[1]))
+            out["Vert"].append(float(parts[2]))
+            out["Long"].append(float(parts[3]))
+            out["MicL"].append(float(parts[4]))
+        except ValueError:
+            continue
+    return out
+
+
+def main():
+    buf = TARGET.read_bytes()
+    samples = load_sidecar_samples(TXT)
+    print(f"file size: {len(buf)} bytes")
+    print(f"sample rows: Tran={len(samples['Tran'])} Vert={len(samples['Vert'])} Long={len(samples['Long'])} MicL={len(samples['MicL'])}")
+    print(f"first 6 Tran samples: {samples['Tran'][:6]}")
+    print(f"first 6 Vert samples: {samples['Vert'][:6]}")
+    print(f"first 6 Long samples: {samples['Long'][:6]}")
+    print(f"first 6 MicL samples: {samples['MicL'][:6]}")
+
+    print()
+    print("=== BW magic '00 02 00' positions ===")
+    hits = find_all(buf, b"\x00\x02\x00")
+    print(f"{len(hits)} hits")
+    for h in hits[:20]:
+        print(hex_at(buf, h, 24))
+
+    print()
+    print("=== '40 02' segment-header positions ===")
+    hits = find_all(buf, b"\x40\x02")
+    print(f"{len(hits)} hits")
+    for h in hits:
+        ctx_pre = buf[max(0, h - 4): h].hex()
+        ctx_post = buf[h: h + 20].hex()
+        # Show byte preceding to help identify real headers vs casual occurrences
+        print(f"  0x{h:04x}  pre={ctx_pre}  post={ctx_post}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,40 @@
+"""Find each segment boundary in the channel and check if errors reset there."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    decoded = decode_waveform_v2(buf[0x0f1f:])
+    GEO_LSB = 0.0003
+
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        # Find every transition where error becomes zero from nonzero (or grows from zero)
+        # Print indices where dec resyncs back to exact match.
+        n = min(len(sc_counts), len(dec))
+        events = []
+        prev_match = True
+        for i in range(n):
+            match = sc_counts[i] == dec[i]
+            if match != prev_match:
+                kind = "RESYNC" if match else "DIVERGE"
+                events.append((i, kind, sc_counts[i], dec[i]))
+                prev_match = match
+        print(f"{ch}: {len(events)} transitions")
+        for i, kind, sc_v, dec_v in events[:20]:
+            print(f"  idx {i:4d}  {kind:8s}  sc={sc_v:6d}  dec={dec_v:6d}  diff={dec_v-sc_v:+d}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,46 @@
+"""Smoke-test read_idf_file on IDFH across the corpus."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
+    result = read_idf_file(target)
+    ev = result.event
+    print(f"=== {target.name} ===")
+    print(f"  signature:   {result.signature}")
+    print(f"  serial:      {ev.serial}")
+    print(f"  timestamp:   {ev.timestamp}")
+    print(f"  sample_rate: {ev.sample_rate}")
+    print(f"  kind:        {ev.kind}")
+    print(f"  intervals:   {len(result.intervals or [])}")
+    print(f"  peaks:       T={ev.peaks.transverse_ips:.4f} V={ev.peaks.vertical_ips:.4f} L={ev.peaks.longitudinal_ips:.4f}")
+    print()
+
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = list(root.rglob("*.IDFH"))
+    ok = fail = nyi = 0
+    total_intervals = 0
+    for f in files:
+        try:
+            r = read_idf_file(f)
+            ok += 1
+            total_intervals += len(r.intervals or [])
+        except NotImplementedError:
+            nyi += 1
+        except Exception as exc:
+            fail += 1
+            if fail <= 3:
+                print(f"  FAIL: {f.name}: {type(exc).__name__}: {exc}")
+    print(f"Corpus: {len(files)} IDFH files | ok={ok} fail={fail} nyi={nyi}")
+    print(f"Total intervals decoded: {total_intervals}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,48 @@
+"""Smoke-test read_idf_file across the sample corpus."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_file import read_idf_file, geo_count_to_ips, mic_count_to_psi
+
+
+def main():
+    target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
+    result = read_idf_file(target)
+    ev = result.event
+    print(f"=== {target.name} ===")
+    print(f"  signature: {result.signature}")
+    print(f"  serial:    {ev.serial}")
+    print(f"  timestamp: {ev.timestamp}")
+    print(f"  sample_rate: {ev.sample_rate}")
+    print(f"  record_time: {ev.record_time_sec}")
+    print(f"  calibration: {result.binary_metadata.calibration_date}")
+    print(f"  Tran samples: {len(result.samples['Tran'])}, peak_ips={ev.peaks.transverse_ips:.4f}")
+    print(f"  Vert samples: {len(result.samples['Vert'])}, peak_ips={ev.peaks.vertical_ips:.4f}")
+    print(f"  Long samples: {len(result.samples['Long'])}, peak_ips={ev.peaks.longitudinal_ips:.4f}")
+    print(f"  MicL samples: {len(result.samples['MicL'])}")
+    print()
+
+    # Corpus sweep
+    root = REPO / "tests/fixtures/THORDATA_example"
+    files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
+    ok = fail = nyi = 0
+    for f in files:
+        try:
+            r = read_idf_file(f)
+            ok += 1
+        except NotImplementedError:
+            nyi += 1
+        except Exception as exc:
+            fail += 1
+            if fail <= 5:
+                print(f"  FAIL: {f.name}: {type(exc).__name__}: {exc}")
+    print()
+    print(f"Corpus: {len(files)} IDFW files | ok={ok} fail={fail} not-implemented={nyi}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,47 @@
+"""Verify build_bw_report_from_idf against a known sidecar."""
+from __future__ import annotations
+import json
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from micromate.idf_ascii_report import parse_idf_report
+from micromate.idf_to_bw_report import build_bw_report_from_idf
+from micromate.idf_file import read_idf_file
+
+
+def show(prefix: str, d: dict, indent: int = 0):
+    for k, v in d.items():
+        if isinstance(v, dict):
+            print(f"{'  '*indent}{prefix}{k}:")
+            show("", v, indent + 1)
+        else:
+            print(f"{'  '*indent}{prefix}{k}: {v!r}")
+
+
+def main():
+    base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
+    idfw = base / "UM11719_20231219162723.IDFW"
+    txt  = base / "TXT" / f"{idfw.name}.txt"
+
+    report_dict = parse_idf_report(txt.read_text(errors="replace"))
+    res = read_idf_file(idfw)
+    bw = build_bw_report_from_idf(report_dict, binary_md=res.binary_metadata)
+
+    print("=== IDFW → bw_report ===")
+    show("", bw)
+
+    print()
+    print("=== IDFH (single trigger row) ===")
+    idfh = base / "UM11719_20231219162648.IDFH"
+    txt_h = base / "TXT" / f"{idfh.name}.txt"
+    rh = parse_idf_report(txt_h.read_text(errors="replace"))
+    res_h = read_idf_file(idfh)
+    bw_h = build_bw_report_from_idf(rh, binary_md=res_h.binary_metadata, intervals=res_h.intervals)
+    show("", bw_h)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,73 @@
+"""Trace Tran sample-by-sample to find exactly where the codec drifts."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def s4(n: int) -> int:
+    return n if n < 8 else n - 16
+
+
+def i8(b: int) -> int:
+    return b if b < 128 else b - 256
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    GEO_LSB = 0.0003
+    sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
+
+    body = buf[0x0f1f:]
+    # Tran[0], Tran[1] from preamble
+    t0 = int.from_bytes(body[3:5], "big", signed=True)
+    t1 = int.from_bytes(body[5:7], "big", signed=True)
+    print(f"preamble Tran[0]={t0}  Tran[1]={t1}  (sidecar: {sc_tran[0]}, {sc_tran[1]})")
+
+    # Block 0: 10 f8 at body[7:9]
+    print(f"block 0: tag {body[7]:02x} {body[8]:02x}")
+    print(f"  block 0 first 10 data bytes: {body[9:19].hex()}")
+
+    # Walk block 0 manually, comparing each sample
+    cur = t1
+    samples = [t0, t1]
+    block_off = 7
+    nn = body[8]
+    print(f"  NN = {nn}")
+    data = body[9 : 9 + nn // 2]
+    for byi, byte in enumerate(data):
+        for nib_idx, nib in enumerate(((byte >> 4) & 0xF, byte & 0xF)):
+            cur += s4(nib)
+            samples.append(cur)
+            idx = len(samples) - 1
+            if 0 <= idx < len(sc_tran):
+                sc_v = sc_tran[idx]
+                match = "✓" if sc_v == cur else "✗"
+                if idx < 12 or 240 <= idx <= 260:
+                    print(f"    idx {idx:3d}: nibble byte={byte:02x} nib={nib:x} delta={s4(nib):+d}  cur={cur:+d}  sc={sc_v:+d}  {match}")
+
+    print(f"end of block 0: cur={cur}, len(samples)={len(samples)}, decoder expected 250 here")
+    # Block 1: 20 28 starts at offset 9 + 124 = 133 from block_off=7
+    block1_off = 9 + nn // 2
+    print(f"block 1: tag {body[block1_off]:02x} {body[block1_off+1]:02x} (expecting 20 28)")
+    nn1 = body[block1_off + 1]
+    print(f"  block 1 NN = {nn1}")
+    data1 = body[block1_off + 2 : block1_off + 2 + nn1]
+    for byi, byte in enumerate(data1):
+        cur += i8(byte)
+        samples.append(cur)
+        idx = len(samples) - 1
+        if idx < len(sc_tran):
+            sc_v = sc_tran[idx]
+            match = "✓" if sc_v == cur else "✗"
+            if 248 <= idx <= 295:
+                print(f"    idx {idx:3d}: int8 byte={byte:02x} delta={i8(byte):+d}  cur={cur:+d}  sc={sc_v:+d}  {match}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,42 @@
+"""Feed candidate body offsets to the BW codec and compare with sidecar."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    # Sidecar samples in 0.0003 counts (Thor geo LSB).
+    sc_tran = [int(round(v / 0.0003)) for v in sc["Tran"][:30]]
+    sc_vert = [int(round(v / 0.0003)) for v in sc["Vert"][:30]]
+    sc_long = [int(round(v / 0.0003)) for v in sc["Long"][:30]]
+    sc_micl = [int(round(v / 1e-6)) for v in sc["MicL"][:30]]  # 1 µ unit for mic? Will iterate.
+    print(f"sidecar Tran (counts): {sc_tran}")
+    print(f"sidecar Vert (counts): {sc_vert}")
+    print(f"sidecar Long (counts): {sc_long}")
+    print(f"sidecar MicL (×1e-6):  {sc_micl}")
+    print()
+
+    # Try candidate body start offsets.
+    for off in (0x0f1f, 0x1057, 0x11f1, 0x1333, 0x1bde, 0x0d30):
+        print(f"=== body @ 0x{off:04x} ===")
+        body = buf[off:]
+        decoded = decode_waveform_v2(body)
+        if not decoded:
+            print("  decode_waveform_v2 returned None")
+            continue
+        for ch in ("Tran", "Vert", "Long", "MicL"):
+            arr = decoded.get(ch, [])
+            print(f"  {ch}[{len(arr)}]: {arr[:20]}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,51 @@
+"""Verify decode_waveform_v2 against sidecar across all 2304 samples per channel."""
+from __future__ import annotations
+import sys
+from pathlib import Path
+
+REPO = Path(__file__).resolve().parents[1]
+sys.path.insert(0, str(REPO))
+
+from minimateplus.waveform_codec import decode_waveform_v2
+from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
+
+
+def main():
+    buf = TARGET.read_bytes()
+    sc = load_sidecar_samples(TXT)
+    body = buf[0x0f1f:]
+    decoded = decode_waveform_v2(body)
+
+    print(f"Sidecar lengths: Tran={len(sc['Tran'])} Vert={len(sc['Vert'])} Long={len(sc['Long'])} MicL={len(sc['MicL'])}")
+    print(f"Decoded lengths: Tran={len(decoded['Tran'])} Vert={len(decoded['Vert'])} Long={len(decoded['Long'])} MicL={len(decoded['MicL'])}")
+    print()
+
+    GEO_LSB = 0.0003  # in/s per count
+    for ch in ("Tran", "Vert", "Long"):
+        sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
+        dec = decoded[ch]
+        n = min(len(sc_counts), len(dec))
+        matches = sum(1 for i in range(n) if sc_counts[i] == dec[i])
+        first_mismatch = next((i for i in range(n) if sc_counts[i] != dec[i]), None)
+        print(f"{ch}: compared {n}, exact matches {matches} ({100*matches/n:.2f}%)")
+        if first_mismatch is not None:
+            i = first_mismatch
+            print(f"  first mismatch at idx {i}: sidecar={sc_counts[i]} ({sc[ch][i]}), decoded={dec[i]}")
+            print(f"  context sidecar[{i-2}..{i+5}]: {sc_counts[max(0,i-2):i+5]}")
+            print(f"  context decoded[{i-2}..{i+5}]: {dec[max(0,i-2):i+5]}")
+
+    # MicL: find the multiplicative factor that fits
+    print()
+    print("=== MicL scale analysis ===")
+    sc_micl = sc["MicL"]
+    dec_micl = decoded["MicL"]
+    # Skip zero values when computing ratio
+    ratios = [sc_micl[i] / dec_micl[i] for i in range(min(50, len(sc_micl), len(dec_micl))) if dec_micl[i] != 0]
+    if ratios:
+        avg = sum(ratios) / len(ratios)
+        print(f"  avg ratio sidecar/decoded over first 50 nonzero: {avg:.4e} (n={len(ratios)})")
+        print(f"  ratios sample: {[f'{r:.4e}' for r in ratios[:6]]}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,627 @@
+#!/usr/bin/env python3
+"""
+ach_bridge.py — Transparent TCP bridge / splitter for Instantel MiniMate Plus
+                call-home (ACH) traffic.
+
+Modes
+-----
+  standalone   Accept connection, capture frames, do NOT forward anywhere.
+               Good for initial discovery with a test unit.
+
+  bridge       Forward to one upstream server while capturing.
+               Use this for the initial discovery phase with your test server.
+
+  splitter     Forward to the PRIMARY upstream (production ACH server) AND
+               mirror a copy to a SECONDARY server simultaneously.
+               The device never knows — it talks to the primary the whole time.
+               If the mirror fails, the primary connection is unaffected.
+
+               Think of it like a headphone splitter: one input, two outputs.
+               Primary → authoritative responses back to device.
+               Mirror  → gets all device bytes, its responses are discarded.
+
+Usage
+-----
+  # Standalone capture (test/discovery — no forwarding)
+  python bridges/ach_bridge.py --standalone [--port 12345]
+
+  # Bridge mode (forward to one server, e.g. your test server)
+  python bridges/ach_bridge.py --upstream HOST:PORT [--port 12345]
+
+  # Splitter mode (production: forward to prod + mirror to your server)
+  python bridges/ach_bridge.py --upstream PROD_HOST:PORT --mirror MY_HOST:PORT [--port 12345]
+
+Setup for discovery (test server, don't touch prod)
+----------------------------------------------------
+1. Stand up your test ACH server, note its IP and port (e.g. 192.168.1.50:12345).
+2. Take ONE test unit.  In ACEmanager → Call Home, point it at:
+     <this machine's LAN IP> : <--port>
+3. Run:  python bridges/ach_bridge.py --upstream TEST_SERVER:12345 --port 12345
+4. Trigger the unit.  Raw frames are saved to bridges/captures/ach_<ts>/.
+5. Revert the unit's ACEmanager setting when done.
+
+Setup for production splitter (when you're ready)
+-------------------------------------------------
+This does NOT touch the units.  Instead you re-route traffic at the network
+layer so that call-home packets arrive at a machine running this script first.
+Typical approach: update the DNS entry / host record your prod ACH server is
+registered under to point at this machine.  The units keep their existing
+ACEmanager settings.
+
+  python bridges/ach_bridge.py \\
+      --upstream PROD_ACH_HOST:12345 \\
+      --mirror   MY_NEW_SERVER:12345 \\
+      --port     12345
+
+Output (each connection gets its own timestamped sub-directory)
+------
+  bridges/captures/ach_<ts>/
+    raw_client_<ts>.bin   — raw bytes from the device (S3 side)
+    raw_server_<ts>.bin   — raw bytes from the primary upstream (BW side)
+    raw_mirror_<ts>.bin   — raw bytes from the mirror upstream (splitter mode only)
+    session_<ts>.log      — human-readable frame parse log
+    session_<ts>.jsonl    — JSON-lines frame log
+
+raw_client / raw_server are byte-for-byte compatible with parse_capture.py.
+"""
+
+from __future__ import annotations
+
+import argparse
+import asyncio
+import datetime
+import json
+import logging
+import os
+import sys
+from pathlib import Path
+from typing import List, Optional
+
+# Add project root to path
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from minimateplus.framing import S3FrameParser, S3Frame
+
+log = logging.getLogger("ach_bridge")
+
+
+# ── Frame label helpers ──────────────────────────────────────────────────────
+
+_KNOWN_RSP_SUBS = {
+    0xA4: "POLL_RSP",
+    0xA5: "BULK_WAVEFORM_RSP",
+    0xE0: "ADVANCE_EVENT_RSP",
+    0xE1: "EVENT_INDEX_FIRST_RSP",
+    0xE3: "MONITOR_STATUS_RSP",
+    0xEA: "SERIAL_NUM_RSP",
+    0xF3: "WAVEFORM_RECORD_RSP",
+    0xF5: "WAVEFORM_HEADER_RSP",
+    0xF7: "EVENT_INDEX_RSP",
+    0xF9: "UNK_06_RSP",
+    0xFE: "DEVICE_INFO_RSP",
+    # Write acks
+    0x97: "EVT_IDX_WRITE_ACK",
+    0x8C: "CONFIRM_B_ACK",
+    0x8E: "COMPLIANCE_WRITE_ACK",
+    0x8D: "CONFIRM_A_ACK",
+    0x7D: "TRIGGER_WRITE_ACK",
+    0x7C: "TRIGGER_CONFIRM_ACK",
+    0x96: "WAVEFORM_WRITE_ACK",
+    0x8B: "CONFIRM_C_ACK",
+    0x69: "START_MONITOR_ACK",
+    0x68: "STOP_MONITOR_ACK",
+}
+
+_KNOWN_REQ_SUBS = {
+    0x5B: "POLL",
+    0x5A: "BULK_WAVEFORM",
+    0x1F: "ADVANCE_EVENT",
+    0x1E: "EVENT_INDEX_FIRST",
+    0x1C: "MONITOR_STATUS",
+    0x15: "SERIAL_NUM",
+    0x0C: "WAVEFORM_RECORD",
+    0x0A: "WAVEFORM_HEADER",
+    0x08: "EVENT_INDEX",
+    0x06: "UNK_06",
+    0x01: "DEVICE_INFO",
+    # Write commands
+    0x68: "EVT_IDX_WRITE",
+    0x73: "CONFIRM_B",
+    0x71: "COMPLIANCE_WRITE",
+    0x72: "CONFIRM_A",
+    0x82: "TRIGGER_WRITE",
+    0x83: "TRIGGER_CONFIRM",
+    0x69: "WAVEFORM_WRITE",
+    0x74: "CONFIRM_C",
+    0x96: "START_MONITOR",
+    0x97: "STOP_MONITOR",
+}
+
+
+def _label_s3_frame(frame: S3Frame) -> str:
+    name = _KNOWN_RSP_SUBS.get(frame.sub, f"UNK_0x{frame.sub:02X}")
+    chk = "✓" if frame.checksum_valid else "✗CHK"
+    return (
+        f"S3→  SUB=0x{frame.sub:02X} ({name}) "
+        f"page=0x{frame.page_key:04X} data={len(frame.data)}B {chk}"
+    )
+
+
+def _label_bw_frame(data: bytes, prefix: str = "  →BW") -> str:
+    """Best-effort label for a raw BW request frame (wire bytes)."""
+    # Wire layout: 41 02 10 10 00 sub ...
+    if len(data) < 6:
+        return f"{prefix}  (short {len(data)}B)"
+    sub = data[5]
+    name = _KNOWN_REQ_SUBS.get(sub, f"UNK_0x{sub:02X}")
+    return f"{prefix}  SUB=0x{sub:02X} ({name}) {len(data)}B"
+
+
+# ── Per-session capture writer ─────────────────────────────────────────────────
+
+class CaptureSession:
+    """Writes raw bytes + parsed log for one TCP connection."""
+
+    def __init__(self, capture_dir: Path, peer: str, *, has_mirror: bool = False):
+        ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+        self.dir = capture_dir / f"ach_{ts}"
+        self.dir.mkdir(parents=True, exist_ok=True)
+        self.peer = peer
+
+        self._raw_client = open(self.dir / f"raw_client_{ts}.bin", "wb")
+        self._raw_server = open(self.dir / f"raw_server_{ts}.bin", "wb")
+        self._raw_mirror = (
+            open(self.dir / f"raw_mirror_{ts}.bin", "wb") if has_mirror else None
+        )
+        self._log_fh   = open(self.dir / f"session_{ts}.log",   "w")
+        self._jsonl_fh = open(self.dir / f"session_{ts}.jsonl", "w")
+
+        self._s3_parser = S3FrameParser()
+        self._frame_count       = 0
+        self._byte_count_client = 0
+        self._byte_count_server = 0
+        self._byte_count_mirror = 0
+
+        self._log(
+            f"# ACH capture — peer={peer}  "
+            f"mirror={'yes' if has_mirror else 'no'}  "
+            f"started={datetime.datetime.now().isoformat()}"
+        )
+        self._log(f"# Output dir: {self.dir}")
+        log.info("Capture session opened: %s (peer=%s)", self.dir, peer)
+
+    # ── public API ────────────────────────────────────────────────────────────
+
+    def feed_client(self, data: bytes) -> None:
+        """Bytes FROM the device (S3 response frames)."""
+        self._raw_client.write(data)
+        self._raw_client.flush()
+        self._byte_count_client += len(data)
+
+        for byte in data:
+            frame = self._s3_parser.feed(bytes([byte]))
+            if frame:
+                frames = frame if isinstance(frame, list) else [frame]
+                for f in frames:
+                    self._frame_count += 1
+                    label = _label_s3_frame(f)
+                    self._log(f"[{self._frame_count:04d}] {label}")
+                    self._log(
+                        f"       hex: {f.data[:64].hex()}"
+                        + (" ..." if len(f.data) > 64 else "")
+                    )
+                    self._emit_json("s3", f)
+
+    def feed_server(self, data: bytes) -> None:
+        """Bytes FROM the primary upstream server (BW request frames)."""
+        self._raw_server.write(data)
+        self._raw_server.flush()
+        self._byte_count_server += len(data)
+        label = _label_bw_frame(data, prefix="  →BW[primary]")
+        self._log(f"       {label}")
+
+    def feed_mirror(self, data: bytes) -> None:
+        """Bytes FROM the mirror server (logged, not forwarded to device)."""
+        if self._raw_mirror:
+            self._raw_mirror.write(data)
+            self._raw_mirror.flush()
+        self._byte_count_mirror += len(data)
+        label = _label_bw_frame(data, prefix="  →BW[mirror] ")
+        self._log(f"       {label}  [MIRROR — not sent to device]")
+
+    def close(self, reason: str = "connection closed") -> None:
+        self._log(f"# Session ended: {reason}")
+        self._log(
+            f"# Totals — client={self._byte_count_client}B  "
+            f"server={self._byte_count_server}B  "
+            f"mirror={self._byte_count_mirror}B  "
+            f"s3_frames={self._frame_count}"
+        )
+        handles = [self._raw_client, self._raw_server, self._log_fh, self._jsonl_fh]
+        if self._raw_mirror:
+            handles.append(self._raw_mirror)
+        for fh in handles:
+            try:
+                fh.close()
+            except Exception:
+                pass
+        log.info(
+            "Session closed (%s): %dB client, %dB server, %dB mirror, %d S3 frames → %s",
+            reason,
+            self._byte_count_client, self._byte_count_server,
+            self._byte_count_mirror, self._frame_count,
+            self.dir,
+        )
+
+    # ── internals ─────────────────────────────────────────────────────────────
+
+    def _log(self, msg: str) -> None:
+        print(msg, file=self._log_fh, flush=True)
+        print(msg)
+
+    def _emit_json(self, direction: str, frame: S3Frame) -> None:
+        record = {
+            "dir":            direction,
+            "sub":            frame.sub,
+            "page_key":       frame.page_key,
+            "data_len":       len(frame.data),
+            "data_hex":       frame.data.hex(),
+            "checksum_valid": frame.checksum_valid,
+        }
+        print(json.dumps(record), file=self._jsonl_fh, flush=True)
+
+
+# ── Bridge / splitter connection handler ──────────────────────────────────────
+
+class BridgeHandler:
+    """
+    Handles inbound device connections.
+
+    Modes (determined by which upstreams are configured):
+      standalone  — no upstream_host / no mirror_host
+      bridge      — upstream_host set, no mirror_host
+      splitter    — upstream_host AND mirror_host both set
+    """
+
+    def __init__(
+        self,
+        capture_dir:   Path,
+        upstream_host: Optional[str],
+        upstream_port: Optional[int],
+        mirror_host:   Optional[str] = None,
+        mirror_port:   Optional[int] = None,
+    ):
+        self.capture_dir   = capture_dir
+        self.upstream_host = upstream_host
+        self.upstream_port = upstream_port
+        self.mirror_host   = mirror_host
+        self.mirror_port   = mirror_port
+
+    async def handle(
+        self,
+        client_reader: asyncio.StreamReader,
+        client_writer: asyncio.StreamWriter,
+    ) -> None:
+        peer = client_writer.get_extra_info("peername", ("?", 0))
+        peer_str = f"{peer[0]}:{peer[1]}"
+        log.info("Inbound connection from %s", peer_str)
+
+        has_mirror = bool(self.mirror_host)
+        session = CaptureSession(self.capture_dir, peer_str, has_mirror=has_mirror)
+
+        if not self.upstream_host:
+            # ── Standalone mode ──────────────────────────────────────────────
+            log.info("Standalone mode — recording inbound traffic only")
+            try:
+                while True:
+                    data = await client_reader.read(4096)
+                    if not data:
+                        break
+                    session.feed_client(data)
+            except asyncio.CancelledError:
+                pass
+            except Exception as exc:
+                log.warning("Standalone read error: %s", exc)
+            finally:
+                session.close("standalone capture ended")
+                try:
+                    client_writer.close()
+                    await client_writer.wait_closed()
+                except Exception:
+                    pass
+            return
+
+        # ── Bridge / splitter mode ───────────────────────────────────────────
+        # Connect to primary upstream (required)
+        try:
+            up_reader, up_writer = await asyncio.open_connection(
+                self.upstream_host, self.upstream_port
+            )
+            log.info("Connected to primary %s:%s", self.upstream_host, self.upstream_port)
+        except Exception as exc:
+            log.error("Failed to connect to primary upstream: %s", exc)
+            session.close(f"primary connect failed: {exc}")
+            client_writer.close()
+            return
+
+        # Connect to mirror upstream (optional — failure is non-fatal)
+        mir_reader: Optional[asyncio.StreamReader] = None
+        mir_writer: Optional[asyncio.StreamWriter] = None
+        if self.mirror_host:
+            try:
+                mir_reader, mir_writer = await asyncio.open_connection(
+                    self.mirror_host, self.mirror_port
+                )
+                log.info("Connected to mirror %s:%s", self.mirror_host, self.mirror_port)
+            except Exception as exc:
+                log.warning(
+                    "Mirror connect failed — continuing without mirror: %s", exc
+                )
+                session._log(f"# WARNING: mirror connect failed: {exc}")
+
+        # Build relay tasks
+        #
+        # ┌──────────┐  device bytes  ┌─────────────┐
+        # │  Device  │ ─────────────► │   PRIMARY   │  responses ──► device
+        # └──────────┘                └─────────────┘
+        #      │
+        #      │ device bytes (copy)
+        #      ▼
+        # ┌─────────────┐
+        # │   MIRROR    │  responses discarded (logged only)
+        # └─────────────┘
+        #
+        tasks = [
+            asyncio.create_task(
+                self._relay_device(client_reader, up_writer, mir_writer, session),
+                name="device→upstreams",
+            ),
+            asyncio.create_task(
+                self._relay_simple(up_reader, client_writer, session, "server"),
+                name="primary→device",
+            ),
+        ]
+        if mir_reader is not None:
+            tasks.append(asyncio.create_task(
+                self._relay_drain(mir_reader, session),
+                name="mirror→drain",
+            ))
+
+        try:
+            # Wait for the device-to-upstreams relay to exit first (device
+            # disconnected or primary dropped).  Then cancel the rest.
+            done, pending = await asyncio.wait(
+                tasks,
+                return_when=asyncio.FIRST_COMPLETED,
+            )
+            for t in pending:
+                t.cancel()
+                try:
+                    await t
+                except (asyncio.CancelledError, Exception):
+                    pass
+        except Exception as exc:
+            log.warning("Bridge relay error: %s", exc)
+        finally:
+            session.close("relay ended")
+            for writer in filter(None, [client_writer, up_writer, mir_writer]):
+                try:
+                    writer.close()
+                    await writer.wait_closed()
+                except Exception:
+                    pass
+
+    # ── Relay helpers ─────────────────────────────────────────────────────────
+
+    async def _relay_device(
+        self,
+        reader:       asyncio.StreamReader,
+        primary_writer: asyncio.StreamWriter,
+        mirror_writer:  Optional[asyncio.StreamWriter],
+        session:      CaptureSession,
+    ) -> None:
+        """
+        Read bytes from the device, write to the primary server, and also
+        write a copy to the mirror server (if connected).  Mirror write
+        failures are non-fatal — we log and continue.
+        """
+        try:
+            while True:
+                data = await reader.read(4096)
+                if not data:
+                    break
+                session.feed_client(data)
+
+                # Primary write — failure IS fatal (lose primary = lose prod)
+                primary_writer.write(data)
+                await primary_writer.drain()
+
+                # Mirror write — failure is non-fatal
+                if mirror_writer is not None:
+                    try:
+                        mirror_writer.write(data)
+                        await mirror_writer.drain()
+                    except Exception as exc:
+                        log.warning("Mirror write failed (non-fatal): %s", exc)
+                        session._log(f"# WARNING: mirror write failed: {exc}")
+                        mirror_writer = None  # stop trying
+
+        except (asyncio.IncompleteReadError, ConnectionResetError, BrokenPipeError):
+            pass
+
+    async def _relay_simple(
+        self,
+        reader:    asyncio.StreamReader,
+        writer:    asyncio.StreamWriter,
+        session:   CaptureSession,
+        direction: str,
+    ) -> None:
+        """Standard single-pipe relay (primary→device or vice-versa)."""
+        try:
+            while True:
+                data = await reader.read(4096)
+                if not data:
+                    break
+                if direction == "server":
+                    session.feed_server(data)
+                else:
+                    session.feed_client(data)
+                writer.write(data)
+                await writer.drain()
+        except (asyncio.IncompleteReadError, ConnectionResetError, BrokenPipeError):
+            pass
+
+    async def _relay_drain(
+        self,
+        reader:  asyncio.StreamReader,
+        session: CaptureSession,
+    ) -> None:
+        """
+        Read mirror server responses, log them to session, do NOT forward to
+        device.  The device only ever sees primary server responses.
+        """
+        try:
+            while True:
+                data = await reader.read(4096)
+                if not data:
+                    break
+                session.feed_mirror(data)
+        except (asyncio.IncompleteReadError, ConnectionResetError, BrokenPipeError):
+            pass
+
+
+# ── Main ───────────────────────────────────────────────────────────────────────
+
+async def main(args: argparse.Namespace) -> None:
+    capture_dir = Path(__file__).parent / "captures"
+    capture_dir.mkdir(parents=True, exist_ok=True)
+
+    upstream_host: Optional[str] = None
+    upstream_port: Optional[int] = None
+    mirror_host:   Optional[str] = None
+    mirror_port:   Optional[int] = None
+
+    if not args.standalone:
+        if not args.upstream:
+            print("ERROR: --upstream HOST:PORT is required unless --standalone is set.")
+            sys.exit(1)
+        parts = args.upstream.rsplit(":", 1)
+        if len(parts) != 2:
+            print("ERROR: --upstream must be HOST:PORT (e.g. 203.0.113.5:12345)")
+            sys.exit(1)
+        upstream_host = parts[0]
+        upstream_port = int(parts[1])
+
+    if args.mirror:
+        parts = args.mirror.rsplit(":", 1)
+        if len(parts) != 2:
+            print("ERROR: --mirror must be HOST:PORT (e.g. 192.168.1.50:12345)")
+            sys.exit(1)
+        mirror_host = parts[0]
+        mirror_port = int(parts[1])
+
+    handler = BridgeHandler(
+        capture_dir,
+        upstream_host, upstream_port,
+        mirror_host,   mirror_port,
+    )
+
+    server = await asyncio.start_server(
+        handler.handle,
+        host="0.0.0.0",
+        port=args.port,
+    )
+
+    # ── Startup banner ────────────────────────────────────────────────────────
+    if args.standalone:
+        mode = "STANDALONE capture (no forwarding)"
+    elif mirror_host:
+        mode = f"SPLITTER  primary={upstream_host}:{upstream_port}  mirror={mirror_host}:{mirror_port}"
+    else:
+        mode = f"BRIDGE → {upstream_host}:{upstream_port}"
+
+    addrs = ", ".join(str(s.getsockname()) for s in server.sockets)
+    print(f"\n{'='*70}")
+    print(f"  ACH bridge/splitter  listening on {addrs}")
+    print(f"  Mode:  {mode}")
+    print(f"  Captures: {capture_dir}/ach_<timestamp>/")
+    print(f"{'='*70}")
+
+    if upstream_host and not mirror_host:
+        print(f"\n  DISCOVERY PHASE")
+        print(f"  Point your TEST unit's ACEmanager call-home destination to:")
+        print(f"    <this machine's LAN IP> : {args.port}")
+        print(f"  All traffic will be forwarded to {upstream_host}:{upstream_port}")
+    elif mirror_host:
+        print(f"\n  SPLITTER MODE — PRODUCTION SAFE")
+        print(f"  Units connect as normal.  Every byte is forwarded to:")
+        print(f"    PRIMARY (authoritative): {upstream_host}:{upstream_port}")
+        print(f"    MIRROR  (your server):   {mirror_host}:{mirror_port}")
+        print(f"  Only PRIMARY responses reach the device.")
+        print(f"  Mirror failures are logged and do not affect the device.")
+    else:
+        print(f"\n  STANDALONE MODE — capture only, nothing forwarded")
+        print(f"  Point a unit at <this machine's LAN IP> : {args.port}")
+
+    print(f"\n  Waiting for inbound connections...  (Ctrl-C to stop)\n")
+
+    async with server:
+        await server.serve_forever()
+
+
+def parse_args() -> argparse.Namespace:
+    p = argparse.ArgumentParser(
+        description=(
+            "Transparent TCP bridge / splitter for Instantel MiniMate Plus "
+            "call-home (ACH) traffic."
+        ),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog=__doc__,
+    )
+    p.add_argument(
+        "--upstream", "-u",
+        metavar="HOST:PORT",
+        help=(
+            "Primary upstream ACH server to forward to "
+            "(e.g. 203.0.113.5:12345). "
+            "Omit with --standalone for capture-only mode."
+        ),
+    )
+    p.add_argument(
+        "--mirror", "-m",
+        metavar="HOST:PORT",
+        help=(
+            "Mirror / secondary server to receive a copy of all device bytes "
+            "(splitter mode).  Mirror responses are logged but NOT forwarded "
+            "to the device.  Mirror failures are non-fatal."
+        ),
+    )
+    p.add_argument(
+        "--port", "-p",
+        type=int,
+        default=12345,
+        help="Local port to listen on (default: 12345).",
+    )
+    p.add_argument(
+        "--standalone", "-s",
+        action="store_true",
+        help="Capture-only mode: accept connection, do not forward anywhere.",
+    )
+    p.add_argument(
+        "--verbose", "-v",
+        action="store_true",
+        help="Enable debug logging.",
+    )
+    return p.parse_args()
+
+
+if __name__ == "__main__":
+    args = parse_args()
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+    )
+    try:
+        asyncio.run(main(args))
+    except KeyboardInterrupt:
+        print("\nStopped.")
@@ -0,0 +1,177 @@
+#!/usr/bin/env python3
+"""
+ach_mitm.py — TCP man-in-the-middle proxy for capturing Blastware ACH sessions.
+
+The unit calls home to THIS proxy instead of directly to Blastware.  The proxy
+forwards every byte in both directions to the real Blastware ACH server and saves
+the traffic to separate raw capture files that the Analyzer can load directly.
+
+Setup
+-----
+  1.  Start Blastware's ACH server on the BW PC as normal (it listens on its port).
+  2.  Run this proxy on any machine the unit can reach:
+
+        python bridges/ach_mitm.py --bw-host 192.168.1.50 --bw-port 9999
+
+  3.  Point the unit's ACEmanager call-home destination to THIS machine's IP and
+      the --listen-port (default 9999).
+  4.  Trigger a call-home (or wait for the unit to call in).
+  5.  The proxy transparently forwards everything and saves two files per session:
+
+        ach_mitm_<ts>/raw_bw_<ts>.bin   -- bytes Blastware sent to unit (BW TX)
+        ach_mitm_<ts>/raw_s3_<ts>.bin   -- bytes unit sent to Blastware (S3 TX)
+
+  Both files load directly in the Analyzer (File > Open Capture).
+
+  The proxy exits cleanly when either side drops the connection.
+
+Use case: capturing Blastware operations we haven't reverse-engineered yet,
+e.g. event deletion, factory reset, firmware update.
+"""
+
+from __future__ import annotations
+
+import argparse
+import datetime
+import logging
+import socket
+import sys
+import threading
+from pathlib import Path
+
+log = logging.getLogger("ach_mitm")
+
+
+def _pipe(src: socket.socket, dst: socket.socket, label: str, outfile) -> None:
+    """Forward bytes from src to dst, writing everything to outfile."""
+    try:
+        while True:
+            data = src.recv(4096)
+            if not data:
+                break
+            dst.sendall(data)
+            outfile.write(data)
+            outfile.flush()
+            log.debug("%s  %d bytes", label, len(data))
+    except OSError:
+        pass
+    finally:
+        log.info("%s  pipe closed", label)
+        # Signal the other direction to stop by shutting down our end.
+        try:
+            dst.shutdown(socket.SHUT_WR)
+        except OSError:
+            pass
+
+
+def handle(unit_sock: socket.socket, peer: str, bw_host: str, bw_port: int,
+           output_dir: Path) -> None:
+    ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+    session_dir = output_dir / f"ach_mitm_{ts}"
+    session_dir.mkdir(parents=True, exist_ok=True)
+
+    log.info("Session %s  unit=%s  forwarding to %s:%d", ts, peer, bw_host, bw_port)
+
+    # Connect upstream to Blastware.
+    bw_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    try:
+        bw_sock.connect((bw_host, bw_port))
+    except OSError as exc:
+        log.error("Cannot reach Blastware at %s:%d: %s", bw_host, bw_port, exc)
+        unit_sock.close()
+        return
+
+    log.info("Connected to Blastware at %s:%d", bw_host, bw_port)
+
+    bw_path   = session_dir / f"raw_bw_{ts}.bin"   # Blastware → unit  (BW TX)
+    s3_path   = session_dir / f"raw_s3_{ts}.bin"   # unit → Blastware  (S3 TX)
+
+    with open(bw_path, "wb") as bw_fh, open(s3_path, "wb") as s3_fh:
+        # Two threads: one per direction.
+        t_bw = threading.Thread(
+            target=_pipe, args=(bw_sock, unit_sock, "BW->unit", bw_fh), daemon=True
+        )
+        t_s3 = threading.Thread(
+            target=_pipe, args=(unit_sock, bw_sock, "unit->BW", s3_fh), daemon=True
+        )
+        t_bw.start()
+        t_s3.start()
+        t_bw.join()
+        t_s3.join()
+
+    bw_bytes = bw_path.stat().st_size
+    s3_bytes = s3_path.stat().st_size
+    log.info(
+        "Session %s done  BW->unit: %d bytes  unit->BW: %d bytes  -> %s",
+        ts, bw_bytes, s3_bytes, session_dir,
+    )
+
+    unit_sock.close()
+    bw_sock.close()
+
+
+def serve(args: argparse.Namespace) -> None:
+    output_dir = Path(args.output)
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+    server.bind(("0.0.0.0", args.listen_port))
+    server.listen(5)
+    server.settimeout(1.0)
+
+    print(f"\n{'='*60}")
+    print(f"  ACH MITM proxy")
+    print(f"  Listening on     0.0.0.0:{args.listen_port}")
+    print(f"  Forwarding to    {args.bw_host}:{args.bw_port}")
+    print(f"  Captures in      {output_dir.resolve()}/ach_mitm_<ts>/")
+    print(f"{'='*60}")
+    print(f"\n  Point the unit's ACEmanager call-home to this machine on port {args.listen_port}")
+    print(f"  Ctrl-C to stop\n")
+
+    try:
+        while True:
+            try:
+                client_sock, addr = server.accept()
+            except socket.timeout:
+                continue
+            peer = f"{addr[0]}:{addr[1]}"
+            log.info("Accepted connection from %s", peer)
+            t = threading.Thread(
+                target=handle,
+                args=(client_sock, peer, args.bw_host, args.bw_port, output_dir),
+                daemon=True,
+            )
+            t.start()
+    except KeyboardInterrupt:
+        print("\nStopping.")
+    finally:
+        server.close()
+
+
+def main() -> None:
+    ap = argparse.ArgumentParser(description=__doc__,
+                                 formatter_class=argparse.RawDescriptionHelpFormatter)
+    ap.add_argument("--bw-host",     required=True,
+                    help="IP or hostname of the Blastware ACH server")
+    ap.add_argument("--bw-port",     type=int, default=9999,
+                    help="Port Blastware is listening on (default: 9999)")
+    ap.add_argument("--listen-port", type=int, default=9999,
+                    help="Port this proxy listens on (default: 9999)")
+    ap.add_argument("--output",      default="bridges/captures/mitm",
+                    help="Directory for capture files")
+    ap.add_argument("--log-level",   default="INFO",
+                    choices=["DEBUG", "INFO", "WARNING", "ERROR"])
+    args = ap.parse_args()
+
+    logging.basicConfig(
+        level=getattr(logging, args.log_level),
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+        stream=sys.stdout,
+    )
+
+    serve(args)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,904 @@
+#!/usr/bin/env python3
+"""
+ach_server.py — Minimal inbound ACH (Auto Call Home) server for MiniMate Plus.
+
+This IS your test server.  Run it on any machine on the same network, point a
+unit's ACEmanager call-home destination at it, and it will speak the full BW
+protocol to the device: handshake, pull device info, download all events, save
+everything as JSON.
+
+The key thing this script tells you that no amount of packet sniffing can:
+  - Does the device speak first (push) or wait for us to send POLL (pull)?
+
+If startup() completes normally → it's pull protocol, same as Blastware.
+If startup() times out → the device sent something first; check raw_rx.bin.
+
+Usage
+-----
+  python bridges/ach_server.py [--port 12345] [--output bridges/captures/]
+
+Setup
+-----
+  1.  Run this script on a machine on your local network.
+  2.  In ACEmanager → Application → ALEOS Application Framework  (or equivalent)
+      find the Call Home / ACH settings.  Set:
+        Remote Host: <this machine's LAN IP>
+        Remote Port: 12345
+  3.  Trigger the unit (wait for a vibration event, or use the manual call-home
+      button if your firmware version has one).
+  4.  The unit connects.  This script handshakes, downloads all events,
+      and saves a timestamped session directory.
+
+Output per session
+------------------
+  bridges/captures/ach_inbound_<ts>/
+    device_info.json   — serial number, firmware version, calibration date, etc.
+    events.json        — all events: timestamp, PPV per channel, peaks, metadata
+    raw_rx_<ts>.bin    — raw bytes from the device (S3 side) for Analyzer
+    raw_tx_<ts>.bin    — raw bytes we sent to the device (BW side) for Analyzer
+    session_<ts>.log   — detailed protocol log
+
+What to look for
+----------------
+  Push vs pull: Check session_<ts>.log.  If the first line after "Connected"
+    shows bytes arriving BEFORE the POLL probe was sent, it's push.  If POLL
+    gets a clean response, it's pull.
+
+  Frequency: Look at raw_rx.bin in the Analyzer.  SUB 5A (0xA5 responses) carry
+    bulk waveform data — if frequency is sent pre-computed there will be float32
+    values before the ADC sample blocks.
+
+  ACH-specific framing: Does the unit send anything extra before the DLE+STX
+    framing starts?  raw_rx.bin will show raw bytes including any preamble.
+"""
+
+from __future__ import annotations
+
+import argparse
+import datetime
+import json
+import logging
+import socket
+import sys
+import threading
+from pathlib import Path
+from typing import Optional
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from minimateplus.transport import SocketTransport
+from minimateplus.client import MiniMateClient
+from minimateplus.models import DeviceInfo, Event, MonitorLogEntry
+from sfm.database import SeismoDb
+from sfm.waveform_store import WaveformStore
+
+log = logging.getLogger("ach_server")
+
+# ── Per-unit state (downloaded events index) ──────────────────────────────────
+# Persisted as <output_dir>/ach_state.json
+# Format (current — v2):
+#   {
+#     "BE11529": {
+#       "downloaded_events": {              # key_hex → ISO timestamp string
+#         "01110000": "2026-04-11T00:42:17",
+#         "0111245a": "2026-04-11T01:04:30"
+#       },
+#       "max_downloaded_key": "0111245a",
+#       "last_seen": "2026-04-11T01:04:36",
+#       "serial":    "BE11529",
+#       "peer":      "63.43.212.232:51920"
+#     }
+#   }
+#
+# Why (key, timestamp) and not key alone:
+#   The device's event-key counter resets to 0x01110000 after every memory
+#   erase (internal or external).  A bare-key dedup (the v1 format) cannot
+#   distinguish a re-recorded event with the same key from one we already
+#   downloaded.  The 0C waveform record's timestamp IS unique per physical
+#   event, so we pair (key, timestamp) and treat a key with a different
+#   timestamp as a new event regardless of `max_downloaded_key`.
+#
+# Legacy v1 format (`downloaded_keys: list[str]` only) is auto-migrated on
+# read: the keys are kept under a sentinel of "" (empty string) timestamp so
+# the (key, timestamp) compare always sees a mismatch and forces a one-time
+# re-download.  After that pass the state is rewritten in v2 form.
+
+_state_lock = threading.Lock()
+
+
+def _load_state(state_path: Path) -> dict:
+    """
+    Load ach_state.json, transparently migrating any legacy
+    `downloaded_keys: list` entries into the v2 `downloaded_events: dict`
+    schema.  Returns the migrated state.
+    """
+    if not state_path.exists():
+        return {}
+    try:
+        with open(state_path) as f:
+            state = json.load(f)
+    except Exception:
+        return {}
+
+    # Per-unit migration: legacy list → dict-with-empty-timestamps
+    for unit_key, unit_state in list(state.items()):
+        if not isinstance(unit_state, dict):
+            continue
+        if "downloaded_events" in unit_state:
+            continue
+        legacy_keys = unit_state.get("downloaded_keys")
+        if isinstance(legacy_keys, list):
+            unit_state["downloaded_events"] = {k: "" for k in legacy_keys}
+            log.info(
+                "ach_state: migrated %s from v1 (downloaded_keys list) → v2 "
+                "(downloaded_events dict, %d keys with empty timestamps; "
+                "they will re-validate on next session)",
+                unit_key, len(legacy_keys),
+            )
+        else:
+            unit_state["downloaded_events"] = {}
+        # keep legacy field for one cycle; cleared on next save
+        unit_state.pop("downloaded_keys", None)
+
+    return state
+
+
+def _save_state(state_path: Path, state: dict) -> None:
+    with _state_lock:
+        with open(state_path, "w") as f:
+            json.dump(state, f, indent=2)
+
+
+# ── Per-session handler ────────────────────────────────────────────────────────
+
+class AchSession:
+    """
+    Handles one inbound unit connection in its own thread.
+    Wraps the socket in a SocketTransport → MiniMateClient, then runs the
+    standard connect → get_device_info → get_events sequence.
+
+    State tracking (ach_state.json in output_dir):
+      On each successful download we record the SET of event keys downloaded.
+      On the next call-home we compare: if all device keys are already in the
+      set, there's nothing new.  If any key is new (including after the device
+      was wiped and re-recorded), we download and save only those events.
+    """
+
+    def __init__(
+        self,
+        sock: socket.socket,
+        peer: str,
+        output_dir: Path,
+        timeout: float,
+        events_only: bool,
+        max_events: Optional[int],
+        state_path: Path,
+        db: "SeismoDb",
+        store: "WaveformStore",
+        clear_after_download: bool = False,
+        restart_monitoring: bool = False,
+        force_redownload: bool = False,
+    ) -> None:
+        self.sock                 = sock
+        self.peer                 = peer
+        self.output_dir           = output_dir
+        self.timeout              = timeout
+        self.events_only          = events_only
+        self.max_events           = max_events
+        self.state_path           = state_path
+        self.db                   = db
+        self.store                = store
+        self.clear_after_download = clear_after_download
+        self.restart_monitoring   = restart_monitoring
+        # `force_redownload` tells this session to ignore ach_state and
+        # re-download every event currently on the device, regardless of any
+        # (key, timestamp) match.  Useful as a manual override when state has
+        # become inconsistent with what's actually on disk / in the DB.
+        self.force_redownload     = force_redownload
+
+    def run(self) -> None:
+        ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+
+        # Session dir and file handler are created lazily — only after startup
+        # succeeds.  This prevents internet scanners and dropped connections from
+        # littering the output directory with empty session folders.
+        try:
+            self._run_inner(ts)
+        except Exception as exc:
+            log.error("Session failed (%s): %s", self.peer, exc, exc_info=True)
+        finally:
+            try:
+                self.sock.close()
+            except Exception:
+                pass
+
+    def _run_inner(self, ts: str) -> None:
+        transport = SocketTransport(self.sock, peer=self.peer)
+
+        # Collect raw bytes in memory until startup succeeds, then flush to disk.
+        raw_rx_buf: list[bytes] = []   # device → us  (S3 side)
+        raw_tx_buf: list[bytes] = []   # us → device  (BW side)
+        _orig_read  = transport.read
+        _orig_write = transport.write
+
+        def tapped_read(n: int) -> bytes:
+            data = _orig_read(n)
+            if data:
+                raw_rx_buf.append(data)
+            return data
+
+        def tapped_write(data: bytes) -> None:
+            _orig_write(data)
+            if data:
+                raw_tx_buf.append(data)
+
+        transport.read  = tapped_read   # type: ignore[method-assign]
+        transport.write = tapped_write  # type: ignore[method-assign]
+
+        serial: Optional[str] = None
+
+        # ── Step 1: startup handshake ─────────────────────────────────────────
+        # Do this BEFORE creating the session directory so that scanner probes
+        # and dropped connections leave no trace on disk.
+        try:
+            from minimateplus.protocol import MiniMateProtocol
+            client = MiniMateClient(transport=transport, timeout=self.timeout)
+            client.open()
+            proto = MiniMateProtocol(transport, recv_timeout=self.timeout)
+            proto.startup()
+        except Exception as exc:
+            log.warning("Startup failed from %s: %s -- ignoring", self.peer, exc)
+            return  # no session dir created
+
+        # Startup succeeded — this is a real unit.  Create session dir now.
+        session_dir = self.output_dir / f"ach_inbound_{ts}"
+        session_dir.mkdir(parents=True, exist_ok=True)
+        log_path   = session_dir / f"session_{ts}.log"
+        raw_rx_path = session_dir / f"raw_rx_{ts}.bin"   # device → us  (S3 side)
+        raw_tx_path = session_dir / f"raw_tx_{ts}.bin"   # us → device  (BW side)
+
+        # Flush buffered bytes to files and switch to direct file writes.
+        raw_rx_fh = open(raw_rx_path, "wb")
+        raw_tx_fh = open(raw_tx_path, "wb")
+        for chunk in raw_rx_buf:
+            raw_rx_fh.write(chunk)
+        for chunk in raw_tx_buf:
+            raw_tx_fh.write(chunk)
+        raw_rx_buf.clear()
+        raw_tx_buf.clear()
+
+        def tapped_read_file(n: int) -> bytes:
+            data = _orig_read(n)
+            if data:
+                raw_rx_fh.write(data)
+                raw_rx_fh.flush()
+            return data
+
+        def tapped_write_file(data: bytes) -> None:
+            _orig_write(data)
+            if data:
+                raw_tx_fh.write(data)
+                raw_tx_fh.flush()
+
+        transport.read  = tapped_read_file   # type: ignore[method-assign]
+        transport.write = tapped_write_file  # type: ignore[method-assign]
+
+        # Wire up file handler now that the session dir exists.
+        fh = logging.FileHandler(log_path, encoding="utf-8")
+        fh.setFormatter(logging.Formatter("%(asctime)s  %(levelname)-7s  %(name)s  %(message)s"))
+        root_logger = logging.getLogger()
+        root_logger.addHandler(fh)
+
+        try:
+            # ── Step 2: device info ───────────────────────────────────────────
+            device_info = None
+            if not self.events_only:
+                log.info("Step 2/3: reading device info")
+                try:
+                    device_info = client.connect()
+                    serial = device_info.serial
+                    _save_json(session_dir / "device_info.json", _device_info_to_dict(device_info))
+                    log.info(
+                        "  [OK] Device: serial=%s  firmware=%s  model=%s  events=%d",
+                        serial,
+                        device_info.firmware_version,
+                        device_info.model,
+                        device_info.event_count or 0,
+                    )
+                except Exception as exc:
+                    log.error("  [FAIL] Device info failed: %s", exc)
+            else:
+                log.info("Step 2/3: skipping device info (--events-only)")
+
+            # ── Step 3: check for new events by comparing key sets ────────────
+            log.info("Step 3/3: checking for new events")
+
+            state = _load_state(self.state_path)
+            unit_key = serial or self.peer  # fall back to IP if no serial
+            unit_state = state.get(unit_key, {})
+
+            # downloaded_events is the v2 (key_hex → timestamp_iso) dict.
+            # Empty-string timestamps are migrated v1 entries — they force a
+            # one-time re-download because the (key, timestamp) compare always
+            # mismatches against any non-empty timestamp from a fresh 0C read.
+            seen_events: dict[str, str] = dict(unit_state.get("downloaded_events", {}))
+            max_seen_key: str = unit_state.get("max_downloaded_key", "00000000")
+
+            if self.force_redownload:
+                log.info("  --force-redownload-all set — ignoring %d cached "
+                         "(key, timestamp) entries for this session",
+                         len(seen_events))
+                seen_events = {}
+
+            # Walk the event index (browse-mode, no 5A) to get the actual current
+            # key list.  The SUB 08 event_count field is a lifetime "total events
+            # ever recorded" counter that does NOT decrement on erase — confirmed
+            # 2026-04-13.  list_event_keys() via the 1E/1F chain is the only
+            # reliable way to know what is actually stored on the device right now.
+            log.info("  Checking device key list (browse walk, no waveform download)...")
+            try:
+                device_keys = client.list_event_keys()
+            except Exception as exc:
+                log.warning("  list_event_keys failed: %s -- falling back to full download", exc)
+                device_keys = None
+
+            current_count = len(device_keys) if device_keys is not None else 0
+
+            log.info("  Unit has %d stored event(s); %d (key, ts) entr(ies) previously downloaded",
+                     current_count, len(seen_events))
+
+            if device_keys is not None and current_count == 0:
+                log.info("  [OK] No events on device -- nothing to download")
+                log.info("Session complete (no events) -> %s", session_dir)
+                return
+
+            if device_keys is not None:
+                # ── Post-erase detection (best-effort, key-only signal) ───────
+                # After erase the device's key counter resets to 01110000.
+                # If the device's current max key is below our high-water mark
+                # we know erase happened.  This catches the cleanest case but
+                # does NOT catch erase-then-record-many-events (where the new
+                # max may climb past the old max).  The (key, timestamp) check
+                # in get_events() is what handles those.
+                if device_keys and max_seen_key != "00000000":
+                    max_device_key = max(device_keys)
+                    if max_device_key < max_seen_key:
+                        log.info(
+                            "  Post-erase reset detected: "
+                            "device max key %s < historical max %s "
+                            "-- discarding stale (key, ts) state for this session",
+                            max_device_key, max_seen_key,
+                        )
+                        seen_events = {}
+
+                # Note: no early-exit "all already downloaded" short-circuit
+                # here.  Without per-event timestamps we cannot tell whether
+                # device_keys ⊆ seen_events.keys() actually means we have
+                # those physical events.  get_events() will read 0C on its
+                # skip path and decide per event.
+
+            # Apply max_events cap
+            # stop_idx: when we know the count from list_event_keys, use it as
+            # an upper bound.  When list_event_keys failed (device_keys is None),
+            # pass None — get_events will run until the null sentinel naturally.
+            stop_idx: Optional[int] = (current_count - 1) if device_keys is not None else None
+            if self.max_events is not None:
+                cap = self.max_events - 1
+                stop_idx = cap if stop_idx is None else min(stop_idx, cap)
+                if device_keys is not None and self.max_events < current_count:
+                    log.warning(
+                        "  max_events=%d cap: will download events 0-%d only "
+                        "(unit has %d total)",
+                        self.max_events, stop_idx, current_count,
+                    )
+
+            try:
+                # Pass `seen_events` (key → ISO timestamp) so the client can
+                # read 0C on its skip path and only skip 5A when the per-event
+                # timestamp matches what we already have on disk.  When force_-
+                # redownload is set, seen_events was already cleared above.
+                #
+                # Filter out empty-string timestamps (legacy v1 entries) — the
+                # client's 0C-on-skip-path only trusts entries with a
+                # populated timestamp; otherwise it falls through to a full
+                # 5A download.
+                skip_dict = {k: ts for k, ts in seen_events.items() if ts}
+
+                all_events = client.get_events(
+                    full_waveform=True,
+                    stop_after_index=stop_idx,
+                    skip_waveform_for_events=skip_dict if skip_dict else None,
+                )
+
+                # New events are those that came back with _a5_frames populated
+                # (= 5A actually ran on this session).  Skipped events have
+                # _a5_frames = None because the client matched (key, timestamp)
+                # against skip_dict and bypassed 5A.
+                new_events = [
+                    e for e in all_events
+                    if getattr(e, "_a5_frames", None)
+                ]
+                skipped = len(all_events) - len(new_events)
+
+                log.info("  [OK] Walked %d event(s): %d downloaded, %d skipped (matched (key, ts) in state)",
+                         len(all_events), len(new_events), skipped)
+
+                # ── Persist event file + A5 sidecar to the waveform store ──
+                # Saves ride alongside the existing JSON dump so the on-disk
+                # event file and events.json reference the same set of events.
+                waveform_records: dict[str, dict] = {}
+                for ev in new_events:
+                    if not ev._a5_frames:
+                        continue
+                    try:
+                        rec = self.store.save(
+                            ev,
+                            serial=serial or "UNKNOWN",
+                            a5_frames=ev._a5_frames,
+                        )
+                        if ev._waveform_key is not None:
+                            waveform_records[ev._waveform_key.hex()] = rec
+                        log.info(
+                            "  [WAVE] saved %s (%d bytes)",
+                            rec["filename"], rec["filesize"],
+                        )
+                    except Exception as exc:
+                        key_hex = ev._waveform_key.hex() if ev._waveform_key else "????????"
+                        log.warning(
+                            "  [WARN] Waveform store save failed for %s: %s",
+                            key_hex, exc,
+                        )
+
+                if new_events:
+                    _save_json(
+                        session_dir / "events.json",
+                        [_event_to_dict(e, waveform_records) for e in new_events],
+                    )
+
+                    for ev in new_events:
+                        pv = ev.peak_values
+                        pi = ev.project_info
+                        key_hex = ev._waveform_key.hex() if ev._waveform_key else "????????"
+                        log.info(
+                            "  NEW [%s] %s  Tran=%.4f  Vert=%.4f  Long=%.4f  VS=%.4f  project=%r",
+                            key_hex,
+                            str(ev.timestamp) if ev.timestamp else "?",
+                            pv.tran            if pv else 0,
+                            pv.vert            if pv else 0,
+                            pv.long            if pv else 0,
+                            pv.peak_vector_sum if pv else 0,
+                            pi.project         if pi else "",
+                        )
+                else:
+                    log.info("  [OK] No new events since last call-home -- nothing to save")
+
+                # ── Monitor log entries (partial records / continuous monitoring) ──
+                # Browse walk (0A + 1F only) to collect monitor log entries for
+                # recording intervals where no threshold was crossed.  This is a
+                # second 1E-based pass over the device's record list, separate from
+                # the get_events() download loop above.
+                log.info("  Collecting monitor log entries (browse walk)...")
+                new_monitor_entries: list[MonitorLogEntry] = []
+                try:
+                    new_monitor_entries = client.get_monitor_log_entries(
+                        skip_keys=seen_keys if seen_keys else None,
+                    )
+                    if new_monitor_entries:
+                        _save_json(
+                            session_dir / "monitor_log.json",
+                            [_monitor_log_entry_to_dict(e) for e in new_monitor_entries],
+                        )
+                        log.info(
+                            "  [OK] %d new monitor log entry(s) saved",
+                            len(new_monitor_entries),
+                        )
+                        for ml in new_monitor_entries:
+                            log.info(
+                                "  MONLOG [%s] %s → %s (%s)",
+                                ml.key,
+                                ml.start_time.isoformat() if ml.start_time else "?",
+                                ml.stop_time.isoformat()  if ml.stop_time  else "?",
+                                f"{ml.duration_seconds:.0f}s" if ml.duration_seconds is not None else "?s",
+                            )
+                    else:
+                        log.info("  [OK] No new monitor log entries")
+                except Exception as exc:
+                    log.warning(
+                        "  [WARN] Monitor log collection failed: %s -- continuing",
+                        exc,
+                    )
+
+                # ── Persist to SQLite DB ─────────────────────────────────────
+                _session_start = datetime.datetime.now()
+                try:
+                    _ev_ins, _ev_skip = self.db.insert_events(
+                        new_events,
+                        serial=serial or self.peer,
+                        session_id=None,
+                        waveform_records=waveform_records,
+                        device_family="series3",
+                    )
+                    _ml_ins, _ml_skip = self.db.insert_monitor_log(
+                        new_monitor_entries, session_id=None
+                    )
+                    _session_id = self.db.insert_ach_session(
+                        serial=serial or self.peer,
+                        peer=self.peer,
+                        events_downloaded=_ev_ins,
+                        monitor_entries=_ml_ins,
+                        duration_seconds=(datetime.datetime.now() - _session_start).total_seconds(),
+                        session_time=_session_start,
+                    )
+                    log.info(
+                        "  [DB] session=%s  events +%d (skip %d)  monitor +%d (skip %d)",
+                        _session_id[:8], _ev_ins, _ev_skip, _ml_ins, _ml_skip,
+                    )
+                except Exception as exc:
+                    log.warning("  [WARN] DB write failed: %s -- continuing", exc)
+
+                # ── Optional: erase device memory after successful download ────
+                erased_successfully = False
+                if self.clear_after_download and new_events:
+                    log.info("  Clearing device memory (--clear-after-download)...")
+                    try:
+                        client.delete_all_events()
+                        log.info("  [OK] Device memory cleared")
+                        erased_successfully = True
+                    except Exception as exc:
+                        log.error(
+                            "  [WARN] Event deletion failed: %s -- events NOT cleared",
+                            exc,
+                        )
+
+                # ── Update persistent state ───────────────────────────────────
+                # Build a fresh (key → ISO timestamp) map from THIS session's
+                # results.  For each event currently on the device, prefer the
+                # timestamp we just observed (from 0C); fall back to whatever
+                # was already in seen_events for that key (so we don't lose an
+                # entry just because get_events skipped it on the (key, ts)
+                # match path).
+                def _ts_iso(ev) -> str:
+                    ts = getattr(ev, "timestamp", None)
+                    if ts is None:
+                        return ""
+                    try:
+                        return datetime.datetime(
+                            ts.year, ts.month, ts.day,
+                            ts.hour or 0, ts.minute or 0, ts.second or 0,
+                        ).isoformat()
+                    except Exception:
+                        return str(ts)
+
+                current_events_map: dict[str, str] = {}
+                for ev in all_events:
+                    if ev._waveform_key is None:
+                        continue
+                    key_hex = ev._waveform_key.hex()
+                    ts_iso  = _ts_iso(ev) or seen_events.get(key_hex, "")
+                    current_events_map[key_hex] = ts_iso
+
+                # Monitor-log entries don't have a 0C-style timestamp, but
+                # they DO have a start_time; use that so the monitor-log keys
+                # are properly entered into the (key, ts) map.
+                for ml in new_monitor_entries:
+                    key_hex = ml.key
+                    ts = ml.start_time
+                    ts_iso = ts.isoformat() if ts else seen_events.get(key_hex, "")
+                    # If a triggered event already populated this key, keep
+                    # whichever has a non-empty timestamp.
+                    if key_hex not in current_events_map or not current_events_map[key_hex]:
+                        current_events_map[key_hex] = ts_iso
+
+                if erased_successfully:
+                    updated_events: dict[str, str] = {}
+                    new_max_key  = "00000000"
+                    log.info(
+                        "  State reset after erase -- next session will download "
+                        "from key 0 (device counter resets after erase)"
+                    )
+                else:
+                    # Merge: keep prior (key, ts) entries we still have evidence
+                    # of (for survivors of any partial failure), plus this
+                    # session's authoritative (key, ts) pairs.
+                    updated_events = dict(seen_events)
+                    updated_events.update(current_events_map)
+                    new_max_key  = (
+                        max(updated_events.keys())
+                        if updated_events else max_seen_key
+                    )
+
+                state[unit_key] = {
+                    "downloaded_events":  updated_events,
+                    "max_downloaded_key": new_max_key,
+                    "last_seen":          datetime.datetime.now().isoformat(),
+                    "serial":             serial,
+                    "peer":               self.peer,
+                }
+                _save_state(self.state_path, state)
+
+            except Exception as exc:
+                log.error("  [FAIL] Event download failed: %s", exc, exc_info=True)
+
+            # ── Optional: restart monitoring after successful download ─────────
+            if self.restart_monitoring:
+                log.info("  Restarting monitoring on device (--restart-monitoring)...")
+                try:
+                    client.start_monitoring()
+                    log.info("  [OK] Monitoring restarted")
+                except Exception as exc:
+                    log.warning("  [WARN] Failed to restart monitoring: %s", exc)
+
+        finally:
+            raw_rx_fh.close()
+            raw_tx_fh.close()
+            client.close()  # closes transport / socket cleanly
+            root_logger.removeHandler(fh)
+            fh.close()
+
+        log.info("Session complete -> %s", session_dir)
+        log.info("="*60)
+
+
+# ── JSON helpers ───────────────────────────────────────────────────────────────
+
+def _save_json(path: Path, obj: object) -> None:
+    with open(path, "w") as f:
+        json.dump(obj, f, indent=2, default=str)
+    log.debug("Saved %s", path)
+
+
+def _device_info_to_dict(d: DeviceInfo) -> dict:
+    cc = d.compliance_config
+    return {
+        "serial":            d.serial,
+        "firmware_version":  d.firmware_version,
+        "dsp_version":       d.dsp_version,
+        "model":             d.model,
+        "event_count":       d.event_count,
+        # compliance config fields (None if 1A read failed)
+        "setup_name":        cc.setup_name        if cc else None,
+        "sample_rate":       cc.sample_rate        if cc else None,
+        "record_time":       cc.record_time        if cc else None,
+        "trigger_level_geo": cc.trigger_level_geo  if cc else None,
+        "alarm_level_geo":   cc.alarm_level_geo    if cc else None,
+        "geo_adc_scale": cc.geo_adc_scale if cc else None,  # hw scale factor (in/s)/V
+        "geo_range":     cc.geo_range     if cc else None,  # 0x01=Normal 10in/s, 0x00=Sensitive 1.25in/s (unconfirmed)
+        "project":           cc.project            if cc else None,
+        "client":            cc.client             if cc else None,
+        "operator":          cc.operator           if cc else None,
+        "sensor_location":   cc.sensor_location    if cc else None,
+    }
+
+
+def _event_to_dict(
+    e: Event,
+    waveform_records: Optional[dict[str, dict]] = None,
+) -> dict:
+    pv = e.peak_values
+    pi = e.project_info
+    peaks = {}
+    if pv:
+        peaks = {
+            "transverse":    pv.tran,
+            "vertical":      pv.vert,
+            "longitudinal":  pv.long,
+            "vector_sum":    pv.peak_vector_sum,
+            "mic":           pv.micl,
+        }
+    samples = {}
+    if e.raw_samples:
+        samples = {
+            ch: vals[:20]  # first 20 sample-sets to keep the file sane
+            for ch, vals in e.raw_samples.items()
+        }
+        samples["__note__"] = "first 20 sample-sets only; see raw_rx.bin for full waveform"
+
+    rec: dict = {}
+    if waveform_records and e._waveform_key is not None:
+        rec = waveform_records.get(e._waveform_key.hex(), {}) or {}
+
+    return {
+        "timestamp":       str(e.timestamp) if e.timestamp else None,
+        "project":         pi.project         if pi else None,
+        "client":          pi.client          if pi else None,
+        "operator":        pi.operator        if pi else None,
+        "sensor_location": pi.sensor_location if pi else None,
+        "peaks":           peaks,
+        "raw_samples_preview": samples,
+        "blastware_filename":   rec.get("filename"),
+        "blastware_filesize":   rec.get("filesize"),
+        "a5_pickle_filename":   rec.get("a5_pickle_filename"),
+    }
+
+
+def _monitor_log_entry_to_dict(e: MonitorLogEntry) -> dict:
+    return {
+        "key":               e.key,
+        "start_time":        e.start_time.isoformat() if e.start_time else None,
+        "stop_time":         e.stop_time.isoformat()  if e.stop_time  else None,
+        "duration_seconds":  e.duration_seconds,
+        "serial":            e.serial,
+        "geo_threshold_ips": e.geo_threshold_ips,
+    }
+
+
+# ── Main server loop ───────────────────────────────────────────────────────────
+
+def serve(args: argparse.Namespace) -> None:
+    output_dir = Path(args.output)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    state_path = output_dir / "ach_state.json"
+    db = SeismoDb(output_dir / "seismo_relay.db")
+    store = WaveformStore(output_dir / "waveforms")
+
+    server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+    server_sock.bind(("0.0.0.0", args.port))
+    server_sock.listen(5)
+    # Wake up every second so Ctrl-C is handled promptly on Windows.
+    # Without this, accept() blocks indefinitely and ignores KeyboardInterrupt.
+    server_sock.settimeout(1.0)
+
+    max_ev = args.max_events
+    print(f"\n{'='*60}")
+    print(f"  ACH inbound server  listening on 0.0.0.0:{args.port}")
+    print(f"  Output:     {output_dir.resolve()}/ach_inbound_<timestamp>/")
+    print(f"  State file: {state_path}")
+    print(f"  Max events per session: {max_ev if max_ev else 'unlimited'}")
+    print(f"  Clear device after download: {'YES' if args.clear_after_download else 'no'}")
+    print(f"  Restart monitoring after download: {'YES' if args.restart_monitoring else 'no'}")
+    print(f"  Force re-download all (ignore state): {'YES' if args.force_redownload_all else 'no'}")
+    print(f"{'='*60}")
+    print(f"\n  Point your test unit's ACEmanager call-home settings to:")
+    print(f"    Remote Host: <this machine's LAN IP>")
+    print(f"    Remote Port: {args.port}")
+    print(f"\n  Waiting for inbound connections...  (Ctrl-C to stop)\n")
+
+    allow_ips = set(args.allow_ips)
+    if allow_ips:
+        print(f"  Allowlist:  {', '.join(sorted(allow_ips))}")
+    else:
+        print("  Allowlist:  NONE -- accepting all IPs (add --allow-ip to restrict)")
+
+    try:
+        while True:
+            try:
+                client_sock, addr = server_sock.accept()
+            except socket.timeout:
+                continue  # no connection this second; loop back and check for Ctrl-C
+            try:
+                peer_ip = addr[0]
+                peer = f"{addr[0]}:{addr[1]}"
+
+                if allow_ips and peer_ip not in allow_ips:
+                    log.info("Rejected connection from %s (not in allowlist)", peer)
+                    client_sock.close()
+                    continue
+
+                log.info("Accepted connection from %s", peer)
+                session = AchSession(
+                    sock=client_sock,
+                    peer=peer,
+                    output_dir=output_dir,
+                    timeout=args.timeout,
+                    events_only=args.events_only,
+                    max_events=max_ev,
+                    state_path=state_path,
+                    db=db,
+                    store=store,
+                    clear_after_download=args.clear_after_download,
+                    restart_monitoring=args.restart_monitoring,
+                    force_redownload=args.force_redownload_all,
+                )
+                t = threading.Thread(target=session.run, daemon=True, name=f"ach-{peer}")
+                t.start()
+            except KeyboardInterrupt:
+                raise
+            except Exception as exc:
+                log.error("Accept error: %s", exc)
+    finally:
+        server_sock.close()
+        print("\nServer stopped.")
+
+
+def parse_args() -> argparse.Namespace:
+    p = argparse.ArgumentParser(
+        description="Minimal inbound ACH server — speak BW protocol to calling MiniMate Plus units.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog=__doc__,
+    )
+    p.add_argument(
+        "--port", "-p",
+        type=int,
+        default=12345,
+        help="Port to listen on (default: 12345).",
+    )
+    p.add_argument(
+        "--output", "-o",
+        default=str(Path(__file__).parent / "captures"),
+        metavar="DIR",
+        help="Directory to write session captures (default: bridges/captures/).",
+    )
+    p.add_argument(
+        "--timeout", "-t",
+        type=float,
+        default=30.0,
+        help="Protocol receive timeout in seconds (default: 30.0).",
+    )
+    p.add_argument(
+        "--events-only",
+        action="store_true",
+        help="Skip the device-info step and go straight to event download.",
+    )
+    p.add_argument(
+        "--max-events",
+        type=int,
+        default=None,
+        metavar="N",
+        help=(
+            "Safety cap: download at most N events per session (default: unlimited). "
+            "Useful if a unit has many old events stored — prevents a very long first run."
+        ),
+    )
+    p.add_argument(
+        "--allow-ip",
+        metavar="IP",
+        action="append",
+        dest="allow_ips",
+        default=[],
+        help=(
+            "Only accept connections from this IP address (repeat for multiple). "
+            "Example: --allow-ip 63.43.212.232  "
+            "If not specified, all IPs are accepted (not recommended for public servers)."
+        ),
+    )
+    p.add_argument(
+        "--restart-monitoring",
+        action="store_true",
+        default=False,
+        help=(
+            "After downloading events, send SUB 0x96 (start monitoring) before "
+            "disconnecting. Required for RV55 units whose firmware does not assert "
+            "DCD on disconnect — without this the unit stays idle after a call-home."
+        ),
+    )
+    p.add_argument(
+        "--clear-after-download",
+        action="store_true",
+        default=False,
+        help=(
+            "After successfully downloading new events, erase all events from the "
+            "device memory (SUB 0xA3 → 0x1C → 0x06 → 0xA2 sequence, confirmed from "
+            "4-11-26 MITM capture). Only fires when at least one new event was saved. "
+            "This mirrors the standard Blastware ACH workflow."
+        ),
+    )
+    p.add_argument(
+        "--force-redownload-all",
+        action="store_true",
+        default=False,
+        help=(
+            "Manual override: ignore ach_state.json's downloaded_events map "
+            "for this session and re-download every event currently on the "
+            "device, regardless of (key, timestamp) match.  Useful when state "
+            "has become inconsistent with the on-disk waveform store / DB."
+        ),
+    )
+    p.add_argument(
+        "--verbose", "-v",
+        action="store_true",
+        help="Enable debug logging.",
+    )
+    return p.parse_args()
+
+
+if __name__ == "__main__":
+    args = parse_args()
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+    )
+    try:
+        serve(args)
+    except KeyboardInterrupt:
+        print("\nStopped.")
@@ -58,16 +58,24 @@ class BridgeGUI(tk.Tk):
        tk.Entry(self, textvariable=self.logdir_var, width=24).grid(row=1, column=3, sticky="we", **pad)
        tk.Button(self, text="Browse", command=self._choose_dir).grid(row=1, column=4, sticky="w", **pad)

-        # Row 2: Raw taps
-        self.raw_bw_var = tk.StringVar(value="")
-        self.raw_s3_var = tk.StringVar(value="")
-        tk.Checkbutton(self, text="Save BW->S3 raw", command=self._toggle_raw_bw, onvalue="1", offvalue="").grid(row=2, column=0, sticky="w", **pad)
-        tk.Entry(self, textvariable=self.raw_bw_var, width=28).grid(row=2, column=1, columnspan=3, sticky="we", **pad)
-        tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_bw_var, "bw")).grid(row=2, column=4, **pad)
+        # Row 2: Raw taps — ON by default; "auto" = timestamped name; blank checkbox = disabled
+        self.raw_bw_enabled = tk.IntVar(value=1)
+        self.raw_s3_enabled = tk.IntVar(value=1)
+        # Path fields: empty means "auto" (bridge picks a timestamped name)
+        self.raw_bw_path_var = tk.StringVar(value="")
+        self.raw_s3_path_var = tk.StringVar(value="")

-        tk.Checkbutton(self, text="Save S3->BW raw", command=self._toggle_raw_s3, onvalue="1", offvalue="").grid(row=3, column=0, sticky="w", **pad)
-        tk.Entry(self, textvariable=self.raw_s3_var, width=28).grid(row=3, column=1, columnspan=3, sticky="we", **pad)
-        tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_s3_var, "s3")).grid(row=3, column=4, **pad)
+        tk.Checkbutton(self, text="BW→S3 raw (auto)", variable=self.raw_bw_enabled,
+                       command=self._toggle_raw_bw).grid(row=2, column=0, sticky="w", **pad)
+        tk.Entry(self, textvariable=self.raw_bw_path_var, width=28,
+                 fg="grey").grid(row=2, column=1, columnspan=3, sticky="we", **pad)
+        tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_bw_path_var, "bw")).grid(row=2, column=4, **pad)
+
+        tk.Checkbutton(self, text="S3→BW raw (auto)", variable=self.raw_s3_enabled,
+                       command=self._toggle_raw_s3).grid(row=3, column=0, sticky="w", **pad)
+        tk.Entry(self, textvariable=self.raw_s3_path_var, width=28,
+                 fg="grey").grid(row=3, column=1, columnspan=3, sticky="we", **pad)
+        tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_s3_path_var, "s3")).grid(row=3, column=4, **pad)

        # Row 4: Status + buttons
        self.status_var = tk.StringVar(value="Idle")
@@ -102,13 +110,11 @@ class BridgeGUI(tk.Tk):
            var.set(filename)

    def _toggle_raw_bw(self) -> None:
-        if not self.raw_bw_var.get():
-            # default name
-            self.raw_bw_var.set(os.path.join(self.logdir_var.get(), "raw_bw.bin"))
+        # Checkbox toggled — no path action needed; enabled state drives the flag.
+        pass

    def _toggle_raw_s3(self) -> None:
-        if not self.raw_s3_var.get():
-            self.raw_s3_var.set(os.path.join(self.logdir_var.get(), "raw_s3.bin"))
+        pass

    def start_bridge(self) -> None:
        if self.process and self.process.poll() is None:
@@ -126,23 +132,22 @@ class BridgeGUI(tk.Tk):

        args = [sys.executable, BRIDGE_PATH, "--bw", bw, "--s3", s3, "--baud", baud, "--logdir", logdir]

-        ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+        # Raw tap flags.
+        # Checkbox on + empty path → pass "auto" (bridge generates timestamped name).
+        # Checkbox on + explicit path → pass that path.
+        # Checkbox off → pass "" to disable (overrides bridge's auto default).
+        raw_bw_explicit = self.raw_bw_path_var.get().strip()
+        raw_s3_explicit = self.raw_s3_path_var.get().strip()

-        raw_bw = self.raw_bw_var.get().strip()
-        raw_s3 = self.raw_s3_var.get().strip()
+        if self.raw_bw_enabled.get():
+            args += ["--raw-bw", raw_bw_explicit if raw_bw_explicit else "auto"]
+        else:
+            args += ["--raw-bw", ""]   # explicit disable

-        # If the user left the default generic name, replace with a timestamped one
-        # so each session gets its own file.
-        if raw_bw:
-            if os.path.basename(raw_bw) in ("raw_bw.bin", "raw_bw"):
-                raw_bw = os.path.join(os.path.dirname(raw_bw) or logdir, f"raw_bw_{ts}.bin")
-                self.raw_bw_var.set(raw_bw)
-            args += ["--raw-bw", raw_bw]
-        if raw_s3:
-            if os.path.basename(raw_s3) in ("raw_s3.bin", "raw_s3"):
-                raw_s3 = os.path.join(os.path.dirname(raw_s3) or logdir, f"raw_s3_{ts}.bin")
-                self.raw_s3_var.set(raw_s3)
-            args += ["--raw-s3", raw_s3]
+        if self.raw_s3_enabled.get():
+            args += ["--raw-s3", raw_s3_explicit if raw_s3_explicit else "auto"]
+        else:
+            args += ["--raw-s3", ""]   # explicit disable

        try:
            self.process = subprocess.Popen(
@@ -93,8 +93,11 @@ class SessionLogger:
        self._bin_fh = open(bin_path, "ab", buffering=0)
        self._lock = threading.Lock()
        # Optional pure-byte taps (no headers). BW=Blastware tx, S3=device tx.
+        # These can be opened/closed on demand via start_raw_capture/stop_raw_capture.
        self._raw_bw = open(raw_bw_path, "ab", buffering=0) if raw_bw_path else None
        self._raw_s3 = open(raw_s3_path, "ab", buffering=0) if raw_s3_path else None
+        self._cap_bw_path: Optional[str] = raw_bw_path
+        self._cap_s3_path: Optional[str] = raw_s3_path

    def log_line(self, line: str) -> None:
        with self._lock:
@@ -124,6 +127,43 @@ class SessionLogger:
        self.log_line(f"[{ts}] [INFO] {msg}")
        self.bin_write_record(REC_INFO, msg.encode("utf-8", errors="replace"))

+    def start_raw_capture(self, label: str, logdir: str) -> tuple:
+        """Open new raw tap files for a named capture.  Returns (bw_path, s3_path)."""
+        ts = _dt.datetime.now().strftime("%Y%m%d_%H%M%S")
+        safe = "".join(c if c.isalnum() or c in "-_" else "_" for c in label)[:40] if label else ""
+        suffix = f"_{safe}" if safe else ""
+        bw_path = os.path.join(logdir, f"raw_bw_{ts}{suffix}.bin")
+        s3_path = os.path.join(logdir, f"raw_s3_{ts}{suffix}.bin")
+        with self._lock:
+            # Close any previously open taps first
+            if self._raw_bw:
+                self._raw_bw.close()
+            if self._raw_s3:
+                self._raw_s3.close()
+            self._raw_bw = open(bw_path, "ab", buffering=0)
+            self._raw_s3 = open(s3_path, "ab", buffering=0)
+            self._cap_bw_path = bw_path
+            self._cap_s3_path = s3_path
+        self.log_info(f"raw capture started: label={label!r} bw={bw_path} s3={s3_path}")
+        return bw_path, s3_path
+
+    def stop_raw_capture(self) -> tuple:
+        """Close raw tap files.  Returns (bw_path, s3_path) for the capture just closed."""
+        with self._lock:
+            bw = self._cap_bw_path
+            s3 = self._cap_s3_path
+            if self._raw_bw:
+                self._raw_bw.close()
+                self._raw_bw = None
+            if self._raw_s3:
+                self._raw_s3.close()
+                self._raw_s3 = None
+            self._cap_bw_path = None
+            self._cap_s3_path = None
+        if bw:
+            self.log_info(f"raw capture stopped: bw={bw} s3={s3}")
+        return bw, s3
+
    def close(self) -> None:
        with self._lock:
            try:
@@ -291,8 +331,18 @@ def forward_loop(
            time.sleep(0.002)


-def annotation_loop(logger: SessionLogger, stop: threading.Event) -> None:
-    print("[MARK] Type 'm' + Enter to annotate the capture. Ctrl+C to stop.")
+def annotation_loop(logger: SessionLogger, logdir: str, stop: threading.Event) -> None:
+    """
+    Reads stdin commands while the bridge runs.
+
+    Commands:
+      m                      — prompt for a mark label (interactive)
+      CAP_START:<label>      — begin a raw tap capture with the given label
+      CAP_STOP               — stop the current raw tap capture
+    Responses (printed to stdout, parsed by the GUI):
+      [CAP_START] <bw_path>\\t<s3_path>
+      [CAP_STOP]  <bw_path>\\t<s3_path>
+    """
    while not stop.is_set():
        try:
            line = input()
@@ -303,7 +353,21 @@ def annotation_loop(logger: SessionLogger, stop: threading.Event) -> None:
        if not line:
            continue

-        if line.lower() == "m":
+        if line.startswith("CAP_START:"):
+            label = line[10:].strip()
+            bw_path, s3_path = logger.start_raw_capture(label, logdir)
+            print(f"[CAP_START] {bw_path}\t{s3_path}")
+            sys.stdout.flush()
+
+        elif line == "CAP_STOP":
+            bw_path, s3_path = logger.stop_raw_capture()
+            if bw_path:
+                print(f"[CAP_STOP] {bw_path}\t{s3_path}")
+            else:
+                print("[CAP_STOP] no active capture")
+            sys.stdout.flush()
+
+        elif line.lower() == "m":
            try:
                sys.stdout.write("  Label: ")
                sys.stdout.flush()
@@ -315,8 +379,9 @@ def annotation_loop(logger: SessionLogger, stop: threading.Event) -> None:
                print(f"  [MARK written] {label}")
            else:
                print("  (empty label — mark cancelled)")
+
        else:
-            print("  (type 'm' + Enter to annotate)")
+            print(f"  (unknown command: {line!r})")


 def main() -> int:
@@ -325,8 +390,14 @@ def main() -> int:
    ap.add_argument("--s3", default="COM5", help="S3-side COM port (default: COM5)")
    ap.add_argument("--baud", type=int, default=38400, help="Baud rate (default: 38400)")
    ap.add_argument("--logdir", default=".", help="Directory to write session logs into (default: .)")
-    ap.add_argument("--raw-bw", default=None, help="Optional file to append raw bytes sent from BW->S3 (no headers)")
-    ap.add_argument("--raw-s3", default=None, help="Optional file to append raw bytes sent from S3->BW (no headers)")
+    ap.add_argument("--raw-bw", default="auto",
+                    help="File to append raw bytes sent from BW->S3 (no headers). "
+                         "Default 'auto' generates a timestamped name in --logdir. "
+                         "Pass an empty string to disable.")
+    ap.add_argument("--raw-s3", default="auto",
+                    help="File to append raw bytes sent from S3->BW (no headers). "
+                         "Default 'auto' generates a timestamped name in --logdir. "
+                         "Pass an empty string to disable.")
    ap.add_argument("--quiet", action="store_true", help="No console heartbeat output")
    ap.add_argument("--status-every", type=float, default=0.0, help="Seconds between console heartbeat lines (default: 0 = off)")
    args = ap.parse_args()
@@ -349,12 +420,16 @@ def main() -> int:
    # If raw tap flags were passed without a path (bare --raw-bw / --raw-s3),
    # or if the sentinel value "auto" is used, generate a timestamped name.
    # If a specific path was provided, use it as-is (caller's responsibility).
-    raw_bw_path = args.raw_bw
-    raw_s3_path = args.raw_s3
-    if raw_bw_path in (None, "", "auto"):
-        raw_bw_path = os.path.join(args.logdir, f"raw_bw_{ts}.bin") if args.raw_bw is not None else None
-    if raw_s3_path in (None, "", "auto"):
-        raw_s3_path = os.path.join(args.logdir, f"raw_s3_{ts}.bin") if args.raw_s3 is not None else None
+    # Resolve raw tap paths.
+    # "auto" (default) → timestamped file in logdir (always captured).
+    # Explicit path     → use verbatim.
+    # None or ""        → disabled (pass --raw-bw "" to suppress capture).
+    raw_bw_path: Optional[str] = args.raw_bw if args.raw_bw else None
+    raw_s3_path: Optional[str] = args.raw_s3 if args.raw_s3 else None
+    if raw_bw_path == "auto":
+        raw_bw_path = os.path.join(args.logdir, f"raw_bw_{ts}.bin")
+    if raw_s3_path == "auto":
+        raw_s3_path = os.path.join(args.logdir, f"raw_s3_{ts}.bin")

    logger = SessionLogger(log_path, bin_path, raw_bw_path=raw_bw_path, raw_s3_path=raw_s3_path)

@@ -391,7 +466,7 @@ def main() -> int:
    t_ann = threading.Thread(
        target=annotation_loop,
        name="Annotator",
-        args=(logger, stop),
+        args=(logger, args.logdir, stop),
        daemon=True,
    )

@@ -0,0 +1,435 @@
+#!/usr/bin/env python3
+"""
+serial_watch.py — Instantel Series-3 serial monitor with S3 frame parsing.
+
+Taps the RS-232 line between the MiniMate Plus and its modem (RV50/RV55).
+Saves raw binary captures compatible with the rest of the analysis toolchain,
+plus a human-readable frame log.
+
+Usage
+-----
+  python bridges/serial_watch.py                        # interactive COM picker
+  python bridges/serial_watch.py --port COM3            # specify port
+  python bridges/serial_watch.py --port COM3 --ack-ok   # reply OK to AT commands
+                                                         # (useful if modem is absent
+                                                         #  and you want the device to
+                                                         #  proceed past AT negotiation)
+  python bridges/serial_watch.py --list                  # list available ports
+
+Output
+------
+  bridges/captures/serial_<ISO-timestamp>/
+    raw_s3_<ts>.bin    — raw bytes from device (feeds directly into S3FrameParser)
+    session_<ts>.log   — human-readable frame + control-line log
+    session_<ts>.jsonl — JSON-lines frame log
+
+The raw_s3_*.bin file is byte-for-byte compatible with the existing capture
+format used by bridges/parse_capture.py and all analysis scripts.
+
+What to look for in a call-home capture
+----------------------------------------
+1. Does the device talk first after CONNECT, or does it wait?
+   - If raw_s3_*.bin has bytes before any AT/POLL exchange → PUSH protocol
+   - If it stays silent → PULL protocol (same as Blastware manual download)
+
+2. Look for "Operating System" ASCII at the start — the device sends this 16-byte
+   boot string on cold start before entering DLE-framed mode.
+
+3. RING/CONNECT from the modem appear as ASCII before the DLE frames — the parser
+   handles these automatically (scans forward to DLE+STX).
+"""
+
+from __future__ import annotations
+
+import argparse
+import sys
+import threading
+import time
+from datetime import datetime
+from pathlib import Path
+
+try:
+    import serial
+    from serial.tools import list_ports
+except ModuleNotFoundError:
+    print(
+        "pyserial not found. Install with:\n  python -m pip install pyserial",
+        file=sys.stderr,
+    )
+    sys.exit(1)
+
+# Add project root so we can import the frame parser
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from minimateplus.framing import S3FrameParser, S3Frame
+
+import json
+
+
+# ── Helpers ───────────────────────────────────────────────────────────────────
+
+def _ts() -> str:
+    return datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3]
+
+
+def _hexdump(b: bytes) -> str:
+    return " ".join(f"{x:02X}" for x in b)
+
+
+def _printable(b: bytes) -> str:
+    return b.decode("latin1", errors="replace")
+
+
+_KNOWN_SUBS = {
+    0xA4: "POLL_RSP",        0xA5: "BULK_WAVEFORM_RSP", 0xE0: "ADVANCE_EVENT_RSP",
+    0xE1: "EVENT_IDX_FIRST_RSP", 0xE3: "MONITOR_STATUS_RSP", 0xEA: "SERIAL_NUM_RSP",
+    0xF3: "WAVEFORM_RECORD_RSP", 0xF5: "WAVEFORM_HEADER_RSP", 0xF7: "EVENT_INDEX_RSP",
+    0xF9: "UNK_06_RSP",      0xFE: "DEVICE_INFO_RSP",
+    0x69: "START_MONITOR_ACK", 0x68: "STOP_MONITOR_ACK",
+    0x97: "EVT_IDX_WRITE_ACK", 0x8C: "CONFIRM_B_ACK",  0x8E: "COMPLIANCE_WRITE_ACK",
+    0x8D: "CONFIRM_A_ACK",   0x7D: "TRIGGER_WRITE_ACK", 0x7C: "TRIGGER_CONFIRM_ACK",
+    0x96: "WAVEFORM_WRITE_ACK", 0x8B: "CONFIRM_C_ACK",
+}
+
+
+def _label_frame(frame: S3Frame) -> str:
+    name = _KNOWN_SUBS.get(frame.sub, f"UNK_0x{frame.sub:02X}")
+    chk  = "✓" if frame.checksum_valid else "✗ BAD_CHK"
+    peek = frame.data[:24].hex() + ("…" if len(frame.data) > 24 else "")
+    return (
+        f"S3 SUB=0x{frame.sub:02X} ({name:<22})  "
+        f"page=0x{frame.page_key:04X}  data={len(frame.data):4d}B  {chk}  {peek}"
+    )
+
+
+# ── Logger ────────────────────────────────────────────────────────────────────
+
+class Logger:
+    def __init__(self, log_path: Path, jsonl_path: Path, raw_path: Path) -> None:
+        self._log  = log_path.open("a", encoding="utf-8", newline="")
+        self._jl   = jsonl_path.open("a", encoding="utf-8", newline="")
+        self._raw  = raw_path.open("ab")
+        self._lock = threading.Lock()
+        self._frame_count = 0
+
+    def info(self, msg: str) -> None:
+        line = f"[{_ts()}] INFO  | {msg}"
+        with self._lock:
+            print(line)
+            print(line, file=self._log, flush=True)
+
+    def ctrl(self, msg: str) -> None:
+        line = f"[{_ts()}] CTRL  | {msg}"
+        with self._lock:
+            print(line)
+            print(line, file=self._log, flush=True)
+
+    def data_hex(self, msg: str) -> None:
+        line = f"[{_ts()}] HEX   | {msg}"
+        with self._lock:
+            print(line)
+            print(line, file=self._log, flush=True)
+
+    def data_ascii(self, msg: str) -> None:
+        line = f"[{_ts()}] DATA  | {msg}"
+        with self._lock:
+            print(line)
+            print(line, file=self._log, flush=True)
+
+    def frame(self, f: S3Frame) -> None:
+        with self._lock:
+            self._frame_count += 1
+            label = f"[{_ts()}] FRAME | #{self._frame_count:04d}  {_label_frame(f)}"
+            print(label)
+            print(label, file=self._log, flush=True)
+            record = {
+                "frame": self._frame_count,
+                "sub": f.sub,
+                "page_key": f.page_key,
+                "data_len": len(f.data),
+                "data_hex": f.data.hex(),
+                "checksum_valid": f.checksum_valid,
+            }
+            print(json.dumps(record), file=self._jl, flush=True)
+
+    def write_raw(self, data: bytes) -> None:
+        with self._lock:
+            self._raw.write(data)
+            self._raw.flush()
+
+    def close(self) -> None:
+        with self._lock:
+            for fh in (self._log, self._jl, self._raw):
+                try:
+                    fh.flush()
+                    fh.close()
+                except Exception:
+                    pass
+
+
+# ── Control-line monitor thread ───────────────────────────────────────────────
+
+def _monitor_control_lines(
+    ser: serial.Serial,
+    logger: Logger,
+    stop: threading.Event,
+    interval: float,
+) -> None:
+    prev = dict(CTS=None, DSR=None, DCD=None, RI=None)
+    try:
+        prev.update(CTS=ser.cts, DSR=ser.dsr, DCD=ser.cd)
+        try:
+            prev["RI"] = ser.ri
+        except Exception:
+            pass
+    except Exception as exc:
+        logger.ctrl(f"Init error: {exc}")
+        return
+
+    logger.ctrl(
+        f"Initial: CTS={prev['CTS']} DSR={prev['DSR']} DCD={prev['DCD']} RI={prev['RI']}"
+    )
+    while not stop.is_set():
+        try:
+            cur = dict(CTS=ser.cts, DSR=ser.dsr, DCD=ser.cd, RI=None)
+            try:
+                cur["RI"] = ser.ri
+            except Exception:
+                pass
+            for name, val in cur.items():
+                if val != prev[name]:
+                    logger.ctrl(f"{name} → {val}")
+                    prev[name] = val
+        except serial.SerialException as exc:
+            logger.ctrl(f"Poll error: {exc}")
+            break
+        stop.wait(interval)
+
+
+# ── Serial open ───────────────────────────────────────────────────────────────
+
+_PARITY = {
+    "N": serial.PARITY_NONE, "E": serial.PARITY_EVEN, "O": serial.PARITY_ODD,
+    "M": serial.PARITY_MARK, "S": serial.PARITY_SPACE,
+}
+_STOPBITS = {
+    1: serial.STOPBITS_ONE, 1.5: serial.STOPBITS_ONE_POINT_FIVE, 2: serial.STOPBITS_TWO,
+}
+
+
+def _open_serial(args: argparse.Namespace, logger: Logger) -> serial.Serial | None:
+    for attempt in range(1, args.open_retries + 2):
+        logger.info(
+            f"Opening {args.port} @ {args.baud},{args.bytesize}{args.parity}{args.stopbits} "
+            f"rtscts={args.rtscts} xonxoff={args.xonxoff} dsrdtr={args.dsrdtr} "
+            f"(attempt {attempt})"
+        )
+        try:
+            ser = serial.Serial(
+                port=args.port,
+                baudrate=args.baud,
+                bytesize=args.bytesize,
+                parity=_PARITY[args.parity],
+                stopbits=_STOPBITS[args.stopbits],
+                timeout=args.timeout,
+                xonxoff=args.xonxoff,
+                rtscts=args.rtscts,
+                dsrdtr=args.dsrdtr,
+                write_timeout=0,
+            )
+            try:
+                ser.setDTR(args.dtr == "on")
+                ser.setRTS(args.rts == "on")
+                logger.ctrl(f"Set DTR={args.dtr} RTS={args.rts}")
+            except Exception as exc:
+                logger.ctrl(f"DTR/RTS set failed: {exc}")
+
+            if args.send_break > 0:
+                try:
+                    ser.break_condition = True
+                    time.sleep(args.send_break / 1000.0)
+                    ser.break_condition = False
+                    logger.ctrl(f"BREAK held {args.send_break} ms")
+                except Exception as exc:
+                    logger.ctrl(f"BREAK failed: {exc}")
+
+            return ser
+
+        except serial.SerialException as exc:
+            logger.info(f"Open failed: {exc}")
+            if attempt <= args.open_retries:
+                time.sleep(args.open_retry_delay)
+
+    return None
+
+
+# ── Port picker ───────────────────────────────────────────────────────────────
+
+def _list_ports() -> list:
+    ports = list(list_ports.comports())
+    if not ports:
+        print("No serial ports found.")
+        return []
+    print("Available serial ports:")
+    for i, p in enumerate(ports, 1):
+        print(f"  {i:2d})  {p.device:<12}  {p.description or ''}")
+    return ports
+
+
+def _pick_port() -> str:
+    ports = _list_ports()
+    if not ports:
+        sys.exit(1)
+    if len(ports) == 1:
+        print(f"Auto-selecting: {ports[0].device}")
+        return ports[0].device
+    while True:
+        sel = input("Select port (number or name, e.g. COM3): ").strip()
+        if sel.isdigit() and 1 <= int(sel) <= len(ports):
+            return ports[int(sel) - 1].device
+        for p in ports:
+            if p.device.upper() == sel.upper():
+                return p.device
+        print("Not recognised. Enter list number or exact port name.")
+
+
+# ── Main loop ─────────────────────────────────────────────────────────────────
+
+def main() -> None:
+    ap = argparse.ArgumentParser(
+        description="Monitor Instantel Series-3 serial traffic with S3 frame parsing."
+    )
+    ap.add_argument("--port", "-p",
+                    help="COM port (e.g. COM3). Omit to be prompted.")
+    ap.add_argument("--baud", "-b", type=int, default=38400)
+    ap.add_argument("--bytesize", type=int, choices=[5, 6, 7, 8], default=8)
+    ap.add_argument("--parity", choices=["N", "E", "O", "M", "S"], default="N")
+    ap.add_argument("--stopbits", type=float, choices=[1, 1.5, 2], default=1)
+    ap.add_argument("--rtscts", action="store_true")
+    ap.add_argument("--xonxoff", action="store_true")
+    ap.add_argument("--dsrdtr", action="store_true")
+    ap.add_argument("--dtr", choices=["on", "off"], default="on")
+    ap.add_argument("--rts", choices=["on", "off"], default="on")
+    ap.add_argument("--send-break", type=int, default=0,
+                    help="Hold BREAK for N ms after open.")
+    ap.add_argument("--show", choices=["ascii", "hex", "both", "frames"],
+                    default="frames",
+                    help="'frames' (default) shows only parsed S3 frames. "
+                         "'ascii'/'hex'/'both' also show raw bytes.")
+    ap.add_argument("--encoding", default="latin1")
+    ap.add_argument("--read-chunk", type=int, default=4096)
+    ap.add_argument("--timeout", type=float, default=0.05)
+    ap.add_argument("--poll-lines-interval", type=float, default=0.2)
+    ap.add_argument("--open-retries", type=int, default=0)
+    ap.add_argument("--open-retry-delay", type=float, default=0.8)
+    ap.add_argument("--ack-ok", action="store_true",
+                    help="Auto-reply OK to AT* commands (except ATDT). "
+                         "Useful for testing without a real modem.")
+    ap.add_argument("--list", action="store_true",
+                    help="List available serial ports and exit.")
+    args = ap.parse_args()
+
+    if args.list:
+        _list_ports()
+        return
+
+    args.port = args.port or _pick_port()
+
+    # Build output paths
+    ts_str = datetime.now().strftime("%Y%m%d_%H%M%S")
+    out_dir = Path(__file__).parent / "captures" / f"serial_{ts_str}"
+    out_dir.mkdir(parents=True, exist_ok=True)
+
+    log_path   = out_dir / f"session_{ts_str}.log"
+    jsonl_path = out_dir / f"session_{ts_str}.jsonl"
+    raw_path   = out_dir / f"raw_s3_{ts_str}.bin"
+
+    logger = Logger(log_path, jsonl_path, raw_path)
+    logger.info(f"Output directory: {out_dir}")
+    logger.info(f"raw_s3 → {raw_path.name}  (compatible with parse_capture.py)")
+
+    ser = _open_serial(args, logger)
+    if ser is None:
+        logger.info("Could not open serial port. Exiting.")
+        logger.close()
+        sys.exit(1)
+
+    s3_parser = S3FrameParser()
+    rx_buf    = bytearray()
+    stop_evt  = threading.Event()
+
+    ctrl_thread = threading.Thread(
+        target=_monitor_control_lines,
+        args=(ser, logger, stop_evt, args.poll_lines_interval),
+        daemon=True,
+    )
+    ctrl_thread.start()
+    logger.info("Monitoring started. Waiting for call-home. Press Ctrl+C to stop.")
+
+    try:
+        while True:
+            try:
+                data = ser.read(args.read_chunk)
+            except serial.SerialException as exc:
+                logger.info(f"Read error: {exc}")
+                break
+
+            if not data:
+                continue
+
+            # 1. Save raw bytes
+            logger.write_raw(data)
+
+            # 2. Optional raw display
+            if args.show in ("ascii", "both"):
+                txt = _printable(data)
+                for line in txt.splitlines():
+                    logger.data_ascii(line)
+            if args.show in ("hex", "both"):
+                logger.data_hex(_hexdump(data))
+
+            # 3. Parse S3 frames
+            for byte in data:
+                result = s3_parser.feed(bytes([byte]))
+                if result:
+                    frames = result if isinstance(result, list) else [result]
+                    for f in frames:
+                        logger.frame(f)
+
+            # 4. AT command handling for --ack-ok
+            if args.ack_ok:
+                rx_buf.extend(data)
+                while b"\r" in rx_buf or b"\n" in rx_buf:
+                    for sep in (b"\r", b"\n"):
+                        idx = rx_buf.find(sep)
+                        if idx != -1:
+                            line_bytes = bytes(rx_buf[:idx])
+                            del rx_buf[:idx + 1]
+                            break
+                    else:
+                        break
+
+                    line_str = line_bytes.decode("latin1", errors="ignore").strip().upper()
+                    if line_str.startswith("AT") and not line_str.startswith("ATDT"):
+                        try:
+                            ser.write(b"\r\nOK\r\n")
+                            ser.flush()
+                            logger.info(f"AT ack: {line_str!r} → OK")
+                        except Exception as exc:
+                            logger.info(f"AT ack write failed: {exc}")
+
+    except KeyboardInterrupt:
+        logger.info("Ctrl+C — stopping.")
+
+    finally:
+        stop_evt.set()
+        try:
+            ser.close()
+        except Exception:
+            pass
+        ctrl_thread.join(timeout=1.0)
+        logger.info(f"Capture saved to: {out_dir}")
+        logger.close()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,185 @@
+# Histogram body codec — FULLY DECODED (2026-05-20)
+
+Clean working status doc for the MiniMate Plus histogram-mode event
+body codec.  Companion to `waveform_codec_re_status.md`.  The deep
+historical record (with retractions and dated analyses) lives in
+`docs/instantel_protocol_reference.md §7.6.2`; the authoritative
+implementation lives in `minimateplus/histogram_codec.py`.
+
+## TL;DR
+
+**The codec is fully decoded.**  Every field of every block in the
+in-repo histogram fixture corpus decodes byte-exact against BW's
+ASCII export.
+
+26 regression tests pass against ~3,500 blocks across 5 in-repo
+fixtures, plus a synthetic regression block taken from a real
+BE9558 prod event to lock in the uint8-peak interpretation.
+
+**Important correction (2026-05-21):** the per-channel peak count
+is `uint8` at byte[6]/[10]/[14]/[18], NOT `uint16 LE` at byte[6:8]
+etc.  The N844 fixture corpus the original RE was done against has
+zero values in bytes [7]/[11]/[15]/[19] for every block, so the
+two interpretations happened to be equivalent.  Cross-correlating
+non-N844 events (BE9558 Tran-drift, BE18003 Histogram+Continuous)
+against BW's per-interval ASCII export — 4 channels × ~1400 blocks
+per event × multiple events = 100% byte-exact only when the peak
+is read as uint8.  Reading as uint16 LE produced peaks up to 268
+in/s per channel and 35× inflated PVS sums when first deployed to
+prod (rolled back, root-caused, and fixed in commit 7183b95+1).
+
+## Body format
+
+```
+body = [stream of 32-byte data blocks] + [small trailing remnant]
+```
+
+Each block represents one histogram interval.  Block layout:
+
+```
+[0]    0x00                      always-zero tag
+[1]    segment_id (uint8)        0x00..0x03 — 256 blocks per segment
+[2:4]  block_ctr (uint16 LE)     resets each segment (0x0100, 0x0101, …)
+[4:6]  0x000a (uint16 LE)        constant marker (= 10)
+[6]    T_peak_count   uint8      Tran peak (count × 0.005 → in/s at Normal,
+                                  max 1.275 in/s — fits in uint8)
+[7]    T_annotation   uint8      empirically non-zero on intervals with sub-Hz
+                                  or unmeasurable freq; meaning not fully RE'd
+[8:10] T_halfperiod   uint16 LE  Tran half-period in samples
+                                  (freq_Hz = 512 / halfp; ≤ 5 means ">100 Hz")
+[10]   V_peak_count   uint8      Vert peak
+[11]   V_annotation   uint8
+[12:14] V_halfperiod  uint16 LE  Vert freq half-period
+[14]   L_peak_count   uint8      Long peak
+[15]   L_annotation   uint8
+[16:18] L_halfperiod  uint16 LE  Long freq half-period
+[18]   M_peak_count   uint8      MicL peak count
+                                  (dB via waveform_codec.mic_count_to_db)
+[19]   M_annotation   uint8
+[20:22] M_halfperiod  uint16 LE  MicL freq half-period
+[22:24] 0x00 0x00                constant
+[24:28] 4-byte variable          purpose unknown — possibly CRC,
+                                  timestamp delta, or psi(L) numeric;
+                                  not needed for waveform reconstruction
+[28:32] 0x1e 0x0a 0x00 0x00      constant block-end signature
+```
+
+Reliable block-identification anchor:
+```python
+block[22:24] == b"\x00\x00" and block[28:32] == b"\x1e\x0a\x00\x00"
+```
+(The `1e 0a 00 00` constant tail is the most distinctive signature.)
+
+## Per-channel encoding
+
+| Channel | Peak encoding | Frequency encoding |
+|---|---|---|
+| Tran | count × 0.005 = in/s at Normal range | `freq_Hz = 512 / halfperiod` |
+| Vert | same | same |
+| Long | same | same |
+| MicL | count → dB via `mic_count_to_db(count)` (same formula as waveform codec) | same |
+
+**`>100 Hz` sentinel**: when halfperiod ≤ 5 (giving ≥100 Hz from the
+512/halfp formula), BW displays `>100 Hz`.  Codec's `half_period_to_hz`
+returns `None` in this range.
+
+## Verified facts (cross-checked against fixture corpus)
+
+Example: N844L6Z8.ZR0H block 130 → all 8 decoded fields byte-exact:
+
+```
+binary samples [10, 6, 24, 4, 18, 5, 21, 5, 9]
+TXT row        [0.030, 21, 0.020, 28, 0.025, 24, 0.040, 0.000, 95.92, 57]
+
+slot[0] = 10                                  marker
+slot[1] = 6  × 0.005 = 0.030 in/s         ✓ T_peak
+slot[2] = 24 → 512/24 = 21.3 → 21 Hz      ✓ T_freq
+slot[3] = 4  × 0.005 = 0.020 in/s         ✓ V_peak
+slot[4] = 18 → 512/18 = 28.4 → 28 Hz      ✓ V_freq
+slot[5] = 5  × 0.005 = 0.025 in/s         ✓ L_peak
+slot[6] = 21 → 512/21 = 24.4 → 24 Hz      ✓ L_freq
+slot[7] = 5  → 81.94 + 20·log10(5) = 95.92 dB  ✓ M_peak
+slot[8] = 9  → 512/9 = 56.9 → 57 Hz       ✓ M_freq
+```
+
+## Verified test coverage
+
+`tests/test_histogram_codec.py` (24 tests):
+
+- Block walking: yields one record per `.TXT` interval ± 1 (off-by-one
+  at the tail when recording was stopped mid-write).  Segment-ID
+  groups of 256 blocks confirmed.
+- Geo peaks: every block of N844L20G, N844L6Z8, N844L6XE, N844L23B
+  matches `.TXT` within the 0.0005 in/s quantization step.
+- Geo freqs: every block of N844L6Z8 and N844L6XE matches `.TXT`
+  within 1 Hz (BW display rounds).  `>100 Hz` sentinel handled correctly.
+- Mic dB: every block of N844L6XE, N844L23B, N844L6Z8 matches `.TXT`
+  within 0.1 dB (BW display precision).
+- Mic freq: matches `.TXT` within 1 Hz across active blocks.
+
+## What's NOT yet decoded
+
+- **Annotation bytes (`block[7]/[11]/[15]/[19]`)**.  Empirically
+  non-zero on intervals where the per-channel ZC frequency comes
+  out as `N/A` or sub-Hz (`<1.0`, `1.X`).  Hypothesis tested in the
+  RE session: byte != 0 ↔ sub-Hz freq.  Only ~50% correlation
+  across the K558 corpus, so the relationship is more complex.
+  Possibilities: time-of-peak-within-interval, halfp extension for
+  very-long-period signals, or a debug/diagnostic field the firmware
+  writes opportunistically.  Doesn't affect peak amplitudes or
+  waveform reconstruction.  Captured as `record["annotations"]` for
+  future RE.
+- **4-byte variable metadata field (bytes 24:28)**.  Not needed for
+  waveform reconstruction.  Speculation: per-block CRC, sub-second
+  timestamp offset, or a Mic psi(L) count not in the 9 samples.
+  Punt until something needs it.
+- **Geo PVS (TXT col 7, e.g. "0.040 in/s")**.  Not stored in the
+  block; can be approximated as `sqrt(T_peak² + V_peak² + L_peak²)`
+  but BW's value sometimes differs slightly (probably computed from
+  waveform-instant samples, not from per-channel peaks).  Punt — the
+  `.h5` consumers don't need PVS as a sample channel.
+- **Mic psi(L) value (TXT col 8)**.  TXT shows it as a small psi value
+  derived from the dB measurement.  Not in the 9 samples.  Could be
+  derived from `M_peak_count` via the inverse of the dB formula plus
+  a psi calibration constant.  Defer.
+
+## Output shape
+
+`decode_histogram_body` returns the standard 4-channel dict that
+mirrors `waveform_codec.decode_waveform_v2`'s output:
+
+```python
+{
+    "Tran": [peak_count_per_interval, ...],   # 16-count units (LSB = 0.005 in/s)
+    "Vert": [..., ...],
+    "Long": [..., ...],
+    "MicL": [..., ...],                       # raw ADC counts
+}
+```
+
+Run through `waveform_codec.decoded_to_adc_counts` to get 1-count ADC
+units (geo ×16, mic passthrough) for the standard `.h5` writer.
+
+For the full per-interval record with frequencies + metadata, use
+`decode_histogram_body_full()`.
+
+## Where it's wired
+
+- `minimateplus/event_file_io.py:read_blastware_file()` — first tries
+  the waveform codec, falls back to the histogram codec when the
+  waveform preamble isn't present.  Same output shape, same
+  downstream pipeline.
+- `scripts/backfill_sidecars.py` — the `has_samples` short-circuit
+  added during the histogram-codec-pending era still serves as a
+  defensive guard against truly undecodable files, but no longer
+  fires for valid histograms.
+
+## Companion reference
+
+- `docs/waveform_codec_re_status.md` — sibling status doc for the
+  much-more-complex waveform-mode codec.
+- `docs/instantel_protocol_reference.md §7.6.2` — historical
+  protocol-reference entry.  Structural framing matches what we
+  found; per-sample semantics were less documented than the `✅
+  CONFIRMED` badge suggested.  This doc supersedes §7.6.2 where they
+  conflict on confidence level.
@@ -0,0 +1,341 @@
+# IDF Protocol Reference — Thor / Micromate Series IV
+
+Starting-point reference for reverse-engineering Instantel's Micromate
+Series IV event-file format.  Sibling to
+[instantel_protocol_reference.md](instantel_protocol_reference.md) (the
+Series III "Rosetta Stone") — this doc holds what we know so far and
+the open questions still to crack.
+
+**Status (2026-05-28):** ASCII text sidecar fully decoded (1,014
+sample files round-trip).  **Thor IDFW** binary now decodes via
+`micromate.idf_file.read_idf_file()` — reuses the BW segment-rotated
+block codec verbatim at fixed body offset `0x0f1f`; metadata (serial,
+timestamp, sample_rate, record_time, calibration_date) extracted from
+the binary header.  Sample fidelity is 87–99% byte-exact on quiet
+events; loud events hit the BW codec's known walker-stops-early
+limitation.  Residual ~3% drift on per-sample deltas (likely a
+Thor-specific 12-bit delta refinement not yet modelled).
+
+**Thor IDFH histograms also decoded.**  Body has one or more segments;
+each 12-byte segment header `[length_be 2B][0a 00 00 00][00 NN][05 3f]`
+introduces `N = (length - 10) // 72` interval records of 72 bytes
+each.  Each interval = 4 × 16-byte per-channel records:
+`[int16 min][int16 max][int16 ??][uint16 halfp][2B 00][uint16 ??][2B 00][uint16 ??]`.
+Geo peak `= max(|min|, |max|) / 32768 × 10` in/s (matches sidecar
+~1.8%); freq `= 512 / halfp` Hz (None for halfp ≤ 5 → ">100"
+sentinel).  Corpus: **all 859 Thor IDFH files decode, 181,071
+intervals**.  Wired through `read_idf_file()` →
+`save_imported_idf()` → sidecar's `extensions.idf_intervals`.
+
+**Note on the BE9439 outliers in the example corpus:** Two files
+(`BE9439_20200713131747.IDFW` and `BE9439_20200713124251.IDFH`) are
+**Series III Blastware** binaries, not Thor.  Provenance: TMI tried
+to use Thor to manage auto-call-homes for Series III units; the
+experiment didn't work out, but it did leave a few BW event files
+in Thor's per-serial directory structure with `.IDFW`/`.IDFH`
+extensions — Thor's forwarder applied its own naming convention to
+the BW bodies it was relaying.  Their header `10 00 01 80 00 00
+Instantel STRT ff fe <end_key> <start_key>` is the BW SUB 5A STRT
+record, not a Thor body preamble.  The reader detects them by
+signature and raises `NotImplementedError` pointing callers at
+`read_blastware_file()`, which extracts BW-format peaks from them.
+
+**Still NYI for Thor IDFH:** per-channel `int16 field4` (possibly
+time-of-peak); the two uint16 fields (probably PVS contributions);
+8-byte interval tail (PVS data); mic dB(L) exact conversion constant.
+
+### Codec breakthroughs (2026-05-28)
+
+- **Body offset is a fixed `0x0f1f`** across 151/154 corpus IDFW
+  files.  Preceded by a 4-byte record-type marker (`46 00 00 00`)
+  + magic preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]`.
+- **Sample stream is BW's segment-rotated block codec verbatim.**
+  Thor reuses `10 NN` (nibble), `20 NN` (int8), `00 NN` (RLE),
+  `30 NN` (packed12), `40 02` (segment header) tags with the same
+  semantics.  Channel rotation Tran→Vert→Long→MicL.
+- **Geo LSB = 0.0003 in/s** (not BW's 0.005), because Thor's 16-bit
+  ADC range maps to 10 in/s without the 16-count BW quantization step.
+- **Mic ≈ 2.14×10⁻⁶ psi/count** (rough scale; refine after channel
+  block calibration constants are decoded).
+- **BW compliance anchor `\xbe\x80\x00\x00\x00\x00` reappears at
+  IDFW offset 0x952** — sample_rate at anchor−6 (uint16 BE),
+  record_time at anchor+6 (float32 BE), same layout as BW.
+- **Event timestamp at offset 0x97A** — 8 bytes `[day][month]
+  [year_be][unk][hour][min][sec]`.  Stop-time mirrors at 0x982.
+- **Serial as null-terminated ASCII at 0x14E**.
+- **Calibration date** at 0x194–0x197 (day, month, year_be).
+- Per-sample residual drift of ~3% suggests Thor encodes int8/nibble
+  deltas with an extra refinement bit that BW doesn't carry —
+  unsolved; errors resync within a few samples so cumulative impact
+  is small.
+
+---
+
+## File model
+
+### Filename convention
+
+```
+<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>
+```
+
+- **SERIAL** — literal device serial, two-letter prefix + numeric
+  suffix.  Examples seen: `UM11719`, `UM13981`, `UM20147`, `BE9439`.
+  Unlike Series III BW filenames (`M529LK44.AB0`, base-36 stem),
+  Series IV filenames carry the serial in plain text.
+- **YYYYMMDDHHMMSS** — 14-char ASCII timestamp in **device local
+  time** (no timezone marker).
+- **KIND** — `IDFH` for histograms, `IDFW` for waveforms.
+
+The `.IDFH.txt` / `.IDFW.txt` ASCII sidecar lives in a `TXT/`
+**subfolder** of the unit's directory, not alongside the binary.
+This pairing convention is encoded in
+`event_forwarder.idf_report_path()`.
+
+### Directory layout
+
+```
+C:\THORDATA\
+└── <Project>\
+    └── <UM####>\                  ← unit serial dir
+        ├── UM12345_20260520100000.MLG     ← monitor log (not events)
+        ├── UM12345_20260520100000.IDFH    ← histogram event (binary)
+        ├── UM12345_20260520100000.IDFW    ← waveform event (binary)
+        ├── UM12345_20260520100000.IDFW.CDB ← cache-DB variant (skip)
+        ├── TXT\
+        │   ├── UM12345_20260520100000.IDFH.txt    ← histogram ASCII sidecar
+        │   └── UM12345_20260520100000.IDFW.txt    ← waveform  ASCII sidecar
+        ├── CSV\, HTML\, PDF\, XML\        ← operator-facing derived exports
+        └── ...
+```
+
+The `.IDFW.CDB` files share the binary's basename but appear to be a
+separate cache/database variant.  Their first 8 bytes match the
+**old**-firmware Thor signature (see below) regardless of which
+signature the paired `.IDFW` uses.  Purpose unknown; sizes vary
+wildly (observed 123 B → 40,491 B).  Thor-watcher's forwarder
+deliberately skips them.
+
+### Sample corpus
+
+The `thor-watcher/example-data/THORDATA_example/` tree carries
+**1,014 paired .IDFW / .IDFH + .txt files** spanning 2020–2023
+across nine units (UM11719, UM13981, UM20147, …, plus BE9439 from
+2020).  This is the reverse-engineering ground truth.
+
+---
+
+## ASCII sidecar (`.IDFW.txt` / `.IDFH.txt`) — fully decoded
+
+Shape: plain text, one `"Key : Value"` line per metadata field,
+followed for waveforms by a tab-separated sample table headed by
+the literal line `Waveform Data Channels`.  Parsed by
+[`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py).
+See [`micromate/models.py`](../micromate/models.py) for the typed
+`IdfReport` shape.
+
+### Notable conventions
+
+- **Units are native to Thor** — geophone in **in/s**, microphone in
+  **dB(L)** (not psi like Series III BW reports), frequency in Hz,
+  acceleration in g, displacement in in.
+- **Below-threshold readings** appear as the literal string
+  `<0.005 in/s` (155 occurrences in the sample corpus) — the parser
+  strips the `<` and treats the numeric remainder as the value.
+- **Out-of-range / not-measured** values appear as `N/A` — parser
+  drops the field rather than letting the string leak into a numeric
+  column.
+- **Firmware string** observed: `Micromate ISEE 11.0AK`.
+- **TitleString1..4** are operator-defined free-text slots; Thor's
+  default labels map them to Location / Client / Company / Notes,
+  which the parser surfaces as `project` / `client` / `operator` /
+  `notes`.
+- **Histogram sidecars** use `HistogramStartDate` / `HistogramStartTime`
+  in place of waveform's `EventDate` / `EventTime`.  Parser falls
+  through to either.
+- **Histogram tabular block** lacks the `Waveform Data Channels`
+  marker; instead it's a multi-line column header followed by
+  per-interval rows (`<date> <time> <tran-ppv> <freq> ...`).  Parser
+  silently ignores lines after the metadata block since they lack a
+  colon-separated `key : value` shape (the timestamps DO contain
+  colons but produce garbage keys that don't collide with any
+  recognised field).
+
+---
+
+## Binary header signatures (observed)
+
+Hex dump of the first 32 bytes across 1,014 sample files reveals
+**two distinct file signatures**, both anchored by the literal
+ASCII string `"\x00Instantel\x00"` at offset 6–16:
+
+### Signature A — newer firmware (1,012 files, 99.8% of corpus)
+
+```
+00000000: 0012 0100 0000 496e 7374 616e 7465 6c00   ......Instantel.
+00000010: 0000 a695 002e b500 4f70 6572 6174 6f72   ........Operator
+                                ^^^^^^^^^^^^^^^^
+                                operator/title string starts at 0x18
+```
+
+Header bytes 0–5: `00 12 01 00 00 00`.  Followed immediately by the
+8-byte ASCII tag, then 6 unknown bytes, then ASCII operator-supplied
+strings (Operator name, etc.) and on through the project / client /
+title strings.  No `STRT` record observed in this layout.
+
+### Signature B — older firmware (2 files: BE9439 from 2020)
+
+```
+00000000: 1000 0180 0000 496e 7374 616e 7465 6c00   ......Instantel.
+00000010: 072c 0012 0300 5354 5254 fffe 0111 2340   .,....STRT....#@
+                          ^^^^^^^^^                ^^^^^^^^^
+                          STRT magic               4-byte end_key
+00000020: 0111 0000 2e5f 00ac 4600 0000 0200 0000   ....._..F.......
+          ^^^^^^^^^             ^^^
+          4-byte start_key      0x46 (BW WAVEHDR record-type marker)
+```
+
+Header bytes 0–5: `10 00 01 80 00 00`.  The structure after the
+`Instantel` magic is **byte-for-byte identical to a BW SUB 5A
+probe-response STRT record** as documented in
+[instantel_protocol_reference.md → "SUB 5A — STRT record encodes
+end_offset"](instantel_protocol_reference.md).  Specifically:
+
+| Offset | Bytes               | Meaning (per BW reference)          |
+|--------|---------------------|--------------------------------------|
+| 0x14   | `53 54 52 54`       | `STRT` magic                         |
+| 0x18   | `ff fe`             | STRT sentinel                        |
+| 0x1A   | `01 11 23 40`       | `end_key` (4 bytes)                  |
+| 0x1E   | `01 11 00 00`       | `start_key` (4 bytes)                |
+| 0x26   | `46`                | `0x46` waveform-record type marker   |
+
+**Hypothesis:** Older Micromate firmware writes a wrapped BW-format
+event into the `.IDFW` file — essentially the same on-disk shape as
+a Series III device, with the new filename convention applied at
+export time.  Newer firmware (signature A) abandoned the
+BW-compatible layout for an Instantel-specific format.
+
+If that hypothesis holds, the 2 signature-B files can already be
+parsed via `minimateplus/event_file_io.read_blastware_file()` — worth
+testing.  The 1,012 signature-A files are the real reverse-engineering
+target.
+
+### `.IDFW.CDB` cache files
+
+Always carry signature B (`10 00 01 80 ...`), even when the paired
+`.IDFW` carries signature A.  Plausible explanation: the CDB is an
+internal Thor cache-database export that retains the legacy BW-style
+record layout regardless of the user-facing `.IDFW` format version.
+Not currently consumed by the forwarder.
+
+---
+
+## File-size patterns (Signature A, the main target)
+
+Survey of 1,012 signature-A files:
+
+| Event type   | Typical size      | Source of variance                           |
+|--------------|-------------------|----------------------------------------------|
+| `.IDFW` 2-sec | 9,200 – 10,500 B | Operator-supplied strings (TitleString1..4) of varying length |
+| `.IDFH`       | 2,944 – 4,076 B  | Histogram interval count (record duration / interval) |
+
+**Naive arithmetic for 2-sec waveform:**
+- 4 channels × 2 sec × 1024 sps = 8,192 samples
+- At 2 bytes/sample (int16) = 16,384 sample bytes → file would be > 16 KB
+- Observed: ~9–10 KB
+- → samples are likely **1 byte each** (int8 quantised), **or** stored
+  with bit-packing / delta encoding, **or** only one channel's
+  full-rate samples are stored with the others reconstructed
+  arithmetically.  Verifying this is the **first RE milestone**.
+
+Project-string–length variance (~1 KB across the corpus) is consistent
+with the file carrying a single copy of each TitleString1..4 plus
+operator + setup-name as null-padded ASCII regions.
+
+---
+
+## Open questions
+
+The reverse-engineering targets, roughly in dependency order:
+
+1. **Sample encoding (signature A)** — int8? int16 LE/BE? Bit-packed?
+   Delta-coded?  Per-channel interleaved or sequential blocks?
+2. **Header field layout (signature A)** — where do sample_rate,
+   record_time, channel count, and per-channel peaks live in the
+   binary?  The ASCII sidecar gives the device-authoritative values,
+   so binary fields can be confirmed by diff.
+3. **Operator-string offsets** — `Operator` at 0x18 is the first
+   visible string in signature-A files; the rest (project, client,
+   notes, setup) follow.  Need to map exact offsets and null-padding
+   conventions.
+4. **Signature-B → BW codec compatibility** — does
+   `minimateplus/event_file_io.read_blastware_file()` actually parse
+   the 2 BE9439 signature-B files as-is?  If yes, the OLD-format
+   ingest is free.
+5. **`.IDFW.CDB` purpose** — is it an internal Thor cache, a
+   ring-buffer dump, or something else?  Worth a single small effort
+   to characterise so we know what we're skipping.
+6. **Footer / checksum** — every BW event file has a footer; does
+   IDF?  Where does the per-channel sample block end?
+
+---
+
+## Reverse-engineering playbook (when we start)
+
+The Series III BW codec took ~2 months of MITM wire captures
+because we didn't have ground-truth metadata.  Thor's situation is
+**substantially better**:
+
+- **Ground truth is on disk.**  Every binary in `example-data/`
+  has a paired `.IDFW.txt` carrying the full decoded sample table
+  (`Waveform Data Channels` block — see any sample file in
+  `thor-watcher/example-data/.../TXT/`).  Aligning binary bytes
+  to the table's float-per-row values gives an immediate per-byte
+  hypothesis test.
+- **Cross-event diffing.**  1,012 signature-A samples from 9 units
+  spanning 4 years means any field that varies between events is
+  immediately localisable.  Fields that are constant across all
+  files (firmware ID, channel labels, format-version word) are also
+  immediately localisable by complementary search.
+- **No protocol surface.**  Files at rest, not a wire dialect.  No
+  DLE stuffing, no inner-frame parsing, no probe/data two-step.
+
+Suggested first session (2-4 hours): hand-decode `UM11719_20231219162723.IDFW`
+(10,290 bytes) against its `TXT/UM11719_20231219162723.IDFW.txt`
+sample table (the 2-sec waveform at 1024 sps × 4 channels = 8,192
+sample rows).  Find the first per-channel sample value (`0.0003` in
+the Tran column at t=0) in the binary.  Confirms sample encoding.
+Everything else flows from there.
+
+---
+
+## Code seams ready to receive the codec
+
+When the codec lands, it goes into
+[`micromate/idf_file.py`](../micromate/idf_file.py) (currently a
+stub raising `NotImplementedError`).  Public API:
+
+```python
+from micromate import IdfEvent
+from micromate.idf_file import read_idf_file
+
+event: IdfEvent = read_idf_file(Path("UM11719_20231219163444.IDFW"))
+# event.peaks.transverse_ips, event.timestamp, event.raw_samples, ...
+```
+
+The ingest pipeline (`WaveformStore.save_imported_idf`) currently
+builds the `IdfEvent` from the `.txt` parser only.  Once
+`read_idf_file()` works, the binary becomes authoritative; the
+`.txt` parser drops to fast-path metadata cross-check.  Operators
+who don't enable Thor's TXT exporter still get fully populated
+events.
+
+---
+
+## See also
+
+- [instantel_protocol_reference.md](instantel_protocol_reference.md) — Series III BW protocol reference (the Rosetta Stone).  STRT record format, DLE framing, BW filename encoding.
+- [`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py) — `.txt` sidecar parser.
+- [`micromate/models.py`](../micromate/models.py) — `IdfEvent`, `IdfReport` typed dataclasses.
+- [`micromate/idf_file.py`](../micromate/idf_file.py) — placeholder for the binary codec.
+- [`thor-watcher/example-data/THORDATA_example/`](../../thor-watcher/example-data/) — 1,014 paired binary + .txt files for codec validation.
@@ -0,0 +1,255 @@
+# Runbook — Recovering a wedged unit stuck in a call-home loop
+
+**Original incident:** BE9558H at `166.246.130.1:9034`, recovered 2026-05-17.
+
+A field unit with a stuck-triggered geophone (or any hardware fault causing
+constant event triggering) will record events back-to-back, and if Auto Call
+Home is set to "After Event Recorded" the device will dial the office BW
+ACH server in a tight loop. Combined with a Sierra Wireless modem in
+bidirectional serial-TCP mode, this makes the unit effectively unreachable
+from SFM — every TCP connection we open gets killed when the modem flips
+from server-mode to client-mode to honor the device's next AT dial command.
+
+This runbook describes how to break the loop and recover control.
+
+---
+
+## Symptoms
+
+- Terra-View / SFM `/device/info` either hangs or fails on `count_events()`.
+- `/device/monitor/status` and `/device/rescue` return 502 (protocol timeout
+  waiting for POLL response) or 503 (TCP connect refused).
+- ACEmanager serial log shows repeating
+  `Connect to IP: <BW_IP> Port: <BW_PORT>` → `Shutdown TCP socket` cycles
+  every 30-60 seconds.
+- Spam-mode endpoints (`/device/stop_monitoring_spam`) report many
+  `sent_ok` but the device's monitoring state never changes.
+- `slow_drip` reports `[Errno 32] Broken pipe` after sending the preamble
+  but before completing the drip loop.
+
+If you see *all* of these, the unit is in this exact failure mode.
+
+---
+
+## Quick reference — how to recover
+
+You need **ACEmanager access** to the unit's modem.
+
+### Step 1: stop the modem's mode-flipping
+
+In ACEmanager → **Serial → Port Configuration**:
+
+| Field | Set to |
+|---|---|
+| **Destination Address** | clear (blank) |
+| **Destination Port** | `0` |
+
+Click **Apply**. This removes the modem's auto-dial-out target. The device's
+AT dial commands now error back at the modem instead of triggering a
+mode-flip, so the modem stays in TCP-server mode permanently and our inbound
+TCP sessions stay alive.
+
+*(Optional belt-and-suspenders: also add the BW server's port to
+**Security → Port Filtering - Outbound** as a blocked port, with
+Outbound Port Filtering Mode = Blocked Ports.)*
+
+### Step 2: stop monitoring on the device (slow drip)
+
+From the SFM host:
+
+```bash
+/home/serversdown/seismo-relay/scripts/slow_drip.sh <DEVICE_IP> <PORT>
+```
+
+Defaults are 120s duration with a drip every 3s. Watch the response:
+
+- `duration_s ≈ 120` and `drips_sent ≈ 40` → session held the full duration ✓
+- `bytes_received > 0` → device is responding ✓ (this is the success signal)
+
+If `duration_s` is small or `send_error: "Broken pipe"`, Step 1 didn't take
+hold — re-check ACEmanager, may need to reboot the modem after Apply.
+
+### Step 3: confirm monitoring stopped
+
+```bash
+curl 'http://localhost:8200/device/monitor/status?host=<DEVICE_IP>&tcp_port=<PORT>&force=true'
+# expect: {"is_monitoring": false, ...}
+```
+
+### Step 4: disable ACH at the device level + erase corrupted events
+
+Either fire the rescue endpoint:
+
+```bash
+/home/serversdown/seismo-relay/scripts/rescue_device.sh <DEVICE_IP> <PORT>
+```
+
+Or do the two steps manually:
+
+```bash
+# Disable ACH in the device's compliance config
+curl -X POST 'http://localhost:8200/device/call_home?host=<DEVICE_IP>&tcp_port=<PORT>' \
+  -H 'Content-Type: application/json' \
+  -d '{"auto_call_home_enabled": false}'
+
+# Erase corrupted event chain
+curl -X POST 'http://localhost:8200/device/events/erase?host=<DEVICE_IP>&tcp_port=<PORT>'
+```
+
+You can also do this via the SFM standalone UI → **Call Home** tab → set
+`Enable Auto Call Home` to `Disabled` → **Write to Device**.
+
+### Step 5: restore modem config (housekeeping)
+
+Once the device-side ACH is disabled, restore the modem's Destination
+Address and Port to the original values (e.g. `50.197.32.92` / `12345`) in
+ACEmanager. The modem will resume normal bidirectional behavior, but the
+unit won't issue any dial commands until ACH is explicitly re-enabled on
+the device.
+
+### Step 6: do NOT re-enable ACH on this unit until the underlying hardware
+fault is repaired. If you do, the call-home loop starts again immediately
+and you'll be running this runbook a second time.
+
+---
+
+## Why this works — the failure mode explained
+
+The Sierra Wireless RV50/RV55 serial port operates in one of two TCP modes
+at any moment:
+
+- **Server mode** — listens on `Device Port` (e.g. 9034), bridges inbound
+  TCP to the device's serial port. This is what we need to interact with
+  the device.
+- **Client mode** — when the device sends an AT dial command on its serial
+  TX line, the modem opens an outbound TCP to `Destination Address:Port`
+  and bridges that to serial.
+
+A serial port in this configuration is **bidirectional**: the modem flips
+between server and client modes on demand. When the device's firmware is
+healthy and only dials occasionally, this works fine.
+
+When the unit is constantly triggering events and ACH is set to "After
+Event Recorded", the device sends an AT dial command every few seconds.
+Each one causes the modem to:
+
+1. Drop any active inbound TCP session
+2. Flip to client mode
+3. Attempt outbound TCP to `Destination Address:Port`
+4. Hang for up to a minute waiting for it to succeed/fail
+5. Drop back to server mode
+
+**During the entire hang, no inbound TCP can establish.** Even between
+hangs, the modem closes any existing inbound session before flipping. So
+any tool that needs more than a few seconds of held TCP (e.g. POLL +
+config read + write) gets repeatedly kicked off.
+
+Clearing `Destination Address` removes step 3-4 from the cycle: the modem
+has nowhere to dial, so it doesn't flip modes when it receives an AT dial
+command. The serial port effectively becomes server-only, and inbound TCP
+sessions can stay open as long as needed.
+
+**This is a modem-layer issue, not a device firmware issue.** The device
+is alive and responsive the whole time — confirmed in the BE9558H
+recovery by 990 bytes of S3 responses received over a 120s slow-drip
+session once the modem was no longer mode-flipping.
+
+---
+
+## Why simpler approaches don't work
+
+| Approach | Why it fails |
+|---|---|
+| Standard `/device/info` | Triggers `count_events()` 1E/1F walk, takes 90s+ and hits corrupted event chain in this scenario |
+| `/device/rescue` race loop | Gets 502 (protocol timeout) because the modem closes the TCP before the POLL handshake can complete |
+| `/device/stop_monitoring_blind` (single frame) | Even if the bytes leave the wire, the device's protocol parser ignores write commands without a preceding POLL handshake (early-version bug, now fixed by including POLL preamble in blind sends) |
+| `/device/stop_monitoring_spam` (sub-second cadence) | Each session is killed by the modem's mode-flip before the device can drain its UART RX buffer; high-rate spam also risks UART FIFO overrun on the device side |
+| Outbound port firewall block alone | Stops the outbound TCP from succeeding, but doesn't stop the modem from *trying* and mode-flipping. Reduces but doesn't eliminate the contention. |
+| Modem reboot | Temporary — as soon as the device starts triggering again, the loop resumes within seconds |
+
+The combination of `slow_drip` + cleared `Destination Address` works because:
+
+1. The modem stops mode-flipping → TCP session stays open for the full
+   drip duration
+2. Slow drip rate → device's UART RX FIFO never overflows even if
+   firmware is busy with event recording
+3. The drip is `SESSION_RESET + STOP_MONITORING` every 3s → many
+   independent chances for the parser to land one valid frame
+4. Once one Stop Monitoring is parsed, event recording halts → firmware
+   has CPU to spare → subsequent operations are trivially easy
+
+---
+
+## Tooling reference
+
+All endpoints live in `seismo-relay/sfm/server.py`. All scripts live in
+`seismo-relay/scripts/` and default to SFM direct (`http://localhost:8200`),
+overridable via `SFM_BASE_URL`.
+
+### Endpoints added during BE9558H recovery
+
+| Endpoint | Purpose |
+|---|---|
+| `GET /device/events/storage_range` | SUB 0x06 — first/last event keys, `is_empty` flag. ~2s, no event walk. |
+| `GET /device/events/index` | SUB 0x08 — lifetime event counter (does NOT decrement on erase). ~2s. |
+| `POST /device/events/erase` | Full erase sequence 0xA3 → 0x1C → 0x06 → 0xA2. |
+| `POST /device/rescue` | Disable ACH + erase in one TCP session. Short timeouts for race-loop usage. |
+| `POST /device/stop_monitoring_blind` | Fire-and-forget Stop with full POLL preamble (single attempt). |
+| `POST /device/stop_monitoring_spam` | Server-side tight retry loop, sub-second cadence, duration-bounded. |
+| `POST /device/stop_monitoring_slow_drip` | One held TCP session, slow trickle of stop frames. **The endpoint that saved BE9558H.** |
+
+Also changed: default protocol recv timeout dropped from 30s → 10s in
+`_build_client`. Added `connect_timeout` knob to same. Cleaned up
+unhandled-exception path in `/device/monitor/status` so it returns 502
+instead of 500 on protocol timeouts.
+
+### Scripts
+
+| Script | Purpose |
+|---|---|
+| `scripts/rescue_device.sh` | Race-loop wrapper around `/device/rescue` |
+| `scripts/blind_stop.sh` | Race-loop wrapper around `/device/stop_monitoring_blind` |
+| `scripts/spam_stop.sh` | Single-call burst hammer (`/device/stop_monitoring_spam`) |
+| `scripts/slow_drip.sh` | Single-call held-session drip (`/device/stop_monitoring_slow_drip`) |
+| `scripts/watch_unit.sh` | Passive periodic reachability check, logs to file |
+
+---
+
+## Incident log — BE9558H, 2026-05-16/17
+
+What was wrong: Long-axis geophone developed an offset, constantly above
+trigger threshold → constant event recording → after-event ACH set →
+modem dialing office BW server (`50.197.32.92:12345`) every 30-60s.
+Local event chain corrupted (`next_boundary 0x100EE exceeds uint16`).
+
+Diagnostic path:
+
+1. `/device/info` slow, choked on event walk
+2. Built lightweight probe endpoints (`storage_range`, `index`) — useful
+   but didn't reach the wedged unit
+3. Built `/device/rescue` with short timeouts — got 502 (POLL no response)
+4. Built `/device/stop_monitoring_blind` — first version was a false
+   positive (no POLL preamble); fixed by including
+   `SESSION_RESET+POLL_PROBE+SESSION_RESET+POLL_DATA` in the dump
+5. Verified blind stop works on bench unit
+6. Built `/device/stop_monitoring_spam` — 420 successful sends over
+   5 min, zero behavior change on field unit
+7. Inspected ACEmanager logs → saw outbound dial-out attempts every ~30s,
+   confirmed device was not fully locked up
+8. Added outbound port-12345 firewall block → outbound attempts now fail
+   instantly but contention persisted
+9. Built `/device/stop_monitoring_slow_drip` — session died at 3s with
+   broken pipe (modem closing on us)
+10. Looked at full ACEmanager Port Configuration → **found
+    `Destination Address: 50.197.32.92` configured**, realized every AT
+    dial command was triggering a modem mode-flip that killed our inbound
+11. Cleared Destination Address + Port → slow_drip held 120s, device
+    responded with 990 bytes, 39 stop commands acked
+12. Disabled ACH at device level via `/device/call_home`, erased events
+
+Final state: device IDLE, memory 958.1 / 960 KB free, ACH disabled at
+device level, modem destination cleared (to be restored after physical
+service).
+
+Total time from "i was wondering if its possible to" first attempt to
+recovery: ~7 hours of intermittent debugging across one evening.
@@ -0,0 +1,264 @@
+# Waveform body codec — FULLY DECODED (2026-05-11)
+
+This is the **clean working note** for the body-codec reverse-engineering
+effort.  It supersedes scattered claims elsewhere when they conflict.
+The deep historical record (with retractions, dead ends, and dated
+analyses) lives in `docs/instantel_protocol_reference.md §7.6.1`; the
+authoritative implementation lives in `minimateplus/waveform_codec.py`.
+
+## TL;DR
+
+**The codec is fully decoded.**  Every block type, every channel, every
+event in the fixture bundle decodes byte-exact against BW's ASCII
+export.
+
+| Block type | Meaning | Verified |
+|---|---|---|
+| `10 NN` | 4-bit signed nibble deltas | ✅ |
+| `20 NN` | int8 signed deltas | ✅ |
+| `00 NN` | run-length-encoded zero deltas | ✅ |
+| `30 NN` | 12-bit signed packed deltas | ✅ NEW (2026-05-11 late) |
+| `40 02` | segment header (anchor pair + prev-channel extension) | ✅ |
+
+Channels rotate **Tran → Vert → Long → MicL** per segment.  Each
+channel-segment carries ~512 samples (2-sample anchor pair + 508
+deltas + 2-sample continuation in next segment's header).
+
+## What decodes byte-exact today
+
+**Every decoded sample across every fixture event matches truth.  Zero
+divergences.**
+
+| Event | Description | Tran | Vert | Long | Total |
+|---|---|---|---|---|---|
+| event-a (5-8) | quiet, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
+| event-c (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
+| event-d (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
+| JQ0 (5-11) | Vert-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
+| V70 (5-11) | Mic-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
+| SP0 (5-11) | loud all, 3 sec | 2048 ✓ | 1538 ✓ | 1536 ✓ | 5122 |
+| SS0 (5-11) | loud-from-start | 734 ✓ | 512 ✓ | 512 ✓ | 1758 |
+| SV0 (5-11) | loud-from-start | 1024 ✓ | 578 ✓ | 512 ✓ | 2114 |
+| event-b (5-8) | quiet, 2 sec | 512 ✓ | 226 ✓ | 0 | 738 |
+
+That's **47,364 ADC samples decoded byte-exact, zero errors.**
+
+Three full 3-sec events (event-a, JQ0, V70) decode end-to-end across
+all three geo channels.
+
+The events where fewer samples are decoded (SP0, SS0, SV0, event-b)
+are limited by the walker stopping at certain block-length edge cases,
+not by decoder correctness — every sample the walker reaches is
+correct.
+
+## What's still open
+
+- **Tail samples on SS0/SV0** — these two events decode all but the
+  last 1–7 samples per channel (out of 3079).  Likely the same
+  "last segment is truncated" pattern.  Minor; doesn't affect the
+  bulk of the data.
+
+## Sample counts (72,972 byte-exact total)
+
+| Event | Tran | Vert | Long | Status |
+|---|---|---|---|---|
+| event-a | 3328 | 3328 | 3328 | full |
+| event-b | 2304 | 2304 | 2304 | full |
+| event-c | 1280 | 1280 | 1280 | full |
+| event-d | 1280 | 1280 | 1280 | full |
+| JQ0 | 3328 | 3328 | 3328 | full |
+| V70 | 3328 | 3328 | 3328 | full |
+| SP0 | 3328 | 3328 | 3328 | full |
+| SS0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
+| SV0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
+
+## What's now wired into production (2026-05-11 late)
+
+- **`client.py:_decode_a5_waveform`** — now uses
+  `decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
+  `event.raw_samples` is populated with int16 ADC counts that flow
+  through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
+  Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
+  reference but is not called.
+
+- **MicL → dB(L) conversion** — exposed as
+  `waveform_codec.mic_count_to_db(count)`.  Verified against BW
+  display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
+  the V70 mic-heavy fixture exactly).
+
+- **`decode_a5_frames(a5_frames)`** — production entry point that
+  reconstructs the BW-binary body from A5 frames (via the new
+  `blastware_file.extract_body_bytes` helper) and runs the verified
+  codec.  Returns the same `raw_samples` dict shape the consumers
+  already expect.
+
+## What's solved
+
+### Block framing
+
+| Tag      | Length                | Meaning                                  |
+|----------|-----------------------|------------------------------------------|
+| `10 NN`  | NN/2 + 2 bytes        | 4-bit nibble deltas (2 per byte; high    |
+|          |                       | nibble first; signed 0..7 / 8..F = -8..-1)|
+| `20 NN`  | NN + 2 bytes          | int8 signed deltas (1 per byte)          |
+| `00 NN`  | 2 bytes               | RLE: append NN copies of current value   |
+| `30 NN`  | NN*2 in data section, | Unknown content.  Only in loud-from-     |
+|          | NN*4 in trailer       | start events.                            |
+| `40 02`  | 20 bytes (fixed)      | Segment header                           |
+
+NN is always a multiple of 4.
+
+Implementation: `walk_body()` in `minimateplus/waveform_codec.py`.
+
+### 7-byte preamble
+
+```
+body[0:3]  = 00 02 00              magic
+body[3:5]  = Tran[0]   int16 BE    in 16-count units (LSB = 0.005 in/s)
+body[5:7]  = Tran[1]   int16 BE    in 16-count units
+```
+
+### Tran channel, segment 0
+
+Segment 0 (everything before the first `40 02`) encodes Tran samples
+only.  Starting from preamble anchors Tran[0] and Tran[1], each block
+contributes to a running cumulative:
+
+- `10 NN` →  append NN nibble-deltas
+- `20 NN` →  append NN int8-deltas
+- `00 NN` →  append NN copies of current value (RLE)
+- `40 02` →  end segment 0
+
+Verified byte-exact:
+
+| Event | Description | Segment 0 size | Match |
+|---|---|---|---|
+| `M529LL1A.SP0` | Loud, 0.25 s pretrig | 510 | 510/510 ✓ |
+| `M529LL1A.SV0` | Loud from sample 0 | 58 | 58/58 ✓ (stops at first `30 NN`) |
+| `M529LL1A.SS0` | Loud from sample 0 | 42 | 42/42 ✓ (stops at first `30 04`) |
+| `M529LL1L.JQ0` | Vert-heavy | 510 | 510/510 ✓ |
+| `M529LL1L.V70` | Mic-heavy (140 dB) | 510 | 510/510 ✓ |
+
+Implementation: `decode_tran_initial()`.
+
+### Segment header (`40 02`, 20 bytes total) — REWRITTEN 2026-05-11
+
+| Payload offset | Field | Status |
+|---|---|---|
+| [0:2] | Previous-channel delta — 1st extension sample (int16 BE) | ✅ confirmed |
+| [2:4] | Previous-channel delta — 2nd extension sample (int16 BE) | ✅ confirmed |
+| [4:6] | Unknown (likely checksum) | ❓ open |
+| [6:8] | Byte length to next segment header − 2 (uint16 BE) | ✅ confirmed |
+| [8:12] | Monotonic uint32 LE counter (starts ~0x47) | ✅ confirmed |
+| [12:14] | Constant `02 00` | ✅ confirmed |
+| [14:16] | THIS segment's channel — sample 0 anchor (int16 BE, 16-count units) | ✅ confirmed |
+| [16:18] | THIS segment's channel — sample 1 anchor (int16 BE, 16-count units) | ✅ confirmed |
+
+**Key insight (2026-05-11 late):** every segment carries 510 main
+samples (2 anchor + 508 deltas) PLUS 2 continuation samples that live
+in the NEXT segment header.  So each channel-segment effectively spans
+512 sample-sets.  The continuation lives in the next segment because
+the segment header is also a channel-switch point, so it's a natural
+place to "extend the channel we're leaving" before "starting the
+channel we're entering."
+
+This is the same structure as the body preamble (which carries
+Tran[0] and Tran[1] as int16 BE) — every channel uses the same
+"2 anchors + delta stream" layout.
+
+## Channel rotation — VERIFIED 2026-05-11
+
+```
+(initial body)  →  Tran samples 0..509       (preamble + delta blocks)
+segment 0 hdr  ext+anchor →  Vert samples 0..511   ← anchor in hdr [14:18]
+segment 1 hdr  ext+anchor →  Long samples 0..511
+segment 2 hdr  ext+anchor →  Mic  samples 0..511
+segment 3 hdr  ext+anchor →  Tran samples 510..1021 (continuation)
+segment 4 hdr  ext+anchor →  Vert samples 512..1023
+segment 5 hdr  ext+anchor →  Long samples 512..1023
+segment 6 hdr  ext+anchor →  Mic  samples 512..1023
+segment 7 hdr  ext+anchor →  Tran samples 1022..1533
+...
+```
+
+Implementation: `decode_waveform_v2()` returns
+`{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}` with
+each channel's samples in 16-count units.  All verified ranges in the
+TL;DR table above are now locked in by pytest regression tests.
+
+## What's still open
+
+1. **`30 NN` block content.**  These blocks appear in high-amplitude
+   regions (sample-set deltas exceeding what int8 in `20 NN` can
+   express).  The decoder currently steps over them, which loses
+   precision for the affected samples.  Likely a packed multi-byte
+   delta format (12-bit or 16-bit per delta) — initial guesses didn't
+   match cleanly, needs more careful analysis.
+
+2. **MicL decoding.**  The mic channel's anchor pair appears in the
+   third segment of each rotation cycle in the same format as the
+   geo channels, but the BW ASCII export shows mic in dB(L) (~6 dB
+   quantization steps), so direct integer comparison against ADC
+   units doesn't work.  Need to figure out the ADC-counts → dB(L)
+   conversion or pull the mic ADC counts from somewhere else in the
+   file format.
+
+3. **Walker fix for event-b.**  The original quiet bundle's event-b
+   still bails out partway through.  Lower priority since the other
+   7 events walk cleanly.
+
+## `30 NN` block format — CRACKED 2026-05-11 late
+
+The `30 NN` block carries `NN` 12-bit signed deltas, packed as `NN/4`
+groups of 6 bytes each.  Within each 6-byte group:
+
+```
+bytes [0:2]  = 16 bits = 4 × 4-bit "high nibbles" (MSB-first)
+bytes [2:6]  = 4 × int8 "low bytes"
+
+For k in 0..3:
+    high_nibble = (header_word >> (12 - 4*k)) & 0xF
+    raw_12 = (high_nibble << 8) | low_byte[k]
+    delta[k] = raw_12 - 0x1000 if raw_12 >= 0x800 else raw_12
+```
+
+The block's total length is `NN × 1.5 + 2` bytes (tag included).  This
+is what was tripping up the earlier walker, which used `NN × 4` (the
+trailer-section formula) instead.
+
+Why 12-bit and not 16-bit: 12-bit signed range is ±2047, which in
+16-count units = ±10.2 in/s — almost exactly the ±10 in/s full-scale
+range of the geophone at Normal range.  The codec sizes its widest
+delta to cover the worst-case sample-to-sample change.
+
+Verified against all 14 `30 NN` blocks across the bundled fixture
+events.  Every delta decodes byte-exact against BW's ASCII export.
+
+## Test fixtures
+
+Committed under `tests/fixtures/`:
+
+- `decode-re-5-8-26/event-a..event-d/`: original quiet bundle (4 events,
+  PPV < 1 in/s).  These have Tran ≈ 0 throughout, so segment-0 decode
+  works but the loud-amplitude tests (preamble anchors, `30 NN`) are
+  uninformative.
+- `5-11-26/M529LL1A.{SP0,SS0,SV0}`: loud bundle (PPV 6-7 in/s on all
+  channels).  These cracked the Tran codec.
+- `5-11-26/M529LL1L.{JQ0,V70}`: targeted captures.  JQ0 is Vert-heavy,
+  V70 is Mic-heavy (140 dB).  These cracked the `00 NN` RLE rule.
+
+Each fixture has a `.TXT` Blastware ASCII export as ground truth.
+
+## Tests
+
+`tests/test_waveform_codec.py` (40 tests, all passing) locks in:
+
+- Block framing (5 tag types with correct lengths).
+- Walker contiguity (no gaps or overlaps).
+- Segment header parsing (counter monotonicity, fixed-pattern check).
+- `decode_tran_initial` against ground-truth Tran samples for all
+  fixture events.
+
+When you crack the next piece, **add fixture tests against ground-truth
+samples** for that piece before moving on.  Don't let unverified code
+ship without a regression lock-in.
@@ -0,0 +1,634 @@
+#!/usr/bin/env python3
+"""
+experiments.py — Protocol minimization experiments for MiniMate Plus.
+
+Goal: figure out which steps in Blastware's sequences are truly required vs.
+cargo-culted, so we can build a faster, smarter client.
+
+Each experiment is self-contained (opens its own TCP connection) and reports
+PASS / FAIL / INCONCLUSIVE with timing and notes.
+
+Usage:
+    python experiments.py [--host IP] [--port PORT] [exp1 exp2 ...]
+
+    Run all:       python experiments.py
+    Run specific:  python experiments.py cold_status fast_event_count no_5a
+
+Available experiments
+---------------------
+  cold_status       EXP1  Monitor status (1C) with NO prior POLL
+  fast_event_count  EXP2  Event count via POLL+08 only — skip identity reads
+  no_5a             EXP3  Event record (0C) without bulk waveform stream (5A)
+  skip_1e           EXP4  0A/0C directly with cached key — skip initial 1E
+  fewer_polls       EXP5  Only 1 POLL before 5A instead of Blastware's 3
+  compliance_only   EXP6  Write compliance ONLY (71x3→72), skip event index+trigger+waveform
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+import struct
+import sys
+import time
+from dataclasses import dataclass, field
+from typing import Optional
+
+logging.basicConfig(
+    level=logging.WARNING,          # experiment output is via print(); set DEBUG for wire trace
+    format="%(asctime)s  %(levelname)-7s  %(name)-20s  %(message)s",
+    datefmt="%H:%M:%S",
+)
+log = logging.getLogger("experiments")
+
+# ── Imports ───────────────────────────────────────────────────────────────────
+
+from minimateplus.transport import TcpTransport
+from minimateplus.protocol import (
+    MiniMateProtocol,
+    ProtocolError,
+    TimeoutError as ProtoTimeout,
+    SUB_MONITOR_STATUS,
+    SUB_SERIAL_NUMBER,
+    SUB_FULL_CONFIG,
+    SUB_EVENT_INDEX,
+    SUB_COMPLIANCE,
+    SUB_WRITE_CONFIRM_A,
+    SUB_WRITE_CONFIRM_B,
+)
+from minimateplus.framing import build_bw_frame, SESSION_RESET
+from minimateplus.client import (
+    MiniMateClient,
+    _decode_compliance_config_into,
+    _encode_compliance_config,
+)
+from minimateplus.models import DeviceInfo
+
+
+DEFAULT_HOST = "63.43.212.232"
+DEFAULT_PORT = 9034
+
+
+# ── Result container ──────────────────────────────────────────────────────────
+
+@dataclass
+class Result:
+    name:     str
+    outcome:  str        # "PASS" | "FAIL" | "INCONCLUSIVE"
+    elapsed:  float = 0.0
+    notes:    str   = ""
+    details:  dict  = field(default_factory=dict)
+
+    def __str__(self) -> str:
+        sym = {"PASS": "✅", "FAIL": "❌", "INCONCLUSIVE": "⚠️ "}.get(self.outcome, "?")
+        lines = [f"  {sym}  {self.outcome:13s}  {self.name}  ({self.elapsed:.1f}s)"]
+        if self.notes:
+            lines.append(f"       {self.notes}")
+        for k, v in self.details.items():
+            lines.append(f"       {k}: {v}")
+        return "\n".join(lines)
+
+
+# ── Connection helpers ────────────────────────────────────────────────────────
+
+def connect_proto(host: str, port: int, timeout: float = 15.0) -> tuple[TcpTransport, MiniMateProtocol]:
+    """Open a raw TCP connection and return (transport, proto) without any handshake."""
+    t = TcpTransport(host, port)
+    t.connect()
+    proto = MiniMateProtocol(t, recv_timeout=timeout)
+    return t, proto
+
+
+def connect_client(host: str, port: int, timeout: float = 30.0) -> tuple[MiniMateClient, DeviceInfo]:
+    """Open a MiniMateClient and run the full connect() handshake."""
+    transport = TcpTransport(host, port)
+    client = MiniMateClient(transport=transport, timeout=timeout)
+    client.open()
+    info = client.connect()
+    return client, info
+
+
+# ── Experiment runner ─────────────────────────────────────────────────────────
+
+def run(name: str, fn, *args, **kwargs) -> Result:
+    print(f"\n{'─'*60}")
+    print(f"  Running: {name}")
+    print(f"{'─'*60}")
+    t0 = time.time()
+    try:
+        outcome, notes, details = fn(*args, **kwargs)
+    except Exception as exc:
+        outcome = "FAIL"
+        notes   = f"Uncaught exception: {exc}"
+        details = {}
+        log.exception("Experiment %s raised:", name)
+    elapsed = time.time() - t0
+    r = Result(name=name, outcome=outcome, elapsed=elapsed, notes=notes, details=details)
+    print(str(r))
+    return r
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  EXP1 — Monitor status (1C) with NO prior POLL
+# ══════════════════════════════════════════════════════════════════════════════
+#
+#  Blastware always does a full POLL handshake before any other command.
+#  We want to know: can we query SUB 1C (battery, memory, monitoring state)
+#  cold, with only a SESSION_RESET signal and no POLL at all?
+#
+#  If PASS: status checks become near-instant (no ~1s POLL round-trip).
+#  If FAIL: we need POLL first, but maybe we can cache it.
+
+def exp_cold_status(host: str, port: int) -> tuple[str, str, dict]:
+    """SUB 1C without any POLL — just SESSION_RESET + 1C probe + 1C data."""
+    t, proto = connect_proto(host, port)
+    try:
+        print("  Sending SESSION_RESET only (no POLL)")
+        t.write(SESSION_RESET)
+        time.sleep(0.1)
+
+        print("  Sending SUB 1C probe (no POLL first)…")
+        rsp_sub = (0xFF - SUB_MONITOR_STATUS) & 0xFF   # 0xE3
+        t.write(build_bw_frame(SUB_MONITOR_STATUS, 0x00))
+        probe = proto._recv_one(expected_sub=rsp_sub, timeout=8.0)
+        print(f"  1C probe OK  page_key=0x{probe.page_key:04X}  data={probe.data.hex()}")
+
+        t.write(build_bw_frame(SUB_MONITOR_STATUS, 0x2C))
+        data_rsp = proto._recv_one(expected_sub=rsp_sub, timeout=8.0)
+
+        section = data_rsp.data
+        print(f"  1C data OK  {len(section)} bytes  hex: {section.hex()}")
+
+        # Decode battery + memory from the end of the section
+        details = {"raw_bytes": len(section)}
+        if len(section) >= 10:
+            batt_raw = struct.unpack_from(">H", section, len(section) - 10)[0]
+            mem_total = struct.unpack_from(">I", section, len(section) - 8)[0]
+            mem_free  = struct.unpack_from(">I", section, len(section) - 4)[0]
+            is_monitoring = (section[1] == 0x10)
+            details["battery_v"]     = f"{batt_raw / 100:.2f} V"
+            details["memory_total"]  = f"{mem_total:,} bytes"
+            details["memory_free"]   = f"{mem_free:,} bytes"
+            details["monitoring"]    = is_monitoring
+            print(f"  battery={batt_raw/100:.2f}V  mem_free={mem_free:,}  monitoring={is_monitoring}")
+
+        return "PASS", "SUB 1C responded without any POLL — cold status read works!", details
+
+    except ProtoTimeout:
+        return "FAIL", "Device did not respond to 1C without POLL (timeout)", {}
+    except ProtocolError as exc:
+        return "FAIL", f"Protocol error: {exc}", {}
+    finally:
+        t.disconnect()
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  EXP2 — Fast event count: POLL + SUB 08 only (skip identity reads)
+# ══════════════════════════════════════════════════════════════════════════════
+#
+#  Blastware's connect() does: POLL → 15 → 01 → 1A → 08
+#  We want to know: can we skip 15/01/1A and go straight from POLL to 08?
+#
+#  Reading identity (15, 01) and full compliance (1A, ~2126 bytes over TCP)
+#  takes several seconds each connect. If we only need event count, skipping
+#  them would be a huge win.
+#
+#  If PASS: fast status poll = POLL + 08 only (~2 round trips vs ~8+).
+
+def exp_fast_event_count(host: str, port: int) -> tuple[str, str, dict]:
+    """POLL startup → SUB 08 only, skip serial/config/compliance reads."""
+    t, proto = connect_proto(host, port)
+    try:
+        print("  Running startup (POLL only)…")
+        proto.startup()
+        print("  POLL OK — now reading SUB 08 (event index) directly…")
+
+        idx_raw = proto.read_event_index()
+        print(f"  SUB 08 OK  {len(idx_raw)} bytes")
+
+        # Try to decode event count from SUB 08 payload
+        # The raw block is 88 bytes; bytes [3:7] may be a count (uint32 BE)
+        details = {"idx_raw_len": len(idx_raw)}
+        if len(idx_raw) >= 7:
+            count_candidate = struct.unpack_from(">I", idx_raw, 3)[0]
+            details["count_candidate"] = count_candidate
+            print(f"  idx[3:7] as uint32 BE = {count_candidate} (may or may not be event count)")
+
+        # Also verify we can read 1E without the identity reads having been done
+        print("  Reading 1E (event header) to confirm event access works…")
+        key4, data8 = proto.read_event_first()
+        is_empty = data8[4:8] == b"\x00\x00\x00\x00"
+        details["first_key"] = key4.hex()
+        details["is_empty"]  = is_empty
+        print(f"  1E OK  key={key4.hex()}  empty={is_empty}")
+
+        return "PASS", "POLL+08+1E all work without identity reads (15/01/1A skipped)", details
+
+    except ProtocolError as exc:
+        return "FAIL", f"Protocol error: {exc}", {}
+    finally:
+        t.disconnect()
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  EXP3 — Get event record (0C) without bulk waveform stream (5A)
+# ══════════════════════════════════════════════════════════════════════════════
+#
+#  Blastware's event download = 1E → 0A → 1E-arm → 0C → 1F(dl) → POLL×3 → 5A → 1F(browse)
+#
+#  The 5A bulk stream is the slow part (several large frames, ~1s+ per event).
+#  We only need 5A for: client, operator, seis_loc, notes (not in 0C).
+#  If you don't need those fields, can we do: 1E → 0A → 0C → 1F(browse) ?
+#
+#  Two variants tested:
+#    3a: Skip 1E-arm AND 5A — just 0A → 0C → 1F(browse)
+#    3b: Include 1E-arm but skip 5A+POLL — 0A → 1E-arm → 0C → 1F(browse)
+#
+#  If PASS: event peak values available without the slow bulk stream.
+#  If FAIL on 3a but PASS on 3b: 1E-arm required even without 5A.
+
+def exp_no_5a(host: str, port: int) -> tuple[str, str, dict]:
+    """Event record via 0A→0C without 5A or POLL×3. Tests both with and without 1E-arm."""
+    t, proto = connect_proto(host, port)
+    try:
+        print("  Startup (POLL)…")
+        proto.startup()
+
+        # Get the first event key via 1E
+        key4, data8 = proto.read_event_first()
+        if data8[4:8] == b"\x00\x00\x00\x00":
+            return "INCONCLUSIVE", "Device has no stored events — cannot test", {}
+        print(f"  First event key: {key4.hex()}")
+
+        details: dict = {"key": key4.hex()}
+
+        # ── Variant 3a: 0A → 0C → 1F(browse), no 1E-arm ─────────────────────
+        print("\n  [3a] 0A → 0C → 1F(browse)  (NO 1E-arm, NO 5A)")
+        try:
+            _hdr, rec_len = proto.read_waveform_header(key4)
+            print(f"  0A OK  rec_len=0x{rec_len:02X}")
+            record_3a = proto.read_waveform_record(key4)
+            print(f"  0C OK  {len(record_3a)} bytes")
+            # Check for recognizable content
+            has_tran = b"Tran" in record_3a
+            has_vert = b"Vert" in record_3a
+            has_long = b"Long" in record_3a
+            print(f"  0C content check: Tran={has_tran}  Vert={has_vert}  Long={has_long}")
+            details["3a_0c_bytes"] = len(record_3a)
+            details["3a_has_peaks"] = has_tran and has_vert and has_long
+
+            # Now try browse 1F without any 5A
+            key4_next, data8_next = proto.advance_event(browse=True)
+            null_sentinel = data8_next[4:8] == b"\x00\x00\x00\x00"
+            print(f"  1F(browse) → key={key4_next.hex()}  null={null_sentinel}")
+            details["3a_1f_ok"]   = True
+            details["3a_outcome"] = "PASS"
+        except ProtocolError as exc:
+            print(f"  3a FAILED: {exc}")
+            details["3a_outcome"] = f"FAIL: {exc}"
+            # Try to recover by reconnecting for 3b
+            t.disconnect()
+            t2, proto2 = connect_proto(host, port)
+            proto2.startup()
+            key4, data8 = proto2.read_event_first()
+            if data8[4:8] == b"\x00\x00\x00\x00":
+                return "FAIL", f"3a failed and device empty on retry: {exc}", details
+            t, proto = t2, proto2
+
+        # ── Variant 3b: 0A → 1E-arm → 0C → 1F(browse), no 5A ───────────────
+        print("\n  [3b] 0A → 1E-arm(0xFE) → 0C → 1F(browse)  (NO POLL×3, NO 5A)")
+        try:
+            _hdr, rec_len = proto.read_waveform_header(key4)
+            print(f"  0A OK  rec_len=0x{rec_len:02X}")
+
+            # 1E download-arm (token=0xFE) between 0A and 0C
+            proto.read_event_first(token=0xFE)
+            print("  1E-arm OK")
+
+            record_3b = proto.read_waveform_record(key4)
+            print(f"  0C OK  {len(record_3b)} bytes")
+            has_tran = b"Tran" in record_3b
+            print(f"  0C content check: Tran={has_tran}  Vert={b'Vert' in record_3b}")
+            details["3b_0c_bytes"] = len(record_3b)
+            details["3b_has_peaks"] = has_tran
+
+            # Browse 1F without 5A / POLL×3
+            key4_next2, data8_next2 = proto.advance_event(browse=True)
+            null_sentinel2 = data8_next2[4:8] == b"\x00\x00\x00\x00"
+            print(f"  1F(browse) → key={key4_next2.hex()}  null={null_sentinel2}")
+            details["3b_1f_ok"]   = True
+            details["3b_outcome"] = "PASS"
+        except ProtocolError as exc:
+            print(f"  3b FAILED: {exc}")
+            details["3b_outcome"] = f"FAIL: {exc}"
+
+        # Summarize
+        a_ok = details.get("3a_outcome") == "PASS"
+        b_ok = details.get("3b_outcome") == "PASS"
+        if a_ok:
+            return "PASS", "3a: 0A→0C works with NO 1E-arm and NO 5A. Huge speedup possible!", details
+        elif b_ok:
+            return "PASS", "3b: 0A→1E-arm→0C works without 5A (1E-arm still needed before 0C)", details
+        else:
+            return "FAIL", "Both 3a and 3b failed — 5A may be required for device state", details
+
+    except ProtocolError as exc:
+        return "FAIL", f"Protocol error during setup: {exc}", {}
+    finally:
+        try:
+            t.disconnect()
+        except Exception:
+            pass
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  EXP4 — Skip initial 1E if we already know the event key
+# ══════════════════════════════════════════════════════════════════════════════
+#
+#  In Blastware, every session starts with 1E to discover the first key.
+#  But if we already fetched and cached the event keys from a previous session,
+#  can we skip 1E entirely and go straight to 0A(cached_key)?
+#
+#  Practical use case: we poll the device every N minutes. We already know
+#  all the event keys from last time. On re-connect, can we go direct to 0A?
+#
+#  If PASS: subsequent polls that don't add new events can skip 1E discovery.
+
+def exp_skip_1e(host: str, port: int) -> tuple[str, str, dict]:
+    """Get the first event key, disconnect, reconnect, go straight to 0A (skip 1E)."""
+    # Phase 1: get the key
+    t, proto = connect_proto(host, port)
+    try:
+        proto.startup()
+        key4, data8 = proto.read_event_first()
+        if data8[4:8] == b"\x00\x00\x00\x00":
+            return "INCONCLUSIVE", "No events stored — cannot test", {}
+        print(f"  Phase 1: got event key = {key4.hex()}")
+    finally:
+        t.disconnect()
+    time.sleep(0.5)
+
+    # Phase 2: fresh connection, skip 1E, go straight to 0A with cached key
+    t2, proto2 = connect_proto(host, port)
+    try:
+        print("  Phase 2: fresh connection — startup + 0A directly (no 1E)")
+        proto2.startup()
+
+        _hdr, rec_len = proto2.read_waveform_header(key4)
+        print(f"  0A OK  rec_len=0x{rec_len:02X}")
+
+        record = proto2.read_waveform_record(key4)
+        has_peaks = b"Tran" in record
+        print(f"  0C OK  {len(record)} bytes  has_peaks={has_peaks}")
+
+        details = {
+            "cached_key":  key4.hex(),
+            "0c_bytes":    len(record),
+            "has_peaks":   has_peaks,
+        }
+        return "PASS", "0A works with cached key — 1E discovery can be skipped on known sessions", details
+
+    except ProtocolError as exc:
+        return "FAIL", f"0A failed with cached key (device needs 1E first?): {exc}", {"key": key4.hex()}
+    finally:
+        t2.disconnect()
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  EXP5 — Fewer POLLs before 5A (try POLL×1 instead of Blastware's POLL×3)
+# ══════════════════════════════════════════════════════════════════════════════
+#
+#  Blastware always sends 3 full POLL probe+data cycles between 1F and 5A.
+#  Each POLL is a round trip. Can we get away with just 1?
+#
+#  WARNING: If POLL×1 fails, the device may be in a bad state. We try to
+#  recover with an extra POLL×2 and a fresh 5A attempt. Even on failure we
+#  try to leave the device in a usable state.
+#
+#  Strategy: run the full event sequence up to 1F(download), then try 5A
+#  with only 1 POLL. If 5A responds → PASS. If timeout → try 2 more POLLs
+#  and check if the device recovers.
+
+def exp_fewer_polls(host: str, port: int) -> tuple[str, str, dict]:
+    """Full sequence to 1F, then only 1 POLL before 5A (Blastware does 3)."""
+    t, proto = connect_proto(host, port)
+    try:
+        proto.startup()
+
+        key4, data8 = proto.read_event_first()
+        if data8[4:8] == b"\x00\x00\x00\x00":
+            return "INCONCLUSIVE", "No events stored — cannot test", {}
+        print(f"  Event key: {key4.hex()}")
+
+        # Full setup: 0A → 1E-arm → 0C → 1F(download)
+        _hdr, rec_len = proto.read_waveform_header(key4)
+        print(f"  0A OK  rec_len=0x{rec_len:02X}")
+        proto.read_event_first(token=0xFE)   # 1E-arm
+        print("  1E-arm OK")
+        proto.read_waveform_record(key4)
+        print("  0C OK")
+        arm_key4, _ = proto.advance_event(browse=False)  # 1F(download) — arms 5A
+        print(f"  1F(download) OK  arm_key={arm_key4.hex()}")
+
+        # Only 1 POLL (Blastware does 3)
+        print("  Sending 1 POLL (instead of 3)…")
+        proto.poll()
+        print("  POLL ok — now probing 5A…")
+
+        try:
+            frames = proto.read_bulk_waveform_stream(key4, stop_after_metadata=True, max_chunks=12)
+            print(f"  5A OK after 1 POLL  — {len(frames)} frames received")
+            details = {"poll_count": 1, "frames": len(frames)}
+            return "PASS", "5A works with only 1 POLL (saved 2 round-trips per event)!", details
+
+        except ProtoTimeout:
+            print("  5A timed out after 1 POLL — device needs more POLLs")
+            # Attempt recovery: send 2 more POLLs and see if 5A then works
+            print("  Attempting recovery: 2 more POLLs…")
+            try:
+                proto.poll()
+                proto.poll()
+                frames2 = proto.read_bulk_waveform_stream(key4, stop_after_metadata=True, max_chunks=12)
+                print(f"  5A worked after total 3 POLLs ({len(frames2)} frames)")
+                return "FAIL", "5A needs 3 POLLs — 1 is not enough (recovery confirmed 3 still works)", {
+                    "poll_count_tried": 1, "recovery_polls": 3, "recovery_frames": len(frames2)
+                }
+            except ProtocolError as exc2:
+                return "FAIL", f"5A failed even after 3 total POLLs — device may need reconnect: {exc2}", {}
+
+    except ProtocolError as exc:
+        return "FAIL", f"Setup failed: {exc}", {}
+    finally:
+        t.disconnect()
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  EXP6 — Compliance-only write (71×3→72), skip event index + trigger + waveform
+# ══════════════════════════════════════════════════════════════════════════════
+#
+#  Blastware's full write sequence: 68→73 | 71×3→72 | 82→83 | 69→74→72
+#  We want to know: can we write ONLY the compliance block (71×3→72)?
+#
+#  Test procedure:
+#    1. Read current compliance config (SUB 1A)
+#    2. Patch the "notes" field to a test marker
+#    3. Write ONLY 71×3→72 (skip 68, 73, 82, 83, 69, 74, final 72)
+#    4. Read back (SUB 1A) and verify the change was written
+#    5. Restore original value
+#
+#  If PASS: we can push individual config fields without touching event index,
+#           trigger config, or waveform data — huge simplification.
+#  If FAIL: the device needs the full write sequence (may reject partial write).
+#
+#  SAFETY: We restore original data in a finally block. If the restore write
+#  fails, the device will have the test marker in "notes" — harmless but visible.
+
+_EXP6_MARKER = "[exp6-test]"
+
+def exp_compliance_only(host: str, port: int) -> tuple[str, str, dict]:
+    """Write compliance block alone (71×3→72), verify, and restore."""
+    client, info = connect_client(host, port)
+    original_raw: Optional[bytes] = None
+    try:
+        proto = client._proto
+        if proto is None:
+            return "FAIL", "Could not get protocol handle from client", {}
+
+        # 1. Read current compliance
+        print("  Reading current compliance config (SUB 1A)…")
+        original_raw = proto.read_compliance_config()
+        print(f"  Got {len(original_raw)} bytes of compliance config")
+
+        # Find current notes value for display
+        info_obj = DeviceInfo()
+        _decode_compliance_config_into(original_raw, info_obj)
+        cc = info_obj.compliance_config
+        orig_notes = cc.notes if cc else "(unknown)"
+        print(f"  Current notes field: {orig_notes!r}")
+
+        # 2. Build modified payload with test marker in notes
+        test_notes = _EXP6_MARKER
+        modified_raw = _encode_compliance_config(
+            original_raw,
+            notes=test_notes,
+        )
+        print(f"  Encoded modified compliance payload ({len(modified_raw)} bytes)")
+        print(f"  Patching notes: {orig_notes!r} → {test_notes!r}")
+
+        # 3. Write ONLY the compliance block: 71×3 → 72
+        print("  Writing compliance ONLY (71×3→72) — skipping 68/73/82/83/69/74…")
+        proto.write_compliance_config_raw(modified_raw)
+        print("  Write complete — device acked 71×3→72")
+
+        # 4. Read back and verify
+        print("  Reading back compliance config to verify…")
+        readback_raw = proto.read_compliance_config()
+        readback_info = DeviceInfo()
+        _decode_compliance_config_into(readback_raw, readback_info)
+        rb_cc = readback_info.compliance_config
+        readback_notes = rb_cc.notes if rb_cc else "(decode failed)"
+        print(f"  Read-back notes: {readback_notes!r}")
+
+        write_worked = (readback_notes == test_notes)
+        print(f"  Write verified: {write_worked}")
+
+        details = {
+            "original_notes":  orig_notes,
+            "written_notes":   test_notes,
+            "readback_notes":  readback_notes,
+            "write_verified":  write_worked,
+        }
+
+        if write_worked:
+            return "PASS", "Compliance-only write works! No event index or trigger writes needed.", details
+        else:
+            return "FAIL", f"Write was not reflected in read-back (got {readback_notes!r})", details
+
+    except ProtocolError as exc:
+        return "FAIL", f"Protocol error: {exc}", {}
+
+    finally:
+        # Restore original compliance data regardless of outcome
+        if original_raw is not None:
+            print("  Restoring original compliance config…")
+            try:
+                proto2 = client._proto
+                if proto2:
+                    proto2.write_compliance_config_raw(
+                        _encode_compliance_config(original_raw)  # no-op patch = verbatim
+                    )
+                    print("  Restore complete")
+                else:
+                    print("  WARNING: protocol handle gone — could not restore")
+            except Exception as exc_r:
+                print(f"  WARNING: restore failed: {exc_r}")
+        client.close()
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+#  Registry + main
+# ══════════════════════════════════════════════════════════════════════════════
+
+EXPERIMENTS = {
+    "cold_status":        ("EXP1", exp_cold_status,        "Monitor status (1C) with no POLL"),
+    "fast_event_count":   ("EXP2", exp_fast_event_count,   "Event count via POLL+08, skip identity reads"),
+    "no_5a":              ("EXP3", exp_no_5a,              "Event record (0C) without bulk waveform (5A)"),
+    "skip_1e":            ("EXP4", exp_skip_1e,            "0A/0C with cached key — skip initial 1E"),
+    "fewer_polls":        ("EXP5", exp_fewer_polls,        "1 POLL before 5A instead of Blastware's 3"),
+    "compliance_only":    ("EXP6", exp_compliance_only,    "Compliance-only write (71×3→72), no other blocks"),
+}
+
+
+def main() -> None:
+    ap = argparse.ArgumentParser(description="MiniMate Plus protocol minimization experiments")
+    ap.add_argument("--host", default=DEFAULT_HOST)
+    ap.add_argument("--port", type=int, default=DEFAULT_PORT)
+    ap.add_argument("--debug", action="store_true", help="Enable DEBUG wire logging")
+    ap.add_argument("experiments", nargs="*",
+                    help=f"Which to run (default: all). Choices: {', '.join(EXPERIMENTS)}")
+    args = ap.parse_args()
+
+    if args.debug:
+        logging.getLogger().setLevel(logging.DEBUG)
+
+    which = args.experiments or list(EXPERIMENTS.keys())
+    unknown = [e for e in which if e not in EXPERIMENTS]
+    if unknown:
+        print(f"Unknown experiments: {unknown}")
+        print(f"Available: {', '.join(EXPERIMENTS)}")
+        sys.exit(1)
+
+    print(f"\n{'═'*60}")
+    print(f"  MiniMate Plus Protocol Minimization Experiments")
+    print(f"  Target: {args.host}:{args.port}")
+    print(f"  Running: {', '.join(which)}")
+    print(f"{'═'*60}")
+
+    results: list[Result] = []
+    for key in which:
+        tag, fn, desc = EXPERIMENTS[key]
+        label = f"{tag}: {desc}"
+        r = run(label, fn, args.host, args.port)
+        results.append(r)
+        time.sleep(1.5)   # brief pause between experiments — let device settle
+
+    print(f"\n\n{'═'*60}")
+    print("  SUMMARY")
+    print(f"{'═'*60}")
+    for r in results:
+        sym = {"PASS": "✅", "FAIL": "❌", "INCONCLUSIVE": "⚠️ "}.get(r.outcome, "?")
+        print(f"  {sym}  {r.outcome:13s}  {r.name}")
+    print(f"{'═'*60}")
+
+    passed = sum(1 for r in results if r.outcome == "PASS")
+    failed = sum(1 for r in results if r.outcome == "FAIL")
+    skipped = sum(1 for r in results if r.outcome == "INCONCLUSIVE")
+    print(f"  {passed} passed  {failed} failed  {skipped} inconclusive")
+
+
+if __name__ == "__main__":
+    try:
+        main()
+    except KeyboardInterrupt:
+        print("\nInterrupted.")
+        sys.exit(0)
@@ -0,0 +1,48 @@
+"""
+micromate — Instantel Micromate (Series IV) device library.
+
+Sibling of ``minimateplus`` (the Series III library).  Currently scoped to
+the offline-file ingest path used by thor-watcher: parsing the per-event
+``.IDFH``/``.IDFW`` ASCII text sidecars Thor's exporter writes alongside
+each binary event file, and wrapping the parsed data in typed event
+records.
+
+Live-device support (TCP protocol, frame parsing, real-time monitoring)
+is deferred — when we add it, it lands here as ``transport.py`` /
+``framing.py`` / ``protocol.py`` / ``client.py``, mirroring the
+``minimateplus`` package layout.
+
+Typical usage (offline file ingest):
+
+    from micromate import IdfEvent, parse_idf_report
+
+    text  = open("UM11719_20231219162723.IDFW.txt").read()
+    rep   = parse_idf_report(text)                       # dict
+    event = IdfEvent.from_report(rep, "UM11719_20231219162723.IDFW")
+    print(event.serial, event.peaks.transverse_ips, event.mic_pspl_dbl)
+"""
+
+from .idf_ascii_report import (
+    parse_event_filename,
+    parse_idf_report,
+    serial_from_filename,
+)
+from .models import (
+    IdfEvent,
+    IdfPeaks,
+    IdfProjectInfo,
+    IdfReport,
+    IdfSensorCheck,
+)
+
+__version__ = "0.1.0"
+__all__ = [
+    "IdfEvent",
+    "IdfPeaks",
+    "IdfProjectInfo",
+    "IdfReport",
+    "IdfSensorCheck",
+    "parse_event_filename",
+    "parse_idf_report",
+    "serial_from_filename",
+]
@@ -0,0 +1,330 @@
+"""
+micromate/idf_ascii_report.py — parse Thor (Micromate Series IV) IDF ASCII reports.
+
+Thor exports a `.IDFW.txt` or `.IDFH.txt` sidecar next to each `.IDFW`
+(waveform) or `.IDFH` (histogram) event binary.  Each sidecar is a
+plain-text file with `"Key : Value"` lines covering the full device-
+authoritative event metadata — PPV per channel, ZC Freq, Time of Peak,
+Peak Acceleration / Displacement, sensor self-check results, project
+strings, calibration date, battery level, etc. — followed by a raw
+waveform-samples block headed by the literal line "Waveform Data Channels".
+
+This is the Thor analogue of `minimateplus/bw_ascii_report.py` for the
+Blastware (Series III) report format.  The parser is intentionally
+permissive: we extract everything we recognise into a flat dict and
+silently ignore anything we don't.  Downstream callers parse units
+(`"0.2119 in/s"` → 0.2119) only on the fields they need.
+
+Example input (truncated):
+
+    "EventType : Full Waveform"
+    "SampleRate : 1024 sps"
+    "EventTime : 16:27:23"
+    "EventDate : 2023-12-19"
+    "TranPPV : 0.0251 in/s"
+    "VertPPV : 0.2119 in/s"
+    "LongPPV : 0.0282 in/s"
+    "PeakVectorSum : 0.2131 in/s"
+    "MicPSPL : 99.4 dB(L)"
+    "TranZCFreq : 6.5 Hz"
+    "SerialNumber : UM11719"
+    "Version : Micromate ISEE 11.0AK"
+    "FileName : UM11719_20231219162723.IDFW"
+    "BatteryLevel : 3.8 volts"
+    "Calibration : November 22, 2023 by Instantel"
+    "TranTestResults : Passed"
+    "TitleString1 : UPMC Presby-Loc 3-Level1-1R Elevator Rm"
+    Waveform Data Channels
+        Tran    Vert    Long    MicL
+        0.0003  -0.0003  0.0003  0.00013
+        ...
+"""
+
+from __future__ import annotations
+
+import datetime
+import re
+from typing import Any, Dict, Optional, Tuple, Union
+
+
+# Lines look like:  "Key : Value"   (quotes literal, single ":" separator)
+_LINE_RE = re.compile(r'^\s*"?([^":]+?)"?\s*:\s*"?(.*?)"?\s*$')
+
+# Marker that ends the metadata block — everything after is raw sample data.
+_WAVEFORM_BLOCK_MARKER = "waveform data channels"
+
+
+def _normalize_key(raw: str) -> str:
+    """Convert "TranPPV" / "PreTriggerLength" → snake_case."""
+    s = raw.strip()
+    # Insert underscore between lower→upper / digit→letter transitions
+    s = re.sub(r"(?<=[a-z0-9])(?=[A-Z])", "_", s)
+    s = re.sub(r"(?<=[A-Z])(?=[A-Z][a-z])", "_", s)
+    s = s.replace("-", "_").replace(" ", "_")
+    return s.lower()
+
+
+def _strip_unit_suffix(value: str) -> str:
+    """Return the numeric part of values like "0.2119 in/s" → "0.2119".
+
+    Also strips Thor's below/above-threshold prefixes:
+      "<0.005 in/s"  → "0.005"   (below-noise-floor reading)
+      ">100 Hz"      → "100"     (above-measurement-range reading)
+    """
+    parts = value.strip().split()
+    token = parts[0] if parts else value.strip()
+    if token.startswith("<") or token.startswith(">"):
+        token = token[1:]
+    return token
+
+
+def _parse_float(value: str) -> Optional[float]:
+    try:
+        return float(_strip_unit_suffix(value))
+    except (ValueError, TypeError):
+        return None
+
+
+def _parse_int(value: str) -> Optional[int]:
+    try:
+        return int(float(_strip_unit_suffix(value)))
+    except (ValueError, TypeError):
+        return None
+
+
+def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
+    """
+    Parse a Thor IDFW.txt / IDFH.txt sidecar.
+
+    Returns a flat dict with two kinds of entries:
+
+      - **Raw fields** — every `Key : Value` line, keyed by snake_case
+        of the original key, value as a string (unit suffix preserved).
+        Lets callers grab any field we haven't explicitly normalised.
+
+      - **Derived fields** — a curated set with parsed types:
+          * `serial_number`     str
+          * `event_type`        str  ("Full Waveform" / "Full Histogram")
+          * `event_datetime`    ISO-8601 string ("YYYY-MM-DDTHH:MM:SS") when
+                                 both EventDate and EventTime are present
+          * `sample_rate`       int  (samples/sec)
+          * `tran_ppv`,`vert_ppv`,`long_ppv` float (in/s)
+          * `mic_ppv`           float (dB or psi — same units as MicPSPL)
+          * `peak_vector_sum`   float (in/s)
+          * `tran_zc_freq`,`vert_zc_freq`,`long_zc_freq` float (Hz)
+          * `record_time_sec`   float (seconds)
+          * `pre_trigger_sec`   float (seconds)
+          * `project`           str  (from TitleString1 — Thor's location)
+          * `client`            str  (TitleString2)
+          * `operator`          str  (TitleString3 — company/operator)
+          * `notes`             str  (TitleString4)
+          * `setup`             str
+          * `version`           str  (firmware)
+          * `battery_volts`     float
+          * `calibration_text`  str  (e.g. "November 22, 2023 by Instantel")
+          * `tran_test_passed`, `vert_test_passed`, `long_test_passed`,
+            `mic_test_passed`  bool  ("Passed" → True; anything else → False)
+          * `filename`          str  (FileName line — useful sanity check)
+
+    Stops parsing at the literal "Waveform Data Channels" line; the
+    raw-samples block is left to whoever wants to decode the binary.
+
+    Input may be `str` or `bytes` (`utf-8`/`latin-1` tolerant).
+    """
+    if isinstance(text, bytes):
+        try:
+            text = text.decode("utf-8")
+        except UnicodeDecodeError:
+            text = text.decode("latin-1", errors="replace")
+
+    raw: Dict[str, str] = {}
+
+    for line in text.splitlines():
+        stripped = line.strip()
+        if not stripped:
+            continue
+        if stripped.lower().startswith(_WAVEFORM_BLOCK_MARKER):
+            break
+        m = _LINE_RE.match(stripped)
+        if not m:
+            continue
+        key = _normalize_key(m.group(1))
+        value = m.group(2).strip()
+        # Multi-value lines (Channel, Units, etc.) — coalesce by appending.
+        if key in raw:
+            raw[key] = raw[key] + "; " + value
+        else:
+            raw[key] = value
+
+    out: Dict[str, Any] = dict(raw)  # keep all raw fields
+
+    # ── Derived fields ───────────────────────────────────────────────────────
+
+    def _take(*candidates: str) -> Optional[str]:
+        for c in candidates:
+            if c in raw:
+                return raw[c]
+        return None
+
+    # Event identity
+    if "serial_number" in raw:
+        out["serial_number"] = raw["serial_number"]
+    if "event_type" in raw:
+        out["event_type"] = raw["event_type"]
+    if "file_name" in raw:
+        out["filename"] = raw["file_name"]
+
+    # Combined date+time.  Waveform sidecars use "EventDate" / "EventTime";
+    # histogram sidecars use "HistogramStartDate" / "HistogramStartTime".
+    # Prefer the event_* names when both are present.
+    ed = raw.get("event_date") or raw.get("histogram_start_date")
+    et = raw.get("event_time") or raw.get("histogram_start_time")
+    if ed and et:
+        try:
+            dt = datetime.datetime.strptime(f"{ed} {et}", "%Y-%m-%d %H:%M:%S")
+            out["event_datetime"] = dt.isoformat()
+        except ValueError:
+            pass
+
+    # Numeric scalars.  For every field we typify here, we MUST drop the
+    # raw string copy from `out` when parsing fails — Thor writes things
+    # like "<0.005 in/s" (below threshold) and "N/A" (not measured) that
+    # would otherwise linger in `out` as strings, sneak into SQLite REAL
+    # columns via permissive type affinity, and then crash the JS
+    # frontend on `.toFixed(...)`.
+    int_fields = ("sample_rate",)
+    for key in int_fields:
+        v = raw.get(key)
+        if v is None:
+            continue
+        iv = _parse_int(v)
+        if iv is not None:
+            out[key] = iv
+        else:
+            out.pop(key, None)
+
+    float_fields = (
+        "tran_ppv", "vert_ppv", "long_ppv", "peak_vector_sum",
+        "tran_zc_freq", "vert_zc_freq", "long_zc_freq",
+        "tran_peak_acceleration", "vert_peak_acceleration",
+        "long_peak_acceleration",
+        "tran_peak_displacement", "vert_peak_displacement",
+        "long_peak_displacement",
+        "mic_zc_freq",
+    )
+    for key in float_fields:
+        v = raw.get(key)
+        if v is None:
+            continue
+        fv = _parse_float(v)
+        if fv is not None:
+            out[key] = fv
+        else:
+            out.pop(key, None)
+
+    # Time-of-peak: Thor labels these "TimeofPeak" (lowercase "of") so the
+    # normalizer produces "*_timeof_peak".  Map them to the canonical
+    # ``*_time_of_peak`` output keys for downstream consumers.
+    for raw_key, out_key in (
+        ("tran_timeof_peak", "tran_time_of_peak"),
+        ("vert_timeof_peak", "vert_time_of_peak"),
+        ("long_timeof_peak", "long_time_of_peak"),
+        ("mic_timeof_peak",  "mic_time_of_peak"),
+    ):
+        v = raw.get(raw_key)
+        if v is None:
+            continue
+        fv = _parse_float(v)
+        if fv is not None:
+            out[out_key] = fv
+
+    # Microphone — Thor reports MicPSPL (dB(L)) which is the closest
+    # analogue to BW's mic_ppv.  The raw "99.4 dB(L)" string stays in
+    # `out` under the original `mic_pspl` key for display; the parsed
+    # float goes in `mic_ppv`.
+    mic = raw.get("mic_pspl")
+    if mic is not None:
+        fv = _parse_float(mic)
+        if fv is not None:
+            out["mic_ppv"] = fv
+
+    # Record / pre-trigger duration — same drop-on-failure discipline.
+    rt = raw.get("record_time")
+    if rt is not None:
+        fv = _parse_float(rt)
+        if fv is not None:
+            out["record_time_sec"] = fv
+    pt = raw.get("pre_trigger_length")
+    if pt is not None:
+        fv = _parse_float(pt)
+        if fv is not None:
+            out["pre_trigger_sec"] = fv
+
+    # Project / client / operator / location strings.  Thor's title
+    # strings are operator-defined; conventional mapping (per Thor's
+    # default TitleNote labels in the example data):
+    #   TitleString1 = Location  → project (sensor location identifier)
+    #   TitleString2 = Client    → client
+    #   TitleString3 = Company   → operator (the monitoring company)
+    #   TitleString4 = Notes     → notes
+    out["project"]  = _take("title_string1")
+    out["client"]   = _take("title_string2")
+    out["operator"] = _take("title_string3", "operator")
+    out["notes"]    = _take("title_string4", "post_event_note")
+
+    if "setup" in raw:
+        out["setup"] = raw["setup"]
+    if "version" in raw:
+        out["version"] = raw["version"]
+
+    # Battery (e.g. "3.8 volts" → 3.8)
+    bl = raw.get("battery_level")
+    if bl is not None:
+        fv = _parse_float(bl)
+        if fv is not None:
+            out["battery_volts"] = fv
+
+    # Calibration line is free-form (e.g. "November 22, 2023 by Instantel").
+    if "calibration" in raw:
+        out["calibration_text"] = raw["calibration"]
+
+    # Sensor self-check results — bool flags
+    for key, out_key in (
+        ("tran_test_results", "tran_test_passed"),
+        ("vert_test_results", "vert_test_passed"),
+        ("long_test_results", "long_test_passed"),
+        ("mic_test_results",  "mic_test_passed"),
+    ):
+        v = raw.get(key)
+        if v is not None:
+            out[out_key] = v.strip().lower() == "passed"
+
+    return out
+
+
+def serial_from_filename(name: str) -> Optional[str]:
+    """Convenience: pull the serial prefix from a Thor event filename.
+
+    Thor uses the literal serial as the filename prefix:
+      UM11719_20231219163444.IDFW  →  "UM11719"
+      BE9439_20200713124251.IDFH   →  "BE9439"
+    """
+    m = re.match(r"^([A-Z]{2}\d+)_\d{14}\.(IDFH|IDFW)(?:\.txt)?$",
+                 name, re.IGNORECASE)
+    return m.group(1).upper() if m else None
+
+
+def parse_event_filename(name: str) -> Optional[Tuple[str, datetime.datetime, str]]:
+    """Parse `<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>` → (serial, datetime, kind).
+
+    `kind` is "IDFH" or "IDFW" (upper-case).  Returns None on no match.
+    """
+    m = re.match(r"^([A-Z]{2}\d+)_(\d{14})\.(IDFH|IDFW)$",
+                 name, re.IGNORECASE)
+    if not m:
+        return None
+    try:
+        ts = datetime.datetime.strptime(m.group(2), "%Y%m%d%H%M%S")
+    except ValueError:
+        return None
+    return m.group(1).upper(), ts, m.group(3).upper()
@@ -0,0 +1,530 @@
+"""
+micromate/idf_file.py — Thor IDF binary codec.
+
+Decodes the Instantel Micromate Series IV ``.IDFW`` (waveform) and
+``.IDFH`` (histogram) binary on-disk format.  Sister module to
+``minimateplus/event_file_io.py``.
+
+Status (2026-05-28):
+
+- **Genuine Series IV / Thor binaries** are all signed
+  ``00 12 01 00 00 00 Instantel\\0`` (sig-A in earlier notes).  Two
+  Series III (Blastware) binaries appear in the example corpus
+  (``BE9439_*``) — they share the ``.IDFW``/``.IDFH`` extension by
+  filing convention but carry a BW STRT header (``10 00 01 80 00 00
+  Instantel STRT...``) and are NOT Thor data.  The reader detects
+  them by signature and raises NotImplementedError pointing callers
+  at ``minimateplus.event_file_io.read_blastware_file()``.
+- **IDFW waveform body** reuses the BW segment-rotated block codec
+  verbatim.  Body always starts at file offset ``0x0f1f``.  Samples
+  decoded via ``minimateplus.waveform_codec.decode_waveform_v2``
+  with 87–99% byte-exact match against ``.IDFW.txt`` sidecar (quiet
+  events).  Loud events hit the BW codec's known walker-stops-early
+  limit.  Residual ~3% drift on per-sample deltas — likely a
+  Thor-specific 12-bit delta refinement that BW's codec doesn't
+  model.  Geo LSB = 0.0003 in/s; mic factor ~2.14e-6 psi/count.
+- **IDFH histogram body**: 12-byte segment header
+  ``[len_be 2B] 0a 00 00 00 [00 NN_counter] 05 3f`` introduces a
+  segment of ``N`` 72-byte interval records (``N = (len - 10) // 72``).
+  Each record holds 4 × 16-byte per-channel min/max/halfp + 8-byte
+  tail.  Geo peaks via ``max(|min|, |max|) / 32768 × 10`` in/s
+  (matches sidecar within ~1.8%), freq via ``512 / halfp`` Hz.
+  **All 859 Thor IDFH files in the corpus decode (181,071 intervals).**
+- Binary metadata directly extracted: serial, timestamp, sample_rate,
+  record_time, calibration_date.  Other fields fall back to the paired
+  ``.IDFW.txt`` / ``.IDFH.txt`` sidecar (consumed by
+  ``WaveformStore.save_imported_idf``).
+
+The full reverse-engineering writeup lives in
+``docs/idf_protocol_reference.md``.
+"""
+
+from __future__ import annotations
+
+import datetime
+import struct
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Optional, Union
+
+from minimateplus.waveform_codec import decode_waveform_v2
+
+from .models import IdfEvent, IdfPeaks, IdfReport
+
+
+# Genuine Series IV / Thor IDF binary signature: 6 bytes, then ASCII "Instantel".
+_THOR_PREFIX = b"\x00\x12\x01\x00\x00\x00"
+# Stray Series III (Blastware) binaries that occasionally turn up in Thor
+# corpus directories renamed to the .IDFW/.IDFH convention.  Their header
+# (`10 00 01 80 00 00 Instantel STRT ...`) is byte-for-byte a BW SUB 5A
+# STRT record, not a Thor binary.  Detected so we can refuse-and-route
+# rather than mis-parse.
+_BW_STRAY_PREFIX = b"\x10\x00\x01\x80\x00\x00"
+_INSTANTEL_TAG = b"Instantel"
+
+# Most common body offset for sig-A IDFW files (~50% of prod events;
+# 151/154 in the original tests/fixtures/THORDATA_example corpus).  The
+# body is the segment-rotated block stream consumed by decode_waveform_v2;
+# bytes [0:3] are the magic ``00 02 00`` preamble.  Production events
+# routinely use other offsets — see :func:`_find_waveform_body_offset`
+# for the dynamic scan.  This constant survives only as the priority hint.
+_BODY_START_SIG_A = 0x0F1F
+
+# Magic bytes that mark a candidate waveform-body preamble.
+_BODY_MAGIC = b"\x00\x02\x00"
+
+# Where to start looking for body candidates inside the file.  Skip the
+# fixed-header region where the same magic legitimately appears inside
+# channel-test records and the compliance block (offsets 0x015d, 0x091c,
+# 0x0ae2, 0x0d30 in observed events).
+_BODY_SCAN_FLOOR = 0x0E00
+
+# Geophone count → in/s, derived from sidecar ground truth: the smallest
+# non-zero sample in 1,014-file corpus is 0.0003 in/s.
+_GEO_LSB_IPS = 0.0003
+
+# Microphone count → psi, derived from sidecar regression on 50 sample
+# pairs from UM11719_20231219162723.IDFW (mic-heavy event).
+_MIC_LSB_PSI = 2.14e-6
+
+# IDFH histogram constants.
+_IDFH_INTERVAL_SIZE = 72        # bytes per per-interval record
+_IDFH_SEGMENT_HEADER = 10       # bytes: [len_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
+_IDFH_SEGMENT_TAIL   = 2        # bytes after the interval data block, before next marker
+_IDFH_HALFP_FREQ_NUM = 512.0    # freq_hz = NUM / halfp; halfp ≤ 5 means ">100 Hz" sentinel
+_IDFH_GEO_FULL_SCALE = 10.0     # in/s — Normal range
+_IDFH_INT16_FS = 32768.0
+_IDFH_CHANNELS = ("Tran", "Vert", "Long", "MicL")
+
+
+# ─── Binary metadata extraction ─────────────────────────────────────────────
+
+
+@dataclass
+class IdfBinaryMetadata:
+    """Fields recoverable from the sig-A binary header (no .txt needed)."""
+    serial:           Optional[str] = None
+    event_datetime:   Optional[datetime.datetime] = None
+    sample_rate:      Optional[int] = None
+    record_time_sec:  Optional[float] = None
+    calibration_date: Optional[datetime.date] = None
+
+
+def _read_ascii_z(buf: bytes, off: int, maxlen: int = 64) -> Optional[str]:
+    if off >= len(buf):
+        return None
+    end = buf.find(b"\x00", off, off + maxlen)
+    if end < 0:
+        end = min(off + maxlen, len(buf))
+    s = buf[off:end].decode("ascii", errors="replace").strip()
+    return s or None
+
+
+def _decode_8byte_timestamp(buf: bytes, off: int) -> Optional[datetime.datetime]:
+    """Layout: ``[day][month][year_hi][year_lo][unknown][hour][min][sec]``."""
+    if off + 8 > len(buf):
+        return None
+    day, mon, yh, yl, _unk, hr, mn, sc = buf[off : off + 8]
+    year = (yh << 8) | yl
+    if not (2015 <= year <= 2050 and 1 <= mon <= 12 and 1 <= day <= 31
+            and 0 <= hr < 24 and 0 <= mn < 60 and 0 <= sc < 60):
+        return None
+    try:
+        return datetime.datetime(year, mon, day, hr, mn, sc)
+    except ValueError:
+        return None
+
+
+def extract_binary_metadata(buf: bytes) -> IdfBinaryMetadata:
+    """Pull serial/timestamp/sample_rate/record_time/calibration from the
+    sig-A binary header.
+
+    Field positions confirmed against UM11719_20231219162723.IDFW; stable
+    across the 151-file sig-A corpus.
+    """
+    md = IdfBinaryMetadata()
+
+    # Serial: null-terminated ASCII at 0x14E.
+    md.serial = _read_ascii_z(buf, 0x14E, maxlen=16)
+
+    # Sample rate + record time live in a BW-compatible compliance block.
+    # Locate the 6-byte anchor `be 80 00 00 00 00` and read offsets relative
+    # to it: anchor-6 = sample_rate uint16 BE; anchor+6 = record_time float32 BE.
+    anchor = buf.find(b"\xbe\x80\x00\x00\x00\x00", 0x800, 0xA00)
+    if anchor > 0:
+        sr_bytes = buf[anchor - 6 : anchor - 4]
+        if len(sr_bytes) == 2:
+            sr = int.from_bytes(sr_bytes, "big")
+            if sr in (256, 512, 1024, 2048, 4096):
+                md.sample_rate = sr
+        rt_bytes = buf[anchor + 6 : anchor + 10]
+        if len(rt_bytes) == 4:
+            try:
+                rt = struct.unpack(">f", rt_bytes)[0]
+                if 0.1 <= rt <= 600.0:
+                    md.record_time_sec = float(rt)
+            except struct.error:
+                pass
+
+    # Event timestamp: 8 bytes.  Position differs between IDFW (0x97A) and
+    # IDFH (0x9F8); scan a small range and accept the first valid decode.
+    for off in (0x97A, 0x9F8):
+        ts = _decode_8byte_timestamp(buf, off)
+        if ts is not None:
+            md.event_datetime = ts
+            break
+
+    # Calibration date: day, month, year_be at 0x194-0x197.
+    if len(buf) > 0x197:
+        day, mon = buf[0x194], buf[0x195]
+        year = int.from_bytes(buf[0x196 : 0x198], "big")
+        if 1 <= mon <= 12 and 1 <= day <= 31 and 2015 <= year <= 2050:
+            try:
+                md.calibration_date = datetime.date(year, mon, day)
+            except ValueError:
+                pass
+
+    return md
+
+
+# ─── Sample decoder + unit conversion ───────────────────────────────────────
+
+
+def _find_waveform_body_offset(buf: bytes) -> Optional[int]:
+    """Pick the file offset of the waveform body by trial-decoding every
+    ``00 02 00`` magic position past the fixed-header region.
+
+    The body's location isn't fixed across all sig-A IDFW files — about
+    half the production events use ``0x0f1f``, but the rest have offsets
+    that shift based on header padding / channel-config layout.  We
+    auto-detect by:
+
+      1. Find every ``00 02 00`` occurrence past ``_BODY_SCAN_FLOOR``.
+      2. Try ``decode_waveform_v2()`` on each candidate.
+      3. Pick the offset whose decoded sample count is largest.
+
+    Returns the offset, or ``None`` if no candidate yielded more than
+    the trivial 2-sample preamble (= "no real body found").
+
+    Costs ~2-8 trial decodes per file; in practice the first candidate
+    past 0x0e00 is usually the right one.
+    """
+    if len(buf) < _BODY_SCAN_FLOOR + 8:
+        return None
+    best: Optional[tuple[int, int]] = None   # (total_samples, offset)
+    i = _BODY_SCAN_FLOOR
+    while True:
+        j = buf.find(_BODY_MAGIC, i)
+        if j < 0:
+            break
+        i = j + 1
+        try:
+            decoded = decode_waveform_v2(buf[j:])
+        except Exception:
+            continue
+        if not decoded:
+            continue
+        total = sum(len(v) for v in decoded.values())
+        # A "real" body has more than just the 2-sample preamble.
+        if total <= 2:
+            continue
+        if best is None or total > best[0]:
+            best = (total, j)
+    return best[1] if best else None
+
+
+def _decode_waveform_samples(buf: bytes) -> Optional[dict]:
+    """Decode samples from the sig-A waveform body.
+
+    Returns the raw decoder counts dict — geo LSB = 0.0003 in/s, mic in
+    its own count unit (see :func:`mic_count_to_psi`).  Returns None if
+    no usable body is found.
+
+    Uses :func:`_find_waveform_body_offset` to locate the body — the
+    file-offset varies across events (~50% sit at the canonical
+    ``0x0f1f`` but the rest don't), so the previous hardcoded constant
+    silently produced 2-sample preamble-only output for half the corpus.
+    """
+    off = _find_waveform_body_offset(buf)
+    if off is None:
+        return None
+    return decode_waveform_v2(buf[off:])
+
+
+def geo_count_to_ips(count: int) -> float:
+    """Convert a Thor geo decoder count to in/s.  LSB = 0.0003 in/s."""
+    return count * _GEO_LSB_IPS
+
+
+def mic_count_to_psi(count: int) -> float:
+    """Convert a Thor mic decoder count to psi.  Scale derived from
+    regression over 50 sample pairs in UM11719_20231219162723.IDFW;
+    consistent to ~5%.  Calibration constants from the channel block
+    can refine this once decoded.
+    """
+    return count * _MIC_LSB_PSI
+
+
+# ─── IDFH histogram decoder ─────────────────────────────────────────────────
+
+
+@dataclass
+class IdfhInterval:
+    """One decoded histogram interval (typically one minute of monitoring)."""
+    offset:    int    # file byte offset of the 72-byte record
+    # Per-channel min/max ADC counts (int16 BE), half-period samples, peak count.
+    # Peak = max(|min|, |max|).  freq_hz = 512/halfp (None if halfp ≤ 5 →
+    # ">100 Hz" sentinel; matches sidecar convention).
+    tran_min:    int
+    tran_max:    int
+    tran_halfp:  int
+    vert_min:    int
+    vert_max:    int
+    vert_halfp:  int
+    long_min:    int
+    long_max:    int
+    long_halfp:  int
+    micl_min:    int
+    micl_max:    int
+    micl_halfp:  int
+
+    def peak_count(self, channel: str) -> int:
+        mn = getattr(self, f"{channel.lower()}_min")
+        mx = getattr(self, f"{channel.lower()}_max")
+        return max(abs(mn), abs(mx))
+
+    def peak_ips(self, channel: str) -> float:
+        """Convert peak count to in/s (geo channels only)."""
+        return self.peak_count(channel) / _IDFH_INT16_FS * _IDFH_GEO_FULL_SCALE
+
+    def freq_hz(self, channel: str) -> Optional[float]:
+        halfp = getattr(self, f"{channel.lower()}_halfp")
+        if halfp <= 5:
+            return None
+        return _IDFH_HALFP_FREQ_NUM / halfp
+
+
+def _decode_idfh_interval(buf72: bytes, offset: int) -> IdfhInterval:
+    """Decode one 72-byte interval record into per-channel min/max/halfp."""
+    import struct
+    fields = []
+    for i in range(4):
+        block = buf72[i * 16 : (i + 1) * 16]
+        mn = struct.unpack_from(">h", block, 0)[0]
+        mx = struct.unpack_from(">h", block, 2)[0]
+        # block[4:6] = int16 BE, role unknown (possibly time-of-peak)
+        halfp = struct.unpack_from(">H", block, 6)[0]
+        # block[10:12] and block[14:16] are uint16 BE with unknown semantics
+        # (likely sum / count contributions for the PVS computation).
+        fields.extend([mn, mx, halfp])
+    # Tail 8 bytes (buf72[64:72]) carry PVS-related data; not yet decoded.
+    return IdfhInterval(
+        offset=offset,
+        tran_min=fields[0], tran_max=fields[1], tran_halfp=fields[2],
+        vert_min=fields[3], vert_max=fields[4], vert_halfp=fields[5],
+        long_min=fields[6], long_max=fields[7], long_halfp=fields[8],
+        micl_min=fields[9], micl_max=fields[10], micl_halfp=fields[11],
+    )
+
+
+def decode_idfh_body(buf: bytes) -> list:
+    """Walk an IDFH file and decode every interval record.
+
+    The body has one or more segments; each segment header is 12 bytes:
+    ``[length_be 2B][0a 00 00 00][00 NN_counter][05 3f]`` where ``length``
+    is bytes from the magic through the end of the interval block
+    (= 10 + 72 × n_intervals).  Segments are separated by a 2-byte tail
+    + next-segment 2-byte prefix (the bytes before the next length field).
+    Confirmed against the 859-file corpus (181,071 intervals decoded; 1
+    failure is the sig-B BE9439 file).
+    """
+    intervals: list = []
+    i = 0
+    while True:
+        j = buf.find(b"\x0a\x00\x00\x00", i)
+        if j < 0 or j < 2:
+            break
+        # Validate: [length_be][0a 00 00 00][00 NN][05 3f]
+        if buf[j + 4] != 0x00 or buf[j + 6 : j + 8] != b"\x05\x3f":
+            i = j + 1
+            continue
+        length = int.from_bytes(buf[j - 2 : j], "big")
+        n = (length - _IDFH_SEGMENT_HEADER) // _IDFH_INTERVAL_SIZE
+        if n <= 0:
+            i = j + 1
+            continue
+        header_start = j - 2
+        interval_start = header_start + _IDFH_SEGMENT_HEADER
+        for k in range(n):
+            off = interval_start + k * _IDFH_INTERVAL_SIZE
+            if off + _IDFH_INTERVAL_SIZE > len(buf):
+                break
+            chunk = buf[off : off + _IDFH_INTERVAL_SIZE]
+            intervals.append(_decode_idfh_interval(chunk, off))
+        # Advance past this segment + the 2-byte tail.
+        i = header_start + length + _IDFH_SEGMENT_TAIL
+    return intervals
+
+
+# ─── Top-level reader ───────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfReadResult:
+    """Return type for :func:`read_idf_file`.
+
+    For waveforms (``.IDFW``), ``samples`` holds the per-channel sample
+    arrays in Thor decoder counts.  For histograms (``.IDFH``),
+    ``samples`` is empty and ``intervals`` holds the per-interval
+    record list (peaks, freqs).
+    """
+    event:           IdfEvent
+    samples:         dict   # {"Tran": [...], ...} for IDFW; empty for IDFH
+    binary_metadata: IdfBinaryMetadata
+    signature:       str    # always "thor" for now (sig-A genuine Thor)
+    intervals:       Optional[list] = None  # list[IdfhInterval] for IDFH; None for IDFW
+
+
+def read_idf_file(
+    path: Union[str, Path],
+    *,
+    data: Optional[bytes] = None,
+) -> IdfReadResult:
+    """Parse a Thor ``.IDFW`` binary into an ``IdfEvent`` + decoded samples.
+
+    Currently implements signature-A waveforms only.  Signature-B
+    (old-firmware) and ``.IDFH`` histograms raise NotImplementedError;
+    use the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar for those via
+    ``parse_idf_report()``.
+
+    Returns an :class:`IdfReadResult`.  The caller converts int sample
+    counts to physical units via :func:`geo_count_to_ips` /
+    :func:`mic_count_to_psi`.
+
+    ``path`` is used for filename in error messages and ``.IDFH`` vs
+    ``.IDFW`` suffix detection.  When ``data`` is supplied the disk
+    read is skipped — useful for ingest paths that already have the
+    bytes in memory and where the file may not exist on disk yet.
+    """
+    p = Path(path)
+    buf = data if data is not None else p.read_bytes()
+
+    if len(buf) < 16 or buf[6:16] != _INSTANTEL_TAG + b"\x00":
+        raise ValueError(f"{p.name}: not an IDF file (missing Instantel magic)")
+
+    sig_prefix = buf[:6]
+    if sig_prefix == _THOR_PREFIX:
+        signature = "thor"
+    elif sig_prefix == _BW_STRAY_PREFIX:
+        raise NotImplementedError(
+            f"{p.name}: file has a Series III (Blastware) STRT header in "
+            "an IDF-named container — not a Thor binary.  Route through "
+            "minimateplus.event_file_io.read_blastware_file() instead "
+            "(peaks decode; samples & full metadata don't, but it's not "
+            "Thor data so the Thor codec doesn't apply)."
+        )
+    else:
+        raise ValueError(f"{p.name}: unknown IDF signature {sig_prefix.hex()}")
+
+    is_histogram = p.suffix.upper() == ".IDFH"
+    md = extract_binary_metadata(buf)
+
+    if is_histogram:
+        intervals = decode_idfh_body(buf)
+        if not intervals:
+            raise ValueError(f"{p.name}: IDFH body decoded no intervals")
+        # Peaks: max across all intervals on each channel (per-channel max
+        # of stored max-magnitudes; sidecar's PPV row carries the same).
+        peak_tran = max((iv.peak_ips("Tran") for iv in intervals), default=0.0)
+        peak_vert = max((iv.peak_ips("Vert") for iv in intervals), default=0.0)
+        peak_long = max((iv.peak_ips("Long") for iv in intervals), default=0.0)
+        # Mic peak in psi — Thor stores per-interval mic ADC counts in the
+        # binary; convert the max count to psi via the per-count factor.
+        mic_peak_count = max((iv.peak_count("MicL") for iv in intervals), default=0)
+        mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
+        rep = IdfReport(
+            serial_number=md.serial,
+            event_type="Full Histogram",
+            event_datetime=md.event_datetime,
+            filename=p.name,
+            sample_rate=md.sample_rate,
+            record_time_sec=md.record_time_sec,
+        )
+        peaks = IdfPeaks(
+            transverse_ips=peak_tran,
+            vertical_ips=peak_vert,
+            longitudinal_ips=peak_long,
+            peak_vector_sum_ips=None,
+            mic_pspl_dbl=None,         # IDFH binary doesn't carry the dB(L) value
+            mic_pspl_psi=mic_peak_psi,
+        )
+        event = IdfEvent(
+            serial=md.serial or "UNKNOWN",
+            timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
+            kind="Histogram",
+            filename=p.name,
+            sample_rate=md.sample_rate,
+            record_time_sec=md.record_time_sec,
+            peaks=peaks,
+            report=rep,
+        )
+        return IdfReadResult(
+            event=event,
+            samples={},
+            binary_metadata=md,
+            signature=signature,
+            intervals=intervals,
+        )
+
+    # Waveform path.
+    decoded = _decode_waveform_samples(buf)
+    if decoded is None:
+        raise ValueError(f"{p.name}: waveform body codec failed")
+
+    rep = IdfReport(
+        serial_number=md.serial,
+        event_type="Full Waveform",
+        event_datetime=md.event_datetime,
+        filename=p.name,
+        sample_rate=md.sample_rate,
+        record_time_sec=md.record_time_sec,
+    )
+
+    def _peak_ips(ch: str) -> float:
+        arr = decoded.get(ch, [])
+        return geo_count_to_ips(max((abs(v) for v in arr), default=0))
+
+    # Mic peak psi from binary: max absolute MicL ADC count × 2.14e-6 psi/count.
+    mic_arr = decoded.get("MicL", [])
+    mic_peak_count = max((abs(v) for v in mic_arr), default=0)
+    mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
+
+    peaks = IdfPeaks(
+        transverse_ips=_peak_ips("Tran"),
+        vertical_ips=_peak_ips("Vert"),
+        longitudinal_ips=_peak_ips("Long"),
+        # PVS requires aligned per-sample √(T²+V²+L²); leave None — the
+        # sidecar carries it and the bridge picks it up if present.
+        peak_vector_sum_ips=None,
+        mic_pspl_dbl=None,             # binary IDFW doesn't carry the dB(L) value;
+                                       # sidecar .txt fills it via IdfReport.from_dict
+        mic_pspl_psi=mic_peak_psi,
+    )
+
+    event = IdfEvent(
+        serial=md.serial or "UNKNOWN",
+        timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
+        kind="Waveform",
+        filename=p.name,
+        sample_rate=md.sample_rate,
+        record_time_sec=md.record_time_sec,
+        peaks=peaks,
+        report=rep,
+    )
+
+    return IdfReadResult(
+        event=event,
+        samples=decoded,
+        binary_metadata=md,
+        signature=signature,
+    )
@@ -0,0 +1,323 @@
+"""
+micromate/idf_to_bw_report.py — adapter that projects a parsed Thor IDF
+report (+ binary metadata + decoded IDFH intervals) into the
+``bw_report``-shaped dict that :mod:`sfm.report_pdf.gather_report_data`
+consumes.
+
+Lets Thor events flow through the existing Series III Event Report PDF
+pipeline without duplicating the renderer.  Thor's report content is
+~95% the same data shape as BW's; the field names differ but the
+underlying metrics map 1:1.
+
+Caveats
+───────
+
+- **Mic units** — Thor records ``MicPSPL`` natively in dB(L).  This
+  adapter sets ``bw_report.mic.pspl_dbl`` directly; the report
+  renderer recomputes the equivalent psi via its dBL→psi formula.
+- **Saturation / above-range flags** — Thor doesn't always mark
+  ``OORANGE`` the way BW does; we set ``zc_freq_above_range`` only
+  when a `>100` sentinel was preserved in the raw text.
+- **Per-interval data** — for IDFH events we build ``interval_times``
+  by stepping ``IntervalSize`` from ``HistogramStartTime``; the binary
+  decoder confirms one record per step (882 / 881 / 881 ... across
+  the corpus).
+- **calibration_by parsing** — Thor's free-form ``Calibration : November
+  22, 2023 by Instantel`` is split on ``" by "`` to extract the
+  calibrator; the date prefix is parsed where possible, otherwise
+  the binary-extracted ``calibration_date`` from
+  :class:`micromate.idf_file.IdfBinaryMetadata` wins.
+"""
+
+from __future__ import annotations
+
+import datetime
+import re
+from typing import Any, Dict, List, Optional
+
+
+# ─── Helpers ────────────────────────────────────────────────────────────────
+
+
+_NUM_RE = re.compile(r"-?\d+(?:\.\d+)?")
+
+
+def _parse_first_number(s: Optional[str]) -> Optional[float]:
+    """Pull the first numeric token from a string like ``"0.1500 in/s"``."""
+    if s is None:
+        return None
+    m = _NUM_RE.search(str(s))
+    if not m:
+        return None
+    try:
+        return float(m.group(0))
+    except ValueError:
+        return None
+
+
+def _parse_interval_size_s(s: Optional[str]) -> Optional[float]:
+    """``"60 sec"`` → 60.0, ``"5 min"`` → 300.0, ``"1 hour"`` → 3600."""
+    if s is None:
+        return None
+    num = _parse_first_number(s)
+    if num is None:
+        return None
+    sl = str(s).lower()
+    if "hour" in sl or "hr" in sl:
+        return num * 3600.0
+    if "min" in sl:
+        return num * 60.0
+    return num   # default to seconds
+
+
+def _parse_calibration(text: Optional[str]) -> tuple[Optional[str], Optional[str]]:
+    """Split ``"November 22, 2023 by Instantel"`` → (ISO date, calibrator).
+
+    Returns ``(None, None)`` if neither half parses.
+    """
+    if not text:
+        return None, None
+    parts = str(text).split(" by ", 1)
+    date_part = parts[0].strip() if parts else None
+    by_part = parts[1].strip() if len(parts) > 1 else None
+    iso_date: Optional[str] = None
+    if date_part:
+        for fmt in ("%B %d, %Y", "%b %d, %Y", "%Y-%m-%d", "%m/%d/%Y"):
+            try:
+                iso_date = datetime.datetime.strptime(date_part, fmt).date().isoformat()
+                break
+            except ValueError:
+                continue
+    return iso_date, by_part
+
+
+def _channel_peaks(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
+    """Map ``tran_ppv`` / ``tran_zc_freq`` / ... → bw_report.peaks.tran shape."""
+    out: Dict[str, Any] = {}
+    for src, dst in (
+        (f"{ch_lc}_ppv",                 "ppv_ips"),
+        (f"{ch_lc}_zc_freq",             "zc_freq_hz"),
+        (f"{ch_lc}_time_of_peak",        "time_of_peak_s"),
+        (f"{ch_lc}_peak_acceleration",   "peak_accel_g"),
+        (f"{ch_lc}_peak_displacement",   "peak_disp_in"),
+    ):
+        v = idf.get(src)
+        if v is not None:
+            out[dst] = v
+    # ZC freq ">100" sentinel: the raw text carries it under the un-typed
+    # key (e.g. ``raw["tran_zc_freq"]`` would be ``">100"``), and our parser
+    # dropped the typed entry.  Detect that case and flag.
+    raw_zc = idf.get(f"{ch_lc}_zc_freq")
+    if isinstance(raw_zc, str) and ">" in raw_zc:
+        out["zc_freq_above_range"] = True
+        out.pop("zc_freq_hz", None)
+    return out
+
+
+def _sensor_check(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
+    out: Dict[str, Any] = {}
+    fr = idf.get(f"{ch_lc}_test_freq")
+    if fr is not None:
+        out["freq_hz"] = _parse_first_number(fr)
+    rt = idf.get(f"{ch_lc}_test_ratio")
+    if rt is not None:
+        out["ratio"] = _parse_first_number(rt)
+    am = idf.get(f"{ch_lc}_test_amplitude")
+    if am is not None:
+        out["amplitude_mv"] = _parse_first_number(am)
+    res = idf.get(f"{ch_lc}_test_results")
+    if res is not None:
+        out["result"] = str(res).strip()
+    return {k: v for k, v in out.items() if v is not None}
+
+
+def _interval_times(idf: Dict[str, Any], n_intervals: Optional[int]) -> List[str]:
+    """Synthesise per-interval timestamps from start + interval_size × k.
+
+    Returns ``[]`` when start time or interval size is unknown.
+    """
+    if not n_intervals:
+        return []
+    start_date = idf.get("histogram_start_date") or idf.get("event_date")
+    start_time = idf.get("histogram_start_time") or idf.get("event_time")
+    iv_str = idf.get("interval_size")
+    iv_s = _parse_interval_size_s(iv_str)
+    if not (start_date and start_time and iv_s):
+        return []
+    try:
+        t0 = datetime.datetime.strptime(f"{start_date} {start_time}", "%Y-%m-%d %H:%M:%S")
+    except ValueError:
+        return []
+    out = []
+    for k in range(int(n_intervals)):
+        t = t0 + datetime.timedelta(seconds=iv_s * (k + 1))
+        out.append(t.isoformat())
+    return out
+
+
+# ─── Top-level adapter ──────────────────────────────────────────────────────
+
+
+def build_bw_report_from_idf(
+    idf_report: Dict[str, Any],
+    *,
+    binary_md=None,
+    intervals: Optional[list] = None,
+    is_histogram: Optional[bool] = None,
+) -> Dict[str, Any]:
+    """Project a parsed IDF report dict (and optional binary metadata +
+    decoded IDFH intervals) into the BW report sidecar shape.
+
+    The returned dict is structurally identical to what
+    ``minimateplus.event_file_io._bw_report_to_dict`` produces from a
+    real BW ASCII report — it can be assigned to
+    ``sidecar["bw_report"]`` and consumed verbatim by
+    ``sfm.report_pdf.gather_report_data``.
+
+    ``intervals`` is the list of :class:`micromate.idf_file.IdfhInterval`
+    objects from :func:`micromate.idf_file.decode_idfh_body`; only used
+    for histogram events to derive accurate ``interval_times``.
+    """
+    if is_histogram is None:
+        et = str(idf_report.get("event_type", ""))
+        is_histogram = et.lower().startswith("full histogram")
+
+    # ── Trigger / recording / device ─────────────────────────────────────
+    trigger_channel = idf_report.get("trigger")
+    trigger_level   = _parse_first_number(idf_report.get("geo_trigger_level"))
+    geo_range_ips   = _parse_first_number(idf_report.get("geo_range"))
+
+    cal_iso, cal_by = _parse_calibration(idf_report.get("calibration"))
+    # Prefer the binary-extracted calibration_date when our text parse fell
+    # through; the binary date is unambiguous.
+    if cal_iso is None and binary_md is not None and binary_md.calibration_date:
+        cal_iso = binary_md.calibration_date.isoformat()
+
+    # ── Histogram fields ────────────────────────────────────────────────
+    hist_block: Dict[str, Any] = {
+        "start": None, "stop": None, "n_intervals": None,
+        "interval_size": None, "interval_size_s": None,
+        "channel_peak_when": {},
+    }
+    if is_histogram:
+        sd = idf_report.get("histogram_start_date")
+        st = idf_report.get("histogram_start_time")
+        if sd and st:
+            try:
+                hist_block["start"] = datetime.datetime.strptime(
+                    f"{sd} {st}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+        ed = idf_report.get("histogram_stop_date")
+        et_ = idf_report.get("histogram_stop_time")
+        if ed and et_:
+            try:
+                hist_block["stop"] = datetime.datetime.strptime(
+                    f"{ed} {et_}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+        n_raw = idf_report.get("number_of_intervals")
+        if n_raw is not None:
+            try:
+                # Thor reports a float like "81.04"; round to int (the BW
+                # report uses an int for the column).
+                hist_block["n_intervals"] = int(float(str(n_raw)))
+            except ValueError:
+                pass
+        # When the binary decoder gave us the actual interval count, prefer it.
+        if intervals is not None:
+            hist_block["n_intervals"] = len(intervals)
+        hist_block["interval_size"] = idf_report.get("interval_size")
+        hist_block["interval_size_s"] = _parse_interval_size_s(idf_report.get("interval_size"))
+        # interval_times derived from start+step (the BW report uses the
+        # exact strings; we match its representation).
+        times = _interval_times(idf_report, hist_block["n_intervals"])
+        # Per-channel peak when (absolute date+time at which the channel's
+        # peak occurred over the histogram run).  Thor splits this into
+        # ``TranPeakDate`` / ``TranPeakTime`` etc.
+        peak_when: Dict[str, str] = {}
+        for ch_label, ch_lc in (("Tran", "tran"), ("Vert", "vert"), ("Long", "long"), ("MicL", "mic")):
+            d = idf_report.get(f"{ch_lc}_peak_date")
+            t = idf_report.get(f"{ch_lc}_peak_time")
+            if d and t:
+                try:
+                    peak_when[ch_label] = datetime.datetime.strptime(
+                        f"{d} {t}", "%Y-%m-%d %H:%M:%S"
+                    ).isoformat()
+                except ValueError:
+                    continue
+        if peak_when:
+            hist_block["channel_peak_when"] = peak_when
+
+    # ── Mic block ────────────────────────────────────────────────────────
+    mic_block = {
+        "weighting":           "L",                   # Thor mic is ISEE Linear
+        "pspl_dbl":            idf_report.get("mic_ppv"),  # the dB(L) float
+        "pspl_saturated":      False,
+        "zc_freq_hz":          idf_report.get("mic_zc_freq"),
+        "zc_freq_above_range": isinstance(idf_report.get("mic_zc_freq"), str)
+                               and ">" in str(idf_report.get("mic_zc_freq")),
+        "time_of_peak_s":      idf_report.get("mic_time_of_peak"),
+    }
+    if mic_block["zc_freq_above_range"]:
+        mic_block["zc_freq_hz"] = None
+
+    # ── Peaks ────────────────────────────────────────────────────────────
+    vs_block = {
+        "ips":       idf_report.get("peak_vector_sum"),
+        "time_s":    _parse_first_number(idf_report.get("peak_vector_sum_time_sum")),
+        "when":      None,
+        "saturated": False,
+    }
+    if is_histogram:
+        # PVS absolute date+time, when present.
+        vs_d = idf_report.get("peak_vector_sum_date")
+        vs_t = idf_report.get("peak_vector_sum_time")
+        if vs_d and vs_t:
+            try:
+                vs_block["when"] = datetime.datetime.strptime(
+                    f"{vs_d} {vs_t}", "%Y-%m-%d %H:%M:%S"
+                ).isoformat()
+            except ValueError:
+                pass
+
+    return {
+        "available":  True,
+        "event_type": idf_report.get("event_type"),
+        "version":    idf_report.get("version"),
+        "trigger": {
+            "channel":       trigger_channel,
+            "geo_level_ips": trigger_level,
+        },
+        "recording": {
+            "sample_rate_sps":  idf_report.get("sample_rate"),
+            "record_time_s":    idf_report.get("record_time_sec"),
+            "pretrig_s":        idf_report.get("pre_trigger_sec"),
+            "stop_mode":        idf_report.get("record_stop_mode"),
+            "geo_range_ips":    geo_range_ips,
+            "units":            idf_report.get("units"),
+        },
+        "device": {
+            "battery_volts":    idf_report.get("battery_volts"),
+            "calibration_date": cal_iso,
+            "calibration_by":   cal_by,
+        },
+        "peaks": {
+            "tran":       _channel_peaks(idf_report, "tran"),
+            "vert":       _channel_peaks(idf_report, "vert"),
+            "long":       _channel_peaks(idf_report, "long"),
+            "vector_sum": vs_block,
+        },
+        "mic":          mic_block,
+        "sensor_check": {
+            "tran": _sensor_check(idf_report, "tran"),
+            "vert": _sensor_check(idf_report, "vert"),
+            "long": _sensor_check(idf_report, "long"),
+            "mic":  _sensor_check(idf_report, "mic"),
+        },
+        "histogram":    hist_block,
+        "monitor_log":  [],
+        "pc_sw_version": None,
+    }
@@ -0,0 +1,398 @@
+"""
+Micromate (Series IV / Thor) native data models.
+
+These are the right-shaped dataclasses for Thor data — Thor measures
+the microphone in dB(L) directly, so this model carries
+``mic_pspl_dbl`` rather than the pseudo-``psi`` shoehorn that
+``minimateplus.PeakValues`` uses for Series III BW data.
+
+The ingest pipeline today goes:
+
+    .IDFW.txt  →  parse_idf_report()  →  dict
+    dict       →  IdfEvent.from_report()  →  IdfEvent  (typed)
+    IdfEvent   →  IdfEvent.to_minimateplus_event()  →  shape DB / sidecar
+                                                     machinery expects
+
+The ``to_minimateplus_event()`` bridge is a temporary boundary — when we
+crack the binary IDF codec and have richer per-event data to store, the
+DB schema will grow Series-IV-specific columns and the bridge will
+shrink or disappear.
+"""
+
+from __future__ import annotations
+
+import datetime
+from dataclasses import dataclass, field
+from typing import Any, Dict, Optional, Tuple
+
+
+# ── IdfReport ─────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfReport:
+    """Typed wrapper around the dict returned by ``parse_idf_report``.
+
+    All fields optional — Thor's exporter is permissive and some IDF .txt
+    files (especially histograms) omit fields that waveform sidecars
+    include.  Use ``.raw`` for any field this dataclass hasn't surfaced
+    yet (the parser keeps every recognised key in the raw dict).
+    """
+
+    # Identity / kind
+    serial_number:     Optional[str] = None
+    event_type:        Optional[str] = None      # "Full Waveform" | "Full Histogram"
+    event_datetime:    Optional[datetime.datetime] = None
+    filename:          Optional[str] = None      # echoed by Thor's exporter
+
+    # Sampling / timing
+    sample_rate:       Optional[int]   = None    # samples/sec
+    record_time_sec:   Optional[float] = None
+    pre_trigger_sec:   Optional[float] = None
+
+    # Geophone peaks (in/s)
+    tran_ppv:          Optional[float] = None
+    vert_ppv:          Optional[float] = None
+    long_ppv:          Optional[float] = None
+    peak_vector_sum:   Optional[float] = None
+
+    # Microphone — Thor's native unit is dB(L), NOT psi.
+    mic_pspl_dbl:      Optional[float] = None
+
+    # Zero-crossing frequencies (Hz)
+    tran_zc_freq:      Optional[float] = None
+    vert_zc_freq:      Optional[float] = None
+    long_zc_freq:      Optional[float] = None
+    mic_zc_freq:       Optional[float] = None
+
+    # Per-channel time of peak (sec, since event start)
+    tran_time_of_peak: Optional[float] = None
+    vert_time_of_peak: Optional[float] = None
+    long_time_of_peak: Optional[float] = None
+    mic_time_of_peak:  Optional[float] = None
+
+    # Derived per-channel motion
+    tran_peak_acceleration: Optional[float] = None    # g
+    vert_peak_acceleration: Optional[float] = None
+    long_peak_acceleration: Optional[float] = None
+    tran_peak_displacement: Optional[float] = None    # in
+    vert_peak_displacement: Optional[float] = None
+    long_peak_displacement: Optional[float] = None
+
+    # Operator-supplied strings (Thor's TitleString1..4 → semantic slots)
+    project:           Optional[str] = None    # TitleString1
+    client:            Optional[str] = None    # TitleString2
+    operator:          Optional[str] = None    # TitleString3
+    notes:             Optional[str] = None    # TitleString4 / PostEventNote
+    setup:             Optional[str] = None    # setup file name
+
+    # Sensor self-check results
+    tran_test_passed:  Optional[bool] = None
+    vert_test_passed:  Optional[bool] = None
+    long_test_passed:  Optional[bool] = None
+    mic_test_passed:   Optional[bool] = None
+
+    # Device-fixed metadata
+    firmware_version:  Optional[str]   = None
+    calibration_text:  Optional[str]   = None
+    battery_volts:     Optional[float] = None
+
+    # Original parser dict — preserves every recognised key (including
+    # raw unit-suffixed strings) for forward-compatible field access.
+    raw: Dict[str, Any] = field(default_factory=dict, repr=False)
+
+    @classmethod
+    def from_dict(cls, d: Dict[str, Any]) -> "IdfReport":
+        """Build an IdfReport from the dict returned by ``parse_idf_report``."""
+        ed = d.get("event_datetime")
+        if isinstance(ed, str):
+            try:
+                ed = datetime.datetime.fromisoformat(ed)
+            except ValueError:
+                ed = None
+
+        return cls(
+            serial_number     = d.get("serial_number"),
+            event_type        = d.get("event_type"),
+            event_datetime    = ed if isinstance(ed, datetime.datetime) else None,
+            filename          = d.get("filename"),
+            sample_rate       = d.get("sample_rate"),
+            record_time_sec   = d.get("record_time_sec"),
+            pre_trigger_sec   = d.get("pre_trigger_sec"),
+            tran_ppv          = d.get("tran_ppv"),
+            vert_ppv          = d.get("vert_ppv"),
+            long_ppv          = d.get("long_ppv"),
+            peak_vector_sum   = d.get("peak_vector_sum"),
+            mic_pspl_dbl      = d.get("mic_ppv"),       # parser names it mic_ppv (legacy)
+            tran_zc_freq      = d.get("tran_zc_freq"),
+            vert_zc_freq      = d.get("vert_zc_freq"),
+            long_zc_freq      = d.get("long_zc_freq"),
+            mic_zc_freq       = d.get("mic_zc_freq"),
+            tran_time_of_peak = d.get("tran_time_of_peak"),
+            vert_time_of_peak = d.get("vert_time_of_peak"),
+            long_time_of_peak = d.get("long_time_of_peak"),
+            mic_time_of_peak  = d.get("mic_time_of_peak"),
+            tran_peak_acceleration = d.get("tran_peak_acceleration"),
+            vert_peak_acceleration = d.get("vert_peak_acceleration"),
+            long_peak_acceleration = d.get("long_peak_acceleration"),
+            tran_peak_displacement = d.get("tran_peak_displacement"),
+            vert_peak_displacement = d.get("vert_peak_displacement"),
+            long_peak_displacement = d.get("long_peak_displacement"),
+            project           = d.get("project"),
+            client            = d.get("client"),
+            operator          = d.get("operator"),
+            notes             = d.get("notes"),
+            setup             = d.get("setup"),
+            tran_test_passed  = d.get("tran_test_passed"),
+            vert_test_passed  = d.get("vert_test_passed"),
+            long_test_passed  = d.get("long_test_passed"),
+            mic_test_passed   = d.get("mic_test_passed"),
+            firmware_version  = d.get("version"),
+            calibration_text  = d.get("calibration_text"),
+            battery_volts     = d.get("battery_volts"),
+            raw               = d,
+        )
+
+
+# ── IdfPeaks / IdfProjectInfo / IdfSensorCheck (narrow grouping types) ───────
+
+
+@dataclass
+class IdfPeaks:
+    """Geophone + mic peak values for one Thor event.  Native Thor units.
+
+    Thor stores the mic peak in two parallel forms — ``mic_pspl_dbl`` is
+    what the sidecar's top-level ``MicPSPL`` header field carries (dB(L)),
+    used in the report header.  ``mic_pspl_psi`` is the psi value derived
+    either from the IDFW sample table / IDFH interval column 9, or from
+    the binary mic counts (~2.14e-6 psi/count).  Needed because the
+    BW-shaped ``PeakValues.micl`` consumed by ``event_hdf5.write_event_hdf5``
+    expects psi — feeding it dB(L) makes the h5 mic-chart scale factor
+    blow up.
+    """
+    transverse_ips:    Optional[float] = None    # in/s
+    vertical_ips:      Optional[float] = None    # in/s
+    longitudinal_ips:  Optional[float] = None    # in/s
+    peak_vector_sum_ips: Optional[float] = None  # in/s
+    mic_pspl_dbl:      Optional[float] = None    # dB(L)
+    mic_pspl_psi:      Optional[float] = None    # psi
+
+
+@dataclass
+class IdfProjectInfo:
+    """Operator-supplied strings from Thor's TitleString1..4."""
+    project:  Optional[str] = None
+    client:   Optional[str] = None
+    operator: Optional[str] = None
+    notes:    Optional[str] = None
+    setup:    Optional[str] = None
+
+
+@dataclass
+class IdfSensorCheck:
+    """Per-channel pass/fail from Thor's self-test."""
+    tran: Optional[bool] = None
+    vert: Optional[bool] = None
+    long: Optional[bool] = None
+    mic:  Optional[bool] = None
+
+
+# ── IdfEvent ─────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfEvent:
+    """A single Thor / Micromate Series IV event.
+
+    Built from a parsed .IDFW.txt or .IDFH.txt sidecar via
+    ``IdfEvent.from_report()``.  The filename is the authoritative
+    source for serial + timestamp + kind; the .txt provides
+    device-authoritative peak values, frequencies, project strings,
+    sensor self-check, firmware, calibration.
+    """
+
+    # Identity
+    serial:    str
+    timestamp: datetime.datetime
+    kind:      str                  # "Waveform" | "Histogram"
+    filename:  str                  # device-native binary filename, e.g. "UM11719_20231219163444.IDFW"
+
+    # Sampling / timing
+    sample_rate:     Optional[int]   = None
+    record_time_sec: Optional[float] = None
+    pre_trigger_sec: Optional[float] = None
+
+    # Peaks
+    peaks: IdfPeaks = field(default_factory=IdfPeaks)
+
+    # Per-channel frequencies (Hz)
+    tran_zc_freq: Optional[float] = None
+    vert_zc_freq: Optional[float] = None
+    long_zc_freq: Optional[float] = None
+    mic_zc_freq:  Optional[float] = None
+
+    # Project strings
+    project_info: IdfProjectInfo = field(default_factory=IdfProjectInfo)
+
+    # Sensor self-check
+    sensor_check: IdfSensorCheck = field(default_factory=IdfSensorCheck)
+
+    # Device-fixed
+    firmware_version: Optional[str]   = None
+    calibration_text: Optional[str]   = None
+    battery_volts:    Optional[float] = None
+
+    # The full parsed report — preserves anything not surfaced as a typed field
+    report: IdfReport = field(default_factory=IdfReport)
+
+    @classmethod
+    def from_report(
+        cls,
+        report: Any,
+        filename: str,
+    ) -> "IdfEvent":
+        """Build an IdfEvent from a parsed report (dict or IdfReport) and
+        the device-native binary filename.
+
+        The filename is authoritative for serial + timestamp + kind:
+        Thor's filenames are literal ``<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>``
+        and the device's own clock is the canonical event timestamp.
+        If the report carries an ``event_datetime`` that differs from
+        what's in the filename, the report wins (it has finer-grained
+        device-reported time-of-trigger semantics).
+        """
+        from .idf_ascii_report import parse_event_filename
+
+        # Normalise input to IdfReport
+        if isinstance(report, IdfReport):
+            rep = report
+        elif isinstance(report, dict):
+            rep = IdfReport.from_dict(report)
+        else:
+            raise TypeError(
+                f"report must be IdfReport or dict; got {type(report).__name__}"
+            )
+
+        # Filename → (serial, timestamp, kind).  Required — fall back to
+        # report-supplied values only if filename parsing fails.
+        parsed = parse_event_filename(filename)
+        if parsed is not None:
+            fn_serial, fn_ts, fn_kind = parsed
+            kind = "Histogram" if fn_kind == "IDFH" else "Waveform"
+        else:
+            fn_serial = rep.serial_number or "UNKNOWN"
+            fn_ts     = rep.event_datetime or datetime.datetime(1970, 1, 1)
+            kind      = "Waveform" if (rep.event_type or "").lower().startswith("full waveform") else "Histogram"
+
+        # Prefer report's event_datetime (device-authoritative) over the filename.
+        ts = rep.event_datetime or fn_ts
+        serial = rep.serial_number or fn_serial
+
+        return cls(
+            serial=serial,
+            timestamp=ts,
+            kind=kind,
+            filename=filename,
+            sample_rate=rep.sample_rate,
+            record_time_sec=rep.record_time_sec,
+            pre_trigger_sec=rep.pre_trigger_sec,
+            peaks=IdfPeaks(
+                transverse_ips      = rep.tran_ppv,
+                vertical_ips        = rep.vert_ppv,
+                longitudinal_ips    = rep.long_ppv,
+                peak_vector_sum_ips = rep.peak_vector_sum,
+                mic_pspl_dbl        = rep.mic_pspl_dbl,
+            ),
+            tran_zc_freq=rep.tran_zc_freq,
+            vert_zc_freq=rep.vert_zc_freq,
+            long_zc_freq=rep.long_zc_freq,
+            mic_zc_freq=rep.mic_zc_freq,
+            project_info=IdfProjectInfo(
+                project=rep.project,
+                client=rep.client,
+                operator=rep.operator,
+                notes=rep.notes,
+                setup=rep.setup,
+            ),
+            sensor_check=IdfSensorCheck(
+                tran=rep.tran_test_passed,
+                vert=rep.vert_test_passed,
+                long=rep.long_test_passed,
+                mic=rep.mic_test_passed,
+            ),
+            firmware_version=rep.firmware_version,
+            calibration_text=rep.calibration_text,
+            battery_volts=rep.battery_volts,
+            report=rep,
+        )
+
+    # ── Bridge to minimateplus shape (for the existing DB / sidecar paths) ──
+
+    def to_minimateplus_event(self, waveform_key: bytes) -> Any:
+        """Project this Thor event into the shape ``minimateplus.Event``
+        carries, so it can flow through the existing
+        ``SeismoDb.insert_events()`` and ``event_to_sidecar_dict()``
+        machinery without those code paths needing to know about Thor.
+
+        Caveats of the bridge:
+          - ``PeakValues.micl`` carries the mic peak in **psi** (matching
+            BW's convention) — set from :attr:`IdfPeaks.mic_pspl_psi`,
+            with a dB(L)→psi fallback when only the dB(L) value is
+            available.  This is what the h5 writer's mic-scale-factor
+            logic needs.  The dB(L) value still flows through
+            ``bw_report.mic.pspl_dbl`` (set by the
+            ``idf_to_bw_report`` adapter) and the renderer reads it
+            from there for the report header.
+          - Many Thor-specific fields (Peak Acceleration / Displacement,
+            sensor self-check, calibration) don't have a slot in
+            ``Event``.  The full IdfReport is preserved on the
+            ``.sfm.json`` sidecar under ``extensions.idf_report`` via
+            ``save_imported_idf`` — that's the source of truth for them.
+        """
+        from minimateplus.models import (
+            Event, PeakValues, ProjectInfo, Timestamp,
+        )
+
+        ts_obj = Timestamp(
+            raw=bytes(9),
+            flag=0,
+            year=self.timestamp.year,
+            unknown_byte=0,
+            month=self.timestamp.month,
+            day=self.timestamp.day,
+            hour=self.timestamp.hour,
+            minute=self.timestamp.minute,
+            second=self.timestamp.second,
+        )
+        # Resolve mic peak as psi.  Priority: binary-derived mic_pspl_psi
+        # (set by read_idf_file) > dB(L)→psi fallback via standard formula
+        # (psi = 2.9e-9 × 10^(dBL/20)) > None.
+        mic_psi = self.peaks.mic_pspl_psi
+        if mic_psi is None and self.peaks.mic_pspl_dbl is not None:
+            mic_psi = 2.9e-9 * (10.0 ** (self.peaks.mic_pspl_dbl / 20.0))
+        pv = PeakValues(
+            tran=self.peaks.transverse_ips,
+            vert=self.peaks.vertical_ips,
+            long=self.peaks.longitudinal_ips,
+            micl=mic_psi,   # psi, matching BW's convention (h5 scaling depends on this)
+            peak_vector_sum=self.peaks.peak_vector_sum_ips,
+        )
+        pi = ProjectInfo(
+            setup_name=self.project_info.setup,
+            project=self.project_info.project,
+            client=self.project_info.client,
+            operator=self.project_info.operator,
+            sensor_location=None,           # Thor folds location into project string
+            notes=self.project_info.notes,
+        )
+        ev = Event(
+            index=0,
+            timestamp=ts_obj,
+            sample_rate=self.sample_rate,
+            peak_values=pv,
+            project_info=pi,
+            record_type=self.kind,
+            rectime_seconds=self.record_time_sec,
+        )
+        ev._waveform_key = waveform_key
+        return ev
@@ -20,8 +20,16 @@ Typical usage (TCP / modem):
 """

 from .client import MiniMateClient
-from .models import DeviceInfo, Event
-from .transport import SerialTransport, TcpTransport
+from .models import DeviceInfo, Event, MonitorLogEntry
+from .transport import CapturingTransport, SerialTransport, TcpTransport

 __version__ = "0.1.0"
-__all__ = ["MiniMateClient", "DeviceInfo", "Event", "SerialTransport", "TcpTransport"]
+__all__ = [
+    "MiniMateClient",
+    "DeviceInfo",
+    "Event",
+    "MonitorLogEntry",
+    "SerialTransport",
+    "TcpTransport",
+    "CapturingTransport",
+]
@@ -0,0 +1,738 @@
+"""
+minimateplus/bw_ascii_report.py — parser for Blastware's per-event ASCII
+report (the .TXT file BW writes alongside each saved event binary).
+
+The ASCII export is the authoritative source for every "rich" per-event
+field that BW computes from the waveform but never persists in the BW
+binary itself:
+
+  - Per-channel PPV (Tran / Vert / Long / MicL)
+  - Peak Vector Sum + Peak Vector Sum Time
+  - Per-channel ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement
+  - MicL PSPL, MicL Time of Peak, MicL ZC Freq
+  - Per-channel Sensor Self-Check (Test Freq / Test Ratio / Test Results)
+  - MicL Test Amplitude (mV)
+  - Battery, calibration date, monitor-log timestamps
+
+Persisting these values into the SFM database lets the monthly-summary
+review workflow ("show me events at Location X with PVS > 0.5") work
+without depending on the (still-undecoded) waveform body codec.
+
+Format (verified against decode-re/5-8-26 4-event bundle):
+
+  - One field per line, wrapped in double quotes:   `"Field Name : Value"`
+  - Field/value separator: literal ` : ` (space-colon-space).
+  - Some field names contain an internal `:` already (e.g. `"Project:"`),
+    so we split on the FIRST ` : ` only.
+  - Some fields have unit suffixes:  `"0.500 in/s"` / `"7.5 Hz"` / `"533 mv"`.
+  - A `"Monitor Log(s)"` marker line is followed by tab-separated rows
+    of `start_time<TAB>stop_time<TAB>description`.
+  - Final `"PC SW Version : ..."` line ends the metadata block.
+  - A blank line separates metadata from the sample table.
+  - Sample table starts with `   Tran   <TAB>   Vert   <TAB>...`, then
+    one row per sample (tab-separated, right-padded numeric values).
+  - Geo channel values are in in/s; MicL in dB(L) (or 0.000 below threshold).
+
+Because some metadata fields have whitespace quirks ("MicL  Time of
+Peak" has two spaces; the leading "Project:" value has its own colon),
+we normalise whitespace in the key before lookup.
+"""
+
+from __future__ import annotations
+
+import datetime
+import re
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Dict, List, Optional, Tuple, Union
+
+
+# ─────────────────────────────────────────────────────────────────────────────
+# Output dataclasses
+# ─────────────────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class ChannelStats:
+    """Per-channel derived stats, populated from an event report."""
+    ppv_ips:           Optional[float] = None      # in/s            (geo channels only)
+    zc_freq_hz:        Optional[float] = None      # Hz
+    time_of_peak_s:    Optional[float] = None      # seconds (relative to trigger; can be negative)
+    peak_accel_g:      Optional[float] = None      # g               (geo channels only)
+    peak_disp_in:      Optional[float] = None      # in              (geo channels only)
+    # When BW writes "OORANGE" (Out Of Range — truncated) for a PPV
+    # value, the true peak exceeded the channel's full-scale range.
+    # We substitute the range max (e.g. 10.000 in/s for Normal range)
+    # as a lower bound, and flag here so downstream UI / alerts know
+    # to render "> 10 in/s" or "saturated" instead of trusting the
+    # value as an exact measurement.
+    ppv_saturated:     bool = False
+    # Set when BW writes ">100 Hz" for ZC Freq — the zero-crossing
+    # algorithm's peak frequency exceeded the device's reporting
+    # ceiling (typically 100 Hz on V10.72).  zc_freq_hz gets the
+    # threshold (100.0) as a lower bound; downstream UI renders ">100".
+    zc_freq_above_range: bool = False
+
+
+@dataclass
+class MicStats:
+    """MicL-specific stats."""
+    weighting:         Optional[str]   = None      # e.g. "Linear Weighting"
+    pspl_dbl:          Optional[float] = None      # dB(L)
+    zc_freq_hz:        Optional[float] = None
+    time_of_peak_s:    Optional[float] = None
+    # Set when BW writes "OORANGE" for PSPL — mic exceeded its
+    # measurement range.  pspl_dbl gets the conservative upper bound
+    # 140 dBL (typical NL-43 max; some units cap at 148).  Consumers
+    # should render "> 140 dB(L)" or similar when this flag is set.
+    pspl_saturated:    bool = False
+    # Same semantics as ChannelStats.zc_freq_above_range — mic ZC
+    # peak exceeded device reporting ceiling.
+    zc_freq_above_range: bool = False
+
+
+@dataclass
+class SensorCheck:
+    """Per-channel sensor self-check result.
+
+    Geo channels report a frequency + ratio; MicL reports a frequency +
+    amplitude (mV).  All channels also have a Pass/Fail string.
+    """
+    test_freq_hz:      Optional[float] = None
+    test_ratio:        Optional[float] = None      # geo channels only
+    test_amplitude_mv: Optional[float] = None      # MicL only
+    test_results:      Optional[str]   = None      # "Passed" / "Failed"
+
+
+@dataclass
+class MonitorLogEntry:
+    """One row of the trailing Monitor Log(s) block."""
+    start_time:  Optional[datetime.datetime] = None
+    stop_time:   Optional[datetime.datetime] = None
+    description: Optional[str] = None
+
+
+# BW saturation marker — appears in PPV / Peak Vector Sum / similar
+# numeric fields when the underlying measurement exceeded the
+# channel's full-scale range (e.g., a geophone reading > 10 in/s at
+# Normal range, or a mic exceeding its sensitivity ceiling).  Treated
+# as "≥ range_max" + a saturated flag rather than discarded.
+# Appears as: ``"Tran PPV : OORANGE in/s"``
+_OORANGE_MARKERS = ("OORANGE", "OUT OF RANGE")
+
+
+def _is_oorange(value: str) -> bool:
+    """True when a BW numeric field is an Out-Of-Range saturation marker."""
+    s = value.strip().upper()
+    return any(m in s for m in _OORANGE_MARKERS)
+
+
+def _parse_above_range(value: str) -> Optional[float]:
+    """For BW "above-range" markers like ">100 Hz", return the threshold.
+
+    BW writes ZC Freq as ">100 Hz" when the zero-crossing algorithm sees
+    a peak too fast to count (device cuts off at 100 Hz).  Returns the
+    numeric portion after the '>' (e.g. 100.0), or None if `value` is
+    not an above-range marker.
+    """
+    s = value.strip()
+    if not s.startswith(">"):
+        return None
+    return _parse_number(s[1:])
+
+
+@dataclass
+class BwAsciiReport:
+    """Structured representation of one BW per-event ASCII export."""
+    # ── Identity ─────────────────────────────────────────────────────────────
+    event_type:        Optional[str] = None         # e.g. "Full Waveform"
+    serial:            Optional[str] = None         # e.g. "BE11529"
+    version:           Optional[str] = None         # firmware version line
+    file_name:         Optional[str] = None         # e.g. "M529LK44.AB0"
+    event_datetime:    Optional[datetime.datetime] = None  # parsed from Event Time + Event Date
+
+    # ── Trigger / recording config ──────────────────────────────────────────
+    trigger_channel:        Optional[str]   = None  # e.g. "Vert" or "From Unit"
+    geo_trigger_level_ips:  Optional[float] = None
+    pretrig_s:              Optional[float] = None  # negative seconds
+    record_time_s:          Optional[float] = None
+    record_stop_mode:       Optional[str]   = None
+    sample_rate_sps:        Optional[int]   = None
+    battery_volts:          Optional[float] = None
+    calibration_date:       Optional[datetime.date] = None
+    calibration_by:         Optional[str]   = None  # e.g. "Instantel"
+    units:                  Optional[str]   = None  # e.g. "in/s and dB(L)"
+
+    # ── Operator-supplied metadata ──────────────────────────────────────────
+    # Parsed by POSITION from the 4-line "User Notes" block BW writes
+    # between the `Units :` and `Geo Range :` lines.  Position-based so
+    # the values populate correctly even when an operator renames the
+    # labels in Blastware's Compliance Setup → Notes tab (the 4 labels
+    # are user-editable, e.g. "Seis Loc:" → "Building:" → "Site Address:").
+    # The original labels BW wrote are preserved in `user_note_labels`
+    # so terra-view can render them as the operator named them.
+    project:           Optional[str] = None     # position 1 (BW default label "Project:")
+    client:            Optional[str] = None     # position 2 (BW default label "Client:")
+    operator:          Optional[str] = None     # position 3 (BW default label "User Name:")
+    sensor_location:   Optional[str] = None     # position 4 (BW default label "Seis Loc:")
+
+    # Maps canonical slot name → the literal label BW wrote in the ASCII
+    # export.  Empty if the User Notes block wasn't present.  Example
+    # when the operator renamed slot 4 to "Building:":
+    #     {"project": "Project:", "client": "Client:",
+    #      "operator": "User Name:", "sensor_location": "Building:"}
+    user_note_labels:  Dict[str, str] = field(default_factory=dict)
+
+    # ── Geo channel scaling ─────────────────────────────────────────────────
+    geo_range_ips:     Optional[float] = None       # 10.000 / 1.250
+
+    # ── Per-channel derived stats (geo + mic) ───────────────────────────────
+    channels:          Dict[str, ChannelStats] = field(default_factory=dict)
+    mic:               MicStats = field(default_factory=MicStats)
+
+    # ── Vector sum ──────────────────────────────────────────────────────────
+    peak_vector_sum_ips:    Optional[float] = None
+    peak_vector_sum_time_s: Optional[float] = None
+    # Saturation flag — set when BW writes "OORANGE" for the PVS.  We
+    # then substitute sqrt(3) * geo_range_ips as a conservative upper
+    # bound (the theoretical maximum PVS when all 3 geo channels are
+    # simultaneously at full-scale).  Consumers should display this as
+    # ">{value} in/s" or similar.
+    peak_vector_sum_saturated: bool = False
+    # Histograms additionally have an absolute date+time for the PVS
+    # (it occurred at a specific interval).  Waveform reports show
+    # only the relative-time value above.
+    peak_vector_sum_when:   Optional[datetime.datetime] = None
+
+    # ── Histogram-specific fields (populated only when Event Type starts
+    # with 'Histogram' / 'Full Histogram' / 'Histogram + Continuous') ──
+    histogram_start:        Optional[datetime.datetime] = None
+    histogram_stop:         Optional[datetime.datetime] = None
+    histogram_n_intervals:  Optional[int]   = None      # e.g. 4, 1436
+    histogram_interval_size_str: Optional[str]   = None  # "1 minute" / "5 minutes" / "15 seconds"
+    histogram_interval_size_s:   Optional[float] = None  # parsed to seconds
+    # Per-channel absolute peak time+date (histogram-specific).  For
+    # waveform events these are None — those reports use the channel's
+    # time_of_peak_s (relative to trigger) instead.  Keyed by channel
+    # name ("Tran", "Vert", "Long", "MicL").
+    channel_peak_when:      Dict[str, datetime.datetime] = field(default_factory=dict)
+
+    # ── Sensor self-check (per channel) ─────────────────────────────────────
+    sensor_check:      Dict[str, SensorCheck] = field(default_factory=dict)
+
+    # ── Monitor log + tooling version ───────────────────────────────────────
+    monitor_log:       List[MonitorLogEntry] = field(default_factory=list)
+    pc_sw_version:     Optional[str] = None
+
+    # ── Sample table (optional; only parsed if requested) ───────────────────
+    # Each entry: (Tran, Vert, Long, MicL) in the report's units (geo
+    # channels in in/s, MicL in dB(L)).  None when parse_samples=False.
+    samples:           Optional[List[Tuple[float, float, float, float]]] = None
+
+
+# ─────────────────────────────────────────────────────────────────────────────
+# Helpers
+# ─────────────────────────────────────────────────────────────────────────────
+
+
+_KEY_NORMALISE_RE = re.compile(r"\s+")
+_NUMERIC_RE       = re.compile(r"^-?\d+(?:\.\d+)?")
+
+
+def _normalise_key(k: str) -> str:
+    """Collapse whitespace runs (incl. tabs) and strip — handles BW's
+    "MicL  Time of Peak" double-space and leading-colon quirks."""
+    return _KEY_NORMALISE_RE.sub(" ", k).strip()
+
+
+def _strip_quotes(line: str) -> str:
+    line = line.rstrip("\r\n")
+    if len(line) >= 2 and line.startswith('"') and line.endswith('"'):
+        return line[1:-1]
+    return line
+
+
+def _parse_number(value: str) -> Optional[float]:
+    """Pull the leading numeric portion out of a value like "0.500 in/s"."""
+    m = _NUMERIC_RE.match(value.strip())
+    if not m:
+        return None
+    try:
+        return float(m.group(0))
+    except ValueError:
+        return None
+
+
+def _parse_int(value: str) -> Optional[int]:
+    n = _parse_number(value)
+    return None if n is None else int(round(n))
+
+
+# Months exactly as BW writes them.
+_MONTHS = {
+    "January": 1, "February": 2, "March": 3, "April": 4,
+    "May": 5, "June": 6, "July": 7, "August": 8,
+    "September": 9, "October": 10, "November": 11, "December": 12,
+    # Short forms used in monitor-log rows ("Apr 23 /26").
+    "Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "Jun": 6, "Jul": 7,
+    "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12,
+}
+
+
+def _parse_event_date(s: str) -> Optional[datetime.date]:
+    """Parse "April 23, 2026" or "May 8, 2026" → date."""
+    s = s.strip()
+    parts = s.replace(",", " ").split()
+    if len(parts) < 3:
+        return None
+    month_name, day_str, year_str = parts[0], parts[1], parts[2]
+    month = _MONTHS.get(month_name)
+    if month is None:
+        return None
+    try:
+        return datetime.date(int(year_str), month, int(day_str))
+    except ValueError:
+        return None
+
+
+def _parse_iso_date(s: str) -> Optional[datetime.date]:
+    """Parse "2026-05-16" → date.  Histograms use ISO format for their
+    Start Date / Stop Date / Peak Date fields; waveforms use the
+    "May 8, 2026" long form which `_parse_event_date` handles."""
+    s = s.strip()
+    try:
+        return datetime.date.fromisoformat(s)
+    except ValueError:
+        return None
+
+
+_INTERVAL_UNIT_SECONDS = {
+    "second": 1, "seconds": 1, "sec": 1, "secs": 1,
+    "minute": 60, "minutes": 60, "min": 60, "mins": 60,
+    "hour": 3600, "hours": 3600, "hr": 3600, "hrs": 3600,
+}
+
+
+def _parse_interval_size(s: str) -> Optional[float]:
+    """Parse "1 minute" / "5 minutes" / "15 seconds" / "2 seconds" → seconds.
+
+    Handles the BW Compliance Setup → Histogram Interval values verbatim
+    ("2 seconds", "5 seconds", "15 seconds", "1 minute", "5 minutes",
+    "15 minutes") plus a few defensive variants.
+    """
+    if not s:
+        return None
+    parts = s.strip().split()
+    if len(parts) < 2:
+        return None
+    try:
+        n = float(parts[0])
+    except ValueError:
+        return None
+    unit_per_s = _INTERVAL_UNIT_SECONDS.get(parts[1].lower())
+    if unit_per_s is None:
+        return None
+    return n * unit_per_s
+
+
+def _parse_event_time(s: str) -> Optional[datetime.time]:
+    """Parse "15:56:35" → time."""
+    s = s.strip()
+    try:
+        h, m, sec = s.split(":")
+        return datetime.time(int(h), int(m), int(sec))
+    except (ValueError, IndexError):
+        return None
+
+
+def _parse_calibration(value: str) -> Tuple[Optional[datetime.date], Optional[str]]:
+    """Parse "April 29, 2025 by Instantel" → (date, "Instantel")."""
+    parts = value.split(" by ", 1)
+    date = _parse_event_date(parts[0])
+    by = parts[1].strip() if len(parts) > 1 else None
+    return date, by
+
+
+def _parse_monitor_row(line: str) -> Optional[MonitorLogEntry]:
+    """Parse a tab-separated monitor log row.
+
+    Format: `<start>\t<stop>\t<desc>` where each timestamp is BW's
+    short form "Mon DD /YY HH:MM:SS" (e.g. "Apr 23 /26 15:46:16").
+    Year is encoded as a 2-digit suffix; we expand "/26" → 2026.
+    """
+    parts = line.split("\t")
+    if len(parts) < 2:
+        return None
+    start = _parse_monitor_ts(parts[0])
+    stop  = _parse_monitor_ts(parts[1])
+    desc  = parts[2].strip() if len(parts) > 2 else None
+    if start is None and stop is None and not desc:
+        return None
+    return MonitorLogEntry(start_time=start, stop_time=stop, description=desc)
+
+
+def _parse_monitor_ts(s: str) -> Optional[datetime.datetime]:
+    """Parse "Apr 23 /26 15:46:16" → datetime."""
+    s = s.strip()
+    parts = s.split()
+    if len(parts) < 4:
+        return None
+    month = _MONTHS.get(parts[0])
+    if month is None:
+        return None
+    try:
+        day = int(parts[1])
+        # parts[2] looks like "/26" → century-flip to 2026
+        yy = int(parts[2].lstrip("/"))
+        year = 2000 + yy if yy < 80 else 1900 + yy
+        h, m, sec = (int(x) for x in parts[3].split(":"))
+        return datetime.datetime(year, month, day, h, m, sec)
+    except (ValueError, IndexError):
+        return None
+
+
+# ── User-notes positional slot map ──────────────────────────────────────────
+#
+# Blastware's Compliance Setup → Notes tab shows four operator-supplied
+# fields whose LABELS the operator can rename (see screenshot in
+# project archive).  Defaults are "Project:" / "Client:" /
+# "User Name:" / "Seis Loc:", but an operator using a different
+# convention can rename them to anything ("Building:", "Site:",
+# "Address:", etc.).  The ASCII export reflects whatever the operator
+# typed, so label-based matching is fragile.
+#
+# What IS reliable: BW always writes the 4 user-notes lines in the
+# same order, contiguously between the `Units :` line and the
+# `Geo Range :` line.  We parse them by POSITION and preserve the
+# operator's labels in `report.user_note_labels` so terra-view can
+# render them as the operator intended.
+
+_USER_NOTE_SLOTS = ("project", "client", "operator", "sensor_location")
+
+
+# ─────────────────────────────────────────────────────────────────────────────
+# Top-level parser
+# ─────────────────────────────────────────────────────────────────────────────
+
+
+def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwAsciiReport:
+    """Parse a BW per-event ASCII export into a structured BwAsciiReport.
+
+    Set ``parse_samples=True`` to also populate ``report.samples`` with
+    the trailing sample table.  Default False because the table is
+    huge and most callers only want metadata for indexing.
+    """
+    if isinstance(text, bytes):
+        text = text.decode("ascii", errors="replace")
+
+    report = BwAsciiReport()
+    # Pre-create channel stat slots so callers can rely on them existing.
+    for ch in ("Tran", "Vert", "Long", "MicL"):
+        report.channels.setdefault(ch, ChannelStats())
+        report.sensor_check.setdefault(ch, SensorCheck())
+
+    lines = text.splitlines()
+    i = 0
+    n = len(lines)
+
+    in_monitor_log_section = False
+    event_time_str: Optional[str] = None
+    event_date: Optional[datetime.date] = None
+
+    # User-notes block detection.  We enter the block after parsing
+    # the "Units :" line and exit on the "Geo Range :" line.  Inside,
+    # the first 4 unmatched `<label> : <value>` lines are assigned to
+    # the 4 canonical operator-supplied slots by POSITION (project,
+    # client, operator, sensor_location) regardless of what the
+    # operator named the labels in BW's Compliance Setup → Notes tab.
+    in_user_notes_block = False
+    user_note_position = 0
+
+    # Histogram-field staging — BW writes <Channel> Peak Time and
+    # <Channel> Peak Date on separate lines (and similarly Histogram
+    # Start Time / Date).  We stash the partial value when the time
+    # line arrives and combine it when the matching date line arrives.
+    _hist_start_time: Optional[datetime.time] = None
+    _hist_stop_time:  Optional[datetime.time] = None
+    _pending_peak_time: Dict[str, Optional[datetime.time]] = {}
+    _pvs_time_raw: Optional[str] = None  # last Peak Vector Sum Time value, raw
+
+    while i < n:
+        raw_line = lines[i]
+        i += 1
+        # Blank line marks the start of the sample table.
+        if raw_line.strip() == "":
+            break
+
+        line = _strip_quotes(raw_line)
+
+        # Monitor log section: "Monitor Log(s)" header followed by N rows
+        # (still inside double-quoted lines), terminated by a non-row line
+        # like "PC SW Version : ..." or a blank line.
+        if not in_monitor_log_section and line.strip() == "Monitor Log(s)":
+            in_monitor_log_section = True
+            continue
+        if in_monitor_log_section:
+            # Heuristic: monitor rows contain a tab; the next "Field : Value"
+            # line ends the section.
+            if "\t" in line:
+                entry = _parse_monitor_row(line)
+                if entry:
+                    report.monitor_log.append(entry)
+                continue
+            # Falls through to the field parser below; clear the flag.
+            in_monitor_log_section = False
+
+        # "Field : Value" — split on FIRST occurrence of " : "
+        idx = line.find(" : ")
+        if idx < 0:
+            continue
+        key = _normalise_key(line[:idx])
+        value = line[idx + 3 :].strip()
+
+        # ── Identity / config ────────────────────────────────────────────────
+        if   key == "Event Type":           report.event_type = value
+        elif key == "Serial Number":        report.serial = value
+        elif key == "Version":              report.version = value
+        elif key == "File Name":            report.file_name = value
+        elif key == "Event Time":           event_time_str = value
+        elif key == "Event Date":           event_date = _parse_event_date(value)
+
+        elif key == "Trigger":              report.trigger_channel = value
+        elif key == "Geo Trigger Level":    report.geo_trigger_level_ips = _parse_number(value)
+        elif key == "Pre-trigger Length":   report.pretrig_s = _parse_number(value)
+        elif key == "Record Time":          report.record_time_s = _parse_number(value)
+        elif key == "Record Stop Mode":     report.record_stop_mode = value
+        elif key == "Sample Rate":          report.sample_rate_sps = _parse_int(value)
+        elif key == "Battery Level":        report.battery_volts = _parse_number(value)
+        elif key == "Calibration":
+            report.calibration_date, report.calibration_by = _parse_calibration(value)
+        elif key == "Units":
+            report.units = value
+            # Entering the user-notes block.  Next ~4 lines until
+            # "Geo Range :" are the operator-supplied notes.
+            in_user_notes_block = True
+            user_note_position = 0
+
+        elif key == "Geo Range":
+            # Exiting the user-notes block.
+            in_user_notes_block = False
+            report.geo_range_ips = _parse_number(value)
+
+        # User-notes block: assign by position (operator may have
+        # renamed the labels, so we don't trust them).  Preserve the
+        # original labels in `user_note_labels` for downstream UIs
+        # (terra-view) that want to display them as the operator
+        # named them.
+        elif in_user_notes_block and user_note_position < len(_USER_NOTE_SLOTS):
+            slot = _USER_NOTE_SLOTS[user_note_position]
+            setattr(report, slot, value)
+            report.user_note_labels[slot] = key
+            user_note_position += 1
+
+        # ── Per-channel stats ────────────────────────────────────────────────
+        # All match the pattern "{Channel} <stat-name>"
+        elif key in (
+            "Tran PPV", "Vert PPV", "Long PPV",
+            "Tran ZC Freq", "Vert ZC Freq", "Long ZC Freq",
+            "Tran Time of Peak", "Vert Time of Peak", "Long Time of Peak",
+            "Tran Peak Acceleration", "Vert Peak Acceleration", "Long Peak Acceleration",
+            "Tran Peak Displacement", "Vert Peak Displacement", "Long Peak Displacement",
+        ):
+            ch_name, stat = key.split(" ", 1)
+            cs = report.channels.setdefault(ch_name, ChannelStats())
+            if stat == "PPV":
+                if _is_oorange(value):
+                    # Channel saturated — substitute range max as lower
+                    # bound; flag so downstream UI can render "> 10 in/s".
+                    cs.ppv_ips       = report.geo_range_ips
+                    cs.ppv_saturated = True
+                else:
+                    cs.ppv_ips = _parse_number(value)
+            elif stat == "ZC Freq":
+                # ">100 Hz" → store threshold + flag; numeric → parse normally
+                threshold = _parse_above_range(value)
+                if threshold is not None:
+                    cs.zc_freq_hz = threshold
+                    cs.zc_freq_above_range = True
+                else:
+                    cs.zc_freq_hz = _parse_number(value)
+            else:
+                num = _parse_number(value)
+                if   stat == "Time of Peak":        cs.time_of_peak_s = num
+                elif stat == "Peak Acceleration":   cs.peak_accel_g   = num
+                elif stat == "Peak Displacement":   cs.peak_disp_in   = num
+
+        # ── Histogram-specific fields ────────────────────────────────────────
+        # Histograms have Start/Stop time+date pairs + an interval count
+        # and size, plus per-channel absolute Peak Time/Date instead of
+        # the waveform's relative Time of Peak.
+        elif key == "Histogram Start Time":
+            _hist_start_time = _parse_event_time(value)
+        elif key == "Histogram Start Date":
+            _d = _parse_iso_date(value)
+            if _d and _hist_start_time:
+                report.histogram_start = datetime.datetime.combine(_d, _hist_start_time)
+        elif key == "Histogram Stop Time":
+            _hist_stop_time = _parse_event_time(value)
+        elif key == "Histogram Stop Date":
+            _d = _parse_iso_date(value)
+            if _d and _hist_stop_time:
+                report.histogram_stop = datetime.datetime.combine(_d, _hist_stop_time)
+        elif key == "Number of Intervals":
+            try:
+                report.histogram_n_intervals = int(float(value.strip()))
+            except ValueError:
+                pass
+        elif key == "Interval Size":
+            report.histogram_interval_size_str = value.strip()
+            report.histogram_interval_size_s   = _parse_interval_size(value)
+
+        # ── Per-channel histogram Peak Date / Peak Time ──
+        # Lines like "Tran Peak Time : 22:31:38" + "Tran Peak Date : 2026-05-16"
+        elif key in ("Tran Peak Time", "Vert Peak Time", "Long Peak Time", "MicL Time"):
+            ch_name = "MicL" if key == "MicL Time" else key.split(" ", 1)[0]
+            _pending_peak_time[ch_name] = _parse_event_time(value)
+        elif key in ("Tran Peak Date", "Vert Peak Date", "Long Peak Date", "MicL Date"):
+            ch_name = "MicL" if key == "MicL Date" else key.split(" ", 1)[0]
+            _d = _parse_iso_date(value)
+            _t = _pending_peak_time.get(ch_name)
+            if _d and _t:
+                report.channel_peak_when[ch_name] = datetime.datetime.combine(_d, _t)
+
+        # ── Vector Sum ───────────────────────────────────────────────────────
+        elif key == "Peak Vector Sum":
+            if _is_oorange(value):
+                # PVS saturated — conservative upper bound is
+                # sqrt(3) * geo_range_ips (all 3 channels at full-scale).
+                # Real PVS could be lower (channels rarely peak
+                # simultaneously) but never higher within the range.
+                if report.geo_range_ips is not None:
+                    import math as _math
+                    report.peak_vector_sum_ips = _math.sqrt(3) * report.geo_range_ips
+                report.peak_vector_sum_saturated = True
+            else:
+                report.peak_vector_sum_ips = _parse_number(value)
+        # BW writes the PVS-time label with a typo: "Peak Vector Sum TimeSum"
+        # (looks like Sum got appended twice).  Accept both forms.  Confirmed
+        # against actual BW output on 2026-05-27 — every PVS-time line in
+        # the field examples (T190, T438, K557) uses the typo'd label.
+        elif key in ("Peak Vector Sum Time", "Peak Vector Sum TimeSum"):
+            report.peak_vector_sum_time_s = _parse_number(value)
+            _pvs_time_raw = value
+        elif key == "Peak Vector Sum Date":
+            # Histogram-mode PVS gets paired with a date.  We may have
+            # captured 'Peak Vector Sum Time' as either a relative
+            # seconds float (waveform) or an HH:MM:SS string we
+            # interpreted as a number.  For histograms, BW writes
+            # "Peak Vector Sum Time : 22:33:52" which _parse_number
+            # parses as 22.0 (loses information).  When Peak Vector Sum
+            # Date arrives, re-parse the previous PVS time line as a
+            # clock time and combine into an absolute datetime.
+            _d = _parse_iso_date(value)
+            if _d and _pvs_time_raw is not None:
+                _t = _parse_event_time(_pvs_time_raw)
+                if _t:
+                    report.peak_vector_sum_when = datetime.datetime.combine(_d, _t)
+                    # The earlier seconds parse was bogus for histograms;
+                    # clear it so consumers don't think it's a real offset.
+                    report.peak_vector_sum_time_s = None
+
+        # ── Microphone block ────────────────────────────────────────────────
+        elif key == "Microphone":
+            report.mic.weighting = value
+        elif key == "MicL PSPL":
+            if _is_oorange(value):
+                # Mic saturated — substitute conservative upper bound 140 dBL.
+                report.mic.pspl_dbl       = 140.0
+                report.mic.pspl_saturated = True
+            else:
+                report.mic.pspl_dbl = _parse_number(value)
+            # Mirror onto the "MicL" entry in channels so callers querying
+            # `channels["MicL"].ppv_ips` see something — but it's dB(L), not
+            # in/s, so we store as-is in the MicStats and mark the channel.
+        elif key == "MicL Time of Peak":
+            report.mic.time_of_peak_s = _parse_number(value)
+            cs = report.channels.setdefault("MicL", ChannelStats())
+            cs.time_of_peak_s = report.mic.time_of_peak_s
+        elif key == "MicL ZC Freq":
+            threshold = _parse_above_range(value)
+            if threshold is not None:
+                report.mic.zc_freq_hz         = threshold
+                report.mic.zc_freq_above_range = True
+            else:
+                report.mic.zc_freq_hz = _parse_number(value)
+            cs = report.channels.setdefault("MicL", ChannelStats())
+            cs.zc_freq_hz          = report.mic.zc_freq_hz
+            cs.zc_freq_above_range = report.mic.zc_freq_above_range
+
+        # ── Sensor self-check ────────────────────────────────────────────────
+        elif key in (
+            "Tran Test Freq", "Vert Test Freq", "Long Test Freq", "MicL Test Freq",
+            "Tran Test Ratio", "Vert Test Ratio", "Long Test Ratio",
+            "MicL Test Amplitude",
+            "Tran Test Results", "Vert Test Results", "Long Test Results", "MicL Test Results",
+        ):
+            ch_name, stat = key.split(" ", 1)
+            sc = report.sensor_check.setdefault(ch_name, SensorCheck())
+            if   stat == "Test Freq":      sc.test_freq_hz      = _parse_number(value)
+            elif stat == "Test Ratio":     sc.test_ratio        = _parse_number(value)
+            elif stat == "Test Amplitude": sc.test_amplitude_mv = _parse_number(value)
+            elif stat == "Test Results":   sc.test_results      = value
+
+        # ── Trailer ─────────────────────────────────────────────────────────
+        elif key == "PC SW Version":
+            report.pc_sw_version = value
+
+        # Unknown keys are silently dropped — forward-compat for future
+        # BW versions that may add fields.
+
+    # Combine event date + time into a datetime
+    if event_date is not None and event_time_str is not None:
+        t = _parse_event_time(event_time_str)
+        if t is not None:
+            report.event_datetime = datetime.datetime.combine(event_date, t)
+
+    if parse_samples:
+        report.samples = _parse_sample_table(lines, i)
+
+    return report
+
+
+def _parse_sample_table(
+    lines: List[str], start: int,
+) -> List[Tuple[float, float, float, float]]:
+    """Parse the trailing sample table.
+
+    The table starts with a header row ("   Tran   <TAB>...") and continues
+    until EOF.  Each data row is a tab-separated quartet of numeric values.
+    """
+    samples: List[Tuple[float, float, float, float]] = []
+    seen_header = False
+    for line in lines[start:]:
+        line = line.rstrip("\r\n")
+        if not line.strip():
+            continue
+        cols = [c.strip() for c in line.split("\t") if c.strip()]
+        if not seen_header:
+            # Header row contains channel names; numeric rows don't.
+            if any(c in ("Tran", "Vert", "Long", "MicL") for c in cols):
+                seen_header = True
+            continue
+        if len(cols) < 4:
+            continue
+        try:
+            samples.append((
+                float(cols[0]), float(cols[1]),
+                float(cols[2]), float(cols[3]),
+            ))
+        except ValueError:
+            continue
+    return samples
+
+
+def parse_report_file(
+    path: Union[str, Path], *, parse_samples: bool = False,
+) -> BwAsciiReport:
+    """Convenience: read a .TXT file from disk and parse it."""
+    return parse_report(Path(path).read_bytes(), parse_samples=parse_samples)
@@ -0,0 +1,927 @@
+"""
+minimateplus/event_file_io.py — modern event-file (.sfm.json sidecar) IO.
+
+This module is the single home for event-file conversion code that doesn't
+fit cleanly inside `blastware_file.py` (which is the BW binary codec):
+
+  - sidecar JSON read/write (the modern per-event metadata file)
+  - read_blastware_file() — reverse of write_blastware_file, used by
+    the BW-importer flow when SFM is ingesting files produced by
+    Blastware's own ACH (where the source A5 frames aren't available).
+
+Sidecar schema v1 layout — see docs in the project plan or the schema
+declared in `event_to_sidecar_dict()`.
+"""
+
+from __future__ import annotations
+
+import datetime
+import hashlib
+import json
+import logging
+import os
+import struct
+from pathlib import Path
+from typing import Optional, Union
+
+from .models import Event, PeakValues, ProjectInfo, Timestamp
+from . import blastware_file as _bw  # avoid circular reference at module load
+from .bw_ascii_report import BwAsciiReport
+from .waveform_codec import decode_waveform_v2, decoded_to_adc_counts
+from .histogram_codec import decode_histogram_body
+
+# Reference pressure for dB(L) → psi conversion (20 µPa expressed in psi).
+# Same constant as sfm/sfm_webapp.html so server-side and browser-side
+# conversions agree.
+_DBL_REF_PSI = 2.9e-9
+
+log = logging.getLogger(__name__)
+
+# Schema version for the sidecar JSON.  Bump when fields change shape.
+# Older readers must reject anything > SCHEMA_VERSION; newer fields added
+# inside `extensions` are forward-compatible without a bump.
+SCHEMA_VERSION = 1
+SIDECAR_KIND   = "sfm.event"
+
+# Default tool_version stamp; callers can override.  Hard-coded here
+# rather than read via importlib.metadata because the latter reflects the
+# *installed* dist-info, which doesn't update when pyproject.toml is
+# bumped without a `pip install` re-run — leading to confusing stale
+# version stamps in sidecars.  Bump this constant and CHANGELOG.md
+# together at release time.
+TOOL_VERSION = "0.21.1"
+
+try:
+    # Best-effort: prefer the installed metadata when it's NEWER than the
+    # baked-in constant (e.g. a downstream packager bumped the wheel
+    # without editing this file).  Otherwise fall back to TOOL_VERSION.
+    from importlib.metadata import version as _pkg_version
+    _meta_v = _pkg_version("seismo-relay")
+    def _vtuple(s):
+        try:
+            return tuple(int(p) for p in s.split(".")[:3])
+        except Exception:
+            return (0, 0, 0)
+    _TOOL_VERSION_DEFAULT = (
+        _meta_v if _vtuple(_meta_v) > _vtuple(TOOL_VERSION) else TOOL_VERSION
+    )
+except Exception:
+    _TOOL_VERSION_DEFAULT = TOOL_VERSION
+
+
+# ── Sidecar dict construction ─────────────────────────────────────────────────
+
+
+def _ts_iso(ts: Optional[Timestamp]) -> Optional[str]:
+    if ts is None:
+        return None
+    try:
+        return datetime.datetime(
+            ts.year, ts.month, ts.day,
+            ts.hour or 0, ts.minute or 0, ts.second or 0,
+        ).isoformat()
+    except Exception:
+        return str(ts)
+
+
+def _peak_values_to_dict(pv: Optional[PeakValues]) -> dict:
+    if pv is None:
+        return {
+            "transverse":   None,
+            "vertical":     None,
+            "longitudinal": None,
+            "vector_sum":   None,
+            "mic_psi":      None,
+        }
+    return {
+        "transverse":   pv.tran,
+        "vertical":     pv.vert,
+        "longitudinal": pv.long,
+        "vector_sum":   pv.peak_vector_sum,
+        "mic_psi":      pv.micl,
+    }
+
+
+def _bw_report_to_dict(report: BwAsciiReport) -> dict:
+    """Project a parsed BW ASCII report into the sidecar's `bw_report` block.
+
+    All fields are rendered as plain JSON-compatible types (no datetime
+    objects).  Channels are uniformly lowercased for stable JSON keys.
+    """
+    def _ch(ch_name: str) -> dict:
+        cs = report.channels.get(ch_name)
+        if cs is None:
+            return {}
+        out = {
+            "ppv_ips":         cs.ppv_ips,
+            "zc_freq_hz":      cs.zc_freq_hz,
+            "time_of_peak_s":  cs.time_of_peak_s,
+            "peak_accel_g":    cs.peak_accel_g,
+            "peak_disp_in":    cs.peak_disp_in,
+        }
+        # Drop all-None entries — keeps the JSON tidy for partial reports.
+        out = {k: v for k, v in out.items() if v is not None}
+        # Saturation flag (only present when True) — signals that ppv_ips
+        # is the channel range max (a lower bound), not an exact reading.
+        if getattr(cs, "ppv_saturated", False):
+            out["ppv_saturated"] = True
+        # ZC Freq above device reporting ceiling (BW ">100 Hz") — value
+        # in zc_freq_hz is the threshold, not an exact measurement.
+        if getattr(cs, "zc_freq_above_range", False):
+            out["zc_freq_above_range"] = True
+        return out
+
+    def _sc(ch_name: str) -> dict:
+        sc = report.sensor_check.get(ch_name)
+        if sc is None:
+            return {}
+        out = {
+            "freq_hz":      sc.test_freq_hz,
+            "ratio":        sc.test_ratio,
+            "amplitude_mv": sc.test_amplitude_mv,
+            "result":       sc.test_results,
+        }
+        return {k: v for k, v in out.items() if v is not None}
+
+    monitor_log = []
+    for entry in report.monitor_log:
+        e = {
+            "start":       entry.start_time.isoformat() if entry.start_time else None,
+            "stop":        entry.stop_time.isoformat()  if entry.stop_time  else None,
+            "description": entry.description,
+        }
+        monitor_log.append({k: v for k, v in e.items() if v is not None})
+
+    return {
+        "available":   True,
+        "event_type":  report.event_type,
+        "version":     report.version,
+        "trigger": {
+            "channel":       report.trigger_channel,
+            "geo_level_ips": report.geo_trigger_level_ips,
+        },
+        "recording": {
+            "sample_rate_sps":  report.sample_rate_sps,
+            "record_time_s":    report.record_time_s,
+            "pretrig_s":        report.pretrig_s,
+            "stop_mode":        report.record_stop_mode,
+            "geo_range_ips":    report.geo_range_ips,
+            "units":            report.units,
+        },
+        "device": {
+            "battery_volts":    report.battery_volts,
+            "calibration_date": report.calibration_date.isoformat() if report.calibration_date else None,
+            "calibration_by":   report.calibration_by,
+        },
+        "peaks": {
+            "tran":         _ch("Tran"),
+            "vert":         _ch("Vert"),
+            "long":         _ch("Long"),
+            "vector_sum": {
+                "ips":       report.peak_vector_sum_ips,
+                "time_s":    report.peak_vector_sum_time_s,
+                # Histogram events have an absolute date+time for the PVS
+                # (the interval at which it occurred); waveform events
+                # only have the time_s offset.
+                "when":      report.peak_vector_sum_when.isoformat() if report.peak_vector_sum_when else None,
+                # Set when BW reported the PVS as OORANGE — value is the
+                # conservative upper bound sqrt(3) * geo_range_ips, not
+                # an exact peak.
+                "saturated": bool(getattr(report, "peak_vector_sum_saturated", False)),
+            },
+        },
+        "mic": {
+            "weighting":             report.mic.weighting,
+            "pspl_dbl":              report.mic.pspl_dbl,
+            "pspl_saturated":        bool(getattr(report.mic, "pspl_saturated", False)),
+            "zc_freq_hz":            report.mic.zc_freq_hz,
+            "zc_freq_above_range":   bool(getattr(report.mic, "zc_freq_above_range", False)),
+            "time_of_peak_s":        report.mic.time_of_peak_s,
+        },
+        "sensor_check": {
+            "tran": _sc("Tran"),
+            "vert": _sc("Vert"),
+            "long": _sc("Long"),
+            "mic":  _sc("MicL"),
+        },
+        # Histogram-specific fields (None on waveform-mode events).
+        # Per-channel absolute peak time/date for histograms — for
+        # waveforms see channels[ch]["time_of_peak_s"] instead.
+        "histogram": {
+            "start":               report.histogram_start.isoformat() if report.histogram_start else None,
+            "stop":                report.histogram_stop.isoformat()  if report.histogram_stop  else None,
+            "n_intervals":         report.histogram_n_intervals,
+            "interval_size":       report.histogram_interval_size_str,
+            "interval_size_s":     report.histogram_interval_size_s,
+            "channel_peak_when":   {ch: dt.isoformat() for ch, dt in report.channel_peak_when.items()},
+        },
+        "monitor_log":   monitor_log,
+        "pc_sw_version": report.pc_sw_version,
+    }
+
+
+def _dbl_to_psi(pspl_dbl: float) -> float:
+    """Convert dB(L) sound pressure level back to psi.  Uses the same
+    20 µPa reference (= 2.9e-9 psi) as the webapp so server-side and
+    browser-side conversions agree."""
+    return _DBL_REF_PSI * (10.0 ** (pspl_dbl / 20.0))
+
+
+def apply_report_to_event(event: Event, report: BwAsciiReport) -> None:
+    """Overlay device-authoritative fields from a parsed BW ASCII report
+    onto an in-memory Event, IN-PLACE.
+
+    Why this exists
+    ───────────────
+    `read_blastware_file()` parses the BW binary and fills `Event.peak_values`
+    via `_peaks_from_samples()` — which runs the (still-undecoded) BW body
+    codec assuming raw int16 LE and produces ±32K-shaped noise on every
+    channel.  Result: peak values land in the SeismoDb event row as
+    ~10 in/s on every event regardless of the actual signal.
+
+    When a paired BW ASCII report is available, the report carries the
+    device's own authoritative peak / project / sample-rate / record-time
+    values.  This helper folds those onto the Event before it flows to
+    `SeismoDb.insert_events()`, so the DB columns reflect the report
+    rather than the broken-codec output.
+
+    Fields overlaid (only when the report supplies a non-None value):
+      - peak_values.tran / .vert / .long              (from report.channels)
+      - peak_values.peak_vector_sum                   (from report.peak_vector_sum_ips)
+      - peak_values.micl  (psi)                       (from report.mic.pspl_dbl → psi)
+      - project_info.project / .client / .operator / .sensor_location
+      - sample_rate                                   (from report.sample_rate_sps)
+      - rectime_seconds                               (from report.record_time_s)
+
+    Fields NOT touched (operator-edit / parser-output preserved):
+      - timestamp, raw_samples, record_type, total_samples,
+        pretrig_samples, _waveform_key, _a5_frames, _raw_record
+      - false_trigger and review state (those live on the sidecar, not on Event)
+    """
+    if event.peak_values is None:
+        event.peak_values = PeakValues()
+    pv = event.peak_values
+    ch = report.channels
+    if (t := ch.get("Tran")) and t.ppv_ips is not None: pv.tran = t.ppv_ips
+    if (v := ch.get("Vert")) and v.ppv_ips is not None: pv.vert = v.ppv_ips
+    if (l := ch.get("Long")) and l.ppv_ips is not None: pv.long = l.ppv_ips
+    if report.peak_vector_sum_ips is not None:
+        pv.peak_vector_sum = report.peak_vector_sum_ips
+    if report.mic.pspl_dbl is not None and report.mic.pspl_dbl > 0:
+        pv.micl = _dbl_to_psi(report.mic.pspl_dbl)
+
+    if event.project_info is None:
+        event.project_info = ProjectInfo()
+    pi = event.project_info
+    if report.project:         pi.project         = report.project
+    if report.client:          pi.client          = report.client
+    if report.operator:        pi.operator        = report.operator
+    if report.sensor_location: pi.sensor_location = report.sensor_location
+
+    if report.sample_rate_sps:
+        event.sample_rate = report.sample_rate_sps
+    if report.record_time_s is not None:
+        event.rectime_seconds = report.record_time_s
+
+
+def apply_bw_report_dict_to_event(event: Event, bw_report: dict) -> None:
+    """Mirror of ``apply_report_to_event`` for the projected sidecar
+    dict shape (as produced by ``_bw_report_to_dict``).
+
+    Why this exists
+    ───────────────
+    The ingest path holds a live ``BwAsciiReport`` parsed straight from
+    the ``_ASCII.TXT`` and uses ``apply_report_to_event`` to overlay
+    device-authoritative peaks onto the codec output before insert.
+
+    The backfill path doesn't have the original ``.TXT`` (it's not
+    retained in the waveform store), but it does have the preserved
+    ``bw_report`` block from the sidecar — which contains the same
+    projected fields.  Re-overlaying those during a backfill keeps the
+    DB peak columns aligned with what BW reports rather than letting
+    the codec output (which may be incomplete for unhandled formats or
+    walker edge cases) win by default.
+
+    No-ops cleanly when ``bw_report`` is ``None``, empty, or missing
+    any particular sub-field — only fields with a concrete value get
+    written.  Mirrors ``apply_report_to_event``'s "report wins where
+    present" semantics.
+    """
+    if not bw_report:
+        return
+    if event.peak_values is None:
+        event.peak_values = PeakValues()
+    pv = event.peak_values
+
+    peaks = bw_report.get("peaks") or {}
+    tran = (peaks.get("tran") or {}).get("ppv_ips")
+    vert = (peaks.get("vert") or {}).get("ppv_ips")
+    long = (peaks.get("long") or {}).get("ppv_ips")
+    if tran is not None: pv.tran = tran
+    if vert is not None: pv.vert = vert
+    if long is not None: pv.long = long
+    vs_ips = (peaks.get("vector_sum") or {}).get("ips")
+    if vs_ips is not None:
+        pv.peak_vector_sum = vs_ips
+
+    mic = bw_report.get("mic") or {}
+    pspl = mic.get("pspl_dbl")
+    if pspl is not None and pspl > 0:
+        pv.micl = _dbl_to_psi(pspl)
+
+    rec = bw_report.get("recording") or {}
+    sr = rec.get("sample_rate_sps")
+    if sr:
+        event.sample_rate = sr
+    rt = rec.get("record_time_s")
+    if rt is not None:
+        event.rectime_seconds = rt
+
+
+def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
+    if pi is None:
+        return {
+            "project":         None,
+            "client":          None,
+            "operator":        None,
+            "sensor_location": None,
+        }
+    return {
+        "project":         pi.project,
+        "client":          pi.client,
+        "operator":        pi.operator,
+        "sensor_location": pi.sensor_location,
+    }
+
+
+def event_to_sidecar_dict(
+    event: Event,
+    *,
+    serial: str,
+    blastware_filename: str,
+    blastware_filesize: int,
+    blastware_sha256: str,
+    source_kind: str = "sfm-live",
+    txt_filename: Optional[str] = None,
+    a5_pickle_filename: Optional[str] = None,
+    tool_version: str = _TOOL_VERSION_DEFAULT,
+    captured_at: Optional[datetime.datetime] = None,
+    review: Optional[dict] = None,
+    extensions: Optional[dict] = None,
+    bw_report: Optional[BwAsciiReport] = None,
+) -> dict:
+    """
+    Build a v1 sidecar dict from an Event + the surrounding metadata.
+
+    Pure helper — no file I/O.  Callers stitch the result into a sidecar
+    via `write_sidecar()` (or POST it back via the PATCH endpoint).
+
+    When *bw_report* is supplied (e.g. by the ACH-forwarded import path
+    where Blastware writes a per-event ASCII report alongside the binary),
+    its decoded fields are folded into the sidecar:
+
+      - A new top-level ``bw_report`` block carries the rich derived
+        per-channel stats (Peak Acceleration, Peak Displacement, ZC Freq,
+        Time of Peak), the Peak Vector Sum + time, the per-channel sensor
+        self-check results, and monitor-log timestamps.
+      - ``peak_values`` is overlaid from the report (the report's PPV/PVS
+        values are computed by the device firmware and are authoritative;
+        anything ``read_blastware_file()`` derived from samples is
+        approximate at best until the body codec is decoded).
+      - ``project_info`` is overlaid from the report when the report
+        supplies a non-empty value (the report mirrors the device's
+        compliance config, which is what BW shows in its event report).
+      - ``event.timestamp`` is overlaid from the report's Event Date +
+        Event Time (BW's report timestamps are second-resolution and
+        match the binary's footer; we prefer the report value because
+        the BW-binary footer timestamp can drift on some firmware).
+    """
+    if source_kind not in {"sfm-live", "sfm-ach", "bw-import", "idf-import"}:
+        raise ValueError(f"unknown source_kind: {source_kind!r}")
+
+    captured_at = captured_at or datetime.datetime.utcnow()
+
+    # ── Overlay event fields from the report when present ───────────────────
+    timestamp_iso = _ts_iso(event.timestamp)
+    if bw_report and bw_report.event_datetime:
+        timestamp_iso = bw_report.event_datetime.isoformat()
+
+    # Build peak_values, optionally overlaid from the report.  The report
+    # stores Mic peak as PSPL (dB(L)); we convert to psi to match the
+    # existing peak_values.mic_psi field.
+    peak_dict = _peak_values_to_dict(event.peak_values)
+    if bw_report:
+        ch = bw_report.channels
+        if (t := ch.get("Tran")) and t.ppv_ips is not None: peak_dict["transverse"]   = t.ppv_ips
+        if (v := ch.get("Vert")) and v.ppv_ips is not None: peak_dict["vertical"]     = v.ppv_ips
+        if (l := ch.get("Long")) and l.ppv_ips is not None: peak_dict["longitudinal"] = l.ppv_ips
+        if bw_report.peak_vector_sum_ips is not None:
+            peak_dict["vector_sum"] = bw_report.peak_vector_sum_ips
+        if bw_report.mic.pspl_dbl is not None and bw_report.mic.pspl_dbl > 0:
+            peak_dict["mic_psi"] = _dbl_to_psi(bw_report.mic.pspl_dbl)
+
+    # Project info: overlay from report (the report mirrors the
+    # session-start compliance config that BW renders in event reports).
+    proj_dict = _project_info_to_dict(event.project_info)
+    if bw_report:
+        if bw_report.project:         proj_dict["project"]         = bw_report.project
+        if bw_report.client:          proj_dict["client"]          = bw_report.client
+        if bw_report.operator:        proj_dict["operator"]        = bw_report.operator
+        if bw_report.sensor_location: proj_dict["sensor_location"] = bw_report.sensor_location
+
+    # Event-block fields: overlay from report where available.
+    event_block = {
+        "serial":           serial,
+        "timestamp":        timestamp_iso,
+        "waveform_key":     event._waveform_key.hex() if event._waveform_key else None,
+        "record_type":      event.record_type,
+        "sample_rate":      event.sample_rate,
+        "rectime_seconds":  event.rectime_seconds,
+        "total_samples":    event.total_samples,
+        "pretrig_samples":  event.pretrig_samples,
+    }
+    if bw_report:
+        # Report values are authoritative — they're the user-configured
+        # values BW reads back, not STRT-derived guesses.  In particular
+        # `event.rectime_seconds` from `read_blastware_file()` reads
+        # STRT[18] which is actually the `0x46` record-type marker (= 70)
+        # rather than the user's Record Time setting.  Always overwrite.
+        if bw_report.sample_rate_sps:
+            event_block["sample_rate"] = bw_report.sample_rate_sps
+        if bw_report.record_time_s is not None:
+            event_block["rectime_seconds"] = bw_report.record_time_s
+        # Derive total_samples + pretrig_samples per channel from the
+        # report's sample_rate × times.  These match the row count of
+        # the report's sample table (verified: event-c reports 1024 sps
+        # × (1.0 + 0.25) = 1280 rows).
+        if (sr := bw_report.sample_rate_sps) and bw_report.record_time_s is not None:
+            pretrig_s = abs(bw_report.pretrig_s) if bw_report.pretrig_s is not None else 0.0
+            event_block["total_samples"]   = int(round(sr * (bw_report.record_time_s + pretrig_s)))
+            event_block["pretrig_samples"] = int(round(sr * pretrig_s))
+
+    out = {
+        "schema_version": SCHEMA_VERSION,
+        "kind":           SIDECAR_KIND,
+
+        "event":        event_block,
+        "peak_values":  peak_dict,
+        "project_info": proj_dict,
+
+        "blastware": {
+            "filename":  blastware_filename,
+            "filesize":  blastware_filesize,
+            "sha256":    blastware_sha256,
+            "available": True,
+        },
+
+        "source": {
+            "kind":               source_kind,
+            "captured_at":        captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
+            "tool_version":       tool_version,
+            "a5_pickle_filename": a5_pickle_filename,
+            "txt_filename":       txt_filename,
+        },
+
+        "review": review or {
+            "false_trigger": False,
+            "reviewer":      None,
+            "reviewed_at":   None,
+            "notes":         "",
+        },
+
+        "extensions": extensions or {},
+    }
+
+    if bw_report:
+        out["bw_report"] = _bw_report_to_dict(bw_report)
+
+    return out
+
+
+# ── Sidecar IO ────────────────────────────────────────────────────────────────
+
+
+def write_sidecar(path: Union[str, Path], data: dict) -> None:
+    """
+    Atomic write of a sidecar dict to <path>.
+
+    Validates schema_version is supported before writing so we don't
+    silently drop a future-format sidecar over the wire.
+    """
+    path = Path(path)
+    sv = data.get("schema_version")
+    if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
+        raise ValueError(
+            f"write_sidecar: unsupported schema_version={sv!r} "
+            f"(this build supports 1..{SCHEMA_VERSION})"
+        )
+
+    tmp = path.with_suffix(path.suffix + ".tmp")
+    with tmp.open("w", encoding="utf-8") as f:
+        json.dump(data, f, indent=2, sort_keys=False, default=str)
+        f.write("\n")
+        f.flush()
+        os.fsync(f.fileno())
+    os.replace(tmp, path)
+
+
+def read_sidecar(path: Union[str, Path]) -> dict:
+    """
+    Load a sidecar JSON file.
+
+    Raises FileNotFoundError if missing, ValueError on bad shape /
+    unsupported schema_version.  Unknown keys at the top level are
+    preserved in the returned dict (forward-compat).
+    """
+    path = Path(path)
+    with path.open("r", encoding="utf-8") as f:
+        data = json.load(f)
+    if not isinstance(data, dict):
+        raise ValueError(f"sidecar at {path}: top-level is not a JSON object")
+    sv = data.get("schema_version")
+    if not isinstance(sv, int) or sv < 1:
+        raise ValueError(f"sidecar at {path}: missing or invalid schema_version")
+    if sv > SCHEMA_VERSION:
+        raise ValueError(
+            f"sidecar at {path}: schema_version={sv} > supported {SCHEMA_VERSION}; "
+            "upgrade seismo-relay to read this file"
+        )
+    if data.get("kind") != SIDECAR_KIND:
+        raise ValueError(f"sidecar at {path}: unexpected kind={data.get('kind')!r}")
+    return data
+
+
+def patch_sidecar(
+    path: Union[str, Path],
+    *,
+    review: Optional[dict] = None,
+    extensions: Optional[dict] = None,
+    reviewer_now: bool = True,
+) -> dict:
+    """
+    Atomically apply a JSON-merge-patch to a sidecar file's `review`
+    and/or `extensions` blocks.  Other top-level keys are untouched.
+
+    `review_now`: when True (default) and `review` is non-empty, stamps
+    `review.reviewed_at` with the current UTC time so the review-time is
+    auditable without the caller having to pass it.
+
+    Returns the new full sidecar dict.
+    """
+    path = Path(path)
+    data = read_sidecar(path)
+
+    if review:
+        merged = dict(data.get("review") or {})
+        merged.update({k: v for k, v in review.items() if v is not None or k in merged})
+        if reviewer_now:
+            merged["reviewed_at"] = datetime.datetime.utcnow().isoformat() + "Z"
+        data["review"] = merged
+
+    if extensions:
+        merged_ext = dict(data.get("extensions") or {})
+        merged_ext.update(extensions)
+        data["extensions"] = merged_ext
+
+    write_sidecar(path, data)
+    return data
+
+
+def sidecar_path_for(blastware_path: Union[str, Path]) -> Path:
+    """Convention: <bw_path>.sfm.json sits next to the BW binary."""
+    p = Path(blastware_path)
+    return p.with_name(p.name + ".sfm.json")
+
+
+def file_sha256(path: Union[str, Path], chunk_size: int = 65536) -> str:
+    """Compute SHA-256 of a file as a hex string."""
+    h = hashlib.sha256()
+    with open(path, "rb") as f:
+        while True:
+            chunk = f.read(chunk_size)
+            if not chunk:
+                break
+            h.update(chunk)
+    return h.hexdigest()
+
+
+# ── Blastware-file reader ─────────────────────────────────────────────────────
+#
+# Reverse of `blastware_file.write_blastware_file`.  Used by the BW-import
+# flow to ingest files produced by Blastware's own ACH (where the source
+# A5 frames are not available).
+#
+# File structure (recap):
+#   [22B header] [21B STRT record] [body bytes] [26B footer]
+#
+# The body holds:
+#   - 6B preamble (00 00 ff ff ff ff) immediately after the STRT
+#   - 4-channel interleaved int16 LE samples
+#   - Embedded ASCII metadata strings (Project: / Client: / User Name: /
+#     Seis Loc: / Extended Notes) from the device's session-start config
+#
+# The 0C waveform record (per-event peaks, project name) is NOT in the
+# BW file — those are computed by the device firmware and only carried
+# in the live SUB 0C response.  read_blastware_file() therefore computes
+# peaks from the raw samples assuming Normal-range (10 in/s full-scale)
+# geophone sensitivity.  Imported events surface that assumption via the
+# sidecar's `peak_values.computed_from_samples` flag.
+
+
+# Geophone scale factor, in/s per ADC unit, for Normal range (10 in/s FS).
+# Confirmed from CLAUDE.md (geo_hardware_constant = 6.206053 in/s per V,
+# ADC full-scale = 1.61133 V Normal range = 10.0 in/s peak; per-count
+# resolution ≈ 10.0 / 32768).
+_GEO_NORMAL_FS_INS  = 10.0
+_GEO_SENSITIVE_FS_INS = 1.250
+_INT16_FS = 32768.0
+
+# Microphone scale factor, psi per ADC count.  Approximate — exact factor
+# depends on the geophone-vs-mic ADC scaling and the firmware reference.
+# We mark mic_psi as "computed approximate" in the sidecar.
+_MIC_FS_PSI = 0.0125 / _INT16_FS   # ~0.5 psi full-scale assumption
+
+
+def _decode_strt(strt: bytes) -> dict:
+    """
+    Decode the 21-byte STRT record from a BW file.
+
+    Returns dict with waveform_key (4B), total_samples, pretrig_samples,
+    rectime_seconds.  Falls back to None on truncated/missing fields.
+    """
+    if len(strt) < 21 or strt[0:4] != b"STRT":
+        return {}
+    return {
+        "waveform_key":    strt[6:10].hex(),
+        "total_samples":   struct.unpack_from(">H", strt, 8)[0],
+        "pretrig_samples": struct.unpack_from(">H", strt, 16)[0],
+        "rectime_seconds": strt[18],
+    }
+
+
+def _find_first_string(buf: bytes, label: bytes, max_len: int = 256) -> Optional[str]:
+    """
+    Search `buf` for `label` (e.g. b"Project:") and return the
+    null-terminated ASCII string that follows, stripped.
+    """
+    pos = buf.find(label)
+    if pos < 0:
+        return None
+    start = pos + len(label)
+    end = buf.find(b"\x00", start, start + max_len)
+    if end < 0:
+        end = start + max_len
+    text = buf[start:end].decode("ascii", errors="replace").strip()
+    return text or None
+
+
+def _decode_samples_4ch_int16_le(stream: bytes) -> dict[str, list[int]]:
+    """
+    Decode a 4-channel interleaved int16 LE byte stream into per-channel
+    lists.  Channels are [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
+    Truncates to a multiple of 8 bytes (one full sample-set).
+    """
+    n_complete = (len(stream) // 8) * 8
+    if n_complete == 0:
+        return {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    fmt = "<" + "h" * (n_complete // 2)
+    flat = list(struct.unpack(fmt, stream[:n_complete]))
+    return {
+        "Tran": flat[0::4],
+        "Vert": flat[1::4],
+        "Long": flat[2::4],
+        "MicL": flat[3::4],
+    }
+
+
+def _peaks_from_samples(samples: dict[str, list[int]]) -> PeakValues:
+    """
+    Compute approximate peaks from raw int16 samples assuming Normal-range
+    geophone sensitivity.  Used by the BW-importer when the 0C waveform
+    record (the device's authoritative peaks) is unavailable.
+    """
+    def _peak_ins(ch: list[int]) -> float:
+        if not ch:
+            return 0.0
+        m = max(abs(int(v)) for v in ch)
+        return m / _INT16_FS * _GEO_NORMAL_FS_INS
+
+    tran = _peak_ins(samples.get("Tran", []))
+    vert = _peak_ins(samples.get("Vert", []))
+    long_ = _peak_ins(samples.get("Long", []))
+
+    # Mic in psi (approximate)
+    mic_ch = samples.get("MicL", []) or []
+    mic = max((abs(int(v)) for v in mic_ch), default=0) * _MIC_FS_PSI
+
+    # Peak vector sum: max over time of sqrt(T^2 + V^2 + L^2)
+    pvs = 0.0
+    n = min(len(samples.get("Tran", [])), len(samples.get("Vert", [])), len(samples.get("Long", [])))
+    if n:
+        scale = _GEO_NORMAL_FS_INS / _INT16_FS
+        T = samples["Tran"]; V = samples["Vert"]; L = samples["Long"]
+        for i in range(n):
+            t = T[i] * scale
+            v = V[i] * scale
+            l = L[i] * scale
+            mag = (t*t + v*v + l*l) ** 0.5
+            if mag > pvs:
+                pvs = mag
+
+    return PeakValues(
+        tran=tran, vert=vert, long=long_,
+        peak_vector_sum=pvs, micl=mic,
+    )
+
+
+_RECORD_TYPE_BY_EXT_SUFFIX = {
+    'H': 'Histogram',
+    'W': 'Waveform',
+    'M': 'Manual',
+    'E': 'Event',
+    'C': 'Combo',
+}
+
+
+def derive_record_type_from_filename(filename, default: str = "Waveform") -> str:
+    """Derive a BW Event's record_type from its filename's extension suffix.
+
+    V10.72+ MiniMate Plus firmware encodes the event type as the LAST
+    character of the extension (the `T` in BW's `AB0T` scheme):
+
+        ``M529LKIQ.G10H``  →  H  →  ``"Histogram"``
+        ``T350L385.VY0W``  →  W  →  ``"Waveform"``
+        ``...M``           →  M  →  ``"Manual"``
+        ``...E``           →  E  →  ``"Event"``
+        ``...C``           →  C  →  ``"Combo"``
+
+    Old S338 firmware uses 3-char extensions ending in ``0`` whose
+    encoding is not yet known — those fall through to ``default``.
+    Micromate Series 4 uses a different scheme entirely (observed:
+    ``IDFH``, ``IDFW``) but the LAST-char convention (H / W) still holds
+    for the type code, so it works for both families.
+
+    Returns ``default`` if filename is empty, has no extension, or the
+    suffix char isn't a recognized type code.
+    """
+    if not filename:
+        return default
+    try:
+        name = Path(filename).name
+    except (TypeError, ValueError):
+        return default
+    if '.' not in name:
+        return default
+    ext = name.rsplit('.', 1)[1]
+    if not ext:
+        return default
+    return _RECORD_TYPE_BY_EXT_SUFFIX.get(ext[-1].upper(), default)
+
+
+def read_blastware_file(path: Union[str, Path]) -> Event:
+    """
+    Parse a Blastware waveform file into an Event.
+
+    Recovers:
+      - waveform_key, rectime_seconds, total_samples, pretrig_samples
+        (from the STRT record)
+      - timestamp (from the footer's start-time field)
+      - project_info (from ASCII labels embedded in the body)
+      - raw_samples (Tran/Vert/Long/MicL int16 lists)
+      - peak_values (computed from raw_samples; approximate — see notes
+        on _peaks_from_samples about Normal-range assumption)
+
+    Does NOT recover the source A5 frames (they aren't in the BW file).
+    The returned Event has `_a5_frames = None`, signalling that
+    byte-for-byte regeneration of the BW file from this Event alone is
+    not possible — the on-disk BW file IS the byte-for-byte source.
+    """
+    path = Path(path)
+    raw = path.read_bytes()
+    if len(raw) < _bw._WAVEFORM_HEADER_SIZE + 21 + 26:
+        raise ValueError(f"{path}: file too short ({len(raw)} bytes) to be a BW event")
+
+    # Header: validate magic prefix.
+    header = raw[:_bw._WAVEFORM_HEADER_SIZE]
+    if not header.startswith(_bw._FILE_HEADER_PREFIX):
+        raise ValueError(f"{path}: not a Blastware file (bad header prefix)")
+
+    # STRT record: 21 bytes immediately after the header.
+    strt_raw = raw[_bw._WAVEFORM_HEADER_SIZE : _bw._WAVEFORM_HEADER_SIZE + 21]
+    strt_fields = _decode_strt(strt_raw)
+    if not strt_fields:
+        raise ValueError(f"{path}: STRT record missing or malformed")
+
+    # Footer: locate the 0e 08 marker, validating the year is in a sane range.
+    body_start = _bw._WAVEFORM_HEADER_SIZE + 21
+    footer_pos = -1
+    pos = body_start
+    while True:
+        pos = raw.find(b"\x0e\x08", pos)
+        if pos < 0 or pos + 26 > len(raw):
+            break
+        yr = (raw[pos + 4] << 8) | raw[pos + 5]
+        if 2015 <= yr <= 2050:
+            footer_pos = pos
+            break
+        pos += 1
+
+    if footer_pos < 0 and len(raw) >= 26:
+        footer_pos = len(raw) - 26
+    if footer_pos < body_start:
+        raise ValueError(f"{path}: footer not found")
+
+    body   = raw[body_start : footer_pos]
+    footer = raw[footer_pos : footer_pos + 26]
+
+    # Footer layout:
+    #   [0:2]   0e 08  marker
+    #   [2:10]  ts1 (start) BE 8B
+    #   [10:18] ts2 (stop)  BE 8B
+    #   [18:24] 00 01 00 02 00 00
+    #   [24:26] crc
+    ts1 = _bw._decode_ts_be(footer[2:10])
+    ts2 = _bw._decode_ts_be(footer[10:18])
+
+    # Body: decode via the verified body codecs.  Two formats coexist:
+    #
+    #   1. Waveform-mode (.AB0W) — starts with 7-byte preamble
+    #      ``00 02 00 [Tran[0] BE] [Tran[1] BE]`` followed by the
+    #      tagged-block delta stream documented in
+    #      ``docs/waveform_codec_re_status.md`` and §7.6.1 of the
+    #      protocol reference.  Decoded by ``waveform_codec.decode_waveform_v2``.
+    #
+    #   2. Histogram-mode (.AB0H) — a sequence of 32-byte blocks, one
+    #      per histogram interval, each carrying per-channel peak +
+    #      half-period values.  Decoded by
+    #      ``histogram_codec.decode_histogram_body``.  Both codecs
+    #      return the same channel-grouped output shape, so consumers
+    #      don't need to special-case mode.
+    #
+    # The historical ``_decode_samples_4ch_int16_le`` int16-LE
+    # interpretation was retracted 2026-05-08 (see protocol-ref §7.6.1
+    # retraction box) — it produced ±32K noise on every event.
+    #
+    # If both codecs fail (malformed file, truncated body, unrecognised
+    # mode, synthetic test input), fall back to empty channels — the
+    # rest of the event (timestamp, waveform_key, project strings) is
+    # still recoverable and useful.
+    decoded = decode_waveform_v2(body)
+    if decoded is None:
+        decoded = decode_histogram_body(body)
+    if decoded is None:
+        log.warning(
+            "%s: body codec failed to decode (body starts %s) — "
+            "raw_samples will be empty", path, body[:8].hex(" "),
+        )
+        samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
+    else:
+        samples = decoded_to_adc_counts(decoded)
+
+    # Metadata strings (label-anchored search across the body).
+    project = _find_first_string(body, b"Project:")
+    client  = _find_first_string(body, b"Client:")
+    user    = _find_first_string(body, b"User Name:")
+    seisloc = _find_first_string(body, b"Seis Loc:")
+
+    # Build the Event.
+    ev = Event(index=-1)
+    if strt_fields.get("waveform_key"):
+        ev._waveform_key = bytes.fromhex(strt_fields["waveform_key"])
+    # Derive record_type from the filename's extension suffix (H/W/M/E/C).
+    # When called from save_imported_bw the path here is a tmp file with a
+    # ".bw" suffix, so the derivation falls back to "Waveform" and the
+    # caller overrides ev.record_type using the original filename — see
+    # waveform_store.save_imported_bw.
+    ev.record_type     = derive_record_type_from_filename(path.name)
+    ev.rectime_seconds = strt_fields.get("rectime_seconds")
+    ev.total_samples   = strt_fields.get("total_samples")
+    ev.pretrig_samples = strt_fields.get("pretrig_samples")
+
+    if ts1 is not None:
+        ev.timestamp = Timestamp(
+            raw=footer[2:10],
+            flag=0x10,
+            year=ts1.year, unknown_byte=0, month=ts1.month, day=ts1.day,
+            hour=ts1.hour, minute=ts1.minute, second=ts1.second,
+        )
+
+    ev.project_info = ProjectInfo(
+        project=project, client=client, operator=user, sensor_location=seisloc,
+    )
+    ev.raw_samples = samples
+    # Only compute peaks from samples when we actually have samples.
+    # For events the codec couldn't decode (histogram-mode bodies, until
+    # the §7.6.2 histogram codec is wired in), samples is an empty dict
+    # and ``_peaks_from_samples`` would return PeakValues(0, 0, 0, 0, 0).
+    # That would then OVERWRITE existing good DB peak values (e.g. from
+    # paired BW ASCII reports) during the backfill UPSERT path.
+    # Leaving peak_values=None signals "we don't know" to downstream
+    # consumers; the backfill script seeds from the DB row when it sees
+    # None, and ``apply_report_to_event`` overlays from a paired ASCII
+    # report when one is supplied.
+    has_samples = any(samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL"))
+    ev.peak_values = _peaks_from_samples(samples) if has_samples else None
+    ev._a5_frames = None  # not recoverable from BW file
+
+    return ev
@@ -111,20 +111,24 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
    verified against this algorithm on 2026-04-02).

    Args:
-        offset_word: 16-bit offset (0x1004 for probe/chunks, 0x005A for term).
-        raw_params:  10 or 11 params bytes (from bulk_waveform_params or
-                     bulk_waveform_term_params). 0x10 bytes in params are
-                     written RAW — NOT DLE-stuffed. Confirmed 2026-04-06 by
-                     comparing wire bytes: BW sends bare `10 04` for chunk 1
-                     (counter=0x1004), not stuffed `10 10 04`. Device reads
-                     params at fixed byte positions; stuffing shifts the bytes
-                     and corrupts the counter, causing device to ignore the frame.
+        offset_word: 16-bit offset.  For probe/chunks/metadata pages this is
+                     `0x1002`.  For the proper TERM frame this is computed by
+                     `bulk_waveform_term_v2()` from the STRT-derived
+                     `end_offset`.
+        raw_params:  10, 11, or 12 params bytes (from `bulk_waveform_params`
+                     for probes/samples, `bulk_waveform_term_v2` for TERM, or
+                     a manually-built 12-byte block for the metadata pages
+                     0x1002 / 0x1004).  See gotcha #3 below — params region
+                     uses partial DLE stuffing of 0x10 bytes.

    Returns:
        Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
    """
-    if len(raw_params) not in (10, 11):
-        raise ValueError(f"raw_params must be 10 or 11 bytes, got {len(raw_params)}")
+    if len(raw_params) not in (10, 11, 12):
+        # 10 = termination params; 11 = regular probe / chunk params;
+        # 12 = metadata-page params (extra trailing 0x00 — BW byte-perfect quirk
+        # for the two fixed metadata reads at counter=0x1002 and 0x1004).
+        raise ValueError(f"raw_params must be 10/11/12 bytes, got {len(raw_params)}")

    # Build stuffed section between STX and checksum
    s = bytearray()
@@ -134,8 +138,40 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
    s += b"\x00"                                 # field3
    s += bytes([(offset_word >> 8) & 0xFF,       # offset_hi — raw, NOT stuffed
                 offset_word & 0xFF])            # offset_lo
-    for b in raw_params:                         # params — NOT DLE-stuffed (raw bytes, match BW wire format)
+    # Params — partial DLE stuffing of 0x10 bytes (CONFIRMED 2026-05-05).
+    #
+    # The device's de-stuffing rule for params is:
+    #   • `10 10`        → de-stuffs to `10`
+    #   • `10 02/03/04`  → kept literal (these are inner-frame markers)
+    #   • `10 X` other   → de-stuffs to just `X`  (drops the 0x10)
+    #
+    # So for any 0x10 byte in the *logical* params that is followed by a
+    # byte NOT in {0x02, 0x03, 0x04, 0x10}, we must double the 0x10 on the
+    # wire (`10 X` → `10 10 X`) so the device's de-stuffer reproduces the
+    # original `10 X` pair.  Without this, counter values with `0x10` in
+    # the high byte (e.g. counter=0x1000 has params bytes `10 00`) are
+    # silently corrupted to `0x__00` on the device side, and the device
+    # responds for the wrong address — for counter=0x1000 it returns the
+    # probe response (counter=0x0000), which contains the file header +
+    # STRT.  That STRT block then lands in the assembled file body and
+    # Blastware rejects the file as malformed.
+    #
+    # Confirmed against BW capture 5-1-26 / bwcap3sec frame 20: params
+    # logical bytes `00 01 11 10 00 00 00 00 00 00 00` (counter=0x1000)
+    # are encoded on the wire as `00 01 11 10 10 00 00 00 00 00 00 00`.
+    # BW frames 13/14 (meta @ 0x1002 / 0x1004) leave `10 02` and `10 04`
+    # raw — the device handles those literal pairs correctly.
+    i = 0
+    while i < len(raw_params):
+        b = raw_params[i]
        s.append(b)
+        if (
+            b == 0x10
+            and i + 1 < len(raw_params)
+            and raw_params[i + 1] not in (0x02, 0x03, 0x04, 0x10)
+        ):
+            s.append(0x10)   # double the 0x10 so it survives device de-stuffing
+        i += 1

    # DLE-aware checksum: for 0x10 XX pairs count XX; for lone bytes count them
    chk, i = 0, 0
@@ -398,28 +434,26 @@ def bulk_waveform_params(key4: bytes, counter: int, *, is_probe: bool = False) -

 def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
    """
-    Build the 10-byte params block for the SUB 5A termination request.
+    ⛔ DEPRECATED — DO NOT USE IN NEW CODE.

-    The termination request uses offset=0x005A and a DIFFERENT params layout —
-    the leading 0x00 byte is dropped, key4[0:2] shifts to params[0:2], and the
-    counter high byte is at params[2]:
+    This is the v1 termination params helper, paired with the broken
+    `_BULK_TERM_OFFSET = 0x005A` magic offset_word.  Together they produce a
+    ~100-byte device-side terminator response that does NOT contain the
+    partial-last-chunk waveform tail or the 26-byte file footer.  Files
+    reconstructed using this terminator are missing their last ~512 bytes of
+    waveform data and have a synthesized footer that disagrees with what BW
+    would have written.

-      params[0]   = key4[0]
-      params[1]   = key4[1]
-      params[2]   = (counter >> 8) & 0xFF
-      params[3:]  = zeros
+    **For new code, use `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`**
+    which computes the correct offset_word + params from the STRT-derived
+    `end_offset`.  v2 produces wire bytes that match BW exactly across all
+    tested events (4-27-26 / 5-1-26 / 5-4-26 captures).

-    Counter for the termination request = last_regular_counter + 0x0400.
-
-    Confirmed from 1-2-26 BW TX capture: final request (frame 83) uses
-    offset=0x005A, params[0:3] = key4[0:2] + term_counter_hi.
-
-    Args:
-        key4:    4-byte waveform key.
-        counter: Termination counter (= last regular counter + 0x0400).
-
-    Returns:
-        10-byte params block.
+    This function is retained ONLY for the defensive fallback path in
+    `read_bulk_waveform_stream()` that triggers when STRT parsing fails or no
+    chunks are fetched (= a malformed event or an unexpected device state).
+    The fallback already logs a WARNING when it activates; if you see that
+    warning, the bug is upstream — STRT should have been parseable.
    """
    if len(key4) != 4:
        raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
@@ -430,6 +464,123 @@ def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
    return bytes(p)


+def bulk_waveform_term_v2(
+    key4: bytes,
+    end_offset: int,
+    last_chunk_counter: int,
+) -> tuple[int, bytes]:
+    """
+    Compute the SUB 5A TERM frame's offset_word and 10-byte params block.
+
+    Confirmed across 3 events (4-27-26 + 5-1-26 captures):
+
+      next_boundary = last_chunk_counter + 0x0200
+      offset_word   = end_offset - next_boundary    (residual byte count)
+      params[0]     = key4[0]                       (= 0x01 on every observed device)
+      params[1]     = key4[1]                       (= 0x11)
+      params[2]     = (next_boundary >> 8) & 0xFF
+      params[3]     = next_boundary & 0xFF
+      params[4:10]  = zeros
+
+    Verification:
+      | end_offset | last_chunk | next_boundary | offset_word | params[2:4] |
+      | 0x1ABE     | 0x1800     | 0x1A00        | 0x00BE      | 1A 00       |
+      | 0x21F2     | 0x1E00     | 0x2000        | 0x01F2      | 20 00       |
+      | 0x417E     | 0x3E38     | 0x4038        | 0x0146      | 40 38       |
+
+    The device receives `requested_address = (params[2] << 8) | offset_word`
+    and replies with `(end_offset - next_boundary)` bytes of waveform tail
+    starting at `next_boundary` — including the 26-byte file footer.
+
+    Args:
+        key4:               4-byte waveform key for this event.
+        end_offset:         Event-end pointer (= `(end_key[2] << 8) | end_key[3]`
+                            from the STRT record at data[23:27] of A5[0]).
+        last_chunk_counter: Counter of the last full 0x0200-byte chunk fetched
+                            (the chunk that covers [last_chunk_counter,
+                            last_chunk_counter + 0x0200)).
+
+    Returns:
+        (offset_word, params10) tuple.  Pass as
+        `build_5a_frame(offset_word, params)`.
+
+    Raises:
+        ValueError: on inconsistent inputs.
+    """
+    if len(key4) != 4:
+        raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
+    next_boundary = last_chunk_counter + 0x0200
+    if next_boundary > 0xFFFF:
+        raise ValueError(
+            f"next_boundary 0x{next_boundary:04X} exceeds uint16; check inputs"
+        )
+    if end_offset <= last_chunk_counter:
+        raise ValueError(
+            f"end_offset 0x{end_offset:04X} must be > "
+            f"last_chunk_counter 0x{last_chunk_counter:04X}"
+        )
+    offset_word = end_offset - next_boundary
+    if offset_word < 0:
+        # Last chunk overshot end_offset; caller should have stopped one chunk
+        # earlier.  Treat as zero residual.
+        offset_word = 0
+    if offset_word > 0xFFFF:
+        raise ValueError(
+            f"offset_word 0x{offset_word:04X} exceeds uint16"
+        )
+    p = bytearray(10)
+    p[0] = key4[0]
+    p[1] = key4[1]
+    p[2] = (next_boundary >> 8) & 0xFF
+    p[3] = next_boundary & 0xFF
+    return offset_word, bytes(p)
+
+
+# ── End-offset extraction from STRT record ────────────────────────────────────
+
+STRT_MARKER = b"STRT"
+
+
+def parse_strt_end_offset(a5_data: bytes) -> Optional[int]:
+    """
+    Extract the event-end offset from the STRT record in an A5 response payload.
+
+    The first A5 response (the probe response, or the first chunk for events
+    with non-zero start_key[2:4]) contains a STRT record at byte offset 17 of
+    `data`.  Layout:
+
+      data[17:21]  "STRT"
+      data[21:23]  ff fe        sentinel
+      data[23:27]  end_key      ← 4-byte key of where this event ENDS
+      data[27:31]  start_key
+      ...
+
+    Returns `(end_key[2] << 8) | end_key[3]` — the absolute device-buffer
+    address where the event ends.  Use this to bound the chunk loop and to
+    compute the TERM frame.
+
+    Verified end_offset values:
+      | event start_key | end_key      | end_offset |
+      | 01110000        | 01111ABE     | 0x1ABE     |
+      | 01110000        | 011121F2     | 0x21F2     |
+      | 011121F2        | 0111417E     | 0x417E     |
+
+    Args:
+        a5_data: The `data` field of an A5 response frame (frame.data).
+
+    Returns:
+        The end_offset (uint16) if STRT is found, else None.
+    """
+    pos = a5_data.find(STRT_MARKER)
+    if pos < 0 or pos + 10 > len(a5_data):
+        return None
+    # data[pos+4:pos+6] is "ff fe"; data[pos+6:pos+10] is end_key.
+    end_key = a5_data[pos + 6 : pos + 10]
+    if len(end_key) < 4:
+        return None
+    return (end_key[2] << 8) | end_key[3]
+
+
 # ── Pre-built POLL frames ─────────────────────────────────────────────────────
 #
 # POLL (SUB 0x5B) uses the same two-step pattern as all other reads — the
@@ -457,6 +608,11 @@ class S3Frame:
    page_lo: int        # PAGE_LO from header
    data: bytes         # payload data section (payload[5:], checksum already stripped)
    checksum_valid: bool
+    chk_byte: int = 0   # actual checksum byte received from wire (body[-1])
+                        # needed for waveform file reconstruction: when the last data byte
+                        # is 0x10 and chk_byte ∈ {0x02, 0x03, 0x04}, the DLE+chk pair
+                        # must be included in the DLE-strip operation to correctly
+                        # reconstruct the Blastware binary body.

    @property
    def page_key(self) -> int:
@@ -465,7 +621,6 @@ class S3Frame:


 # ── Streaming S3 frame parser ─────────────────────────────────────────────────
-
 class S3FrameParser:
    """
    Incremental byte-stream parser for S3→BW response frames.
@@ -592,9 +747,10 @@ class S3FrameParser:
            return None

        return S3Frame(
-            sub           = raw_payload[2],
-            page_hi       = raw_payload[3],
-            page_lo       = raw_payload[4],
-            data          = raw_payload[5:],
+            sub            = raw_payload[2],
+            page_hi        = raw_payload[3],
+            page_lo        = raw_payload[4],
+            data           = raw_payload[5:],
            checksum_valid = (chk_received == chk_computed),
+            chk_byte       = chk_received,
        )
@@ -0,0 +1,283 @@
+"""
+histogram_codec.py — decoder for MiniMate Plus histogram-mode event bodies.
+
+FULLY DECODED 2026-05-20.  Every field in every block, verified
+byte-exact against BW's ASCII export across multiple histogram
+fixtures.
+
+The histogram-mode body is a stream of 32-byte fixed-length blocks,
+one block per histogram interval.  Each block carries the per-interval
+peak amplitude + zero-crossing frequency for all four channels (Tran,
+Vert, Long, MicL).
+
+────────────────────────────────────────────────────────────────────────────
+Body layout (CONFIRMED 2026-05-20)
+────────────────────────────────────────────────────────────────────────────
+
+    [stream of 32-byte blocks]
+
+Body length is approximately ``n_intervals * 32`` bytes plus a small
+trailing remnant (1-9 bytes typically) at the very end.  Walker should
+iterate 32-stride and stop before the tail.
+
+────────────────────────────────────────────────────────────────────────────
+32-byte block layout
+────────────────────────────────────────────────────────────────────────────
+
+    [0]    0x00                      always-zero tag
+    [1]    segment_id  (uint8)       0x00..0x03 — 256 blocks per segment
+    [2:4]  block_ctr  (uint16 LE)    resets each segment (0x0100, 0x0101, …)
+    [4:6]  0x000a (uint16 LE)        constant marker (= 10)
+    [6]    T_peak_count   uint8      Tran peak (count × 0.005 → in/s, max 1.275 in/s)
+    [7]    T_annotation   uint8      empirically non-zero on intervals with sub-Hz
+                                     or unmeasurable Tran freq; meaning not fully RE'd
+    [8:10] T_halfperiod   uint16 LE  Tran half-period in samples (freq = 512 / halfp Hz)
+    [10]   V_peak_count   uint8
+    [11]   V_annotation   uint8
+    [12:14] V_halfperiod  uint16 LE
+    [14]   L_peak_count   uint8
+    [15]   L_annotation   uint8
+    [16:18] L_halfperiod  uint16 LE
+    [18]   M_peak_count   uint8      MicL peak (count → dB via mic_count_to_db)
+    [19]   M_annotation   uint8
+    [20:22] M_halfperiod  uint16 LE  MicL half-period in samples (freq = 512 / halfp Hz)
+    [22:24] 0x00 0x00                constant
+    [24:28] 4-byte variable          purpose unknown (possibly CRC or timestamp delta)
+    [28:32] 0x1e 0x0a 0x00 0x00      constant block-end signature
+
+NOTE on peak-count width: an earlier interpretation treated the peak
+fields as uint16 LE spanning [6:8] / [10:12] / [14:16] / [18:20].
+That happened to be byte-exact against the N844 fixture corpus only
+because every annotation byte in those fixtures was zero, making
+``uint16 LE == uint8``.  Cross-correlating BE9558 (K558) Tran-drift
+and BE18003 (T003) Histogram+Continuous events against the BW ASCII
+export proved peak is uint8 alone — see test_histogram_codec.py
+and docs/histogram_codec_re_status.md.
+
+Block-identification anchor: ``block[22:24] == b"\\x00\\x00"`` AND
+``block[28:32] == b"\\x1e\\x0a\\x00\\x00"``.  This is the reliable
+distinguisher from non-block content in the file.
+
+────────────────────────────────────────────────────────────────────────────
+Per-channel encoding
+────────────────────────────────────────────────────────────────────────────
+
+Geophone channels (Tran, Vert, Long):
+  - peak_count × 0.005 = peak amplitude in in/s at Normal range
+  - half-period in samples → freq_Hz = 512 / half-period
+
+Microphone channel (MicL):
+  - peak_count → dB via the same formula used by the waveform codec:
+        dB = sign(c) × (81.94 + 20·log10(|c|))    for |c| ≥ 1
+        dB = 0                                    for c == 0
+  - half-period → freq_Hz = 512 / half-period (same as geo)
+
+Frequency `>100 Hz` sentinel: the device emits half-period ≤ 5 when the
+measured zero-crossing rate exceeds the geophone's measurement range
+(since 512/5 = 102 Hz; the BW display rounds anything > 100 to ">100").
+
+────────────────────────────────────────────────────────────────────────────
+Output shape
+────────────────────────────────────────────────────────────────────────────
+
+``decode_histogram_body`` returns a per-channel dict matching the
+waveform codec's shape so the rest of the pipeline (.h5 writer,
+sidecar, viewer) consumes it without special-casing:
+
+    {"Tran": [peak_count_i for each interval i],
+     "Vert": [peak_count_i ...],
+     "Long": [peak_count_i ...],
+     "MicL": [peak_count_i ...]}
+
+Values are in **16-count units for geo** (LSB = 0.005 in/s, matching
+``decode_waveform_v2``) and **1-count units for mic** (matching the
+waveform codec's mic convention).  Run through
+``waveform_codec.decoded_to_adc_counts`` to scale geo to 1-count ADC.
+
+Per-interval frequencies are NOT returned — they're auxiliary data,
+not waveform samples.  Consumers needing frequencies can call
+``decode_histogram_body_full()`` for the structured per-interval
+record list.
+"""
+
+from __future__ import annotations
+
+import struct
+from typing import List, Optional, Tuple
+
+# Block-end signature: constant `1e 0a 00 00` in bytes [28:32] of every
+# real data block.  More distinctive than the byte-22 `00 00` (which
+# matches many false positives), so we anchor on this.
+_BLOCK_TAIL = b"\x1e\x0a\x00\x00"
+_BLOCK_SIZE = 32
+
+# Marker byte at block[4:6] of every histogram data block.  Used as
+# additional validation that we're looking at a real block.
+_BLOCK_MARKER = 10
+
+# Geo peak scaling: stored as "count × 0.005 in/s" where 1 count = one
+# 0.005 in/s display quantum.  Equivalent to the waveform codec's
+# 16-count-unit output (1 unit = 0.005 in/s = 16 ADC counts).
+_GEO_LSB_INS = 0.005
+
+# Frequency formula: freq_Hz = _FREQ_NUMERATOR / half_period_samples.
+# Empirically determined to be 512 (= sample_rate / 2, where sample rate
+# is 1024 sps for the standard MiniMate Plus configuration).
+_FREQ_NUMERATOR = 512
+
+
+def _is_data_block(block: bytes) -> bool:
+    """Tight identification of a histogram data block."""
+    if len(block) < _BLOCK_SIZE:
+        return False
+    if block[28:32] != _BLOCK_TAIL:
+        return False
+    if block[22:24] != b"\x00\x00":
+        return False
+    if block[0] != 0x00:
+        return False
+    marker = block[4] | (block[5] << 8)
+    if marker != _BLOCK_MARKER:
+        return False
+    return True
+
+
+def _decode_block(block: bytes) -> Optional[dict]:
+    """Decode one 32-byte histogram block.  Caller must have validated
+    with ``_is_data_block`` first.
+
+    Returns a record with per-channel peak counts (uint8) and
+    half-periods (uint16 LE).
+    """
+    # Peak counts are uint8 at bytes [6] / [10] / [14] / [18].  The
+    # adjacent bytes [7] / [11] / [15] / [19] hold an annotation field
+    # whose meaning isn't fully understood (empirically non-zero in
+    # intervals with sub-Hz or unmeasurable geo frequencies, mostly
+    # zero otherwise — see test fixtures from BE9558/BE18003 corpora).
+    # Crucially, those annotation bytes are NOT the high byte of the
+    # peak count: cross-correlating against BW's per-interval ASCII
+    # export proves the peak is uint8 alone.
+    #
+    # Reading the peak as uint16 LE (the original interpretation) was
+    # accidentally correct only because every block in the N844 fixture
+    # corpus had a zero annotation byte; non-N844 events with non-zero
+    # annotation bytes decoded to physically impossible peaks (e.g.
+    # 268 in/s per channel) and produced 35× inflated PVS sums when
+    # first run against prod data.  See histogram_codec_re_status.md.
+    t_peak = block[6]
+    v_peak = block[10]
+    l_peak = block[14]
+    m_peak = block[18]
+    t_halfp = block[8]  | (block[9]  << 8)
+    v_halfp = block[12] | (block[13] << 8)
+    l_halfp = block[16] | (block[17] << 8)
+    m_halfp = block[20] | (block[21] << 8)
+    segment_id = block[1]
+    block_ctr  = block[2] | (block[3] << 8)
+    var_meta   = bytes(block[24:28])
+    annotations = (block[7], block[11], block[15], block[19])
+    return {
+        "segment_id":  segment_id,
+        "block_ctr":   block_ctr,
+        "t_peak":      t_peak,
+        "t_halfp":     t_halfp,
+        "v_peak":      v_peak,
+        "v_halfp":     v_halfp,
+        "l_peak":      l_peak,
+        "l_halfp":     l_halfp,
+        "m_peak":      m_peak,
+        "m_halfp":     m_halfp,
+        "meta_var":    var_meta,
+        "annotations": annotations,
+    }
+
+
+def walk_body(body: bytes) -> List[dict]:
+    """Walk the body and return one dict per histogram interval.
+
+    Iterates 32-byte strides from offset 0.  Yields a decoded record
+    for every block that passes ``_is_data_block`` validation.  Stops
+    when the remaining bytes are too short to form a complete block.
+
+    In Histogram+Continuous mode the body interleaves data blocks with
+    other 32-byte content (likely continuous-mode waveform blocks) that
+    fail the data-block validation; the walker naturally skips them
+    without losing 32-byte alignment.  Use ``block_ctr`` from each
+    returned record to map back to the original interval index — the
+    record list is sparse when other block types are interleaved.
+    """
+    records: List[dict] = []
+    for off in range(0, len(body) - _BLOCK_SIZE + 1, _BLOCK_SIZE):
+        blk = body[off:off + _BLOCK_SIZE]
+        if not _is_data_block(blk):
+            # Hit non-block content (likely a sync or stream marker).
+            # Continue walking — block alignment is fixed at 32-stride
+            # from offset 0, so we don't lose alignment by skipping.
+            continue
+        decoded = _decode_block(blk)
+        if decoded is None:
+            # Block validated as a histogram block but had peak fields
+            # outside the plausible range — undocumented extension.
+            # Skip rather than propagating bogus PVS contributions.
+            continue
+        records.append(decoded)
+    return records
+
+
+def decode_histogram_body(body: bytes) -> Optional[dict]:
+    """Decode a histogram-mode body into per-channel peak-sample arrays.
+
+    Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
+    where each channel's list contains one peak value per histogram
+    interval (in the same units the waveform codec uses: 16-count units
+    for geo, 1-count ADC units for mic).  Returns ``None`` if the body
+    doesn't contain any valid histogram blocks.
+
+    To convert to physical units:
+      - Geo channels: ``count * 0.005`` = peak in in/s at Normal range
+        (or run through ``waveform_codec.decoded_to_adc_counts`` first
+         to get 1-count ADC values, then ``count / 32767 * 10.0`` for in/s)
+      - Mic channel:  use ``waveform_codec.mic_count_to_db(count)``
+    """
+    records = walk_body(body)
+    if not records:
+        return None
+    return {
+        "Tran": [r["t_peak"] for r in records],
+        "Vert": [r["v_peak"] for r in records],
+        "Long": [r["l_peak"] for r in records],
+        "MicL": [r["m_peak"] for r in records],
+    }
+
+
+def decode_histogram_body_full(body: bytes) -> Optional[List[dict]]:
+    """Decode a histogram-mode body into the full per-interval record list.
+
+    Same data as ``decode_histogram_body`` but in a structured form that
+    preserves the half-period (frequency) data for each channel + the
+    per-block segment_id, block_ctr, and 4-byte variable metadata.
+    Useful for diagnostic tools, sidecar enrichment, and future-codec
+    work.
+
+    Returns ``None`` if the body has no valid blocks.
+    """
+    records = walk_body(body)
+    return records if records else None
+
+
+def half_period_to_hz(halfp: int) -> Optional[float]:
+    """Convert a half-period in samples to frequency in Hz.
+
+    Returns ``None`` for half-period ≤ 5 — the device emits values in
+    that range when the measured zero-crossing rate exceeds 100 Hz
+    (the BW display reports `>100 Hz` for such cases).  Callers can
+    treat ``None`` as the `>100 Hz` sentinel.
+    """
+    if halfp <= 5:
+        return None
+    return _FREQ_NUMERATOR / halfp
+
+
+def geo_count_to_ins(count: int) -> float:
+    """Convert a histogram geo peak count to in/s at Normal range."""
+    return count * _GEO_LSB_INS
@@ -14,6 +14,7 @@ Notes on certainty:

 from __future__ import annotations

+import datetime
 import struct
 from dataclasses import dataclass, field
 from typing import Optional
@@ -200,6 +201,58 @@ class Timestamp:
            second=second,
        )

+    @classmethod
+    def from_short_record(cls, data: bytes) -> "Timestamp":
+        """
+        Decode an 8-byte timestamp header from a 210-byte waveform record.
+
+        Wire layout (✅ CONFIRMED 2026-05-01 against live SFM run on BE11529 in
+        Continuous mode, day-of-month = 1 May, raw: 01 05 07 ea 00 0d 15 25):
+          byte[0]:    day                 (uint8)
+          byte[1]:    month               (uint8)
+          bytes[2-3]: year                (big-endian uint16)
+          byte[4]:    unknown             (0x00 in observed sample)
+          byte[5]:    hour                (uint8)
+          byte[6]:    minute              (uint8)
+          byte[7]:    second              (uint8)
+
+        This is a third format observed in the wild — distinct from the 9-byte
+        (single-shot, sub_code=0x10 at [1]) and 10-byte (continuous, 0x10 at
+        [0] AND [2]) layouts.  No marker bytes; disambiguated by where the
+        year lands when scanned at byte 2/3/4.
+
+        Args:
+            data: at least 8 bytes; only the first 8 are consumed.
+
+        Returns:
+            Decoded Timestamp.
+
+        Raises:
+            ValueError: if data is fewer than 8 bytes.
+        """
+        if len(data) < 8:
+            raise ValueError(
+                f"Short record timestamp requires at least 8 bytes, got {len(data)}"
+            )
+        day          = data[0]
+        month        = data[1]
+        year         = struct.unpack_from(">H", data, 2)[0]
+        unknown_byte = data[4]
+        hour         = data[5]
+        minute       = data[6]
+        second       = data[7]
+        return cls(
+            raw=bytes(data[:8]),
+            flag=0,
+            year=year,
+            unknown_byte=unknown_byte,
+            month=month,
+            day=day,
+            hour=hour,
+            minute=minute,
+            second=second,
+        )
+
    @property
    def clock_set(self) -> bool:
        """False when year == 1995 (factory default / battery-lost state)."""
@@ -268,7 +321,7 @@ class ChannelConfig:
    label: str                  # e.g. "Tran", "Vert", "Long", "MicL" ✅
    trigger_level: float        # in/s (geo) or psi (MicL) ✅
    alarm_level: float          # in/s (geo) or psi (MicL) ✅
-    max_range: float            # full-scale calibration constant (e.g. 6.206) 🔶
+    max_range: float            # hardware/firmware sensitivity constant (e.g. 6.206053) ✅ confirmed same on all units
    unit_label: str             # e.g. "in./s" or "psi" ✅


@@ -337,15 +390,34 @@ class ComplianceConfig:
    raw: Optional[bytes] = None            # full 2090-byte payload (for debugging)

    # Recording parameters (✅ CONFIRMED from §7.6)
-    record_time: Optional[float] = None    # seconds (7.0, 10.0, 13.0, etc.)
-    sample_rate: Optional[int] = None      # sps (1024, 2048, 4096, etc.) — NOT YET FOUND ❓
+    recording_mode: Optional[int] = None  # uint8: 0x00=Single Shot, 0x01=Continuous,
+                                           # 0x03=Histogram, 0x04=Histogram+Continuous ✅ confirmed 2026-04-20
+                                           # Read (E5): data[anchor_pos - 8] (6-byte anchor)
+                                           # Write (SUB 71): data[anchor_pos - 7]
+    sample_rate: Optional[int] = None     # sps (1024, 2048, 4096)
+    histogram_interval_sec: Optional[int] = None  # uint16 BE, seconds ✅ confirmed 2026-04-20
+                                                   # anchor_pos - 4 (same offset in read & write)
+                                                   # Valid values: 2, 5, 15, 60, 300, 900
+                                                   # Mode-gated: only active in Histogram/Histogram+Continuous
+    record_time: Optional[float] = None   # seconds (e.g. 3.0, 5.0, 8.0, 10.0)

    # Trigger/alarm levels (✅ CONFIRMED per-channel at §7.6)
    # For now we store the first geo channel (Transverse) as representatives;
    # full per-channel data would require structured Channel objects.
-    trigger_level_geo: Optional[float] = None    # in/s (first geo channel)
-    alarm_level_geo: Optional[float] = None      # in/s (first geo channel)
-    max_range_geo: Optional[float] = None        # in/s full-scale range
+    trigger_level_geo: Optional[float] = None    # in/s (first geo channel) ✅
+    alarm_level_geo: Optional[float] = None      # in/s (first geo channel) ✅
+    geo_adc_scale: Optional[float] = None        # ADC-to-velocity scale factor (float32 at Tran+28) ✅
+                                                 # = inverse sensitivity = 1/sensitivity (in/s per V)
+                                                 # Formula (Interface Handbook §4.5): Range = 1.61133 V × scale_factor
+                                                 #   → 1.61133 × 6.206053 = 10.000 in/s (Normal range) ✅
+                                                 # Firmware uses: PPV (in/s) = ADC_voltage (V) × 6.206053
+                                                 # Identical on BE11529 and BE18189 — same Instantel geophone hardware.
+                                                 # NOT a user-configurable setting. Must NOT be written.
+    geo_range: Optional[int] = None             # range/sensitivity selector — CONFIRMED 2026-04-20
+                                                 # 0x00 = Normal    10.000 in/s  (standard gain)
+                                                 # 0x01 = Sensitive  1.250 in/s  (high gain)
+                                                 # Offset: Tran+33 in both E5 read and SUB 71 write payloads
+                                                 # (same 2126-byte buffer is round-tripped; applied to Tran/Vert/Long)

    # Project/setup strings (sourced from E5 / SUB 71 write payload)
    # These are the FULL project metadata from compliance config,
@@ -358,6 +430,78 @@ class ComplianceConfig:
    notes: Optional[str] = None            # extended notes / additional info


+# ── Call Home Config ──────────────────────────────────────────────────────────
+
+@dataclass
+class CallHomeConfig:
+    """
+    Auto Call Home (ACH) configuration from SUB 0x2C (response 0xD3).
+
+    Read with a standard two-step protocol (probe offset=0x00, data offset=0x7C).
+    Written via SUB 0x7E (write, 127-byte payload) + SUB 0x7F (confirm).
+
+    Confirmed from 4-20-26 call home settings captures (11 BW + S3 capture pairs).
+
+    Raw payload layout (data[11:] from S3 response, 125 bytes):
+      [0]     0x00           header byte
+      [1]     0x7C = 124     inner length (= offset for SUB 0x7E write - 2)
+      [2]     0xDC           constant
+      [3:5]   0x00 0x00      padding
+      [5]     auto_call_home_enabled  (0x00=off, 0x01=on) ✅
+      [6:46]  dial_string             40-byte null-padded ASCII ✅
+      [46:87] auto_answer_raw         AT command strings (not decoded) ✅ present
+      [87]    after_event_recorded    (0x01=on, 0x00=off) ✅
+      [91]    at_specified_times      (0x01=on, 0x00=off) ✅
+      [93]    time1_enabled           (0x01=on, 0x00=off) ✅
+      [95]    time2_enabled           (0x01=on, 0x00=off) ✅
+      [101]   time1_hour              uint8 decimal 0-23 ✅
+      [102]   time1_min               uint8 decimal 0-59 ✅
+      [105]   time2_hour              uint8 decimal 0-23 ✅
+      [106]   time2_min               uint8 decimal 0-59 ✅
+      [117]   DLE prefix (0x10)       ┐ DLE-escaped num_retries=3 (0x03)
+      [118]   0x03                    ┘ device stores/returns 0x03 DLE-escaped ✅
+      [120]   time_between_retries_sec uint8 (= 0x0F = 15 s default) ✅
+      [122]   wait_for_connection_sec  uint8 (= 0x3C = 60 s default) ✅
+      [124]   warm_up_time_sec         uint8 (= 0x3C = 60 s default) ✅
+
+    Write payload = raw 125 bytes + b'\\x00\\x00' (2 trailing zeros) = 127 bytes.
+    Offset for SUB 0x7E: data[1] + 2 = 0x7C + 2 = 0x7E = 126.
+
+    Note on DLE-escaped 0x03:  The device's S3 response DLE-escapes ETX (0x03)
+    bytes as \\x10\\x03.  The S3FrameParser preserves both bytes in frame.data.
+    Subsequent fields after offset 117 are therefore at raw_offset = logical+1.
+    The raw payload must be round-tripped verbatim in write; do NOT reapply DLE
+    destuffing or stripping.
+    """
+    raw: Optional[bytes] = None          # raw 125-byte read payload (for round-trip write)
+
+    # ── Main enable ──────────────────────────────────────────────────────────
+    auto_call_home_enabled: Optional[bool] = None    # raw[5] ✅
+
+    # ── Dial string ──────────────────────────────────────────────────────────
+    dial_string: Optional[str] = None                # raw[6:46] 40-byte null-padded ASCII ✅
+
+    # ── When to call ─────────────────────────────────────────────────────────
+    after_event_recorded: Optional[bool] = None      # raw[87] ✅
+    at_specified_times: Optional[bool] = None        # raw[91] ✅
+
+    # ── Time slot 1 ──────────────────────────────────────────────────────────
+    time1_enabled: Optional[bool] = None             # raw[93] ✅
+    time1_hour: Optional[int] = None                 # raw[101]  0-23 ✅
+    time1_min: Optional[int] = None                  # raw[102]  0-59 ✅
+
+    # ── Time slot 2 ──────────────────────────────────────────────────────────
+    time2_enabled: Optional[bool] = None             # raw[95] ✅
+    time2_hour: Optional[int] = None                 # raw[105]  0-23 ✅
+    time2_min: Optional[int] = None                  # raw[106]  0-59 ✅
+
+    # ── Retry / timeout settings (read-only; not writable via set_call_home_config) ──
+    num_retries: Optional[int] = None                # raw[117:119]=10 03 → value 3 ✅
+    time_between_retries_sec: Optional[int] = None   # raw[120] (shifted +1 by DLE) ✅
+    wait_for_connection_sec: Optional[int] = None    # raw[122] ✅
+    warm_up_time_sec: Optional[int] = None           # raw[124] ✅
+
+
 # ── Event ─────────────────────────────────────────────────────────────────────

@dataclass
@@ -401,6 +545,10 @@ class Event:
    # Set by get_events(); required by download_waveform().
    _waveform_key: Optional[bytes] = field(default=None, repr=False)

+    # Raw A5 frames from the full bulk waveform download (full_waveform=True).
+    # Populated by get_events() when full_waveform=True; used by write_blastware_file().
+    _a5_frames: Optional[list] = field(default=None, repr=False)
+
    def __str__(self) -> str:
        ts = str(self.timestamp) if self.timestamp else "no timestamp"
        ppv = ""
@@ -419,6 +567,65 @@ class Event:
        return f"Event#{self.index} {ts}{ppv}"


+# ── MonitorLogEntry ───────────────────────────────────────────────────────────
+
+@dataclass
+class MonitorLogEntry:
+    """
+    A monitor log entry decoded from a SUB 0x0A (WAVEFORM_HEADER) response
+    whose first byte is 0x2C (partial record, recording mode = continuous
+    monitoring without a triggered event).
+
+    These are the "partial bins" that Blastware stores between triggered events.
+    Each entry represents one monitoring interval — the span of time during
+    which the unit was actively monitoring but no threshold crossing occurred.
+
+    Confirmed from 4-11-26 MITM capture analysis (2026-04-11):
+
+    Header layout (full response data[0:]):
+      data[0]    = 0x2C  (partial record type / data length in probe response)
+      data[1:5]  = 0x00 × 4
+      data[5:9]  = event key (4 bytes, big-endian hex)
+      data[9:11] = 0x00 × 2
+      data[11:]  = timestamp_start (9 or 10 bytes depending on recording mode)
+                   + timestamp_stop (same format)
+                   + separator (4–5 bytes, variable)
+                   + serial null-terminated (e.g. "BE11529\\0")
+                   + "Geo: X.XXX in/s\\0"  (trigger threshold string)
+
+    Timestamp format detection:
+      data[11] == 0x10  →  10-byte sub_code=0x03 (continuous) format
+      data[12] == 0x10  →  9-byte sub_code=0x10 (single-shot) format
+
+    In contrast to Event (triggered records, type 0x46), MonitorLogEntry
+    records do NOT have a waveform record (SUB 0x0C) or bulk waveform stream
+    (SUB 5A).  All available metadata is in the 0x0A header alone.
+    """
+    index: int                                      # 0-based position in device record list
+    key: str                                        # 8-hex event key (e.g. "01114290") ✅
+
+    start_time: Optional[datetime.datetime] = None  # monitoring session start ✅
+    stop_time: Optional[datetime.datetime] = None   # monitoring session stop ✅
+    serial: Optional[str] = None                    # device serial (e.g. "BE11529") ✅
+    geo_threshold_ips: Optional[float] = None       # trigger level from "Geo: X.XXX in/s" ✅
+
+    # Raw bytes for debugging / future decoding
+    raw_header: Optional[bytes] = field(default=None, repr=False)
+
+    @property
+    def duration_seconds(self) -> Optional[float]:
+        """Duration of monitoring interval in seconds, or None if times unavailable."""
+        if self.start_time and self.stop_time:
+            return (self.stop_time - self.start_time).total_seconds()
+        return None
+
+    def __str__(self) -> str:
+        start = self.start_time.isoformat() if self.start_time else "?"
+        stop  = self.stop_time.isoformat()  if self.stop_time  else "?"
+        dur   = f" ({self.duration_seconds:.0f}s)" if self.duration_seconds is not None else ""
+        return f"MonitorLog#{self.index} key={self.key} {start}→{stop}{dur}"
+
+
 # ── MonitorStatus ─────────────────────────────────────────────────────────────

@dataclass
@@ -35,6 +35,8 @@ from .framing import (
    token_params,
    bulk_waveform_params,
    bulk_waveform_term_params,
+    bulk_waveform_term_v2,
+    parse_strt_end_offset,
    POLL_PROBE,
    POLL_DATA,
    SESSION_RESET,
@@ -57,7 +59,7 @@ SUB_POLL             = 0x5B
 SUB_SERIAL_NUMBER    = 0x15
 SUB_FULL_CONFIG      = 0x01
 SUB_EVENT_INDEX      = 0x08
-SUB_CHANNEL_CONFIG   = 0x06
+SUB_CHANNEL_CONFIG   = 0x06   # Event storage range read (first/last key) ✅
 SUB_MONITOR_STATUS   = 0x1C   # Monitoring status read (battery, memory, mode) ✅
 SUB_EVENT_HEADER     = 0x1E
 SUB_EVENT_ADVANCE    = 0x1F
@@ -65,6 +67,7 @@ SUB_WAVEFORM_HEADER  = 0x0A
 SUB_WAVEFORM_RECORD  = 0x0C
 SUB_BULK_WAVEFORM    = 0x5A
 SUB_COMPLIANCE       = 0x1A
+SUB_CALL_HOME        = 0x2C   # Call home config read  → response 0xD3 ✅
 SUB_UNKNOWN_2E       = 0x2E

 # Write command SUBs (= Read SUB + 0x60, confirmed from BW captures 3-11-26)
@@ -78,10 +81,20 @@ SUB_WRITE_CONFIRM_C      = 0x74   # Confirm C — sent after 69 ✅
 SUB_TRIGGER_CONFIG_WRITE = 0x82   # Write trigger config (0x22 + 0x60) ✅
 SUB_TRIGGER_CONFIRM      = 0x83   # Confirm trigger write ✅

+# Call home write SUBs (confirmed from 4-20-26 call home settings captures)
+SUB_CALL_HOME_WRITE      = 0x7E   # Write call home config  → response 0x81 ✅
+SUB_CALL_HOME_CONFIRM    = 0x7F   # Confirm call home write → response 0x80 ✅
+
 # Monitoring control SUBs (confirmed from 4-8-26/2ndtry BW TX capture)
 SUB_START_MONITORING     = 0x96   # Start monitoring  → response 0x69 ✅
 SUB_STOP_MONITORING      = 0x97   # Stop monitoring   → response 0x68 ✅

+# Erase-all SUBs (confirmed from 4-11-26 MITM capture)
+# Both use token=0xFE at params[7] and return minimal 11-byte acks.
+# Standard response formula applies: 0xFF - SUB.
+SUB_ERASE_ALL_BEGIN      = 0xA3   # Begin erase all events    → response 0x5C ✅
+SUB_ERASE_ALL_CONFIRM    = 0xA2   # Confirm erase all events  → response 0x5D ✅
+
 # Hardcoded data lengths for the two-step read protocol.
 #
 # The S3 probe response page_key is always 0x0000 — it does NOT carry the
@@ -96,12 +109,14 @@ DATA_LENGTHS: dict[int, int] = {
    SUB_SERIAL_NUMBER:  0x0A,   # 10-byte serial number block ✅
    SUB_FULL_CONFIG:    0x98,   # 152-byte full config block ✅
    SUB_EVENT_INDEX:    0x58,   # 88-byte event index ✅
+    SUB_CHANNEL_CONFIG: 0x24,   # 36-byte event storage range (first/last key) ✅
    SUB_MONITOR_STATUS: 0x2C,   # 44-byte monitor status block (idle) ✅
    SUB_EVENT_HEADER:   0x08,   # 8-byte event header (waveform key + event data) ✅
    SUB_EVENT_ADVANCE:  0x08,   # 8-byte next-key response ✅
    # SUB_WAVEFORM_HEADER (0x0A) is VARIABLE — length read from probe response
    # data[4].  Do NOT add it here; use read_waveform_header() instead. ✅
    SUB_WAVEFORM_RECORD: 0xD2,  # 210-byte waveform/histogram record ✅
+    SUB_CALL_HOME:      0x7C,   # 124-byte call home config ✅ (confirmed 4-20-26)
    SUB_UNKNOWN_2E:     0x1A,   # 26 bytes, purpose TBD 🔶
    0x09:               0xCA,   # 202 bytes, purpose TBD 🔶
    # SUB_COMPLIANCE (0x1A) uses a multi-step sequence with a 2090-byte total;
@@ -109,14 +124,22 @@ DATA_LENGTHS: dict[int, int] = {
 }

 # SUB 5A (BULK_WAVEFORM_STREAM) protocol constants.
-# Confirmed from 1-2-26 BW TX capture analysis (2026-04-02).
-_BULK_CHUNK_OFFSET = 0x1004   # offset field for probe + all regular chunk requests ✅
-_BULK_TERM_OFFSET  = 0x005A   # offset field for termination request ✅
-_BULK_COUNTER_STEP = 0x0400   # chunk counter increment per chunk ✅
-# Chunk counter formula: chunk_num * 0x0400 for ALL chunks including chunk 1.
-# Earlier captures showed 0x1004 for chunk 1 — that was a Blastware artifact, not a
-# protocol requirement.  Confirmed 2026-04-06: 0x0400 for chunk 1 works; 0x1004
-# causes a 120-second device timeout.  Formula n * 0x0400 is used for all chunks.
+#
+# 2026-05-01 minimal-fix: the chunk-counter walk is now bounded by the event's
+# `end_offset` extracted from the STRT record at data[23:27] of the probe
+# response.  Without this bound the loop kept asking for chunks past the event
+# end and the device responded with post-event circular-buffer garbage,
+# corrupting reconstructed Blastware files for events ≥ 2 sec.
+#
+# We keep the OLD 0x0400 chunk step here (BW actually uses 0x0200 — see §7.8.5
+# of the protocol reference for the corrected understanding) because the
+# existing blastware_file.py builder relies on the 0x0400-step frame structure
+# to produce valid files.  Switching to BW's 0x0200 step is a separate task
+# that also requires updating the file builder.
+# BW-exact protocol values (v0.14.0).  Verified against 4-27-26 + 5-1-26 captures.
+_BULK_CHUNK_OFFSET = 0x1002   # offset_word for probe + all chunk requests
+_BULK_TERM_OFFSET  = 0x005A   # offset_word for the legacy terminator (fallback only)
+_BULK_COUNTER_STEP = 0x0200   # chunk counter increment (matches chunk payload size)

 # Default timeout values (seconds).
 # MiniMate Plus is a slow device — keep these generous.
@@ -387,23 +410,32 @@ class MiniMateProtocol:
        Send the SUB 0A (WAVEFORM_HEADER) two-step read for *key4*.

        The data length for 0A is VARIABLE and must be read from the probe
-        response at data[4].  Two known values:
-          0x30 — full histogram bin (has a waveform record to follow)
-          0x26 — partial histogram bin (no waveform record)
+        response at data[4].  Two confirmed values:
+          0x46 (70)  — full triggered event (has 0C waveform record to follow)
+          0x2C (44)  — partial / monitor-log entry (no 0C record; 0A header only)

        Args:
            key4: 4-byte waveform record address from 1E or 1F.

        Returns:
-            (header_bytes, record_length) where:
-              header_bytes  — raw data section starting at data[11]
-              record_length — DATA_LENGTH read from probe (0x30 or 0x26)
+            (raw_data, record_length) where:
+              raw_data      — complete data_rsp.data bytes (full response payload)
+              record_length — DATA_LENGTH read from probe (0x46 for full, 0x2C for partial)
+
+        The raw_data layout:
+          raw_data[0]    = record type (0x46 = full triggered event, 0x2C = partial/monitor)
+          raw_data[1:5]  = 0x00 × 4
+          raw_data[5:9]  = event key (4 bytes)
+          raw_data[9:11] = 0x00 × 2
+          raw_data[11:]  = timestamps + separator + serial + channel strings
+                           (see MonitorLogEntry in models.py for full layout)

        Raises:
            ProtocolError: on timeout, bad checksum, or wrong response SUB.

-        Confirmed from 3-31-26 capture: 0A probe response data[4] carries
+        Confirmed from 4-11-26 MITM capture: 0A probe response data[4] carries
        the variable length; data-request uses that length as the offset byte.
+        record_length == data[0] in virtually all cases (confirmed empirically).
        """
        rsp_sub = _expected_rsp_sub(SUB_WAVEFORM_HEADER)
        params  = waveform_key_params(key4)
@@ -413,7 +445,7 @@ class MiniMateProtocol:
        probe_rsp = self._recv_one(expected_sub=rsp_sub)

        # Variable length — read from probe response data[4]
-        length = probe_rsp.data[4] if len(probe_rsp.data) > 4 else 0x30
+        length = probe_rsp.data[4] if len(probe_rsp.data) > 4 else 0x46
        log.debug("read_waveform_header: 0A data request offset=0x%02X", length)

        if length == 0:
@@ -422,12 +454,11 @@ class MiniMateProtocol:
        self._send(build_bw_frame(SUB_WAVEFORM_HEADER, length, params))
        data_rsp = self._recv_one(expected_sub=rsp_sub)

-        header_bytes = data_rsp.data[11:11 + length]
        log.debug(
            "read_waveform_header: key=%s length=0x%02X is_full=%s",
-            key4.hex(), length, length == 0x30,
+            key4.hex(), length, length >= 0x40,
        )
-        return header_bytes, length
+        return data_rsp.data, length

    def read_waveform_data_raw(self) -> bytes:
        """
@@ -503,142 +534,270 @@ class MiniMateProtocol:
        self,
        key4: bytes,
        *,
-        stop_after_metadata: bool = True,
-        max_chunks: int = 32,
-    ) -> list[bytes]:
+        stop_after_metadata: bool = True,   # DEPRECATED — no-op under BW-exact walk
+        max_chunks: int = 256,              # safety cap only; loop is bounded by end_offset
+        include_terminator: bool = False,
+        extra_chunks_after_metadata: int = 1,   # DEPRECATED — no-op
+    ) -> list[S3Frame]:
        """
-        Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event.
+        Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event using
+        Blastware's exact protocol.  REWRITTEN 2026-05-02 (v0.14.0).

-        The bulk waveform stream carries both raw ADC samples (large) and
-        event-time metadata strings ("Project:", "Client:", "User Name:",
-        "Seis Loc:", "Extended Notes") embedded in one of the middle frames
-        (confirmed: A5[7] of 9 for 1-2-26 capture).
+        Algorithm (matches BW captures across 2-sec / 3-sec / event-2):

-        Protocol is request-per-chunk, NOT a continuous stream:
-          1. Probe  (offset=_BULK_CHUNK_OFFSET, is_probe=True,  counter=0x0000)
-          2. Chunks (offset=_BULK_CHUNK_OFFSET, is_probe=False, counter+=0x0400)
-          3. Loop until metadata found (stop_after_metadata=True) or max_chunks
-          4. Termination (offset=_BULK_TERM_OFFSET, counter=last+_BULK_COUNTER_STEP)
-             Device responds with a final A5 frame (page_key=0x0000).
+          1. Probe
+               - For events at start_key[2:4] = 0x0000 (first event after erase
+                 / wrap): probe at counter=0x0000 with full key in params.
+               - For continuation events (start_key[2:4] != 0): first chunk at
+                 counter = start_key[2:4] + 0x0046; acts as both probe and
+                 first sample chunk; response carries STRT.

-        The termination frame (page_key=0x0000) is NOT included in the returned list.
+          2. Parse end_offset from STRT record at data[23:27] of the probe response.

-        Args:
-            key4:                4-byte waveform key from EVENT_HEADER (1E).
-            stop_after_metadata: If True (default), send termination as soon as
-                                 b"Project:" is found in a frame's data — avoids
-                                 downloading the full ADC waveform payload (several
-                                 hundred KB).  Set False to download everything.
-            max_chunks:          Safety cap on the number of chunk requests sent
-                                 (default 32; a typical event uses 9 large frames).
+          3. Read two fixed metadata pages at counter=0x1002 and counter=0x1004
+             — global session metadata (Project / Client / User Name / Seis Loc
+             / Extended Notes ASCII strings).  Event 1 only; continuation
+             events skip these (BW caches them across the session).
+
+          4. Walk sample chunks at 0x0200 increments, starting from 0x0600 for
+             event 1 or `start + 0x0046 + 0x0200` for continuation events.
+             Stop when `next_chunk + 0x0200 > end_offset`.
+
+          5. Send TERM frame with offset_word and params computed by
+             `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`.
+             The TERM response contains the partial last chunk (residual =
+             end_offset - next_boundary) including the 26-byte 0e 08 file
+             footer.

        Returns:
-            List of raw data bytes from each A5 response frame (not including
-            the terminator frame).  Frame indices match the request sequence:
-            index 0 = probe response, index 1 = first chunk, etc.
+            List of S3Frame objects from each A5 response (probe, metadata
+            pages, sample chunks, optional TERM response).  Caller passes
+            `include_terminator=True` (e.g. write_blastware_file) to keep the
+            TERM response in the list — it's required to reconstruct the
+            file footer.
+
+        Deprecated kwargs:
+            stop_after_metadata: legacy "Project:"-string-based stop condition.
+                                 No-op under the BW-exact walk; the loop is
+                                 deterministically bounded by end_offset from
+                                 STRT.  Accepted for backward compat.
+            extra_chunks_after_metadata: same.

        Raises:
-            ProtocolError: on timeout, bad checksum, or unexpected SUB.
-
-        Confirmed from 1-2-26 BW TX/RX captures (2026-04-02):
-          - probe + 8 regular chunks + 1 termination = 10 TX frames
-          - 9 large A5 responses + 1 terminator A5  = 10 RX frames
-          - page_key=0x0010 on large frames; page_key=0x0000 on terminator ✅
-          - "Project:" metadata at A5[7].data[626] ✅
+            ProtocolError: on timeout / bad checksum / unexpected SUB.
        """
        if len(key4) != 4:
            raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")

-        rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM)   # 0xFF - 0x5A = 0xA5
-        frames_data: list[bytes] = []
-        counter = 0
+        # Quietly accept and warn on deprecated kwargs.
+        if not stop_after_metadata:
+            log.debug("5A: stop_after_metadata=False is no-op under BW-exact walk")
+        if extra_chunks_after_metadata not in (0, 1):
+            log.debug("5A: extra_chunks_after_metadata=%d is no-op under BW-exact walk",
+                      extra_chunks_after_metadata)

-        # ── Step 1: probe ────────────────────────────────────────────────────
-        log.debug("5A probe  key=%s", key4.hex())
-        params = bulk_waveform_params(key4, 0, is_probe=True)
-        self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
-        self._parser.reset()        # reset bytes_fed counter before probe recv
+        rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM)   # 0xA5
+        frames_data: list[S3Frame] = []
+
+        start_offset = (key4[2] << 8) | key4[3]
+        is_event_1   = (start_offset == 0)
+
+        # ── Step 1: probe / first chunk ──────────────────────────────────────
+        if is_event_1:
+            probe_counter = 0
+            probe_params  = bulk_waveform_params(key4, 0, is_probe=True)
+            log.debug("5A probe (event-1)  key=%s  counter=0x0000", key4.hex())
+        else:
+            # Continuation events: first 5A request lands at counter = key[2:4]
+            # (i.e. the address of the off=0x46 WAVEHDR record returned by 1F).
+            # The probe response carries STRT at byte 17 with end_offset.
+            #
+            # Confirmed 2026-05-04 from 5-1-26 "copy 2nd address" capture
+            # (BW probes counter=0x2238 with key=01112238, STRT@17 end=0x417E)
+            # and 5-4-26 BW captures (2-sec event probes counter=0x2238).
+            #
+            # The earlier "+0x46" formula in the doc came from calling
+            # start_key the BOUNDARY (off=0x2C) key, but the iteration walk
+            # uses 1F's off=0x46 key as cur_key, which already incorporates
+            # the +0x46 offset relative to the boundary.  Adding it again
+            # caused the probe to overshoot, miss STRT, and run uncapped.
+            probe_counter = start_offset
+            probe_params  = bulk_waveform_params(key4, probe_counter)
+            log.debug(
+                "5A probe (event-N)  key=%s  counter=0x%04X",
+                key4.hex(), probe_counter,
+            )
+
+        self._send(build_5a_frame(_BULK_CHUNK_OFFSET, probe_params))
+        self._parser.reset()
        try:
            rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False)
        except TimeoutError:
            log.warning(
-                "5A probe TIMED OUT for key=%s — "
-                "%d raw bytes received (no complete A5 frame assembled)",
+                "5A probe TIMED OUT for key=%s — %d raw bytes received",
                key4.hex(), self._parser.bytes_fed,
            )
            raise
-        frames_data.append(rsp.data)
-        log.debug("5A A5[0]  page_key=0x%04X  %d bytes", rsp.page_key, len(rsp.data))

-        # ── Step 2: chunk loop ───────────────────────────────────────────────
-        # Chunk counters are monotonic: chunk_num * 0x0400 for all chunks.
-        # The 4-2-26 BW TX capture showed 0x1004 for chunk 1, but this is a
-        # Blastware artifact — the device accepts any counter value and streams
-        # data regardless.  Empirically confirmed 2026-04-06: 0x0400 for chunk 1
-        # works; 0x1004 causes the device to ignore the frame (timeout).
-        for chunk_num in range(1, max_chunks + 1):
-            counter = chunk_num * _BULK_COUNTER_STEP
-            params  = bulk_waveform_params(key4, counter)
-            log.debug("5A chunk %d  counter=0x%04X", chunk_num, counter)
+        frames_data.append(rsp)
+        log.debug("5A A5[0] (probe)  page_key=0x%04X  %d bytes",
+                  rsp.page_key, len(rsp.data))
+
+        # ── Step 2: parse STRT end_offset from probe response ────────────────
+        end_offset = parse_strt_end_offset(rsp.data)
+        if end_offset is None:
+            log.warning(
+                "5A probe response did not contain a STRT record; "
+                "cannot bound chunk loop — falling back to max_chunks=%d cap",
+                max_chunks,
+            )
+            end_offset = 0xFFFF   # impossible value → loop runs to max_chunks
+        else:
+            log.info(
+                "5A STRT  start_offset=0x%04X  end_offset=0x%04X  size=0x%04X",
+                start_offset, end_offset, end_offset - start_offset,
+            )
+
+        # ── Step 3: metadata pages 0x1002 + 0x1004 (event 1 only) ────────────
+        # Confirmed from BW captures: BW reads these two fixed device-buffer
+        # pages immediately after the probe for events at start_key[2:4]=0.
+        # Continuation events skip them (BW caches across the session).
+        # Their content is global compliance-setup metadata: Project, Client,
+        # User Name, Seis Loc, Extended Notes.
+        if is_event_1:
+            for meta_counter in (0x1002, 0x1004):
+                # Metadata page params have an extra trailing 0x00 byte
+                # (12-byte params instead of 11) — empirical from BW captures.
+                # Checksum-neutral but matches BW byte-for-byte.
+                meta_params = bytes([
+                    0x00,
+                    key4[0], key4[1],
+                    (meta_counter >> 8) & 0xFF,
+                    meta_counter & 0xFF,
+                    0, 0, 0, 0, 0, 0, 0,
+                ])
+                log.debug("5A metadata page  counter=0x%04X", meta_counter)
+                self._send(build_5a_frame(_BULK_CHUNK_OFFSET, meta_params))
+                self._parser.reset()
+                try:
+                    meta_rsp = self._recv_one(
+                        expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
+                    )
+                except TimeoutError:
+                    log.warning(
+                        "5A metadata page 0x%04X TIMED OUT — continuing",
+                        meta_counter,
+                    )
+                    continue
+                frames_data.append(meta_rsp)
+                log.debug(
+                    "5A meta@0x%04X  page_key=0x%04X  %d bytes",
+                    meta_counter, meta_rsp.page_key, len(meta_rsp.data),
+                )
+
+        # ── Step 4: sample chunk loop, bounded by end_offset ─────────────────
+        # Sample chunks start at:
+        #   event 1:        counter = 0x0600
+        #   event N (>0):   counter = probe_counter + 0x0200
+        #                            (probe was the first sample chunk)
+        if is_event_1:
+            counter = 0x0600
+        else:
+            counter = probe_counter + _BULK_COUNTER_STEP
+
+        last_chunk_counter: Optional[int] = (
+            probe_counter if not is_event_1 else None
+        )
+        chunks_fetched = 0
+
+        while chunks_fetched < max_chunks:
+            # Stop when next chunk would straddle the event end.
+            if counter + _BULK_COUNTER_STEP > end_offset:
+                log.debug(
+                    "5A chunk loop done at counter=0x%04X (end=0x%04X); "
+                    "%d chunks fetched",
+                    counter, end_offset, chunks_fetched,
+                )
+                break
+
+            params = bulk_waveform_params(key4, counter)
+            log.debug("5A chunk #%d  counter=0x%04X", chunks_fetched + 1, counter)
            self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
-            self._parser.reset()        # reset bytes_fed for accurate per-chunk count
+            self._parser.reset()
            try:
-                rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False, timeout=10.0)
+                rsp = self._recv_one(
+                    expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
+                )
            except TimeoutError:
                raw = self._parser.bytes_fed
                log.warning(
                    "5A TIMEOUT chunk=%d counter=0x%04X raw_bytes=%d",
-                    chunk_num, counter, raw,
+                    chunks_fetched + 1, counter, raw,
                )
                if raw > 0 and frames_data:
-                    # Device sent a partial byte (likely a bare DLE/ETX end-of-stream
-                    # signal) but never completed a full frame.  Treat as graceful
-                    # stream end and fall through to the termination step.
                    log.warning(
-                        "5A end-of-stream detected at chunk=%d (raw_bytes=%d, "
-                        "frames_collected=%d) — proceeding to termination",
-                        chunk_num, raw, len(frames_data),
+                        "5A unexpected end-of-stream — proceeding to TERM",
                    )
                    break
                raise

-            log.warning(
-                "5A RX chunk=%d page_key=0x%04X data_len=%d contains_Project=%s",
-                chunk_num, rsp.page_key, len(rsp.data), b"Project:" in rsp.data,
+            log.debug(
+                "5A RX chunk=%d page_key=0x%04X data_len=%d",
+                chunks_fetched + 1, rsp.page_key, len(rsp.data),
            )

            if rsp.page_key == 0x0000:
-                # Device unexpectedly terminated mid-stream (no termination needed).
-                log.debug("5A A5[%d] page_key=0x0000 — device terminated early", chunk_num)
+                # Device terminated mid-stream unexpectedly.
+                log.warning(
+                    "5A unexpected page_key=0x0000 mid-stream at counter=0x%04X",
+                    counter,
+                )
+                if include_terminator:
+                    frames_data.append(rsp)
                return frames_data

-            frames_data.append(rsp.data)
-
-            if stop_after_metadata and b"Project:" in rsp.data:
-                log.debug("5A A5[%d] metadata found — stopping early", chunk_num)
-                break
+            frames_data.append(rsp)
+            last_chunk_counter = counter
+            counter += _BULK_COUNTER_STEP
+            chunks_fetched += 1
        else:
            log.warning(
-                "5A reached max_chunks=%d without end-of-stream; sending termination",
-                max_chunks,
+                "5A reached max_chunks=%d at counter=0x%04X (end=0x%04X)",
+                max_chunks, counter, end_offset,
            )

-        # ── Step 3: termination ──────────────────────────────────────────────
-        term_counter = counter + _BULK_COUNTER_STEP
-        term_params  = bulk_waveform_term_params(key4, term_counter)
-        log.debug(
-            "5A termination  term_counter=0x%04X  offset=0x%04X",
-            term_counter, _BULK_TERM_OFFSET,
-        )
-        self._send(build_5a_frame(_BULK_TERM_OFFSET, term_params))
-        try:
-            term_rsp = self._recv_one(expected_sub=rsp_sub)
+        # ── Step 5: TERM with proper end_offset-derived formula ──────────────
+        if last_chunk_counter is None or end_offset == 0xFFFF:
+            # No STRT or no chunks fetched — fall back to legacy TERM.
+            log.warning(
+                "5A using legacy TERM (offset_word=0x005A); "
+                "end_offset unavailable or no chunks fetched",
+            )
+            legacy_counter = (last_chunk_counter or probe_counter) + _BULK_COUNTER_STEP
+            term_offset_word = _BULK_TERM_OFFSET   # 0x005A
+            term_params = bulk_waveform_term_params(key4, legacy_counter)
+        else:
+            term_offset_word, term_params = bulk_waveform_term_v2(
+                key4, end_offset, last_chunk_counter,
+            )
            log.debug(
-                "5A termination response  page_key=0x%04X  %d bytes",
+                "5A TERM  offset_word=0x%04X  params[2:4]=%s  end=0x%04X  "
+                "last_chunk=0x%04X",
+                term_offset_word, term_params[2:4].hex(),
+                end_offset, last_chunk_counter,
+            )
+
+        self._send(build_5a_frame(term_offset_word, term_params))
+        try:
+            term_rsp = self._recv_one(expected_sub=rsp_sub, timeout=10.0)
+            log.info(
+                "5A TERM response  page_key=0x%04X  %d bytes",
                term_rsp.page_key, len(term_rsp.data),
            )
+            if include_terminator:
+                frames_data.append(term_rsp)
        except TimeoutError:
-            log.debug("5A no termination response — device may have already closed")
+            log.warning("5A no TERM response (timeout)")

        return frames_data

@@ -778,7 +937,7 @@ class MiniMateProtocol:
                continue

            chunk = data_rsp.data[11:]
-            log.warning(
+            log.debug(
                "read_compliance_config: frame %s  page=0x%04X  data=%d  cfg_chunk=%d  running_total=%d",
                step_name, data_rsp.page_key, len(data_rsp.data),
                len(chunk), len(config) + len(chunk),
@@ -798,17 +957,18 @@ class MiniMateProtocol:
        except TimeoutError:
            pass

-        log.warning(
+        log.info(
            "read_compliance_config: done — %d cfg bytes total",
            len(config),
        )

-        # Hex dump first 128 bytes for field mapping
-        for row in range(0, min(len(config), 128), 16):
-            row_bytes = bytes(config[row:row + 16])
-            hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
-            asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
-            log.warning("  cfg[%04x]: %-48s  %s", row, hex_part, asc_part)
+        # Hex dump first 128 bytes — useful only for field-mapping work, not normal operation.
+        if log.isEnabledFor(logging.DEBUG):
+            for row in range(0, min(len(config), 128), 16):
+                row_bytes = bytes(config[row:row + 16])
+                hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
+                asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
+                log.debug("  cfg[%04x]: %-48s  %s", row, hex_part, asc_part)

        return bytes(config)

@@ -1072,6 +1232,89 @@ class MiniMateProtocol:
        self._send(frame)
        return self.recv_write_ack(expected_sub=rsp_sub)

+    # ── Call home config (SUBs 0x2C / 0x7E / 0x7F) ──────────────────────────
+
+    def read_call_home_config(self) -> bytes:
+        """
+        Read the auto call home configuration (SUB 0x2C → response 0xD3).
+
+        Standard two-step read: probe (offset=0x00) then data (offset=0x7C=124).
+        Returns the raw 125-byte payload (data[11:] of the data response).
+
+        Confirmed from 4-20-26 call home settings capture:
+          - Probe response: data[4]=0x7C (confirms data length = 124)
+          - Data response: 136 bytes total (11-byte echo header + 125 bytes payload)
+          - Payload[0:3] = 0x00 0x7C 0xDC (header: zero, inner-length, constant)
+          - Payload[5]  = auto_call_home_enabled
+          - Payload[6:46] = dial_string (40-byte null-padded ASCII "RADIO RING")
+
+        Returns:
+            Raw 125-byte call home config payload (data[11:]).
+            Suitable for round-trip write (append \\x00\\x00 → 127-byte write payload).
+
+        Raises:
+            ProtocolError: on timeout or wrong response SUB.
+        """
+        rsp_sub = _expected_rsp_sub(SUB_CALL_HOME)   # 0xFF - 0x2C = 0xD3
+        length  = DATA_LENGTHS[SUB_CALL_HOME]         # 0x7C = 124
+
+        log.debug("read_call_home_config: 0x2C probe")
+        self._send(build_bw_frame(SUB_CALL_HOME, 0))
+        self._recv_one(expected_sub=rsp_sub)
+
+        log.debug("read_call_home_config: 0x2C data request offset=0x%02X", length)
+        self._send(build_bw_frame(SUB_CALL_HOME, length))
+        data_rsp = self._recv_one(expected_sub=rsp_sub)
+
+        payload = data_rsp.data[11:]
+        log.debug("read_call_home_config: received %d payload bytes", len(payload))
+        return payload
+
+    def write_call_home_config(self, data: bytes) -> None:
+        """
+        Write the auto call home configuration (SUB 0x7E → 0x7F confirm).
+
+        Write sequence (confirmed from 4-20-26 call home settings captures):
+          SUB 0x7E  write 127-byte payload  → device acks SUB 0x81
+          SUB 0x7F  confirm (no data)       → device acks SUB 0x80
+
+        The 127-byte write payload = 125-byte read payload + b'\\x00\\x00'.
+        The offset field = data[1] + 2 = 0x7C + 2 = 0x7E = 126.
+
+        Write frame format: build_bw_write_frame (minimal DLE stuffing — only
+        BW_CMD is doubled; all other bytes are RAW).  The \\x10\\x03 sequence
+        within the payload is preserved as-is (device interprets DLE+ETX as the
+        literal value 0x03 per the inner-frame terminator convention).
+
+        Args:
+            data: 127-byte write payload (read payload + \\x00\\x00 footer).
+                  Must start with [0x00][0x7C][...] (standard header).
+
+        Raises:
+            ValueError:    if data is not exactly 127 bytes or lacks expected header.
+            ProtocolError: on timeout or wrong response SUB.
+        """
+        if len(data) < 2:
+            raise ValueError(f"call home write payload must be at least 2 bytes, got {len(data)}")
+        rsp_sub_write   = _expected_rsp_sub(SUB_CALL_HOME_WRITE)    # 0xFF - 0x7E = 0x81
+        rsp_sub_confirm = _expected_rsp_sub(SUB_CALL_HOME_CONFIRM)  # 0xFF - 0x7F = 0x80
+
+        # Offset formula: data[1] + 2 (same pattern as other single-chunk writes)
+        offset = data[1] + 2   # 0x7C + 2 = 0x7E = 126
+        frame = build_bw_write_frame(SUB_CALL_HOME_WRITE, data, offset=offset)
+        log.debug(
+            "write_call_home_config: %d bytes  data[1]=0x%02X  offset=0x%04X",
+            len(data), data[1], offset,
+        )
+        self._send(frame)
+        self.recv_write_ack(expected_sub=rsp_sub_write)
+        log.debug("write_call_home_config: write acked; sending confirm 0x7F")
+
+        confirm_frame = build_bw_write_frame(SUB_CALL_HOME_CONFIRM, b"")
+        self._send(confirm_frame)
+        self.recv_write_ack(expected_sub=rsp_sub_confirm)
+        log.debug("write_call_home_config: confirm acked — done")
+
    # ── Monitoring ────────────────────────────────────────────────────────────

    def read_monitor_status(self) -> S3Frame:
@@ -1137,6 +1380,78 @@ class MiniMateProtocol:
        self._send(frame)
        return self.recv_write_ack(expected_sub=rsp_sub)

+    def read_event_storage_range(self) -> S3Frame:
+        """
+        Read event storage range (SUB 0x06 → response 0xF9).
+
+        Two-step read: probe (offset=0x00) then data (offset=0x24 = 36 bytes).
+        Uses token=0xFE at params[7] — same as the erase sequence.
+
+        The 36-byte response ends with two 4-byte event keys (first and last
+        stored event key).  After a successful erase, both keys are 0x01110000
+        (device-empty sentinel).  Confirmed from 4-11-26 MITM capture.
+
+        Returns:
+            S3Frame with 36 bytes of storage range data.
+
+        Raises:
+            ProtocolError: on timeout or wrong response SUB.
+        """
+        rsp_sub = _expected_rsp_sub(SUB_CHANNEL_CONFIG)   # 0xFF - 0x06 = 0xF9
+        params  = token_params(0xFE)
+        log.debug("read_event_storage_range: probe step  rsp_sub=0x%02X", rsp_sub)
+        self._send(build_bw_frame(SUB_CHANNEL_CONFIG, offset=0x00, params=params))
+        self._recv_one(expected_sub=rsp_sub)
+
+        log.debug(
+            "read_event_storage_range: data step  offset=0x%02X",
+            DATA_LENGTHS[SUB_CHANNEL_CONFIG],
+        )
+        self._send(build_bw_frame(SUB_CHANNEL_CONFIG,
+                                  offset=DATA_LENGTHS[SUB_CHANNEL_CONFIG],
+                                  params=params))
+        return self._recv_one(expected_sub=rsp_sub)
+
+    def begin_erase_all(self) -> S3Frame:
+        """
+        Send Begin-Erase-All command (SUB 0xA3 → response 0x5C).
+
+        Single frame with token=0xFE at params[7].  The device acknowledges with
+        a minimal ack and begins the erase process.  Follow up with
+        read_monitor_status() + read_event_storage_range() + confirm_erase_all()
+        to complete the sequence.  Confirmed from 4-11-26 MITM capture.
+
+        Returns:
+            S3Frame ack from device (SUB 0x5C).
+
+        Raises:
+            ProtocolError: on timeout or wrong response SUB.
+        """
+        rsp_sub = _expected_rsp_sub(SUB_ERASE_ALL_BEGIN)   # 0xFF - 0xA3 = 0x5C
+        log.debug("begin_erase_all: rsp_sub=0x%02X", rsp_sub)
+        self._send(build_bw_frame(SUB_ERASE_ALL_BEGIN, params=token_params(0xFE)))
+        return self._recv_one(expected_sub=rsp_sub)
+
+    def confirm_erase_all(self) -> S3Frame:
+        """
+        Send Confirm-Erase-All command (SUB 0xA2 → response 0x5D).
+
+        Single frame with token=0xFE at params[7].  Must be preceded by
+        begin_erase_all() + read_monitor_status() + read_event_storage_range().
+        After this call the device memory is cleared.  Confirmed from 4-11-26
+        MITM capture.
+
+        Returns:
+            S3Frame ack from device (SUB 0x5D).
+
+        Raises:
+            ProtocolError: on timeout or wrong response SUB.
+        """
+        rsp_sub = _expected_rsp_sub(SUB_ERASE_ALL_CONFIRM)  # 0xFF - 0xA2 = 0x5D
+        log.debug("confirm_erase_all: rsp_sub=0x%02X", rsp_sub)
+        self._send(build_bw_frame(SUB_ERASE_ALL_CONFIRM, params=token_params(0xFE)))
+        return self._recv_one(expected_sub=rsp_sub)
+
    # ── Internal helpers ──────────────────────────────────────────────────────

    def _send(self, frame: bytes) -> None:
@@ -418,3 +418,138 @@ class TcpTransport(BaseTransport):
    def __repr__(self) -> str:
        state = "connected" if self.is_connected else "disconnected"
        return f"TcpTransport({self.host!r}, port={self.port}, {state})"
+
+
+# ── Inbound / accepted-socket transport ───────────────────────────────────────
+
+class SocketTransport(TcpTransport):
+    """
+    Like TcpTransport but wraps an already-accepted inbound socket.
+
+    Used by the ACH inbound server (bridges/ach_server.py) — the device dials
+    IN to us, so by the time we create this transport the socket is already live.
+    connect() is a no-op; everything else (read, write, read_until_idle, …) is
+    inherited unchanged from TcpTransport.
+
+    Args:
+        sock: An already-connected socket.socket returned by server_socket.accept().
+        peer: Human-readable peer label for repr / logging (e.g. "203.0.113.5:54321").
+    """
+
+    def __init__(self, sock: socket.socket, peer: str = "inbound") -> None:
+        # Bypass TcpTransport.__init__ — we already have a live socket.
+        self.host            = peer
+        self.port            = 0
+        self.connect_timeout = 0.0
+        self._sock           = sock
+        sock.settimeout(self._RECV_TIMEOUT)
+
+    def connect(self) -> None:
+        """No-op — socket was already accepted inbound."""
+        pass  # Already have a live socket; nothing to open.
+
+    @property
+    def is_connected(self) -> bool:
+        return self._sock is not None
+
+    def __repr__(self) -> str:
+        return f"SocketTransport(peer={self.host!r})"
+
+
+# ── Capturing transport (MITM-style raw byte mirror) ──────────────────────────
+
+class CapturingTransport(BaseTransport):
+    """
+    Wraps another BaseTransport and mirrors every byte to two raw capture files:
+
+        raw_bw_<...>.bin  — bytes WE wrote to the device (BW-side TX)
+        raw_s3_<...>.bin  — bytes the device wrote back  (S3-side TX)
+
+    The file naming and on-wire byte layout are identical to the captures
+    produced by `bridges/ach_mitm.py`, so the resulting `.bin` files can be
+    loaded directly by the Analyzer (File > Open Capture) and parsed by the
+    same tooling used for genuine Blastware MITM captures.
+
+    All BaseTransport methods are forwarded to the inner transport; the only
+    side-effect is that successful read/write byte streams are appended to the
+    two open binary files.
+
+    Args:
+        inner:   An already-built BaseTransport (SerialTransport / TcpTransport).
+        bw_path: File path for the "BW TX" stream (bytes we send).  Opened "wb".
+        s3_path: File path for the "S3 TX" stream (bytes the device sends).
+                 Opened "wb".
+
+    Example:
+        with CapturingTransport(TcpTransport("1.2.3.4", 9034),
+                                "raw_bw.bin", "raw_s3.bin") as t:
+            client = MiniMateClient(transport=t)
+            client.connect()
+            client.get_events()
+        # both .bin files now hold the full bidirectional capture.
+    """
+
+    def __init__(self, inner: BaseTransport, bw_path: str, s3_path: str) -> None:
+        self._inner = inner
+        self._bw_path = bw_path
+        self._s3_path = s3_path
+        self._bw_fh = None
+        self._s3_fh = None
+        # Forward inner attrs so callers can introspect (e.g. .host, .port).
+        self.host = getattr(inner, "host", None)
+        self.port = getattr(inner, "port", None)
+
+    # ── BaseTransport interface ───────────────────────────────────────────────
+
+    def connect(self) -> None:
+        if self._bw_fh is None:
+            self._bw_fh = open(self._bw_path, "wb", buffering=0)
+        if self._s3_fh is None:
+            self._s3_fh = open(self._s3_path, "wb", buffering=0)
+        self._inner.connect()
+
+    def disconnect(self) -> None:
+        try:
+            self._inner.disconnect()
+        finally:
+            for fh_attr in ("_bw_fh", "_s3_fh"):
+                fh = getattr(self, fh_attr)
+                if fh is not None:
+                    try:
+                        fh.flush()
+                        fh.close()
+                    except Exception:
+                        pass
+                    setattr(self, fh_attr, None)
+
+    @property
+    def is_connected(self) -> bool:
+        return self._inner.is_connected
+
+    def write(self, data: bytes) -> None:
+        self._inner.write(data)
+        if data and self._bw_fh is not None:
+            try:
+                self._bw_fh.write(data)
+            except Exception:
+                pass
+
+    def read(self, n: int) -> bytes:
+        got = self._inner.read(n)
+        if got and self._s3_fh is not None:
+            try:
+                self._s3_fh.write(got)
+            except Exception:
+                pass
+        return got
+
+    @property
+    def bw_path(self) -> str:
+        return self._bw_path
+
+    @property
+    def s3_path(self) -> str:
+        return self._s3_path
+
+    def __repr__(self) -> str:
+        return f"CapturingTransport({self._inner!r}, bw={self._bw_path!r}, s3={self._s3_path!r})"
@@ -0,0 +1,578 @@
+"""
+waveform_codec.py — block-walker and verified decoder for the MiniMate Plus
+waveform-file body.
+
+FULLY DECODED 2026-05-11.  Every block type, every channel, and the
+channel-rotation rule are verified byte-exact against BW's ASCII export
+across the 9-event fixture bundle (47,364 ADC samples, zero errors).
+
+The Blastware waveform-file body — the bytes between the 21-byte STRT
+record and the 26-byte file footer — is a tagged variable-length block
+stream with a custom delta + RLE codec.  (Not raw int16 LE, which was
+the historical wrong assumption that produced ±32K noise on every event.)
+
+Current status:
+
+- Block framing: ✅ solved (5 block types and lengths all confirmed)
+- Per-channel decode: ✅ solved (Tran / Vert / Long / MicL all byte-exact)
+- Channel rotation: ✅ Tran → Vert → Long → MicL per segment
+- Segment header: ✅ fully decoded (anchor pair + prev-channel extension)
+- 30 NN packed-delta block: ✅ NN × 12-bit signed deltas in NN/4 groups
+- MicL → dB(L) conversion: ✅ ``mic_count_to_db`` matches BW display
+- Production wiring: ✅ ``client.py:_decode_a5_waveform`` uses the new
+  codec (via ``decode_a5_frames``).  ``.h5`` sidecars now render
+  correctly.
+
+Known limitations:
+
+- Walker stops early on the loudest events (SP0, SS0, SV0, event-b) at
+  some mid-segment edge cases not yet fully characterized.  Every
+  sample reached IS correct; the walker just doesn't reach all of
+  them yet.  The cleanly-decoded subset is still ~5000–15000 samples
+  per loud event.
+
+────────────────────────────────────────────────────────────────────────────
+Body layout (CONFIRMED 2026-05-11 against 8 fixture events)
+────────────────────────────────────────────────────────────────────────────
+
+    [7-byte preamble] [stream of tagged blocks] [trailer]
+
+The preamble is always exactly 7 bytes:
+
+    body[0:3]  = 00 02 00              magic
+    body[3:5]  = Tran[0]   int16 BE    in 16-count units (LSB = 0.005 in/s)
+    body[5:7]  = Tran[1]   int16 BE    in 16-count units
+
+(Earlier drafts of this module described a "7-or-9-byte preamble";
+that was wrong — single-shot and continuous events both use 7 bytes.
+The "extra 2 bytes" on continuous events were the first ``00 NN`` RLE
+marker, not part of the preamble.)
+
+Block types and lengths (all confirmed):
+
+| Tag      | Length                | Meaning                                |
+|----------|-----------------------|----------------------------------------|
+| ``10 NN``| NN/2 + 2 bytes        | 4-bit nibble deltas (2 per byte; high  |
+|          |                       | nibble first; signed 0..7 / 8..F = -8..-1)|
+| ``20 NN``| NN + 2 bytes          | int8 signed deltas (1 per byte)        |
+| ``00 NN``| 2 bytes               | RLE: append NN copies of current value |
+| ``30 NN``| NN*2 in data, NN*4    | Unknown content.  Only in loud events. |
+|          | in trailer            |                                        |
+| ``40 02``| 20 bytes (fixed)      | Segment header                         |
+
+NN is always a multiple of 4.
+
+────────────────────────────────────────────────────────────────────────────
+Tran channel, segment 0 (CONFIRMED 2026-05-11)
+────────────────────────────────────────────────────────────────────────────
+
+Segment 0 — everything before the first ``40 02`` segment header — encodes
+Tran samples only.  Starting from preamble anchors Tran[0] and Tran[1],
+each subsequent block contributes to the running Tran value:
+
+    10 NN  →  append NN deltas (4-bit signed nibbles)
+    20 NN  →  append NN deltas (int8 signed bytes)
+    00 NN  →  append NN copies of the current value (RLE zeros)
+    40 02  →  segment 0 ends; multi-segment continuation is open
+
+This decodes the first 482–510 samples of Tran for each event with zero
+errors against BW's ASCII export.  The exact segment-0 sample count
+varies per event (it's bounded by a fixed device-flash byte budget, not
+a fixed sample count — quiet events fit more samples because zero
+deltas pack into ``00 NN`` markers compactly).
+
+Implementation: :func:`decode_tran_initial`.
+
+────────────────────────────────────────────────────────────────────────────
+Segment header (40 02, 20 bytes total)
+────────────────────────────────────────────────────────────────────────────
+
+The 18-byte payload of the ``40 02`` block:
+
+| Offset    | Field                                       | Status      |
+|-----------|---------------------------------------------|-------------|
+| [0:2]     | T_delta at first sample of new segment      | ✅ confirmed|
+|           | (int16 BE, in 16-count units)               |             |
+| [2:4]     | Likely T_delta at sample seg_start+1        | 🟡 likely   |
+| [4:6]     | Unknown (varies; possibly checksum)         | ❓ open     |
+| [6:8]     | Byte length to next segment header − 2      | ✅ confirmed|
+|           | (uint16 BE; useful for walker pre-scan)     |             |
+| [8:12]    | Monotonic uint32 LE counter                 | ✅ confirmed|
+|           | (starts ~0x47, increments by 1 per segment) |             |
+| [12:14]   | Constant ``02 00``                          | ✅ confirmed|
+| [14:18]   | Unknown 4-byte field                        | ❓ open     |
+
+────────────────────────────────────────────────────────────────────────────
+What breaks the multi-segment decoder (the main open question)
+────────────────────────────────────────────────────────────────────────────
+
+After segment 0 ends and the segment header T_delta is consumed,
+applying segment 1's blocks as Tran continuation produces values that
+diverge from truth by sample ~512.  The block structure inside segment
+1 is IDENTICAL to segment 0 (same alternating 10 NN / 00 NN pattern),
+and the delta budget matches the segment size exactly (V70 segment 1
+has 264 nibble-deltas + 244 RLE zeros = 508 = the segment's sample
+count).  But the cumulative is wrong.
+
+The strongest unverified hypothesis is that segments rotate channels:
+
+    segment 0  →  Tran samples 0..509
+    segment 1  →  Vert samples 0..507
+    segment 2  →  Long samples 0..507
+    segment 3  →  Mic  samples 0..507
+    segment 4  →  Tran samples 510..N (continuation)
+    ...
+
+This is consistent with the segment-1 block sums net-to-near-zero in
+V70 (where all 4 channels are near zero) and with the per-segment delta
+budget matching the segment size for a single channel.  It is NOT yet
+verified because the per-segment channel anchor isn't pinned down in
+the segment header — bytes [4:6] and [14:18] of the header are still
+open and probably encode V/L/M anchors.
+
+See ``docs/waveform_codec_re_status.md`` for the current working notes
+and the suggested next experiment ("segment-channel scoring analyzer").
+"""
+
+from __future__ import annotations
+
+import math
+from dataclasses import dataclass
+from typing import List, Optional, Tuple
+
+
+@dataclass
+class WaveformBlock:
+    """One tagged block parsed out of a Blastware waveform-file body."""
+    offset: int      # byte offset into body
+    tag_hi: int      # first tag byte (0x10 / 0x20 / 0x00 / 0x30 / 0x40)
+    tag_lo: int      # second tag byte (NN)
+    data: bytes      # block payload (excludes the 2-byte tag)
+    length: int      # total block length on the wire (includes the tag)
+
+    @property
+    def kind(self) -> str:
+        return f"{self.tag_hi:02x} {self.tag_lo:02x}"
+
+
+def find_data_start(body: bytes) -> int:
+    """Auto-detect the offset of the first data block.
+
+    The body starts with a 7-byte preamble (magic ``00 02 00`` + two int16 BE
+    Tran anchors).  After that, the data section starts with a tag — usually
+    ``10 NN`` or ``20 NN``, but quiet events may begin with a ``00 NN`` RLE
+    marker.  We return the offset of the first recognized tag.
+    """
+    # Try fixed offset 7 first (canonical preamble length).
+    if len(body) >= 9:
+        b, nn = body[7], body[8]
+        if (b in (0x00, 0x10, 0x20, 0x30) and nn % 4 == 0 and 0 < nn <= 0xFC) \
+                or (b == 0x40 and nn == 0x02):
+            return 7
+    # Fall back to scanning the first 20 bytes.
+    for i in range(min(20, len(body) - 1)):
+        b = body[i]
+        nn = body[i + 1]
+        if b in (0x10, 0x20) and nn % 4 == 0 and 0 < nn <= 0xFC:
+            return i
+    return -1
+
+
+def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
+    """Walk the tagged-block sequence starting at *start* (auto-detected by default).
+
+    Stops when an unrecognized tag is encountered or end of body is reached.
+    Returned blocks are in stream order.
+    """
+    if start is None:
+        start = find_data_start(body)
+        if start < 0:
+            return []
+
+    blocks: List[WaveformBlock] = []
+    i = start
+    while i + 1 < len(body):
+        t0 = body[i]
+        t1 = body[i + 1]
+        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 // 2 + 2
+        elif (t0 & 0xF0) == 0x10 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
+            # Wide-NN nibble block: ``1X NN`` where X is the high nibble of a
+            # 12-bit NN value.  NN = ((t0 & 0x0F) << 8) | t1.  Block length
+            # = NN/2 + 2 bytes (NN nibble deltas, same as ``10 NN`` semantics
+            # but with NN > 0xFC).  Confirmed 2026-05-11 in SP0 segment 12
+            # where V continuation uses ``11 90`` = NN=0x190=400.
+            wide_nn = ((t0 & 0x0F) << 8) | t1
+            length = wide_nn // 2 + 2
+        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
+            length = t1 + 2
+        elif (t0 & 0xF0) == 0x20 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
+            # Wide-NN int8 block: ``2X NN`` extends NN to 12 bits the same way.
+            wide_nn = ((t0 & 0x0F) << 8) | t1
+            length = wide_nn + 2
+        elif t0 == 0x00 and t1 % 4 == 0:
+            length = 2
+        elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
+            # Data-section ``30 NN`` blocks carry NN 12-bit signed deltas packed
+            # as NN/4 groups of (2-byte high-nibble field + 4 × int8 low byte).
+            # Length = NN/4 × 6 + 2 = NN × 1.5 + 2 (= 8 for NN=4, 14 for NN=8,
+            # 20 for NN=12, etc.).  Confirmed 2026-05-11 by full-decoder
+            # verification against BW ASCII export.
+            #
+            # Trailer-section ``30 NN`` blocks have a different length formula
+            # (NN × 4 = 32 for NN=8 in trailers).  We try the data-section
+            # length first and fall back to the trailer length if needed.
+            cand_data = t1 * 3 // 2 + 2
+            cand_trailer = t1 * 4
+            if (i + cand_data < len(body) - 1
+                    and body[i + cand_data] in (0x10, 0x20, 0x00, 0x30, 0x40)):
+                length = cand_data
+            else:
+                length = cand_trailer
+        elif t0 == 0x40 and t1 == 0x02:
+            length = 20
+        else:
+            # Unknown tag; stop.  Caller can inspect ``i`` to see where.
+            break
+
+        if i + length > len(body):
+            break
+
+        data = bytes(body[i + 2 : i + length])
+        blocks.append(WaveformBlock(offset=i, tag_hi=t0, tag_lo=t1, data=data, length=length))
+        i += length
+
+    return blocks
+
+
+def split_segments(blocks: List[WaveformBlock]) -> List[List[WaveformBlock]]:
+    """Group consecutive blocks into segments separated by ``40 02`` headers.
+
+    The first segment is whatever runs before the first ``40 02`` header
+    (typically the "segment 0" preamble data after the body preamble).
+    Subsequent segments start with a ``40 02`` block, then have their
+    own data blocks until the next ``40 02``.
+    """
+    segments: List[List[WaveformBlock]] = []
+    current: List[WaveformBlock] = []
+    for b in blocks:
+        if b.tag_hi == 0x40 and b.tag_lo == 0x02:
+            if current:
+                segments.append(current)
+            current = [b]
+        else:
+            current.append(b)
+    if current:
+        segments.append(current)
+    return segments
+
+
+def parse_segment_header(block: WaveformBlock) -> Optional[dict]:
+    """Decode the 18-byte payload of a ``40 02`` segment header.
+
+    Returns a dict with the labelled fields, or None if *block* is not
+    a ``40 02`` header.
+    """
+    if not (block.tag_hi == 0x40 and block.tag_lo == 0x02):
+        return None
+    if len(block.data) < 18:
+        return None
+    p = block.data
+    counter = int.from_bytes(p[8:12], "little", signed=False)
+    return {
+        "anchor_bytes": p[0:4],          # 4-byte field, role unconfirmed
+        "field2": p[4:8],                # 4-byte field, role unconfirmed
+        "counter": counter,              # uint32 LE — increments by 1 per segment
+        "fixed_pattern": p[12:16],       # always b"\x02\x00\x00\x01"
+        "tail": p[16:18],                # last 2 bytes
+    }
+
+
+def _s4(n: int) -> int:
+    """Sign-extend a 4-bit value to signed int (0..7 → 0..7; 8..F → -8..-1)."""
+    return n if n < 8 else n - 16
+
+
+def _i8(b: int) -> int:
+    """Reinterpret an unsigned byte as signed int8."""
+    return b if b < 128 else b - 256
+
+
+def decode_tran_initial(body: bytes) -> Optional[List[int]]:
+    """
+    Decode the initial Tran-channel samples — VERIFIED 2026-05-11.
+
+    Returns Tran samples in **16-count units** (LSB = 0.005 in/s at Normal
+    range — the same quantization BW uses for its ASCII export).  Returns
+    ``None`` if the body cannot be parsed.
+
+    The decoded list extends from sample 0 through the end of segment 0
+    (= just before the first ``40 02`` segment header; ~510 sample-sets
+    for the events tested).  Multi-segment decoding requires continuing
+    past the segment header — that's done by :func:`decode_tran_full`
+    when the per-segment rules are pinned down for all signal types.
+
+    Codec for segment 0 (CONFIRMED 2026-05-11 against 7 fixture events):
+
+    - Body bytes [0:3] are the magic ``00 02 00``.
+    - Body bytes [3:5] = ``Tran[0]`` as int16 BE in 16-count units.
+    - Body bytes [5:7] = ``Tran[1]`` as int16 BE in 16-count units.
+    - Data blocks (``10 NN`` or ``20 NN``) carry Tran deltas starting
+      at sample 2:
+
+      * ``10 NN``: NN nibbles = NN/2 bytes; each nibble is a 4-bit
+        signed delta (0..7 → 0..+7; 8..F → -8..-1).  High nibble of
+        each byte comes first.
+      * ``20 NN``: NN int8 signed deltas (one delta per byte).
+
+    - ``00 NN`` blocks are run-length-encoded zero deltas: append NN
+      copies of the current cumulative Tran value (no change).
+
+    - ``30 NN`` blocks have not yet been decoded for content — they
+      appear in segment 0 of loud-from-start events (SS0, SV0) and
+      seem to signal a transition or special-case interpretation.
+      The walker steps over them but their data is ignored.
+
+    The walk stops at the first ``40 02`` segment header.
+    """
+    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
+        return None
+    t0 = int.from_bytes(body[3:5], "big", signed=True)
+    t1 = int.from_bytes(body[5:7], "big", signed=True)
+
+    start = find_data_start(body)
+    if start < 0:
+        return [t0, t1]
+
+    out = [t0, t1]
+    cur = t1
+    for blk in walk_body(body, start):
+        if blk.tag_hi == 0x40:
+            # Segment boundary — stop.  Multi-segment decode is decode_tran_full.
+            break
+        if blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur += _s4(nib)
+                    out.append(cur)
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur += _i8(byte)
+                out.append(cur)
+        elif blk.tag_hi == 0x00:
+            # RLE zero deltas: append NN copies of current Tran value.
+            for _ in range(blk.tag_lo):
+                out.append(cur)
+        # 30 NN: unknown content; skip.
+    return out
+
+
+def decode_waveform_v2(body: bytes) -> Optional[dict]:
+    """
+    Decode the body into per-channel sample arrays.
+
+    Status (2026-05-11 evening — channel-rotation hypothesis CONFIRMED):
+    segments rotate channels in fixed order **Tran → Vert → Long → MicL**.
+    Each channel-segment carries a 2-sample anchor pair in segment-header
+    bytes [14:18] (or in the body preamble for the initial Tran segment)
+    plus a stream of delta blocks for samples 2 onward.
+
+    Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
+    with each channel's decoded samples in 16-count units (LSB = 0.005
+    in/s at Normal range).  Returns ``None`` if the body cannot be
+    parsed.
+    """
+    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
+        return None
+
+    channels = ["Tran", "Vert", "Long", "MicL"]
+    out: dict = {ch: [] for ch in channels}
+
+    # Initial Tran segment: preamble anchor pair + delta blocks before first 40 02.
+    t0 = int.from_bytes(body[3:5], "big", signed=True)
+    t1 = int.from_bytes(body[5:7], "big", signed=True)
+    out["Tran"].extend([t0, t1])
+
+    start = find_data_start(body)
+    if start < 0:
+        return out
+
+    blocks = walk_body(body, start)
+    seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
+
+    def apply_blocks(channel: str, anchor: int,
+                     block_start: int, block_end: int) -> int:
+        """Apply delta blocks [block_start, block_end) to *channel*'s sample
+        list, starting from *anchor*.  Returns the final cumulative value."""
+        cur = anchor
+        for bi in range(block_start, block_end):
+            blk = blocks[bi]
+            if (blk.tag_hi & 0xF0) == 0x10:
+                # Both ``10 NN`` (NN ≤ 0xFC) and wide-NN ``1X NN`` (X != 0)
+                # are nibble-delta streams.  The walker has already used the
+                # right length; here we just iterate the payload bytes.
+                for byte in blk.data:
+                    for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                        cur += _s4(nib)
+                        out[channel].append(cur)
+            elif (blk.tag_hi & 0xF0) == 0x20:
+                # ``20 NN`` and wide ``2X NN`` both carry int8 deltas.
+                for byte in blk.data:
+                    cur += _i8(byte)
+                    out[channel].append(cur)
+            elif blk.tag_hi == 0x00:
+                for _ in range(blk.tag_lo):
+                    out[channel].append(cur)
+            elif blk.tag_hi == 0x30:
+                # 12-bit signed deltas, packed as NN/4 groups of 6 bytes each:
+                #   bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB first)
+                #   bytes [2:6] = 4 × int8 low bytes
+                # Each delta = sign_extend_12((high_nibble << 8) | low_byte).
+                # Confirmed 2026-05-11 against all 14 ``30 NN`` blocks in the
+                # bundled fixtures.
+                n_groups = blk.tag_lo // 4
+                for g in range(n_groups):
+                    grp = blk.data[g * 6 : (g + 1) * 6]
+                    if len(grp) < 6:
+                        break
+                    high_word = (grp[0] << 8) | grp[1]
+                    for k in range(4):
+                        nib = (high_word >> (12 - 4 * k)) & 0xF
+                        v = (nib << 8) | grp[2 + k]
+                        if v >= 0x800:
+                            v -= 0x1000
+                        cur += v
+                        out[channel].append(cur)
+            # 40 02: should not occur in segment data.
+        return cur
+
+    # Initial Tran segment: deltas from start of body up to first 40 02 (or end).
+    first_seg = seg_idx[0] if seg_idx else len(blocks)
+    last_tran_value = apply_blocks("Tran", t1, 0, first_seg)
+
+    # Subsequent segments rotate channels.  Each segment header carries:
+    #   bytes [0:2] and [2:4] = 2 deltas extending the PREVIOUS channel
+    #   bytes [14:16] and [16:18] = anchor pair for THIS segment's channel
+    #
+    # Rotation: V, L, M, T, V, L, M, T, ...  (initial Tran segment is the
+    # implicit T in the cycle.)
+    rotation = ["Vert", "Long", "MicL", "Tran"]
+    # Track each channel's "running cumulative value" so we can apply the
+    # previous-channel extension deltas at every segment boundary.
+    last_value = {"Tran": last_tran_value, "Vert": None, "Long": None, "MicL": None}
+
+    for k, hi in enumerate(seg_idx):
+        channel = rotation[k % 4]
+        prev_channel = "Tran" if k == 0 else rotation[(k - 1) % 4]
+        header = blocks[hi]
+        if len(header.data) < 18:
+            continue
+        # Validate: real segment headers have bytes [12:14] = `02 00`.
+        # Trailer/footer "40 02" markers contain ASCII serial bytes or other
+        # non-header data there and would otherwise be mis-interpreted as
+        # segment headers, adding spurious samples at the tail.
+        if header.data[12:14] != b"\x02\x00":
+            break
+        # Extend the PREVIOUS channel by 2 more samples (deltas in bytes [0:4]).
+        prev_d0 = int.from_bytes(header.data[0:2], "big", signed=True)
+        prev_d1 = int.from_bytes(header.data[2:4], "big", signed=True)
+        if last_value[prev_channel] is not None:
+            v = last_value[prev_channel] + prev_d0
+            out[prev_channel].append(v)
+            v += prev_d1
+            out[prev_channel].append(v)
+            last_value[prev_channel] = v
+        # Anchor pair for THIS segment's channel.
+        c0 = int.from_bytes(header.data[14:16], "big", signed=True)
+        c1 = int.from_bytes(header.data[16:18], "big", signed=True)
+        out[channel].extend([c0, c1])
+        # Apply delta blocks for this segment.
+        next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
+        last_value[channel] = apply_blocks(channel, c1, hi + 1, next_hi)
+
+    return out
+
+
+# ── ADC-scale conversion helpers ────────────────────────────────────────────
+
+
+# Scaling factor: decode_waveform_v2 produces geo-channel samples in the BW
+# display quantization (16-count units, LSB = 0.005 in/s at Normal range).
+# The legacy consumer pipeline (sfm/event_hdf5.py) expects raw_samples in
+# 1-count ADC units (× full_scale / 32768 → physical).  To plug the new
+# decoder in without rewriting consumers, multiply geo values by 16.
+#
+# Mic samples are already in raw ADC counts (decoded value 1 = 1 mic ADC count
+# = -81.94 dB on the BW display).  Mic values pass through unchanged.
+_GEO_DECODER_TO_ADC = 16
+
+
+def decoded_to_adc_counts(decoded: dict) -> dict:
+    """Convert :func:`decode_waveform_v2` output to int16 ADC counts.
+
+    Geo channels are scaled by ×16 (decoder produces 16-count units,
+    consumer expects 1-count ADC).  Mic is passed through as raw counts.
+    """
+    if not decoded:
+        return {}
+    return {
+        "Tran": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Tran", [])],
+        "Vert": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Vert", [])],
+        "Long": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Long", [])],
+        "MicL": list(decoded.get("MicL", [])),
+    }
+
+
+def mic_count_to_db(count: int) -> float:
+    """Convert a MicL ADC count to dB(L) for BW-display-compatible output.
+
+    Empirical formula (confirmed 2026-05-11 against V70 fixture: count=813
+    → 140.1 dB; count=±1 → ±81.94 dB; count=±24 → ±109.5 dB):
+
+        dB = sign(count) × (81.94 + 20 × log10(|count|))    for |count| ≥ 1
+        dB = 0.0                                            for count == 0
+
+    The constant 81.94 corresponds to 10^(81.94/20) ≈ 12490 mic ADC counts
+    being the dB(L) reference level — almost certainly a calibration
+    constant from the device's mic.
+    """
+    if count == 0:
+        return 0.0
+    sign = 1.0 if count > 0 else -1.0
+    return sign * (81.94 + 20.0 * math.log10(abs(count)))
+
+
+# ── A5-frame entry point ────────────────────────────────────────────────────
+
+
+def decode_a5_frames(a5_frames) -> Optional[dict]:
+    """Decode a list of A5 (BULK_WAVEFORM_STREAM) frames into per-channel
+    int16 ADC samples.
+
+    Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
+    with each channel's samples in **1-count ADC units** (the legacy
+    ``event.raw_samples`` convention — multiply by ``full_scale / 32768``
+    to convert to physical units; for mic, use :func:`mic_count_to_db` or
+    a per-count psi factor).
+
+    Returns ``None`` if the frames cannot be parsed.
+
+    This is the wired-up production entry point.  It:
+      1. Reconstructs the BW-binary body bytes from the A5 frames
+         (``blastware_file.extract_body_bytes``).
+      2. Runs the verified codec (``decode_waveform_v2``) on the body.
+      3. Converts to int16 ADC counts via :func:`decoded_to_adc_counts`.
+    """
+    # Local import to avoid a cycle: blastware_file imports models and
+    # ultimately client.py imports waveform_codec.
+    from .blastware_file import extract_body_bytes
+
+    if not a5_frames:
+        return None
+    _strt, body, _footer = extract_body_bytes(a5_frames)
+    if not body:
+        return None
+    decoded = decode_waveform_v2(body)
+    if decoded is None:
+        return None
+    return decoded_to_adc_counts(decoded)
@@ -53,7 +53,9 @@ SUB_TABLE: dict[int, tuple[str, str, str]] = {
    0x82: ("TRIGGER_CONFIG_WRITE",      "BW→S3", "0x1C bytes; trigger config block; mirrors SUB 1C"),
    0x83: ("TRIGGER_WRITE_CONFIRM",     "BW→S3", "Short frame; commit step after 0x82"),
    # S3→BW responses
+    0x5A: ("BULK_WAVEFORM_STREAM",       "BW→S3", "Bulk waveform chunk request; response is A5 stream"),
    0xA4: ("POLL_RESPONSE",             "S3→BW", "Response to SUB 5B poll"),
+    0xA5: ("BULK_WAVEFORM_RESPONSE",    "S3→BW", "Response to SUB 5A; waveform chunks + metadata"),
    0xFE: ("FULL_CONFIG_RESPONSE",      "S3→BW", "Response to SUB 01"),
    0xF9: ("CHANNEL_CONFIG_RESPONSE",   "S3→BW", "Response to SUB 06"),
    0xF7: ("EVENT_INDEX_RESPONSE",      "S3→BW", "Response to SUB 08; contains backlight/power-save"),
@@ -33,7 +33,7 @@ STX = 0x02
 ETX = 0x03
 ACK = 0x41

-__version__ = "0.2.2"
+__version__ = "0.2.5"


@dataclass
@@ -184,9 +184,9 @@ def validate_bw_body_auto(body: bytes) -> Optional[Tuple[bytes, bytes, str]]:
 def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
    frames: List[Frame] = []

-    IDLE = 0
-    IN_FRAME = 1
-    AFTER_DLE = 2
+    IDLE        = 0
+    IN_FRAME    = 1
+    IN_FRAME_DLE = 2   # saw DLE inside frame — waiting for next byte

    state = IDLE
    body = bytearray()
@@ -206,28 +206,26 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
                state = IN_FRAME
                i += 2
                continue
+            # ACK bytes, boot strings, garbage — silently ignored

        elif state == IN_FRAME:
            if b == DLE:
-                state = AFTER_DLE
+                state = IN_FRAME_DLE
                i += 1
                continue
-            body.append(b)
-
-        else:  # AFTER_DLE
-            if b == DLE:
-                body.append(DLE)
-                state = IN_FRAME
-                i += 1
-                continue
-
            if b == ETX:
+                # Bare ETX = real S3 frame terminator (confirmed from S3FrameParser)
                end_offset = i + 1
                trailer_start = i + 1
                trailer_end = trailer_start + trailer_len
                trailer = blob[trailer_start:trailer_end]

-                # For S3 mode we don't assume checksum type here yet.
+                # S3 checksums are deliberately not validated here.
+                # Large S3 responses (A5 bulk waveform, E5 compliance) embed
+                # inner DLE+ETX sub-frame terminators whose trailing 0x03 byte
+                # lands where the parser would expect the SUM8 checksum, causing
+                # false failures.  The live protocol (protocol.py _validate_frame)
+                # also skips S3 checksum enforcement for the same reason.
                frames.append(Frame(
                    index=idx,
                    start_offset=start_offset,
@@ -244,13 +242,27 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
                state = IDLE
                i = trailer_end
                continue
+            body.append(b)

+        else:  # IN_FRAME_DLE
+            if b == DLE:
+                # DLE DLE → literal 0x10 in payload
+                body.append(DLE)
+                state = IN_FRAME
+                i += 1
+                continue
+            if b == ETX:
+                # DLE+ETX inside a frame = inner-frame terminator (A4/E5 sub-frames).
+                # Treat as literal data, NOT the outer frame end.
+                body.append(DLE)
+                body.append(ETX)
+                state = IN_FRAME
+                i += 1
+                continue
            # Unexpected DLE + byte → treat as literal data
            body.append(DLE)
            body.append(b)
            state = IN_FRAME
-            i += 1
-            continue

        i += 1

@@ -298,10 +310,13 @@ def parse_bw(blob: bytes, trailer_len: int, validate_checksum: bool) -> List[Fra

            if b == ETX:
                # Candidate end-of-frame.
-                # Accept ETX if the next bytes look like a real next-frame start (ACK+STX),
-                # or we're at EOF. This prevents chopping on in-payload 0x03.
-                next_is_start = (i + 2 < n and blob[i + 1] == ACK and blob[i + 2] == STX)
-                at_eof = (i == n - 1)
+                # Skip any SESSION_RESET (41 03) sequences — sent before POLL to wake
+                # monitoring units — to find the real next frame start (ACK+STX).
+                j = i + 1
+                while j + 1 < n and blob[j] == ACK and blob[j + 1] == ETX:
+                    j += 2
+                next_is_start = (j + 1 < n and blob[j] == ACK and blob[j + 1] == STX)
+                at_eof = (i == n - 1) or (j >= n)

                if not (next_is_start or at_eof):
                    # Not a real boundary -> payload byte
@@ -0,0 +1,24 @@
+[build-system]
+requires = ["setuptools>=68", "wheel"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "seismo-relay"
+version = "0.21.1"
+description = "Python client and REST server for MiniMate Plus seismographs"
+requires-python = ">=3.10"
+dependencies = [
+    "fastapi>=0.104",
+    "uvicorn[standard]>=0.24",
+    "pyserial>=3.5",
+    "sqlalchemy>=2.0",
+    "python-multipart>=0.0.7",
+    "h5py>=3.10",
+    "numpy>=1.24",
+    "matplotlib>=3.8",
+]
+
+[tool.setuptools.packages.find]
+# Auto-discovers minimateplus/, micromate/, sfm/, bridges/ as packages
+where = ["."]
+include = ["minimateplus*", "micromate*", "sfm*", "bridges*"]
@@ -0,0 +1,8 @@
+fastapi
+uvicorn
+sqlalchemy
+pyserial
+python-multipart
+h5py
+numpy
+matplotlib
@@ -0,0 +1,360 @@
+"""
+scratch/next_experiment_skeleton.py — segment-channel scoring analyzer.
+
+This is the suggested NEXT EXPERIMENT for cracking the waveform body codec.
+The goal is to figure out what segments 1+ contain, since segment 0 = Tran
+is solved but multi-segment continuation diverges from truth at sample ~512.
+
+────────────────────────────────────────────────────────────────────────────
+The hypothesis to test
+────────────────────────────────────────────────────────────────────────────
+
+Segments rotate through channels:
+
+    segment 0  →  Tran samples 0..509
+    segment 1  →  Vert samples 0..507
+    segment 2  →  Long samples 0..507
+    segment 3  →  Mic  samples 0..507
+    segment 4  →  Tran samples 510..N (continuation)
+    ...
+
+This would explain why segment 0 works perfectly (it's pure Tran) and why
+applying segment 1's blocks as Tran continuation gives wrong values
+(it's actually Vert).
+
+────────────────────────────────────────────────────────────────────────────
+What the analyzer should do
+────────────────────────────────────────────────────────────────────────────
+
+For each segment in each fixture event:
+
+1. Run the segment-0 block-walker + RLE decode (the same algorithm that
+   ``decode_tran_initial`` uses) over the segment's blocks.  Start from
+   some anchor value and produce a cumulative trajectory of length =
+   number-of-deltas-in-segment.
+
+2. For each candidate channel C ∈ {Tran, Vert, Long, MicL}:
+   For each candidate anchor location in the segment-header payload
+   (try [0:2], [2:4], [4:6], [14:16], [16:18] as int16 BE):
+       Compare the decoded trajectory against truth[C] starting from
+       the segment's first sample index.
+       Score = number of matches (or sum of squared errors).
+
+3. Report the best (channel, anchor-location) combination per segment.
+
+If the rotation hypothesis is correct, you'll see:
+    segment 0  →  best score for (Tran, preamble bytes [3:5])    ✓ already known
+    segment 1  →  best score for (Vert, <some-header-byte>)
+    segment 2  →  best score for (Long, <some-header-byte>)
+    segment 3  →  best score for (MicL, <some-header-byte>)
+    segment 4  →  best score for (Tran, continuing from segment 0's end)
+
+If the rotation hypothesis is NOT correct, the scorer will at least narrow
+down what segment 1 actually carries.  Maybe channels interleave at finer
+granularity, or maybe segments alternate by something other than channel.
+
+────────────────────────────────────────────────────────────────────────────
+Why this is a scoring analyzer, not a hand-written decoder
+────────────────────────────────────────────────────────────────────────────
+
+Direct hand-coding ("assume segment 1 is Vert with anchor at byte X") gets
+stuck when the assumption is wrong because the failure mode is silent —
+you get plausible-looking-but-wrong samples and have to manually diff
+against truth to debug.
+
+The scorer is brute-force but cheap: every fixture event × every segment ×
+4 channels × 5 anchor-byte candidates is only ~hundreds of comparisons.
+The winning combination jumps out by score.
+
+────────────────────────────────────────────────────────────────────────────
+Skeleton
+────────────────────────────────────────────────────────────────────────────
+"""
+from __future__ import annotations
+
+import os
+import re
+import sys
+from dataclasses import dataclass
+from typing import List, Optional, Tuple
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+
+from minimateplus.waveform_codec import walk_body, find_data_start, WaveformBlock
+
+
+# ── Reusable pieces ──────────────────────────────────────────────────────────
+
+
+CHANNELS = ("Tran", "Vert", "Long", "MicL")
+LSB_INV = 200  # 1 in/s / 0.005 in/s/LSB; multiply BW-export floats by this
+               # to get 16-count units (the body's native quantization).
+
+
+@dataclass
+class FixtureEvent:
+    name: str           # e.g. "M529LL1A.SP0"
+    bin_path: str
+    txt_path: str
+    body: bytes
+    truth: dict         # {channel: list of int16-quantized samples}
+    blocks: List[WaveformBlock]
+    segment_starts: List[int]  # block indices of each 40 02 segment header
+    segment_sample_starts: List[int]  # for each segment, the truth sample index it starts at
+
+
+def s4(n: int) -> int:
+    """4-bit signed nibble decode."""
+    return n if n < 8 else n - 16
+
+
+def i8(b: int) -> int:
+    """int8 reinterpret of unsigned byte."""
+    return b if b < 128 else b - 256
+
+
+def load_fixture(name: str) -> FixtureEvent:
+    """Load a fixture event with its truth values and parsed block stream."""
+    # Find the fixture (search both subdirs of tests/fixtures/).
+    base = os.path.join(os.path.dirname(__file__), "..", "tests", "fixtures")
+    candidates = [
+        os.path.join(base, "5-11-26", name),
+        os.path.join(base, "decode-re-5-8-26", "event-a", name),  # not used directly
+    ]
+    bin_path = next((c for c in candidates if os.path.exists(c)), None)
+    if bin_path is None:
+        # Try a glob walk for the 5-8 fixtures (they're in subdirs).
+        for root, _, files in os.walk(base):
+            if name in files:
+                bin_path = os.path.join(root, name)
+                break
+    if bin_path is None:
+        raise FileNotFoundError(name)
+
+    txt_path = bin_path + ".TXT"
+    with open(bin_path, "rb") as f:
+        raw = f.read()
+    body = raw[43:-26]
+    truth = _parse_txt(txt_path)
+    blocks = walk_body(body, find_data_start(body))
+
+    seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
+    # Segment 0 starts at sample 0; subsequent segments start at the
+    # cumulative sample count from previous segment(s).  Tran's segment 0
+    # is N samples; if rotation hypothesis is correct, segment 1's data
+    # starts at sample 0 for a *different* channel.  The analyzer should
+    # try both "continues from previous segment" and "starts at sample 0
+    # of a different channel."
+    seg_sample_starts = _compute_segment_sample_starts(blocks, seg_idx)
+
+    return FixtureEvent(
+        name=name, bin_path=bin_path, txt_path=txt_path,
+        body=body, truth=truth, blocks=blocks,
+        segment_starts=seg_idx, segment_sample_starts=seg_sample_starts,
+    )
+
+
+def _parse_txt(path: str) -> dict:
+    """Parse BW ASCII TXT export into {channel: [int_samples_in_16_count_units]}."""
+    with open(path, "r", encoding="utf-8", errors="replace") as f:
+        lines = f.read().splitlines()
+    header_idx = next(
+        (i for i, l in enumerate(lines)
+         if all(c in l for c in CHANNELS)),
+        None,
+    )
+    if header_idx is None:
+        return {ch: [] for ch in CHANNELS}
+    out = {ch: [] for ch in CHANNELS}
+    for line in lines[header_idx + 1:]:
+        parts = re.split(r"\s+", line.strip())
+        if len(parts) < 4:
+            continue
+        try:
+            vals = [float(p) for p in parts[:4]]
+        except ValueError:
+            continue
+        for ch, v in zip(CHANNELS, vals):
+            # Multiply by LSB_INV; geo channels are in in/s, MicL is in dB(L)
+            # (which doesn't quantize the same way — leaving raw for MicL is fine,
+            # the scorer should treat MicL specially).
+            out[ch].append(round(v * LSB_INV) if ch != "MicL" else v)
+    return out
+
+
+def _compute_segment_sample_starts(
+    blocks: List[WaveformBlock], seg_idx: List[int]
+) -> List[int]:
+    """Cumulative sample-count up to each segment header (if all blocks treated
+    as Tran continuation).  Useful as one candidate for segment-1-Tran tests.
+
+    The scorer should ALSO try "segment 1 starts at sample 0 of a new channel"
+    as the rotation hypothesis predicts.
+    """
+    starts = []
+    cum = 2  # T[0] + T[1] from preamble
+    for i, b in enumerate(blocks):
+        if i in seg_idx:
+            starts.append(cum)
+        if b.tag_hi == 0x10:
+            cum += b.tag_lo
+        elif b.tag_hi == 0x20:
+            cum += b.tag_lo
+        elif b.tag_hi == 0x00:
+            cum += b.tag_lo
+        # 30 NN and 40 02 don't contribute samples (for this hypothesis)
+    return starts
+
+
+# ── The core algorithm: decode a segment's blocks as deltas ─────────────────
+
+
+def decode_segment_as_channel(
+    blocks: List[WaveformBlock],
+    seg_start_block_idx: int,
+    seg_end_block_idx: int,
+    anchor: int,
+) -> List[int]:
+    """Apply the segment-0 codec rules to a range of blocks, starting from *anchor*.
+
+    Returns a list of cumulative sample values (one per delta).  Does NOT include
+    the anchor itself in the output — the first returned value is anchor + first_delta.
+    """
+    out = []
+    cur = anchor
+    for bi in range(seg_start_block_idx, seg_end_block_idx):
+        blk = blocks[bi]
+        if blk.tag_hi == 0x10:
+            for byte in blk.data:
+                for nib in ((byte >> 4) & 0xF, byte & 0xF):
+                    cur += s4(nib)
+                    out.append(cur)
+        elif blk.tag_hi == 0x20:
+            for byte in blk.data:
+                cur += i8(byte)
+                out.append(cur)
+        elif blk.tag_hi == 0x00:
+            for _ in range(blk.tag_lo):
+                out.append(cur)
+        # 30 NN: skip (content unknown)
+        # 40 02: shouldn't appear in segment data (it's the segment header)
+    return out
+
+
+def score_against_truth(
+    decoded: List[int],
+    truth: List[int],
+    truth_start: int,
+) -> Tuple[int, int]:
+    """Compare *decoded* to truth[truth_start : truth_start + len(decoded)].
+
+    Returns (n_matches, n_compared).
+    """
+    n = min(len(decoded), len(truth) - truth_start)
+    if n <= 0:
+        return (0, 0)
+    matches = sum(1 for i in range(n) if decoded[i] == truth[truth_start + i])
+    return (matches, n)
+
+
+# ── TODO for the next pass ──────────────────────────────────────────────────
+
+
+def score_segment_against_all_channels(
+    event: FixtureEvent,
+    segment_index: int,
+) -> List[Tuple[str, int, int, int]]:
+    """For segment *segment_index* of *event*, find the best (channel, start_sample)
+    fit.
+
+    For each candidate channel C and each candidate starting truth-sample index s,
+    we pick the anchor that makes the FIRST decoded value match truth[C][s], then
+    score the remaining decoded values against truth[C][s+1 : s+N].
+
+    Returns rows of (channel_name, start_sample, n_matches, n_compared)
+    sorted by match-count descending.
+    """
+    # Block range of this segment: from the segment header (inclusive) up to
+    # the next segment header (exclusive), or end-of-blocks.
+    seg_header_idx = event.segment_starts[segment_index]
+    next_header_idx = (
+        event.segment_starts[segment_index + 1]
+        if segment_index + 1 < len(event.segment_starts)
+        else len(event.blocks)
+    )
+
+    # Decode the segment's data blocks (skip the segment-header block itself).
+    # Use anchor=0 — we'll re-anchor when scoring against each channel.
+    deltas_trajectory = decode_segment_as_channel(
+        event.blocks, seg_header_idx + 1, next_header_idx, anchor=0
+    )
+    if not deltas_trajectory:
+        return []
+
+    n = len(deltas_trajectory)
+    results = []
+
+    for ch in ("Tran", "Vert", "Long"):
+        truth = event.truth.get(ch)
+        if not truth or len(truth) < n + 1:
+            continue
+        # For each candidate starting sample s in truth, check if applying
+        # the deltas starting from truth[s] reproduces truth[s+1:s+n+1].
+        best = (0, -1)
+        for s in range(len(truth) - n):
+            anchor = truth[s]
+            offset = anchor - deltas_trajectory[0] + truth[s + 1] - anchor
+            # Recompute: trajectory[i] = anchor + cumulative_delta_through_i
+            # but we already have deltas_trajectory computed from anchor=0,
+            # so trajectory_relative[i] = anchor + deltas_trajectory[i].
+            matches = 0
+            for i in range(n):
+                if truth[s + i + 1] == anchor + deltas_trajectory[i]:
+                    matches += 1
+                # Note: we could break early on first mismatch for "matches start",
+                # but counting total matches gives a more robust score.
+            if matches > best[0]:
+                best = (matches, s)
+        results.append((ch, best[1], best[0], n))
+
+    results.sort(key=lambda r: -r[2])
+    return results
+
+
+# ── Driver ──────────────────────────────────────────────────────────────────
+
+
+def main():
+    """Run the analyzer on all loud-bundle events and print best scores."""
+    events = ["M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
+              "M529LL1L.JQ0", "M529LL1L.V70"]
+    for name in events:
+        try:
+            event = load_fixture(name)
+        except FileNotFoundError:
+            print(f"{name}: fixture not found")
+            continue
+
+        print(f"\n=== {name} ===")
+        print(f"  body bytes: {len(event.body)}")
+        print(f"  blocks: {len(event.blocks)}")
+        print(f"  segments: {len(event.segment_starts)}")
+        print(f"  segment sample-starts (if all blocks are 1 channel):")
+        for si, sample_start in enumerate(event.segment_sample_starts):
+            print(f"    seg {si}: sample {sample_start}")
+
+        for si in range(len(event.segment_starts)):
+            results = score_segment_against_all_channels(event, si)
+            if not results:
+                print(f"  seg {si}: (no scorable data)")
+                continue
+            tag = "✓" if results[0][2] / max(results[0][3], 1) > 0.9 else " "
+            top = results[0]
+            print(f"  seg {si}: best fit {tag} = {top[0]:<5} "
+                  f"starting at sample {top[1]:>5}, {top[2]:>4}/{top[3]:<4} match"
+                  + (f"  (next: {results[1][0]} @{results[1][1]} {results[1][2]}/{results[1][3]})"
+                     if len(results) > 1 else ""))
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,150 @@
+"""
+scripts/backfill_record_type.py — fix `record_type` on legacy event
+rows whose value was hardcoded to "Waveform" regardless of actual type.
+
+Why this is needed
+──────────────────
+Pre-v0.16.1 the BW file importer (`event_file_io.read_blastware_file`)
+hardcoded `ev.record_type = "Waveform"` for every imported event.  Fixed
+in commit aac1c8e — new ingests now derive the type from the Blastware
+filename's extension last character (H=Histogram, W=Waveform, M=Manual,
+E=Event, C=Combo) per the V10.72+ MiniMate Plus AB0T filename scheme.
+
+Effect on a server that imported events under the old code: every
+events row has `record_type = "Waveform"`, even for histograms,
+manuals, etc.  Visible in terra-view's event-detail modal under the
+"Record Type" field.  Terra-view also has a client-side workaround
+that derives the type from the filename for display purposes, so
+operators see the correct type in the UI even before this backfill.
+This script makes the DB column match what the UI is already showing,
+which matters for reporting and any downstream consumer that reads
+events.record_type directly.
+
+This script
+───────────
+Walks the `events` table and updates each row's `record_type` to the
+derived value from its `blastware_filename`.  Old S338 firmware files
+(3-char extensions ending in `0`) and any unrecognized suffix get
+left at the existing value (defaults to "Waveform").
+
+Idempotent: re-running after a successful backfill finds zero rows
+needing updates and exits cleanly (it always re-derives but only
+writes when the value would change).
+
+Usage
+─────
+  # Dry-run (default): print what would change, don't touch the DB
+  python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db
+
+  # Apply the backfill
+  python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db --apply
+"""
+
+from __future__ import annotations
+
+import argparse
+import sqlite3
+import sys
+from collections import Counter
+from pathlib import Path
+
+
+# Must stay in sync with minimateplus.event_file_io._RECORD_TYPE_BY_EXT_SUFFIX.
+_TYPE_FROM_SUFFIX = {
+    "H": "Histogram",
+    "W": "Waveform",
+    "M": "Manual",
+    "E": "Event",
+    "C": "Combo",
+}
+
+
+def derive_record_type(filename: str | None, default: str = "Waveform") -> str:
+    """Mirror of minimateplus.event_file_io.derive_record_type_from_filename.
+
+    Vendored here so this script runs without needing the seismo-relay
+    package on the Python path (useful on prod where you might be
+    running it via `docker exec` against a container's DB volume).
+    """
+    if not filename:
+        return default
+    name = Path(filename).name
+    if "." not in name:
+        return default
+    ext = name.rsplit(".", 1)[1]
+    if not ext:
+        return default
+    return _TYPE_FROM_SUFFIX.get(ext[-1].upper(), default)
+
+
+def main() -> int:
+    ap = argparse.ArgumentParser(description=__doc__)
+    ap.add_argument("--db", required=True, help="Path to seismo_relay.db")
+    ap.add_argument("--apply", action="store_true",
+                    help="Actually write changes (default is dry-run).")
+    ap.add_argument("--default", default="Waveform",
+                    help="Fallback record_type when filename doesn't encode one. "
+                         "Default: Waveform (matches the pre-fix bug's behavior).")
+    args = ap.parse_args()
+
+    db_path = Path(args.db)
+    if not db_path.exists():
+        print(f"ERROR: database not found at {db_path}", file=sys.stderr)
+        return 1
+
+    conn = sqlite3.connect(str(db_path))
+    conn.row_factory = sqlite3.Row
+    cur = conn.cursor()
+
+    cur.execute("""
+        SELECT id, blastware_filename, record_type
+        FROM events
+        WHERE blastware_filename IS NOT NULL
+          AND blastware_filename != ''
+    """)
+    rows = cur.fetchall()
+    total = len(rows)
+    print(f"Scanning {total:,} event rows…")
+    print()
+
+    # Tally proposed changes.
+    transitions: Counter[tuple[str, str]] = Counter()
+    update_ids: list[tuple[str, str]] = []
+    unrecognized = 0
+
+    for row in rows:
+        derived = derive_record_type(row["blastware_filename"], default=args.default)
+        current = row["record_type"] or ""
+        if derived == current:
+            continue
+        transitions[(current, derived)] += 1
+        update_ids.append((row["id"], derived))
+
+    if not update_ids:
+        print("Nothing to update — all rows already match.")
+        conn.close()
+        return 0
+
+    print(f"{len(update_ids):,} row(s) need updating:")
+    for (old, new), count in sorted(transitions.items(), key=lambda x: -x[1]):
+        print(f"  {count:>6,}  {old!r:14s} → {new!r}")
+    print()
+
+    if not args.apply:
+        print("(dry-run — re-run with --apply to write changes)")
+        conn.close()
+        return 0
+
+    print("Applying changes…")
+    cur.executemany(
+        "UPDATE events SET record_type = ? WHERE id = ?",
+        [(new, eid) for eid, new in update_ids],
+    )
+    conn.commit()
+    print(f"Done. Updated {cur.rowcount:,} row(s).")
+    conn.close()
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,466 @@
+"""
+scripts/backfill_sidecars.py — generate .sfm.json sidecars AND .h5
+clean-waveform files for existing events already in the waveform store
+that predate those features.
+
+Walks `<store_root>/<serial>/<filename>` and for each BW event file:
+
+  Sidecar (.sfm.json):
+    - Skip when an existing sidecar's blastware.sha256 matches the
+      current BW file's sha256.
+    - Else regenerate: prefer .a5.pkl (full fidelity); fall back to
+      parsing the BW binary directly (peaks computed from samples).
+
+  Clean waveform (.h5):
+    - Regenerated whenever the sidecar is regenerated (sha mismatch
+      OR sidecar.source.tool_version < current TOOL_VERSION OR --force).
+      The .h5 and the sidecar both come from the same decoder output,
+      so if the sidecar is stale the .h5 is too.
+    - Written when missing.
+    - --skip-hdf5 turns off all .h5 writes.
+
+Typical use after a decoder upgrade:
+    1. Pull the new seismo-relay code (which bumped TOOL_VERSION).
+    2. Run this script — every sidecar with an older tool_version
+       stamp regenerates, and the associated .h5 cascade-regenerates.
+    3. Operator review state (review.false_trigger, notes, reviewer)
+       and the sidecar's extensions block are preserved across the
+       regen.
+
+Usage:
+    python scripts/backfill_sidecars.py [--store-root PATH]
+                                        [--db-path PATH]
+                                        [--dry-run]
+                                        [--skip-hdf5]
+                                        [-v]
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+import sys
+from pathlib import Path
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+from sfm import event_hdf5
+from sfm.waveform_store import WaveformStore, _frame_to_dict, _dict_to_frame  # noqa: F401
+from sfm.database import SeismoDb
+
+log = logging.getLogger("backfill_sidecars")
+
+
+def _looks_like_event_file(path: Path) -> bool:
+    """Same heuristic as the importer CLI.
+
+    Filters to BW (Series III) event files only — Thor (Series IV)
+    `.IDFW` / `.IDFH` files share the store but have their own ingest
+    path (`WaveformStore.save_imported_idf`) and are NOT decodable by
+    `event_file_io.read_blastware_file`.  Their sidecars are populated
+    at ingest from the paired `.IDFW.txt` ASCII report; nothing the
+    backfill regenerates would improve on them, so we exclude them
+    from scope.
+    """
+    if not path.is_file():
+        return False
+    if path.name.endswith((".a5.pkl", ".sfm.json", ".h5")):
+        return False
+    ext = path.suffix.lstrip(".")
+    if not (3 <= len(ext) <= 4):
+        return False
+    # Thor IDF files share the .{W,H}-suffix shape but aren't BW.
+    if ext.upper() in ("IDFW", "IDFH"):
+        return False
+    if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
+        return False
+    try:
+        return path.stat().st_size >= 70
+    except OSError:
+        return False
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    p.add_argument(
+        "--db-path",
+        default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
+    )
+    p.add_argument("--store-root", default=None)
+    p.add_argument("--dry-run", action="store_true")
+    p.add_argument(
+        "--skip-hdf5", action="store_true",
+        help="Don't generate .h5 clean-waveform files (only sidecars).",
+    )
+    p.add_argument(
+        "--force", action="store_true",
+        help=(
+            "Regenerate sidecars + .h5 even when an existing sidecar's "
+            "blastware.sha256 matches the current BW file.  Use this after "
+            "upgrading seismo-relay to pull in decoder bug fixes (e.g. the "
+            "STRT-rectime byte-offset fix in v0.15.x)."
+        ),
+    )
+    p.add_argument(
+        "--reparse-txt", action="store_true",
+        help=(
+            "Re-parse the preserved <serial>/<filename>_ASCII.TXT with the "
+            "current bw_ascii_report parser and overwrite the sidecar's "
+            "bw_report block.  Use this after upgrading the ASCII parser to "
+            "pull in new fields (e.g. zc_freq_above_range for BW '>100 Hz' "
+            "ZC peaks).  No-op for events without a preserved .TXT; safely "
+            "idempotent when the parser hasn't changed."
+        ),
+    )
+    p.add_argument("-v", "--verbose", action="store_true")
+    args = p.parse_args(argv)
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+        datefmt="%H:%M:%S",
+    )
+
+    db_path = Path(args.db_path).expanduser().resolve()
+    store_root = (
+        Path(args.store_root).expanduser().resolve()
+        if args.store_root else db_path.parent / "waveforms"
+    )
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+
+    store = WaveformStore(store_root)
+    db    = SeismoDb(db_path)
+
+    written = skipped = errors = 0
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        serial = serial_dir.name
+        for path in sorted(serial_dir.iterdir()):
+            if not _looks_like_event_file(path):
+                continue
+            sidecar_path = store.sidecar_path_for(serial, path.name)
+            try:
+                bw_sha = event_file_io.file_sha256(path)
+            except Exception as exc:
+                log.error("sha256 failed for %s: %s", path, exc)
+                errors += 1
+                continue
+
+            # Skip when an up-to-date sidecar already exists.
+            #
+            # Two-part freshness check:
+            #   1. blastware.sha256 must match the current BW file (proves
+            #      the sidecar describes THIS file).
+            #   2. source.tool_version must be ≥ current TOOL_VERSION (proves
+            #      the sidecar was written by a build that includes any
+            #      decoder fixes shipped since).
+            # Either part failing → regenerate.  --force bypasses both.
+            #
+            # Tracks whether we're regenerating the sidecar this iteration
+            # so the .h5 logic below knows to refresh that too — staleness
+            # of the sidecar implies staleness of the derived .h5 (both
+            # come out of the same decoder).
+            sidecar_stale = True
+            if sidecar_path.exists() and not args.force and not args.reparse_txt:
+                try:
+                    existing = event_file_io.read_sidecar(sidecar_path)
+                    sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
+                    src_ver = existing.get("source", {}).get("tool_version", "")
+                    def _vt(s):
+                        try:
+                            return tuple(int(p) for p in str(s).split(".")[:3])
+                        except Exception:
+                            return (0, 0, 0)
+                    ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION)
+                    if sha_ok and ver_ok:
+                        skipped += 1
+                        sidecar_stale = False
+                        continue
+                    if sha_ok and not ver_ok:
+                        log.info(
+                            "regenerating %s (sidecar tool_version=%s < current %s)",
+                            sidecar_path.name, src_ver or "(none)",
+                            event_file_io.TOOL_VERSION,
+                        )
+                except Exception:
+                    pass  # fall through to rewrite
+
+            # Decide path: A5-based (high-fidelity) or BW-only.
+            a5_path = serial_dir / f"{path.name}.a5.pkl"
+            try:
+                if a5_path.exists():
+                    frames = store.load_a5(serial, path.name)
+                    if not frames:
+                        raise RuntimeError("a5_pickle present but unreadable")
+                    # Build an Event by replaying the A5 decoders.  Note:
+                    # the .a5.pkl alone CANNOT recover timestamp /
+                    # record_type / waveform_key / per-channel peaks —
+                    # those live in the 0C record, which isn't saved
+                    # separately.  We seed those from the DB row + the
+                    # existing sidecar below so a re-backfill doesn't
+                    # nuke fields the original save populated.
+                    from minimateplus.client import (
+                        _decode_a5_metadata_into,
+                        _decode_a5_waveform,
+                    )
+                    from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
+                    ev = Event(index=-1)
+                    _decode_a5_metadata_into(frames, ev)
+                    _decode_a5_waveform(frames, ev)
+                    source_kind = "sfm-live"
+                    a5_filename = a5_path.name
+                else:
+                    ev = event_file_io.read_blastware_file(path)
+                    source_kind = "bw-import"
+                    a5_filename = None
+                    from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
+
+                # ── Seed missing fields from the SeismoDb events row ──
+                # The DB row was populated at original save time with peaks,
+                # project info, timestamp, record_type, sample_rate, etc.
+                # All of those survive intact in SQLite; pull them onto the
+                # rebuilt Event so the regenerated sidecar matches what was
+                # there before the backfill ran.
+                db_row = None
+                try:
+                    import sqlite3 as _sql
+                    with _sql.connect(str(db.db_path)) as _conn:
+                        _conn.row_factory = _sql.Row
+                        db_row = _conn.execute(
+                            "SELECT * FROM events "
+                            "WHERE serial=? AND blastware_filename=? "
+                            "LIMIT 1",
+                            (serial, path.name),
+                        ).fetchone()
+                except Exception as exc:
+                    log.debug("DB lookup failed for %s: %s", path.name, exc)
+
+                if db_row is not None:
+                    if ev.sample_rate is None and db_row["sample_rate"]:
+                        ev.sample_rate = int(db_row["sample_rate"])
+                    if not ev.record_type and db_row["record_type"]:
+                        ev.record_type = db_row["record_type"]
+                    if ev._waveform_key is None and db_row["waveform_key"]:
+                        try:
+                            ev._waveform_key = bytes.fromhex(db_row["waveform_key"])
+                        except Exception:
+                            pass
+                    # Timestamp from the ISO-8601 string in the DB row.
+                    if ev.timestamp is None and db_row["timestamp"]:
+                        try:
+                            import datetime as _dt
+                            _t = _dt.datetime.fromisoformat(db_row["timestamp"])
+                            ev.timestamp = Timestamp(
+                                raw=b"", flag=0x10,
+                                year=_t.year, unknown_byte=0,
+                                month=_t.month, day=_t.day,
+                                hour=_t.hour, minute=_t.minute, second=_t.second,
+                            )
+                        except Exception:
+                            pass
+                    # Peaks from the DB row when the A5 decode didn't supply them.
+                    if ev.peak_values is None:
+                        ev.peak_values = PeakValues(
+                            tran=db_row["tran_ppv"],
+                            vert=db_row["vert_ppv"],
+                            long=db_row["long_ppv"],
+                            peak_vector_sum=db_row["peak_vector_sum"],
+                            micl=db_row["mic_ppv"],
+                        )
+                    # Project info from the DB row when the A5 metadata-page
+                    # decode didn't pick it up.
+                    if ev.project_info is None or all(
+                        v in (None, "")
+                        for v in (
+                            (ev.project_info.project          if ev.project_info else None),
+                            (ev.project_info.client           if ev.project_info else None),
+                            (ev.project_info.operator         if ev.project_info else None),
+                            (ev.project_info.sensor_location  if ev.project_info else None),
+                        )
+                    ):
+                        ev.project_info = ProjectInfo(
+                            project=db_row["project"],
+                            client=db_row["client"],
+                            operator=db_row["operator"],
+                            sensor_location=db_row["sensor_location"],
+                        )
+
+                # Derive total_samples when we have both rectime + sample_rate.
+                # The decoder's STRT-derived value can be a buffer offset
+                # rather than a sample count — drop it in that case.
+                if ev.sample_rate and ev.rectime_seconds:
+                    derived = int(round(ev.sample_rate * ev.rectime_seconds))
+                    if (ev.total_samples is None
+                            or ev.total_samples > derived * 2
+                            or ev.total_samples < derived // 4):
+                        ev.total_samples = derived
+
+                # Preserve user-edited review state + extensions + the
+                # bw_report block from the existing sidecar so a backfill
+                # never wipes them out.  The bw_report block originates
+                # from the paired .TXT ASCII report parsed at ORIGINAL
+                # import time (ach forward / direct upload); the .TXT
+                # file is not in the waveform store, so we can't re-derive
+                # it from disk.  event_to_sidecar_dict takes a
+                # BwAsciiReport dataclass (not a dict), so for bw_report
+                # we overlay the existing block after regen instead of
+                # passing it as a kwarg.
+                preserved_review     = None
+                preserved_ext        = None
+                preserved_bw_report  = None
+                preserved_txt_fn     = None
+                if sidecar_path.exists():
+                    try:
+                        _existing = event_file_io.read_sidecar(sidecar_path)
+                        preserved_review    = _existing.get("review")
+                        preserved_ext       = _existing.get("extensions")
+                        preserved_bw_report = _existing.get("bw_report")
+                        # Preserve txt_filename so backfills don't blank out the
+                        # pointer to the saved raw .TXT (events ingested after
+                        # 2026-05-27 have this).
+                        preserved_txt_fn    = (_existing.get("source") or {}).get("txt_filename")
+                    except Exception:
+                        pass
+
+                # --reparse-txt: if a .TXT is preserved on disk, run the
+                # current parser against it and overwrite the bw_report
+                # block.  Picks up post-ingest parser fixes (e.g. the
+                # 2026-05-28 zc_freq_above_range / ">100 Hz" addition).
+                if args.reparse_txt and preserved_txt_fn:
+                    try:
+                        from minimateplus import bw_ascii_report
+                        txt_path = store.txt_path_for(serial, path.name)
+                        if txt_path.exists():
+                            refreshed = bw_ascii_report.parse_report_file(txt_path)
+                            preserved_bw_report = event_file_io._bw_report_to_dict(refreshed)
+                            log.debug("reparsed bw_report from %s", txt_path.name)
+                        else:
+                            log.debug("--reparse-txt: no .TXT at %s (sidecar says %r)",
+                                      txt_path, preserved_txt_fn)
+                    except Exception as exc:
+                        log.warning("--reparse-txt failed for %s: %s", path.name, exc)
+
+                # Overlay BW ASCII report fields onto the rebuilt Event
+                # BEFORE the sidecar + DB write.  Mirrors what the ingest
+                # path does — BW's reported peaks (and sample_rate /
+                # record_time) win over codec output where present.
+                #
+                # Without this step, --force backfill silently overwrites
+                # the bw_report-overlaid DB columns with codec-derived
+                # values, which is wrong for events the codec doesn't
+                # fully decode (e.g. waveform walker edge cases on
+                # SP0/SS0/SV0-style events, or histogram sub-formats with
+                # byte[5]!=0 that aren't yet RE'd).  Net effect was PVS=0
+                # on three top-10 events on 2026-05-22.
+                if preserved_bw_report:
+                    event_file_io.apply_bw_report_dict_to_event(
+                        ev, preserved_bw_report,
+                    )
+
+                sidecar = event_file_io.event_to_sidecar_dict(
+                    ev,
+                    serial=serial,
+                    blastware_filename=path.name,
+                    blastware_filesize=path.stat().st_size,
+                    blastware_sha256=bw_sha,
+                    source_kind=source_kind,
+                    a5_pickle_filename=a5_filename,
+                    txt_filename=preserved_txt_fn,
+                    review=preserved_review,
+                    extensions=preserved_ext,
+                )
+                if preserved_bw_report is not None:
+                    sidecar["bw_report"] = preserved_bw_report
+
+                # Also emit the .h5 clean-waveform file when:
+                #   - it's missing, OR
+                #   - --force was passed, OR
+                #   - the sidecar is being regenerated this iteration
+                #     (sha mismatch / tool_version too old).  The .h5 and
+                #     the sidecar are both derived from the same decoder
+                #     output, so if the sidecar is stale, so is the .h5.
+                #
+                # Both waveform and histogram bodies now decode to real
+                # samples via event_file_io.read_blastware_file → either
+                # waveform_codec.decode_waveform_v2 or histogram_codec.
+                # decode_histogram_body.  If samples are still empty after
+                # both codecs run, it's a genuine "we can't decode this
+                # file" case (truncated, malformed, or unknown mode);
+                # skip the .h5 write so we don't replace whatever's
+                # there with an empty placeholder.
+                has_samples = bool(
+                    ev.raw_samples and any(
+                        ev.raw_samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL")
+                    )
+                )
+                hdf5_path = store.hdf5_path_for(serial, path.name)
+                hdf5_filename = hdf5_path.name if hdf5_path.exists() else None
+                hdf5_action = "kept"
+                need_h5 = (
+                    not args.skip_hdf5
+                    and (args.force or not hdf5_path.exists() or sidecar_stale)
+                    and has_samples
+                )
+                if not has_samples and not args.skip_hdf5:
+                    hdf5_action = "skipped-undecodable"
+                if need_h5:
+                    if args.dry_run:
+                        hdf5_action = "would (re)write"
+                    else:
+                        try:
+                            event_hdf5.write_event_hdf5(
+                                hdf5_path, ev,
+                                serial=serial,
+                                geo_range="normal",
+                                source_kind=source_kind,
+                            )
+                            hdf5_filename = hdf5_path.name
+                            hdf5_action = "rewrote" if hdf5_path.exists() else "wrote"
+                        except Exception as exc:
+                            log.warning("HDF5 write failed for %s: %s", path.name, exc)
+                            hdf5_action = "FAILED"
+
+                if args.dry_run:
+                    print(f"  [DRY ] would write {sidecar_path.name} "
+                          f"+ .h5 ({hdf5_action})  source={source_kind}")
+                    written += 1
+                    continue
+
+                event_file_io.write_sidecar(sidecar_path, sidecar)
+
+                # Best-effort: keep the SQL row's sidecar_filename in sync
+                # by upserting via insert_events (it dedups on serial+ts).
+                try:
+                    db.insert_events(
+                        [ev], serial=serial,
+                        waveform_records=(
+                            {ev._waveform_key.hex(): {
+                                "filename":           path.name,
+                                "filesize":           path.stat().st_size,
+                                "a5_pickle_filename": a5_filename,
+                                "sidecar_filename":   sidecar_path.name,
+                            }}
+                            if ev._waveform_key else None
+                        ),
+                        device_family="series3",
+                    )
+                except Exception as exc:
+                    log.warning("DB upsert failed for %s: %s", path.name, exc)
+
+                print(f"  [OK  ] {path.name}  → {sidecar_path.name} "
+                      f"+ h5 ({hdf5_action})  source={source_kind}")
+                written += 1
+
+            except Exception as exc:
+                log.error("backfill failed for %s: %s", path, exc, exc_info=args.verbose)
+                errors += 1
+
+    print(f"\nDone.  written={written}  skipped(uptodate)={skipped}  errors={errors}")
+    return 0 if errors == 0 else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,331 @@
+"""
+scripts/backfill_thor_events.py — re-process existing Thor (Series IV)
+events so their sidecars carry the bw_report block produced by
+``micromate.idf_to_bw_report.build_bw_report_from_idf`` + their .h5
+clean-waveform files for IDFW events.
+
+Why this exists
+───────────────
+
+Thor events ingested before v0.21.0 (or during the v0.21.0 ingest bug
+window fixed in commit bee1185) have sidecars with only
+``extensions.idf_report`` — no ``bw_report`` block.  Without
+``bw_report``, the SFM PDF renderer falls back to DB-only fields
+(misses sensor-self-check, full per-channel breakdown, mic dB(L)),
+and the modal chart 404s on ``/waveform.json`` for IDFW events
+because no .h5 was written when the codec failed at ingest.
+
+Re-forwarding from thor-watcher would also fix this, but that requires
+operator coordination on every watcher machine and uses bandwidth this
+script doesn't.
+
+What this does
+──────────────
+
+Walks ``<store>/<serial>/<filename>`` for ``.IDFW`` / ``.IDFH`` files
+and, for each one:
+
+  1. Reads the existing sidecar (preserving review state + captured_at).
+  2. Re-runs ``micromate.idf_file.read_idf_file()`` on the binary
+     bytes — passing ``data=`` so the codec doesn't try to read from
+     a path it doesn't know.
+  3. Pulls ``extensions.idf_report`` (the raw parsed Thor dict the
+     v0.18.0+ ingest path already stashed) and runs the v0.21.0
+     ``build_bw_report_from_idf`` adapter against it.
+  4. Writes the refreshed sidecar with the new ``bw_report``,
+     bumped ``source.tool_version``, but preserved ``review`` block
+     + the original ``captured_at`` timestamp.
+  5. Regenerates the .h5 waveform file via the existing
+     ``event_hdf5`` writer.  For IDFW that's the decoded per-sample
+     stream; for IDFH it's a 1-sample-per-interval synthesised array
+     (peak ADC count per channel) so the renderer's bar-chart code
+     has data to group on.  Mic peak psi from the binary is merged
+     onto the IdfEvent before the bridge so the h5 writer's per-count
+     mic scale factor lands on a sensible value (without this the
+     mic chart on Thor events plots dB(L)-as-pseudo-psi and shows
+     bomb-level numbers).
+
+Idempotent.  Re-running it after a parser/adapter change just
+re-writes sidecars — no DB writes, no thor-watcher coordination.
+
+Usage
+─────
+
+    python scripts/backfill_thor_events.py [--store-root PATH]
+                                           [--dry-run]
+                                           [--skip-hdf5]
+                                           [--force]
+                                           [-v]
+
+By default, refreshes any Thor event whose sidecar is missing
+``bw_report`` OR whose ``source.tool_version`` is older than the
+current ``TOOL_VERSION``.  ``--force`` refreshes every Thor event
+regardless.
+"""
+
+from __future__ import annotations
+
+import argparse
+import logging
+import sys
+from pathlib import Path
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+from sfm.waveform_store import WaveformStore
+
+log = logging.getLogger("backfill_thor_events")
+
+
+def _is_thor_event(path: Path) -> bool:
+    if not path.is_file():
+        return False
+    if path.name.endswith((".sfm.json", ".h5", "_ASCII.TXT")):
+        return False
+    return path.suffix.upper() in (".IDFW", ".IDFH")
+
+
+def _vtuple(s: str) -> tuple:
+    try:
+        return tuple(int(p) for p in str(s).split(".")[:3])
+    except Exception:
+        return (0, 0, 0)
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    p.add_argument(
+        "--db-path",
+        default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
+        help="Used only to derive the default --store-root.",
+    )
+    p.add_argument("--store-root", default=None)
+    p.add_argument("--dry-run", action="store_true")
+    p.add_argument("--skip-hdf5", action="store_true",
+                   help="Don't regenerate .h5 files for IDFW events.")
+    p.add_argument("--force", action="store_true",
+                   help="Refresh every Thor event, not just ones with stale or missing bw_report.")
+    p.add_argument("-v", "--verbose", action="store_true")
+    args = p.parse_args(argv)
+
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s  %(levelname)-7s  %(name)s  %(message)s",
+        datefmt="%H:%M:%S",
+    )
+
+    db_path = Path(args.db_path).expanduser().resolve()
+    store_root = (
+        Path(args.store_root).expanduser().resolve()
+        if args.store_root else db_path.parent / "waveforms"
+    )
+    if not store_root.exists():
+        log.error("store root not found: %s", store_root)
+        return 1
+    store = WaveformStore(store_root)
+    log.info("store root: %s", store_root)
+    log.info("current TOOL_VERSION: %s", event_file_io.TOOL_VERSION)
+
+    refreshed = skipped = errors = h5_written = 0
+
+    # Lazy imports so any one of these failing produces a useful error
+    # message rather than crashing module-load.
+    from micromate.idf_file import read_idf_file
+    from micromate.idf_to_bw_report import build_bw_report_from_idf
+
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        serial = serial_dir.name
+        for path in sorted(serial_dir.iterdir()):
+            if not _is_thor_event(path):
+                continue
+
+            sidecar_path = store.sidecar_path_for(serial, path.name)
+            if not sidecar_path.exists():
+                log.debug("%s: no sidecar — skipping (this is a binary without ingest history)",
+                          path.name)
+                skipped += 1
+                continue
+
+            try:
+                existing = event_file_io.read_sidecar(sidecar_path)
+            except Exception as exc:
+                log.warning("%s: failed to read sidecar — %s", path.name, exc)
+                errors += 1
+                continue
+
+            has_bw_report = bool(existing.get("bw_report"))
+            existing_version = (existing.get("source") or {}).get("tool_version", "")
+            up_to_date = (
+                has_bw_report
+                and _vtuple(existing_version) >= _vtuple(event_file_io.TOOL_VERSION)
+            )
+            if up_to_date and not args.force:
+                skipped += 1
+                continue
+
+            # Re-decode the binary.  Catch + log; continue with .txt-only
+            # data if it fails (matches the live ingest path's behavior).
+            idf_samples = None
+            idf_intervals = None
+            binary_md = None
+            is_histogram = path.suffix.upper() == ".IDFH"
+            try:
+                binary_bytes = path.read_bytes()
+                res = read_idf_file(path, data=binary_bytes)
+                idf_samples = res.samples or None
+                idf_intervals = res.intervals
+                binary_md = res.binary_metadata
+                is_histogram = res.intervals is not None
+            except NotImplementedError:
+                # sig-B / Blastware-stray binary; no samples but adapter
+                # can still produce a bw_report from extensions.idf_report.
+                log.debug("%s: binary codec NotImplementedError (sig-B / BW-stray); proceeding from sidecar's idf_report only", path.name)
+            except Exception as exc:
+                log.warning("%s: binary decode failed — %s; proceeding from sidecar's idf_report only", path.name, exc)
+
+            # Run the adapter.  Pull report_dict from
+            # extensions.idf_report (the v0.18.0+ ingest preserved it).
+            report_dict = (existing.get("extensions") or {}).get("idf_report") or {}
+            if not report_dict and binary_md is None:
+                log.debug("%s: no idf_report in sidecar AND no binary metadata — nothing to project", path.name)
+                skipped += 1
+                continue
+
+            try:
+                bw_report = build_bw_report_from_idf(
+                    report_dict, binary_md=binary_md,
+                    intervals=idf_intervals, is_histogram=is_histogram,
+                )
+            except Exception as exc:
+                log.warning("%s: adapter failed — %s", path.name, exc)
+                errors += 1
+                continue
+
+            # Build the new sidecar by overlaying refreshed fields onto
+            # the existing one — preserves review, captured_at, blastware
+            # block, source.kind, etc.
+            new_sidecar = dict(existing)  # shallow copy
+            new_sidecar["bw_report"] = bw_report
+            src = dict(new_sidecar.get("source") or {})
+            src["tool_version"] = event_file_io.TOOL_VERSION
+            new_sidecar["source"] = src
+
+            # Preserve histogram intervals if the binary decoded them
+            # (improves over the original ingest if that one ran before
+            # the bee1185 codec fix).
+            if idf_intervals is not None:
+                ext = dict(new_sidecar.get("extensions") or {})
+                ext["idf_intervals"] = [
+                    {
+                        "offset":     iv.offset,
+                        "tran_peak":  iv.peak_count("Tran"),
+                        "tran_halfp": iv.tran_halfp,
+                        "tran_freq":  iv.freq_hz("Tran"),
+                        "vert_peak":  iv.peak_count("Vert"),
+                        "vert_halfp": iv.vert_halfp,
+                        "vert_freq":  iv.freq_hz("Vert"),
+                        "long_peak":  iv.peak_count("Long"),
+                        "long_halfp": iv.long_halfp,
+                        "long_freq":  iv.freq_hz("Long"),
+                        "mic_peak":   iv.peak_count("MicL"),
+                        "mic_halfp":  iv.micl_halfp,
+                        "mic_freq":   iv.freq_hz("MicL"),
+                    }
+                    for iv in idf_intervals
+                ]
+                new_sidecar["extensions"] = ext
+
+            if args.dry_run:
+                will_write_h5 = (idf_samples or idf_intervals) and not args.skip_hdf5
+                log.info("[DRY] %s/%s — would refresh sidecar (bw_report=%s, h5=%s)",
+                         serial, path.name,
+                         "wrote" if not has_bw_report else "refreshed",
+                         "would write" if will_write_h5 else "skipped")
+            else:
+                event_file_io.write_sidecar(sidecar_path, new_sidecar)
+                log.info("%s/%s — sidecar refreshed (bw_report=%s, intervals=%d)",
+                         serial, path.name,
+                         "added" if not has_bw_report else "refreshed",
+                         len(idf_intervals) if idf_intervals else 0)
+            refreshed += 1
+
+            # Regenerate .h5 by replaying the same IdfEvent → Event bridge
+            # save_imported_idf uses.  For IDFW we write the decoded per-
+            # sample arrays.  For IDFH we synthesise a 1-sample-per-interval
+            # array (peak ADC count per channel per interval) so the
+            # renderer's bar-chart code has something to group on.
+            # Pre-condition: either real samples (IDFW) or decoded intervals
+            # (IDFH).  Skip otherwise.
+            have_data = bool(idf_samples) or bool(idf_intervals)
+            if have_data and not args.skip_hdf5:
+                from sfm import event_hdf5
+                hdf5_path = store.hdf5_path_for(serial, path.name)
+                if args.dry_run:
+                    log.debug("[DRY] would write %s", hdf5_path.name)
+                else:
+                    try:
+                        from micromate import IdfEvent
+                        from minimateplus.event_file_io import file_sha256
+                        idf_event = IdfEvent.from_report(report_dict, path.name)
+
+                        # Merge the binary-derived mic peak psi (only the
+                        # binary path knows the proper psi value; the .txt
+                        # carries dB(L)).  Without this, the h5 writer's
+                        # per-count mic factor is computed against the
+                        # dB(L) value-as-pseudo-psi and the mic chart
+                        # scales wildly.
+                        if (binary_md is not None and res is not None
+                                and res.event.peaks.mic_pspl_psi is not None):
+                            idf_event.peaks.mic_pspl_psi = res.event.peaks.mic_pspl_psi
+
+                        sha256 = file_sha256(path)
+                        waveform_key = bytes.fromhex(sha256)[:16]
+                        ev = idf_event.to_minimateplus_event(waveform_key)
+
+                        if is_histogram and idf_intervals:
+                            # 1 sample per interval per channel — same
+                            # synthesis save_imported_idf uses.  The h5
+                            # writer's count×geo_fs/32768 conversion turns
+                            # each peak-ADC-count into the bar's physical
+                            # value.
+                            ev.raw_samples = {
+                                "Tran": [iv.peak_count("Tran") for iv in idf_intervals],
+                                "Vert": [iv.peak_count("Vert") for iv in idf_intervals],
+                                "Long": [iv.peak_count("Long") for iv in idf_intervals],
+                                "MicL": [iv.peak_count("MicL") for iv in idf_intervals],
+                            }
+                            ev.total_samples = ev.total_samples or len(idf_intervals)
+                        elif idf_samples:
+                            ev.raw_samples = idf_samples
+                            n_samp = max(
+                                (len(idf_samples.get(ch, []))
+                                 for ch in ("Tran", "Vert", "Long", "MicL")),
+                                default=0,
+                            )
+                            ev.total_samples = ev.total_samples or n_samp
+
+                        event_hdf5.write_event_hdf5(
+                            hdf5_path, ev,
+                            serial=serial,
+                            geo_range="normal",
+                            source_kind="idf-import",
+                            tool_version=event_file_io.TOOL_VERSION,
+                        )
+                        h5_written += 1
+                        log.debug("%s/%s — .h5 written (%s)",
+                                  serial, path.name,
+                                  f"{len(idf_intervals)} intervals" if is_histogram
+                                  else f"{sum(len(v) for v in (idf_samples or {}).values())} samples")
+                    except Exception as exc:
+                        log.warning("%s/%s — .h5 write failed: %s",
+                                    serial, path.name, exc)
+
+    log.info("Done.  refreshed=%d  skipped=%d  errors=%d  h5_written=%d",
+             refreshed, skipped, errors, h5_written)
+    return 0 if errors == 0 else 2
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,100 @@
+#!/usr/bin/env bash
+# Fire-and-forget Stop Monitoring loop — for wedged or constantly-triggering units.
+#
+# Hammers POST /device/stop_monitoring_blind in a tight loop.  The endpoint
+# opens TCP, dumps SESSION_RESET + a few copies of the SUB 0x97 frame, and
+# closes — without ever reading an S3 response.  Each TCP-won attempt is
+# ~50ms of wire activity instead of the multi-frame handshake the regular
+# rescue endpoint does, so windows that are too small for the full rescue
+# can still land a stop-monitoring command.
+#
+# Usage:
+#   ./blind_stop.sh <host> [tcp_port]
+#
+# Env:
+#   SFM_BASE_URL    Default: http://localhost:8200 (SFM direct).
+#                   Set to http://localhost:8001/api/sfm to route through
+#                   Terra-View's proxy.
+#   MAX_ATTEMPTS    Default: 600
+#   SLEEP_S         Default: 0  (no backoff — hammer it)
+#   MAX_TIME_S      Default: 15
+#   CONNECT_TIMEOUT Default: 5
+#   REPEAT          Frames per TCP session (default 3 — increases hit rate
+#                   if the device is busy reading its own buffer).
+#   STOP_ON_OK      Default: 1.  Set to 0 to keep hammering indefinitely
+#                   even after successful sends (every 503 means the device
+#                   is in *another* session, every 200 means our bytes got
+#                   through — but the device may not have processed them).
+
+set -u
+
+host="${1:-}"
+tcp_port="${2:-9034}"
+if [[ -z "$host" ]]; then
+  echo "usage: $0 <host> [tcp_port]" >&2
+  exit 2
+fi
+
+base="${SFM_BASE_URL:-http://localhost:8200}"
+max_attempts="${MAX_ATTEMPTS:-600}"
+sleep_s="${SLEEP_S:-0}"
+max_time_s="${MAX_TIME_S:-15}"
+connect_timeout="${CONNECT_TIMEOUT:-5}"
+repeat="${REPEAT:-3}"
+stop_on_ok="${STOP_ON_OK:-1}"
+
+url="${base}/device/stop_monitoring_blind?host=${host}&tcp_port=${tcp_port}&connect_timeout=${connect_timeout}&repeat=${repeat}"
+
+echo "blind_stop: target ${host}:${tcp_port}  connect_timeout=${connect_timeout}s  repeat=${repeat}"
+echo "blind_stop: POST ${url}"
+echo "blind_stop: up to ${max_attempts} attempts, ${sleep_s}s between, ${max_time_s}s per request"
+echo "blind_stop: stop_on_ok=${stop_on_ok}"
+echo
+
+ok_count=0
+busy_count=0
+err_count=0
+started=$(date +%s)
+
+for ((i=1; i<=max_attempts; i++)); do
+  printf "[%4d] %s  " "$i" "$(date +%H:%M:%S)"
+  http_code=$(curl -sS -o /tmp/blind_resp.$$ -w "%{http_code}" \
+    --max-time "$max_time_s" \
+    -X POST "$url" || echo "000")
+  body=$(cat /tmp/blind_resp.$$ 2>/dev/null || true)
+  rm -f /tmp/blind_resp.$$
+
+  case "$http_code" in
+    200|201)
+      ok_count=$((ok_count + 1))
+      echo "SENT  $body"
+      if [[ "$stop_on_ok" == "1" ]]; then
+        elapsed=$(( $(date +%s) - started ))
+        echo
+        echo "blind_stop: success after ${i} attempts (${elapsed}s).  ok=${ok_count} busy=${busy_count} err=${err_count}"
+        echo "blind_stop: NEXT — wait ~10s, then try the full rescue:"
+        echo "  /home/serversdown/seismo-relay/scripts/rescue_device.sh ${host} ${tcp_port}"
+        exit 0
+      fi
+      ;;
+    503)
+      busy_count=$((busy_count + 1))
+      echo "busy (503)"
+      ;;
+    000)
+      err_count=$((err_count + 1))
+      echo "curl error"
+      ;;
+    *)
+      err_count=$((err_count + 1))
+      echo "HTTP $http_code  $body" | head -c 400
+      echo
+      ;;
+  esac
+  [[ "$sleep_s" != "0" ]] && sleep "$sleep_s"
+done
+
+elapsed=$(( $(date +%s) - started ))
+echo
+echo "blind_stop: gave up after ${max_attempts} attempts (${elapsed}s).  ok=${ok_count} busy=${busy_count} err=${err_count}" >&2
+exit 1
@@ -0,0 +1,185 @@
+"""
+scripts/check_bw_report_preservation.py — verify that running backfill_sidecars
+doesn't wipe the `bw_report` block from sidecars that already had one.
+
+Two-step workflow:
+
+  # Before running backfill — capture a baseline snapshot:
+  python scripts/check_bw_report_preservation.py snapshot \
+      --store-root /path/to/waveforms \
+      --out before.json
+
+  # Run backfill:
+  python scripts/backfill_sidecars.py --store-root /path/to/waveforms --force
+
+  # After backfill — diff against the baseline:
+  python scripts/check_bw_report_preservation.py diff \
+      --store-root /path/to/waveforms \
+      --baseline before.json
+
+The diff classifies every sidecar into one of:
+
+  PRESERVED      had bw_report before, has same hash now  ← GOOD
+  CHANGED        had bw_report before, has different hash now  ← suspicious
+                 (backfill should only ever copy the block verbatim)
+  WIPED          had bw_report before, doesn't now  ← BUG — data loss
+  STILL_MISSING  didn't have bw_report before, still doesn't  ← expected
+  NEW            didn't have bw_report before, has one now
+                 (only possible if a re-ingest happened between snapshots;
+                  shouldn't happen during backfill)
+  REMOVED        sidecar existed in baseline, file is gone now
+  ADDED          sidecar didn't exist in baseline, exists now
+
+Exit code is 0 if no WIPED or CHANGED entries are found, 1 otherwise.
+"""
+
+from __future__ import annotations
+
+import argparse
+import hashlib
+import json
+import sys
+from pathlib import Path
+from typing import Optional
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+
+
+def _bw_report_hash(sidecar_data: dict) -> Optional[str]:
+    """Canonical-JSON hash of the bw_report block, or None if absent."""
+    br = sidecar_data.get("bw_report")
+    if not br:
+        return None
+    # sort_keys for stable hashing across dict-ordering differences
+    blob = json.dumps(br, sort_keys=True, separators=(",", ":"))
+    return hashlib.sha256(blob.encode()).hexdigest()
+
+
+def _scan_store(store_root: Path) -> dict:
+    """Walk every <serial>/<file>.sfm.json and return {relpath: hash_or_None}.
+
+    Relpath is `<serial>/<filename>` — stable across machines/snapshots.
+    """
+    out: dict[str, Optional[str]] = {}
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        for sidecar in sorted(serial_dir.glob("*.sfm.json")):
+            relpath = f"{serial_dir.name}/{sidecar.name}"
+            try:
+                data = event_file_io.read_sidecar(sidecar)
+            except Exception as exc:
+                print(f"  WARN: failed to read {relpath}: {exc}", file=sys.stderr)
+                continue
+            out[relpath] = _bw_report_hash(data)
+    return out
+
+
+def cmd_snapshot(args) -> int:
+    store_root = Path(args.store_root).expanduser().resolve()
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+    out_path = Path(args.out).expanduser().resolve()
+
+    print(f"Scanning {store_root} …")
+    snapshot = _scan_store(store_root)
+
+    with_bw    = sum(1 for v in snapshot.values() if v is not None)
+    without_bw = sum(1 for v in snapshot.values() if v is None)
+    print(f"  total sidecars:     {len(snapshot)}")
+    print(f"  with bw_report:     {with_bw}")
+    print(f"  without bw_report:  {without_bw}")
+
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(out_path, "w") as f:
+        json.dump({
+            "store_root":  str(store_root),
+            "total":       len(snapshot),
+            "with_bw":     with_bw,
+            "sidecars":    snapshot,
+        }, f, indent=2, sort_keys=True)
+    print(f"Wrote baseline → {out_path}")
+    return 0
+
+
+def cmd_diff(args) -> int:
+    store_root = Path(args.store_root).expanduser().resolve()
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+    baseline_path = Path(args.baseline).expanduser().resolve()
+    if not baseline_path.exists():
+        print(f"error: baseline file not found: {baseline_path}", file=sys.stderr)
+        return 2
+
+    with open(baseline_path) as f:
+        baseline = json.load(f)
+    before = baseline["sidecars"]
+    print(f"Scanning {store_root} for comparison against {baseline_path.name} …")
+    after = _scan_store(store_root)
+
+    classes = {k: [] for k in (
+        "PRESERVED", "CHANGED", "WIPED", "STILL_MISSING", "NEW", "REMOVED", "ADDED",
+    )}
+    all_keys = set(before) | set(after)
+    for key in sorted(all_keys):
+        b = before.get(key, "__MISSING__")
+        a = after.get(key, "__MISSING__")
+        if b == "__MISSING__":
+            classes["ADDED"].append(key)
+        elif a == "__MISSING__":
+            classes["REMOVED"].append(key)
+        elif b is None and a is None:
+            classes["STILL_MISSING"].append(key)
+        elif b is None and a is not None:
+            classes["NEW"].append(key)
+        elif b is not None and a is None:
+            classes["WIPED"].append(key)
+        elif b == a:
+            classes["PRESERVED"].append(key)
+        else:
+            classes["CHANGED"].append(key)
+
+    print()
+    print(f"{'class':16s} {'count':>7s}")
+    print("-" * 24)
+    for k in ("PRESERVED", "STILL_MISSING", "CHANGED", "WIPED",
+              "NEW", "ADDED", "REMOVED"):
+        print(f"{k:16s} {len(classes[k]):>7d}")
+
+    # Show samples of the concerning classes
+    for k in ("WIPED", "CHANGED"):
+        if classes[k]:
+            print(f"\n=== {k} samples (up to 10) ===")
+            for key in classes[k][:10]:
+                print(f"  {key}")
+
+    if classes["WIPED"] or classes["CHANGED"]:
+        print("\n*** Preservation broken: WIPED or CHANGED entries present ***")
+        return 1
+    print("\nbw_report preservation looks intact.")
+    return 0
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    sub = p.add_subparsers(dest="cmd", required=True)
+
+    p_snap = sub.add_parser("snapshot", help="capture baseline bw_report hashes")
+    p_snap.add_argument("--store-root", required=True)
+    p_snap.add_argument("--out", required=True, help="output JSON path")
+    p_snap.set_defaults(func=cmd_snapshot)
+
+    p_diff = sub.add_parser("diff", help="diff current store against a baseline")
+    p_diff.add_argument("--store-root", required=True)
+    p_diff.add_argument("--baseline",   required=True, help="JSON from `snapshot`")
+    p_diff.set_defaults(func=cmd_diff)
+
+    args = p.parse_args(argv)
+    return args.func(args)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,151 @@
+"""
+scripts/repair_unknown_serials.py — re-attribute events stuck under
+`serial = 'UNKNOWN'` to their correct serial by decoding the BW filename.
+
+Why this is needed
+──────────────────
+The /db/import/blastware_file endpoint had a bug (fixed in commit a032fa5+1
+on the ach-report-ingestion branch) where every forwarded event was inserted
+with serial='UNKNOWN' because the endpoint's `_serial_from_event(ev)` stub
+returned None and never consulted the BW-filename serial that
+`WaveformStore.save_imported_bw()` had already decoded.
+
+Effect on a server that ran a buggy version: every forwarded event's
+SeismoDb row has `serial='UNKNOWN'`, even though the on-disk waveform
+store has correctly bucketed the files into `BE<NNNN>/` folders.  So
+the BW binaries / sidecars / HDF5s are fine, but `/db/units` and
+`/db/events?serial=...` queries don't surface the events.
+
+This script
+───────────
+Walks the events table looking for rows with `serial='UNKNOWN'` and
+re-attributes each one to the serial decoded from its
+`blastware_filename` column.  If the row's serial would collide with
+an existing row (already-correct duplicate from a later re-forward),
+the UNKNOWN row is deleted.  Otherwise the row's `serial` column is
+updated in-place.
+
+Idempotent: re-running after a successful repair finds zero matching
+rows and exits cleanly.
+
+Usage
+─────
+  # Dry-run (default): print what would change, don't touch the DB
+  python -m scripts.repair_unknown_serials --db bridges/captures/seismo_relay.db
+
+  # Apply the repair
+  python -m scripts.repair_unknown_serials --db bridges/captures/seismo_relay.db --apply
+"""
+
+from __future__ import annotations
+
+import argparse
+import sqlite3
+import sys
+from pathlib import Path
+
+# Reach into sfm.waveform_store for the serial decoder.  This script
+# is run from the repo root via `python -m scripts.repair_unknown_serials`.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+from sfm.waveform_store import _serial_from_bw_filename
+
+
+def main(argv: list[str] | None = None) -> int:
+    p = argparse.ArgumentParser(
+        description="Re-attribute events stuck under serial='UNKNOWN'.",
+    )
+    p.add_argument(
+        "--db", required=True, type=Path,
+        help="Path to seismo_relay.db (e.g. bridges/captures/seismo_relay.db)",
+    )
+    p.add_argument(
+        "--apply", action="store_true",
+        help="Apply the repair.  Without this flag the script runs in "
+             "dry-run mode and only reports what would change.",
+    )
+    args = p.parse_args(argv)
+
+    if not args.db.exists():
+        print(f"DB not found: {args.db}", file=sys.stderr)
+        return 2
+
+    conn = sqlite3.connect(str(args.db))
+    conn.row_factory = sqlite3.Row
+
+    rows = list(conn.execute(
+        "SELECT id, serial, timestamp, blastware_filename "
+        "  FROM events "
+        " WHERE serial = 'UNKNOWN' "
+        " ORDER BY timestamp",
+    ))
+    print(f"Found {len(rows)} UNKNOWN-serial rows in events table.")
+    if not rows:
+        return 0
+
+    updated   = 0
+    deleted   = 0
+    unresolved = 0
+    by_serial: dict[str, int] = {}
+
+    for row in rows:
+        rid       = row["id"]
+        ts        = row["timestamp"]
+        bw_name   = row["blastware_filename"]
+        new_serial = _serial_from_bw_filename(bw_name) if bw_name else None
+        if not new_serial:
+            print(f"  ⚠ id={rid[:8]} ts={ts} filename={bw_name!r} — "
+                  f"cannot decode serial from filename; skipping")
+            unresolved += 1
+            continue
+
+        # Check for an existing row at the target (serial, timestamp).
+        existing = conn.execute(
+            "SELECT id FROM events WHERE serial = ? AND timestamp = ?",
+            (new_serial, ts),
+        ).fetchone()
+        action: str
+        if existing is None:
+            # Safe to UPDATE in place.
+            if args.apply:
+                conn.execute(
+                    "UPDATE events SET serial = ? WHERE id = ?",
+                    (new_serial, rid),
+                )
+            action = "UPDATE"
+            updated += 1
+        else:
+            # A correctly-attributed row already exists.  Drop the
+            # UNKNOWN duplicate.
+            if args.apply:
+                conn.execute("DELETE FROM events WHERE id = ?", (rid,))
+            action = "DELETE (dup)"
+            deleted += 1
+
+        by_serial[new_serial] = by_serial.get(new_serial, 0) + 1
+        print(f"  {action:14s}  id={rid[:8]}  ts={ts}  "
+              f"filename={bw_name}  →  {new_serial}")
+
+    if args.apply:
+        conn.commit()
+    conn.close()
+
+    print()
+    print(f"Summary:")
+    print(f"  UNKNOWN rows scanned:       {len(rows)}")
+    print(f"  Updated to real serial:     {updated}")
+    print(f"  Deleted (duplicate of an    ")
+    print(f"   already-correct row):      {deleted}")
+    print(f"  Unresolved (bad filename):  {unresolved}")
+    print()
+    if by_serial:
+        print(f"Per-serial breakdown of repaired rows:")
+        for serial, count in sorted(by_serial.items()):
+            print(f"  {serial:12s}  {count}")
+    if not args.apply:
+        print()
+        print("(dry-run — re-run with --apply to commit)")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,99 @@
+#!/usr/bin/env bash
+# Rescue an uncooperative MiniMate that's busy with another ACH session.
+#
+# Hammers POST /device/rescue in a tight loop with a short timeout.  When the
+# device is in an ACH session our SYN either gets refused or silently dropped
+# (5s connect timeout inside the endpoint) and we retry immediately.  When the
+# device is between sessions, our TCP wins, the endpoint disables Auto Call
+# Home and erases events inside the same session, then returns success.
+#
+# Usage:
+#   ./rescue_device.sh <host> [tcp_port] [--no-erase] [--no-disable-ach]
+#
+# Examples:
+#   ./rescue_device.sh 166.246.130.1 9034
+#   ./rescue_device.sh 166.246.130.1 9034 --no-erase     # just silence it
+#
+# Environment:
+#   SFM_BASE_URL    Defaults to http://localhost:8200 (SFM direct).
+#                   Set to http://localhost:8001/api/sfm to route through
+#                   Terra-View's proxy.  Direct mode avoids the proxy's
+#                   60s timeout, which matters for long-running endpoints.
+#   MAX_ATTEMPTS    Cap on retries (default 600 ≈ 30+ min).
+#   SLEEP_S         Backoff between attempts (default 1).
+#   MAX_TIME_S      Per-request timeout (default 60).
+#   CONNECT_TIMEOUT TCP connect timeout (default 5).
+#   RECV_TIMEOUT    Per-frame S3 recv timeout (default 5).  If POLL or any
+#                   subsequent frame doesn't respond within this window, the
+#                   rescue endpoint bails and this script retries.
+
+set -u
+
+host="${1:-}"
+tcp_port="${2:-9034}"
+shift 2 2>/dev/null || shift $# 2>/dev/null
+
+if [[ -z "$host" ]]; then
+  echo "usage: $0 <host> [tcp_port] [--no-erase] [--no-disable-ach]" >&2
+  exit 2
+fi
+
+disable_ach="true"
+erase="true"
+for arg in "$@"; do
+  case "$arg" in
+    --no-erase)        erase="false" ;;
+    --no-disable-ach)  disable_ach="false" ;;
+    *) echo "unknown flag: $arg" >&2; exit 2 ;;
+  esac
+done
+
+base="${SFM_BASE_URL:-http://localhost:8200}"
+max_attempts="${MAX_ATTEMPTS:-600}"
+sleep_s="${SLEEP_S:-1}"
+max_time_s="${MAX_TIME_S:-60}"
+connect_timeout="${CONNECT_TIMEOUT:-5}"
+recv_timeout="${RECV_TIMEOUT:-5}"
+
+url="${base}/device/rescue?host=${host}&tcp_port=${tcp_port}&disable_ach=${disable_ach}&erase=${erase}&connect_timeout=${connect_timeout}&recv_timeout=${recv_timeout}"
+
+echo "rescue: target ${host}:${tcp_port}  disable_ach=${disable_ach}  erase=${erase}"
+echo "rescue: connect_timeout=${connect_timeout}s  recv_timeout=${recv_timeout}s"
+echo "rescue: POST ${url}"
+echo "rescue: up to ${max_attempts} attempts, ${sleep_s}s between, ${max_time_s}s per request"
+echo
+
+started=$(date +%s)
+for ((i=1; i<=max_attempts; i++)); do
+  printf "[%3d] %s  " "$i" "$(date +%H:%M:%S)"
+  http_code=$(curl -sS -o /tmp/rescue_resp.$$ -w "%{http_code}" \
+    --max-time "$max_time_s" \
+    -X POST "$url" || echo "000")
+  body=$(cat /tmp/rescue_resp.$$ 2>/dev/null || true)
+  rm -f /tmp/rescue_resp.$$
+
+  case "$http_code" in
+    200|201)
+      elapsed=$(( $(date +%s) - started ))
+      echo "OK  (${elapsed}s total)"
+      echo "$body"
+      exit 0
+      ;;
+    503)
+      # Connection refused / timeout — device busy in another session.  Retry fast.
+      echo "busy (503)"
+      ;;
+    000)
+      echo "curl error (network)"
+      ;;
+    *)
+      echo "HTTP $http_code"
+      echo "  $body" | head -c 400
+      echo
+      ;;
+  esac
+  sleep "$sleep_s"
+done
+
+echo "rescue: gave up after ${max_attempts} attempts" >&2
+exit 1
@@ -0,0 +1,44 @@
+#!/usr/bin/env bash
+# Hold a single TCP session open and drip stop-monitoring frames at a slow
+# rate, so the device's UART RX FIFO has time to drain between sends.
+#
+# Use when high-rate spam isn't landing — typically because the device's
+# firmware is too busy to drain its serial buffer fast enough and bytes
+# are being lost to UART overrun.
+#
+# Usage:
+#   ./slow_drip.sh <host> [tcp_port] [duration_s]
+#
+# Env:
+#   DURATION         Default: 120 (seconds; arg 3 overrides). Clamped 1..600.
+#   INTERVAL         Seconds between drip sends (default 3).  Lower = more
+#                    aggressive, more risk of FIFO overrun.  Higher = safer
+#                    but fewer total drips per duration.
+#   CONNECT_TIMEOUT  Default: 5
+#   SFM_BASE_URL     Default: http://localhost:8200 (SFM direct).
+
+set -u
+
+host="${1:-}"
+tcp_port="${2:-9034}"
+duration="${3:-${DURATION:-120}}"
+if [[ -z "$host" ]]; then
+  echo "usage: $0 <host> [tcp_port] [duration_s]" >&2
+  exit 2
+fi
+
+base="${SFM_BASE_URL:-http://localhost:8200}"
+interval="${INTERVAL:-3}"
+connect_timeout="${CONNECT_TIMEOUT:-5}"
+
+url="${base}/device/stop_monitoring_slow_drip?host=${host}&tcp_port=${tcp_port}&duration_s=${duration}&interval_s=${interval}&connect_timeout=${connect_timeout}"
+
+echo "slow_drip: target ${host}:${tcp_port}  duration=${duration}s  interval=${interval}s  connect_timeout=${connect_timeout}s"
+echo "slow_drip: POST ${url}"
+echo
+
+# Give curl enough slack to wait out the duration plus a buffer
+max_time=$(awk -v d="$duration" 'BEGIN { printf "%d", d + 30 }')
+
+curl -sS --max-time "$max_time" -X POST "$url"
+echo
--- a/Show More
+++ b/Show More