From ecc935482bf61d913b8114b6cab38caecb809f47 Mon Sep 17 00:00:00 2001 From: serversdown Date: Wed, 20 May 2026 15:19:49 +0000 Subject: [PATCH] =?UTF-8?q?seismo-relay=20v0.19.0=20=E2=80=94=20device-fam?= =?UTF-8?q?ily=20separation=20+=20micromate/=20package?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tighten the Series III / Series IV boundary so UI and storage dispatch on a clean signal instead of sniffing filenames or applying magnitude heuristics. Phase 1 — events.device_family column ("series3" | "series4"): self-applying migration with filename-based backfill of existing rows (1,132 backfilled on prod 2026-05-20); plumbed through every import path (BW endpoint, IDF endpoint, ACH server, BW CLI, sidecar backfill); UPSERT preserves via COALESCE; UI dispatches on it. Phase 2 — extract micromate/ package alongside minimateplus/: native IdfEvent / IdfReport / IdfPeaks / IdfProjectInfo / IdfSensorCheck (mic in dB(L), not pseudo-psi); moved idf_ascii_report.py from sfm/ to micromate/; refactored save_imported_idf to use IdfEvent and bridge to minimateplus.Event at the SQL-insert boundary; idf_file.py stub for the future binary codec. Phase 3 prep — docs/idf_protocol_reference.md captures the two observed Thor binary header signatures (1,012 newer-firmware files vs 2 old files whose layout is byte-for-byte BW-STRT-compatible), file-size hints suggesting int8 sample encoding, open questions in dependency order, and a concrete first-session plan for cracking the codec. Also rolled in the v0.18.1 hotfixes that motivated this work: - idf_ascii_report parser now handles "<0.005 in/s" (below-threshold) and "N/A" markers without leaving raw strings in numeric DB columns. - sfm_webapp.html: defensive _ppvFmt / mic formatter so future data-shape drift can't kill the whole events table render. All 1,014 example-data sidecars round-trip through the new package. See CHANGELOG.md for full notes. --- CHANGELOG.md | 27 ++ README.md | 152 ++++++++-- docs/idf_protocol_reference.md | 284 +++++++++++++++++++ micromate/__init__.py | 48 ++++ {sfm => micromate}/idf_ascii_report.py | 2 +- micromate/idf_file.py | 64 +++++ micromate/models.py | 377 +++++++++++++++++++++++++ pyproject.toml | 6 +- sfm/server.py | 2 +- sfm/waveform_store.py | 121 +++----- tests/test_idf_ascii_report.py | 2 +- 11 files changed, 966 insertions(+), 119 deletions(-) create mode 100644 docs/idf_protocol_reference.md create mode 100644 micromate/__init__.py rename {sfm => micromate}/idf_ascii_report.py (99%) create mode 100644 micromate/idf_file.py create mode 100644 micromate/models.py diff --git a/CHANGELOG.md b/CHANGELOG.md index 7e7ceae..f2d4f95 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,33 @@ All notable changes to seismo-relay are documented here. --- +## v0.19.0 — 2026-05-20 + +The "device-family separation" release. Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics. + +### Added — Phase 1: `device_family` column on `events` + +- **`events.device_family TEXT`** — new column carrying `"series3"` or `"series4"`. Populated by every import path (`/db/import/blastware_file`, `/db/import/idf_file`, ACH server, BW CLI, sidecar backfill script). Returned through `/db/events` since `query_events` uses `SELECT *`. +- **Self-applying migration** — on startup, `ALTER TABLE ... ADD COLUMN` lands the new column; a follow-on `UPDATE` backfills existing rows from the binary filename extension (`.IDFH`/`.IDFW` → `series4`, everything else → `series3`). No manual SQL needed. +- **UPSERT preserves family** — re-imports without an explicit family don't blank existing rows (`COALESCE(?, device_family)`). +- **UI dispatches on the column** — `sfm_webapp.html` events-table mic formatter now branches on `ev.device_family === 'series4'` (Thor stores native dB(L); BW stores psi). Modal uses `source.kind === 'idf-import'` from the sidecar (sidecars don't carry the DB column). Source-files section labels changed from "BW filename / BW filesize / BW sha256" to format-neutral "Event file / File size / File sha256". + +### Added — Phase 2: `micromate/` package alongside `minimateplus/` + +- **`micromate/`** — new sibling package for the Thor / Micromate Series IV device. Currently scoped to offline-file ingest; live-device support (TCP transport, framing, protocol, client) will land here when reverse-engineering happens. + - `micromate/idf_ascii_report.py` — moved from `sfm/idf_ascii_report.py`. No behaviour change. + - `micromate/models.py` — typed `IdfReport`, `IdfEvent`, `IdfPeaks`, `IdfProjectInfo`, `IdfSensorCheck`. Stores mic in native `mic_pspl_dbl` (dB(L)) instead of the pseudo-psi shoehorn that the BW-shaped model uses. `IdfEvent.from_report()` constructs from a parsed dict + filename; `IdfEvent.to_minimateplus_event(waveform_key)` bridges to the existing sidecar / DB-insert machinery. + - `micromate/idf_file.py` — placeholder for the binary codec (`.IDFH` / `.IDFW`). Stubbed `read_idf_file()` raises `NotImplementedError`; documents the planned reverse-engineering path. +- **`WaveformStore.save_imported_idf`** refactored to use the native `IdfEvent` and bridge at the SQL-insert boundary. Cleaner separation of "parse a Thor event" (in `micromate/`) from "store it on disk + write a sidecar" (in `sfm/waveform_store.py`). +- **Tests** — `tests/test_idf_ascii_report.py` imports updated to `micromate.idf_ascii_report`. All 1,014 example-data sidecars round-trip through `IdfEvent.from_report()` without errors. + +### Companion releases + +- **thor-watcher** unaffected — it talks to the relay over HTTP only. No version bump needed. +- **terra-view** unaffected today; can use `device_family` in its event-detail rendering when convenient. + +--- + ## v0.18.0 — 2026-05-19 The "Thor / Series IV ingest adapter" release. Seismo-relay can now accept event files from Instantel Micromate Series IV (Thor) units alongside the existing MiniMate Plus (Series III) Blastware pipeline. diff --git a/README.md b/README.md index 79534c8..c057f68 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,11 @@ -# seismo-relay `v0.17.0` +# seismo-relay `v0.19.0` A ground-up replacement for **Blastware** — Instantel's aging Windows-only -software for managing MiniMate Plus seismographs. +software for managing seismographs. Supports both the **MiniMate Plus +(Series III)** and the **Micromate (Series IV / "Thor")** families: +Series III via the live RS-232 / TCP wire protocol *and* Blastware ACH file +ingest; Series IV currently via Thor TXT-paired IDF file ingest, with the +binary codec on the roadmap. Built in Python. Runs on Windows, Linux, or macOS. Connects to instruments over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55). @@ -19,6 +23,18 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55). > every Blastware ACH event lands in SeismoDb with device-authoritative > peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data, > without depending on the still-undecoded waveform body codec. +> **v0.18.0 (2026-05-19)** adds Thor / Micromate Series IV ingest at +> `/db/import/idf_file` — paired with **thor-watcher v0.3.0**, every +> `.IDFH` / `.IDFW` event file (plus its `.txt` sidecar) lands in +> SeismoDb the same way BW events do. See +> [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) for +> the IDF format reference and reverse-engineering plan. +> **v0.19.0 (2026-05-20)** separates Series III and Series IV at the +> code level: new `micromate/` package alongside `minimateplus/`, new +> `events.device_family` DB column ("series3" / "series4") so the UI +> and storage layer dispatch deterministically instead of sniffing +> filenames. Self-applying migration backfills existing rows from the +> binary filename extension. > See [CHANGELOG.md](CHANGELOG.md) for full version history. --- @@ -29,17 +45,25 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55). seismo-relay/ ├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Download + Console tabs) │ -├── minimateplus/ ← MiniMate Plus client library +├── minimateplus/ ← Series III (MiniMate Plus) client library │ ├── transport.py ← SerialTransport, TcpTransport, SocketTransport │ ├── protocol.py ← DLE frame layer, SUB command dispatch │ ├── client.py ← High-level client (connect, get_events, delete_all_events, push_config, get_call_home_config, …) │ ├── framing.py ← Frame builders, DLE codec, S3FrameParser │ ├── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, CallHomeConfig, … +│ ├── bw_ascii_report.py ← Parse BW per-event ASCII reports (.TXT sidecars) +│ ├── event_file_io.py ← Read BW binaries, write .sfm.json sidecars │ └── blastware_file.py ← Write events to Blastware-compatible .AB0 files │ +├── micromate/ ← Series IV (Micromate / Thor) client library (NEW v0.19) +│ ├── models.py ← IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L)) +│ ├── idf_ascii_report.py ← Parse Thor .IDFW.txt / .IDFH.txt event sidecars +│ └── idf_file.py ← Stub for the .IDFW / .IDFH binary codec (reverse-engineering pending) +│ ├── sfm/ ← SFM REST API server (FastAPI, port 8200) -│ ├── server.py ← Live device endpoints + DB query endpoints + caching -│ ├── database.py ← SeismoDb — SQLite persistence (events, monitor_log, ach_sessions, sessions table) +│ ├── server.py ← Live device endpoints + DB query + ingest endpoints + caching +│ ├── database.py ← SeismoDb — SQLite persistence (events, monitor_log, ach_sessions) +│ ├── waveform_store.py ← On-disk store for BW + IDF event binaries + .sfm.json sidecars │ └── sfm_webapp.html ← Embedded web UI with Call Home config tab │ ├── bridges/ @@ -56,7 +80,8 @@ seismo-relay/ │ └── frame_db.py ← SQLite frame database │ └── docs/ - └── instantel_protocol_reference.md ← Reverse-engineered protocol spec + ├── instantel_protocol_reference.md ← Series III protocol spec (the Rosetta Stone) + └── idf_protocol_reference.md ← Series IV (Thor IDF) format reference + codec RE plan ``` --- @@ -148,11 +173,23 @@ Query the SQLite database written by `ach_server.py`. All read-only except | Method | URL | Description | |--------|-----|-------------| | `GET` | `/db/units` | All known serials with summary stats | -| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger) | +| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger). Response rows include `device_family` ("series3" / "series4") so clients dispatch on unit type without sniffing filenames. | | `GET` | `/db/monitor_log` | Monitoring intervals | | `GET` | `/db/sessions` | ACH call-home session history | | `PATCH` | `/db/events/{id}/false_trigger?value=true` | Flag / unflag false triggers | +### File ingest endpoints + +Used by watcher daemons to push field-collected event files into the SFM DB ++ waveform store. Both accept multipart uploads of binary event files +optionally paired with their ASCII sidecar reports; both dedup by +`(serial, timestamp)` and UPSERT device-authoritative fields on re-import. + +| Method | URL | Description | +|--------|-----|-------------| +| `POST` | `/db/import/blastware_file` | Series III: `.AB0*` / `.N00` binaries + paired `_ASCII.TXT`. Source: `series3-watcher`. | +| `POST` | `/db/import/idf_file` | Series IV: `.IDFH` / `.IDFW` binaries + paired `.IDFW.txt` / `.IDFH.txt`. Source: `thor-watcher`. | + --- ## minimateplus library @@ -214,22 +251,77 @@ not per individual event). --- +## micromate library + +Series IV / Thor support, sibling to `minimateplus`. Currently scoped to +offline-file ingest from Thor's TXT exporter; live-device protocol is +deferred until the binary codec is cracked. + +```python +from micromate import IdfEvent, parse_idf_report + +# Parse a .IDFW.txt / .IDFH.txt sidecar (1014 example files round-trip cleanly) +text = open("UM11719_20231219162723.IDFW.txt").read() +report_dict = parse_idf_report(text) # permissive dict + +# Wrap into a typed event using the device-native binary filename +event = IdfEvent.from_report(report_dict, "UM11719_20231219162723.IDFW") + +event.serial # "UM11719" +event.kind # "Waveform" or "Histogram" +event.peaks.transverse_ips # 0.0251 (in/s, native unit) +event.peaks.mic_pspl_dbl # 99.4 (dB(L), Thor's native mic unit — NOT psi) +event.project_info.project # "UPMC Presby-Loc 3-Level1-1R Elevator Rm" +event.sensor_check.tran # True (passed self-check) +event.firmware_version # "Micromate ISEE 11.0AK" +event.calibration_text # "November 22, 2023 by Instantel" + +# Bridge to the existing minimateplus.Event shape for the DB / sidecar paths +# (waveform_key is a 16-byte sha256 prefix when ingesting from a binary file) +bridged_event = event.to_minimateplus_event(waveform_key=b"\x00" * 16) +``` + +The binary codec (`.IDFW` / `.IDFH` event files themselves) is on the +roadmap — see [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) +for everything known so far, the two observed file signatures, and the +reverse-engineering plan. The `micromate/idf_file.py` stub is where +`read_idf_file()` will land. + +--- + ## Database -`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode) using the -`SeismoDb` persistence layer. Four tables, all unit-keyed by serial number: +`ach_server.py` and the file-ingest endpoints write to +`bridges/captures/seismo_relay.db` (SQLite, WAL mode) via the `SeismoDb` +persistence layer. Three tables, all unit-keyed by serial number: | Table | Key | Contents | |-------|-----|----------| | `ach_sessions` | UUID | Per-call-home audit record: serial, timestamp, peer IP, events_downloaded, monitor_entries, duration_seconds | -| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag | -| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips | -| `events.false_trigger` | Boolean flag | PATCH endpoint to mark/unmark false triggers for review | +| `events` | UUID, UNIQUE(serial, timestamp) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag, **`device_family`** ("series3" / "series4"), `blastware_filename` (binary at-rest in `waveforms/`), sidecar references | +| `monitor_log` | UUID, UNIQUE(serial, start_time) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips | -Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs never -produce duplicate rows. Post-erase key reuse is handled automatically via the -high-water mark in `ach_state.json`. Key-based state tracking allows correct -handling of device erasures (external or post-download). +**Deduplication is by `(serial, timestamp)`** — the device clock is the +stable natural key. Repeat call-homes or re-runs UPSERT the row in place, +refreshing every device-authoritative field (peaks, project strings, +sample_rate, file references) so the latest writer wins. `false_trigger` +and `device_family` are preserved across UPSERTs. Earlier versions used +`(serial, waveform_key)` for dedup, but the device's event-key counter +resets to `0x01110000` after every erase, so timestamps are the correct +dedup field. Migration handles the transition transparently on first +startup. + +**`device_family` (added v0.19.0)** discriminates Series III from Series +IV at the SQL level. Set by every import path; the UI dispatches on it +to render mic units correctly (Series III: psi → dBL conversion; Series +IV: native dBL passthrough). Existing rows are backfilled at first +startup of v0.19.0+ by sniffing the binary filename extension. + +The on-disk waveform store lives at `bridges/captures/waveforms//` +and holds the original event binaries (BW `.AB0*` / `.N00` for Series III, +`.IDFH` / `.IDFW` for Series IV) plus their `.sfm.json` review/metadata +sidecars. Series III events also produce `.a5.pkl` source-frame pickles +and `.h5` clean-waveform exports; Series IV doesn't yet (pending codec). --- @@ -311,18 +403,27 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows. ## Key Features -**Device support:** -- [x] Full read/write/erase pipelines +**Series III (MiniMate Plus) device support:** +- [x] Full read/write/erase pipelines over RS-232 or TCP/cellular - [x] Compliance config (recording mode, sample rate, histogram interval, geo sensitivity, project strings) - [x] Auto Call Home config (read/write ACH settings, dial string, time slots, retries) - [x] Monitor control (start/stop, status polling, battery/memory) - [x] Monitor log entries (continuous monitoring intervals without full waveform download) +- [x] Blastware file ingest at `/db/import/blastware_file` (paired with `series3-watcher`) + +**Series IV (Micromate / Thor) device support:** +- [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+) +- [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version +- [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/` +- [ ] Binary `.IDFW` / `.IDFH` codec — pending (see Roadmap + [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md)) +- [ ] Live-device protocol — pending codec **Data persistence:** -- [x] SQLite database (`seismo_relay.db`) with 4 tables: ach_sessions, events, monitor_log, plus false_trigger flag -- [x] Deduplication by waveform key (handles re-runs and repeat call-homes) -- [x] Post-erase key-reuse detection (tracks high-water mark) -- [x] Session state (`ach_state.json`) with downloaded keys and max key +- [x] SQLite database (`seismo_relay.db`) with `events`, `monitor_log`, `ach_sessions` tables +- [x] Per-row `device_family` column ("series3" / "series4") for clean UI / unit-of-measurement dispatch (v0.19.0+) +- [x] Deduplication by `(serial, timestamp)` — natural key handles post-erase counter resets +- [x] UPSERT on re-import refreshes every device-authoritative field (peaks, project, sample_rate); preserves operator review state (`false_trigger`) +- [x] Post-erase key-reuse detection (tracks high-water mark in `ach_state.json`) **REST API:** - [x] Live device endpoints with in-memory caching (`_LiveCache`) @@ -330,6 +431,7 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows. - [x] DB query endpoints (units, events, monitor_log, sessions, false_trigger PATCH) - [x] Call Home config read/write endpoints - [x] Blastware file download endpoint (`/device/event/{index}/blastware_file`) +- [x] Import endpoints for both device families (`/db/import/blastware_file`, `/db/import/idf_file`) **File output (v0.7+, byte-perfect as of v0.14.3):** - [x] Blastware-compatible `.AB0` / `.G10` file generation (waveform + metadata) @@ -359,8 +461,10 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows. ### High-impact (unblocks product features) -- [ ] **Waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec. -- [ ] **In-app waveform viewer accuracy.** Depends on codec decode. Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples. +- [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec. +- [ ] **Series IV (Thor IDF) binary codec reverse-engineering.** `.IDFH` / `.IDFW` files are currently stored opaquely by `WaveformStore.save_imported_idf`, with all metadata sourced from the paired `.txt` sidecar. This works because thor-watcher forwards both files together, but operators who haven't enabled Thor's TXT exporter get rows with NULL peaks. Cracking the binary closes that gap and unlocks waveform display. Starting-point reference at [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) — two observed file signatures (1,012 newer-firmware files + 2 old files whose layout matches the Series III STRT-record format), suggested first-session plan (~2-4 hrs), 1,014 paired binary+txt files available as ground truth in `thor-watcher/example-data/`. Code seam ready at `micromate/idf_file.py`. +- [ ] **In-app waveform viewer accuracy.** Depends on Series III codec decode. Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples. Series IV waveforms come online when the IDF codec lands. +- [ ] **Series IV live-device support.** Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD). - [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing. - [ ] **Vibration summary reports** — highest legit PPV per project → Word doc (false-trigger filtering first). diff --git a/docs/idf_protocol_reference.md b/docs/idf_protocol_reference.md new file mode 100644 index 0000000..643de53 --- /dev/null +++ b/docs/idf_protocol_reference.md @@ -0,0 +1,284 @@ +# IDF Protocol Reference — Thor / Micromate Series IV + +Starting-point reference for reverse-engineering Instantel's Micromate +Series IV event-file format. Sibling to +[instantel_protocol_reference.md](instantel_protocol_reference.md) (the +Series III "Rosetta Stone") — this doc holds what we know so far and +the open questions still to crack. + +**Status (2026-05-20):** ASCII text sidecar fully decoded (1,014 +sample files round-trip). Binary `.IDFH` / `.IDFW` codec +**not yet implemented** — binaries are stored opaquely by +`WaveformStore.save_imported_idf`, with metadata sourced from the +paired `.txt` sidecar. + +--- + +## File model + +### Filename convention + +``` +_. +``` + +- **SERIAL** — literal device serial, two-letter prefix + numeric + suffix. Examples seen: `UM11719`, `UM13981`, `UM20147`, `BE9439`. + Unlike Series III BW filenames (`M529LK44.AB0`, base-36 stem), + Series IV filenames carry the serial in plain text. +- **YYYYMMDDHHMMSS** — 14-char ASCII timestamp in **device local + time** (no timezone marker). +- **KIND** — `IDFH` for histograms, `IDFW` for waveforms. + +The `.IDFH.txt` / `.IDFW.txt` ASCII sidecar lives in a `TXT/` +**subfolder** of the unit's directory, not alongside the binary. +This pairing convention is encoded in +`event_forwarder.idf_report_path()`. + +### Directory layout + +``` +C:\THORDATA\ +└── \ + └── \ ← unit serial dir + ├── UM12345_20260520100000.MLG ← monitor log (not events) + ├── UM12345_20260520100000.IDFH ← histogram event (binary) + ├── UM12345_20260520100000.IDFW ← waveform event (binary) + ├── UM12345_20260520100000.IDFW.CDB ← cache-DB variant (skip) + ├── TXT\ + │ ├── UM12345_20260520100000.IDFH.txt ← histogram ASCII sidecar + │ └── UM12345_20260520100000.IDFW.txt ← waveform ASCII sidecar + ├── CSV\, HTML\, PDF\, XML\ ← operator-facing derived exports + └── ... +``` + +The `.IDFW.CDB` files share the binary's basename but appear to be a +separate cache/database variant. Their first 8 bytes match the +**old**-firmware Thor signature (see below) regardless of which +signature the paired `.IDFW` uses. Purpose unknown; sizes vary +wildly (observed 123 B → 40,491 B). Thor-watcher's forwarder +deliberately skips them. + +### Sample corpus + +The `thor-watcher/example-data/THORDATA_example/` tree carries +**1,014 paired .IDFW / .IDFH + .txt files** spanning 2020–2023 +across nine units (UM11719, UM13981, UM20147, …, plus BE9439 from +2020). This is the reverse-engineering ground truth. + +--- + +## ASCII sidecar (`.IDFW.txt` / `.IDFH.txt`) — fully decoded + +Shape: plain text, one `"Key : Value"` line per metadata field, +followed for waveforms by a tab-separated sample table headed by +the literal line `Waveform Data Channels`. Parsed by +[`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py). +See [`micromate/models.py`](../micromate/models.py) for the typed +`IdfReport` shape. + +### Notable conventions + +- **Units are native to Thor** — geophone in **in/s**, microphone in + **dB(L)** (not psi like Series III BW reports), frequency in Hz, + acceleration in g, displacement in in. +- **Below-threshold readings** appear as the literal string + `<0.005 in/s` (155 occurrences in the sample corpus) — the parser + strips the `<` and treats the numeric remainder as the value. +- **Out-of-range / not-measured** values appear as `N/A` — parser + drops the field rather than letting the string leak into a numeric + column. +- **Firmware string** observed: `Micromate ISEE 11.0AK`. +- **TitleString1..4** are operator-defined free-text slots; Thor's + default labels map them to Location / Client / Company / Notes, + which the parser surfaces as `project` / `client` / `operator` / + `notes`. +- **Histogram sidecars** use `HistogramStartDate` / `HistogramStartTime` + in place of waveform's `EventDate` / `EventTime`. Parser falls + through to either. +- **Histogram tabular block** lacks the `Waveform Data Channels` + marker; instead it's a multi-line column header followed by + per-interval rows (`