seismo-relay v0.19.0 — device-family separation + micromate/ package

Tighten the Series III / Series IV boundary so UI and storage dispatch on a clean signal instead of sniffing filenames or applying magnitude heuristics. Phase 1 — events.device_family column ("series3" | "series4"): self-applying migration with filename-based backfill of existing rows (1,132 backfilled on prod 2026-05-20); plumbed through every import path (BW endpoint, IDF endpoint, ACH server, BW CLI, sidecar backfill); UPSERT preserves via COALESCE; UI dispatches on it. Phase 2 — extract micromate/ package alongside minimateplus/: native IdfEvent / IdfReport / IdfPeaks / IdfProjectInfo / IdfSensorCheck (mic in dB(L), not pseudo-psi); moved idf_ascii_report.py from sfm/ to micromate/; refactored save_imported_idf to use IdfEvent and bridge to minimateplus.Event at the SQL-insert boundary; idf_file.py stub for the future binary codec. Phase 3 prep — docs/idf_protocol_reference.md captures the two observed Thor binary header signatures (1,012 newer-firmware files vs 2 old files whose layout is byte-for-byte BW-STRT-compatible), file-size hints suggesting int8 sample encoding, open questions in dependency order, and a concrete first-session plan for cracking the codec. Also rolled in the v0.18.1 hotfixes that motivated this work: - idf_ascii_report parser now handles "<0.005 in/s" (below-threshold) and "N/A" markers without leaving raw strings in numeric DB columns. - sfm_webapp.html: defensive _ppvFmt / mic formatter so future data-shape drift can't kill the whole events table render. All 1,014 example-data sidecars round-trip through the new package. See CHANGELOG.md for full notes.
2026-05-20 15:19:49 +00:00
parent e95ac692ee
commit ecc935482b
11 changed files with 966 additions and 119 deletions
@@ -0,0 +1,48 @@
+"""
+micromate — Instantel Micromate (Series IV) device library.
+
+Sibling of ``minimateplus`` (the Series III library).  Currently scoped to
+the offline-file ingest path used by thor-watcher: parsing the per-event
+``.IDFH``/``.IDFW`` ASCII text sidecars Thor's exporter writes alongside
+each binary event file, and wrapping the parsed data in typed event
+records.
+
+Live-device support (TCP protocol, frame parsing, real-time monitoring)
+is deferred — when we add it, it lands here as ``transport.py`` /
+``framing.py`` / ``protocol.py`` / ``client.py``, mirroring the
+``minimateplus`` package layout.
+
+Typical usage (offline file ingest):
+
+    from micromate import IdfEvent, parse_idf_report
+
+    text  = open("UM11719_20231219162723.IDFW.txt").read()
+    rep   = parse_idf_report(text)                       # dict
+    event = IdfEvent.from_report(rep, "UM11719_20231219162723.IDFW")
+    print(event.serial, event.peaks.transverse_ips, event.mic_pspl_dbl)
+"""
+
+from .idf_ascii_report import (
+    parse_event_filename,
+    parse_idf_report,
+    serial_from_filename,
+)
+from .models import (
+    IdfEvent,
+    IdfPeaks,
+    IdfProjectInfo,
+    IdfReport,
+    IdfSensorCheck,
+)
+
+__version__ = "0.1.0"
+__all__ = [
+    "IdfEvent",
+    "IdfPeaks",
+    "IdfProjectInfo",
+    "IdfReport",
+    "IdfSensorCheck",
+    "parse_event_filename",
+    "parse_idf_report",
+    "serial_from_filename",
+]
@@ -0,0 +1,315 @@
+"""
+micromate/idf_ascii_report.py — parse Thor (Micromate Series IV) IDF ASCII reports.
+
+Thor exports a `.IDFW.txt` or `.IDFH.txt` sidecar next to each `.IDFW`
+(waveform) or `.IDFH` (histogram) event binary.  Each sidecar is a
+plain-text file with `"Key : Value"` lines covering the full device-
+authoritative event metadata — PPV per channel, ZC Freq, Time of Peak,
+Peak Acceleration / Displacement, sensor self-check results, project
+strings, calibration date, battery level, etc. — followed by a raw
+waveform-samples block headed by the literal line "Waveform Data Channels".
+
+This is the Thor analogue of `minimateplus/bw_ascii_report.py` for the
+Blastware (Series III) report format.  The parser is intentionally
+permissive: we extract everything we recognise into a flat dict and
+silently ignore anything we don't.  Downstream callers parse units
+(`"0.2119 in/s"` → 0.2119) only on the fields they need.
+
+Example input (truncated):
+
+    "EventType : Full Waveform"
+    "SampleRate : 1024 sps"
+    "EventTime : 16:27:23"
+    "EventDate : 2023-12-19"
+    "TranPPV : 0.0251 in/s"
+    "VertPPV : 0.2119 in/s"
+    "LongPPV : 0.0282 in/s"
+    "PeakVectorSum : 0.2131 in/s"
+    "MicPSPL : 99.4 dB(L)"
+    "TranZCFreq : 6.5 Hz"
+    "SerialNumber : UM11719"
+    "Version : Micromate ISEE 11.0AK"
+    "FileName : UM11719_20231219162723.IDFW"
+    "BatteryLevel : 3.8 volts"
+    "Calibration : November 22, 2023 by Instantel"
+    "TranTestResults : Passed"
+    "TitleString1 : UPMC Presby-Loc 3-Level1-1R Elevator Rm"
+    Waveform Data Channels
+        Tran    Vert    Long    MicL
+        0.0003  -0.0003  0.0003  0.00013
+        ...
+"""
+
+from __future__ import annotations
+
+import datetime
+import re
+from typing import Any, Dict, Optional, Tuple, Union
+
+
+# Lines look like:  "Key : Value"   (quotes literal, single ":" separator)
+_LINE_RE = re.compile(r'^\s*"?([^":]+?)"?\s*:\s*"?(.*?)"?\s*$')
+
+# Marker that ends the metadata block — everything after is raw sample data.
+_WAVEFORM_BLOCK_MARKER = "waveform data channels"
+
+
+def _normalize_key(raw: str) -> str:
+    """Convert "TranPPV" / "PreTriggerLength" → snake_case."""
+    s = raw.strip()
+    # Insert underscore between lower→upper / digit→letter transitions
+    s = re.sub(r"(?<=[a-z0-9])(?=[A-Z])", "_", s)
+    s = re.sub(r"(?<=[A-Z])(?=[A-Z][a-z])", "_", s)
+    s = s.replace("-", "_").replace(" ", "_")
+    return s.lower()
+
+
+def _strip_unit_suffix(value: str) -> str:
+    """Return the numeric part of values like "0.2119 in/s" → "0.2119".
+
+    Also strips Thor's below/above-threshold prefixes:
+      "<0.005 in/s"  → "0.005"   (below-noise-floor reading)
+      ">100 Hz"      → "100"     (above-measurement-range reading)
+    """
+    parts = value.strip().split()
+    token = parts[0] if parts else value.strip()
+    if token.startswith("<") or token.startswith(">"):
+        token = token[1:]
+    return token
+
+
+def _parse_float(value: str) -> Optional[float]:
+    try:
+        return float(_strip_unit_suffix(value))
+    except (ValueError, TypeError):
+        return None
+
+
+def _parse_int(value: str) -> Optional[int]:
+    try:
+        return int(float(_strip_unit_suffix(value)))
+    except (ValueError, TypeError):
+        return None
+
+
+def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
+    """
+    Parse a Thor IDFW.txt / IDFH.txt sidecar.
+
+    Returns a flat dict with two kinds of entries:
+
+      - **Raw fields** — every `Key : Value` line, keyed by snake_case
+        of the original key, value as a string (unit suffix preserved).
+        Lets callers grab any field we haven't explicitly normalised.
+
+      - **Derived fields** — a curated set with parsed types:
+          * `serial_number`     str
+          * `event_type`        str  ("Full Waveform" / "Full Histogram")
+          * `event_datetime`    ISO-8601 string ("YYYY-MM-DDTHH:MM:SS") when
+                                 both EventDate and EventTime are present
+          * `sample_rate`       int  (samples/sec)
+          * `tran_ppv`,`vert_ppv`,`long_ppv` float (in/s)
+          * `mic_ppv`           float (dB or psi — same units as MicPSPL)
+          * `peak_vector_sum`   float (in/s)
+          * `tran_zc_freq`,`vert_zc_freq`,`long_zc_freq` float (Hz)
+          * `record_time_sec`   float (seconds)
+          * `pre_trigger_sec`   float (seconds)
+          * `project`           str  (from TitleString1 — Thor's location)
+          * `client`            str  (TitleString2)
+          * `operator`          str  (TitleString3 — company/operator)
+          * `notes`             str  (TitleString4)
+          * `setup`             str
+          * `version`           str  (firmware)
+          * `battery_volts`     float
+          * `calibration_text`  str  (e.g. "November 22, 2023 by Instantel")
+          * `tran_test_passed`, `vert_test_passed`, `long_test_passed`,
+            `mic_test_passed`  bool  ("Passed" → True; anything else → False)
+          * `filename`          str  (FileName line — useful sanity check)
+
+    Stops parsing at the literal "Waveform Data Channels" line; the
+    raw-samples block is left to whoever wants to decode the binary.
+
+    Input may be `str` or `bytes` (`utf-8`/`latin-1` tolerant).
+    """
+    if isinstance(text, bytes):
+        try:
+            text = text.decode("utf-8")
+        except UnicodeDecodeError:
+            text = text.decode("latin-1", errors="replace")
+
+    raw: Dict[str, str] = {}
+
+    for line in text.splitlines():
+        stripped = line.strip()
+        if not stripped:
+            continue
+        if stripped.lower().startswith(_WAVEFORM_BLOCK_MARKER):
+            break
+        m = _LINE_RE.match(stripped)
+        if not m:
+            continue
+        key = _normalize_key(m.group(1))
+        value = m.group(2).strip()
+        # Multi-value lines (Channel, Units, etc.) — coalesce by appending.
+        if key in raw:
+            raw[key] = raw[key] + "; " + value
+        else:
+            raw[key] = value
+
+    out: Dict[str, Any] = dict(raw)  # keep all raw fields
+
+    # ── Derived fields ───────────────────────────────────────────────────────
+
+    def _take(*candidates: str) -> Optional[str]:
+        for c in candidates:
+            if c in raw:
+                return raw[c]
+        return None
+
+    # Event identity
+    if "serial_number" in raw:
+        out["serial_number"] = raw["serial_number"]
+    if "event_type" in raw:
+        out["event_type"] = raw["event_type"]
+    if "file_name" in raw:
+        out["filename"] = raw["file_name"]
+
+    # Combined date+time.  Waveform sidecars use "EventDate" / "EventTime";
+    # histogram sidecars use "HistogramStartDate" / "HistogramStartTime".
+    # Prefer the event_* names when both are present.
+    ed = raw.get("event_date") or raw.get("histogram_start_date")
+    et = raw.get("event_time") or raw.get("histogram_start_time")
+    if ed and et:
+        try:
+            dt = datetime.datetime.strptime(f"{ed} {et}", "%Y-%m-%d %H:%M:%S")
+            out["event_datetime"] = dt.isoformat()
+        except ValueError:
+            pass
+
+    # Numeric scalars.  For every field we typify here, we MUST drop the
+    # raw string copy from `out` when parsing fails — Thor writes things
+    # like "<0.005 in/s" (below threshold) and "N/A" (not measured) that
+    # would otherwise linger in `out` as strings, sneak into SQLite REAL
+    # columns via permissive type affinity, and then crash the JS
+    # frontend on `.toFixed(...)`.
+    int_fields = ("sample_rate",)
+    for key in int_fields:
+        v = raw.get(key)
+        if v is None:
+            continue
+        iv = _parse_int(v)
+        if iv is not None:
+            out[key] = iv
+        else:
+            out.pop(key, None)
+
+    float_fields = (
+        "tran_ppv", "vert_ppv", "long_ppv", "peak_vector_sum",
+        "tran_zc_freq", "vert_zc_freq", "long_zc_freq",
+        "tran_peak_acceleration", "vert_peak_acceleration",
+        "long_peak_acceleration",
+        "tran_peak_displacement", "vert_peak_displacement",
+        "long_peak_displacement",
+        "tran_time_of_peak", "vert_time_of_peak", "long_time_of_peak",
+        "mic_time_of_peak", "mic_zc_freq",
+    )
+    for key in float_fields:
+        v = raw.get(key)
+        if v is None:
+            continue
+        fv = _parse_float(v)
+        if fv is not None:
+            out[key] = fv
+        else:
+            out.pop(key, None)
+
+    # Microphone — Thor reports MicPSPL (dB(L)) which is the closest
+    # analogue to BW's mic_ppv.  The raw "99.4 dB(L)" string stays in
+    # `out` under the original `mic_pspl` key for display; the parsed
+    # float goes in `mic_ppv`.
+    mic = raw.get("mic_pspl")
+    if mic is not None:
+        fv = _parse_float(mic)
+        if fv is not None:
+            out["mic_ppv"] = fv
+
+    # Record / pre-trigger duration — same drop-on-failure discipline.
+    rt = raw.get("record_time")
+    if rt is not None:
+        fv = _parse_float(rt)
+        if fv is not None:
+            out["record_time_sec"] = fv
+    pt = raw.get("pre_trigger_length")
+    if pt is not None:
+        fv = _parse_float(pt)
+        if fv is not None:
+            out["pre_trigger_sec"] = fv
+
+    # Project / client / operator / location strings.  Thor's title
+    # strings are operator-defined; conventional mapping (per Thor's
+    # default TitleNote labels in the example data):
+    #   TitleString1 = Location  → project (sensor location identifier)
+    #   TitleString2 = Client    → client
+    #   TitleString3 = Company   → operator (the monitoring company)
+    #   TitleString4 = Notes     → notes
+    out["project"]  = _take("title_string1")
+    out["client"]   = _take("title_string2")
+    out["operator"] = _take("title_string3", "operator")
+    out["notes"]    = _take("title_string4", "post_event_note")
+
+    if "setup" in raw:
+        out["setup"] = raw["setup"]
+    if "version" in raw:
+        out["version"] = raw["version"]
+
+    # Battery (e.g. "3.8 volts" → 3.8)
+    bl = raw.get("battery_level")
+    if bl is not None:
+        fv = _parse_float(bl)
+        if fv is not None:
+            out["battery_volts"] = fv
+
+    # Calibration line is free-form (e.g. "November 22, 2023 by Instantel").
+    if "calibration" in raw:
+        out["calibration_text"] = raw["calibration"]
+
+    # Sensor self-check results — bool flags
+    for key, out_key in (
+        ("tran_test_results", "tran_test_passed"),
+        ("vert_test_results", "vert_test_passed"),
+        ("long_test_results", "long_test_passed"),
+        ("mic_test_results",  "mic_test_passed"),
+    ):
+        v = raw.get(key)
+        if v is not None:
+            out[out_key] = v.strip().lower() == "passed"
+
+    return out
+
+
+def serial_from_filename(name: str) -> Optional[str]:
+    """Convenience: pull the serial prefix from a Thor event filename.
+
+    Thor uses the literal serial as the filename prefix:
+      UM11719_20231219163444.IDFW  →  "UM11719"
+      BE9439_20200713124251.IDFH   →  "BE9439"
+    """
+    m = re.match(r"^([A-Z]{2}\d+)_\d{14}\.(IDFH|IDFW)(?:\.txt)?$",
+                 name, re.IGNORECASE)
+    return m.group(1).upper() if m else None
+
+
+def parse_event_filename(name: str) -> Optional[Tuple[str, datetime.datetime, str]]:
+    """Parse `<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>` → (serial, datetime, kind).
+
+    `kind` is "IDFH" or "IDFW" (upper-case).  Returns None on no match.
+    """
+    m = re.match(r"^([A-Z]{2}\d+)_(\d{14})\.(IDFH|IDFW)$",
+                 name, re.IGNORECASE)
+    if not m:
+        return None
+    try:
+        ts = datetime.datetime.strptime(m.group(2), "%Y%m%d%H%M%S")
+    except ValueError:
+        return None
+    return m.group(1).upper(), ts, m.group(3).upper()
@@ -0,0 +1,64 @@
+"""
+micromate/idf_file.py — placeholder for the Thor IDF binary codec.
+
+Thor's ``.IDFH`` (histogram) and ``.IDFW`` (waveform) event files are an
+Instantel proprietary binary format that has not yet been reverse-
+engineered.  Today seismo-relay treats them as opaque blobs:
+``WaveformStore.save_imported_idf`` stores the bytes verbatim and reads
+all device-authoritative metadata from the paired ``.IDFW.txt`` /
+``.IDFH.txt`` ASCII sidecar (parsed by ``idf_ascii_report.py``).
+
+When we crack the binary codec — same reverse-engineering playbook we
+used to byte-perfect-parse Series III BW files (see
+``docs/instantel_protocol_reference.md`` and ``minimateplus/event_file_io.py``)
+— this module will grow:
+
+  - ``read_idf_file(path) -> IdfEvent``
+        Parse a ``.IDFW``/``.IDFH`` binary and return a fully populated
+        ``IdfEvent`` whose waveform-sample arrays come from the binary
+        (the .txt sidecar's tabular sample block being a best-effort
+        check).  Lets us ingest Thor events even when the operator
+        hasn't enabled the .txt exporter — closing the
+        ``had_report=False`` gap that the thor-watcher forwarder
+        currently tolerates as a known limitation.
+
+  - ``write_idf_file(path, event)`` (eventually)
+        Round-trip event reconstruction, used for verifying the codec
+        against captured device files the way ``write_blastware_file``
+        verifies the Series III codec.
+
+  - Helpers for decoding the binary's per-channel sample arrays into
+    physical units, the per-event flash buffer's monitor-log records,
+    etc.
+
+The reverse-engineering path: pair every ``.IDFW`` binary in
+``thor-watcher/example-data/`` with its sibling ``.IDFW.txt``, treating
+the txt's "Waveform Data Channels" block as ground-truth, and align
+the binary's per-channel int16-or-similar arrays against it.  Header
+fields (sample rate, channel count, record time, timestamps) sit before
+the sample block — same approach as the BW codec where ASCII strings
+inside the binary (``Project:``, ``Client:``, etc.) anchored field
+discovery.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Union
+
+from .models import IdfEvent
+
+
+def read_idf_file(path: Union[str, Path]) -> "IdfEvent":
+    """Parse a Thor ``.IDFW``/``.IDFH`` binary into an ``IdfEvent``.
+
+    Not yet implemented.  When implemented, this will be the canonical
+    entry point for reading Thor binaries — the ASCII sidecar parser
+    becomes an optional fast-path metadata supplement rather than the
+    sole source of device-authoritative data.
+    """
+    raise NotImplementedError(
+        "IDF binary codec not yet implemented; the .IDFW/.IDFH binary format "
+        "is undecoded.  Use parse_idf_report() on the paired .txt sidecar "
+        "for device-authoritative metadata."
+    )
@@ -0,0 +1,377 @@
+"""
+Micromate (Series IV / Thor) native data models.
+
+These are the right-shaped dataclasses for Thor data — Thor measures
+the microphone in dB(L) directly, so this model carries
+``mic_pspl_dbl`` rather than the pseudo-``psi`` shoehorn that
+``minimateplus.PeakValues`` uses for Series III BW data.
+
+The ingest pipeline today goes:
+
+    .IDFW.txt  →  parse_idf_report()  →  dict
+    dict       →  IdfEvent.from_report()  →  IdfEvent  (typed)
+    IdfEvent   →  IdfEvent.to_minimateplus_event()  →  shape DB / sidecar
+                                                     machinery expects
+
+The ``to_minimateplus_event()`` bridge is a temporary boundary — when we
+crack the binary IDF codec and have richer per-event data to store, the
+DB schema will grow Series-IV-specific columns and the bridge will
+shrink or disappear.
+"""
+
+from __future__ import annotations
+
+import datetime
+from dataclasses import dataclass, field
+from typing import Any, Dict, Optional, Tuple
+
+
+# ── IdfReport ─────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfReport:
+    """Typed wrapper around the dict returned by ``parse_idf_report``.
+
+    All fields optional — Thor's exporter is permissive and some IDF .txt
+    files (especially histograms) omit fields that waveform sidecars
+    include.  Use ``.raw`` for any field this dataclass hasn't surfaced
+    yet (the parser keeps every recognised key in the raw dict).
+    """
+
+    # Identity / kind
+    serial_number:     Optional[str] = None
+    event_type:        Optional[str] = None      # "Full Waveform" | "Full Histogram"
+    event_datetime:    Optional[datetime.datetime] = None
+    filename:          Optional[str] = None      # echoed by Thor's exporter
+
+    # Sampling / timing
+    sample_rate:       Optional[int]   = None    # samples/sec
+    record_time_sec:   Optional[float] = None
+    pre_trigger_sec:   Optional[float] = None
+
+    # Geophone peaks (in/s)
+    tran_ppv:          Optional[float] = None
+    vert_ppv:          Optional[float] = None
+    long_ppv:          Optional[float] = None
+    peak_vector_sum:   Optional[float] = None
+
+    # Microphone — Thor's native unit is dB(L), NOT psi.
+    mic_pspl_dbl:      Optional[float] = None
+
+    # Zero-crossing frequencies (Hz)
+    tran_zc_freq:      Optional[float] = None
+    vert_zc_freq:      Optional[float] = None
+    long_zc_freq:      Optional[float] = None
+    mic_zc_freq:       Optional[float] = None
+
+    # Per-channel time of peak (sec, since event start)
+    tran_time_of_peak: Optional[float] = None
+    vert_time_of_peak: Optional[float] = None
+    long_time_of_peak: Optional[float] = None
+    mic_time_of_peak:  Optional[float] = None
+
+    # Derived per-channel motion
+    tran_peak_acceleration: Optional[float] = None    # g
+    vert_peak_acceleration: Optional[float] = None
+    long_peak_acceleration: Optional[float] = None
+    tran_peak_displacement: Optional[float] = None    # in
+    vert_peak_displacement: Optional[float] = None
+    long_peak_displacement: Optional[float] = None
+
+    # Operator-supplied strings (Thor's TitleString1..4 → semantic slots)
+    project:           Optional[str] = None    # TitleString1
+    client:            Optional[str] = None    # TitleString2
+    operator:          Optional[str] = None    # TitleString3
+    notes:             Optional[str] = None    # TitleString4 / PostEventNote
+    setup:             Optional[str] = None    # setup file name
+
+    # Sensor self-check results
+    tran_test_passed:  Optional[bool] = None
+    vert_test_passed:  Optional[bool] = None
+    long_test_passed:  Optional[bool] = None
+    mic_test_passed:   Optional[bool] = None
+
+    # Device-fixed metadata
+    firmware_version:  Optional[str]   = None
+    calibration_text:  Optional[str]   = None
+    battery_volts:     Optional[float] = None
+
+    # Original parser dict — preserves every recognised key (including
+    # raw unit-suffixed strings) for forward-compatible field access.
+    raw: Dict[str, Any] = field(default_factory=dict, repr=False)
+
+    @classmethod
+    def from_dict(cls, d: Dict[str, Any]) -> "IdfReport":
+        """Build an IdfReport from the dict returned by ``parse_idf_report``."""
+        ed = d.get("event_datetime")
+        if isinstance(ed, str):
+            try:
+                ed = datetime.datetime.fromisoformat(ed)
+            except ValueError:
+                ed = None
+
+        return cls(
+            serial_number     = d.get("serial_number"),
+            event_type        = d.get("event_type"),
+            event_datetime    = ed if isinstance(ed, datetime.datetime) else None,
+            filename          = d.get("filename"),
+            sample_rate       = d.get("sample_rate"),
+            record_time_sec   = d.get("record_time_sec"),
+            pre_trigger_sec   = d.get("pre_trigger_sec"),
+            tran_ppv          = d.get("tran_ppv"),
+            vert_ppv          = d.get("vert_ppv"),
+            long_ppv          = d.get("long_ppv"),
+            peak_vector_sum   = d.get("peak_vector_sum"),
+            mic_pspl_dbl      = d.get("mic_ppv"),       # parser names it mic_ppv (legacy)
+            tran_zc_freq      = d.get("tran_zc_freq"),
+            vert_zc_freq      = d.get("vert_zc_freq"),
+            long_zc_freq      = d.get("long_zc_freq"),
+            mic_zc_freq       = d.get("mic_zc_freq"),
+            tran_time_of_peak = d.get("tran_time_of_peak"),
+            vert_time_of_peak = d.get("vert_time_of_peak"),
+            long_time_of_peak = d.get("long_time_of_peak"),
+            mic_time_of_peak  = d.get("mic_time_of_peak"),
+            tran_peak_acceleration = d.get("tran_peak_acceleration"),
+            vert_peak_acceleration = d.get("vert_peak_acceleration"),
+            long_peak_acceleration = d.get("long_peak_acceleration"),
+            tran_peak_displacement = d.get("tran_peak_displacement"),
+            vert_peak_displacement = d.get("vert_peak_displacement"),
+            long_peak_displacement = d.get("long_peak_displacement"),
+            project           = d.get("project"),
+            client            = d.get("client"),
+            operator          = d.get("operator"),
+            notes             = d.get("notes"),
+            setup             = d.get("setup"),
+            tran_test_passed  = d.get("tran_test_passed"),
+            vert_test_passed  = d.get("vert_test_passed"),
+            long_test_passed  = d.get("long_test_passed"),
+            mic_test_passed   = d.get("mic_test_passed"),
+            firmware_version  = d.get("version"),
+            calibration_text  = d.get("calibration_text"),
+            battery_volts     = d.get("battery_volts"),
+            raw               = d,
+        )
+
+
+# ── IdfPeaks / IdfProjectInfo / IdfSensorCheck (narrow grouping types) ───────
+
+
+@dataclass
+class IdfPeaks:
+    """Geophone + mic peak values for one Thor event.  Native Thor units."""
+    transverse_ips:    Optional[float] = None    # in/s
+    vertical_ips:      Optional[float] = None    # in/s
+    longitudinal_ips:  Optional[float] = None    # in/s
+    peak_vector_sum_ips: Optional[float] = None  # in/s
+    mic_pspl_dbl:      Optional[float] = None    # dB(L)
+
+
+@dataclass
+class IdfProjectInfo:
+    """Operator-supplied strings from Thor's TitleString1..4."""
+    project:  Optional[str] = None
+    client:   Optional[str] = None
+    operator: Optional[str] = None
+    notes:    Optional[str] = None
+    setup:    Optional[str] = None
+
+
+@dataclass
+class IdfSensorCheck:
+    """Per-channel pass/fail from Thor's self-test."""
+    tran: Optional[bool] = None
+    vert: Optional[bool] = None
+    long: Optional[bool] = None
+    mic:  Optional[bool] = None
+
+
+# ── IdfEvent ─────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class IdfEvent:
+    """A single Thor / Micromate Series IV event.
+
+    Built from a parsed .IDFW.txt or .IDFH.txt sidecar via
+    ``IdfEvent.from_report()``.  The filename is the authoritative
+    source for serial + timestamp + kind; the .txt provides
+    device-authoritative peak values, frequencies, project strings,
+    sensor self-check, firmware, calibration.
+    """
+
+    # Identity
+    serial:    str
+    timestamp: datetime.datetime
+    kind:      str                  # "Waveform" | "Histogram"
+    filename:  str                  # device-native binary filename, e.g. "UM11719_20231219163444.IDFW"
+
+    # Sampling / timing
+    sample_rate:     Optional[int]   = None
+    record_time_sec: Optional[float] = None
+    pre_trigger_sec: Optional[float] = None
+
+    # Peaks
+    peaks: IdfPeaks = field(default_factory=IdfPeaks)
+
+    # Per-channel frequencies (Hz)
+    tran_zc_freq: Optional[float] = None
+    vert_zc_freq: Optional[float] = None
+    long_zc_freq: Optional[float] = None
+    mic_zc_freq:  Optional[float] = None
+
+    # Project strings
+    project_info: IdfProjectInfo = field(default_factory=IdfProjectInfo)
+
+    # Sensor self-check
+    sensor_check: IdfSensorCheck = field(default_factory=IdfSensorCheck)
+
+    # Device-fixed
+    firmware_version: Optional[str]   = None
+    calibration_text: Optional[str]   = None
+    battery_volts:    Optional[float] = None
+
+    # The full parsed report — preserves anything not surfaced as a typed field
+    report: IdfReport = field(default_factory=IdfReport)
+
+    @classmethod
+    def from_report(
+        cls,
+        report: Any,
+        filename: str,
+    ) -> "IdfEvent":
+        """Build an IdfEvent from a parsed report (dict or IdfReport) and
+        the device-native binary filename.
+
+        The filename is authoritative for serial + timestamp + kind:
+        Thor's filenames are literal ``<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>``
+        and the device's own clock is the canonical event timestamp.
+        If the report carries an ``event_datetime`` that differs from
+        what's in the filename, the report wins (it has finer-grained
+        device-reported time-of-trigger semantics).
+        """
+        from .idf_ascii_report import parse_event_filename
+
+        # Normalise input to IdfReport
+        if isinstance(report, IdfReport):
+            rep = report
+        elif isinstance(report, dict):
+            rep = IdfReport.from_dict(report)
+        else:
+            raise TypeError(
+                f"report must be IdfReport or dict; got {type(report).__name__}"
+            )
+
+        # Filename → (serial, timestamp, kind).  Required — fall back to
+        # report-supplied values only if filename parsing fails.
+        parsed = parse_event_filename(filename)
+        if parsed is not None:
+            fn_serial, fn_ts, fn_kind = parsed
+            kind = "Histogram" if fn_kind == "IDFH" else "Waveform"
+        else:
+            fn_serial = rep.serial_number or "UNKNOWN"
+            fn_ts     = rep.event_datetime or datetime.datetime(1970, 1, 1)
+            kind      = "Waveform" if (rep.event_type or "").lower().startswith("full waveform") else "Histogram"
+
+        # Prefer report's event_datetime (device-authoritative) over the filename.
+        ts = rep.event_datetime or fn_ts
+        serial = rep.serial_number or fn_serial
+
+        return cls(
+            serial=serial,
+            timestamp=ts,
+            kind=kind,
+            filename=filename,
+            sample_rate=rep.sample_rate,
+            record_time_sec=rep.record_time_sec,
+            pre_trigger_sec=rep.pre_trigger_sec,
+            peaks=IdfPeaks(
+                transverse_ips      = rep.tran_ppv,
+                vertical_ips        = rep.vert_ppv,
+                longitudinal_ips    = rep.long_ppv,
+                peak_vector_sum_ips = rep.peak_vector_sum,
+                mic_pspl_dbl        = rep.mic_pspl_dbl,
+            ),
+            tran_zc_freq=rep.tran_zc_freq,
+            vert_zc_freq=rep.vert_zc_freq,
+            long_zc_freq=rep.long_zc_freq,
+            mic_zc_freq=rep.mic_zc_freq,
+            project_info=IdfProjectInfo(
+                project=rep.project,
+                client=rep.client,
+                operator=rep.operator,
+                notes=rep.notes,
+                setup=rep.setup,
+            ),
+            sensor_check=IdfSensorCheck(
+                tran=rep.tran_test_passed,
+                vert=rep.vert_test_passed,
+                long=rep.long_test_passed,
+                mic=rep.mic_test_passed,
+            ),
+            firmware_version=rep.firmware_version,
+            calibration_text=rep.calibration_text,
+            battery_volts=rep.battery_volts,
+            report=rep,
+        )
+
+    # ── Bridge to minimateplus shape (for the existing DB / sidecar paths) ──
+
+    def to_minimateplus_event(self, waveform_key: bytes) -> Any:
+        """Project this Thor event into the shape ``minimateplus.Event``
+        carries, so it can flow through the existing
+        ``SeismoDb.insert_events()`` and ``event_to_sidecar_dict()``
+        machinery without those code paths needing to know about Thor.
+
+        Caveats of the bridge:
+          - ``mic_ppv`` on the produced Event carries Thor's dB(L) value
+            verbatim — the UI distinguishes via the ``device_family``
+            column (Phase 1).  Don't run the BW psi→dBL converter on
+            Series IV rows.
+          - Many Thor-specific fields (Peak Acceleration / Displacement,
+            sensor self-check, calibration) don't have a slot in
+            ``Event``.  The full IdfReport is preserved on the
+            ``.sfm.json`` sidecar under ``extensions.idf_report`` via
+            ``save_imported_idf`` — that's the source of truth for them.
+        """
+        from minimateplus.models import (
+            Event, PeakValues, ProjectInfo, Timestamp,
+        )
+
+        ts_obj = Timestamp(
+            raw=bytes(9),
+            flag=0,
+            year=self.timestamp.year,
+            unknown_byte=0,
+            month=self.timestamp.month,
+            day=self.timestamp.day,
+            hour=self.timestamp.hour,
+            minute=self.timestamp.minute,
+            second=self.timestamp.second,
+        )
+        pv = PeakValues(
+            tran=self.peaks.transverse_ips,
+            vert=self.peaks.vertical_ips,
+            long=self.peaks.longitudinal_ips,
+            micl=self.peaks.mic_pspl_dbl,   # dB(L) — see caveat above
+            peak_vector_sum=self.peaks.peak_vector_sum_ips,
+        )
+        pi = ProjectInfo(
+            setup_name=self.project_info.setup,
+            project=self.project_info.project,
+            client=self.project_info.client,
+            operator=self.project_info.operator,
+            sensor_location=None,           # Thor folds location into project string
+            notes=self.project_info.notes,
+        )
+        ev = Event(
+            index=0,
+            timestamp=ts_obj,
+            sample_rate=self.sample_rate,
+            peak_values=pv,
+            project_info=pi,
+            record_type=self.kind,
+            rectime_seconds=self.record_time_sec,
+        )
+        ev._waveform_key = waveform_key
+        return ev