diff --git a/CLAUDE.md b/CLAUDE.md index 5dd6629..e46b30b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -8,6 +8,84 @@ When new information about the protocol is discovered, please update the instant --- +## Architecture: three-tier conceptual model + +seismo-relay is a **suite of cooperating components**, not a single app. +The three tiers below are the canonical mental model — the current +directory layout doesn't fully reflect them yet (some of what is +conceptually SDM lives under `sfm/` today), but new code should be +placed and named according to this model. + +### 1. SFM — the device-side (active connection to physical units) + +Replaces Blastware's *talk-to-the-meter* role. Lives where a connection +to a physical seismograph is open. + +In scope: +- `minimateplus/{transport,framing,protocol,client}.py` — wire protocol +- `seismo_lab.py` — diagnostic GUI (a thick client for SFM) +- The `/device/*` HTTP endpoints in `sfm/server.py` — + `/device/info`, `/device/events`, `/device/monitor/*`, `/device/call_home`, + etc. Anything that opens a connection at the moment of the request. +- Future: a Thor / Micromate live client (mirror `minimateplus/`) +- Future: a control surface Terra-View can launch into — see the + README's Roadmap. + +Does NOT own a database. Outputs `Event` objects. Has a "spun up when +needed" runtime profile rather than "always on". + +### 2. SDM — the data-side (storage, ingest, and serving) + +The new name for the receiving-and-storing role. Originally called SFM +because the FastAPI service started life as a thin device proxy, but +the actual role has migrated heavily toward data management. **For now +the directory remains `sfm/`** — renaming requires touching ~30-50 +files in seismo-relay + ~10-15 in terra-view + a Docker volume +migration; deferred until the codebase is quiet enough to do it as a +clean refactor. + +In scope: +- `sfm/database.py` (`SeismoDb`) +- `sfm/waveform_store.py`, `sfm/event_hdf5.py` +- The `/db/*` HTTP endpoints — `events`, `units`, `monitor_log`, + `sessions`, `false_trigger` mutations +- The `/db/import/*` ingest endpoints — `blastware_file` (series3), + `idf_file` (series4); anything that receives events FROM somewhere +- `scripts/backfill_sidecars.py`, `scripts/check_bw_report_preservation.py`, + and similar data-maintenance tools +- The `.sfm.json` sidecars and `.h5` files in the waveform store +- The shape that Terra-View consumes (Terra-View should never need to + reach into SFM/device-side endpoints to populate its UI) + +Always-on, scaled for storage/serving, has the DB and waveform store. + +### 3. Codec library — pure data interpretation (used by both sides) + +Neither SFM nor SDM — a shared library both depend on. + +In scope: +- `minimateplus/{waveform_codec,histogram_codec,event_file_io,bw_ascii_report,blastware_file}.py` +- `micromate/{idf_ascii_report,idf_file}.py` + +These modules take bytes (off the wire on the SFM side, or from a +forwarded file on the SDM side) and return `Event` objects. They +should not import from `sfm/`, must not touch a DB, and have no I/O +beyond reading files passed as arguments. Keep them pure — both +tiers can then depend on them without circularity. + +### Practical consequences + +When deciding where new code goes, ask: +- *Does it need a connection to a device?* → SFM +- *Does it operate on stored events / sidecars / DB rows?* → SDM +- *Does it interpret bytes into structured data, with no I/O of its own?* → codec lib + +Terra-View is downstream of SDM for data, and (per the roadmap) will +eventually invoke into SFM's device-control endpoints to provide a +"connect to unit" experience. + +--- + ## Project layout ``` diff --git a/README.md b/README.md index c057f68..6433158 100644 --- a/README.md +++ b/README.md @@ -459,6 +459,72 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows. ## Roadmap (Future) +### Strategic direction — where this is going + +seismo-relay is being built as a **suite of cooperating components** +that together replace and improve on Blastware's role. Three logical +tiers: + +1. **SFM** (device-side) — owns the active connection to a physical + unit. Today: `minimateplus/`, `/device/*` HTTP endpoints, + `seismo_lab.py`. Future: live Thor / Micromate support. +2. **SDM** (data-side) — owns the database, waveform store, ingest + pipelines, and the read-API that Terra-View consumes. Today this + code lives under `sfm/` for historical reasons; the role has + migrated and the eventual rename is on the long-tail cleanup list. +3. **Codec library** — pure data-interpretation: `minimateplus/*_codec.py`, + `bw_ascii_report.py`, `micromate/idf_*.py`. Used by both SFM and + SDM, depends on neither. + +Terra-View is downstream of SDM for fleet listings, event detail, etc. +The long-term vision adds a **second link** from Terra-View → SFM for +direct device interaction (see below). + +The codec work in this repo isn't trying to replace BW's network +layer — BW's ACH file forwarding and Thor's IDF call-home are +battle-tested. The value is in the receiving and processing side: turn +the stream of binary+ASCII pairs into something users can search, +filter, alert on, and report from. + +### Terra-View ↔ SFM device control (the long-term vision) + +Today Terra-View only reads from SDM (event listings, dashboards, +project reports). When a unit goes missing — operator notices in the +Terra-View dashboard — there's no way to *do* anything from the UI. +The path of least resistance is to RDP into a Windows box and open +Blastware, which defeats the purpose of having Terra-View. + +Target experience: +- Operator notices a unit in Terra-View dashboard hasn't called in. +- Clicks unit detail → "Connect to Device" button. +- Terra-View opens an embedded view (modal or side-panel) that talks + to SFM's `/device/*` endpoints over the network. +- Live view: device clock, battery, memory, current monitor status. +- Actions: start/stop monitoring, push compliance config changes, pull + fresh events, run a sensor self-check, change call-home settings. +- Audit log: every connect / action recorded in SDM for the unit + history. + +Implementation steps (concrete): +- [ ] **SFM authentication & authorization layer.** Today `/device/*` + endpoints are unauthenticated — anyone on the network can call + them. Need at minimum a token-based auth, ideally with a "who + can connect to which units" mapping. Hard prerequisite for + letting Terra-View users into the control surface. +- [ ] **Terra-View "Connect to Device" entry point** on the unit + detail page. Renders only when unit has connection info on file + and the user has permission. +- [ ] **Embedded live-monitor view** in Terra-View — equivalent to + `seismo_lab.py`'s Bridge tab, but in the browser. Polls SFM's + `/device/monitor/status` on an interval; sends start/stop via + `/device/monitor/{start,stop}`. +- [ ] **Action history** — every connect / push / action call records + a row in `unit_history`, viewable on the unit detail page. +- [ ] **Series IV live-device support in SFM** — currently `/device/*` + only supports MiniMate Plus. Blocks "Connect to Device" for + Thor units until done. Depends on Thor wire-protocol capture + and a `micromate/` parallel of the `minimateplus/` modules. + ### High-impact (unblocks product features) - [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec. diff --git a/minimateplus/event_file_io.py b/minimateplus/event_file_io.py index 6e5674d..66a4b68 100644 --- a/minimateplus/event_file_io.py +++ b/minimateplus/event_file_io.py @@ -254,6 +254,60 @@ def apply_report_to_event(event: Event, report: BwAsciiReport) -> None: event.rectime_seconds = report.record_time_s +def apply_bw_report_dict_to_event(event: Event, bw_report: dict) -> None: + """Mirror of ``apply_report_to_event`` for the projected sidecar + dict shape (as produced by ``_bw_report_to_dict``). + + Why this exists + ─────────────── + The ingest path holds a live ``BwAsciiReport`` parsed straight from + the ``_ASCII.TXT`` and uses ``apply_report_to_event`` to overlay + device-authoritative peaks onto the codec output before insert. + + The backfill path doesn't have the original ``.TXT`` (it's not + retained in the waveform store), but it does have the preserved + ``bw_report`` block from the sidecar — which contains the same + projected fields. Re-overlaying those during a backfill keeps the + DB peak columns aligned with what BW reports rather than letting + the codec output (which may be incomplete for unhandled formats or + walker edge cases) win by default. + + No-ops cleanly when ``bw_report`` is ``None``, empty, or missing + any particular sub-field — only fields with a concrete value get + written. Mirrors ``apply_report_to_event``'s "report wins where + present" semantics. + """ + if not bw_report: + return + if event.peak_values is None: + event.peak_values = PeakValues() + pv = event.peak_values + + peaks = bw_report.get("peaks") or {} + tran = (peaks.get("tran") or {}).get("ppv_ips") + vert = (peaks.get("vert") or {}).get("ppv_ips") + long = (peaks.get("long") or {}).get("ppv_ips") + if tran is not None: pv.tran = tran + if vert is not None: pv.vert = vert + if long is not None: pv.long = long + vs_ips = (peaks.get("vector_sum") or {}).get("ips") + if vs_ips is not None: + pv.peak_vector_sum = vs_ips + + mic = bw_report.get("mic") or {} + pspl = mic.get("pspl_dbl") + if pspl is not None and pspl > 0: + pv.micl = _dbl_to_psi(pspl) + + rec = bw_report.get("recording") or {} + sr = rec.get("sample_rate_sps") + if sr: + event.sample_rate = sr + rt = rec.get("record_time_s") + if rt is not None: + event.rectime_seconds = rt + + def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict: if pi is None: return { diff --git a/scripts/backfill_sidecars.py b/scripts/backfill_sidecars.py index bbe0d0f..9c4bf5d 100644 --- a/scripts/backfill_sidecars.py +++ b/scripts/backfill_sidecars.py @@ -309,6 +309,23 @@ def main(argv=None) -> int: except Exception: pass + # Overlay BW ASCII report fields onto the rebuilt Event + # BEFORE the sidecar + DB write. Mirrors what the ingest + # path does — BW's reported peaks (and sample_rate / + # record_time) win over codec output where present. + # + # Without this step, --force backfill silently overwrites + # the bw_report-overlaid DB columns with codec-derived + # values, which is wrong for events the codec doesn't + # fully decode (e.g. waveform walker edge cases on + # SP0/SS0/SV0-style events, or histogram sub-formats with + # byte[5]!=0 that aren't yet RE'd). Net effect was PVS=0 + # on three top-10 events on 2026-05-22. + if preserved_bw_report: + event_file_io.apply_bw_report_dict_to_event( + ev, preserved_bw_report, + ) + sidecar = event_file_io.event_to_sidecar_dict( ev, serial=serial, diff --git a/tests/test_event_file_io.py b/tests/test_event_file_io.py index 6e08dae..0e043e8 100644 --- a/tests/test_event_file_io.py +++ b/tests/test_event_file_io.py @@ -529,6 +529,77 @@ def test_save_imported_bw_round_trip(tmp_path: Path): assert stored_path.read_bytes() == src.read_bytes() +# ── apply_bw_report_dict_to_event ──────────────────────────────────────────── + + +def test_apply_bw_report_dict_overlays_peaks_and_recording(): + """Verbatim mirror of the data shape produced by `_bw_report_to_dict` + when projecting a parsed `BwAsciiReport` into the sidecar. Confirms + each field overlays onto Event correctly so the backfill path + matches ingest behavior.""" + from minimateplus.models import PeakValues + ev = Event(index=0) + bw_report = { + "peaks": { + "tran": {"ppv_ips": 9.84375}, + "vert": {"ppv_ips": 0.305}, + "long": {"ppv_ips": 0.405}, + "vector_sum": {"ips": 14.86736}, + }, + "mic": {"pspl_dbl": 115.9}, + "recording": {"sample_rate_sps": 1024, "record_time_s": 3.0}, + } + event_file_io.apply_bw_report_dict_to_event(ev, bw_report) + assert ev.peak_values is not None + assert ev.peak_values.tran == 9.84375 + assert ev.peak_values.vert == 0.305 + assert ev.peak_values.long == 0.405 + assert ev.peak_values.peak_vector_sum == 14.86736 + # MicL is converted dB → psi via _dbl_to_psi — just confirm non-zero + assert ev.peak_values.micl is not None and ev.peak_values.micl > 0 + assert ev.sample_rate == 1024 + assert ev.rectime_seconds == 3.0 + + +def test_apply_bw_report_dict_overwrites_codec_peaks(): + """The whole point of this helper: bw_report wins over whatever the + codec produced. This is what the 2026-05-22 prod backfill missed — + DB peaks got overwritten with codec output (incl. PVS=0 on the + three top events) when they should have stayed bw_report-overlaid.""" + from minimateplus.models import PeakValues + ev = Event(index=0) + # Simulate codec output that's clearly wrong (incomplete decode): + ev.peak_values = PeakValues( + tran=2.09, vert=0.0, long=0.0, peak_vector_sum=0.0, + ) + bw_report = { + "peaks": { + "tran": {"ppv_ips": 9.84}, + "vert": {"ppv_ips": 4.95}, + "long": {"ppv_ips": 8.05}, + "vector_sum": {"ips": 14.95}, + }, + } + event_file_io.apply_bw_report_dict_to_event(ev, bw_report) + assert ev.peak_values.tran == 9.84 + assert ev.peak_values.vert == 4.95 + assert ev.peak_values.long == 8.05 + assert ev.peak_values.peak_vector_sum == 14.95 + + +def test_apply_bw_report_dict_no_op_on_empty(): + """None / empty dict / missing keys should leave Event untouched.""" + from minimateplus.models import PeakValues + for empty in (None, {}, {"peaks": {}}, {"peaks": {"tran": {}}}): + ev = Event(index=0) + ev.peak_values = PeakValues(tran=1.0, vert=2.0, long=3.0) + event_file_io.apply_bw_report_dict_to_event(ev, empty) + # Unchanged + assert ev.peak_values.tran == 1.0 + assert ev.peak_values.vert == 2.0 + assert ev.peak_values.long == 3.0 + + if __name__ == "__main__": if pytest is not None: pytest.main([__file__, "-v"])