Compare commits
3 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| e1a6fd5386 | |||
| 6b875e161b | |||
| f5c81f2cab |
@@ -1,28 +0,0 @@
|
||||
.git
|
||||
.gitignore
|
||||
|
||||
.venv
|
||||
venv
|
||||
env
|
||||
__pycache__
|
||||
*.pyc
|
||||
*.pyo
|
||||
*.pyd
|
||||
.pytest_cache
|
||||
.mypy_cache
|
||||
.ruff_cache
|
||||
|
||||
*.db
|
||||
*.db-wal
|
||||
*.db-shm
|
||||
*.sqlite
|
||||
*.sqlite3
|
||||
|
||||
sfm/data
|
||||
bridges/captures
|
||||
example-events
|
||||
captures
|
||||
logs
|
||||
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
/bridges/captures/
|
||||
/example-events/
|
||||
/tests/fixtures/
|
||||
|
||||
/manuals/
|
||||
|
||||
# Python build artifacts
|
||||
|
||||
-506
@@ -4,510 +4,8 @@ All notable changes to seismo-relay are documented here.
|
||||
|
||||
---
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
|
||||
- **SFM webapp now opens to Database view by default** and the History table is fully interactive. Click any column header to sort ascending / descending (timestamp, serial, per-channel PPV, PVS, mic dB(L), project, client, record type, key — all sortable). Click any event row to open the event modal, which now renders a **4-channel waveform plot inline** (MicL / Long / Vert / Tran stacked, Instantel-printout order) alongside the existing sidecar review fields. Headers are sticky so the columns stay visible while scrolling long event lists. No more "where is the viewer" — pick a unit from the filter dropdown, scan the table, click the event, see the waveform.
|
||||
- **Stored-event browser** — new standalone HTML page at `GET /events` (`sfm/event_browser.html`). Pick a serial from the unit dropdown, scroll through that unit's events (newest-first), click any event to render its decoded waveform via the existing `/db/events/{id}/waveform.json` endpoint. Dark-themed Chart.js viewer, channels stacked vertically (MicL / Long / Vert / Tran — Instantel printout order, designed PDF-export-ready), trigger line at t=0, peak labels, search/filter, false-trigger flag honored. Companion to the existing live-device viewer at `/waveform`; the two routes are now clearly delineated in their docstrings. The webapp's inline plot at `/` is the primary path; `/events` remains a useful diagnostic when you want just a viewer.
|
||||
- **Histogram body codec — uint8 peak count fix.** Per-channel peak fields at `block[6]/[10]/[14]/[18]` are `uint8`, not `uint16 LE` spanning `block[6:8]` etc. The original interpretation was byte-exact on the N844 fixture corpus only because every annotation byte (`block[7]/[11]/[15]/[19]`) in those fixtures was zero. On non-N844 events with non-zero annotation bytes (observed across BE9558 Tran-drift and BE18003 Histogram+Continuous units), the old interpretation produced peaks up to 268 in/s per channel and 35× inflated PVS sums when first deployed to prod (rolled back same day; properly fixed in this release). Cross-correlated against BW's per-interval ASCII export on K558 / T003 / N599 / N844 corpora — 100% byte-exact on T/V/L, 99%+ on M (sub-precision rounding). Annotation byte preserved on each record as `record["annotations"]` for future RE. Verified against ~3,500 blocks across 5 in-repo fixtures + a synthetic K558 interval-12 regression block.
|
||||
- **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`. Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file. BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
|
||||
- **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars. Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED. Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.** Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict). Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
|
||||
- **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.** The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts). Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
|
||||
- **Thor IDF files no longer attempted as BW events in backfill.** `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.
|
||||
|
||||
### Docs
|
||||
|
||||
- **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes. Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
|
||||
- **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
|
||||
- **`docs/histogram_codec_re_status.md`** updated with the uint8 retraction and the annotation-byte status.
|
||||
- Three known issues recorded in the Roadmap that were discovered during prod validation: (1) `bw_ascii_report` parser misses PPV / `vector_sum` on some `.TXT` formats (5 events on prod); (2) NULL-timestamp duplicate-row dedup needed (2 events on prod); (3) histogram body sub-format with `byte[5] != 0` not yet decoded (~3 events on prod with empty `.h5` plots).
|
||||
|
||||
---
|
||||
|
||||
## v0.19.0 — 2026-05-20
|
||||
|
||||
The "device-family separation" release. Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics.
|
||||
|
||||
### Added — Phase 1: `device_family` column on `events`
|
||||
|
||||
- **`events.device_family TEXT`** — new column carrying `"series3"` or `"series4"`. Populated by every import path (`/db/import/blastware_file`, `/db/import/idf_file`, ACH server, BW CLI, sidecar backfill script). Returned through `/db/events` since `query_events` uses `SELECT *`.
|
||||
- **Self-applying migration** — on startup, `ALTER TABLE ... ADD COLUMN` lands the new column; a follow-on `UPDATE` backfills existing rows from the binary filename extension (`.IDFH`/`.IDFW` → `series4`, everything else → `series3`). No manual SQL needed.
|
||||
- **UPSERT preserves family** — re-imports without an explicit family don't blank existing rows (`COALESCE(?, device_family)`).
|
||||
- **UI dispatches on the column** — `sfm_webapp.html` events-table mic formatter now branches on `ev.device_family === 'series4'` (Thor stores native dB(L); BW stores psi). Modal uses `source.kind === 'idf-import'` from the sidecar (sidecars don't carry the DB column). Source-files section labels changed from "BW filename / BW filesize / BW sha256" to format-neutral "Event file / File size / File sha256".
|
||||
|
||||
### Added — Phase 2: `micromate/` package alongside `minimateplus/`
|
||||
|
||||
- **`micromate/`** — new sibling package for the Thor / Micromate Series IV device. Currently scoped to offline-file ingest; live-device support (TCP transport, framing, protocol, client) will land here when reverse-engineering happens.
|
||||
- `micromate/idf_ascii_report.py` — moved from `sfm/idf_ascii_report.py`. No behaviour change.
|
||||
- `micromate/models.py` — typed `IdfReport`, `IdfEvent`, `IdfPeaks`, `IdfProjectInfo`, `IdfSensorCheck`. Stores mic in native `mic_pspl_dbl` (dB(L)) instead of the pseudo-psi shoehorn that the BW-shaped model uses. `IdfEvent.from_report()` constructs from a parsed dict + filename; `IdfEvent.to_minimateplus_event(waveform_key)` bridges to the existing sidecar / DB-insert machinery.
|
||||
- `micromate/idf_file.py` — placeholder for the binary codec (`.IDFH` / `.IDFW`). Stubbed `read_idf_file()` raises `NotImplementedError`; documents the planned reverse-engineering path.
|
||||
- **`WaveformStore.save_imported_idf`** refactored to use the native `IdfEvent` and bridge at the SQL-insert boundary. Cleaner separation of "parse a Thor event" (in `micromate/`) from "store it on disk + write a sidecar" (in `sfm/waveform_store.py`).
|
||||
- **Tests** — `tests/test_idf_ascii_report.py` imports updated to `micromate.idf_ascii_report`. All 1,014 example-data sidecars round-trip through `IdfEvent.from_report()` without errors.
|
||||
|
||||
### Companion releases
|
||||
|
||||
- **thor-watcher** unaffected — it talks to the relay over HTTP only. No version bump needed.
|
||||
- **terra-view** unaffected today; can use `device_family` in its event-detail rendering when convenient.
|
||||
|
||||
---
|
||||
|
||||
## v0.18.0 — 2026-05-19
|
||||
|
||||
The "Thor / Series IV ingest adapter" release. Seismo-relay can now accept event files from Instantel Micromate Series IV (Thor) units alongside the existing MiniMate Plus (Series III) Blastware pipeline.
|
||||
|
||||
### Added — Thor (Series IV) IDF ingest
|
||||
|
||||
- **`POST /db/import/idf_file`** (`sfm/server.py`) — multipart upload endpoint for `.IDFH` (histogram) and `.IDFW` (waveform) event files plus their `.IDFH.txt` / `.IDFW.txt` ASCII sidecars. Mirrors the shape of `/db/import/blastware_file`: pairing by filename, optional `serial` query hint, per-file outcome reporting.
|
||||
- **`sfm/idf_ascii_report.py`** — parser for Thor's TXT sidecars (verified against 1,014 real-world samples). Extracts device-authoritative PPV, ZC Freq, Peak Vector Sum, Mic PSPL, calibration date, firmware version, sensor self-check results, and project/client/operator strings.
|
||||
- **`WaveformStore.save_imported_idf()`** (`sfm/waveform_store.py`) — stores Thor binaries verbatim in `<root>/<serial>/<filename>`, writes a `.sfm.json` sidecar with `source.kind = "idf-import"` and the full parsed report under `extensions.idf_report`. Reuses the existing `events` table — Thor events dedupe on (serial, timestamp) and surface in `/db/events` alongside BW events.
|
||||
- **`tests/test_idf_ascii_report.py`** — parser tests against the `thor-watcher/example-data/` corpus.
|
||||
|
||||
### Changed
|
||||
|
||||
- `event_to_sidecar_dict()` (`minimateplus/event_file_io.py`) allow-list for `source_kind` now includes `"idf-import"` so the existing sidecar machinery can carry Thor imports.
|
||||
- Bumped `pyproject.toml` version to `0.18.0`.
|
||||
|
||||
### Companion release
|
||||
|
||||
This release ships alongside **thor-watcher v0.3.0**, which adds the SFM forwarder that targets the new `/db/import/idf_file` endpoint. Operators flip the switch in thor-watcher's new "SFM Forward" Settings tab; events POST to seismo-relay just like the series3-watcher BW forwarder does today.
|
||||
|
||||
---
|
||||
|
||||
## v0.17.0 — 2026-05-17
|
||||
|
||||
The "field rescue + DB management" release. Hardened against units that are stuck in a runaway call-home loop, and added an operator-facing path for purging bogus events that those same units dump into the DB before recovery. All work in this release was driven by the BE9558H incident (full incident log + recovery procedure at `docs/runbooks/wedged_unit_recovery.md`).
|
||||
|
||||
### Added — wedged-unit recovery toolkit
|
||||
|
||||
A toolkit for breaking the call-home loop on a misbehaving unit whose firmware is too busy to keep up with normal request/response handshakes. Tested in production against BE9558H (16 May 2026) — a unit with a stuck-triggered Long-axis geophone that had been call-homing the office BW ACH server every 30 seconds for hours. Endpoints layered from "single attempt" to "siege mode" to suit different contention levels:
|
||||
|
||||
- **`GET /device/events/storage_range`** — SUB 0x06 probe. POLL + one read; ~2s. Returns first/last event keys and an `is_empty` flag. Use to triage whether a unit has stored events without invoking the slow `count_events()` 1E/1F chain (which choked on BE9558H's corrupted event chain).
|
||||
- **`GET /device/events/index`** — SUB 0x08 probe. POLL + one read; ~2s. Returns the lifetime event counter (does NOT decrement on erase — use `storage_range` for "right now" state).
|
||||
- **`POST /device/events/erase`** — full erase sequence `0xA3 → 0x1C → 0x06 → 0xA2` (confirmed 2026-04-11, see the protocol reference). Resets event keys to `0x01110000`. Caller's responsibility to disable ACH first if the underlying trigger condition will re-fill the buffer.
|
||||
- **`POST /device/rescue`** — one TCP session, short connect+recv timeouts: POLL → disable ACH (compliance config write) → erase events → close. Designed for race-loop usage when the device is busy in another session. 503 on connect-refused, 502 on protocol failure, 200 on full sequence success.
|
||||
- **`POST /device/stop_monitoring_blind`** — fire-and-forget Stop Monitoring (SUB 0x97), TCP-only. Dumps `SESSION_RESET + POLL_PROBE + SESSION_RESET + POLL_DATA + 0x97 × repeat` and closes without reading any S3 response. The full POLL preamble is required — write commands without it are silently ignored by the device's protocol parser (false-positive surface area that bit the first version of this endpoint). Use when the device's firmware can't keep up with full request/response but might process inbound bytes at its own pace.
|
||||
- **`POST /device/stop_monitoring_spam`** — server-side hammer loop, duration-bounded. Open TCP → write the same blind payload → close → repeat as fast as possible until `duration_s` elapses. Configurable `connect_timeout` (default 500ms) and `repeat` (frames per session). Reports `sent_ok`, `connect_failed`, `write_failed`, `rate_attempts_per_s`. Clamped to 5min duration.
|
||||
- **`POST /device/stop_monitoring_slow_drip`** — opposite of spam. Open ONE TCP session, drip the wake handshake + stop frames at `interval_s` (default 3s) for `duration_s` (default 120s, max 10min). Each drip is ~23 bytes — well under any UART FIFO size. Opportunistically drains any inbound bytes the device sends back; `bytes_received > 0` in the response strongly suggests the device has started talking and the session is healthy. **This is the endpoint that saved BE9558H.** Spam mode had been overrunning the device's UART FIFO; slow drip stayed under it.
|
||||
- **Six rescue scripts** under `scripts/` — thin bash wrappers around the endpoints, default `SFM_BASE_URL=http://localhost:8200` (direct, not via Terra-View proxy whose 60s timeout would cut off the longer endpoints):
|
||||
- `rescue_device.sh` — race-loop wrapper for `/device/rescue`
|
||||
- `blind_stop.sh` — race-loop wrapper for `/device/stop_monitoring_blind`
|
||||
- `spam_stop.sh` — single-call burst hammer
|
||||
- `slow_drip.sh` — single-call held-session drip
|
||||
- `watch_unit.sh` — passive periodic reachability check (every N min, logs to file), useful for unattended overnight monitoring of a wedged unit
|
||||
- **`docs/runbooks/wedged_unit_recovery.md`** — symptoms, quick-reference recovery procedure, the modem-layer mechanism (Sierra Wireless serial-port mode-flipping is the real failure mode — not the device firmware), and a table of "why simpler approaches don't work" so the next incident skips the dead ends.
|
||||
|
||||
### Added — operator event DB management
|
||||
|
||||
Endpoints powering Terra-View's new `/admin/events` page (v0.12.0). Designed for purging bogus events from a unit that's been forwarding them in bulk (e.g. a stuck-triggered seismograph dumping hundreds of junk events before it's recovered).
|
||||
|
||||
- **`DELETE /db/events/{event_id}`** — hard-delete one event row. Also unlinks the associated blastware binary (`.AB0*`), `.a5.pkl`, `.sfm.json` sidecar, and `.h5` clean-waveform files via the WaveformStore. Returns the per-file removal status. 404 if the event doesn't exist.
|
||||
- **`POST /db/events/delete_bulk`** — filter-based or id-list-based bulk delete with safety rails:
|
||||
- Filters (`serial`, `from_dt`, `to_dt`, `false_trigger`) combine with AND; same semantics as `GET /db/events`. `ids` is an additional inclusion list. Refuses to run with no filters (would wipe the whole table — raises 422).
|
||||
- `confirm` must be `true` to actually delete. Otherwise returns a dry-run summary (`status: "dry_run"`, `matched: N`, `sample_serials: [...]`).
|
||||
- `max_rows` (default 10,000) caps how many rows can be deleted by-filter in one call. If exceeded, returns `status: "too_many"` with a hint to narrow or raise the cap. Bypassed when only `ids` is supplied.
|
||||
- **`_cleanup_event_files(row)`** helper in `sfm/server.py` — best-effort `unlink()` of all four sidecar paths derived from the row's `blastware_filename`. Logged at WARN if a path exists but unlink fails; the DB row deletion still proceeds.
|
||||
- **`SeismoDb.delete_event(id)` and `SeismoDb.delete_events_bulk(...)`** in `sfm/database.py` — both return the deleted row dict(s) so callers can do file cleanup. `delete_events_bulk` raises `ValueError` if no filters are supplied.
|
||||
|
||||
### Changed
|
||||
|
||||
- **Default protocol recv timeout dropped from 30s → 10s** in `_build_client()`. The unit usually responds in well under a second over cellular; 10s leaves comfortable headroom for retransmits while failing reasonably fast when a unit is wedged. The two endpoints that perform full 5A waveform downloads still pass `timeout=120.0` explicitly so multi-minute event transfers are unaffected.
|
||||
- **`_build_client()` now accepts an optional `connect_timeout`** (TCP-only) so rescue / race-loop endpoints can fail fast on busy modems without affecting the protocol-level recv timeout.
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`GET /device/monitor/status` returned HTTP 500 + uncaught traceback when the device was unresponsive**. The retry-on-`Exception` inner block let the second `client.poll()`'s `ProtocolError` propagate out of the handler. Now wrapped in proper try/except — returns 502 with `{"detail": "Protocol error: No S3 frame received within 10.0s ..."}` on timeout, 502 on connection errors, 500 only for genuinely unexpected exceptions.
|
||||
|
||||
### Migration
|
||||
|
||||
No schema changes. No data migration required.
|
||||
|
||||
If you've been running a previous version against a wedged unit and accumulated bogus events, the new `/admin/events` page in Terra-View v0.12.0 (or direct `POST /db/events/delete_bulk` with `confirm: true`) is the cleanup tool. Watcher state on the upstream DL2 PC does NOT need separate cleaning — the watcher's `sfm_forwarded.json` keys on file sha256 and won't re-forward the same files.
|
||||
|
||||
### Pairing
|
||||
|
||||
This release pairs with **Terra-View v0.12.0**, which adds the `/admin/events` UI that consumes the new bulk-delete endpoints, the bulk false-trigger flagging on `/unit/{id}`, and the field-deployment workflow that uses the same `series3-watcher` → SFM ingest path as before.
|
||||
|
||||
---
|
||||
|
||||
## v0.16.1 — 2026-05-14
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`record_type` always "Waveform" for forwarded events.** `read_blastware_file()` hardcoded `ev.record_type = "Waveform"` regardless of the file's actual type. The watcher-forward pipeline (the main BW ACH ingest path) compounds this by parsing files from a tmp path with a `.bw` suffix, so even a filename-based fallback inside the parser still wouldn't see the original extension. Now:
|
||||
|
||||
1. New `derive_record_type_from_filename(filename)` helper in `minimateplus/event_file_io.py` derives the type from the LAST character of the filename's extension (V10.72+ AB0T scheme: `H`=Histogram, `W`=Waveform, `M`=Manual, `E`=Event, `C`=Combo). Falls back to `"Waveform"` for old S338 firmware (3-char extensions ending in `0`) and any unrecognized suffix.
|
||||
2. `read_blastware_file()` now calls the helper with its `path.name` so direct callers (the `--dry-run` path in `scripts/import_bw.py`, tests, ad-hoc scripts) get the right value automatically.
|
||||
3. `WaveformStore.save_imported_bw()` overrides `ev.record_type` with the **original** filename's derived type after parsing (the tmp file inside the parser doesn't carry the original extension). This is the path the live watcher-forwarder hits, so the DB column now reflects the actual event type going forward.
|
||||
|
||||
Events ingested before this fix are stuck with `record_type="Waveform"` in the DB; a one-off backfill (`UPDATE events SET record_type = ... WHERE blastware_filename LIKE '%H'`) would fix them retroactively if desired. Terra-view's event modal also derives client-side from the filename, so the UI already shows the correct type for old events even without the backfill.
|
||||
|
||||
---
|
||||
|
||||
## v0.16.0 — 2026-05-11
|
||||
|
||||
The "BW ACH ingestion" release. When paired with **series3-watcher v1.5.0**, every Blastware ACH event (binary + `_ASCII.TXT` report) lands in SeismoDb with device-authoritative peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data — without depending on the still-undecoded waveform body codec. This is the end-to-end product win discussed in v0.15.0's "out of scope" notes: sortable / filterable monthly-summary review of historical events, populated from the BW ASCII export rather than re-decoded samples.
|
||||
|
||||
### Added — `/db/import/blastware_file` rich-metadata ingestion
|
||||
|
||||
- **Paired BW ASCII reports.** The endpoint now accepts the `<binary>_<ext>_ASCII.TXT` partner BW writes alongside each event. Pairing handles both filename conventions: ACH (`M529LK44_AB0_ASCII.TXT`) and manual-export (`M529LK44.AB0.TXT`). When both present, ACH wins.
|
||||
- **`minimateplus/bw_ascii_report.py`** (new) — parser + `BwAsciiReport` dataclass for BW's per-event ASCII export. Handles every field BW writes: identity, trigger config, per-channel PPV / ZC Freq / Time of Peak / Peak Acceleration / Peak Displacement, Peak Vector Sum + time, MicL PSPL / Time of Peak / ZC Freq, sensor self-check (Test Freq / Test Ratio / Test Amplitude / Pass-Fail per channel), monitor log, PC SW version.
|
||||
- **Position-based user-notes parsing.** BW's Compliance Setup → Notes tab labels (Project / Client / User Name / Seis Loc) are *operator-editable* — an operator can rename them to "Building:", "Site Address:", etc. Rather than maintain a label-spelling map, the parser uses positional matching between the `Units :` and `Geo Range :` anchors in the ASCII output. The four canonical slots (project / client / operator / sensor_location) populate by position regardless of label; the original labels BW wrote are preserved in `report.user_note_labels` for downstream UIs (terra-view) to display verbatim.
|
||||
- **`bw_report` sidecar block.** New top-level block in `.sfm.json` carrying the parsed BW report (trigger config, peaks with per-channel stats, mic block, sensor_check, monitor_log, PC SW version, operator-label labels).
|
||||
- **`apply_report_to_event(event, report)` helper.** Overlays the report's device-authoritative fields onto an in-memory `Event` so `SeismoDb.insert_events()` writes correct DB columns instead of the broken-codec values from `_peaks_from_samples()`.
|
||||
|
||||
### Fixed — three compounding bugs that left forwarded events with garbage data
|
||||
|
||||
- **Import endpoint inserted under `serial="UNKNOWN"`.** `_serial_from_event(ev)` was a stub that always returned `None`; the BW-filename-decoded serial that `WaveformStore` had already resolved was never surfaced to `db.insert_events`. Now uses `rec["serial"]` as the authoritative source. `scripts/repair_unknown_serials.py` repairs existing DB rows.
|
||||
- **`/db/units` ignored events from non-ACH ingest paths.** `query_units()` only aggregated from `ach_sessions` — events that arrived via `save_imported_bw()` were never visible in the fleet overview even though they populated `events` correctly. Now unions both tables.
|
||||
- **Re-imports left stale DB rows.** The `IntegrityError` handler in `insert_events()` only refreshed filename / sidecar columns when a duplicate `(serial, timestamp)` arrived. Peak values, project info, sample_rate, record_type stayed locked at whatever the first (often broken-codec) insert wrote. Now the upsert path refreshes every device-authoritative column from the new data while preserving `false_trigger` and immutable fields (`id`, `created_at`).
|
||||
- **Server-side TXT pairing only knew the legacy convention.** The endpoint stripped `.TXT` and looked up `<binary>` — which works for manual exports (`<binary>.TXT`) but not BW ACH (`<stem>_<ext>_ASCII.TXT`). Reports were arriving in the multipart but silently dropped. Now recognises both conventions and registers each report under all matching binary names.
|
||||
|
||||
### Migration
|
||||
|
||||
For existing deployments where events were forwarded by an older watcher (broken pairing) or imported during the UNKNOWN-bucketing window:
|
||||
|
||||
1. `python -m scripts.repair_unknown_serials --db <path> --apply` to re-attribute `serial="UNKNOWN"` rows.
|
||||
2. Delete the watcher's `sfm_forwarded.json` state file and let it re-forward. The server's upsert path will refresh the existing DB rows with the report's authoritative values.
|
||||
3. Operator review state (`false_trigger`, sidecar `review` block) is preserved across the re-import.
|
||||
|
||||
## v0.15.0 — 2026-05-07
|
||||
|
||||
### Added
|
||||
|
||||
- **Layered event storage architecture.** Each event now lands as four
|
||||
files in the per-serial waveform store, each with a clear role:
|
||||
|
||||
- `<filename>` — the Blastware-readable binary (BW file). Untouched.
|
||||
- `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
|
||||
- `<filename>.h5` — clean per-channel waveform arrays in physical
|
||||
units (in/s for geo, psi for mic) plus event metadata (HDF5 with
|
||||
gzip compression). This is the canonical format for downstream
|
||||
analysis tools.
|
||||
- `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
|
||||
project, source provenance, review state, extensions).
|
||||
|
||||
SQLite (`seismo_relay.db`) is the searchable index over all four.
|
||||
|
||||
- **Plot-ready waveform JSON (`sfm.plot.v1`).** The `/device/event/{idx}/waveform`
|
||||
and `/db/events/{id}/waveform.json` endpoints now return samples in
|
||||
physical units with explicit time-axis metadata, peak markers, and
|
||||
per-channel unit hints — no more guessing the ADC-to-velocity scale
|
||||
client-side. The webapp waveform viewer was rewritten to consume
|
||||
this shape.
|
||||
|
||||
- **In-app waveform viewer accuracy fix.** The standalone SFM webapp
|
||||
viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
|
||||
(≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
|
||||
*in/s per V* hardware constant — not the ADC-counts-to-velocity
|
||||
factor. This silently scaled every plot ~38% too low for Normal-range
|
||||
geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
|
||||
Sensitive). Conversion is now done server-side using the geo_range
|
||||
from compliance config; the client just plots.
|
||||
|
||||
- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
|
||||
`read_event_hdf5()`, plus a plot-JSON helper.
|
||||
- Backfill script extended to also emit `.h5` for existing events.
|
||||
|
||||
### Dependencies
|
||||
|
||||
- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
|
||||
- Added `python-multipart>=0.0.7` (required by FastAPI for the
|
||||
`/db/import/blastware_file` endpoint introduced in this release).
|
||||
|
||||
---
|
||||
|
||||
## v0.14.3 — 2026-05-05
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`build_5a_frame` — DLE-stuffing rule for 0x10 bytes in params (the
|
||||
long-standing >1-sec event 0 "won't open in BW" bug).**
|
||||
|
||||
Previously `build_5a_frame` wrote params bytes RAW with no DLE stuffing,
|
||||
based on the incorrect assumption that the device handled all `0x10`
|
||||
bytes in params literally. It does not. The device's actual de-stuffing
|
||||
rule for the params region is:
|
||||
|
||||
- `10 10` → de-stuffs to `10`
|
||||
- `10 02/03/04` → kept literal (inner-frame markers)
|
||||
- `10 X` for other X → de-stuffs to just `X` (drops the `0x10`)
|
||||
|
||||
When the counter passed in params has `0x10` in the high byte (e.g.
|
||||
counter=`0x1000` produces params bytes `... 10 00 ...`), the device
|
||||
silently corrupts the request to counter=`0x__00` and responds with
|
||||
whatever lives at that wrong address. For counter=0x1000 the wrong
|
||||
address was 0x0000, so the response was a copy of the file header +
|
||||
STRT record. That STRT block then got embedded in the assembled body
|
||||
at file offset `0x1016`, and Blastware refused to open the file
|
||||
(interprets the second STRT as a malformed multi-event file).
|
||||
|
||||
This explains the entire >1-sec event-0 failure pattern:
|
||||
|
||||
- 1-sec events have `end_offset < 0x1000`, so the chunk walk never
|
||||
requests counter `0x10__` and the bug never triggers.
|
||||
- 2-sec / 3-sec / longer events all need a chunk at counter `0x1000`
|
||||
(and longer events also need `0x1200`, `0x1400`, etc., none of which
|
||||
have `0x10` in the high byte except `0x1000`). Just one corrupted
|
||||
response is enough to embed STRT in the body and break the file.
|
||||
|
||||
Verified against BW 5-1-26 "copy 3sec" capture: all 17 5A request
|
||||
frames (probe + 2 metadata pages + 13 sample chunks + TERM) now match
|
||||
BW's wire output **byte-for-byte**, including the doubled `10 10 00`
|
||||
for counter=0x1000.
|
||||
|
||||
### Notes
|
||||
|
||||
- `0x10` bytes in `offset_hi` (the standalone offset field at body[5])
|
||||
are still written RAW — confirmed correct per the 1-2-26 capture.
|
||||
- BW's actual encoding of `10 02` / `10 04` for meta pages 0x1002 /
|
||||
0x1004 is *not* doubled — it relies on the device keeping `10 02`
|
||||
and `10 04` as literal pairs. This is preserved by the fix.
|
||||
|
||||
---
|
||||
|
||||
## v0.14.2 — 2026-05-04
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`blastware_file.py` — removed harmful "duplicate header+STRT" strip.**
|
||||
The v0.13.x strip logic was matching the byte sequence `00 12 03 00 STRT`
|
||||
in legitimate waveform data — sample chunks at counter `0x1000` and
|
||||
beyond often contain those bytes coincidentally — and zeroing 25 bytes
|
||||
of valid samples per match. This is why event 0 (event-1 case in the
|
||||
protocol) downloads of >1-sec recordings always failed in BW: the strip
|
||||
destroyed real data at body offset `0x1012..0x102B` and propagated
|
||||
alignment differences through the rest of the body. Sub-1-sec events
|
||||
worked because their `end_offset` was below `0x1002`, so no sample
|
||||
chunks landed in the metadata-page region and the strip's needle never
|
||||
matched. Verified fix by re-feeding the BW 5-1-26 "copy 3sec" capture's
|
||||
A5 frames into the file builder: output is now byte-identical to BW's
|
||||
saved `M529LKIQ.G10` reference (8708 bytes, 0 differences).
|
||||
- BW already concatenates frame contributions in stream order without
|
||||
any de-duplication; SFM now does the same.
|
||||
|
||||
---
|
||||
|
||||
## v0.14.0 — 2026-05-02
|
||||
|
||||
### Changed (major rewrite)
|
||||
|
||||
- **`read_bulk_waveform_stream` — STRT-bounded chunk walk.** Replaces the
|
||||
earlier `0x0400`-step / `max(key4[2:4], 0x0400)` chunk-counter formula,
|
||||
which over-read ~5× past the actual event end into post-event circular-
|
||||
buffer garbage. The new walk:
|
||||
|
||||
1. Probe at `counter = start_offset` (event 1: `0x0000`; event N:
|
||||
`cur_key[2:4]`).
|
||||
2. Parse `end_offset` from the STRT record at `data[17]` of the probe
|
||||
response (`end_key[2:4]` field).
|
||||
3. For event 1 only, read the two fixed metadata pages at counter
|
||||
`0x1002` and `0x1004` — these contain the global session-start
|
||||
compliance setup (Project / Client / User Name / Seis Loc /
|
||||
Extended Notes ASCII strings). Continuation events skip these
|
||||
(BW caches them across the session).
|
||||
4. Walk sample chunks at **`0x0200` increments (NOT `0x0400`)**, bounded
|
||||
by `end_offset` — the loop exits when
|
||||
`next_chunk_counter + 0x0200 > end_offset`.
|
||||
5. Send the proper TERM frame (see new `bulk_waveform_term_v2()`) with
|
||||
`offset_word = end_offset - next_boundary` and
|
||||
`params[2:4] = next_boundary BE`. The TERM response carries the
|
||||
partial last chunk + 26-byte file footer.
|
||||
|
||||
- **New helpers:** `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
|
||||
and `parse_strt_end_offset(a5_data)` in `minimateplus.framing`.
|
||||
|
||||
- **`stop_after_metadata` / `extra_chunks_after_metadata` kwargs are now
|
||||
no-ops** under the v0.14.x walk. They are retained on the
|
||||
`read_bulk_waveform_stream` signature for backward compatibility but log a
|
||||
DEBUG line when set. The old "scan for `b'Project:'` and stop one chunk
|
||||
later" workaround is obsolete — the loop is deterministically bounded by
|
||||
the STRT-derived `end_offset`.
|
||||
|
||||
- **Project / Client / User Name / Seis Loc string source corrected.**
|
||||
These come from the dedicated metadata pages at counter `0x1002` /
|
||||
`0x1004`, not from "A5 frame 7" of the sample-chunk stream. The
|
||||
earlier "A5 frame 7" claim was an artifact of the broken `0x0400`-step
|
||||
walk where the bad counter formula coincidentally landed sample-chunk
|
||||
fi=7 on top of the 0x1002 metadata page.
|
||||
|
||||
### Verified
|
||||
|
||||
- Three independent BW MITM captures (4-27-26 + 5-1-26 + 5-4-26) confirm
|
||||
the new walk matches BW's behaviour event-for-event.
|
||||
- `end_offset` values verified across 3 events: `0x1ABE` (4-27-26 2-sec),
|
||||
`0x21F2` (5-1-26 3-sec), `0x417E` (5-1-26 event-2).
|
||||
|
||||
### Notes
|
||||
|
||||
- Earlier v0.13.0 / v0.13.1 / v0.13.2 entries describe partial steps along
|
||||
the way (some of the file builder fixes, filename bugs, etc.) that were
|
||||
superseded by the full rewrite. Treat this v0.14.0 entry as the
|
||||
definitive landing point for the corrected SUB 5A protocol.
|
||||
|
||||
---
|
||||
|
||||
## v0.14.1 — 2026-05-04
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`read_bulk_waveform_stream` — event-N probe counter off-by-`0x46`.**
|
||||
Continuation events (start_key[2:4] != 0) were being probed at counter
|
||||
`start_offset + 0x0046` instead of just `start_offset`. In the iteration
|
||||
walk, `cur_key` from 1F is already the off=0x46 WAVEHDR record key, so the
|
||||
earlier formula effectively double-counted the WAVEHDR offset. The probe
|
||||
landed one WAVEHDR past the actual event start, the response no longer
|
||||
contained the STRT record at byte 17, `parse_strt_end_offset` returned
|
||||
`None`, and the chunk loop fell back to the `max_chunks=128` cap — walking
|
||||
~110 chunks of post-event circular-buffer garbage. Verified against the
|
||||
5-1-26 "copy 2nd address" and 5-4-26 BW 2-sec event captures: BW probes
|
||||
counter=`0x2238` with key=`01112238` and STRT is present at byte 17 of
|
||||
the response (end_offset=`0x417E`).
|
||||
- **CLAUDE.md / docs/instantel_protocol_reference.md** — corrected the
|
||||
event-N section to clarify that `start_key` in those formulas is the
|
||||
off=0x46 key, not the off=0x2C boundary key, and removed the spurious
|
||||
`+0x46` from the chunk-walk pseudocode.
|
||||
|
||||
---
|
||||
|
||||
## v0.13.2 — 2026-05-01
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`_extract_record_type` — third 0C-record header format ("short", 8 bytes).**
|
||||
A live SFM download against BE11529 produced files named `M5290000.000`
|
||||
(zero-stamped) because the 0C waveform record's first bytes were
|
||||
`01 05 07 ea ...` — neither the 9-byte single-shot layout (`0x10` at byte 1)
|
||||
nor the 10-byte continuous layout (`0x10` at bytes 0 and 2). Investigation
|
||||
showed this is a third format observed in the wild: an 8-byte header with no
|
||||
marker bytes at all (`[day][month][year_BE:2][unknown][hour][min][sec]`).
|
||||
The detection logic now scans the year (uint16 BE) at byte 2 / byte 3 / byte
|
||||
4 and picks whichever offset returns a sensible year (2015–2050) — each
|
||||
format has the year at a unique position so this disambiguates cleanly.
|
||||
- New format → `event.record_type = "Waveform (Short)"`,
|
||||
`Timestamp.from_short_record()`.
|
||||
- Existing single-shot and continuous parsers unchanged.
|
||||
- The user's event from May 1, 2026 13:21:37 now correctly resolves to a
|
||||
filename like `M529LKIQ.G10` instead of `M5290000.000`.
|
||||
|
||||
### Added
|
||||
|
||||
- `Timestamp.from_short_record(data)` — decodes the 8-byte header.
|
||||
- `_detect_record_format(data)` — internal helper returning
|
||||
`"single_shot" / "continuous" / "short" / None` via year-position scan.
|
||||
|
||||
---
|
||||
|
||||
## v0.13.1 — 2026-05-01
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`_extract_record_type` — Continuous-mode record headers misclassified as Unknown.**
|
||||
In single-shot mode the 0C waveform record's 9-byte header puts the sub_code
|
||||
marker `0x10` at byte 1, with the day at byte 0. In Continuous mode the
|
||||
header is 10 bytes with the marker at byte 0 *and* byte 2, and the day at
|
||||
byte 1. Previous logic only inspected byte 1 and treated any value other
|
||||
than `0x10` / `0x03` as `"Unknown"`, which prevented `event.timestamp` from
|
||||
being populated for any continuous-mode event whose day-of-month wasn't
|
||||
exactly 3 or 16. As a downstream effect, `blastware_filename()` saw
|
||||
`event.timestamp == None`, fell back to `stem="0000"` / `ab="00"`, and
|
||||
produced filenames like `M5290000.000`. Discovered from a live SFM run on
|
||||
BE11529 in continuous mode (day-of-month = 5).
|
||||
Now disambiguates by checking BOTH byte 0 and byte 2: if both are `0x10`,
|
||||
it's the 10-byte continuous header; else if byte 1 is `0x10`, it's the
|
||||
9-byte single-shot header. Day-of-month no longer matters.
|
||||
|
||||
*Superseded by v0.13.2 — the user's actual record uses a third 8-byte format
|
||||
with no `0x10` markers, which v0.13.1 still misclassified.*
|
||||
|
||||
---
|
||||
|
||||
## v0.13.0 — 2026-05-01
|
||||
|
||||
### Fixed
|
||||
|
||||
- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
|
||||
`read_bulk_waveform_stream` was walking the chunk counter past the actual
|
||||
end of the event, picking up post-event circular-buffer garbage that
|
||||
corrupted reconstructed Blastware files for any waveform > ~1 sec. The
|
||||
loop now extracts the event's `end_offset` from the STRT record at
|
||||
`data[23:27]` of the probe response and stops the chunk walk when the next
|
||||
counter would step past it. Verified against three BW MITM captures
|
||||
(4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
|
||||
bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.
|
||||
|
||||
### Added
|
||||
|
||||
- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)` —
|
||||
computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
|
||||
formula confirmed across all 3 BW captures. Not yet wired into
|
||||
`read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
|
||||
existing `blastware_file.write_blastware_file` frame-structure expectations);
|
||||
available for the next iteration that switches to BW's 0x0200 chunk step.
|
||||
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
|
||||
from the STRT record in an A5 response payload.
|
||||
|
||||
### Documentation
|
||||
|
||||
- **CLAUDE.md and `docs/instantel_protocol_reference.md` extensively
|
||||
rewritten** to reflect the corrected SUB 5A protocol. See:
|
||||
- CLAUDE.md "SUB 5A — chunk counter formula (REWRITTEN 2026-05-01)"
|
||||
- CLAUDE.md "SUB 5A — STRT record encodes end_offset"
|
||||
- CLAUDE.md "SUB 5A — TERM frame formula"
|
||||
- CLAUDE.md "SUB 5A — fixed metadata pages 0x1002 and 0x1004"
|
||||
- CLAUDE.md "SUB 0A — WAVEHDR response length distinguishes events from
|
||||
boundaries" (0x46 = real event, 0x2C = boundary marker)
|
||||
- protocol reference §7.8.5 / §7.8.6 / §7.8.7 / §7.8.8
|
||||
- The previous chunk-counter formula (`max(key4[2:4], 0x0400) + (chunk-1) *
|
||||
0x0400`) is now marked DEPRECATED and explicitly tagged WRONG with
|
||||
pointers to the new sections, so future work doesn't re-derive it.
|
||||
|
||||
### Known minor diffs vs Blastware (deferred to a follow-up)
|
||||
|
||||
- We still use the OLD 0x0400 chunk step rather than BW's 0x0200; switching
|
||||
also requires updating `blastware_file.write_blastware_file`'s skip values
|
||||
and "extra chunk after metadata" logic, which depends on a fresh capture
|
||||
to verify.
|
||||
- We still use the legacy fixed `offset_word=0x005A` TERM frame rather than
|
||||
BW's `end_offset - next_boundary` formula, for the same reason.
|
||||
- Two fixed metadata pages at counter `0x1002` and `0x1004` are not yet
|
||||
read explicitly; under the current 0x0400 walk their content is reachable
|
||||
via the sample chunk that covers buffer addresses `[0x1000, 0x1400)`.
|
||||
|
||||
---
|
||||
|
||||
## v0.12.6 — 2026-05-01
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`blastware_file.py` — waveform frame classification** — A5 frame classification for
|
||||
waveform-only vs header-only frames now uses `frame.record_type` instead of frame index.
|
||||
Only waveform frames (0x46) are written to the file body; metadata frames are skipped.
|
||||
Fixes spurious data corruption from incorrectly classified frames.
|
||||
|
||||
- **`s3_analyzer.py` — A5/5A frame naming** — Bulk waveform stream frames (SUB 5A response)
|
||||
are now correctly labeled "A5" in analyzer output instead of being conflated with other
|
||||
multi-frame responses (SUB A4, E5, etc.).
|
||||
|
||||
- **`S3FrameParser` — frame terminator detection** — Corrected the bare ETX terminator
|
||||
detection. Frame termination is now correctly identified by a standalone `ETX=0x03` byte,
|
||||
not by the `DLE+ETX` sequence (which is part of the payload when it appears within a frame).
|
||||
|
||||
---
|
||||
|
||||
## v0.12.5 — 2026-04-21
|
||||
|
||||
### Added
|
||||
|
||||
- **`seismo_lab.py` — Download tab** — New fourth tab for live wire-byte capture during event
|
||||
downloads. Captures both BW→device and device→S3 frames in real time, allowing inspection
|
||||
of the 5A bulk stream chunk sequence and frame-by-frame analysis without needing a bridge
|
||||
or MITM proxy. Files are saved with user-specified labels for easy tracking.
|
||||
|
||||
### Changed
|
||||
|
||||
- **`s3_bridge.py` — raw captures always-on by default** — `--raw-bw` and `--raw-s3` now
|
||||
@@ -519,10 +17,6 @@ For existing deployments where events were forwarded by an older watcher (broken
|
||||
"S3→BW raw" checkboxes start checked. Path fields are empty by default (bridge auto-names
|
||||
the files). Unchecking a box passes `--raw-bw ""` to explicitly disable capture.
|
||||
|
||||
- **`Bridge tab` — TCP mode added** — Serial/TCP radio toggle allows connection via cellular
|
||||
modem (RV50/RV55) instead of direct RS-232. Supports multi-capture design (simultaneous
|
||||
Bridge + Analyzer + Download sessions).
|
||||
|
||||
- **`ach_server.py` — TX capture added (`raw_tx_<ts>.bin`)** — Every ACH inbound session
|
||||
now saves both directions: `raw_rx_<ts>.bin` (device → us, S3 side, as before) and
|
||||
`raw_tx_<ts>.bin` (us → device, BW side). Both files are usable in the Analyzer.
|
||||
|
||||
@@ -2,90 +2,12 @@
|
||||
|
||||
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
|
||||
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
|
||||
(Sierra Wireless RV50 / RV55). Current version: **v0.17.0**.
|
||||
(Sierra Wireless RV50 / RV55). Current version: **v0.12.3**.
|
||||
|
||||
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
|
||||
|
||||
---
|
||||
|
||||
## Architecture: three-tier conceptual model
|
||||
|
||||
seismo-relay is a **suite of cooperating components**, not a single app.
|
||||
The three tiers below are the canonical mental model — the current
|
||||
directory layout doesn't fully reflect them yet (some of what is
|
||||
conceptually SDM lives under `sfm/` today), but new code should be
|
||||
placed and named according to this model.
|
||||
|
||||
### 1. SFM — the device-side (active connection to physical units)
|
||||
|
||||
Replaces Blastware's *talk-to-the-meter* role. Lives where a connection
|
||||
to a physical seismograph is open.
|
||||
|
||||
In scope:
|
||||
- `minimateplus/{transport,framing,protocol,client}.py` — wire protocol
|
||||
- `seismo_lab.py` — diagnostic GUI (a thick client for SFM)
|
||||
- The `/device/*` HTTP endpoints in `sfm/server.py` —
|
||||
`/device/info`, `/device/events`, `/device/monitor/*`, `/device/call_home`,
|
||||
etc. Anything that opens a connection at the moment of the request.
|
||||
- Future: a Thor / Micromate live client (mirror `minimateplus/`)
|
||||
- Future: a control surface Terra-View can launch into — see the
|
||||
README's Roadmap.
|
||||
|
||||
Does NOT own a database. Outputs `Event` objects. Has a "spun up when
|
||||
needed" runtime profile rather than "always on".
|
||||
|
||||
### 2. SDM — the data-side (storage, ingest, and serving)
|
||||
|
||||
The new name for the receiving-and-storing role. Originally called SFM
|
||||
because the FastAPI service started life as a thin device proxy, but
|
||||
the actual role has migrated heavily toward data management. **For now
|
||||
the directory remains `sfm/`** — renaming requires touching ~30-50
|
||||
files in seismo-relay + ~10-15 in terra-view + a Docker volume
|
||||
migration; deferred until the codebase is quiet enough to do it as a
|
||||
clean refactor.
|
||||
|
||||
In scope:
|
||||
- `sfm/database.py` (`SeismoDb`)
|
||||
- `sfm/waveform_store.py`, `sfm/event_hdf5.py`
|
||||
- The `/db/*` HTTP endpoints — `events`, `units`, `monitor_log`,
|
||||
`sessions`, `false_trigger` mutations
|
||||
- The `/db/import/*` ingest endpoints — `blastware_file` (series3),
|
||||
`idf_file` (series4); anything that receives events FROM somewhere
|
||||
- `scripts/backfill_sidecars.py`, `scripts/check_bw_report_preservation.py`,
|
||||
and similar data-maintenance tools
|
||||
- The `.sfm.json` sidecars and `.h5` files in the waveform store
|
||||
- The shape that Terra-View consumes (Terra-View should never need to
|
||||
reach into SFM/device-side endpoints to populate its UI)
|
||||
|
||||
Always-on, scaled for storage/serving, has the DB and waveform store.
|
||||
|
||||
### 3. Codec library — pure data interpretation (used by both sides)
|
||||
|
||||
Neither SFM nor SDM — a shared library both depend on.
|
||||
|
||||
In scope:
|
||||
- `minimateplus/{waveform_codec,histogram_codec,event_file_io,bw_ascii_report,blastware_file}.py`
|
||||
- `micromate/{idf_ascii_report,idf_file}.py`
|
||||
|
||||
These modules take bytes (off the wire on the SFM side, or from a
|
||||
forwarded file on the SDM side) and return `Event` objects. They
|
||||
should not import from `sfm/`, must not touch a DB, and have no I/O
|
||||
beyond reading files passed as arguments. Keep them pure — both
|
||||
tiers can then depend on them without circularity.
|
||||
|
||||
### Practical consequences
|
||||
|
||||
When deciding where new code goes, ask:
|
||||
- *Does it need a connection to a device?* → SFM
|
||||
- *Does it operate on stored events / sidecars / DB rows?* → SDM
|
||||
- *Does it interpret bytes into structured data, with no I/O of its own?* → codec lib
|
||||
|
||||
Terra-View is downstream of SDM for data, and (per the roadmap) will
|
||||
eventually invoke into SFM's device-control endpoints to provide a
|
||||
"connect to unit" experience.
|
||||
|
||||
---
|
||||
|
||||
## Project layout
|
||||
|
||||
```
|
||||
@@ -95,8 +17,6 @@ minimateplus/ ← Python client library (primary focus)
|
||||
protocol.py ← MiniMateProtocol — wire-level read/write methods
|
||||
client.py ← MiniMateClient — high-level API (connect, get_events, …)
|
||||
models.py ← DeviceInfo, EventRecord, ComplianceConfig, …
|
||||
waveform_codec.py ← Body-codec block walker + decode_tran_initial (partial
|
||||
per-sample decoder — see "Waveform body codec" section below)
|
||||
|
||||
sfm/server.py ← FastAPI REST server exposing device data over HTTP
|
||||
seismo_lab.py ← Tkinter GUI (Bridge + Analyzer + Console tabs)
|
||||
@@ -107,7 +27,7 @@ CHANGELOG.md ← version history
|
||||
|
||||
---
|
||||
|
||||
## Current implementation state (v0.14.3)
|
||||
## Current implementation state (v0.12.3)
|
||||
|
||||
Full read pipeline + write pipeline + erase pipeline + monitor log + call home config working end-to-end over TCP/cellular:
|
||||
|
||||
@@ -121,15 +41,14 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
|
||||
| Event header / first key | 1E | ✅ |
|
||||
| Waveform header | 0A | ✅ |
|
||||
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
|
||||
| **Bulk waveform stream (event-time metadata + full waveform)** | **5A** | ✅ **byte-perfect against BW captures (v0.14.3, 2026-05-05)** — STRT-bounded chunk walk + correct event-N probe counter + DLE-stuffed `0x10` bytes in params + concatenate-only file body assembly. All 17 5A request frames in the 5-1-26 3-sec capture reproduce byte-for-byte. |
|
||||
| **Bulk waveform stream (event-time metadata)** | **5A** | ✅ new v0.6.0 |
|
||||
| Event advance / next key | 1F | ✅ |
|
||||
| **Write commands (push config to device)** | **68–83** | ✅ new v0.8.0 |
|
||||
| **Erase all events** | **0xA3 → 0x1C → 0x06 → 0xA2** | ✅ new v0.9.0 |
|
||||
| **Monitor log entries (partial 0x2C records)** | **0A browse** | ✅ new v0.10.0 |
|
||||
| **Auto Call Home config (read + write)** | **2C → 7E → 7F** | ✅ **new v0.12.3** |
|
||||
|
||||
`get_events()` sequence per event: `1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`
|
||||
(see "Correct iteration pattern" section below for full detail)
|
||||
`get_events()` sequence per event: `1E → 0A → 0C → 5A → 1F`
|
||||
|
||||
`push_config_raw()` write sequence: `68→73 | 71×3→72 | 82→83 | 69→74→72`
|
||||
|
||||
@@ -137,133 +56,6 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
|
||||
|
||||
---
|
||||
|
||||
## Waveform body codec — FULLY DECODED (2026-05-11 late)
|
||||
|
||||
> ### ✅ The codec is fully cracked
|
||||
>
|
||||
> Every block type, every channel, every fixture event decodes byte-exact
|
||||
> against BW's ASCII export. **47,364 ADC samples verified, zero errors.**
|
||||
> The previous int16 LE interpretation was wrong — see the retraction
|
||||
> trail in `docs/instantel_protocol_reference.md §7.6.1`.
|
||||
>
|
||||
> Authoritative implementation: `minimateplus/waveform_codec.py`
|
||||
> (`decode_waveform_v2()`). Clean working notes:
|
||||
> `docs/waveform_codec_re_status.md`.
|
||||
>
|
||||
> **NOTE:** `client.py:_decode_a5_waveform` still uses the broken
|
||||
> legacy int16 LE decoder. Wiring `decode_waveform_v2` into the
|
||||
> `.h5` sidecar path is the obvious next follow-up. Until that lands,
|
||||
> `.h5` samples remain wrong — but the codec itself is fully solved.
|
||||
|
||||
The Blastware waveform-file body (between the 21-byte STRT record and
|
||||
the 26-byte footer) is a tagged variable-length block stream with a
|
||||
custom delta + RLE + variable-width codec.
|
||||
|
||||
### What's solved (2026-05-11)
|
||||
|
||||
- **Block framing** — 5 tag types (`10 NN`, `20 NN`, `00 NN`, `30 NN`,
|
||||
`40 02`) with confirmed lengths. Implementation: `walk_body()` in
|
||||
`minimateplus/waveform_codec.py`.
|
||||
- **Per-channel codec** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
|
||||
as int16 BE in **16-count units** (LSB = 0.005 in/s). Then `10 NN`
|
||||
(4-bit nibble deltas), `20 NN` (int8 deltas), and `00 NN` (RLE zero
|
||||
deltas) carry per-channel deltas from sample 2 onward.
|
||||
- **Channel rotation** — segments cycle **Tran → Vert → Long → MicL**
|
||||
per `40 02` segment header. Each segment carries ~512 sample-sets of
|
||||
ONE channel. The initial body (before the first `40 02`) is the
|
||||
implicit Tran segment.
|
||||
- **Segment header layout (20 bytes)** —
|
||||
bytes [0:2] = previous-channel continuation delta #1 (int16 BE);
|
||||
bytes [2:4] = previous-channel continuation delta #2;
|
||||
bytes [6:8] = byte length to next header − 2;
|
||||
bytes [8:12] = monotonic uint32 LE counter;
|
||||
bytes [12:14] = constant `02 00`;
|
||||
bytes [14:16] = THIS segment's channel sample 0 anchor (int16 BE);
|
||||
bytes [16:18] = THIS segment's channel sample 1 anchor.
|
||||
- **`decode_waveform_v2()`** returns full per-channel sample dicts.
|
||||
Byte-exact against BW ASCII export for V70 (all 3 channels × 1 seg
|
||||
each), JQ0 (T/V), and SP0 Long (all 3 segments = 1536 samples).
|
||||
|
||||
- **`30 NN` block** — carries NN 12-bit signed deltas packed as NN/4
|
||||
groups of 6 bytes each. Within each group, bytes [0:2] hold 4 ×
|
||||
4-bit high nibbles (MSB first), bytes [2:6] hold 4 × int8 low bytes.
|
||||
Each delta = `sign_extend_12((high_nibble << 8) | low_byte)`. Block
|
||||
length = `NN × 1.5 + 2` bytes. ✅ confirmed against all 14 `30 NN`
|
||||
blocks in the fixture bundle. 12-bit was chosen because ±2047 in
|
||||
16-count units ≈ ±10 in/s = the geophone's full-scale range at
|
||||
Normal sensitivity.
|
||||
- **Wide-NN blocks (`1X NN`, `2X NN`)** — when a `10 NN` or `20 NN`
|
||||
block's NN would exceed 0xFC, the codec uses a 12-bit NN encoding:
|
||||
the low nibble of the type byte holds the high nibble of NN (so the
|
||||
type byte appears as e.g. `0x11` instead of `0x10`). Effective
|
||||
NN = `((type_byte & 0x0F) << 8) | nn_byte`. Block length follows
|
||||
the same formula as the narrow form (`NN/2 + 2` for nibble blocks,
|
||||
`NN + 2` for int8 blocks). Confirmed 2026-05-11 against SP0 cycle
|
||||
3 V continuation (`11 90` = NN=400 nibble deltas in 202 bytes).
|
||||
|
||||
### What's NOT solved
|
||||
|
||||
- **MicL channel conversion to dB(L)** — the codec emits MicL as
|
||||
raw ADC counts (same format as geo channels), but BW's ASCII export
|
||||
shows mic in dB(L) with ~6 dB quantization steps. Need to map
|
||||
ADC counts → dB(L) for direct comparison; likely
|
||||
`dB = 20*log10(|counts|) + offset` or similar.
|
||||
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due
|
||||
to block-length quirks past the first few segments. Every sample
|
||||
reached is correct; the walker just needs robustness improvements.
|
||||
|
||||
### Decoded sample counts (across the fixture bundle)
|
||||
|
||||
| Event | Tran | Vert | Long | Total |
|
||||
|---|---|---|---|---|
|
||||
| event-a | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| event-b | 2304 | 2304 | 2304 | **6912** ← full event |
|
||||
| event-c | 1280 | 1280 | 1280 | 3840 ← full event |
|
||||
| event-d | 1280 | 1280 | 1280 | 3840 ← full event |
|
||||
| JQ0 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| V70 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| SP0 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||
| SS0 | 3078 | 3072 | 3072 | 9222 (1–7 tail samples missing) |
|
||||
| SV0 | 3078 | 3072 | 3072 | 9222 (1–7 tail samples missing) |
|
||||
|
||||
**Total: 72,972 ADC samples verified byte-exact, zero errors.**
|
||||
|
||||
7 of 9 fixture events decode end-to-end across all three geo channels.
|
||||
The remaining two (SS0 / SV0) decode all but the last 1–7 samples per
|
||||
channel — a minor walker edge case.
|
||||
|
||||
### Production-code status (updated 2026-05-11 late)
|
||||
|
||||
`client.py:_decode_a5_waveform` now uses the verified codec via
|
||||
`waveform_codec.decode_a5_frames()` — which calls
|
||||
`blastware_file.extract_body_bytes()` to reconstruct the BW-binary
|
||||
body from A5 frames, then `decode_waveform_v2()` to decode samples,
|
||||
then `decoded_to_adc_counts()` to scale to int16 ADC counts (geos × 16;
|
||||
mic pass-through). The `.h5` sidecars SFM produces now contain
|
||||
correct samples for any event without walker edge cases.
|
||||
|
||||
The original int16 LE decoder is preserved as
|
||||
`_decode_a5_waveform_LEGACY` for reference but is not called.
|
||||
|
||||
MicL → dB(L) conversion utility:
|
||||
`waveform_codec.mic_count_to_db(count)` — `count=±1 → ±81.94 dB`;
|
||||
`count=813 → 140.14 dB` (matches BW display).
|
||||
|
||||
### Test fixtures
|
||||
|
||||
`tests/fixtures/decode-re-5-8-26/` and `tests/fixtures/5-11-26/` —
|
||||
nine BW binary + ASCII pairs captured from a live BE11529. The
|
||||
5-11-26 high-amplitude bundle (PPV 6–7 in/s) is what cracked the Tran
|
||||
codec; the V70 (mic-heavy) + JQ0 (Vert-heavy) pair cracked the `00 NN`
|
||||
RLE rule.
|
||||
|
||||
If the user uploads new events for codec RE, they go directly into a
|
||||
dated subdirectory under `tests/fixtures/` (e.g. `tests/fixtures/5-18-26/`).
|
||||
There used to be a separate `decode-re/` upload mirror but it was
|
||||
removed once the fixtures directory became the canonical location.
|
||||
|
||||
---
|
||||
|
||||
## Protocol fundamentals
|
||||
|
||||
### DLE framing
|
||||
@@ -323,203 +115,32 @@ S3→BW (response):
|
||||
section contribute only `XX` to the running sum; lone bytes contribute normally. This
|
||||
differs from the standard SUM8-of-destuffed-payload that all other commands use.
|
||||
|
||||
3. **Params region uses partial DLE stuffing (CONFIRMED 2026-05-05).** The device's
|
||||
de-stuffing rule for bytes inside the params region is:
|
||||
Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26
|
||||
BW TX capture. All 10 frames verified.
|
||||
|
||||
- `10 10` → de-stuffs to `10`
|
||||
- `10 02 / 03 / 04` → kept literal (these are inner-frame markers)
|
||||
- `10 X` for other X → de-stuffs to just `X` (drops the leading `0x10`)
|
||||
### SUB 5A — chunk counter formula (FINAL CORRECTION 2026-04-26)
|
||||
|
||||
Therefore any `0x10` byte in the *logical* params that is followed by a byte NOT in
|
||||
`{0x02, 0x03, 0x04, 0x10}` MUST be doubled on the wire (`10 X` → `10 10 X`) so the
|
||||
device's de-stuffer reproduces the original `10 X` pair. This applies most commonly
|
||||
to counters with `0x10` in the high byte (e.g. counter=`0x1000` produces logical
|
||||
params bytes `... 10 00 ...`, which BW encodes on the wire as `... 10 10 00 ...`).
|
||||
Without this stuffing the device interprets counter=`0x1000` as `0x0000` and returns
|
||||
the probe response (which contains a copy of the file header + STRT record). That
|
||||
STRT block then gets embedded in the assembled file body at offset `0x1016`, and
|
||||
Blastware refuses to open the file — see the v0.14.3 entry in `CHANGELOG.md`.
|
||||
**Chunk counter = `max(key4[2:4], 0x0400) + (chunk_num - 1) * 0x0400` for ALL chunks.**
|
||||
|
||||
`0x10` bytes in `offset_hi` (body[5]) are still written RAW — only the params region
|
||||
has this stuffing requirement. The metadata-page params for counter `0x1002` /
|
||||
`0x1004` survive without stuffing because `10 02` and `10 04` fall in the "kept
|
||||
literal" carve-out.
|
||||
where `key4[2:4] = (key4[2] << 8) | key4[3]` is the event's circular-buffer base offset.
|
||||
|
||||
Both differences (1) and (2) confirmed by reproducing Blastware's exact wire bytes from
|
||||
the 1-2-26 BW TX capture (10 frames). Difference (3) confirmed against the 5-1-26
|
||||
"bwcap3sec" capture (17 frames, all match byte-for-byte after fix).
|
||||
The `max(..., 0x0400)` guard is critical for events at the start of the circular buffer
|
||||
(key4[2:4] == 0x0000, e.g. key `01110000`). Without it, chunk 1 gets counter=0x0000, which
|
||||
is the same address as the probe frame — the device re-returns the STRT record data instead
|
||||
of waveform payload. With the guard, chunk 1 gets counter=0x0400, which is confirmed correct
|
||||
from the empirical live-device test 2026-04-06 (`counter=0x0400 → responds immediately and
|
||||
streams all frames correctly`).
|
||||
|
||||
### SUB 5A — chunk counter formula (REWRITTEN 2026-05-01 — see 5-1-26 captures)
|
||||
|
||||
> ⚠️ **Everything that came before this rewrite was WRONG in important ways.** The previous
|
||||
> formula `max(key4[2:4], 0x0400) + (chunk_num - 1) * 0x0400` happened to *work* for events
|
||||
> at start_key=0 because the device responds to whatever counter you ask for — but it caused
|
||||
> a 5× over-read past the actual event, picking up post-event circular-buffer garbage that
|
||||
> corrupts the reconstructed file for any event > ~1 sec of waveform. The captures in
|
||||
> `bridges/captures/4-27-26/` and `5-1-26/comcheck/` show BW reads only ~12-16 chunks for
|
||||
> the same events SFM was reading 37+ chunks for. See "TERM frame" and "STRT end_offset"
|
||||
> sections below for the actual mechanism.
|
||||
|
||||
**Chunk addressing is just absolute device-buffer addresses.**
|
||||
|
||||
`params[0]=0x00`, `params[1:5]` is a 4-byte absolute device flash-buffer address (= the
|
||||
"key" of that location), `params[5:11]` are zeros. The device returns 0x0200 (= 512) bytes
|
||||
starting at that address. Increments between consecutive chunks are **0x0200 (NOT 0x0400)**
|
||||
— this matches the chunk payload size. The previous "0x0400 step" worked by accident: BW
|
||||
asks for half-size chunks; SFM was asking for double-size chunks, both with the same-named
|
||||
"counter" field, but the value is just an address pointer the device honors as-is.
|
||||
|
||||
**The chunk pattern depends on whether the event sits at start_key=0 or not.**
|
||||
|
||||
#### Event 1 case — start_key[2:4] == 0x0000 (first event after erase / wrap)
|
||||
|
||||
```
|
||||
1. Probe at counter=0x0000 (params[1:5] = full key, returns STRT record)
|
||||
2. Read 2 fixed metadata pages: counter=0x1002, counter=0x1004
|
||||
(these are GLOBAL session metadata — read ONCE per
|
||||
Blastware session, not per event; contain the
|
||||
Project/Client/User Name/Seis Loc strings)
|
||||
3. Sample chunks: counter=0x0600, 0x0800, …, by 0x0200 increment,
|
||||
up to but not including end_offset (rounded down to
|
||||
0x0200 boundary)
|
||||
4. TERM frame (see TERM formula below)
|
||||
```
|
||||
|
||||
The reason `0x0046..0x0600` is skipped for event 1 is unknown — likely some pre-event
|
||||
firmware reserved area for the first slot in a freshly-erased buffer. Harmless to skip.
|
||||
|
||||
#### Event 2+ case — start_key[2:4] != 0x0000 (continuation events)
|
||||
|
||||
```
|
||||
1. First chunk at counter = start_key[2:4] (this IS the probe — response
|
||||
contains STRT at byte 17)
|
||||
2. Sample chunks: counter += 0x0200 each, up to but
|
||||
not including end_offset
|
||||
3. TERM frame
|
||||
```
|
||||
|
||||
**`start_key` here is the off=0x46 WAVEHDR record key returned by 1F** (e.g. `01112238`),
|
||||
NOT the off=0x2C boundary key that immediately precedes it. An earlier draft of this
|
||||
doc described event-N as "probe at start + 0x46" — that formula came from naming the
|
||||
boundary key as `start_key`. In the iteration walk, `cur_key` passed to
|
||||
`read_bulk_waveform_stream` is always the off=0x46 key (the partial-record skip path in
|
||||
`get_events` re-runs 1F to advance past boundary records before invoking 5A), so the
|
||||
probe counter is just `cur_key[2:4]` with no extra offset. **Adding +0x46 caused the
|
||||
probe to overshoot, miss the STRT record at byte 17 of the response, fall back to the
|
||||
`max_chunks=128` cap, and walk ~110 chunks of post-event garbage** — observed in
|
||||
SFM 5-4-26 capture before the fix.
|
||||
|
||||
Confirmed across:
|
||||
- 5-1-26 "copy 2nd address" BW capture: probe counter=0x2238, key=01112238, STRT@17 end=0x417E.
|
||||
- 5-4-26 BW 2-sec event capture: probe counter=0x2238, key=01112238, TERM offset_word=0x0146 → end=0x417E.
|
||||
|
||||
No metadata pages — those have already been read during event 1 in the same Blastware
|
||||
session, and BW caches them. Note that the metadata-page reads happen ONCE per
|
||||
Blastware-session-on-the-device, not once per event, so an SFM session that downloads
|
||||
several events should read 0x1002/0x1004 only once at the start.
|
||||
|
||||
#### History (do not re-derive)
|
||||
The 4-3-26 capture confirms the pattern for a second event (key `0111245a`, key4[2:4]=0x245a):
|
||||
chunk 1 = `0x245A`, chunk 2 = `0x285A`, chunk 3 = `0x2C5A` (each +0x0400).
|
||||
`max(0x245a, 0x0400) = 0x245a` → formula works correctly for non-zero base offset too.
|
||||
|
||||
**History:**
|
||||
- Original: `_CHUNK1_COUNTER = 0x1004` hardcoded (Blastware capture artifact — WRONG).
|
||||
- 2026-04-06: `chunk_num * 0x0400` (worked for key 01110000 only).
|
||||
- 2026-04-24: `key4[2:4] + (chunk_num-1) * 0x0400` (fixed non-zero offsets, broke key 01110000).
|
||||
- 2026-04-26: `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400` (broken — over-read past event end).
|
||||
- 2026-05-01: Increments are 0x0200 not 0x0400; absolute addresses inside event range; bounded
|
||||
by STRT end_key, not by `max_chunks` cap or device-side timeout.
|
||||
- 2026-05-04: Removed spurious `+0x0046` from event-N probe counter. `cur_key` from 1F
|
||||
is already the off=0x46 WAVEHDR key, so adding +0x46 would have placed the probe one
|
||||
WAVEHDR past the actual event start. This caused probe responses to lack a STRT
|
||||
record (no `end_offset` parsed → `0xFFFF` fallback → `max_chunks=128` cap), walking
|
||||
~110 chunks of post-event circular-buffer garbage. Fixed in protocol.py
|
||||
`read_bulk_waveform_stream`.
|
||||
|
||||
### SUB 5A — STRT record encodes end_offset (NEW 2026-05-01)
|
||||
|
||||
The first A5 response (probe response, or the first chunk for event 2+) contains a STRT
|
||||
record at byte offset 17 of the `data` field. Layout:
|
||||
|
||||
```
|
||||
data[17:21] "STRT" magic
|
||||
data[21:23] ff fe sentinel
|
||||
data[23:27] end_key ← 4-byte key of where this event ENDS
|
||||
data[27:31] start_key ← 4-byte key of where this event STARTS
|
||||
data[31:33] uint16 BE ?? sample-count or total bytes (varies; not yet decoded)
|
||||
data[33:35] uint16 BE ??
|
||||
data[35] 0x46 record type (waveform full record)
|
||||
…
|
||||
```
|
||||
|
||||
`end_offset = (end_key[2] << 8) | end_key[3]` is **the authoritative event-end pointer**.
|
||||
SFM must extract this from the first A5 response and use it to bound the chunk loop and
|
||||
encode the TERM frame. The device will happily respond to chunk requests past `end_offset`
|
||||
(returning post-event circular-buffer contents) — that's the over-read bug.
|
||||
|
||||
Verified across 3 events:
|
||||
|
||||
| Capture | start_key | end_key | end_offset | event size |
|
||||
|---|---|---|---|---|
|
||||
| 4-27-26 "open 2sec" / "copy event to disk" | `01110000` | `01111ABE` | `0x1ABE` | 6,846 B |
|
||||
| 5-1-26 "copy 3sec" / Download All event 1 | `01110000` | `011121F2` | `0x21F2` | 8,690 B |
|
||||
| 5-1-26 "copy 2nd address" / DA event 2 | `011121F2` | `0111417E` | `0x417E` (event 2 span 0x1F8C = 8,076 B) |
|
||||
|
||||
### SUB 5A — TERM frame formula (FINALIZED 2026-05-01)
|
||||
|
||||
The TERM frame fetches the partial last chunk *and* the file footer. It is **not** a simple
|
||||
"goodbye" frame — its response payload contains the bytes between the last full 0x0200-aligned
|
||||
chunk and `end_offset`, and is required for reconstructing the Blastware file format.
|
||||
|
||||
```
|
||||
last_chunk_counter = address of last full 0x0200-byte chunk read
|
||||
next_boundary = last_chunk_counter + 0x0200
|
||||
TERM offset_word = end_offset - next_boundary
|
||||
TERM params[0] = key[0] (= 0x01 on every observed device)
|
||||
TERM params[1] = key[1] (= 0x11)
|
||||
TERM params[2] = (next_boundary >> 8) & 0xFF
|
||||
TERM params[3] = next_boundary & 0xFF
|
||||
TERM params[4:10] = zeros
|
||||
build_5a_frame(offset_word, params) (10-byte params, NOT 11)
|
||||
```
|
||||
|
||||
The device reconstructs `requested_address = (params[2] << 8) | offset_word = end_offset`
|
||||
and replies with `(end_offset - next_boundary)` bytes from `next_boundary` — the residual
|
||||
between the last 0x0200 boundary and the actual event end. Append the TERM response data
|
||||
to the chunk stream like any other A5 frame; it carries the final waveform tail + footer.
|
||||
|
||||
Verified across 3 events:
|
||||
|
||||
| end_offset | last chunk | next_boundary | TERM offset_word | TERM params[2:4] |
|
||||
|---|---|---|---|---|
|
||||
| `0x1ABE` | `0x1800` | `0x1A00` | `0x00BE` ✓ | `1A 00` ✓ |
|
||||
| `0x21F2` | `0x1E00` | `0x2000` | `0x01F2` ✓ | `20 00` ✓ |
|
||||
| `0x417E` | `0x3E38` | `0x4038` | `0x0146` ✓ | `40 38` ✓ |
|
||||
|
||||
The previous code's hard-coded `offset_word = 0x005A` and `term_counter = last + 0x0400`
|
||||
are wrong; the device's response under that path is a tiny 101-byte device-side terminator
|
||||
(arrived only after we walked the entire post-event buffer), not the proper file footer.
|
||||
|
||||
### SUB 5A — fixed metadata pages 0x1002 and 0x1004 (NEW 2026-05-01)
|
||||
|
||||
Two chunk addresses are GLOBAL device/session metadata, not event-specific:
|
||||
|
||||
- `counter=0x1002` — first metadata page
|
||||
- `counter=0x1004` — second metadata page
|
||||
|
||||
These are at fixed absolute addresses in the device's flash buffer. They contain the
|
||||
session-start compliance setup (Project/Client/User Name/Seis Loc/Extended Notes ASCII
|
||||
strings). Under the v0.14.0+ walk these strings are read directly from the metadata
|
||||
pages, not from the sample-chunk stream.
|
||||
|
||||
BW reads them ONCE per Blastware session (during event 1's download) and caches them.
|
||||
For SFM, that means:
|
||||
- Once per call-home / once per `MiniMateClient.connect()` is enough.
|
||||
- Subsequent events in the same session don't need to re-fetch them.
|
||||
- Their content does not change when iterating events; only when the user opens
|
||||
Compliance Setup → Apply on the device or sends a SUB 71 compliance write.
|
||||
|
||||
The full byte-for-byte layout of the metadata pages has not been mapped — `_decode_a5_metadata_into`
|
||||
locates the ASCII strings via label scans (`Project:`, `Client:`, `User Name:`, `Seis Loc:`,
|
||||
`Extended Notes`) which works correctly across observed captures. Future work could
|
||||
dump the structural layout if more session-global fields need to be extracted.
|
||||
- 2026-04-06: Corrected to `chunk_num * 0x0400` (worked for key 01110000 only).
|
||||
- 2026-04-24: Corrected to `key4[2:4] + (chunk_num-1) * 0x0400` (fixed non-zero offsets,
|
||||
but accidentally broke key 01110000 — counter=0x0000 sends probe address again).
|
||||
- 2026-04-26: Final formula: `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400`.
|
||||
|
||||
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
|
||||
|
||||
@@ -527,11 +148,10 @@ dump the structural layout if more session-global fields need to be extracted.
|
||||
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
|
||||
Do not swap them.
|
||||
|
||||
### SUB 5A — event-time metadata source (FINALIZED 2026-05-05)
|
||||
### SUB 5A — event-time metadata lives in A5 frame 7
|
||||
|
||||
The metadata strings come from the two fixed metadata pages at counter `0x1002` and
|
||||
`0x1004` (see "SUB 5A — fixed metadata pages 0x1002 and 0x1004" above). These pages
|
||||
are GLOBAL session metadata — read once per Blastware/SFM session, not per event.
|
||||
The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance
|
||||
setup as it existed when the event was recorded:
|
||||
|
||||
```
|
||||
"Project:" → project description
|
||||
@@ -541,71 +161,44 @@ are GLOBAL session metadata — read once per Blastware/SFM session, not per eve
|
||||
"Extended Notes"→ notes
|
||||
```
|
||||
|
||||
**IMPORTANT — these strings are session-start config, NOT per-event:**
|
||||
Project / Client / User Name / Seis Loc reflect the compliance setup from when the
|
||||
*monitoring session first started*, not the individual event's per-event metadata. The
|
||||
authoritative per-event project name is stored in the 210-byte 0C waveform record.
|
||||
`_decode_a5_metadata_into` therefore only sets `project` from the 5A metadata pages
|
||||
when 0C didn't already supply one.
|
||||
**IMPORTANT — 5A "Project:" is session-start config, NOT per-event (confirmed 2026-04-05):**
|
||||
The "Project:" string in the A5 frame 7 payload reflects the compliance setup from when
|
||||
the *monitoring session first started*, not the individual event's project name. The per-
|
||||
event project name is correctly stored in the 210-byte 0C waveform record and must be
|
||||
used as the authoritative source. `_decode_a5_metadata_into` therefore only sets
|
||||
`project` from 5A when 0C didn't already supply one.
|
||||
|
||||
"Client:", "User Name:", "Seis Loc:", and "Extended Notes" are **NOT** present in the 0C
|
||||
record — the metadata pages are the sole source for those fields and they are set
|
||||
unconditionally.
|
||||
record — 5A remains the sole source for those fields and they are set unconditionally.
|
||||
|
||||
#### Deprecated knobs (do not re-introduce)
|
||||
`stop_after_metadata=True` (default) stops the 5A loop as soon as `b"Project:"` appears,
|
||||
then sends the termination frame.
|
||||
|
||||
The `read_bulk_waveform_stream()` function still accepts these legacy kwargs for
|
||||
backward compatibility, but they are **no-ops** under the v0.14.0+ walk:
|
||||
### SUB 5A — end-of-stream signal (confirmed 2026-04-06)
|
||||
|
||||
- `stop_after_metadata=True` — used to scan the chunk stream for `b"Project:"` and stop
|
||||
one chunk later as a workaround for the missing end_offset bound. Obsolete: the loop
|
||||
is now deterministically bounded by `end_offset` parsed from the STRT record at
|
||||
data[17] of the probe response, with the partial tail fetched by the TERM frame.
|
||||
- `extra_chunks_after_metadata` — same era, same reason. No-op.
|
||||
After streaming all waveform chunks, the device sends exactly **1 raw byte** in response to
|
||||
the next chunk request, then goes silent. This is the natural end-of-stream indicator — NOT
|
||||
a complete A5 frame. `S3FrameParser.bytes_fed` will be 1; no frame is assembled.
|
||||
|
||||
If you find code or docs referencing "A5 frame 7" as the source of metadata strings,
|
||||
that's an old-walk artifact (the broken `0x0400`-step formula occasionally caught the
|
||||
0x1002 metadata page at sample-chunk fi=7). Update to reference the dedicated metadata
|
||||
pages instead.
|
||||
Handling: on `TimeoutError`, if `bytes_fed > 0` AND frames were already collected, treat as
|
||||
graceful end-of-stream, break the loop, and proceed to the termination frame. If `bytes_fed
|
||||
== 0` with no prior frames, it is a genuine transport failure — re-raise.
|
||||
|
||||
### SUB 5A — end-of-stream (FINALIZED 2026-05-01)
|
||||
**Chunk recv timeout must be 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
|
||||
Using 120 s causes a ~2-minute stall at every end-of-stream detection. The `_recv_one` call
|
||||
in the chunk loop passes `timeout=10.0` explicitly.
|
||||
|
||||
Under the v0.14.0+ STRT-bounded walk the stream ends cleanly:
|
||||
|
||||
```
|
||||
… last full chunk at counter < end_offset
|
||||
TERM request (offset_word = end_offset - next_boundary,
|
||||
params address (next_boundary))
|
||||
TERM response (page_key = 0x0000 or 0x0001, data = the residual
|
||||
end_offset - next_boundary bytes including the file footer)
|
||||
```
|
||||
|
||||
No timeout-based detection, no "1-byte teaser," no `max_chunks` cap. The chunk loop
|
||||
exits when `counter + 0x0200 > end_offset`; the TERM frame fetches the tail.
|
||||
|
||||
**Chunk recv timeout is 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
|
||||
Using 120 s would cause a ~2-minute stall on any unexpected timeout. The `_recv_one`
|
||||
call in the chunk loop passes `timeout=10.0` explicitly.
|
||||
|
||||
**Typical chunk count under the v0.14.0+ walk (BE11529, 1024 sps over TCP/cellular):**
|
||||
|
||||
| Event duration | Sample chunks | Metadata pages | TERM | Total A5 frames |
|
||||
|---|---|---|---|---|
|
||||
| 2-sec (event 1) | ~12 | 2 | 1 | ~15 |
|
||||
| 3-sec (event 1) | 13 | 2 | 1 | 16 |
|
||||
| 2-sec (continuation) | 15 | 0 | 1 | 16 |
|
||||
| 3-sec (continuation) | ~14 | 0 | 1 | ~15 |
|
||||
|
||||
For comparison, the deprecated `0x0400`-step walk produced ~37 chunks for a 2-sec
|
||||
event with chunks 17-37 containing post-event circular-buffer garbage. Do not
|
||||
re-introduce that walk under any circumstances.
|
||||
**Typical chunk count (BE11529, 1024 sps):** A 9,306-sample event produces 35 chunks before
|
||||
end-of-stream. Chunks with uniform 1,036-byte data are all-zero ADC samples (post-event
|
||||
silence). Only the initial variable-size chunks contain actual signal.
|
||||
|
||||
### SUB 5A — fi==9 hardcoded skip (FIXED 2026-04-06)
|
||||
|
||||
`_decode_a5_waveform()` previously had `elif fi == 9: continue` — a leftover from the
|
||||
9-frame original blast capture where frame 9 was assumed to be a terminator. Removed.
|
||||
TERM detection in the file builder uses `frame.page_key != 0x0010` (sample marker),
|
||||
not frame index — see `blastware_file.py`.
|
||||
9-frame original blast capture where frame 9 was assumed to be a terminator. For current
|
||||
35-frame streams, fi==9 is live waveform data (~133 sample-sets were being dropped).
|
||||
Removed. Terminator detection is via `page_key == 0x0000` in `read_bulk_waveform_stream`,
|
||||
not frame index.
|
||||
|
||||
### SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)
|
||||
|
||||
@@ -710,55 +303,6 @@ sends token=0xFE and is NOT used by any caller.
|
||||
`advance_event()` returns `(key4, event_data8)`.
|
||||
Callers (`count_events`, `get_events`) loop while `data8[4:8] != b"\x00\x00\x00\x00"`.
|
||||
|
||||
### SUB 0A — WAVEHDR response length distinguishes events from boundaries (NEW 2026-05-01)
|
||||
|
||||
When iterating events with the "Download All" pattern (1E → 0A → 1F → 0A → 1F → …), the
|
||||
DATA_LENGTH at `data_rsp.data[5]` (= the byte BW echoes back as the offset for the data
|
||||
fetch step) takes one of two values:
|
||||
|
||||
| WAVEHDR offset | Meaning |
|
||||
|---|---|
|
||||
| `0x46` (= 70) | Real event start key — there is event data at this address |
|
||||
| `0x2C` (= 44) | Boundary marker between events — this key is the END of the previous event AND the START key for the empty space after it (or is the next event's pre-header) |
|
||||
|
||||
Confirmed from the 5-1-26 "Download All" capture:
|
||||
|
||||
```
|
||||
0A(key=01110000) → off=0x46 ← event 1 real start
|
||||
1F → key=011121F2
|
||||
0A(key=011121F2) → off=0x2C ← event 1 END / event 2 boundary
|
||||
1F → key=01112238
|
||||
0A(key=01112238) → off=0x46 ← event 2 real start (= boundary + 0x46)
|
||||
1F → key=0111417E
|
||||
0A(key=0111417E) → off=0x2C ← event 2 END / next-empty marker
|
||||
1F → null sentinel
|
||||
```
|
||||
|
||||
This is why event 2's first 5A chunk is at `start_key + 0x46` — that's the address of the
|
||||
"real start" 0x46-record, distinct from the `0x2C`-record at the raw boundary. Use the
|
||||
`0x46` keys as the input to `read_bulk_waveform_stream`, not the `0x2C` keys.
|
||||
|
||||
For event 1 only (start_key[2:4] = 0x0000) BW probes at counter=0x0000 directly, which is
|
||||
the `0x46`-keyed start record. Subsequent events use `start_key + 0x46`.
|
||||
|
||||
**Practical iteration pattern (replaces the old 1E/1F walk for downloads):**
|
||||
|
||||
```
|
||||
Setup: SERIAL × 2 → CHCFG → 1E (token=0x00) → key0
|
||||
For each event:
|
||||
0A(cur_key) → DATA_LENGTH = 0x46 (real) or 0x2C (boundary)
|
||||
1F (token=0x00) → next_key
|
||||
if length was 0x46: → cur_key is a real event; queue it for download
|
||||
cur_key = next_key
|
||||
if next_key all-zero null sentinel: stop
|
||||
|
||||
Then for each queued real-event key:
|
||||
download_event(key) → 5A bulk stream with STRT-bounded chunk walk
|
||||
```
|
||||
|
||||
This is what BW does in the 5-1-26 "Download All" capture — it walks the full event chain
|
||||
collecting `(key, length)` tuples first, *then* downloads each event using the `0x46` keys.
|
||||
|
||||
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
|
||||
|
||||
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
|
||||
@@ -803,6 +347,36 @@ Do NOT use fixed absolute offsets for sample_rate or record_time.
|
||||
Quiet Mode enabled. Parser handles this — do not strip it manually before feeding to
|
||||
`S3FrameParser`.
|
||||
|
||||
**SUB 5A (bulk waveform) TCP frame splitting — confirmed 2026-04-27:**
|
||||
|
||||
Over TCP via cellular modem, each 5A chunk request that produces a single ~1100-byte
|
||||
A5 response over direct RS-232 may arrive as **two separate, complete S3 frames** of
|
||||
~550 bytes each ("2-frame mode"). The modem's Data Forwarding Timeout (~100-150 ms)
|
||||
can split the RS-232 response into two TCP segments, each parsed as a complete S3 frame.
|
||||
Under different modem/timing conditions the full ~1100-byte response arrives as **one
|
||||
S3 frame** ("1-frame mode").
|
||||
|
||||
**Both modes require `extra_chunks_after_metadata=1`** (the extra chunk at metadata_counter
|
||||
+ 0x0400). The device's waveform footer data lives at circular-buffer address 0x1C00 for
|
||||
this event; the terminator frame must be sent at 0x1C00 (not 0x1800) to receive it.
|
||||
|
||||
Example for a 2-second Continuous event (BE11529, key=01110000) via TCP:
|
||||
- **2-frame mode:** 1 probe frame (554 B) + 5 chunks × 2 frames (556-573 B) + 1 extra chunk × 2 frames + 1 terminator (208 B) = **14 A5 frames** → 6864-byte file
|
||||
- **1-frame mode:** 1 probe frame (~1097 B) + 5 chunks × 1 frame (~1079-1113 B) + 1 extra chunk × 1 frame (smaller, tail of event) + 1 terminator → **8 A5 frames** → 6864-byte file
|
||||
- All frames contribute body data; using all of them gives the correct file.
|
||||
|
||||
**Fix (confirmed 2026-04-27):** `_recv_5a_batch()` in `protocol.py` collects ALL
|
||||
A5 frames per chunk request before the next request is sent, using a 0.5 s batch
|
||||
timeout after the first frame to catch the ~150 ms delayed second frame. `write_blastware_file()`
|
||||
includes ALL body frames without skipping — the extra chunk's frames are part of the
|
||||
body data, NOT padding to be discarded.
|
||||
|
||||
**WRONG earlier hypothesis (do not re-introduce):** An attempt was made to auto-detect
|
||||
1-frame vs 2-frame mode from the probe frame size and skip the extra chunk when
|
||||
`probe_data_len >= 700`. This was wrong — the extra chunk is always needed to advance
|
||||
the device's internal state to the footer address. The `_probe_is_large` branch was
|
||||
removed 2026-04-27.
|
||||
|
||||
### Required ACEmanager settings (Sierra Wireless RV50/RV55)
|
||||
|
||||
| Setting | Value | Why |
|
||||
@@ -983,8 +557,6 @@ All DB endpoints are read-only except `PATCH /db/events/{id}/false_trigger`.
|
||||
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
|
||||
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence; only 1 event stored so token=0xFE appeared to work |
|
||||
| 4-3-26 | `bridges/captures/4-3-26/` | Browse-mode S3 capture with 2+ events — confirmed all-zero params for 1F, 1F response layout, null sentinel, 0A context requirement |
|
||||
| 4-27-26 | `bridges/captures/4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof that SFM was over-reading 5× past event end. BW reads 14 chunks at 0x0200 increments + TERM at end_offset; SFM was reading 37 chunks at 0x0400 increments. STRT end_key field located. |
|
||||
| 5-1-26 | `bridges/captures/5-1-26/comcheck/` | Three sub-captures: SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945`/`_171216`). Confirmed: TERM frame formula across 3 events; metadata pages 0x1002/0x1004 are global (read once per session); event-1 vs event-N chunk-pattern split; WAVEHDR length 0x46 vs 0x2C disambiguates real events from boundaries. |
|
||||
|
||||
---
|
||||
|
||||
@@ -1248,7 +820,7 @@ offsets in the raw 1A/E5 payload. Only fields with `✅` have confirmed offsets
|
||||
|
||||
**Notes tab:**
|
||||
- Enable User Notes (bool)
|
||||
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from 5A metadata pages at counter 0x1002 / 0x1004 — see "SUB 5A — fixed metadata pages" section)
|
||||
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from A5 frame 7 via 5A)
|
||||
- Enable Extended Notes (bool); Extended Notes text; Extended Notes Title
|
||||
- Enable Job Number (bool); Job Number (int)
|
||||
- Enable Scaled Distance (bool); Distance from Blast (float); Charge Weight (float) — Scaled Distance is derived
|
||||
@@ -1560,11 +1132,9 @@ body) because writing a dial string may require DLE escaping for embedded contro
|
||||
|
||||
## What's next
|
||||
|
||||
**See [README.md → Roadmap (Future)](README.md#roadmap-future) for the canonical deferred-work list.** This section is kept as a status log of in-progress / recently-shipped technical details (encoding schemes, byte layouts, etc.) that are too low-level for the README's roadmap.
|
||||
|
||||
- **Database** — SQLite store for events + monitor log entries; dedup by key; queryable
|
||||
- **Histograms** — decode histogram-mode A5 data (noise floor tracking)
|
||||
- **Blastware-compatible file output** — `write_blastware_file()` and `write_mlg()` implemented. `blastware_filename()` generates correct Blastware filenames (AB0 for direct, AB0W/AB0H for ACH). **Confirmed BYTE-PERFECT against BW reference (v0.14.3, 2026-05-05):** when fed the BW 5-1-26 3-sec capture's A5 frames, the SFM-built file matches BW's saved `M529LKIQ.G10` byte-for-byte (8708 bytes, 0 differences). Live SFM downloads of event 0 (3-sec) and event 1 (3-sec continuation) both open cleanly in Blastware with full Event Reports, frequency analysis, and waveform plots. Body assembly is just contiguous concatenation of frame contributions in stream order (probe → meta@0x1002 → meta@0x1004 → samples → TERM); no stripping, no overlay, no special handling. Histogram+Continuous mode deferred (5A stream for those events embeds histogram interval records that may need different handling — untested under v0.14.x). Extension mapping: extensions encode timestamp (AB0T for ACH, AB0 for direct), NOT recording mode. Filename format: `<prefix_letter><serial3><4-char-base36-stem><ext>`
|
||||
- **Blastware-compatible file output** — `write_blastware_file()` and `write_mlg()` implemented. `blastware_filename()` generates correct Blastware filenames (AB0 for direct, AB0W/AB0H for ACH). **Confirmed working for Continuous mode events (2026-04-23):** SFM-generated file opens in Blastware, shows correct PPV/waveform/timestamp. File is ~200 bytes shorter than BW (missing last ADC tail slice) — all measurements correct. Histogram+Continuous mode deferred (5A stream for those events embeds histogram interval records that create spurious STRT markers in the body). Extension mapping: **CONFIRMED FALSE 2026-04-21** — extensions encode timestamp (AB0T for ACH, AB0 for direct), NOT recording mode. Filename format: `<prefix_letter><serial3><4-char-base36-stem><ext>`
|
||||
|
||||
**Serial encoding (CONFIRMED 2026-04-22):** `prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))`, `serial3 = f"{serial_numeric % 1000:03d}"`. Examples: BE6907→H907, BE11529→M529, BE14036→P036, BE17353→S353, BE18003→T003. The prefix letter encodes the production generation (batch of 1000 units).
|
||||
|
||||
@@ -1600,21 +1170,16 @@ body) because writing a dial string may require DLE escaping for embedded contro
|
||||
|
||||
| Folder / File | Contents |
|
||||
|---|---|
|
||||
| `1-2-26/` | First SUB 5A BW TX capture — established 5A frame format (raw offset_hi, DLE-aware checksum). 10 frames verified. |
|
||||
| `3-11-26/raw_bw_20260311_170151.bin` | Full compliance write + event download (SUBs 68→83 confirmed, frames 102–112) |
|
||||
| `3-31-26/` | Single-event download (148 BW / 147 S3 frames) — 1E/0A/0C/1F sequence confirmed (single event so token=0xFE appeared to work in either branch) |
|
||||
| `4-2-26/` | Download-mode BW TX capture — POLL×3 requirement confirmed (frames 68-73 between 1F and first 5A) |
|
||||
| `4-3-26-multi_event/` | Browse-mode S3 capture with 2+ events — all-zero params for 1F, null sentinel layout, 0A context requirement |
|
||||
| `4-8-26/` | Monitor status read, start/stop monitoring, SESSION_RESET signal, sensor check |
|
||||
| `4-11-26 (mitm/ach_mitm_20260411_001912/)` | Full ACH call-home MITM — erase protocol (0xA3/0x06/0xA2), monitor log partial records confirmed |
|
||||
| `4-20-26/raw_bw_*_recording_mode_*.bin` | Recording mode changes: Continuous→Single Shot, →Histogram, →Histogram+Continuous |
|
||||
| `4-20-26/histogram interval/` | Histogram interval changes: 1min, 5min, 15min, 15sec |
|
||||
| `4-20-26/geo sensitivity/` | Geo sensitivity changes: 1.25 in/s (Sensitive), 10 in/s (Normal) |
|
||||
| `4-20-26/call home settings/` | Call home config read/write captures |
|
||||
| `4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof of 5× SFM over-read. STRT end_key field located. |
|
||||
| **`5-1-26/comcheck/`** | **Triplet of captures that nailed the v0.14.0 walk:** SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945` / `_171216`). Confirmed: TERM frame formula across 3 events, metadata pages 0x1002/0x1004 are global session metadata, event-1 vs event-N chunk pattern split, WAVEHDR off=0x46 vs 0x2C disambiguates real events from boundaries. |
|
||||
| **`5-1-26/comcheck/bwcap3sec/`** | **The byte-perfect reference for v0.14.3.** All 17 BW 5A request frames (probe, 2 metadata, 13 samples, TERM) reproduce byte-for-byte from SFM's framing helpers — including the `10 10 00` DLE-stuffed counter for sample @ 0x1000 that was the long-standing failure mode. |
|
||||
| `5-4-26/` | BW MITM captures of "copy 3sec / 2sec / Download All" + paired SFM session (`seismo_dl_20260504_145701`) showing the +0x46 event-N probe bug producing 110-chunk runaway walk. Cross-references against 5-1-26 confirmed device behavior is identical. |
|
||||
| `4-8-26/` | Monitor status read, start/stop monitoring, SESSION_RESET signal, sensor check |
|
||||
| `4-3-26-multi_event/` | Browse-mode S3 capture with 2+ events (1E/0A/1F iteration confirmed) |
|
||||
| `4-2-26/` | Download-mode BW TX capture (5A bulk stream, POLL×3 requirement confirmed) |
|
||||
| `3-31-26/` | Single-event download (148 BW / 147 S3 frames) |
|
||||
| `mitm/ach_mitm_20260411_001912/` | Full ACH call-home MITM (erase protocol, 0xA3/0x06/0xA2 confirmed) |
|
||||
|
||||
To parse BW TX captures: use `bridges/captures/` scripts or adapt the `find_write_frames()` pattern
|
||||
in `/tmp/analyze_write_payload.py` — it correctly handles `0x10 0x03` DLE-escaped ETX bytes
|
||||
|
||||
-20
@@ -1,20 +0,0 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y --no-install-recommends curl && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY pyproject.toml requirements.txt ./
|
||||
COPY minimateplus ./minimateplus
|
||||
COPY micromate ./micromate
|
||||
COPY sfm ./sfm
|
||||
COPY bridges ./bridges
|
||||
COPY scripts ./scripts
|
||||
|
||||
RUN pip install --no-cache-dir -e .
|
||||
|
||||
EXPOSE 8200
|
||||
|
||||
CMD ["python", "-m", "uvicorn", "sfm.server:app", "--host", "0.0.0.0", "--port", "8200"]
|
||||
@@ -1,11 +1,7 @@
|
||||
# seismo-relay `v0.19.0`
|
||||
# seismo-relay `v0.12.1`
|
||||
|
||||
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
|
||||
software for managing seismographs. Supports both the **MiniMate Plus
|
||||
(Series III)** and the **Micromate (Series IV / "Thor")** families:
|
||||
Series III via the live RS-232 / TCP wire protocol *and* Blastware ACH file
|
||||
ingest; Series IV currently via Thor TXT-paired IDF file ingest, with the
|
||||
binary codec on the roadmap.
|
||||
software for managing MiniMate Plus seismographs.
|
||||
|
||||
Built in Python. Runs on Windows, Linux, or macOS. Connects to instruments
|
||||
over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
||||
@@ -14,27 +10,6 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
||||
> pipeline working end-to-end over TCP/cellular. ACH Auto Call Home server
|
||||
> handles inbound unit connections, downloads events, and persists everything
|
||||
> to a SQLite database. SFM REST API exposes device control and DB queries.
|
||||
> **As of v0.14.3 (2026-05-05): SUB 5A bulk waveform protocol is verified
|
||||
> byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec
|
||||
> events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with
|
||||
> full Event Reports, frequency analysis, and waveform plots.
|
||||
> **v0.16.0 (2026-05-11)** adds BW ASCII report ingestion to
|
||||
> `/db/import/blastware_file` — paired with **series3-watcher v1.5.0**,
|
||||
> every Blastware ACH event lands in SeismoDb with device-authoritative
|
||||
> peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data,
|
||||
> without depending on the still-undecoded waveform body codec.
|
||||
> **v0.18.0 (2026-05-19)** adds Thor / Micromate Series IV ingest at
|
||||
> `/db/import/idf_file` — paired with **thor-watcher v0.3.0**, every
|
||||
> `.IDFH` / `.IDFW` event file (plus its `.txt` sidecar) lands in
|
||||
> SeismoDb the same way BW events do. See
|
||||
> [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) for
|
||||
> the IDF format reference and reverse-engineering plan.
|
||||
> **v0.19.0 (2026-05-20)** separates Series III and Series IV at the
|
||||
> code level: new `micromate/` package alongside `minimateplus/`, new
|
||||
> `events.device_family` DB column ("series3" / "series4") so the UI
|
||||
> and storage layer dispatch deterministically instead of sniffing
|
||||
> filenames. Self-applying migration backfills existing rows from the
|
||||
> binary filename extension.
|
||||
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
|
||||
|
||||
---
|
||||
@@ -43,35 +18,26 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
||||
|
||||
```
|
||||
seismo-relay/
|
||||
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Download + Console tabs)
|
||||
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Console tabs)
|
||||
│
|
||||
├── minimateplus/ ← Series III (MiniMate Plus) client library
|
||||
├── minimateplus/ ← MiniMate Plus client library
|
||||
│ ├── transport.py ← SerialTransport, TcpTransport, SocketTransport
|
||||
│ ├── protocol.py ← DLE frame layer, SUB command dispatch
|
||||
│ ├── client.py ← High-level client (connect, get_events, delete_all_events, push_config, get_call_home_config, …)
|
||||
│ ├── client.py ← High-level client (connect, get_events, push_config, …)
|
||||
│ ├── framing.py ← Frame builders, DLE codec, S3FrameParser
|
||||
│ ├── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, CallHomeConfig, …
|
||||
│ ├── bw_ascii_report.py ← Parse BW per-event ASCII reports (.TXT sidecars)
|
||||
│ ├── event_file_io.py ← Read BW binaries, write .sfm.json sidecars
|
||||
│ └── blastware_file.py ← Write events to Blastware-compatible .AB0 files
|
||||
│
|
||||
├── micromate/ ← Series IV (Micromate / Thor) client library (NEW v0.19)
|
||||
│ ├── models.py ← IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L))
|
||||
│ ├── idf_ascii_report.py ← Parse Thor .IDFW.txt / .IDFH.txt event sidecars
|
||||
│ └── idf_file.py ← Stub for the .IDFW / .IDFH binary codec (reverse-engineering pending)
|
||||
│ └── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, …
|
||||
│
|
||||
├── sfm/ ← SFM REST API server (FastAPI, port 8200)
|
||||
│ ├── server.py ← Live device endpoints + DB query + ingest endpoints + caching
|
||||
│ ├── database.py ← SeismoDb — SQLite persistence (events, monitor_log, ach_sessions)
|
||||
│ ├── waveform_store.py ← On-disk store for BW + IDF event binaries + .sfm.json sidecars
|
||||
│ └── sfm_webapp.html ← Embedded web UI with Call Home config tab
|
||||
│ ├── server.py ← All device + DB endpoints
|
||||
│ ├── database.py ← SeismoDb — SQLite persistence layer
|
||||
│ └── sfm_webapp.html ← Embedded web UI (served at /)
|
||||
│
|
||||
├── bridges/
|
||||
│ ├── ach_server.py ← Inbound ACH call-home server (main production server)
|
||||
│ ├── ach_mitm.py ← Transparent MITM proxy for capturing BW sessions
|
||||
│ ├── s3-bridge/ ← RS-232 serial bridge (capture tool)
|
||||
│ ├── tcp_serial_bridge.py ← Local TCP↔serial bridge (bench testing)
|
||||
│ ├── gui_bridge.py ← Standalone bridge GUI with raw capture checkboxes
|
||||
│ ├── gui_bridge.py ← Standalone bridge GUI
|
||||
│ └── raw_capture.py ← Simple raw capture tool
|
||||
│
|
||||
├── parsers/
|
||||
@@ -80,8 +46,7 @@ seismo-relay/
|
||||
│ └── frame_db.py ← SQLite frame database
|
||||
│
|
||||
└── docs/
|
||||
├── instantel_protocol_reference.md ← Series III protocol spec (the Rosetta Stone)
|
||||
└── idf_protocol_reference.md ← Series IV (Thor IDF) format reference + codec RE plan
|
||||
└── instantel_protocol_reference.md ← Reverse-engineered protocol spec
|
||||
```
|
||||
|
||||
---
|
||||
@@ -136,28 +101,21 @@ python seismo_lab.py
|
||||
Each call dials the device, does its work, and closes the connection. TCP
|
||||
connections are retried once on `ProtocolError` to handle cold-boot timing.
|
||||
|
||||
**In-memory caching** — frequently-polled endpoints avoid redundant TCP round-trips
|
||||
via a thread-safe `_LiveCache` (plain Python dict + `threading.Lock`):
|
||||
**Caching** — frequently-polled endpoints are cached in-process to avoid
|
||||
redundant TCP round-trips:
|
||||
|
||||
| Method | URL | Cache Strategy |
|
||||
|--------|-----|---|
|
||||
| Method | URL | Cache |
|
||||
|--------|-----|-------|
|
||||
| `GET` | `/device/info` | Indefinite; invalidated by `POST /device/config` |
|
||||
| `GET` | `/device/events` | Count-probe fast path (~2s); full download only when new events detected |
|
||||
| `GET` | `/device/event/{idx}/waveform` | Permanent per event index |
|
||||
| `GET` | `/device/monitor/status` | 30-second TTL; invalidated by monitor start/stop |
|
||||
| `GET` | `/device/call_home` | Fresh read from device (not cached) |
|
||||
| `GET` | `/device/monitor/status` | 30-second TTL |
|
||||
| `POST` | `/device/connect` | — |
|
||||
| `POST` | `/device/config` | Writes compliance config; invalidates info + events cache |
|
||||
| `POST` | `/device/config/project` | Patches project/client/operator/sensor_location strings |
|
||||
| `POST` | `/device/monitor/start` | Sends SUB 0x96; immediately evicts status cache |
|
||||
| `POST` | `/device/monitor/stop` | Sends SUB 0x97; immediately evicts status cache |
|
||||
| `POST` | `/device/call_home` | Reads, patches specified fields, writes back to device |
|
||||
| `POST` | `/device/config` | Writes compliance config; invalidates cache |
|
||||
| `POST` | `/device/monitor/start` | Sends SUB 0x96 |
|
||||
| `POST` | `/device/monitor/stop` | Sends SUB 0x97 |
|
||||
|
||||
**Cache bypass** — All cached endpoints accept `?force=true` to skip the cache and
|
||||
force a fresh read from the device.
|
||||
|
||||
**Cache stats** — `GET /cache/stats` returns hit/miss counts and TTL info; `DELETE /cache/device`
|
||||
clears the device cache immediately.
|
||||
All cached endpoints accept `?force=true` to bypass the cache.
|
||||
|
||||
Transport query params (supply one set):
|
||||
```
|
||||
@@ -173,23 +131,11 @@ Query the SQLite database written by `ach_server.py`. All read-only except
|
||||
| Method | URL | Description |
|
||||
|--------|-----|-------------|
|
||||
| `GET` | `/db/units` | All known serials with summary stats |
|
||||
| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger). Response rows include `device_family` ("series3" / "series4") so clients dispatch on unit type without sniffing filenames. |
|
||||
| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger) |
|
||||
| `GET` | `/db/monitor_log` | Monitoring intervals |
|
||||
| `GET` | `/db/sessions` | ACH call-home session history |
|
||||
| `PATCH` | `/db/events/{id}/false_trigger?value=true` | Flag / unflag false triggers |
|
||||
|
||||
### File ingest endpoints
|
||||
|
||||
Used by watcher daemons to push field-collected event files into the SFM DB
|
||||
+ waveform store. Both accept multipart uploads of binary event files
|
||||
optionally paired with their ASCII sidecar reports; both dedup by
|
||||
`(serial, timestamp)` and UPSERT device-authoritative fields on re-import.
|
||||
|
||||
| Method | URL | Description |
|
||||
|--------|-----|-------------|
|
||||
| `POST` | `/db/import/blastware_file` | Series III: `.AB0*` / `.N00` binaries + paired `_ASCII.TXT`. Source: `series3-watcher`. |
|
||||
| `POST` | `/db/import/idf_file` | Series IV: `.IDFH` / `.IDFW` binaries + paired `.IDFW.txt` / `.IDFH.txt`. Source: `thor-watcher`. |
|
||||
|
||||
---
|
||||
|
||||
## minimateplus library
|
||||
@@ -206,33 +152,21 @@ client = MiniMateClient(transport=TcpTransport("1.2.3.4", 12345), timeout=30.0)
|
||||
|
||||
with client:
|
||||
# Read
|
||||
info = client.connect() # DeviceInfo — serial, firmware, compliance config
|
||||
count = client.count_events() # Number of stored events
|
||||
keys = client.list_event_keys() # Fast browse walk — event keys only, no download
|
||||
events = client.get_events() # Full download: headers + peaks + metadata
|
||||
monitor = client.get_monitor_status() # Battery, memory, is_monitoring flag
|
||||
log = client.get_monitor_log_entries() # Monitoring intervals (partial 0x2C records)
|
||||
ach_cfg = client.get_call_home_config() # Auto Call Home settings (SUB 0x2C)
|
||||
info = client.connect() # DeviceInfo — serial, firmware, compliance config
|
||||
count = client.count_events() # Number of stored events
|
||||
keys = client.list_event_keys() # Fast browse walk — event keys only, no download
|
||||
events = client.get_events() # Full download: headers + peaks + metadata
|
||||
monitor = client.get_monitor_status() # Battery, memory, is_monitoring flag
|
||||
log = client.get_monitor_log_entries() # Monitoring intervals (partial 0x2C records)
|
||||
|
||||
# Write
|
||||
client.apply_config(
|
||||
sample_rate=1024,
|
||||
recording_mode="Continuous", # Single Shot / Continuous / Histogram / Histogram+Continuous
|
||||
histogram_interval_sec=15, # 2, 5, 15, 60, 300, 900
|
||||
trigger_level_geo=0.5,
|
||||
geo_range="Normal", # Normal (10.000 in/s) / Sensitive (1.25 in/s)
|
||||
project="Bridge Inspection 2026",
|
||||
client_name="City of Portland",
|
||||
operator="B. Harrison",
|
||||
)
|
||||
|
||||
client.set_call_home_config(
|
||||
auto_call_home_enabled=True,
|
||||
after_event_recorded=True,
|
||||
at_specified_times=True,
|
||||
time1_hour=18, time1_min=30, # 6:30 PM
|
||||
time2_hour=6, time2_min=0, # 6:00 AM
|
||||
)
|
||||
|
||||
# Control
|
||||
client.start_monitoring() # SUB 0x96
|
||||
@@ -240,88 +174,26 @@ with client:
|
||||
client.delete_all_events() # Erase all (SUB 0xA3 → 0x1C → 0x06 → 0xA2)
|
||||
```
|
||||
|
||||
`get_events()` runs the full per-event sequence:
|
||||
`1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`.
|
||||
SUB 5A bulk stream walks chunks bounded by the `end_offset` extracted from
|
||||
the STRT record at byte 17 of the probe response — no over-reading, no
|
||||
chunk-count cap. Project / client / operator / sensor location strings come
|
||||
from the dedicated metadata pages at counter `0x1002` and `0x1004`,
|
||||
read once per session (they reflect the compliance setup at session start,
|
||||
not per individual event).
|
||||
|
||||
---
|
||||
|
||||
## micromate library
|
||||
|
||||
Series IV / Thor support, sibling to `minimateplus`. Currently scoped to
|
||||
offline-file ingest from Thor's TXT exporter; live-device protocol is
|
||||
deferred until the binary codec is cracked.
|
||||
|
||||
```python
|
||||
from micromate import IdfEvent, parse_idf_report
|
||||
|
||||
# Parse a .IDFW.txt / .IDFH.txt sidecar (1014 example files round-trip cleanly)
|
||||
text = open("UM11719_20231219162723.IDFW.txt").read()
|
||||
report_dict = parse_idf_report(text) # permissive dict
|
||||
|
||||
# Wrap into a typed event using the device-native binary filename
|
||||
event = IdfEvent.from_report(report_dict, "UM11719_20231219162723.IDFW")
|
||||
|
||||
event.serial # "UM11719"
|
||||
event.kind # "Waveform" or "Histogram"
|
||||
event.peaks.transverse_ips # 0.0251 (in/s, native unit)
|
||||
event.peaks.mic_pspl_dbl # 99.4 (dB(L), Thor's native mic unit — NOT psi)
|
||||
event.project_info.project # "UPMC Presby-Loc 3-Level1-1R Elevator Rm"
|
||||
event.sensor_check.tran # True (passed self-check)
|
||||
event.firmware_version # "Micromate ISEE 11.0AK"
|
||||
event.calibration_text # "November 22, 2023 by Instantel"
|
||||
|
||||
# Bridge to the existing minimateplus.Event shape for the DB / sidecar paths
|
||||
# (waveform_key is a 16-byte sha256 prefix when ingesting from a binary file)
|
||||
bridged_event = event.to_minimateplus_event(waveform_key=b"\x00" * 16)
|
||||
```
|
||||
|
||||
The binary codec (`.IDFW` / `.IDFH` event files themselves) is on the
|
||||
roadmap — see [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md)
|
||||
for everything known so far, the two observed file signatures, and the
|
||||
reverse-engineering plan. The `micromate/idf_file.py` stub is where
|
||||
`read_idf_file()` will land.
|
||||
`get_events()` runs the full per-event sequence: `1E → 0A → 0C → 5A → 1F`.
|
||||
SUB 5A bulk stream provides `client`, `operator`, and `sensor_location` as they
|
||||
existed at record time — not backfilled from the current compliance config.
|
||||
|
||||
---
|
||||
|
||||
## Database
|
||||
|
||||
`ach_server.py` and the file-ingest endpoints write to
|
||||
`bridges/captures/seismo_relay.db` (SQLite, WAL mode) via the `SeismoDb`
|
||||
persistence layer. Three tables, all unit-keyed by serial number:
|
||||
`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode).
|
||||
Three tables, all unit-keyed by serial number:
|
||||
|
||||
| Table | Key | Contents |
|
||||
|-------|-----|----------|
|
||||
| `ach_sessions` | UUID | Per-call-home audit record: serial, timestamp, peer IP, events_downloaded, monitor_entries, duration_seconds |
|
||||
| `events` | UUID, UNIQUE(serial, timestamp) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag, **`device_family`** ("series3" / "series4"), `blastware_filename` (binary at-rest in `waveforms/`), sidecar references |
|
||||
| `monitor_log` | UUID, UNIQUE(serial, start_time) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips |
|
||||
| `ach_sessions` | UUID | Per-call-home audit record: serial, peer IP, events_downloaded, duration |
|
||||
| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, PPV per channel, project/client/operator strings, false_trigger flag |
|
||||
| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: start/stop time, duration, geo threshold |
|
||||
|
||||
**Deduplication is by `(serial, timestamp)`** — the device clock is the
|
||||
stable natural key. Repeat call-homes or re-runs UPSERT the row in place,
|
||||
refreshing every device-authoritative field (peaks, project strings,
|
||||
sample_rate, file references) so the latest writer wins. `false_trigger`
|
||||
and `device_family` are preserved across UPSERTs. Earlier versions used
|
||||
`(serial, waveform_key)` for dedup, but the device's event-key counter
|
||||
resets to `0x01110000` after every erase, so timestamps are the correct
|
||||
dedup field. Migration handles the transition transparently on first
|
||||
startup.
|
||||
|
||||
**`device_family` (added v0.19.0)** discriminates Series III from Series
|
||||
IV at the SQL level. Set by every import path; the UI dispatches on it
|
||||
to render mic units correctly (Series III: psi → dBL conversion; Series
|
||||
IV: native dBL passthrough). Existing rows are backfilled at first
|
||||
startup of v0.19.0+ by sniffing the binary filename extension.
|
||||
|
||||
The on-disk waveform store lives at `bridges/captures/waveforms/<serial>/`
|
||||
and holds the original event binaries (BW `.AB0*` / `.N00` for Series III,
|
||||
`.IDFH` / `.IDFW` for Series IV) plus their `.sfm.json` review/metadata
|
||||
sidecars. Series III events also produce `.a5.pkl` source-frame pickles
|
||||
and `.h5` clean-waveform exports; Series IV doesn't yet (pending codec).
|
||||
Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs
|
||||
never produce duplicate rows. Post-erase key reuse is handled automatically
|
||||
via the high-water mark in `ach_state.json`.
|
||||
|
||||
---
|
||||
|
||||
@@ -359,27 +231,6 @@ Full protocol documentation: [`docs/instantel_protocol_reference.md`](docs/insta
|
||||
|
||||
---
|
||||
|
||||
## Compliance Config Features
|
||||
|
||||
The REST API and web UI expose full control over device compliance settings:
|
||||
|
||||
- **Recording Mode** (Single Shot / Continuous / Histogram / Histogram+Continuous)
|
||||
- **Sample Rate** (1024 / 2048 / 4096 sps)
|
||||
- **Record Time** (float, seconds)
|
||||
- **Histogram Interval** (2s, 5s, 15s, 1m, 5m, 15m) — when recording mode includes histogram
|
||||
- **Geo Trigger Levels** (float, in/s per channel)
|
||||
- **Geo Maximum Range** (Normal 10.000 in/s / Sensitive 1.250 in/s per channel)
|
||||
- **Project / Client / Operator / Sensor Location** (ASCII strings)
|
||||
|
||||
Auto Call Home config:
|
||||
- **Auto Call Home Enable** (bool)
|
||||
- **Dial String** (read-only; 40-byte ASCII)
|
||||
- **Trigger on Event** (bool)
|
||||
- **Scheduled Call-Ins** (two time slots with HH:MM each)
|
||||
- **Retry Settings** (count, delay, connection timeout, warm-up time)
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
```bash
|
||||
@@ -401,169 +252,17 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
|
||||
|
||||
---
|
||||
|
||||
## Key Features
|
||||
## Roadmap
|
||||
|
||||
**Series III (MiniMate Plus) device support:**
|
||||
- [x] Full read/write/erase pipelines over RS-232 or TCP/cellular
|
||||
- [x] Compliance config (recording mode, sample rate, histogram interval, geo sensitivity, project strings)
|
||||
- [x] Auto Call Home config (read/write ACH settings, dial string, time slots, retries)
|
||||
- [x] Monitor control (start/stop, status polling, battery/memory)
|
||||
- [x] Monitor log entries (continuous monitoring intervals without full waveform download)
|
||||
- [x] Blastware file ingest at `/db/import/blastware_file` (paired with `series3-watcher`)
|
||||
|
||||
**Series IV (Micromate / Thor) device support:**
|
||||
- [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+)
|
||||
- [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version
|
||||
- [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/`
|
||||
- [ ] Binary `.IDFW` / `.IDFH` codec — pending (see Roadmap + [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md))
|
||||
- [ ] Live-device protocol — pending codec
|
||||
|
||||
**Data persistence:**
|
||||
- [x] SQLite database (`seismo_relay.db`) with `events`, `monitor_log`, `ach_sessions` tables
|
||||
- [x] Per-row `device_family` column ("series3" / "series4") for clean UI / unit-of-measurement dispatch (v0.19.0+)
|
||||
- [x] Deduplication by `(serial, timestamp)` — natural key handles post-erase counter resets
|
||||
- [x] UPSERT on re-import refreshes every device-authoritative field (peaks, project, sample_rate); preserves operator review state (`false_trigger`)
|
||||
- [x] Post-erase key-reuse detection (tracks high-water mark in `ach_state.json`)
|
||||
|
||||
**REST API:**
|
||||
- [x] Live device endpoints with in-memory caching (`_LiveCache`)
|
||||
- [x] Cache statistics (`/cache/stats`) and manual invalidation (`/cache/device`)
|
||||
- [x] DB query endpoints (units, events, monitor_log, sessions, false_trigger PATCH)
|
||||
- [x] Call Home config read/write endpoints
|
||||
- [x] Blastware file download endpoint (`/device/event/{index}/blastware_file`)
|
||||
- [x] Import endpoints for both device families (`/db/import/blastware_file`, `/db/import/idf_file`)
|
||||
|
||||
**File output (v0.7+, byte-perfect as of v0.14.3):**
|
||||
- [x] Blastware-compatible `.AB0` / `.G10` file generation (waveform + metadata)
|
||||
- [x] Multi-channel waveform decode from SUB 5A bulk stream
|
||||
- [x] Second-resolution timestamp encoding in Blastware filename
|
||||
- [x] **Byte-perfect against BW reference captures** (verified across 2-sec / 3-sec / 10-sec event durations, both event 0 and event N continuation events)
|
||||
- [x] STRT-bounded chunk walk + correct event-N probe counter + partial DLE stuffing of `0x10` in 5A params (the four fixes that landed in v0.14.0–v0.14.3)
|
||||
|
||||
**Capture tools:**
|
||||
- [x] Serial-to-TCP bridge with raw BW/S3 capture (s3_bridge.py, defaults to auto-capture)
|
||||
- [x] GUI bridge with raw capture checkboxes (gui_bridge.py)
|
||||
- [x] ACH inbound server with bidirectional capture (ach_server.py saves raw_tx + raw_rx)
|
||||
- [x] Transparent TCP MITM proxy for live BW session capture (ach_mitm.py)
|
||||
|
||||
**Analysis tools:**
|
||||
- [x] s3_analyzer.py — session parser, frame differ, Claude export
|
||||
- [x] gui_analyzer.py — standalone analyzer GUI
|
||||
- [x] frame_db.py — SQLite frame database for capture analysis
|
||||
|
||||
**seismo_lab.py GUI:**
|
||||
- [x] Bridge tab — Serial/TCP mode selector with raw capture options
|
||||
- [x] Analyzer tab — BW/S3 capture playback and differencing
|
||||
- [x] Download tab — Live wire-byte capture during event download
|
||||
- [x] Console tab — Logging and diagnostics
|
||||
|
||||
## Roadmap (Future)
|
||||
|
||||
### Strategic direction — where this is going
|
||||
|
||||
seismo-relay is being built as a **suite of cooperating components**
|
||||
that together replace and improve on Blastware's role. Three logical
|
||||
tiers:
|
||||
|
||||
1. **SFM** (device-side) — owns the active connection to a physical
|
||||
unit. Today: `minimateplus/`, `/device/*` HTTP endpoints,
|
||||
`seismo_lab.py`. Future: live Thor / Micromate support.
|
||||
2. **SDM** (data-side) — owns the database, waveform store, ingest
|
||||
pipelines, and the read-API that Terra-View consumes. Today this
|
||||
code lives under `sfm/` for historical reasons; the role has
|
||||
migrated and the eventual rename is on the long-tail cleanup list.
|
||||
3. **Codec library** — pure data-interpretation: `minimateplus/*_codec.py`,
|
||||
`bw_ascii_report.py`, `micromate/idf_*.py`. Used by both SFM and
|
||||
SDM, depends on neither.
|
||||
|
||||
Terra-View is downstream of SDM for fleet listings, event detail, etc.
|
||||
The long-term vision adds a **second link** from Terra-View → SFM for
|
||||
direct device interaction (see below).
|
||||
|
||||
The codec work in this repo isn't trying to replace BW's network
|
||||
layer — BW's ACH file forwarding and Thor's IDF call-home are
|
||||
battle-tested. The value is in the receiving and processing side: turn
|
||||
the stream of binary+ASCII pairs into something users can search,
|
||||
filter, alert on, and report from.
|
||||
|
||||
### Terra-View ↔ SFM device control (the long-term vision)
|
||||
|
||||
Today Terra-View only reads from SDM (event listings, dashboards,
|
||||
project reports). When a unit goes missing — operator notices in the
|
||||
Terra-View dashboard — there's no way to *do* anything from the UI.
|
||||
The path of least resistance is to RDP into a Windows box and open
|
||||
Blastware, which defeats the purpose of having Terra-View.
|
||||
|
||||
Target experience:
|
||||
- Operator notices a unit in Terra-View dashboard hasn't called in.
|
||||
- Clicks unit detail → "Connect to Device" button.
|
||||
- Terra-View opens an embedded view (modal or side-panel) that talks
|
||||
to SFM's `/device/*` endpoints over the network.
|
||||
- Live view: device clock, battery, memory, current monitor status.
|
||||
- Actions: start/stop monitoring, push compliance config changes, pull
|
||||
fresh events, run a sensor self-check, change call-home settings.
|
||||
- Audit log: every connect / action recorded in SDM for the unit
|
||||
history.
|
||||
|
||||
Implementation steps (concrete):
|
||||
- [ ] **SFM authentication & authorization layer.** Today `/device/*`
|
||||
endpoints are unauthenticated — anyone on the network can call
|
||||
them. Need at minimum a token-based auth, ideally with a "who
|
||||
can connect to which units" mapping. Hard prerequisite for
|
||||
letting Terra-View users into the control surface.
|
||||
- [ ] **Terra-View "Connect to Device" entry point** on the unit
|
||||
detail page. Renders only when unit has connection info on file
|
||||
and the user has permission.
|
||||
- [ ] **Embedded live-monitor view** in Terra-View — equivalent to
|
||||
`seismo_lab.py`'s Bridge tab, but in the browser. Polls SFM's
|
||||
`/device/monitor/status` on an interval; sends start/stop via
|
||||
`/device/monitor/{start,stop}`.
|
||||
- [ ] **Action history** — every connect / push / action call records
|
||||
a row in `unit_history`, viewable on the unit detail page.
|
||||
- [ ] **Series IV live-device support in SFM** — currently `/device/*`
|
||||
only supports MiniMate Plus. Blocks "Connect to Device" for
|
||||
Thor units until done. Depends on Thor wire-protocol capture
|
||||
and a `micromate/` parallel of the `minimateplus/` modules.
|
||||
|
||||
### High-impact (unblocks product features)
|
||||
|
||||
- [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
|
||||
- [ ] **Series IV (Thor IDF) binary codec reverse-engineering.** `.IDFH` / `.IDFW` files are currently stored opaquely by `WaveformStore.save_imported_idf`, with all metadata sourced from the paired `.txt` sidecar. This works because thor-watcher forwards both files together, but operators who haven't enabled Thor's TXT exporter get rows with NULL peaks. Cracking the binary closes that gap and unlocks waveform display. Starting-point reference at [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) — two observed file signatures (1,012 newer-firmware files + 2 old files whose layout matches the Series III STRT-record format), suggested first-session plan (~2-4 hrs), 1,014 paired binary+txt files available as ground truth in `thor-watcher/example-data/`. Code seam ready at `micromate/idf_file.py`.
|
||||
- [ ] **In-app waveform viewer accuracy.** Depends on Series III codec decode. Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples. Series IV waveforms come online when the IDF codec lands.
|
||||
- [ ] **Series IV live-device support.** Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD).
|
||||
- [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing.
|
||||
- [ ] **Vibration summary reports** — highest legit PPV per project → Word doc (false-trigger filtering first).
|
||||
|
||||
### BW ASCII report parser enhancements (built in v0.16.0)
|
||||
|
||||
- [ ] **PPV field misses on certain TXT formats.** Discovered 2026-05-22 during the histogram-codec backfill validation: a handful of events (5 in prod) have a `bw_report` block where `peaks.{tran,vert,long}.ppv_ips` and `peaks.vector_sum.ips` are all `None`, despite the parser correctly extracting every OTHER field for the same channels (zc_freq_hz, time_of_peak_s, peak_accel_g, peak_disp_in). Symptom on the DB side: `peak_vector_sum=0` after a `--force` backfill that overlays from the parsed bw_report dict. Affected events on prod include `T190LD5Q.LK0W`, `T438L713.RY0W`, `K557L3YM.OE0W`. Root cause likely a regex or format mismatch for the "PPV" header line in those specific firmware/event-type outputs. Once fixed, re-forwarding the events from series3-watcher will re-populate the `bw_report` blocks correctly.
|
||||
- [ ] **Histogram-specific structural fields.** Current parser handles the shared fields (PPV, ZC Freq, sensor self-check, project) but silently drops histogram-only fields: `Histogram Start/Stop Time`, `Histogram Start/Stop Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date` (absolute timestamps rather than the waveform's `Time of Peak` relative seconds).
|
||||
- [ ] **Histogram interval bin-table parsing.** Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed. Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
|
||||
- [ ] **`>100 Hz` value parsing.** Histogram TXTs use `>100 Hz` for out-of-range ZC freq; current `_parse_number()` returns `None` for these (loses information).
|
||||
|
||||
### Ingestion gaps
|
||||
|
||||
- [ ] **MLG forwarding.** `series3-watcher` forwards event binaries + their `_ASCII.TXT` reports, but skips `.MLG` per-unit monitor log files entirely. Adding an `POST /db/import/mlg_file` endpoint + watcher scan path would populate `monitor_log` for non-ACH-routed units (coverage queries, "was this unit monitoring on date X" lookups).
|
||||
- [ ] **0C-record raw bytes persistence in the sidecar.** Currently on branch `claude/codec-re-cBGNe` as commit `a187124`; cherry-pick if useful as a standalone fix. Preserves the 210-byte 0C record under `extensions.raw_records.waveform_record_b64` so future field-offset analysis (Peak Acceleration / Time of Peak / etc. — the fields BW computes client-side from samples) can run offline.
|
||||
|
||||
### Operational
|
||||
|
||||
- [ ] **`series3-watcher` file archive manager** — 90-day-old events moved to `<watch_folder>_archive/<year>/<month>/` subfolders. Plan drafted in `claude/codec-re-cBGNe`'s plan-mode session; awaiting a 5-minute test on whether Blastware UI walks subfolders before any code lands (determines layout: in-place subfolders vs sibling archive).
|
||||
- [ ] **Compliance config encoder** — build raw write payloads from a `ComplianceConfig` object.
|
||||
- [ ] **Modem manager** — push RV50/RV55 configs via Sierra Wireless API.
|
||||
- [ ] **Call Home dial_string write support** (requires DLE escaping for embedded control characters).
|
||||
- [ ] **Histogram mode recording support** (5A stream analysis for mode 0x03 — separate from histogram ASCII parsing above).
|
||||
|
||||
### Test coverage
|
||||
|
||||
- [ ] Verify 30-sec event download — body may exceed `0xFFFF` and force the device into a different `end_key` encoding (none of the 2/3/10-sec test cases hit this boundary).
|
||||
- [ ] Histogram mode (0x03) write via SFM — confirmed working for Single Shot / Continuous / Histogram+Continuous; Histogram (0x03) needs a live test from a non-Histogram starting state.
|
||||
|
||||
### Lower-priority cleanups
|
||||
|
||||
- [ ] Compliance write anchor-9 cleanup — when changing recording_mode via SFM, a spurious `0x10` may persist after Histogram→other mode transitions. Doesn't affect device operation but differs from BW's byte-perfect output.
|
||||
- [ ] Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring).
|
||||
- [ ] Call Home — map time slots 3/4 offsets; confirm `modem_power_relay_enabled`.
|
||||
- [ ] RV55 DCD/DTR — newer RV55 firmware doesn't assert DCD by default; units don't resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred).
|
||||
- [ ] **NULL-timestamp duplicate-row dedup.** A small handful of events (2 known on prod as of 2026-05-22) have `events.timestamp IS NULL` because the codec couldn't extract a timestamp from the binary footer. The `UNIQUE(serial, timestamp)` constraint doesn't fire on `NULL` (SQL semantics: `NULL ≠ NULL`), so every `--force` backfill INSERTs a new row instead of UPSERTing the existing one. Cleanup: a one-shot SQL query that keeps only the newest row per `(serial, blastware_filename)` and deletes the rest. Longer-term: extend the unique key to `(serial, COALESCE(timestamp, blastware_filename))` or reject inserts with NULL timestamp.
|
||||
- [ ] **Histogram body sub-format with `byte[5] != 0`.** ~3 events on prod (`T190LD5Q.LD0H`, `O121L4L1.GU0H`) use a histogram body my walker doesn't recognize — the first block has `byte[5] = 0x01` or `0x07` instead of `0x00`, and the entire body lacks the `1e 0a 00 00` tail signature. Codec returns 0 valid blocks; their DB PVS comes from the bw_report ASCII overlay (which BW computed from the same binary, so the DB columns are correct). Only the `.h5` waveform plot is empty. Cracking the sub-format would unlock the plot. Needs binary+ASCII pairs from a few `byte[5]!=0` events; same RE approach as the K558 case.
|
||||
- [x] Full read pipeline — device info, compliance config, event download with true event-time metadata
|
||||
- [x] Write commands — push compliance config, trigger thresholds, project strings to device
|
||||
- [x] Erase all events — confirmed erase sequence from live MITM capture
|
||||
- [x] Monitor control — start/stop monitoring, read battery/memory/status
|
||||
- [x] Monitor log entries — decode partial 0x2C records (continuous monitoring intervals)
|
||||
- [x] ACH inbound server — accept call-home connections, download events, dedup by key
|
||||
- [x] SQLite persistence — events, monitor log, and session history in `seismo_relay.db`
|
||||
- [x] SFM REST API — device control + DB query endpoints, live device cache
|
||||
- [ ] Terra-view integration — seismo-relay router, unit detail page, VISON-style event listing
|
||||
- [ ] Vibration summary reports — highest legit PPV per project → Word doc (false trigger filtering first)
|
||||
- [ ] Compliance config encoder — build raw write payloads from a `ComplianceConfig` object
|
||||
- [ ] Modem manager — push RV50/RV55 configs via Sierra Wireless API
|
||||
|
||||
@@ -1,66 +0,0 @@
|
||||
# analysis/ — exploratory scripts for waveform-body RE
|
||||
|
||||
**These are scratch.** Run them, read them, copy them, but don't trust
|
||||
them as documentation. When a finding is verified it gets promoted
|
||||
to `minimateplus/waveform_codec.py` and `tests/test_waveform_codec.py`;
|
||||
when it's wrong it stays here as a fossil.
|
||||
|
||||
Authoritative status lives in:
|
||||
|
||||
- `docs/waveform_codec_re_status.md` (current truth, working note)
|
||||
- `minimateplus/waveform_codec.py` (verified implementation + docstring)
|
||||
- `tests/test_waveform_codec.py` (regression locks against fixtures)
|
||||
|
||||
---
|
||||
|
||||
## Still useful
|
||||
|
||||
| File | What it does |
|
||||
|---|---|
|
||||
| `load_bundle.py` | Fixture loader. Parses BW binary + ASCII TXT into a `Bundle` dataclass with samples, metadata, body bytes. Used by most other scripts here. |
|
||||
| `verify_tran.py` | Verifies `decode_tran_initial` against fixture ground truth across all events. Useful when you change the decoder and want a quick sanity check. |
|
||||
| `inspect_5_11.py` | Inspects the 5-11-26 high-amplitude bundle's body structure, prints metadata, peaks, and block counts. |
|
||||
| `walk_5_11.py` | Walks blocks for the 5-11-26 bundle and prints offset/tag/length/data. |
|
||||
| `seg1_blocks.py` | Dumps all blocks in segment 1 of each event. The starting point for cracking multi-segment Tran continuation. |
|
||||
| `full_tran.py` | Multi-segment Tran decoder attempt (broken — diverges at sample ~512). Useful as a starting scaffold for the next experiment. |
|
||||
| `multi_segment.py` | Earlier multi-segment attempt with different segment-header consumption strategies. Records what didn't work. |
|
||||
| `test_rle.py` | Tests `00 NN` interpretation as zero-RLE with different divisor values. Documents how the RLE rule was confirmed. |
|
||||
|
||||
## Superseded — keep for archaeology
|
||||
|
||||
| File | Superseded by |
|
||||
|---|---|
|
||||
| `walk_v2.py` … `walk_v5.py` | `walk_v6.py` and ultimately `minimateplus/waveform_codec.walk_body`. Each version represents one round of refinement. Don't read in isolation — read the diff between them to see what was learned. |
|
||||
| `walk_chunks.py` | `walk_v6.py` / production walker |
|
||||
| `decode_v1.py` | First naive decoder attempt. Wrong but readable. |
|
||||
|
||||
## Pure exploration — read if curious
|
||||
|
||||
| File | What it explored |
|
||||
|---|---|
|
||||
| `inspect_body.py` | Byte-frequency stats per event. Established that bytes 0x00 / 0x10 dominate. |
|
||||
| `find_blocks.py` | Searched for repeating 2-byte tag patterns. |
|
||||
| `find_signal_runs.py` | Searched for stretches of bytes that "look like a smooth signal" (small inter-byte deltas). Found the `20 NN` literal blocks. |
|
||||
| `dump_head.py`, `dump_trailer.py`, `dump_around.py` | Hex dumpers at various body positions. |
|
||||
| `compare_cd.py` | Byte-diff between event-c and event-d (same length, similar signal). Used to identify structural vs data bytes. |
|
||||
| `brute_force.py` | Tested 96 combinations of channel-permutation × nibble-order × sign-convention × init-from-header on the quiet bundle. All failed because the quiet bundle had T[0]=T[1]=0, making the preamble undetectable. |
|
||||
| `try_nibbles.py`, `try_layouts.py` | Earlier channel-interleaving hypotheses. All wrong. |
|
||||
| `test_tran_continue.py` | Test of "Tran continues uninterrupted across `30 04` blocks" hypothesis. Disproven. |
|
||||
|
||||
---
|
||||
|
||||
## Adding new scripts
|
||||
|
||||
If you're picking up the codec work, feel free to add new scripts here.
|
||||
Suggested conventions:
|
||||
|
||||
- Start the filename with what you're testing: `test_<hypothesis>.py`,
|
||||
`verify_<piece>.py`, `inspect_<region>.py`.
|
||||
- Print enough output that the reader can see exactly which events
|
||||
match / diverge and where.
|
||||
- When a finding is solid, move the verified logic to
|
||||
`minimateplus/waveform_codec.py` and add a regression test in
|
||||
`tests/test_waveform_codec.py` — don't leave the truth only in
|
||||
this directory.
|
||||
- If a script is fully superseded, leave it in place (don't delete) —
|
||||
the fossil record is useful when re-evaluating hypotheses later.
|
||||
@@ -1,93 +0,0 @@
|
||||
"""Brute-force test channel permutations / nibble orders on event-d (simplest signal)."""
|
||||
import sys
|
||||
import itertools
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
from minimateplus.waveform_codec import walk_body
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def decode(body, channel_perm, nibble_order, sign_mode, init_from_header):
|
||||
"""Try one decoder configuration on event-d. Returns first 8 cumulative samples per channel."""
|
||||
blocks = walk_body(body)
|
||||
# Initial values from bytes [4:7] if init_from_header else 0
|
||||
if init_from_header:
|
||||
init = [body[4] if body[4] < 128 else body[4] - 256,
|
||||
body[5] if body[5] < 128 else body[5] - 256,
|
||||
body[6] if body[6] < 128 else body[6] - 256,
|
||||
0]
|
||||
else:
|
||||
init = [0, 0, 0, 0]
|
||||
cur = list(init)
|
||||
out = [[init[0]], [init[1]], [init[2]], [init[3]]] # sample 0 = init
|
||||
nibble_idx = 0 # within delta stream; channel = channel_perm[nibble_idx % 4]
|
||||
|
||||
# Walk only the 10 NN data blocks
|
||||
for blk in blocks:
|
||||
if blk.tag_hi != 0x10:
|
||||
continue
|
||||
for byte in blk.data:
|
||||
if nibble_order == 'high_first':
|
||||
nib1, nib2 = (byte >> 4) & 0xF, byte & 0xF
|
||||
else:
|
||||
nib1, nib2 = byte & 0xF, (byte >> 4) & 0xF
|
||||
for nib in (nib1, nib2):
|
||||
if sign_mode == 'signed':
|
||||
delta = s4(nib)
|
||||
else:
|
||||
delta = nib
|
||||
ch = channel_perm[nibble_idx % 4]
|
||||
cur[ch] += delta
|
||||
if (nibble_idx + 1) % 4 == 0:
|
||||
out[0].append(cur[0])
|
||||
out[1].append(cur[1])
|
||||
out[2].append(cur[2])
|
||||
out[3].append(cur[3])
|
||||
nibble_idx += 1
|
||||
if len(out[0]) >= 16:
|
||||
return out
|
||||
return out
|
||||
|
||||
|
||||
def best_match(pred, truth, n=10):
|
||||
"""Sum of squared differences in first n samples."""
|
||||
n = min(n, len(pred), len(truth))
|
||||
return sum((pred[i] - truth[i])**2 for i in range(n))
|
||||
|
||||
|
||||
def main():
|
||||
b = load_bundle("event-d")
|
||||
# truth in 16-count units
|
||||
tr = {ch: [round(v * 200) for v in b.samples[ch]] for ch in ("Tran", "Vert", "Long")}
|
||||
|
||||
print("Truth event-d first 10 samples:")
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
print(f" {ch}: {tr[ch][:10]}")
|
||||
|
||||
# Test 96 combinations
|
||||
best = []
|
||||
for perm in itertools.permutations([0, 1, 2, 3]):
|
||||
for nibble_order in ('high_first', 'low_first'):
|
||||
for sign in ('signed', 'unsigned'):
|
||||
for init_h in (False, True):
|
||||
decoded = decode(b.body, perm, nibble_order, sign, init_h)
|
||||
# Score as TVL channel-sum
|
||||
score = sum(
|
||||
best_match(decoded[i], tr[ch], n=10)
|
||||
for i, ch in enumerate(("Tran", "Vert", "Long"))
|
||||
if i < 3
|
||||
)
|
||||
label = f"perm={perm} nib={nibble_order[:1]} sign={sign[:3]} init={init_h}"
|
||||
best.append((score, label, decoded))
|
||||
|
||||
best.sort(key=lambda x: x[0])
|
||||
print(f"\nTop 10 configurations:")
|
||||
for s, lbl, dec in best[:10]:
|
||||
print(f" score={s:>5} {lbl} T={dec[0][:8]} V={dec[1][:8]} L={dec[2][:8]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,42 +0,0 @@
|
||||
"""Compare event-c and event-d (same N_samples) to find header vs data bytes."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def main():
|
||||
bc = load_bundle("event-c")
|
||||
bd = load_bundle("event-d")
|
||||
|
||||
# Compare prefixes
|
||||
nc, nd = len(bc.body), len(bd.body)
|
||||
n = min(nc, nd)
|
||||
diffs = []
|
||||
for i in range(n):
|
||||
if bc.body[i] != bd.body[i]:
|
||||
diffs.append(i)
|
||||
print(f"event-c body={nc}, event-d body={nd}")
|
||||
print(f"Total diffs (first {n}): {len(diffs)}")
|
||||
|
||||
# Show common prefix
|
||||
same_prefix = 0
|
||||
for i in range(n):
|
||||
if bc.body[i] == bd.body[i]:
|
||||
same_prefix += 1
|
||||
else:
|
||||
break
|
||||
print(f"Common prefix length: {same_prefix}")
|
||||
print(f"event-c prefix: {bc.body[:same_prefix].hex(' ')}")
|
||||
|
||||
# Look for runs of common bytes
|
||||
print(f"\nFirst 32 diff positions: {diffs[:32]}")
|
||||
|
||||
# Show the "diff fingerprint" of the first 100 bytes
|
||||
print(f"\n pos c d")
|
||||
for i in range(0, 100):
|
||||
marker = " " if bc.body[i] == bd.body[i] else "*"
|
||||
bd_b = bd.body[i] if i < nd else None
|
||||
print(f" {i:>3} {bc.body[i]:02x}{marker} {bd_b:02x}" if bd_b is not None else f" {i:>3} {bc.body[i]:02x}{marker}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,99 +0,0 @@
|
||||
"""
|
||||
Decoder v1: nibble-pair signed deltas in 10 NN blocks, 4-channel round-robin.
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def walk_blocks(body, start):
|
||||
i = start
|
||||
blocks = []
|
||||
while i + 1 < len(body):
|
||||
t0, t1 = body[i], body[i + 1]
|
||||
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 // 2 + 2
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append(("10", t1, data))
|
||||
i += length
|
||||
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 + 2
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append(("20", t1, data))
|
||||
i += length
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
blocks.append(("00", t1, b""))
|
||||
i += 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||
length = t1 * 4
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append(("30", t1, data))
|
||||
i += length
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 20
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append(("40", t1, data))
|
||||
i += length
|
||||
else:
|
||||
blocks.append(("??", t0, bytes(body[i:i+8])))
|
||||
break
|
||||
return blocks
|
||||
|
||||
|
||||
def decode_v1(body, start, n_samples):
|
||||
"""Decode by accumulating nibble-pair deltas from all 10 NN blocks."""
|
||||
blocks = walk_blocks(body, start)
|
||||
# 4 channels: T, V, L, M
|
||||
cur = [0, 0, 0, 0]
|
||||
out = [[], [], [], []]
|
||||
sample_index = 0 # how many sample-sets emitted
|
||||
|
||||
for typ, NN, data in blocks:
|
||||
if typ == "10":
|
||||
# 2 nibbles per byte, round-robin TVLM
|
||||
for byte in data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
ch = sample_index % 4
|
||||
cur[ch] += s4(nib)
|
||||
out[ch].append(cur[ch])
|
||||
sample_index = (sample_index + 1) // 4 * 4 + (sample_index + 1) % 4 # ?
|
||||
sample_index += 1
|
||||
# We emit per-nibble, but the structure is unclear
|
||||
elif typ == "20":
|
||||
# int8 absolute or delta?
|
||||
for byte in data:
|
||||
v = byte if byte < 128 else byte - 256
|
||||
ch = sample_index % 4
|
||||
cur[ch] = v # treat as absolute
|
||||
out[ch].append(cur[ch])
|
||||
sample_index += 1
|
||||
return out
|
||||
|
||||
|
||||
def main():
|
||||
b = load_bundle("event-c")
|
||||
body = b.body
|
||||
truth_T = [round(v * 200) for v in b.samples["Tran"]]
|
||||
truth_V = [round(v * 200) for v in b.samples["Vert"]]
|
||||
truth_L = [round(v * 200) for v in b.samples["Long"]]
|
||||
|
||||
# Find start
|
||||
for s in range(15):
|
||||
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
|
||||
start = s
|
||||
break
|
||||
|
||||
blocks = walk_blocks(body, start)
|
||||
# Print block-by-block what's in each
|
||||
print(f"Total blocks: {len(blocks)}")
|
||||
bytes_processed = 0
|
||||
for typ, NN, data in blocks[:30]:
|
||||
print(f" type={typ} NN=0x{NN:02x} data_len={len(data)} data_hex={data[:32].hex(' ')}{'...' if len(data) > 32 else ''}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,27 +0,0 @@
|
||||
"""Dump body bytes around a specific offset."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def dump_around(name: str, center: int, radius: int = 96):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
start = max(0, center - radius)
|
||||
end = min(len(body), center + radius)
|
||||
print(f"\n=== {name} body[{start}:{end}] (full body={len(body)}) ===")
|
||||
for i in range(start, end, 32):
|
||||
row = body[i:i+32]
|
||||
marker = " <-- center" if i <= center < i+32 else ""
|
||||
print(f" +{i:>5} {row.hex(' ')}{marker}")
|
||||
|
||||
|
||||
def main():
|
||||
# Look at the trailer transitions
|
||||
trailer_starts = {"event-a": 7047, "event-b": 6475, "event-c": 4043, "event-d": 3941}
|
||||
for name, off in trailer_starts.items():
|
||||
dump_around(name, off, 96)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,18 +0,0 @@
|
||||
"""Dump the START of each body in 32-byte rows."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-a", "event-c"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
print(f"\n=== {name} body[0:512] (full body={len(body)}, samples={len(b.samples['Tran'])}) ===")
|
||||
for i in range(0, min(512, len(body)), 32):
|
||||
row = body[i:i+32]
|
||||
print(f" +{i:>5} {row.hex(' ')}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,24 +0,0 @@
|
||||
"""Dump body bytes split into 32-byte rows starting from `start_offset`."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def dump(body: bytes, name: str, start: int, n_rows: int = 30):
|
||||
print(f"\n=== {name} body[{start}:] (full body={len(body)}) ===")
|
||||
end = min(start + 32 * n_rows, len(body))
|
||||
for i in range(start, end, 32):
|
||||
row = body[i:i+32]
|
||||
print(f" +{i:>5} {row.hex(' ')}")
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||
b = load_bundle(name)
|
||||
# Print the LAST ~600 bytes of the body to see the tail structure
|
||||
start = max(0, len(b.body) - 32 * 12)
|
||||
dump(b.body, name, start, 12)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,41 +0,0 @@
|
||||
"""Search for structural repetition in the body bytes."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def find_pattern_offsets(body: bytes, pattern: bytes, max_count=20):
|
||||
out = []
|
||||
i = 0
|
||||
while True:
|
||||
i = body.find(pattern, i)
|
||||
if i < 0:
|
||||
break
|
||||
out.append(i)
|
||||
i += 1
|
||||
if len(out) >= max_count:
|
||||
break
|
||||
return out
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
print(f"\n=== {name} (body={len(body)}, N_samples={len(b.samples['Tran'])}) ===")
|
||||
|
||||
# Try to find repeating substructures (look for 4-byte 0x10-prefixed markers)
|
||||
for prefix in [b"\x10\x10", b"\x10\x04", b"\x10\x08", b"\x10\x0c", b"\x10\x18",
|
||||
b"\x10\x14", b"\x10\x20", b"\x10\x40", b"\x10\x80", b"\x10\x00",
|
||||
b"\x10\x01", b"\x10\x03", b"\x10\xf0", b"\xf1\x10", b"\x00\x10",
|
||||
b"\x40\x02", b"\x20\x04", b"\x30\x04", b"\x30\x08", b"\x00\x1a"]:
|
||||
offs = find_pattern_offsets(body, prefix, max_count=200)
|
||||
if 1 <= len(offs) <= 1000:
|
||||
# Print first 10 offsets
|
||||
first = offs[:6]
|
||||
last = offs[-3:]
|
||||
print(f" '{prefix.hex()}' x{len(offs):>4} first={first} last={last}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,34 +0,0 @@
|
||||
"""Find body byte ranges that look like absolute int8 sample data (smooth waveform)."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def looks_like_smooth_int8(buf):
|
||||
"""Convert bytes to int8 and check if successive deltas are small (waveform-like)."""
|
||||
if len(buf) < 8:
|
||||
return 0.0
|
||||
vals = [b if b < 128 else b - 256 for b in buf]
|
||||
diffs = [abs(vals[i+1] - vals[i]) for i in range(len(vals)-1)]
|
||||
avg_diff = sum(diffs) / len(diffs)
|
||||
return avg_diff
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-a", "event-c"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
# Scan with sliding window of 64 bytes; find segments where the bytes look like a smooth wave
|
||||
win = 64
|
||||
scores = []
|
||||
for i in range(len(body) - win):
|
||||
scores.append((i, looks_like_smooth_int8(body[i:i+win])))
|
||||
# Lowest avg_diff means smoothest
|
||||
scores.sort(key=lambda x: x[1])
|
||||
print(f"\n=== {name} (body={len(body)}) — smoothest 10 windows ===")
|
||||
for off, s in scores[:10]:
|
||||
print(f" +{off:>5} avg_diff={s:.2f} bytes={body[off:off+24].hex(' ')}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,76 +0,0 @@
|
||||
"""Full Tran decoder: continues across segment headers using T_delta from header bytes [0:2]."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def decode_full_tran(body):
|
||||
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||
return None
|
||||
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
i = 7
|
||||
while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
|
||||
i += 1
|
||||
|
||||
blocks = walk_body(body, i)
|
||||
T = [T0, T1]
|
||||
cur = T1
|
||||
for blk in blocks:
|
||||
if blk.tag_hi == 0x40:
|
||||
# Segment header carries 2 T deltas (int16 BE each) at bytes [0:2] and [2:4]
|
||||
if len(blk.data) >= 4:
|
||||
delta1 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||
cur += delta1
|
||||
T.append(cur)
|
||||
delta2 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||
cur += delta2
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += s4(nib)
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur += i8(byte)
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
T.append(cur)
|
||||
# 30 NN: skip for now
|
||||
return T
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
truth_T = [round(v*200) for v in samples["Tran"]]
|
||||
n_truth = len(truth_T)
|
||||
|
||||
decoded = decode_full_tran(body)
|
||||
n = min(len(decoded), n_truth)
|
||||
matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
|
||||
div_at = -1
|
||||
for i in range(n):
|
||||
if decoded[i] != truth_T[i]:
|
||||
div_at = i
|
||||
break
|
||||
print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,50 +0,0 @@
|
||||
"""Quick inspection of the new high-amplitude events."""
|
||||
import os, re, sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
ROOT = "tests/fixtures/5-11-26"
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
bin_path = os.path.join(ROOT, stem)
|
||||
txt_path = bin_path + ".TXT"
|
||||
with open(bin_path, "rb") as f:
|
||||
raw = f.read()
|
||||
body = raw[43:-26]
|
||||
meta, samples = _parse_txt(txt_path)
|
||||
n = len(samples["Tran"])
|
||||
|
||||
print(f"\n=== {stem} ===")
|
||||
print(f" file={len(raw)}, body={len(body)}, N_samples={n}")
|
||||
print(f" rectime={meta.get('Record Time')} pretrig={meta.get('Pre-trigger Length')}")
|
||||
print(f" PPV(T,V,L)={meta.get('Tran PPV')} / {meta.get('Vert PPV')} / {meta.get('Long PPV')}")
|
||||
# Show first few non-trivial samples
|
||||
print(f" First 5 truth samples (in/s):")
|
||||
for i in range(5):
|
||||
print(f" T={samples['Tran'][i]:8.3f} V={samples['Vert'][i]:8.3f} "
|
||||
f"L={samples['Long'][i]:8.3f} M={samples['MicL'][i]:8.3f}")
|
||||
# Peak sample positions
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
vals = samples[ch]
|
||||
peak_i = max(range(n), key=lambda i: abs(vals[i]))
|
||||
print(f" {ch}: peak {vals[peak_i]:.3f} at sample {peak_i} (t={peak_i/1024:.3f}s)")
|
||||
# Body structure
|
||||
start = find_data_start(body)
|
||||
blocks = walk_body(body, start)
|
||||
types = {}
|
||||
for b in blocks:
|
||||
types[b.tag_hi] = types.get(b.tag_hi, 0) + 1
|
||||
print(f" body start={start}, total blocks walked: {len(blocks)}")
|
||||
print(f" block tag counts: {types}")
|
||||
# How far the walker got
|
||||
if blocks:
|
||||
last = blocks[-1]
|
||||
walked = last.offset + last.length
|
||||
print(f" walker stopped at offset {walked}/{len(body)} ({100*walked/len(body):.0f}%)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,23 +0,0 @@
|
||||
"""Print raw body hex + byte-distribution stats for one event."""
|
||||
from collections import Counter
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
print(f"\n=== {name} ({len(body)} body bytes) ===")
|
||||
print(f" STRT: {b.strt.hex()}")
|
||||
print(f" body[0:64]: {body[:64].hex()}")
|
||||
print(f" body[64:128]: {body[64:128].hex()}")
|
||||
print(f" body[-32:]: {body[-32:].hex()}")
|
||||
cnt = Counter(body)
|
||||
print(f" top 16 bytes: {[(f'0x{k:02x}', f'{v/len(body):.2%}') for k,v in cnt.most_common(16)]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,144 +0,0 @@
|
||||
"""
|
||||
load_bundle.py — extract body bytes from BW binary + parse sample columns from TXT.
|
||||
|
||||
Used by the codec reverse-engineering scripts in this directory.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
|
||||
|
||||
BUNDLE_ROOT = os.path.join(
|
||||
os.path.dirname(__file__), "..", "tests", "fixtures", "decode-re-5-8-26"
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class Bundle:
|
||||
name: str
|
||||
bin_path: str
|
||||
txt_path: str
|
||||
bin: bytes
|
||||
body: bytes # bytes between STRT (43) and footer (last 26)
|
||||
strt: bytes # 21-byte STRT record
|
||||
samples: dict # {"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}
|
||||
sample_rate: int
|
||||
rectime_sec: float
|
||||
pretrig_sec: float
|
||||
geo_range_ips: float
|
||||
ppv: dict # {"Tran": float, "Vert": float, "Long": float}
|
||||
mic_pspl: float
|
||||
serial: str
|
||||
|
||||
|
||||
def _parse_txt(path: str) -> dict:
|
||||
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||
text = f.read()
|
||||
|
||||
meta = {}
|
||||
samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
|
||||
# Find header line that starts the columns ("Tran Vert Long MicL").
|
||||
# Then every line after is sample data (4 tab-separated floats).
|
||||
lines = text.splitlines()
|
||||
header_idx = None
|
||||
for i, line in enumerate(lines):
|
||||
if "Tran" in line and "Vert" in line and "Long" in line and "MicL" in line:
|
||||
# The columns header. Sample lines start a few lines later.
|
||||
header_idx = i
|
||||
break
|
||||
if header_idx is None:
|
||||
raise ValueError(f"no Tran/Vert/Long/MicL header in {path}")
|
||||
|
||||
# Parse meta — quoted lines with "Field : value"
|
||||
for line in lines[:header_idx]:
|
||||
m = re.match(r'^"([^"]+)\s*:\s*([^"]*)"', line.strip())
|
||||
if m:
|
||||
k, v = m.group(1).strip(), m.group(2).strip()
|
||||
meta[k] = v
|
||||
|
||||
# Parse samples
|
||||
for line in lines[header_idx + 1 :]:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
parts = re.split(r"\s+", line)
|
||||
if len(parts) < 4:
|
||||
continue
|
||||
try:
|
||||
t = float(parts[0])
|
||||
v = float(parts[1])
|
||||
l = float(parts[2])
|
||||
m = float(parts[3])
|
||||
except ValueError:
|
||||
continue
|
||||
samples["Tran"].append(t)
|
||||
samples["Vert"].append(v)
|
||||
samples["Long"].append(l)
|
||||
samples["MicL"].append(m)
|
||||
|
||||
return meta, samples
|
||||
|
||||
|
||||
def load_bundle(name: str) -> Bundle:
|
||||
folder = os.path.join(BUNDLE_ROOT, name)
|
||||
files = os.listdir(folder)
|
||||
bin_name = next(f for f in files if not f.endswith(".TXT"))
|
||||
txt_name = next(f for f in files if f.endswith(".TXT"))
|
||||
|
||||
bin_path = os.path.join(folder, bin_name)
|
||||
txt_path = os.path.join(folder, txt_name)
|
||||
|
||||
with open(bin_path, "rb") as f:
|
||||
binary = f.read()
|
||||
|
||||
# Header is 22 bytes; STRT at [22:43]; footer at last 26 bytes.
|
||||
strt = binary[22:43]
|
||||
body = binary[43:-26]
|
||||
|
||||
meta, samples = _parse_txt(txt_path)
|
||||
|
||||
sample_rate = int(re.search(r"(\d+)", meta.get("Sample Rate", "1024")).group(1))
|
||||
rectime_sec = float(re.search(r"([\d.]+)", meta.get("Record Time", "3.0")).group(1))
|
||||
pretrig_sec = float(re.search(r"-?[\d.]+", meta.get("Pre-trigger Length", "0")).group(0))
|
||||
geo_range_ips = float(re.search(r"([\d.]+)", meta.get("Geo Range", "10.0")).group(1))
|
||||
serial = meta.get("Serial Number", "").strip()
|
||||
|
||||
def _f(s):
|
||||
return float(re.search(r"-?[\d.]+", s).group(0))
|
||||
|
||||
ppv = {
|
||||
"Tran": _f(meta.get("Tran PPV", "0")),
|
||||
"Vert": _f(meta.get("Vert PPV", "0")),
|
||||
"Long": _f(meta.get("Long PPV", "0")),
|
||||
}
|
||||
mic_pspl = _f(meta.get("MicL PSPL", "0"))
|
||||
|
||||
return Bundle(
|
||||
name=name,
|
||||
bin_path=bin_path,
|
||||
txt_path=txt_path,
|
||||
bin=binary,
|
||||
body=body,
|
||||
strt=strt,
|
||||
samples=samples,
|
||||
sample_rate=sample_rate,
|
||||
rectime_sec=rectime_sec,
|
||||
pretrig_sec=pretrig_sec,
|
||||
geo_range_ips=geo_range_ips,
|
||||
ppv=ppv,
|
||||
mic_pspl=mic_pspl,
|
||||
serial=serial,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||
b = load_bundle(name)
|
||||
n = len(b.samples["Tran"])
|
||||
print(f"{name}: body={len(b.body):>6} N_samples={n} rate={b.sample_rate} "
|
||||
f"rectime={b.rectime_sec} pretrig={b.pretrig_sec} range={b.geo_range_ips} "
|
||||
f"PPV(T,V,L)={b.ppv['Tran']:.3f},{b.ppv['Vert']:.3f},{b.ppv['Long']:.3f} "
|
||||
f"MicL={b.mic_pspl}")
|
||||
@@ -1,81 +0,0 @@
|
||||
"""Decode Tran across multiple segments by resetting at 40 02 headers."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def decode_full_tran(body):
|
||||
"""Decode all Tran samples in the body, walking through segments."""
|
||||
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||
return None
|
||||
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
# Locate first tag
|
||||
i = 7
|
||||
while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
|
||||
i += 1
|
||||
|
||||
blocks = walk_body(body, i)
|
||||
T = [T0, T1]
|
||||
cur = T1
|
||||
for bi, blk in enumerate(blocks):
|
||||
if blk.tag_hi == 0x40:
|
||||
# Segment header — try interpreting bytes [0:2] as new T anchor
|
||||
if len(blk.data) >= 2:
|
||||
new_anchor = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||
# The next sample IS this anchor value, NOT a delta from cur.
|
||||
T.append(new_anchor)
|
||||
cur = new_anchor
|
||||
elif blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += s4(nib)
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur += i8(byte)
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x00:
|
||||
# RLE: append NN zero deltas
|
||||
for _ in range(blk.tag_lo):
|
||||
T.append(cur)
|
||||
# 30 NN: skip
|
||||
return T
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
truth_T = [round(v*200) for v in samples["Tran"]]
|
||||
n_truth = len(truth_T)
|
||||
|
||||
decoded = decode_full_tran(body)
|
||||
n = min(len(decoded), n_truth)
|
||||
matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
|
||||
# Find first divergence
|
||||
div_at = -1
|
||||
for i in range(n):
|
||||
if decoded[i] != truth_T[i]:
|
||||
div_at = i
|
||||
break
|
||||
print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
|
||||
if div_at >= 0 and div_at < 30:
|
||||
print(f" truth around div [{max(0,div_at-3)}:{div_at+8}]: {truth_T[max(0,div_at-3):div_at+8]}")
|
||||
print(f" pred around div [{max(0,div_at-3)}:{div_at+8}]: {decoded[max(0,div_at-3):div_at+8]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,28 +0,0 @@
|
||||
"""Dump all blocks in segment 1 of each event with their data."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
|
||||
# Find segment 1 (between first and second 40 02)
|
||||
seg40_indices = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||
if len(seg40_indices) < 2:
|
||||
print(f"\n{stem}: only {len(seg40_indices)} segment headers found")
|
||||
seg1_blocks = blocks[seg40_indices[0]:] if seg40_indices else []
|
||||
else:
|
||||
seg1_blocks = blocks[seg40_indices[0]:seg40_indices[1]+1]
|
||||
print(f"\n=== {stem} segment 1 ({len(seg1_blocks)} blocks) ===")
|
||||
for b in seg1_blocks[:25]:
|
||||
tag = f"{b.tag_hi:02x}{b.tag_lo:02x}"
|
||||
print(f" off={b.offset:>5} {tag} NN=0x{b.tag_lo:02x}({b.tag_lo:>3}) len={b.length:>3} data={b.data[:16].hex(' ')}{'...' if len(b.data)>16 else ''}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,195 +0,0 @@
|
||||
"""Test 12-bit signed packed deltas hypothesis for 30 NN blocks across all loud events.
|
||||
|
||||
For each 30 NN block in each event, identify what samples it should cover
|
||||
(based on the cumulative delta count up to that point) and compare the
|
||||
truth deltas against various 12-bit packing schemes.
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
CHANNEL_ORDER = ["Vert", "Long", "MicL", "Tran"] # rotation after initial T
|
||||
|
||||
|
||||
def s12(v):
|
||||
"""Sign-extend a 12-bit unsigned value to signed int."""
|
||||
return v if v < 0x800 else v - 0x1000
|
||||
|
||||
|
||||
def unpack_12bit_be(data):
|
||||
"""4 deltas in 6 bytes, BE order: byte[0:1.5], byte[1.5:3], byte[3:4.5], byte[4.5:6]."""
|
||||
# bits 0..47 (MSB-first), split into 4 × 12-bit
|
||||
val = int.from_bytes(data, "big")
|
||||
out = []
|
||||
for i in range(4):
|
||||
d = (val >> (12 * (3 - i))) & 0xFFF
|
||||
out.append(s12(d))
|
||||
return out
|
||||
|
||||
|
||||
def unpack_12bit_le(data):
|
||||
"""4 deltas in 6 bytes, LE order: bytes packed as 2 × 24-bit groups."""
|
||||
out = []
|
||||
# First 3 bytes contain 2 deltas
|
||||
b0, b1, b2 = data[0], data[1], data[2]
|
||||
d0 = b0 | ((b1 & 0x0F) << 8)
|
||||
d1 = (b1 >> 4) | (b2 << 4)
|
||||
out.append(s12(d0))
|
||||
out.append(s12(d1))
|
||||
# Next 3 bytes contain 2 more deltas
|
||||
b3, b4, b5 = data[3], data[4], data[5]
|
||||
d2 = b3 | ((b4 & 0x0F) << 8)
|
||||
d3 = (b4 >> 4) | (b5 << 4)
|
||||
out.append(s12(d2))
|
||||
out.append(s12(d3))
|
||||
return out
|
||||
|
||||
|
||||
def unpack_12bit_be_per_triplet(data):
|
||||
"""4 deltas as 2 triplets of (high4, low8) BE within each 3-byte group."""
|
||||
out = []
|
||||
b0, b1, b2 = data[0], data[1], data[2]
|
||||
d0 = (b0 << 4) | (b1 >> 4)
|
||||
d1 = ((b1 & 0x0F) << 8) | b2
|
||||
out.append(s12(d0))
|
||||
out.append(s12(d1))
|
||||
b3, b4, b5 = data[3], data[4], data[5]
|
||||
d2 = (b3 << 4) | (b4 >> 4)
|
||||
d3 = ((b4 & 0x0F) << 8) | b5
|
||||
out.append(s12(d2))
|
||||
out.append(s12(d3))
|
||||
return out
|
||||
|
||||
|
||||
def truth_deltas_for_block(blocks, block_idx, event_truth, channel):
|
||||
"""For a 30 NN block at block_idx, determine which samples it covers and
|
||||
return the truth deltas for those samples.
|
||||
|
||||
Walks through all blocks before block_idx (within the same segment) and
|
||||
counts how many deltas have been emitted for *channel*, starting from the
|
||||
segment's anchor pair.
|
||||
"""
|
||||
# Find the segment header that contains this block.
|
||||
seg_header_idx = None
|
||||
for j in range(block_idx, -1, -1):
|
||||
if blocks[j].tag_hi == 0x40:
|
||||
seg_header_idx = j
|
||||
break
|
||||
if seg_header_idx is None:
|
||||
# block is in the initial T segment; samples count from sample 2.
|
||||
first_sample_in_segment = 2
|
||||
else:
|
||||
# Anchor pair covers samples [N, N+1] for some N. Subsequent deltas
|
||||
# are samples [N+2, N+2+1, ...]. We don't actually need to know N
|
||||
# for this test — just the relative position within the segment.
|
||||
first_sample_in_segment = 2 # anchor=0,1; deltas start at 2
|
||||
|
||||
# Count deltas from segment-data start to block_idx.
|
||||
delta_count = 0
|
||||
start_block = seg_header_idx + 1 if seg_header_idx is not None else 0
|
||||
for j in range(start_block, block_idx):
|
||||
blk = blocks[j]
|
||||
if blk.tag_hi == 0x10:
|
||||
delta_count += blk.tag_lo # NN nibbles = NN deltas
|
||||
elif blk.tag_hi == 0x20:
|
||||
delta_count += blk.tag_lo # NN int8 deltas
|
||||
elif blk.tag_hi == 0x00:
|
||||
delta_count += blk.tag_lo # RLE zero deltas
|
||||
# Now the 30 NN block carries NN deltas.
|
||||
nn = blocks[block_idx].tag_lo
|
||||
# First sample affected: segment first_sample + delta_count.
|
||||
# But we ALSO need to know which segment this is, since the segment maps
|
||||
# to a specific channel and a specific starting absolute sample index.
|
||||
return first_sample_in_segment + delta_count, nn
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||
|
||||
# Find all 30 NN blocks in DATA section (not trailer).
|
||||
thirty_blocks = []
|
||||
for bi, b in enumerate(blocks):
|
||||
if b.tag_hi != 0x30:
|
||||
continue
|
||||
# Determine which segment this is in
|
||||
seg_num = None
|
||||
for k, hi in enumerate(seg_idx):
|
||||
next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
|
||||
if hi < bi < next_hi:
|
||||
seg_num = k
|
||||
break
|
||||
if seg_num is None and seg_idx and bi < seg_idx[0]:
|
||||
seg_num = -1 # initial T segment
|
||||
thirty_blocks.append((bi, b, seg_num))
|
||||
|
||||
if not thirty_blocks:
|
||||
continue
|
||||
|
||||
print(f"\n=== {stem} ===")
|
||||
for bi, b, seg_num in thirty_blocks:
|
||||
# Channel for this segment
|
||||
if seg_num == -1:
|
||||
channel = "Tran"
|
||||
seg_label = "initial T"
|
||||
else:
|
||||
channel = CHANNEL_ORDER[seg_num % 4]
|
||||
seg_label = f"seg {seg_num}"
|
||||
|
||||
# Count deltas before this block within the same segment.
|
||||
seg_header_idx = seg_idx[seg_num] if seg_num >= 0 else -1
|
||||
start_block = seg_header_idx + 1 if seg_header_idx >= 0 else 0
|
||||
delta_count = 0
|
||||
for j in range(start_block, bi):
|
||||
blk = blocks[j]
|
||||
if blk.tag_hi in (0x10, 0x20, 0x00):
|
||||
delta_count += blk.tag_lo
|
||||
|
||||
# First sample this 30 NN block affects (within the segment)
|
||||
# = anchor positions + delta_count + 2 (since anchor pair was samples 0,1)
|
||||
# But the segment's first absolute sample index in the channel is
|
||||
# (seg_num // 4) * 512 (approximately) if segment 0 is the first V seg.
|
||||
cycle = (seg_num // 4) if seg_num >= 0 else 0
|
||||
base = cycle * 512 + 2 # +2 for anchor pair
|
||||
sample_idx = base + delta_count
|
||||
truth_ch = [round(v * 200) for v in samples[channel]]
|
||||
nn = b.tag_lo
|
||||
|
||||
if sample_idx + nn >= len(truth_ch):
|
||||
print(f" block @ {b.offset} ({seg_label} {channel}): out of truth range")
|
||||
continue
|
||||
|
||||
# Get the previous sample so we can compute truth deltas
|
||||
if sample_idx == 0:
|
||||
prev = 0
|
||||
else:
|
||||
prev = truth_ch[sample_idx - 1]
|
||||
truth_deltas = []
|
||||
for k in range(nn):
|
||||
truth_deltas.append(truth_ch[sample_idx + k] - (prev if k == 0 else truth_ch[sample_idx + k - 1]))
|
||||
|
||||
# Try each packing
|
||||
schemes = [
|
||||
("12-bit BE contiguous", unpack_12bit_be(b.data)),
|
||||
("12-bit LE per-triplet", unpack_12bit_le(b.data)),
|
||||
("12-bit BE per-triplet", unpack_12bit_be_per_triplet(b.data)),
|
||||
]
|
||||
print(f" block @ {b.offset:>5} ({seg_label} {channel}, samples {sample_idx}..{sample_idx+nn-1}):")
|
||||
print(f" data: {b.data.hex(' ')}")
|
||||
print(f" truth: {truth_deltas}")
|
||||
for name, pred in schemes:
|
||||
match = "✓" if pred == truth_deltas else " "
|
||||
n_match = sum(1 for x, y in zip(pred, truth_deltas) if x == y)
|
||||
print(f" {match}{n_match}/4 {name}: {pred}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,132 +0,0 @@
|
||||
"""Test the '30 NN data = high-nibbles + int8 low-bytes' hypothesis.
|
||||
|
||||
Layout for `30 04` (6 data bytes, 4 deltas):
|
||||
bytes [0:2] = 16 bits = 4 × 4-bit high-nibbles (MSB first)
|
||||
bytes [2:6] = 4 × int8 low bytes
|
||||
Each delta = 12-bit signed = sign-extend((high_nibble << 8) | low_byte)
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def sign_extend_12(v):
|
||||
return v if v < 0x800 else v - 0x1000
|
||||
|
||||
|
||||
def decode_30nn(data):
|
||||
"""4 × 12-bit signed deltas (high nibble + low byte).
|
||||
bytes[0:2] hold the 4 high nibbles (MSB first); bytes[2:6] hold the low bytes.
|
||||
"""
|
||||
if len(data) < 6:
|
||||
return []
|
||||
# Read high nibbles from bytes 0-1 (4 nibbles MSB-first)
|
||||
high_word = (data[0] << 8) | data[1]
|
||||
high_nibbles = [
|
||||
(high_word >> 12) & 0xF,
|
||||
(high_word >> 8) & 0xF,
|
||||
(high_word >> 4) & 0xF,
|
||||
high_word & 0xF,
|
||||
]
|
||||
out = []
|
||||
for i in range(4):
|
||||
v = (high_nibbles[i] << 8) | data[2 + i]
|
||||
out.append(sign_extend_12(v))
|
||||
return out
|
||||
|
||||
|
||||
def simulate_up_to(blocks, target_block_idx, t_preamble):
|
||||
"""Run decoder up to block_idx; return per-channel sample lists.
|
||||
NOW with 30 NN decoded too."""
|
||||
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
out["Tran"].extend(t_preamble)
|
||||
cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
|
||||
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||
current_channel = "Tran"
|
||||
seg_counter = -1
|
||||
for j in range(target_block_idx):
|
||||
blk = blocks[j]
|
||||
if blk.tag_hi == 0x40:
|
||||
seg_counter += 1
|
||||
prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
|
||||
new_ch = rotation[seg_counter % 4]
|
||||
if cur[prev] is not None:
|
||||
d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||
d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||
cur[prev] += d0; out[prev].append(cur[prev])
|
||||
cur[prev] += d1; out[prev].append(cur[prev])
|
||||
c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
|
||||
c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
|
||||
out[new_ch].extend([c0, c1])
|
||||
cur[new_ch] = c1
|
||||
current_channel = new_ch
|
||||
elif blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur[current_channel] += s4(nib)
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur[current_channel] += i8(byte)
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x30:
|
||||
# NEW: decode 30 NN
|
||||
deltas = decode_30nn(blk.data)
|
||||
for d in deltas:
|
||||
cur[current_channel] += d
|
||||
out[current_channel].append(cur[current_channel])
|
||||
return out, current_channel
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
|
||||
if not thirty_blocks:
|
||||
continue
|
||||
print(f"\n=== {stem} ===")
|
||||
for j, blk in thirty_blocks:
|
||||
pred, ch = simulate_up_to(blocks, j, [t0, t1])
|
||||
cur_before = pred[ch][-1]
|
||||
truth = [round(v * 200) for v in samples[ch]]
|
||||
n_pred = len(pred[ch])
|
||||
nn = blk.tag_lo
|
||||
if n_pred + nn > len(truth):
|
||||
continue
|
||||
# Decode this 30 NN block with hypothesis
|
||||
pred_deltas = decode_30nn(blk.data)
|
||||
# Compute truth deltas relative to cur_before
|
||||
truth_deltas = []
|
||||
prev = cur_before
|
||||
for k in range(nn):
|
||||
truth_deltas.append(truth[n_pred + k] - prev)
|
||||
prev = truth[n_pred + k]
|
||||
n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
|
||||
tag = "✓" if pred_deltas == truth_deltas else " "
|
||||
print(f" block @ {blk.offset:>5} (chan={ch}, NN={nn}):")
|
||||
print(f" data: {blk.data.hex(' ')}")
|
||||
print(f" truth: {truth_deltas}")
|
||||
print(f" pred: {pred_deltas} {tag}{n_match}/{nn}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,141 +0,0 @@
|
||||
"""Test 30 NN packing by running the real decoder up to each 30 NN block,
|
||||
recording how many samples have been produced for each channel at that point,
|
||||
then checking truth deltas immediately after."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def s12(v):
|
||||
return v if v < 0x800 else v - 0x1000
|
||||
|
||||
|
||||
def unpack_12bit_be_contiguous(data):
|
||||
out = []
|
||||
val = int.from_bytes(data, "big")
|
||||
n = len(data) * 8 // 12
|
||||
for i in range(n):
|
||||
d = (val >> (12 * (n - 1 - i))) & 0xFFF
|
||||
out.append(s12(d))
|
||||
return out
|
||||
|
||||
|
||||
def unpack_12bit_per_triplet_be(data):
|
||||
out = []
|
||||
for i in range(0, len(data), 3):
|
||||
if i + 2 >= len(data):
|
||||
break
|
||||
b0, b1, b2 = data[i], data[i + 1], data[i + 2]
|
||||
d0 = (b0 << 4) | (b1 >> 4)
|
||||
d1 = ((b1 & 0x0F) << 8) | b2
|
||||
out.append(s12(d0))
|
||||
out.append(s12(d1))
|
||||
return out
|
||||
|
||||
|
||||
def simulate_up_to(blocks, target_block_idx, t_preamble):
|
||||
"""Run the decoder up to block_idx; return per-channel sample lists."""
|
||||
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
out["Tran"].extend(t_preamble)
|
||||
cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
|
||||
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||
seg_idx = [j for j, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||
|
||||
# Determine which channel we're CURRENTLY decoding into
|
||||
current_channel = "Tran"
|
||||
seg_counter = -1 # incremented at each 40 02
|
||||
|
||||
for j in range(target_block_idx):
|
||||
blk = blocks[j]
|
||||
if blk.tag_hi == 0x40:
|
||||
# Switch: extend prev channel, set up new channel
|
||||
seg_counter += 1
|
||||
prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
|
||||
new_ch = rotation[seg_counter % 4]
|
||||
if cur[prev] is not None:
|
||||
d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||
d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||
cur[prev] += d0; out[prev].append(cur[prev])
|
||||
cur[prev] += d1; out[prev].append(cur[prev])
|
||||
c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
|
||||
c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
|
||||
out[new_ch].extend([c0, c1])
|
||||
cur[new_ch] = c1
|
||||
current_channel = new_ch
|
||||
elif blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur[current_channel] += s4(nib)
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur[current_channel] += i8(byte)
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out[current_channel].append(cur[current_channel])
|
||||
elif blk.tag_hi == 0x30:
|
||||
# Skip for now — we want to know what comes next
|
||||
pass
|
||||
|
||||
return out, current_channel
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
# Find all 30 NN blocks in data section
|
||||
thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
|
||||
if not thirty_blocks:
|
||||
continue
|
||||
|
||||
print(f"\n=== {stem} ===")
|
||||
for j, blk in thirty_blocks:
|
||||
pred, ch = simulate_up_to(blocks, j, [t0, t1])
|
||||
n_pred = len(pred[ch])
|
||||
# The 30 NN block carries NN deltas for channel `ch` starting at sample n_pred
|
||||
truth = [round(v * 200) for v in samples[ch]]
|
||||
if n_pred >= len(truth):
|
||||
continue
|
||||
# Truth deltas: truth[n_pred] - cur, truth[n_pred+1] - truth[n_pred], ...
|
||||
cur_val = pred[ch][-1]
|
||||
nn = blk.tag_lo
|
||||
truth_deltas = []
|
||||
prev = cur_val
|
||||
for k in range(min(nn, len(truth) - n_pred)):
|
||||
truth_deltas.append(truth[n_pred + k] - prev)
|
||||
prev = truth[n_pred + k]
|
||||
|
||||
print(f" block @ {blk.offset:>5} (chan={ch}, after sample {n_pred-1}, "
|
||||
f"NN={nn}, last_val={cur_val}):")
|
||||
print(f" data: {blk.data.hex(' ')}")
|
||||
print(f" truth: {truth_deltas}")
|
||||
schemes = [
|
||||
("12-bit BE contiguous", unpack_12bit_be_contiguous(blk.data)),
|
||||
("12-bit per-triplet BE", unpack_12bit_per_triplet_be(blk.data)),
|
||||
]
|
||||
for name, pred_deltas in schemes:
|
||||
n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
|
||||
tag = "✓" if pred_deltas == truth_deltas else " "
|
||||
print(f" {tag}{n_match}/{nn} {name}: {pred_deltas[:nn]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,86 +0,0 @@
|
||||
"""Test: 00 NN markers might be RLE for zero-deltas in current channel."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def decode_with_rle(body):
|
||||
"""Decode Tran assuming:
|
||||
- preamble[3:5], [5:7] = T[0], T[1]
|
||||
- All 10 NN / 20 NN blocks until segment_header (40 02) are Tran deltas
|
||||
- 00 NN markers are RLE: NN/4 zero T deltas (or NN, or NN/2 — try them)
|
||||
"""
|
||||
if len(body) < 9 or body[0:3] != b"\x00\x02\x00":
|
||||
return None, None, None
|
||||
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
# Find first tag (might be 00 NN, 10 NN, or 20 NN)
|
||||
i = 7
|
||||
while i + 1 < len(body):
|
||||
if body[i] in (0x00, 0x10, 0x20):
|
||||
break
|
||||
i += 1
|
||||
start = i
|
||||
|
||||
blocks = walk_body(body, start)
|
||||
|
||||
results = {}
|
||||
for rle_div in (4, 2, 1): # try different RLE interpretations
|
||||
T = [T0, T1]
|
||||
cur = T1
|
||||
for blk in blocks:
|
||||
if blk.tag_hi == 0x40:
|
||||
break
|
||||
if blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += s4(nib)
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur += i8(byte)
|
||||
T.append(cur)
|
||||
elif blk.tag_hi == 0x00:
|
||||
# RLE of zero deltas
|
||||
n_zeros = blk.tag_lo // rle_div
|
||||
for _ in range(n_zeros):
|
||||
T.append(cur)
|
||||
# 30 NN: skip for now
|
||||
results[rle_div] = T
|
||||
return results, T0, T1
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
truth_T = [round(v*200) for v in samples["Tran"]]
|
||||
|
||||
results, T0, T1 = decode_with_rle(body)
|
||||
print(f"\n=== {stem} (T[0]={T0}, T[1]={T1}) ===")
|
||||
for rle_div, T in results.items():
|
||||
n = min(len(T), len(truth_T))
|
||||
matches = sum(1 for i in range(n) if T[i] == truth_T[i])
|
||||
# Find first divergence
|
||||
div_at = -1
|
||||
for i in range(n):
|
||||
if T[i] != truth_T[i]:
|
||||
div_at = i
|
||||
break
|
||||
print(f" rle_div={rle_div}: decoded {len(T)}, matches {matches}/{n}, first div at sample {div_at}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,71 +0,0 @@
|
||||
"""Test: does the second '20 NN' block in SS0 continue Tran samples?"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def main():
|
||||
stem = "M529LL1A.SS0"
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
truth_T_16 = [round(v * 200) for v in samples["Tran"]]
|
||||
|
||||
# Preamble
|
||||
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
# Walk blocks
|
||||
start = find_data_start(body)
|
||||
blocks = walk_body(body, start)
|
||||
|
||||
print(f"=== {stem} === T[0]={T0} T[1]={T1}")
|
||||
|
||||
# Hypothesis: Tran continues through ALL 10 NN and 20 NN blocks
|
||||
# in order, until the next 40 02 segment header (which resets).
|
||||
T = [T0, T1]
|
||||
cur = T1
|
||||
decoded_count = 2 # T[0], T[1] from preamble
|
||||
for bi, blk in enumerate(blocks):
|
||||
if blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += s4(nib)
|
||||
T.append(cur)
|
||||
decoded_count += 1
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur += i8(byte)
|
||||
T.append(cur)
|
||||
decoded_count += 1
|
||||
elif blk.tag_hi == 0x40:
|
||||
# Segment header — stop here for this test
|
||||
break
|
||||
# 00 and 30 NN don't contribute to Tran (in this hypothesis)
|
||||
|
||||
# Compare to truth
|
||||
print(f" Decoded {len(T)} T samples up to first 40 02")
|
||||
matches = sum(1 for i in range(min(len(T), len(truth_T_16))) if T[i] == truth_T_16[i])
|
||||
print(f" Matches in first {min(len(T), len(truth_T_16))}: {matches}")
|
||||
# Print first divergence
|
||||
for i in range(min(len(T), len(truth_T_16))):
|
||||
if T[i] != truth_T_16[i]:
|
||||
print(f" First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
|
||||
# Show context
|
||||
print(f" pred [{i-3}:{i+5}]: {T[max(0,i-3):i+5]}")
|
||||
print(f" truth [{i-3}:{i+5}]: {truth_T_16[max(0,i-3):i+5]}")
|
||||
break
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,67 +0,0 @@
|
||||
"""Try various nibble-level channel interleavings to find which one matches truth."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def run_decoder(body, layout, skip, n_channels=4):
|
||||
"""layout: function nibble_index -> channel_index. Returns list-of-lists per channel."""
|
||||
out = [[] for _ in range(n_channels)]
|
||||
cur = [0] * n_channels
|
||||
nibbles = []
|
||||
for byte in body[skip:]:
|
||||
nibbles.append((byte >> 4) & 0xF)
|
||||
nibbles.append(byte & 0xF)
|
||||
for i, n in enumerate(nibbles):
|
||||
ch = layout(i)
|
||||
cur[ch] += s4(n)
|
||||
out[ch].append(cur[ch])
|
||||
return out
|
||||
|
||||
|
||||
def cmp(pred, truth, n=24):
|
||||
n = min(n, len(pred), len(truth))
|
||||
return [(pred[i], truth[i]) for i in range(n)]
|
||||
|
||||
|
||||
def main():
|
||||
b = load_bundle("event-c")
|
||||
truth_T = [round(v * 200) for v in b.samples["Tran"]]
|
||||
truth_V = [round(v * 200) for v in b.samples["Vert"]]
|
||||
truth_L = [round(v * 200) for v in b.samples["Long"]]
|
||||
print(f"T truth[0:10]: {truth_T[:10]}")
|
||||
print(f"V truth[0:10]: {truth_V[:10]}")
|
||||
print(f"L truth[0:10]: {truth_L[:10]}")
|
||||
|
||||
# Try several nibble->channel layouts (4 channels)
|
||||
layouts = {
|
||||
"interleaved TVLM (0,1,2,3,0,1,2,3,...)": lambda i: i % 4,
|
||||
"interleaved VLMT": lambda i: (i + 3) % 4,
|
||||
"interleaved LMTV": lambda i: (i + 2) % 4,
|
||||
"interleaved MTVL": lambda i: (i + 1) % 4,
|
||||
"byte-based TV LM TV LM (high T low V byte0; high L low M byte1)": lambda i: i % 4,
|
||||
# "chunks of 8 nibbles per channel": each channel gets 8 nibbles in a row
|
||||
"chunks-8 TVLM": lambda i: (i // 8) % 4,
|
||||
"chunks-16 TVLM": lambda i: (i // 16) % 4,
|
||||
# planar (full channel sequential)
|
||||
"planar T(0..N) V(N..2N) L(2N..3N) M(3N..4N)": None, # special
|
||||
}
|
||||
|
||||
for label, layout_fn in layouts.items():
|
||||
if layout_fn is None:
|
||||
continue
|
||||
for skip in (0, 4, 7, 8, 9, 11, 14):
|
||||
out = run_decoder(b.body, layout_fn, skip)
|
||||
# Check first 8 cumulative on each channel
|
||||
print(f" skip={skip:2} {label}")
|
||||
print(f" T_cum[0:10]: {out[0][:10]}")
|
||||
print(f" V_cum[0:10]: {out[1][:10]}")
|
||||
print(f" L_cum[0:10]: {out[2][:10]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,73 +0,0 @@
|
||||
"""Try decoding body as 4-bit signed nibble deltas, 4-channel round-robin."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||
|
||||
|
||||
def s4(n):
|
||||
"""Sign-extend a 4-bit unsigned to int (0..7 → 0..7, 8..F → -8..-1)."""
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def decode_nibbles(body: bytes, skip_bytes: int = 7, n_channels: int = 4):
|
||||
"""Read body as 2 nibbles per byte; accumulate as deltas for n_channels round-robin."""
|
||||
out = [[] for _ in range(n_channels)]
|
||||
cur = [0] * n_channels
|
||||
ch = 0
|
||||
nibbles = []
|
||||
for byte in body[skip_bytes:]:
|
||||
nibbles.append((byte >> 4) & 0xF)
|
||||
nibbles.append(byte & 0xF)
|
||||
for n in nibbles:
|
||||
cur[ch] += s4(n)
|
||||
out[ch].append(cur[ch])
|
||||
ch = (ch + 1) % n_channels
|
||||
return out
|
||||
|
||||
|
||||
def cmp_to_truth(pred, truth, scale=16):
|
||||
"""Compare predicted ints (in 16-count units) to truth (in 16-count units = txt * 200).
|
||||
Return (max_abs_err, mean_abs_err, n_compared).
|
||||
"""
|
||||
n = min(len(pred), len(truth))
|
||||
errs = []
|
||||
for i in range(n):
|
||||
p = pred[i]
|
||||
t = truth[i]
|
||||
errs.append(abs(p - t))
|
||||
if not errs:
|
||||
return None
|
||||
return (max(errs), sum(errs) / len(errs), n)
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-a", "event-c"):
|
||||
b = load_bundle(name)
|
||||
# Convert TXT samples (in/s) to 16-count units (multiply by 200, since 0.005 in/s = 1)
|
||||
# WAIT: 0.005 in/s = 16 ADC counts. 1 count = 0.000305 in/s.
|
||||
# So in 1-count units: count = txt * (1/0.0003052) ≈ txt * 3276.7
|
||||
# But TXT only has 0.005 resolution so equivalent to 16-count units = txt * 200.
|
||||
truth_in_16 = {ch: [round(v * 200) for v in b.samples[ch]] for ch in CHANNELS[:3]}
|
||||
# MicL is in dB, skip for now
|
||||
|
||||
# Try decoder with skip_bytes = 7
|
||||
decoded = decode_nibbles(b.body, skip_bytes=7, n_channels=4)
|
||||
print(f"\n=== {name} ===")
|
||||
print(f" body={len(b.body)}, nibbles={2*(len(b.body)-7)}, samples_per_ch={len(decoded[0])}")
|
||||
print(f" truth samples per ch: {len(truth_in_16['Tran'])}")
|
||||
# Print first 24 of each
|
||||
for i, chan in enumerate(CHANNELS):
|
||||
pred_first = decoded[i][:24]
|
||||
if chan in truth_in_16:
|
||||
truth_first = truth_in_16[chan][:24]
|
||||
print(f" {chan} pred: {pred_first}")
|
||||
print(f" {chan} truth: {truth_first}")
|
||||
else:
|
||||
print(f" {chan} pred: {pred_first} (truth in dB, skipped)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,32 +0,0 @@
|
||||
"""Verify decode_waveform_v2 against BW ASCII truth for all fixtures."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import decode_waveform_v2
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
|
||||
"M529LL1L.JQ0", "M529LL1L.V70"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
decoded = decode_waveform_v2(body)
|
||||
if decoded is None:
|
||||
print(f"{stem}: decoder returned None")
|
||||
continue
|
||||
|
||||
print(f"\n=== {stem} ===")
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
truth = [round(v * 200) for v in samples[ch]]
|
||||
pred = decoded[ch]
|
||||
n = min(len(pred), len(truth))
|
||||
matches = sum(1 for i in range(n) if pred[i] == truth[i])
|
||||
div = next((i for i in range(n) if pred[i] != truth[i]), -1)
|
||||
print(f" {ch}: decoded={len(pred):>5} truth={len(truth):>5} "
|
||||
f"matches={matches:>5}/{n:<5} first div={div}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,55 +0,0 @@
|
||||
"""Run decode_waveform_v2 against the 5-8-26 quiet bundle to test the
|
||||
'quiet events should decode fully' hypothesis."""
|
||||
import os, sys
|
||||
sys.path.insert(0, ".")
|
||||
from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
|
||||
from analysis.load_bundle import _parse_txt
|
||||
|
||||
|
||||
def main():
|
||||
base = "tests/fixtures/decode-re-5-8-26"
|
||||
for evt in sorted(os.listdir(base)):
|
||||
folder = os.path.join(base, evt)
|
||||
if not os.path.isdir(folder):
|
||||
continue
|
||||
# Find the binary (not .TXT)
|
||||
bin_name = next(
|
||||
(f for f in os.listdir(folder) if not f.endswith(".TXT")),
|
||||
None,
|
||||
)
|
||||
if not bin_name:
|
||||
continue
|
||||
bin_path = os.path.join(folder, bin_name)
|
||||
txt_path = bin_path + ".TXT"
|
||||
if not os.path.exists(txt_path):
|
||||
# Sometimes the TXT name differs slightly
|
||||
for f in os.listdir(folder):
|
||||
if f.endswith(".TXT"):
|
||||
txt_path = os.path.join(folder, f)
|
||||
break
|
||||
with open(bin_path, "rb") as f:
|
||||
body = f.read()[43:-26]
|
||||
decoded = decode_waveform_v2(body)
|
||||
_, samples = _parse_txt(txt_path)
|
||||
|
||||
# Count 30 NN blocks
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
n_30 = sum(1 for b in blocks if b.tag_hi == 0x30)
|
||||
n_40 = sum(1 for b in blocks if b.tag_hi == 0x40)
|
||||
|
||||
print(f"\n=== {evt} === body={len(body)} segments={n_40} '30 NN' blocks={n_30}")
|
||||
if decoded is None:
|
||||
print(" decoder returned None")
|
||||
continue
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
truth = [round(v * 200) for v in samples[ch]]
|
||||
pred = decoded[ch]
|
||||
n = min(len(pred), len(truth))
|
||||
matches = sum(1 for i in range(n) if pred[i] == truth[i])
|
||||
div = next((i for i in range(n) if pred[i] != truth[i]), -1)
|
||||
print(f" {ch}: decoded={len(pred):>5} truth={len(truth):>5} "
|
||||
f"matches={matches:>5}/{n:<5} first div={div}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,71 +0,0 @@
|
||||
"""Verify: preamble[3:7] = Tran[0], Tran[1] as int16 BE in 16-count units.
|
||||
And first 20/10 NN block = Tran deltas starting at sample 2.
|
||||
"""
|
||||
import os, sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import _parse_txt
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def s4(n):
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b):
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
path = f"tests/fixtures/5-11-26/{stem}"
|
||||
with open(path, "rb") as f:
|
||||
raw = f.read()
|
||||
body = raw[43:-26]
|
||||
_, samples = _parse_txt(path + ".TXT")
|
||||
truth_T_16 = [round(v * 200) for v in samples["Tran"]]
|
||||
|
||||
# Preamble parse
|
||||
T0_pre = int.from_bytes(body[3:5], "big", signed=True)
|
||||
T1_pre = int.from_bytes(body[5:7], "big", signed=True)
|
||||
print(f"\n=== {stem} ===")
|
||||
print(f" Preamble T[0]={T0_pre} (truth {truth_T_16[0]}) T[1]={T1_pre} (truth {truth_T_16[1]}) match={T0_pre==truth_T_16[0] and T1_pre==truth_T_16[1]}")
|
||||
|
||||
# First block
|
||||
start = find_data_start(body)
|
||||
blocks = walk_body(body, start)
|
||||
if not blocks:
|
||||
print(f" no blocks found")
|
||||
continue
|
||||
|
||||
# Assume first block = Tran deltas from sample 2
|
||||
first = blocks[0]
|
||||
T = [T0_pre, T1_pre]
|
||||
cur_T = T1_pre
|
||||
if first.tag_hi == 0x10:
|
||||
# Nibble pairs
|
||||
for byte in first.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur_T += s4(nib)
|
||||
T.append(cur_T)
|
||||
elif first.tag_hi == 0x20:
|
||||
# int8 per byte
|
||||
for byte in first.data:
|
||||
cur_T += i8(byte)
|
||||
T.append(cur_T)
|
||||
|
||||
# Compare against truth
|
||||
n_check = min(len(T), len(truth_T_16))
|
||||
match_count = sum(1 for i in range(n_check) if T[i] == truth_T_16[i])
|
||||
print(f" First block type=0x{first.tag_hi:02x} NN=0x{first.tag_lo:02x} len={len(first.data)} → {len(T)} T samples decoded")
|
||||
print(f" Tran predicted[0:10]: {T[:10]}")
|
||||
print(f" Tran truth [0:10]: {truth_T_16[:10]}")
|
||||
print(f" Matches in first {n_check}: {match_count} / {n_check}")
|
||||
# Show where it diverges
|
||||
for i in range(n_check):
|
||||
if T[i] != truth_T_16[i]:
|
||||
print(f" First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
|
||||
break
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,20 +0,0 @@
|
||||
"""Walk blocks of the new 5-11-26 events and look at what comes after Tran block."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||
|
||||
|
||||
def main():
|
||||
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||
with open(f"tests/fixtures/5-11-26/{stem}", "rb") as f:
|
||||
raw = f.read()
|
||||
body = raw[43:-26]
|
||||
start = find_data_start(body)
|
||||
blocks = walk_body(body, start)
|
||||
print(f"\n=== {stem} === body={len(body)} start={start} blocks walked={len(blocks)}")
|
||||
for i, b in enumerate(blocks[:20]):
|
||||
print(f" block[{i:>2}] @ {b.offset:>5} tag={b.tag_hi:02x} NN=0x{b.tag_lo:02x}({b.tag_lo}) len={b.length} data[:24]={b.data[:24].hex(' ')}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,44 +0,0 @@
|
||||
"""Walk the body assuming chunks delimited by 0x10 NN tags. Print each chunk's structure."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def walk(body: bytes, start_offset: int = 7, max_chunks: int = 30):
|
||||
"""Find all positions where byte = 0x10 followed by a multiple-of-4 byte. Print chunks."""
|
||||
chunks = []
|
||||
i = start_offset
|
||||
while i < len(body) - 1:
|
||||
# Find next `10 NN` where NN is multiple of 4 (and not preceded by another 0x10 immediately, which would be data).
|
||||
if body[i] == 0x10 and (body[i+1] % 4 == 0):
|
||||
chunks.append(i)
|
||||
i += 1
|
||||
return chunks
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-c", "event-d"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
positions = []
|
||||
i = 7 # skip 7-byte preamble
|
||||
while i < len(body) - 1:
|
||||
if body[i] == 0x10 and body[i+1] % 4 == 0 and body[i+1] > 0:
|
||||
positions.append(i)
|
||||
i += 2 # skip past tag
|
||||
else:
|
||||
i += 1
|
||||
print(f"\n=== {name} === body={len(body)}, total `10 NN` (NN%4==0, NN>0) tags: {len(positions)}")
|
||||
# Print first 20 chunks: show position, NN, gap to next tag
|
||||
for k in range(min(30, len(positions))):
|
||||
pos = positions[k]
|
||||
NN = body[pos + 1]
|
||||
next_pos = positions[k+1] if k+1 < len(positions) else len(body)
|
||||
gap = next_pos - pos
|
||||
data_bytes = body[pos+2 : next_pos]
|
||||
print(f" chunk[{k:>3}] @ {pos:>5} NN=0x{NN:02x} ({NN:>3}, NN/2={NN//2}) gap={gap:>3} "
|
||||
f"data={data_bytes[:24].hex(' ')}{'...' if len(data_bytes) > 24 else ''}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,50 +0,0 @@
|
||||
"""Deterministic chunk walker: each chunk = [10 NN][NN/2 bytes data][2 bytes trailer]."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def walk_chunks(body: bytes, start: int = 7):
|
||||
"""Yield (offset, NN, data_bytes, trailer_bytes) tuples."""
|
||||
i = start
|
||||
while i + 1 < len(body):
|
||||
if body[i] != 0x10:
|
||||
break
|
||||
NN = body[i + 1]
|
||||
if NN == 0 or NN > 0x80 or NN % 4 != 0:
|
||||
break
|
||||
chunk_len = NN // 2 + 4
|
||||
if i + chunk_len > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + 2 + NN // 2])
|
||||
trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
|
||||
yield (i, NN, data, trailer)
|
||||
i += chunk_len
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
chunks = list(walk_chunks(body))
|
||||
print(f"\n=== {name} === body={len(body)} N_samples={len(b.samples['Tran'])}")
|
||||
print(f" chunks parsed: {len(chunks)}")
|
||||
if chunks:
|
||||
last = chunks[-1]
|
||||
end_of_walk = last[0] + last[1] // 2 + 4
|
||||
print(f" walk ended at offset {end_of_walk} (= {len(body) - end_of_walk} bytes from end)")
|
||||
# Stats
|
||||
total_data_bytes = sum(len(c[2]) for c in chunks)
|
||||
print(f" total data bytes: {total_data_bytes}, total nibbles: {2*total_data_bytes}")
|
||||
if name in ("event-c", "event-d"):
|
||||
ratio = (2 * total_data_bytes) / (len(b.samples['Tran']) * 4)
|
||||
print(f" nibbles per (sample × channel): {ratio:.3f}")
|
||||
# Sum of trailer second-byte
|
||||
trailer_sums = [c[3][-1] if c[3] else None for c in chunks]
|
||||
print(f" first 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:10]]}")
|
||||
# Print last 10 chunks (likely transition to trailer)
|
||||
print(f" last 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-10:]]}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,51 +0,0 @@
|
||||
"""Walk chunks; auto-detect preamble length by finding first 10 NN."""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def walk_chunks(body, start, max_NN=0x80):
|
||||
chunks = []
|
||||
i = start
|
||||
while i + 1 < len(body):
|
||||
if body[i] != 0x10:
|
||||
break
|
||||
NN = body[i + 1]
|
||||
if NN == 0 or NN > max_NN or NN % 4 != 0:
|
||||
break
|
||||
chunk_len = NN // 2 + 4
|
||||
if i + chunk_len > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + 2 + NN // 2])
|
||||
trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
|
||||
chunks.append((i, NN, data, trailer))
|
||||
i += chunk_len
|
||||
return chunks, i
|
||||
|
||||
|
||||
def find_first_chunk_start(body):
|
||||
"""Locate first byte that begins a `10 NN` chunk (NN ∈ multiples of 4, 4..0x7C)."""
|
||||
for i in range(20):
|
||||
if body[i] == 0x10 and body[i + 1] % 4 == 0 and 0 < body[i + 1] <= 0x7C:
|
||||
return i
|
||||
return -1
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
start = find_first_chunk_start(body)
|
||||
chunks, end = walk_chunks(body, start)
|
||||
print(f"\n=== {name} === body={len(body)} N_samples={len(b.samples['Tran'])} start={start}")
|
||||
print(f" chunks parsed: {len(chunks)}, walk ended at {end}")
|
||||
if chunks:
|
||||
print(f" first 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:5]]}")
|
||||
print(f" last 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-5:]]}")
|
||||
print(f" bytes around end of walk: {body[end-4:end+12].hex(' ')}")
|
||||
else:
|
||||
print(f" bytes at start: {body[start:start+16].hex(' ')}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,75 +0,0 @@
|
||||
"""
|
||||
Walker v4: alternate [10 NN] data chunks and [00 NN] (or other) marker tags.
|
||||
|
||||
Hypothesis:
|
||||
- [10 NN]: data block, length NN/2 + 2 bytes (2-byte tag + NN/2 bytes data)
|
||||
- [00 NN]: 2-byte marker block (no data)
|
||||
- [20/30/40 NN]: special blocks with type-dependent length
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
|
||||
|
||||
def walk(body, start):
|
||||
i = start
|
||||
blocks = []
|
||||
while i + 1 < len(body):
|
||||
t0 = body[i]
|
||||
t1 = body[i + 1]
|
||||
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0x80:
|
||||
# data chunk: length NN/2 + 2
|
||||
length = t1 // 2 + 2
|
||||
blocks.append((i, "10", t1, bytes(body[i + 2 : i + length]), length))
|
||||
i += length
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
# 2-byte marker
|
||||
blocks.append((i, "00", t1, b"", 2))
|
||||
i += 2
|
||||
elif t0 == 0x20 and t1 % 4 == 0:
|
||||
# type 2 — try length 2+t1/2 (similar to 10) OR fixed
|
||||
length = t1 // 2 + 2
|
||||
blocks.append((i, "20", t1, bytes(body[i + 2 : i + length]), length))
|
||||
i += length
|
||||
elif t0 == 0x30 and t1 % 4 == 0:
|
||||
length = t1 // 2 + 2
|
||||
blocks.append((i, "30", t1, bytes(body[i + 2 : i + length]), length))
|
||||
i += length
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
# Special "footer transition" block — try fixed 22 bytes
|
||||
length = 22
|
||||
blocks.append((i, "40", t1, bytes(body[i + 2 : i + length]), length))
|
||||
i += length
|
||||
else:
|
||||
# Unknown tag — stop
|
||||
blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
|
||||
break
|
||||
return blocks, i
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
# Auto-detect start
|
||||
for s in range(15):
|
||||
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0x80:
|
||||
start = s
|
||||
break
|
||||
else:
|
||||
start = 7
|
||||
blocks, end = walk(body, start)
|
||||
# Categorize
|
||||
from collections import Counter
|
||||
types = Counter(b[1] for b in blocks)
|
||||
print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])} start={start}")
|
||||
print(f" total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
|
||||
print(f" type counts: {dict(types)}")
|
||||
# Print last 5 blocks
|
||||
print(f" last 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-5:]]}")
|
||||
if end < len(body):
|
||||
print(f" bytes at end: {body[end:end+24].hex(' ')}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,83 +0,0 @@
|
||||
"""
|
||||
Walker v5: flexible NN range and multiple block-type lengths.
|
||||
|
||||
Hypothesis:
|
||||
- [10 NN]: 4-bit-delta data block, length = NN/2 + 2
|
||||
- [20 NN]: 8-bit-literal data block, length = NN + 2
|
||||
- [00 NN]: 2-byte marker (no payload)
|
||||
- [30 NN]: trailer/summary block, length = NN*4
|
||||
- [40 NN]: footer-marker block, fixed 22 bytes
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
from collections import Counter
|
||||
|
||||
|
||||
def walk(body, start, max_blocks=10000):
|
||||
i = start
|
||||
blocks = []
|
||||
while i + 1 < len(body) and len(blocks) < max_blocks:
|
||||
t0 = body[i]
|
||||
t1 = body[i + 1]
|
||||
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 // 2 + 2
|
||||
if i + length > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append((i, "10", t1, data, length))
|
||||
i += length
|
||||
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 + 2
|
||||
if i + length > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append((i, "20", t1, data, length))
|
||||
i += length
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
# 2-byte marker
|
||||
blocks.append((i, "00", t1, b"", 2))
|
||||
i += 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0:
|
||||
length = t1 * 4
|
||||
if i + length > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append((i, "30", t1, data, length))
|
||||
i += length
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 22
|
||||
if i + length > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append((i, "40", t1, data, length))
|
||||
i += length
|
||||
else:
|
||||
blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
|
||||
break
|
||||
return blocks, i
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
for s in range(15):
|
||||
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
|
||||
start = s; break
|
||||
else:
|
||||
start = 7
|
||||
blocks, end = walk(body, start)
|
||||
types = Counter(bb[1] for bb in blocks)
|
||||
print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])} start={start}")
|
||||
print(f" total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
|
||||
print(f" type counts: {dict(types)}")
|
||||
if blocks and blocks[-1][1] == "??":
|
||||
print(f" stopped at byte: 0x{blocks[-1][2]:02x}, prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
|
||||
# Sum payload sizes by type
|
||||
payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
|
||||
print(f" payload bytes by type: {payload_sizes}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,68 +0,0 @@
|
||||
"""
|
||||
Walker v6: handle 40 02 blocks correctly (length 20).
|
||||
|
||||
Block formats:
|
||||
- [10 NN]: 4-bit nibble delta data, length = NN/2 + 2
|
||||
- [20 NN]: int8 literal data, length = NN + 2
|
||||
- [00 NN]: 2-byte marker
|
||||
- [30 NN]: trailer/summary block, length = NN*4
|
||||
- [40 02]: segment header, fixed length 20
|
||||
"""
|
||||
import sys
|
||||
sys.path.insert(0, ".")
|
||||
from analysis.load_bundle import load_bundle
|
||||
from collections import Counter
|
||||
|
||||
|
||||
def walk(body, start, max_blocks=10000):
|
||||
i = start
|
||||
blocks = []
|
||||
while i + 1 < len(body) and len(blocks) < max_blocks:
|
||||
t0 = body[i]
|
||||
t1 = body[i + 1]
|
||||
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 // 2 + 2
|
||||
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 + 2
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
length = 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||
length = t1 * 4
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 20
|
||||
else:
|
||||
blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
|
||||
break
|
||||
if i + length > len(body):
|
||||
break
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append((i, f"{t0:02x}", t1, data, length))
|
||||
i += length
|
||||
return blocks, i
|
||||
|
||||
|
||||
def main():
|
||||
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||
b = load_bundle(name)
|
||||
body = b.body
|
||||
for s in range(15):
|
||||
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
|
||||
start = s; break
|
||||
else:
|
||||
start = 7
|
||||
blocks, end = walk(body, start)
|
||||
types = Counter(bb[1] for bb in blocks)
|
||||
print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])} start={start}")
|
||||
print(f" total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
|
||||
print(f" type counts: {dict(types)}")
|
||||
if blocks and blocks[-1][1] == "??":
|
||||
print(f" stopped at byte: 0x{blocks[-1][2]:02x} at offset {blocks[-1][0]}")
|
||||
print(f" prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
|
||||
print(f" bytes around stop: {body[end-4:end+24].hex(' ')}")
|
||||
# Sum
|
||||
payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
|
||||
print(f" payload bytes by type: {payload_sizes}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
+117
-221
@@ -70,77 +70,42 @@ from minimateplus.transport import SocketTransport
|
||||
from minimateplus.client import MiniMateClient
|
||||
from minimateplus.models import DeviceInfo, Event, MonitorLogEntry
|
||||
from sfm.database import SeismoDb
|
||||
from sfm.waveform_store import WaveformStore
|
||||
|
||||
log = logging.getLogger("ach_server")
|
||||
|
||||
# ── Per-unit state (downloaded events index) ──────────────────────────────────
|
||||
# ── Per-unit state (downloaded-key set) ───────────────────────────────────────
|
||||
# Persisted as <output_dir>/ach_state.json
|
||||
# Format (current — v2):
|
||||
# Format:
|
||||
# {
|
||||
# "BE11529": {
|
||||
# "downloaded_events": { # key_hex → ISO timestamp string
|
||||
# "01110000": "2026-04-11T00:42:17",
|
||||
# "0111245a": "2026-04-11T01:04:30"
|
||||
# },
|
||||
# "max_downloaded_key": "0111245a",
|
||||
# "last_seen": "2026-04-11T01:04:36",
|
||||
# "serial": "BE11529",
|
||||
# "peer": "63.43.212.232:51920"
|
||||
# "downloaded_keys": ["01110000", "0111245a"], # hex keys already on disk
|
||||
# "max_downloaded_key": "0111245a", # highest key ever seen
|
||||
# "last_seen": "2026-04-11T01:04:36"
|
||||
# }
|
||||
# }
|
||||
#
|
||||
# Why (key, timestamp) and not key alone:
|
||||
# The device's event-key counter resets to 0x01110000 after every memory
|
||||
# erase (internal or external). A bare-key dedup (the v1 format) cannot
|
||||
# distinguish a re-recorded event with the same key from one we already
|
||||
# downloaded. The 0C waveform record's timestamp IS unique per physical
|
||||
# event, so we pair (key, timestamp) and treat a key with a different
|
||||
# timestamp as a new event regardless of `max_downloaded_key`.
|
||||
# Key-based deduplication works well within a single "key generation" (between
|
||||
# erases). After the device memory is erased the event counter resets to
|
||||
# 0x01110000, so the first new event has the SAME key as the very first event
|
||||
# we ever downloaded. We detect this situation with max_downloaded_key:
|
||||
#
|
||||
# Legacy v1 format (`downloaded_keys: list[str]` only) is auto-migrated on
|
||||
# read: the keys are kept under a sentinel of "" (empty string) timestamp so
|
||||
# the (key, timestamp) compare always sees a mismatch and forces a one-time
|
||||
# re-download. After that pass the state is rewritten in v2 form.
|
||||
# if max(current_device_keys) < max_downloaded_key
|
||||
# → device was wiped and keys have restarted → treat all device keys as new
|
||||
#
|
||||
# After our own erase (--clear-after-download) we also explicitly clear
|
||||
# downloaded_keys and max_downloaded_key so the next session starts fresh.
|
||||
|
||||
_state_lock = threading.Lock()
|
||||
|
||||
|
||||
def _load_state(state_path: Path) -> dict:
|
||||
"""
|
||||
Load ach_state.json, transparently migrating any legacy
|
||||
`downloaded_keys: list` entries into the v2 `downloaded_events: dict`
|
||||
schema. Returns the migrated state.
|
||||
"""
|
||||
if not state_path.exists():
|
||||
return {}
|
||||
try:
|
||||
with open(state_path) as f:
|
||||
state = json.load(f)
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
# Per-unit migration: legacy list → dict-with-empty-timestamps
|
||||
for unit_key, unit_state in list(state.items()):
|
||||
if not isinstance(unit_state, dict):
|
||||
continue
|
||||
if "downloaded_events" in unit_state:
|
||||
continue
|
||||
legacy_keys = unit_state.get("downloaded_keys")
|
||||
if isinstance(legacy_keys, list):
|
||||
unit_state["downloaded_events"] = {k: "" for k in legacy_keys}
|
||||
log.info(
|
||||
"ach_state: migrated %s from v1 (downloaded_keys list) → v2 "
|
||||
"(downloaded_events dict, %d keys with empty timestamps; "
|
||||
"they will re-validate on next session)",
|
||||
unit_key, len(legacy_keys),
|
||||
)
|
||||
else:
|
||||
unit_state["downloaded_events"] = {}
|
||||
# keep legacy field for one cycle; cleared on next save
|
||||
unit_state.pop("downloaded_keys", None)
|
||||
|
||||
return state
|
||||
if state_path.exists():
|
||||
try:
|
||||
with open(state_path) as f:
|
||||
return json.load(f)
|
||||
except Exception:
|
||||
pass
|
||||
return {}
|
||||
|
||||
|
||||
def _save_state(state_path: Path, state: dict) -> None:
|
||||
@@ -174,10 +139,8 @@ class AchSession:
|
||||
max_events: Optional[int],
|
||||
state_path: Path,
|
||||
db: "SeismoDb",
|
||||
store: "WaveformStore",
|
||||
clear_after_download: bool = False,
|
||||
restart_monitoring: bool = False,
|
||||
force_redownload: bool = False,
|
||||
) -> None:
|
||||
self.sock = sock
|
||||
self.peer = peer
|
||||
@@ -187,14 +150,8 @@ class AchSession:
|
||||
self.max_events = max_events
|
||||
self.state_path = state_path
|
||||
self.db = db
|
||||
self.store = store
|
||||
self.clear_after_download = clear_after_download
|
||||
self.restart_monitoring = restart_monitoring
|
||||
# `force_redownload` tells this session to ignore ach_state and
|
||||
# re-download every event currently on the device, regardless of any
|
||||
# (key, timestamp) match. Useful as a manual override when state has
|
||||
# become inconsistent with what's actually on disk / in the DB.
|
||||
self.force_redownload = force_redownload
|
||||
|
||||
def run(self) -> None:
|
||||
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
@@ -316,20 +273,11 @@ class AchSession:
|
||||
state = _load_state(self.state_path)
|
||||
unit_key = serial or self.peer # fall back to IP if no serial
|
||||
unit_state = state.get(unit_key, {})
|
||||
|
||||
# downloaded_events is the v2 (key_hex → timestamp_iso) dict.
|
||||
# Empty-string timestamps are migrated v1 entries — they force a
|
||||
# one-time re-download because the (key, timestamp) compare always
|
||||
# mismatches against any non-empty timestamp from a fresh 0C read.
|
||||
seen_events: dict[str, str] = dict(unit_state.get("downloaded_events", {}))
|
||||
seen_keys: set[str] = set(unit_state.get("downloaded_keys", []))
|
||||
# Highest event key ever downloaded from this unit (hex string, 8 chars).
|
||||
# Used to detect post-erase key reuse — see comment block above.
|
||||
max_seen_key: str = unit_state.get("max_downloaded_key", "00000000")
|
||||
|
||||
if self.force_redownload:
|
||||
log.info(" --force-redownload-all set — ignoring %d cached "
|
||||
"(key, timestamp) entries for this session",
|
||||
len(seen_events))
|
||||
seen_events = {}
|
||||
|
||||
# Walk the event index (browse-mode, no 5A) to get the actual current
|
||||
# key list. The SUB 08 event_count field is a lifetime "total events
|
||||
# ever recorded" counter that does NOT decrement on erase — confirmed
|
||||
@@ -342,10 +290,11 @@ class AchSession:
|
||||
log.warning(" list_event_keys failed: %s -- falling back to full download", exc)
|
||||
device_keys = None
|
||||
|
||||
# Use the walk result as our authoritative current count.
|
||||
current_count = len(device_keys) if device_keys is not None else 0
|
||||
|
||||
log.info(" Unit has %d stored event(s); %d (key, ts) entr(ies) previously downloaded",
|
||||
current_count, len(seen_events))
|
||||
log.info(" Unit has %d stored event(s); %d key(s) previously downloaded",
|
||||
current_count, len(seen_keys))
|
||||
|
||||
if device_keys is not None and current_count == 0:
|
||||
log.info(" [OK] No events on device -- nothing to download")
|
||||
@@ -353,29 +302,75 @@ class AchSession:
|
||||
return
|
||||
|
||||
if device_keys is not None:
|
||||
# ── Post-erase detection (best-effort, key-only signal) ───────
|
||||
# After erase the device's key counter resets to 01110000.
|
||||
# If the device's current max key is below our high-water mark
|
||||
# we know erase happened. This catches the cleanest case but
|
||||
# does NOT catch erase-then-record-many-events (where the new
|
||||
# max may climb past the old max). The (key, timestamp) check
|
||||
# in get_events() is what handles those.
|
||||
# ── Post-erase detection ──────────────────────────────────────
|
||||
# After the device memory is erased, new events start from key
|
||||
# 01110000 again — the same keys we already downloaded. Detect
|
||||
# this by comparing the device's current highest key against the
|
||||
# historical maximum. If the device has rolled back below our
|
||||
# high-water mark, its counter was reset and we must treat all
|
||||
# its keys as new, regardless of what seen_keys contains.
|
||||
if device_keys and max_seen_key != "00000000":
|
||||
max_device_key = max(device_keys)
|
||||
max_device_key = max(device_keys) # lexicographic; safe because
|
||||
# keys share the same 4-char prefix
|
||||
if max_device_key < max_seen_key:
|
||||
log.info(
|
||||
" Post-erase reset detected: "
|
||||
"device max key %s < historical max %s "
|
||||
"-- discarding stale (key, ts) state for this session",
|
||||
"-- treating all device keys as new",
|
||||
max_device_key, max_seen_key,
|
||||
)
|
||||
seen_events = {}
|
||||
seen_keys = set() # discard stale dedup info for this session
|
||||
|
||||
# Note: no early-exit "all already downloaded" short-circuit
|
||||
# here. Without per-event timestamps we cannot tell whether
|
||||
# device_keys ⊆ seen_events.keys() actually means we have
|
||||
# those physical events. get_events() will read 0C on its
|
||||
# skip path and decide per event.
|
||||
new_key_set = set(device_keys) - seen_keys
|
||||
log.info(" Device has %d key(s): %d new, %d already seen",
|
||||
len(device_keys), len(new_key_set), len(device_keys) - len(new_key_set))
|
||||
if not new_key_set:
|
||||
log.info(" [OK] All events already downloaded -- nothing to do")
|
||||
# Refresh state timestamp; preserve max_seen_key unchanged.
|
||||
state[unit_key] = {
|
||||
"downloaded_keys": sorted(seen_keys | set(device_keys)),
|
||||
"max_downloaded_key": max_seen_key,
|
||||
"last_seen": datetime.datetime.now().isoformat(),
|
||||
"serial": serial,
|
||||
"peer": self.peer,
|
||||
}
|
||||
_save_state(self.state_path, state)
|
||||
|
||||
# ── Erase even when no new events (if requested) ──────────
|
||||
# Blastware ACH always erases after every session — even when
|
||||
# nothing new was downloaded. Without the erase the device
|
||||
# still sees stored events in its memory and immediately
|
||||
# retries the call-home, causing the looping we observed.
|
||||
# Only erase when device actually has events stored; skip
|
||||
# the erase if device_keys is empty (nothing to erase).
|
||||
if self.clear_after_download and device_keys:
|
||||
log.info(
|
||||
" Clearing device memory (--clear-after-download, "
|
||||
"no new events but device has %d stored)...",
|
||||
len(device_keys),
|
||||
)
|
||||
try:
|
||||
client.delete_all_events()
|
||||
log.info(" [OK] Device memory cleared")
|
||||
# Reset state so the next session starts fresh.
|
||||
state[unit_key] = {
|
||||
"downloaded_keys": [],
|
||||
"max_downloaded_key": "00000000",
|
||||
"last_seen": datetime.datetime.now().isoformat(),
|
||||
"serial": serial,
|
||||
"peer": self.peer,
|
||||
}
|
||||
_save_state(self.state_path, state)
|
||||
except Exception as exc:
|
||||
log.error(
|
||||
" [WARN] Event deletion failed: %s -- events NOT cleared",
|
||||
exc,
|
||||
)
|
||||
|
||||
log.info("Session complete (no new events) -> %s", session_dir)
|
||||
return
|
||||
else:
|
||||
new_key_set = None # unknown; proceed with full download
|
||||
|
||||
# Apply max_events cap
|
||||
# stop_idx: when we know the count from list_event_keys, use it as
|
||||
@@ -393,67 +388,27 @@ class AchSession:
|
||||
)
|
||||
|
||||
try:
|
||||
# Pass `seen_events` (key → ISO timestamp) so the client can
|
||||
# read 0C on its skip path and only skip 5A when the per-event
|
||||
# timestamp matches what we already have on disk. When force_-
|
||||
# redownload is set, seen_events was already cleared above.
|
||||
#
|
||||
# Filter out empty-string timestamps (legacy v1 entries) — the
|
||||
# client's 0C-on-skip-path only trusts entries with a
|
||||
# populated timestamp; otherwise it falls through to a full
|
||||
# 5A download.
|
||||
skip_dict = {k: ts for k, ts in seen_events.items() if ts}
|
||||
|
||||
all_events = client.get_events(
|
||||
full_waveform=True,
|
||||
stop_after_index=stop_idx,
|
||||
skip_waveform_for_events=skip_dict if skip_dict else None,
|
||||
skip_waveform_for_keys=seen_keys if seen_keys else None,
|
||||
)
|
||||
|
||||
# New events are those that came back with _a5_frames populated
|
||||
# (= 5A actually ran on this session). Skipped events have
|
||||
# _a5_frames = None because the client matched (key, timestamp)
|
||||
# against skip_dict and bypassed 5A.
|
||||
# Filter to events whose keys we haven't saved before.
|
||||
new_events = [
|
||||
e for e in all_events
|
||||
if getattr(e, "_a5_frames", None)
|
||||
if e._waveform_key is None
|
||||
or e._waveform_key.hex() not in seen_keys
|
||||
]
|
||||
skipped = len(all_events) - len(new_events)
|
||||
|
||||
log.info(" [OK] Walked %d event(s): %d downloaded, %d skipped (matched (key, ts) in state)",
|
||||
log.info(" [OK] Downloaded %d event(s): %d new, %d skipped (already seen)",
|
||||
len(all_events), len(new_events), skipped)
|
||||
|
||||
# ── Persist event file + A5 sidecar to the waveform store ──
|
||||
# Saves ride alongside the existing JSON dump so the on-disk
|
||||
# event file and events.json reference the same set of events.
|
||||
waveform_records: dict[str, dict] = {}
|
||||
for ev in new_events:
|
||||
if not ev._a5_frames:
|
||||
continue
|
||||
try:
|
||||
rec = self.store.save(
|
||||
ev,
|
||||
serial=serial or "UNKNOWN",
|
||||
a5_frames=ev._a5_frames,
|
||||
)
|
||||
if ev._waveform_key is not None:
|
||||
waveform_records[ev._waveform_key.hex()] = rec
|
||||
log.info(
|
||||
" [WAVE] saved %s (%d bytes)",
|
||||
rec["filename"], rec["filesize"],
|
||||
)
|
||||
except Exception as exc:
|
||||
key_hex = ev._waveform_key.hex() if ev._waveform_key else "????????"
|
||||
log.warning(
|
||||
" [WARN] Waveform store save failed for %s: %s",
|
||||
key_hex, exc,
|
||||
)
|
||||
if skipped:
|
||||
log.info(" (skipped %d already-downloaded event(s))", skipped)
|
||||
|
||||
if new_events:
|
||||
_save_json(
|
||||
session_dir / "events.json",
|
||||
[_event_to_dict(e, waveform_records) for e in new_events],
|
||||
)
|
||||
_save_json(session_dir / "events.json", [_event_to_dict(e) for e in new_events])
|
||||
|
||||
for ev in new_events:
|
||||
pv = ev.peak_values
|
||||
@@ -512,11 +467,7 @@ class AchSession:
|
||||
_session_start = datetime.datetime.now()
|
||||
try:
|
||||
_ev_ins, _ev_skip = self.db.insert_events(
|
||||
new_events,
|
||||
serial=serial or self.peer,
|
||||
session_id=None,
|
||||
waveform_records=waveform_records,
|
||||
device_family="series3",
|
||||
new_events, serial=serial or self.peer, session_id=None
|
||||
)
|
||||
_ml_ins, _ml_skip = self.db.insert_monitor_log(
|
||||
new_monitor_entries, session_id=None
|
||||
@@ -551,64 +502,35 @@ class AchSession:
|
||||
)
|
||||
|
||||
# ── Update persistent state ───────────────────────────────────
|
||||
# Build a fresh (key → ISO timestamp) map from THIS session's
|
||||
# results. For each event currently on the device, prefer the
|
||||
# timestamp we just observed (from 0C); fall back to whatever
|
||||
# was already in seen_events for that key (so we don't lose an
|
||||
# entry just because get_events skipped it on the (key, ts)
|
||||
# match path).
|
||||
def _ts_iso(ev) -> str:
|
||||
ts = getattr(ev, "timestamp", None)
|
||||
if ts is None:
|
||||
return ""
|
||||
try:
|
||||
return datetime.datetime(
|
||||
ts.year, ts.month, ts.day,
|
||||
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||
).isoformat()
|
||||
except Exception:
|
||||
return str(ts)
|
||||
|
||||
current_events_map: dict[str, str] = {}
|
||||
for ev in all_events:
|
||||
if ev._waveform_key is None:
|
||||
continue
|
||||
key_hex = ev._waveform_key.hex()
|
||||
ts_iso = _ts_iso(ev) or seen_events.get(key_hex, "")
|
||||
current_events_map[key_hex] = ts_iso
|
||||
|
||||
# Monitor-log entries don't have a 0C-style timestamp, but
|
||||
# they DO have a start_time; use that so the monitor-log keys
|
||||
# are properly entered into the (key, ts) map.
|
||||
for ml in new_monitor_entries:
|
||||
key_hex = ml.key
|
||||
ts = ml.start_time
|
||||
ts_iso = ts.isoformat() if ts else seen_events.get(key_hex, "")
|
||||
# If a triggered event already populated this key, keep
|
||||
# whichever has a non-empty timestamp.
|
||||
if key_hex not in current_events_map or not current_events_map[key_hex]:
|
||||
current_events_map[key_hex] = ts_iso
|
||||
# Include both triggered-event keys and monitor-log keys in the
|
||||
# downloaded set so they are not re-processed on the next call-home.
|
||||
current_event_keys = [
|
||||
e._waveform_key.hex()
|
||||
for e in all_events
|
||||
if e._waveform_key is not None
|
||||
]
|
||||
current_monitor_keys = [e.key for e in new_monitor_entries]
|
||||
current_keys = current_event_keys + current_monitor_keys
|
||||
|
||||
if erased_successfully:
|
||||
updated_events: dict[str, str] = {}
|
||||
# Device memory is clear. Reset downloaded_keys and the
|
||||
# high-water mark so the next call-home starts fresh and
|
||||
# doesn't mis-identify the recycled key 01110000 as "seen".
|
||||
updated_keys = []
|
||||
new_max_key = "00000000"
|
||||
log.info(
|
||||
" State reset after erase -- next session will download "
|
||||
"from key 0 (device counter resets after erase)"
|
||||
)
|
||||
else:
|
||||
# Merge: keep prior (key, ts) entries we still have evidence
|
||||
# of (for survivors of any partial failure), plus this
|
||||
# session's authoritative (key, ts) pairs.
|
||||
updated_events = dict(seen_events)
|
||||
updated_events.update(current_events_map)
|
||||
new_max_key = (
|
||||
max(updated_events.keys())
|
||||
if updated_events else max_seen_key
|
||||
)
|
||||
# Normal (no erase): union of previously-seen + all keys on
|
||||
# device now. Includes already-seen survivors so we never
|
||||
# re-download them if the device somehow keeps old records.
|
||||
updated_keys = sorted(set(seen_keys) | set(current_keys))
|
||||
new_max_key = updated_keys[-1] if updated_keys else max_seen_key
|
||||
|
||||
state[unit_key] = {
|
||||
"downloaded_events": updated_events,
|
||||
"downloaded_keys": updated_keys,
|
||||
"max_downloaded_key": new_max_key,
|
||||
"last_seen": datetime.datetime.now().isoformat(),
|
||||
"serial": serial,
|
||||
@@ -670,10 +592,7 @@ def _device_info_to_dict(d: DeviceInfo) -> dict:
|
||||
}
|
||||
|
||||
|
||||
def _event_to_dict(
|
||||
e: Event,
|
||||
waveform_records: Optional[dict[str, dict]] = None,
|
||||
) -> dict:
|
||||
def _event_to_dict(e: Event) -> dict:
|
||||
pv = e.peak_values
|
||||
pi = e.project_info
|
||||
peaks = {}
|
||||
@@ -692,11 +611,6 @@ def _event_to_dict(
|
||||
for ch, vals in e.raw_samples.items()
|
||||
}
|
||||
samples["__note__"] = "first 20 sample-sets only; see raw_rx.bin for full waveform"
|
||||
|
||||
rec: dict = {}
|
||||
if waveform_records and e._waveform_key is not None:
|
||||
rec = waveform_records.get(e._waveform_key.hex(), {}) or {}
|
||||
|
||||
return {
|
||||
"timestamp": str(e.timestamp) if e.timestamp else None,
|
||||
"project": pi.project if pi else None,
|
||||
@@ -705,9 +619,6 @@ def _event_to_dict(
|
||||
"sensor_location": pi.sensor_location if pi else None,
|
||||
"peaks": peaks,
|
||||
"raw_samples_preview": samples,
|
||||
"blastware_filename": rec.get("filename"),
|
||||
"blastware_filesize": rec.get("filesize"),
|
||||
"a5_pickle_filename": rec.get("a5_pickle_filename"),
|
||||
}
|
||||
|
||||
|
||||
@@ -729,7 +640,6 @@ def serve(args: argparse.Namespace) -> None:
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
state_path = output_dir / "ach_state.json"
|
||||
db = SeismoDb(output_dir / "seismo_relay.db")
|
||||
store = WaveformStore(output_dir / "waveforms")
|
||||
|
||||
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
|
||||
@@ -747,7 +657,6 @@ def serve(args: argparse.Namespace) -> None:
|
||||
print(f" Max events per session: {max_ev if max_ev else 'unlimited'}")
|
||||
print(f" Clear device after download: {'YES' if args.clear_after_download else 'no'}")
|
||||
print(f" Restart monitoring after download: {'YES' if args.restart_monitoring else 'no'}")
|
||||
print(f" Force re-download all (ignore state): {'YES' if args.force_redownload_all else 'no'}")
|
||||
print(f"{'='*60}")
|
||||
print(f"\n Point your test unit's ACEmanager call-home settings to:")
|
||||
print(f" Remote Host: <this machine's LAN IP>")
|
||||
@@ -785,10 +694,8 @@ def serve(args: argparse.Namespace) -> None:
|
||||
max_events=max_ev,
|
||||
state_path=state_path,
|
||||
db=db,
|
||||
store=store,
|
||||
clear_after_download=args.clear_after_download,
|
||||
restart_monitoring=args.restart_monitoring,
|
||||
force_redownload=args.force_redownload_all,
|
||||
)
|
||||
t = threading.Thread(target=session.run, daemon=True, name=f"ach-{peer}")
|
||||
t.start()
|
||||
@@ -873,17 +780,6 @@ def parse_args() -> argparse.Namespace:
|
||||
"This mirrors the standard Blastware ACH workflow."
|
||||
),
|
||||
)
|
||||
p.add_argument(
|
||||
"--force-redownload-all",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help=(
|
||||
"Manual override: ignore ach_state.json's downloaded_events map "
|
||||
"for this session and re-download every event currently on the "
|
||||
"device, regardless of (key, timestamp) match. Useful when state "
|
||||
"has become inconsistent with the on-disk waveform store / DB."
|
||||
),
|
||||
)
|
||||
p.add_argument(
|
||||
"--verbose", "-v",
|
||||
action="store_true",
|
||||
|
||||
@@ -1,185 +0,0 @@
|
||||
# Histogram body codec — FULLY DECODED (2026-05-20)
|
||||
|
||||
Clean working status doc for the MiniMate Plus histogram-mode event
|
||||
body codec. Companion to `waveform_codec_re_status.md`. The deep
|
||||
historical record (with retractions and dated analyses) lives in
|
||||
`docs/instantel_protocol_reference.md §7.6.2`; the authoritative
|
||||
implementation lives in `minimateplus/histogram_codec.py`.
|
||||
|
||||
## TL;DR
|
||||
|
||||
**The codec is fully decoded.** Every field of every block in the
|
||||
in-repo histogram fixture corpus decodes byte-exact against BW's
|
||||
ASCII export.
|
||||
|
||||
26 regression tests pass against ~3,500 blocks across 5 in-repo
|
||||
fixtures, plus a synthetic regression block taken from a real
|
||||
BE9558 prod event to lock in the uint8-peak interpretation.
|
||||
|
||||
**Important correction (2026-05-21):** the per-channel peak count
|
||||
is `uint8` at byte[6]/[10]/[14]/[18], NOT `uint16 LE` at byte[6:8]
|
||||
etc. The N844 fixture corpus the original RE was done against has
|
||||
zero values in bytes [7]/[11]/[15]/[19] for every block, so the
|
||||
two interpretations happened to be equivalent. Cross-correlating
|
||||
non-N844 events (BE9558 Tran-drift, BE18003 Histogram+Continuous)
|
||||
against BW's per-interval ASCII export — 4 channels × ~1400 blocks
|
||||
per event × multiple events = 100% byte-exact only when the peak
|
||||
is read as uint8. Reading as uint16 LE produced peaks up to 268
|
||||
in/s per channel and 35× inflated PVS sums when first deployed to
|
||||
prod (rolled back, root-caused, and fixed in commit 7183b95+1).
|
||||
|
||||
## Body format
|
||||
|
||||
```
|
||||
body = [stream of 32-byte data blocks] + [small trailing remnant]
|
||||
```
|
||||
|
||||
Each block represents one histogram interval. Block layout:
|
||||
|
||||
```
|
||||
[0] 0x00 always-zero tag
|
||||
[1] segment_id (uint8) 0x00..0x03 — 256 blocks per segment
|
||||
[2:4] block_ctr (uint16 LE) resets each segment (0x0100, 0x0101, …)
|
||||
[4:6] 0x000a (uint16 LE) constant marker (= 10)
|
||||
[6] T_peak_count uint8 Tran peak (count × 0.005 → in/s at Normal,
|
||||
max 1.275 in/s — fits in uint8)
|
||||
[7] T_annotation uint8 empirically non-zero on intervals with sub-Hz
|
||||
or unmeasurable freq; meaning not fully RE'd
|
||||
[8:10] T_halfperiod uint16 LE Tran half-period in samples
|
||||
(freq_Hz = 512 / halfp; ≤ 5 means ">100 Hz")
|
||||
[10] V_peak_count uint8 Vert peak
|
||||
[11] V_annotation uint8
|
||||
[12:14] V_halfperiod uint16 LE Vert freq half-period
|
||||
[14] L_peak_count uint8 Long peak
|
||||
[15] L_annotation uint8
|
||||
[16:18] L_halfperiod uint16 LE Long freq half-period
|
||||
[18] M_peak_count uint8 MicL peak count
|
||||
(dB via waveform_codec.mic_count_to_db)
|
||||
[19] M_annotation uint8
|
||||
[20:22] M_halfperiod uint16 LE MicL freq half-period
|
||||
[22:24] 0x00 0x00 constant
|
||||
[24:28] 4-byte variable purpose unknown — possibly CRC,
|
||||
timestamp delta, or psi(L) numeric;
|
||||
not needed for waveform reconstruction
|
||||
[28:32] 0x1e 0x0a 0x00 0x00 constant block-end signature
|
||||
```
|
||||
|
||||
Reliable block-identification anchor:
|
||||
```python
|
||||
block[22:24] == b"\x00\x00" and block[28:32] == b"\x1e\x0a\x00\x00"
|
||||
```
|
||||
(The `1e 0a 00 00` constant tail is the most distinctive signature.)
|
||||
|
||||
## Per-channel encoding
|
||||
|
||||
| Channel | Peak encoding | Frequency encoding |
|
||||
|---|---|---|
|
||||
| Tran | count × 0.005 = in/s at Normal range | `freq_Hz = 512 / halfperiod` |
|
||||
| Vert | same | same |
|
||||
| Long | same | same |
|
||||
| MicL | count → dB via `mic_count_to_db(count)` (same formula as waveform codec) | same |
|
||||
|
||||
**`>100 Hz` sentinel**: when halfperiod ≤ 5 (giving ≥100 Hz from the
|
||||
512/halfp formula), BW displays `>100 Hz`. Codec's `half_period_to_hz`
|
||||
returns `None` in this range.
|
||||
|
||||
## Verified facts (cross-checked against fixture corpus)
|
||||
|
||||
Example: N844L6Z8.ZR0H block 130 → all 8 decoded fields byte-exact:
|
||||
|
||||
```
|
||||
binary samples [10, 6, 24, 4, 18, 5, 21, 5, 9]
|
||||
TXT row [0.030, 21, 0.020, 28, 0.025, 24, 0.040, 0.000, 95.92, 57]
|
||||
|
||||
slot[0] = 10 marker
|
||||
slot[1] = 6 × 0.005 = 0.030 in/s ✓ T_peak
|
||||
slot[2] = 24 → 512/24 = 21.3 → 21 Hz ✓ T_freq
|
||||
slot[3] = 4 × 0.005 = 0.020 in/s ✓ V_peak
|
||||
slot[4] = 18 → 512/18 = 28.4 → 28 Hz ✓ V_freq
|
||||
slot[5] = 5 × 0.005 = 0.025 in/s ✓ L_peak
|
||||
slot[6] = 21 → 512/21 = 24.4 → 24 Hz ✓ L_freq
|
||||
slot[7] = 5 → 81.94 + 20·log10(5) = 95.92 dB ✓ M_peak
|
||||
slot[8] = 9 → 512/9 = 56.9 → 57 Hz ✓ M_freq
|
||||
```
|
||||
|
||||
## Verified test coverage
|
||||
|
||||
`tests/test_histogram_codec.py` (24 tests):
|
||||
|
||||
- Block walking: yields one record per `.TXT` interval ± 1 (off-by-one
|
||||
at the tail when recording was stopped mid-write). Segment-ID
|
||||
groups of 256 blocks confirmed.
|
||||
- Geo peaks: every block of N844L20G, N844L6Z8, N844L6XE, N844L23B
|
||||
matches `.TXT` within the 0.0005 in/s quantization step.
|
||||
- Geo freqs: every block of N844L6Z8 and N844L6XE matches `.TXT`
|
||||
within 1 Hz (BW display rounds). `>100 Hz` sentinel handled correctly.
|
||||
- Mic dB: every block of N844L6XE, N844L23B, N844L6Z8 matches `.TXT`
|
||||
within 0.1 dB (BW display precision).
|
||||
- Mic freq: matches `.TXT` within 1 Hz across active blocks.
|
||||
|
||||
## What's NOT yet decoded
|
||||
|
||||
- **Annotation bytes (`block[7]/[11]/[15]/[19]`)**. Empirically
|
||||
non-zero on intervals where the per-channel ZC frequency comes
|
||||
out as `N/A` or sub-Hz (`<1.0`, `1.X`). Hypothesis tested in the
|
||||
RE session: byte != 0 ↔ sub-Hz freq. Only ~50% correlation
|
||||
across the K558 corpus, so the relationship is more complex.
|
||||
Possibilities: time-of-peak-within-interval, halfp extension for
|
||||
very-long-period signals, or a debug/diagnostic field the firmware
|
||||
writes opportunistically. Doesn't affect peak amplitudes or
|
||||
waveform reconstruction. Captured as `record["annotations"]` for
|
||||
future RE.
|
||||
- **4-byte variable metadata field (bytes 24:28)**. Not needed for
|
||||
waveform reconstruction. Speculation: per-block CRC, sub-second
|
||||
timestamp offset, or a Mic psi(L) count not in the 9 samples.
|
||||
Punt until something needs it.
|
||||
- **Geo PVS (TXT col 7, e.g. "0.040 in/s")**. Not stored in the
|
||||
block; can be approximated as `sqrt(T_peak² + V_peak² + L_peak²)`
|
||||
but BW's value sometimes differs slightly (probably computed from
|
||||
waveform-instant samples, not from per-channel peaks). Punt — the
|
||||
`.h5` consumers don't need PVS as a sample channel.
|
||||
- **Mic psi(L) value (TXT col 8)**. TXT shows it as a small psi value
|
||||
derived from the dB measurement. Not in the 9 samples. Could be
|
||||
derived from `M_peak_count` via the inverse of the dB formula plus
|
||||
a psi calibration constant. Defer.
|
||||
|
||||
## Output shape
|
||||
|
||||
`decode_histogram_body` returns the standard 4-channel dict that
|
||||
mirrors `waveform_codec.decode_waveform_v2`'s output:
|
||||
|
||||
```python
|
||||
{
|
||||
"Tran": [peak_count_per_interval, ...], # 16-count units (LSB = 0.005 in/s)
|
||||
"Vert": [..., ...],
|
||||
"Long": [..., ...],
|
||||
"MicL": [..., ...], # raw ADC counts
|
||||
}
|
||||
```
|
||||
|
||||
Run through `waveform_codec.decoded_to_adc_counts` to get 1-count ADC
|
||||
units (geo ×16, mic passthrough) for the standard `.h5` writer.
|
||||
|
||||
For the full per-interval record with frequencies + metadata, use
|
||||
`decode_histogram_body_full()`.
|
||||
|
||||
## Where it's wired
|
||||
|
||||
- `minimateplus/event_file_io.py:read_blastware_file()` — first tries
|
||||
the waveform codec, falls back to the histogram codec when the
|
||||
waveform preamble isn't present. Same output shape, same
|
||||
downstream pipeline.
|
||||
- `scripts/backfill_sidecars.py` — the `has_samples` short-circuit
|
||||
added during the histogram-codec-pending era still serves as a
|
||||
defensive guard against truly undecodable files, but no longer
|
||||
fires for valid histograms.
|
||||
|
||||
## Companion reference
|
||||
|
||||
- `docs/waveform_codec_re_status.md` — sibling status doc for the
|
||||
much-more-complex waveform-mode codec.
|
||||
- `docs/instantel_protocol_reference.md §7.6.2` — historical
|
||||
protocol-reference entry. Structural framing matches what we
|
||||
found; per-sample semantics were less documented than the `✅
|
||||
CONFIRMED` badge suggested. This doc supersedes §7.6.2 where they
|
||||
conflict on confidence level.
|
||||
@@ -1,284 +0,0 @@
|
||||
# IDF Protocol Reference — Thor / Micromate Series IV
|
||||
|
||||
Starting-point reference for reverse-engineering Instantel's Micromate
|
||||
Series IV event-file format. Sibling to
|
||||
[instantel_protocol_reference.md](instantel_protocol_reference.md) (the
|
||||
Series III "Rosetta Stone") — this doc holds what we know so far and
|
||||
the open questions still to crack.
|
||||
|
||||
**Status (2026-05-20):** ASCII text sidecar fully decoded (1,014
|
||||
sample files round-trip). Binary `.IDFH` / `.IDFW` codec
|
||||
**not yet implemented** — binaries are stored opaquely by
|
||||
`WaveformStore.save_imported_idf`, with metadata sourced from the
|
||||
paired `.txt` sidecar.
|
||||
|
||||
---
|
||||
|
||||
## File model
|
||||
|
||||
### Filename convention
|
||||
|
||||
```
|
||||
<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>
|
||||
```
|
||||
|
||||
- **SERIAL** — literal device serial, two-letter prefix + numeric
|
||||
suffix. Examples seen: `UM11719`, `UM13981`, `UM20147`, `BE9439`.
|
||||
Unlike Series III BW filenames (`M529LK44.AB0`, base-36 stem),
|
||||
Series IV filenames carry the serial in plain text.
|
||||
- **YYYYMMDDHHMMSS** — 14-char ASCII timestamp in **device local
|
||||
time** (no timezone marker).
|
||||
- **KIND** — `IDFH` for histograms, `IDFW` for waveforms.
|
||||
|
||||
The `.IDFH.txt` / `.IDFW.txt` ASCII sidecar lives in a `TXT/`
|
||||
**subfolder** of the unit's directory, not alongside the binary.
|
||||
This pairing convention is encoded in
|
||||
`event_forwarder.idf_report_path()`.
|
||||
|
||||
### Directory layout
|
||||
|
||||
```
|
||||
C:\THORDATA\
|
||||
└── <Project>\
|
||||
└── <UM####>\ ← unit serial dir
|
||||
├── UM12345_20260520100000.MLG ← monitor log (not events)
|
||||
├── UM12345_20260520100000.IDFH ← histogram event (binary)
|
||||
├── UM12345_20260520100000.IDFW ← waveform event (binary)
|
||||
├── UM12345_20260520100000.IDFW.CDB ← cache-DB variant (skip)
|
||||
├── TXT\
|
||||
│ ├── UM12345_20260520100000.IDFH.txt ← histogram ASCII sidecar
|
||||
│ └── UM12345_20260520100000.IDFW.txt ← waveform ASCII sidecar
|
||||
├── CSV\, HTML\, PDF\, XML\ ← operator-facing derived exports
|
||||
└── ...
|
||||
```
|
||||
|
||||
The `.IDFW.CDB` files share the binary's basename but appear to be a
|
||||
separate cache/database variant. Their first 8 bytes match the
|
||||
**old**-firmware Thor signature (see below) regardless of which
|
||||
signature the paired `.IDFW` uses. Purpose unknown; sizes vary
|
||||
wildly (observed 123 B → 40,491 B). Thor-watcher's forwarder
|
||||
deliberately skips them.
|
||||
|
||||
### Sample corpus
|
||||
|
||||
The `thor-watcher/example-data/THORDATA_example/` tree carries
|
||||
**1,014 paired .IDFW / .IDFH + .txt files** spanning 2020–2023
|
||||
across nine units (UM11719, UM13981, UM20147, …, plus BE9439 from
|
||||
2020). This is the reverse-engineering ground truth.
|
||||
|
||||
---
|
||||
|
||||
## ASCII sidecar (`.IDFW.txt` / `.IDFH.txt`) — fully decoded
|
||||
|
||||
Shape: plain text, one `"Key : Value"` line per metadata field,
|
||||
followed for waveforms by a tab-separated sample table headed by
|
||||
the literal line `Waveform Data Channels`. Parsed by
|
||||
[`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py).
|
||||
See [`micromate/models.py`](../micromate/models.py) for the typed
|
||||
`IdfReport` shape.
|
||||
|
||||
### Notable conventions
|
||||
|
||||
- **Units are native to Thor** — geophone in **in/s**, microphone in
|
||||
**dB(L)** (not psi like Series III BW reports), frequency in Hz,
|
||||
acceleration in g, displacement in in.
|
||||
- **Below-threshold readings** appear as the literal string
|
||||
`<0.005 in/s` (155 occurrences in the sample corpus) — the parser
|
||||
strips the `<` and treats the numeric remainder as the value.
|
||||
- **Out-of-range / not-measured** values appear as `N/A` — parser
|
||||
drops the field rather than letting the string leak into a numeric
|
||||
column.
|
||||
- **Firmware string** observed: `Micromate ISEE 11.0AK`.
|
||||
- **TitleString1..4** are operator-defined free-text slots; Thor's
|
||||
default labels map them to Location / Client / Company / Notes,
|
||||
which the parser surfaces as `project` / `client` / `operator` /
|
||||
`notes`.
|
||||
- **Histogram sidecars** use `HistogramStartDate` / `HistogramStartTime`
|
||||
in place of waveform's `EventDate` / `EventTime`. Parser falls
|
||||
through to either.
|
||||
- **Histogram tabular block** lacks the `Waveform Data Channels`
|
||||
marker; instead it's a multi-line column header followed by
|
||||
per-interval rows (`<date> <time> <tran-ppv> <freq> ...`). Parser
|
||||
silently ignores lines after the metadata block since they lack a
|
||||
colon-separated `key : value` shape (the timestamps DO contain
|
||||
colons but produce garbage keys that don't collide with any
|
||||
recognised field).
|
||||
|
||||
---
|
||||
|
||||
## Binary header signatures (observed)
|
||||
|
||||
Hex dump of the first 32 bytes across 1,014 sample files reveals
|
||||
**two distinct file signatures**, both anchored by the literal
|
||||
ASCII string `"\x00Instantel\x00"` at offset 6–16:
|
||||
|
||||
### Signature A — newer firmware (1,012 files, 99.8% of corpus)
|
||||
|
||||
```
|
||||
00000000: 0012 0100 0000 496e 7374 616e 7465 6c00 ......Instantel.
|
||||
00000010: 0000 a695 002e b500 4f70 6572 6174 6f72 ........Operator
|
||||
^^^^^^^^^^^^^^^^
|
||||
operator/title string starts at 0x18
|
||||
```
|
||||
|
||||
Header bytes 0–5: `00 12 01 00 00 00`. Followed immediately by the
|
||||
8-byte ASCII tag, then 6 unknown bytes, then ASCII operator-supplied
|
||||
strings (Operator name, etc.) and on through the project / client /
|
||||
title strings. No `STRT` record observed in this layout.
|
||||
|
||||
### Signature B — older firmware (2 files: BE9439 from 2020)
|
||||
|
||||
```
|
||||
00000000: 1000 0180 0000 496e 7374 616e 7465 6c00 ......Instantel.
|
||||
00000010: 072c 0012 0300 5354 5254 fffe 0111 2340 .,....STRT....#@
|
||||
^^^^^^^^^ ^^^^^^^^^
|
||||
STRT magic 4-byte end_key
|
||||
00000020: 0111 0000 2e5f 00ac 4600 0000 0200 0000 ....._..F.......
|
||||
^^^^^^^^^ ^^^
|
||||
4-byte start_key 0x46 (BW WAVEHDR record-type marker)
|
||||
```
|
||||
|
||||
Header bytes 0–5: `10 00 01 80 00 00`. The structure after the
|
||||
`Instantel` magic is **byte-for-byte identical to a BW SUB 5A
|
||||
probe-response STRT record** as documented in
|
||||
[instantel_protocol_reference.md → "SUB 5A — STRT record encodes
|
||||
end_offset"](instantel_protocol_reference.md). Specifically:
|
||||
|
||||
| Offset | Bytes | Meaning (per BW reference) |
|
||||
|--------|---------------------|--------------------------------------|
|
||||
| 0x14 | `53 54 52 54` | `STRT` magic |
|
||||
| 0x18 | `ff fe` | STRT sentinel |
|
||||
| 0x1A | `01 11 23 40` | `end_key` (4 bytes) |
|
||||
| 0x1E | `01 11 00 00` | `start_key` (4 bytes) |
|
||||
| 0x26 | `46` | `0x46` waveform-record type marker |
|
||||
|
||||
**Hypothesis:** Older Micromate firmware writes a wrapped BW-format
|
||||
event into the `.IDFW` file — essentially the same on-disk shape as
|
||||
a Series III device, with the new filename convention applied at
|
||||
export time. Newer firmware (signature A) abandoned the
|
||||
BW-compatible layout for an Instantel-specific format.
|
||||
|
||||
If that hypothesis holds, the 2 signature-B files can already be
|
||||
parsed via `minimateplus/event_file_io.read_blastware_file()` — worth
|
||||
testing. The 1,012 signature-A files are the real reverse-engineering
|
||||
target.
|
||||
|
||||
### `.IDFW.CDB` cache files
|
||||
|
||||
Always carry signature B (`10 00 01 80 ...`), even when the paired
|
||||
`.IDFW` carries signature A. Plausible explanation: the CDB is an
|
||||
internal Thor cache-database export that retains the legacy BW-style
|
||||
record layout regardless of the user-facing `.IDFW` format version.
|
||||
Not currently consumed by the forwarder.
|
||||
|
||||
---
|
||||
|
||||
## File-size patterns (Signature A, the main target)
|
||||
|
||||
Survey of 1,012 signature-A files:
|
||||
|
||||
| Event type | Typical size | Source of variance |
|
||||
|--------------|-------------------|----------------------------------------------|
|
||||
| `.IDFW` 2-sec | 9,200 – 10,500 B | Operator-supplied strings (TitleString1..4) of varying length |
|
||||
| `.IDFH` | 2,944 – 4,076 B | Histogram interval count (record duration / interval) |
|
||||
|
||||
**Naive arithmetic for 2-sec waveform:**
|
||||
- 4 channels × 2 sec × 1024 sps = 8,192 samples
|
||||
- At 2 bytes/sample (int16) = 16,384 sample bytes → file would be > 16 KB
|
||||
- Observed: ~9–10 KB
|
||||
- → samples are likely **1 byte each** (int8 quantised), **or** stored
|
||||
with bit-packing / delta encoding, **or** only one channel's
|
||||
full-rate samples are stored with the others reconstructed
|
||||
arithmetically. Verifying this is the **first RE milestone**.
|
||||
|
||||
Project-string–length variance (~1 KB across the corpus) is consistent
|
||||
with the file carrying a single copy of each TitleString1..4 plus
|
||||
operator + setup-name as null-padded ASCII regions.
|
||||
|
||||
---
|
||||
|
||||
## Open questions
|
||||
|
||||
The reverse-engineering targets, roughly in dependency order:
|
||||
|
||||
1. **Sample encoding (signature A)** — int8? int16 LE/BE? Bit-packed?
|
||||
Delta-coded? Per-channel interleaved or sequential blocks?
|
||||
2. **Header field layout (signature A)** — where do sample_rate,
|
||||
record_time, channel count, and per-channel peaks live in the
|
||||
binary? The ASCII sidecar gives the device-authoritative values,
|
||||
so binary fields can be confirmed by diff.
|
||||
3. **Operator-string offsets** — `Operator` at 0x18 is the first
|
||||
visible string in signature-A files; the rest (project, client,
|
||||
notes, setup) follow. Need to map exact offsets and null-padding
|
||||
conventions.
|
||||
4. **Signature-B → BW codec compatibility** — does
|
||||
`minimateplus/event_file_io.read_blastware_file()` actually parse
|
||||
the 2 BE9439 signature-B files as-is? If yes, the OLD-format
|
||||
ingest is free.
|
||||
5. **`.IDFW.CDB` purpose** — is it an internal Thor cache, a
|
||||
ring-buffer dump, or something else? Worth a single small effort
|
||||
to characterise so we know what we're skipping.
|
||||
6. **Footer / checksum** — every BW event file has a footer; does
|
||||
IDF? Where does the per-channel sample block end?
|
||||
|
||||
---
|
||||
|
||||
## Reverse-engineering playbook (when we start)
|
||||
|
||||
The Series III BW codec took ~2 months of MITM wire captures
|
||||
because we didn't have ground-truth metadata. Thor's situation is
|
||||
**substantially better**:
|
||||
|
||||
- **Ground truth is on disk.** Every binary in `example-data/`
|
||||
has a paired `.IDFW.txt` carrying the full decoded sample table
|
||||
(`Waveform Data Channels` block — see any sample file in
|
||||
`thor-watcher/example-data/.../TXT/`). Aligning binary bytes
|
||||
to the table's float-per-row values gives an immediate per-byte
|
||||
hypothesis test.
|
||||
- **Cross-event diffing.** 1,012 signature-A samples from 9 units
|
||||
spanning 4 years means any field that varies between events is
|
||||
immediately localisable. Fields that are constant across all
|
||||
files (firmware ID, channel labels, format-version word) are also
|
||||
immediately localisable by complementary search.
|
||||
- **No protocol surface.** Files at rest, not a wire dialect. No
|
||||
DLE stuffing, no inner-frame parsing, no probe/data two-step.
|
||||
|
||||
Suggested first session (2-4 hours): hand-decode `UM11719_20231219162723.IDFW`
|
||||
(10,290 bytes) against its `TXT/UM11719_20231219162723.IDFW.txt`
|
||||
sample table (the 2-sec waveform at 1024 sps × 4 channels = 8,192
|
||||
sample rows). Find the first per-channel sample value (`0.0003` in
|
||||
the Tran column at t=0) in the binary. Confirms sample encoding.
|
||||
Everything else flows from there.
|
||||
|
||||
---
|
||||
|
||||
## Code seams ready to receive the codec
|
||||
|
||||
When the codec lands, it goes into
|
||||
[`micromate/idf_file.py`](../micromate/idf_file.py) (currently a
|
||||
stub raising `NotImplementedError`). Public API:
|
||||
|
||||
```python
|
||||
from micromate import IdfEvent
|
||||
from micromate.idf_file import read_idf_file
|
||||
|
||||
event: IdfEvent = read_idf_file(Path("UM11719_20231219163444.IDFW"))
|
||||
# event.peaks.transverse_ips, event.timestamp, event.raw_samples, ...
|
||||
```
|
||||
|
||||
The ingest pipeline (`WaveformStore.save_imported_idf`) currently
|
||||
builds the `IdfEvent` from the `.txt` parser only. Once
|
||||
`read_idf_file()` works, the binary becomes authoritative; the
|
||||
`.txt` parser drops to fast-path metadata cross-check. Operators
|
||||
who don't enable Thor's TXT exporter still get fully populated
|
||||
events.
|
||||
|
||||
---
|
||||
|
||||
## See also
|
||||
|
||||
- [instantel_protocol_reference.md](instantel_protocol_reference.md) — Series III BW protocol reference (the Rosetta Stone). STRT record format, DLE framing, BW filename encoding.
|
||||
- [`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py) — `.txt` sidecar parser.
|
||||
- [`micromate/models.py`](../micromate/models.py) — `IdfEvent`, `IdfReport` typed dataclasses.
|
||||
- [`micromate/idf_file.py`](../micromate/idf_file.py) — placeholder for the binary codec.
|
||||
- [`thor-watcher/example-data/THORDATA_example/`](../../thor-watcher/example-data/) — 1,014 paired binary + .txt files for codec validation.
|
||||
+322
-1093
File diff suppressed because it is too large
Load Diff
@@ -1,255 +0,0 @@
|
||||
# Runbook — Recovering a wedged unit stuck in a call-home loop
|
||||
|
||||
**Original incident:** BE9558H at `166.246.130.1:9034`, recovered 2026-05-17.
|
||||
|
||||
A field unit with a stuck-triggered geophone (or any hardware fault causing
|
||||
constant event triggering) will record events back-to-back, and if Auto Call
|
||||
Home is set to "After Event Recorded" the device will dial the office BW
|
||||
ACH server in a tight loop. Combined with a Sierra Wireless modem in
|
||||
bidirectional serial-TCP mode, this makes the unit effectively unreachable
|
||||
from SFM — every TCP connection we open gets killed when the modem flips
|
||||
from server-mode to client-mode to honor the device's next AT dial command.
|
||||
|
||||
This runbook describes how to break the loop and recover control.
|
||||
|
||||
---
|
||||
|
||||
## Symptoms
|
||||
|
||||
- Terra-View / SFM `/device/info` either hangs or fails on `count_events()`.
|
||||
- `/device/monitor/status` and `/device/rescue` return 502 (protocol timeout
|
||||
waiting for POLL response) or 503 (TCP connect refused).
|
||||
- ACEmanager serial log shows repeating
|
||||
`Connect to IP: <BW_IP> Port: <BW_PORT>` → `Shutdown TCP socket` cycles
|
||||
every 30-60 seconds.
|
||||
- Spam-mode endpoints (`/device/stop_monitoring_spam`) report many
|
||||
`sent_ok` but the device's monitoring state never changes.
|
||||
- `slow_drip` reports `[Errno 32] Broken pipe` after sending the preamble
|
||||
but before completing the drip loop.
|
||||
|
||||
If you see *all* of these, the unit is in this exact failure mode.
|
||||
|
||||
---
|
||||
|
||||
## Quick reference — how to recover
|
||||
|
||||
You need **ACEmanager access** to the unit's modem.
|
||||
|
||||
### Step 1: stop the modem's mode-flipping
|
||||
|
||||
In ACEmanager → **Serial → Port Configuration**:
|
||||
|
||||
| Field | Set to |
|
||||
|---|---|
|
||||
| **Destination Address** | clear (blank) |
|
||||
| **Destination Port** | `0` |
|
||||
|
||||
Click **Apply**. This removes the modem's auto-dial-out target. The device's
|
||||
AT dial commands now error back at the modem instead of triggering a
|
||||
mode-flip, so the modem stays in TCP-server mode permanently and our inbound
|
||||
TCP sessions stay alive.
|
||||
|
||||
*(Optional belt-and-suspenders: also add the BW server's port to
|
||||
**Security → Port Filtering - Outbound** as a blocked port, with
|
||||
Outbound Port Filtering Mode = Blocked Ports.)*
|
||||
|
||||
### Step 2: stop monitoring on the device (slow drip)
|
||||
|
||||
From the SFM host:
|
||||
|
||||
```bash
|
||||
/home/serversdown/seismo-relay/scripts/slow_drip.sh <DEVICE_IP> <PORT>
|
||||
```
|
||||
|
||||
Defaults are 120s duration with a drip every 3s. Watch the response:
|
||||
|
||||
- `duration_s ≈ 120` and `drips_sent ≈ 40` → session held the full duration ✓
|
||||
- `bytes_received > 0` → device is responding ✓ (this is the success signal)
|
||||
|
||||
If `duration_s` is small or `send_error: "Broken pipe"`, Step 1 didn't take
|
||||
hold — re-check ACEmanager, may need to reboot the modem after Apply.
|
||||
|
||||
### Step 3: confirm monitoring stopped
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:8200/device/monitor/status?host=<DEVICE_IP>&tcp_port=<PORT>&force=true'
|
||||
# expect: {"is_monitoring": false, ...}
|
||||
```
|
||||
|
||||
### Step 4: disable ACH at the device level + erase corrupted events
|
||||
|
||||
Either fire the rescue endpoint:
|
||||
|
||||
```bash
|
||||
/home/serversdown/seismo-relay/scripts/rescue_device.sh <DEVICE_IP> <PORT>
|
||||
```
|
||||
|
||||
Or do the two steps manually:
|
||||
|
||||
```bash
|
||||
# Disable ACH in the device's compliance config
|
||||
curl -X POST 'http://localhost:8200/device/call_home?host=<DEVICE_IP>&tcp_port=<PORT>' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"auto_call_home_enabled": false}'
|
||||
|
||||
# Erase corrupted event chain
|
||||
curl -X POST 'http://localhost:8200/device/events/erase?host=<DEVICE_IP>&tcp_port=<PORT>'
|
||||
```
|
||||
|
||||
You can also do this via the SFM standalone UI → **Call Home** tab → set
|
||||
`Enable Auto Call Home` to `Disabled` → **Write to Device**.
|
||||
|
||||
### Step 5: restore modem config (housekeeping)
|
||||
|
||||
Once the device-side ACH is disabled, restore the modem's Destination
|
||||
Address and Port to the original values (e.g. `50.197.32.92` / `12345`) in
|
||||
ACEmanager. The modem will resume normal bidirectional behavior, but the
|
||||
unit won't issue any dial commands until ACH is explicitly re-enabled on
|
||||
the device.
|
||||
|
||||
### Step 6: do NOT re-enable ACH on this unit until the underlying hardware
|
||||
fault is repaired. If you do, the call-home loop starts again immediately
|
||||
and you'll be running this runbook a second time.
|
||||
|
||||
---
|
||||
|
||||
## Why this works — the failure mode explained
|
||||
|
||||
The Sierra Wireless RV50/RV55 serial port operates in one of two TCP modes
|
||||
at any moment:
|
||||
|
||||
- **Server mode** — listens on `Device Port` (e.g. 9034), bridges inbound
|
||||
TCP to the device's serial port. This is what we need to interact with
|
||||
the device.
|
||||
- **Client mode** — when the device sends an AT dial command on its serial
|
||||
TX line, the modem opens an outbound TCP to `Destination Address:Port`
|
||||
and bridges that to serial.
|
||||
|
||||
A serial port in this configuration is **bidirectional**: the modem flips
|
||||
between server and client modes on demand. When the device's firmware is
|
||||
healthy and only dials occasionally, this works fine.
|
||||
|
||||
When the unit is constantly triggering events and ACH is set to "After
|
||||
Event Recorded", the device sends an AT dial command every few seconds.
|
||||
Each one causes the modem to:
|
||||
|
||||
1. Drop any active inbound TCP session
|
||||
2. Flip to client mode
|
||||
3. Attempt outbound TCP to `Destination Address:Port`
|
||||
4. Hang for up to a minute waiting for it to succeed/fail
|
||||
5. Drop back to server mode
|
||||
|
||||
**During the entire hang, no inbound TCP can establish.** Even between
|
||||
hangs, the modem closes any existing inbound session before flipping. So
|
||||
any tool that needs more than a few seconds of held TCP (e.g. POLL +
|
||||
config read + write) gets repeatedly kicked off.
|
||||
|
||||
Clearing `Destination Address` removes step 3-4 from the cycle: the modem
|
||||
has nowhere to dial, so it doesn't flip modes when it receives an AT dial
|
||||
command. The serial port effectively becomes server-only, and inbound TCP
|
||||
sessions can stay open as long as needed.
|
||||
|
||||
**This is a modem-layer issue, not a device firmware issue.** The device
|
||||
is alive and responsive the whole time — confirmed in the BE9558H
|
||||
recovery by 990 bytes of S3 responses received over a 120s slow-drip
|
||||
session once the modem was no longer mode-flipping.
|
||||
|
||||
---
|
||||
|
||||
## Why simpler approaches don't work
|
||||
|
||||
| Approach | Why it fails |
|
||||
|---|---|
|
||||
| Standard `/device/info` | Triggers `count_events()` 1E/1F walk, takes 90s+ and hits corrupted event chain in this scenario |
|
||||
| `/device/rescue` race loop | Gets 502 (protocol timeout) because the modem closes the TCP before the POLL handshake can complete |
|
||||
| `/device/stop_monitoring_blind` (single frame) | Even if the bytes leave the wire, the device's protocol parser ignores write commands without a preceding POLL handshake (early-version bug, now fixed by including POLL preamble in blind sends) |
|
||||
| `/device/stop_monitoring_spam` (sub-second cadence) | Each session is killed by the modem's mode-flip before the device can drain its UART RX buffer; high-rate spam also risks UART FIFO overrun on the device side |
|
||||
| Outbound port firewall block alone | Stops the outbound TCP from succeeding, but doesn't stop the modem from *trying* and mode-flipping. Reduces but doesn't eliminate the contention. |
|
||||
| Modem reboot | Temporary — as soon as the device starts triggering again, the loop resumes within seconds |
|
||||
|
||||
The combination of `slow_drip` + cleared `Destination Address` works because:
|
||||
|
||||
1. The modem stops mode-flipping → TCP session stays open for the full
|
||||
drip duration
|
||||
2. Slow drip rate → device's UART RX FIFO never overflows even if
|
||||
firmware is busy with event recording
|
||||
3. The drip is `SESSION_RESET + STOP_MONITORING` every 3s → many
|
||||
independent chances for the parser to land one valid frame
|
||||
4. Once one Stop Monitoring is parsed, event recording halts → firmware
|
||||
has CPU to spare → subsequent operations are trivially easy
|
||||
|
||||
---
|
||||
|
||||
## Tooling reference
|
||||
|
||||
All endpoints live in `seismo-relay/sfm/server.py`. All scripts live in
|
||||
`seismo-relay/scripts/` and default to SFM direct (`http://localhost:8200`),
|
||||
overridable via `SFM_BASE_URL`.
|
||||
|
||||
### Endpoints added during BE9558H recovery
|
||||
|
||||
| Endpoint | Purpose |
|
||||
|---|---|
|
||||
| `GET /device/events/storage_range` | SUB 0x06 — first/last event keys, `is_empty` flag. ~2s, no event walk. |
|
||||
| `GET /device/events/index` | SUB 0x08 — lifetime event counter (does NOT decrement on erase). ~2s. |
|
||||
| `POST /device/events/erase` | Full erase sequence 0xA3 → 0x1C → 0x06 → 0xA2. |
|
||||
| `POST /device/rescue` | Disable ACH + erase in one TCP session. Short timeouts for race-loop usage. |
|
||||
| `POST /device/stop_monitoring_blind` | Fire-and-forget Stop with full POLL preamble (single attempt). |
|
||||
| `POST /device/stop_monitoring_spam` | Server-side tight retry loop, sub-second cadence, duration-bounded. |
|
||||
| `POST /device/stop_monitoring_slow_drip` | One held TCP session, slow trickle of stop frames. **The endpoint that saved BE9558H.** |
|
||||
|
||||
Also changed: default protocol recv timeout dropped from 30s → 10s in
|
||||
`_build_client`. Added `connect_timeout` knob to same. Cleaned up
|
||||
unhandled-exception path in `/device/monitor/status` so it returns 502
|
||||
instead of 500 on protocol timeouts.
|
||||
|
||||
### Scripts
|
||||
|
||||
| Script | Purpose |
|
||||
|---|---|
|
||||
| `scripts/rescue_device.sh` | Race-loop wrapper around `/device/rescue` |
|
||||
| `scripts/blind_stop.sh` | Race-loop wrapper around `/device/stop_monitoring_blind` |
|
||||
| `scripts/spam_stop.sh` | Single-call burst hammer (`/device/stop_monitoring_spam`) |
|
||||
| `scripts/slow_drip.sh` | Single-call held-session drip (`/device/stop_monitoring_slow_drip`) |
|
||||
| `scripts/watch_unit.sh` | Passive periodic reachability check, logs to file |
|
||||
|
||||
---
|
||||
|
||||
## Incident log — BE9558H, 2026-05-16/17
|
||||
|
||||
What was wrong: Long-axis geophone developed an offset, constantly above
|
||||
trigger threshold → constant event recording → after-event ACH set →
|
||||
modem dialing office BW server (`50.197.32.92:12345`) every 30-60s.
|
||||
Local event chain corrupted (`next_boundary 0x100EE exceeds uint16`).
|
||||
|
||||
Diagnostic path:
|
||||
|
||||
1. `/device/info` slow, choked on event walk
|
||||
2. Built lightweight probe endpoints (`storage_range`, `index`) — useful
|
||||
but didn't reach the wedged unit
|
||||
3. Built `/device/rescue` with short timeouts — got 502 (POLL no response)
|
||||
4. Built `/device/stop_monitoring_blind` — first version was a false
|
||||
positive (no POLL preamble); fixed by including
|
||||
`SESSION_RESET+POLL_PROBE+SESSION_RESET+POLL_DATA` in the dump
|
||||
5. Verified blind stop works on bench unit
|
||||
6. Built `/device/stop_monitoring_spam` — 420 successful sends over
|
||||
5 min, zero behavior change on field unit
|
||||
7. Inspected ACEmanager logs → saw outbound dial-out attempts every ~30s,
|
||||
confirmed device was not fully locked up
|
||||
8. Added outbound port-12345 firewall block → outbound attempts now fail
|
||||
instantly but contention persisted
|
||||
9. Built `/device/stop_monitoring_slow_drip` — session died at 3s with
|
||||
broken pipe (modem closing on us)
|
||||
10. Looked at full ACEmanager Port Configuration → **found
|
||||
`Destination Address: 50.197.32.92` configured**, realized every AT
|
||||
dial command was triggering a modem mode-flip that killed our inbound
|
||||
11. Cleared Destination Address + Port → slow_drip held 120s, device
|
||||
responded with 990 bytes, 39 stop commands acked
|
||||
12. Disabled ACH at device level via `/device/call_home`, erased events
|
||||
|
||||
Final state: device IDLE, memory 958.1 / 960 KB free, ACH disabled at
|
||||
device level, modem destination cleared (to be restored after physical
|
||||
service).
|
||||
|
||||
Total time from "i was wondering if its possible to" first attempt to
|
||||
recovery: ~7 hours of intermittent debugging across one evening.
|
||||
@@ -1,264 +0,0 @@
|
||||
# Waveform body codec — FULLY DECODED (2026-05-11)
|
||||
|
||||
This is the **clean working note** for the body-codec reverse-engineering
|
||||
effort. It supersedes scattered claims elsewhere when they conflict.
|
||||
The deep historical record (with retractions, dead ends, and dated
|
||||
analyses) lives in `docs/instantel_protocol_reference.md §7.6.1`; the
|
||||
authoritative implementation lives in `minimateplus/waveform_codec.py`.
|
||||
|
||||
## TL;DR
|
||||
|
||||
**The codec is fully decoded.** Every block type, every channel, every
|
||||
event in the fixture bundle decodes byte-exact against BW's ASCII
|
||||
export.
|
||||
|
||||
| Block type | Meaning | Verified |
|
||||
|---|---|---|
|
||||
| `10 NN` | 4-bit signed nibble deltas | ✅ |
|
||||
| `20 NN` | int8 signed deltas | ✅ |
|
||||
| `00 NN` | run-length-encoded zero deltas | ✅ |
|
||||
| `30 NN` | 12-bit signed packed deltas | ✅ NEW (2026-05-11 late) |
|
||||
| `40 02` | segment header (anchor pair + prev-channel extension) | ✅ |
|
||||
|
||||
Channels rotate **Tran → Vert → Long → MicL** per segment. Each
|
||||
channel-segment carries ~512 samples (2-sample anchor pair + 508
|
||||
deltas + 2-sample continuation in next segment's header).
|
||||
|
||||
## What decodes byte-exact today
|
||||
|
||||
**Every decoded sample across every fixture event matches truth. Zero
|
||||
divergences.**
|
||||
|
||||
| Event | Description | Tran | Vert | Long | Total |
|
||||
|---|---|---|---|---|---|
|
||||
| event-a (5-8) | quiet, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||
| event-c (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
|
||||
| event-d (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
|
||||
| JQ0 (5-11) | Vert-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||
| V70 (5-11) | Mic-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||
| SP0 (5-11) | loud all, 3 sec | 2048 ✓ | 1538 ✓ | 1536 ✓ | 5122 |
|
||||
| SS0 (5-11) | loud-from-start | 734 ✓ | 512 ✓ | 512 ✓ | 1758 |
|
||||
| SV0 (5-11) | loud-from-start | 1024 ✓ | 578 ✓ | 512 ✓ | 2114 |
|
||||
| event-b (5-8) | quiet, 2 sec | 512 ✓ | 226 ✓ | 0 | 738 |
|
||||
|
||||
That's **47,364 ADC samples decoded byte-exact, zero errors.**
|
||||
|
||||
Three full 3-sec events (event-a, JQ0, V70) decode end-to-end across
|
||||
all three geo channels.
|
||||
|
||||
The events where fewer samples are decoded (SP0, SS0, SV0, event-b)
|
||||
are limited by the walker stopping at certain block-length edge cases,
|
||||
not by decoder correctness — every sample the walker reaches is
|
||||
correct.
|
||||
|
||||
## What's still open
|
||||
|
||||
- **Tail samples on SS0/SV0** — these two events decode all but the
|
||||
last 1–7 samples per channel (out of 3079). Likely the same
|
||||
"last segment is truncated" pattern. Minor; doesn't affect the
|
||||
bulk of the data.
|
||||
|
||||
## Sample counts (72,972 byte-exact total)
|
||||
|
||||
| Event | Tran | Vert | Long | Status |
|
||||
|---|---|---|---|---|
|
||||
| event-a | 3328 | 3328 | 3328 | full |
|
||||
| event-b | 2304 | 2304 | 2304 | full |
|
||||
| event-c | 1280 | 1280 | 1280 | full |
|
||||
| event-d | 1280 | 1280 | 1280 | full |
|
||||
| JQ0 | 3328 | 3328 | 3328 | full |
|
||||
| V70 | 3328 | 3328 | 3328 | full |
|
||||
| SP0 | 3328 | 3328 | 3328 | full |
|
||||
| SS0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
|
||||
| SV0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
|
||||
|
||||
## What's now wired into production (2026-05-11 late)
|
||||
|
||||
- **`client.py:_decode_a5_waveform`** — now uses
|
||||
`decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
|
||||
`event.raw_samples` is populated with int16 ADC counts that flow
|
||||
through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
|
||||
Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
|
||||
reference but is not called.
|
||||
|
||||
- **MicL → dB(L) conversion** — exposed as
|
||||
`waveform_codec.mic_count_to_db(count)`. Verified against BW
|
||||
display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
|
||||
the V70 mic-heavy fixture exactly).
|
||||
|
||||
- **`decode_a5_frames(a5_frames)`** — production entry point that
|
||||
reconstructs the BW-binary body from A5 frames (via the new
|
||||
`blastware_file.extract_body_bytes` helper) and runs the verified
|
||||
codec. Returns the same `raw_samples` dict shape the consumers
|
||||
already expect.
|
||||
|
||||
## What's solved
|
||||
|
||||
### Block framing
|
||||
|
||||
| Tag | Length | Meaning |
|
||||
|----------|-----------------------|------------------------------------------|
|
||||
| `10 NN` | NN/2 + 2 bytes | 4-bit nibble deltas (2 per byte; high |
|
||||
| | | nibble first; signed 0..7 / 8..F = -8..-1)|
|
||||
| `20 NN` | NN + 2 bytes | int8 signed deltas (1 per byte) |
|
||||
| `00 NN` | 2 bytes | RLE: append NN copies of current value |
|
||||
| `30 NN` | NN*2 in data section, | Unknown content. Only in loud-from- |
|
||||
| | NN*4 in trailer | start events. |
|
||||
| `40 02` | 20 bytes (fixed) | Segment header |
|
||||
|
||||
NN is always a multiple of 4.
|
||||
|
||||
Implementation: `walk_body()` in `minimateplus/waveform_codec.py`.
|
||||
|
||||
### 7-byte preamble
|
||||
|
||||
```
|
||||
body[0:3] = 00 02 00 magic
|
||||
body[3:5] = Tran[0] int16 BE in 16-count units (LSB = 0.005 in/s)
|
||||
body[5:7] = Tran[1] int16 BE in 16-count units
|
||||
```
|
||||
|
||||
### Tran channel, segment 0
|
||||
|
||||
Segment 0 (everything before the first `40 02`) encodes Tran samples
|
||||
only. Starting from preamble anchors Tran[0] and Tran[1], each block
|
||||
contributes to a running cumulative:
|
||||
|
||||
- `10 NN` → append NN nibble-deltas
|
||||
- `20 NN` → append NN int8-deltas
|
||||
- `00 NN` → append NN copies of current value (RLE)
|
||||
- `40 02` → end segment 0
|
||||
|
||||
Verified byte-exact:
|
||||
|
||||
| Event | Description | Segment 0 size | Match |
|
||||
|---|---|---|---|
|
||||
| `M529LL1A.SP0` | Loud, 0.25 s pretrig | 510 | 510/510 ✓ |
|
||||
| `M529LL1A.SV0` | Loud from sample 0 | 58 | 58/58 ✓ (stops at first `30 NN`) |
|
||||
| `M529LL1A.SS0` | Loud from sample 0 | 42 | 42/42 ✓ (stops at first `30 04`) |
|
||||
| `M529LL1L.JQ0` | Vert-heavy | 510 | 510/510 ✓ |
|
||||
| `M529LL1L.V70` | Mic-heavy (140 dB) | 510 | 510/510 ✓ |
|
||||
|
||||
Implementation: `decode_tran_initial()`.
|
||||
|
||||
### Segment header (`40 02`, 20 bytes total) — REWRITTEN 2026-05-11
|
||||
|
||||
| Payload offset | Field | Status |
|
||||
|---|---|---|
|
||||
| [0:2] | Previous-channel delta — 1st extension sample (int16 BE) | ✅ confirmed |
|
||||
| [2:4] | Previous-channel delta — 2nd extension sample (int16 BE) | ✅ confirmed |
|
||||
| [4:6] | Unknown (likely checksum) | ❓ open |
|
||||
| [6:8] | Byte length to next segment header − 2 (uint16 BE) | ✅ confirmed |
|
||||
| [8:12] | Monotonic uint32 LE counter (starts ~0x47) | ✅ confirmed |
|
||||
| [12:14] | Constant `02 00` | ✅ confirmed |
|
||||
| [14:16] | THIS segment's channel — sample 0 anchor (int16 BE, 16-count units) | ✅ confirmed |
|
||||
| [16:18] | THIS segment's channel — sample 1 anchor (int16 BE, 16-count units) | ✅ confirmed |
|
||||
|
||||
**Key insight (2026-05-11 late):** every segment carries 510 main
|
||||
samples (2 anchor + 508 deltas) PLUS 2 continuation samples that live
|
||||
in the NEXT segment header. So each channel-segment effectively spans
|
||||
512 sample-sets. The continuation lives in the next segment because
|
||||
the segment header is also a channel-switch point, so it's a natural
|
||||
place to "extend the channel we're leaving" before "starting the
|
||||
channel we're entering."
|
||||
|
||||
This is the same structure as the body preamble (which carries
|
||||
Tran[0] and Tran[1] as int16 BE) — every channel uses the same
|
||||
"2 anchors + delta stream" layout.
|
||||
|
||||
## Channel rotation — VERIFIED 2026-05-11
|
||||
|
||||
```
|
||||
(initial body) → Tran samples 0..509 (preamble + delta blocks)
|
||||
segment 0 hdr ext+anchor → Vert samples 0..511 ← anchor in hdr [14:18]
|
||||
segment 1 hdr ext+anchor → Long samples 0..511
|
||||
segment 2 hdr ext+anchor → Mic samples 0..511
|
||||
segment 3 hdr ext+anchor → Tran samples 510..1021 (continuation)
|
||||
segment 4 hdr ext+anchor → Vert samples 512..1023
|
||||
segment 5 hdr ext+anchor → Long samples 512..1023
|
||||
segment 6 hdr ext+anchor → Mic samples 512..1023
|
||||
segment 7 hdr ext+anchor → Tran samples 1022..1533
|
||||
...
|
||||
```
|
||||
|
||||
Implementation: `decode_waveform_v2()` returns
|
||||
`{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}` with
|
||||
each channel's samples in 16-count units. All verified ranges in the
|
||||
TL;DR table above are now locked in by pytest regression tests.
|
||||
|
||||
## What's still open
|
||||
|
||||
1. **`30 NN` block content.** These blocks appear in high-amplitude
|
||||
regions (sample-set deltas exceeding what int8 in `20 NN` can
|
||||
express). The decoder currently steps over them, which loses
|
||||
precision for the affected samples. Likely a packed multi-byte
|
||||
delta format (12-bit or 16-bit per delta) — initial guesses didn't
|
||||
match cleanly, needs more careful analysis.
|
||||
|
||||
2. **MicL decoding.** The mic channel's anchor pair appears in the
|
||||
third segment of each rotation cycle in the same format as the
|
||||
geo channels, but the BW ASCII export shows mic in dB(L) (~6 dB
|
||||
quantization steps), so direct integer comparison against ADC
|
||||
units doesn't work. Need to figure out the ADC-counts → dB(L)
|
||||
conversion or pull the mic ADC counts from somewhere else in the
|
||||
file format.
|
||||
|
||||
3. **Walker fix for event-b.** The original quiet bundle's event-b
|
||||
still bails out partway through. Lower priority since the other
|
||||
7 events walk cleanly.
|
||||
|
||||
## `30 NN` block format — CRACKED 2026-05-11 late
|
||||
|
||||
The `30 NN` block carries `NN` 12-bit signed deltas, packed as `NN/4`
|
||||
groups of 6 bytes each. Within each 6-byte group:
|
||||
|
||||
```
|
||||
bytes [0:2] = 16 bits = 4 × 4-bit "high nibbles" (MSB-first)
|
||||
bytes [2:6] = 4 × int8 "low bytes"
|
||||
|
||||
For k in 0..3:
|
||||
high_nibble = (header_word >> (12 - 4*k)) & 0xF
|
||||
raw_12 = (high_nibble << 8) | low_byte[k]
|
||||
delta[k] = raw_12 - 0x1000 if raw_12 >= 0x800 else raw_12
|
||||
```
|
||||
|
||||
The block's total length is `NN × 1.5 + 2` bytes (tag included). This
|
||||
is what was tripping up the earlier walker, which used `NN × 4` (the
|
||||
trailer-section formula) instead.
|
||||
|
||||
Why 12-bit and not 16-bit: 12-bit signed range is ±2047, which in
|
||||
16-count units = ±10.2 in/s — almost exactly the ±10 in/s full-scale
|
||||
range of the geophone at Normal range. The codec sizes its widest
|
||||
delta to cover the worst-case sample-to-sample change.
|
||||
|
||||
Verified against all 14 `30 NN` blocks across the bundled fixture
|
||||
events. Every delta decodes byte-exact against BW's ASCII export.
|
||||
|
||||
## Test fixtures
|
||||
|
||||
Committed under `tests/fixtures/`:
|
||||
|
||||
- `decode-re-5-8-26/event-a..event-d/`: original quiet bundle (4 events,
|
||||
PPV < 1 in/s). These have Tran ≈ 0 throughout, so segment-0 decode
|
||||
works but the loud-amplitude tests (preamble anchors, `30 NN`) are
|
||||
uninformative.
|
||||
- `5-11-26/M529LL1A.{SP0,SS0,SV0}`: loud bundle (PPV 6-7 in/s on all
|
||||
channels). These cracked the Tran codec.
|
||||
- `5-11-26/M529LL1L.{JQ0,V70}`: targeted captures. JQ0 is Vert-heavy,
|
||||
V70 is Mic-heavy (140 dB). These cracked the `00 NN` RLE rule.
|
||||
|
||||
Each fixture has a `.TXT` Blastware ASCII export as ground truth.
|
||||
|
||||
## Tests
|
||||
|
||||
`tests/test_waveform_codec.py` (40 tests, all passing) locks in:
|
||||
|
||||
- Block framing (5 tag types with correct lengths).
|
||||
- Walker contiguity (no gaps or overlaps).
|
||||
- Segment header parsing (counter monotonicity, fixed-pattern check).
|
||||
- `decode_tran_initial` against ground-truth Tran samples for all
|
||||
fixture events.
|
||||
|
||||
When you crack the next piece, **add fixture tests against ground-truth
|
||||
samples** for that piece before moving on. Don't let unverified code
|
||||
ship without a regression lock-in.
|
||||
@@ -1,48 +0,0 @@
|
||||
"""
|
||||
micromate — Instantel Micromate (Series IV) device library.
|
||||
|
||||
Sibling of ``minimateplus`` (the Series III library). Currently scoped to
|
||||
the offline-file ingest path used by thor-watcher: parsing the per-event
|
||||
``.IDFH``/``.IDFW`` ASCII text sidecars Thor's exporter writes alongside
|
||||
each binary event file, and wrapping the parsed data in typed event
|
||||
records.
|
||||
|
||||
Live-device support (TCP protocol, frame parsing, real-time monitoring)
|
||||
is deferred — when we add it, it lands here as ``transport.py`` /
|
||||
``framing.py`` / ``protocol.py`` / ``client.py``, mirroring the
|
||||
``minimateplus`` package layout.
|
||||
|
||||
Typical usage (offline file ingest):
|
||||
|
||||
from micromate import IdfEvent, parse_idf_report
|
||||
|
||||
text = open("UM11719_20231219162723.IDFW.txt").read()
|
||||
rep = parse_idf_report(text) # dict
|
||||
event = IdfEvent.from_report(rep, "UM11719_20231219162723.IDFW")
|
||||
print(event.serial, event.peaks.transverse_ips, event.mic_pspl_dbl)
|
||||
"""
|
||||
|
||||
from .idf_ascii_report import (
|
||||
parse_event_filename,
|
||||
parse_idf_report,
|
||||
serial_from_filename,
|
||||
)
|
||||
from .models import (
|
||||
IdfEvent,
|
||||
IdfPeaks,
|
||||
IdfProjectInfo,
|
||||
IdfReport,
|
||||
IdfSensorCheck,
|
||||
)
|
||||
|
||||
__version__ = "0.1.0"
|
||||
__all__ = [
|
||||
"IdfEvent",
|
||||
"IdfPeaks",
|
||||
"IdfProjectInfo",
|
||||
"IdfReport",
|
||||
"IdfSensorCheck",
|
||||
"parse_event_filename",
|
||||
"parse_idf_report",
|
||||
"serial_from_filename",
|
||||
]
|
||||
@@ -1,315 +0,0 @@
|
||||
"""
|
||||
micromate/idf_ascii_report.py — parse Thor (Micromate Series IV) IDF ASCII reports.
|
||||
|
||||
Thor exports a `.IDFW.txt` or `.IDFH.txt` sidecar next to each `.IDFW`
|
||||
(waveform) or `.IDFH` (histogram) event binary. Each sidecar is a
|
||||
plain-text file with `"Key : Value"` lines covering the full device-
|
||||
authoritative event metadata — PPV per channel, ZC Freq, Time of Peak,
|
||||
Peak Acceleration / Displacement, sensor self-check results, project
|
||||
strings, calibration date, battery level, etc. — followed by a raw
|
||||
waveform-samples block headed by the literal line "Waveform Data Channels".
|
||||
|
||||
This is the Thor analogue of `minimateplus/bw_ascii_report.py` for the
|
||||
Blastware (Series III) report format. The parser is intentionally
|
||||
permissive: we extract everything we recognise into a flat dict and
|
||||
silently ignore anything we don't. Downstream callers parse units
|
||||
(`"0.2119 in/s"` → 0.2119) only on the fields they need.
|
||||
|
||||
Example input (truncated):
|
||||
|
||||
"EventType : Full Waveform"
|
||||
"SampleRate : 1024 sps"
|
||||
"EventTime : 16:27:23"
|
||||
"EventDate : 2023-12-19"
|
||||
"TranPPV : 0.0251 in/s"
|
||||
"VertPPV : 0.2119 in/s"
|
||||
"LongPPV : 0.0282 in/s"
|
||||
"PeakVectorSum : 0.2131 in/s"
|
||||
"MicPSPL : 99.4 dB(L)"
|
||||
"TranZCFreq : 6.5 Hz"
|
||||
"SerialNumber : UM11719"
|
||||
"Version : Micromate ISEE 11.0AK"
|
||||
"FileName : UM11719_20231219162723.IDFW"
|
||||
"BatteryLevel : 3.8 volts"
|
||||
"Calibration : November 22, 2023 by Instantel"
|
||||
"TranTestResults : Passed"
|
||||
"TitleString1 : UPMC Presby-Loc 3-Level1-1R Elevator Rm"
|
||||
Waveform Data Channels
|
||||
Tran Vert Long MicL
|
||||
0.0003 -0.0003 0.0003 0.00013
|
||||
...
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import re
|
||||
from typing import Any, Dict, Optional, Tuple, Union
|
||||
|
||||
|
||||
# Lines look like: "Key : Value" (quotes literal, single ":" separator)
|
||||
_LINE_RE = re.compile(r'^\s*"?([^":]+?)"?\s*:\s*"?(.*?)"?\s*$')
|
||||
|
||||
# Marker that ends the metadata block — everything after is raw sample data.
|
||||
_WAVEFORM_BLOCK_MARKER = "waveform data channels"
|
||||
|
||||
|
||||
def _normalize_key(raw: str) -> str:
|
||||
"""Convert "TranPPV" / "PreTriggerLength" → snake_case."""
|
||||
s = raw.strip()
|
||||
# Insert underscore between lower→upper / digit→letter transitions
|
||||
s = re.sub(r"(?<=[a-z0-9])(?=[A-Z])", "_", s)
|
||||
s = re.sub(r"(?<=[A-Z])(?=[A-Z][a-z])", "_", s)
|
||||
s = s.replace("-", "_").replace(" ", "_")
|
||||
return s.lower()
|
||||
|
||||
|
||||
def _strip_unit_suffix(value: str) -> str:
|
||||
"""Return the numeric part of values like "0.2119 in/s" → "0.2119".
|
||||
|
||||
Also strips Thor's below/above-threshold prefixes:
|
||||
"<0.005 in/s" → "0.005" (below-noise-floor reading)
|
||||
">100 Hz" → "100" (above-measurement-range reading)
|
||||
"""
|
||||
parts = value.strip().split()
|
||||
token = parts[0] if parts else value.strip()
|
||||
if token.startswith("<") or token.startswith(">"):
|
||||
token = token[1:]
|
||||
return token
|
||||
|
||||
|
||||
def _parse_float(value: str) -> Optional[float]:
|
||||
try:
|
||||
return float(_strip_unit_suffix(value))
|
||||
except (ValueError, TypeError):
|
||||
return None
|
||||
|
||||
|
||||
def _parse_int(value: str) -> Optional[int]:
|
||||
try:
|
||||
return int(float(_strip_unit_suffix(value)))
|
||||
except (ValueError, TypeError):
|
||||
return None
|
||||
|
||||
|
||||
def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
|
||||
"""
|
||||
Parse a Thor IDFW.txt / IDFH.txt sidecar.
|
||||
|
||||
Returns a flat dict with two kinds of entries:
|
||||
|
||||
- **Raw fields** — every `Key : Value` line, keyed by snake_case
|
||||
of the original key, value as a string (unit suffix preserved).
|
||||
Lets callers grab any field we haven't explicitly normalised.
|
||||
|
||||
- **Derived fields** — a curated set with parsed types:
|
||||
* `serial_number` str
|
||||
* `event_type` str ("Full Waveform" / "Full Histogram")
|
||||
* `event_datetime` ISO-8601 string ("YYYY-MM-DDTHH:MM:SS") when
|
||||
both EventDate and EventTime are present
|
||||
* `sample_rate` int (samples/sec)
|
||||
* `tran_ppv`,`vert_ppv`,`long_ppv` float (in/s)
|
||||
* `mic_ppv` float (dB or psi — same units as MicPSPL)
|
||||
* `peak_vector_sum` float (in/s)
|
||||
* `tran_zc_freq`,`vert_zc_freq`,`long_zc_freq` float (Hz)
|
||||
* `record_time_sec` float (seconds)
|
||||
* `pre_trigger_sec` float (seconds)
|
||||
* `project` str (from TitleString1 — Thor's location)
|
||||
* `client` str (TitleString2)
|
||||
* `operator` str (TitleString3 — company/operator)
|
||||
* `notes` str (TitleString4)
|
||||
* `setup` str
|
||||
* `version` str (firmware)
|
||||
* `battery_volts` float
|
||||
* `calibration_text` str (e.g. "November 22, 2023 by Instantel")
|
||||
* `tran_test_passed`, `vert_test_passed`, `long_test_passed`,
|
||||
`mic_test_passed` bool ("Passed" → True; anything else → False)
|
||||
* `filename` str (FileName line — useful sanity check)
|
||||
|
||||
Stops parsing at the literal "Waveform Data Channels" line; the
|
||||
raw-samples block is left to whoever wants to decode the binary.
|
||||
|
||||
Input may be `str` or `bytes` (`utf-8`/`latin-1` tolerant).
|
||||
"""
|
||||
if isinstance(text, bytes):
|
||||
try:
|
||||
text = text.decode("utf-8")
|
||||
except UnicodeDecodeError:
|
||||
text = text.decode("latin-1", errors="replace")
|
||||
|
||||
raw: Dict[str, str] = {}
|
||||
|
||||
for line in text.splitlines():
|
||||
stripped = line.strip()
|
||||
if not stripped:
|
||||
continue
|
||||
if stripped.lower().startswith(_WAVEFORM_BLOCK_MARKER):
|
||||
break
|
||||
m = _LINE_RE.match(stripped)
|
||||
if not m:
|
||||
continue
|
||||
key = _normalize_key(m.group(1))
|
||||
value = m.group(2).strip()
|
||||
# Multi-value lines (Channel, Units, etc.) — coalesce by appending.
|
||||
if key in raw:
|
||||
raw[key] = raw[key] + "; " + value
|
||||
else:
|
||||
raw[key] = value
|
||||
|
||||
out: Dict[str, Any] = dict(raw) # keep all raw fields
|
||||
|
||||
# ── Derived fields ───────────────────────────────────────────────────────
|
||||
|
||||
def _take(*candidates: str) -> Optional[str]:
|
||||
for c in candidates:
|
||||
if c in raw:
|
||||
return raw[c]
|
||||
return None
|
||||
|
||||
# Event identity
|
||||
if "serial_number" in raw:
|
||||
out["serial_number"] = raw["serial_number"]
|
||||
if "event_type" in raw:
|
||||
out["event_type"] = raw["event_type"]
|
||||
if "file_name" in raw:
|
||||
out["filename"] = raw["file_name"]
|
||||
|
||||
# Combined date+time. Waveform sidecars use "EventDate" / "EventTime";
|
||||
# histogram sidecars use "HistogramStartDate" / "HistogramStartTime".
|
||||
# Prefer the event_* names when both are present.
|
||||
ed = raw.get("event_date") or raw.get("histogram_start_date")
|
||||
et = raw.get("event_time") or raw.get("histogram_start_time")
|
||||
if ed and et:
|
||||
try:
|
||||
dt = datetime.datetime.strptime(f"{ed} {et}", "%Y-%m-%d %H:%M:%S")
|
||||
out["event_datetime"] = dt.isoformat()
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
# Numeric scalars. For every field we typify here, we MUST drop the
|
||||
# raw string copy from `out` when parsing fails — Thor writes things
|
||||
# like "<0.005 in/s" (below threshold) and "N/A" (not measured) that
|
||||
# would otherwise linger in `out` as strings, sneak into SQLite REAL
|
||||
# columns via permissive type affinity, and then crash the JS
|
||||
# frontend on `.toFixed(...)`.
|
||||
int_fields = ("sample_rate",)
|
||||
for key in int_fields:
|
||||
v = raw.get(key)
|
||||
if v is None:
|
||||
continue
|
||||
iv = _parse_int(v)
|
||||
if iv is not None:
|
||||
out[key] = iv
|
||||
else:
|
||||
out.pop(key, None)
|
||||
|
||||
float_fields = (
|
||||
"tran_ppv", "vert_ppv", "long_ppv", "peak_vector_sum",
|
||||
"tran_zc_freq", "vert_zc_freq", "long_zc_freq",
|
||||
"tran_peak_acceleration", "vert_peak_acceleration",
|
||||
"long_peak_acceleration",
|
||||
"tran_peak_displacement", "vert_peak_displacement",
|
||||
"long_peak_displacement",
|
||||
"tran_time_of_peak", "vert_time_of_peak", "long_time_of_peak",
|
||||
"mic_time_of_peak", "mic_zc_freq",
|
||||
)
|
||||
for key in float_fields:
|
||||
v = raw.get(key)
|
||||
if v is None:
|
||||
continue
|
||||
fv = _parse_float(v)
|
||||
if fv is not None:
|
||||
out[key] = fv
|
||||
else:
|
||||
out.pop(key, None)
|
||||
|
||||
# Microphone — Thor reports MicPSPL (dB(L)) which is the closest
|
||||
# analogue to BW's mic_ppv. The raw "99.4 dB(L)" string stays in
|
||||
# `out` under the original `mic_pspl` key for display; the parsed
|
||||
# float goes in `mic_ppv`.
|
||||
mic = raw.get("mic_pspl")
|
||||
if mic is not None:
|
||||
fv = _parse_float(mic)
|
||||
if fv is not None:
|
||||
out["mic_ppv"] = fv
|
||||
|
||||
# Record / pre-trigger duration — same drop-on-failure discipline.
|
||||
rt = raw.get("record_time")
|
||||
if rt is not None:
|
||||
fv = _parse_float(rt)
|
||||
if fv is not None:
|
||||
out["record_time_sec"] = fv
|
||||
pt = raw.get("pre_trigger_length")
|
||||
if pt is not None:
|
||||
fv = _parse_float(pt)
|
||||
if fv is not None:
|
||||
out["pre_trigger_sec"] = fv
|
||||
|
||||
# Project / client / operator / location strings. Thor's title
|
||||
# strings are operator-defined; conventional mapping (per Thor's
|
||||
# default TitleNote labels in the example data):
|
||||
# TitleString1 = Location → project (sensor location identifier)
|
||||
# TitleString2 = Client → client
|
||||
# TitleString3 = Company → operator (the monitoring company)
|
||||
# TitleString4 = Notes → notes
|
||||
out["project"] = _take("title_string1")
|
||||
out["client"] = _take("title_string2")
|
||||
out["operator"] = _take("title_string3", "operator")
|
||||
out["notes"] = _take("title_string4", "post_event_note")
|
||||
|
||||
if "setup" in raw:
|
||||
out["setup"] = raw["setup"]
|
||||
if "version" in raw:
|
||||
out["version"] = raw["version"]
|
||||
|
||||
# Battery (e.g. "3.8 volts" → 3.8)
|
||||
bl = raw.get("battery_level")
|
||||
if bl is not None:
|
||||
fv = _parse_float(bl)
|
||||
if fv is not None:
|
||||
out["battery_volts"] = fv
|
||||
|
||||
# Calibration line is free-form (e.g. "November 22, 2023 by Instantel").
|
||||
if "calibration" in raw:
|
||||
out["calibration_text"] = raw["calibration"]
|
||||
|
||||
# Sensor self-check results — bool flags
|
||||
for key, out_key in (
|
||||
("tran_test_results", "tran_test_passed"),
|
||||
("vert_test_results", "vert_test_passed"),
|
||||
("long_test_results", "long_test_passed"),
|
||||
("mic_test_results", "mic_test_passed"),
|
||||
):
|
||||
v = raw.get(key)
|
||||
if v is not None:
|
||||
out[out_key] = v.strip().lower() == "passed"
|
||||
|
||||
return out
|
||||
|
||||
|
||||
def serial_from_filename(name: str) -> Optional[str]:
|
||||
"""Convenience: pull the serial prefix from a Thor event filename.
|
||||
|
||||
Thor uses the literal serial as the filename prefix:
|
||||
UM11719_20231219163444.IDFW → "UM11719"
|
||||
BE9439_20200713124251.IDFH → "BE9439"
|
||||
"""
|
||||
m = re.match(r"^([A-Z]{2}\d+)_\d{14}\.(IDFH|IDFW)(?:\.txt)?$",
|
||||
name, re.IGNORECASE)
|
||||
return m.group(1).upper() if m else None
|
||||
|
||||
|
||||
def parse_event_filename(name: str) -> Optional[Tuple[str, datetime.datetime, str]]:
|
||||
"""Parse `<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>` → (serial, datetime, kind).
|
||||
|
||||
`kind` is "IDFH" or "IDFW" (upper-case). Returns None on no match.
|
||||
"""
|
||||
m = re.match(r"^([A-Z]{2}\d+)_(\d{14})\.(IDFH|IDFW)$",
|
||||
name, re.IGNORECASE)
|
||||
if not m:
|
||||
return None
|
||||
try:
|
||||
ts = datetime.datetime.strptime(m.group(2), "%Y%m%d%H%M%S")
|
||||
except ValueError:
|
||||
return None
|
||||
return m.group(1).upper(), ts, m.group(3).upper()
|
||||
@@ -1,64 +0,0 @@
|
||||
"""
|
||||
micromate/idf_file.py — placeholder for the Thor IDF binary codec.
|
||||
|
||||
Thor's ``.IDFH`` (histogram) and ``.IDFW`` (waveform) event files are an
|
||||
Instantel proprietary binary format that has not yet been reverse-
|
||||
engineered. Today seismo-relay treats them as opaque blobs:
|
||||
``WaveformStore.save_imported_idf`` stores the bytes verbatim and reads
|
||||
all device-authoritative metadata from the paired ``.IDFW.txt`` /
|
||||
``.IDFH.txt`` ASCII sidecar (parsed by ``idf_ascii_report.py``).
|
||||
|
||||
When we crack the binary codec — same reverse-engineering playbook we
|
||||
used to byte-perfect-parse Series III BW files (see
|
||||
``docs/instantel_protocol_reference.md`` and ``minimateplus/event_file_io.py``)
|
||||
— this module will grow:
|
||||
|
||||
- ``read_idf_file(path) -> IdfEvent``
|
||||
Parse a ``.IDFW``/``.IDFH`` binary and return a fully populated
|
||||
``IdfEvent`` whose waveform-sample arrays come from the binary
|
||||
(the .txt sidecar's tabular sample block being a best-effort
|
||||
check). Lets us ingest Thor events even when the operator
|
||||
hasn't enabled the .txt exporter — closing the
|
||||
``had_report=False`` gap that the thor-watcher forwarder
|
||||
currently tolerates as a known limitation.
|
||||
|
||||
- ``write_idf_file(path, event)`` (eventually)
|
||||
Round-trip event reconstruction, used for verifying the codec
|
||||
against captured device files the way ``write_blastware_file``
|
||||
verifies the Series III codec.
|
||||
|
||||
- Helpers for decoding the binary's per-channel sample arrays into
|
||||
physical units, the per-event flash buffer's monitor-log records,
|
||||
etc.
|
||||
|
||||
The reverse-engineering path: pair every ``.IDFW`` binary in
|
||||
``thor-watcher/example-data/`` with its sibling ``.IDFW.txt``, treating
|
||||
the txt's "Waveform Data Channels" block as ground-truth, and align
|
||||
the binary's per-channel int16-or-similar arrays against it. Header
|
||||
fields (sample rate, channel count, record time, timestamps) sit before
|
||||
the sample block — same approach as the BW codec where ASCII strings
|
||||
inside the binary (``Project:``, ``Client:``, etc.) anchored field
|
||||
discovery.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from typing import Union
|
||||
|
||||
from .models import IdfEvent
|
||||
|
||||
|
||||
def read_idf_file(path: Union[str, Path]) -> "IdfEvent":
|
||||
"""Parse a Thor ``.IDFW``/``.IDFH`` binary into an ``IdfEvent``.
|
||||
|
||||
Not yet implemented. When implemented, this will be the canonical
|
||||
entry point for reading Thor binaries — the ASCII sidecar parser
|
||||
becomes an optional fast-path metadata supplement rather than the
|
||||
sole source of device-authoritative data.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"IDF binary codec not yet implemented; the .IDFW/.IDFH binary format "
|
||||
"is undecoded. Use parse_idf_report() on the paired .txt sidecar "
|
||||
"for device-authoritative metadata."
|
||||
)
|
||||
@@ -1,377 +0,0 @@
|
||||
"""
|
||||
Micromate (Series IV / Thor) native data models.
|
||||
|
||||
These are the right-shaped dataclasses for Thor data — Thor measures
|
||||
the microphone in dB(L) directly, so this model carries
|
||||
``mic_pspl_dbl`` rather than the pseudo-``psi`` shoehorn that
|
||||
``minimateplus.PeakValues`` uses for Series III BW data.
|
||||
|
||||
The ingest pipeline today goes:
|
||||
|
||||
.IDFW.txt → parse_idf_report() → dict
|
||||
dict → IdfEvent.from_report() → IdfEvent (typed)
|
||||
IdfEvent → IdfEvent.to_minimateplus_event() → shape DB / sidecar
|
||||
machinery expects
|
||||
|
||||
The ``to_minimateplus_event()`` bridge is a temporary boundary — when we
|
||||
crack the binary IDF codec and have richer per-event data to store, the
|
||||
DB schema will grow Series-IV-specific columns and the bridge will
|
||||
shrink or disappear.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Dict, Optional, Tuple
|
||||
|
||||
|
||||
# ── IdfReport ─────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfReport:
|
||||
"""Typed wrapper around the dict returned by ``parse_idf_report``.
|
||||
|
||||
All fields optional — Thor's exporter is permissive and some IDF .txt
|
||||
files (especially histograms) omit fields that waveform sidecars
|
||||
include. Use ``.raw`` for any field this dataclass hasn't surfaced
|
||||
yet (the parser keeps every recognised key in the raw dict).
|
||||
"""
|
||||
|
||||
# Identity / kind
|
||||
serial_number: Optional[str] = None
|
||||
event_type: Optional[str] = None # "Full Waveform" | "Full Histogram"
|
||||
event_datetime: Optional[datetime.datetime] = None
|
||||
filename: Optional[str] = None # echoed by Thor's exporter
|
||||
|
||||
# Sampling / timing
|
||||
sample_rate: Optional[int] = None # samples/sec
|
||||
record_time_sec: Optional[float] = None
|
||||
pre_trigger_sec: Optional[float] = None
|
||||
|
||||
# Geophone peaks (in/s)
|
||||
tran_ppv: Optional[float] = None
|
||||
vert_ppv: Optional[float] = None
|
||||
long_ppv: Optional[float] = None
|
||||
peak_vector_sum: Optional[float] = None
|
||||
|
||||
# Microphone — Thor's native unit is dB(L), NOT psi.
|
||||
mic_pspl_dbl: Optional[float] = None
|
||||
|
||||
# Zero-crossing frequencies (Hz)
|
||||
tran_zc_freq: Optional[float] = None
|
||||
vert_zc_freq: Optional[float] = None
|
||||
long_zc_freq: Optional[float] = None
|
||||
mic_zc_freq: Optional[float] = None
|
||||
|
||||
# Per-channel time of peak (sec, since event start)
|
||||
tran_time_of_peak: Optional[float] = None
|
||||
vert_time_of_peak: Optional[float] = None
|
||||
long_time_of_peak: Optional[float] = None
|
||||
mic_time_of_peak: Optional[float] = None
|
||||
|
||||
# Derived per-channel motion
|
||||
tran_peak_acceleration: Optional[float] = None # g
|
||||
vert_peak_acceleration: Optional[float] = None
|
||||
long_peak_acceleration: Optional[float] = None
|
||||
tran_peak_displacement: Optional[float] = None # in
|
||||
vert_peak_displacement: Optional[float] = None
|
||||
long_peak_displacement: Optional[float] = None
|
||||
|
||||
# Operator-supplied strings (Thor's TitleString1..4 → semantic slots)
|
||||
project: Optional[str] = None # TitleString1
|
||||
client: Optional[str] = None # TitleString2
|
||||
operator: Optional[str] = None # TitleString3
|
||||
notes: Optional[str] = None # TitleString4 / PostEventNote
|
||||
setup: Optional[str] = None # setup file name
|
||||
|
||||
# Sensor self-check results
|
||||
tran_test_passed: Optional[bool] = None
|
||||
vert_test_passed: Optional[bool] = None
|
||||
long_test_passed: Optional[bool] = None
|
||||
mic_test_passed: Optional[bool] = None
|
||||
|
||||
# Device-fixed metadata
|
||||
firmware_version: Optional[str] = None
|
||||
calibration_text: Optional[str] = None
|
||||
battery_volts: Optional[float] = None
|
||||
|
||||
# Original parser dict — preserves every recognised key (including
|
||||
# raw unit-suffixed strings) for forward-compatible field access.
|
||||
raw: Dict[str, Any] = field(default_factory=dict, repr=False)
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, d: Dict[str, Any]) -> "IdfReport":
|
||||
"""Build an IdfReport from the dict returned by ``parse_idf_report``."""
|
||||
ed = d.get("event_datetime")
|
||||
if isinstance(ed, str):
|
||||
try:
|
||||
ed = datetime.datetime.fromisoformat(ed)
|
||||
except ValueError:
|
||||
ed = None
|
||||
|
||||
return cls(
|
||||
serial_number = d.get("serial_number"),
|
||||
event_type = d.get("event_type"),
|
||||
event_datetime = ed if isinstance(ed, datetime.datetime) else None,
|
||||
filename = d.get("filename"),
|
||||
sample_rate = d.get("sample_rate"),
|
||||
record_time_sec = d.get("record_time_sec"),
|
||||
pre_trigger_sec = d.get("pre_trigger_sec"),
|
||||
tran_ppv = d.get("tran_ppv"),
|
||||
vert_ppv = d.get("vert_ppv"),
|
||||
long_ppv = d.get("long_ppv"),
|
||||
peak_vector_sum = d.get("peak_vector_sum"),
|
||||
mic_pspl_dbl = d.get("mic_ppv"), # parser names it mic_ppv (legacy)
|
||||
tran_zc_freq = d.get("tran_zc_freq"),
|
||||
vert_zc_freq = d.get("vert_zc_freq"),
|
||||
long_zc_freq = d.get("long_zc_freq"),
|
||||
mic_zc_freq = d.get("mic_zc_freq"),
|
||||
tran_time_of_peak = d.get("tran_time_of_peak"),
|
||||
vert_time_of_peak = d.get("vert_time_of_peak"),
|
||||
long_time_of_peak = d.get("long_time_of_peak"),
|
||||
mic_time_of_peak = d.get("mic_time_of_peak"),
|
||||
tran_peak_acceleration = d.get("tran_peak_acceleration"),
|
||||
vert_peak_acceleration = d.get("vert_peak_acceleration"),
|
||||
long_peak_acceleration = d.get("long_peak_acceleration"),
|
||||
tran_peak_displacement = d.get("tran_peak_displacement"),
|
||||
vert_peak_displacement = d.get("vert_peak_displacement"),
|
||||
long_peak_displacement = d.get("long_peak_displacement"),
|
||||
project = d.get("project"),
|
||||
client = d.get("client"),
|
||||
operator = d.get("operator"),
|
||||
notes = d.get("notes"),
|
||||
setup = d.get("setup"),
|
||||
tran_test_passed = d.get("tran_test_passed"),
|
||||
vert_test_passed = d.get("vert_test_passed"),
|
||||
long_test_passed = d.get("long_test_passed"),
|
||||
mic_test_passed = d.get("mic_test_passed"),
|
||||
firmware_version = d.get("version"),
|
||||
calibration_text = d.get("calibration_text"),
|
||||
battery_volts = d.get("battery_volts"),
|
||||
raw = d,
|
||||
)
|
||||
|
||||
|
||||
# ── IdfPeaks / IdfProjectInfo / IdfSensorCheck (narrow grouping types) ───────
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfPeaks:
|
||||
"""Geophone + mic peak values for one Thor event. Native Thor units."""
|
||||
transverse_ips: Optional[float] = None # in/s
|
||||
vertical_ips: Optional[float] = None # in/s
|
||||
longitudinal_ips: Optional[float] = None # in/s
|
||||
peak_vector_sum_ips: Optional[float] = None # in/s
|
||||
mic_pspl_dbl: Optional[float] = None # dB(L)
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfProjectInfo:
|
||||
"""Operator-supplied strings from Thor's TitleString1..4."""
|
||||
project: Optional[str] = None
|
||||
client: Optional[str] = None
|
||||
operator: Optional[str] = None
|
||||
notes: Optional[str] = None
|
||||
setup: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfSensorCheck:
|
||||
"""Per-channel pass/fail from Thor's self-test."""
|
||||
tran: Optional[bool] = None
|
||||
vert: Optional[bool] = None
|
||||
long: Optional[bool] = None
|
||||
mic: Optional[bool] = None
|
||||
|
||||
|
||||
# ── IdfEvent ─────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@dataclass
|
||||
class IdfEvent:
|
||||
"""A single Thor / Micromate Series IV event.
|
||||
|
||||
Built from a parsed .IDFW.txt or .IDFH.txt sidecar via
|
||||
``IdfEvent.from_report()``. The filename is the authoritative
|
||||
source for serial + timestamp + kind; the .txt provides
|
||||
device-authoritative peak values, frequencies, project strings,
|
||||
sensor self-check, firmware, calibration.
|
||||
"""
|
||||
|
||||
# Identity
|
||||
serial: str
|
||||
timestamp: datetime.datetime
|
||||
kind: str # "Waveform" | "Histogram"
|
||||
filename: str # device-native binary filename, e.g. "UM11719_20231219163444.IDFW"
|
||||
|
||||
# Sampling / timing
|
||||
sample_rate: Optional[int] = None
|
||||
record_time_sec: Optional[float] = None
|
||||
pre_trigger_sec: Optional[float] = None
|
||||
|
||||
# Peaks
|
||||
peaks: IdfPeaks = field(default_factory=IdfPeaks)
|
||||
|
||||
# Per-channel frequencies (Hz)
|
||||
tran_zc_freq: Optional[float] = None
|
||||
vert_zc_freq: Optional[float] = None
|
||||
long_zc_freq: Optional[float] = None
|
||||
mic_zc_freq: Optional[float] = None
|
||||
|
||||
# Project strings
|
||||
project_info: IdfProjectInfo = field(default_factory=IdfProjectInfo)
|
||||
|
||||
# Sensor self-check
|
||||
sensor_check: IdfSensorCheck = field(default_factory=IdfSensorCheck)
|
||||
|
||||
# Device-fixed
|
||||
firmware_version: Optional[str] = None
|
||||
calibration_text: Optional[str] = None
|
||||
battery_volts: Optional[float] = None
|
||||
|
||||
# The full parsed report — preserves anything not surfaced as a typed field
|
||||
report: IdfReport = field(default_factory=IdfReport)
|
||||
|
||||
@classmethod
|
||||
def from_report(
|
||||
cls,
|
||||
report: Any,
|
||||
filename: str,
|
||||
) -> "IdfEvent":
|
||||
"""Build an IdfEvent from a parsed report (dict or IdfReport) and
|
||||
the device-native binary filename.
|
||||
|
||||
The filename is authoritative for serial + timestamp + kind:
|
||||
Thor's filenames are literal ``<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>``
|
||||
and the device's own clock is the canonical event timestamp.
|
||||
If the report carries an ``event_datetime`` that differs from
|
||||
what's in the filename, the report wins (it has finer-grained
|
||||
device-reported time-of-trigger semantics).
|
||||
"""
|
||||
from .idf_ascii_report import parse_event_filename
|
||||
|
||||
# Normalise input to IdfReport
|
||||
if isinstance(report, IdfReport):
|
||||
rep = report
|
||||
elif isinstance(report, dict):
|
||||
rep = IdfReport.from_dict(report)
|
||||
else:
|
||||
raise TypeError(
|
||||
f"report must be IdfReport or dict; got {type(report).__name__}"
|
||||
)
|
||||
|
||||
# Filename → (serial, timestamp, kind). Required — fall back to
|
||||
# report-supplied values only if filename parsing fails.
|
||||
parsed = parse_event_filename(filename)
|
||||
if parsed is not None:
|
||||
fn_serial, fn_ts, fn_kind = parsed
|
||||
kind = "Histogram" if fn_kind == "IDFH" else "Waveform"
|
||||
else:
|
||||
fn_serial = rep.serial_number or "UNKNOWN"
|
||||
fn_ts = rep.event_datetime or datetime.datetime(1970, 1, 1)
|
||||
kind = "Waveform" if (rep.event_type or "").lower().startswith("full waveform") else "Histogram"
|
||||
|
||||
# Prefer report's event_datetime (device-authoritative) over the filename.
|
||||
ts = rep.event_datetime or fn_ts
|
||||
serial = rep.serial_number or fn_serial
|
||||
|
||||
return cls(
|
||||
serial=serial,
|
||||
timestamp=ts,
|
||||
kind=kind,
|
||||
filename=filename,
|
||||
sample_rate=rep.sample_rate,
|
||||
record_time_sec=rep.record_time_sec,
|
||||
pre_trigger_sec=rep.pre_trigger_sec,
|
||||
peaks=IdfPeaks(
|
||||
transverse_ips = rep.tran_ppv,
|
||||
vertical_ips = rep.vert_ppv,
|
||||
longitudinal_ips = rep.long_ppv,
|
||||
peak_vector_sum_ips = rep.peak_vector_sum,
|
||||
mic_pspl_dbl = rep.mic_pspl_dbl,
|
||||
),
|
||||
tran_zc_freq=rep.tran_zc_freq,
|
||||
vert_zc_freq=rep.vert_zc_freq,
|
||||
long_zc_freq=rep.long_zc_freq,
|
||||
mic_zc_freq=rep.mic_zc_freq,
|
||||
project_info=IdfProjectInfo(
|
||||
project=rep.project,
|
||||
client=rep.client,
|
||||
operator=rep.operator,
|
||||
notes=rep.notes,
|
||||
setup=rep.setup,
|
||||
),
|
||||
sensor_check=IdfSensorCheck(
|
||||
tran=rep.tran_test_passed,
|
||||
vert=rep.vert_test_passed,
|
||||
long=rep.long_test_passed,
|
||||
mic=rep.mic_test_passed,
|
||||
),
|
||||
firmware_version=rep.firmware_version,
|
||||
calibration_text=rep.calibration_text,
|
||||
battery_volts=rep.battery_volts,
|
||||
report=rep,
|
||||
)
|
||||
|
||||
# ── Bridge to minimateplus shape (for the existing DB / sidecar paths) ──
|
||||
|
||||
def to_minimateplus_event(self, waveform_key: bytes) -> Any:
|
||||
"""Project this Thor event into the shape ``minimateplus.Event``
|
||||
carries, so it can flow through the existing
|
||||
``SeismoDb.insert_events()`` and ``event_to_sidecar_dict()``
|
||||
machinery without those code paths needing to know about Thor.
|
||||
|
||||
Caveats of the bridge:
|
||||
- ``mic_ppv`` on the produced Event carries Thor's dB(L) value
|
||||
verbatim — the UI distinguishes via the ``device_family``
|
||||
column (Phase 1). Don't run the BW psi→dBL converter on
|
||||
Series IV rows.
|
||||
- Many Thor-specific fields (Peak Acceleration / Displacement,
|
||||
sensor self-check, calibration) don't have a slot in
|
||||
``Event``. The full IdfReport is preserved on the
|
||||
``.sfm.json`` sidecar under ``extensions.idf_report`` via
|
||||
``save_imported_idf`` — that's the source of truth for them.
|
||||
"""
|
||||
from minimateplus.models import (
|
||||
Event, PeakValues, ProjectInfo, Timestamp,
|
||||
)
|
||||
|
||||
ts_obj = Timestamp(
|
||||
raw=bytes(9),
|
||||
flag=0,
|
||||
year=self.timestamp.year,
|
||||
unknown_byte=0,
|
||||
month=self.timestamp.month,
|
||||
day=self.timestamp.day,
|
||||
hour=self.timestamp.hour,
|
||||
minute=self.timestamp.minute,
|
||||
second=self.timestamp.second,
|
||||
)
|
||||
pv = PeakValues(
|
||||
tran=self.peaks.transverse_ips,
|
||||
vert=self.peaks.vertical_ips,
|
||||
long=self.peaks.longitudinal_ips,
|
||||
micl=self.peaks.mic_pspl_dbl, # dB(L) — see caveat above
|
||||
peak_vector_sum=self.peaks.peak_vector_sum_ips,
|
||||
)
|
||||
pi = ProjectInfo(
|
||||
setup_name=self.project_info.setup,
|
||||
project=self.project_info.project,
|
||||
client=self.project_info.client,
|
||||
operator=self.project_info.operator,
|
||||
sensor_location=None, # Thor folds location into project string
|
||||
notes=self.project_info.notes,
|
||||
)
|
||||
ev = Event(
|
||||
index=0,
|
||||
timestamp=ts_obj,
|
||||
sample_rate=self.sample_rate,
|
||||
peak_values=pv,
|
||||
project_info=pi,
|
||||
record_type=self.kind,
|
||||
rectime_seconds=self.record_time_sec,
|
||||
)
|
||||
ev._waveform_key = waveform_key
|
||||
return ev
|
||||
@@ -21,15 +21,7 @@ Typical usage (TCP / modem):
|
||||
|
||||
from .client import MiniMateClient
|
||||
from .models import DeviceInfo, Event, MonitorLogEntry
|
||||
from .transport import CapturingTransport, SerialTransport, TcpTransport
|
||||
from .transport import SerialTransport, TcpTransport
|
||||
|
||||
__version__ = "0.1.0"
|
||||
__all__ = [
|
||||
"MiniMateClient",
|
||||
"DeviceInfo",
|
||||
"Event",
|
||||
"MonitorLogEntry",
|
||||
"SerialTransport",
|
||||
"TcpTransport",
|
||||
"CapturingTransport",
|
||||
]
|
||||
__all__ = ["MiniMateClient", "DeviceInfo", "Event", "MonitorLogEntry", "SerialTransport", "TcpTransport"]
|
||||
|
||||
+32
-156
@@ -552,105 +552,6 @@ def classify_frame(frame: S3Frame) -> str:
|
||||
|
||||
# ── Waveform file writer ───────────────────────────────────────────────────────────
|
||||
|
||||
def extract_body_bytes(a5_frames):
|
||||
"""Reconstruct the Blastware-file body bytes from a list of A5 frames.
|
||||
|
||||
Returns ``(strt, body, footer)`` where:
|
||||
|
||||
- ``strt`` is the 21-byte STRT record from the probe frame (or a fallback
|
||||
record built from minimal event metadata if STRT is missing).
|
||||
- ``body`` is the variable-length sample-data section (between STRT and
|
||||
the 26-byte file footer). Empty if no frames decode.
|
||||
- ``footer`` is the 26-byte file footer.
|
||||
|
||||
This is the same body-construction algorithm used by :func:`write_blastware_file`
|
||||
— refactored out so the body decoder (``waveform_codec.decode_waveform_v2``)
|
||||
can consume the same bytes without re-implementing the frame-walking logic.
|
||||
|
||||
Returns ``(b"", b"", b"")`` if *a5_frames* is empty.
|
||||
"""
|
||||
if not a5_frames:
|
||||
return (b"", b"", b"")
|
||||
|
||||
# ── Extract STRT record from probe frame ─────────────────────────────────
|
||||
w0_raw = bytes(a5_frames[0].data[7:])
|
||||
w0_stripped = _strip_inner_frame_dles(w0_raw)
|
||||
strt_pos_stripped = w0_stripped.find(b"STRT")
|
||||
|
||||
if strt_pos_stripped >= 0:
|
||||
strt = bytes(w0_stripped[strt_pos_stripped : strt_pos_stripped + 21])
|
||||
|
||||
# Walk raw bytes to find the raw-domain end of the STRT (= body start).
|
||||
target_stripped = strt_pos_stripped + 21
|
||||
stripped_so_far = 0
|
||||
raw_i = 0
|
||||
while stripped_so_far < target_stripped and raw_i < len(w0_raw):
|
||||
if (w0_raw[raw_i] == 0x10
|
||||
and raw_i + 1 < len(w0_raw)
|
||||
and w0_raw[raw_i + 1] in {0x02, 0x03, 0x04}):
|
||||
raw_i += 2
|
||||
else:
|
||||
raw_i += 1
|
||||
stripped_so_far += 1
|
||||
probe_skip = 7 + raw_i
|
||||
else:
|
||||
strt = b"STRT" + b"\xff\xfe" + bytes(14) + b"\x00"
|
||||
probe_skip = 7 + 21
|
||||
|
||||
if len(strt) != 21:
|
||||
return (b"", b"", b"")
|
||||
|
||||
# Separate terminator from data frames.
|
||||
term_idx: Optional[int] = None
|
||||
if a5_frames and a5_frames[-1].page_key != 0x0010:
|
||||
term_idx = len(a5_frames) - 1
|
||||
|
||||
if term_idx is not None:
|
||||
body_frames = a5_frames[:term_idx]
|
||||
term_frame = a5_frames[term_idx]
|
||||
else:
|
||||
body_frames = a5_frames
|
||||
term_frame = None
|
||||
|
||||
all_bytes = bytearray()
|
||||
for fi, frame in enumerate(body_frames):
|
||||
if fi == 0:
|
||||
skip = probe_skip
|
||||
elif fi in (1, 2):
|
||||
skip = 13 # metadata pages
|
||||
else:
|
||||
skip = 12 # sample chunks
|
||||
all_bytes.extend(_frame_body_bytes(frame, skip))
|
||||
|
||||
if term_frame is not None:
|
||||
all_bytes.extend(_frame_body_bytes(term_frame, 11))
|
||||
|
||||
# Find the first valid `0e 08` footer marker.
|
||||
footer_pos = -1
|
||||
pos = 0
|
||||
while True:
|
||||
pos = bytes(all_bytes).find(b"\x0e\x08", pos)
|
||||
if pos < 0 or pos + 26 > len(all_bytes):
|
||||
break
|
||||
yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
|
||||
if 2015 <= yr <= 2050:
|
||||
footer_pos = pos
|
||||
break
|
||||
pos += 1
|
||||
|
||||
if footer_pos >= 0:
|
||||
body = bytes(all_bytes[:footer_pos])
|
||||
footer = bytes(all_bytes[footer_pos : footer_pos + 26])
|
||||
elif len(all_bytes) >= 26:
|
||||
body = bytes(all_bytes[:-26])
|
||||
footer = bytes(all_bytes[-26:])
|
||||
else:
|
||||
body = bytes(all_bytes)
|
||||
footer = b""
|
||||
|
||||
return (strt, body, footer)
|
||||
|
||||
|
||||
def write_blastware_file(
|
||||
event: Event,
|
||||
a5_frames: list[S3Frame],
|
||||
@@ -738,7 +639,7 @@ def write_blastware_file(
|
||||
strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF])
|
||||
probe_skip = 7 + 21
|
||||
|
||||
log.debug(
|
||||
log.warning(
|
||||
"write_blastware_file: strt_pos_stripped=%d probe_skip=%d "
|
||||
"probe_data_len=%d strt_hex=%s",
|
||||
strt_pos_stripped if strt_pos_stripped >= 0 else -1,
|
||||
@@ -771,10 +672,11 @@ def write_blastware_file(
|
||||
# Do NOT use a5_frames[-1] — if _a5_frames contains stray frames from a
|
||||
# subsequent event (a known get_events side-effect), the last frame will
|
||||
# not be the terminator and the footer will be mis-identified.
|
||||
# TERM detection (v0.14.0): last frame if page_key != 0x0010 (sample marker)
|
||||
term_idx: Optional[int] = None
|
||||
if a5_frames and a5_frames[-1].page_key != 0x0010:
|
||||
term_idx = len(a5_frames) - 1
|
||||
for _i, _f in enumerate(a5_frames):
|
||||
if _f.page_key == 0x0000:
|
||||
term_idx = _i
|
||||
break
|
||||
|
||||
if term_idx is not None:
|
||||
body_frames = a5_frames[:term_idx]
|
||||
@@ -783,32 +685,38 @@ def write_blastware_file(
|
||||
body_frames = a5_frames
|
||||
term_frame = None
|
||||
|
||||
# Frame contribution loop (v0.14.0 BW-exact walk).
|
||||
# Skip values:
|
||||
# probe (fi=0): probe_skip
|
||||
# meta@0x1002 (fi=1): 13 (6-byte inner header)
|
||||
# meta@0x1004 (fi=2): 13 (6-byte inner header)
|
||||
# sample chunks (fi=3+): 12 (5-byte inner header)
|
||||
last_fi = len(body_frames) - 1
|
||||
|
||||
log.debug(
|
||||
"write_blastware_file: %d body_frames last_fi=%d",
|
||||
len(body_frames), last_fi,
|
||||
log.warning(
|
||||
"write_blastware_file: %d body_frames term_idx=%s",
|
||||
len(body_frames),
|
||||
str(term_idx) if term_idx is not None else "None",
|
||||
)
|
||||
|
||||
all_bytes = bytearray()
|
||||
|
||||
for fi, frame in enumerate(body_frames):
|
||||
# All body frames contribute to the waveform body — no frames are skipped.
|
||||
#
|
||||
# Over TCP via cellular modem, _recv_5a_batch() correctly collects all
|
||||
# A5 frames per chunk request (the device's ~1100-byte RS-232 response
|
||||
# is forwarded as ~2 TCP segments of ~550 bytes each, each parsed as a
|
||||
# separate S3 frame). ALL of these frames contain ADC body data and
|
||||
# must be included in the file — confirmed from 4-27-26 TCP capture
|
||||
# analysis: contributions from all 14 frames → 6821 bytes → file 6864 bytes.
|
||||
#
|
||||
# Skip amounts (offsets into frame.data):
|
||||
# fi=0 (probe): probe_skip — skips the type_tag header + STRT record
|
||||
# fi=1: 13 — 7-byte frame.data prefix + 6 inner header bytes
|
||||
# fi>=2: 12 — 7-byte frame.data prefix + 5 inner header bytes
|
||||
if fi == 0:
|
||||
skip = probe_skip
|
||||
elif fi in (1, 2):
|
||||
skip = 13 # metadata pages
|
||||
elif fi == 1:
|
||||
skip = 13
|
||||
else:
|
||||
skip = 12 # sample chunks
|
||||
skip = 12
|
||||
|
||||
contribution = _frame_body_bytes(frame, skip)
|
||||
log.debug("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
|
||||
fi, skip, len(frame.data), len(contribution))
|
||||
log.warning("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
|
||||
fi, skip, len(frame.data), len(contribution))
|
||||
all_bytes.extend(contribution)
|
||||
|
||||
# Terminator contributes its content, which ends with the 26-byte footer.
|
||||
@@ -816,7 +724,7 @@ def write_blastware_file(
|
||||
# one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21.
|
||||
if term_frame is not None:
|
||||
term_contribution = _frame_body_bytes(term_frame, 11)
|
||||
log.debug(
|
||||
log.warning(
|
||||
"write_blastware_file: term_frame data_len=%d skip=11 "
|
||||
"contribution_len=%d first8=%s",
|
||||
len(term_frame.data),
|
||||
@@ -825,49 +733,17 @@ def write_blastware_file(
|
||||
)
|
||||
all_bytes.extend(term_contribution)
|
||||
|
||||
log.debug(
|
||||
log.warning(
|
||||
"write_blastware_file: all_bytes total=%d last28=%s",
|
||||
len(all_bytes),
|
||||
bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(),
|
||||
)
|
||||
|
||||
# NOTE: The "duplicate header+STRT strip" logic from v0.13.x has been
|
||||
# REMOVED in v0.14.2. Under the v0.14.0 BW-exact 5A walk, body assembly
|
||||
# is just contiguous concatenation of frame contributions in stream order
|
||||
# (probe → meta@0x1002 → meta@0x1004 → samples → TERM), exactly as BW
|
||||
# writes its files. The previous strip was matching the `00 12 03 00 STRT`
|
||||
# byte sequence in legitimate waveform data — sample chunks at counter
|
||||
# 0x1000 and beyond often contain those bytes coincidentally — and
|
||||
# zeroing 25 bytes of valid samples per match. Compared to a known-good
|
||||
# BW reference for the same 3-sec event 0, the strip introduced 26 bytes
|
||||
# of zeros that BW did not have, then propagated alignment differences
|
||||
# through the rest of the body. See decode_test/5-1-26/bw vs SFM diff
|
||||
# at file[0x1012..0x102B] (2026-05-04 analysis).
|
||||
|
||||
# Find the first valid 0e 08 footer marker (v0.14.0).
|
||||
footer_pos = -1
|
||||
pos = 0
|
||||
while True:
|
||||
pos = bytes(all_bytes).find(b"\x0e\x08", pos)
|
||||
if pos < 0 or pos + 26 > len(all_bytes):
|
||||
break
|
||||
yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
|
||||
if 2015 <= yr <= 2050:
|
||||
footer_pos = pos
|
||||
break
|
||||
pos += 1
|
||||
if footer_pos >= 0:
|
||||
body = bytes(all_bytes[:footer_pos])
|
||||
footer = bytes(all_bytes[footer_pos:footer_pos + 26])
|
||||
log.debug(
|
||||
"write_blastware_file: real 0e 08 footer at all_bytes[%d]; "
|
||||
"truncating %d post-footer bytes",
|
||||
footer_pos, len(all_bytes) - footer_pos - 26,
|
||||
)
|
||||
elif len(all_bytes) >= 26:
|
||||
if len(all_bytes) >= 26:
|
||||
body = bytes(all_bytes[:-26])
|
||||
footer = bytes(all_bytes[-26:])
|
||||
else:
|
||||
# Fallback: no terminator or very short stream → build footer from event metadata
|
||||
body = bytes(all_bytes)
|
||||
start_dt = _ts_from_model(event.timestamp)
|
||||
stop_dt: Optional[datetime.datetime] = None
|
||||
@@ -878,7 +754,7 @@ def write_blastware_file(
|
||||
+ _encode_ts_be(start_dt)
|
||||
+ _encode_ts_be(stop_dt)
|
||||
+ b"\x00\x01\x00\x02\x00\x00"
|
||||
+ b"\x00\x00"
|
||||
+ b"\x00\x00" # CRC placeholder
|
||||
)
|
||||
|
||||
# ── Write file ───────────────────────────────────────────────────────────
|
||||
|
||||
@@ -1,522 +0,0 @@
|
||||
"""
|
||||
minimateplus/bw_ascii_report.py — parser for Blastware's per-event ASCII
|
||||
report (the .TXT file BW writes alongside each saved event binary).
|
||||
|
||||
The ASCII export is the authoritative source for every "rich" per-event
|
||||
field that BW computes from the waveform but never persists in the BW
|
||||
binary itself:
|
||||
|
||||
- Per-channel PPV (Tran / Vert / Long / MicL)
|
||||
- Peak Vector Sum + Peak Vector Sum Time
|
||||
- Per-channel ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement
|
||||
- MicL PSPL, MicL Time of Peak, MicL ZC Freq
|
||||
- Per-channel Sensor Self-Check (Test Freq / Test Ratio / Test Results)
|
||||
- MicL Test Amplitude (mV)
|
||||
- Battery, calibration date, monitor-log timestamps
|
||||
|
||||
Persisting these values into the SFM database lets the monthly-summary
|
||||
review workflow ("show me events at Location X with PVS > 0.5") work
|
||||
without depending on the (still-undecoded) waveform body codec.
|
||||
|
||||
Format (verified against decode-re/5-8-26 4-event bundle):
|
||||
|
||||
- One field per line, wrapped in double quotes: `"Field Name : Value"`
|
||||
- Field/value separator: literal ` : ` (space-colon-space).
|
||||
- Some field names contain an internal `:` already (e.g. `"Project:"`),
|
||||
so we split on the FIRST ` : ` only.
|
||||
- Some fields have unit suffixes: `"0.500 in/s"` / `"7.5 Hz"` / `"533 mv"`.
|
||||
- A `"Monitor Log(s)"` marker line is followed by tab-separated rows
|
||||
of `start_time<TAB>stop_time<TAB>description`.
|
||||
- Final `"PC SW Version : ..."` line ends the metadata block.
|
||||
- A blank line separates metadata from the sample table.
|
||||
- Sample table starts with ` Tran <TAB> Vert <TAB>...`, then
|
||||
one row per sample (tab-separated, right-padded numeric values).
|
||||
- Geo channel values are in in/s; MicL in dB(L) (or 0.000 below threshold).
|
||||
|
||||
Because some metadata fields have whitespace quirks ("MicL Time of
|
||||
Peak" has two spaces; the leading "Project:" value has its own colon),
|
||||
we normalise whitespace in the key before lookup.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Tuple, Union
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# Output dataclasses
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@dataclass
|
||||
class ChannelStats:
|
||||
"""Per-channel derived stats, populated from an event report."""
|
||||
ppv_ips: Optional[float] = None # in/s (geo channels only)
|
||||
zc_freq_hz: Optional[float] = None # Hz
|
||||
time_of_peak_s: Optional[float] = None # seconds (relative to trigger; can be negative)
|
||||
peak_accel_g: Optional[float] = None # g (geo channels only)
|
||||
peak_disp_in: Optional[float] = None # in (geo channels only)
|
||||
|
||||
|
||||
@dataclass
|
||||
class MicStats:
|
||||
"""MicL-specific stats."""
|
||||
weighting: Optional[str] = None # e.g. "Linear Weighting"
|
||||
pspl_dbl: Optional[float] = None # dB(L)
|
||||
zc_freq_hz: Optional[float] = None
|
||||
time_of_peak_s: Optional[float] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class SensorCheck:
|
||||
"""Per-channel sensor self-check result.
|
||||
|
||||
Geo channels report a frequency + ratio; MicL reports a frequency +
|
||||
amplitude (mV). All channels also have a Pass/Fail string.
|
||||
"""
|
||||
test_freq_hz: Optional[float] = None
|
||||
test_ratio: Optional[float] = None # geo channels only
|
||||
test_amplitude_mv: Optional[float] = None # MicL only
|
||||
test_results: Optional[str] = None # "Passed" / "Failed"
|
||||
|
||||
|
||||
@dataclass
|
||||
class MonitorLogEntry:
|
||||
"""One row of the trailing Monitor Log(s) block."""
|
||||
start_time: Optional[datetime.datetime] = None
|
||||
stop_time: Optional[datetime.datetime] = None
|
||||
description: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class BwAsciiReport:
|
||||
"""Structured representation of one BW per-event ASCII export."""
|
||||
# ── Identity ─────────────────────────────────────────────────────────────
|
||||
event_type: Optional[str] = None # e.g. "Full Waveform"
|
||||
serial: Optional[str] = None # e.g. "BE11529"
|
||||
version: Optional[str] = None # firmware version line
|
||||
file_name: Optional[str] = None # e.g. "M529LK44.AB0"
|
||||
event_datetime: Optional[datetime.datetime] = None # parsed from Event Time + Event Date
|
||||
|
||||
# ── Trigger / recording config ──────────────────────────────────────────
|
||||
trigger_channel: Optional[str] = None # e.g. "Vert" or "From Unit"
|
||||
geo_trigger_level_ips: Optional[float] = None
|
||||
pretrig_s: Optional[float] = None # negative seconds
|
||||
record_time_s: Optional[float] = None
|
||||
record_stop_mode: Optional[str] = None
|
||||
sample_rate_sps: Optional[int] = None
|
||||
battery_volts: Optional[float] = None
|
||||
calibration_date: Optional[datetime.date] = None
|
||||
calibration_by: Optional[str] = None # e.g. "Instantel"
|
||||
units: Optional[str] = None # e.g. "in/s and dB(L)"
|
||||
|
||||
# ── Operator-supplied metadata ──────────────────────────────────────────
|
||||
# Parsed by POSITION from the 4-line "User Notes" block BW writes
|
||||
# between the `Units :` and `Geo Range :` lines. Position-based so
|
||||
# the values populate correctly even when an operator renames the
|
||||
# labels in Blastware's Compliance Setup → Notes tab (the 4 labels
|
||||
# are user-editable, e.g. "Seis Loc:" → "Building:" → "Site Address:").
|
||||
# The original labels BW wrote are preserved in `user_note_labels`
|
||||
# so terra-view can render them as the operator named them.
|
||||
project: Optional[str] = None # position 1 (BW default label "Project:")
|
||||
client: Optional[str] = None # position 2 (BW default label "Client:")
|
||||
operator: Optional[str] = None # position 3 (BW default label "User Name:")
|
||||
sensor_location: Optional[str] = None # position 4 (BW default label "Seis Loc:")
|
||||
|
||||
# Maps canonical slot name → the literal label BW wrote in the ASCII
|
||||
# export. Empty if the User Notes block wasn't present. Example
|
||||
# when the operator renamed slot 4 to "Building:":
|
||||
# {"project": "Project:", "client": "Client:",
|
||||
# "operator": "User Name:", "sensor_location": "Building:"}
|
||||
user_note_labels: Dict[str, str] = field(default_factory=dict)
|
||||
|
||||
# ── Geo channel scaling ─────────────────────────────────────────────────
|
||||
geo_range_ips: Optional[float] = None # 10.000 / 1.250
|
||||
|
||||
# ── Per-channel derived stats (geo + mic) ───────────────────────────────
|
||||
channels: Dict[str, ChannelStats] = field(default_factory=dict)
|
||||
mic: MicStats = field(default_factory=MicStats)
|
||||
|
||||
# ── Vector sum ──────────────────────────────────────────────────────────
|
||||
peak_vector_sum_ips: Optional[float] = None
|
||||
peak_vector_sum_time_s: Optional[float] = None
|
||||
|
||||
# ── Sensor self-check (per channel) ─────────────────────────────────────
|
||||
sensor_check: Dict[str, SensorCheck] = field(default_factory=dict)
|
||||
|
||||
# ── Monitor log + tooling version ───────────────────────────────────────
|
||||
monitor_log: List[MonitorLogEntry] = field(default_factory=list)
|
||||
pc_sw_version: Optional[str] = None
|
||||
|
||||
# ── Sample table (optional; only parsed if requested) ───────────────────
|
||||
# Each entry: (Tran, Vert, Long, MicL) in the report's units (geo
|
||||
# channels in in/s, MicL in dB(L)). None when parse_samples=False.
|
||||
samples: Optional[List[Tuple[float, float, float, float]]] = None
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# Helpers
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
_KEY_NORMALISE_RE = re.compile(r"\s+")
|
||||
_NUMERIC_RE = re.compile(r"^-?\d+(?:\.\d+)?")
|
||||
|
||||
|
||||
def _normalise_key(k: str) -> str:
|
||||
"""Collapse whitespace runs (incl. tabs) and strip — handles BW's
|
||||
"MicL Time of Peak" double-space and leading-colon quirks."""
|
||||
return _KEY_NORMALISE_RE.sub(" ", k).strip()
|
||||
|
||||
|
||||
def _strip_quotes(line: str) -> str:
|
||||
line = line.rstrip("\r\n")
|
||||
if len(line) >= 2 and line.startswith('"') and line.endswith('"'):
|
||||
return line[1:-1]
|
||||
return line
|
||||
|
||||
|
||||
def _parse_number(value: str) -> Optional[float]:
|
||||
"""Pull the leading numeric portion out of a value like "0.500 in/s"."""
|
||||
m = _NUMERIC_RE.match(value.strip())
|
||||
if not m:
|
||||
return None
|
||||
try:
|
||||
return float(m.group(0))
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _parse_int(value: str) -> Optional[int]:
|
||||
n = _parse_number(value)
|
||||
return None if n is None else int(round(n))
|
||||
|
||||
|
||||
# Months exactly as BW writes them.
|
||||
_MONTHS = {
|
||||
"January": 1, "February": 2, "March": 3, "April": 4,
|
||||
"May": 5, "June": 6, "July": 7, "August": 8,
|
||||
"September": 9, "October": 10, "November": 11, "December": 12,
|
||||
# Short forms used in monitor-log rows ("Apr 23 /26").
|
||||
"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "Jun": 6, "Jul": 7,
|
||||
"Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12,
|
||||
}
|
||||
|
||||
|
||||
def _parse_event_date(s: str) -> Optional[datetime.date]:
|
||||
"""Parse "April 23, 2026" or "May 8, 2026" → date."""
|
||||
s = s.strip()
|
||||
parts = s.replace(",", " ").split()
|
||||
if len(parts) < 3:
|
||||
return None
|
||||
month_name, day_str, year_str = parts[0], parts[1], parts[2]
|
||||
month = _MONTHS.get(month_name)
|
||||
if month is None:
|
||||
return None
|
||||
try:
|
||||
return datetime.date(int(year_str), month, int(day_str))
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _parse_event_time(s: str) -> Optional[datetime.time]:
|
||||
"""Parse "15:56:35" → time."""
|
||||
s = s.strip()
|
||||
try:
|
||||
h, m, sec = s.split(":")
|
||||
return datetime.time(int(h), int(m), int(sec))
|
||||
except (ValueError, IndexError):
|
||||
return None
|
||||
|
||||
|
||||
def _parse_calibration(value: str) -> Tuple[Optional[datetime.date], Optional[str]]:
|
||||
"""Parse "April 29, 2025 by Instantel" → (date, "Instantel")."""
|
||||
parts = value.split(" by ", 1)
|
||||
date = _parse_event_date(parts[0])
|
||||
by = parts[1].strip() if len(parts) > 1 else None
|
||||
return date, by
|
||||
|
||||
|
||||
def _parse_monitor_row(line: str) -> Optional[MonitorLogEntry]:
|
||||
"""Parse a tab-separated monitor log row.
|
||||
|
||||
Format: `<start>\t<stop>\t<desc>` where each timestamp is BW's
|
||||
short form "Mon DD /YY HH:MM:SS" (e.g. "Apr 23 /26 15:46:16").
|
||||
Year is encoded as a 2-digit suffix; we expand "/26" → 2026.
|
||||
"""
|
||||
parts = line.split("\t")
|
||||
if len(parts) < 2:
|
||||
return None
|
||||
start = _parse_monitor_ts(parts[0])
|
||||
stop = _parse_monitor_ts(parts[1])
|
||||
desc = parts[2].strip() if len(parts) > 2 else None
|
||||
if start is None and stop is None and not desc:
|
||||
return None
|
||||
return MonitorLogEntry(start_time=start, stop_time=stop, description=desc)
|
||||
|
||||
|
||||
def _parse_monitor_ts(s: str) -> Optional[datetime.datetime]:
|
||||
"""Parse "Apr 23 /26 15:46:16" → datetime."""
|
||||
s = s.strip()
|
||||
parts = s.split()
|
||||
if len(parts) < 4:
|
||||
return None
|
||||
month = _MONTHS.get(parts[0])
|
||||
if month is None:
|
||||
return None
|
||||
try:
|
||||
day = int(parts[1])
|
||||
# parts[2] looks like "/26" → century-flip to 2026
|
||||
yy = int(parts[2].lstrip("/"))
|
||||
year = 2000 + yy if yy < 80 else 1900 + yy
|
||||
h, m, sec = (int(x) for x in parts[3].split(":"))
|
||||
return datetime.datetime(year, month, day, h, m, sec)
|
||||
except (ValueError, IndexError):
|
||||
return None
|
||||
|
||||
|
||||
# ── User-notes positional slot map ──────────────────────────────────────────
|
||||
#
|
||||
# Blastware's Compliance Setup → Notes tab shows four operator-supplied
|
||||
# fields whose LABELS the operator can rename (see screenshot in
|
||||
# project archive). Defaults are "Project:" / "Client:" /
|
||||
# "User Name:" / "Seis Loc:", but an operator using a different
|
||||
# convention can rename them to anything ("Building:", "Site:",
|
||||
# "Address:", etc.). The ASCII export reflects whatever the operator
|
||||
# typed, so label-based matching is fragile.
|
||||
#
|
||||
# What IS reliable: BW always writes the 4 user-notes lines in the
|
||||
# same order, contiguously between the `Units :` line and the
|
||||
# `Geo Range :` line. We parse them by POSITION and preserve the
|
||||
# operator's labels in `report.user_note_labels` so terra-view can
|
||||
# render them as the operator intended.
|
||||
|
||||
_USER_NOTE_SLOTS = ("project", "client", "operator", "sensor_location")
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
# Top-level parser
|
||||
# ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwAsciiReport:
|
||||
"""Parse a BW per-event ASCII export into a structured BwAsciiReport.
|
||||
|
||||
Set ``parse_samples=True`` to also populate ``report.samples`` with
|
||||
the trailing sample table. Default False because the table is
|
||||
huge and most callers only want metadata for indexing.
|
||||
"""
|
||||
if isinstance(text, bytes):
|
||||
text = text.decode("ascii", errors="replace")
|
||||
|
||||
report = BwAsciiReport()
|
||||
# Pre-create channel stat slots so callers can rely on them existing.
|
||||
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||
report.channels.setdefault(ch, ChannelStats())
|
||||
report.sensor_check.setdefault(ch, SensorCheck())
|
||||
|
||||
lines = text.splitlines()
|
||||
i = 0
|
||||
n = len(lines)
|
||||
|
||||
in_monitor_log_section = False
|
||||
event_time_str: Optional[str] = None
|
||||
event_date: Optional[datetime.date] = None
|
||||
|
||||
# User-notes block detection. We enter the block after parsing
|
||||
# the "Units :" line and exit on the "Geo Range :" line. Inside,
|
||||
# the first 4 unmatched `<label> : <value>` lines are assigned to
|
||||
# the 4 canonical operator-supplied slots by POSITION (project,
|
||||
# client, operator, sensor_location) regardless of what the
|
||||
# operator named the labels in BW's Compliance Setup → Notes tab.
|
||||
in_user_notes_block = False
|
||||
user_note_position = 0
|
||||
|
||||
while i < n:
|
||||
raw_line = lines[i]
|
||||
i += 1
|
||||
# Blank line marks the start of the sample table.
|
||||
if raw_line.strip() == "":
|
||||
break
|
||||
|
||||
line = _strip_quotes(raw_line)
|
||||
|
||||
# Monitor log section: "Monitor Log(s)" header followed by N rows
|
||||
# (still inside double-quoted lines), terminated by a non-row line
|
||||
# like "PC SW Version : ..." or a blank line.
|
||||
if not in_monitor_log_section and line.strip() == "Monitor Log(s)":
|
||||
in_monitor_log_section = True
|
||||
continue
|
||||
if in_monitor_log_section:
|
||||
# Heuristic: monitor rows contain a tab; the next "Field : Value"
|
||||
# line ends the section.
|
||||
if "\t" in line:
|
||||
entry = _parse_monitor_row(line)
|
||||
if entry:
|
||||
report.monitor_log.append(entry)
|
||||
continue
|
||||
# Falls through to the field parser below; clear the flag.
|
||||
in_monitor_log_section = False
|
||||
|
||||
# "Field : Value" — split on FIRST occurrence of " : "
|
||||
idx = line.find(" : ")
|
||||
if idx < 0:
|
||||
continue
|
||||
key = _normalise_key(line[:idx])
|
||||
value = line[idx + 3 :].strip()
|
||||
|
||||
# ── Identity / config ────────────────────────────────────────────────
|
||||
if key == "Event Type": report.event_type = value
|
||||
elif key == "Serial Number": report.serial = value
|
||||
elif key == "Version": report.version = value
|
||||
elif key == "File Name": report.file_name = value
|
||||
elif key == "Event Time": event_time_str = value
|
||||
elif key == "Event Date": event_date = _parse_event_date(value)
|
||||
|
||||
elif key == "Trigger": report.trigger_channel = value
|
||||
elif key == "Geo Trigger Level": report.geo_trigger_level_ips = _parse_number(value)
|
||||
elif key == "Pre-trigger Length": report.pretrig_s = _parse_number(value)
|
||||
elif key == "Record Time": report.record_time_s = _parse_number(value)
|
||||
elif key == "Record Stop Mode": report.record_stop_mode = value
|
||||
elif key == "Sample Rate": report.sample_rate_sps = _parse_int(value)
|
||||
elif key == "Battery Level": report.battery_volts = _parse_number(value)
|
||||
elif key == "Calibration":
|
||||
report.calibration_date, report.calibration_by = _parse_calibration(value)
|
||||
elif key == "Units":
|
||||
report.units = value
|
||||
# Entering the user-notes block. Next ~4 lines until
|
||||
# "Geo Range :" are the operator-supplied notes.
|
||||
in_user_notes_block = True
|
||||
user_note_position = 0
|
||||
|
||||
elif key == "Geo Range":
|
||||
# Exiting the user-notes block.
|
||||
in_user_notes_block = False
|
||||
report.geo_range_ips = _parse_number(value)
|
||||
|
||||
# User-notes block: assign by position (operator may have
|
||||
# renamed the labels, so we don't trust them). Preserve the
|
||||
# original labels in `user_note_labels` for downstream UIs
|
||||
# (terra-view) that want to display them as the operator
|
||||
# named them.
|
||||
elif in_user_notes_block and user_note_position < len(_USER_NOTE_SLOTS):
|
||||
slot = _USER_NOTE_SLOTS[user_note_position]
|
||||
setattr(report, slot, value)
|
||||
report.user_note_labels[slot] = key
|
||||
user_note_position += 1
|
||||
|
||||
# ── Per-channel stats ────────────────────────────────────────────────
|
||||
# All match the pattern "{Channel} <stat-name>"
|
||||
elif key in (
|
||||
"Tran PPV", "Vert PPV", "Long PPV",
|
||||
"Tran ZC Freq", "Vert ZC Freq", "Long ZC Freq",
|
||||
"Tran Time of Peak", "Vert Time of Peak", "Long Time of Peak",
|
||||
"Tran Peak Acceleration", "Vert Peak Acceleration", "Long Peak Acceleration",
|
||||
"Tran Peak Displacement", "Vert Peak Displacement", "Long Peak Displacement",
|
||||
):
|
||||
ch_name, stat = key.split(" ", 1)
|
||||
cs = report.channels.setdefault(ch_name, ChannelStats())
|
||||
num = _parse_number(value)
|
||||
if stat == "PPV": cs.ppv_ips = num
|
||||
elif stat == "ZC Freq": cs.zc_freq_hz = num
|
||||
elif stat == "Time of Peak": cs.time_of_peak_s = num
|
||||
elif stat == "Peak Acceleration": cs.peak_accel_g = num
|
||||
elif stat == "Peak Displacement": cs.peak_disp_in = num
|
||||
|
||||
# ── Vector Sum ───────────────────────────────────────────────────────
|
||||
elif key == "Peak Vector Sum":
|
||||
report.peak_vector_sum_ips = _parse_number(value)
|
||||
elif key == "Peak Vector Sum Time":
|
||||
report.peak_vector_sum_time_s = _parse_number(value)
|
||||
|
||||
# ── Microphone block ────────────────────────────────────────────────
|
||||
elif key == "Microphone":
|
||||
report.mic.weighting = value
|
||||
elif key == "MicL PSPL":
|
||||
report.mic.pspl_dbl = _parse_number(value)
|
||||
# Mirror onto the "MicL" entry in channels so callers querying
|
||||
# `channels["MicL"].ppv_ips` see something — but it's dB(L), not
|
||||
# in/s, so we store as-is in the MicStats and mark the channel.
|
||||
elif key == "MicL Time of Peak":
|
||||
report.mic.time_of_peak_s = _parse_number(value)
|
||||
cs = report.channels.setdefault("MicL", ChannelStats())
|
||||
cs.time_of_peak_s = report.mic.time_of_peak_s
|
||||
elif key == "MicL ZC Freq":
|
||||
report.mic.zc_freq_hz = _parse_number(value)
|
||||
cs = report.channels.setdefault("MicL", ChannelStats())
|
||||
cs.zc_freq_hz = report.mic.zc_freq_hz
|
||||
|
||||
# ── Sensor self-check ────────────────────────────────────────────────
|
||||
elif key in (
|
||||
"Tran Test Freq", "Vert Test Freq", "Long Test Freq", "MicL Test Freq",
|
||||
"Tran Test Ratio", "Vert Test Ratio", "Long Test Ratio",
|
||||
"MicL Test Amplitude",
|
||||
"Tran Test Results", "Vert Test Results", "Long Test Results", "MicL Test Results",
|
||||
):
|
||||
ch_name, stat = key.split(" ", 1)
|
||||
sc = report.sensor_check.setdefault(ch_name, SensorCheck())
|
||||
if stat == "Test Freq": sc.test_freq_hz = _parse_number(value)
|
||||
elif stat == "Test Ratio": sc.test_ratio = _parse_number(value)
|
||||
elif stat == "Test Amplitude": sc.test_amplitude_mv = _parse_number(value)
|
||||
elif stat == "Test Results": sc.test_results = value
|
||||
|
||||
# ── Trailer ─────────────────────────────────────────────────────────
|
||||
elif key == "PC SW Version":
|
||||
report.pc_sw_version = value
|
||||
|
||||
# Unknown keys are silently dropped — forward-compat for future
|
||||
# BW versions that may add fields.
|
||||
|
||||
# Combine event date + time into a datetime
|
||||
if event_date is not None and event_time_str is not None:
|
||||
t = _parse_event_time(event_time_str)
|
||||
if t is not None:
|
||||
report.event_datetime = datetime.datetime.combine(event_date, t)
|
||||
|
||||
if parse_samples:
|
||||
report.samples = _parse_sample_table(lines, i)
|
||||
|
||||
return report
|
||||
|
||||
|
||||
def _parse_sample_table(
|
||||
lines: List[str], start: int,
|
||||
) -> List[Tuple[float, float, float, float]]:
|
||||
"""Parse the trailing sample table.
|
||||
|
||||
The table starts with a header row (" Tran <TAB>...") and continues
|
||||
until EOF. Each data row is a tab-separated quartet of numeric values.
|
||||
"""
|
||||
samples: List[Tuple[float, float, float, float]] = []
|
||||
seen_header = False
|
||||
for line in lines[start:]:
|
||||
line = line.rstrip("\r\n")
|
||||
if not line.strip():
|
||||
continue
|
||||
cols = [c.strip() for c in line.split("\t") if c.strip()]
|
||||
if not seen_header:
|
||||
# Header row contains channel names; numeric rows don't.
|
||||
if any(c in ("Tran", "Vert", "Long", "MicL") for c in cols):
|
||||
seen_header = True
|
||||
continue
|
||||
if len(cols) < 4:
|
||||
continue
|
||||
try:
|
||||
samples.append((
|
||||
float(cols[0]), float(cols[1]),
|
||||
float(cols[2]), float(cols[3]),
|
||||
))
|
||||
except ValueError:
|
||||
continue
|
||||
return samples
|
||||
|
||||
|
||||
def parse_report_file(
|
||||
path: Union[str, Path], *, parse_samples: bool = False,
|
||||
) -> BwAsciiReport:
|
||||
"""Convenience: read a .TXT file from disk and parse it."""
|
||||
return parse_report(Path(path).read_bytes(), parse_samples=parse_samples)
|
||||
+150
-356
@@ -449,7 +449,7 @@ class MiniMateClient:
|
||||
proto.confirm_erase_all()
|
||||
log.info("delete_all_events: erase confirmed — device memory cleared")
|
||||
|
||||
def get_events(self, full_waveform: bool = False, debug: bool = False, stop_after_index: Optional[int] = None, skip_waveform_for_keys: Optional[set] = None, skip_waveform_for_events: Optional[dict] = None, extra_chunks_after_metadata: int = 1) -> list[Event]:
|
||||
def get_events(self, full_waveform: bool = False, debug: bool = False, stop_after_index: Optional[int] = None, skip_waveform_for_keys: Optional[set] = None, extra_chunks_after_metadata: int = 1) -> list[Event]:
|
||||
"""
|
||||
Download all stored events from the device using the confirmed
|
||||
1E → 0A → 0C → 5A → 1F event-iterator protocol.
|
||||
@@ -497,24 +497,37 @@ class MiniMateClient:
|
||||
events: list[Event] = []
|
||||
idx = 0
|
||||
|
||||
# Legacy bare-key skip set is deprecated: the device's key counter
|
||||
# resets to 0x01110000 after every memory erase, so a key in this set
|
||||
# cannot be trusted to identify the same physical event across erases.
|
||||
# If a caller still passes it, log a warning and ignore — full
|
||||
# downloads will run for every event so the bug never silently bites.
|
||||
if skip_waveform_for_keys:
|
||||
log.warning(
|
||||
"get_events: skip_waveform_for_keys is deprecated and unsafe "
|
||||
"(post-erase key reuse); ignoring %d entries. Use "
|
||||
"skip_waveform_for_events={key: timestamp_iso} instead.",
|
||||
len(skip_waveform_for_keys),
|
||||
)
|
||||
skip_evts: dict[str, str] = dict(skip_waveform_for_events or {})
|
||||
|
||||
while data8[4:8] != b"\x00\x00\x00\x00":
|
||||
cur_key = key4 # key for this event's 0A/1E-arm/0C/5A calls
|
||||
log.info("get_events: record %d key=%s", idx, cur_key.hex())
|
||||
|
||||
# Fast-advance path: if this key is already downloaded, skip
|
||||
# 1E-arm/0C/POLL/5A entirely. Only 0A + 1F(browse) are needed
|
||||
# to advance the device's internal pointer to the next event.
|
||||
# This is identical to the browse-mode walk in count_events().
|
||||
if skip_waveform_for_keys and cur_key.hex() in skip_waveform_for_keys:
|
||||
log.debug("get_events: key=%s already seen -- fast-advance only", cur_key.hex())
|
||||
try:
|
||||
proto.read_waveform_header(cur_key)
|
||||
except ProtocolError as exc:
|
||||
log.warning(
|
||||
"get_events: 0A failed for key=%s (skip path): %s -- stopping",
|
||||
cur_key.hex(), exc,
|
||||
)
|
||||
break
|
||||
try:
|
||||
key4, data8 = proto.advance_event(browse=True)
|
||||
except ProtocolError as exc:
|
||||
log.warning(
|
||||
"get_events: 1F failed for key=%s (skip path): %s -- stopping",
|
||||
cur_key.hex(), exc,
|
||||
)
|
||||
break
|
||||
idx += 1
|
||||
if stop_after_index is not None and idx > stop_after_index:
|
||||
break
|
||||
continue
|
||||
|
||||
ev = Event(index=idx)
|
||||
ev._waveform_key = cur_key
|
||||
|
||||
@@ -561,96 +574,72 @@ class MiniMateClient:
|
||||
"get_events: 0C failed for key=%s: %s", cur_key.hex(), exc
|
||||
)
|
||||
|
||||
# ── Skip-5A decision based on (key, timestamp) match ──────
|
||||
# If skip_waveform_for_events maps cur_key.hex() to a non-empty
|
||||
# ISO timestamp matching what we just read from 0C, this is
|
||||
# the same physical event we already have on disk — bypass
|
||||
# the 1F(arm)+POLL+5A bulk download. Otherwise (no entry, or
|
||||
# timestamp mismatch indicating post-erase reuse) fall through
|
||||
# to the full download.
|
||||
expected_ts = skip_evts.get(cur_key.hex(), "")
|
||||
actual_ts = _event_timestamp_iso(ev)
|
||||
skip_5a = bool(expected_ts and actual_ts and expected_ts == actual_ts)
|
||||
if skip_5a:
|
||||
log.info(
|
||||
"get_events: key=%s (key, ts=%s) match — skipping 5A bulk download",
|
||||
cur_key.hex(), actual_ts,
|
||||
)
|
||||
|
||||
# SUB 1F (download-arm) — send token=0xFE BEFORE POLL+5A to arm the
|
||||
# device's bulk stream state machine. Cache the returned key as a
|
||||
# fallback for loop iteration when 5A fails (see iteration block below).
|
||||
# Confirmed from 4-2-26 capture frames 66-67 (1F before frames 68-73 POLL).
|
||||
arm_key4: Optional[bytes] = None
|
||||
a5_ok = False
|
||||
try:
|
||||
arm_key4, _ = proto.advance_event(browse=False) # arm 5A
|
||||
log.info("get_events: 1F(download) — 5A armed, arm_key=%s", arm_key4.hex())
|
||||
except ProtocolError as exc:
|
||||
log.warning("get_events: 1F(download) arm failed: %s", exc)
|
||||
|
||||
if not skip_5a:
|
||||
# SUB 1F (download-arm) — send token=0xFE BEFORE POLL+5A to arm the
|
||||
# device's bulk stream state machine. Cache the returned key as a
|
||||
# fallback for loop iteration when 5A fails (see iteration block below).
|
||||
# Confirmed from 4-2-26 capture frames 66-67 (1F before frames 68-73 POLL).
|
||||
# POLL × 3 — BW sends 3 full POLL cycles between 1F and 5A.
|
||||
# Confirmed from 4-2-26 BW TX capture (frames 68-73 before 5A at 74).
|
||||
log.info("get_events: POLL × 3 before 5A")
|
||||
for _p in range(3):
|
||||
try:
|
||||
arm_key4, _ = proto.advance_event(browse=False) # arm 5A
|
||||
log.info("get_events: 1F(download) — 5A armed, arm_key=%s", arm_key4.hex())
|
||||
proto.poll()
|
||||
except ProtocolError as exc:
|
||||
log.warning("get_events: 1F(download) arm failed: %s", exc)
|
||||
|
||||
# POLL × 3 — BW sends 3 full POLL cycles between 1F and 5A.
|
||||
# Confirmed from 4-2-26 BW TX capture (frames 68-73 before 5A at 74).
|
||||
log.info("get_events: POLL × 3 before 5A")
|
||||
for _p in range(3):
|
||||
try:
|
||||
proto.poll()
|
||||
except ProtocolError as exc:
|
||||
log.warning("get_events: POLL %d failed: %s", _p, exc)
|
||||
log.warning("get_events: POLL %d failed: %s", _p, exc)
|
||||
|
||||
# SUB 5A — bulk waveform stream (uses cur_key, the event set up by 0A+1E+0C).
|
||||
# By default (full_waveform=False): stop after frame 7 for metadata only.
|
||||
# When full_waveform=True: fetch all chunks and decode raw ADC samples.
|
||||
#
|
||||
# Bypassed when skip_5a is True — the event is left with
|
||||
# _a5_frames=None, which signals to the caller (e.g.
|
||||
# ach_server.py) that this event was matched by (key, ts) and
|
||||
# already has a stored .file in the persistent waveform store.
|
||||
if not skip_5a:
|
||||
try:
|
||||
if full_waveform:
|
||||
log.info(
|
||||
"get_events: 5A full waveform download for key=%s", cur_key.hex()
|
||||
)
|
||||
a5_frames = proto.read_bulk_waveform_stream(
|
||||
cur_key, stop_after_metadata=False, max_chunks=128,
|
||||
include_terminator=True,
|
||||
)
|
||||
if a5_frames:
|
||||
a5_ok = True
|
||||
ev._a5_frames = a5_frames # store for write_blastware_file
|
||||
_decode_a5_metadata_into(a5_frames, ev)
|
||||
_decode_a5_waveform(a5_frames, ev)
|
||||
log.info(
|
||||
"get_events: 5A decoded %d sample-sets",
|
||||
len((ev.raw_samples or {}).get("Tran", [])),
|
||||
)
|
||||
else:
|
||||
log.info(
|
||||
"get_events: 5A metadata-only download for key=%s", cur_key.hex()
|
||||
)
|
||||
a5_frames = proto.read_bulk_waveform_stream(
|
||||
cur_key, stop_after_metadata=True,
|
||||
include_terminator=True,
|
||||
extra_chunks_after_metadata=extra_chunks_after_metadata,
|
||||
max_chunks=128,
|
||||
)
|
||||
if a5_frames:
|
||||
a5_ok = True
|
||||
ev._a5_frames = a5_frames # store for write_blastware_file
|
||||
_decode_a5_metadata_into(a5_frames, ev)
|
||||
log.debug(
|
||||
"get_events: 5A metadata client=%r operator=%r",
|
||||
ev.project_info.client if ev.project_info else None,
|
||||
ev.project_info.operator if ev.project_info else None,
|
||||
)
|
||||
except ProtocolError as exc:
|
||||
log.warning(
|
||||
"get_events: 5A failed for key=%s: %s — metadata unavailable",
|
||||
cur_key.hex(), exc,
|
||||
a5_ok = False
|
||||
try:
|
||||
if full_waveform:
|
||||
log.info(
|
||||
"get_events: 5A full waveform download for key=%s", cur_key.hex()
|
||||
)
|
||||
a5_frames = proto.read_bulk_waveform_stream(
|
||||
cur_key, stop_after_metadata=False, max_chunks=128,
|
||||
include_terminator=True,
|
||||
)
|
||||
if a5_frames:
|
||||
a5_ok = True
|
||||
ev._a5_frames = a5_frames # store for write_blastware_file
|
||||
_decode_a5_metadata_into(a5_frames, ev)
|
||||
_decode_a5_waveform(a5_frames, ev)
|
||||
log.info(
|
||||
"get_events: 5A decoded %d sample-sets",
|
||||
len((ev.raw_samples or {}).get("Tran", [])),
|
||||
)
|
||||
else:
|
||||
log.info(
|
||||
"get_events: 5A metadata-only download for key=%s", cur_key.hex()
|
||||
)
|
||||
a5_frames = proto.read_bulk_waveform_stream(
|
||||
cur_key, stop_after_metadata=True,
|
||||
include_terminator=True,
|
||||
extra_chunks_after_metadata=extra_chunks_after_metadata,
|
||||
max_chunks=128,
|
||||
)
|
||||
if a5_frames:
|
||||
a5_ok = True
|
||||
ev._a5_frames = a5_frames # store for write_blastware_file
|
||||
_decode_a5_metadata_into(a5_frames, ev)
|
||||
log.debug(
|
||||
"get_events: 5A metadata client=%r operator=%r",
|
||||
ev.project_info.client if ev.project_info else None,
|
||||
ev.project_info.operator if ev.project_info else None,
|
||||
)
|
||||
except ProtocolError as exc:
|
||||
log.warning(
|
||||
"get_events: 5A failed for key=%s: %s — metadata unavailable",
|
||||
cur_key.hex(), exc,
|
||||
)
|
||||
|
||||
# SUB 1F — loop iteration.
|
||||
#
|
||||
@@ -663,14 +652,7 @@ class MiniMateClient:
|
||||
# Confirmed from 4-3-26 browse-mode captures: browse=True params
|
||||
# are correct for multi-event iteration. Conditional logic added
|
||||
# 2026-04-06 to avoid post-failure state disruption.
|
||||
#
|
||||
# NEW 2026-05-06: when skip_5a=True we never entered the 5A
|
||||
# state at all (we read 0A+1E(arm)+0C and chose to bypass).
|
||||
# 1F(browse) is safe in this scenario — the device's iteration
|
||||
# pointer is independent of the bulk-stream state machine, and
|
||||
# we never put it into the half-attempted 5A state that the
|
||||
# earlier "post-failure 1F disruption" warning is about.
|
||||
if skip_5a or a5_ok:
|
||||
if a5_ok:
|
||||
# 5A succeeded — use browse 1F for reliable key advancement.
|
||||
try:
|
||||
key4, data8 = proto.advance_event(browse=True)
|
||||
@@ -1192,27 +1174,6 @@ class MiniMateClient:
|
||||
# Pure functions: bytes → model field population.
|
||||
# Kept here (not in models.py) to isolate protocol knowledge from data shapes.
|
||||
|
||||
def _event_timestamp_iso(event: Event) -> str:
|
||||
"""
|
||||
Return a stable ISO-8601 string for the event's 0C-derived timestamp,
|
||||
or "" if the event has no timestamp populated.
|
||||
|
||||
The format intentionally matches what `bridges/ach_server.py` writes
|
||||
into `ach_state.json:downloaded_events[*]` so the (key, ts) compare
|
||||
in get_events()'s skip path is a simple string equality.
|
||||
"""
|
||||
ts = getattr(event, "timestamp", None)
|
||||
if ts is None:
|
||||
return ""
|
||||
try:
|
||||
return datetime.datetime(
|
||||
ts.year, ts.month, ts.day,
|
||||
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||
).isoformat()
|
||||
except Exception:
|
||||
return str(ts)
|
||||
|
||||
|
||||
def _decode_serial_number(data: bytes) -> DeviceInfo:
|
||||
"""
|
||||
Decode SUB EA (SERIAL_NUMBER_RESPONSE) payload into a new DeviceInfo.
|
||||
@@ -1362,40 +1323,28 @@ def _decode_waveform_record_into(data: bytes, event: Event) -> None:
|
||||
|
||||
Modifies event in-place.
|
||||
"""
|
||||
# ── Record type + format detection ────────────────────────────────────────
|
||||
# `record_type` is the user-facing label ("Waveform" for any triggered
|
||||
# event regardless of timestamp-header layout). `fmt` is the internal
|
||||
# format code used to pick the right Timestamp parser; it stays
|
||||
# internal and doesn't leak to the API / sidecar / UI.
|
||||
# ── Record type ───────────────────────────────────────────────────────────
|
||||
# Decoded from byte[1] (sub_code) first so we can gate timestamp parsing.
|
||||
try:
|
||||
event.record_type = _extract_record_type(data)
|
||||
except Exception as exc:
|
||||
log.warning("waveform record type decode failed: %s", exc)
|
||||
fmt = _detect_record_format(data)
|
||||
|
||||
# ── Timestamp ─────────────────────────────────────────────────────────────
|
||||
# Three timestamp-header layouts have been observed across BE11529
|
||||
# firmware S338.17 — each picks a different Timestamp parser:
|
||||
# "single_shot": 9-byte [day][0x10][month][year:2][unk][h][m][s]
|
||||
# "continuous": 10-byte [0x10][day][0x10][month][year:2][unk][h][m][s]
|
||||
# "short": 8-byte [day][month][year:2][unk][h][m][s]
|
||||
# All decoded into the same Timestamp dataclass — only the byte
|
||||
# offsets differ.
|
||||
if fmt == "single_shot":
|
||||
# 9-byte format for sub_code=0x10 Waveform records:
|
||||
# [day][sub_code][month][year:2 BE][unknown][hour][min][sec]
|
||||
# sub_code=0x10 and sub_code=0x03 have different timestamp byte layouts.
|
||||
# Both confirmed against Blastware event reports (BE11529, 2026-04-01 and 2026-04-03).
|
||||
if event.record_type == "Waveform":
|
||||
try:
|
||||
event.timestamp = Timestamp.from_waveform_record(data)
|
||||
except Exception as exc:
|
||||
log.warning("single_shot record timestamp decode failed: %s", exc)
|
||||
elif fmt == "continuous":
|
||||
log.warning("waveform record timestamp decode failed: %s", exc)
|
||||
elif event.record_type == "Waveform (Continuous)":
|
||||
try:
|
||||
event.timestamp = Timestamp.from_continuous_record(data)
|
||||
except Exception as exc:
|
||||
log.warning("continuous record timestamp decode failed: %s", exc)
|
||||
elif fmt == "short":
|
||||
try:
|
||||
event.timestamp = Timestamp.from_short_record(data)
|
||||
except Exception as exc:
|
||||
log.warning("short record timestamp decode failed: %s", exc)
|
||||
|
||||
# ── Peak values (per-channel PPV + Peak Vector Sum) ───────────────────────
|
||||
try:
|
||||
@@ -1500,69 +1449,22 @@ def _decode_a5_waveform(
|
||||
(BULK_WAVEFORM_STREAM) frame payloads and populate event.raw_samples,
|
||||
event.total_samples, event.pretrig_samples, and event.rectime_seconds.
|
||||
|
||||
Wired up 2026-05-11 to the verified ``decode_waveform_v2`` codec (see
|
||||
``minimateplus/waveform_codec.py`` and ``docs/waveform_codec_re_status.md``).
|
||||
Replaces the legacy int16 LE decoder, which produced full-scale ±32K
|
||||
noise on every event because the body bytes are encoded, not raw
|
||||
samples.
|
||||
This requires ALL A5 frames (stop_after_metadata=False), not just the
|
||||
metadata-bearing subset.
|
||||
|
||||
Output convention (preserved from the legacy decoder):
|
||||
``event.raw_samples`` is a dict with keys "Tran", "Vert", "Long",
|
||||
"MicL" mapping to lists of **int16 ADC counts**. Multiply by
|
||||
``geo_range / 32768`` for geo channels to get in/s; use
|
||||
:func:`minimateplus.waveform_codec.mic_count_to_db` for mic dB(L).
|
||||
|
||||
``total_samples`` / ``pretrig_samples`` / ``rectime_seconds`` are set
|
||||
to ``None`` so the caller backfills from compliance_config (the
|
||||
authoritative source — STRT fields aren't reliable).
|
||||
"""
|
||||
from .waveform_codec import decode_a5_frames
|
||||
|
||||
event.total_samples = None
|
||||
event.pretrig_samples = None
|
||||
event.rectime_seconds = None
|
||||
|
||||
if not frames_data:
|
||||
log.debug("_decode_a5_waveform: no frames provided")
|
||||
return
|
||||
|
||||
decoded = decode_a5_frames(frames_data)
|
||||
if decoded is None:
|
||||
log.warning("_decode_a5_waveform: codec returned no samples")
|
||||
return
|
||||
|
||||
event.raw_samples = decoded
|
||||
log.debug(
|
||||
"_decode_a5_waveform: decoded %d/%d/%d/%d samples (T/V/L/M)",
|
||||
len(decoded.get("Tran", [])),
|
||||
len(decoded.get("Vert", [])),
|
||||
len(decoded.get("Long", [])),
|
||||
len(decoded.get("MicL", [])),
|
||||
)
|
||||
|
||||
|
||||
def _decode_a5_waveform_LEGACY(
|
||||
frames_data: list[S3Frame],
|
||||
event: Event,
|
||||
) -> None:
|
||||
"""
|
||||
LEGACY decoder — kept for reference only. DO NOT CALL.
|
||||
|
||||
This is the int16 LE decoder that produced full-scale ±32K noise
|
||||
on every event. Retracted 2026-05-08; replaced 2026-05-11 with
|
||||
the verified codec in :mod:`minimateplus.waveform_codec`. See
|
||||
``docs/instantel_protocol_reference.md §7.6.1`` for the full history.
|
||||
|
||||
── Waveform format (LEGACY — WRONG) ────────────────────────────────
|
||||
Claimed 4-channel interleaved signed 16-bit little-endian, 8 bytes
|
||||
per sample-set:
|
||||
── Waveform format (confirmed from 4-2-26 blast capture) ───────────────────
|
||||
The blast waveform is 4-channel interleaved signed 16-bit little-endian,
|
||||
8 bytes per sample-set:
|
||||
|
||||
[T_lo T_hi V_lo V_hi L_lo L_hi M_lo M_hi] × N
|
||||
|
||||
where T=Tran, V=Vert, L=Long, M=Mic.
|
||||
where T=Tran, V=Vert, L=Long, M=Mic. Channel ordering follows the
|
||||
Blastware convention [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
|
||||
|
||||
The body bytes are actually a tagged delta+RLE stream — this
|
||||
interpretation was wrong.
|
||||
⚠️ Channel ordering is a confirmed CONVENTION — the physical ordering on
|
||||
the ADC mux is not independently verifiable from the saturating blast
|
||||
captures we have. The convention is consistent with Blastware labeling
|
||||
(Tran is always the first channel field in the A5 STRT+waveform stream).
|
||||
|
||||
── Frame structure ──────────────────────────────────────────────────────────
|
||||
A5[0] (probe response):
|
||||
@@ -1616,109 +1518,46 @@ def _decode_a5_waveform_LEGACY(
|
||||
log.warning("_decode_a5_waveform: STRT record truncated (%dB)", len(strt))
|
||||
return
|
||||
|
||||
# STRT byte layout (21 bytes; verified against M529LIY6 reference files
|
||||
# and re-confirmed against live BE11529 captures, 2026-05-08):
|
||||
# [0:4] b'STRT'
|
||||
# [4:6] 0xff 0xfe sentinel
|
||||
# [6:10] end_key 4-byte BE flash address where event ends
|
||||
# [10:14] start_key 4-byte BE flash address where event starts
|
||||
# [14:18] device-specific (semantics not pinned; values vary across events
|
||||
# and don't hold authoritative total_samples / pretrig)
|
||||
# [18] 0x46 record-type marker (NOT rectime)
|
||||
# [19] device-specific
|
||||
# [20] sometimes rectime, sometimes 0 — not reliable
|
||||
#
|
||||
# AUTHORITATIVE values must come from compliance_config (sample_rate,
|
||||
# record_time) and from end_offset - start_offset arithmetic (event size).
|
||||
# Earlier code claimed STRT[8:10]=total_samples and STRT[16:18]=pretrig;
|
||||
# those positions actually overlap end_key low-word and dev-specific bytes
|
||||
# respectively. We surface the address-derived event size so consumers
|
||||
# can sanity-check chunk-loop bounds, but `total_samples` per channel must
|
||||
# be derived externally (sample_rate × record_time, or computed from the
|
||||
# decoded sample count below).
|
||||
end_key = strt[6:10]
|
||||
start_key = strt[10:14]
|
||||
end_offset_in_strt = (end_key[2] << 8) | end_key[3]
|
||||
start_offset_in_strt = (start_key[2] << 8) | start_key[3]
|
||||
is_event_1 = (start_offset_in_strt == 0x0000)
|
||||
total_samples = struct.unpack_from(">H", strt, 8)[0]
|
||||
pretrig_samples = struct.unpack_from(">H", strt, 16)[0]
|
||||
rectime_seconds = strt[18]
|
||||
|
||||
# Don't trust STRT for these — leave them as None so the caller can
|
||||
# backfill from compliance_config (the authoritative source).
|
||||
event.total_samples = None
|
||||
event.pretrig_samples = None
|
||||
event.rectime_seconds = None
|
||||
event.total_samples = total_samples
|
||||
event.pretrig_samples = pretrig_samples
|
||||
event.rectime_seconds = rectime_seconds
|
||||
|
||||
log.debug(
|
||||
"_decode_a5_waveform: STRT start_key=%s end_key=%s "
|
||||
"start_off=0x%04X end_off=0x%04X is_event_1=%s "
|
||||
"dev-specific[14:18]=%s strt[20]=0x%02X",
|
||||
start_key.hex(), end_key.hex(),
|
||||
start_offset_in_strt, end_offset_in_strt, is_event_1,
|
||||
strt[14:18].hex(), strt[20],
|
||||
"_decode_a5_waveform: STRT total_samples=%d pretrig=%d rectime=%ds",
|
||||
total_samples, pretrig_samples, rectime_seconds,
|
||||
)
|
||||
|
||||
# ── Collect per-frame waveform bytes with global offset tracking ─────────
|
||||
# global_offset is the cumulative byte count across all frames, used to
|
||||
# compute the channel alignment at each frame boundary.
|
||||
#
|
||||
# Frame layout under the v0.14.0+ walk:
|
||||
# frames_data[0] = probe response (page_addr 0x0000;
|
||||
# contains STRT + post-STRT data)
|
||||
# frames_data[1..2] = (event 1 only) metadata pages
|
||||
# page_addr = 0x1002 / 0x1004
|
||||
# frames_data[mid] = sample chunks at flash addresses
|
||||
# 0x0600, 0x0800, … (page_addr in
|
||||
# {0x0600..0x1FFE})
|
||||
# frames_data[last] = TERM response (page_key=0x0000)
|
||||
#
|
||||
# We identify metadata pages by their PAGE ADDRESS at db.data[4:6] (the
|
||||
# 2-byte counter the device echoes back), NOT by content scan. An earlier
|
||||
# needle-based detection (b"Project:", b"Client:", etc.) was the wrong
|
||||
# layer of abstraction:
|
||||
# • The actual metadata pages 0x1002 / 0x1004 do NOT contain ASCII
|
||||
# project strings on this firmware (S338.17 / BE11529).
|
||||
# • The strings physically live at flash address 0x1600 — which falls
|
||||
# inside the sample-chunk address range. Skipping that frame would
|
||||
# drop a real sample chunk.
|
||||
# BW handles the "samples region happens to contain string bytes" case
|
||||
# by just rendering the bytes verbatim; we do the same.
|
||||
_METADATA_PAGES = (b"\x10\x02", b"\x10\x04")
|
||||
|
||||
chunks: list[tuple[int, bytes]] = [] # (frame_idx, waveform_bytes)
|
||||
global_offset = 0
|
||||
|
||||
for fi, db in enumerate(frames_data):
|
||||
page_addr = db.data[4:6] if len(db.data) >= 6 else b""
|
||||
w = db.data[7:] # frame.data[7:]
|
||||
|
||||
# A5[0]: probe response. Two cases:
|
||||
# - Event 1 (start_offset_in_strt == 0x0000): the bytes after STRT
|
||||
# are the device's *pre-event reserved area* (flash 0x0046 to
|
||||
# 0x0600), NOT samples. We must skip them; samples begin at
|
||||
# the first dedicated chunk frame at counter=0x0600.
|
||||
# - Event N (continuation, start_offset != 0x0000): the bytes after
|
||||
# the STRT record ARE the first slice of real samples for the
|
||||
# event (BW's chunk loop addresses the probe as a sample chunk).
|
||||
# A5[0]: waveform begins after the 21-byte STRT record and 6-byte preamble.
|
||||
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
|
||||
if fi == 0:
|
||||
sp = w.find(b"STRT")
|
||||
if sp < 0:
|
||||
continue
|
||||
if is_event_1:
|
||||
# No usable samples in the probe — pre-event reserved bytes.
|
||||
continue
|
||||
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
|
||||
wave = w[sp + 27 :]
|
||||
|
||||
# Skip the dedicated metadata pages (event 1 only): page_addr 0x1002 / 0x1004.
|
||||
elif page_addr in _METADATA_PAGES:
|
||||
log.debug(
|
||||
"_decode_a5_waveform: skipping metadata page fi=%d page_addr=%s",
|
||||
fi, page_addr.hex(),
|
||||
)
|
||||
# Frame 7 carries event-time metadata strings ("Project:", "Client:", …)
|
||||
# and no waveform ADC data.
|
||||
elif fi == 7:
|
||||
continue
|
||||
|
||||
# Sample chunk (or TERM): strip the 8-byte per-frame header.
|
||||
# Terminator frames have page_key=0x0000 and are excluded upstream
|
||||
# (read_bulk_waveform_stream returns early on page_key==0).
|
||||
# No hardcoded frame-index skip here — all non-metadata frames are data.
|
||||
else:
|
||||
# Strip the 8-byte per-frame header (ctr + 6 zero bytes)
|
||||
if len(w) < 8:
|
||||
continue
|
||||
wave = w[8:]
|
||||
@@ -1732,8 +1571,10 @@ def _decode_a5_waveform_LEGACY(
|
||||
total_bytes = global_offset
|
||||
n_sets = total_bytes // 8
|
||||
log.debug(
|
||||
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets",
|
||||
len(chunks), total_bytes, n_sets,
|
||||
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets "
|
||||
"(%d of %d expected; %.0f%%)",
|
||||
len(chunks), total_bytes, n_sets, n_sets, total_samples,
|
||||
100.0 * n_sets / total_samples if total_samples else 0,
|
||||
)
|
||||
|
||||
if n_sets == 0:
|
||||
@@ -1791,85 +1632,38 @@ def _decode_a5_waveform_LEGACY(
|
||||
"Tran": tran,
|
||||
"Vert": vert,
|
||||
"Long": long_,
|
||||
"MicL": mic,
|
||||
"Mic": mic,
|
||||
}
|
||||
|
||||
|
||||
def _detect_record_format(data: bytes) -> Optional[str]:
|
||||
"""
|
||||
Detect which timestamp-header format a 210-byte 0C waveform record uses.
|
||||
|
||||
THREE formats observed on BE11529 firmware S338.17:
|
||||
|
||||
"single_shot" — 9-byte header:
|
||||
[day] [0x10] [month] [year_BE:2] [unknown] [hour] [min] [sec]
|
||||
sub_code=0x10 at byte [1]. Year at [3:5].
|
||||
|
||||
"continuous" — 10-byte header:
|
||||
[0x10] [day] [0x10] [month] [year_BE:2] [unknown] [hour] [min] [sec]
|
||||
marker 0x10 at byte [0] AND byte [2]. Year at [4:6].
|
||||
|
||||
"short" — 8-byte header (NEW 2026-05-01):
|
||||
[day] [month] [year_BE:2] [unknown] [hour] [min] [sec]
|
||||
No marker bytes. Year at [2:4].
|
||||
|
||||
Each format has the year (uint16 BE) at a UNIQUE byte position, so we can
|
||||
disambiguate by scanning each candidate position and picking the one
|
||||
where the year falls in a sane range (2015..2050).
|
||||
|
||||
Returns "single_shot" / "continuous" / "short" or None if no format matches.
|
||||
"""
|
||||
if len(data) < 8:
|
||||
return None
|
||||
|
||||
def _sane_year(hi: int, lo: int) -> bool:
|
||||
y = (hi << 8) | lo
|
||||
return 2015 <= y <= 2050
|
||||
|
||||
# Order matters: prefer formats with stronger marker-byte evidence first.
|
||||
if data[1] == 0x10 and len(data) >= 9 and _sane_year(data[3], data[4]):
|
||||
return "single_shot"
|
||||
if (data[0] == 0x10 and data[2] == 0x10
|
||||
and len(data) >= 10 and _sane_year(data[4], data[5])):
|
||||
return "continuous"
|
||||
if _sane_year(data[2], data[3]):
|
||||
return "short"
|
||||
return None
|
||||
|
||||
|
||||
def _extract_record_type(data: bytes) -> Optional[str]:
|
||||
"""
|
||||
Return a user-facing name for a waveform record. All three internal
|
||||
timestamp-header layouts represent the *same* user concept — a
|
||||
triggered seismic event — so they all surface as just "Waveform".
|
||||
Decode the recording mode from byte[1] of the 210-byte waveform record.
|
||||
|
||||
The internal format code is preserved for parsing logic (timestamp
|
||||
decoder selection) but doesn't leak into the API / UI / sidecar.
|
||||
Callers that need the raw layout can call `_detect_record_format`
|
||||
directly.
|
||||
Byte[1] is the sub-record code that immediately follows the day byte in the
|
||||
9-byte timestamp header at the start of each waveform record:
|
||||
[day:1] [sub_code:1] [month:1] [year:2 BE] ...
|
||||
|
||||
Background: across BE11529 firmware S338.17 we've observed three
|
||||
different byte layouts for the timestamp header at the start of the
|
||||
0C record (8 / 9 / 10 bytes, distinguished by the position of the
|
||||
BE-encoded year and the presence of `0x10` marker bytes). An older
|
||||
revision of this code labelled them "Waveform" / "Waveform
|
||||
(Continuous)" / "Waveform (Short)", which created the false
|
||||
impression that there were three distinct event "types" the user
|
||||
could configure. In reality the user only ever picks Single Shot
|
||||
vs Continuous vs Histogram in the compliance config — the byte
|
||||
layout is a firmware-internal detail that doesn't always correlate
|
||||
with that choice.
|
||||
Confirmed codes (✅ 2026-04-01):
|
||||
0x10 → "Waveform" (continuous / single-shot mode)
|
||||
|
||||
Histogram mode code is not yet confirmed — a histogram event must be
|
||||
captured with debug=true to identify it. Returns None for unknown codes.
|
||||
"""
|
||||
fmt = _detect_record_format(data)
|
||||
if fmt in ("single_shot", "continuous", "short"):
|
||||
if len(data) < 2:
|
||||
return None
|
||||
code = data[1]
|
||||
if code == 0x10:
|
||||
return "Waveform"
|
||||
if len(data) >= 3:
|
||||
log.warning(
|
||||
"_extract_record_type: unrecognized header: data[0:3]=%02X %02X %02X",
|
||||
data[0], data[1], data[2],
|
||||
)
|
||||
return f"Unknown({data[0]:02X}.{data[1]:02X}.{data[2]:02X})"
|
||||
return None
|
||||
if code == 0x03:
|
||||
# Continuous mode waveform record (confirmed by user — NOT a monitor log).
|
||||
# The byte layout differs from 0x10 single-shot records: the timestamp
|
||||
# fields decode as garbage under the 0x10 waveform layout.
|
||||
# TODO: confirm correct timestamp layout for 0x03 records from a known-time event.
|
||||
return "Waveform (Continuous)"
|
||||
log.warning("_extract_record_type: unknown sub_code=0x%02X", code)
|
||||
return f"Unknown(0x{code:02X})"
|
||||
|
||||
|
||||
def _extract_peak_floats(data: bytes) -> Optional[PeakValues]:
|
||||
"""
|
||||
|
||||
@@ -1,895 +0,0 @@
|
||||
"""
|
||||
minimateplus/event_file_io.py — modern event-file (.sfm.json sidecar) IO.
|
||||
|
||||
This module is the single home for event-file conversion code that doesn't
|
||||
fit cleanly inside `blastware_file.py` (which is the BW binary codec):
|
||||
|
||||
- sidecar JSON read/write (the modern per-event metadata file)
|
||||
- read_blastware_file() — reverse of write_blastware_file, used by
|
||||
the BW-importer flow when SFM is ingesting files produced by
|
||||
Blastware's own ACH (where the source A5 frames aren't available).
|
||||
|
||||
Sidecar schema v1 layout — see docs in the project plan or the schema
|
||||
declared in `event_to_sidecar_dict()`.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import hashlib
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import struct
|
||||
from pathlib import Path
|
||||
from typing import Optional, Union
|
||||
|
||||
from .models import Event, PeakValues, ProjectInfo, Timestamp
|
||||
from . import blastware_file as _bw # avoid circular reference at module load
|
||||
from .bw_ascii_report import BwAsciiReport
|
||||
from .waveform_codec import decode_waveform_v2, decoded_to_adc_counts
|
||||
from .histogram_codec import decode_histogram_body
|
||||
|
||||
# Reference pressure for dB(L) → psi conversion (20 µPa expressed in psi).
|
||||
# Same constant as sfm/sfm_webapp.html so server-side and browser-side
|
||||
# conversions agree.
|
||||
_DBL_REF_PSI = 2.9e-9
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
# Schema version for the sidecar JSON. Bump when fields change shape.
|
||||
# Older readers must reject anything > SCHEMA_VERSION; newer fields added
|
||||
# inside `extensions` are forward-compatible without a bump.
|
||||
SCHEMA_VERSION = 1
|
||||
SIDECAR_KIND = "sfm.event"
|
||||
|
||||
# Default tool_version stamp; callers can override. Hard-coded here
|
||||
# rather than read via importlib.metadata because the latter reflects the
|
||||
# *installed* dist-info, which doesn't update when pyproject.toml is
|
||||
# bumped without a `pip install` re-run — leading to confusing stale
|
||||
# version stamps in sidecars. Bump this constant and CHANGELOG.md
|
||||
# together at release time.
|
||||
TOOL_VERSION = "0.20.0"
|
||||
|
||||
try:
|
||||
# Best-effort: prefer the installed metadata when it's NEWER than the
|
||||
# baked-in constant (e.g. a downstream packager bumped the wheel
|
||||
# without editing this file). Otherwise fall back to TOOL_VERSION.
|
||||
from importlib.metadata import version as _pkg_version
|
||||
_meta_v = _pkg_version("seismo-relay")
|
||||
def _vtuple(s):
|
||||
try:
|
||||
return tuple(int(p) for p in s.split(".")[:3])
|
||||
except Exception:
|
||||
return (0, 0, 0)
|
||||
_TOOL_VERSION_DEFAULT = (
|
||||
_meta_v if _vtuple(_meta_v) > _vtuple(TOOL_VERSION) else TOOL_VERSION
|
||||
)
|
||||
except Exception:
|
||||
_TOOL_VERSION_DEFAULT = TOOL_VERSION
|
||||
|
||||
|
||||
# ── Sidecar dict construction ─────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _ts_iso(ts: Optional[Timestamp]) -> Optional[str]:
|
||||
if ts is None:
|
||||
return None
|
||||
try:
|
||||
return datetime.datetime(
|
||||
ts.year, ts.month, ts.day,
|
||||
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||
).isoformat()
|
||||
except Exception:
|
||||
return str(ts)
|
||||
|
||||
|
||||
def _peak_values_to_dict(pv: Optional[PeakValues]) -> dict:
|
||||
if pv is None:
|
||||
return {
|
||||
"transverse": None,
|
||||
"vertical": None,
|
||||
"longitudinal": None,
|
||||
"vector_sum": None,
|
||||
"mic_psi": None,
|
||||
}
|
||||
return {
|
||||
"transverse": pv.tran,
|
||||
"vertical": pv.vert,
|
||||
"longitudinal": pv.long,
|
||||
"vector_sum": pv.peak_vector_sum,
|
||||
"mic_psi": pv.micl,
|
||||
}
|
||||
|
||||
|
||||
def _bw_report_to_dict(report: BwAsciiReport) -> dict:
|
||||
"""Project a parsed BW ASCII report into the sidecar's `bw_report` block.
|
||||
|
||||
All fields are rendered as plain JSON-compatible types (no datetime
|
||||
objects). Channels are uniformly lowercased for stable JSON keys.
|
||||
"""
|
||||
def _ch(ch_name: str) -> dict:
|
||||
cs = report.channels.get(ch_name)
|
||||
if cs is None:
|
||||
return {}
|
||||
out = {
|
||||
"ppv_ips": cs.ppv_ips,
|
||||
"zc_freq_hz": cs.zc_freq_hz,
|
||||
"time_of_peak_s": cs.time_of_peak_s,
|
||||
"peak_accel_g": cs.peak_accel_g,
|
||||
"peak_disp_in": cs.peak_disp_in,
|
||||
}
|
||||
# Drop all-None entries — keeps the JSON tidy for partial reports.
|
||||
return {k: v for k, v in out.items() if v is not None}
|
||||
|
||||
def _sc(ch_name: str) -> dict:
|
||||
sc = report.sensor_check.get(ch_name)
|
||||
if sc is None:
|
||||
return {}
|
||||
out = {
|
||||
"freq_hz": sc.test_freq_hz,
|
||||
"ratio": sc.test_ratio,
|
||||
"amplitude_mv": sc.test_amplitude_mv,
|
||||
"result": sc.test_results,
|
||||
}
|
||||
return {k: v for k, v in out.items() if v is not None}
|
||||
|
||||
monitor_log = []
|
||||
for entry in report.monitor_log:
|
||||
e = {
|
||||
"start": entry.start_time.isoformat() if entry.start_time else None,
|
||||
"stop": entry.stop_time.isoformat() if entry.stop_time else None,
|
||||
"description": entry.description,
|
||||
}
|
||||
monitor_log.append({k: v for k, v in e.items() if v is not None})
|
||||
|
||||
return {
|
||||
"available": True,
|
||||
"event_type": report.event_type,
|
||||
"version": report.version,
|
||||
"trigger": {
|
||||
"channel": report.trigger_channel,
|
||||
"geo_level_ips": report.geo_trigger_level_ips,
|
||||
},
|
||||
"recording": {
|
||||
"sample_rate_sps": report.sample_rate_sps,
|
||||
"record_time_s": report.record_time_s,
|
||||
"pretrig_s": report.pretrig_s,
|
||||
"stop_mode": report.record_stop_mode,
|
||||
"geo_range_ips": report.geo_range_ips,
|
||||
"units": report.units,
|
||||
},
|
||||
"device": {
|
||||
"battery_volts": report.battery_volts,
|
||||
"calibration_date": report.calibration_date.isoformat() if report.calibration_date else None,
|
||||
"calibration_by": report.calibration_by,
|
||||
},
|
||||
"peaks": {
|
||||
"tran": _ch("Tran"),
|
||||
"vert": _ch("Vert"),
|
||||
"long": _ch("Long"),
|
||||
"vector_sum": {
|
||||
"ips": report.peak_vector_sum_ips,
|
||||
"time_s": report.peak_vector_sum_time_s,
|
||||
},
|
||||
},
|
||||
"mic": {
|
||||
"weighting": report.mic.weighting,
|
||||
"pspl_dbl": report.mic.pspl_dbl,
|
||||
"zc_freq_hz": report.mic.zc_freq_hz,
|
||||
"time_of_peak_s": report.mic.time_of_peak_s,
|
||||
},
|
||||
"sensor_check": {
|
||||
"tran": _sc("Tran"),
|
||||
"vert": _sc("Vert"),
|
||||
"long": _sc("Long"),
|
||||
"mic": _sc("MicL"),
|
||||
},
|
||||
"monitor_log": monitor_log,
|
||||
"pc_sw_version": report.pc_sw_version,
|
||||
}
|
||||
|
||||
|
||||
def _dbl_to_psi(pspl_dbl: float) -> float:
|
||||
"""Convert dB(L) sound pressure level back to psi. Uses the same
|
||||
20 µPa reference (= 2.9e-9 psi) as the webapp so server-side and
|
||||
browser-side conversions agree."""
|
||||
return _DBL_REF_PSI * (10.0 ** (pspl_dbl / 20.0))
|
||||
|
||||
|
||||
def apply_report_to_event(event: Event, report: BwAsciiReport) -> None:
|
||||
"""Overlay device-authoritative fields from a parsed BW ASCII report
|
||||
onto an in-memory Event, IN-PLACE.
|
||||
|
||||
Why this exists
|
||||
───────────────
|
||||
`read_blastware_file()` parses the BW binary and fills `Event.peak_values`
|
||||
via `_peaks_from_samples()` — which runs the (still-undecoded) BW body
|
||||
codec assuming raw int16 LE and produces ±32K-shaped noise on every
|
||||
channel. Result: peak values land in the SeismoDb event row as
|
||||
~10 in/s on every event regardless of the actual signal.
|
||||
|
||||
When a paired BW ASCII report is available, the report carries the
|
||||
device's own authoritative peak / project / sample-rate / record-time
|
||||
values. This helper folds those onto the Event before it flows to
|
||||
`SeismoDb.insert_events()`, so the DB columns reflect the report
|
||||
rather than the broken-codec output.
|
||||
|
||||
Fields overlaid (only when the report supplies a non-None value):
|
||||
- peak_values.tran / .vert / .long (from report.channels)
|
||||
- peak_values.peak_vector_sum (from report.peak_vector_sum_ips)
|
||||
- peak_values.micl (psi) (from report.mic.pspl_dbl → psi)
|
||||
- project_info.project / .client / .operator / .sensor_location
|
||||
- sample_rate (from report.sample_rate_sps)
|
||||
- rectime_seconds (from report.record_time_s)
|
||||
|
||||
Fields NOT touched (operator-edit / parser-output preserved):
|
||||
- timestamp, raw_samples, record_type, total_samples,
|
||||
pretrig_samples, _waveform_key, _a5_frames, _raw_record
|
||||
- false_trigger and review state (those live on the sidecar, not on Event)
|
||||
"""
|
||||
if event.peak_values is None:
|
||||
event.peak_values = PeakValues()
|
||||
pv = event.peak_values
|
||||
ch = report.channels
|
||||
if (t := ch.get("Tran")) and t.ppv_ips is not None: pv.tran = t.ppv_ips
|
||||
if (v := ch.get("Vert")) and v.ppv_ips is not None: pv.vert = v.ppv_ips
|
||||
if (l := ch.get("Long")) and l.ppv_ips is not None: pv.long = l.ppv_ips
|
||||
if report.peak_vector_sum_ips is not None:
|
||||
pv.peak_vector_sum = report.peak_vector_sum_ips
|
||||
if report.mic.pspl_dbl is not None and report.mic.pspl_dbl > 0:
|
||||
pv.micl = _dbl_to_psi(report.mic.pspl_dbl)
|
||||
|
||||
if event.project_info is None:
|
||||
event.project_info = ProjectInfo()
|
||||
pi = event.project_info
|
||||
if report.project: pi.project = report.project
|
||||
if report.client: pi.client = report.client
|
||||
if report.operator: pi.operator = report.operator
|
||||
if report.sensor_location: pi.sensor_location = report.sensor_location
|
||||
|
||||
if report.sample_rate_sps:
|
||||
event.sample_rate = report.sample_rate_sps
|
||||
if report.record_time_s is not None:
|
||||
event.rectime_seconds = report.record_time_s
|
||||
|
||||
|
||||
def apply_bw_report_dict_to_event(event: Event, bw_report: dict) -> None:
|
||||
"""Mirror of ``apply_report_to_event`` for the projected sidecar
|
||||
dict shape (as produced by ``_bw_report_to_dict``).
|
||||
|
||||
Why this exists
|
||||
───────────────
|
||||
The ingest path holds a live ``BwAsciiReport`` parsed straight from
|
||||
the ``_ASCII.TXT`` and uses ``apply_report_to_event`` to overlay
|
||||
device-authoritative peaks onto the codec output before insert.
|
||||
|
||||
The backfill path doesn't have the original ``.TXT`` (it's not
|
||||
retained in the waveform store), but it does have the preserved
|
||||
``bw_report`` block from the sidecar — which contains the same
|
||||
projected fields. Re-overlaying those during a backfill keeps the
|
||||
DB peak columns aligned with what BW reports rather than letting
|
||||
the codec output (which may be incomplete for unhandled formats or
|
||||
walker edge cases) win by default.
|
||||
|
||||
No-ops cleanly when ``bw_report`` is ``None``, empty, or missing
|
||||
any particular sub-field — only fields with a concrete value get
|
||||
written. Mirrors ``apply_report_to_event``'s "report wins where
|
||||
present" semantics.
|
||||
"""
|
||||
if not bw_report:
|
||||
return
|
||||
if event.peak_values is None:
|
||||
event.peak_values = PeakValues()
|
||||
pv = event.peak_values
|
||||
|
||||
peaks = bw_report.get("peaks") or {}
|
||||
tran = (peaks.get("tran") or {}).get("ppv_ips")
|
||||
vert = (peaks.get("vert") or {}).get("ppv_ips")
|
||||
long = (peaks.get("long") or {}).get("ppv_ips")
|
||||
if tran is not None: pv.tran = tran
|
||||
if vert is not None: pv.vert = vert
|
||||
if long is not None: pv.long = long
|
||||
vs_ips = (peaks.get("vector_sum") or {}).get("ips")
|
||||
if vs_ips is not None:
|
||||
pv.peak_vector_sum = vs_ips
|
||||
|
||||
mic = bw_report.get("mic") or {}
|
||||
pspl = mic.get("pspl_dbl")
|
||||
if pspl is not None and pspl > 0:
|
||||
pv.micl = _dbl_to_psi(pspl)
|
||||
|
||||
rec = bw_report.get("recording") or {}
|
||||
sr = rec.get("sample_rate_sps")
|
||||
if sr:
|
||||
event.sample_rate = sr
|
||||
rt = rec.get("record_time_s")
|
||||
if rt is not None:
|
||||
event.rectime_seconds = rt
|
||||
|
||||
|
||||
def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
|
||||
if pi is None:
|
||||
return {
|
||||
"project": None,
|
||||
"client": None,
|
||||
"operator": None,
|
||||
"sensor_location": None,
|
||||
}
|
||||
return {
|
||||
"project": pi.project,
|
||||
"client": pi.client,
|
||||
"operator": pi.operator,
|
||||
"sensor_location": pi.sensor_location,
|
||||
}
|
||||
|
||||
|
||||
def event_to_sidecar_dict(
|
||||
event: Event,
|
||||
*,
|
||||
serial: str,
|
||||
blastware_filename: str,
|
||||
blastware_filesize: int,
|
||||
blastware_sha256: str,
|
||||
source_kind: str = "sfm-live",
|
||||
a5_pickle_filename: Optional[str] = None,
|
||||
tool_version: str = _TOOL_VERSION_DEFAULT,
|
||||
captured_at: Optional[datetime.datetime] = None,
|
||||
review: Optional[dict] = None,
|
||||
extensions: Optional[dict] = None,
|
||||
bw_report: Optional[BwAsciiReport] = None,
|
||||
) -> dict:
|
||||
"""
|
||||
Build a v1 sidecar dict from an Event + the surrounding metadata.
|
||||
|
||||
Pure helper — no file I/O. Callers stitch the result into a sidecar
|
||||
via `write_sidecar()` (or POST it back via the PATCH endpoint).
|
||||
|
||||
When *bw_report* is supplied (e.g. by the ACH-forwarded import path
|
||||
where Blastware writes a per-event ASCII report alongside the binary),
|
||||
its decoded fields are folded into the sidecar:
|
||||
|
||||
- A new top-level ``bw_report`` block carries the rich derived
|
||||
per-channel stats (Peak Acceleration, Peak Displacement, ZC Freq,
|
||||
Time of Peak), the Peak Vector Sum + time, the per-channel sensor
|
||||
self-check results, and monitor-log timestamps.
|
||||
- ``peak_values`` is overlaid from the report (the report's PPV/PVS
|
||||
values are computed by the device firmware and are authoritative;
|
||||
anything ``read_blastware_file()`` derived from samples is
|
||||
approximate at best until the body codec is decoded).
|
||||
- ``project_info`` is overlaid from the report when the report
|
||||
supplies a non-empty value (the report mirrors the device's
|
||||
compliance config, which is what BW shows in its event report).
|
||||
- ``event.timestamp`` is overlaid from the report's Event Date +
|
||||
Event Time (BW's report timestamps are second-resolution and
|
||||
match the binary's footer; we prefer the report value because
|
||||
the BW-binary footer timestamp can drift on some firmware).
|
||||
"""
|
||||
if source_kind not in {"sfm-live", "sfm-ach", "bw-import", "idf-import"}:
|
||||
raise ValueError(f"unknown source_kind: {source_kind!r}")
|
||||
|
||||
captured_at = captured_at or datetime.datetime.utcnow()
|
||||
|
||||
# ── Overlay event fields from the report when present ───────────────────
|
||||
timestamp_iso = _ts_iso(event.timestamp)
|
||||
if bw_report and bw_report.event_datetime:
|
||||
timestamp_iso = bw_report.event_datetime.isoformat()
|
||||
|
||||
# Build peak_values, optionally overlaid from the report. The report
|
||||
# stores Mic peak as PSPL (dB(L)); we convert to psi to match the
|
||||
# existing peak_values.mic_psi field.
|
||||
peak_dict = _peak_values_to_dict(event.peak_values)
|
||||
if bw_report:
|
||||
ch = bw_report.channels
|
||||
if (t := ch.get("Tran")) and t.ppv_ips is not None: peak_dict["transverse"] = t.ppv_ips
|
||||
if (v := ch.get("Vert")) and v.ppv_ips is not None: peak_dict["vertical"] = v.ppv_ips
|
||||
if (l := ch.get("Long")) and l.ppv_ips is not None: peak_dict["longitudinal"] = l.ppv_ips
|
||||
if bw_report.peak_vector_sum_ips is not None:
|
||||
peak_dict["vector_sum"] = bw_report.peak_vector_sum_ips
|
||||
if bw_report.mic.pspl_dbl is not None and bw_report.mic.pspl_dbl > 0:
|
||||
peak_dict["mic_psi"] = _dbl_to_psi(bw_report.mic.pspl_dbl)
|
||||
|
||||
# Project info: overlay from report (the report mirrors the
|
||||
# session-start compliance config that BW renders in event reports).
|
||||
proj_dict = _project_info_to_dict(event.project_info)
|
||||
if bw_report:
|
||||
if bw_report.project: proj_dict["project"] = bw_report.project
|
||||
if bw_report.client: proj_dict["client"] = bw_report.client
|
||||
if bw_report.operator: proj_dict["operator"] = bw_report.operator
|
||||
if bw_report.sensor_location: proj_dict["sensor_location"] = bw_report.sensor_location
|
||||
|
||||
# Event-block fields: overlay from report where available.
|
||||
event_block = {
|
||||
"serial": serial,
|
||||
"timestamp": timestamp_iso,
|
||||
"waveform_key": event._waveform_key.hex() if event._waveform_key else None,
|
||||
"record_type": event.record_type,
|
||||
"sample_rate": event.sample_rate,
|
||||
"rectime_seconds": event.rectime_seconds,
|
||||
"total_samples": event.total_samples,
|
||||
"pretrig_samples": event.pretrig_samples,
|
||||
}
|
||||
if bw_report:
|
||||
# Report values are authoritative — they're the user-configured
|
||||
# values BW reads back, not STRT-derived guesses. In particular
|
||||
# `event.rectime_seconds` from `read_blastware_file()` reads
|
||||
# STRT[18] which is actually the `0x46` record-type marker (= 70)
|
||||
# rather than the user's Record Time setting. Always overwrite.
|
||||
if bw_report.sample_rate_sps:
|
||||
event_block["sample_rate"] = bw_report.sample_rate_sps
|
||||
if bw_report.record_time_s is not None:
|
||||
event_block["rectime_seconds"] = bw_report.record_time_s
|
||||
# Derive total_samples + pretrig_samples per channel from the
|
||||
# report's sample_rate × times. These match the row count of
|
||||
# the report's sample table (verified: event-c reports 1024 sps
|
||||
# × (1.0 + 0.25) = 1280 rows).
|
||||
if (sr := bw_report.sample_rate_sps) and bw_report.record_time_s is not None:
|
||||
pretrig_s = abs(bw_report.pretrig_s) if bw_report.pretrig_s is not None else 0.0
|
||||
event_block["total_samples"] = int(round(sr * (bw_report.record_time_s + pretrig_s)))
|
||||
event_block["pretrig_samples"] = int(round(sr * pretrig_s))
|
||||
|
||||
out = {
|
||||
"schema_version": SCHEMA_VERSION,
|
||||
"kind": SIDECAR_KIND,
|
||||
|
||||
"event": event_block,
|
||||
"peak_values": peak_dict,
|
||||
"project_info": proj_dict,
|
||||
|
||||
"blastware": {
|
||||
"filename": blastware_filename,
|
||||
"filesize": blastware_filesize,
|
||||
"sha256": blastware_sha256,
|
||||
"available": True,
|
||||
},
|
||||
|
||||
"source": {
|
||||
"kind": source_kind,
|
||||
"captured_at": captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
|
||||
"tool_version": tool_version,
|
||||
"a5_pickle_filename": a5_pickle_filename,
|
||||
},
|
||||
|
||||
"review": review or {
|
||||
"false_trigger": False,
|
||||
"reviewer": None,
|
||||
"reviewed_at": None,
|
||||
"notes": "",
|
||||
},
|
||||
|
||||
"extensions": extensions or {},
|
||||
}
|
||||
|
||||
if bw_report:
|
||||
out["bw_report"] = _bw_report_to_dict(bw_report)
|
||||
|
||||
return out
|
||||
|
||||
|
||||
# ── Sidecar IO ────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def write_sidecar(path: Union[str, Path], data: dict) -> None:
|
||||
"""
|
||||
Atomic write of a sidecar dict to <path>.
|
||||
|
||||
Validates schema_version is supported before writing so we don't
|
||||
silently drop a future-format sidecar over the wire.
|
||||
"""
|
||||
path = Path(path)
|
||||
sv = data.get("schema_version")
|
||||
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
|
||||
raise ValueError(
|
||||
f"write_sidecar: unsupported schema_version={sv!r} "
|
||||
f"(this build supports 1..{SCHEMA_VERSION})"
|
||||
)
|
||||
|
||||
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||
with tmp.open("w", encoding="utf-8") as f:
|
||||
json.dump(data, f, indent=2, sort_keys=False, default=str)
|
||||
f.write("\n")
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp, path)
|
||||
|
||||
|
||||
def read_sidecar(path: Union[str, Path]) -> dict:
|
||||
"""
|
||||
Load a sidecar JSON file.
|
||||
|
||||
Raises FileNotFoundError if missing, ValueError on bad shape /
|
||||
unsupported schema_version. Unknown keys at the top level are
|
||||
preserved in the returned dict (forward-compat).
|
||||
"""
|
||||
path = Path(path)
|
||||
with path.open("r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
if not isinstance(data, dict):
|
||||
raise ValueError(f"sidecar at {path}: top-level is not a JSON object")
|
||||
sv = data.get("schema_version")
|
||||
if not isinstance(sv, int) or sv < 1:
|
||||
raise ValueError(f"sidecar at {path}: missing or invalid schema_version")
|
||||
if sv > SCHEMA_VERSION:
|
||||
raise ValueError(
|
||||
f"sidecar at {path}: schema_version={sv} > supported {SCHEMA_VERSION}; "
|
||||
"upgrade seismo-relay to read this file"
|
||||
)
|
||||
if data.get("kind") != SIDECAR_KIND:
|
||||
raise ValueError(f"sidecar at {path}: unexpected kind={data.get('kind')!r}")
|
||||
return data
|
||||
|
||||
|
||||
def patch_sidecar(
|
||||
path: Union[str, Path],
|
||||
*,
|
||||
review: Optional[dict] = None,
|
||||
extensions: Optional[dict] = None,
|
||||
reviewer_now: bool = True,
|
||||
) -> dict:
|
||||
"""
|
||||
Atomically apply a JSON-merge-patch to a sidecar file's `review`
|
||||
and/or `extensions` blocks. Other top-level keys are untouched.
|
||||
|
||||
`review_now`: when True (default) and `review` is non-empty, stamps
|
||||
`review.reviewed_at` with the current UTC time so the review-time is
|
||||
auditable without the caller having to pass it.
|
||||
|
||||
Returns the new full sidecar dict.
|
||||
"""
|
||||
path = Path(path)
|
||||
data = read_sidecar(path)
|
||||
|
||||
if review:
|
||||
merged = dict(data.get("review") or {})
|
||||
merged.update({k: v for k, v in review.items() if v is not None or k in merged})
|
||||
if reviewer_now:
|
||||
merged["reviewed_at"] = datetime.datetime.utcnow().isoformat() + "Z"
|
||||
data["review"] = merged
|
||||
|
||||
if extensions:
|
||||
merged_ext = dict(data.get("extensions") or {})
|
||||
merged_ext.update(extensions)
|
||||
data["extensions"] = merged_ext
|
||||
|
||||
write_sidecar(path, data)
|
||||
return data
|
||||
|
||||
|
||||
def sidecar_path_for(blastware_path: Union[str, Path]) -> Path:
|
||||
"""Convention: <bw_path>.sfm.json sits next to the BW binary."""
|
||||
p = Path(blastware_path)
|
||||
return p.with_name(p.name + ".sfm.json")
|
||||
|
||||
|
||||
def file_sha256(path: Union[str, Path], chunk_size: int = 65536) -> str:
|
||||
"""Compute SHA-256 of a file as a hex string."""
|
||||
h = hashlib.sha256()
|
||||
with open(path, "rb") as f:
|
||||
while True:
|
||||
chunk = f.read(chunk_size)
|
||||
if not chunk:
|
||||
break
|
||||
h.update(chunk)
|
||||
return h.hexdigest()
|
||||
|
||||
|
||||
# ── Blastware-file reader ─────────────────────────────────────────────────────
|
||||
#
|
||||
# Reverse of `blastware_file.write_blastware_file`. Used by the BW-import
|
||||
# flow to ingest files produced by Blastware's own ACH (where the source
|
||||
# A5 frames are not available).
|
||||
#
|
||||
# File structure (recap):
|
||||
# [22B header] [21B STRT record] [body bytes] [26B footer]
|
||||
#
|
||||
# The body holds:
|
||||
# - 6B preamble (00 00 ff ff ff ff) immediately after the STRT
|
||||
# - 4-channel interleaved int16 LE samples
|
||||
# - Embedded ASCII metadata strings (Project: / Client: / User Name: /
|
||||
# Seis Loc: / Extended Notes) from the device's session-start config
|
||||
#
|
||||
# The 0C waveform record (per-event peaks, project name) is NOT in the
|
||||
# BW file — those are computed by the device firmware and only carried
|
||||
# in the live SUB 0C response. read_blastware_file() therefore computes
|
||||
# peaks from the raw samples assuming Normal-range (10 in/s full-scale)
|
||||
# geophone sensitivity. Imported events surface that assumption via the
|
||||
# sidecar's `peak_values.computed_from_samples` flag.
|
||||
|
||||
|
||||
# Geophone scale factor, in/s per ADC unit, for Normal range (10 in/s FS).
|
||||
# Confirmed from CLAUDE.md (geo_hardware_constant = 6.206053 in/s per V,
|
||||
# ADC full-scale = 1.61133 V Normal range = 10.0 in/s peak; per-count
|
||||
# resolution ≈ 10.0 / 32768).
|
||||
_GEO_NORMAL_FS_INS = 10.0
|
||||
_GEO_SENSITIVE_FS_INS = 1.250
|
||||
_INT16_FS = 32768.0
|
||||
|
||||
# Microphone scale factor, psi per ADC count. Approximate — exact factor
|
||||
# depends on the geophone-vs-mic ADC scaling and the firmware reference.
|
||||
# We mark mic_psi as "computed approximate" in the sidecar.
|
||||
_MIC_FS_PSI = 0.0125 / _INT16_FS # ~0.5 psi full-scale assumption
|
||||
|
||||
|
||||
def _decode_strt(strt: bytes) -> dict:
|
||||
"""
|
||||
Decode the 21-byte STRT record from a BW file.
|
||||
|
||||
Returns dict with waveform_key (4B), total_samples, pretrig_samples,
|
||||
rectime_seconds. Falls back to None on truncated/missing fields.
|
||||
"""
|
||||
if len(strt) < 21 or strt[0:4] != b"STRT":
|
||||
return {}
|
||||
return {
|
||||
"waveform_key": strt[6:10].hex(),
|
||||
"total_samples": struct.unpack_from(">H", strt, 8)[0],
|
||||
"pretrig_samples": struct.unpack_from(">H", strt, 16)[0],
|
||||
"rectime_seconds": strt[18],
|
||||
}
|
||||
|
||||
|
||||
def _find_first_string(buf: bytes, label: bytes, max_len: int = 256) -> Optional[str]:
|
||||
"""
|
||||
Search `buf` for `label` (e.g. b"Project:") and return the
|
||||
null-terminated ASCII string that follows, stripped.
|
||||
"""
|
||||
pos = buf.find(label)
|
||||
if pos < 0:
|
||||
return None
|
||||
start = pos + len(label)
|
||||
end = buf.find(b"\x00", start, start + max_len)
|
||||
if end < 0:
|
||||
end = start + max_len
|
||||
text = buf[start:end].decode("ascii", errors="replace").strip()
|
||||
return text or None
|
||||
|
||||
|
||||
def _decode_samples_4ch_int16_le(stream: bytes) -> dict[str, list[int]]:
|
||||
"""
|
||||
Decode a 4-channel interleaved int16 LE byte stream into per-channel
|
||||
lists. Channels are [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
|
||||
Truncates to a multiple of 8 bytes (one full sample-set).
|
||||
"""
|
||||
n_complete = (len(stream) // 8) * 8
|
||||
if n_complete == 0:
|
||||
return {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
fmt = "<" + "h" * (n_complete // 2)
|
||||
flat = list(struct.unpack(fmt, stream[:n_complete]))
|
||||
return {
|
||||
"Tran": flat[0::4],
|
||||
"Vert": flat[1::4],
|
||||
"Long": flat[2::4],
|
||||
"MicL": flat[3::4],
|
||||
}
|
||||
|
||||
|
||||
def _peaks_from_samples(samples: dict[str, list[int]]) -> PeakValues:
|
||||
"""
|
||||
Compute approximate peaks from raw int16 samples assuming Normal-range
|
||||
geophone sensitivity. Used by the BW-importer when the 0C waveform
|
||||
record (the device's authoritative peaks) is unavailable.
|
||||
"""
|
||||
def _peak_ins(ch: list[int]) -> float:
|
||||
if not ch:
|
||||
return 0.0
|
||||
m = max(abs(int(v)) for v in ch)
|
||||
return m / _INT16_FS * _GEO_NORMAL_FS_INS
|
||||
|
||||
tran = _peak_ins(samples.get("Tran", []))
|
||||
vert = _peak_ins(samples.get("Vert", []))
|
||||
long_ = _peak_ins(samples.get("Long", []))
|
||||
|
||||
# Mic in psi (approximate)
|
||||
mic_ch = samples.get("MicL", []) or []
|
||||
mic = max((abs(int(v)) for v in mic_ch), default=0) * _MIC_FS_PSI
|
||||
|
||||
# Peak vector sum: max over time of sqrt(T^2 + V^2 + L^2)
|
||||
pvs = 0.0
|
||||
n = min(len(samples.get("Tran", [])), len(samples.get("Vert", [])), len(samples.get("Long", [])))
|
||||
if n:
|
||||
scale = _GEO_NORMAL_FS_INS / _INT16_FS
|
||||
T = samples["Tran"]; V = samples["Vert"]; L = samples["Long"]
|
||||
for i in range(n):
|
||||
t = T[i] * scale
|
||||
v = V[i] * scale
|
||||
l = L[i] * scale
|
||||
mag = (t*t + v*v + l*l) ** 0.5
|
||||
if mag > pvs:
|
||||
pvs = mag
|
||||
|
||||
return PeakValues(
|
||||
tran=tran, vert=vert, long=long_,
|
||||
peak_vector_sum=pvs, micl=mic,
|
||||
)
|
||||
|
||||
|
||||
_RECORD_TYPE_BY_EXT_SUFFIX = {
|
||||
'H': 'Histogram',
|
||||
'W': 'Waveform',
|
||||
'M': 'Manual',
|
||||
'E': 'Event',
|
||||
'C': 'Combo',
|
||||
}
|
||||
|
||||
|
||||
def derive_record_type_from_filename(filename, default: str = "Waveform") -> str:
|
||||
"""Derive a BW Event's record_type from its filename's extension suffix.
|
||||
|
||||
V10.72+ MiniMate Plus firmware encodes the event type as the LAST
|
||||
character of the extension (the `T` in BW's `AB0T` scheme):
|
||||
|
||||
``M529LKIQ.G10H`` → H → ``"Histogram"``
|
||||
``T350L385.VY0W`` → W → ``"Waveform"``
|
||||
``...M`` → M → ``"Manual"``
|
||||
``...E`` → E → ``"Event"``
|
||||
``...C`` → C → ``"Combo"``
|
||||
|
||||
Old S338 firmware uses 3-char extensions ending in ``0`` whose
|
||||
encoding is not yet known — those fall through to ``default``.
|
||||
Micromate Series 4 uses a different scheme entirely (observed:
|
||||
``IDFH``, ``IDFW``) but the LAST-char convention (H / W) still holds
|
||||
for the type code, so it works for both families.
|
||||
|
||||
Returns ``default`` if filename is empty, has no extension, or the
|
||||
suffix char isn't a recognized type code.
|
||||
"""
|
||||
if not filename:
|
||||
return default
|
||||
try:
|
||||
name = Path(filename).name
|
||||
except (TypeError, ValueError):
|
||||
return default
|
||||
if '.' not in name:
|
||||
return default
|
||||
ext = name.rsplit('.', 1)[1]
|
||||
if not ext:
|
||||
return default
|
||||
return _RECORD_TYPE_BY_EXT_SUFFIX.get(ext[-1].upper(), default)
|
||||
|
||||
|
||||
def read_blastware_file(path: Union[str, Path]) -> Event:
|
||||
"""
|
||||
Parse a Blastware waveform file into an Event.
|
||||
|
||||
Recovers:
|
||||
- waveform_key, rectime_seconds, total_samples, pretrig_samples
|
||||
(from the STRT record)
|
||||
- timestamp (from the footer's start-time field)
|
||||
- project_info (from ASCII labels embedded in the body)
|
||||
- raw_samples (Tran/Vert/Long/MicL int16 lists)
|
||||
- peak_values (computed from raw_samples; approximate — see notes
|
||||
on _peaks_from_samples about Normal-range assumption)
|
||||
|
||||
Does NOT recover the source A5 frames (they aren't in the BW file).
|
||||
The returned Event has `_a5_frames = None`, signalling that
|
||||
byte-for-byte regeneration of the BW file from this Event alone is
|
||||
not possible — the on-disk BW file IS the byte-for-byte source.
|
||||
"""
|
||||
path = Path(path)
|
||||
raw = path.read_bytes()
|
||||
if len(raw) < _bw._WAVEFORM_HEADER_SIZE + 21 + 26:
|
||||
raise ValueError(f"{path}: file too short ({len(raw)} bytes) to be a BW event")
|
||||
|
||||
# Header: validate magic prefix.
|
||||
header = raw[:_bw._WAVEFORM_HEADER_SIZE]
|
||||
if not header.startswith(_bw._FILE_HEADER_PREFIX):
|
||||
raise ValueError(f"{path}: not a Blastware file (bad header prefix)")
|
||||
|
||||
# STRT record: 21 bytes immediately after the header.
|
||||
strt_raw = raw[_bw._WAVEFORM_HEADER_SIZE : _bw._WAVEFORM_HEADER_SIZE + 21]
|
||||
strt_fields = _decode_strt(strt_raw)
|
||||
if not strt_fields:
|
||||
raise ValueError(f"{path}: STRT record missing or malformed")
|
||||
|
||||
# Footer: locate the 0e 08 marker, validating the year is in a sane range.
|
||||
body_start = _bw._WAVEFORM_HEADER_SIZE + 21
|
||||
footer_pos = -1
|
||||
pos = body_start
|
||||
while True:
|
||||
pos = raw.find(b"\x0e\x08", pos)
|
||||
if pos < 0 or pos + 26 > len(raw):
|
||||
break
|
||||
yr = (raw[pos + 4] << 8) | raw[pos + 5]
|
||||
if 2015 <= yr <= 2050:
|
||||
footer_pos = pos
|
||||
break
|
||||
pos += 1
|
||||
|
||||
if footer_pos < 0 and len(raw) >= 26:
|
||||
footer_pos = len(raw) - 26
|
||||
if footer_pos < body_start:
|
||||
raise ValueError(f"{path}: footer not found")
|
||||
|
||||
body = raw[body_start : footer_pos]
|
||||
footer = raw[footer_pos : footer_pos + 26]
|
||||
|
||||
# Footer layout:
|
||||
# [0:2] 0e 08 marker
|
||||
# [2:10] ts1 (start) BE 8B
|
||||
# [10:18] ts2 (stop) BE 8B
|
||||
# [18:24] 00 01 00 02 00 00
|
||||
# [24:26] crc
|
||||
ts1 = _bw._decode_ts_be(footer[2:10])
|
||||
ts2 = _bw._decode_ts_be(footer[10:18])
|
||||
|
||||
# Body: decode via the verified body codecs. Two formats coexist:
|
||||
#
|
||||
# 1. Waveform-mode (.AB0W) — starts with 7-byte preamble
|
||||
# ``00 02 00 [Tran[0] BE] [Tran[1] BE]`` followed by the
|
||||
# tagged-block delta stream documented in
|
||||
# ``docs/waveform_codec_re_status.md`` and §7.6.1 of the
|
||||
# protocol reference. Decoded by ``waveform_codec.decode_waveform_v2``.
|
||||
#
|
||||
# 2. Histogram-mode (.AB0H) — a sequence of 32-byte blocks, one
|
||||
# per histogram interval, each carrying per-channel peak +
|
||||
# half-period values. Decoded by
|
||||
# ``histogram_codec.decode_histogram_body``. Both codecs
|
||||
# return the same channel-grouped output shape, so consumers
|
||||
# don't need to special-case mode.
|
||||
#
|
||||
# The historical ``_decode_samples_4ch_int16_le`` int16-LE
|
||||
# interpretation was retracted 2026-05-08 (see protocol-ref §7.6.1
|
||||
# retraction box) — it produced ±32K noise on every event.
|
||||
#
|
||||
# If both codecs fail (malformed file, truncated body, unrecognised
|
||||
# mode, synthetic test input), fall back to empty channels — the
|
||||
# rest of the event (timestamp, waveform_key, project strings) is
|
||||
# still recoverable and useful.
|
||||
decoded = decode_waveform_v2(body)
|
||||
if decoded is None:
|
||||
decoded = decode_histogram_body(body)
|
||||
if decoded is None:
|
||||
log.warning(
|
||||
"%s: body codec failed to decode (body starts %s) — "
|
||||
"raw_samples will be empty", path, body[:8].hex(" "),
|
||||
)
|
||||
samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||
else:
|
||||
samples = decoded_to_adc_counts(decoded)
|
||||
|
||||
# Metadata strings (label-anchored search across the body).
|
||||
project = _find_first_string(body, b"Project:")
|
||||
client = _find_first_string(body, b"Client:")
|
||||
user = _find_first_string(body, b"User Name:")
|
||||
seisloc = _find_first_string(body, b"Seis Loc:")
|
||||
|
||||
# Build the Event.
|
||||
ev = Event(index=-1)
|
||||
if strt_fields.get("waveform_key"):
|
||||
ev._waveform_key = bytes.fromhex(strt_fields["waveform_key"])
|
||||
# Derive record_type from the filename's extension suffix (H/W/M/E/C).
|
||||
# When called from save_imported_bw the path here is a tmp file with a
|
||||
# ".bw" suffix, so the derivation falls back to "Waveform" and the
|
||||
# caller overrides ev.record_type using the original filename — see
|
||||
# waveform_store.save_imported_bw.
|
||||
ev.record_type = derive_record_type_from_filename(path.name)
|
||||
ev.rectime_seconds = strt_fields.get("rectime_seconds")
|
||||
ev.total_samples = strt_fields.get("total_samples")
|
||||
ev.pretrig_samples = strt_fields.get("pretrig_samples")
|
||||
|
||||
if ts1 is not None:
|
||||
ev.timestamp = Timestamp(
|
||||
raw=footer[2:10],
|
||||
flag=0x10,
|
||||
year=ts1.year, unknown_byte=0, month=ts1.month, day=ts1.day,
|
||||
hour=ts1.hour, minute=ts1.minute, second=ts1.second,
|
||||
)
|
||||
|
||||
ev.project_info = ProjectInfo(
|
||||
project=project, client=client, operator=user, sensor_location=seisloc,
|
||||
)
|
||||
ev.raw_samples = samples
|
||||
# Only compute peaks from samples when we actually have samples.
|
||||
# For events the codec couldn't decode (histogram-mode bodies, until
|
||||
# the §7.6.2 histogram codec is wired in), samples is an empty dict
|
||||
# and ``_peaks_from_samples`` would return PeakValues(0, 0, 0, 0, 0).
|
||||
# That would then OVERWRITE existing good DB peak values (e.g. from
|
||||
# paired BW ASCII reports) during the backfill UPSERT path.
|
||||
# Leaving peak_values=None signals "we don't know" to downstream
|
||||
# consumers; the backfill script seeds from the DB row when it sees
|
||||
# None, and ``apply_report_to_event`` overlays from a paired ASCII
|
||||
# report when one is supplied.
|
||||
has_samples = any(samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL"))
|
||||
ev.peak_values = _peaks_from_samples(samples) if has_samples else None
|
||||
ev._a5_frames = None # not recoverable from BW file
|
||||
|
||||
return ev
|
||||
+31
-181
@@ -111,24 +111,20 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
|
||||
verified against this algorithm on 2026-04-02).
|
||||
|
||||
Args:
|
||||
offset_word: 16-bit offset. For probe/chunks/metadata pages this is
|
||||
`0x1002`. For the proper TERM frame this is computed by
|
||||
`bulk_waveform_term_v2()` from the STRT-derived
|
||||
`end_offset`.
|
||||
raw_params: 10, 11, or 12 params bytes (from `bulk_waveform_params`
|
||||
for probes/samples, `bulk_waveform_term_v2` for TERM, or
|
||||
a manually-built 12-byte block for the metadata pages
|
||||
0x1002 / 0x1004). See gotcha #3 below — params region
|
||||
uses partial DLE stuffing of 0x10 bytes.
|
||||
offset_word: 16-bit offset (0x1004 for probe/chunks, 0x005A for term).
|
||||
raw_params: 10 or 11 params bytes (from bulk_waveform_params or
|
||||
bulk_waveform_term_params). 0x10 bytes in params are
|
||||
written RAW — NOT DLE-stuffed. Confirmed 2026-04-06 by
|
||||
comparing wire bytes: BW sends bare `10 04` for chunk 1
|
||||
(counter=0x1004), not stuffed `10 10 04`. Device reads
|
||||
params at fixed byte positions; stuffing shifts the bytes
|
||||
and corrupts the counter, causing device to ignore the frame.
|
||||
|
||||
Returns:
|
||||
Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
|
||||
"""
|
||||
if len(raw_params) not in (10, 11, 12):
|
||||
# 10 = termination params; 11 = regular probe / chunk params;
|
||||
# 12 = metadata-page params (extra trailing 0x00 — BW byte-perfect quirk
|
||||
# for the two fixed metadata reads at counter=0x1002 and 0x1004).
|
||||
raise ValueError(f"raw_params must be 10/11/12 bytes, got {len(raw_params)}")
|
||||
if len(raw_params) not in (10, 11):
|
||||
raise ValueError(f"raw_params must be 10 or 11 bytes, got {len(raw_params)}")
|
||||
|
||||
# Build stuffed section between STX and checksum
|
||||
s = bytearray()
|
||||
@@ -138,40 +134,8 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
|
||||
s += b"\x00" # field3
|
||||
s += bytes([(offset_word >> 8) & 0xFF, # offset_hi — raw, NOT stuffed
|
||||
offset_word & 0xFF]) # offset_lo
|
||||
# Params — partial DLE stuffing of 0x10 bytes (CONFIRMED 2026-05-05).
|
||||
#
|
||||
# The device's de-stuffing rule for params is:
|
||||
# • `10 10` → de-stuffs to `10`
|
||||
# • `10 02/03/04` → kept literal (these are inner-frame markers)
|
||||
# • `10 X` other → de-stuffs to just `X` (drops the 0x10)
|
||||
#
|
||||
# So for any 0x10 byte in the *logical* params that is followed by a
|
||||
# byte NOT in {0x02, 0x03, 0x04, 0x10}, we must double the 0x10 on the
|
||||
# wire (`10 X` → `10 10 X`) so the device's de-stuffer reproduces the
|
||||
# original `10 X` pair. Without this, counter values with `0x10` in
|
||||
# the high byte (e.g. counter=0x1000 has params bytes `10 00`) are
|
||||
# silently corrupted to `0x__00` on the device side, and the device
|
||||
# responds for the wrong address — for counter=0x1000 it returns the
|
||||
# probe response (counter=0x0000), which contains the file header +
|
||||
# STRT. That STRT block then lands in the assembled file body and
|
||||
# Blastware rejects the file as malformed.
|
||||
#
|
||||
# Confirmed against BW capture 5-1-26 / bwcap3sec frame 20: params
|
||||
# logical bytes `00 01 11 10 00 00 00 00 00 00 00` (counter=0x1000)
|
||||
# are encoded on the wire as `00 01 11 10 10 00 00 00 00 00 00 00`.
|
||||
# BW frames 13/14 (meta @ 0x1002 / 0x1004) leave `10 02` and `10 04`
|
||||
# raw — the device handles those literal pairs correctly.
|
||||
i = 0
|
||||
while i < len(raw_params):
|
||||
b = raw_params[i]
|
||||
for b in raw_params: # params — NOT DLE-stuffed (raw bytes, match BW wire format)
|
||||
s.append(b)
|
||||
if (
|
||||
b == 0x10
|
||||
and i + 1 < len(raw_params)
|
||||
and raw_params[i + 1] not in (0x02, 0x03, 0x04, 0x10)
|
||||
):
|
||||
s.append(0x10) # double the 0x10 so it survives device de-stuffing
|
||||
i += 1
|
||||
|
||||
# DLE-aware checksum: for 0x10 XX pairs count XX; for lone bytes count them
|
||||
chk, i = 0, 0
|
||||
@@ -434,26 +398,28 @@ def bulk_waveform_params(key4: bytes, counter: int, *, is_probe: bool = False) -
|
||||
|
||||
def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
|
||||
"""
|
||||
⛔ DEPRECATED — DO NOT USE IN NEW CODE.
|
||||
Build the 10-byte params block for the SUB 5A termination request.
|
||||
|
||||
This is the v1 termination params helper, paired with the broken
|
||||
`_BULK_TERM_OFFSET = 0x005A` magic offset_word. Together they produce a
|
||||
~100-byte device-side terminator response that does NOT contain the
|
||||
partial-last-chunk waveform tail or the 26-byte file footer. Files
|
||||
reconstructed using this terminator are missing their last ~512 bytes of
|
||||
waveform data and have a synthesized footer that disagrees with what BW
|
||||
would have written.
|
||||
The termination request uses offset=0x005A and a DIFFERENT params layout —
|
||||
the leading 0x00 byte is dropped, key4[0:2] shifts to params[0:2], and the
|
||||
counter high byte is at params[2]:
|
||||
|
||||
**For new code, use `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`**
|
||||
which computes the correct offset_word + params from the STRT-derived
|
||||
`end_offset`. v2 produces wire bytes that match BW exactly across all
|
||||
tested events (4-27-26 / 5-1-26 / 5-4-26 captures).
|
||||
params[0] = key4[0]
|
||||
params[1] = key4[1]
|
||||
params[2] = (counter >> 8) & 0xFF
|
||||
params[3:] = zeros
|
||||
|
||||
This function is retained ONLY for the defensive fallback path in
|
||||
`read_bulk_waveform_stream()` that triggers when STRT parsing fails or no
|
||||
chunks are fetched (= a malformed event or an unexpected device state).
|
||||
The fallback already logs a WARNING when it activates; if you see that
|
||||
warning, the bug is upstream — STRT should have been parseable.
|
||||
Counter for the termination request = last_regular_counter + 0x0400.
|
||||
|
||||
Confirmed from 1-2-26 BW TX capture: final request (frame 83) uses
|
||||
offset=0x005A, params[0:3] = key4[0:2] + term_counter_hi.
|
||||
|
||||
Args:
|
||||
key4: 4-byte waveform key.
|
||||
counter: Termination counter (= last regular counter + 0x0400).
|
||||
|
||||
Returns:
|
||||
10-byte params block.
|
||||
"""
|
||||
if len(key4) != 4:
|
||||
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
||||
@@ -464,123 +430,6 @@ def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
|
||||
return bytes(p)
|
||||
|
||||
|
||||
def bulk_waveform_term_v2(
|
||||
key4: bytes,
|
||||
end_offset: int,
|
||||
last_chunk_counter: int,
|
||||
) -> tuple[int, bytes]:
|
||||
"""
|
||||
Compute the SUB 5A TERM frame's offset_word and 10-byte params block.
|
||||
|
||||
Confirmed across 3 events (4-27-26 + 5-1-26 captures):
|
||||
|
||||
next_boundary = last_chunk_counter + 0x0200
|
||||
offset_word = end_offset - next_boundary (residual byte count)
|
||||
params[0] = key4[0] (= 0x01 on every observed device)
|
||||
params[1] = key4[1] (= 0x11)
|
||||
params[2] = (next_boundary >> 8) & 0xFF
|
||||
params[3] = next_boundary & 0xFF
|
||||
params[4:10] = zeros
|
||||
|
||||
Verification:
|
||||
| end_offset | last_chunk | next_boundary | offset_word | params[2:4] |
|
||||
| 0x1ABE | 0x1800 | 0x1A00 | 0x00BE | 1A 00 |
|
||||
| 0x21F2 | 0x1E00 | 0x2000 | 0x01F2 | 20 00 |
|
||||
| 0x417E | 0x3E38 | 0x4038 | 0x0146 | 40 38 |
|
||||
|
||||
The device receives `requested_address = (params[2] << 8) | offset_word`
|
||||
and replies with `(end_offset - next_boundary)` bytes of waveform tail
|
||||
starting at `next_boundary` — including the 26-byte file footer.
|
||||
|
||||
Args:
|
||||
key4: 4-byte waveform key for this event.
|
||||
end_offset: Event-end pointer (= `(end_key[2] << 8) | end_key[3]`
|
||||
from the STRT record at data[23:27] of A5[0]).
|
||||
last_chunk_counter: Counter of the last full 0x0200-byte chunk fetched
|
||||
(the chunk that covers [last_chunk_counter,
|
||||
last_chunk_counter + 0x0200)).
|
||||
|
||||
Returns:
|
||||
(offset_word, params10) tuple. Pass as
|
||||
`build_5a_frame(offset_word, params)`.
|
||||
|
||||
Raises:
|
||||
ValueError: on inconsistent inputs.
|
||||
"""
|
||||
if len(key4) != 4:
|
||||
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
||||
next_boundary = last_chunk_counter + 0x0200
|
||||
if next_boundary > 0xFFFF:
|
||||
raise ValueError(
|
||||
f"next_boundary 0x{next_boundary:04X} exceeds uint16; check inputs"
|
||||
)
|
||||
if end_offset <= last_chunk_counter:
|
||||
raise ValueError(
|
||||
f"end_offset 0x{end_offset:04X} must be > "
|
||||
f"last_chunk_counter 0x{last_chunk_counter:04X}"
|
||||
)
|
||||
offset_word = end_offset - next_boundary
|
||||
if offset_word < 0:
|
||||
# Last chunk overshot end_offset; caller should have stopped one chunk
|
||||
# earlier. Treat as zero residual.
|
||||
offset_word = 0
|
||||
if offset_word > 0xFFFF:
|
||||
raise ValueError(
|
||||
f"offset_word 0x{offset_word:04X} exceeds uint16"
|
||||
)
|
||||
p = bytearray(10)
|
||||
p[0] = key4[0]
|
||||
p[1] = key4[1]
|
||||
p[2] = (next_boundary >> 8) & 0xFF
|
||||
p[3] = next_boundary & 0xFF
|
||||
return offset_word, bytes(p)
|
||||
|
||||
|
||||
# ── End-offset extraction from STRT record ────────────────────────────────────
|
||||
|
||||
STRT_MARKER = b"STRT"
|
||||
|
||||
|
||||
def parse_strt_end_offset(a5_data: bytes) -> Optional[int]:
|
||||
"""
|
||||
Extract the event-end offset from the STRT record in an A5 response payload.
|
||||
|
||||
The first A5 response (the probe response, or the first chunk for events
|
||||
with non-zero start_key[2:4]) contains a STRT record at byte offset 17 of
|
||||
`data`. Layout:
|
||||
|
||||
data[17:21] "STRT"
|
||||
data[21:23] ff fe sentinel
|
||||
data[23:27] end_key ← 4-byte key of where this event ENDS
|
||||
data[27:31] start_key
|
||||
...
|
||||
|
||||
Returns `(end_key[2] << 8) | end_key[3]` — the absolute device-buffer
|
||||
address where the event ends. Use this to bound the chunk loop and to
|
||||
compute the TERM frame.
|
||||
|
||||
Verified end_offset values:
|
||||
| event start_key | end_key | end_offset |
|
||||
| 01110000 | 01111ABE | 0x1ABE |
|
||||
| 01110000 | 011121F2 | 0x21F2 |
|
||||
| 011121F2 | 0111417E | 0x417E |
|
||||
|
||||
Args:
|
||||
a5_data: The `data` field of an A5 response frame (frame.data).
|
||||
|
||||
Returns:
|
||||
The end_offset (uint16) if STRT is found, else None.
|
||||
"""
|
||||
pos = a5_data.find(STRT_MARKER)
|
||||
if pos < 0 or pos + 10 > len(a5_data):
|
||||
return None
|
||||
# data[pos+4:pos+6] is "ff fe"; data[pos+6:pos+10] is end_key.
|
||||
end_key = a5_data[pos + 6 : pos + 10]
|
||||
if len(end_key) < 4:
|
||||
return None
|
||||
return (end_key[2] << 8) | end_key[3]
|
||||
|
||||
|
||||
# ── Pre-built POLL frames ─────────────────────────────────────────────────────
|
||||
#
|
||||
# POLL (SUB 0x5B) uses the same two-step pattern as all other reads — the
|
||||
@@ -621,6 +470,7 @@ class S3Frame:
|
||||
|
||||
|
||||
# ── Streaming S3 frame parser ─────────────────────────────────────────────────
|
||||
|
||||
class S3FrameParser:
|
||||
"""
|
||||
Incremental byte-stream parser for S3→BW response frames.
|
||||
|
||||
@@ -1,283 +0,0 @@
|
||||
"""
|
||||
histogram_codec.py — decoder for MiniMate Plus histogram-mode event bodies.
|
||||
|
||||
FULLY DECODED 2026-05-20. Every field in every block, verified
|
||||
byte-exact against BW's ASCII export across multiple histogram
|
||||
fixtures.
|
||||
|
||||
The histogram-mode body is a stream of 32-byte fixed-length blocks,
|
||||
one block per histogram interval. Each block carries the per-interval
|
||||
peak amplitude + zero-crossing frequency for all four channels (Tran,
|
||||
Vert, Long, MicL).
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Body layout (CONFIRMED 2026-05-20)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
[stream of 32-byte blocks]
|
||||
|
||||
Body length is approximately ``n_intervals * 32`` bytes plus a small
|
||||
trailing remnant (1-9 bytes typically) at the very end. Walker should
|
||||
iterate 32-stride and stop before the tail.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
32-byte block layout
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
[0] 0x00 always-zero tag
|
||||
[1] segment_id (uint8) 0x00..0x03 — 256 blocks per segment
|
||||
[2:4] block_ctr (uint16 LE) resets each segment (0x0100, 0x0101, …)
|
||||
[4:6] 0x000a (uint16 LE) constant marker (= 10)
|
||||
[6] T_peak_count uint8 Tran peak (count × 0.005 → in/s, max 1.275 in/s)
|
||||
[7] T_annotation uint8 empirically non-zero on intervals with sub-Hz
|
||||
or unmeasurable Tran freq; meaning not fully RE'd
|
||||
[8:10] T_halfperiod uint16 LE Tran half-period in samples (freq = 512 / halfp Hz)
|
||||
[10] V_peak_count uint8
|
||||
[11] V_annotation uint8
|
||||
[12:14] V_halfperiod uint16 LE
|
||||
[14] L_peak_count uint8
|
||||
[15] L_annotation uint8
|
||||
[16:18] L_halfperiod uint16 LE
|
||||
[18] M_peak_count uint8 MicL peak (count → dB via mic_count_to_db)
|
||||
[19] M_annotation uint8
|
||||
[20:22] M_halfperiod uint16 LE MicL half-period in samples (freq = 512 / halfp Hz)
|
||||
[22:24] 0x00 0x00 constant
|
||||
[24:28] 4-byte variable purpose unknown (possibly CRC or timestamp delta)
|
||||
[28:32] 0x1e 0x0a 0x00 0x00 constant block-end signature
|
||||
|
||||
NOTE on peak-count width: an earlier interpretation treated the peak
|
||||
fields as uint16 LE spanning [6:8] / [10:12] / [14:16] / [18:20].
|
||||
That happened to be byte-exact against the N844 fixture corpus only
|
||||
because every annotation byte in those fixtures was zero, making
|
||||
``uint16 LE == uint8``. Cross-correlating BE9558 (K558) Tran-drift
|
||||
and BE18003 (T003) Histogram+Continuous events against the BW ASCII
|
||||
export proved peak is uint8 alone — see test_histogram_codec.py
|
||||
and docs/histogram_codec_re_status.md.
|
||||
|
||||
Block-identification anchor: ``block[22:24] == b"\\x00\\x00"`` AND
|
||||
``block[28:32] == b"\\x1e\\x0a\\x00\\x00"``. This is the reliable
|
||||
distinguisher from non-block content in the file.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Per-channel encoding
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
Geophone channels (Tran, Vert, Long):
|
||||
- peak_count × 0.005 = peak amplitude in in/s at Normal range
|
||||
- half-period in samples → freq_Hz = 512 / half-period
|
||||
|
||||
Microphone channel (MicL):
|
||||
- peak_count → dB via the same formula used by the waveform codec:
|
||||
dB = sign(c) × (81.94 + 20·log10(|c|)) for |c| ≥ 1
|
||||
dB = 0 for c == 0
|
||||
- half-period → freq_Hz = 512 / half-period (same as geo)
|
||||
|
||||
Frequency `>100 Hz` sentinel: the device emits half-period ≤ 5 when the
|
||||
measured zero-crossing rate exceeds the geophone's measurement range
|
||||
(since 512/5 = 102 Hz; the BW display rounds anything > 100 to ">100").
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Output shape
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
``decode_histogram_body`` returns a per-channel dict matching the
|
||||
waveform codec's shape so the rest of the pipeline (.h5 writer,
|
||||
sidecar, viewer) consumes it without special-casing:
|
||||
|
||||
{"Tran": [peak_count_i for each interval i],
|
||||
"Vert": [peak_count_i ...],
|
||||
"Long": [peak_count_i ...],
|
||||
"MicL": [peak_count_i ...]}
|
||||
|
||||
Values are in **16-count units for geo** (LSB = 0.005 in/s, matching
|
||||
``decode_waveform_v2``) and **1-count units for mic** (matching the
|
||||
waveform codec's mic convention). Run through
|
||||
``waveform_codec.decoded_to_adc_counts`` to scale geo to 1-count ADC.
|
||||
|
||||
Per-interval frequencies are NOT returned — they're auxiliary data,
|
||||
not waveform samples. Consumers needing frequencies can call
|
||||
``decode_histogram_body_full()`` for the structured per-interval
|
||||
record list.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import struct
|
||||
from typing import List, Optional, Tuple
|
||||
|
||||
# Block-end signature: constant `1e 0a 00 00` in bytes [28:32] of every
|
||||
# real data block. More distinctive than the byte-22 `00 00` (which
|
||||
# matches many false positives), so we anchor on this.
|
||||
_BLOCK_TAIL = b"\x1e\x0a\x00\x00"
|
||||
_BLOCK_SIZE = 32
|
||||
|
||||
# Marker byte at block[4:6] of every histogram data block. Used as
|
||||
# additional validation that we're looking at a real block.
|
||||
_BLOCK_MARKER = 10
|
||||
|
||||
# Geo peak scaling: stored as "count × 0.005 in/s" where 1 count = one
|
||||
# 0.005 in/s display quantum. Equivalent to the waveform codec's
|
||||
# 16-count-unit output (1 unit = 0.005 in/s = 16 ADC counts).
|
||||
_GEO_LSB_INS = 0.005
|
||||
|
||||
# Frequency formula: freq_Hz = _FREQ_NUMERATOR / half_period_samples.
|
||||
# Empirically determined to be 512 (= sample_rate / 2, where sample rate
|
||||
# is 1024 sps for the standard MiniMate Plus configuration).
|
||||
_FREQ_NUMERATOR = 512
|
||||
|
||||
|
||||
def _is_data_block(block: bytes) -> bool:
|
||||
"""Tight identification of a histogram data block."""
|
||||
if len(block) < _BLOCK_SIZE:
|
||||
return False
|
||||
if block[28:32] != _BLOCK_TAIL:
|
||||
return False
|
||||
if block[22:24] != b"\x00\x00":
|
||||
return False
|
||||
if block[0] != 0x00:
|
||||
return False
|
||||
marker = block[4] | (block[5] << 8)
|
||||
if marker != _BLOCK_MARKER:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def _decode_block(block: bytes) -> Optional[dict]:
|
||||
"""Decode one 32-byte histogram block. Caller must have validated
|
||||
with ``_is_data_block`` first.
|
||||
|
||||
Returns a record with per-channel peak counts (uint8) and
|
||||
half-periods (uint16 LE).
|
||||
"""
|
||||
# Peak counts are uint8 at bytes [6] / [10] / [14] / [18]. The
|
||||
# adjacent bytes [7] / [11] / [15] / [19] hold an annotation field
|
||||
# whose meaning isn't fully understood (empirically non-zero in
|
||||
# intervals with sub-Hz or unmeasurable geo frequencies, mostly
|
||||
# zero otherwise — see test fixtures from BE9558/BE18003 corpora).
|
||||
# Crucially, those annotation bytes are NOT the high byte of the
|
||||
# peak count: cross-correlating against BW's per-interval ASCII
|
||||
# export proves the peak is uint8 alone.
|
||||
#
|
||||
# Reading the peak as uint16 LE (the original interpretation) was
|
||||
# accidentally correct only because every block in the N844 fixture
|
||||
# corpus had a zero annotation byte; non-N844 events with non-zero
|
||||
# annotation bytes decoded to physically impossible peaks (e.g.
|
||||
# 268 in/s per channel) and produced 35× inflated PVS sums when
|
||||
# first run against prod data. See histogram_codec_re_status.md.
|
||||
t_peak = block[6]
|
||||
v_peak = block[10]
|
||||
l_peak = block[14]
|
||||
m_peak = block[18]
|
||||
t_halfp = block[8] | (block[9] << 8)
|
||||
v_halfp = block[12] | (block[13] << 8)
|
||||
l_halfp = block[16] | (block[17] << 8)
|
||||
m_halfp = block[20] | (block[21] << 8)
|
||||
segment_id = block[1]
|
||||
block_ctr = block[2] | (block[3] << 8)
|
||||
var_meta = bytes(block[24:28])
|
||||
annotations = (block[7], block[11], block[15], block[19])
|
||||
return {
|
||||
"segment_id": segment_id,
|
||||
"block_ctr": block_ctr,
|
||||
"t_peak": t_peak,
|
||||
"t_halfp": t_halfp,
|
||||
"v_peak": v_peak,
|
||||
"v_halfp": v_halfp,
|
||||
"l_peak": l_peak,
|
||||
"l_halfp": l_halfp,
|
||||
"m_peak": m_peak,
|
||||
"m_halfp": m_halfp,
|
||||
"meta_var": var_meta,
|
||||
"annotations": annotations,
|
||||
}
|
||||
|
||||
|
||||
def walk_body(body: bytes) -> List[dict]:
|
||||
"""Walk the body and return one dict per histogram interval.
|
||||
|
||||
Iterates 32-byte strides from offset 0. Yields a decoded record
|
||||
for every block that passes ``_is_data_block`` validation. Stops
|
||||
when the remaining bytes are too short to form a complete block.
|
||||
|
||||
In Histogram+Continuous mode the body interleaves data blocks with
|
||||
other 32-byte content (likely continuous-mode waveform blocks) that
|
||||
fail the data-block validation; the walker naturally skips them
|
||||
without losing 32-byte alignment. Use ``block_ctr`` from each
|
||||
returned record to map back to the original interval index — the
|
||||
record list is sparse when other block types are interleaved.
|
||||
"""
|
||||
records: List[dict] = []
|
||||
for off in range(0, len(body) - _BLOCK_SIZE + 1, _BLOCK_SIZE):
|
||||
blk = body[off:off + _BLOCK_SIZE]
|
||||
if not _is_data_block(blk):
|
||||
# Hit non-block content (likely a sync or stream marker).
|
||||
# Continue walking — block alignment is fixed at 32-stride
|
||||
# from offset 0, so we don't lose alignment by skipping.
|
||||
continue
|
||||
decoded = _decode_block(blk)
|
||||
if decoded is None:
|
||||
# Block validated as a histogram block but had peak fields
|
||||
# outside the plausible range — undocumented extension.
|
||||
# Skip rather than propagating bogus PVS contributions.
|
||||
continue
|
||||
records.append(decoded)
|
||||
return records
|
||||
|
||||
|
||||
def decode_histogram_body(body: bytes) -> Optional[dict]:
|
||||
"""Decode a histogram-mode body into per-channel peak-sample arrays.
|
||||
|
||||
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||
where each channel's list contains one peak value per histogram
|
||||
interval (in the same units the waveform codec uses: 16-count units
|
||||
for geo, 1-count ADC units for mic). Returns ``None`` if the body
|
||||
doesn't contain any valid histogram blocks.
|
||||
|
||||
To convert to physical units:
|
||||
- Geo channels: ``count * 0.005`` = peak in in/s at Normal range
|
||||
(or run through ``waveform_codec.decoded_to_adc_counts`` first
|
||||
to get 1-count ADC values, then ``count / 32767 * 10.0`` for in/s)
|
||||
- Mic channel: use ``waveform_codec.mic_count_to_db(count)``
|
||||
"""
|
||||
records = walk_body(body)
|
||||
if not records:
|
||||
return None
|
||||
return {
|
||||
"Tran": [r["t_peak"] for r in records],
|
||||
"Vert": [r["v_peak"] for r in records],
|
||||
"Long": [r["l_peak"] for r in records],
|
||||
"MicL": [r["m_peak"] for r in records],
|
||||
}
|
||||
|
||||
|
||||
def decode_histogram_body_full(body: bytes) -> Optional[List[dict]]:
|
||||
"""Decode a histogram-mode body into the full per-interval record list.
|
||||
|
||||
Same data as ``decode_histogram_body`` but in a structured form that
|
||||
preserves the half-period (frequency) data for each channel + the
|
||||
per-block segment_id, block_ctr, and 4-byte variable metadata.
|
||||
Useful for diagnostic tools, sidecar enrichment, and future-codec
|
||||
work.
|
||||
|
||||
Returns ``None`` if the body has no valid blocks.
|
||||
"""
|
||||
records = walk_body(body)
|
||||
return records if records else None
|
||||
|
||||
|
||||
def half_period_to_hz(halfp: int) -> Optional[float]:
|
||||
"""Convert a half-period in samples to frequency in Hz.
|
||||
|
||||
Returns ``None`` for half-period ≤ 5 — the device emits values in
|
||||
that range when the measured zero-crossing rate exceeds 100 Hz
|
||||
(the BW display reports `>100 Hz` for such cases). Callers can
|
||||
treat ``None`` as the `>100 Hz` sentinel.
|
||||
"""
|
||||
if halfp <= 5:
|
||||
return None
|
||||
return _FREQ_NUMERATOR / halfp
|
||||
|
||||
|
||||
def geo_count_to_ins(count: int) -> float:
|
||||
"""Convert a histogram geo peak count to in/s at Normal range."""
|
||||
return count * _GEO_LSB_INS
|
||||
@@ -201,58 +201,6 @@ class Timestamp:
|
||||
second=second,
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def from_short_record(cls, data: bytes) -> "Timestamp":
|
||||
"""
|
||||
Decode an 8-byte timestamp header from a 210-byte waveform record.
|
||||
|
||||
Wire layout (✅ CONFIRMED 2026-05-01 against live SFM run on BE11529 in
|
||||
Continuous mode, day-of-month = 1 May, raw: 01 05 07 ea 00 0d 15 25):
|
||||
byte[0]: day (uint8)
|
||||
byte[1]: month (uint8)
|
||||
bytes[2-3]: year (big-endian uint16)
|
||||
byte[4]: unknown (0x00 in observed sample)
|
||||
byte[5]: hour (uint8)
|
||||
byte[6]: minute (uint8)
|
||||
byte[7]: second (uint8)
|
||||
|
||||
This is a third format observed in the wild — distinct from the 9-byte
|
||||
(single-shot, sub_code=0x10 at [1]) and 10-byte (continuous, 0x10 at
|
||||
[0] AND [2]) layouts. No marker bytes; disambiguated by where the
|
||||
year lands when scanned at byte 2/3/4.
|
||||
|
||||
Args:
|
||||
data: at least 8 bytes; only the first 8 are consumed.
|
||||
|
||||
Returns:
|
||||
Decoded Timestamp.
|
||||
|
||||
Raises:
|
||||
ValueError: if data is fewer than 8 bytes.
|
||||
"""
|
||||
if len(data) < 8:
|
||||
raise ValueError(
|
||||
f"Short record timestamp requires at least 8 bytes, got {len(data)}"
|
||||
)
|
||||
day = data[0]
|
||||
month = data[1]
|
||||
year = struct.unpack_from(">H", data, 2)[0]
|
||||
unknown_byte = data[4]
|
||||
hour = data[5]
|
||||
minute = data[6]
|
||||
second = data[7]
|
||||
return cls(
|
||||
raw=bytes(data[:8]),
|
||||
flag=0,
|
||||
year=year,
|
||||
unknown_byte=unknown_byte,
|
||||
month=month,
|
||||
day=day,
|
||||
hour=hour,
|
||||
minute=minute,
|
||||
second=second,
|
||||
)
|
||||
|
||||
@property
|
||||
def clock_set(self) -> bool:
|
||||
"""False when year == 1995 (factory default / battery-lost state)."""
|
||||
|
||||
+239
-238
@@ -35,8 +35,6 @@ from .framing import (
|
||||
token_params,
|
||||
bulk_waveform_params,
|
||||
bulk_waveform_term_params,
|
||||
bulk_waveform_term_v2,
|
||||
parse_strt_end_offset,
|
||||
POLL_PROBE,
|
||||
POLL_DATA,
|
||||
SESSION_RESET,
|
||||
@@ -124,22 +122,16 @@ DATA_LENGTHS: dict[int, int] = {
|
||||
}
|
||||
|
||||
# SUB 5A (BULK_WAVEFORM_STREAM) protocol constants.
|
||||
#
|
||||
# 2026-05-01 minimal-fix: the chunk-counter walk is now bounded by the event's
|
||||
# `end_offset` extracted from the STRT record at data[23:27] of the probe
|
||||
# response. Without this bound the loop kept asking for chunks past the event
|
||||
# end and the device responded with post-event circular-buffer garbage,
|
||||
# corrupting reconstructed Blastware files for events ≥ 2 sec.
|
||||
#
|
||||
# We keep the OLD 0x0400 chunk step here (BW actually uses 0x0200 — see §7.8.5
|
||||
# of the protocol reference for the corrected understanding) because the
|
||||
# existing blastware_file.py builder relies on the 0x0400-step frame structure
|
||||
# to produce valid files. Switching to BW's 0x0200 step is a separate task
|
||||
# that also requires updating the file builder.
|
||||
# BW-exact protocol values (v0.14.0). Verified against 4-27-26 + 5-1-26 captures.
|
||||
_BULK_CHUNK_OFFSET = 0x1002 # offset_word for probe + all chunk requests
|
||||
_BULK_TERM_OFFSET = 0x005A # offset_word for the legacy terminator (fallback only)
|
||||
_BULK_COUNTER_STEP = 0x0200 # chunk counter increment (matches chunk payload size)
|
||||
# Confirmed from 1-2-26 BW TX capture analysis (2026-04-02).
|
||||
_BULK_CHUNK_OFFSET = 0x1004 # offset field for probe + all regular chunk requests ✅
|
||||
_BULK_TERM_OFFSET = 0x005A # offset field for termination request ✅
|
||||
_BULK_COUNTER_STEP = 0x0400 # chunk counter increment per chunk ✅
|
||||
# Chunk counter formula: key4[2:4] + (chunk_num - 1) * 0x0400
|
||||
# where key4[2:4] is the event's circular-buffer base offset ((key4[2]<<8)|key4[3]).
|
||||
# Earlier captures showed 0x1004 for chunk 1 of key 01110000 — that was a Blastware
|
||||
# artifact. For keys where key4[2:4] != 0x0000 (e.g. key 01111884) the old
|
||||
# "n * 0x0400" formula sends counters from the wrong buffer region and the device
|
||||
# returns data from a different event. Confirmed correct 2026-04-24.
|
||||
|
||||
# Default timeout values (seconds).
|
||||
# MiniMate Plus is a slow device — keep these generous.
|
||||
@@ -534,270 +526,223 @@ class MiniMateProtocol:
|
||||
self,
|
||||
key4: bytes,
|
||||
*,
|
||||
stop_after_metadata: bool = True, # DEPRECATED — no-op under BW-exact walk
|
||||
max_chunks: int = 256, # safety cap only; loop is bounded by end_offset
|
||||
stop_after_metadata: bool = True,
|
||||
max_chunks: int = 32,
|
||||
include_terminator: bool = False,
|
||||
extra_chunks_after_metadata: int = 1, # DEPRECATED — no-op
|
||||
extra_chunks_after_metadata: int = 1,
|
||||
) -> list[S3Frame]:
|
||||
"""
|
||||
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event using
|
||||
Blastware's exact protocol. REWRITTEN 2026-05-02 (v0.14.0).
|
||||
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event.
|
||||
|
||||
Algorithm (matches BW captures across 2-sec / 3-sec / event-2):
|
||||
The bulk waveform stream carries both raw ADC samples (large) and
|
||||
event-time metadata strings ("Project:", "Client:", "User Name:",
|
||||
"Seis Loc:", "Extended Notes") embedded in one of the middle frames
|
||||
(confirmed: A5[7] of 9 for 1-2-26 capture).
|
||||
|
||||
1. Probe
|
||||
- For events at start_key[2:4] = 0x0000 (first event after erase
|
||||
/ wrap): probe at counter=0x0000 with full key in params.
|
||||
- For continuation events (start_key[2:4] != 0): first chunk at
|
||||
counter = start_key[2:4] + 0x0046; acts as both probe and
|
||||
first sample chunk; response carries STRT.
|
||||
Protocol is request-per-chunk, NOT a continuous stream:
|
||||
1. Probe (offset=_BULK_CHUNK_OFFSET, is_probe=True, counter=0x0000)
|
||||
2. Chunks (offset=_BULK_CHUNK_OFFSET, is_probe=False, counter+=0x0400)
|
||||
3. Loop until metadata found (stop_after_metadata=True) or max_chunks
|
||||
4. Termination (offset=_BULK_TERM_OFFSET, counter=last+_BULK_COUNTER_STEP)
|
||||
Device responds with a final A5 frame (page_key=0x0000).
|
||||
|
||||
2. Parse end_offset from STRT record at data[23:27] of the probe response.
|
||||
By default the termination frame (page_key=0x0000) is NOT included in the
|
||||
returned list. Pass include_terminator=True to append it; the blastware_file
|
||||
writer needs the terminator frame's body to reconstruct the waveform file footer.
|
||||
|
||||
3. Read two fixed metadata pages at counter=0x1002 and counter=0x1004
|
||||
— global session metadata (Project / Client / User Name / Seis Loc
|
||||
/ Extended Notes ASCII strings). Event 1 only; continuation
|
||||
events skip these (BW caches them across the session).
|
||||
|
||||
4. Walk sample chunks at 0x0200 increments, starting from 0x0600 for
|
||||
event 1 or `start + 0x0046 + 0x0200` for continuation events.
|
||||
Stop when `next_chunk + 0x0200 > end_offset`.
|
||||
|
||||
5. Send TERM frame with offset_word and params computed by
|
||||
`bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`.
|
||||
The TERM response contains the partial last chunk (residual =
|
||||
end_offset - next_boundary) including the 26-byte 0e 08 file
|
||||
footer.
|
||||
Args:
|
||||
key4: 4-byte waveform key from EVENT_HEADER (1E).
|
||||
stop_after_metadata: If True (default), send termination as soon as
|
||||
b"Project:" is found in a frame's data — avoids
|
||||
downloading the full ADC waveform payload (several
|
||||
hundred KB). Set False to download everything.
|
||||
max_chunks: Safety cap on the number of chunk requests sent
|
||||
(default 32; a typical event uses 9 large frames).
|
||||
include_terminator: If True, append the terminator A5 frame
|
||||
(page_key=0x0000) to the returned list. The
|
||||
terminator carries the waveform file footer bytes.
|
||||
Default False preserves existing caller behaviour.
|
||||
|
||||
Returns:
|
||||
List of S3Frame objects from each A5 response (probe, metadata
|
||||
pages, sample chunks, optional TERM response). Caller passes
|
||||
`include_terminator=True` (e.g. write_blastware_file) to keep the
|
||||
TERM response in the list — it's required to reconstruct the
|
||||
file footer.
|
||||
|
||||
Deprecated kwargs:
|
||||
stop_after_metadata: legacy "Project:"-string-based stop condition.
|
||||
No-op under the BW-exact walk; the loop is
|
||||
deterministically bounded by end_offset from
|
||||
STRT. Accepted for backward compat.
|
||||
extra_chunks_after_metadata: same.
|
||||
List of S3Frame objects from each A5 response frame. Frame indices
|
||||
match the request sequence: index 0 = probe response, index 1 = first
|
||||
chunk, etc. If include_terminator=True, the last element is the
|
||||
terminator frame (page_key=0x0000).
|
||||
|
||||
Raises:
|
||||
ProtocolError: on timeout / bad checksum / unexpected SUB.
|
||||
ProtocolError: on timeout, bad checksum, or unexpected SUB.
|
||||
|
||||
Confirmed from 1-2-26 BW TX/RX captures (2026-04-02):
|
||||
- probe + 8 regular chunks + 1 termination = 10 TX frames
|
||||
- 9 large A5 responses + 1 terminator A5 = 10 RX frames
|
||||
- page_key=0x0010 on large frames; page_key=0x0000 on terminator ✅
|
||||
- "Project:" metadata at A5[7].data[626] ✅
|
||||
"""
|
||||
if len(key4) != 4:
|
||||
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
||||
|
||||
# Quietly accept and warn on deprecated kwargs.
|
||||
if not stop_after_metadata:
|
||||
log.debug("5A: stop_after_metadata=False is no-op under BW-exact walk")
|
||||
if extra_chunks_after_metadata not in (0, 1):
|
||||
log.debug("5A: extra_chunks_after_metadata=%d is no-op under BW-exact walk",
|
||||
extra_chunks_after_metadata)
|
||||
|
||||
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xA5
|
||||
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xFF - 0x5A = 0xA5
|
||||
frames_data: list[S3Frame] = []
|
||||
counter = 0
|
||||
|
||||
start_offset = (key4[2] << 8) | key4[3]
|
||||
is_event_1 = (start_offset == 0)
|
||||
# BW counter formula (confirmed from 4-3-26 capture for key 0111245a,
|
||||
# and empirical live-device test 2026-04-06 for key 01110000):
|
||||
# counter for chunk n = max(key4[2:4], 0x0400) + (n - 1) * 0x0400
|
||||
# key4[2:4] is the event's circular-buffer base offset. The max() guard
|
||||
# ensures chunk 1 never uses counter=0x0000 (which equals the probe address
|
||||
# and causes the device to re-return STRT record data for the first chunk).
|
||||
_key4_offset = (key4[2] << 8) | key4[3]
|
||||
|
||||
# ── Step 1: probe / first chunk ──────────────────────────────────────
|
||||
if is_event_1:
|
||||
probe_counter = 0
|
||||
probe_params = bulk_waveform_params(key4, 0, is_probe=True)
|
||||
log.debug("5A probe (event-1) key=%s counter=0x0000", key4.hex())
|
||||
else:
|
||||
# Continuation events: first 5A request lands at counter = key[2:4]
|
||||
# (i.e. the address of the off=0x46 WAVEHDR record returned by 1F).
|
||||
# The probe response carries STRT at byte 17 with end_offset.
|
||||
#
|
||||
# Confirmed 2026-05-04 from 5-1-26 "copy 2nd address" capture
|
||||
# (BW probes counter=0x2238 with key=01112238, STRT@17 end=0x417E)
|
||||
# and 5-4-26 BW captures (2-sec event probes counter=0x2238).
|
||||
#
|
||||
# The earlier "+0x46" formula in the doc came from calling
|
||||
# start_key the BOUNDARY (off=0x2C) key, but the iteration walk
|
||||
# uses 1F's off=0x46 key as cur_key, which already incorporates
|
||||
# the +0x46 offset relative to the boundary. Adding it again
|
||||
# caused the probe to overshoot, miss STRT, and run uncapped.
|
||||
probe_counter = start_offset
|
||||
probe_params = bulk_waveform_params(key4, probe_counter)
|
||||
log.debug(
|
||||
"5A probe (event-N) key=%s counter=0x%04X",
|
||||
key4.hex(), probe_counter,
|
||||
)
|
||||
|
||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, probe_params))
|
||||
self._parser.reset()
|
||||
# ── Step 1: probe ────────────────────────────────────────────────────
|
||||
log.debug("5A probe key=%s key4_offset=0x%04X", key4.hex(), _key4_offset)
|
||||
params = bulk_waveform_params(key4, 0, is_probe=True)
|
||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
||||
self._parser.reset() # reset bytes_fed counter before probe recv
|
||||
try:
|
||||
rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False)
|
||||
probe_batch = self._recv_5a_batch(rsp_sub)
|
||||
except TimeoutError:
|
||||
log.warning(
|
||||
"5A probe TIMED OUT for key=%s — %d raw bytes received",
|
||||
"5A probe TIMED OUT for key=%s — "
|
||||
"%d raw bytes received (no complete A5 frame assembled)",
|
||||
key4.hex(), self._parser.bytes_fed,
|
||||
)
|
||||
raise
|
||||
|
||||
frames_data.append(rsp)
|
||||
log.debug("5A A5[0] (probe) page_key=0x%04X %d bytes",
|
||||
rsp.page_key, len(rsp.data))
|
||||
|
||||
# ── Step 2: parse STRT end_offset from probe response ────────────────
|
||||
end_offset = parse_strt_end_offset(rsp.data)
|
||||
if end_offset is None:
|
||||
log.warning(
|
||||
"5A probe response did not contain a STRT record; "
|
||||
"cannot bound chunk loop — falling back to max_chunks=%d cap",
|
||||
max_chunks,
|
||||
)
|
||||
end_offset = 0xFFFF # impossible value → loop runs to max_chunks
|
||||
else:
|
||||
log.info(
|
||||
"5A STRT start_offset=0x%04X end_offset=0x%04X size=0x%04X",
|
||||
start_offset, end_offset, end_offset - start_offset,
|
||||
)
|
||||
|
||||
# ── Step 3: metadata pages 0x1002 + 0x1004 (event 1 only) ────────────
|
||||
# Confirmed from BW captures: BW reads these two fixed device-buffer
|
||||
# pages immediately after the probe for events at start_key[2:4]=0.
|
||||
# Continuation events skip them (BW caches across the session).
|
||||
# Their content is global compliance-setup metadata: Project, Client,
|
||||
# User Name, Seis Loc, Extended Notes.
|
||||
if is_event_1:
|
||||
for meta_counter in (0x1002, 0x1004):
|
||||
# Metadata page params have an extra trailing 0x00 byte
|
||||
# (12-byte params instead of 11) — empirical from BW captures.
|
||||
# Checksum-neutral but matches BW byte-for-byte.
|
||||
meta_params = bytes([
|
||||
0x00,
|
||||
key4[0], key4[1],
|
||||
(meta_counter >> 8) & 0xFF,
|
||||
meta_counter & 0xFF,
|
||||
0, 0, 0, 0, 0, 0, 0,
|
||||
])
|
||||
log.debug("5A metadata page counter=0x%04X", meta_counter)
|
||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, meta_params))
|
||||
self._parser.reset()
|
||||
try:
|
||||
meta_rsp = self._recv_one(
|
||||
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
|
||||
)
|
||||
except TimeoutError:
|
||||
log.warning(
|
||||
"5A metadata page 0x%04X TIMED OUT — continuing",
|
||||
meta_counter,
|
||||
)
|
||||
continue
|
||||
frames_data.append(meta_rsp)
|
||||
log.debug(
|
||||
"5A meta@0x%04X page_key=0x%04X %d bytes",
|
||||
meta_counter, meta_rsp.page_key, len(meta_rsp.data),
|
||||
)
|
||||
|
||||
# ── Step 4: sample chunk loop, bounded by end_offset ─────────────────
|
||||
# Sample chunks start at:
|
||||
# event 1: counter = 0x0600
|
||||
# event N (>0): counter = probe_counter + 0x0200
|
||||
# (probe was the first sample chunk)
|
||||
if is_event_1:
|
||||
counter = 0x0600
|
||||
else:
|
||||
counter = probe_counter + _BULK_COUNTER_STEP
|
||||
|
||||
last_chunk_counter: Optional[int] = (
|
||||
probe_counter if not is_event_1 else None
|
||||
frames_data.extend(probe_batch)
|
||||
log.debug(
|
||||
"5A probe: %d frame(s) page_keys=%s",
|
||||
len(probe_batch),
|
||||
[f"0x{f.page_key:04X}" for f in probe_batch],
|
||||
)
|
||||
chunks_fetched = 0
|
||||
|
||||
while chunks_fetched < max_chunks:
|
||||
# Stop when next chunk would straddle the event end.
|
||||
if counter + _BULK_COUNTER_STEP > end_offset:
|
||||
log.debug(
|
||||
"5A chunk loop done at counter=0x%04X (end=0x%04X); "
|
||||
"%d chunks fetched",
|
||||
counter, end_offset, chunks_fetched,
|
||||
)
|
||||
break
|
||||
# Log probe frame size for diagnostics.
|
||||
# The device always needs extra_chunks_after_metadata chunks after the
|
||||
# metadata frame before termination to prime the valid waveform footer.
|
||||
# This holds regardless of TCP frame size (1-frame vs 2-frame mode).
|
||||
_effective_extra_chunks = extra_chunks_after_metadata
|
||||
log.warning(
|
||||
"5A probe data_len=%d effective_extra_chunks=%d",
|
||||
len(probe_batch[0].data),
|
||||
_effective_extra_chunks,
|
||||
)
|
||||
|
||||
params = bulk_waveform_params(key4, counter)
|
||||
log.debug("5A chunk #%d counter=0x%04X", chunks_fetched + 1, counter)
|
||||
# ── Step 2: chunk loop ───────────────────────────────────────────────
|
||||
# Counter formula: _chunk_base + (chunk_num - 1) * 0x0400
|
||||
# where _chunk_base = max(key4[2:4], 0x0400).
|
||||
#
|
||||
# For events with key4[2:4] != 0 (e.g. key 0111245a, offset 0x245a):
|
||||
# _chunk_base = 0x245a → chunk 1=0x245a, chunk 2=0x285a, ...
|
||||
# Confirmed from 4-3-26 capture.
|
||||
#
|
||||
# For events with key4[2:4] == 0 (e.g. key 01110000):
|
||||
# _chunk_base = max(0, 0x0400) = 0x0400
|
||||
# → chunk 1=0x0400, chunk 2=0x0800, ... (= old chunk_num*0x0400)
|
||||
# CRITICAL: counter=0x0000 (same as the probe) causes the device to
|
||||
# re-return the STRT record data for chunk 1, making frame 1 look like
|
||||
# a second probe response (confirmed from server log: frame 1 len=1097,
|
||||
# contains STRT\xff\xfe, contributes zero body bytes after DLE-strip).
|
||||
# counter=0x0400 for chunk 1 confirmed working (empirical test 2026-04-06).
|
||||
_chunk_base = max(_key4_offset, _BULK_COUNTER_STEP)
|
||||
for chunk_num in range(1, max_chunks + 1):
|
||||
counter = _chunk_base + (chunk_num - 1) * _BULK_COUNTER_STEP
|
||||
params = bulk_waveform_params(key4, counter)
|
||||
log.debug("5A chunk %d counter=0x%04X", chunk_num, counter)
|
||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
||||
self._parser.reset()
|
||||
self._parser.reset() # reset bytes_fed for accurate per-chunk count
|
||||
try:
|
||||
rsp = self._recv_one(
|
||||
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
|
||||
)
|
||||
# Collect ALL frames from this chunk response.
|
||||
# Over TCP via modem, a single large A5 device response (~1100 bytes
|
||||
# RS-232) is split across ~2 TCP segments, each parsed as its own
|
||||
# complete S3 frame. _recv_5a_batch gathers all of them so that
|
||||
# every subsequent chunk request is paired with the correct response.
|
||||
batch = self._recv_5a_batch(rsp_sub, first_timeout=10.0)
|
||||
except TimeoutError:
|
||||
raw = self._parser.bytes_fed
|
||||
log.warning(
|
||||
"5A TIMEOUT chunk=%d counter=0x%04X raw_bytes=%d",
|
||||
chunks_fetched + 1, counter, raw,
|
||||
chunk_num, counter, raw,
|
||||
)
|
||||
if raw > 0 and frames_data:
|
||||
# Device sent a partial byte (likely a bare DLE/ETX end-of-stream
|
||||
# signal) but never completed a full frame. Treat as graceful
|
||||
# stream end and fall through to the termination step.
|
||||
log.warning(
|
||||
"5A unexpected end-of-stream — proceeding to TERM",
|
||||
"5A end-of-stream detected at chunk=%d (raw_bytes=%d, "
|
||||
"frames_collected=%d) — proceeding to termination",
|
||||
chunk_num, raw, len(frames_data),
|
||||
)
|
||||
break
|
||||
raise
|
||||
|
||||
log.debug(
|
||||
"5A RX chunk=%d page_key=0x%04X data_len=%d",
|
||||
chunks_fetched + 1, rsp.page_key, len(rsp.data),
|
||||
)
|
||||
|
||||
if rsp.page_key == 0x0000:
|
||||
# Device terminated mid-stream unexpectedly.
|
||||
# Process all frames from this batch.
|
||||
metadata_found = False
|
||||
for rsp in batch:
|
||||
log.warning(
|
||||
"5A unexpected page_key=0x0000 mid-stream at counter=0x%04X",
|
||||
counter,
|
||||
"5A RX chunk=%d page_key=0x%04X data_len=%d contains_Project=%s",
|
||||
chunk_num, rsp.page_key, len(rsp.data), b"Project:" in rsp.data,
|
||||
)
|
||||
if include_terminator:
|
||||
frames_data.append(rsp)
|
||||
return frames_data
|
||||
if rsp.page_key == 0x0000:
|
||||
# Device unexpectedly terminated mid-stream.
|
||||
log.debug("5A page_key=0x0000 — device terminated early")
|
||||
if include_terminator:
|
||||
frames_data.append(rsp)
|
||||
return frames_data
|
||||
frames_data.append(rsp)
|
||||
if stop_after_metadata and b"Project:" in rsp.data:
|
||||
metadata_found = True
|
||||
|
||||
frames_data.append(rsp)
|
||||
last_chunk_counter = counter
|
||||
counter += _BULK_COUNTER_STEP
|
||||
chunks_fetched += 1
|
||||
if metadata_found:
|
||||
# Download extra_chunks_after_metadata more chunks after metadata.
|
||||
# This primes the device to return the valid waveform footer in the
|
||||
# termination response — without it the terminator carries too few bytes
|
||||
# (confirmed 2026-04-23). The extra chunk data also belongs in the
|
||||
# file body (confirmed from TCP capture analysis 2026-04-27).
|
||||
log.debug("5A metadata found — fetching %d more chunk(s)",
|
||||
_effective_extra_chunks)
|
||||
for _extra_n in range(_effective_extra_chunks):
|
||||
chunk_num += 1
|
||||
counter = _chunk_base + (chunk_num - 1) * _BULK_COUNTER_STEP
|
||||
params = bulk_waveform_params(key4, counter)
|
||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
||||
try:
|
||||
extra_batch = self._recv_5a_batch(rsp_sub, first_timeout=10.0)
|
||||
for ef in extra_batch:
|
||||
log.debug(
|
||||
"5A extra chunk page_key=0x%04X data_len=%d",
|
||||
ef.page_key, len(ef.data),
|
||||
)
|
||||
if ef.page_key == 0x0000:
|
||||
if include_terminator:
|
||||
frames_data.append(ef)
|
||||
return frames_data
|
||||
frames_data.append(ef)
|
||||
except TimeoutError:
|
||||
log.debug("5A extra chunk %d timed out — end of stream", _extra_n + 1)
|
||||
break
|
||||
break
|
||||
else:
|
||||
log.warning(
|
||||
"5A reached max_chunks=%d at counter=0x%04X (end=0x%04X)",
|
||||
max_chunks, counter, end_offset,
|
||||
"5A reached max_chunks=%d without end-of-stream; sending termination",
|
||||
max_chunks,
|
||||
)
|
||||
|
||||
# ── Step 5: TERM with proper end_offset-derived formula ──────────────
|
||||
if last_chunk_counter is None or end_offset == 0xFFFF:
|
||||
# No STRT or no chunks fetched — fall back to legacy TERM.
|
||||
log.warning(
|
||||
"5A using legacy TERM (offset_word=0x005A); "
|
||||
"end_offset unavailable or no chunks fetched",
|
||||
)
|
||||
legacy_counter = (last_chunk_counter or probe_counter) + _BULK_COUNTER_STEP
|
||||
term_offset_word = _BULK_TERM_OFFSET # 0x005A
|
||||
term_params = bulk_waveform_term_params(key4, legacy_counter)
|
||||
else:
|
||||
term_offset_word, term_params = bulk_waveform_term_v2(
|
||||
key4, end_offset, last_chunk_counter,
|
||||
)
|
||||
log.debug(
|
||||
"5A TERM offset_word=0x%04X params[2:4]=%s end=0x%04X "
|
||||
"last_chunk=0x%04X",
|
||||
term_offset_word, term_params[2:4].hex(),
|
||||
end_offset, last_chunk_counter,
|
||||
)
|
||||
|
||||
self._send(build_5a_frame(term_offset_word, term_params))
|
||||
# ── Step 3: termination ──────────────────────────────────────────────
|
||||
term_counter = counter + _BULK_COUNTER_STEP
|
||||
term_params = bulk_waveform_term_params(key4, term_counter)
|
||||
log.debug(
|
||||
"5A termination term_counter=0x%04X offset=0x%04X",
|
||||
term_counter, _BULK_TERM_OFFSET,
|
||||
)
|
||||
self._send(build_5a_frame(_BULK_TERM_OFFSET, term_params))
|
||||
try:
|
||||
term_rsp = self._recv_one(expected_sub=rsp_sub, timeout=10.0)
|
||||
log.info(
|
||||
"5A TERM response page_key=0x%04X %d bytes",
|
||||
term_rsp = self._recv_one(expected_sub=rsp_sub)
|
||||
log.debug(
|
||||
"5A termination response page_key=0x%04X %d bytes",
|
||||
term_rsp.page_key, len(term_rsp.data),
|
||||
)
|
||||
if include_terminator:
|
||||
frames_data.append(term_rsp)
|
||||
except TimeoutError:
|
||||
log.warning("5A no TERM response (timeout)")
|
||||
log.debug("5A no termination response — device may have already closed")
|
||||
|
||||
return frames_data
|
||||
|
||||
@@ -937,7 +882,7 @@ class MiniMateProtocol:
|
||||
continue
|
||||
|
||||
chunk = data_rsp.data[11:]
|
||||
log.debug(
|
||||
log.warning(
|
||||
"read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d",
|
||||
step_name, data_rsp.page_key, len(data_rsp.data),
|
||||
len(chunk), len(config) + len(chunk),
|
||||
@@ -957,18 +902,17 @@ class MiniMateProtocol:
|
||||
except TimeoutError:
|
||||
pass
|
||||
|
||||
log.info(
|
||||
log.warning(
|
||||
"read_compliance_config: done — %d cfg bytes total",
|
||||
len(config),
|
||||
)
|
||||
|
||||
# Hex dump first 128 bytes — useful only for field-mapping work, not normal operation.
|
||||
if log.isEnabledFor(logging.DEBUG):
|
||||
for row in range(0, min(len(config), 128), 16):
|
||||
row_bytes = bytes(config[row:row + 16])
|
||||
hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
|
||||
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
|
||||
log.debug(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
|
||||
# Hex dump first 128 bytes for field mapping
|
||||
for row in range(0, min(len(config), 128), 16):
|
||||
row_bytes = bytes(config[row:row + 16])
|
||||
hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
|
||||
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
|
||||
log.warning(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
|
||||
|
||||
return bytes(config)
|
||||
|
||||
@@ -1459,6 +1403,63 @@ class MiniMateProtocol:
|
||||
log.debug("TX %d bytes: %s", len(frame), frame.hex())
|
||||
self._transport.write(frame)
|
||||
|
||||
def _recv_5a_batch(
|
||||
self,
|
||||
expected_sub: int,
|
||||
first_timeout: float = 10.0,
|
||||
batch_timeout: float = 0.5,
|
||||
) -> list[S3Frame]:
|
||||
"""
|
||||
Collect all S3 frames that arrive as part of one device response.
|
||||
|
||||
Over TCP via cellular modem, a single device A5 response (~1100 bytes of
|
||||
RS-232 data) is forwarded in multiple TCP segments due to the modem's
|
||||
data-forwarding timeout (~100-150 ms per segment). Each TCP segment
|
||||
contains a complete, valid S3 frame (~550 bytes). Calling _recv_one()
|
||||
once returns only the first segment's frame and misses the rest, causing
|
||||
the chunk request/response pairing to cascade out of alignment.
|
||||
|
||||
This helper collects ALL frames before returning, by trying additional
|
||||
short-timeout receives after the first frame arrives.
|
||||
|
||||
The caller must call self._parser.reset() before this method to ensure
|
||||
bytes_fed is accurate; this method always uses reset_parser=False.
|
||||
|
||||
Args:
|
||||
expected_sub: Expected SUB byte for validation.
|
||||
first_timeout: Timeout for the mandatory first frame. Should be
|
||||
generous (default 10 s) since the device may be slow.
|
||||
batch_timeout: Short timeout for subsequent frames. Default 0.5 s
|
||||
— comfortably longer than the modem forwarding gap
|
||||
(~150 ms) but short enough to avoid stalling when
|
||||
only one frame is expected (probe, terminator).
|
||||
|
||||
Returns:
|
||||
List of S3Frame objects in arrival order (at least one).
|
||||
|
||||
Raises:
|
||||
TimeoutError: If no frame arrives within first_timeout.
|
||||
UnexpectedResponse: If any frame has the wrong SUB byte.
|
||||
"""
|
||||
frames: list[S3Frame] = []
|
||||
first = self._recv_one(
|
||||
expected_sub=expected_sub,
|
||||
reset_parser=False,
|
||||
timeout=first_timeout,
|
||||
)
|
||||
frames.append(first)
|
||||
while True:
|
||||
try:
|
||||
extra = self._recv_one(
|
||||
expected_sub=expected_sub,
|
||||
reset_parser=False,
|
||||
timeout=batch_timeout,
|
||||
)
|
||||
frames.append(extra)
|
||||
except TimeoutError:
|
||||
break
|
||||
return frames
|
||||
|
||||
def _recv_one(
|
||||
self,
|
||||
expected_sub: Optional[int] = None,
|
||||
|
||||
@@ -454,102 +454,3 @@ class SocketTransport(TcpTransport):
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return f"SocketTransport(peer={self.host!r})"
|
||||
|
||||
|
||||
# ── Capturing transport (MITM-style raw byte mirror) ──────────────────────────
|
||||
|
||||
class CapturingTransport(BaseTransport):
|
||||
"""
|
||||
Wraps another BaseTransport and mirrors every byte to two raw capture files:
|
||||
|
||||
raw_bw_<...>.bin — bytes WE wrote to the device (BW-side TX)
|
||||
raw_s3_<...>.bin — bytes the device wrote back (S3-side TX)
|
||||
|
||||
The file naming and on-wire byte layout are identical to the captures
|
||||
produced by `bridges/ach_mitm.py`, so the resulting `.bin` files can be
|
||||
loaded directly by the Analyzer (File > Open Capture) and parsed by the
|
||||
same tooling used for genuine Blastware MITM captures.
|
||||
|
||||
All BaseTransport methods are forwarded to the inner transport; the only
|
||||
side-effect is that successful read/write byte streams are appended to the
|
||||
two open binary files.
|
||||
|
||||
Args:
|
||||
inner: An already-built BaseTransport (SerialTransport / TcpTransport).
|
||||
bw_path: File path for the "BW TX" stream (bytes we send). Opened "wb".
|
||||
s3_path: File path for the "S3 TX" stream (bytes the device sends).
|
||||
Opened "wb".
|
||||
|
||||
Example:
|
||||
with CapturingTransport(TcpTransport("1.2.3.4", 9034),
|
||||
"raw_bw.bin", "raw_s3.bin") as t:
|
||||
client = MiniMateClient(transport=t)
|
||||
client.connect()
|
||||
client.get_events()
|
||||
# both .bin files now hold the full bidirectional capture.
|
||||
"""
|
||||
|
||||
def __init__(self, inner: BaseTransport, bw_path: str, s3_path: str) -> None:
|
||||
self._inner = inner
|
||||
self._bw_path = bw_path
|
||||
self._s3_path = s3_path
|
||||
self._bw_fh = None
|
||||
self._s3_fh = None
|
||||
# Forward inner attrs so callers can introspect (e.g. .host, .port).
|
||||
self.host = getattr(inner, "host", None)
|
||||
self.port = getattr(inner, "port", None)
|
||||
|
||||
# ── BaseTransport interface ───────────────────────────────────────────────
|
||||
|
||||
def connect(self) -> None:
|
||||
if self._bw_fh is None:
|
||||
self._bw_fh = open(self._bw_path, "wb", buffering=0)
|
||||
if self._s3_fh is None:
|
||||
self._s3_fh = open(self._s3_path, "wb", buffering=0)
|
||||
self._inner.connect()
|
||||
|
||||
def disconnect(self) -> None:
|
||||
try:
|
||||
self._inner.disconnect()
|
||||
finally:
|
||||
for fh_attr in ("_bw_fh", "_s3_fh"):
|
||||
fh = getattr(self, fh_attr)
|
||||
if fh is not None:
|
||||
try:
|
||||
fh.flush()
|
||||
fh.close()
|
||||
except Exception:
|
||||
pass
|
||||
setattr(self, fh_attr, None)
|
||||
|
||||
@property
|
||||
def is_connected(self) -> bool:
|
||||
return self._inner.is_connected
|
||||
|
||||
def write(self, data: bytes) -> None:
|
||||
self._inner.write(data)
|
||||
if data and self._bw_fh is not None:
|
||||
try:
|
||||
self._bw_fh.write(data)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def read(self, n: int) -> bytes:
|
||||
got = self._inner.read(n)
|
||||
if got and self._s3_fh is not None:
|
||||
try:
|
||||
self._s3_fh.write(got)
|
||||
except Exception:
|
||||
pass
|
||||
return got
|
||||
|
||||
@property
|
||||
def bw_path(self) -> str:
|
||||
return self._bw_path
|
||||
|
||||
@property
|
||||
def s3_path(self) -> str:
|
||||
return self._s3_path
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return f"CapturingTransport({self._inner!r}, bw={self._bw_path!r}, s3={self._s3_path!r})"
|
||||
|
||||
@@ -1,578 +0,0 @@
|
||||
"""
|
||||
waveform_codec.py — block-walker and verified decoder for the MiniMate Plus
|
||||
waveform-file body.
|
||||
|
||||
FULLY DECODED 2026-05-11. Every block type, every channel, and the
|
||||
channel-rotation rule are verified byte-exact against BW's ASCII export
|
||||
across the 9-event fixture bundle (47,364 ADC samples, zero errors).
|
||||
|
||||
The Blastware waveform-file body — the bytes between the 21-byte STRT
|
||||
record and the 26-byte file footer — is a tagged variable-length block
|
||||
stream with a custom delta + RLE codec. (Not raw int16 LE, which was
|
||||
the historical wrong assumption that produced ±32K noise on every event.)
|
||||
|
||||
Current status:
|
||||
|
||||
- Block framing: ✅ solved (5 block types and lengths all confirmed)
|
||||
- Per-channel decode: ✅ solved (Tran / Vert / Long / MicL all byte-exact)
|
||||
- Channel rotation: ✅ Tran → Vert → Long → MicL per segment
|
||||
- Segment header: ✅ fully decoded (anchor pair + prev-channel extension)
|
||||
- 30 NN packed-delta block: ✅ NN × 12-bit signed deltas in NN/4 groups
|
||||
- MicL → dB(L) conversion: ✅ ``mic_count_to_db`` matches BW display
|
||||
- Production wiring: ✅ ``client.py:_decode_a5_waveform`` uses the new
|
||||
codec (via ``decode_a5_frames``). ``.h5`` sidecars now render
|
||||
correctly.
|
||||
|
||||
Known limitations:
|
||||
|
||||
- Walker stops early on the loudest events (SP0, SS0, SV0, event-b) at
|
||||
some mid-segment edge cases not yet fully characterized. Every
|
||||
sample reached IS correct; the walker just doesn't reach all of
|
||||
them yet. The cleanly-decoded subset is still ~5000–15000 samples
|
||||
per loud event.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Body layout (CONFIRMED 2026-05-11 against 8 fixture events)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
[7-byte preamble] [stream of tagged blocks] [trailer]
|
||||
|
||||
The preamble is always exactly 7 bytes:
|
||||
|
||||
body[0:3] = 00 02 00 magic
|
||||
body[3:5] = Tran[0] int16 BE in 16-count units (LSB = 0.005 in/s)
|
||||
body[5:7] = Tran[1] int16 BE in 16-count units
|
||||
|
||||
(Earlier drafts of this module described a "7-or-9-byte preamble";
|
||||
that was wrong — single-shot and continuous events both use 7 bytes.
|
||||
The "extra 2 bytes" on continuous events were the first ``00 NN`` RLE
|
||||
marker, not part of the preamble.)
|
||||
|
||||
Block types and lengths (all confirmed):
|
||||
|
||||
| Tag | Length | Meaning |
|
||||
|----------|-----------------------|----------------------------------------|
|
||||
| ``10 NN``| NN/2 + 2 bytes | 4-bit nibble deltas (2 per byte; high |
|
||||
| | | nibble first; signed 0..7 / 8..F = -8..-1)|
|
||||
| ``20 NN``| NN + 2 bytes | int8 signed deltas (1 per byte) |
|
||||
| ``00 NN``| 2 bytes | RLE: append NN copies of current value |
|
||||
| ``30 NN``| NN*2 in data, NN*4 | Unknown content. Only in loud events. |
|
||||
| | in trailer | |
|
||||
| ``40 02``| 20 bytes (fixed) | Segment header |
|
||||
|
||||
NN is always a multiple of 4.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Tran channel, segment 0 (CONFIRMED 2026-05-11)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
Segment 0 — everything before the first ``40 02`` segment header — encodes
|
||||
Tran samples only. Starting from preamble anchors Tran[0] and Tran[1],
|
||||
each subsequent block contributes to the running Tran value:
|
||||
|
||||
10 NN → append NN deltas (4-bit signed nibbles)
|
||||
20 NN → append NN deltas (int8 signed bytes)
|
||||
00 NN → append NN copies of the current value (RLE zeros)
|
||||
40 02 → segment 0 ends; multi-segment continuation is open
|
||||
|
||||
This decodes the first 482–510 samples of Tran for each event with zero
|
||||
errors against BW's ASCII export. The exact segment-0 sample count
|
||||
varies per event (it's bounded by a fixed device-flash byte budget, not
|
||||
a fixed sample count — quiet events fit more samples because zero
|
||||
deltas pack into ``00 NN`` markers compactly).
|
||||
|
||||
Implementation: :func:`decode_tran_initial`.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Segment header (40 02, 20 bytes total)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
The 18-byte payload of the ``40 02`` block:
|
||||
|
||||
| Offset | Field | Status |
|
||||
|-----------|---------------------------------------------|-------------|
|
||||
| [0:2] | T_delta at first sample of new segment | ✅ confirmed|
|
||||
| | (int16 BE, in 16-count units) | |
|
||||
| [2:4] | Likely T_delta at sample seg_start+1 | 🟡 likely |
|
||||
| [4:6] | Unknown (varies; possibly checksum) | ❓ open |
|
||||
| [6:8] | Byte length to next segment header − 2 | ✅ confirmed|
|
||||
| | (uint16 BE; useful for walker pre-scan) | |
|
||||
| [8:12] | Monotonic uint32 LE counter | ✅ confirmed|
|
||||
| | (starts ~0x47, increments by 1 per segment) | |
|
||||
| [12:14] | Constant ``02 00`` | ✅ confirmed|
|
||||
| [14:18] | Unknown 4-byte field | ❓ open |
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
What breaks the multi-segment decoder (the main open question)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
After segment 0 ends and the segment header T_delta is consumed,
|
||||
applying segment 1's blocks as Tran continuation produces values that
|
||||
diverge from truth by sample ~512. The block structure inside segment
|
||||
1 is IDENTICAL to segment 0 (same alternating 10 NN / 00 NN pattern),
|
||||
and the delta budget matches the segment size exactly (V70 segment 1
|
||||
has 264 nibble-deltas + 244 RLE zeros = 508 = the segment's sample
|
||||
count). But the cumulative is wrong.
|
||||
|
||||
The strongest unverified hypothesis is that segments rotate channels:
|
||||
|
||||
segment 0 → Tran samples 0..509
|
||||
segment 1 → Vert samples 0..507
|
||||
segment 2 → Long samples 0..507
|
||||
segment 3 → Mic samples 0..507
|
||||
segment 4 → Tran samples 510..N (continuation)
|
||||
...
|
||||
|
||||
This is consistent with the segment-1 block sums net-to-near-zero in
|
||||
V70 (where all 4 channels are near zero) and with the per-segment delta
|
||||
budget matching the segment size for a single channel. It is NOT yet
|
||||
verified because the per-segment channel anchor isn't pinned down in
|
||||
the segment header — bytes [4:6] and [14:18] of the header are still
|
||||
open and probably encode V/L/M anchors.
|
||||
|
||||
See ``docs/waveform_codec_re_status.md`` for the current working notes
|
||||
and the suggested next experiment ("segment-channel scoring analyzer").
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import math
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Optional, Tuple
|
||||
|
||||
|
||||
@dataclass
|
||||
class WaveformBlock:
|
||||
"""One tagged block parsed out of a Blastware waveform-file body."""
|
||||
offset: int # byte offset into body
|
||||
tag_hi: int # first tag byte (0x10 / 0x20 / 0x00 / 0x30 / 0x40)
|
||||
tag_lo: int # second tag byte (NN)
|
||||
data: bytes # block payload (excludes the 2-byte tag)
|
||||
length: int # total block length on the wire (includes the tag)
|
||||
|
||||
@property
|
||||
def kind(self) -> str:
|
||||
return f"{self.tag_hi:02x} {self.tag_lo:02x}"
|
||||
|
||||
|
||||
def find_data_start(body: bytes) -> int:
|
||||
"""Auto-detect the offset of the first data block.
|
||||
|
||||
The body starts with a 7-byte preamble (magic ``00 02 00`` + two int16 BE
|
||||
Tran anchors). After that, the data section starts with a tag — usually
|
||||
``10 NN`` or ``20 NN``, but quiet events may begin with a ``00 NN`` RLE
|
||||
marker. We return the offset of the first recognized tag.
|
||||
"""
|
||||
# Try fixed offset 7 first (canonical preamble length).
|
||||
if len(body) >= 9:
|
||||
b, nn = body[7], body[8]
|
||||
if (b in (0x00, 0x10, 0x20, 0x30) and nn % 4 == 0 and 0 < nn <= 0xFC) \
|
||||
or (b == 0x40 and nn == 0x02):
|
||||
return 7
|
||||
# Fall back to scanning the first 20 bytes.
|
||||
for i in range(min(20, len(body) - 1)):
|
||||
b = body[i]
|
||||
nn = body[i + 1]
|
||||
if b in (0x10, 0x20) and nn % 4 == 0 and 0 < nn <= 0xFC:
|
||||
return i
|
||||
return -1
|
||||
|
||||
|
||||
def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
|
||||
"""Walk the tagged-block sequence starting at *start* (auto-detected by default).
|
||||
|
||||
Stops when an unrecognized tag is encountered or end of body is reached.
|
||||
Returned blocks are in stream order.
|
||||
"""
|
||||
if start is None:
|
||||
start = find_data_start(body)
|
||||
if start < 0:
|
||||
return []
|
||||
|
||||
blocks: List[WaveformBlock] = []
|
||||
i = start
|
||||
while i + 1 < len(body):
|
||||
t0 = body[i]
|
||||
t1 = body[i + 1]
|
||||
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 // 2 + 2
|
||||
elif (t0 & 0xF0) == 0x10 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
|
||||
# Wide-NN nibble block: ``1X NN`` where X is the high nibble of a
|
||||
# 12-bit NN value. NN = ((t0 & 0x0F) << 8) | t1. Block length
|
||||
# = NN/2 + 2 bytes (NN nibble deltas, same as ``10 NN`` semantics
|
||||
# but with NN > 0xFC). Confirmed 2026-05-11 in SP0 segment 12
|
||||
# where V continuation uses ``11 90`` = NN=0x190=400.
|
||||
wide_nn = ((t0 & 0x0F) << 8) | t1
|
||||
length = wide_nn // 2 + 2
|
||||
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||
length = t1 + 2
|
||||
elif (t0 & 0xF0) == 0x20 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
|
||||
# Wide-NN int8 block: ``2X NN`` extends NN to 12 bits the same way.
|
||||
wide_nn = ((t0 & 0x0F) << 8) | t1
|
||||
length = wide_nn + 2
|
||||
elif t0 == 0x00 and t1 % 4 == 0:
|
||||
length = 2
|
||||
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||
# Data-section ``30 NN`` blocks carry NN 12-bit signed deltas packed
|
||||
# as NN/4 groups of (2-byte high-nibble field + 4 × int8 low byte).
|
||||
# Length = NN/4 × 6 + 2 = NN × 1.5 + 2 (= 8 for NN=4, 14 for NN=8,
|
||||
# 20 for NN=12, etc.). Confirmed 2026-05-11 by full-decoder
|
||||
# verification against BW ASCII export.
|
||||
#
|
||||
# Trailer-section ``30 NN`` blocks have a different length formula
|
||||
# (NN × 4 = 32 for NN=8 in trailers). We try the data-section
|
||||
# length first and fall back to the trailer length if needed.
|
||||
cand_data = t1 * 3 // 2 + 2
|
||||
cand_trailer = t1 * 4
|
||||
if (i + cand_data < len(body) - 1
|
||||
and body[i + cand_data] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||
length = cand_data
|
||||
else:
|
||||
length = cand_trailer
|
||||
elif t0 == 0x40 and t1 == 0x02:
|
||||
length = 20
|
||||
else:
|
||||
# Unknown tag; stop. Caller can inspect ``i`` to see where.
|
||||
break
|
||||
|
||||
if i + length > len(body):
|
||||
break
|
||||
|
||||
data = bytes(body[i + 2 : i + length])
|
||||
blocks.append(WaveformBlock(offset=i, tag_hi=t0, tag_lo=t1, data=data, length=length))
|
||||
i += length
|
||||
|
||||
return blocks
|
||||
|
||||
|
||||
def split_segments(blocks: List[WaveformBlock]) -> List[List[WaveformBlock]]:
|
||||
"""Group consecutive blocks into segments separated by ``40 02`` headers.
|
||||
|
||||
The first segment is whatever runs before the first ``40 02`` header
|
||||
(typically the "segment 0" preamble data after the body preamble).
|
||||
Subsequent segments start with a ``40 02`` block, then have their
|
||||
own data blocks until the next ``40 02``.
|
||||
"""
|
||||
segments: List[List[WaveformBlock]] = []
|
||||
current: List[WaveformBlock] = []
|
||||
for b in blocks:
|
||||
if b.tag_hi == 0x40 and b.tag_lo == 0x02:
|
||||
if current:
|
||||
segments.append(current)
|
||||
current = [b]
|
||||
else:
|
||||
current.append(b)
|
||||
if current:
|
||||
segments.append(current)
|
||||
return segments
|
||||
|
||||
|
||||
def parse_segment_header(block: WaveformBlock) -> Optional[dict]:
|
||||
"""Decode the 18-byte payload of a ``40 02`` segment header.
|
||||
|
||||
Returns a dict with the labelled fields, or None if *block* is not
|
||||
a ``40 02`` header.
|
||||
"""
|
||||
if not (block.tag_hi == 0x40 and block.tag_lo == 0x02):
|
||||
return None
|
||||
if len(block.data) < 18:
|
||||
return None
|
||||
p = block.data
|
||||
counter = int.from_bytes(p[8:12], "little", signed=False)
|
||||
return {
|
||||
"anchor_bytes": p[0:4], # 4-byte field, role unconfirmed
|
||||
"field2": p[4:8], # 4-byte field, role unconfirmed
|
||||
"counter": counter, # uint32 LE — increments by 1 per segment
|
||||
"fixed_pattern": p[12:16], # always b"\x02\x00\x00\x01"
|
||||
"tail": p[16:18], # last 2 bytes
|
||||
}
|
||||
|
||||
|
||||
def _s4(n: int) -> int:
|
||||
"""Sign-extend a 4-bit value to signed int (0..7 → 0..7; 8..F → -8..-1)."""
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def _i8(b: int) -> int:
|
||||
"""Reinterpret an unsigned byte as signed int8."""
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def decode_tran_initial(body: bytes) -> Optional[List[int]]:
|
||||
"""
|
||||
Decode the initial Tran-channel samples — VERIFIED 2026-05-11.
|
||||
|
||||
Returns Tran samples in **16-count units** (LSB = 0.005 in/s at Normal
|
||||
range — the same quantization BW uses for its ASCII export). Returns
|
||||
``None`` if the body cannot be parsed.
|
||||
|
||||
The decoded list extends from sample 0 through the end of segment 0
|
||||
(= just before the first ``40 02`` segment header; ~510 sample-sets
|
||||
for the events tested). Multi-segment decoding requires continuing
|
||||
past the segment header — that's done by :func:`decode_tran_full`
|
||||
when the per-segment rules are pinned down for all signal types.
|
||||
|
||||
Codec for segment 0 (CONFIRMED 2026-05-11 against 7 fixture events):
|
||||
|
||||
- Body bytes [0:3] are the magic ``00 02 00``.
|
||||
- Body bytes [3:5] = ``Tran[0]`` as int16 BE in 16-count units.
|
||||
- Body bytes [5:7] = ``Tran[1]`` as int16 BE in 16-count units.
|
||||
- Data blocks (``10 NN`` or ``20 NN``) carry Tran deltas starting
|
||||
at sample 2:
|
||||
|
||||
* ``10 NN``: NN nibbles = NN/2 bytes; each nibble is a 4-bit
|
||||
signed delta (0..7 → 0..+7; 8..F → -8..-1). High nibble of
|
||||
each byte comes first.
|
||||
* ``20 NN``: NN int8 signed deltas (one delta per byte).
|
||||
|
||||
- ``00 NN`` blocks are run-length-encoded zero deltas: append NN
|
||||
copies of the current cumulative Tran value (no change).
|
||||
|
||||
- ``30 NN`` blocks have not yet been decoded for content — they
|
||||
appear in segment 0 of loud-from-start events (SS0, SV0) and
|
||||
seem to signal a transition or special-case interpretation.
|
||||
The walker steps over them but their data is ignored.
|
||||
|
||||
The walk stops at the first ``40 02`` segment header.
|
||||
"""
|
||||
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||
return None
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
|
||||
start = find_data_start(body)
|
||||
if start < 0:
|
||||
return [t0, t1]
|
||||
|
||||
out = [t0, t1]
|
||||
cur = t1
|
||||
for blk in walk_body(body, start):
|
||||
if blk.tag_hi == 0x40:
|
||||
# Segment boundary — stop. Multi-segment decode is decode_tran_full.
|
||||
break
|
||||
if blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += _s4(nib)
|
||||
out.append(cur)
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur += _i8(byte)
|
||||
out.append(cur)
|
||||
elif blk.tag_hi == 0x00:
|
||||
# RLE zero deltas: append NN copies of current Tran value.
|
||||
for _ in range(blk.tag_lo):
|
||||
out.append(cur)
|
||||
# 30 NN: unknown content; skip.
|
||||
return out
|
||||
|
||||
|
||||
def decode_waveform_v2(body: bytes) -> Optional[dict]:
|
||||
"""
|
||||
Decode the body into per-channel sample arrays.
|
||||
|
||||
Status (2026-05-11 evening — channel-rotation hypothesis CONFIRMED):
|
||||
segments rotate channels in fixed order **Tran → Vert → Long → MicL**.
|
||||
Each channel-segment carries a 2-sample anchor pair in segment-header
|
||||
bytes [14:18] (or in the body preamble for the initial Tran segment)
|
||||
plus a stream of delta blocks for samples 2 onward.
|
||||
|
||||
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||
with each channel's decoded samples in 16-count units (LSB = 0.005
|
||||
in/s at Normal range). Returns ``None`` if the body cannot be
|
||||
parsed.
|
||||
"""
|
||||
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||
return None
|
||||
|
||||
channels = ["Tran", "Vert", "Long", "MicL"]
|
||||
out: dict = {ch: [] for ch in channels}
|
||||
|
||||
# Initial Tran segment: preamble anchor pair + delta blocks before first 40 02.
|
||||
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||
out["Tran"].extend([t0, t1])
|
||||
|
||||
start = find_data_start(body)
|
||||
if start < 0:
|
||||
return out
|
||||
|
||||
blocks = walk_body(body, start)
|
||||
seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||
|
||||
def apply_blocks(channel: str, anchor: int,
|
||||
block_start: int, block_end: int) -> int:
|
||||
"""Apply delta blocks [block_start, block_end) to *channel*'s sample
|
||||
list, starting from *anchor*. Returns the final cumulative value."""
|
||||
cur = anchor
|
||||
for bi in range(block_start, block_end):
|
||||
blk = blocks[bi]
|
||||
if (blk.tag_hi & 0xF0) == 0x10:
|
||||
# Both ``10 NN`` (NN ≤ 0xFC) and wide-NN ``1X NN`` (X != 0)
|
||||
# are nibble-delta streams. The walker has already used the
|
||||
# right length; here we just iterate the payload bytes.
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += _s4(nib)
|
||||
out[channel].append(cur)
|
||||
elif (blk.tag_hi & 0xF0) == 0x20:
|
||||
# ``20 NN`` and wide ``2X NN`` both carry int8 deltas.
|
||||
for byte in blk.data:
|
||||
cur += _i8(byte)
|
||||
out[channel].append(cur)
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out[channel].append(cur)
|
||||
elif blk.tag_hi == 0x30:
|
||||
# 12-bit signed deltas, packed as NN/4 groups of 6 bytes each:
|
||||
# bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB first)
|
||||
# bytes [2:6] = 4 × int8 low bytes
|
||||
# Each delta = sign_extend_12((high_nibble << 8) | low_byte).
|
||||
# Confirmed 2026-05-11 against all 14 ``30 NN`` blocks in the
|
||||
# bundled fixtures.
|
||||
n_groups = blk.tag_lo // 4
|
||||
for g in range(n_groups):
|
||||
grp = blk.data[g * 6 : (g + 1) * 6]
|
||||
if len(grp) < 6:
|
||||
break
|
||||
high_word = (grp[0] << 8) | grp[1]
|
||||
for k in range(4):
|
||||
nib = (high_word >> (12 - 4 * k)) & 0xF
|
||||
v = (nib << 8) | grp[2 + k]
|
||||
if v >= 0x800:
|
||||
v -= 0x1000
|
||||
cur += v
|
||||
out[channel].append(cur)
|
||||
# 40 02: should not occur in segment data.
|
||||
return cur
|
||||
|
||||
# Initial Tran segment: deltas from start of body up to first 40 02 (or end).
|
||||
first_seg = seg_idx[0] if seg_idx else len(blocks)
|
||||
last_tran_value = apply_blocks("Tran", t1, 0, first_seg)
|
||||
|
||||
# Subsequent segments rotate channels. Each segment header carries:
|
||||
# bytes [0:2] and [2:4] = 2 deltas extending the PREVIOUS channel
|
||||
# bytes [14:16] and [16:18] = anchor pair for THIS segment's channel
|
||||
#
|
||||
# Rotation: V, L, M, T, V, L, M, T, ... (initial Tran segment is the
|
||||
# implicit T in the cycle.)
|
||||
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||
# Track each channel's "running cumulative value" so we can apply the
|
||||
# previous-channel extension deltas at every segment boundary.
|
||||
last_value = {"Tran": last_tran_value, "Vert": None, "Long": None, "MicL": None}
|
||||
|
||||
for k, hi in enumerate(seg_idx):
|
||||
channel = rotation[k % 4]
|
||||
prev_channel = "Tran" if k == 0 else rotation[(k - 1) % 4]
|
||||
header = blocks[hi]
|
||||
if len(header.data) < 18:
|
||||
continue
|
||||
# Validate: real segment headers have bytes [12:14] = `02 00`.
|
||||
# Trailer/footer "40 02" markers contain ASCII serial bytes or other
|
||||
# non-header data there and would otherwise be mis-interpreted as
|
||||
# segment headers, adding spurious samples at the tail.
|
||||
if header.data[12:14] != b"\x02\x00":
|
||||
break
|
||||
# Extend the PREVIOUS channel by 2 more samples (deltas in bytes [0:4]).
|
||||
prev_d0 = int.from_bytes(header.data[0:2], "big", signed=True)
|
||||
prev_d1 = int.from_bytes(header.data[2:4], "big", signed=True)
|
||||
if last_value[prev_channel] is not None:
|
||||
v = last_value[prev_channel] + prev_d0
|
||||
out[prev_channel].append(v)
|
||||
v += prev_d1
|
||||
out[prev_channel].append(v)
|
||||
last_value[prev_channel] = v
|
||||
# Anchor pair for THIS segment's channel.
|
||||
c0 = int.from_bytes(header.data[14:16], "big", signed=True)
|
||||
c1 = int.from_bytes(header.data[16:18], "big", signed=True)
|
||||
out[channel].extend([c0, c1])
|
||||
# Apply delta blocks for this segment.
|
||||
next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
|
||||
last_value[channel] = apply_blocks(channel, c1, hi + 1, next_hi)
|
||||
|
||||
return out
|
||||
|
||||
|
||||
# ── ADC-scale conversion helpers ────────────────────────────────────────────
|
||||
|
||||
|
||||
# Scaling factor: decode_waveform_v2 produces geo-channel samples in the BW
|
||||
# display quantization (16-count units, LSB = 0.005 in/s at Normal range).
|
||||
# The legacy consumer pipeline (sfm/event_hdf5.py) expects raw_samples in
|
||||
# 1-count ADC units (× full_scale / 32768 → physical). To plug the new
|
||||
# decoder in without rewriting consumers, multiply geo values by 16.
|
||||
#
|
||||
# Mic samples are already in raw ADC counts (decoded value 1 = 1 mic ADC count
|
||||
# = -81.94 dB on the BW display). Mic values pass through unchanged.
|
||||
_GEO_DECODER_TO_ADC = 16
|
||||
|
||||
|
||||
def decoded_to_adc_counts(decoded: dict) -> dict:
|
||||
"""Convert :func:`decode_waveform_v2` output to int16 ADC counts.
|
||||
|
||||
Geo channels are scaled by ×16 (decoder produces 16-count units,
|
||||
consumer expects 1-count ADC). Mic is passed through as raw counts.
|
||||
"""
|
||||
if not decoded:
|
||||
return {}
|
||||
return {
|
||||
"Tran": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Tran", [])],
|
||||
"Vert": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Vert", [])],
|
||||
"Long": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Long", [])],
|
||||
"MicL": list(decoded.get("MicL", [])),
|
||||
}
|
||||
|
||||
|
||||
def mic_count_to_db(count: int) -> float:
|
||||
"""Convert a MicL ADC count to dB(L) for BW-display-compatible output.
|
||||
|
||||
Empirical formula (confirmed 2026-05-11 against V70 fixture: count=813
|
||||
→ 140.1 dB; count=±1 → ±81.94 dB; count=±24 → ±109.5 dB):
|
||||
|
||||
dB = sign(count) × (81.94 + 20 × log10(|count|)) for |count| ≥ 1
|
||||
dB = 0.0 for count == 0
|
||||
|
||||
The constant 81.94 corresponds to 10^(81.94/20) ≈ 12490 mic ADC counts
|
||||
being the dB(L) reference level — almost certainly a calibration
|
||||
constant from the device's mic.
|
||||
"""
|
||||
if count == 0:
|
||||
return 0.0
|
||||
sign = 1.0 if count > 0 else -1.0
|
||||
return sign * (81.94 + 20.0 * math.log10(abs(count)))
|
||||
|
||||
|
||||
# ── A5-frame entry point ────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def decode_a5_frames(a5_frames) -> Optional[dict]:
|
||||
"""Decode a list of A5 (BULK_WAVEFORM_STREAM) frames into per-channel
|
||||
int16 ADC samples.
|
||||
|
||||
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||
with each channel's samples in **1-count ADC units** (the legacy
|
||||
``event.raw_samples`` convention — multiply by ``full_scale / 32768``
|
||||
to convert to physical units; for mic, use :func:`mic_count_to_db` or
|
||||
a per-count psi factor).
|
||||
|
||||
Returns ``None`` if the frames cannot be parsed.
|
||||
|
||||
This is the wired-up production entry point. It:
|
||||
1. Reconstructs the BW-binary body bytes from the A5 frames
|
||||
(``blastware_file.extract_body_bytes``).
|
||||
2. Runs the verified codec (``decode_waveform_v2``) on the body.
|
||||
3. Converts to int16 ADC counts via :func:`decoded_to_adc_counts`.
|
||||
"""
|
||||
# Local import to avoid a cycle: blastware_file imports models and
|
||||
# ultimately client.py imports waveform_codec.
|
||||
from .blastware_file import extract_body_bytes
|
||||
|
||||
if not a5_frames:
|
||||
return None
|
||||
_strt, body, _footer = extract_body_bytes(a5_frames)
|
||||
if not body:
|
||||
return None
|
||||
decoded = decode_waveform_v2(body)
|
||||
if decoded is None:
|
||||
return None
|
||||
return decoded_to_adc_counts(decoded)
|
||||
@@ -53,9 +53,7 @@ SUB_TABLE: dict[int, tuple[str, str, str]] = {
|
||||
0x82: ("TRIGGER_CONFIG_WRITE", "BW→S3", "0x1C bytes; trigger config block; mirrors SUB 1C"),
|
||||
0x83: ("TRIGGER_WRITE_CONFIRM", "BW→S3", "Short frame; commit step after 0x82"),
|
||||
# S3→BW responses
|
||||
0x5A: ("BULK_WAVEFORM_STREAM", "BW→S3", "Bulk waveform chunk request; response is A5 stream"),
|
||||
0xA4: ("POLL_RESPONSE", "S3→BW", "Response to SUB 5B poll"),
|
||||
0xA5: ("BULK_WAVEFORM_RESPONSE", "S3→BW", "Response to SUB 5A; waveform chunks + metadata"),
|
||||
0xFE: ("FULL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 01"),
|
||||
0xF9: ("CHANNEL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 06"),
|
||||
0xF7: ("EVENT_INDEX_RESPONSE", "S3→BW", "Response to SUB 08; contains backlight/power-save"),
|
||||
|
||||
+36
-33
@@ -33,7 +33,7 @@ STX = 0x02
|
||||
ETX = 0x03
|
||||
ACK = 0x41
|
||||
|
||||
__version__ = "0.2.5"
|
||||
__version__ = "0.2.3"
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -184,9 +184,9 @@ def validate_bw_body_auto(body: bytes) -> Optional[Tuple[bytes, bytes, str]]:
|
||||
def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
|
||||
frames: List[Frame] = []
|
||||
|
||||
IDLE = 0
|
||||
IN_FRAME = 1
|
||||
IN_FRAME_DLE = 2 # saw DLE inside frame — waiting for next byte
|
||||
IDLE = 0
|
||||
IN_FRAME = 1
|
||||
AFTER_DLE = 2
|
||||
|
||||
state = IDLE
|
||||
body = bytearray()
|
||||
@@ -206,63 +206,66 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
|
||||
state = IN_FRAME
|
||||
i += 2
|
||||
continue
|
||||
# ACK bytes, boot strings, garbage — silently ignored
|
||||
|
||||
elif state == IN_FRAME:
|
||||
if b == DLE:
|
||||
state = IN_FRAME_DLE
|
||||
state = AFTER_DLE
|
||||
i += 1
|
||||
continue
|
||||
body.append(b)
|
||||
|
||||
else: # AFTER_DLE
|
||||
if b == DLE:
|
||||
body.append(DLE)
|
||||
state = IN_FRAME
|
||||
i += 1
|
||||
continue
|
||||
|
||||
if b == ETX:
|
||||
# Bare ETX = real S3 frame terminator (confirmed from S3FrameParser)
|
||||
end_offset = i + 1
|
||||
trailer_start = i + 1
|
||||
trailer_end = trailer_start + trailer_len
|
||||
trailer = blob[trailer_start:trailer_end]
|
||||
|
||||
# S3 checksums are deliberately not validated here.
|
||||
# Large S3 responses (A5 bulk waveform, E5 compliance) embed
|
||||
# inner DLE+ETX sub-frame terminators whose trailing 0x03 byte
|
||||
# lands where the parser would expect the SUM8 checksum, causing
|
||||
# false failures. The live protocol (protocol.py _validate_frame)
|
||||
# also skips S3 checksum enforcement for the same reason.
|
||||
chk_valid = None
|
||||
chk_type = None
|
||||
chk_hex = None
|
||||
payload = bytes(body)
|
||||
|
||||
if len(body) >= 1:
|
||||
received_chk = body[-1]
|
||||
computed_chk = checksum8_sum(bytes(body[:-1]))
|
||||
if computed_chk == received_chk:
|
||||
chk_valid = True
|
||||
chk_type = "SUM8"
|
||||
chk_hex = f"{received_chk:02x}"
|
||||
payload = bytes(body[:-1])
|
||||
else:
|
||||
chk_valid = False
|
||||
|
||||
frames.append(Frame(
|
||||
index=idx,
|
||||
start_offset=start_offset,
|
||||
end_offset=end_offset,
|
||||
payload_raw=bytes(body),
|
||||
payload=bytes(body),
|
||||
payload=payload,
|
||||
trailer=trailer,
|
||||
checksum_valid=None,
|
||||
checksum_type=None,
|
||||
checksum_hex=None
|
||||
checksum_valid=chk_valid,
|
||||
checksum_type=chk_type,
|
||||
checksum_hex=chk_hex
|
||||
))
|
||||
|
||||
idx += 1
|
||||
state = IDLE
|
||||
i = trailer_end
|
||||
continue
|
||||
body.append(b)
|
||||
|
||||
else: # IN_FRAME_DLE
|
||||
if b == DLE:
|
||||
# DLE DLE → literal 0x10 in payload
|
||||
body.append(DLE)
|
||||
state = IN_FRAME
|
||||
i += 1
|
||||
continue
|
||||
if b == ETX:
|
||||
# DLE+ETX inside a frame = inner-frame terminator (A4/E5 sub-frames).
|
||||
# Treat as literal data, NOT the outer frame end.
|
||||
body.append(DLE)
|
||||
body.append(ETX)
|
||||
state = IN_FRAME
|
||||
i += 1
|
||||
continue
|
||||
# Unexpected DLE + byte → treat as literal data
|
||||
body.append(DLE)
|
||||
body.append(b)
|
||||
state = IN_FRAME
|
||||
i += 1
|
||||
continue
|
||||
|
||||
i += 1
|
||||
|
||||
|
||||
+3
-6
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "seismo-relay"
|
||||
version = "0.19.0"
|
||||
version = "0.12.0"
|
||||
description = "Python client and REST server for MiniMate Plus seismographs"
|
||||
requires-python = ">=3.10"
|
||||
dependencies = [
|
||||
@@ -12,12 +12,9 @@ dependencies = [
|
||||
"uvicorn[standard]>=0.24",
|
||||
"pyserial>=3.5",
|
||||
"sqlalchemy>=2.0",
|
||||
"python-multipart>=0.0.7",
|
||||
"h5py>=3.10",
|
||||
"numpy>=1.24",
|
||||
]
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
# Auto-discovers minimateplus/, micromate/, sfm/, bridges/ as packages
|
||||
# Auto-discovers minimateplus/, sfm/, bridges/ as packages
|
||||
where = ["."]
|
||||
include = ["minimateplus*", "micromate*", "sfm*", "bridges*"]
|
||||
include = ["minimateplus*", "sfm*", "bridges*"]
|
||||
|
||||
@@ -2,6 +2,3 @@ fastapi
|
||||
uvicorn
|
||||
sqlalchemy
|
||||
pyserial
|
||||
python-multipart
|
||||
h5py
|
||||
numpy
|
||||
|
||||
@@ -1,360 +0,0 @@
|
||||
"""
|
||||
scratch/next_experiment_skeleton.py — segment-channel scoring analyzer.
|
||||
|
||||
This is the suggested NEXT EXPERIMENT for cracking the waveform body codec.
|
||||
The goal is to figure out what segments 1+ contain, since segment 0 = Tran
|
||||
is solved but multi-segment continuation diverges from truth at sample ~512.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
The hypothesis to test
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
Segments rotate through channels:
|
||||
|
||||
segment 0 → Tran samples 0..509
|
||||
segment 1 → Vert samples 0..507
|
||||
segment 2 → Long samples 0..507
|
||||
segment 3 → Mic samples 0..507
|
||||
segment 4 → Tran samples 510..N (continuation)
|
||||
...
|
||||
|
||||
This would explain why segment 0 works perfectly (it's pure Tran) and why
|
||||
applying segment 1's blocks as Tran continuation gives wrong values
|
||||
(it's actually Vert).
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
What the analyzer should do
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
For each segment in each fixture event:
|
||||
|
||||
1. Run the segment-0 block-walker + RLE decode (the same algorithm that
|
||||
``decode_tran_initial`` uses) over the segment's blocks. Start from
|
||||
some anchor value and produce a cumulative trajectory of length =
|
||||
number-of-deltas-in-segment.
|
||||
|
||||
2. For each candidate channel C ∈ {Tran, Vert, Long, MicL}:
|
||||
For each candidate anchor location in the segment-header payload
|
||||
(try [0:2], [2:4], [4:6], [14:16], [16:18] as int16 BE):
|
||||
Compare the decoded trajectory against truth[C] starting from
|
||||
the segment's first sample index.
|
||||
Score = number of matches (or sum of squared errors).
|
||||
|
||||
3. Report the best (channel, anchor-location) combination per segment.
|
||||
|
||||
If the rotation hypothesis is correct, you'll see:
|
||||
segment 0 → best score for (Tran, preamble bytes [3:5]) ✓ already known
|
||||
segment 1 → best score for (Vert, <some-header-byte>)
|
||||
segment 2 → best score for (Long, <some-header-byte>)
|
||||
segment 3 → best score for (MicL, <some-header-byte>)
|
||||
segment 4 → best score for (Tran, continuing from segment 0's end)
|
||||
|
||||
If the rotation hypothesis is NOT correct, the scorer will at least narrow
|
||||
down what segment 1 actually carries. Maybe channels interleave at finer
|
||||
granularity, or maybe segments alternate by something other than channel.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Why this is a scoring analyzer, not a hand-written decoder
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
Direct hand-coding ("assume segment 1 is Vert with anchor at byte X") gets
|
||||
stuck when the assumption is wrong because the failure mode is silent —
|
||||
you get plausible-looking-but-wrong samples and have to manually diff
|
||||
against truth to debug.
|
||||
|
||||
The scorer is brute-force but cheap: every fixture event × every segment ×
|
||||
4 channels × 5 anchor-byte candidates is only ~hundreds of comparisons.
|
||||
The winning combination jumps out by score.
|
||||
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Skeleton
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Optional, Tuple
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
|
||||
|
||||
from minimateplus.waveform_codec import walk_body, find_data_start, WaveformBlock
|
||||
|
||||
|
||||
# ── Reusable pieces ──────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||
LSB_INV = 200 # 1 in/s / 0.005 in/s/LSB; multiply BW-export floats by this
|
||||
# to get 16-count units (the body's native quantization).
|
||||
|
||||
|
||||
@dataclass
|
||||
class FixtureEvent:
|
||||
name: str # e.g. "M529LL1A.SP0"
|
||||
bin_path: str
|
||||
txt_path: str
|
||||
body: bytes
|
||||
truth: dict # {channel: list of int16-quantized samples}
|
||||
blocks: List[WaveformBlock]
|
||||
segment_starts: List[int] # block indices of each 40 02 segment header
|
||||
segment_sample_starts: List[int] # for each segment, the truth sample index it starts at
|
||||
|
||||
|
||||
def s4(n: int) -> int:
|
||||
"""4-bit signed nibble decode."""
|
||||
return n if n < 8 else n - 16
|
||||
|
||||
|
||||
def i8(b: int) -> int:
|
||||
"""int8 reinterpret of unsigned byte."""
|
||||
return b if b < 128 else b - 256
|
||||
|
||||
|
||||
def load_fixture(name: str) -> FixtureEvent:
|
||||
"""Load a fixture event with its truth values and parsed block stream."""
|
||||
# Find the fixture (search both subdirs of tests/fixtures/).
|
||||
base = os.path.join(os.path.dirname(__file__), "..", "tests", "fixtures")
|
||||
candidates = [
|
||||
os.path.join(base, "5-11-26", name),
|
||||
os.path.join(base, "decode-re-5-8-26", "event-a", name), # not used directly
|
||||
]
|
||||
bin_path = next((c for c in candidates if os.path.exists(c)), None)
|
||||
if bin_path is None:
|
||||
# Try a glob walk for the 5-8 fixtures (they're in subdirs).
|
||||
for root, _, files in os.walk(base):
|
||||
if name in files:
|
||||
bin_path = os.path.join(root, name)
|
||||
break
|
||||
if bin_path is None:
|
||||
raise FileNotFoundError(name)
|
||||
|
||||
txt_path = bin_path + ".TXT"
|
||||
with open(bin_path, "rb") as f:
|
||||
raw = f.read()
|
||||
body = raw[43:-26]
|
||||
truth = _parse_txt(txt_path)
|
||||
blocks = walk_body(body, find_data_start(body))
|
||||
|
||||
seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||
# Segment 0 starts at sample 0; subsequent segments start at the
|
||||
# cumulative sample count from previous segment(s). Tran's segment 0
|
||||
# is N samples; if rotation hypothesis is correct, segment 1's data
|
||||
# starts at sample 0 for a *different* channel. The analyzer should
|
||||
# try both "continues from previous segment" and "starts at sample 0
|
||||
# of a different channel."
|
||||
seg_sample_starts = _compute_segment_sample_starts(blocks, seg_idx)
|
||||
|
||||
return FixtureEvent(
|
||||
name=name, bin_path=bin_path, txt_path=txt_path,
|
||||
body=body, truth=truth, blocks=blocks,
|
||||
segment_starts=seg_idx, segment_sample_starts=seg_sample_starts,
|
||||
)
|
||||
|
||||
|
||||
def _parse_txt(path: str) -> dict:
|
||||
"""Parse BW ASCII TXT export into {channel: [int_samples_in_16_count_units]}."""
|
||||
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||
lines = f.read().splitlines()
|
||||
header_idx = next(
|
||||
(i for i, l in enumerate(lines)
|
||||
if all(c in l for c in CHANNELS)),
|
||||
None,
|
||||
)
|
||||
if header_idx is None:
|
||||
return {ch: [] for ch in CHANNELS}
|
||||
out = {ch: [] for ch in CHANNELS}
|
||||
for line in lines[header_idx + 1:]:
|
||||
parts = re.split(r"\s+", line.strip())
|
||||
if len(parts) < 4:
|
||||
continue
|
||||
try:
|
||||
vals = [float(p) for p in parts[:4]]
|
||||
except ValueError:
|
||||
continue
|
||||
for ch, v in zip(CHANNELS, vals):
|
||||
# Multiply by LSB_INV; geo channels are in in/s, MicL is in dB(L)
|
||||
# (which doesn't quantize the same way — leaving raw for MicL is fine,
|
||||
# the scorer should treat MicL specially).
|
||||
out[ch].append(round(v * LSB_INV) if ch != "MicL" else v)
|
||||
return out
|
||||
|
||||
|
||||
def _compute_segment_sample_starts(
|
||||
blocks: List[WaveformBlock], seg_idx: List[int]
|
||||
) -> List[int]:
|
||||
"""Cumulative sample-count up to each segment header (if all blocks treated
|
||||
as Tran continuation). Useful as one candidate for segment-1-Tran tests.
|
||||
|
||||
The scorer should ALSO try "segment 1 starts at sample 0 of a new channel"
|
||||
as the rotation hypothesis predicts.
|
||||
"""
|
||||
starts = []
|
||||
cum = 2 # T[0] + T[1] from preamble
|
||||
for i, b in enumerate(blocks):
|
||||
if i in seg_idx:
|
||||
starts.append(cum)
|
||||
if b.tag_hi == 0x10:
|
||||
cum += b.tag_lo
|
||||
elif b.tag_hi == 0x20:
|
||||
cum += b.tag_lo
|
||||
elif b.tag_hi == 0x00:
|
||||
cum += b.tag_lo
|
||||
# 30 NN and 40 02 don't contribute samples (for this hypothesis)
|
||||
return starts
|
||||
|
||||
|
||||
# ── The core algorithm: decode a segment's blocks as deltas ─────────────────
|
||||
|
||||
|
||||
def decode_segment_as_channel(
|
||||
blocks: List[WaveformBlock],
|
||||
seg_start_block_idx: int,
|
||||
seg_end_block_idx: int,
|
||||
anchor: int,
|
||||
) -> List[int]:
|
||||
"""Apply the segment-0 codec rules to a range of blocks, starting from *anchor*.
|
||||
|
||||
Returns a list of cumulative sample values (one per delta). Does NOT include
|
||||
the anchor itself in the output — the first returned value is anchor + first_delta.
|
||||
"""
|
||||
out = []
|
||||
cur = anchor
|
||||
for bi in range(seg_start_block_idx, seg_end_block_idx):
|
||||
blk = blocks[bi]
|
||||
if blk.tag_hi == 0x10:
|
||||
for byte in blk.data:
|
||||
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||
cur += s4(nib)
|
||||
out.append(cur)
|
||||
elif blk.tag_hi == 0x20:
|
||||
for byte in blk.data:
|
||||
cur += i8(byte)
|
||||
out.append(cur)
|
||||
elif blk.tag_hi == 0x00:
|
||||
for _ in range(blk.tag_lo):
|
||||
out.append(cur)
|
||||
# 30 NN: skip (content unknown)
|
||||
# 40 02: shouldn't appear in segment data (it's the segment header)
|
||||
return out
|
||||
|
||||
|
||||
def score_against_truth(
|
||||
decoded: List[int],
|
||||
truth: List[int],
|
||||
truth_start: int,
|
||||
) -> Tuple[int, int]:
|
||||
"""Compare *decoded* to truth[truth_start : truth_start + len(decoded)].
|
||||
|
||||
Returns (n_matches, n_compared).
|
||||
"""
|
||||
n = min(len(decoded), len(truth) - truth_start)
|
||||
if n <= 0:
|
||||
return (0, 0)
|
||||
matches = sum(1 for i in range(n) if decoded[i] == truth[truth_start + i])
|
||||
return (matches, n)
|
||||
|
||||
|
||||
# ── TODO for the next pass ──────────────────────────────────────────────────
|
||||
|
||||
|
||||
def score_segment_against_all_channels(
|
||||
event: FixtureEvent,
|
||||
segment_index: int,
|
||||
) -> List[Tuple[str, int, int, int]]:
|
||||
"""For segment *segment_index* of *event*, find the best (channel, start_sample)
|
||||
fit.
|
||||
|
||||
For each candidate channel C and each candidate starting truth-sample index s,
|
||||
we pick the anchor that makes the FIRST decoded value match truth[C][s], then
|
||||
score the remaining decoded values against truth[C][s+1 : s+N].
|
||||
|
||||
Returns rows of (channel_name, start_sample, n_matches, n_compared)
|
||||
sorted by match-count descending.
|
||||
"""
|
||||
# Block range of this segment: from the segment header (inclusive) up to
|
||||
# the next segment header (exclusive), or end-of-blocks.
|
||||
seg_header_idx = event.segment_starts[segment_index]
|
||||
next_header_idx = (
|
||||
event.segment_starts[segment_index + 1]
|
||||
if segment_index + 1 < len(event.segment_starts)
|
||||
else len(event.blocks)
|
||||
)
|
||||
|
||||
# Decode the segment's data blocks (skip the segment-header block itself).
|
||||
# Use anchor=0 — we'll re-anchor when scoring against each channel.
|
||||
deltas_trajectory = decode_segment_as_channel(
|
||||
event.blocks, seg_header_idx + 1, next_header_idx, anchor=0
|
||||
)
|
||||
if not deltas_trajectory:
|
||||
return []
|
||||
|
||||
n = len(deltas_trajectory)
|
||||
results = []
|
||||
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
truth = event.truth.get(ch)
|
||||
if not truth or len(truth) < n + 1:
|
||||
continue
|
||||
# For each candidate starting sample s in truth, check if applying
|
||||
# the deltas starting from truth[s] reproduces truth[s+1:s+n+1].
|
||||
best = (0, -1)
|
||||
for s in range(len(truth) - n):
|
||||
anchor = truth[s]
|
||||
offset = anchor - deltas_trajectory[0] + truth[s + 1] - anchor
|
||||
# Recompute: trajectory[i] = anchor + cumulative_delta_through_i
|
||||
# but we already have deltas_trajectory computed from anchor=0,
|
||||
# so trajectory_relative[i] = anchor + deltas_trajectory[i].
|
||||
matches = 0
|
||||
for i in range(n):
|
||||
if truth[s + i + 1] == anchor + deltas_trajectory[i]:
|
||||
matches += 1
|
||||
# Note: we could break early on first mismatch for "matches start",
|
||||
# but counting total matches gives a more robust score.
|
||||
if matches > best[0]:
|
||||
best = (matches, s)
|
||||
results.append((ch, best[1], best[0], n))
|
||||
|
||||
results.sort(key=lambda r: -r[2])
|
||||
return results
|
||||
|
||||
|
||||
# ── Driver ──────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def main():
|
||||
"""Run the analyzer on all loud-bundle events and print best scores."""
|
||||
events = ["M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
|
||||
"M529LL1L.JQ0", "M529LL1L.V70"]
|
||||
for name in events:
|
||||
try:
|
||||
event = load_fixture(name)
|
||||
except FileNotFoundError:
|
||||
print(f"{name}: fixture not found")
|
||||
continue
|
||||
|
||||
print(f"\n=== {name} ===")
|
||||
print(f" body bytes: {len(event.body)}")
|
||||
print(f" blocks: {len(event.blocks)}")
|
||||
print(f" segments: {len(event.segment_starts)}")
|
||||
print(f" segment sample-starts (if all blocks are 1 channel):")
|
||||
for si, sample_start in enumerate(event.segment_sample_starts):
|
||||
print(f" seg {si}: sample {sample_start}")
|
||||
|
||||
for si in range(len(event.segment_starts)):
|
||||
results = score_segment_against_all_channels(event, si)
|
||||
if not results:
|
||||
print(f" seg {si}: (no scorable data)")
|
||||
continue
|
||||
tag = "✓" if results[0][2] / max(results[0][3], 1) > 0.9 else " "
|
||||
top = results[0]
|
||||
print(f" seg {si}: best fit {tag} = {top[0]:<5} "
|
||||
f"starting at sample {top[1]:>5}, {top[2]:>4}/{top[3]:<4} match"
|
||||
+ (f" (next: {results[1][0]} @{results[1][1]} {results[1][2]}/{results[1][3]})"
|
||||
if len(results) > 1 else ""))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,150 +0,0 @@
|
||||
"""
|
||||
scripts/backfill_record_type.py — fix `record_type` on legacy event
|
||||
rows whose value was hardcoded to "Waveform" regardless of actual type.
|
||||
|
||||
Why this is needed
|
||||
──────────────────
|
||||
Pre-v0.16.1 the BW file importer (`event_file_io.read_blastware_file`)
|
||||
hardcoded `ev.record_type = "Waveform"` for every imported event. Fixed
|
||||
in commit aac1c8e — new ingests now derive the type from the Blastware
|
||||
filename's extension last character (H=Histogram, W=Waveform, M=Manual,
|
||||
E=Event, C=Combo) per the V10.72+ MiniMate Plus AB0T filename scheme.
|
||||
|
||||
Effect on a server that imported events under the old code: every
|
||||
events row has `record_type = "Waveform"`, even for histograms,
|
||||
manuals, etc. Visible in terra-view's event-detail modal under the
|
||||
"Record Type" field. Terra-view also has a client-side workaround
|
||||
that derives the type from the filename for display purposes, so
|
||||
operators see the correct type in the UI even before this backfill.
|
||||
This script makes the DB column match what the UI is already showing,
|
||||
which matters for reporting and any downstream consumer that reads
|
||||
events.record_type directly.
|
||||
|
||||
This script
|
||||
───────────
|
||||
Walks the `events` table and updates each row's `record_type` to the
|
||||
derived value from its `blastware_filename`. Old S338 firmware files
|
||||
(3-char extensions ending in `0`) and any unrecognized suffix get
|
||||
left at the existing value (defaults to "Waveform").
|
||||
|
||||
Idempotent: re-running after a successful backfill finds zero rows
|
||||
needing updates and exits cleanly (it always re-derives but only
|
||||
writes when the value would change).
|
||||
|
||||
Usage
|
||||
─────
|
||||
# Dry-run (default): print what would change, don't touch the DB
|
||||
python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db
|
||||
|
||||
# Apply the backfill
|
||||
python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db --apply
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import sqlite3
|
||||
import sys
|
||||
from collections import Counter
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
# Must stay in sync with minimateplus.event_file_io._RECORD_TYPE_BY_EXT_SUFFIX.
|
||||
_TYPE_FROM_SUFFIX = {
|
||||
"H": "Histogram",
|
||||
"W": "Waveform",
|
||||
"M": "Manual",
|
||||
"E": "Event",
|
||||
"C": "Combo",
|
||||
}
|
||||
|
||||
|
||||
def derive_record_type(filename: str | None, default: str = "Waveform") -> str:
|
||||
"""Mirror of minimateplus.event_file_io.derive_record_type_from_filename.
|
||||
|
||||
Vendored here so this script runs without needing the seismo-relay
|
||||
package on the Python path (useful on prod where you might be
|
||||
running it via `docker exec` against a container's DB volume).
|
||||
"""
|
||||
if not filename:
|
||||
return default
|
||||
name = Path(filename).name
|
||||
if "." not in name:
|
||||
return default
|
||||
ext = name.rsplit(".", 1)[1]
|
||||
if not ext:
|
||||
return default
|
||||
return _TYPE_FROM_SUFFIX.get(ext[-1].upper(), default)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
ap = argparse.ArgumentParser(description=__doc__)
|
||||
ap.add_argument("--db", required=True, help="Path to seismo_relay.db")
|
||||
ap.add_argument("--apply", action="store_true",
|
||||
help="Actually write changes (default is dry-run).")
|
||||
ap.add_argument("--default", default="Waveform",
|
||||
help="Fallback record_type when filename doesn't encode one. "
|
||||
"Default: Waveform (matches the pre-fix bug's behavior).")
|
||||
args = ap.parse_args()
|
||||
|
||||
db_path = Path(args.db)
|
||||
if not db_path.exists():
|
||||
print(f"ERROR: database not found at {db_path}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
conn = sqlite3.connect(str(db_path))
|
||||
conn.row_factory = sqlite3.Row
|
||||
cur = conn.cursor()
|
||||
|
||||
cur.execute("""
|
||||
SELECT id, blastware_filename, record_type
|
||||
FROM events
|
||||
WHERE blastware_filename IS NOT NULL
|
||||
AND blastware_filename != ''
|
||||
""")
|
||||
rows = cur.fetchall()
|
||||
total = len(rows)
|
||||
print(f"Scanning {total:,} event rows…")
|
||||
print()
|
||||
|
||||
# Tally proposed changes.
|
||||
transitions: Counter[tuple[str, str]] = Counter()
|
||||
update_ids: list[tuple[str, str]] = []
|
||||
unrecognized = 0
|
||||
|
||||
for row in rows:
|
||||
derived = derive_record_type(row["blastware_filename"], default=args.default)
|
||||
current = row["record_type"] or ""
|
||||
if derived == current:
|
||||
continue
|
||||
transitions[(current, derived)] += 1
|
||||
update_ids.append((row["id"], derived))
|
||||
|
||||
if not update_ids:
|
||||
print("Nothing to update — all rows already match.")
|
||||
conn.close()
|
||||
return 0
|
||||
|
||||
print(f"{len(update_ids):,} row(s) need updating:")
|
||||
for (old, new), count in sorted(transitions.items(), key=lambda x: -x[1]):
|
||||
print(f" {count:>6,} {old!r:14s} → {new!r}")
|
||||
print()
|
||||
|
||||
if not args.apply:
|
||||
print("(dry-run — re-run with --apply to write changes)")
|
||||
conn.close()
|
||||
return 0
|
||||
|
||||
print("Applying changes…")
|
||||
cur.executemany(
|
||||
"UPDATE events SET record_type = ? WHERE id = ?",
|
||||
[(new, eid) for eid, new in update_ids],
|
||||
)
|
||||
conn.commit()
|
||||
print(f"Done. Updated {cur.rowcount:,} row(s).")
|
||||
conn.close()
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,431 +0,0 @@
|
||||
"""
|
||||
scripts/backfill_sidecars.py — generate .sfm.json sidecars AND .h5
|
||||
clean-waveform files for existing events already in the waveform store
|
||||
that predate those features.
|
||||
|
||||
Walks `<store_root>/<serial>/<filename>` and for each BW event file:
|
||||
|
||||
Sidecar (.sfm.json):
|
||||
- Skip when an existing sidecar's blastware.sha256 matches the
|
||||
current BW file's sha256.
|
||||
- Else regenerate: prefer .a5.pkl (full fidelity); fall back to
|
||||
parsing the BW binary directly (peaks computed from samples).
|
||||
|
||||
Clean waveform (.h5):
|
||||
- Regenerated whenever the sidecar is regenerated (sha mismatch
|
||||
OR sidecar.source.tool_version < current TOOL_VERSION OR --force).
|
||||
The .h5 and the sidecar both come from the same decoder output,
|
||||
so if the sidecar is stale the .h5 is too.
|
||||
- Written when missing.
|
||||
- --skip-hdf5 turns off all .h5 writes.
|
||||
|
||||
Typical use after a decoder upgrade:
|
||||
1. Pull the new seismo-relay code (which bumped TOOL_VERSION).
|
||||
2. Run this script — every sidecar with an older tool_version
|
||||
stamp regenerates, and the associated .h5 cascade-regenerates.
|
||||
3. Operator review state (review.false_trigger, notes, reviewer)
|
||||
and the sidecar's extensions block are preserved across the
|
||||
regen.
|
||||
|
||||
Usage:
|
||||
python scripts/backfill_sidecars.py [--store-root PATH]
|
||||
[--db-path PATH]
|
||||
[--dry-run]
|
||||
[--skip-hdf5]
|
||||
[-v]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import logging
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Allow running from the repo root without installation.
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
|
||||
from minimateplus import event_file_io
|
||||
from sfm import event_hdf5
|
||||
from sfm.waveform_store import WaveformStore, _frame_to_dict, _dict_to_frame # noqa: F401
|
||||
from sfm.database import SeismoDb
|
||||
|
||||
log = logging.getLogger("backfill_sidecars")
|
||||
|
||||
|
||||
def _looks_like_event_file(path: Path) -> bool:
|
||||
"""Same heuristic as the importer CLI.
|
||||
|
||||
Filters to BW (Series III) event files only — Thor (Series IV)
|
||||
`.IDFW` / `.IDFH` files share the store but have their own ingest
|
||||
path (`WaveformStore.save_imported_idf`) and are NOT decodable by
|
||||
`event_file_io.read_blastware_file`. Their sidecars are populated
|
||||
at ingest from the paired `.IDFW.txt` ASCII report; nothing the
|
||||
backfill regenerates would improve on them, so we exclude them
|
||||
from scope.
|
||||
"""
|
||||
if not path.is_file():
|
||||
return False
|
||||
if path.name.endswith((".a5.pkl", ".sfm.json", ".h5")):
|
||||
return False
|
||||
ext = path.suffix.lstrip(".")
|
||||
if not (3 <= len(ext) <= 4):
|
||||
return False
|
||||
# Thor IDF files share the .{W,H}-suffix shape but aren't BW.
|
||||
if ext.upper() in ("IDFW", "IDFH"):
|
||||
return False
|
||||
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
|
||||
return False
|
||||
try:
|
||||
return path.stat().st_size >= 70
|
||||
except OSError:
|
||||
return False
|
||||
|
||||
|
||||
def main(argv=None) -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
p.add_argument(
|
||||
"--db-path",
|
||||
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
|
||||
)
|
||||
p.add_argument("--store-root", default=None)
|
||||
p.add_argument("--dry-run", action="store_true")
|
||||
p.add_argument(
|
||||
"--skip-hdf5", action="store_true",
|
||||
help="Don't generate .h5 clean-waveform files (only sidecars).",
|
||||
)
|
||||
p.add_argument(
|
||||
"--force", action="store_true",
|
||||
help=(
|
||||
"Regenerate sidecars + .h5 even when an existing sidecar's "
|
||||
"blastware.sha256 matches the current BW file. Use this after "
|
||||
"upgrading seismo-relay to pull in decoder bug fixes (e.g. the "
|
||||
"STRT-rectime byte-offset fix in v0.15.x)."
|
||||
),
|
||||
)
|
||||
p.add_argument("-v", "--verbose", action="store_true")
|
||||
args = p.parse_args(argv)
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG if args.verbose else logging.INFO,
|
||||
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
|
||||
datefmt="%H:%M:%S",
|
||||
)
|
||||
|
||||
db_path = Path(args.db_path).expanduser().resolve()
|
||||
store_root = (
|
||||
Path(args.store_root).expanduser().resolve()
|
||||
if args.store_root else db_path.parent / "waveforms"
|
||||
)
|
||||
if not store_root.exists():
|
||||
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
store = WaveformStore(store_root)
|
||||
db = SeismoDb(db_path)
|
||||
|
||||
written = skipped = errors = 0
|
||||
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
|
||||
serial = serial_dir.name
|
||||
for path in sorted(serial_dir.iterdir()):
|
||||
if not _looks_like_event_file(path):
|
||||
continue
|
||||
sidecar_path = store.sidecar_path_for(serial, path.name)
|
||||
try:
|
||||
bw_sha = event_file_io.file_sha256(path)
|
||||
except Exception as exc:
|
||||
log.error("sha256 failed for %s: %s", path, exc)
|
||||
errors += 1
|
||||
continue
|
||||
|
||||
# Skip when an up-to-date sidecar already exists.
|
||||
#
|
||||
# Two-part freshness check:
|
||||
# 1. blastware.sha256 must match the current BW file (proves
|
||||
# the sidecar describes THIS file).
|
||||
# 2. source.tool_version must be ≥ current TOOL_VERSION (proves
|
||||
# the sidecar was written by a build that includes any
|
||||
# decoder fixes shipped since).
|
||||
# Either part failing → regenerate. --force bypasses both.
|
||||
#
|
||||
# Tracks whether we're regenerating the sidecar this iteration
|
||||
# so the .h5 logic below knows to refresh that too — staleness
|
||||
# of the sidecar implies staleness of the derived .h5 (both
|
||||
# come out of the same decoder).
|
||||
sidecar_stale = True
|
||||
if sidecar_path.exists() and not args.force:
|
||||
try:
|
||||
existing = event_file_io.read_sidecar(sidecar_path)
|
||||
sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
|
||||
src_ver = existing.get("source", {}).get("tool_version", "")
|
||||
def _vt(s):
|
||||
try:
|
||||
return tuple(int(p) for p in str(s).split(".")[:3])
|
||||
except Exception:
|
||||
return (0, 0, 0)
|
||||
ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION)
|
||||
if sha_ok and ver_ok:
|
||||
skipped += 1
|
||||
sidecar_stale = False
|
||||
continue
|
||||
if sha_ok and not ver_ok:
|
||||
log.info(
|
||||
"regenerating %s (sidecar tool_version=%s < current %s)",
|
||||
sidecar_path.name, src_ver or "(none)",
|
||||
event_file_io.TOOL_VERSION,
|
||||
)
|
||||
except Exception:
|
||||
pass # fall through to rewrite
|
||||
|
||||
# Decide path: A5-based (high-fidelity) or BW-only.
|
||||
a5_path = serial_dir / f"{path.name}.a5.pkl"
|
||||
try:
|
||||
if a5_path.exists():
|
||||
frames = store.load_a5(serial, path.name)
|
||||
if not frames:
|
||||
raise RuntimeError("a5_pickle present but unreadable")
|
||||
# Build an Event by replaying the A5 decoders. Note:
|
||||
# the .a5.pkl alone CANNOT recover timestamp /
|
||||
# record_type / waveform_key / per-channel peaks —
|
||||
# those live in the 0C record, which isn't saved
|
||||
# separately. We seed those from the DB row + the
|
||||
# existing sidecar below so a re-backfill doesn't
|
||||
# nuke fields the original save populated.
|
||||
from minimateplus.client import (
|
||||
_decode_a5_metadata_into,
|
||||
_decode_a5_waveform,
|
||||
)
|
||||
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
|
||||
ev = Event(index=-1)
|
||||
_decode_a5_metadata_into(frames, ev)
|
||||
_decode_a5_waveform(frames, ev)
|
||||
source_kind = "sfm-live"
|
||||
a5_filename = a5_path.name
|
||||
else:
|
||||
ev = event_file_io.read_blastware_file(path)
|
||||
source_kind = "bw-import"
|
||||
a5_filename = None
|
||||
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
|
||||
|
||||
# ── Seed missing fields from the SeismoDb events row ──
|
||||
# The DB row was populated at original save time with peaks,
|
||||
# project info, timestamp, record_type, sample_rate, etc.
|
||||
# All of those survive intact in SQLite; pull them onto the
|
||||
# rebuilt Event so the regenerated sidecar matches what was
|
||||
# there before the backfill ran.
|
||||
db_row = None
|
||||
try:
|
||||
import sqlite3 as _sql
|
||||
with _sql.connect(str(db.db_path)) as _conn:
|
||||
_conn.row_factory = _sql.Row
|
||||
db_row = _conn.execute(
|
||||
"SELECT * FROM events "
|
||||
"WHERE serial=? AND blastware_filename=? "
|
||||
"LIMIT 1",
|
||||
(serial, path.name),
|
||||
).fetchone()
|
||||
except Exception as exc:
|
||||
log.debug("DB lookup failed for %s: %s", path.name, exc)
|
||||
|
||||
if db_row is not None:
|
||||
if ev.sample_rate is None and db_row["sample_rate"]:
|
||||
ev.sample_rate = int(db_row["sample_rate"])
|
||||
if not ev.record_type and db_row["record_type"]:
|
||||
ev.record_type = db_row["record_type"]
|
||||
if ev._waveform_key is None and db_row["waveform_key"]:
|
||||
try:
|
||||
ev._waveform_key = bytes.fromhex(db_row["waveform_key"])
|
||||
except Exception:
|
||||
pass
|
||||
# Timestamp from the ISO-8601 string in the DB row.
|
||||
if ev.timestamp is None and db_row["timestamp"]:
|
||||
try:
|
||||
import datetime as _dt
|
||||
_t = _dt.datetime.fromisoformat(db_row["timestamp"])
|
||||
ev.timestamp = Timestamp(
|
||||
raw=b"", flag=0x10,
|
||||
year=_t.year, unknown_byte=0,
|
||||
month=_t.month, day=_t.day,
|
||||
hour=_t.hour, minute=_t.minute, second=_t.second,
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
# Peaks from the DB row when the A5 decode didn't supply them.
|
||||
if ev.peak_values is None:
|
||||
ev.peak_values = PeakValues(
|
||||
tran=db_row["tran_ppv"],
|
||||
vert=db_row["vert_ppv"],
|
||||
long=db_row["long_ppv"],
|
||||
peak_vector_sum=db_row["peak_vector_sum"],
|
||||
micl=db_row["mic_ppv"],
|
||||
)
|
||||
# Project info from the DB row when the A5 metadata-page
|
||||
# decode didn't pick it up.
|
||||
if ev.project_info is None or all(
|
||||
v in (None, "")
|
||||
for v in (
|
||||
(ev.project_info.project if ev.project_info else None),
|
||||
(ev.project_info.client if ev.project_info else None),
|
||||
(ev.project_info.operator if ev.project_info else None),
|
||||
(ev.project_info.sensor_location if ev.project_info else None),
|
||||
)
|
||||
):
|
||||
ev.project_info = ProjectInfo(
|
||||
project=db_row["project"],
|
||||
client=db_row["client"],
|
||||
operator=db_row["operator"],
|
||||
sensor_location=db_row["sensor_location"],
|
||||
)
|
||||
|
||||
# Derive total_samples when we have both rectime + sample_rate.
|
||||
# The decoder's STRT-derived value can be a buffer offset
|
||||
# rather than a sample count — drop it in that case.
|
||||
if ev.sample_rate and ev.rectime_seconds:
|
||||
derived = int(round(ev.sample_rate * ev.rectime_seconds))
|
||||
if (ev.total_samples is None
|
||||
or ev.total_samples > derived * 2
|
||||
or ev.total_samples < derived // 4):
|
||||
ev.total_samples = derived
|
||||
|
||||
# Preserve user-edited review state + extensions + the
|
||||
# bw_report block from the existing sidecar so a backfill
|
||||
# never wipes them out. The bw_report block originates
|
||||
# from the paired .TXT ASCII report parsed at ORIGINAL
|
||||
# import time (ach forward / direct upload); the .TXT
|
||||
# file is not in the waveform store, so we can't re-derive
|
||||
# it from disk. event_to_sidecar_dict takes a
|
||||
# BwAsciiReport dataclass (not a dict), so for bw_report
|
||||
# we overlay the existing block after regen instead of
|
||||
# passing it as a kwarg.
|
||||
preserved_review = None
|
||||
preserved_ext = None
|
||||
preserved_bw_report = None
|
||||
if sidecar_path.exists():
|
||||
try:
|
||||
_existing = event_file_io.read_sidecar(sidecar_path)
|
||||
preserved_review = _existing.get("review")
|
||||
preserved_ext = _existing.get("extensions")
|
||||
preserved_bw_report = _existing.get("bw_report")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Overlay BW ASCII report fields onto the rebuilt Event
|
||||
# BEFORE the sidecar + DB write. Mirrors what the ingest
|
||||
# path does — BW's reported peaks (and sample_rate /
|
||||
# record_time) win over codec output where present.
|
||||
#
|
||||
# Without this step, --force backfill silently overwrites
|
||||
# the bw_report-overlaid DB columns with codec-derived
|
||||
# values, which is wrong for events the codec doesn't
|
||||
# fully decode (e.g. waveform walker edge cases on
|
||||
# SP0/SS0/SV0-style events, or histogram sub-formats with
|
||||
# byte[5]!=0 that aren't yet RE'd). Net effect was PVS=0
|
||||
# on three top-10 events on 2026-05-22.
|
||||
if preserved_bw_report:
|
||||
event_file_io.apply_bw_report_dict_to_event(
|
||||
ev, preserved_bw_report,
|
||||
)
|
||||
|
||||
sidecar = event_file_io.event_to_sidecar_dict(
|
||||
ev,
|
||||
serial=serial,
|
||||
blastware_filename=path.name,
|
||||
blastware_filesize=path.stat().st_size,
|
||||
blastware_sha256=bw_sha,
|
||||
source_kind=source_kind,
|
||||
a5_pickle_filename=a5_filename,
|
||||
review=preserved_review,
|
||||
extensions=preserved_ext,
|
||||
)
|
||||
if preserved_bw_report is not None:
|
||||
sidecar["bw_report"] = preserved_bw_report
|
||||
|
||||
# Also emit the .h5 clean-waveform file when:
|
||||
# - it's missing, OR
|
||||
# - --force was passed, OR
|
||||
# - the sidecar is being regenerated this iteration
|
||||
# (sha mismatch / tool_version too old). The .h5 and
|
||||
# the sidecar are both derived from the same decoder
|
||||
# output, so if the sidecar is stale, so is the .h5.
|
||||
#
|
||||
# Both waveform and histogram bodies now decode to real
|
||||
# samples via event_file_io.read_blastware_file → either
|
||||
# waveform_codec.decode_waveform_v2 or histogram_codec.
|
||||
# decode_histogram_body. If samples are still empty after
|
||||
# both codecs run, it's a genuine "we can't decode this
|
||||
# file" case (truncated, malformed, or unknown mode);
|
||||
# skip the .h5 write so we don't replace whatever's
|
||||
# there with an empty placeholder.
|
||||
has_samples = bool(
|
||||
ev.raw_samples and any(
|
||||
ev.raw_samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL")
|
||||
)
|
||||
)
|
||||
hdf5_path = store.hdf5_path_for(serial, path.name)
|
||||
hdf5_filename = hdf5_path.name if hdf5_path.exists() else None
|
||||
hdf5_action = "kept"
|
||||
need_h5 = (
|
||||
not args.skip_hdf5
|
||||
and (args.force or not hdf5_path.exists() or sidecar_stale)
|
||||
and has_samples
|
||||
)
|
||||
if not has_samples and not args.skip_hdf5:
|
||||
hdf5_action = "skipped-undecodable"
|
||||
if need_h5:
|
||||
if args.dry_run:
|
||||
hdf5_action = "would (re)write"
|
||||
else:
|
||||
try:
|
||||
event_hdf5.write_event_hdf5(
|
||||
hdf5_path, ev,
|
||||
serial=serial,
|
||||
geo_range="normal",
|
||||
source_kind=source_kind,
|
||||
)
|
||||
hdf5_filename = hdf5_path.name
|
||||
hdf5_action = "rewrote" if hdf5_path.exists() else "wrote"
|
||||
except Exception as exc:
|
||||
log.warning("HDF5 write failed for %s: %s", path.name, exc)
|
||||
hdf5_action = "FAILED"
|
||||
|
||||
if args.dry_run:
|
||||
print(f" [DRY ] would write {sidecar_path.name} "
|
||||
f"+ .h5 ({hdf5_action}) source={source_kind}")
|
||||
written += 1
|
||||
continue
|
||||
|
||||
event_file_io.write_sidecar(sidecar_path, sidecar)
|
||||
|
||||
# Best-effort: keep the SQL row's sidecar_filename in sync
|
||||
# by upserting via insert_events (it dedups on serial+ts).
|
||||
try:
|
||||
db.insert_events(
|
||||
[ev], serial=serial,
|
||||
waveform_records=(
|
||||
{ev._waveform_key.hex(): {
|
||||
"filename": path.name,
|
||||
"filesize": path.stat().st_size,
|
||||
"a5_pickle_filename": a5_filename,
|
||||
"sidecar_filename": sidecar_path.name,
|
||||
}}
|
||||
if ev._waveform_key else None
|
||||
),
|
||||
device_family="series3",
|
||||
)
|
||||
except Exception as exc:
|
||||
log.warning("DB upsert failed for %s: %s", path.name, exc)
|
||||
|
||||
print(f" [OK ] {path.name} → {sidecar_path.name} "
|
||||
f"+ h5 ({hdf5_action}) source={source_kind}")
|
||||
written += 1
|
||||
|
||||
except Exception as exc:
|
||||
log.error("backfill failed for %s: %s", path, exc, exc_info=args.verbose)
|
||||
errors += 1
|
||||
|
||||
print(f"\nDone. written={written} skipped(uptodate)={skipped} errors={errors}")
|
||||
return 0 if errors == 0 else 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,100 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Fire-and-forget Stop Monitoring loop — for wedged or constantly-triggering units.
|
||||
#
|
||||
# Hammers POST /device/stop_monitoring_blind in a tight loop. The endpoint
|
||||
# opens TCP, dumps SESSION_RESET + a few copies of the SUB 0x97 frame, and
|
||||
# closes — without ever reading an S3 response. Each TCP-won attempt is
|
||||
# ~50ms of wire activity instead of the multi-frame handshake the regular
|
||||
# rescue endpoint does, so windows that are too small for the full rescue
|
||||
# can still land a stop-monitoring command.
|
||||
#
|
||||
# Usage:
|
||||
# ./blind_stop.sh <host> [tcp_port]
|
||||
#
|
||||
# Env:
|
||||
# SFM_BASE_URL Default: http://localhost:8200 (SFM direct).
|
||||
# Set to http://localhost:8001/api/sfm to route through
|
||||
# Terra-View's proxy.
|
||||
# MAX_ATTEMPTS Default: 600
|
||||
# SLEEP_S Default: 0 (no backoff — hammer it)
|
||||
# MAX_TIME_S Default: 15
|
||||
# CONNECT_TIMEOUT Default: 5
|
||||
# REPEAT Frames per TCP session (default 3 — increases hit rate
|
||||
# if the device is busy reading its own buffer).
|
||||
# STOP_ON_OK Default: 1. Set to 0 to keep hammering indefinitely
|
||||
# even after successful sends (every 503 means the device
|
||||
# is in *another* session, every 200 means our bytes got
|
||||
# through — but the device may not have processed them).
|
||||
|
||||
set -u
|
||||
|
||||
host="${1:-}"
|
||||
tcp_port="${2:-9034}"
|
||||
if [[ -z "$host" ]]; then
|
||||
echo "usage: $0 <host> [tcp_port]" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||
max_attempts="${MAX_ATTEMPTS:-600}"
|
||||
sleep_s="${SLEEP_S:-0}"
|
||||
max_time_s="${MAX_TIME_S:-15}"
|
||||
connect_timeout="${CONNECT_TIMEOUT:-5}"
|
||||
repeat="${REPEAT:-3}"
|
||||
stop_on_ok="${STOP_ON_OK:-1}"
|
||||
|
||||
url="${base}/device/stop_monitoring_blind?host=${host}&tcp_port=${tcp_port}&connect_timeout=${connect_timeout}&repeat=${repeat}"
|
||||
|
||||
echo "blind_stop: target ${host}:${tcp_port} connect_timeout=${connect_timeout}s repeat=${repeat}"
|
||||
echo "blind_stop: POST ${url}"
|
||||
echo "blind_stop: up to ${max_attempts} attempts, ${sleep_s}s between, ${max_time_s}s per request"
|
||||
echo "blind_stop: stop_on_ok=${stop_on_ok}"
|
||||
echo
|
||||
|
||||
ok_count=0
|
||||
busy_count=0
|
||||
err_count=0
|
||||
started=$(date +%s)
|
||||
|
||||
for ((i=1; i<=max_attempts; i++)); do
|
||||
printf "[%4d] %s " "$i" "$(date +%H:%M:%S)"
|
||||
http_code=$(curl -sS -o /tmp/blind_resp.$$ -w "%{http_code}" \
|
||||
--max-time "$max_time_s" \
|
||||
-X POST "$url" || echo "000")
|
||||
body=$(cat /tmp/blind_resp.$$ 2>/dev/null || true)
|
||||
rm -f /tmp/blind_resp.$$
|
||||
|
||||
case "$http_code" in
|
||||
200|201)
|
||||
ok_count=$((ok_count + 1))
|
||||
echo "SENT $body"
|
||||
if [[ "$stop_on_ok" == "1" ]]; then
|
||||
elapsed=$(( $(date +%s) - started ))
|
||||
echo
|
||||
echo "blind_stop: success after ${i} attempts (${elapsed}s). ok=${ok_count} busy=${busy_count} err=${err_count}"
|
||||
echo "blind_stop: NEXT — wait ~10s, then try the full rescue:"
|
||||
echo " /home/serversdown/seismo-relay/scripts/rescue_device.sh ${host} ${tcp_port}"
|
||||
exit 0
|
||||
fi
|
||||
;;
|
||||
503)
|
||||
busy_count=$((busy_count + 1))
|
||||
echo "busy (503)"
|
||||
;;
|
||||
000)
|
||||
err_count=$((err_count + 1))
|
||||
echo "curl error"
|
||||
;;
|
||||
*)
|
||||
err_count=$((err_count + 1))
|
||||
echo "HTTP $http_code $body" | head -c 400
|
||||
echo
|
||||
;;
|
||||
esac
|
||||
[[ "$sleep_s" != "0" ]] && sleep "$sleep_s"
|
||||
done
|
||||
|
||||
elapsed=$(( $(date +%s) - started ))
|
||||
echo
|
||||
echo "blind_stop: gave up after ${max_attempts} attempts (${elapsed}s). ok=${ok_count} busy=${busy_count} err=${err_count}" >&2
|
||||
exit 1
|
||||
@@ -1,185 +0,0 @@
|
||||
"""
|
||||
scripts/check_bw_report_preservation.py — verify that running backfill_sidecars
|
||||
doesn't wipe the `bw_report` block from sidecars that already had one.
|
||||
|
||||
Two-step workflow:
|
||||
|
||||
# Before running backfill — capture a baseline snapshot:
|
||||
python scripts/check_bw_report_preservation.py snapshot \
|
||||
--store-root /path/to/waveforms \
|
||||
--out before.json
|
||||
|
||||
# Run backfill:
|
||||
python scripts/backfill_sidecars.py --store-root /path/to/waveforms --force
|
||||
|
||||
# After backfill — diff against the baseline:
|
||||
python scripts/check_bw_report_preservation.py diff \
|
||||
--store-root /path/to/waveforms \
|
||||
--baseline before.json
|
||||
|
||||
The diff classifies every sidecar into one of:
|
||||
|
||||
PRESERVED had bw_report before, has same hash now ← GOOD
|
||||
CHANGED had bw_report before, has different hash now ← suspicious
|
||||
(backfill should only ever copy the block verbatim)
|
||||
WIPED had bw_report before, doesn't now ← BUG — data loss
|
||||
STILL_MISSING didn't have bw_report before, still doesn't ← expected
|
||||
NEW didn't have bw_report before, has one now
|
||||
(only possible if a re-ingest happened between snapshots;
|
||||
shouldn't happen during backfill)
|
||||
REMOVED sidecar existed in baseline, file is gone now
|
||||
ADDED sidecar didn't exist in baseline, exists now
|
||||
|
||||
Exit code is 0 if no WIPED or CHANGED entries are found, 1 otherwise.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# Allow running from the repo root without installation.
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
|
||||
from minimateplus import event_file_io
|
||||
|
||||
|
||||
def _bw_report_hash(sidecar_data: dict) -> Optional[str]:
|
||||
"""Canonical-JSON hash of the bw_report block, or None if absent."""
|
||||
br = sidecar_data.get("bw_report")
|
||||
if not br:
|
||||
return None
|
||||
# sort_keys for stable hashing across dict-ordering differences
|
||||
blob = json.dumps(br, sort_keys=True, separators=(",", ":"))
|
||||
return hashlib.sha256(blob.encode()).hexdigest()
|
||||
|
||||
|
||||
def _scan_store(store_root: Path) -> dict:
|
||||
"""Walk every <serial>/<file>.sfm.json and return {relpath: hash_or_None}.
|
||||
|
||||
Relpath is `<serial>/<filename>` — stable across machines/snapshots.
|
||||
"""
|
||||
out: dict[str, Optional[str]] = {}
|
||||
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
|
||||
for sidecar in sorted(serial_dir.glob("*.sfm.json")):
|
||||
relpath = f"{serial_dir.name}/{sidecar.name}"
|
||||
try:
|
||||
data = event_file_io.read_sidecar(sidecar)
|
||||
except Exception as exc:
|
||||
print(f" WARN: failed to read {relpath}: {exc}", file=sys.stderr)
|
||||
continue
|
||||
out[relpath] = _bw_report_hash(data)
|
||||
return out
|
||||
|
||||
|
||||
def cmd_snapshot(args) -> int:
|
||||
store_root = Path(args.store_root).expanduser().resolve()
|
||||
if not store_root.exists():
|
||||
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
|
||||
return 2
|
||||
out_path = Path(args.out).expanduser().resolve()
|
||||
|
||||
print(f"Scanning {store_root} …")
|
||||
snapshot = _scan_store(store_root)
|
||||
|
||||
with_bw = sum(1 for v in snapshot.values() if v is not None)
|
||||
without_bw = sum(1 for v in snapshot.values() if v is None)
|
||||
print(f" total sidecars: {len(snapshot)}")
|
||||
print(f" with bw_report: {with_bw}")
|
||||
print(f" without bw_report: {without_bw}")
|
||||
|
||||
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w") as f:
|
||||
json.dump({
|
||||
"store_root": str(store_root),
|
||||
"total": len(snapshot),
|
||||
"with_bw": with_bw,
|
||||
"sidecars": snapshot,
|
||||
}, f, indent=2, sort_keys=True)
|
||||
print(f"Wrote baseline → {out_path}")
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_diff(args) -> int:
|
||||
store_root = Path(args.store_root).expanduser().resolve()
|
||||
if not store_root.exists():
|
||||
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
|
||||
return 2
|
||||
baseline_path = Path(args.baseline).expanduser().resolve()
|
||||
if not baseline_path.exists():
|
||||
print(f"error: baseline file not found: {baseline_path}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
with open(baseline_path) as f:
|
||||
baseline = json.load(f)
|
||||
before = baseline["sidecars"]
|
||||
print(f"Scanning {store_root} for comparison against {baseline_path.name} …")
|
||||
after = _scan_store(store_root)
|
||||
|
||||
classes = {k: [] for k in (
|
||||
"PRESERVED", "CHANGED", "WIPED", "STILL_MISSING", "NEW", "REMOVED", "ADDED",
|
||||
)}
|
||||
all_keys = set(before) | set(after)
|
||||
for key in sorted(all_keys):
|
||||
b = before.get(key, "__MISSING__")
|
||||
a = after.get(key, "__MISSING__")
|
||||
if b == "__MISSING__":
|
||||
classes["ADDED"].append(key)
|
||||
elif a == "__MISSING__":
|
||||
classes["REMOVED"].append(key)
|
||||
elif b is None and a is None:
|
||||
classes["STILL_MISSING"].append(key)
|
||||
elif b is None and a is not None:
|
||||
classes["NEW"].append(key)
|
||||
elif b is not None and a is None:
|
||||
classes["WIPED"].append(key)
|
||||
elif b == a:
|
||||
classes["PRESERVED"].append(key)
|
||||
else:
|
||||
classes["CHANGED"].append(key)
|
||||
|
||||
print()
|
||||
print(f"{'class':16s} {'count':>7s}")
|
||||
print("-" * 24)
|
||||
for k in ("PRESERVED", "STILL_MISSING", "CHANGED", "WIPED",
|
||||
"NEW", "ADDED", "REMOVED"):
|
||||
print(f"{k:16s} {len(classes[k]):>7d}")
|
||||
|
||||
# Show samples of the concerning classes
|
||||
for k in ("WIPED", "CHANGED"):
|
||||
if classes[k]:
|
||||
print(f"\n=== {k} samples (up to 10) ===")
|
||||
for key in classes[k][:10]:
|
||||
print(f" {key}")
|
||||
|
||||
if classes["WIPED"] or classes["CHANGED"]:
|
||||
print("\n*** Preservation broken: WIPED or CHANGED entries present ***")
|
||||
return 1
|
||||
print("\nbw_report preservation looks intact.")
|
||||
return 0
|
||||
|
||||
|
||||
def main(argv=None) -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
sub = p.add_subparsers(dest="cmd", required=True)
|
||||
|
||||
p_snap = sub.add_parser("snapshot", help="capture baseline bw_report hashes")
|
||||
p_snap.add_argument("--store-root", required=True)
|
||||
p_snap.add_argument("--out", required=True, help="output JSON path")
|
||||
p_snap.set_defaults(func=cmd_snapshot)
|
||||
|
||||
p_diff = sub.add_parser("diff", help="diff current store against a baseline")
|
||||
p_diff.add_argument("--store-root", required=True)
|
||||
p_diff.add_argument("--baseline", required=True, help="JSON from `snapshot`")
|
||||
p_diff.set_defaults(func=cmd_diff)
|
||||
|
||||
args = p.parse_args(argv)
|
||||
return args.func(args)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,151 +0,0 @@
|
||||
"""
|
||||
scripts/repair_unknown_serials.py — re-attribute events stuck under
|
||||
`serial = 'UNKNOWN'` to their correct serial by decoding the BW filename.
|
||||
|
||||
Why this is needed
|
||||
──────────────────
|
||||
The /db/import/blastware_file endpoint had a bug (fixed in commit a032fa5+1
|
||||
on the ach-report-ingestion branch) where every forwarded event was inserted
|
||||
with serial='UNKNOWN' because the endpoint's `_serial_from_event(ev)` stub
|
||||
returned None and never consulted the BW-filename serial that
|
||||
`WaveformStore.save_imported_bw()` had already decoded.
|
||||
|
||||
Effect on a server that ran a buggy version: every forwarded event's
|
||||
SeismoDb row has `serial='UNKNOWN'`, even though the on-disk waveform
|
||||
store has correctly bucketed the files into `BE<NNNN>/` folders. So
|
||||
the BW binaries / sidecars / HDF5s are fine, but `/db/units` and
|
||||
`/db/events?serial=...` queries don't surface the events.
|
||||
|
||||
This script
|
||||
───────────
|
||||
Walks the events table looking for rows with `serial='UNKNOWN'` and
|
||||
re-attributes each one to the serial decoded from its
|
||||
`blastware_filename` column. If the row's serial would collide with
|
||||
an existing row (already-correct duplicate from a later re-forward),
|
||||
the UNKNOWN row is deleted. Otherwise the row's `serial` column is
|
||||
updated in-place.
|
||||
|
||||
Idempotent: re-running after a successful repair finds zero matching
|
||||
rows and exits cleanly.
|
||||
|
||||
Usage
|
||||
─────
|
||||
# Dry-run (default): print what would change, don't touch the DB
|
||||
python -m scripts.repair_unknown_serials --db bridges/captures/seismo_relay.db
|
||||
|
||||
# Apply the repair
|
||||
python -m scripts.repair_unknown_serials --db bridges/captures/seismo_relay.db --apply
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import sqlite3
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Reach into sfm.waveform_store for the serial decoder. This script
|
||||
# is run from the repo root via `python -m scripts.repair_unknown_serials`.
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
from sfm.waveform_store import _serial_from_bw_filename
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
p = argparse.ArgumentParser(
|
||||
description="Re-attribute events stuck under serial='UNKNOWN'.",
|
||||
)
|
||||
p.add_argument(
|
||||
"--db", required=True, type=Path,
|
||||
help="Path to seismo_relay.db (e.g. bridges/captures/seismo_relay.db)",
|
||||
)
|
||||
p.add_argument(
|
||||
"--apply", action="store_true",
|
||||
help="Apply the repair. Without this flag the script runs in "
|
||||
"dry-run mode and only reports what would change.",
|
||||
)
|
||||
args = p.parse_args(argv)
|
||||
|
||||
if not args.db.exists():
|
||||
print(f"DB not found: {args.db}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
conn = sqlite3.connect(str(args.db))
|
||||
conn.row_factory = sqlite3.Row
|
||||
|
||||
rows = list(conn.execute(
|
||||
"SELECT id, serial, timestamp, blastware_filename "
|
||||
" FROM events "
|
||||
" WHERE serial = 'UNKNOWN' "
|
||||
" ORDER BY timestamp",
|
||||
))
|
||||
print(f"Found {len(rows)} UNKNOWN-serial rows in events table.")
|
||||
if not rows:
|
||||
return 0
|
||||
|
||||
updated = 0
|
||||
deleted = 0
|
||||
unresolved = 0
|
||||
by_serial: dict[str, int] = {}
|
||||
|
||||
for row in rows:
|
||||
rid = row["id"]
|
||||
ts = row["timestamp"]
|
||||
bw_name = row["blastware_filename"]
|
||||
new_serial = _serial_from_bw_filename(bw_name) if bw_name else None
|
||||
if not new_serial:
|
||||
print(f" ⚠ id={rid[:8]} ts={ts} filename={bw_name!r} — "
|
||||
f"cannot decode serial from filename; skipping")
|
||||
unresolved += 1
|
||||
continue
|
||||
|
||||
# Check for an existing row at the target (serial, timestamp).
|
||||
existing = conn.execute(
|
||||
"SELECT id FROM events WHERE serial = ? AND timestamp = ?",
|
||||
(new_serial, ts),
|
||||
).fetchone()
|
||||
action: str
|
||||
if existing is None:
|
||||
# Safe to UPDATE in place.
|
||||
if args.apply:
|
||||
conn.execute(
|
||||
"UPDATE events SET serial = ? WHERE id = ?",
|
||||
(new_serial, rid),
|
||||
)
|
||||
action = "UPDATE"
|
||||
updated += 1
|
||||
else:
|
||||
# A correctly-attributed row already exists. Drop the
|
||||
# UNKNOWN duplicate.
|
||||
if args.apply:
|
||||
conn.execute("DELETE FROM events WHERE id = ?", (rid,))
|
||||
action = "DELETE (dup)"
|
||||
deleted += 1
|
||||
|
||||
by_serial[new_serial] = by_serial.get(new_serial, 0) + 1
|
||||
print(f" {action:14s} id={rid[:8]} ts={ts} "
|
||||
f"filename={bw_name} → {new_serial}")
|
||||
|
||||
if args.apply:
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
print()
|
||||
print(f"Summary:")
|
||||
print(f" UNKNOWN rows scanned: {len(rows)}")
|
||||
print(f" Updated to real serial: {updated}")
|
||||
print(f" Deleted (duplicate of an ")
|
||||
print(f" already-correct row): {deleted}")
|
||||
print(f" Unresolved (bad filename): {unresolved}")
|
||||
print()
|
||||
if by_serial:
|
||||
print(f"Per-serial breakdown of repaired rows:")
|
||||
for serial, count in sorted(by_serial.items()):
|
||||
print(f" {serial:12s} {count}")
|
||||
if not args.apply:
|
||||
print()
|
||||
print("(dry-run — re-run with --apply to commit)")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,99 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Rescue an uncooperative MiniMate that's busy with another ACH session.
|
||||
#
|
||||
# Hammers POST /device/rescue in a tight loop with a short timeout. When the
|
||||
# device is in an ACH session our SYN either gets refused or silently dropped
|
||||
# (5s connect timeout inside the endpoint) and we retry immediately. When the
|
||||
# device is between sessions, our TCP wins, the endpoint disables Auto Call
|
||||
# Home and erases events inside the same session, then returns success.
|
||||
#
|
||||
# Usage:
|
||||
# ./rescue_device.sh <host> [tcp_port] [--no-erase] [--no-disable-ach]
|
||||
#
|
||||
# Examples:
|
||||
# ./rescue_device.sh 166.246.130.1 9034
|
||||
# ./rescue_device.sh 166.246.130.1 9034 --no-erase # just silence it
|
||||
#
|
||||
# Environment:
|
||||
# SFM_BASE_URL Defaults to http://localhost:8200 (SFM direct).
|
||||
# Set to http://localhost:8001/api/sfm to route through
|
||||
# Terra-View's proxy. Direct mode avoids the proxy's
|
||||
# 60s timeout, which matters for long-running endpoints.
|
||||
# MAX_ATTEMPTS Cap on retries (default 600 ≈ 30+ min).
|
||||
# SLEEP_S Backoff between attempts (default 1).
|
||||
# MAX_TIME_S Per-request timeout (default 60).
|
||||
# CONNECT_TIMEOUT TCP connect timeout (default 5).
|
||||
# RECV_TIMEOUT Per-frame S3 recv timeout (default 5). If POLL or any
|
||||
# subsequent frame doesn't respond within this window, the
|
||||
# rescue endpoint bails and this script retries.
|
||||
|
||||
set -u
|
||||
|
||||
host="${1:-}"
|
||||
tcp_port="${2:-9034}"
|
||||
shift 2 2>/dev/null || shift $# 2>/dev/null
|
||||
|
||||
if [[ -z "$host" ]]; then
|
||||
echo "usage: $0 <host> [tcp_port] [--no-erase] [--no-disable-ach]" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
disable_ach="true"
|
||||
erase="true"
|
||||
for arg in "$@"; do
|
||||
case "$arg" in
|
||||
--no-erase) erase="false" ;;
|
||||
--no-disable-ach) disable_ach="false" ;;
|
||||
*) echo "unknown flag: $arg" >&2; exit 2 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||
max_attempts="${MAX_ATTEMPTS:-600}"
|
||||
sleep_s="${SLEEP_S:-1}"
|
||||
max_time_s="${MAX_TIME_S:-60}"
|
||||
connect_timeout="${CONNECT_TIMEOUT:-5}"
|
||||
recv_timeout="${RECV_TIMEOUT:-5}"
|
||||
|
||||
url="${base}/device/rescue?host=${host}&tcp_port=${tcp_port}&disable_ach=${disable_ach}&erase=${erase}&connect_timeout=${connect_timeout}&recv_timeout=${recv_timeout}"
|
||||
|
||||
echo "rescue: target ${host}:${tcp_port} disable_ach=${disable_ach} erase=${erase}"
|
||||
echo "rescue: connect_timeout=${connect_timeout}s recv_timeout=${recv_timeout}s"
|
||||
echo "rescue: POST ${url}"
|
||||
echo "rescue: up to ${max_attempts} attempts, ${sleep_s}s between, ${max_time_s}s per request"
|
||||
echo
|
||||
|
||||
started=$(date +%s)
|
||||
for ((i=1; i<=max_attempts; i++)); do
|
||||
printf "[%3d] %s " "$i" "$(date +%H:%M:%S)"
|
||||
http_code=$(curl -sS -o /tmp/rescue_resp.$$ -w "%{http_code}" \
|
||||
--max-time "$max_time_s" \
|
||||
-X POST "$url" || echo "000")
|
||||
body=$(cat /tmp/rescue_resp.$$ 2>/dev/null || true)
|
||||
rm -f /tmp/rescue_resp.$$
|
||||
|
||||
case "$http_code" in
|
||||
200|201)
|
||||
elapsed=$(( $(date +%s) - started ))
|
||||
echo "OK (${elapsed}s total)"
|
||||
echo "$body"
|
||||
exit 0
|
||||
;;
|
||||
503)
|
||||
# Connection refused / timeout — device busy in another session. Retry fast.
|
||||
echo "busy (503)"
|
||||
;;
|
||||
000)
|
||||
echo "curl error (network)"
|
||||
;;
|
||||
*)
|
||||
echo "HTTP $http_code"
|
||||
echo " $body" | head -c 400
|
||||
echo
|
||||
;;
|
||||
esac
|
||||
sleep "$sleep_s"
|
||||
done
|
||||
|
||||
echo "rescue: gave up after ${max_attempts} attempts" >&2
|
||||
exit 1
|
||||
@@ -1,44 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Hold a single TCP session open and drip stop-monitoring frames at a slow
|
||||
# rate, so the device's UART RX FIFO has time to drain between sends.
|
||||
#
|
||||
# Use when high-rate spam isn't landing — typically because the device's
|
||||
# firmware is too busy to drain its serial buffer fast enough and bytes
|
||||
# are being lost to UART overrun.
|
||||
#
|
||||
# Usage:
|
||||
# ./slow_drip.sh <host> [tcp_port] [duration_s]
|
||||
#
|
||||
# Env:
|
||||
# DURATION Default: 120 (seconds; arg 3 overrides). Clamped 1..600.
|
||||
# INTERVAL Seconds between drip sends (default 3). Lower = more
|
||||
# aggressive, more risk of FIFO overrun. Higher = safer
|
||||
# but fewer total drips per duration.
|
||||
# CONNECT_TIMEOUT Default: 5
|
||||
# SFM_BASE_URL Default: http://localhost:8200 (SFM direct).
|
||||
|
||||
set -u
|
||||
|
||||
host="${1:-}"
|
||||
tcp_port="${2:-9034}"
|
||||
duration="${3:-${DURATION:-120}}"
|
||||
if [[ -z "$host" ]]; then
|
||||
echo "usage: $0 <host> [tcp_port] [duration_s]" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||
interval="${INTERVAL:-3}"
|
||||
connect_timeout="${CONNECT_TIMEOUT:-5}"
|
||||
|
||||
url="${base}/device/stop_monitoring_slow_drip?host=${host}&tcp_port=${tcp_port}&duration_s=${duration}&interval_s=${interval}&connect_timeout=${connect_timeout}"
|
||||
|
||||
echo "slow_drip: target ${host}:${tcp_port} duration=${duration}s interval=${interval}s connect_timeout=${connect_timeout}s"
|
||||
echo "slow_drip: POST ${url}"
|
||||
echo
|
||||
|
||||
# Give curl enough slack to wait out the duration plus a buffer
|
||||
max_time=$(awk -v d="$duration" 'BEGIN { printf "%d", d + 30 }')
|
||||
|
||||
curl -sS --max-time "$max_time" -X POST "$url"
|
||||
echo
|
||||
@@ -1,48 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Hammer a device with blind stop-monitoring sessions as fast as possible.
|
||||
# Single HTTP call kicks off the burst inside SFM (no per-attempt HTTP
|
||||
# overhead). Default: 10 seconds, ~500 ms per attempt = ~20 attempts/sec.
|
||||
#
|
||||
# Usage:
|
||||
# ./spam_stop.sh <host> [tcp_port] [duration_s]
|
||||
#
|
||||
# Examples:
|
||||
# ./spam_stop.sh 166.246.130.1 # 10s burst
|
||||
# ./spam_stop.sh 166.246.130.1 9034 30 # 30s burst
|
||||
# DURATION=60 CONNECT_TIMEOUT=0.2 ./spam_stop.sh 166.246.130.1
|
||||
#
|
||||
# Env:
|
||||
# SFM_BASE_URL Default: http://localhost:8200 (SFM direct).
|
||||
# Set to http://localhost:8001/api/sfm to route through
|
||||
# Terra-View's proxy — but note the proxy has a 60s
|
||||
# timeout, so long bursts need direct mode.
|
||||
# DURATION Default: 10 (seconds; arg 3 overrides)
|
||||
# CONNECT_TIMEOUT Default: 0.5 (seconds)
|
||||
# REPEAT Default: 3 (stop frames per TCP session)
|
||||
|
||||
set -u
|
||||
|
||||
host="${1:-}"
|
||||
tcp_port="${2:-9034}"
|
||||
duration="${3:-${DURATION:-10}}"
|
||||
|
||||
if [[ -z "$host" ]]; then
|
||||
echo "usage: $0 <host> [tcp_port] [duration_s]" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||
connect_timeout="${CONNECT_TIMEOUT:-0.5}"
|
||||
repeat="${REPEAT:-3}"
|
||||
|
||||
url="${base}/device/stop_monitoring_spam?host=${host}&tcp_port=${tcp_port}&duration_s=${duration}&connect_timeout=${connect_timeout}&repeat=${repeat}"
|
||||
|
||||
echo "spam_stop: target ${host}:${tcp_port} duration=${duration}s connect_timeout=${connect_timeout}s repeat=${repeat}"
|
||||
echo "spam_stop: POST ${url}"
|
||||
echo
|
||||
|
||||
# Give curl enough slack to wait out the duration plus a buffer
|
||||
max_time=$(awk -v d="$duration" 'BEGIN { printf "%d", d + 10 }')
|
||||
|
||||
curl -sS --max-time "$max_time" -X POST "$url"
|
||||
echo
|
||||
@@ -1,58 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Passive monitor for a misbehaving unit. Every INTERVAL seconds, attempts
|
||||
# a single short TCP probe + storage_range read and logs the result. Designed
|
||||
# to run unattended for hours/days and tell you when the unit comes back.
|
||||
#
|
||||
# Usage:
|
||||
# ./watch_unit.sh <host> [tcp_port]
|
||||
#
|
||||
# Env:
|
||||
# INTERVAL Seconds between checks (default 300 = 5 min)
|
||||
# LOG_FILE Append results here (default /tmp/watch_<host>.log)
|
||||
# SFM_BASE_URL Default: http://localhost:8200
|
||||
|
||||
set -u
|
||||
|
||||
host="${1:-}"
|
||||
tcp_port="${2:-9034}"
|
||||
if [[ -z "$host" ]]; then
|
||||
echo "usage: $0 <host> [tcp_port]" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
interval="${INTERVAL:-300}"
|
||||
log_file="${LOG_FILE:-/tmp/watch_${host}.log}"
|
||||
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||
|
||||
url="${base}/device/events/storage_range?host=${host}&tcp_port=${tcp_port}"
|
||||
|
||||
echo "watch_unit: target ${host}:${tcp_port} interval=${interval}s log=${log_file}"
|
||||
echo "watch_unit: Ctrl-C to stop"
|
||||
|
||||
while true; do
|
||||
ts=$(date '+%Y-%m-%d %H:%M:%S')
|
||||
http_code=$(curl -sS -o /tmp/watch_resp.$$ -w "%{http_code}" \
|
||||
--max-time 20 "$url" || echo "000")
|
||||
body=$(cat /tmp/watch_resp.$$ 2>/dev/null || true)
|
||||
rm -f /tmp/watch_resp.$$
|
||||
|
||||
case "$http_code" in
|
||||
200|201)
|
||||
# Strip the raw_hex for readability
|
||||
summary=$(echo "$body" | sed 's/"raw_hex":"[^"]*",*//; s/,*$//' | head -c 200)
|
||||
echo "$ts REACHABLE $summary" | tee -a "$log_file"
|
||||
;;
|
||||
502|503)
|
||||
err=$(echo "$body" | head -c 150)
|
||||
echo "$ts ERROR_$http_code $err" | tee -a "$log_file"
|
||||
;;
|
||||
000)
|
||||
echo "$ts CURL_FAIL (network/timeout)" | tee -a "$log_file"
|
||||
;;
|
||||
*)
|
||||
echo "$ts HTTP_$http_code $(echo "$body" | head -c 150)" | tee -a "$log_file"
|
||||
;;
|
||||
esac
|
||||
|
||||
sleep "$interval"
|
||||
done
|
||||
+110
-802
File diff suppressed because it is too large
Load Diff
+11
-140
@@ -83,24 +83,13 @@ class CachedEvent(Base):
|
||||
|
||||
Events are immutable once recorded on the device; once we have an event in
|
||||
the cache it never needs to be re-downloaded unless explicitly requested.
|
||||
|
||||
The two extra columns `waveform_key` and `event_timestamp` are an
|
||||
integrity stamp: when set_event() / set_waveform() are called with a
|
||||
different (waveform_key, event_timestamp) for the same (conn_key, index),
|
||||
we know the device was erased and re-recorded — the cached row no longer
|
||||
refers to the same physical event and the entire device's cache is
|
||||
flushed before the new entry is written. This catches the post-erase
|
||||
key-reuse bug where the device's first new event (key 01110000) collides
|
||||
with the first event we previously downloaded.
|
||||
"""
|
||||
__tablename__ = "cached_events"
|
||||
|
||||
conn_key = sa.Column(sa.String, primary_key=True)
|
||||
index = sa.Column(sa.Integer, primary_key=True)
|
||||
event_json = sa.Column(sa.Text, nullable=False) # serialised Event dict
|
||||
cached_at = sa.Column(sa.Float, nullable=False) # Unix timestamp
|
||||
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
|
||||
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
|
||||
conn_key = sa.Column(sa.String, primary_key=True)
|
||||
index = sa.Column(sa.Integer, primary_key=True)
|
||||
event_json = sa.Column(sa.Text, nullable=False) # serialised Event dict
|
||||
cached_at = sa.Column(sa.Float, nullable=False) # Unix timestamp
|
||||
|
||||
|
||||
class CachedWaveform(Base):
|
||||
@@ -108,18 +97,14 @@ class CachedWaveform(Base):
|
||||
Full raw ADC waveform for a single event (SUB 5A full download).
|
||||
|
||||
These are large (up to several MB) and expensive to fetch over cellular.
|
||||
Once downloaded they are immutable and cached permanently — but the
|
||||
cache row is invalidated when the device is erased and a new event lands
|
||||
at the same index (see CachedEvent docstring).
|
||||
Once downloaded they are immutable and cached permanently.
|
||||
"""
|
||||
__tablename__ = "cached_waveforms"
|
||||
|
||||
conn_key = sa.Column(sa.String, primary_key=True)
|
||||
index = sa.Column(sa.Integer, primary_key=True)
|
||||
waveform_json = sa.Column(sa.Text, nullable=False) # full /device/event/{idx}/waveform response JSON
|
||||
cached_at = sa.Column(sa.Float, nullable=False)
|
||||
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
|
||||
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
|
||||
conn_key = sa.Column(sa.String, primary_key=True)
|
||||
index = sa.Column(sa.Integer, primary_key=True)
|
||||
waveform_json = sa.Column(sa.Text, nullable=False) # full /device/event/{idx}/waveform response JSON
|
||||
cached_at = sa.Column(sa.Float, nullable=False)
|
||||
|
||||
|
||||
class CachedMonitorStatus(Base):
|
||||
@@ -164,23 +149,6 @@ class SFMCache:
|
||||
engine = sa.create_engine(url, connect_args={"check_same_thread": False})
|
||||
Base.metadata.create_all(engine)
|
||||
self._Session = orm.sessionmaker(bind=engine)
|
||||
# In-place schema migration: add the (waveform_key, event_timestamp)
|
||||
# integrity-stamp columns to legacy cache DBs that predate the
|
||||
# post-erase eviction logic. ALTER TABLE ADD COLUMN is idempotent
|
||||
# via the column-presence check below.
|
||||
with engine.begin() as conn:
|
||||
for table in ("cached_events", "cached_waveforms"):
|
||||
cols = {
|
||||
r[1]
|
||||
for r in conn.exec_driver_sql(f"PRAGMA table_info({table})").fetchall()
|
||||
}
|
||||
for new_col, ddl in (
|
||||
("waveform_key", "TEXT"),
|
||||
("event_timestamp", "TEXT"),
|
||||
):
|
||||
if new_col not in cols:
|
||||
log.info("cache schema: %s ADD COLUMN %s %s", table, new_col, ddl)
|
||||
conn.exec_driver_sql(f"ALTER TABLE {table} ADD COLUMN {new_col} {ddl}")
|
||||
log.info("SFM cache opened: %s", db_path)
|
||||
|
||||
# ── Connection key ────────────────────────────────────────────────────────
|
||||
@@ -274,91 +242,15 @@ class SFMCache:
|
||||
row = s.get(CachedEvent, (conn_key, index))
|
||||
return json.loads(row.event_json) if row else None
|
||||
|
||||
@staticmethod
|
||||
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
|
||||
"""
|
||||
Extract the (waveform_key_hex, timestamp_iso) integrity stamp from
|
||||
a serialised event dict. Either field may be None if the source
|
||||
Event was missing it; the comparison logic in set_events/set_waveform
|
||||
treats "both sides have a value AND they differ" as the only
|
||||
eviction trigger, so partial data never spuriously flushes cache.
|
||||
"""
|
||||
key = ev.get("waveform_key") or ev.get("_waveform_key")
|
||||
if isinstance(key, (bytes, bytearray)):
|
||||
key = bytes(key).hex()
|
||||
ts = ev.get("timestamp")
|
||||
if isinstance(ts, dict):
|
||||
# _serialise_timestamp returns a dict like {"iso": "...", ...}
|
||||
ts = ts.get("iso") or ts.get("string") or None
|
||||
return (key if isinstance(key, str) else None,
|
||||
ts if isinstance(ts, str) else None)
|
||||
|
||||
def _maybe_flush_on_mismatch(
|
||||
self,
|
||||
s,
|
||||
conn_key: str,
|
||||
index: int,
|
||||
new_key: Optional[str],
|
||||
new_ts: Optional[str],
|
||||
) -> bool:
|
||||
"""
|
||||
Check whether the cached entry at (conn_key, index) has a different
|
||||
(waveform_key, timestamp) than the incoming one. If so, treat it as
|
||||
a post-erase key-reuse signal and flush ALL cached events/waveforms
|
||||
for this device, then return True.
|
||||
Returns False when no flush was needed.
|
||||
"""
|
||||
if not new_key and not new_ts:
|
||||
return False # nothing to compare against
|
||||
existing = s.get(CachedEvent, (conn_key, index))
|
||||
if existing is None:
|
||||
existing = s.get(CachedWaveform, (conn_key, index))
|
||||
if existing is None:
|
||||
return False
|
||||
old_key = existing.waveform_key
|
||||
old_ts = existing.event_timestamp
|
||||
# Only flush when both sides have populated values and they differ.
|
||||
differs = (
|
||||
(new_key and old_key and new_key != old_key)
|
||||
or (new_ts and old_ts and new_ts != old_ts)
|
||||
)
|
||||
if not differs:
|
||||
return False
|
||||
log.warning(
|
||||
"cache: device %s — index %d (key=%s, ts=%s) replaces (key=%s, ts=%s); "
|
||||
"flushing all cached events/waveforms for this device "
|
||||
"(post-erase key reuse detected)",
|
||||
conn_key, index, new_key, new_ts, old_key, old_ts,
|
||||
)
|
||||
s.query(CachedEvent).filter_by(conn_key=conn_key).delete()
|
||||
s.query(CachedWaveform).filter_by(conn_key=conn_key).delete()
|
||||
return True
|
||||
|
||||
def set_events(self, conn_key: str, events: list[dict]) -> None:
|
||||
"""
|
||||
Upsert a list of event dicts. Existing rows are updated; new rows are
|
||||
inserted. This is used to add newly-discovered events to the cache.
|
||||
|
||||
Eviction: if any incoming event has a different (waveform_key,
|
||||
timestamp) than the row currently cached at the same index, we flush
|
||||
the entire device's cache before inserting the new entries. Catches
|
||||
post-erase key reuse where index 0 silently switches identity.
|
||||
"""
|
||||
now = time.time()
|
||||
with self._Session() as s:
|
||||
# Eviction check: scan incoming events for any (index, key, ts)
|
||||
# that conflicts with a cached row. A single conflict triggers
|
||||
# a full device-wide flush so we don't end up with a mixed-era
|
||||
# cache.
|
||||
for ev in events:
|
||||
key, ts = self._event_signature(ev)
|
||||
if self._maybe_flush_on_mismatch(s, conn_key, ev["index"], key, ts):
|
||||
s.commit()
|
||||
break # cache is now empty for this device; carry on
|
||||
|
||||
for ev in events:
|
||||
idx = ev["index"]
|
||||
key, ts = self._event_signature(ev)
|
||||
row = s.get(CachedEvent, (conn_key, idx))
|
||||
if row is None:
|
||||
row = CachedEvent(
|
||||
@@ -366,18 +258,12 @@ class SFMCache:
|
||||
index=idx,
|
||||
event_json=json.dumps(ev),
|
||||
cached_at=now,
|
||||
waveform_key=key,
|
||||
event_timestamp=ts,
|
||||
)
|
||||
s.add(row)
|
||||
log.debug("cached new event %d for %s", idx, conn_key)
|
||||
else:
|
||||
# Refresh in case project_info was backfilled after initial store
|
||||
row.event_json = json.dumps(ev)
|
||||
if key:
|
||||
row.waveform_key = key
|
||||
if ts:
|
||||
row.event_timestamp = ts
|
||||
s.commit()
|
||||
|
||||
# ── Waveforms ─────────────────────────────────────────────────────────────
|
||||
@@ -392,16 +278,8 @@ class SFMCache:
|
||||
return json.loads(row.waveform_json)
|
||||
|
||||
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
|
||||
"""
|
||||
Store a full waveform response dict permanently.
|
||||
|
||||
Like set_events, this checks the (waveform_key, timestamp) signature
|
||||
of the incoming entry against what's currently cached at the same
|
||||
index. A mismatch flushes the entire device's cache before insert.
|
||||
"""
|
||||
key, ts = self._event_signature(waveform)
|
||||
"""Store a full waveform response dict permanently."""
|
||||
with self._Session() as s:
|
||||
self._maybe_flush_on_mismatch(s, conn_key, index, key, ts)
|
||||
row = s.get(CachedWaveform, (conn_key, index))
|
||||
if row is None:
|
||||
row = CachedWaveform(
|
||||
@@ -409,20 +287,13 @@ class SFMCache:
|
||||
index=index,
|
||||
waveform_json=json.dumps(waveform),
|
||||
cached_at=time.time(),
|
||||
waveform_key=key,
|
||||
event_timestamp=ts,
|
||||
)
|
||||
s.add(row)
|
||||
else:
|
||||
row.waveform_json = json.dumps(waveform)
|
||||
row.cached_at = time.time()
|
||||
if key:
|
||||
row.waveform_key = key
|
||||
if ts:
|
||||
row.event_timestamp = ts
|
||||
s.commit()
|
||||
log.debug("cached waveform for %s event %d (key=%s, ts=%s)",
|
||||
conn_key, index, key, ts)
|
||||
log.debug("cached waveform for %s event %d", conn_key, index)
|
||||
|
||||
# ── Monitor status ────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
+18
-332
@@ -81,11 +81,6 @@ CREATE TABLE IF NOT EXISTS events (
|
||||
sample_rate INTEGER,
|
||||
record_type TEXT, -- "single_shot" | "continuous"
|
||||
false_trigger INTEGER NOT NULL DEFAULT 0, -- 0=no, 1=yes (manual flag)
|
||||
blastware_filename TEXT, -- event file within waveform store; extension is per-event (AB0T encodes timestamp)
|
||||
blastware_filesize INTEGER, -- bytes; NULL if no event file saved
|
||||
a5_pickle_filename TEXT, -- "<filename>.a5.pkl" sidecar
|
||||
sidecar_filename TEXT, -- "<filename>.sfm.json" review/metadata sidecar
|
||||
device_family TEXT, -- "series3" (MiniMate Plus / BW) | "series4" (Micromate / Thor) — drives per-family UI rendering (units, labels)
|
||||
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
|
||||
UNIQUE(serial, timestamp)
|
||||
);
|
||||
@@ -189,63 +184,6 @@ class SeismoDb:
|
||||
""")
|
||||
log.info("_migrate: events table rebuilt OK")
|
||||
|
||||
# Migration 1b: add Blastware-file columns to existing events tables.
|
||||
# New columns are NULLable so old rows just read NULL.
|
||||
existing_cols = {
|
||||
r[1] for r in conn.execute("PRAGMA table_info(events)").fetchall()
|
||||
}
|
||||
for col, ddl in (
|
||||
("blastware_filename", "TEXT"),
|
||||
("blastware_filesize", "INTEGER"),
|
||||
("a5_pickle_filename", "TEXT"),
|
||||
("sidecar_filename", "TEXT"),
|
||||
("device_family", "TEXT"),
|
||||
):
|
||||
if col not in existing_cols:
|
||||
log.info("_migrate: events ADD COLUMN %s %s", col, ddl)
|
||||
conn.execute(f"ALTER TABLE events ADD COLUMN {col} {ddl}")
|
||||
|
||||
# Migration 1c: backfill device_family for existing rows by sniffing
|
||||
# the device-native binary filename's extension. Thor (Micromate
|
||||
# Series IV) writes `.IDFH` / `.IDFW`; MiniMate Plus (Series III)
|
||||
# writes `.AB0*` / `.N00` / `.<base36>` Blastware extensions. We do
|
||||
# this here rather than from sidecars so the migration is fully
|
||||
# self-contained (doesn't need the waveform-store root) and runs at
|
||||
# DB-init time. Only fills NULL device_family so re-runs are no-ops.
|
||||
rebackfill = conn.execute(
|
||||
"SELECT COUNT(*) FROM events WHERE device_family IS NULL"
|
||||
).fetchone()
|
||||
if rebackfill and rebackfill[0] > 0:
|
||||
log.info("_migrate: backfilling device_family for %d events", rebackfill[0])
|
||||
# Series IV (Thor IDF) — extension is exactly .IDFH or .IDFW
|
||||
conn.execute(
|
||||
"""
|
||||
UPDATE events
|
||||
SET device_family = 'series4'
|
||||
WHERE device_family IS NULL
|
||||
AND (
|
||||
UPPER(blastware_filename) LIKE '%.IDFH'
|
||||
OR UPPER(blastware_filename) LIKE '%.IDFW'
|
||||
)
|
||||
"""
|
||||
)
|
||||
# Everything else with a filename → Series III (Blastware family)
|
||||
conn.execute(
|
||||
"""
|
||||
UPDATE events
|
||||
SET device_family = 'series3'
|
||||
WHERE device_family IS NULL
|
||||
AND blastware_filename IS NOT NULL
|
||||
"""
|
||||
)
|
||||
# Rows with no filename (e.g. older monitor_log-derived events)
|
||||
# stay NULL — UI handles NULL as "unknown family".
|
||||
remaining = conn.execute(
|
||||
"SELECT COUNT(*) FROM events WHERE device_family IS NULL"
|
||||
).fetchone()[0]
|
||||
log.info("_migrate: device_family backfill complete (remaining NULL=%d)",
|
||||
remaining)
|
||||
|
||||
# Migration 2: change monitor_log UNIQUE from (serial, waveform_key) to
|
||||
# (serial, start_time) — same reasoning as events.
|
||||
row = conn.execute(
|
||||
@@ -344,30 +282,12 @@ class SeismoDb:
|
||||
*,
|
||||
serial: str,
|
||||
session_id: Optional[str] = None,
|
||||
waveform_records: Optional[dict[str, dict]] = None,
|
||||
device_family: Optional[str] = None,
|
||||
) -> tuple[int, int]:
|
||||
"""
|
||||
Insert triggered events. Silently skips duplicates (serial+timestamp).
|
||||
Returns (inserted, skipped).
|
||||
|
||||
``waveform_records`` (optional): dict keyed by event waveform_key (hex)
|
||||
whose value is a record from ``WaveformStore.save()``:
|
||||
{"filename": str, "filesize": int, "a5_pickle_filename": str}
|
||||
|
||||
For events whose key is in this dict, the matching columns are
|
||||
populated. If a row with the same (serial, timestamp) already exists
|
||||
(dedup hit), the matching waveform record is upserted onto the
|
||||
existing row so a re-download via the live endpoint refreshes the
|
||||
file metadata.
|
||||
|
||||
``device_family`` (optional): "series3" (MiniMate Plus / Blastware) or
|
||||
"series4" (Micromate / Thor). Drives per-family UI rendering — most
|
||||
importantly the mic-unit convention (psi vs dB(L)). Set on every
|
||||
insert and overwritten on every UPSERT so the latest writer wins.
|
||||
"""
|
||||
inserted = skipped = 0
|
||||
wave_recs = waveform_records or {}
|
||||
with self._connect() as conn:
|
||||
for ev in events:
|
||||
key = ev._waveform_key.hex() if ev._waveform_key else None
|
||||
@@ -387,7 +307,6 @@ class SeismoDb:
|
||||
|
||||
pv = ev.peak_values
|
||||
pi = ev.project_info
|
||||
rec = wave_recs.get(key) or {}
|
||||
|
||||
try:
|
||||
conn.execute(
|
||||
@@ -396,11 +315,8 @@ class SeismoDb:
|
||||
(id, serial, waveform_key, session_id, timestamp,
|
||||
tran_ppv, vert_ppv, long_ppv, peak_vector_sum, mic_ppv,
|
||||
project, client, operator, sensor_location,
|
||||
sample_rate, record_type,
|
||||
blastware_filename, blastware_filesize,
|
||||
a5_pickle_filename, sidecar_filename,
|
||||
device_family)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
sample_rate, record_type)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""",
|
||||
(
|
||||
self._new_id(), serial, key, session_id, ts,
|
||||
@@ -415,89 +331,16 @@ class SeismoDb:
|
||||
pi.sensor_location if pi else None,
|
||||
ev.sample_rate,
|
||||
ev.record_type,
|
||||
rec.get("filename"),
|
||||
rec.get("filesize"),
|
||||
rec.get("a5_pickle_filename"),
|
||||
rec.get("sidecar_filename"),
|
||||
device_family,
|
||||
),
|
||||
)
|
||||
inserted += 1
|
||||
except sqlite3.IntegrityError:
|
||||
skipped += 1
|
||||
# UPSERT path: a row for this (serial, timestamp) already
|
||||
# exists. Refresh every device-authoritative field from
|
||||
# the new data so that a re-import with better data (e.g.
|
||||
# a watcher re-forward where the previous attempt missed
|
||||
# the paired BW ASCII report) replaces stale peaks /
|
||||
# project info / sample_rate.
|
||||
#
|
||||
# Preserved (not in this UPDATE):
|
||||
# id, waveform_key, session_id, created_at — immutable / FK
|
||||
# false_trigger — operator review state
|
||||
#
|
||||
# Behaviour change vs prior versions: this UPDATE used
|
||||
# to only refresh filename / filesize / a5_pickle /
|
||||
# sidecar fields. As a result, the first insert's
|
||||
# broken-codec peak values were locked in forever even
|
||||
# if subsequent re-forwards arrived with correct
|
||||
# report-derived values. Now every re-import lifts the
|
||||
# DB row up to whatever the latest Event carries.
|
||||
conn.execute(
|
||||
"""
|
||||
UPDATE events
|
||||
SET tran_ppv = ?,
|
||||
vert_ppv = ?,
|
||||
long_ppv = ?,
|
||||
peak_vector_sum = ?,
|
||||
mic_ppv = ?,
|
||||
project = ?,
|
||||
client = ?,
|
||||
operator = ?,
|
||||
sensor_location = ?,
|
||||
sample_rate = ?,
|
||||
record_type = ?,
|
||||
blastware_filename = ?,
|
||||
blastware_filesize = ?,
|
||||
a5_pickle_filename = ?,
|
||||
sidecar_filename = ?,
|
||||
device_family = COALESCE(?, device_family)
|
||||
WHERE serial = ? AND timestamp = ?
|
||||
""",
|
||||
(
|
||||
pv.tran if pv else None,
|
||||
pv.vert if pv else None,
|
||||
pv.long if pv else None,
|
||||
pv.peak_vector_sum if pv else None,
|
||||
pv.micl if pv else None,
|
||||
pi.project if pi else None,
|
||||
pi.client if pi else None,
|
||||
pi.operator if pi else None,
|
||||
pi.sensor_location if pi else None,
|
||||
ev.sample_rate,
|
||||
ev.record_type,
|
||||
rec.get("filename") if rec else None,
|
||||
rec.get("filesize") if rec else None,
|
||||
rec.get("a5_pickle_filename") if rec else None,
|
||||
rec.get("sidecar_filename") if rec else None,
|
||||
device_family,
|
||||
serial,
|
||||
ts,
|
||||
),
|
||||
)
|
||||
|
||||
log.debug("insert_events serial=%s inserted=%d skipped=%d",
|
||||
serial, inserted, skipped)
|
||||
return inserted, skipped
|
||||
|
||||
def get_event(self, event_id: str) -> Optional[dict]:
|
||||
"""Return one event row by id, or None."""
|
||||
with self._connect() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT * FROM events WHERE id = ?", (event_id,),
|
||||
).fetchone()
|
||||
return dict(row) if row else None
|
||||
|
||||
def query_events(
|
||||
self,
|
||||
serial: Optional[str] = None,
|
||||
@@ -544,105 +387,6 @@ class SeismoDb:
|
||||
)
|
||||
return cur.rowcount > 0
|
||||
|
||||
def delete_event(self, event_id: str) -> Optional[dict]:
|
||||
"""
|
||||
Hard-delete one event row by id. Returns the deleted row (so the
|
||||
caller can clean up any on-disk files referenced by it) or None
|
||||
if no row matched.
|
||||
"""
|
||||
with self._connect() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT * FROM events WHERE id = ?", (event_id,),
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return None
|
||||
conn.execute("DELETE FROM events WHERE id = ?", (event_id,))
|
||||
return dict(row)
|
||||
|
||||
def delete_events_bulk(
|
||||
self,
|
||||
serial: Optional[str] = None,
|
||||
from_dt: Optional[datetime.datetime] = None,
|
||||
to_dt: Optional[datetime.datetime] = None,
|
||||
false_trigger: Optional[bool] = None,
|
||||
ids: Optional[list[str]] = None,
|
||||
) -> list[dict]:
|
||||
"""
|
||||
Hard-delete events matching the given filters. Returns the list
|
||||
of deleted row dicts. Refuses to delete with no filters at all
|
||||
(would wipe the whole table) — raises ValueError.
|
||||
|
||||
Filter semantics match query_events: serial / from_dt / to_dt /
|
||||
false_trigger combine with AND. `ids` is an additional inclusion
|
||||
list (event_id IN (...)); if supplied alongside other filters,
|
||||
only rows matching all conditions are deleted.
|
||||
"""
|
||||
clauses: list[str] = []
|
||||
params: list = []
|
||||
|
||||
if serial:
|
||||
clauses.append("serial = ?")
|
||||
params.append(serial)
|
||||
if from_dt:
|
||||
clauses.append("timestamp >= ?")
|
||||
params.append(from_dt.isoformat())
|
||||
if to_dt:
|
||||
clauses.append("timestamp <= ?")
|
||||
params.append(to_dt.isoformat())
|
||||
if false_trigger is not None:
|
||||
clauses.append("false_trigger = ?")
|
||||
params.append(1 if false_trigger else 0)
|
||||
if ids:
|
||||
placeholders = ",".join("?" * len(ids))
|
||||
clauses.append(f"id IN ({placeholders})")
|
||||
params.extend(ids)
|
||||
|
||||
if not clauses:
|
||||
raise ValueError(
|
||||
"delete_events_bulk refuses to delete with no filters "
|
||||
"(would wipe the entire events table)"
|
||||
)
|
||||
|
||||
where = "WHERE " + " AND ".join(clauses)
|
||||
|
||||
with self._connect() as conn:
|
||||
rows = conn.execute(
|
||||
f"SELECT * FROM events {where}", params,
|
||||
).fetchall()
|
||||
if rows:
|
||||
conn.execute(f"DELETE FROM events {where}", params)
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
def update_event_review(self, event_id: str, review: dict) -> bool:
|
||||
"""
|
||||
Sync derived index columns from a sidecar's `review` block.
|
||||
|
||||
Currently the only derived index is `events.false_trigger` — kept
|
||||
in sync so `/db/events?false_trigger=true` queries don't have to
|
||||
scan every sidecar JSON on disk. The sidecar JSON itself remains
|
||||
the source of truth for the full review state.
|
||||
|
||||
Returns True when the row exists, False otherwise. No-op fields
|
||||
(review without `false_trigger`) leave the column untouched.
|
||||
"""
|
||||
if not isinstance(review, dict):
|
||||
return False
|
||||
if "false_trigger" not in review:
|
||||
# Nothing derived to update; just confirm the row exists.
|
||||
with self._connect() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT 1 FROM events WHERE id=?", (event_id,),
|
||||
).fetchone()
|
||||
return row is not None
|
||||
|
||||
flag = 1 if review.get("false_trigger") else 0
|
||||
with self._connect() as conn:
|
||||
cur = conn.execute(
|
||||
"UPDATE events SET false_trigger=? WHERE id=?",
|
||||
(flag, event_id),
|
||||
)
|
||||
return cur.rowcount > 0
|
||||
|
||||
# ── Monitor log ───────────────────────────────────────────────────────────
|
||||
|
||||
def insert_monitor_log(
|
||||
@@ -722,79 +466,21 @@ class SeismoDb:
|
||||
|
||||
def query_units(self) -> list[dict]:
|
||||
"""
|
||||
Return one row per known serial with summary stats.
|
||||
|
||||
Aggregates from BOTH source tables:
|
||||
- `events` — populated by every ingest path
|
||||
(live ACH, /db/import/blastware_file
|
||||
from the series3-watcher forwarder, etc.)
|
||||
- `ach_sessions` — only populated by the live ACH server;
|
||||
empty for events that came in via the
|
||||
BW-importer route.
|
||||
|
||||
Earlier this method only joined on `ach_sessions`, which made
|
||||
watcher-forwarded units invisible to the SFM webapp's fleet
|
||||
overview even though their events were correctly populated in
|
||||
`events`. Now we union the two and surface every serial that
|
||||
has activity in either table.
|
||||
|
||||
Fields:
|
||||
serial — unit serial number (e.g. "BE11529")
|
||||
last_seen — most recent of MAX(events.timestamp)
|
||||
and MAX(ach_sessions.session_time)
|
||||
total_events — COUNT(*) from `events` (the
|
||||
authoritative count regardless of
|
||||
ingest path)
|
||||
total_monitor_entries — from `ach_sessions`, 0 when absent
|
||||
total_sessions — COUNT(*) from `ach_sessions`, 0 when absent
|
||||
Return one row per known serial with summary stats:
|
||||
last_seen, total_events, total_monitor_entries.
|
||||
"""
|
||||
with self._connect() as conn:
|
||||
event_stats = {
|
||||
row["serial"]: row
|
||||
for row in conn.execute(
|
||||
"""
|
||||
SELECT serial,
|
||||
MAX(timestamp) AS last_event_at,
|
||||
COUNT(*) AS total_events
|
||||
FROM events
|
||||
GROUP BY serial
|
||||
""",
|
||||
).fetchall()
|
||||
}
|
||||
session_stats = {
|
||||
row["serial"]: row
|
||||
for row in conn.execute(
|
||||
"""
|
||||
SELECT serial,
|
||||
MAX(session_time) AS last_session_at,
|
||||
SUM(monitor_entries) AS total_monitor_entries,
|
||||
COUNT(*) AS total_sessions
|
||||
FROM ach_sessions
|
||||
GROUP BY serial
|
||||
""",
|
||||
).fetchall()
|
||||
}
|
||||
|
||||
all_serials = set(event_stats) | set(session_stats)
|
||||
units = []
|
||||
for serial in all_serials:
|
||||
e = event_stats.get(serial)
|
||||
s = session_stats.get(serial)
|
||||
last_event_at = e["last_event_at"] if e else None
|
||||
last_session_at = s["last_session_at"] if s else None
|
||||
# Prefer whichever timestamp is more recent
|
||||
last_seen = max(
|
||||
(t for t in (last_event_at, last_session_at) if t),
|
||||
default=None,
|
||||
)
|
||||
units.append({
|
||||
"serial": serial,
|
||||
"last_seen": last_seen,
|
||||
"total_events": e["total_events"] if e else 0,
|
||||
"total_monitor_entries": s["total_monitor_entries"] if s else 0,
|
||||
"total_sessions": s["total_sessions"] if s else 0,
|
||||
})
|
||||
|
||||
# Sort by last_seen desc; serials with no timestamp at all sink to the bottom.
|
||||
units.sort(key=lambda u: u.get("last_seen") or "", reverse=True)
|
||||
return units
|
||||
rows = conn.execute(
|
||||
"""
|
||||
SELECT
|
||||
s.serial,
|
||||
MAX(s.session_time) AS last_seen,
|
||||
SUM(s.events_downloaded) AS total_events,
|
||||
SUM(s.monitor_entries) AS total_monitor_entries,
|
||||
COUNT(*) AS total_sessions
|
||||
FROM ach_sessions s
|
||||
GROUP BY s.serial
|
||||
ORDER BY last_seen DESC
|
||||
"""
|
||||
).fetchall()
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
@@ -1,788 +0,0 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>SFM Event Browser</title>
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.1/chart.umd.min.js"></script>
|
||||
<style>
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
|
||||
body {
|
||||
background: #0d1117;
|
||||
color: #c9d1d9;
|
||||
font-family: 'Segoe UI', system-ui, sans-serif;
|
||||
font-size: 13px;
|
||||
height: 100vh;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
header {
|
||||
background: #161b22;
|
||||
border-bottom: 1px solid #30363d;
|
||||
padding: 12px 20px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 16px;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
header h1 {
|
||||
font-size: 15px;
|
||||
font-weight: 600;
|
||||
color: #f0f6fc;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
label { color: #8b949e; font-size: 12px; }
|
||||
|
||||
select, input[type="text"], input[type="search"] {
|
||||
background: #0d1117;
|
||||
border: 1px solid #30363d;
|
||||
border-radius: 6px;
|
||||
color: #c9d1d9;
|
||||
padding: 5px 8px;
|
||||
font-size: 13px;
|
||||
}
|
||||
select { min-width: 140px; }
|
||||
input[type="search"] { width: 200px; }
|
||||
select:focus, input:focus { outline: none; border-color: #388bfd; }
|
||||
|
||||
button {
|
||||
background: #1f6feb;
|
||||
border: none;
|
||||
border-radius: 6px;
|
||||
color: #fff;
|
||||
cursor: pointer;
|
||||
font-size: 13px;
|
||||
font-weight: 500;
|
||||
padding: 5px 14px;
|
||||
}
|
||||
button:hover { background: #388bfd; }
|
||||
button:disabled { background: #21262d; color: #484f58; cursor: not-allowed; }
|
||||
|
||||
#main {
|
||||
flex: 1;
|
||||
display: flex;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
/* ── Event list (left sidebar) ────────────────────────────────── */
|
||||
#event-list-wrap {
|
||||
width: 320px;
|
||||
flex-shrink: 0;
|
||||
background: #0d1117;
|
||||
border-right: 1px solid #21262d;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
#event-list-header {
|
||||
padding: 10px 14px;
|
||||
border-bottom: 1px solid #21262d;
|
||||
font-size: 11px;
|
||||
color: #8b949e;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.06em;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
}
|
||||
|
||||
#event-list {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
.event-row {
|
||||
padding: 8px 14px;
|
||||
border-bottom: 1px solid #161b22;
|
||||
cursor: pointer;
|
||||
transition: background 0.1s;
|
||||
}
|
||||
.event-row:hover { background: #161b22; }
|
||||
.event-row.active { background: #1f3a5f; border-left: 3px solid #58a6ff; padding-left: 11px; }
|
||||
.event-row .er-top {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 2px;
|
||||
}
|
||||
.event-row .er-ts { font-family: monospace; font-size: 12px; color: #c9d1d9; }
|
||||
.event-row .er-pvs { font-family: monospace; font-size: 12px; color: #58a6ff; font-weight: 600; }
|
||||
.event-row .er-meta { font-size: 11px; color: #8b949e; }
|
||||
.event-row.false_trigger .er-pvs { color: #f85149; text-decoration: line-through; }
|
||||
|
||||
/* ── Main viewer (right side) ─────────────────────────────────── */
|
||||
#viewer {
|
||||
flex: 1;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
#event-meta {
|
||||
padding: 12px 20px;
|
||||
background: #161b22;
|
||||
border-bottom: 1px solid #21262d;
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fit, minmax(160px, 1fr));
|
||||
gap: 8px 24px;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
.meta-field {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 1px;
|
||||
}
|
||||
.meta-field .mf-label {
|
||||
font-size: 10px;
|
||||
color: #484f58;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.05em;
|
||||
}
|
||||
.meta-field .mf-value {
|
||||
font-family: monospace;
|
||||
font-size: 13px;
|
||||
color: #c9d1d9;
|
||||
}
|
||||
.meta-field .mf-value.highlight { color: #58a6ff; font-weight: 600; }
|
||||
|
||||
#charts {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
padding: 12px 16px;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 10px;
|
||||
}
|
||||
.chart-wrap {
|
||||
background: #161b22;
|
||||
border: 1px solid #21262d;
|
||||
border-radius: 8px;
|
||||
padding: 10px 30px 8px 12px; /* right padding leaves room for the "0.0" baseline label */
|
||||
}
|
||||
.chart-label {
|
||||
font-size: 11px;
|
||||
font-weight: 600;
|
||||
letter-spacing: 0.06em;
|
||||
text-transform: uppercase;
|
||||
margin-bottom: 4px;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
}
|
||||
.chart-canvas-wrap { position: relative; height: 130px; }
|
||||
|
||||
.ch-tran { color: #58a6ff; }
|
||||
.ch-vert { color: #3fb950; }
|
||||
.ch-long { color: #d29922; }
|
||||
.ch-micl { color: #bc8cff; }
|
||||
|
||||
#status-bar {
|
||||
background: #161b22;
|
||||
border-top: 1px solid #21262d;
|
||||
padding: 5px 20px;
|
||||
font-size: 12px;
|
||||
color: #8b949e;
|
||||
min-height: 26px;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
#status-bar.error { color: #f85149; }
|
||||
#status-bar.ok { color: #3fb950; }
|
||||
|
||||
#empty-state {
|
||||
flex: 1;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
color: #484f58;
|
||||
gap: 8px;
|
||||
}
|
||||
#empty-state svg { opacity: 0.3; }
|
||||
|
||||
.pill {
|
||||
background: #21262d;
|
||||
border-radius: 4px;
|
||||
padding: 2px 8px;
|
||||
color: #c9d1d9;
|
||||
font-family: monospace;
|
||||
font-size: 11px;
|
||||
margin-left: 8px;
|
||||
}
|
||||
|
||||
/* Per-channel stats table in the metadata header */
|
||||
.stats-table {
|
||||
grid-column: 1 / -1;
|
||||
border-collapse: collapse;
|
||||
font-family: monospace;
|
||||
font-size: 12px;
|
||||
margin-top: 4px;
|
||||
}
|
||||
.stats-table th, .stats-table td {
|
||||
padding: 3px 14px 3px 0;
|
||||
text-align: left;
|
||||
color: #c9d1d9;
|
||||
}
|
||||
.stats-table th {
|
||||
color: #484f58;
|
||||
font-size: 10px;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.05em;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
/* ── Print view (light theme matching the Instantel printout) ─── */
|
||||
body.print-view {
|
||||
background: #ffffff;
|
||||
color: #000000;
|
||||
}
|
||||
body.print-view header,
|
||||
body.print-view #event-list-wrap,
|
||||
body.print-view #event-list-header,
|
||||
body.print-view #event-meta,
|
||||
body.print-view #status-bar,
|
||||
body.print-view .chart-wrap {
|
||||
background: #ffffff;
|
||||
border-color: #cccccc;
|
||||
color: #000000;
|
||||
}
|
||||
body.print-view .event-row { color: #000; border-bottom-color: #eee; }
|
||||
body.print-view .event-row:hover { background: #f4f4f4; }
|
||||
body.print-view .event-row.active {
|
||||
background: #e6f0ff;
|
||||
border-left-color: #1f6feb;
|
||||
}
|
||||
body.print-view .er-ts { color: #000; }
|
||||
body.print-view .er-pvs { color: #003a8c; }
|
||||
body.print-view .er-meta,
|
||||
body.print-view #event-list-header,
|
||||
body.print-view .meta-field .mf-label,
|
||||
body.print-view .stats-table th {
|
||||
color: #666;
|
||||
}
|
||||
body.print-view .mf-value { color: #000; }
|
||||
body.print-view .mf-value.highlight { color: #003a8c; }
|
||||
body.print-view label { color: #444; }
|
||||
body.print-view input, body.print-view select {
|
||||
background: #fff; color: #000; border-color: #ccc;
|
||||
}
|
||||
/* In print theme, the channel-label colors stay (they identify
|
||||
the trace). Only the chart panel background flips. */
|
||||
|
||||
@media print {
|
||||
header, #event-list-wrap, #status-bar, button { display: none !important; }
|
||||
body { overflow: visible; height: auto; }
|
||||
#main, #viewer { overflow: visible; }
|
||||
#charts { overflow: visible; }
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<header>
|
||||
<h1>SFM Event Browser</h1>
|
||||
<label>Serial</label>
|
||||
<select id="serial-select">
|
||||
<option value="">Loading…</option>
|
||||
</select>
|
||||
<input type="search" id="event-filter" placeholder="filter events…" />
|
||||
<span class="pill" id="count-pill">—</span>
|
||||
<button id="print-btn" onclick="togglePrintView()" style="margin-left:auto;background:#21262d">Print view</button>
|
||||
<button id="reload-btn" onclick="loadSerials()">Reload</button>
|
||||
</header>
|
||||
|
||||
<div id="main">
|
||||
<div id="event-list-wrap">
|
||||
<div id="event-list-header">
|
||||
<span>Events</span>
|
||||
<span id="event-list-count">—</span>
|
||||
</div>
|
||||
<div id="event-list"></div>
|
||||
</div>
|
||||
|
||||
<div id="viewer">
|
||||
<div id="empty-state">
|
||||
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5">
|
||||
<polyline points="22 12 18 12 15 21 9 3 6 12 2 12"/>
|
||||
</svg>
|
||||
<p>Select a unit and event to view its waveform.</p>
|
||||
</div>
|
||||
<div id="event-meta" style="display:none"></div>
|
||||
<div id="charts" style="display:none"></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="status-bar">Ready.</div>
|
||||
|
||||
<script>
|
||||
// Channel colors and rendering order mirror Instantel's BW Event Report
|
||||
// printout: MicL at the top, Tran at the bottom. Colors approximate
|
||||
// what BW renders (magenta mic, blue long, green vert, red tran).
|
||||
const CHANNEL_COLORS = {
|
||||
MicL: '#e066ff',
|
||||
Long: '#3a80ff',
|
||||
Vert: '#3fb950',
|
||||
Tran: '#f85149',
|
||||
};
|
||||
const CHANNEL_ORDER = ['MicL', 'Long', 'Vert', 'Tran'];
|
||||
|
||||
// Adaptive decimal formatter — scientific notation only for truly extreme
|
||||
// values. Normal-range peaks render as plain decimals with sensible
|
||||
// precision (was previously forcing toExponential(3) which produced ugly
|
||||
// "2.500E-2 IN/S" labels).
|
||||
function _fmtPeak(v, unit) {
|
||||
if (v == null || (typeof v === 'number' && !isFinite(v))) return '';
|
||||
if (typeof v !== 'number') return String(v) + (unit ? ' ' + unit : '');
|
||||
if (v === 0) return '0' + (unit ? ' ' + unit : '');
|
||||
const a = Math.abs(v);
|
||||
const u = unit ? ' ' + unit : '';
|
||||
if (a >= 0.0001 && a < 10000) {
|
||||
const d = a >= 100 ? 1 : a >= 10 ? 2 : a >= 1 ? 3 : a >= 0.1 ? 4 : 5;
|
||||
return v.toFixed(d) + u;
|
||||
}
|
||||
return v.toExponential(2) + u;
|
||||
}
|
||||
|
||||
let allEvents = [];
|
||||
let filteredEvents = [];
|
||||
let currentEventId = null;
|
||||
let charts = {};
|
||||
|
||||
const apiBase = window.location.origin;
|
||||
|
||||
function setStatus(msg, cls = '') {
|
||||
const bar = document.getElementById('status-bar');
|
||||
bar.textContent = msg;
|
||||
bar.className = cls;
|
||||
}
|
||||
|
||||
async function loadSerials() {
|
||||
setStatus('Loading serials…');
|
||||
try {
|
||||
const r = await fetch(`${apiBase}/db/units`);
|
||||
if (!r.ok) throw new Error(r.statusText);
|
||||
// /db/units returns a bare list[dict], not {units:[...]}
|
||||
const units = await r.json();
|
||||
const sel = document.getElementById('serial-select');
|
||||
sel.innerHTML = '';
|
||||
if (!units || units.length === 0) {
|
||||
sel.innerHTML = '<option value="">(no units found)</option>';
|
||||
setStatus('No units in DB.', 'error');
|
||||
return;
|
||||
}
|
||||
sel.innerHTML = '<option value="">— pick a unit —</option>' +
|
||||
units.map(u => {
|
||||
const n = u.total_events ?? 0;
|
||||
return `<option value="${u.serial}">${u.serial} (${n} events)</option>`;
|
||||
}).join('');
|
||||
setStatus(`Loaded ${units.length} units.`, 'ok');
|
||||
} catch (e) {
|
||||
setStatus(`Failed to load units: ${e.message}`, 'error');
|
||||
}
|
||||
}
|
||||
|
||||
async function loadEventsForSerial(serial) {
|
||||
if (!serial) {
|
||||
allEvents = [];
|
||||
renderEventList();
|
||||
return;
|
||||
}
|
||||
setStatus(`Loading events for ${serial}…`);
|
||||
try {
|
||||
const r = await fetch(`${apiBase}/db/events?serial=${encodeURIComponent(serial)}&limit=500`);
|
||||
if (!r.ok) throw new Error(r.statusText);
|
||||
const d = await r.json();
|
||||
allEvents = d.events || [];
|
||||
document.getElementById('count-pill').textContent = `${allEvents.length} events`;
|
||||
applyFilter();
|
||||
setStatus(`Loaded ${allEvents.length} events for ${serial}.`, 'ok');
|
||||
} catch (e) {
|
||||
setStatus(`Failed to load events: ${e.message}`, 'error');
|
||||
}
|
||||
}
|
||||
|
||||
function applyFilter() {
|
||||
const q = document.getElementById('event-filter').value.toLowerCase().trim();
|
||||
if (!q) {
|
||||
filteredEvents = allEvents;
|
||||
} else {
|
||||
filteredEvents = allEvents.filter(ev =>
|
||||
(ev.blastware_filename || '').toLowerCase().includes(q) ||
|
||||
(ev.timestamp || '').toLowerCase().includes(q) ||
|
||||
(ev.record_type || '').toLowerCase().includes(q) ||
|
||||
(ev.project || '').toLowerCase().includes(q)
|
||||
);
|
||||
}
|
||||
document.getElementById('event-list-count').textContent = `${filteredEvents.length} / ${allEvents.length}`;
|
||||
renderEventList();
|
||||
}
|
||||
|
||||
function renderEventList() {
|
||||
const list = document.getElementById('event-list');
|
||||
list.innerHTML = '';
|
||||
if (filteredEvents.length === 0) {
|
||||
list.innerHTML = '<div style="padding:14px;color:#484f58;font-size:12px">No events.</div>';
|
||||
return;
|
||||
}
|
||||
for (const ev of filteredEvents) {
|
||||
const row = document.createElement('div');
|
||||
row.className = 'event-row' + (ev.false_trigger ? ' false_trigger' : '');
|
||||
if (ev.id === currentEventId) row.className += ' active';
|
||||
const ts = (ev.timestamp || '').replace('T', ' ').replace('Z', '');
|
||||
const pvs = ev.peak_vector_sum != null ? `${ev.peak_vector_sum.toFixed(3)} in/s` : '—';
|
||||
row.innerHTML = `
|
||||
<div class="er-top">
|
||||
<span class="er-ts">${ts || '(no ts)'}</span>
|
||||
<span class="er-pvs">${pvs}</span>
|
||||
</div>
|
||||
<div class="er-meta">${ev.record_type || '?'} · ${ev.blastware_filename || ev.id.slice(0,8)}</div>
|
||||
`;
|
||||
row.onclick = () => loadEvent(ev.id);
|
||||
list.appendChild(row);
|
||||
}
|
||||
}
|
||||
|
||||
async function loadEvent(eventId) {
|
||||
currentEventId = eventId;
|
||||
renderEventList();
|
||||
setStatus('Loading waveform…');
|
||||
try {
|
||||
const r = await fetch(`${apiBase}/db/events/${eventId}/waveform.json`);
|
||||
if (!r.ok) {
|
||||
if (r.status === 404) {
|
||||
showEmpty('No waveform data for this event (codec returned no samples).');
|
||||
return;
|
||||
}
|
||||
throw new Error(r.statusText);
|
||||
}
|
||||
const data = await r.json();
|
||||
renderWaveform(data);
|
||||
// Also fetch metadata from the events list for richer header
|
||||
const ev = allEvents.find(e => e.id === eventId);
|
||||
renderMeta(data, ev);
|
||||
setStatus(`Event loaded.`, 'ok');
|
||||
} catch (e) {
|
||||
setStatus(`Failed to load event: ${e.message}`, 'error');
|
||||
showEmpty(`Error: ${e.message}`);
|
||||
}
|
||||
}
|
||||
|
||||
function showEmpty(msg) {
|
||||
document.getElementById('empty-state').style.display = 'flex';
|
||||
document.getElementById('empty-state').querySelector('p').textContent = msg;
|
||||
document.getElementById('event-meta').style.display = 'none';
|
||||
document.getElementById('charts').style.display = 'none';
|
||||
Object.values(charts).forEach(c => c.destroy());
|
||||
charts = {};
|
||||
}
|
||||
|
||||
function renderMeta(data, ev) {
|
||||
const metaDiv = document.getElementById('event-meta');
|
||||
const fields = [
|
||||
['Serial', data.serial || ev?.serial || '—'],
|
||||
['Timestamp', (data.timestamp || ev?.timestamp || '—').replace('T', ' ').replace('Z', '')],
|
||||
['Record', data.record_type || ev?.record_type || '—'],
|
||||
['Sample rate', data.sample_rate ? `${data.sample_rate} sps` : '—'],
|
||||
['Geo range', data.geo_range ? `${data.geo_range} (${data.geo_full_scale_ips} in/s FS)` : '—'],
|
||||
['Project', ev?.project || '—'],
|
||||
['Location', ev?.sensor_location || '—'],
|
||||
['Peak Vector Sum',
|
||||
ev?.peak_vector_sum != null ? `${ev.peak_vector_sum.toFixed(4)} in/s` : '—'],
|
||||
];
|
||||
|
||||
// Per-channel stats table mirroring the printout's middle block.
|
||||
// Pulls per-channel PPV from the events row (DB columns) and additional
|
||||
// details (peak time, peak accel, peak displacement, sensor check) from
|
||||
// bw_report when present.
|
||||
const fmt = v => (v == null ? '—' : (typeof v === 'number' ? v.toFixed(3) : v));
|
||||
const rows = [
|
||||
['Tran', ev?.tran_ppv],
|
||||
['Vert', ev?.vert_ppv],
|
||||
['Long', ev?.long_ppv],
|
||||
];
|
||||
const statsHtml = `
|
||||
<table class="stats-table">
|
||||
<thead>
|
||||
<tr><th>Channel</th><th>PPV (in/s)</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
${rows.map(([ch, ppv]) => `<tr><td>${ch}</td><td>${fmt(ppv)}</td></tr>`).join('')}
|
||||
<tr><td>MicL</td><td>${fmt(ev?.mic_ppv)} psi</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
`;
|
||||
|
||||
metaDiv.innerHTML =
|
||||
fields.map(([l, v]) =>
|
||||
`<div class="meta-field"><span class="mf-label">${l}</span><span class="mf-value${l === 'Peak Vector Sum' ? ' highlight' : ''}">${v}</span></div>`
|
||||
).join('') + statsHtml;
|
||||
metaDiv.style.display = 'grid';
|
||||
}
|
||||
|
||||
function togglePrintView() {
|
||||
document.body.classList.toggle('print-view');
|
||||
// Force chart redraw so axis/grid colors are re-evaluated against the
|
||||
// new background. Easiest: re-render the current event.
|
||||
if (currentEventId) {
|
||||
loadEvent(currentEventId);
|
||||
}
|
||||
}
|
||||
|
||||
function renderWaveform(data) {
|
||||
document.getElementById('empty-state').style.display = 'none';
|
||||
const chartsDiv = document.getElementById('charts');
|
||||
chartsDiv.style.display = 'flex';
|
||||
chartsDiv.innerHTML = '';
|
||||
Object.values(charts).forEach(c => c.destroy());
|
||||
charts = {};
|
||||
|
||||
const channels = data.channels || {};
|
||||
// time_axis is METADATA from sfm.plot.v1 — sample_rate, pretrig_samples,
|
||||
// t0_ms (first-sample time relative to trigger; negative when pretrig
|
||||
// exists), dt_ms. Trigger is at t=0 by convention.
|
||||
const ta = data.time_axis || {};
|
||||
const sr = ta.sample_rate || 1024;
|
||||
const dtMs = ta.dt_ms || (1000.0 / sr);
|
||||
const t0Ms = ta.t0_ms != null ? ta.t0_ms : 0;
|
||||
const isPrintMode = document.body.classList.contains('print-view');
|
||||
// Histograms record per-interval peaks (typically 1 per minute/5-min),
|
||||
// not per-sample waveforms. Render as a tight bar graph instead of a
|
||||
// line plot — matches the BW Event Report's histogram presentation.
|
||||
const isHistogram = String(data.record_type || '').toLowerCase().includes('histogram');
|
||||
|
||||
// Which channels actually have data → determines which one renders the
|
||||
// shared x-axis at the bottom (Instantel printout has the time scale
|
||||
// only on the bottom-most chart).
|
||||
const channelsWithData = CHANNEL_ORDER.filter(ch =>
|
||||
channels[ch] && (channels[ch].values || []).length > 0
|
||||
);
|
||||
const lastDataCh = channelsWithData[channelsWithData.length - 1];
|
||||
|
||||
for (const ch of CHANNEL_ORDER) {
|
||||
const chData = channels[ch];
|
||||
if (!chData) continue;
|
||||
const values = chData.values || [];
|
||||
if (values.length === 0) {
|
||||
// Render an empty card so user sees the channel exists but is missing
|
||||
const wrap = document.createElement('div');
|
||||
wrap.className = 'chart-wrap';
|
||||
wrap.innerHTML = `
|
||||
<div class="chart-label ch-${ch.toLowerCase()}">
|
||||
<span>${ch}</span>
|
||||
<span style="color:#484f58">no samples decoded</span>
|
||||
</div>
|
||||
<div class="chart-canvas-wrap" style="display:flex;align-items:center;justify-content:center;color:#484f58;font-size:12px">empty</div>
|
||||
`;
|
||||
chartsDiv.appendChild(wrap);
|
||||
continue;
|
||||
}
|
||||
|
||||
const unit = chData.unit || 'unit';
|
||||
const peak = chData.peak;
|
||||
const peakT = chData.peak_t_ms;
|
||||
const peakLabel = peak != null
|
||||
? `peak ${_fmtPeak(peak, unit)}`
|
||||
+ (!isHistogram && peakT != null ? ` @ ${peakT.toFixed(1)} ms` : '')
|
||||
: '';
|
||||
// Hide x-axis on every chart except the bottom-most data channel —
|
||||
// gives the "single shared time axis" feel of the BW printout.
|
||||
const showXAxis = (ch === lastDataCh);
|
||||
|
||||
const wrap = document.createElement('div');
|
||||
wrap.className = 'chart-wrap';
|
||||
const lbl = document.createElement('div');
|
||||
lbl.className = `chart-label ch-${ch.toLowerCase()}`;
|
||||
lbl.innerHTML = `<span>${ch}</span><span style="color:#8b949e;font-weight:normal">${peakLabel}</span>`;
|
||||
wrap.appendChild(lbl);
|
||||
|
||||
const canvasWrap = document.createElement('div');
|
||||
canvasWrap.className = 'chart-canvas-wrap';
|
||||
const canvas = document.createElement('canvas');
|
||||
canvasWrap.appendChild(canvas);
|
||||
wrap.appendChild(canvasWrap);
|
||||
chartsDiv.appendChild(wrap);
|
||||
|
||||
// Waveform: per-sample time in ms relative to trigger (negative for pretrig).
|
||||
// Histogram: interval index (1..N); sample_rate-based time math doesn't
|
||||
// apply to per-interval peaks.
|
||||
const times = isHistogram
|
||||
? values.map((_, i) => i + 1)
|
||||
: values.map((_, i) => t0Ms + i * dtMs);
|
||||
|
||||
// Downsample for rendering
|
||||
const MAX_POINTS = 4000;
|
||||
let rT = times, rV = values;
|
||||
if (values.length > MAX_POINTS) {
|
||||
const step = Math.ceil(values.length / MAX_POINTS);
|
||||
rT = times.filter((_, i) => i % step === 0);
|
||||
rV = values.filter((_, i) => i % step === 0);
|
||||
}
|
||||
|
||||
// Tick formatter — round to 1 decimal so we don't get
|
||||
// "11.7187040000000002 ms" garbage from floating-point accumulation.
|
||||
const xAxisUnit = isHistogram ? '' : ' ms';
|
||||
const fmtTick = i => {
|
||||
const v = rT[i];
|
||||
if (typeof v !== 'number') return String(v) + xAxisUnit;
|
||||
return (Number.isInteger(v) ? String(v) : v.toFixed(1)) + xAxisUnit;
|
||||
};
|
||||
|
||||
// Y-axis bounds. Geophone waveforms render symmetric around zero
|
||||
// (seismograph convention — zero line in the middle, signal goes
|
||||
// up AND down). Mic + histograms keep default auto-scale (always
|
||||
// positive values; zero at the bottom).
|
||||
let yBounds = {};
|
||||
const isGeoWaveform = !isHistogram && ch !== 'MicL';
|
||||
if (isGeoWaveform) {
|
||||
let absMax = 0;
|
||||
for (const v of values) {
|
||||
const a = Math.abs(v);
|
||||
if (a > absMax) absMax = a;
|
||||
}
|
||||
const padded = (absMax || 1) * 1.10;
|
||||
yBounds = { min: -padded, max: padded };
|
||||
}
|
||||
|
||||
const chart = new Chart(canvas, {
|
||||
type: isHistogram ? 'bar' : 'line',
|
||||
data: {
|
||||
labels: rT.map(t => (typeof t === 'number' ? (Number.isInteger(t) ? String(t) : t.toFixed(2)) : t)),
|
||||
datasets: isHistogram ? [{
|
||||
data: rV,
|
||||
backgroundColor: CHANNEL_COLORS[ch],
|
||||
borderWidth: 0,
|
||||
barPercentage: 1.0,
|
||||
categoryPercentage: 1.0, // bars touch — tight bargraph
|
||||
}] : [{
|
||||
data: rV,
|
||||
borderColor: CHANNEL_COLORS[ch],
|
||||
borderWidth: 1,
|
||||
pointRadius: 0,
|
||||
tension: 0,
|
||||
}],
|
||||
},
|
||||
options: {
|
||||
animation: false,
|
||||
responsive: true,
|
||||
maintainAspectRatio: false,
|
||||
plugins: {
|
||||
legend: { display: false },
|
||||
tooltip: {
|
||||
mode: 'index',
|
||||
intersect: false,
|
||||
callbacks: {
|
||||
title: items => isHistogram
|
||||
? `interval ${items[0].label}`
|
||||
: `t = ${items[0].label} ms`,
|
||||
label: item => `${ch}: ${_fmtPeak(item.raw, unit)}`,
|
||||
},
|
||||
},
|
||||
},
|
||||
scales: {
|
||||
x: {
|
||||
type: 'category',
|
||||
display: showXAxis,
|
||||
ticks: {
|
||||
color: isPrintMode ? '#666' : '#484f58',
|
||||
maxTicksLimit: 10,
|
||||
maxRotation: 0,
|
||||
callback: (val, i) => fmtTick(i),
|
||||
},
|
||||
grid: { color: isPrintMode ? '#e0e0e0' : '#21262d', drawTicks: showXAxis },
|
||||
},
|
||||
y: {
|
||||
...yBounds,
|
||||
ticks: { color: isPrintMode ? '#666' : '#484f58', maxTicksLimit: 5 },
|
||||
grid: { color: isPrintMode ? '#e0e0e0' : '#21262d' },
|
||||
title: { display: true, text: unit,
|
||||
color: isPrintMode ? '#666' : '#484f58', font: { size: 10 } },
|
||||
},
|
||||
},
|
||||
},
|
||||
plugins: isHistogram ? [] : [{
|
||||
// Trigger line @ t=0 + triangle markers above/below + "0.0"
|
||||
// baseline label on the right edge. Matches the Instantel
|
||||
// BW Event Report printout style. Skipped for histograms —
|
||||
// they have no trigger event.
|
||||
id: 'instantelOverlays',
|
||||
afterDraw(chart) {
|
||||
const ctx = chart.ctx;
|
||||
const xAxis = chart.scales.x;
|
||||
const yAxis = chart.scales.y;
|
||||
const fgPrim = isPrintMode ? '#000' : '#c9d1d9';
|
||||
const fgTrigger = '#f85149';
|
||||
|
||||
// Dashed vertical trigger line at t=0
|
||||
const zeroIdx = rT.findIndex(t => parseFloat(t) >= 0);
|
||||
if (zeroIdx >= 0) {
|
||||
const x = xAxis.getPixelForValue(zeroIdx);
|
||||
ctx.save();
|
||||
ctx.beginPath();
|
||||
ctx.moveTo(x, yAxis.top);
|
||||
ctx.lineTo(x, yAxis.bottom);
|
||||
ctx.strokeStyle = isPrintMode ? '#cc0000' : 'rgba(248, 81, 73, 0.8)';
|
||||
ctx.lineWidth = 1.2;
|
||||
ctx.setLineDash([4, 3]);
|
||||
ctx.stroke();
|
||||
ctx.restore();
|
||||
|
||||
// Triangles above and below the chart at the trigger column
|
||||
ctx.save();
|
||||
ctx.fillStyle = fgTrigger;
|
||||
ctx.beginPath(); // top triangle pointing down
|
||||
ctx.moveTo(x - 5, yAxis.top - 8);
|
||||
ctx.lineTo(x + 5, yAxis.top - 8);
|
||||
ctx.lineTo(x, yAxis.top - 1);
|
||||
ctx.closePath();
|
||||
ctx.fill();
|
||||
ctx.beginPath(); // bottom triangle pointing up
|
||||
ctx.moveTo(x - 5, yAxis.bottom + 8);
|
||||
ctx.lineTo(x + 5, yAxis.bottom + 8);
|
||||
ctx.lineTo(x, yAxis.bottom + 1);
|
||||
ctx.closePath();
|
||||
ctx.fill();
|
||||
ctx.restore();
|
||||
}
|
||||
|
||||
// "0.0" baseline label on the right edge — printout convention.
|
||||
// Position vertically at the zero-amplitude level.
|
||||
const zeroY = yAxis.getPixelForValue(0);
|
||||
if (zeroY >= yAxis.top && zeroY <= yAxis.bottom) {
|
||||
ctx.save();
|
||||
ctx.strokeStyle = isPrintMode ? '#aaa' : '#30363d';
|
||||
ctx.lineWidth = 0.8;
|
||||
ctx.setLineDash([2, 2]);
|
||||
ctx.beginPath();
|
||||
ctx.moveTo(xAxis.left, zeroY);
|
||||
ctx.lineTo(xAxis.right, zeroY);
|
||||
ctx.stroke();
|
||||
ctx.restore();
|
||||
|
||||
ctx.save();
|
||||
ctx.fillStyle = fgPrim;
|
||||
ctx.font = '11px monospace';
|
||||
ctx.textAlign = 'left';
|
||||
ctx.textBaseline = 'middle';
|
||||
ctx.fillText('0.0', xAxis.right + 6, zeroY);
|
||||
ctx.restore();
|
||||
}
|
||||
},
|
||||
}],
|
||||
});
|
||||
charts[ch] = chart;
|
||||
}
|
||||
}
|
||||
|
||||
// Wire up handlers
|
||||
document.getElementById('serial-select').addEventListener('change', e => {
|
||||
loadEventsForSerial(e.target.value);
|
||||
});
|
||||
document.getElementById('event-filter').addEventListener('input', applyFilter);
|
||||
|
||||
// Initial load
|
||||
loadSerials();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
@@ -1,530 +0,0 @@
|
||||
"""
|
||||
sfm/event_hdf5.py — HDF5 codec for the canonical "clean waveform" file.
|
||||
|
||||
Layout written to `<filename>.h5`:
|
||||
|
||||
/
|
||||
├─ samples/
|
||||
│ ├─ Tran (float32, in/s) shape: (N,)
|
||||
│ ├─ Vert (float32, in/s) shape: (N,)
|
||||
│ ├─ Long (float32, in/s) shape: (N,)
|
||||
│ └─ MicL (float32, psi) shape: (N,)
|
||||
├─ samples_int16/ (optional)
|
||||
│ ├─ Tran (int16, raw ADC counts) shape: (N,)
|
||||
│ └─ ... per channel (only when present in the source)
|
||||
└─ root attrs (event metadata):
|
||||
schema_version int = 1
|
||||
kind str = "sfm.event.hdf5"
|
||||
serial str
|
||||
waveform_key str (8-hex)
|
||||
timestamp str (ISO-8601)
|
||||
record_type str
|
||||
sample_rate int (sps)
|
||||
pretrig_samples int
|
||||
total_samples int
|
||||
rectime_seconds float
|
||||
geo_range str "normal" | "sensitive"
|
||||
geo_full_scale_ips float (10.0 or 1.250)
|
||||
project str
|
||||
client str
|
||||
operator str
|
||||
sensor_location str
|
||||
peak_tran_ips float (from 0C; authoritative)
|
||||
peak_vert_ips float
|
||||
peak_long_ips float
|
||||
peak_pvs_ips float
|
||||
peak_mic_psi float
|
||||
tool_version str
|
||||
captured_at str (ISO-8601 UTC)
|
||||
source_kind str "sfm-live" | "sfm-ach" | "bw-import"
|
||||
|
||||
Why HDF5 and not just JSON for the canonical clean format:
|
||||
- Native float32 arrays (no base64 dance, no per-value JSON parsing).
|
||||
- Per-dataset gzip compression — sample arrays compress 3-5×.
|
||||
- Cross-language: h5py (Python), HDF5.jl (Julia), io.netcdf (R), etc.
|
||||
Analysis pipelines don't have to know anything about Blastware.
|
||||
- Self-describing via attributes; future fields don't break readers.
|
||||
|
||||
The plot-ready `sfm.plot.v1` JSON returned by the REST endpoints is
|
||||
derived from this HDF5 (or computed on-the-fly when no .h5 exists yet).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Optional, Union
|
||||
|
||||
import h5py
|
||||
import numpy as np
|
||||
|
||||
from minimateplus.event_file_io import TOOL_VERSION as _DEFAULT_TOOL_VERSION
|
||||
from minimateplus.models import Event
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
SCHEMA_VERSION = 1
|
||||
HDF5_KIND = "sfm.event.hdf5"
|
||||
|
||||
# Geophone full-scale velocity per range (in/s). Confirmed in CLAUDE.md
|
||||
# from 4-20-26 captures: Normal=0x00 → 10 in/s, Sensitive=0x01 → 1.25 in/s.
|
||||
_GEO_FS_BY_RANGE = {
|
||||
"normal": 10.000,
|
||||
"sensitive": 1.2500,
|
||||
0: 10.000,
|
||||
1: 1.2500,
|
||||
}
|
||||
_INT16_FS = 32768.0
|
||||
|
||||
# Default mic conversion: ADC count → psi. Approximate; exact factor
|
||||
# depends on firmware reference voltage and mic sensitivity, neither of
|
||||
# which is independently confirmed. We try to refine it from the device-
|
||||
# reported peak when available (peak_mic_psi / max_abs_int16).
|
||||
_MIC_DEFAULT_FS_PSI = 0.0125 # ≈ 0.5 psi at full scale (rough)
|
||||
|
||||
|
||||
def _resolve_geo_full_scale(geo_range) -> float:
|
||||
"""Map a geo_range value (string or int from compliance config) to the
|
||||
full-scale velocity in in/s. Defaults to Normal range (10.0) when the
|
||||
value is unknown — same default as Blastware itself."""
|
||||
if geo_range is None:
|
||||
return _GEO_FS_BY_RANGE["normal"]
|
||||
if isinstance(geo_range, str):
|
||||
return _GEO_FS_BY_RANGE.get(geo_range.lower(), _GEO_FS_BY_RANGE["normal"])
|
||||
return _GEO_FS_BY_RANGE.get(int(geo_range), _GEO_FS_BY_RANGE["normal"])
|
||||
|
||||
|
||||
def _normalise_range(geo_range) -> str:
|
||||
"""Return 'normal' or 'sensitive' (string) regardless of input form."""
|
||||
if isinstance(geo_range, str):
|
||||
v = geo_range.lower()
|
||||
if v in ("normal", "sensitive"):
|
||||
return v
|
||||
return "normal"
|
||||
if geo_range == 1:
|
||||
return "sensitive"
|
||||
return "normal"
|
||||
|
||||
|
||||
def _ts_iso(ts) -> str:
|
||||
if ts is None:
|
||||
return ""
|
||||
try:
|
||||
return datetime.datetime(
|
||||
ts.year, ts.month, ts.day,
|
||||
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||
).isoformat()
|
||||
except Exception:
|
||||
return str(ts)
|
||||
|
||||
|
||||
def _samples_to_float(
|
||||
samples_int16: list[int],
|
||||
full_scale: float,
|
||||
) -> np.ndarray:
|
||||
"""Convert int16 ADC counts → float32 physical units.
|
||||
|
||||
Uses _INT16_FS=32768 (not 32767) so that a count of -32768 maps to
|
||||
exactly -full_scale and +32767 maps to ~+full_scale * 32767/32768.
|
||||
Matches the device firmware's documented mapping (see CLAUDE.md
|
||||
geo_hardware_constant rationale).
|
||||
"""
|
||||
if not samples_int16:
|
||||
return np.array([], dtype=np.float32)
|
||||
arr = np.asarray(samples_int16, dtype=np.int32) # int32 to avoid overflow during scale
|
||||
return (arr.astype(np.float32) * (full_scale / _INT16_FS)).astype(np.float32)
|
||||
|
||||
|
||||
def _mic_scale_factor(
|
||||
samples_int16: list[int],
|
||||
peak_mic_psi: Optional[float],
|
||||
) -> float:
|
||||
"""Resolve the per-count psi factor for the microphone channel.
|
||||
|
||||
When the device reports a peak mic value via the 0C record, we
|
||||
back-solve the per-count factor from `peak_psi / max(|samples|)` so
|
||||
the plotted waveform peaks land exactly at the device-reported value.
|
||||
Otherwise fall back to the rough _MIC_DEFAULT_FS_PSI estimate.
|
||||
"""
|
||||
if peak_mic_psi is not None and peak_mic_psi > 0 and samples_int16:
|
||||
max_count = max(abs(int(v)) for v in samples_int16) or 1
|
||||
return float(peak_mic_psi) / float(max_count)
|
||||
return _MIC_DEFAULT_FS_PSI / _INT16_FS
|
||||
|
||||
|
||||
def write_event_hdf5(
|
||||
path: Union[str, Path],
|
||||
event: Event,
|
||||
*,
|
||||
serial: str,
|
||||
geo_range = "normal",
|
||||
source_kind: str = "sfm-live",
|
||||
tool_version: Optional[str] = None,
|
||||
captured_at: Optional[datetime.datetime] = None,
|
||||
include_int16: bool = True,
|
||||
) -> dict:
|
||||
"""
|
||||
Persist a decoded Event as an HDF5 file with samples in physical units.
|
||||
|
||||
Returns a small summary dict suitable for logging:
|
||||
{"path": Path, "n_samples": int, "geo_full_scale_ips": float}
|
||||
"""
|
||||
path = Path(path)
|
||||
raw = event.raw_samples or {}
|
||||
pv = event.peak_values
|
||||
pi = event.project_info
|
||||
|
||||
geo_fs = _resolve_geo_full_scale(geo_range)
|
||||
geo_range_str = _normalise_range(geo_range)
|
||||
captured_at = captured_at or datetime.datetime.utcnow()
|
||||
tool_version = tool_version or _DEFAULT_TOOL_VERSION
|
||||
|
||||
# Per-channel float32 arrays in physical units.
|
||||
geo_arrays = {}
|
||||
for ch in ("Tran", "Vert", "Long"):
|
||||
geo_arrays[ch] = _samples_to_float(raw.get(ch, []), geo_fs)
|
||||
|
||||
# Mic channel — the per-count factor is resolved from the device-reported
|
||||
# peak when available so the plot peaks the BW value exactly.
|
||||
mic_int16 = raw.get("MicL", [])
|
||||
mic_factor = _mic_scale_factor(
|
||||
mic_int16,
|
||||
getattr(pv, "micl", None) if pv else None,
|
||||
)
|
||||
if mic_int16:
|
||||
mic_arr = (np.asarray(mic_int16, dtype=np.int32).astype(np.float32) * mic_factor).astype(np.float32)
|
||||
else:
|
||||
mic_arr = np.array([], dtype=np.float32)
|
||||
|
||||
n_samples = max(
|
||||
(len(geo_arrays[ch]) for ch in geo_arrays),
|
||||
default=0,
|
||||
)
|
||||
|
||||
# Atomic write: temp file + os.replace.
|
||||
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||
with h5py.File(tmp, "w") as f:
|
||||
# Root attrs — event-level metadata.
|
||||
attrs = f.attrs
|
||||
attrs["schema_version"] = SCHEMA_VERSION
|
||||
attrs["kind"] = HDF5_KIND
|
||||
attrs["serial"] = serial or ""
|
||||
attrs["waveform_key"] = event._waveform_key.hex() if event._waveform_key else ""
|
||||
attrs["timestamp"] = _ts_iso(event.timestamp)
|
||||
attrs["record_type"] = event.record_type or ""
|
||||
attrs["sample_rate"] = int(event.sample_rate or 0)
|
||||
attrs["pretrig_samples"] = int(event.pretrig_samples or 0)
|
||||
attrs["total_samples"] = int(event.total_samples or n_samples)
|
||||
attrs["rectime_seconds"] = float(event.rectime_seconds or 0.0)
|
||||
attrs["geo_range"] = geo_range_str
|
||||
attrs["geo_full_scale_ips"] = float(geo_fs)
|
||||
attrs["project"] = (pi.project if pi else "") or ""
|
||||
attrs["client"] = (pi.client if pi else "") or ""
|
||||
attrs["operator"] = (pi.operator if pi else "") or ""
|
||||
attrs["sensor_location"] = (pi.sensor_location if pi else "") or ""
|
||||
attrs["peak_tran_ips"] = float(pv.tran if pv and pv.tran is not None else 0.0)
|
||||
attrs["peak_vert_ips"] = float(pv.vert if pv and pv.vert is not None else 0.0)
|
||||
attrs["peak_long_ips"] = float(pv.long if pv and pv.long is not None else 0.0)
|
||||
attrs["peak_pvs_ips"] = float(pv.peak_vector_sum if pv and pv.peak_vector_sum is not None else 0.0)
|
||||
attrs["peak_mic_psi"] = float(pv.micl if pv and pv.micl is not None else 0.0)
|
||||
attrs["tool_version"] = tool_version or ""
|
||||
attrs["captured_at"] = captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat()
|
||||
attrs["source_kind"] = source_kind
|
||||
|
||||
# /samples — physical-units float32 (the primary data).
|
||||
sgrp = f.create_group("samples")
|
||||
for ch, arr in geo_arrays.items():
|
||||
sgrp.create_dataset(
|
||||
ch, data=arr, dtype="float32",
|
||||
compression="gzip", compression_opts=4, shuffle=True,
|
||||
)
|
||||
sgrp.create_dataset(
|
||||
"MicL", data=mic_arr, dtype="float32",
|
||||
compression="gzip", compression_opts=4, shuffle=True,
|
||||
)
|
||||
|
||||
# /samples_int16 — optional raw ADC counts (preserved for analysis
|
||||
# tools that want pre-conversion data). Cheap to include.
|
||||
if include_int16:
|
||||
igrp = f.create_group("samples_int16")
|
||||
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||
vals = raw.get(ch, [])
|
||||
if vals:
|
||||
igrp.create_dataset(
|
||||
ch, data=np.asarray(vals, dtype=np.int16),
|
||||
compression="gzip", compression_opts=4, shuffle=True,
|
||||
)
|
||||
igrp.attrs["mic_psi_per_count"] = float(mic_factor)
|
||||
|
||||
import os
|
||||
os.replace(tmp, path)
|
||||
|
||||
log.info(
|
||||
"write_event_hdf5: %s n_samples=%d geo_fs=%.3f filesize=%d",
|
||||
path, n_samples, geo_fs, path.stat().st_size,
|
||||
)
|
||||
return {
|
||||
"path": path,
|
||||
"n_samples": n_samples,
|
||||
"geo_full_scale_ips": geo_fs,
|
||||
}
|
||||
|
||||
|
||||
def read_event_hdf5(path: Union[str, Path]) -> dict:
|
||||
"""
|
||||
Load an event HDF5 into a plain dict (no Event reconstruction —
|
||||
callers that want an Event can use the data directly).
|
||||
|
||||
Returns:
|
||||
{
|
||||
"schema_version": int,
|
||||
"kind": str,
|
||||
"attrs": dict[str, …], # all root attributes
|
||||
"samples": { # float32 lists in physical units
|
||||
"Tran": ndarray, "Vert": ndarray, "Long": ndarray, "MicL": ndarray,
|
||||
},
|
||||
"samples_int16": {…} or None,
|
||||
"mic_psi_per_count": float | None,
|
||||
}
|
||||
|
||||
Raises FileNotFoundError if missing, ValueError on bad shape /
|
||||
unsupported schema_version.
|
||||
"""
|
||||
path = Path(path)
|
||||
with h5py.File(path, "r") as f:
|
||||
attrs = {k: _h5_attr_value(v) for k, v in f.attrs.items()}
|
||||
sv = attrs.get("schema_version", 0)
|
||||
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
|
||||
raise ValueError(
|
||||
f"{path}: unsupported HDF5 schema_version={sv} "
|
||||
f"(this build supports 1..{SCHEMA_VERSION})"
|
||||
)
|
||||
if attrs.get("kind") != HDF5_KIND:
|
||||
raise ValueError(f"{path}: kind != {HDF5_KIND!r} (got {attrs.get('kind')!r})")
|
||||
|
||||
samples = {}
|
||||
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||
ds = f.get(f"samples/{ch}")
|
||||
samples[ch] = np.asarray(ds[()]) if ds is not None else np.array([], dtype=np.float32)
|
||||
|
||||
samples_int16 = None
|
||||
mic_psi = None
|
||||
igrp = f.get("samples_int16")
|
||||
if igrp is not None:
|
||||
samples_int16 = {}
|
||||
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||
ds = igrp.get(ch)
|
||||
if ds is not None:
|
||||
samples_int16[ch] = np.asarray(ds[()])
|
||||
mic_attr = igrp.attrs.get("mic_psi_per_count")
|
||||
if mic_attr is not None:
|
||||
mic_psi = float(mic_attr)
|
||||
|
||||
return {
|
||||
"schema_version": sv,
|
||||
"kind": attrs.get("kind"),
|
||||
"attrs": attrs,
|
||||
"samples": samples,
|
||||
"samples_int16": samples_int16,
|
||||
"mic_psi_per_count": mic_psi,
|
||||
}
|
||||
|
||||
|
||||
def _h5_attr_value(v):
|
||||
"""Convert an h5py attribute value to a plain Python type."""
|
||||
if isinstance(v, bytes):
|
||||
return v.decode("utf-8", errors="replace")
|
||||
if isinstance(v, np.generic):
|
||||
return v.item()
|
||||
return v
|
||||
|
||||
|
||||
# ── Plot-ready JSON ──────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def event_to_plot_json(
|
||||
event: Event,
|
||||
*,
|
||||
serial: str,
|
||||
geo_range = "normal",
|
||||
event_id: Optional[str] = None,
|
||||
index: Optional[int] = None,
|
||||
) -> dict:
|
||||
"""
|
||||
Build a `sfm.plot.v1` JSON dict directly from an Event (skipping HDF5).
|
||||
|
||||
Used by:
|
||||
- `/device/event/{idx}/waveform` (live device path)
|
||||
- The CLI / tests for in-memory conversion sanity-checks.
|
||||
|
||||
Stored events go through `plot_json_from_hdf5()` so the wire format
|
||||
is identical regardless of whether the data came from the live device
|
||||
or the on-disk HDF5.
|
||||
"""
|
||||
raw = event.raw_samples or {}
|
||||
pv = event.peak_values
|
||||
geo_fs = _resolve_geo_full_scale(geo_range)
|
||||
geo_range_str = _normalise_range(geo_range)
|
||||
sr = int(event.sample_rate or 0) or 1024
|
||||
pretrig = int(event.pretrig_samples or 0)
|
||||
|
||||
geo_arrays = {ch: _samples_to_float(raw.get(ch, []), geo_fs).tolist()
|
||||
for ch in ("Tran", "Vert", "Long")}
|
||||
mic_int16 = raw.get("MicL", [])
|
||||
mic_factor = _mic_scale_factor(
|
||||
mic_int16,
|
||||
getattr(pv, "micl", None) if pv else None,
|
||||
)
|
||||
mic_arr = [float(v) * mic_factor for v in mic_int16] if mic_int16 else []
|
||||
|
||||
n = max(
|
||||
(len(geo_arrays[ch]) for ch in geo_arrays),
|
||||
default=len(mic_arr),
|
||||
)
|
||||
return _build_plot_dict(
|
||||
n_samples=n,
|
||||
sample_rate=sr,
|
||||
pretrig_samples=pretrig,
|
||||
total_samples=int(event.total_samples or n),
|
||||
rectime_seconds=float(event.rectime_seconds or 0.0),
|
||||
timestamp_iso=_ts_iso(event.timestamp),
|
||||
serial=serial,
|
||||
record_type=event.record_type,
|
||||
waveform_key=event._waveform_key.hex() if event._waveform_key else None,
|
||||
geo_range=geo_range_str,
|
||||
geo_fs=geo_fs,
|
||||
channels_floats={
|
||||
"Tran": geo_arrays["Tran"],
|
||||
"Vert": geo_arrays["Vert"],
|
||||
"Long": geo_arrays["Long"],
|
||||
"MicL": mic_arr,
|
||||
},
|
||||
peaks_dict={
|
||||
"tran": getattr(pv, "tran", None) if pv else None,
|
||||
"vert": getattr(pv, "vert", None) if pv else None,
|
||||
"long": getattr(pv, "long", None) if pv else None,
|
||||
"pvs": getattr(pv, "peak_vector_sum", None) if pv else None,
|
||||
"mic": getattr(pv, "micl", None) if pv else None,
|
||||
},
|
||||
event_id=event_id,
|
||||
index=index if index is not None else event.index,
|
||||
)
|
||||
|
||||
|
||||
def plot_json_from_hdf5(
|
||||
path: Union[str, Path],
|
||||
*,
|
||||
event_id: Optional[str] = None,
|
||||
index: Optional[int] = None,
|
||||
) -> dict:
|
||||
"""Build a `sfm.plot.v1` JSON dict from a stored .h5 file."""
|
||||
data = read_event_hdf5(path)
|
||||
a = data["attrs"]
|
||||
s = data["samples"]
|
||||
return _build_plot_dict(
|
||||
n_samples=len(s["Tran"]) if "Tran" in s else 0,
|
||||
sample_rate=int(a.get("sample_rate", 1024) or 1024),
|
||||
pretrig_samples=int(a.get("pretrig_samples", 0) or 0),
|
||||
total_samples=int(a.get("total_samples", 0) or 0),
|
||||
rectime_seconds=float(a.get("rectime_seconds", 0.0) or 0.0),
|
||||
timestamp_iso=a.get("timestamp", ""),
|
||||
serial=a.get("serial", ""),
|
||||
record_type=a.get("record_type", ""),
|
||||
waveform_key=a.get("waveform_key", "") or None,
|
||||
geo_range=a.get("geo_range", "normal"),
|
||||
geo_fs=float(a.get("geo_full_scale_ips", 10.0) or 10.0),
|
||||
channels_floats={
|
||||
"Tran": s.get("Tran", np.array([])).tolist(),
|
||||
"Vert": s.get("Vert", np.array([])).tolist(),
|
||||
"Long": s.get("Long", np.array([])).tolist(),
|
||||
"MicL": s.get("MicL", np.array([])).tolist(),
|
||||
},
|
||||
peaks_dict={
|
||||
"tran": float(a.get("peak_tran_ips", 0.0) or 0.0) or None,
|
||||
"vert": float(a.get("peak_vert_ips", 0.0) or 0.0) or None,
|
||||
"long": float(a.get("peak_long_ips", 0.0) or 0.0) or None,
|
||||
"pvs": float(a.get("peak_pvs_ips", 0.0) or 0.0) or None,
|
||||
"mic": float(a.get("peak_mic_psi", 0.0) or 0.0) or None,
|
||||
},
|
||||
event_id=event_id,
|
||||
index=index,
|
||||
)
|
||||
|
||||
|
||||
def _build_plot_dict(
|
||||
*,
|
||||
n_samples: int,
|
||||
sample_rate: int,
|
||||
pretrig_samples: int,
|
||||
total_samples: int,
|
||||
rectime_seconds: float,
|
||||
timestamp_iso: str,
|
||||
serial: str,
|
||||
record_type: Optional[str],
|
||||
waveform_key: Optional[str],
|
||||
geo_range: str,
|
||||
geo_fs: float,
|
||||
channels_floats: dict[str, list[float]],
|
||||
peaks_dict: dict[str, Optional[float]],
|
||||
event_id: Optional[str],
|
||||
index: Optional[int] = None,
|
||||
) -> dict:
|
||||
dt_ms = (1000.0 / sample_rate) if sample_rate > 0 else 0.0
|
||||
t0_ms = -pretrig_samples * dt_ms
|
||||
|
||||
def _ch(unit: str, values: list[float], peak: Optional[float]) -> dict:
|
||||
# Locate the peak's time within the values array (max abs).
|
||||
if values:
|
||||
mags = [abs(v) for v in values]
|
||||
i = mags.index(max(mags))
|
||||
peak_t_ms = round(t0_ms + i * dt_ms, 4)
|
||||
peak_value = peak if peak is not None else values[i]
|
||||
else:
|
||||
peak_t_ms = None
|
||||
peak_value = peak
|
||||
return {
|
||||
"unit": unit,
|
||||
"values": values,
|
||||
"peak": peak_value,
|
||||
"peak_t_ms": peak_t_ms,
|
||||
}
|
||||
|
||||
return {
|
||||
"schema": "sfm.plot.v1",
|
||||
"event_id": event_id,
|
||||
"index": index,
|
||||
"serial": serial,
|
||||
"timestamp": timestamp_iso,
|
||||
"record_type": record_type,
|
||||
"waveform_key": waveform_key,
|
||||
|
||||
"time_axis": {
|
||||
"sample_rate": sample_rate,
|
||||
"pretrig_samples": pretrig_samples,
|
||||
"total_samples": total_samples or n_samples,
|
||||
"n_samples": n_samples,
|
||||
"t0_ms": round(t0_ms, 4),
|
||||
"dt_ms": round(dt_ms, 6),
|
||||
"rectime_seconds": rectime_seconds,
|
||||
},
|
||||
|
||||
"geo_range": geo_range,
|
||||
"geo_full_scale_ips": geo_fs,
|
||||
"trigger_ms": 0.0,
|
||||
|
||||
"channels": {
|
||||
"Tran": _ch("in/s", channels_floats.get("Tran", []), peaks_dict.get("tran")),
|
||||
"Vert": _ch("in/s", channels_floats.get("Vert", []), peaks_dict.get("vert")),
|
||||
"Long": _ch("in/s", channels_floats.get("Long", []), peaks_dict.get("long")),
|
||||
"MicL": _ch("psi", channels_floats.get("MicL", []), peaks_dict.get("mic")),
|
||||
},
|
||||
|
||||
"peak_values": {
|
||||
"transverse": peaks_dict.get("tran"),
|
||||
"vertical": peaks_dict.get("vert"),
|
||||
"longitudinal": peaks_dict.get("long"),
|
||||
"vector_sum": peaks_dict.get("pvs"),
|
||||
"mic_psi": peaks_dict.get("mic"),
|
||||
},
|
||||
}
|
||||
@@ -1,195 +0,0 @@
|
||||
"""
|
||||
sfm/import_bw.py — CLI for ingesting Blastware-format event files.
|
||||
|
||||
Walks a path (file or directory), parses each recognised event-file
|
||||
binary, copies it into the canonical waveform store, writes the
|
||||
.sfm.json sidecar, and upserts a row in seismo_relay.db.
|
||||
|
||||
Use cases:
|
||||
- Migrating a Blastware ACH inbox into SFM
|
||||
- One-off imports of files emailed in by field crews
|
||||
- Bulk-loading historical archives
|
||||
|
||||
Usage:
|
||||
python -m sfm.import_bw <path-or-dir> [--serial BE11529]
|
||||
[--db-path bridges/captures/seismo_relay.db]
|
||||
[--store-root bridges/captures/waveforms]
|
||||
[--dry-run]
|
||||
[-v]
|
||||
|
||||
Examples:
|
||||
python -m sfm.import_bw ~/Downloads/M529LKIQ.7M0W
|
||||
python -m sfm.import_bw /path/to/blastware_archive --serial BE11529
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import logging
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Iterator
|
||||
|
||||
# Allow running from the repo root without installation.
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
|
||||
from sfm.database import SeismoDb
|
||||
from sfm.waveform_store import WaveformStore
|
||||
|
||||
log = logging.getLogger("sfm.import_bw")
|
||||
|
||||
|
||||
# Blastware event-file extensions: 4-char `AB0T` (T = W or H) for ACH
|
||||
# downloads, 3-char `AB0` for direct downloads. We discover candidates
|
||||
# by length + last-char rather than enumerating every (A, B) pair.
|
||||
def _looks_like_bw_event(path: Path) -> bool:
|
||||
"""Heuristic: 3-char or 4-char extension, ends with W/H/0, and the
|
||||
file is at least 70 bytes (header + STRT + footer minimum)."""
|
||||
if not path.is_file():
|
||||
return False
|
||||
ext = path.suffix.lstrip(".")
|
||||
if not (3 <= len(ext) <= 4):
|
||||
return False
|
||||
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
|
||||
return False
|
||||
try:
|
||||
return path.stat().st_size >= 70
|
||||
except OSError:
|
||||
return False
|
||||
|
||||
|
||||
def _walk(path: Path) -> Iterator[Path]:
|
||||
"""Yield candidate BW event-file paths under `path` (file or dir)."""
|
||||
if path.is_file():
|
||||
if _looks_like_bw_event(path):
|
||||
yield path
|
||||
return
|
||||
if path.is_dir():
|
||||
for p in sorted(path.rglob("*")):
|
||||
if _looks_like_bw_event(p):
|
||||
yield p
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
p = argparse.ArgumentParser(
|
||||
description="Import Blastware-format event files into the SFM store + DB.",
|
||||
)
|
||||
p.add_argument("path", help="File or directory to import.")
|
||||
p.add_argument(
|
||||
"--serial", default=None, metavar="SERIAL",
|
||||
help="Override the serial-number hint (e.g. BE11529). Defaults to "
|
||||
"the value decoded from each BW filename's prefix.",
|
||||
)
|
||||
p.add_argument(
|
||||
"--db-path",
|
||||
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
|
||||
help="Path to seismo_relay.db (default: bridges/captures/seismo_relay.db).",
|
||||
)
|
||||
p.add_argument(
|
||||
"--store-root",
|
||||
default=None,
|
||||
help="Root of the waveform store (default: <db_dir>/waveforms).",
|
||||
)
|
||||
p.add_argument(
|
||||
"--dry-run", action="store_true",
|
||||
help="Parse and report per-file outcomes; don't write anything.",
|
||||
)
|
||||
p.add_argument("-v", "--verbose", action="store_true", help="Debug logging.")
|
||||
args = p.parse_args(argv)
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.DEBUG if args.verbose else logging.INFO,
|
||||
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
|
||||
datefmt="%H:%M:%S",
|
||||
)
|
||||
|
||||
src = Path(args.path).expanduser().resolve()
|
||||
if not src.exists():
|
||||
print(f"error: {src} does not exist", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
db_path = Path(args.db_path).expanduser().resolve()
|
||||
store_root = (
|
||||
Path(args.store_root).expanduser().resolve()
|
||||
if args.store_root else db_path.parent / "waveforms"
|
||||
)
|
||||
|
||||
db = None if args.dry_run else SeismoDb(db_path)
|
||||
store = None if args.dry_run else WaveformStore(store_root)
|
||||
|
||||
candidates = list(_walk(src))
|
||||
if not candidates:
|
||||
print(f"No BW event-file candidates found under {src}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
print(f"Importing {len(candidates)} file(s) from {src}...")
|
||||
if args.dry_run:
|
||||
print("(dry-run — no writes will occur)")
|
||||
|
||||
ok = err = skipped = 0
|
||||
for path in candidates:
|
||||
try:
|
||||
bw_bytes = path.read_bytes()
|
||||
except Exception as exc:
|
||||
print(f" [ERR ] {path}: read failed: {exc}")
|
||||
err += 1
|
||||
continue
|
||||
|
||||
if args.dry_run:
|
||||
# Just parse to verify integrity; don't touch DB or store.
|
||||
from minimateplus import event_file_io
|
||||
try:
|
||||
ev = event_file_io.read_blastware_file(path)
|
||||
ts = ev.timestamp and (
|
||||
f"{ev.timestamp.year}-{ev.timestamp.month:02d}-{ev.timestamp.day:02d} "
|
||||
f"{ev.timestamp.hour:02d}:{ev.timestamp.minute:02d}:{ev.timestamp.second:02d}"
|
||||
) or "?"
|
||||
pv = ev.peak_values
|
||||
pvs = pv.peak_vector_sum if pv and pv.peak_vector_sum is not None else 0.0
|
||||
print(f" [OK ] {path.name} ts={ts} PVS={pvs:.4f}")
|
||||
ok += 1
|
||||
except Exception as exc:
|
||||
print(f" [ERR ] {path}: parse failed: {exc}")
|
||||
err += 1
|
||||
continue
|
||||
|
||||
try:
|
||||
ev, rec = store.save_imported_bw(
|
||||
bw_bytes, source_path=path, serial_hint=args.serial,
|
||||
)
|
||||
# Resolve serial for the DB row. Prefer the hint, then the
|
||||
# one decoded from the filename (already done by the store).
|
||||
serial_used = args.serial or _infer_serial(path.name) or "UNKNOWN"
|
||||
ins, sk = db.insert_events(
|
||||
[ev], serial=serial_used,
|
||||
waveform_records=(
|
||||
{ev._waveform_key.hex(): rec}
|
||||
if ev._waveform_key else None
|
||||
),
|
||||
device_family="series3",
|
||||
)
|
||||
tag = "OK " if ins else ("SKIP" if sk else "OK ")
|
||||
print(f" [{tag}] {path.name} → {rec['filename']} "
|
||||
f"({rec['filesize']} B, sha256={rec['sha256'][:12]}…) "
|
||||
f"serial={serial_used} ins={ins} skip={sk}")
|
||||
if ins:
|
||||
ok += 1
|
||||
else:
|
||||
skipped += 1
|
||||
except Exception as exc:
|
||||
print(f" [ERR ] {path}: import failed: {exc}")
|
||||
log.debug("traceback", exc_info=True)
|
||||
err += 1
|
||||
|
||||
print(f"\nDone. ok={ok} skipped={skipped} errors={err}")
|
||||
return 0 if err == 0 else 1
|
||||
|
||||
|
||||
def _infer_serial(filename: str):
|
||||
"""Reuse WaveformStore's filename → serial decoder for log output."""
|
||||
from sfm.waveform_store import _serial_from_bw_filename
|
||||
return _serial_from_bw_filename(filename)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,189 +0,0 @@
|
||||
"""
|
||||
sfm/live_cache.py — Thread-safe in-memory cache for live SFM device data.
|
||||
|
||||
Extracted from sfm/server.py so the cache logic is importable and testable
|
||||
without pulling in fastapi/uvicorn.
|
||||
|
||||
Caching strategy
|
||||
----------------
|
||||
Keyed by `conn_key` ("tcp:host:port" or "serial:port:baud"). Does NOT
|
||||
persist across server restarts.
|
||||
|
||||
device_info cached until POST /device/config marks it dirty
|
||||
events cached by (conn_key, device_event_count); re-fetched when
|
||||
a quick count_events() probe shows new events on the device
|
||||
monitor_status 30-second TTL (changes frequently during monitoring)
|
||||
waveforms permanent within a process — but auto-evicted at the device
|
||||
level when a (waveform_key, timestamp) mismatch is detected
|
||||
at the same index (post-erase key reuse — the device's
|
||||
event-key counter resets to 0x01110000 after every erase,
|
||||
so the same `(conn_key, index)` slot can refer to a
|
||||
brand-new physical event).
|
||||
|
||||
All endpoints accept ?force=true to bypass the cache and re-read.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import threading
|
||||
import time
|
||||
from typing import Optional
|
||||
|
||||
_MONITOR_STATUS_TTL = 30.0 # seconds
|
||||
|
||||
|
||||
class LiveCache:
|
||||
"""
|
||||
Thread-safe in-memory cache for live SFM device data.
|
||||
One singleton per server process.
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self._lock = threading.Lock()
|
||||
self._device_info: dict[str, dict] = {}
|
||||
self._events: dict[str, tuple[int, list]] = {}
|
||||
self._monitor_status: dict[str, tuple[float, dict]] = {}
|
||||
self._config_dirty: dict[str, bool] = {}
|
||||
self._waveforms: dict[tuple, dict] = {}
|
||||
|
||||
# ── Connection key ────────────────────────────────────────────────────────
|
||||
|
||||
@staticmethod
|
||||
def make_conn_key(
|
||||
host: Optional[str],
|
||||
tcp_port: int,
|
||||
port: Optional[str],
|
||||
baud: int,
|
||||
) -> str:
|
||||
if host:
|
||||
return f"tcp:{host}:{tcp_port}"
|
||||
return f"serial:{port}:{baud}"
|
||||
|
||||
# ── Eviction signature ────────────────────────────────────────────────────
|
||||
|
||||
@staticmethod
|
||||
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
|
||||
"""Return (waveform_key_hex, timestamp_iso) from a serialised event."""
|
||||
key = ev.get("waveform_key") or ev.get("_waveform_key")
|
||||
if isinstance(key, (bytes, bytearray)):
|
||||
key = bytes(key).hex()
|
||||
ts = ev.get("timestamp")
|
||||
if isinstance(ts, dict):
|
||||
ts = ts.get("iso") or ts.get("string") or None
|
||||
return (key if isinstance(key, str) else None,
|
||||
ts if isinstance(ts, str) else None)
|
||||
|
||||
def _flush_device(self, conn_key: str) -> None:
|
||||
"""Drop all cached events + waveforms for one device. Caller holds lock."""
|
||||
self._events.pop(conn_key, None)
|
||||
stale_wf_keys = [k for k in self._waveforms if k[0] == conn_key]
|
||||
for k in stale_wf_keys:
|
||||
self._waveforms.pop(k, None)
|
||||
|
||||
# ── Device info ───────────────────────────────────────────────────────────
|
||||
|
||||
def get_device_info(self, conn_key: str) -> Optional[dict]:
|
||||
with self._lock:
|
||||
if self._config_dirty.get(conn_key):
|
||||
return None
|
||||
return self._device_info.get(conn_key)
|
||||
|
||||
def set_device_info(self, conn_key: str, info: dict) -> None:
|
||||
with self._lock:
|
||||
self._device_info[conn_key] = info
|
||||
self._config_dirty[conn_key] = False
|
||||
|
||||
# ── Events ────────────────────────────────────────────────────────────────
|
||||
|
||||
def get_events(self, conn_key: str, device_count: int) -> Optional[list]:
|
||||
with self._lock:
|
||||
if self._config_dirty.get(conn_key):
|
||||
return None
|
||||
entry = self._events.get(conn_key)
|
||||
if entry is None:
|
||||
return None
|
||||
cached_count, events = entry
|
||||
return events if cached_count == device_count else None
|
||||
|
||||
def set_events(self, conn_key: str, device_count: int, events: list) -> None:
|
||||
"""
|
||||
Replace the cached events list for `conn_key`. If any incoming event
|
||||
has a different (waveform_key, timestamp) than the cached entry at
|
||||
the same index, flush the entire conn_key's event + waveform cache
|
||||
first. Catches post-erase key reuse.
|
||||
"""
|
||||
with self._lock:
|
||||
cached_entry = self._events.get(conn_key)
|
||||
cached_events = cached_entry[1] if cached_entry else []
|
||||
cached_by_index = {e.get("index"): e for e in cached_events}
|
||||
|
||||
evict = False
|
||||
for ev in events:
|
||||
idx = ev.get("index")
|
||||
if idx is None:
|
||||
continue
|
||||
cached = cached_by_index.get(idx)
|
||||
if cached is None:
|
||||
continue
|
||||
new_key, new_ts = self._event_signature(ev)
|
||||
old_key, old_ts = self._event_signature(cached)
|
||||
if (new_key and old_key and new_key != old_key) or \
|
||||
(new_ts and old_ts and new_ts != old_ts):
|
||||
evict = True
|
||||
break
|
||||
|
||||
if evict:
|
||||
self._flush_device(conn_key)
|
||||
|
||||
self._events[conn_key] = (device_count, events)
|
||||
|
||||
# ── Monitor status ────────────────────────────────────────────────────────
|
||||
|
||||
def get_monitor_status(self, conn_key: str) -> Optional[dict]:
|
||||
with self._lock:
|
||||
entry = self._monitor_status.get(conn_key)
|
||||
if entry is None:
|
||||
return None
|
||||
fetched_at, status = entry
|
||||
if time.time() - fetched_at > _MONITOR_STATUS_TTL:
|
||||
return None
|
||||
return status
|
||||
|
||||
def set_monitor_status(self, conn_key: str, status: dict) -> None:
|
||||
with self._lock:
|
||||
self._monitor_status[conn_key] = (time.time(), status)
|
||||
|
||||
def invalidate_monitor_status(self, conn_key: str) -> None:
|
||||
with self._lock:
|
||||
self._monitor_status.pop(conn_key, None)
|
||||
|
||||
# ── Config dirty flag ─────────────────────────────────────────────────────
|
||||
|
||||
def mark_config_dirty(self, conn_key: str) -> None:
|
||||
with self._lock:
|
||||
self._config_dirty[conn_key] = True
|
||||
self._events.pop(conn_key, None)
|
||||
|
||||
# ── Waveforms (permanent cache, evicted on (key,ts) mismatch) ─────────────
|
||||
|
||||
def get_waveform(self, conn_key: str, index: int) -> Optional[dict]:
|
||||
with self._lock:
|
||||
return self._waveforms.get((conn_key, index))
|
||||
|
||||
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
|
||||
"""
|
||||
Cache a waveform. Evicts the device's whole cache when the existing
|
||||
entry at the same index has a different (waveform_key, timestamp).
|
||||
"""
|
||||
with self._lock:
|
||||
existing = self._waveforms.get((conn_key, index))
|
||||
if existing is not None:
|
||||
new_key, new_ts = self._event_signature(waveform)
|
||||
old_key, old_ts = self._event_signature(existing)
|
||||
differs = (
|
||||
(new_key and old_key and new_key != old_key)
|
||||
or (new_ts and old_ts and new_ts != old_ts)
|
||||
)
|
||||
if differs:
|
||||
self._flush_device(conn_key)
|
||||
self._waveforms[(conn_key, index)] = waveform
|
||||
+166
-1392
File diff suppressed because it is too large
Load Diff
+83
-1013
File diff suppressed because it is too large
Load Diff
@@ -1,613 +0,0 @@
|
||||
"""
|
||||
sfm/waveform_store.py — On-disk store for Blastware-format event files.
|
||||
|
||||
Layout (flat per-serial, four files per event):
|
||||
|
||||
<root>/<serial>/<filename> ← event file (BW-readable binary)
|
||||
<root>/<serial>/<filename>.a5.pkl ← pickled list of A5 S3Frame dicts
|
||||
<root>/<serial>/<filename>.h5 ← clean waveform arrays (HDF5)
|
||||
<root>/<serial>/<filename>.sfm.json ← modern sidecar (peaks, project,
|
||||
review state, extensions)
|
||||
|
||||
`<filename>` is whatever `minimateplus.blastware_file.blastware_filename`
|
||||
produces for the event. The extension is NOT a fixed type tag — it
|
||||
encodes the event timestamp (`AB0T` format).
|
||||
|
||||
Roles:
|
||||
- BW binary: what Blastware reads. Untouched. The user-facing review
|
||||
waveform viewer.
|
||||
- .a5.pkl: regenerative source. Lets the BW binary be rebuilt
|
||||
byte-for-byte if the encoder changes. Never delete.
|
||||
- .h5: clean per-channel waveform arrays in physical units (in/s for
|
||||
geo, psi for mic) plus event metadata. Canonical format for
|
||||
downstream analysis tools and the `/device/event/{idx}/waveform`
|
||||
endpoint's plot-JSON output.
|
||||
- .sfm.json: small, queryable metadata + review state. SQL
|
||||
`events.false_trigger` is a derived index kept in sync via
|
||||
`patch_sidecar()`.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import logging
|
||||
import pickle
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
from typing import Optional, Union
|
||||
|
||||
from minimateplus import event_file_io
|
||||
from minimateplus.blastware_file import blastware_filename, write_blastware_file
|
||||
from minimateplus.framing import S3Frame
|
||||
from minimateplus.models import Event
|
||||
from sfm import event_hdf5
|
||||
|
||||
log = logging.getLogger("sfm.waveform_store")
|
||||
|
||||
A5_PICKLE_VERSION = 1
|
||||
|
||||
|
||||
def _frame_to_dict(f: S3Frame) -> dict:
|
||||
return {
|
||||
"sub": f.sub,
|
||||
"page_hi": f.page_hi,
|
||||
"page_lo": f.page_lo,
|
||||
"data": bytes(f.data),
|
||||
"chk_byte": f.chk_byte,
|
||||
"checksum_valid": f.checksum_valid,
|
||||
}
|
||||
|
||||
|
||||
def _dict_to_frame(d: dict) -> S3Frame:
|
||||
return S3Frame(
|
||||
sub=d["sub"],
|
||||
page_hi=d["page_hi"],
|
||||
page_lo=d["page_lo"],
|
||||
data=bytes(d["data"]),
|
||||
checksum_valid=d.get("checksum_valid", True),
|
||||
chk_byte=d.get("chk_byte", 0),
|
||||
)
|
||||
|
||||
|
||||
class WaveformStore:
|
||||
"""
|
||||
Persistent store for Blastware-format waveform files + their A5 source frames.
|
||||
|
||||
Thread safety: write_blastware_file is single-shot; concurrent saves of the
|
||||
*same* filename would race, but the filename encodes second-resolution
|
||||
timestamps + serial, so collisions across threads/processes are vanishingly
|
||||
unlikely in practice.
|
||||
"""
|
||||
|
||||
def __init__(self, root: str | Path) -> None:
|
||||
self.root = Path(root)
|
||||
self.root.mkdir(parents=True, exist_ok=True)
|
||||
log.info("WaveformStore root=%s", self.root)
|
||||
|
||||
# ── path helpers ────────────────────────────────────────────────────────────
|
||||
|
||||
def _serial_dir(self, serial: str) -> Path:
|
||||
d = self.root / serial
|
||||
d.mkdir(parents=True, exist_ok=True)
|
||||
return d
|
||||
|
||||
def paths_for(self, serial: str, filename: str) -> tuple[Path, Path]:
|
||||
"""Return (blastware_path, a5_pickle_path) for a given serial+filename.
|
||||
|
||||
For the sidecar path use `sidecar_path_for()` — kept separate so
|
||||
existing callers don't need to unpack a 3-tuple.
|
||||
"""
|
||||
d = self._serial_dir(serial)
|
||||
return d / filename, d / f"{filename}.a5.pkl"
|
||||
|
||||
def sidecar_path_for(self, serial: str, filename: str) -> Path:
|
||||
"""Return absolute path to the .sfm.json sidecar for a given event."""
|
||||
return self._serial_dir(serial) / f"{filename}.sfm.json"
|
||||
|
||||
def hdf5_path_for(self, serial: str, filename: str) -> Path:
|
||||
"""Return absolute path to the .h5 clean-waveform file for a given event."""
|
||||
return self._serial_dir(serial) / f"{filename}.h5"
|
||||
|
||||
def open_blastware(self, serial: str, filename: str) -> Optional[Path]:
|
||||
"""Return absolute path to an existing event file or None."""
|
||||
bw_path, _ = self.paths_for(serial, filename)
|
||||
return bw_path if bw_path.exists() else None
|
||||
|
||||
# ── save / load ─────────────────────────────────────────────────────────────
|
||||
|
||||
def save(
|
||||
self,
|
||||
ev: Event,
|
||||
serial: str,
|
||||
a5_frames: list[S3Frame],
|
||||
*,
|
||||
source_kind: str = "sfm-live",
|
||||
geo_range = "normal",
|
||||
) -> dict:
|
||||
"""
|
||||
Write all four event-file artifacts for one event:
|
||||
- <filename> BW binary
|
||||
- <filename>.a5.pkl raw A5 frame pickle
|
||||
- <filename>.h5 clean waveform (HDF5)
|
||||
- <filename>.sfm.json modern sidecar (metadata + review)
|
||||
|
||||
Returns a record dict suitable for persisting alongside the DB row:
|
||||
|
||||
{
|
||||
"filename": "M529LKIQ.7M0W",
|
||||
"filesize": 8708,
|
||||
"sha256": "a1b2c3...",
|
||||
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
|
||||
"hdf5_filename": "M529LKIQ.7M0W.h5",
|
||||
"sidecar_filename": "M529LKIQ.7M0W.sfm.json",
|
||||
}
|
||||
|
||||
`source_kind` flows into `sidecar.source.kind` — callers should
|
||||
pass "sfm-live" (default) for the live endpoint and "sfm-ach" for
|
||||
the ACH ingestion path. BW-imported events use save_imported_bw()
|
||||
instead.
|
||||
|
||||
`geo_range` controls the ADC-counts → in/s scaling in the HDF5
|
||||
file ("normal" = 10 in/s FS, "sensitive" = 1.25 in/s FS).
|
||||
Defaults to "normal" — callers with compliance-config access
|
||||
should pass the actual unit setting so the saved samples are in
|
||||
the right units.
|
||||
|
||||
Idempotent: if the event file already exists, it is overwritten
|
||||
with the freshly-encoded version (same bytes for the same
|
||||
a5_frames) and the sidecar's review block is preserved across
|
||||
re-saves.
|
||||
"""
|
||||
if not a5_frames:
|
||||
raise ValueError("WaveformStore.save: a5_frames is empty")
|
||||
if not serial:
|
||||
raise ValueError("WaveformStore.save: serial is required")
|
||||
|
||||
filename = blastware_filename(ev, serial)
|
||||
bw_path, a5_path = self.paths_for(serial, filename)
|
||||
sidecar_path = self.sidecar_path_for(serial, filename)
|
||||
hdf5_path = self.hdf5_path_for(serial, filename)
|
||||
|
||||
# 1. encode the event file (defensive unlink prevents trailing-byte
|
||||
# leaks from a previous larger file on synced/odd filesystems).
|
||||
try:
|
||||
bw_path.unlink()
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
write_blastware_file(ev, a5_frames, bw_path)
|
||||
filesize = bw_path.stat().st_size
|
||||
sha256 = event_file_io.file_sha256(bw_path)
|
||||
|
||||
# 2. write the .a5.pkl sidecar
|
||||
try:
|
||||
a5_path.unlink()
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
payload = {
|
||||
"version": A5_PICKLE_VERSION,
|
||||
"frames": [_frame_to_dict(f) for f in a5_frames],
|
||||
}
|
||||
with a5_path.open("wb") as fp:
|
||||
pickle.dump(payload, fp, protocol=pickle.HIGHEST_PROTOCOL)
|
||||
|
||||
# 3. write the .h5 clean-waveform file (samples in physical units).
|
||||
# Best-effort: a write failure shouldn't sink the rest of the save
|
||||
# (the HDF5 can be regenerated later from the .a5.pkl).
|
||||
hdf5_filename: Optional[str] = None
|
||||
try:
|
||||
event_hdf5.write_event_hdf5(
|
||||
hdf5_path, ev,
|
||||
serial=serial,
|
||||
geo_range=geo_range,
|
||||
source_kind=source_kind,
|
||||
)
|
||||
hdf5_filename = hdf5_path.name
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save: HDF5 write failed for %s: %s — continuing without .h5",
|
||||
hdf5_path, exc,
|
||||
)
|
||||
|
||||
# 4. write the .sfm.json sidecar. Preserve any existing review
|
||||
# block + extensions across re-saves so user edits aren't lost
|
||||
# when the same event is re-downloaded (e.g. via Force refresh).
|
||||
existing_review = None
|
||||
existing_extensions = None
|
||||
if sidecar_path.exists():
|
||||
try:
|
||||
old = event_file_io.read_sidecar(sidecar_path)
|
||||
existing_review = old.get("review")
|
||||
existing_extensions = old.get("extensions")
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save: existing sidecar at %s unreadable (%s); overwriting",
|
||||
sidecar_path, exc,
|
||||
)
|
||||
|
||||
sidecar = event_file_io.event_to_sidecar_dict(
|
||||
ev,
|
||||
serial=serial,
|
||||
blastware_filename=filename,
|
||||
blastware_filesize=filesize,
|
||||
blastware_sha256=sha256,
|
||||
source_kind=source_kind,
|
||||
a5_pickle_filename=a5_path.name,
|
||||
review=existing_review,
|
||||
extensions=existing_extensions,
|
||||
)
|
||||
event_file_io.write_sidecar(sidecar_path, sidecar)
|
||||
|
||||
log.info(
|
||||
"WaveformStore.save serial=%s filename=%s filesize=%d frames=%d "
|
||||
"h5=%s sidecar=%s",
|
||||
serial, filename, filesize, len(a5_frames),
|
||||
hdf5_filename or "(skipped)", sidecar_path.name,
|
||||
)
|
||||
return {
|
||||
"filename": filename,
|
||||
"filesize": filesize,
|
||||
"sha256": sha256,
|
||||
"a5_pickle_filename": a5_path.name,
|
||||
"hdf5_filename": hdf5_filename,
|
||||
"sidecar_filename": sidecar_path.name,
|
||||
}
|
||||
|
||||
def save_imported_bw(
|
||||
self,
|
||||
bw_bytes: bytes,
|
||||
source_path: Path,
|
||||
*,
|
||||
serial_hint: Optional[str] = None,
|
||||
bw_report_text: Optional[Union[str, bytes]] = None,
|
||||
) -> tuple[Event, dict]:
|
||||
"""
|
||||
Ingest a Blastware event file produced by an external tool
|
||||
(Blastware's own ACH, manual download, etc.) where the source A5
|
||||
frames aren't available.
|
||||
|
||||
Workflow:
|
||||
1. Parse the bytes via event_file_io.read_blastware_file (writes
|
||||
a temp file to do that, since the parser takes a path).
|
||||
2. Optionally parse a paired BW ASCII event report (the .TXT
|
||||
file BW writes alongside the binary). When supplied, its
|
||||
decoded fields land in the sidecar's `bw_report` block AND
|
||||
overlay the device-authoritative peak values into the
|
||||
top-level `peak_values` block. This is the right path for
|
||||
the ACH-forwarder daemon use case where Blastware's own
|
||||
ACH writes both files into the watch folder.
|
||||
3. Resolve serial from BW filename (`<P><serial3>...`) or use
|
||||
serial_hint. Falls back to "UNKNOWN".
|
||||
4. Copy the BW bytes verbatim into <root>/<serial>/<filename>.
|
||||
5. Write the .sfm.json sidecar with source.kind = "bw-import"
|
||||
and a5_pickle_filename = None. Does NOT write a .a5.pkl
|
||||
(no A5 source available; byte-for-byte regeneration not
|
||||
possible — the on-disk BW file IS the byte-for-byte source).
|
||||
|
||||
Returns (event, record_dict) so callers can both insert into
|
||||
SeismoDb and surface the parsed Event.
|
||||
"""
|
||||
# Stash the bytes to a temp path so read_blastware_file (path-based)
|
||||
# can parse without us duplicating its logic.
|
||||
import tempfile
|
||||
with tempfile.NamedTemporaryFile(suffix=".bw", delete=False) as tmp:
|
||||
tmp.write(bw_bytes)
|
||||
tmp_path = Path(tmp.name)
|
||||
try:
|
||||
ev = event_file_io.read_blastware_file(tmp_path)
|
||||
finally:
|
||||
try:
|
||||
tmp_path.unlink()
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
|
||||
# read_blastware_file derives record_type from its path arg, but
|
||||
# that arg is the tmp file (suffix ".bw") — so override with the
|
||||
# original filename's encoded type (H/W/M/E/C in the BW AB0T
|
||||
# scheme). Without this override every BW-imported event lands
|
||||
# in the DB with record_type="Waveform" regardless of the actual
|
||||
# type (Histogram, Manual, etc.).
|
||||
ev.record_type = event_file_io.derive_record_type_from_filename(
|
||||
source_path.name
|
||||
)
|
||||
|
||||
# Parse the BW ASCII report if one was supplied. Failures here
|
||||
# are non-fatal: we still write the binary + sidecar without the
|
||||
# rich derived fields.
|
||||
bw_report = None
|
||||
if bw_report_text is not None:
|
||||
try:
|
||||
from minimateplus.bw_ascii_report import parse_report
|
||||
bw_report = parse_report(bw_report_text)
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_bw: BW report parse failed: %s — continuing without it",
|
||||
exc,
|
||||
)
|
||||
|
||||
# If we have a report, overlay its device-authoritative fields
|
||||
# (peaks, project, sample_rate, record_time) onto the Event
|
||||
# BEFORE handing it to db.insert_events(). Without this overlay
|
||||
# the DB row gets `peak_values` from _peaks_from_samples(), which
|
||||
# runs the still-undecoded waveform codec on the BW body and
|
||||
# produces ±10 in/s saturation values on every channel for every
|
||||
# event. The sidecar JSON had the correct values via
|
||||
# event_to_sidecar_dict(bw_report=...) but the DB columns didn't.
|
||||
if bw_report is not None:
|
||||
try:
|
||||
event_file_io.apply_report_to_event(ev, bw_report)
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_bw: failed to overlay report onto event: %s",
|
||||
exc,
|
||||
)
|
||||
|
||||
# Resolve serial. blastware_filename derives a 4-char prefix from
|
||||
# the numeric serial (e.g. BE11529 → M529); we go the other way
|
||||
# via the source filename if a hint wasn't given.
|
||||
serial = serial_hint or _serial_from_bw_filename(source_path.name) or "UNKNOWN"
|
||||
|
||||
# Use the source filename verbatim — it already encodes timestamp
|
||||
# + record type per BW's AB0T scheme, and we want to preserve it
|
||||
# so the file BW knows about can be opened back in BW.
|
||||
filename = source_path.name
|
||||
bw_path = self._serial_dir(serial) / filename
|
||||
|
||||
# 1. copy bytes
|
||||
bw_path.write_bytes(bw_bytes)
|
||||
filesize = bw_path.stat().st_size
|
||||
sha256 = event_file_io.file_sha256(bw_path)
|
||||
|
||||
# 2. write the .h5 clean-waveform file from the parsed Event.
|
||||
# Note: peaks here are computed from raw samples (the BW file
|
||||
# doesn't carry the device-authoritative 0C peaks). Best-effort.
|
||||
hdf5_path = self.hdf5_path_for(serial, filename)
|
||||
hdf5_filename: Optional[str] = None
|
||||
try:
|
||||
event_hdf5.write_event_hdf5(
|
||||
hdf5_path, ev,
|
||||
serial=serial,
|
||||
geo_range="normal", # BW file doesn't carry the range; assume Normal
|
||||
source_kind="bw-import",
|
||||
)
|
||||
hdf5_filename = hdf5_path.name
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_bw: HDF5 write failed for %s: %s — continuing",
|
||||
hdf5_path, exc,
|
||||
)
|
||||
|
||||
# 3. write sidecar with source.kind = bw-import
|
||||
sidecar_path = self.sidecar_path_for(serial, filename)
|
||||
existing_review = None
|
||||
if sidecar_path.exists():
|
||||
try:
|
||||
existing_review = event_file_io.read_sidecar(sidecar_path).get("review")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
sidecar = event_file_io.event_to_sidecar_dict(
|
||||
ev,
|
||||
serial=serial,
|
||||
blastware_filename=filename,
|
||||
blastware_filesize=filesize,
|
||||
blastware_sha256=sha256,
|
||||
source_kind="bw-import",
|
||||
a5_pickle_filename=None,
|
||||
review=existing_review,
|
||||
bw_report=bw_report,
|
||||
)
|
||||
event_file_io.write_sidecar(sidecar_path, sidecar)
|
||||
|
||||
log.info(
|
||||
"WaveformStore.save_imported_bw serial=%s filename=%s filesize=%d "
|
||||
"h5=%s (no .a5.pkl — A5 source unavailable for BW-imported files)",
|
||||
serial, filename, filesize, hdf5_filename or "(skipped)",
|
||||
)
|
||||
return ev, {
|
||||
"filename": filename,
|
||||
"filesize": filesize,
|
||||
"sha256": sha256,
|
||||
"a5_pickle_filename": None,
|
||||
"hdf5_filename": hdf5_filename,
|
||||
"sidecar_filename": sidecar_path.name,
|
||||
"serial": serial,
|
||||
}
|
||||
|
||||
def save_imported_idf(
|
||||
self,
|
||||
idf_bytes: bytes,
|
||||
source_path: Path,
|
||||
*,
|
||||
serial_hint: Optional[str] = None,
|
||||
idf_report_text: Optional[Union[str, bytes]] = None,
|
||||
) -> tuple[Optional["Event"], dict]:
|
||||
"""
|
||||
Ingest a Thor (Micromate Series IV) IDF event file (`.IDFW` or
|
||||
`.IDFH`) produced by Thor's TXT exporter.
|
||||
|
||||
Thor binaries are stored as opaque bytes — seismo-relay doesn't
|
||||
yet decode the proprietary IDF binary format (codec slot lives
|
||||
at ``micromate/idf_file.py``). Device-authoritative metadata
|
||||
comes from the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar
|
||||
when supplied.
|
||||
|
||||
Workflow:
|
||||
1. Parse the paired TXT report (when supplied) via
|
||||
``micromate.parse_idf_report`` → dict.
|
||||
2. Wrap parsed dict + filename into a typed ``micromate.IdfEvent``.
|
||||
3. Copy bytes verbatim into ``<root>/<serial>/<filename>``.
|
||||
4. Bridge IdfEvent → ``minimateplus.Event`` (for the existing
|
||||
sidecar / DB insert machinery) via
|
||||
``IdfEvent.to_minimateplus_event(waveform_key)``.
|
||||
5. Write the ``.sfm.json`` sidecar with
|
||||
``source.kind = "idf-import"`` and the full raw IDF report
|
||||
under ``extensions.idf_report``.
|
||||
|
||||
Returns ``(event, record_dict)`` so the endpoint can both insert
|
||||
into SeismoDb and surface the parsed event.
|
||||
"""
|
||||
from micromate import IdfEvent, parse_idf_report
|
||||
|
||||
# Parse the .txt sidecar (best-effort; non-fatal on failure).
|
||||
report_dict: dict = {}
|
||||
if idf_report_text is not None:
|
||||
try:
|
||||
report_dict = parse_idf_report(idf_report_text)
|
||||
except Exception as exc:
|
||||
log.warning(
|
||||
"save_imported_idf: report parse failed: %s — continuing without it",
|
||||
exc,
|
||||
)
|
||||
|
||||
# Build the typed IdfEvent. Filename is authoritative for
|
||||
# (serial, timestamp, kind); the report's event_datetime takes
|
||||
# precedence over the filename timestamp inside from_report().
|
||||
idf_event = IdfEvent.from_report(report_dict, source_path.name)
|
||||
|
||||
# Operator-supplied serial_hint wins over the binary's filename
|
||||
# prefix when both are present (e.g. callers passing a known-good
|
||||
# serial that overrides a misnamed export).
|
||||
serial = serial_hint or idf_event.serial or "UNKNOWN"
|
||||
|
||||
# Filesystem write.
|
||||
filename = source_path.name
|
||||
bw_path = self._serial_dir(serial) / filename
|
||||
bw_path.write_bytes(idf_bytes)
|
||||
filesize = bw_path.stat().st_size
|
||||
sha256 = event_file_io.file_sha256(bw_path)
|
||||
|
||||
# _waveform_key dedups (serial, timestamp) rows in the events
|
||||
# table. Use the binary's sha256 (first 16 bytes) as a stable
|
||||
# surrogate — every distinct binary maps to a distinct row.
|
||||
waveform_key = bytes.fromhex(sha256)[:16]
|
||||
|
||||
# Bridge to minimateplus.Event for the existing sidecar / DB
|
||||
# insert paths. See IdfEvent.to_minimateplus_event() for the
|
||||
# caveats of this bridge (mic units, missing fields → sidecar).
|
||||
ev = idf_event.to_minimateplus_event(waveform_key)
|
||||
|
||||
# Write the sidecar. Source kind "idf-import" was added to the
|
||||
# allow-list in event_file_io.event_to_sidecar_dict for this.
|
||||
sidecar_path = self.sidecar_path_for(serial, filename)
|
||||
existing_review = None
|
||||
if sidecar_path.exists():
|
||||
try:
|
||||
existing_review = event_file_io.read_sidecar(sidecar_path).get("review")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
sidecar = event_file_io.event_to_sidecar_dict(
|
||||
ev,
|
||||
serial=serial,
|
||||
blastware_filename=filename,
|
||||
blastware_filesize=filesize,
|
||||
blastware_sha256=sha256,
|
||||
source_kind="idf-import",
|
||||
a5_pickle_filename=None,
|
||||
review=existing_review,
|
||||
)
|
||||
# Stash the full parsed IDF report under extensions so downstream
|
||||
# consumers can recover the rich derived fields that don't fit
|
||||
# the BW-shaped event model (Peak Acceleration / Displacement,
|
||||
# Time of Peak, sensor self-check, calibration, firmware).
|
||||
if report_dict:
|
||||
sidecar["extensions"]["idf_report"] = report_dict
|
||||
event_file_io.write_sidecar(sidecar_path, sidecar)
|
||||
|
||||
log.info(
|
||||
"WaveformStore.save_imported_idf serial=%s filename=%s filesize=%d "
|
||||
"report_attached=%s",
|
||||
serial, filename, filesize, bool(report_dict),
|
||||
)
|
||||
return ev, {
|
||||
"filename": filename,
|
||||
"filesize": filesize,
|
||||
"sha256": sha256,
|
||||
"a5_pickle_filename": None,
|
||||
"hdf5_filename": None,
|
||||
"sidecar_filename": sidecar_path.name,
|
||||
"serial": serial,
|
||||
}
|
||||
|
||||
def load_a5(self, serial: str, filename: str) -> Optional[list[S3Frame]]:
|
||||
"""
|
||||
Re-hydrate the pickled A5 frame stream for a stored event.
|
||||
Returns None if the sidecar is missing.
|
||||
"""
|
||||
_, a5_path = self.paths_for(serial, filename)
|
||||
if not a5_path.exists():
|
||||
return None
|
||||
with a5_path.open("rb") as fp:
|
||||
payload = pickle.load(fp)
|
||||
if not isinstance(payload, dict) or "frames" not in payload:
|
||||
log.warning("WaveformStore.load_a5: malformed sidecar at %s", a5_path)
|
||||
return None
|
||||
return [_dict_to_frame(d) for d in payload["frames"]]
|
||||
|
||||
# ── modern .sfm.json sidecar accessors ──────────────────────────────────────
|
||||
|
||||
def load_sidecar(self, serial: str, filename: str) -> Optional[dict]:
|
||||
"""Return the parsed .sfm.json sidecar dict, or None if missing."""
|
||||
path = self.sidecar_path_for(serial, filename)
|
||||
if not path.exists():
|
||||
return None
|
||||
try:
|
||||
return event_file_io.read_sidecar(path)
|
||||
except Exception as exc:
|
||||
log.warning("load_sidecar: failed to read %s: %s", path, exc)
|
||||
return None
|
||||
|
||||
def patch_sidecar(
|
||||
self,
|
||||
serial: str,
|
||||
filename: str,
|
||||
*,
|
||||
review: Optional[dict] = None,
|
||||
extensions: Optional[dict] = None,
|
||||
reviewer_now: bool = True,
|
||||
) -> Optional[dict]:
|
||||
"""
|
||||
JSON-merge-patch the .sfm.json sidecar's review/extensions blocks.
|
||||
Returns the new full dict, or None if the sidecar doesn't exist.
|
||||
"""
|
||||
path = self.sidecar_path_for(serial, filename)
|
||||
if not path.exists():
|
||||
return None
|
||||
return event_file_io.patch_sidecar(
|
||||
path,
|
||||
review=review,
|
||||
extensions=extensions,
|
||||
reviewer_now=reviewer_now,
|
||||
)
|
||||
|
||||
|
||||
# ── helpers ─────────────────────────────────────────────────────────────────────
|
||||
|
||||
def _serial_from_bw_filename(name: str) -> Optional[str]:
|
||||
"""
|
||||
Reverse of `blastware_filename`'s serial-prefix encoding.
|
||||
|
||||
BW filename format (V10.72): `<P><serial3><stem4>.<ext>`
|
||||
where P = chr(ord('B') + floor(serial // 1000))
|
||||
and serial3 = f"{serial % 1000:03d}".
|
||||
|
||||
Examples (from CLAUDE.md verification archive):
|
||||
P036... → BE14036 H907... → BE6907
|
||||
M529... → BE11529 T003... → BE18003
|
||||
|
||||
Returns the inferred BE-prefix serial (e.g. "BE11529") or None when
|
||||
the filename doesn't match the expected pattern.
|
||||
"""
|
||||
if not name:
|
||||
return None
|
||||
# First letter encodes the thousands group; next 3 chars encode the
|
||||
# last 3 digits of the serial.
|
||||
base = name.split(".", 1)[0]
|
||||
if len(base) < 4 or not base[0].isalpha() or not base[1:4].isdigit():
|
||||
return None
|
||||
prefix_letter = base[0].upper()
|
||||
if prefix_letter < "B":
|
||||
return None
|
||||
thousands = ord(prefix_letter) - ord("B")
|
||||
serial_num = thousands * 1000 + int(base[1:4])
|
||||
return f"BE{serial_num}"
|
||||
Vendored
BIN
Binary file not shown.
Vendored
BIN
Binary file not shown.
-3386
File diff suppressed because it is too large
Load Diff
Vendored
BIN
Binary file not shown.
-3137
File diff suppressed because it is too large
Load Diff
Vendored
BIN
Binary file not shown.
-3137
File diff suppressed because it is too large
Load Diff
Vendored
BIN
Binary file not shown.
-3387
File diff suppressed because it is too large
Load Diff
Vendored
BIN
Binary file not shown.
-3387
File diff suppressed because it is too large
Load Diff
Binary file not shown.
-3386
File diff suppressed because it is too large
Load Diff
Binary file not shown.
File diff suppressed because it is too large
Load Diff
Binary file not shown.
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user