66 Commits

Author SHA1 Message Date
serversdown a18712442f feat: preserve and encode raw 0C record in sidecar extensions for offline analysis 2026-05-08 21:50:01 +00:00
serversdown 8aea46b8a0 doc(fix): retracts raw int16 LE sample set assumptions. 2026-05-08 19:26:25 +00:00
serversdown 9123269b1f feat(protocol): implement v0.14.0 SUB 5A protocol rewrite with enhanced chunk handling and new helpers
test: add regression tests for v0.14.x SUB 5A protocol fixes
refactor(logging): change warning logs to debug for less verbosity in write_blastware_file
2026-05-08 19:11:55 +00:00
serversdown 9400f59167 doc: update readme to 0.15.0 2026-05-08 19:06:26 +00:00
serversdown bbed85f7e2 fix: update channel keys to include 'MicL' in device_event_waveform documentation 2026-05-08 18:48:06 +00:00
serversdown c641d5fc10 feat: v0.15.0
### Added

- **Layered event storage architecture.**  Each event now lands as four
  files in the per-serial waveform store, each with a clear role:

  - `<filename>` — the Blastware-readable binary (BW file).  Untouched.
  - `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
  - `<filename>.h5` — clean per-channel waveform arrays in physical
    units (in/s for geo, psi for mic) plus event metadata (HDF5 with
    gzip compression).  This is the canonical format for downstream
    analysis tools.
  - `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
    project, source provenance, review state, extensions).

  SQLite (`seismo_relay.db`) is the searchable index over all four.

- **Plot-ready waveform JSON (`sfm.plot.v1`).**  The `/device/event/{idx}/waveform`
  and `/db/events/{id}/waveform.json` endpoints now return samples in
  physical units with explicit time-axis metadata, peak markers, and
  per-channel unit hints — no more guessing the ADC-to-velocity scale
  client-side.  The webapp waveform viewer was rewritten to consume
  this shape.

- **In-app waveform viewer accuracy fix.**  The standalone SFM webapp
  viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
  (≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
  *in/s per V* hardware constant — not the ADC-counts-to-velocity
  factor.  This silently scaled every plot ~38% too low for Normal-range
  geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
  Sensitive).  Conversion is now done server-side using the geo_range
  from compliance config; the client just plots.

- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
  `read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.

### Dependencies

- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
  `/db/import/blastware_file` endpoint introduced in this release).
2026-05-08 04:39:51 +00:00
serversdown 9afa3484f4 feat(cache): implement integrity checks for cached events and waveforms
- Added `waveform_key` and `event_timestamp` columns to `CachedEvent` and `CachedWaveform` for integrity verification.
- Implemented logic to flush the cache when a mismatch in (waveform_key, event_timestamp) is detected during event and waveform updates.
- Enhanced `set_events` and `set_waveform` methods to check for mismatches and trigger cache eviction as necessary.
- Introduced a new `LiveCache` class to manage in-memory caching of live device data, separating it from the server logic for better testability.
- Added tests to verify the correctness of cache invalidation logic, particularly for post-erase key reuse scenarios.
- Updated web application to include a "Force refresh" toggle, allowing users to bypass the cache and re-fetch data from the device.
2026-05-07 04:42:00 +00:00
serversdown 0484680c89 fix(docs/comments): rename refs to 'event files' to reflect their timestamp extenion names. 2026-05-06 19:08:38 +00:00
serversdown 3711b11bda feat: add waveform store handling 2026-05-06 19:03:38 +00:00
serversdown 52c6e7b618 Merge pull request 'v0.14.3 - Full waveform DL pipeline tested and working.' (#15) from protocol-fix into main
Reviewed-on: #15
2026-05-05 20:49:47 -04:00
serversdown 29ebc75656 doc: update readme v0.14.3 2026-05-05 20:48:58 -04:00
claude ebfe9877fa doc: update changelog to 0.14.3 2026-05-05 20:39:47 -04:00
claude c914a15e12 docs: update for v0.14.3 - Full continuous waveform download successful! 2026-05-05 20:37:52 -04:00
claude a27693242d fix(protocol): implement partial DLE stuffing for 0x10 bytes in params to prevent request corruption 2026-05-05 18:28:28 -04:00
claude eefec0bd64 fix(blastware_file): remove harmful "duplicate header+STRT" strip logic to preserve valid waveform data 2026-05-05 17:48:40 -04:00
claude 7444738883 debug(protocol): event-N probe is now at counter = start_offset instead of start_offset + 0x46 2026-05-05 16:46:35 -04:00
claude 6b76934a04 Merge branch 'main' into protocol-fix 2026-05-04 14:43:05 -04:00
claude 7b62c790a9 fix(seismo-lab): remove duplicate capture history list 2026-05-04 14:30:46 -04:00
claude b66cc9d075 fix(blastware_file): update TERM detection logic and strip duplicate header blocks for accurate file writing 2026-05-04 14:28:11 -04:00
serversdown 4ab604eff1 Merge pull request 'v0.12.6' (#10) from seismo-lab-new into main
Reviewed-on: #10
2026-05-04 13:22:54 -04:00
serversdown e15f1567ef Doc: Update docs for 0.12.6 2026-05-04 17:18:28 +00:00
serversdown bb33ad3837 doc: update to v0.12.5 2026-05-04 17:13:37 +00:00
claude 45e61fbcaf big refactor of waveform protocol. 2026-05-03 01:20:21 -04:00
claude d758825c67 fix(protocol): correct continuous-mode record header classification for accurate timestamp extraction 2026-05-01 20:28:55 -04:00
claude 0fbb39c21a Big event bugfix. see details:
## v0.13.0 — 2026-05-01

### Fixed

- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
  `read_bulk_waveform_stream` was walking the chunk counter past the actual
  end of the event, picking up post-event circular-buffer garbage that
  corrupted reconstructed Blastware files for any waveform > ~1 sec.  The
  loop now extracts the event's `end_offset` from the STRT record at
  `data[23:27]` of the probe response and stops the chunk walk when the next
  counter would step past it.  Verified against three BW MITM captures
  (4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
  bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.

### Added

- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)` —
  computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
  formula confirmed across all 3 BW captures.  Not yet wired into
  `read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
  existing `blastware_file.write_blastware_file` frame-structure expectations);
  available for the next iteration that switches to BW's 0x0200 chunk step.
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
  from the STRT record in an A5 response payload.
2026-05-01 18:37:34 -04:00
claude 1ef55521b1 Fix: Removed duplicates from merge botch. Stable version of seismo_lab.py 2026-05-01 17:34:41 -04:00
claude 738b39f3cb Manually Merged seismo lab persistent connection branch into the new direct download branch, creating a new branch called seismo-lab-new 2026-05-01 15:13:50 -04:00
Claude 625b0a4dfc feat(seismo_lab): add Download tab that captures wire bytes during event download
Adds a new CapturingTransport wrapper in minimateplus.transport that mirrors
every TX/RX byte to two raw .bin files using the same on-wire format as
bridges/ach_mitm.py, so the resulting captures are byte-for-byte compatible
with the existing Blastware MITM captures and load directly in the Analyzer.

A new "Download" tab in seismo_lab.py lets the user connect to a device over
TCP or serial and run connect / list-keys / download-events while the wrapper
saves raw_bw_<ts>.bin (our TX) and raw_s3_<ts>.bin (device TX) into a
seismo_dl_<ts>[_<label>]/ session directory. On completion, the panel hands
both files to the Analyzer and switches tabs, mirroring the UX of the
existing Bridge capture flow.
2026-05-01 00:12:02 +00:00
Claude b14f31f3b0 Include capture label in TCP raw filename
Matches serial bridge naming: raw_bw_{ts}_{label}.bin / raw_s3_{ts}_{label}.bin

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-27 20:48:10 +00:00
Claude b9ab368934 Fix TCP capture: write files only when capture is active
Previously every Blastware connection auto-created files.
Now TCP mode works the same as serial mode:
- Start Bridge: proxy listens and forwards silently, no files written
- New Capture: opens raw_bw/raw_s3 files; pipe threads write to them
- Stop Capture: flushes and closes files, fires Analyzer callback
- No connection = no file; multiple captures per bridge session work correctly

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-27 20:26:31 +00:00
Claude 9004241846 Restore multi-capture Bridge design + TCP mode
Brings back the protocol-exp BridgePanel design:
- Single bridge session stays up; New Capture / Stop Capture create
  labelled raw-file segments on demand (no files created at bridge start)
- Capture history listbox shows all segments; double-click reloads in Analyzer
- On capture complete: Analyzer auto-populates and runs analysis

TCP mode integrated into same tab (Serial/TCP radio toggle):
- Each incoming Blastware connection is automatically a capture segment
- Session appears in history list; Analyzer wires up live on connect
- Stop Capture disconnects current TCP session

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-27 20:20:43 +00:00
Claude 6861d9ed97 Merge TCP mode into Bridge tab (Serial/TCP radio toggle)
Removes the separate 'TCP Capture' tab and folds TCP MITM capture directly
into the existing Bridge tab.  A Serial/TCP radio selector at the top swaps
the connection fields (COM ports vs. listen port + device host:port) while
keeping the same Start Bridge / Stop Bridge / Add Mark buttons, capture
checkboxes, log dir, and live log — identical UX for both modes.

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 23:01:45 +00:00
claude 5cd5652560 Merge branch 'seismo-lab' of https://github.com/serversdwn/seismo-relay into seismo-lab 2026-04-26 18:16:52 -04:00
Claude 897ac8a3f3 Add TCP MITM capture tab (TcpBridgePanel)
New 'TCP Capture' tab in seismo_lab.py: listens on a configurable local
port for an incoming Blastware connection, transparently forwards all
traffic to the real seismograph device, and saves both directions to
raw_bw_<ts>.bin / raw_s3_<ts>.bin in the same format the Analyzer already
understands.  Session start wires up Analyzer live mode automatically via
the same on_bridge_started callback as the COM-port bridge.

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 22:10:48 +00:00
serversdown 310fc5986c Merge pull request 'seismo-lab2' (#7) from seismo-lab2 into seismo-lab
Reviewed-on: #7
2026-04-26 16:49:28 -04:00
Claude e1150b30aa fix(analyzer): name A5/5A frames; revert S3 checksum validation
Add 0x5A (BULK_WAVEFORM_STREAM) and 0xA5 (BULK_WAVEFORM_RESPONSE) to
SUB_TABLE so they display with real names instead of UNKNOWN_5A/A5.

Revert S3 checksum validation to checksum_valid=None (the original
intentional behavior). Large S3 frames (A5 bulk waveform, E5 compliance
config) embed inner DLE+ETX sub-frame delimiters; the trailing 0x03 of
the last inner delimiter can land where the parser expects the SUM8
checksum byte, causing false BAD CHK on every valid A5 frame.
protocol.py _validate_frame documents and ignores exactly this issue.

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 20:40:45 +00:00
claude a7585cb5e0 fix(blastware_file, server): implement logic to skip extra chunks after metadata for accurate file writing 2026-04-26 16:32:32 -04:00
Claude 9bbecea70f fix(parser): correct S3 frame terminator — bare ETX, not DLE+ETX
parse_s3 had the S3 terminator logic inverted vs the real S3FrameParser
in framing.py. It was terminating on DLE+ETX and treating bare ETX as
payload, which caused every bare 0x03 to be swallowed — bundling multiple
real S3 frames into one giant body until a DLE+ETX sequence happened to
appear. Result: 583-byte POLL_RESPONSE 'frames' containing many real
frames concatenated, all showing BAD CHK.

Fix: mirror S3FrameParser exactly —
  - Bare ETX (0x03) = real frame terminator
  - DLE+ETX (0x10 0x03) = inner-frame literal data (A4/E5 sub-frames),
    appended to body and parsing continues

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 20:23:18 +00:00
claude ae30a02898 fix(blastware_file, server): enhance logging and correct chunk handling for accurate data processing 2026-04-26 16:03:07 -04:00
claude 2f084ed105 fix(protocol): update chunk counter formula to use max(key4[2:4], 0x0400) for accurate data streaming 2026-04-26 01:28:47 -04:00
claude 7976b544ed fix(blastware_file): never skip A5 frames based on classification at fi>0
Frame 0 is always the probe; frames 1+ are always data (waveform ADC
chunks, compliance config, compliance continuation).  Gating on
classify_frame() at fi>0 produces false positives: ADC binary data
can coincidentally contain b"STRT\xff\xfe", causing frames 1 and 5
to be silently dropped from the body (confirmed from live capture on
event key=01110000).  Remove all type-based filtering; include every
frame unconditionally with the standard index-based skip amounts.
2026-04-26 00:59:36 -04:00
claude 0415af19b4 fix(blastware_file): remove seen_metadata flag and adjust frame processing logic 2026-04-24 20:21:03 -04:00
claude 35c3f4f945 fix(protocol): correct A5 frame classification and chunk counter formula 2026-04-24 17:25:29 -04:00
claude 43c8158493 feat(blastware_file): classify A5 frames, only write waveform frames to body
Add classify_frame() which categorises each A5 frame by content:
  terminator    — page_key == 0x0000
  probe_or_strt — contains b"STRT"
  metadata      — contains compliance-config ASCII markers
                  (Project:, Client:, Standard Recording Setup, …)
  waveform      — binary-heavy (< 20% printable ASCII), i.e. raw ADC data
  unknown       — fallback

Update write_blastware_file() body loop: frame 0 (probe) is still
always processed; frames 1+ are only included when classify_frame
returns "waveform".  Metadata frames (compliance config block with
Project:/Client:/etc.) and any stray STRT-bearing frames are skipped
with a warning/debug log.  Terminator frame handling is unchanged.

Adds temporary print() diagnostics so each frame's classification is
visible in the server log to aid debugging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 15:48:37 -04:00
claude 242666f358 fix(protocol): correct chunk counter formula for accurate data streaming 2026-04-24 12:52:02 -04:00
claude 03540fdc00 fix: raise max_chunks to 128 for metadata-only 5A download
For 2-second events at 1024 sps the "Project:" metadata frame appears
beyond chunk 32 (the old default cap), causing the safety limit to be
hit and ~34 KB of waveform data to be downloaded instead of stopping
at the metadata frame.  Raising max_chunks to 128 ensures
stop_after_metadata=True can locate the metadata frame for record
times up to ~4 seconds.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 02:19:27 -04:00
claude f83fd880c0 fix(protocol): update device_event_blastware_file to include extra chunk for accurate data retrieval 2026-04-24 00:35:34 -04:00
claude ab2c11e9a9 fix(protocol): refine extra chunk fetching logic for accurate termination response 2026-04-23 20:30:07 -04:00
claude fa887b85d9 fix(protocol): update extra chunk fetching logic to stop at silence detection 2026-04-23 18:28:14 -04:00
claude ecd980d345 fix(protocol): enhance extra chunk fetching logic to ensure footer detection 2026-04-23 18:22:27 -04:00
claude bc9f16e503 fix(protocol): adjust extra_chunks calculation to use integer conversion of record_time 2026-04-23 17:39:28 -04:00
claude aa2b02535b fix(protocol): add record_time based chunk scaling for longer event record times 2026-04-23 17:33:16 -04:00
claude 2a2031c3a9 fix(protocol): fetch additional chunk after metadata to ensure valid termination response 2026-04-23 17:08:36 -04:00
claude 9e7e0bce2a fix(protocol): adjust full_waveform setting for event downloads to end when it should. 2026-04-23 16:43:59 -04:00
claude 5e2f3bf2a1 fix(protocol): enable full_waveform for continuous mode. 2026-04-23 16:24:39 -04:00
claude 39ebd4bdaa fix(protocol): revert endpoint back to stop_after_metadata=True 2026-04-23 15:11:56 -04:00
claude 84c87d0b57 fix(protocol): adjust waveform download to use full_waveform for accurate event streaming 2026-04-23 13:02:55 -04:00
claude ec6362cb8e fix(protocol): include terminator in waveform stream downloads 2026-04-23 12:45:59 -04:00
claude 3eeafd24aa fix(protocol): improve terminator frame detection in write_blastware_file.
fix: rename .n00 to just blastware file (.n00 was false positive)
2026-04-23 01:33:44 -04:00
claude 8cb8b86192 fix(server): add error logging for device event handling 2026-04-22 23:48:59 -04:00
claude 6dcca4da79 feat(protocol): fully decode Blastware filename encoding and update related documentation 2026-04-22 23:43:31 -04:00
claude c47e3a3af0 feat(protocol): update Blastware file format documentation and encoding details 2026-04-22 19:16:05 -04:00
claude dfbc9f29c5 feat: first try at building waveform binary files. 2026-04-21 22:57:53 -04:00
claude 4331215e23 feat(protocol): enhance raw capture functionality and documentation updates
- Update `s3_bridge.py` to default raw capture file paths to "auto" for timestamped naming.
- Modify `gui_bridge.py` to pre-check raw capture options and streamline path handling.
- Extend `ach_server.py` to save both incoming and outgoing raw bytes for analysis.
- Revise `CHANGELOG.md` and `instantel_protocol_reference.md` to reflect changes in recording mode handling and compliance data encoding.
2026-04-21 16:07:24 -04:00
claude b3dcfe7239 fix(client): correct recording_mode anchor position in compliance config encoding 2026-04-21 01:17:45 -04:00
claude 9b5cdfd857 feat(logging): add detailed logging for anchor position in compliance config encoding/decoding 2026-04-21 00:23:15 -04:00
35 changed files with 10045 additions and 976 deletions
+388
View File
@@ -4,6 +4,394 @@ All notable changes to seismo-relay are documented here.
---
## v0.15.0 — 2026-05-07
### Added
- **Layered event storage architecture.** Each event now lands as four
files in the per-serial waveform store, each with a clear role:
- `<filename>` — the Blastware-readable binary (BW file). Untouched.
- `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
- `<filename>.h5` — clean per-channel waveform arrays in physical
units (in/s for geo, psi for mic) plus event metadata (HDF5 with
gzip compression). This is the canonical format for downstream
analysis tools.
- `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
project, source provenance, review state, extensions).
SQLite (`seismo_relay.db`) is the searchable index over all four.
- **Plot-ready waveform JSON (`sfm.plot.v1`).** The `/device/event/{idx}/waveform`
and `/db/events/{id}/waveform.json` endpoints now return samples in
physical units with explicit time-axis metadata, peak markers, and
per-channel unit hints — no more guessing the ADC-to-velocity scale
client-side. The webapp waveform viewer was rewritten to consume
this shape.
- **In-app waveform viewer accuracy fix.** The standalone SFM webapp
viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
(≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
*in/s per V* hardware constant — not the ADC-counts-to-velocity
factor. This silently scaled every plot ~38% too low for Normal-range
geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
Sensitive). Conversion is now done server-side using the geo_range
from compliance config; the client just plots.
- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
`read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.
### Dependencies
- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
`/db/import/blastware_file` endpoint introduced in this release).
---
## v0.14.3 — 2026-05-05
### Fixed
- **`build_5a_frame` — DLE-stuffing rule for 0x10 bytes in params (the
long-standing >1-sec event 0 "won't open in BW" bug).**
Previously `build_5a_frame` wrote params bytes RAW with no DLE stuffing,
based on the incorrect assumption that the device handled all `0x10`
bytes in params literally. It does not. The device's actual de-stuffing
rule for the params region is:
- `10 10` → de-stuffs to `10`
- `10 02/03/04` → kept literal (inner-frame markers)
- `10 X` for other X → de-stuffs to just `X` (drops the `0x10`)
When the counter passed in params has `0x10` in the high byte (e.g.
counter=`0x1000` produces params bytes `... 10 00 ...`), the device
silently corrupts the request to counter=`0x__00` and responds with
whatever lives at that wrong address. For counter=0x1000 the wrong
address was 0x0000, so the response was a copy of the file header +
STRT record. That STRT block then got embedded in the assembled body
at file offset `0x1016`, and Blastware refused to open the file
(interprets the second STRT as a malformed multi-event file).
This explains the entire >1-sec event-0 failure pattern:
- 1-sec events have `end_offset < 0x1000`, so the chunk walk never
requests counter `0x10__` and the bug never triggers.
- 2-sec / 3-sec / longer events all need a chunk at counter `0x1000`
(and longer events also need `0x1200`, `0x1400`, etc., none of which
have `0x10` in the high byte except `0x1000`). Just one corrupted
response is enough to embed STRT in the body and break the file.
Verified against BW 5-1-26 "copy 3sec" capture: all 17 5A request
frames (probe + 2 metadata pages + 13 sample chunks + TERM) now match
BW's wire output **byte-for-byte**, including the doubled `10 10 00`
for counter=0x1000.
### Notes
- `0x10` bytes in `offset_hi` (the standalone offset field at body[5])
are still written RAW — confirmed correct per the 1-2-26 capture.
- BW's actual encoding of `10 02` / `10 04` for meta pages 0x1002 /
0x1004 is *not* doubled — it relies on the device keeping `10 02`
and `10 04` as literal pairs. This is preserved by the fix.
---
## v0.14.2 — 2026-05-04
### Fixed
- **`blastware_file.py` — removed harmful "duplicate header+STRT" strip.**
The v0.13.x strip logic was matching the byte sequence `00 12 03 00 STRT`
in legitimate waveform data — sample chunks at counter `0x1000` and
beyond often contain those bytes coincidentally — and zeroing 25 bytes
of valid samples per match. This is why event 0 (event-1 case in the
protocol) downloads of >1-sec recordings always failed in BW: the strip
destroyed real data at body offset `0x1012..0x102B` and propagated
alignment differences through the rest of the body. Sub-1-sec events
worked because their `end_offset` was below `0x1002`, so no sample
chunks landed in the metadata-page region and the strip's needle never
matched. Verified fix by re-feeding the BW 5-1-26 "copy 3sec" capture's
A5 frames into the file builder: output is now byte-identical to BW's
saved `M529LKIQ.G10` reference (8708 bytes, 0 differences).
- BW already concatenates frame contributions in stream order without
any de-duplication; SFM now does the same.
---
## v0.14.0 — 2026-05-02
### Changed (major rewrite)
- **`read_bulk_waveform_stream` — STRT-bounded chunk walk.** Replaces the
earlier `0x0400`-step / `max(key4[2:4], 0x0400)` chunk-counter formula,
which over-read ~5× past the actual event end into post-event circular-
buffer garbage. The new walk:
1. Probe at `counter = start_offset` (event 1: `0x0000`; event N:
`cur_key[2:4]`).
2. Parse `end_offset` from the STRT record at `data[17]` of the probe
response (`end_key[2:4]` field).
3. For event 1 only, read the two fixed metadata pages at counter
`0x1002` and `0x1004` — these contain the global session-start
compliance setup (Project / Client / User Name / Seis Loc /
Extended Notes ASCII strings). Continuation events skip these
(BW caches them across the session).
4. Walk sample chunks at **`0x0200` increments (NOT `0x0400`)**, bounded
by `end_offset` — the loop exits when
`next_chunk_counter + 0x0200 > end_offset`.
5. Send the proper TERM frame (see new `bulk_waveform_term_v2()`) with
`offset_word = end_offset - next_boundary` and
`params[2:4] = next_boundary BE`. The TERM response carries the
partial last chunk + 26-byte file footer.
- **New helpers:** `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
and `parse_strt_end_offset(a5_data)` in `minimateplus.framing`.
- **`stop_after_metadata` / `extra_chunks_after_metadata` kwargs are now
no-ops** under the v0.14.x walk. They are retained on the
`read_bulk_waveform_stream` signature for backward compatibility but log a
DEBUG line when set. The old "scan for `b'Project:'` and stop one chunk
later" workaround is obsolete — the loop is deterministically bounded by
the STRT-derived `end_offset`.
- **Project / Client / User Name / Seis Loc string source corrected.**
These come from the dedicated metadata pages at counter `0x1002` /
`0x1004`, not from "A5 frame 7" of the sample-chunk stream. The
earlier "A5 frame 7" claim was an artifact of the broken `0x0400`-step
walk where the bad counter formula coincidentally landed sample-chunk
fi=7 on top of the 0x1002 metadata page.
### Verified
- Three independent BW MITM captures (4-27-26 + 5-1-26 + 5-4-26) confirm
the new walk matches BW's behaviour event-for-event.
- `end_offset` values verified across 3 events: `0x1ABE` (4-27-26 2-sec),
`0x21F2` (5-1-26 3-sec), `0x417E` (5-1-26 event-2).
### Notes
- Earlier v0.13.0 / v0.13.1 / v0.13.2 entries describe partial steps along
the way (some of the file builder fixes, filename bugs, etc.) that were
superseded by the full rewrite. Treat this v0.14.0 entry as the
definitive landing point for the corrected SUB 5A protocol.
---
## v0.14.1 — 2026-05-04
### Fixed
- **`read_bulk_waveform_stream` — event-N probe counter off-by-`0x46`.**
Continuation events (start_key[2:4] != 0) were being probed at counter
`start_offset + 0x0046` instead of just `start_offset`. In the iteration
walk, `cur_key` from 1F is already the off=0x46 WAVEHDR record key, so the
earlier formula effectively double-counted the WAVEHDR offset. The probe
landed one WAVEHDR past the actual event start, the response no longer
contained the STRT record at byte 17, `parse_strt_end_offset` returned
`None`, and the chunk loop fell back to the `max_chunks=128` cap — walking
~110 chunks of post-event circular-buffer garbage. Verified against the
5-1-26 "copy 2nd address" and 5-4-26 BW 2-sec event captures: BW probes
counter=`0x2238` with key=`01112238` and STRT is present at byte 17 of
the response (end_offset=`0x417E`).
- **CLAUDE.md / docs/instantel_protocol_reference.md** — corrected the
event-N section to clarify that `start_key` in those formulas is the
off=0x46 key, not the off=0x2C boundary key, and removed the spurious
`+0x46` from the chunk-walk pseudocode.
---
## v0.13.2 — 2026-05-01
### Fixed
- **`_extract_record_type` — third 0C-record header format ("short", 8 bytes).**
A live SFM download against BE11529 produced files named `M5290000.000`
(zero-stamped) because the 0C waveform record's first bytes were
`01 05 07 ea ...` — neither the 9-byte single-shot layout (`0x10` at byte 1)
nor the 10-byte continuous layout (`0x10` at bytes 0 and 2). Investigation
showed this is a third format observed in the wild: an 8-byte header with no
marker bytes at all (`[day][month][year_BE:2][unknown][hour][min][sec]`).
The detection logic now scans the year (uint16 BE) at byte 2 / byte 3 / byte
4 and picks whichever offset returns a sensible year (20152050) — each
format has the year at a unique position so this disambiguates cleanly.
- New format → `event.record_type = "Waveform (Short)"`,
`Timestamp.from_short_record()`.
- Existing single-shot and continuous parsers unchanged.
- The user's event from May 1, 2026 13:21:37 now correctly resolves to a
filename like `M529LKIQ.G10` instead of `M5290000.000`.
### Added
- `Timestamp.from_short_record(data)` — decodes the 8-byte header.
- `_detect_record_format(data)` — internal helper returning
`"single_shot" / "continuous" / "short" / None` via year-position scan.
---
## v0.13.1 — 2026-05-01
### Fixed
- **`_extract_record_type` — Continuous-mode record headers misclassified as Unknown.**
In single-shot mode the 0C waveform record's 9-byte header puts the sub_code
marker `0x10` at byte 1, with the day at byte 0. In Continuous mode the
header is 10 bytes with the marker at byte 0 *and* byte 2, and the day at
byte 1. Previous logic only inspected byte 1 and treated any value other
than `0x10` / `0x03` as `"Unknown"`, which prevented `event.timestamp` from
being populated for any continuous-mode event whose day-of-month wasn't
exactly 3 or 16. As a downstream effect, `blastware_filename()` saw
`event.timestamp == None`, fell back to `stem="0000"` / `ab="00"`, and
produced filenames like `M5290000.000`. Discovered from a live SFM run on
BE11529 in continuous mode (day-of-month = 5).
Now disambiguates by checking BOTH byte 0 and byte 2: if both are `0x10`,
it's the 10-byte continuous header; else if byte 1 is `0x10`, it's the
9-byte single-shot header. Day-of-month no longer matters.
*Superseded by v0.13.2 — the user's actual record uses a third 8-byte format
with no `0x10` markers, which v0.13.1 still misclassified.*
---
## v0.13.0 — 2026-05-01
### Fixed
- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
`read_bulk_waveform_stream` was walking the chunk counter past the actual
end of the event, picking up post-event circular-buffer garbage that
corrupted reconstructed Blastware files for any waveform > ~1 sec. The
loop now extracts the event's `end_offset` from the STRT record at
`data[23:27]` of the probe response and stops the chunk walk when the next
counter would step past it. Verified against three BW MITM captures
(4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.
### Added
- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
formula confirmed across all 3 BW captures. Not yet wired into
`read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
existing `blastware_file.write_blastware_file` frame-structure expectations);
available for the next iteration that switches to BW's 0x0200 chunk step.
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
from the STRT record in an A5 response payload.
### Documentation
- **CLAUDE.md and `docs/instantel_protocol_reference.md` extensively
rewritten** to reflect the corrected SUB 5A protocol. See:
- CLAUDE.md "SUB 5A — chunk counter formula (REWRITTEN 2026-05-01)"
- CLAUDE.md "SUB 5A — STRT record encodes end_offset"
- CLAUDE.md "SUB 5A — TERM frame formula"
- CLAUDE.md "SUB 5A — fixed metadata pages 0x1002 and 0x1004"
- CLAUDE.md "SUB 0A — WAVEHDR response length distinguishes events from
boundaries" (0x46 = real event, 0x2C = boundary marker)
- protocol reference §7.8.5 / §7.8.6 / §7.8.7 / §7.8.8
- The previous chunk-counter formula (`max(key4[2:4], 0x0400) + (chunk-1) *
0x0400`) is now marked DEPRECATED and explicitly tagged WRONG with
pointers to the new sections, so future work doesn't re-derive it.
### Known minor diffs vs Blastware (deferred to a follow-up)
- We still use the OLD 0x0400 chunk step rather than BW's 0x0200; switching
also requires updating `blastware_file.write_blastware_file`'s skip values
and "extra chunk after metadata" logic, which depends on a fresh capture
to verify.
- We still use the legacy fixed `offset_word=0x005A` TERM frame rather than
BW's `end_offset - next_boundary` formula, for the same reason.
- Two fixed metadata pages at counter `0x1002` and `0x1004` are not yet
read explicitly; under the current 0x0400 walk their content is reachable
via the sample chunk that covers buffer addresses `[0x1000, 0x1400)`.
---
## v0.12.6 — 2026-05-01
### Fixed
- **`blastware_file.py` — waveform frame classification** — A5 frame classification for
waveform-only vs header-only frames now uses `frame.record_type` instead of frame index.
Only waveform frames (0x46) are written to the file body; metadata frames are skipped.
Fixes spurious data corruption from incorrectly classified frames.
- **`s3_analyzer.py` — A5/5A frame naming** — Bulk waveform stream frames (SUB 5A response)
are now correctly labeled "A5" in analyzer output instead of being conflated with other
multi-frame responses (SUB A4, E5, etc.).
- **`S3FrameParser` — frame terminator detection** — Corrected the bare ETX terminator
detection. Frame termination is now correctly identified by a standalone `ETX=0x03` byte,
not by the `DLE+ETX` sequence (which is part of the payload when it appears within a frame).
---
## v0.12.5 — 2026-04-21
### Added
- **`seismo_lab.py` — Download tab** — New fourth tab for live wire-byte capture during event
downloads. Captures both BW→device and device→S3 frames in real time, allowing inspection
of the 5A bulk stream chunk sequence and frame-by-frame analysis without needing a bridge
or MITM proxy. Files are saved with user-specified labels for easy tracking.
### Changed
- **`s3_bridge.py` — raw captures always-on by default** — `--raw-bw` and `--raw-s3` now
default to `"auto"` instead of `None`. Every bridge session automatically generates
timestamped `raw_bw_<ts>.bin` and `raw_s3_<ts>.bin` files alongside the `.bin`/`.log`
session files. Pass `--raw-bw ""` (explicit empty string) to disable if needed.
- **`gui_bridge.py` — raw capture checkboxes pre-checked** — Both "BW→S3 raw" and
"S3→BW raw" checkboxes start checked. Path fields are empty by default (bridge auto-names
the files). Unchecking a box passes `--raw-bw ""` to explicitly disable capture.
- **`Bridge tab` — TCP mode added** — Serial/TCP radio toggle allows connection via cellular
modem (RV50/RV55) instead of direct RS-232. Supports multi-capture design (simultaneous
Bridge + Analyzer + Download sessions).
- **`ach_server.py` — TX capture added (`raw_tx_<ts>.bin`)** — Every ACH inbound session
now saves both directions: `raw_rx_<ts>.bin` (device → us, S3 side, as before) and
`raw_tx_<ts>.bin` (us → device, BW side). Both files are usable in the Analyzer.
TX bytes are buffered in memory until startup handshake succeeds (same as RX), preventing
scanner probes from creating empty files.
---
## v0.12.4 — 2026-04-21 (protocol analysis / docs only — no code changes)
### Discovered
- **compliance_raw is wire-encoded, not logical bytes** — `read_compliance_config()` returns
bytes that include DLE prefix bytes (`0x10`) before any `0x03` values (because S3FrameParser
preserves DLE+ETX inner-frame pairs as two literal bytes). The previous CLAUDE.md claim that
"S3FrameParser handles this transparently so compliance_raw contains logical bytes" was wrong.
- **anchor-9 behavior per recording mode** (confirmed from 4-20-26 BW write captures):
- Single Shot (0x00) / Continuous (0x01): anchor-9 = `0x00`
- Histogram (0x03): anchor-9 = `0x10` — the E5 DLE prefix for the `0x03` recording_mode byte
- Histogram+Continuous (0x04): anchor-9 = `0x10` — an actual stored config byte for this mode
Anchor position shifts by ±1 when recording_mode = `0x03` due to the extra DLE byte; the
dynamic anchor search (`buf.find(ANCHOR, 0, 150)`) handles this correctly without code changes.
- **Write frame ETX escaping** — BW escapes `0x03` bytes in write frame data as `0x10 0x03`
on the wire. Our `build_bw_write_frame` sends data bytes raw without ETX escaping. Device
accepts our raw writes for all tested modes. Hypothesis: device write parser uses the
offset/length field for frame boundaries, not ETX scanning, making ETX escaping optional.
Histogram mode (recording_mode = 0x03) write via SFM from a non-Histogram starting state
not yet tested.
- **BW write payload vs E5 read payload are byte-identical** around the anchor region (confirmed
by comparing 3-11-26 BW TX and S3 captures). BW does NOT strip DLE prefix bytes before writing;
it round-trips the wire-encoded bytes verbatim with only the modified fields changed.
- **Capture folder content catalogued** — see CLAUDE.md "BW capture reference" table for a
summary of all available protocol captures and their contents.
---
## v0.12.3 — 2026-04-20
### Added
+395 -58
View File
@@ -2,7 +2,7 @@
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
(Sierra Wireless RV50 / RV55). Current version: **v0.12.3**.
(Sierra Wireless RV50 / RV55). Current version: **v0.14.3**.
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
@@ -27,7 +27,7 @@ CHANGELOG.md ← version history
---
## Current implementation state (v0.12.3)
## Current implementation state (v0.14.3)
Full read pipeline + write pipeline + erase pipeline + monitor log + call home config working end-to-end over TCP/cellular:
@@ -41,14 +41,15 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
| Event header / first key | 1E | ✅ |
| Waveform header | 0A | ✅ |
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
| **Bulk waveform stream (event-time metadata)** | **5A** | ✅ new v0.6.0 |
| **Bulk waveform stream (event-time metadata + full waveform)** | **5A** | ✅ **byte-perfect against BW captures (v0.14.3, 2026-05-05)** — STRT-bounded chunk walk + correct event-N probe counter + DLE-stuffed `0x10` bytes in params + concatenate-only file body assembly. All 17 5A request frames in the 5-1-26 3-sec capture reproduce byte-for-byte. |
| Event advance / next key | 1F | ✅ |
| **Write commands (push config to device)** | **6883** | ✅ new v0.8.0 |
| **Erase all events** | **0xA3 → 0x1C → 0x06 → 0xA2** | ✅ new v0.9.0 |
| **Monitor log entries (partial 0x2C records)** | **0A browse** | ✅ new v0.10.0 |
| **Auto Call Home config (read + write)** | **2C → 7E → 7F** | ✅ **new v0.12.3** |
`get_events()` sequence per event: `1E → 0A → 0C → 5A → 1F`
`get_events()` sequence per event: `1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`
(see "Correct iteration pattern" section below for full detail)
`push_config_raw()` write sequence: `68→73 | 71×3→72 | 82→83 | 69→74→72`
@@ -115,24 +116,203 @@ S3→BW (response):
section contribute only `XX` to the running sum; lone bytes contribute normally. This
differs from the standard SUM8-of-destuffed-payload that all other commands use.
Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26
BW TX capture. All 10 frames verified.
3. **Params region uses partial DLE stuffing (CONFIRMED 2026-05-05).** The device's
de-stuffing rule for bytes inside the params region is:
### SUB 5A — chunk counter is monotonic (CORRECTED 2026-04-06)
- `10 10` → de-stuffs to `10`
- `10 02 / 03 / 04` → kept literal (these are inner-frame markers)
- `10 X` for other X → de-stuffs to just `X` (drops the leading `0x10`)
**Chunk counters are `chunk_num * 0x0400` for ALL chunks including chunk 1.**
Therefore any `0x10` byte in the *logical* params that is followed by a byte NOT in
`{0x02, 0x03, 0x04, 0x10}` MUST be doubled on the wire (`10 X``10 10 X`) so the
device's de-stuffer reproduces the original `10 X` pair. This applies most commonly
to counters with `0x10` in the high byte (e.g. counter=`0x1000` produces logical
params bytes `... 10 00 ...`, which BW encodes on the wire as `... 10 10 00 ...`).
Without this stuffing the device interprets counter=`0x1000` as `0x0000` and returns
the probe response (which contains a copy of the file header + STRT record). That
STRT block then gets embedded in the assembled file body at offset `0x1016`, and
Blastware refuses to open the file — see the v0.14.3 entry in `CHANGELOG.md`.
The 4-2-26 BW TX capture showed `counter=0x1004` for chunk 1 of event key `01110000`, which
led to `_CHUNK1_COUNTER = 0x1004` being hardcoded as a special case. This was a Blastware
artifact, not a protocol requirement. Empirical test 2026-04-06: with `counter=0x1004` for
chunk 1 the device times out (120 s); with `counter=0x0400` (= `1 * 0x0400`) it responds
immediately and streams all frames correctly.
`0x10` bytes in `offset_hi` (body[5]) are still written RAW — only the params region
has this stuffing requirement. The metadata-page params for counter `0x1002` /
`0x1004` survive without stuffing because `10 02` and `10 04` fall in the "kept
literal" carve-out.
The 4-3-26 capture confirms the pattern for a second event (key `0111245a`):
chunk 1 = `0x245A`, chunk 2 = `0x285A`, chunk 3 = `0x2C5A` (each +0x0400). Blastware's
true formula is `key4[2:4] + n * 0x0400` — but since `key4[2:4]` of the first event is
`0x0000`, `n * 0x0400` produces the right result. The device does not strictly validate the
counter and streams data for any valid 5A request; using `chunk_num * 0x0400` is correct.
Both differences (1) and (2) confirmed by reproducing Blastware's exact wire bytes from
the 1-2-26 BW TX capture (10 frames). Difference (3) confirmed against the 5-1-26
"bwcap3sec" capture (17 frames, all match byte-for-byte after fix).
### SUB 5A — chunk counter formula (REWRITTEN 2026-05-01 — see 5-1-26 captures)
> ⚠️ **Everything that came before this rewrite was WRONG in important ways.** The previous
> formula `max(key4[2:4], 0x0400) + (chunk_num - 1) * 0x0400` happened to *work* for events
> at start_key=0 because the device responds to whatever counter you ask for — but it caused
> a 5× over-read past the actual event, picking up post-event circular-buffer garbage that
> corrupts the reconstructed file for any event > ~1 sec of waveform. The captures in
> `bridges/captures/4-27-26/` and `5-1-26/comcheck/` show BW reads only ~12-16 chunks for
> the same events SFM was reading 37+ chunks for. See "TERM frame" and "STRT end_offset"
> sections below for the actual mechanism.
**Chunk addressing is just absolute device-buffer addresses.**
`params[0]=0x00`, `params[1:5]` is a 4-byte absolute device flash-buffer address (= the
"key" of that location), `params[5:11]` are zeros. The device returns 0x0200 (= 512) bytes
starting at that address. Increments between consecutive chunks are **0x0200 (NOT 0x0400)**
— this matches the chunk payload size. The previous "0x0400 step" worked by accident: BW
asks for half-size chunks; SFM was asking for double-size chunks, both with the same-named
"counter" field, but the value is just an address pointer the device honors as-is.
**The chunk pattern depends on whether the event sits at start_key=0 or not.**
#### Event 1 case — start_key[2:4] == 0x0000 (first event after erase / wrap)
```
1. Probe at counter=0x0000 (params[1:5] = full key, returns STRT record)
2. Read 2 fixed metadata pages: counter=0x1002, counter=0x1004
(these are GLOBAL session metadata — read ONCE per
Blastware session, not per event; contain the
Project/Client/User Name/Seis Loc strings)
3. Sample chunks: counter=0x0600, 0x0800, …, by 0x0200 increment,
up to but not including end_offset (rounded down to
0x0200 boundary)
4. TERM frame (see TERM formula below)
```
The reason `0x0046..0x0600` is skipped for event 1 is unknown — likely some pre-event
firmware reserved area for the first slot in a freshly-erased buffer. Harmless to skip.
#### Event 2+ case — start_key[2:4] != 0x0000 (continuation events)
```
1. First chunk at counter = start_key[2:4] (this IS the probe — response
contains STRT at byte 17)
2. Sample chunks: counter += 0x0200 each, up to but
not including end_offset
3. TERM frame
```
**`start_key` here is the off=0x46 WAVEHDR record key returned by 1F** (e.g. `01112238`),
NOT the off=0x2C boundary key that immediately precedes it. An earlier draft of this
doc described event-N as "probe at start + 0x46" — that formula came from naming the
boundary key as `start_key`. In the iteration walk, `cur_key` passed to
`read_bulk_waveform_stream` is always the off=0x46 key (the partial-record skip path in
`get_events` re-runs 1F to advance past boundary records before invoking 5A), so the
probe counter is just `cur_key[2:4]` with no extra offset. **Adding +0x46 caused the
probe to overshoot, miss the STRT record at byte 17 of the response, fall back to the
`max_chunks=128` cap, and walk ~110 chunks of post-event garbage** — observed in
SFM 5-4-26 capture before the fix.
Confirmed across:
- 5-1-26 "copy 2nd address" BW capture: probe counter=0x2238, key=01112238, STRT@17 end=0x417E.
- 5-4-26 BW 2-sec event capture: probe counter=0x2238, key=01112238, TERM offset_word=0x0146 → end=0x417E.
No metadata pages — those have already been read during event 1 in the same Blastware
session, and BW caches them. Note that the metadata-page reads happen ONCE per
Blastware-session-on-the-device, not once per event, so an SFM session that downloads
several events should read 0x1002/0x1004 only once at the start.
#### History (do not re-derive)
- Original: `_CHUNK1_COUNTER = 0x1004` hardcoded (Blastware capture artifact — WRONG).
- 2026-04-06: `chunk_num * 0x0400` (worked for key 01110000 only).
- 2026-04-24: `key4[2:4] + (chunk_num-1) * 0x0400` (fixed non-zero offsets, broke key 01110000).
- 2026-04-26: `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400` (broken — over-read past event end).
- 2026-05-01: Increments are 0x0200 not 0x0400; absolute addresses inside event range; bounded
by STRT end_key, not by `max_chunks` cap or device-side timeout.
- 2026-05-04: Removed spurious `+0x0046` from event-N probe counter. `cur_key` from 1F
is already the off=0x46 WAVEHDR key, so adding +0x46 would have placed the probe one
WAVEHDR past the actual event start. This caused probe responses to lack a STRT
record (no `end_offset` parsed → `0xFFFF` fallback → `max_chunks=128` cap), walking
~110 chunks of post-event circular-buffer garbage. Fixed in protocol.py
`read_bulk_waveform_stream`.
### SUB 5A — STRT record encodes end_offset (NEW 2026-05-01)
The first A5 response (probe response, or the first chunk for event 2+) contains a STRT
record at byte offset 17 of the `data` field. Layout:
```
data[17:21] "STRT" magic
data[21:23] ff fe sentinel
data[23:27] end_key ← 4-byte key of where this event ENDS
data[27:31] start_key ← 4-byte key of where this event STARTS
data[31:33] uint16 BE ?? sample-count or total bytes (varies; not yet decoded)
data[33:35] uint16 BE ??
data[35] 0x46 record type (waveform full record)
```
`end_offset = (end_key[2] << 8) | end_key[3]` is **the authoritative event-end pointer**.
SFM must extract this from the first A5 response and use it to bound the chunk loop and
encode the TERM frame. The device will happily respond to chunk requests past `end_offset`
(returning post-event circular-buffer contents) — that's the over-read bug.
Verified across 3 events:
| Capture | start_key | end_key | end_offset | event size |
|---|---|---|---|---|
| 4-27-26 "open 2sec" / "copy event to disk" | `01110000` | `01111ABE` | `0x1ABE` | 6,846 B |
| 5-1-26 "copy 3sec" / Download All event 1 | `01110000` | `011121F2` | `0x21F2` | 8,690 B |
| 5-1-26 "copy 2nd address" / DA event 2 | `011121F2` | `0111417E` | `0x417E` (event 2 span 0x1F8C = 8,076 B) |
### SUB 5A — TERM frame formula (FINALIZED 2026-05-01)
The TERM frame fetches the partial last chunk *and* the file footer. It is **not** a simple
"goodbye" frame — its response payload contains the bytes between the last full 0x0200-aligned
chunk and `end_offset`, and is required for reconstructing the Blastware file format.
```
last_chunk_counter = address of last full 0x0200-byte chunk read
next_boundary = last_chunk_counter + 0x0200
TERM offset_word = end_offset - next_boundary
TERM params[0] = key[0] (= 0x01 on every observed device)
TERM params[1] = key[1] (= 0x11)
TERM params[2] = (next_boundary >> 8) & 0xFF
TERM params[3] = next_boundary & 0xFF
TERM params[4:10] = zeros
build_5a_frame(offset_word, params) (10-byte params, NOT 11)
```
The device reconstructs `requested_address = (params[2] << 8) | offset_word = end_offset`
and replies with `(end_offset - next_boundary)` bytes from `next_boundary` — the residual
between the last 0x0200 boundary and the actual event end. Append the TERM response data
to the chunk stream like any other A5 frame; it carries the final waveform tail + footer.
Verified across 3 events:
| end_offset | last chunk | next_boundary | TERM offset_word | TERM params[2:4] |
|---|---|---|---|---|
| `0x1ABE` | `0x1800` | `0x1A00` | `0x00BE` ✓ | `1A 00` ✓ |
| `0x21F2` | `0x1E00` | `0x2000` | `0x01F2` ✓ | `20 00` ✓ |
| `0x417E` | `0x3E38` | `0x4038` | `0x0146` ✓ | `40 38` ✓ |
The previous code's hard-coded `offset_word = 0x005A` and `term_counter = last + 0x0400`
are wrong; the device's response under that path is a tiny 101-byte device-side terminator
(arrived only after we walked the entire post-event buffer), not the proper file footer.
### SUB 5A — fixed metadata pages 0x1002 and 0x1004 (NEW 2026-05-01)
Two chunk addresses are GLOBAL device/session metadata, not event-specific:
- `counter=0x1002` — first metadata page
- `counter=0x1004` — second metadata page
These are at fixed absolute addresses in the device's flash buffer. They contain the
session-start compliance setup (Project/Client/User Name/Seis Loc/Extended Notes ASCII
strings). Under the v0.14.0+ walk these strings are read directly from the metadata
pages, not from the sample-chunk stream.
BW reads them ONCE per Blastware session (during event 1's download) and caches them.
For SFM, that means:
- Once per call-home / once per `MiniMateClient.connect()` is enough.
- Subsequent events in the same session don't need to re-fetch them.
- Their content does not change when iterating events; only when the user opens
Compliance Setup → Apply on the device or sends a SUB 71 compliance write.
The full byte-for-byte layout of the metadata pages has not been mapped — `_decode_a5_metadata_into`
locates the ASCII strings via label scans (`Project:`, `Client:`, `User Name:`, `Seis Loc:`,
`Extended Notes`) which works correctly across observed captures. Future work could
dump the structural layout if more session-global fields need to be extracted.
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
@@ -140,10 +320,11 @@ counter and streams data for any valid 5A request; using `chunk_num * 0x0400` is
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
Do not swap them.
### SUB 5A — event-time metadata lives in A5 frame 7
### SUB 5A — event-time metadata source (FINALIZED 2026-05-05)
The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance
setup as it existed when the event was recorded:
The metadata strings come from the two fixed metadata pages at counter `0x1002` and
`0x1004` (see "SUB 5A — fixed metadata pages 0x1002 and 0x1004" above). These pages
are GLOBAL session metadata — read once per Blastware/SFM session, not per event.
```
"Project:" → project description
@@ -153,44 +334,71 @@ setup as it existed when the event was recorded:
"Extended Notes"→ notes
```
**IMPORTANT — 5A "Project:" is session-start config, NOT per-event (confirmed 2026-04-05):**
The "Project:" string in the A5 frame 7 payload reflects the compliance setup from when
the *monitoring session first started*, not the individual event's project name. The per-
event project name is correctly stored in the 210-byte 0C waveform record and must be
used as the authoritative source. `_decode_a5_metadata_into` therefore only sets
`project` from 5A when 0C didn't already supply one.
**IMPORTANT — these strings are session-start config, NOT per-event:**
Project / Client / User Name / Seis Loc reflect the compliance setup from when the
*monitoring session first started*, not the individual event's per-event metadata. The
authoritative per-event project name is stored in the 210-byte 0C waveform record.
`_decode_a5_metadata_into` therefore only sets `project` from the 5A metadata pages
when 0C didn't already supply one.
"Client:", "User Name:", "Seis Loc:", and "Extended Notes" are **NOT** present in the 0C
record — 5A remains the sole source for those fields and they are set unconditionally.
record — the metadata pages are the sole source for those fields and they are set
unconditionally.
`stop_after_metadata=True` (default) stops the 5A loop as soon as `b"Project:"` appears,
then sends the termination frame.
#### Deprecated knobs (do not re-introduce)
### SUB 5A — end-of-stream signal (confirmed 2026-04-06)
The `read_bulk_waveform_stream()` function still accepts these legacy kwargs for
backward compatibility, but they are **no-ops** under the v0.14.0+ walk:
After streaming all waveform chunks, the device sends exactly **1 raw byte** in response to
the next chunk request, then goes silent. This is the natural end-of-stream indicator — NOT
a complete A5 frame. `S3FrameParser.bytes_fed` will be 1; no frame is assembled.
- `stop_after_metadata=True` — used to scan the chunk stream for `b"Project:"` and stop
one chunk later as a workaround for the missing end_offset bound. Obsolete: the loop
is now deterministically bounded by `end_offset` parsed from the STRT record at
data[17] of the probe response, with the partial tail fetched by the TERM frame.
- `extra_chunks_after_metadata` — same era, same reason. No-op.
Handling: on `TimeoutError`, if `bytes_fed > 0` AND frames were already collected, treat as
graceful end-of-stream, break the loop, and proceed to the termination frame. If `bytes_fed
== 0` with no prior frames, it is a genuine transport failure — re-raise.
If you find code or docs referencing "A5 frame 7" as the source of metadata strings,
that's an old-walk artifact (the broken `0x0400`-step formula occasionally caught the
0x1002 metadata page at sample-chunk fi=7). Update to reference the dedicated metadata
pages instead.
**Chunk recv timeout must be 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
Using 120 s causes a ~2-minute stall at every end-of-stream detection. The `_recv_one` call
in the chunk loop passes `timeout=10.0` explicitly.
### SUB 5A — end-of-stream (FINALIZED 2026-05-01)
**Typical chunk count (BE11529, 1024 sps):** A 9,306-sample event produces 35 chunks before
end-of-stream. Chunks with uniform 1,036-byte data are all-zero ADC samples (post-event
silence). Only the initial variable-size chunks contain actual signal.
Under the v0.14.0+ STRT-bounded walk the stream ends cleanly:
```
… last full chunk at counter < end_offset
TERM request (offset_word = end_offset - next_boundary,
params address (next_boundary))
TERM response (page_key = 0x0000 or 0x0001, data = the residual
end_offset - next_boundary bytes including the file footer)
```
No timeout-based detection, no "1-byte teaser," no `max_chunks` cap. The chunk loop
exits when `counter + 0x0200 > end_offset`; the TERM frame fetches the tail.
**Chunk recv timeout is 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
Using 120 s would cause a ~2-minute stall on any unexpected timeout. The `_recv_one`
call in the chunk loop passes `timeout=10.0` explicitly.
**Typical chunk count under the v0.14.0+ walk (BE11529, 1024 sps over TCP/cellular):**
| Event duration | Sample chunks | Metadata pages | TERM | Total A5 frames |
|---|---|---|---|---|
| 2-sec (event 1) | ~12 | 2 | 1 | ~15 |
| 3-sec (event 1) | 13 | 2 | 1 | 16 |
| 2-sec (continuation) | 15 | 0 | 1 | 16 |
| 3-sec (continuation) | ~14 | 0 | 1 | ~15 |
For comparison, the deprecated `0x0400`-step walk produced ~37 chunks for a 2-sec
event with chunks 17-37 containing post-event circular-buffer garbage. Do not
re-introduce that walk under any circumstances.
### SUB 5A — fi==9 hardcoded skip (FIXED 2026-04-06)
`_decode_a5_waveform()` previously had `elif fi == 9: continue` — a leftover from the
9-frame original blast capture where frame 9 was assumed to be a terminator. For current
35-frame streams, fi==9 is live waveform data (~133 sample-sets were being dropped).
Removed. Terminator detection is via `page_key == 0x0000` in `read_bulk_waveform_stream`,
not frame index.
9-frame original blast capture where frame 9 was assumed to be a terminator. Removed.
TERM detection in the file builder uses `frame.page_key != 0x0010` (sample marker),
not frame index — see `blastware_file.py`.
### SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)
@@ -295,6 +503,55 @@ sends token=0xFE and is NOT used by any caller.
`advance_event()` returns `(key4, event_data8)`.
Callers (`count_events`, `get_events`) loop while `data8[4:8] != b"\x00\x00\x00\x00"`.
### SUB 0A — WAVEHDR response length distinguishes events from boundaries (NEW 2026-05-01)
When iterating events with the "Download All" pattern (1E → 0A → 1F → 0A → 1F → …), the
DATA_LENGTH at `data_rsp.data[5]` (= the byte BW echoes back as the offset for the data
fetch step) takes one of two values:
| WAVEHDR offset | Meaning |
|---|---|
| `0x46` (= 70) | Real event start key — there is event data at this address |
| `0x2C` (= 44) | Boundary marker between events — this key is the END of the previous event AND the START key for the empty space after it (or is the next event's pre-header) |
Confirmed from the 5-1-26 "Download All" capture:
```
0A(key=01110000) → off=0x46 ← event 1 real start
1F → key=011121F2
0A(key=011121F2) → off=0x2C ← event 1 END / event 2 boundary
1F → key=01112238
0A(key=01112238) → off=0x46 ← event 2 real start (= boundary + 0x46)
1F → key=0111417E
0A(key=0111417E) → off=0x2C ← event 2 END / next-empty marker
1F → null sentinel
```
This is why event 2's first 5A chunk is at `start_key + 0x46` — that's the address of the
"real start" 0x46-record, distinct from the `0x2C`-record at the raw boundary. Use the
`0x46` keys as the input to `read_bulk_waveform_stream`, not the `0x2C` keys.
For event 1 only (start_key[2:4] = 0x0000) BW probes at counter=0x0000 directly, which is
the `0x46`-keyed start record. Subsequent events use `start_key + 0x46`.
**Practical iteration pattern (replaces the old 1E/1F walk for downloads):**
```
Setup: SERIAL × 2 → CHCFG → 1E (token=0x00) → key0
For each event:
0A(cur_key) → DATA_LENGTH = 0x46 (real) or 0x2C (boundary)
1F (token=0x00) → next_key
if length was 0x46: → cur_key is a real event; queue it for download
cur_key = next_key
if next_key all-zero null sentinel: stop
Then for each queued real-event key:
download_event(key) → 5A bulk stream with STRT-bounded chunk walk
```
This is what BW does in the 5-1-26 "Download All" capture — it walks the full event chain
collecting `(key, length)` tuples first, *then* downloads each event using the `0x46` keys.
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
@@ -386,7 +643,9 @@ bytes `\x01\x2c` = 300 (5-minute default histogram interval); changes when inter
| Offset | Field | Format | Notes |
|---|---|---|---|
| anchor 7 (write) / anchor 8 (read) | recording_mode | uint8 | E5 read has extra `0x10` at anchor7 |
| anchor 9 | mode_prefix | uint8 | `0x00` for Single Shot / Continuous; `0x10` for Histogram (DLE prefix in E5 encoding) and Histogram+Continuous (actual config byte). See "compliance_raw DLE encoding" note below. |
| anchor 8 | recording_mode | uint8 | **Same offset for both read and write** — confirmed 2026-04-21. `_encode_compliance_config` writes `buf[anc-8]`. NOTE: for Histogram (0x03), E5 encodes the value as `0x10 0x03` so compliance_raw[anc-9]=0x10, compliance_raw[anc-8]=0x03. |
| anchor 7 | constant | `0x10` | Always `0x10` in both E5 read and BW write payloads (not a DLE marker — it is part of the sample_rate field area). Do NOT overwrite. |
| anchor 6 | sample_rate | uint16 BE | same in read & write |
| anchor 4 | histogram_interval_sec | uint16 BE | seconds; same in read & write ✅ 2026-04-20 |
| anchor 2 | `0x00 0x00` | padding | |
@@ -395,15 +654,42 @@ bytes `\x01\x2c` = 300 (5-minute default histogram interval); changes when inter
**recording_mode enum** (confirmed 2026-04-20 from 4-20-26 captures):
| Value | Mode |
|---|---|
| `0x00` | Single Shot |
| `0x01` | Continuous |
| `0x02` | ❓ not observed |
| `0x03` | Histogram |
| `0x04` | Histogram + Continuous |
| Value | Mode | anchor-9 in compliance_raw |
|---|---|---|
| `0x00` | Single Shot | `0x00` |
| `0x01` | Continuous | `0x00` |
| `0x02` | ❓ not observed | ❓ |
| `0x03` | Histogram | `0x10` (DLE prefix from E5 wire encoding of 0x03) |
| `0x04` | Histogram + Continuous | `0x10` (actual config byte for this mode) |
**DLE escaping in write frames — CONFIRMED 2026-04-20:** Write frame data payloads DO escape `0x03` (ETX) bytes with a `0x10` DLE prefix. For histogram_interval = 900 (0x0384), the wire carries `10 03 84` — the `0x03` high byte is preceded by a DLE escape. After DLE destuffing (`10 XX → XX`), the logical field value is correctly `03 84` = 900. The CLAUDE.md claim that write frame data is "written RAW" was incorrect; at minimum ETX (0x03) bytes are escaped. S3FrameParser handles this transparently so the decoded `compliance_raw` always contains logical (destuffed) bytes.
**compliance_raw DLE encoding — IMPORTANT (confirmed 2026-04-21 from 4-20-26 captures):**
`compliance_raw` (returned by `read_compliance_config()`) is NOT purely logical bytes — it is
the wire-encoded representation where `0x03` bytes in the config are preceded by a `0x10` DLE
prefix (because S3FrameParser preserves DLE+ETX inner-frame pairs as two literal bytes).
Consequences:
- When recording_mode = `0x03` (Histogram), `compliance_raw[anc-9] = 0x10` (DLE prefix) and
`compliance_raw[anc-8] = 0x03` (the value). The anchor position is +1 compared to modes
without `0x03` bytes before the anchor.
- For Histogram+Continuous (`0x04`), `compliance_raw[anc-9] = 0x10` for a different reason:
it is an actual stored config byte, not a DLE prefix.
- The anchor search (`buf.find(b'\xbe\x80\x00\x00\x00\x00', 0, 150)`) correctly locates
the anchor regardless of these mode-dependent shifts.
- When SFM writes recording_mode and round-trips the rest verbatim, the byte at `anc-9` is
preserved from the previous read. This means transitioning Histogram→other modes via SFM
leaves a `0x10` at `anc-9`. The device stores it as a literal byte; it does not affect
recording mode operation (which is at `anc-8`), but differs from what BW writes. This is a
known minor discrepancy that does not impact device behavior.
- **Histogram recording mode (0x03) write via SFM**: untested. When starting from a mode with
`anc-9 = 0x00`, SFM writes bare `0x03` at anc-8. BW would write `0x10 0x03`. Device likely
accepts both (write frames probably use offset/length for framing, not ETX scanning).
**DLE escaping in write frames — confirmed 2026-04-20:** Blastware escapes `0x03` bytes in
write frame data as `0x10 0x03` on the wire (defensive ETX escaping). Our `build_bw_write_frame`
does NOT do this escaping — it sends data bytes raw. Device acceptance of bare `0x03` bytes
in write frame data is confirmed for the tested modes (Single Shot, Continuous, Histogram+Continuous
where `0x10 0x03` already appears from round-tripping). Histogram mode (bare `0x03` write from
non-Histogram starting state) has not been directly tested.
### SUB 0C — Waveform Record (210 bytes = data[11:11+0xD2])
@@ -490,6 +776,8 @@ All DB endpoints are read-only except `PATCH /db/events/{id}/false_trigger`.
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence; only 1 event stored so token=0xFE appeared to work |
| 4-3-26 | `bridges/captures/4-3-26/` | Browse-mode S3 capture with 2+ events — confirmed all-zero params for 1F, 1F response layout, null sentinel, 0A context requirement |
| 4-27-26 | `bridges/captures/4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof that SFM was over-reading 5× past event end. BW reads 14 chunks at 0x0200 increments + TERM at end_offset; SFM was reading 37 chunks at 0x0400 increments. STRT end_key field located. |
| 5-1-26 | `bridges/captures/5-1-26/comcheck/` | Three sub-captures: SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945`/`_171216`). Confirmed: TERM frame formula across 3 events; metadata pages 0x1002/0x1004 are global (read once per session); event-1 vs event-N chunk-pattern split; WAVEHDR length 0x46 vs 0x2C disambiguates real events from boundaries. |
---
@@ -753,7 +1041,7 @@ offsets in the raw 1A/E5 payload. Only fields with `✅` have confirmed offsets
**Notes tab:**
- Enable User Notes (bool)
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from A5 frame 7 via 5A)
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from 5A metadata pages at counter 0x1002 / 0x1004 — see "SUB 5A — fixed metadata pages" section)
- Enable Extended Notes (bool); Extended Notes text; Extended Notes Title
- Enable Job Number (bool); Job Number (int)
- Enable Scaled Distance (bool); Distance from Blast (float); Charge Weight (float) — Scaled Distance is derived
@@ -1067,9 +1355,58 @@ body) because writing a dial string may require DLE escaping for embedded contro
- **Database** — SQLite store for events + monitor log entries; dedup by key; queryable
- **Histograms** — decode histogram-mode A5 data (noise floor tracking)
- **Blastware-compatible file output** — `write_blastware_file()` and `write_mlg()` implemented. `blastware_filename()` generates correct Blastware filenames (AB0 for direct, AB0W/AB0H for ACH). **Confirmed BYTE-PERFECT against BW reference (v0.14.3, 2026-05-05):** when fed the BW 5-1-26 3-sec capture's A5 frames, the SFM-built file matches BW's saved `M529LKIQ.G10` byte-for-byte (8708 bytes, 0 differences). Live SFM downloads of event 0 (3-sec) and event 1 (3-sec continuation) both open cleanly in Blastware with full Event Reports, frequency analysis, and waveform plots. Body assembly is just contiguous concatenation of frame contributions in stream order (probe → meta@0x1002 → meta@0x1004 → samples → TERM); no stripping, no overlay, no special handling. Histogram+Continuous mode deferred (5A stream for those events embeds histogram interval records that may need different handling — untested under v0.14.x). Extension mapping: extensions encode timestamp (AB0T for ACH, AB0 for direct), NOT recording mode. Filename format: `<prefix_letter><serial3><4-char-base36-stem><ext>`
**Serial encoding (CONFIRMED 2026-04-22):** `prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))`, `serial3 = f"{serial_numeric % 1000:03d}"`. Examples: BE6907→H907, BE11529→M529, BE14036→P036, BE17353→S353, BE18003→T003. The prefix letter encodes the production generation (batch of 1000 units).
**Stem encoding (FULLY CONFIRMED 2026-04-22):** stem = 4-char base-36 of `floor(total_seconds / 1296)` where `total_seconds = (event_local_time 1985-01-01T00:00:00_local)` in seconds. Epoch = `1985-01-01 00:00:00` device local time — confirmed against 3,248 files from 10-year production archive with zero errors. Decode: `event_time = datetime(1985,1,1) + timedelta(seconds=stem_int*1296 + ab_int)`. Example: P036L318.C80H → BE14036, 2025-05-26 15:00:08, Full Histogram.
- **Blastware filename extension — NEW FIRMWARE FULLY DECODED (confirmed 2026-04-21, further confirmed 2026-04-22 from 10-year production archive frequency analysis):**
Extension format = `AB0T` (4 chars):
- `AB` = 2-char base-36 encoding of `total_seconds % 1296` (seconds within the 21.6-min window, 01295); `A = value // 36`, `B = value % 36`
- `0` = always literal digit zero (third character, invariant)
- `T` = event type: `W` = Full Waveform, `H` = Full Histogram
Combined with the 4-char stem, the full filename encodes a complete second-resolution timestamp. Verified against three S353L4H0.{3M0W,8S0H,9X0W} events (all match to the second) plus large-scale frequency analysis of a 10-year archive.
**3-day cycle property (confirmed 2026-04-22):** A unit recording at a fixed daily time cycles through exactly **3 extensions** with a 3-day period. Each calendar day shifts `total_seconds % 1296` by 864 (since `86400 % 1296 = 864`). The cycle repeats every 3 days because `gcd(1296, 864) = 432` and `1296 / 432 = 3`. The three extension values are spaced 432 seconds apart. Confirmed from 10-year archive: the top 3 extensions overall were `CE0H` (95 files), `0E0H` (93), `OE0H` (91) — all three are the 3-day cycle of a 06:00:14 daily call-in time (seconds-in-window = 14, 446, 878; all three have `E` as second character because `14 = E` in base-36 and adding 864 never changes `value % 36` since `864 = 24 × 36`).
**B character invariance:** For a unit recording at a fixed time of day, the second character `B` of the extension (`value % 36`) **never changes** — only the first character `A` cycles through 3 values. This means same-time-of-day files from different dates all share the same `B` character.
**Old firmware (S338, 3-char extensions ending in `0`):** encoding unknown. Extension is NOT recording mode. `blastware_filename()` returns `.N00` as a placeholder for old-firmware units.
**Micromate Series 4** uses a different extension format entirely (observed: `IDFH`, `IDFW`). The `AB0T` formula applies only to MiniMate Plus / V10.72 firmware.
- Compliance config encoder — build raw write payloads from a `ComplianceConfig` object
- **Test Histogram recording mode (0x03) write via SFM** — confirmed working for Single Shot / Continuous / Histogram+Continuous; Histogram (0x03) needs a live test from a non-Histogram starting state (bare 0x03 in write vs BW's DLE-escaped `10 03`)
- **Compliance write anchor-9 cleanup** — when changing recording_mode via SFM, the byte at anchor-9 is not explicitly managed. A spurious `0x10` may persist after Histogram→other mode transitions. Does not affect device operation but differs from BW's byte-perfect output.
- Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring)
- Call Home — map time slots 3/4 offsets; add dial_string write support; confirm `modem_power_relay_enabled`
- Modem manager — push RV50/RV55 configs via Sierra Wireless API
- RV55 DCD/DTR issue — newer RV55 firmware doesn't assert DCD by default; units don't
resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred)
resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred)
## BW capture reference
`bridges/captures/` contains the following BW TX + S3 response captures for protocol analysis:
| Folder / File | Contents |
|---|---|
| `1-2-26/` | First SUB 5A BW TX capture — established 5A frame format (raw offset_hi, DLE-aware checksum). 10 frames verified. |
| `3-11-26/raw_bw_20260311_170151.bin` | Full compliance write + event download (SUBs 68→83 confirmed, frames 102112) |
| `3-31-26/` | Single-event download (148 BW / 147 S3 frames) — 1E/0A/0C/1F sequence confirmed (single event so token=0xFE appeared to work in either branch) |
| `4-2-26/` | Download-mode BW TX capture — POLL×3 requirement confirmed (frames 68-73 between 1F and first 5A) |
| `4-3-26-multi_event/` | Browse-mode S3 capture with 2+ events — all-zero params for 1F, null sentinel layout, 0A context requirement |
| `4-8-26/` | Monitor status read, start/stop monitoring, SESSION_RESET signal, sensor check |
| `4-11-26 (mitm/ach_mitm_20260411_001912/)` | Full ACH call-home MITM — erase protocol (0xA3/0x06/0xA2), monitor log partial records confirmed |
| `4-20-26/raw_bw_*_recording_mode_*.bin` | Recording mode changes: Continuous→Single Shot, →Histogram, →Histogram+Continuous |
| `4-20-26/histogram interval/` | Histogram interval changes: 1min, 5min, 15min, 15sec |
| `4-20-26/geo sensitivity/` | Geo sensitivity changes: 1.25 in/s (Sensitive), 10 in/s (Normal) |
| `4-20-26/call home settings/` | Call home config read/write captures |
| `4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof of 5× SFM over-read. STRT end_key field located. |
| **`5-1-26/comcheck/`** | **Triplet of captures that nailed the v0.14.0 walk:** SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945` / `_171216`). Confirmed: TERM frame formula across 3 events, metadata pages 0x1002/0x1004 are global session metadata, event-1 vs event-N chunk pattern split, WAVEHDR off=0x46 vs 0x2C disambiguates real events from boundaries. |
| **`5-1-26/comcheck/bwcap3sec/`** | **The byte-perfect reference for v0.14.3.** All 17 BW 5A request frames (probe, 2 metadata, 13 samples, TERM) reproduce byte-for-byte from SFM's framing helpers — including the `10 10 00` DLE-stuffed counter for sample @ 0x1000 that was the long-standing failure mode. |
| `5-4-26/` | BW MITM captures of "copy 3sec / 2sec / Download All" + paired SFM session (`seismo_dl_20260504_145701`) showing the +0x46 event-N probe bug producing 110-chunk runaway walk. Cross-references against 5-1-26 confirmed device behavior is identical. |
To parse BW TX captures: use `bridges/captures/` scripts or adapt the `find_write_frames()` pattern
in `/tmp/analyze_write_payload.py` — it correctly handles `0x10 0x03` DLE-escaped ETX bytes
inside write frame data (the naive parser terminates early at the escaped `0x03`).
+135 -38
View File
@@ -1,4 +1,4 @@
# seismo-relay `v0.12.1`
# seismo-relay `v0.15.0`
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
software for managing MiniMate Plus seismographs.
@@ -10,7 +10,15 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
> pipeline working end-to-end over TCP/cellular. ACH Auto Call Home server
> handles inbound unit connections, downloads events, and persists everything
> to a SQLite database. SFM REST API exposes device control and DB queries.
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
> **As of v0.14.3 (2026-05-05): SUB 5A bulk waveform protocol is verified
> byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec
> events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with
> full Event Reports, frequency analysis, and waveform plots.
> **v0.15.0 (2026-05-07)** adds layered per-event storage (BW binary +
> raw 5A pickle + HDF5 + `.sfm.json` sidecar), a plot-ready
> `sfm.plot.v1` JSON shape with server-side ADC-to-physical-units
> conversion, and a BW-file importer for ingesting externally-produced
> events. See [CHANGELOG.md](CHANGELOG.md) for full version history.
---
@@ -18,26 +26,27 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
```
seismo-relay/
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Console tabs)
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Download + Console tabs)
├── minimateplus/ ← MiniMate Plus client library
│ ├── transport.py ← SerialTransport, TcpTransport, SocketTransport
│ ├── protocol.py ← DLE frame layer, SUB command dispatch
│ ├── client.py ← High-level client (connect, get_events, push_config, …)
│ ├── client.py ← High-level client (connect, get_events, delete_all_events, push_config, get_call_home_config, …)
│ ├── framing.py ← Frame builders, DLE codec, S3FrameParser
── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, …
── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, CallHomeConfig,
│ └── blastware_file.py ← Write events to Blastware-compatible .AB0 files
├── sfm/ ← SFM REST API server (FastAPI, port 8200)
│ ├── server.py ← All device + DB endpoints
│ ├── database.py ← SeismoDb — SQLite persistence layer
│ └── sfm_webapp.html ← Embedded web UI (served at /)
│ ├── server.py ← Live device endpoints + DB query endpoints + caching
│ ├── database.py ← SeismoDb — SQLite persistence (events, monitor_log, ach_sessions, sessions table)
│ └── sfm_webapp.html ← Embedded web UI with Call Home config tab
├── bridges/
│ ├── ach_server.py ← Inbound ACH call-home server (main production server)
│ ├── ach_mitm.py ← Transparent MITM proxy for capturing BW sessions
│ ├── s3-bridge/ ← RS-232 serial bridge (capture tool)
│ ├── tcp_serial_bridge.py ← Local TCP↔serial bridge (bench testing)
│ ├── gui_bridge.py ← Standalone bridge GUI
│ ├── gui_bridge.py ← Standalone bridge GUI with raw capture checkboxes
│ └── raw_capture.py ← Simple raw capture tool
├── parsers/
@@ -101,21 +110,28 @@ python seismo_lab.py
Each call dials the device, does its work, and closes the connection. TCP
connections are retried once on `ProtocolError` to handle cold-boot timing.
**Caching** — frequently-polled endpoints are cached in-process to avoid
redundant TCP round-trips:
**In-memory caching** — frequently-polled endpoints avoid redundant TCP round-trips
via a thread-safe `_LiveCache` (plain Python dict + `threading.Lock`):
| Method | URL | Cache |
|--------|-----|-------|
| Method | URL | Cache Strategy |
|--------|-----|---|
| `GET` | `/device/info` | Indefinite; invalidated by `POST /device/config` |
| `GET` | `/device/events` | Count-probe fast path (~2s); full download only when new events detected |
| `GET` | `/device/event/{idx}/waveform` | Permanent per event index |
| `GET` | `/device/monitor/status` | 30-second TTL |
| `GET` | `/device/monitor/status` | 30-second TTL; invalidated by monitor start/stop |
| `GET` | `/device/call_home` | Fresh read from device (not cached) |
| `POST` | `/device/connect` | — |
| `POST` | `/device/config` | Writes compliance config; invalidates cache |
| `POST` | `/device/monitor/start` | Sends SUB 0x96 |
| `POST` | `/device/monitor/stop` | Sends SUB 0x97 |
| `POST` | `/device/config` | Writes compliance config; invalidates info + events cache |
| `POST` | `/device/config/project` | Patches project/client/operator/sensor_location strings |
| `POST` | `/device/monitor/start` | Sends SUB 0x96; immediately evicts status cache |
| `POST` | `/device/monitor/stop` | Sends SUB 0x97; immediately evicts status cache |
| `POST` | `/device/call_home` | Reads, patches specified fields, writes back to device |
All cached endpoints accept `?force=true` to bypass the cache.
**Cache bypass**All cached endpoints accept `?force=true` to skip the cache and
force a fresh read from the device.
**Cache stats**`GET /cache/stats` returns hit/miss counts and TTL info; `DELETE /cache/device`
clears the device cache immediately.
Transport query params (supply one set):
```
@@ -158,42 +174,61 @@ with client:
events = client.get_events() # Full download: headers + peaks + metadata
monitor = client.get_monitor_status() # Battery, memory, is_monitoring flag
log = client.get_monitor_log_entries() # Monitoring intervals (partial 0x2C records)
ach_cfg = client.get_call_home_config() # Auto Call Home settings (SUB 0x2C)
# Write
client.apply_config(
sample_rate=1024,
recording_mode="Continuous", # Single Shot / Continuous / Histogram / Histogram+Continuous
histogram_interval_sec=15, # 2, 5, 15, 60, 300, 900
trigger_level_geo=0.5,
geo_range="Normal", # Normal (10.000 in/s) / Sensitive (1.25 in/s)
project="Bridge Inspection 2026",
client_name="City of Portland",
operator="B. Harrison",
)
client.set_call_home_config(
auto_call_home_enabled=True,
after_event_recorded=True,
at_specified_times=True,
time1_hour=18, time1_min=30, # 6:30 PM
time2_hour=6, time2_min=0, # 6:00 AM
)
# Control
client.start_monitoring() # SUB 0x96
client.stop_monitoring() # SUB 0x97
client.delete_all_events() # Erase all (SUB 0xA3 → 0x1C → 0x06 → 0xA2)
```
`get_events()` runs the full per-event sequence: `1E → 0A → 0C → 5A → 1F`.
SUB 5A bulk stream provides `client`, `operator`, and `sensor_location` as they
existed at record time — not backfilled from the current compliance config.
`get_events()` runs the full per-event sequence:
`1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`.
SUB 5A bulk stream walks chunks bounded by the `end_offset` extracted from
the STRT record at byte 17 of the probe response — no over-reading, no
chunk-count cap. Project / client / operator / sensor location strings come
from the dedicated metadata pages at counter `0x1002` and `0x1004`,
read once per session (they reflect the compliance setup at session start,
not per individual event).
---
## Database
`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode).
Three tables, all unit-keyed by serial number:
`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode) using the
`SeismoDb` persistence layer. Four tables, all unit-keyed by serial number:
| Table | Key | Contents |
|-------|-----|----------|
| `ach_sessions` | UUID | Per-call-home audit record: serial, peer IP, events_downloaded, duration |
| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, PPV per channel, project/client/operator strings, false_trigger flag |
| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: start/stop time, duration, geo threshold |
| `ach_sessions` | UUID | Per-call-home audit record: serial, timestamp, peer IP, events_downloaded, monitor_entries, duration_seconds |
| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag |
| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips |
| `events.false_trigger` | Boolean flag | PATCH endpoint to mark/unmark false triggers for review |
Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs
never produce duplicate rows. Post-erase key reuse is handled automatically
via the high-water mark in `ach_state.json`.
Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs never
produce duplicate rows. Post-erase key reuse is handled automatically via the
high-water mark in `ach_state.json`. Key-based state tracking allows correct
handling of device erasures (external or post-download).
---
@@ -231,6 +266,27 @@ Full protocol documentation: [`docs/instantel_protocol_reference.md`](docs/insta
---
## Compliance Config Features
The REST API and web UI expose full control over device compliance settings:
- **Recording Mode** (Single Shot / Continuous / Histogram / Histogram+Continuous)
- **Sample Rate** (1024 / 2048 / 4096 sps)
- **Record Time** (float, seconds)
- **Histogram Interval** (2s, 5s, 15s, 1m, 5m, 15m) — when recording mode includes histogram
- **Geo Trigger Levels** (float, in/s per channel)
- **Geo Maximum Range** (Normal 10.000 in/s / Sensitive 1.250 in/s per channel)
- **Project / Client / Operator / Sensor Location** (ASCII strings)
Auto Call Home config:
- **Auto Call Home Enable** (bool)
- **Dial String** (read-only; 40-byte ASCII)
- **Trigger on Event** (bool)
- **Scheduled Call-Ins** (two time slots with HH:MM each)
- **Retry Settings** (count, delay, connection timeout, warm-up time)
---
## Requirements
```bash
@@ -252,17 +308,58 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
---
## Roadmap
## Key Features
- [x] Full read pipeline — device info, compliance config, event download with true event-time metadata
- [x] Write commands — push compliance config, trigger thresholds, project strings to device
- [x] Erase all events — confirmed erase sequence from live MITM capture
- [x] Monitor control — start/stop monitoring, read battery/memory/status
- [x] Monitor log entries — decode partial 0x2C records (continuous monitoring intervals)
- [x] ACH inbound server — accept call-home connections, download events, dedup by key
- [x] SQLite persistence — events, monitor log, and session history in `seismo_relay.db`
- [x] SFM REST API — device control + DB query endpoints, live device cache
**Device support:**
- [x] Full read/write/erase pipelines
- [x] Compliance config (recording mode, sample rate, histogram interval, geo sensitivity, project strings)
- [x] Auto Call Home config (read/write ACH settings, dial string, time slots, retries)
- [x] Monitor control (start/stop, status polling, battery/memory)
- [x] Monitor log entries (continuous monitoring intervals without full waveform download)
**Data persistence:**
- [x] SQLite database (`seismo_relay.db`) with 4 tables: ach_sessions, events, monitor_log, plus false_trigger flag
- [x] Deduplication by waveform key (handles re-runs and repeat call-homes)
- [x] Post-erase key-reuse detection (tracks high-water mark)
- [x] Session state (`ach_state.json`) with downloaded keys and max key
**REST API:**
- [x] Live device endpoints with in-memory caching (`_LiveCache`)
- [x] Cache statistics (`/cache/stats`) and manual invalidation (`/cache/device`)
- [x] DB query endpoints (units, events, monitor_log, sessions, false_trigger PATCH)
- [x] Call Home config read/write endpoints
- [x] Blastware file download endpoint (`/device/event/{index}/blastware_file`)
**File output (v0.7+, byte-perfect as of v0.14.3):**
- [x] Blastware-compatible `.AB0` / `.G10` file generation (waveform + metadata)
- [x] Multi-channel waveform decode from SUB 5A bulk stream
- [x] Second-resolution timestamp encoding in Blastware filename
- [x] **Byte-perfect against BW reference captures** (verified across 2-sec / 3-sec / 10-sec event durations, both event 0 and event N continuation events)
- [x] STRT-bounded chunk walk + correct event-N probe counter + partial DLE stuffing of `0x10` in 5A params (the four fixes that landed in v0.14.0v0.14.3)
**Capture tools:**
- [x] Serial-to-TCP bridge with raw BW/S3 capture (s3_bridge.py, defaults to auto-capture)
- [x] GUI bridge with raw capture checkboxes (gui_bridge.py)
- [x] ACH inbound server with bidirectional capture (ach_server.py saves raw_tx + raw_rx)
- [x] Transparent TCP MITM proxy for live BW session capture (ach_mitm.py)
**Analysis tools:**
- [x] s3_analyzer.py — session parser, frame differ, Claude export
- [x] gui_analyzer.py — standalone analyzer GUI
- [x] frame_db.py — SQLite frame database for capture analysis
**seismo_lab.py GUI:**
- [x] Bridge tab — Serial/TCP mode selector with raw capture options
- [x] Analyzer tab — BW/S3 capture playback and differencing
- [x] Download tab — Live wire-byte capture during event download
- [x] Console tab — Logging and diagnostics
## Roadmap (Future)
- [ ] Verify 30-sec event download — body may exceed `0xFFFF` and force the device into a different `end_key` encoding (none of 2/3/10-sec test cases hit this boundary)
- [ ] Terra-view integration — seismo-relay router, unit detail page, VISON-style event listing
- [ ] Vibration summary reports — highest legit PPV per project → Word doc (false trigger filtering first)
- [ ] Compliance config encoder — build raw write payloads from a `ComplianceConfig` object
- [ ] Modem manager — push RV50/RV55 configs via Sierra Wireless API
- [ ] Histogram mode recording support (5A stream analysis for mode 0x03)
- [ ] Call Home dial_string write support (requires DLE escaping for embedded control characters)
+249 -124
View File
@@ -35,6 +35,7 @@ Output per session
device_info.json — serial number, firmware version, calibration date, etc.
events.json — all events: timestamp, PPV per channel, peaks, metadata
raw_rx_<ts>.bin — raw bytes from the device (S3 side) for Analyzer
raw_tx_<ts>.bin — raw bytes we sent to the device (BW side) for Analyzer
session_<ts>.log — detailed protocol log
What to look for
@@ -69,43 +70,78 @@ from minimateplus.transport import SocketTransport
from minimateplus.client import MiniMateClient
from minimateplus.models import DeviceInfo, Event, MonitorLogEntry
from sfm.database import SeismoDb
from sfm.waveform_store import WaveformStore
log = logging.getLogger("ach_server")
# ── Per-unit state (downloaded-key set) ───────────────────────────────────────
# ── Per-unit state (downloaded events index) ──────────────────────────────────
# Persisted as <output_dir>/ach_state.json
# Format:
# Format (current — v2):
# {
# "BE11529": {
# "downloaded_keys": ["01110000", "0111245a"], # hex keys already on disk
# "max_downloaded_key": "0111245a", # highest key ever seen
# "last_seen": "2026-04-11T01:04:36"
# "downloaded_events": { # key_hex → ISO timestamp string
# "01110000": "2026-04-11T00:42:17",
# "0111245a": "2026-04-11T01:04:30"
# },
# "max_downloaded_key": "0111245a",
# "last_seen": "2026-04-11T01:04:36",
# "serial": "BE11529",
# "peer": "63.43.212.232:51920"
# }
# }
#
# Key-based deduplication works well within a single "key generation" (between
# erases). After the device memory is erased the event counter resets to
# 0x01110000, so the first new event has the SAME key as the very first event
# we ever downloaded. We detect this situation with max_downloaded_key:
# Why (key, timestamp) and not key alone:
# The device's event-key counter resets to 0x01110000 after every memory
# erase (internal or external). A bare-key dedup (the v1 format) cannot
# distinguish a re-recorded event with the same key from one we already
# downloaded. The 0C waveform record's timestamp IS unique per physical
# event, so we pair (key, timestamp) and treat a key with a different
# timestamp as a new event regardless of `max_downloaded_key`.
#
# if max(current_device_keys) < max_downloaded_key
# → device was wiped and keys have restarted → treat all device keys as new
#
# After our own erase (--clear-after-download) we also explicitly clear
# downloaded_keys and max_downloaded_key so the next session starts fresh.
# Legacy v1 format (`downloaded_keys: list[str]` only) is auto-migrated on
# read: the keys are kept under a sentinel of "" (empty string) timestamp so
# the (key, timestamp) compare always sees a mismatch and forces a one-time
# re-download. After that pass the state is rewritten in v2 form.
_state_lock = threading.Lock()
def _load_state(state_path: Path) -> dict:
if state_path.exists():
"""
Load ach_state.json, transparently migrating any legacy
`downloaded_keys: list` entries into the v2 `downloaded_events: dict`
schema. Returns the migrated state.
"""
if not state_path.exists():
return {}
try:
with open(state_path) as f:
return json.load(f)
state = json.load(f)
except Exception:
pass
return {}
# Per-unit migration: legacy list → dict-with-empty-timestamps
for unit_key, unit_state in list(state.items()):
if not isinstance(unit_state, dict):
continue
if "downloaded_events" in unit_state:
continue
legacy_keys = unit_state.get("downloaded_keys")
if isinstance(legacy_keys, list):
unit_state["downloaded_events"] = {k: "" for k in legacy_keys}
log.info(
"ach_state: migrated %s from v1 (downloaded_keys list) → v2 "
"(downloaded_events dict, %d keys with empty timestamps; "
"they will re-validate on next session)",
unit_key, len(legacy_keys),
)
else:
unit_state["downloaded_events"] = {}
# keep legacy field for one cycle; cleared on next save
unit_state.pop("downloaded_keys", None)
return state
def _save_state(state_path: Path, state: dict) -> None:
with _state_lock:
@@ -138,8 +174,10 @@ class AchSession:
max_events: Optional[int],
state_path: Path,
db: "SeismoDb",
store: "WaveformStore",
clear_after_download: bool = False,
restart_monitoring: bool = False,
force_redownload: bool = False,
) -> None:
self.sock = sock
self.peer = peer
@@ -149,8 +187,14 @@ class AchSession:
self.max_events = max_events
self.state_path = state_path
self.db = db
self.store = store
self.clear_after_download = clear_after_download
self.restart_monitoring = restart_monitoring
# `force_redownload` tells this session to ignore ach_state and
# re-download every event currently on the device, regardless of any
# (key, timestamp) match. Useful as a manual override when state has
# become inconsistent with what's actually on disk / in the DB.
self.force_redownload = force_redownload
def run(self) -> None:
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
@@ -172,16 +216,24 @@ class AchSession:
transport = SocketTransport(self.sock, peer=self.peer)
# Collect raw bytes in memory until startup succeeds, then flush to disk.
raw_buf: list[bytes] = []
raw_rx_buf: list[bytes] = [] # device → us (S3 side)
raw_tx_buf: list[bytes] = [] # us → device (BW side)
_orig_read = transport.read
_orig_write = transport.write
def tapped_read(n: int) -> bytes:
data = _orig_read(n)
if data:
raw_buf.append(data)
raw_rx_buf.append(data)
return data
def tapped_write(data: bytes) -> None:
_orig_write(data)
if data:
raw_tx_buf.append(data)
transport.read = tapped_read # type: ignore[method-assign]
transport.write = tapped_write # type: ignore[method-assign]
serial: Optional[str] = None
@@ -202,22 +254,34 @@ class AchSession:
session_dir = self.output_dir / f"ach_inbound_{ts}"
session_dir.mkdir(parents=True, exist_ok=True)
log_path = session_dir / f"session_{ts}.log"
raw_path = session_dir / f"raw_rx_{ts}.bin"
raw_rx_path = session_dir / f"raw_rx_{ts}.bin" # device → us (S3 side)
raw_tx_path = session_dir / f"raw_tx_{ts}.bin" # us → device (BW side)
# Flush buffered raw bytes to file and switch to direct file writes.
raw_fh = open(raw_path, "wb")
for chunk in raw_buf:
raw_fh.write(chunk)
raw_buf.clear()
# Flush buffered bytes to files and switch to direct file writes.
raw_rx_fh = open(raw_rx_path, "wb")
raw_tx_fh = open(raw_tx_path, "wb")
for chunk in raw_rx_buf:
raw_rx_fh.write(chunk)
for chunk in raw_tx_buf:
raw_tx_fh.write(chunk)
raw_rx_buf.clear()
raw_tx_buf.clear()
def tapped_read_file(n: int) -> bytes:
data = _orig_read(n)
if data:
raw_fh.write(data)
raw_fh.flush()
raw_rx_fh.write(data)
raw_rx_fh.flush()
return data
def tapped_write_file(data: bytes) -> None:
_orig_write(data)
if data:
raw_tx_fh.write(data)
raw_tx_fh.flush()
transport.read = tapped_read_file # type: ignore[method-assign]
transport.write = tapped_write_file # type: ignore[method-assign]
# Wire up file handler now that the session dir exists.
fh = logging.FileHandler(log_path, encoding="utf-8")
@@ -252,11 +316,20 @@ class AchSession:
state = _load_state(self.state_path)
unit_key = serial or self.peer # fall back to IP if no serial
unit_state = state.get(unit_key, {})
seen_keys: set[str] = set(unit_state.get("downloaded_keys", []))
# Highest event key ever downloaded from this unit (hex string, 8 chars).
# Used to detect post-erase key reuse — see comment block above.
# downloaded_events is the v2 (key_hex → timestamp_iso) dict.
# Empty-string timestamps are migrated v1 entries — they force a
# one-time re-download because the (key, timestamp) compare always
# mismatches against any non-empty timestamp from a fresh 0C read.
seen_events: dict[str, str] = dict(unit_state.get("downloaded_events", {}))
max_seen_key: str = unit_state.get("max_downloaded_key", "00000000")
if self.force_redownload:
log.info(" --force-redownload-all set — ignoring %d cached "
"(key, timestamp) entries for this session",
len(seen_events))
seen_events = {}
# Walk the event index (browse-mode, no 5A) to get the actual current
# key list. The SUB 08 event_count field is a lifetime "total events
# ever recorded" counter that does NOT decrement on erase — confirmed
@@ -269,11 +342,10 @@ class AchSession:
log.warning(" list_event_keys failed: %s -- falling back to full download", exc)
device_keys = None
# Use the walk result as our authoritative current count.
current_count = len(device_keys) if device_keys is not None else 0
log.info(" Unit has %d stored event(s); %d key(s) previously downloaded",
current_count, len(seen_keys))
log.info(" Unit has %d stored event(s); %d (key, ts) entr(ies) previously downloaded",
current_count, len(seen_events))
if device_keys is not None and current_count == 0:
log.info(" [OK] No events on device -- nothing to download")
@@ -281,75 +353,29 @@ class AchSession:
return
if device_keys is not None:
# ── Post-erase detection ──────────────────────────────────────
# After the device memory is erased, new events start from key
# 01110000 again — the same keys we already downloaded. Detect
# this by comparing the device's current highest key against the
# historical maximum. If the device has rolled back below our
# high-water mark, its counter was reset and we must treat all
# its keys as new, regardless of what seen_keys contains.
# ── Post-erase detection (best-effort, key-only signal) ───────
# After erase the device's key counter resets to 01110000.
# If the device's current max key is below our high-water mark
# we know erase happened. This catches the cleanest case but
# does NOT catch erase-then-record-many-events (where the new
# max may climb past the old max). The (key, timestamp) check
# in get_events() is what handles those.
if device_keys and max_seen_key != "00000000":
max_device_key = max(device_keys) # lexicographic; safe because
# keys share the same 4-char prefix
max_device_key = max(device_keys)
if max_device_key < max_seen_key:
log.info(
" Post-erase reset detected: "
"device max key %s < historical max %s "
"-- treating all device keys as new",
"-- discarding stale (key, ts) state for this session",
max_device_key, max_seen_key,
)
seen_keys = set() # discard stale dedup info for this session
seen_events = {}
new_key_set = set(device_keys) - seen_keys
log.info(" Device has %d key(s): %d new, %d already seen",
len(device_keys), len(new_key_set), len(device_keys) - len(new_key_set))
if not new_key_set:
log.info(" [OK] All events already downloaded -- nothing to do")
# Refresh state timestamp; preserve max_seen_key unchanged.
state[unit_key] = {
"downloaded_keys": sorted(seen_keys | set(device_keys)),
"max_downloaded_key": max_seen_key,
"last_seen": datetime.datetime.now().isoformat(),
"serial": serial,
"peer": self.peer,
}
_save_state(self.state_path, state)
# ── Erase even when no new events (if requested) ──────────
# Blastware ACH always erases after every session — even when
# nothing new was downloaded. Without the erase the device
# still sees stored events in its memory and immediately
# retries the call-home, causing the looping we observed.
# Only erase when device actually has events stored; skip
# the erase if device_keys is empty (nothing to erase).
if self.clear_after_download and device_keys:
log.info(
" Clearing device memory (--clear-after-download, "
"no new events but device has %d stored)...",
len(device_keys),
)
try:
client.delete_all_events()
log.info(" [OK] Device memory cleared")
# Reset state so the next session starts fresh.
state[unit_key] = {
"downloaded_keys": [],
"max_downloaded_key": "00000000",
"last_seen": datetime.datetime.now().isoformat(),
"serial": serial,
"peer": self.peer,
}
_save_state(self.state_path, state)
except Exception as exc:
log.error(
" [WARN] Event deletion failed: %s -- events NOT cleared",
exc,
)
log.info("Session complete (no new events) -> %s", session_dir)
return
else:
new_key_set = None # unknown; proceed with full download
# Note: no early-exit "all already downloaded" short-circuit
# here. Without per-event timestamps we cannot tell whether
# device_keys ⊆ seen_events.keys() actually means we have
# those physical events. get_events() will read 0C on its
# skip path and decide per event.
# Apply max_events cap
# stop_idx: when we know the count from list_event_keys, use it as
@@ -367,27 +393,67 @@ class AchSession:
)
try:
# Pass `seen_events` (key → ISO timestamp) so the client can
# read 0C on its skip path and only skip 5A when the per-event
# timestamp matches what we already have on disk. When force_-
# redownload is set, seen_events was already cleared above.
#
# Filter out empty-string timestamps (legacy v1 entries) — the
# client's 0C-on-skip-path only trusts entries with a
# populated timestamp; otherwise it falls through to a full
# 5A download.
skip_dict = {k: ts for k, ts in seen_events.items() if ts}
all_events = client.get_events(
full_waveform=True,
stop_after_index=stop_idx,
skip_waveform_for_keys=seen_keys if seen_keys else None,
skip_waveform_for_events=skip_dict if skip_dict else None,
)
# Filter to events whose keys we haven't saved before.
# New events are those that came back with _a5_frames populated
# (= 5A actually ran on this session). Skipped events have
# _a5_frames = None because the client matched (key, timestamp)
# against skip_dict and bypassed 5A.
new_events = [
e for e in all_events
if e._waveform_key is None
or e._waveform_key.hex() not in seen_keys
if getattr(e, "_a5_frames", None)
]
skipped = len(all_events) - len(new_events)
log.info(" [OK] Downloaded %d event(s): %d new, %d skipped (already seen)",
log.info(" [OK] Walked %d event(s): %d downloaded, %d skipped (matched (key, ts) in state)",
len(all_events), len(new_events), skipped)
if skipped:
log.info(" (skipped %d already-downloaded event(s))", skipped)
# ── Persist event file + A5 sidecar to the waveform store ──
# Saves ride alongside the existing JSON dump so the on-disk
# event file and events.json reference the same set of events.
waveform_records: dict[str, dict] = {}
for ev in new_events:
if not ev._a5_frames:
continue
try:
rec = self.store.save(
ev,
serial=serial or "UNKNOWN",
a5_frames=ev._a5_frames,
)
if ev._waveform_key is not None:
waveform_records[ev._waveform_key.hex()] = rec
log.info(
" [WAVE] saved %s (%d bytes)",
rec["filename"], rec["filesize"],
)
except Exception as exc:
key_hex = ev._waveform_key.hex() if ev._waveform_key else "????????"
log.warning(
" [WARN] Waveform store save failed for %s: %s",
key_hex, exc,
)
if new_events:
_save_json(session_dir / "events.json", [_event_to_dict(e) for e in new_events])
_save_json(
session_dir / "events.json",
[_event_to_dict(e, waveform_records) for e in new_events],
)
for ev in new_events:
pv = ev.peak_values
@@ -446,7 +512,10 @@ class AchSession:
_session_start = datetime.datetime.now()
try:
_ev_ins, _ev_skip = self.db.insert_events(
new_events, serial=serial or self.peer, session_id=None
new_events,
serial=serial or self.peer,
session_id=None,
waveform_records=waveform_records,
)
_ml_ins, _ml_skip = self.db.insert_monitor_log(
new_monitor_entries, session_id=None
@@ -481,35 +550,64 @@ class AchSession:
)
# ── Update persistent state ───────────────────────────────────
# Include both triggered-event keys and monitor-log keys in the
# downloaded set so they are not re-processed on the next call-home.
current_event_keys = [
e._waveform_key.hex()
for e in all_events
if e._waveform_key is not None
]
current_monitor_keys = [e.key for e in new_monitor_entries]
current_keys = current_event_keys + current_monitor_keys
# Build a fresh (key → ISO timestamp) map from THIS session's
# results. For each event currently on the device, prefer the
# timestamp we just observed (from 0C); fall back to whatever
# was already in seen_events for that key (so we don't lose an
# entry just because get_events skipped it on the (key, ts)
# match path).
def _ts_iso(ev) -> str:
ts = getattr(ev, "timestamp", None)
if ts is None:
return ""
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
current_events_map: dict[str, str] = {}
for ev in all_events:
if ev._waveform_key is None:
continue
key_hex = ev._waveform_key.hex()
ts_iso = _ts_iso(ev) or seen_events.get(key_hex, "")
current_events_map[key_hex] = ts_iso
# Monitor-log entries don't have a 0C-style timestamp, but
# they DO have a start_time; use that so the monitor-log keys
# are properly entered into the (key, ts) map.
for ml in new_monitor_entries:
key_hex = ml.key
ts = ml.start_time
ts_iso = ts.isoformat() if ts else seen_events.get(key_hex, "")
# If a triggered event already populated this key, keep
# whichever has a non-empty timestamp.
if key_hex not in current_events_map or not current_events_map[key_hex]:
current_events_map[key_hex] = ts_iso
if erased_successfully:
# Device memory is clear. Reset downloaded_keys and the
# high-water mark so the next call-home starts fresh and
# doesn't mis-identify the recycled key 01110000 as "seen".
updated_keys = []
updated_events: dict[str, str] = {}
new_max_key = "00000000"
log.info(
" State reset after erase -- next session will download "
"from key 0 (device counter resets after erase)"
)
else:
# Normal (no erase): union of previously-seen + all keys on
# device now. Includes already-seen survivors so we never
# re-download them if the device somehow keeps old records.
updated_keys = sorted(set(seen_keys) | set(current_keys))
new_max_key = updated_keys[-1] if updated_keys else max_seen_key
# Merge: keep prior (key, ts) entries we still have evidence
# of (for survivors of any partial failure), plus this
# session's authoritative (key, ts) pairs.
updated_events = dict(seen_events)
updated_events.update(current_events_map)
new_max_key = (
max(updated_events.keys())
if updated_events else max_seen_key
)
state[unit_key] = {
"downloaded_keys": updated_keys,
"downloaded_events": updated_events,
"max_downloaded_key": new_max_key,
"last_seen": datetime.datetime.now().isoformat(),
"serial": serial,
@@ -530,7 +628,8 @@ class AchSession:
log.warning(" [WARN] Failed to restart monitoring: %s", exc)
finally:
raw_fh.close()
raw_rx_fh.close()
raw_tx_fh.close()
client.close() # closes transport / socket cleanly
root_logger.removeHandler(fh)
fh.close()
@@ -570,7 +669,10 @@ def _device_info_to_dict(d: DeviceInfo) -> dict:
}
def _event_to_dict(e: Event) -> dict:
def _event_to_dict(
e: Event,
waveform_records: Optional[dict[str, dict]] = None,
) -> dict:
pv = e.peak_values
pi = e.project_info
peaks = {}
@@ -589,6 +691,11 @@ def _event_to_dict(e: Event) -> dict:
for ch, vals in e.raw_samples.items()
}
samples["__note__"] = "first 20 sample-sets only; see raw_rx.bin for full waveform"
rec: dict = {}
if waveform_records and e._waveform_key is not None:
rec = waveform_records.get(e._waveform_key.hex(), {}) or {}
return {
"timestamp": str(e.timestamp) if e.timestamp else None,
"project": pi.project if pi else None,
@@ -597,6 +704,9 @@ def _event_to_dict(e: Event) -> dict:
"sensor_location": pi.sensor_location if pi else None,
"peaks": peaks,
"raw_samples_preview": samples,
"blastware_filename": rec.get("filename"),
"blastware_filesize": rec.get("filesize"),
"a5_pickle_filename": rec.get("a5_pickle_filename"),
}
@@ -618,6 +728,7 @@ def serve(args: argparse.Namespace) -> None:
output_dir.mkdir(parents=True, exist_ok=True)
state_path = output_dir / "ach_state.json"
db = SeismoDb(output_dir / "seismo_relay.db")
store = WaveformStore(output_dir / "waveforms")
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
@@ -635,6 +746,7 @@ def serve(args: argparse.Namespace) -> None:
print(f" Max events per session: {max_ev if max_ev else 'unlimited'}")
print(f" Clear device after download: {'YES' if args.clear_after_download else 'no'}")
print(f" Restart monitoring after download: {'YES' if args.restart_monitoring else 'no'}")
print(f" Force re-download all (ignore state): {'YES' if args.force_redownload_all else 'no'}")
print(f"{'='*60}")
print(f"\n Point your test unit's ACEmanager call-home settings to:")
print(f" Remote Host: <this machine's LAN IP>")
@@ -672,8 +784,10 @@ def serve(args: argparse.Namespace) -> None:
max_events=max_ev,
state_path=state_path,
db=db,
store=store,
clear_after_download=args.clear_after_download,
restart_monitoring=args.restart_monitoring,
force_redownload=args.force_redownload_all,
)
t = threading.Thread(target=session.run, daemon=True, name=f"ach-{peer}")
t.start()
@@ -758,6 +872,17 @@ def parse_args() -> argparse.Namespace:
"This mirrors the standard Blastware ACH workflow."
),
)
p.add_argument(
"--force-redownload-all",
action="store_true",
default=False,
help=(
"Manual override: ignore ach_state.json's downloaded_events map "
"for this session and re-download every event currently on the "
"device, regardless of (key, timestamp) match. Useful when state "
"has become inconsistent with the on-disk waveform store / DB."
),
)
p.add_argument(
"--verbose", "-v",
action="store_true",
+34 -29
View File
@@ -58,16 +58,24 @@ class BridgeGUI(tk.Tk):
tk.Entry(self, textvariable=self.logdir_var, width=24).grid(row=1, column=3, sticky="we", **pad)
tk.Button(self, text="Browse", command=self._choose_dir).grid(row=1, column=4, sticky="w", **pad)
# Row 2: Raw taps
self.raw_bw_var = tk.StringVar(value="")
self.raw_s3_var = tk.StringVar(value="")
tk.Checkbutton(self, text="Save BW->S3 raw", command=self._toggle_raw_bw, onvalue="1", offvalue="").grid(row=2, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_bw_var, width=28).grid(row=2, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_bw_var, "bw")).grid(row=2, column=4, **pad)
# Row 2: Raw taps — ON by default; "auto" = timestamped name; blank checkbox = disabled
self.raw_bw_enabled = tk.IntVar(value=1)
self.raw_s3_enabled = tk.IntVar(value=1)
# Path fields: empty means "auto" (bridge picks a timestamped name)
self.raw_bw_path_var = tk.StringVar(value="")
self.raw_s3_path_var = tk.StringVar(value="")
tk.Checkbutton(self, text="Save S3->BW raw", command=self._toggle_raw_s3, onvalue="1", offvalue="").grid(row=3, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_s3_var, width=28).grid(row=3, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_s3_var, "s3")).grid(row=3, column=4, **pad)
tk.Checkbutton(self, text="BW→S3 raw (auto)", variable=self.raw_bw_enabled,
command=self._toggle_raw_bw).grid(row=2, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_bw_path_var, width=28,
fg="grey").grid(row=2, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_bw_path_var, "bw")).grid(row=2, column=4, **pad)
tk.Checkbutton(self, text="S3→BW raw (auto)", variable=self.raw_s3_enabled,
command=self._toggle_raw_s3).grid(row=3, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_s3_path_var, width=28,
fg="grey").grid(row=3, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_s3_path_var, "s3")).grid(row=3, column=4, **pad)
# Row 4: Status + buttons
self.status_var = tk.StringVar(value="Idle")
@@ -102,13 +110,11 @@ class BridgeGUI(tk.Tk):
var.set(filename)
def _toggle_raw_bw(self) -> None:
if not self.raw_bw_var.get():
# default name
self.raw_bw_var.set(os.path.join(self.logdir_var.get(), "raw_bw.bin"))
# Checkbox toggled — no path action needed; enabled state drives the flag.
pass
def _toggle_raw_s3(self) -> None:
if not self.raw_s3_var.get():
self.raw_s3_var.set(os.path.join(self.logdir_var.get(), "raw_s3.bin"))
pass
def start_bridge(self) -> None:
if self.process and self.process.poll() is None:
@@ -126,23 +132,22 @@ class BridgeGUI(tk.Tk):
args = [sys.executable, BRIDGE_PATH, "--bw", bw, "--s3", s3, "--baud", baud, "--logdir", logdir]
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
# Raw tap flags.
# Checkbox on + empty path → pass "auto" (bridge generates timestamped name).
# Checkbox on + explicit path → pass that path.
# Checkbox off → pass "" to disable (overrides bridge's auto default).
raw_bw_explicit = self.raw_bw_path_var.get().strip()
raw_s3_explicit = self.raw_s3_path_var.get().strip()
raw_bw = self.raw_bw_var.get().strip()
raw_s3 = self.raw_s3_var.get().strip()
if self.raw_bw_enabled.get():
args += ["--raw-bw", raw_bw_explicit if raw_bw_explicit else "auto"]
else:
args += ["--raw-bw", ""] # explicit disable
# If the user left the default generic name, replace with a timestamped one
# so each session gets its own file.
if raw_bw:
if os.path.basename(raw_bw) in ("raw_bw.bin", "raw_bw"):
raw_bw = os.path.join(os.path.dirname(raw_bw) or logdir, f"raw_bw_{ts}.bin")
self.raw_bw_var.set(raw_bw)
args += ["--raw-bw", raw_bw]
if raw_s3:
if os.path.basename(raw_s3) in ("raw_s3.bin", "raw_s3"):
raw_s3 = os.path.join(os.path.dirname(raw_s3) or logdir, f"raw_s3_{ts}.bin")
self.raw_s3_var.set(raw_s3)
args += ["--raw-s3", raw_s3]
if self.raw_s3_enabled.get():
args += ["--raw-s3", raw_s3_explicit if raw_s3_explicit else "auto"]
else:
args += ["--raw-s3", ""] # explicit disable
try:
self.process = subprocess.Popen(
+18 -8
View File
@@ -390,8 +390,14 @@ def main() -> int:
ap.add_argument("--s3", default="COM5", help="S3-side COM port (default: COM5)")
ap.add_argument("--baud", type=int, default=38400, help="Baud rate (default: 38400)")
ap.add_argument("--logdir", default=".", help="Directory to write session logs into (default: .)")
ap.add_argument("--raw-bw", default=None, help="Optional file to append raw bytes sent from BW->S3 (no headers)")
ap.add_argument("--raw-s3", default=None, help="Optional file to append raw bytes sent from S3->BW (no headers)")
ap.add_argument("--raw-bw", default="auto",
help="File to append raw bytes sent from BW->S3 (no headers). "
"Default 'auto' generates a timestamped name in --logdir. "
"Pass an empty string to disable.")
ap.add_argument("--raw-s3", default="auto",
help="File to append raw bytes sent from S3->BW (no headers). "
"Default 'auto' generates a timestamped name in --logdir. "
"Pass an empty string to disable.")
ap.add_argument("--quiet", action="store_true", help="No console heartbeat output")
ap.add_argument("--status-every", type=float, default=0.0, help="Seconds between console heartbeat lines (default: 0 = off)")
args = ap.parse_args()
@@ -414,12 +420,16 @@ def main() -> int:
# If raw tap flags were passed without a path (bare --raw-bw / --raw-s3),
# or if the sentinel value "auto" is used, generate a timestamped name.
# If a specific path was provided, use it as-is (caller's responsibility).
raw_bw_path = args.raw_bw
raw_s3_path = args.raw_s3
if raw_bw_path in (None, "", "auto"):
raw_bw_path = os.path.join(args.logdir, f"raw_bw_{ts}.bin") if args.raw_bw is not None else None
if raw_s3_path in (None, "", "auto"):
raw_s3_path = os.path.join(args.logdir, f"raw_s3_{ts}.bin") if args.raw_s3 is not None else None
# Resolve raw tap paths.
# "auto" (default) → timestamped file in logdir (always captured).
# Explicit path → use verbatim.
# None or "" → disabled (pass --raw-bw "" to suppress capture).
raw_bw_path: Optional[str] = args.raw_bw if args.raw_bw else None
raw_s3_path: Optional[str] = args.raw_s3 if args.raw_s3 else None
if raw_bw_path == "auto":
raw_bw_path = os.path.join(args.logdir, f"raw_bw_{ts}.bin")
if raw_s3_path == "auto":
raw_s3_path = os.path.join(args.logdir, f"raw_s3_{ts}.bin")
logger = SessionLogger(log_path, bin_path, raw_bw_path=raw_bw_path, raw_s3_path=raw_s3_path)
+683 -65
View File
@@ -11,6 +11,7 @@
| Date | Section | Change |
|---|---|---|
| 2026-05-08 | §7.6.1 (RETRACTION) | **❌ RETRACTED — "raw int16 LE 8 bytes/sample-set" body codec was never validated.** The original 4-2-26 confirmation was based on misreading broken-decoder output (full-scale ±32K noise) as evidence the signal had saturated. BW's own 0C peaks for that capture (Tran=0.420 / Vert=3.870 / Long=0.495 in/s) prove the signal was NOT saturated — none of those exceed 13K ADC counts. No event in the project's archive has ever come close to saturation, yet the decoder consistently produces ±32K noise on every event. Conclusion: the body codec is not raw int16 LE; the actual encoding is open. Body byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`, lots of `10 XX` pairs) — likely a delta encoding with `0x10` as escape, but unverified. Retraction box added at top of §7.6.1; "fully-saturating event" claim removed from channel-identification note. The histogram codec in §7.6.2 IS verified and decoded correctly (different recording mode, 32-byte blocks); use it as a structural hint when reverse-engineering the waveform codec. |
| 2026-02-26 | Initial | Document created from first hex dump analysis |
| 2026-02-26 | §2 Frame Structure | **CORRECTED:** Frame uses DLE-STX (`0x10 0x02`) and DLE-ETX (`0x10 0x03`), not bare `0x02`/`0x03`. `0x41` confirmed as ACK not STX. DLE stuffing rule added. |
| 2026-02-26 | §8 Timestamp | **UPDATED:** Year `0x07CB = 1995` confirmed as MiniMate hardware default date when RTC battery is disconnected. Not an encoding error. Confidence upgraded from ❓ to 🔶. |
@@ -104,9 +105,16 @@
| 2026-04-11 | §7.11 (NEW) | **NEW — §7.11 Erase-All Protocol added.** Full wire sequence, SUB 0x06 storage range payload layout, post-erase key counter reset (resets to `0x01110000`). Confirmed from 4-11-26 MITM capture of live Blastware ACH session. |
| 2026-04-11 | §14.6 | **RESOLVED — ACH Session Lifecycle is no longer "Future".** `bridges/ach_server.py` fully implements inbound ACH: POLL handshake, device info, event download. State tracked via `ach_state.json` (key-based, with `max_downloaded_key` for post-erase detection). `--clear-after-download` flag added for the standard delete-after-upload workflow. |
| 2026-04-17 | §7.6.2, §14 | **RESOLVED — Float 6.206053 at channel_label+28 is the ADC-to-velocity scale factor.** Confirmed from Series III Interface Handbook §4.5 formula: `Range (×1) = 1.61133 V / Sensitivity (V/unit)`. For the standard Instantel geophone at Normal range (10.000 in/s): Sensitivity = 1.61133 / 10 = 0.161133 V/(in/s). The stored value is the **inverse sensitivity** = 1/0.161133 = **6.206053 (in/s)/V**. Cross-check: 1.61133 V × 6.206053 = 10.000 in/s ✅. The firmware uses it as: `PPV (in/s) = ADC_voltage (V) × 6.206053`. Value is identical on all Instantel standard geophones — it is a hardware/firmware constant, NOT a user-configurable setting. Do NOT write this field. Open question §14 item "Max Geo Range float 6.2061" is now **RESOLVED**. |
| 2026-04-20 | §7.6.4 (NEW), §7.9, Appendix B | **CONFIRMED — Recording Mode byte location.** Three targeted captures (4-20-26) confirmed `recording_mode` at **`cfg[5]`** in the SUB 71 write payload (3-chunk compliance write). Method: single Blastware session, one initial E5 config pull, then three sequential "Send to unit" writes changing Recording Mode only. Diff of SUB 71 chunk-1 payloads: only `cfg[5]` and `cfg[1024]` changed; `cfg[1024]` delta exactly equals `cfg[5]` delta (chunk running checksum). In the E5 read response (sub-frame 1, page=0x0010), the field is at **`data[17]`** (= **anchor 4** from the 10-byte anchor), one position earlier than in the write payload due to an extra `0x10` byte at `data[18]` present only in the read format. Enum: `0x00`=Single Shot, `0x01`=Continuous, `0x03`=Histogram, `0x04`=Histogram+Continuous. `0x02` value not yet observed. See §7.6.4 for full details. |
| 2026-04-20 | §7.6.4 (NEW), §7.9, Appendix B | **CONFIRMED — Recording Mode byte location.** Three targeted captures (4-20-26) confirmed `recording_mode` at anchor8 in both the E5 read payload and the BW write payload (6-byte anchor `\xbe\x80\x00\x00\x00\x00`). BW write payload and E5 read payload are **byte-identical** around the anchor region — Blastware round-trips the wire-encoded E5 bytes verbatim with only the target field modified. Anchor position varies by ±1 depending on whether recording_mode = 0x03 (Histogram), because E5 wire-encodes `0x03` as the inner DLE+ETX pair `\x10\x03` (2 bytes), which S3FrameParser preserves as two literal bytes in `compliance_raw`. Enum: `0x00`=Single Shot, `0x01`=Continuous, `0x03`=Histogram, `0x04`=Histogram+Continuous. `0x02` value not yet observed. The byte at anchor9 is `0x00` for Single Shot / Continuous, and `0x10` for Histogram (DLE prefix from E5 encoding) and Histogram+Continuous (actual config byte). See §7.6.4 for full details. |
| 2026-04-21 | Appendix D (NEW) | **NEW — Blastware .N00 and .MLG file formats fully decoded.** `minimateplus/blastware_file.py` implements `write_n00()` and `write_mlg()`. N00 file format confirmed: 22B header + 21B STRT record + variable body + 26B footer. Body reconstructed from A5 bulk waveform stream frames with per-frame skip amounts (probe=7+strt_pos+21, A5[1]=13, A5[2+]=12, terminator=11) and DLE strip rule (strip `0x10` before `{0x02,0x03,0x04}`, keep following byte). Footer extracted verbatim from terminator frame's last 26 bytes. Split-pair edge case: when `frame.data[-1]==0x10` and `chk_byte∈{0x02,0x03,0x04}`, reunite both bytes before stripping and always remove trailing chk_byte (`stripped[:-1]`) — chk_byte is checksum, not payload. STRT record must be copied verbatim from A5[0]; bytes [10:20] are device-specific and cannot be reconstructed from Event fields. `write_n00` verified byte-perfect against `M529LIY6.N00` from 4-3-26-multi_event capture. MLG format: 308B header + N×292B records; CRC algorithm unknown (write as 0x0000). |
| 2026-04-21 | Appendix D §D.5 (NEW) | **NEW — Blastware filename encoding fully decoded.** Serial prefix: `chr(ord('B') + floor(serial/1000))` + last 3 digits zero-padded. Stem: 4-char base-36 of `floor(total_seconds/1296)`. Extension: `AB0` for manual/direct downloads (3 chars), `AB0W` or `AB0H` for ACH/call-home downloads (4 chars), where `AB` = 2-char base-36 of `total_seconds % 1296` and W/H = waveform/histogram. Epoch = 1985-01-01 00:00:00 device local time. Confirmed against 3,248 files from 10-year production archive with zero errors. 3-day cycle property: same daily recording time cycles through 3 extensions (864s/day shift, period=3 days). `blastware_filename(event, serial, ach=False)` implements full formula. |
| 2026-04-21 | §7.6.2, §5.3 | **CORRECTED — compliance_raw contains wire-encoded bytes, NOT logical bytes.** S3FrameParser appends DLE+ETX inner-frame pairs as two literal bytes to the frame body. Any `0x03` values in the compliance config appear in `compliance_raw` as `\x10\x03` (two bytes), not as a single `0x03`. The previous claim "S3FrameParser handles this transparently so compliance_raw contains logical (destuffed) bytes" was wrong. Consequence: `compliance_raw` is the wire-encoded E5 payload; anchor-relative reads work correctly because the anchor position automatically accounts for any DLE-encoded bytes before it. For write-back, round-tripping `compliance_raw` verbatim sends the correct wire bytes to the device. **DLE ETX escaping in write frames:** Blastware escapes `0x03` bytes in write frame data as `\x10\x03` on wire; our `build_bw_write_frame` does not (writes data raw). Device is confirmed to accept raw writes for all tested modes — likely uses the offset/length field for write frame framing, not ETX scanning. |
| 2026-04-20 | §7.6.2, §7.9, Appendix B | **CONFIRMED — Geophone maximum range / sensitivity selector byte location.** Two targeted captures (4-20-26, geo sensitivity folder): one at Normal 10.000 in/s, one at Sensitive 1.250 in/s. E5 read payload diff: exactly 3 bytes differ at channel_label+33 for Tran/Vert/Long. Values: `0x00`=Normal 10.000 in/s, `0x01`=Sensitive 1.250 in/s. Same offset applies to the SUB 71 write payload (which is the same 2126-byte E5-format buffer round-tripped verbatim). **`channel_label+20` reads `0x01` in ALL captures regardless of range setting — it is NOT this field.** Previous hypothesis (uint8 at Tran+20, 0x01=Normal) was WRONG. Stored as `geo_range` in `ComplianceConfig`. Encoded to all three geo channel blocks (Tran/Vert/Long) at label+33. |
| 2026-04-20 | §5.1, §5.3, §7.12 (NEW) | **NEW — Auto Call Home config protocol confirmed from 4-20-26 call home settings captures.** SUB 0x2C (Call Home Config READ, response 0xD3, data offset 0x7C=124) and SUB 0x7E/0x7F (WRITE + CONFIRM, response 0x81/0x80) confirmed. Write payload = read payload (125 bytes) + `\x00\x00` (127 bytes total). **DLE-escaped ETX at raw[117:119]:** the device returns logical value 0x03 (num_retries=3) as `\x10\x03` on the wire — S3FrameParser preserves both bytes as two literals, causing a +1 byte shift for all subsequent fields. Write frame sends these bytes verbatim (device interprets `\x10\x03` as literal value 3). Field map confirmed from 10-frame BW TX diff. See §7.12 for full layout. |
| 2026-05-01 | §7.8.2, §7.8.5 (NEW), §7.8.6 (NEW), §7.8.7 (NEW) | **REWRITTEN — SUB 5A bulk waveform stream protocol.** Five BW MITM captures (4-27-26 "open 2sec waveform" + "copy event to disk", 5-1-26 BW 3-sec + 2nd-event + Download All) prove that the previous chunk-counter formula `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400` over-reads 5× past the actual event end. BW reads ~12-16 chunks per event at **0x0200 increments (NOT 0x0400)**, bounded by `end_offset` extracted from the STRT record at `data[23:27]` of the first A5 response. **TERM frame formula corrected:** `offset_word = end_offset - next_boundary`, `params[2:4] = next_boundary BE` where `next_boundary = last_chunk_counter + 0x0200`. Verified across 3 events (offsets 0x1ABE, 0x21F2, 0x417E). **Metadata pages 0x1002 / 0x1004** are global, fixed-address device pages containing Project/Client/User Name/Seis Loc/Extended Notes — read ONCE per Blastware session (not per event). **Event-1 vs event-N split:** events at start_key[2:4]=0 use probe@0x0000 + metadata pages + sample chunks at 0x0600 onward; continuation events skip metadata and start at start_key+0x0046. **WAVEHDR length 0x46 vs 0x2C disambiguates real events from boundary markers** — the "Download All" pattern walks 1E/0A/1F to map all event keys+lengths upfront, then downloads each `0x46`-keyed event in turn. Old `stop_after_metadata=True` knob is a workaround for the missing end_offset bound and becomes obsolete under the new walk. See new §7.8.5 / §7.8.6 / §7.8.7 for full details. |
| 2026-05-04 | §7.8.5, §7.8.8 | **CORRECTED — Event-N probe counter is just `start_offset`, NOT `start_offset + 0x0046`.** The `+0x46` formula in the original §7.8.5 was based on calling the off=0x2C boundary key the "start_key", but in the iteration walk `cur_key` passed into `read_bulk_waveform_stream` is always the off=0x46 WAVEHDR record key from 1F (the partial-record skip path in `get_events` re-runs 1F to advance past 0x2C boundary records). Adding +0x46 placed the probe one WAVEHDR past the actual event start; the response no longer contained STRT at byte 17, `parse_strt_end_offset` returned None, and the chunk loop fell back to the `max_chunks=128` cap, walking ~110 chunks of post-event circular-buffer garbage. Confirmed against both the 5-1-26 "copy 2nd address" capture (probe at counter=0x2238 with key=01112238) and the 5-4-26 BW 2-sec event capture. Fixed in protocol.py `read_bulk_waveform_stream` v0.14.1. |
| 2026-05-05 | §7.8.1 (rule #3 added) | **CONFIRMED — Partial DLE stuffing of `0x10` bytes in 5A params region.** The device's de-stuffing rule for the SUB 5A params region is: `10 10``10`, `10 02/03/04` → kept literal (inner-frame markers), `10 X` for any other X → de-stuffs to just `X` (drops the `0x10`). Therefore any `0x10` byte in the logical params followed by a byte NOT in {0x02, 0x03, 0x04, 0x10} MUST be doubled on the wire. This affects counters with `0x10` in the high byte — most importantly counter=`0x1000`, where logical params bytes `... 10 00 ...` were being sent raw and the device de-stuffed `10 00` to just `00`, returning the response for counter=0x0000 (= the file header + STRT). That STRT block then ended up embedded in the assembled file body at file offset `0x1016` and Blastware refused to open the file. This was the root cause of the long-standing ">1-sec event 0 won't open in BW" pattern (1-sec events worked because their `end_offset < 0x1000`, so no chunk request ever needed counter `0x10__`). All 17 5A request frames in the 5-1-26 bwcap3sec capture (probe + 2 meta + 13 samples + TERM) now match BW byte-for-byte after the fix. Fixed in framing.py `build_5a_frame` v0.14.3. |
| 2026-05-05 | §7.8 / Blastware file format | **CONFIRMED — File body assembly is contiguous concatenation, no de-duplication.** The "duplicate header+STRT strip" hack from v0.13.x was actively destroying valid waveform data — sample chunks at counter `0x1000` and beyond often coincidentally contain the byte sequence `00 12 03 00 STRT` in their delta-encoded ADC stream, and the strip was zeroing 25 bytes per match. Removed in v0.14.2. The correct file body is: probe contribution + meta@0x1002 + meta@0x1004 + sample contributions in stream order + TERM contribution. Verified byte-perfect against BW reference `M529LKIQ.G10` (8708 bytes, 0 differences) when fed the same A5 frames as the BW capture. |
---
@@ -257,7 +265,7 @@ Step 4 — Device sends actual data payload:
| `0A` | **WAVEFORM HEADER READ** | Checks record type for a given waveform key. Variable DATA_LENGTH: 0x30=full bin, 0x26=partial bin. Key at params[4..7]. Required before every 1F call to establish device waveform context. | ✅ CONFIRMED 2026-03-31 |
| `0C` | **FULL WAVEFORM RECORD** | Downloads 210-byte waveform/histogram record. Sub_code at byte[1]: 0x10=Waveform (9-byte timestamp hdr), 0x03=Waveform-continuous (10-byte hdr, 1-byte shift). PPV floats at label+6 (search "Tran"/"Vert"/"Long"/"MicL"). Peak Vector Sum at tran_label12 (NOT fixed offset). Key at params[4..7], DATA_LENGTH=0xD2. | ✅ CONFIRMED 2026-04-03 |
| `1F` | **EVENT ADVANCE** | Advances to next waveform key. Token byte at params[7] (⚠️ NOT params[6]): 0x00=browse (all-zero params), 0xFE=download (arm 5A state machine). Returns next key at data[11:15]; null sentinel when data[15:19]=0x00000000. Requires preceding 0A to establish context. Browse 1F must ONLY be called after successful 5A — calling it after a failed 5A disrupts device state for the next event's 5A probe. | ✅ CONFIRMED 2026-04-06 |
| `5A` | **BULK WAVEFORM STREAM** | Bulk download of raw ADC sample data. Non-standard frame format: offset_hi=0x10 sent raw (not DLE-stuffed), DLE-aware checksum. Requires 1E-arm + 0C + 1F(0xFE) + POLL×3 before first probe. A5[7] contains event-time metadata (Project:/Client:/User Name:/Seis Loc:). 9+ A5 frames for full waveform; stop_after_metadata=True exits after A5[7]. | ✅ CONFIRMED 2026-04-06 |
| `5A` | **BULK WAVEFORM STREAM** | Bulk download of raw ADC sample data. Non-standard frame format: offset_hi=0x10 sent raw (not DLE-stuffed), DLE-aware checksum, **partial DLE stuffing of 0x10 in params** (`10 X` where X∉{02,03,04,10} must be doubled to `10 10 X` — see §7.8). Requires 1E-arm + 0C + 1F(0xFE) + POLL×3 before first probe. Walk: probe at counter=`start_offset` (event 1: 0x0000) → metadata pages 0x1002 + 0x1004 (event 1 only) → sample chunks at 0x0600, 0x0800, …, step 0x0200, bounded by `end_offset` parsed from STRT@data[17] of probe response → TERM frame at residual offset_word. Project:/Client:/User Name:/Seis Loc: live in the metadata pages, NOT in the sample-chunk stream. | ✅ CONFIRMED 2026-05-05 (BYTE-PERFECT vs BW capture) |
| `24` | **WAVEFORM PAGE A?** | Paged waveform read, possibly channel group A. | 🔶 INFERRED |
| `25` | **WAVEFORM PAGE B?** | Paged waveform read, possibly channel group B. | 🔶 INFERRED |
| `09` | **UNKNOWN READ A** | Read command, response (`F6`) returns 0xCA (202) bytes. Purpose unknown. | 🔶 INFERRED |
@@ -833,11 +841,70 @@ MicL: 39 64 1D AA = 0.0000875 psi
### 7.6 Bulk Waveform Stream (SUB A5) — Raw ADC Sample Records
> ⛔ **§7.6 below describes the deprecated `0x0400`-step walk and is RETAINED FOR HISTORICAL CONTEXT ONLY.**
> The "A5[7] is metadata", "A5[9] is terminator", and chunk-counter frame-index claims in this section
> are all artifacts of the broken walk that was overrunning past event end by ~5×.
>
> **For the corrected protocol (v0.14.0+), use:**
> - **§7.8.5** — chunk addressing (probe at `start_offset`, samples step 0x0200, bounded by STRT `end_offset`)
> - **§7.8.6** — TERM frame formula
> - **§7.8.7** — fixed metadata pages 0x1002 / 0x1004 (this is where Project / Client / User Name / Seis Loc
> strings actually live — NOT in any sample-chunk frame)
> - **§7.8.8** — multi-event "Download All" sequence
>
> The waveform sample encoding described in §7.6.1 below (4-channel interleaved s16 LE, 8 bytes
> per sample-set) is **NOT actually verified** — see the retraction note at the top of §7.6.1.
> The frame-indexing claims and metadata-source claims in §7.6 are also wrong; use §7.8.5–§7.8.8.
**Two distinct formats exist depending on recording mode. Both confirmed from captures.**
---
#### 7.6.1 Blast / Waveform mode — ✅ CONFIRMED (4-2-26 capture)
#### 7.6.1 Blast / Waveform mode — ❌ NOT VERIFIED (retracted 2026-05-08)
> ## ⚠️ RETRACTION (2026-05-08)
>
> The "4-channel interleaved s16 LE, 8 bytes per sample-set" claim
> below was **never actually validated**. It got into this document
> because the decoder built around that assumption produced full-scale
> ±32K counts on every channel of the 4-2-26 capture, and the
> ±32K-shaped output was misread as "the signal must have saturated."
>
> Cross-checking the BW-reported peaks proves the opposite:
>
> | Channel | BW PPV (in/s) | Expected ADC counts at 10 in/s FS |
> |---|---|---|
> | Tran | 0.420 | **1,376** |
> | Vert | 3.870 | **12,686** |
> | Long | 0.495 | **1,622** |
>
> None of these are anywhere near ±32K saturation. No event in the
> project's archive (across all captures from 1-2-26 onward) has
> ever come close to saturation either. Yet the decoder has
> consistently produced ±32K-shaped noise on every event. The right
> conclusion is that the byte-to-sample interpretation has been wrong
> the whole time, NOT that every event happened to saturate.
>
> What's actually known about the body bytes:
>
> - The byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`,
> plus high frequencies of `0x01 / 0x04 / 0x0F / 0xF0 / 0xF1`). Lots
> of `10 XX` pairs. Reading them as LE int16 produces uniform ±32K
> noise — the signature of mis-aligned or encoded data.
> - The CHANGELOG note for v0.14.2 calls the body a "delta-encoded
> ADC stream" — that hint plus the byte distribution points toward
> a delta encoding with `0x10` as an escape marker, but no decoder
> has been worked out yet.
> - The histogram-mode codec in §7.6.2 IS verified and decoded
> correctly (different format: 32-byte blocks with 9× int16 LE
> samples + metadata). The same firmware emits both formats, so
> §7.6.2 may share encoding primitives with the waveform codec
> and is worth using as a structural hint when reverse-engineering.
>
> **Treat the spec below as a starting hypothesis to disprove, not
> ground truth.** The frame-layout pieces (STRT location, preamble,
> chunk header) appear correct; the per-byte sample interpretation
> is the open question.
4-channel interleaved signed 16-bit little-endian, 8 bytes per sample-set:
@@ -902,11 +969,18 @@ Total: 7633B → 954 naive sample-sets, 948 alignment-corrected
Only 948 of 9306 sample-sets captured (10%) — `stop_after_metadata=True` terminated
download after A5[7] was received.
**Channel identification note:** The 4-2-26 blast saturated all four geophone channels
to near-maximum ADC output (~3200032617 counts). Channel ordering [Tran, Vert, Long, Mic]
= [ch0, ch1, ch2, ch3] is the Blastware convention and is consistent with per-channel PPV
values (Tran=0.420, Vert=3.870, Long=0.495 in/s from 0C record), but cannot be
independently confirmed from a fully-saturating event alone.
**Channel identification note:** Channel ordering [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3]
is the Blastware convention. This ordering has not been independently verified end-to-end,
since no decoder yet produces samples that match BW's own rendering of the same event (see
the retraction at the top of §7.6.1). Once the body codec is decoded, the per-channel PPV
values from the 0C record (Tran=0.420, Vert=3.870, Long=0.495 in/s for the 4-2-26 capture)
provide the cross-check that pins down channel order.
> **Historical note:** earlier revisions of this section claimed the 4-2-26 blast had
> "saturated all four channels to ~3200032617 counts," citing that as evidence the s16 LE
> interpretation was correct. That claim was wrong — the ±32K values were the broken
> decoder's output, not the actual signal amplitude (which the 0C peaks above show was
> nowhere near saturation). Retracted 2026-05-08.
---
@@ -1115,20 +1189,26 @@ Near-ambient: 0x3C75C28F = 0.015 in/s (histogram event, near-zero ambient)
**Project strings** — ASCII label-value pairs (search for label, read null-terminated value):
```
"Project:" → project description (in 0C record ✅)
"Client:" → client name (in SUB 5A / A5 frame 7 ✅ — NOT in 0C)
"User Name:" → operator / user (in SUB 5A / A5 frame 7 ✅ — NOT in 0C)
"Seis Loc:" → sensor location (in SUB 5A / A5 frame 7 ✅ — NOT in 0C)
"Extended Notes"→ notes field (in SUB 5A / A5 frame 7 ✅)
"Project:" → project description (in 0C record ✅, also mirrored in metadata pages)
"Client:" → client name (in SUB 5A metadata pages ✅ — NOT in 0C)
"User Name:" → operator / user (in SUB 5A metadata pages ✅ — NOT in 0C)
"Seis Loc:" → sensor location (in SUB 5A metadata pages ✅ — NOT in 0C)
"Extended Notes"→ notes field (in SUB 5A metadata pages ✅)
```
> ✅ **2026-04-02 — CONFIRMED:** `Client:`, `User Name:`, and `Seis Loc:` are sourced from
> **SUB 5A (bulk waveform stream)**, specifically A5 frame 7 of the multi-frame response.
> They are NOT present in the 210-byte SUB 0C waveform record. The strings reflect the
> compliance setup that was active when the event was recorded on the device — making SUB 5A
> the authoritative source for true event-time metadata. The `get_events()` client method
> now issues a SUB 5A request after each 0C download (`stop_after_metadata=True`) and
> overwrites `event.project_info` with the decoded fields.
> ✅ **UPDATED 2026-05-05:** `Client:`, `User Name:`, and `Seis Loc:` come from the
> dedicated **SUB 5A metadata pages at counter `0x1002` and `0x1004`** — see §7.8.7.
> They are NOT present in the 210-byte SUB 0C waveform record.
>
> An earlier draft of this doc claimed they came from "A5 frame 7" of the bulk waveform
> stream — that was an artifact of the deprecated `0x0400`-step walk where the broken
> chunk counter formula happened to land sample-chunk fi=7 on top of the 0x1002 metadata
> page. Under the corrected v0.14.0+ walk (§7.8.5), sample chunks at `0x1000` / `0x1200`
> contain ordinary waveform data, and the metadata pages are read separately.
>
> The strings reflect the compliance setup that was active when the *monitoring session*
> first started (not per-event). `get_events()` reads the metadata pages once at the start
> of the SFM session and the decoded values are stamped onto every event in that session.
---
@@ -1162,7 +1242,9 @@ return events
### 7.7.7 Updated Download Loop with SUB 5A Metadata
> **Added 2026-04-02.** Confirmed working on BE11529 over TCP/cellular.
> **The loop in this subsection is DEPRECATED — it uses the broken `stop_after_metadata=True`
> hack and the wrong sequence ordering.** See §7.8.5–§7.8.8 for the corrected protocol.
> The pseudocode below is preserved as historical record only.
```python
key4, _ = proto.read_event_first() # SUB 1E
@@ -1197,13 +1279,25 @@ return events
### 7.8 SUB 5A — Bulk Waveform Stream (event-time metadata)
> ✅ **Added 2026-04-02.** Frame format confirmed by reproducing Blastware wire bytes
> byte-for-byte from the 1-2-26 BW capture.
> ✅ **§7.8.1 (frame format) — added 2026-04-02; v0.14.3 partial DLE stuffing finalized 2026-05-05.**
> Frame format confirmed by reproducing Blastware wire bytes byte-for-byte across the 1-2-26
> capture (10 frames) and the 5-1-26 bwcap3sec capture (17 frames, all match including the
> DLE-stuffed `10 10 00` for counter=0x1000).
SUB 5A initiates a bulk transfer of the raw sample data for a stored event. The response is a
sequence of A5 frames. Frame 7 (0-indexed) contains the full compliance setup as it existed
when the event was recorded — including `Client:`, `User Name:`, `Seis Loc:`, and
`Extended Notes` ASCII label-value pairs.
SUB 5A initiates a bulk transfer of the raw sample data for a stored event. The response is
a sequence of A5 frames. Project-info ASCII strings (`Project:`, `Client:`, `User Name:`,
`Seis Loc:`, `Extended Notes`) live in the dedicated metadata pages at counter `0x1002`
and `0x1004` (see §7.8.7), not in the sample-chunk stream.
**For the corrected protocol read in order:**
- §7.8.1 — frame format (raw `offset_hi`, DLE-aware checksum, partial DLE stuffing of params)
- §7.8.5 — chunk addressing (probe → metadata pages → samples → TERM, all bounded by `end_offset`)
- §7.8.6 — TERM frame formula
- §7.8.7 — fixed metadata pages 0x1002 / 0x1004
- §7.8.8 — multi-event "Download All" sequence
§7.8.2–§7.8.4 are retained as historical record of earlier (incorrect) understandings —
do not implement against them.
#### 7.8.1 Frame Format
@@ -1214,7 +1308,7 @@ SUB 5A uses a **non-standard frame layout** that differs from all other BW→S3
41 02 10 10 00 5A 00 ^^raw^^ ^^raw^^ ^^stuffed^^
```
Two critical differences from `build_bw_frame`:
Three critical differences from `build_bw_frame`:
1. **`offset_hi` is sent raw, not DLE-stuffed.** When `offset_hi = 0x10`, the wire carries
a bare `0x10` — NOT the stuffed `10 10` that `build_bw_frame` would produce. The device
@@ -1223,36 +1317,85 @@ Two critical differences from `build_bw_frame`:
2. **DLE-aware checksum.** Walking the full frame byte sequence: when a `10 XX` pair is seen,
only `XX` is added to the running sum; lone bytes are added normally.
#### 7.8.2 Request Sequence
3. **Partial DLE stuffing of `0x10` bytes in the params region** (CONFIRMED 2026-05-05).
The device's de-stuffing rule for the params region is:
- `10 10` → de-stuffs to `10`
- `10 02 / 03 / 04` → kept literal (these are inner-frame markers)
- `10 X` for other X → de-stuffs to just `X` (drops the leading `0x10`)
Therefore any `0x10` byte in the *logical* params that is followed by a byte NOT in
`{0x02, 0x03, 0x04, 0x10}` MUST be doubled on the wire (`10 X``10 10 X`) so the
device's de-stuffer reproduces the original `10 X` pair. This applies most commonly
to counters with `0x10` in the high byte (e.g. counter=`0x1000` produces logical
params bytes `... 10 00 ...`, which BW encodes on the wire as `... 10 10 00 ...`).
Without this stuffing the device interprets counter=`0x1000` as `0x0000` and returns
the probe response (= a copy of the file header + STRT record); that STRT block then
ends up embedded in the assembled file body and Blastware refuses to open the file.
`0x10` bytes in `offset_hi` are still written RAW per (1) above — only the params
region has this stuffing requirement. Metadata-page params for counter `0x1002` /
`0x1004` survive without stuffing because `10 02` / `10 04` fall in the "kept literal"
carve-out.
Verified against BW 5-1-26 bwcap3sec frame 20: params logical bytes
`00 01 11 10 00 00 00 00 00 00 00` (counter=0x1000) are encoded on the wire as
`00 01 11 10 10 00 00 00 00 00 00 00` (12 wire bytes for 11 logical bytes).
#### 7.8.2 Request Sequence — DEPRECATED 2026-05-01 (see §7.8.5–§7.8.7 for the corrected protocol)
> ⛔ **The 0x0400-step / max(key4[2:4], 0x0400) formula in this section is WRONG.** Five new
> BW MITM captures (4-27-26 + 5-1-26) prove the actual chunk increment is **0x0200**, the
> chunk loop is bounded by `end_offset` from the STRT record (not by chunk count or by a
> device-side timeout), and the TERM frame's `offset_word=0x005A` magic is incorrect — the
> real TERM offset_word is computed from `end_offset` and the last chunk address. Under the
> deprecated formula SFM over-reads roughly 5× past the actual event end into post-event
> circular-buffer garbage, corrupting reconstructed Blastware files for any waveform ≥ 2 sec.
>
> The whole "stop_after_metadata + one extra chunk + 0e 08 footer" workaround in this
> section was compensating for the missing end_offset bound. It is obsoleted by the
> STRT-bounded walk in §7.8.5.
>
> **Read this section for historical context only.** For the correct protocol, jump to:
> - §7.8.5 — chunk addressing and the STRT end_offset
> - §7.8.6 — TERM frame formula
> - §7.8.7 — fixed metadata pages 0x1002 and 0x1004
| Frame | offset_word | counter | params | Purpose |
|---|---|---|---|---|
| Probe | `0x1004` | `0x0000` | 10 bytes (`bulk_waveform_params(0)`) | Initiate transfer |
| Chunk 1 | `0x1004` | `0x0400` | 11 bytes | First data chunk |
| Chunk 2 | `0x1004` | `0x0800` | 11 bytes | Second chunk |
| Chunk N | `0x1004` | `N * 0x0400` | 11 bytes | Nth chunk |
| Chunk 1 | `0x1004` | `max(key4[2:4], 0x0400)` | 11 bytes | First data chunk |
| Chunk 2 | `0x1004` | `max(key4[2:4], 0x0400) + 0x0400` | 11 bytes | Second chunk |
| Chunk N | `0x1004` | `max(key4[2:4], 0x0400) + (N-1) * 0x0400` | 11 bytes | Nth chunk |
| … | … | … | … | … |
| Termination | `0x005A` | `last + 0x0400` | 10 bytes | End transfer |
| Termination | `0x005A` | `max(key4[2:4], 0x0400) + N * 0x0400` | 10 bytes | End transfer |
> ⚠️ **2026-04-06 CORRECTED — chunk counter is monotonic for ALL chunks.**
> The 4-2-26 BW TX capture showed counter=0x1004 for chunk 1, which was hardcoded as a
> special case. This was a Blastware artifact. Empirically confirmed: counter=0x0400 for
> chunk 1 works correctly; counter=0x1004 causes the device to time out. The device does
> NOT strictly validate the counter value — it streams data for any valid 5A request for
> the given key. Use `chunk_num * 0x0400` (monotonic) for all chunks.
> BW's true internal formula is `key4[2:4] + n * 0x0400`. For event 1 (key `01110000`)
> this equals `n * 0x0400` since `key4[2:4] = 0x0000`. The monotonic formula is correct
> for all keys encountered on this device.
> Historical correction notes (left in place to deter re-derivation of the same wrong formula):
> the table above was the result of three iterative "corrections" between 2026-04-06 and
> 2026-04-26 that progressively narrowed in on the wrong answer because every test was on
> events with `key4[2:4]=0` and the device responds to whatever counter you ask for. The
> 5-1-26 captures with a non-zero start_key event (`01112238`) finally exposed the bug.
The `stop_after_metadata=True` flag causes the loop to stop as soon as `b"Project:"` is
found in the accumulated A5 frame data, typically after 79 chunks. A termination frame
is always sent before returning.
The `stop_after_metadata=True` flag (deprecated as a primary loop-exit) scanned for
`b"Project:"` in the chunk stream because the metadata strings happened to be reachable
when the broken 0x0400-step walk passed the global metadata pages at 0x1002/0x1004. Under
the corrected walk, those strings come from explicit reads at counter=0x1002 and 0x1004,
not from the sample-chunk stream — see §7.8.7.
#### 7.8.3 A5 Frame Layout
#### 7.8.3 A5 Frame Layout — DEPRECATED 2026-05-01
Each A5 response frame contains a chunk of raw bulk data. Frame 7 of the stream carries the
compliance text block with all project-info label-value pairs. The `client` layer searches
for ASCII labels with a null-terminated value read:
> ⛔ **The "Frame 7 carries the compliance text block" claim below is WRONG.** It was
> an artifact of the deprecated `0x0400`-step walk where the broken counter formula
> happened to land sample-chunk fi=7 on top of the 0x1002 metadata page in flash.
> Under the corrected v0.14.0+ walk (§7.8.5), Frame 7 of the sample-chunk sequence is
> just sample-chunk #5 (counter=0x1000), and contains either ordinary waveform data or —
> critically when DLE-stuffing of params is wrong (§7.8.1.3) — a duplicate file header +
> STRT block when the device misinterprets counter=0x1000 as 0x0000. See §7.8.7 for the
> actual source of these strings.
Historical claim (NOT TO BE IMPLEMENTED): each A5 response frame contains a chunk of raw
bulk data; Frame 7 of the stream carries the compliance text block with all project-info
label-value pairs:
```
"Project:" → null-terminated project name
@@ -1262,17 +1405,23 @@ for ASCII labels with a null-terminated value read:
"Extended Notes" → null-terminated notes
```
All five fields reflect the **setup at event-record time**, not the current device config.
All five fields do reflect the **setup at event-record time**, not the current device
config. But the source is the metadata pages (§7.8.7), not "Frame 7" of the sample
stream.
#### 7.8.4 End-of-Stream Behaviour and Chunk Timing
#### 7.8.4 End-of-Stream Behaviour and Chunk Timing — REINTERPRETED 2026-05-01
> ✅ **Confirmed 2026-04-06** — empirical observation on BE11529 (S338.17) over TCP/cellular.
> The "1 raw byte then silence" pattern documented below was originally interpreted as
> "the device's natural end-of-event signal." The 5-1-26 captures show this is actually
> the device's response when the requester has walked **past** the addressable buffer
> region (i.e. ~5× past the actual event end under the deprecated 0x0400-step walk).
> Under the corrected STRT-bounded walk (§7.8.5), the stream ends cleanly with the TERM
> frame's response — no timeout, no 1-byte teaser. The fallback below remains useful as
> defensive handling for malformed events but should not be the primary loop-exit.
**End-of-stream signal:** After sending all waveform chunks, the device sends exactly **1 raw byte** in response to the next chunk request, then goes silent. This byte is not a complete DLE-framed A5 response — `S3FrameParser.bytes_fed` reports 1 and no frame is ever assembled. This is the device's natural end-of-stream indicator.
Handling logic in `read_bulk_waveform_stream`:
**Defensive fallback handling in `read_bulk_waveform_stream`:**
```
TimeoutError caught:
TimeoutError caught (rare under corrected walk):
if bytes_fed > 0 AND frames already collected:
→ graceful end-of-stream; break loop; proceed to termination frame
else (bytes_fed == 0, no prior frames):
@@ -1284,14 +1433,15 @@ TimeoutError caught:
| Metric | Observed value |
|---|---|
| Chunk response time | ~1 s per chunk |
| Chunks for a 9,306-sample event | 35 chunks |
| Data per chunk (active signal) | 1,0361,123 bytes |
| Data per chunk (post-event silence) | 1,036 bytes (uniform) |
| Chunks for a 2-sec event (corrected walk) | 14 (12 sample chunks + 2 metadata pages) + TERM |
| Chunks for a 3-sec event (corrected walk) | 18 (16 sample chunks + 2 metadata pages) + TERM |
| Chunks for a continuation event (corrected walk) | ~15 sample chunks + TERM (no metadata reread) |
| Chunks under deprecated walk for 2-3 sec event | 37 (over-reads ~5×) |
| Data per chunk (corrected, 0x0200 size) | ~540575 bytes wire (= 0x0200 payload + framing) |
| Data per chunk (deprecated 0x0400 step) | 1,0361,123 bytes wire (= 0x0400 payload + framing) |
| Safe recv timeout per chunk | **10 s** (10× typical) |
| Default transport timeout | 120 s → ~2-min stall at end-of-stream |
Chunks with uniform 1,036-byte payload (chunks 1735 in the observed event) contain all-zero ADC samples — the device continues recording silence until the configured record time expires before terminating the stream.
**ADC count-to-physical conversion — ✅ CONFIRMED 2026-04-17:**
Raw samples are signed 16-bit integers (32,768 to +32,767). Source: Interface Handbook §4.5.
@@ -1310,6 +1460,201 @@ where `geo_range = 1.61133 V × 6.206053 = 10.000 in/s` is the Normal (Gain=1) f
`_decode_a5_waveform()` contains `elif fi == 9: continue` from an earlier assumption that frame index 9 is always the device terminator. For streams with more than 9 frames, frame 9 is live waveform data. The skip discards ~1,070 bytes (~133 sample-sets) per event. Terminator detection should use `page_key == 0x0000`, not frame index. This skip should be removed.
#### 7.8.5 Chunk addressing and the STRT end_offset (NEW 2026-05-01) ✅
> ✅ Confirmed across 3 events (4-27-26 + 5-1-26 captures).
`params[0]` is always `0x00`. `params[1:5]` is a 4-byte absolute device flash-buffer
address — equivalently, "the key of the page being requested." The device returns 0x0200
(= 512) bytes starting at that address. Increments between consecutive sample chunks are
**0x0200, NOT 0x0400** (the previous 0x0400 figure was a Blastware-side artifact / our
implementation's bug — see §7.8.2).
##### STRT record (data layout in the first A5 response)
The first A5 response (the probe response, or the first chunk for continuation events)
contains a **STRT record** at byte offset 17 of `data`:
```
data[ 0:14] echoes request: [chunk_size_hi=0x02 / 0x04 ...] [00] [01 11] [counter_hi counter_lo] [00 × 8] [00 12]
data[14:17] 10 03 00 ← inner DLE+ETX frame separator (preserved literally)
data[17:21] "STRT" ← magic
data[21:23] ff fe ← sentinel
data[23:27] end_key ← 4-byte key of where this event ENDS
data[27:31] start_key ← 4-byte key of where this event STARTS
data[31:33] uint16 BE ← ?? sample count or byte count, varies (not yet decoded)
data[33:35] uint16 BE ← ??
data[35] 0x46 ← record type marker (waveform full record)
data[36:] additional pointers / first sample bytes — content varies by event
```
`end_offset = (end_key[2] << 8) | end_key[3]` is **the authoritative event-end pointer**.
Use it to bound the chunk loop and to compute the TERM frame.
##### Chunk pattern by event location in buffer
**Event 1 / start_key[2:4] = 0x0000** (first event after erase or wrap):
```
1. Probe at counter = 0x0000 (params[1:5] = full key)
2. Read fixed metadata pages counter = 0x1002, then 0x1004
3. Walk sample chunks counter = 0x0600, 0x0800, …, by 0x0200,
up to but not including end_offset & 0xFE00
4. TERM (see §7.8.6)
```
The range `[0x0046, 0x0600)` is skipped — likely some pre-event firmware-reserved area for
the first slot in a freshly-erased buffer. Harmless to skip; BW does the same.
**Event 2+ / start_key[2:4] != 0x0000** (continuation events in a populated buffer):
```
1. First chunk at counter = start_key[2:4] ← acts as both probe and first
sample chunk; response carries STRT at byte 17
2. Walk sample chunks counter += 0x0200 each
3. TERM
```
**`start_key` here is the off=0x46 WAVEHDR record key returned by 1F** (e.g. `01112238`),
NOT the off=0x2C boundary key that immediately precedes it. An earlier draft of this
spec described event-N as "probe at start + 0x46" — that formula was correct only if
"start" meant the boundary key (0x21F2 in the 5-1-26 event 2 case). In the iteration
walk used by SFM and BW, `cur_key` passed into the 5A flow is always the off=0x46 key,
so the probe counter equals `cur_key[2:4]` with no extra offset. Adding +0x46 places
the probe one WAVEHDR past the actual event start, the response no longer contains
STRT at byte 17, and the chunk loop falls back to the `max_chunks` cap.
Confirmed:
- 5-1-26 "copy 2nd address" BW capture: probe counter=0x2238 with key=01112238; A5[0]
has STRT@17 with end_offset=0x417E.
- 5-4-26 BW 2-sec event capture: same probe counter=0x2238, same end_offset=0x417E.
**No metadata-page reads.** Pages 0x1002/0x1004 are session-global and were already read
during event 1 in the same Blastware session. In SFM, treat metadata pages as a once-
per-`MiniMateClient.connect()` (or once-per-call-home) read, not per-event.
##### Verified end_offset values
| Capture | start_key | end_key | end_offset | event size | sample-chunk start |
|---|---|---|---|---|---|
| 4-27-26 "open 2sec" / "copy event to disk" | `01110000` | `01111ABE` | `0x1ABE` | 6,846 B | 0x0600 (event-1 case) |
| 5-1-26 "copy 3sec" / Download All event 1 | `01110000` | `011121F2` | `0x21F2` | 8,690 B | 0x0600 (event-1 case) |
| 5-1-26 "copy 2nd address" / DA event 2 | `01112238` (= 1F result) | `0111417E` | `0x417E`, span 0x1F8C = 8,076 B | 0x2238 (= cur_key[2:4]) |
| 5-4-26 BW 2-sec event | `01112238` | `0111417E` | `0x417E` | 0x2238 (= cur_key[2:4]) |
#### 7.8.6 TERM Frame Formula (NEW 2026-05-01) ✅
> ✅ Confirmed across 3 events. Replaces the deprecated `offset_word=0x005A` / `params[2] = key4[2]` formula in §7.8.2.
The TERM frame fetches the partial last chunk and the file footer. Its response payload
contains the bytes between the last full 0x0200-aligned chunk and `end_offset` — typically
20520 B — and is **required for reconstructing the Blastware waveform file**. Append the
TERM response data to the chunk stream like any other A5 frame.
```
last_chunk_counter = address of last full 0x0200-byte chunk read
next_boundary = last_chunk_counter + 0x0200
TERM offset_word = end_offset - next_boundary
TERM params[0] = key[0] (= 0x01 on every observed device)
TERM params[1] = key[1] (= 0x11)
TERM params[2] = (next_boundary >> 8) & 0xFF
TERM params[3] = next_boundary & 0xFF
TERM params[4:10] = zeros ← 10-byte params (not 11)
Frame = build_5a_frame(offset_word, params)
```
The device receives `requested_address = (params[2] << 8) | offset_word` (where offset_word
contains both `offset_hi` and `offset_lo` of the 5A frame, with the high bit of offset_hi
being effectively `bit 0 of (end_offset >> 8)`). It reconstructs `end_offset` and replies
with `(end_offset - next_boundary)` bytes of waveform tail starting at `next_boundary`.
##### Verification
| Event | end_offset | last chunk | next_boundary | TERM offset_word | TERM params[2:4] | TERM response size |
|---|---|---|---|---|---|---|
| 2-sec | `0x1ABE` | `0x1800` | `0x1A00` | `0x00BE` ✓ | `1A 00` ✓ | 208 B |
| 3-sec | `0x21F2` | `0x1E00` | `0x2000` | `0x01F2` ✓ | `20 00` ✓ | 520 B |
| Event-2 | `0x417E` | `0x3E38` | `0x4038` | `0x0146` ✓ | `40 38` ✓ | (not measured directly; same pattern) |
Equivalent way to write the formula:
- `offset_word = end_offset & 0x01FF` — low 9 bits of end_offset
- `params[2:4] = (end_offset & 0xFE00) BE` — high 7 bits of end_offset, low byte zeroed
(The two forms are arithmetically identical to `end_offset - next_boundary` and
`next_boundary` because `next_boundary = end_offset & 0xFE00` whenever the chunk loop
stopped at the last full 0x0200 boundary below end_offset.)
#### 7.8.7 Fixed Metadata Pages 0x1002 / 0x1004 (NEW 2026-05-01) 🔶
> 🔶 Inferred — observed in BW captures but page contents not yet byte-decoded.
Two chunk addresses are **GLOBAL** device/session metadata, not event-specific:
- `counter = 0x1002` — first metadata page
- `counter = 0x1004` — second metadata page
These are at fixed absolute addresses in the device's flash buffer. They contain the
session-start compliance-setup ASCII strings — **Project**, **Client**, **User Name**,
**Seis Loc**, **Extended Notes** — that under the deprecated 0x0400-step walk used to be
discoverable in the sample-chunk stream as "A5 frame 7" content. Under the corrected
0x0200-step walk these strings come exclusively from the dedicated metadata-page reads,
not from sample chunks.
##### Caching strategy
BW reads them ONCE per Blastware session, during event 1's download, and caches them.
For SFM:
- Read once per `MiniMateClient.connect()` / once per call-home session.
- Subsequent events in the same session don't need to re-fetch them.
- Their content does not change while iterating events. They DO change when the user
applies a new compliance setup (SUB 71 write) — invalidate the cache then.
##### TODO — content layout
The byte-for-byte layout of pages 0x1002 and 0x1004 has not been decoded. First task on
the implementation side: dump both pages from a fresh capture and verify they include all
the strings currently extracted from the deprecated A5 frame 7 of the chunk stream.
Compare to the existing `_decode_a5_metadata_into` parser — same string-search anchors
(`b"Project:"`, `b"Client:"`, `b"User Name:"`, `b"Seis Loc:"`, `b"Extended Notes"`) likely
apply directly.
#### 7.8.8 "Download All" Sequence (NEW 2026-05-01) ✅
> ✅ Confirmed from 5-1-26 "Download All" capture (`raw_*_171216_download_all_2events.bin`).
Before any 5A traffic, BW's "Download All" pre-walks the entire event chain to map keys
and event boundaries:
```
SERIAL × 2 → CHCFG → EVT_KEY (1E, all-zero) → key0
→ WAVEHDR (0A, key0) → off=0x46 (real event start)
→ EVT_NEXT (1F, all-zero) → key1
→ WAVEHDR (0A, key1) → off=0x2C (boundary)
→ EVT_NEXT → key2
→ WAVEHDR (0A, key2) → off=0x46 (real event start)
→ EVT_NEXT → key3
→ WAVEHDR (0A, key3) → off=0x2C (boundary)
→ EVT_NEXT → null sentinel
```
The DATA_LENGTH at `data_rsp.data[5]` (echoed BW offset for the data fetch step)
disambiguates real events from boundary markers:
| WAVEHDR offset | Meaning |
|---|---|
| `0x46` (= 70) | Real event start key — this key has event data behind it |
| `0x2C` (= 44) | Boundary marker — this key is the END of the previous event AND the start of the empty/header gap before the next event |
Pairs: each real event spans `[real_key, next_real_key)` in the buffer. In the example
above: event 1 = `[01110000, 011121F2)`, event 2 = `[01112238, 0111417E)`. Note that the
"end of event 1" key (`011121F2`) is also the "boundary key" that comes BEFORE event 2's
real start key (`01112238`) — they differ by exactly 0x46 bytes (the event header size).
After the pre-walk completes, BW downloads each `0x46`-keyed event in turn using the 5A
bulk stream protocol from §7.8.5. Use the `0x46` keys, not the `0x2C` keys, as input to
`read_bulk_waveform_stream`.
---
## 7.9 Compliance Config Field Inventory (Blastware UI, 2026-04-08) ✅
@@ -1341,10 +1686,10 @@ Fields visible in the Blastware "Compliance Setup" dialog. ✅ = byte offset co
| Field | Values / Type | Status |
|---|---|---|
| Enable User Notes | bool | ❓ |
| Project | ASCII string | ✅ (sourced from A5 frame 7 via SUB 5A) |
| Client | ASCII string | ✅ (sourced from A5 frame 7) |
| User Name | ASCII string | ✅ (sourced from A5 frame 7) |
| Seis Loc | ASCII string | ✅ (sourced from A5 frame 7) |
| Project | ASCII string | ✅ (sourced from SUB 5A metadata pages at counter `0x1002` / `0x1004` — see §7.8.7) |
| Client | ASCII string | ✅ (sourced from SUB 5A metadata pages — see §7.8.7) |
| User Name | ASCII string | ✅ (sourced from SUB 5A metadata pages — see §7.8.7) |
| Seis Loc | ASCII string | ✅ (sourced from SUB 5A metadata pages — see §7.8.7) |
| Enable Extended Notes | bool | ❓ |
| Extended Notes | ASCII text | ❓ |
| Extended Notes Title | ASCII string | ❓ |
@@ -2244,6 +2589,279 @@ Semantic Interpretation <- settings, events, responses
---
---
## Appendix D — Blastware Binary File Formats (.N00 / .MLG / others)
> ✅ CONFIRMED 2026-04-21 — all fields verified by binary diff of reconstructed vs reference
> files from the 4-3-26-multi_event capture (M529LIY6.N00, BE11529.MLG).
>
> ⚠️ EXTENSION MAPPING REFUTED 2026-04-21 — earlier assumption that extension encodes
> recording mode is **FALSE**. A continuous-mode event produced `.EI0`, not `.9T0`.
> Extension encoding algorithm is unknown. Do not use extension to infer recording mode.
### D.1 Common File Header (22 bytes)
All Blastware files (regardless of type) share an 18-byte prefix followed by a 4-byte type tag.
| Offset | Length | Value | Description |
|---|---|---|---|
| 0x00 | 6 | `10 00 01 80 00 00` | Fixed prefix |
| 0x06 | 10 | `Instantel\x00` | ASCII string |
| 0x10 | 2 | `07 2c` | Fixed suffix |
| 0x12 | 4 | varies | File type tag (see below) |
**Total header: 22 bytes.**
**Type tags:**
| Extension | Type tag | Description |
|---|---|---|
| `.N00` | `00 12 03 00` | Waveform event (confirmed) |
| `.9T0` | `00 12 03 00` | Waveform event — same type tag as .N00 (assumed; not independently confirmed) |
| `.EI0` | `00 12 03 00` | Waveform event — same type tag (assumed; continuous-mode event observed 2026-04-21) |
| `.MLG` | `22 01 0e a0` | Monitor log |
**Extension encoding — new firmware (V10.72+) FULLY DECODED (confirmed 2026-04-22):**
The extension differs depending on how the file was saved:
| Download method | Extension format | Example |
|---|---|---|
| Manual / direct (Blastware connected to unit) | `AB0` (3 chars) | `.CE0` |
| Call-home / ACH | `AB0W` or `AB0H` (4 chars) | `.CE0H` |
Where:
- `AB` = 2-char base-36 of `total_seconds % 1296`; `A = value // 36`, `B = value % 36`
- `total_seconds = (event_local_time 1985-01-01T00:00:00_local)` in seconds
- `0` = always literal digit zero
- `W` = Full Waveform, `H` = Full Histogram (ACH only)
Base-36 alphabet: `09` = 09, `AZ` = 1035.
The 10-year production archive contains only ACH files (all end in W or H). Manual Blastware downloads produce the same `AB0` prefix but without the trailing type character.
**3-day cycle property (confirmed 2026-04-22):** A unit recording at a fixed daily time cycles through exactly **3 different extensions** with a 3-day period. Each calendar day shifts `total_seconds % 1296` by 864 (since `86400 % 1296 = 864`). The cycle repeats every 3 days because `gcd(1296, 864) = 432`. Confirmed from archive: top 3 extensions `CE0H` (95), `0E0H` (93), `OE0H` (91) are the 3-day cycle of a 06:00:14 daily call-in (seconds-in-window = 446, 14, 878).
**B character invariance:** `864 = 24 × 36`, so adding one day never changes `value % 36` — the second extension character is invariant for a fixed daily recording time. Only the first character cycles through 3 values.
**Old firmware (S338):** 3-char extensions observed (`.N00`, `.EI0`, etc.) — may simply be manual downloads under the same AB0 scheme, or a different encoding. Not yet confirmed.
**Micromate Series 4** uses a different extension format (observed: `IDFH`, `IDFW`). This formula does NOT apply to Micromate units.
All waveform files share the same `00 12 03 00` type tag regardless of extension. Blastware identifies file type by extension, not by type tag alone.
### D.2 Timestamp Encoding (Blastware files)
All timestamps in N00 and MLG files use an **8-byte big-endian format**:
| Byte | Field |
|---|---|
| 0 | day (uint8) |
| 1 | month (uint8) |
| 23 | year (uint16 BE) |
| 4 | `0x00` (reserved) |
| 5 | hour (uint8) |
| 6 | minute (uint8) |
| 7 | second (uint8) |
Example: `01 04 07 ea 00 00 1c 08` → April 1, 2026, 00:28:08.
Note: this differs from the 8-byte protocol timestamp (`[day][sub_code][month][year_HI][year_LO][0x00][hour][min][sec]` = 9 bytes) used in the device's on-wire 0C waveform records. The file format uses a compact 8-byte layout without the `sub_code` byte.
### D.3 N00 File Format — Single-Shot Waveform Event
**File layout:** `[22B header] [21B STRT record] [body bytes] [26B footer]`
#### D.3.1 STRT Record (21 bytes)
The STRT record immediately follows the 22-byte header.
| Offset | Length | Field | Notes |
|---|---|---|---|
| 0 | 4 | `STRT` | ASCII literal |
| 4 | 2 | `ff fe` | Fixed |
| 6 | 4 | event key (key4) | 4-byte waveform key |
| 10 | 4 | device-specific | NOT a repeat of key4 — device-internal field |
| 14 | 6 | device-specific | NOT zero-padded — device-internal fields |
| 20 | 1 | rectime | uint8 seconds |
**Critical:** The STRT record must be copied verbatim from A5[0].data[7+strt_pos:] — bytes [10:20] contain device-specific values that cannot be reconstructed from protocol-level Event fields alone.
#### D.3.2 Body Bytes (variable)
The body is reconstructed from the raw A5 bulk waveform stream frames by stripping DLE framing markers and taking the appropriate slice of each frame's data section.
**Per-frame contribution (from `frame.data`):**
| Frame | Skip amount | Notes |
|---|---|---|
| A5[0] (probe) | `7 + strt_pos_in_w0 + 21` | Skip frame.data prefix + STRT record |
| A5[1] | 13 | 7-byte prefix + 6-byte first-chunk header |
| A5[2..N] | 12 | 7-byte prefix + 5-byte chunk header |
| Terminator (page_key=0x0000) | 11 | 7-byte prefix + 4-byte terminator header |
**DLE strip rule:** For each frame's contribution (`frame.data[skip:]`), strip any `0x10` byte immediately followed by `0x02`, `0x03`, or `0x04`. Only the `0x10` is stripped; the following byte is kept as payload.
**Split-pair edge case:** When `frame.data[-1] == 0x10` AND `frame.chk_byte ∈ {0x02, 0x03, 0x04}`, the S3FrameParser split a DLE+XX pair at the payload/checksum boundary. Reunite the bytes before stripping (`relevant + bytes([chk_byte])`), then always remove the trailing chk_byte from the result (`stripped[:-1]`) — chk_byte is the wire checksum, never payload.
**Body/footer split:** Accumulate all frame contributions (data frames + terminator) into `all_bytes`. Then:
- `body = all_bytes[:-26]` (variable length)
- `footer = all_bytes[-26:]` (always 26 bytes — extracted from terminator content)
#### D.3.3 Footer (26 bytes)
The footer terminates the N00 file. Its bytes come directly from the terminator A5 frame's inner content — do NOT reconstruct from event metadata.
| Offset | Length | Field | Notes |
|---|---|---|---|
| 0 | 2 | `0e 08` | Fixed marker |
| 2 | 8 | ts1 | Start timestamp (8B big-endian) |
| 10 | 8 | ts2 | Stop timestamp (8B big-endian) |
| 18 | 6 | `00 01 00 02 00 00` | Fixed |
| 24 | 2 | CRC | 2-byte CRC — algorithm unconfirmed |
**CRC:** The 2-byte CRC at footer[24:26] has an unconfirmed algorithm. In M529LIY6.N00 it reads `fe da`. Attempts to match CRC-16/CCITT, CRC-16/IBM, CRC-32 (truncated), and 40+ polynomial/init combinations all failed. The writer copies it verbatim from the terminator frame.
### D.4 MLG File Format — Monitor Log
**File layout:** `[308B header] [N × 292B records]`
#### D.4.1 MLG Header (308 bytes)
| Offset | Length | Field | Notes |
|---|---|---|---|
| 0x00 | 22 | common header | prefix + `22 01 0e a0` type tag |
| 0x16 | 16 | unknown | observed as zeros in BE11529.MLG |
| 0x2A | 8 | serial number | null-padded ASCII (e.g. `"BE11529"`) |
| 0x32 | remainder | zero pad | pads to 308 bytes total |
#### D.4.2 MLG Record (292 bytes each)
| Offset | Length | Field | Notes |
|---|---|---|---|
| 0 | 2 | CRC | 2-byte CRC — algorithm unconfirmed; write as `00 00` |
| 2 | 4 | `22 01 0e 80` | Record marker |
| 6 | 8 | ts1 | Start timestamp (8B big-endian) |
| 14 | 8 | ts2 | Stop timestamp (8B big-endian); zeros if no stop |
| 22 | 4 | flags | Record type flags (see below) |
| 26 | 10 | serial | Null-padded ASCII serial number |
| 36 | variable | text | Type-dependent content |
| — | remainder | zero pad | pads to 292 bytes total |
**Record flags:**
| Value | Meaning |
|---|---|
| `ff ff 00 00` | Monitoring start with no stop recorded |
| `01 00 02 00` | Triggered event (has ts1 + ts2) |
| `02 00 00 00` | Monitoring interval (has ts1 + ts2) |
**Text content for triggered events (`flags = 01 00 02 00`):**
| Byte | Field |
|---|---|
| 0 | `0x08` |
| 18 | ts1 copy (8B big-endian) |
| 9+ | `"Geo: X.XXX in/s\x00"` ASCII geo threshold |
#### D.4.3 MLG CRC
The 2-byte CRC at record[0:2] uses an unconfirmed algorithm. Tested against CRC-16/CCITT, CRC-16/IBM, CRC-32 (truncated), word sums, XOR variants, and 40+ polynomial/init combinations — none matched. The writer emits `00 00`. Blastware may reject files with incorrect CRCs (impact on import unknown — TODO: test).
### D.5 Filename Encoding ✅ PARTIALLY CONFIRMED 2026-04-22
Blastware assigns waveform filenames of the form `<prefix_letter><serial3><stem><ext>`, where:
#### D.5.1 Serial Prefix ✅ CONFIRMED 2026-04-22
The first 4 characters of the filename encode the full device serial number:
```
prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))
serial3 = f"{serial_numeric % 1000:03d}" (last 3 digits, zero-padded)
```
Where `serial_numeric` is the integer after the "BE" device-type prefix.
Examples (all confirmed from archive):
| Serial | serial_numeric / 1000 | prefix_letter | serial3 | Filename prefix |
|--------|----------------------|---------------|---------|-----------------|
| BE6907 | 6 | H | 907 | H907 |
| BE7145 | 7 | I | 145 | I145 |
| BE11529 | 11 | M | 529 | M529 |
| BE14036 | 14 | P | 036 | P036 |
| BE17353 | 17 | S | 353 | S353 |
| BE18003 | 18 | T | 003 | T003 |
| BE18191 | 18 | T | 191 | T191 |
| BE18676 | 18 | T | 676 | T676 |
**Interpretation:** The prefix letter encodes the production generation (batch of 1000 units). B=generation 0 (serials 0999), C=generation 1 (10001999), etc. No units with prefix A have been observed — the earliest known units start around serial 2000+ (prefix D).
**Note:** The "BE" device-type prefix is implicit. The filename only encodes the numeric part of the serial. Other Instantel device types (Micromate, Blastmate) may use a different scheme.
#### D.5.2 Stem + Extension — full timestamp encoding ✅ FULLY CONFIRMED 2026-04-22
The stem (4 chars) and AB extension (2 chars) together form a 6-digit base-36 number encoding a complete second-resolution timestamp:
```python
total_seconds = stem_int * 1296 + ab_int
event_local_time = datetime(1985, 1, 1) + timedelta(seconds=total_seconds)
```
- **Epoch:** `1985-01-01 00:00:00` **device local time** ✅ CONFIRMED — verified against 3,248 files from a 10-year production archive; zero errors (only 2 mismatches were Micromate `IDFH`/`IDFW` files which use a completely different naming scheme)
- **Unit:** 1296 seconds = 36² ≈ 21.6 minutes per stem increment
- **Alphabet:** `"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"` (digits then uppercase letters)
- **Collision:** Events within the same 21.6-minute window share a stem; extension distinguishes them
**Decoding example — `P036L318.C80H` (BE14036, Full Histogram):**
```
stem L318 = 21×36³ + 3×36² + 1×36 + 8 = 983,708
AB C8 = 12×36 + 8 = 440
total_sec = 983,708 × 1296 + 440 = 1,274,886,008
event_time = 1985-01-01 + 1,274,886,008s = 2025-05-26 15:00:08 local
```
**Note on local time:** The device's onboard clock is set to the local timezone of the deployment site. The epoch and all timestamps are in that same local time — there is no UTC conversion. Files moved between timezones will decode to the original deployment timezone.
#### D.5.3 Extension taxonomy
Third character of extension is always `'0'`. File type is identified by extension, not by the type tag in the header (all waveform extensions share type tag `00 12 03 00`).
| Extension | Recording mode | Sample rate | Status |
|---|---|---|---|
| `.N00` | Single Shot (0x00) | 1024 sps | ✅ CONFIRMED |
| `.9T0` | Continuous (0x01) | 1024 sps | ✅ CONFIRMED |
| `.490` | ? | ? | ❓ observed from M529LJ8V.490 |
| `.5K0` | ? | ? | ❓ observed from M529LJDY.5K0 |
| `.980` | ? | ? | ❓ observed from M529LJDY.980 |
| `.ML0` | ? | ? | ❓ observed from M529LJDY.ML0 (167s duration; possibly Histogram) |
**Why 5 extensions for "Continuous"?** Binary analysis of all 6 example files shows that `.9T0`, `.490`, `.5K0`, `.980`, `.ML0` are byte-for-byte identical in all metadata regions (compliance anchor block, channel descriptor blocks `Tran/Vert/Long/MicL`). The A5 frame 7 body reflects the **session-start** compliance config, not the per-event capture config. All 5 files show recording_mode=0x01 and sample_rate=1024 in the body. The extension must therefore encode the **capture-time** compliance state — likely a combination of recording mode, sample rate, and possibly mic units or other options. This cannot be determined from file body alone without capture-time compliance data from the 0C record sub_code and the actual waveform sample count.
**DLE-shift offset note for reading recording_mode from N00/9T0 body:**
The compliance block in the file body has been through `_strip_inner_frame_dles`. The 0x10 constant at logical `anchor7` (between recording_mode and sample_rate_HI) gets stripped when sample_rate_HI = `0x04` (1024 sps), because `0x10` precedes `0x04 ∈ {0x02,0x03,0x04}`. After stripping, the anchor shifts left by 1, so:
| 1024 sps (strip occurs) | 2048 or 4096 sps (no strip) |
|---|---|
| `file[anc7]` = recording_mode | `file[anc8]` = recording_mode |
| `file[anc6:anc4]` = sample_rate | `file[anc6:anc4]` = sample_rate |
For 1024 sps files, the expected file bytes around the anchor are:
```
file[anc9]: mode_prefix (0x00 for Single Shot/Continuous; 0x10 for Histogram)
file[anc8]: 0x00 (was recording_mode, but shifted away — now reads 0x00 for mode_prefix)
file[anc7]: recording_mode (0x00=Single Shot, 0x01=Continuous, etc.)
file[anc6]: 0x04 (sample_rate_HI for 1024 sps)
file[anc5]: 0x00 (sample_rate_LO)
file[anc4]: histogram_interval_HI
file[anc3]: histogram_interval_LO
```
---
*All findings reverse-engineered from live RS-232 bridge captures.*
*Cross-referenced from 2026-03-02 with Instantel MiniMate Plus Operator Manual (716U0101 Rev 15).*
*This is a living document — append changelog entries and timestamps as new findings are confirmed or corrected.*
+10 -2
View File
@@ -21,7 +21,15 @@ Typical usage (TCP / modem):
from .client import MiniMateClient
from .models import DeviceInfo, Event, MonitorLogEntry
from .transport import SerialTransport, TcpTransport
from .transport import CapturingTransport, SerialTransport, TcpTransport
__version__ = "0.1.0"
__all__ = ["MiniMateClient", "DeviceInfo", "Event", "MonitorLogEntry", "SerialTransport", "TcpTransport"]
__all__ = [
"MiniMateClient",
"DeviceInfo",
"Event",
"MonitorLogEntry",
"SerialTransport",
"TcpTransport",
"CapturingTransport",
]
+974
View File
@@ -0,0 +1,974 @@
"""
blastware_file.py Blastware binary file codec for bidirectional interoperability.
Reads and writes the proprietary Instantel/Blastware file formats:
Waveform events (.CE0W, .VM0H, .440, .7M0, etc.) (extension encoding UNKNOWN see below)
.MLG Monitor log (monitoring session history)
All waveform formats share a common 22-byte file header prefix and identical
internal binary structure (same type tag 00 12 03 00, same STRT record layout).
Blastware identifies the file type by extension, not by a magic marker.
EXTENSION ENCODING V10.72 firmware FULLY CONFIRMED 2026-04-22:
Direct / manual download: AB0 (3-char, no type character)
Call-home (ACH) download: AB0W or AB0H (4-char, W=waveform H=histogram)
AB = 2-char base-36 of (total_seconds % 1296), where
total_seconds = (event_local_time 1985-01-01T00:00:00_local).
0 = always literal digit zero.
Verified against 3,248 call-home files from a 10-year production archive.
The 10-year archive contains only ACH files (all end in W or H).
Manual Blastware downloads produce 3-char AB0 extensions same encoding
but without the trailing type character.
Old firmware (S338, 3-char extensions): encoding unknown / same as manual?
Micromate Series 4 uses a different scheme (literal datetime in filename).
File structure overview
Waveform file structure (confirmed from example-events/4-3-26-multi/M529LIY6 (example event)):
[22B header] [21B STRT record] [body bytes] [26B footer]
Header (22 bytes):
10 00 01 80 00 00 fixed prefix
49 6e 73 74 61 6e 74 65 6c 00 b'Instantel\x00'
07 2c fixed
00 12 03 00 waveform file type tag (shared by all waveform extensions)
STRT record (21 bytes, immediately follows header):
53 54 52 54 b'STRT'
ff fe fixed (2 bytes)
[key4] 4-byte waveform event key
[key4] 4-byte waveform event key (repeated)
[zeros] 7 bytes padding
[rectime] uint8 record time in seconds
Body (variable reconstructed from A5 frame data):
The body bytes are derived from the raw A5 frame wire content, specifically
from the DLE-decoded representation of each frame's contribution. See the
_frame_body_bytes() helper for the exact algorithm.
Footer (26 bytes):
0e 08
[ts1: 8B big-endian timestamp] start timestamp
[ts2: 8B big-endian timestamp] stop timestamp
00 01 00 02 00 00
[crc: 2B] CRC (algorithm unconfirmed; written as 0x00 0x00 placeholder)
Timestamp format (big-endian, 8 bytes):
[day] [month] [year_HI] [year_LO] [0x00] [hour] [min] [sec]
MLG (monitor log, confirmed from example-events/4-3-26-multi/BE11529.MLG):
[308B header] [N × 292B records]
Header (308 bytes):
Offset 0x00: 10 00 01 80 00 00 Instantel\x00 07 2c 22 01 0e a0 fixed (16B)
Offset 0x10: ... (unknown structure, written as zeros + serial)
Offset 0x2A: serial number (8 bytes, null-padded ASCII, e.g. "BE11529")
... zero-padded to 308 bytes total
Record (292 bytes each):
[2B CRC] unknown algorithm; written as 0x00 0x00
22 01 0e 80 record marker
[ts1: 8B big-endian timestamp] start time
[ts2: 8B big-endian timestamp] stop time (zeros if no stop)
[4B flags] see MLG_FLAGS_* constants below
[10B serial] null-padded serial number ASCII
[text] for trigger records: [0x08][8B ts1_copy] then ASCII "Geo: X.XXX in/s"
for monitoring records: b'' (or minimal separator)
[zero-padded to 292 bytes]
Critical implementation notes
Waveform body reconstruction algorithm (confirmed 2026-04-21 from verification against
M529LIY6 (example event) using raw_s3_20260403_153508.bin capture):
The waveform body bytes come from the A5 frame content, stripped of DLE-framing
artifacts. Each A5 frame contributes a different slice of its data section,
with DLE+{0x02,0x03,0x04} byte pairs stripped.
Skip amounts per frame index (offsets into frame.data):
A5[0] (probe): data[strt_pos + 21 + 7] (skip header + STRT record)
strt_pos found by searching frame.data[7:] for b'STRT';
the contribution starts at strt_pos + 21 within data[7:]
which equals strt_pos + 21 + 7 within frame.data.
A5[1]: data[13] (skip 7-byte frame.data prefix + 6 header bytes)
A5[2..N]: data[12] (skip 7-byte frame.data prefix + 5 header bytes)
Terminator A5: data[11] (1 byte less than chunk frames; terminator inner header
is 4 bytes instead of 5 confirmed 2026-04-21)
DLE strip rule (applied AFTER slicing):
Strip any 0x10 byte that is immediately followed by 0x02, 0x03, or 0x04.
This undoes the DLE-escape that S3FrameParser preserves as literal pairs.
Applied to frame.data[skip:] + bytes([frame.chk_byte]) together, then
conditionally exclude the trailing chk_byte from the output.
chk_byte absorption:
When frame.data[-1] == 0x10 AND frame.chk_byte {0x02, 0x03, 0x04},
the last byte of frame.data is the DLE prefix of a split DLE+chk pair.
Including chk_byte in the strip buffer allows the pair to be stripped as
a unit. After stripping, the trailing chk_byte is ALWAYS removed because
_strip_inner_frame_dles keeps the byte after the DLE (the chk_byte value),
and that value is the checksum, never payload. This applies to all three
cases (chk {0x02, 0x03, 0x04}) identically.
MLG CRC:
The algorithm that produces the 2-byte CRC at the start of each MLG record
is unknown. All examined records use non-zero values that do not match
CRC-16/CCITT, CRC-16/IBM, CRC-32 (truncated), word sums, XOR variants, or
any of the 40+ polynomial/init combinations tested. The writer emits 0x0000.
This produces files that Blastware may reject or display without the CRC check
the exact impact on BW import is unknown (TODO: test).
Public API
blastware_filename(event, serial)
Return the correct Blastware filename for an event (e.g. "M529LIY6.CE0W").
Full AB0T extension encoding confirmed 2026-04-22 against 3,248 archive files.
Extension matches what Blastware itself would generate for the same event.
write_blastware_file(event, a5_frames, path)
Create a Blastware waveform file from an Event and the full A5 frame list.
All waveform extensions share the same binary format the extension is set
by blastware_filename() based on the event timestamp and type.
read_blastware_file(path) Event
Parse a Blastware waveform file into an Event object with waveform data populated.
(Not yet implemented placeholder raises NotImplementedError.)
write_mlg(entries, serial, path)
Create a .MLG file from a list of MonitorLogEntry objects.
read_mlg(path) list[MonitorLogEntry]
Parse a .MLG file into MonitorLogEntry objects.
(Not yet implemented placeholder raises NotImplementedError.)
"""
from __future__ import annotations
import datetime
import logging
import struct
from pathlib import Path
from typing import Optional, Union
from .framing import S3Frame
from .models import Event, MonitorLogEntry, Timestamp
log = logging.getLogger(__name__)
# ── File header constants ─────────────────────────────────────────────────────
# Common 16-byte prefix shared by waveform files and MLG (confirmed from binary inspection).
_FILE_HEADER_PREFIX = bytes.fromhex("1000018000004973") + b"tantel\x00\x07\x2c"
# = 10 00 01 80 00 00 49 73 74 61 6e 74 65 6c 00 07 2c (17 bytes)
# Confirmed breakdown: 10 00 01 80 00 00 = fixed; "Instantel\x00" = 10B; 07 2c = fixed
# Simpler construction:
_FILE_HEADER_PREFIX = b"\x10\x00\x01\x80\x00\x00Instantel\x00\x07\x2c" # 17 bytes
# Waveform file type tag (4 bytes after common prefix) — shared by ALL waveform extensions
_WAVEFORM_TYPE_TAG = b"\x00\x12\x03\x00" # confirmed from M529LIY6 (example event) — same tag for .CE0W, .VM0H, etc.
# MLG type tag (4 bytes after common prefix)
_MLG_TYPE_TAG = b"\x22\x01\x0e\xa0" # confirmed from BE11529.MLG offset 0x11..0x14
# Total header sizes
_WAVEFORM_HEADER_SIZE = 22 # 17 + 4 = 21... wait. Let me recalculate.
# From binary: first 22 bytes = header, then STRT at byte 22.
# 17-byte common prefix + 4-byte type tag = 21 bytes. But observed header is 22B.
# Checking: 6 fixed + 10 "Instantel\x00" + 2 "07 2c" = 18B prefix, then 4B type tag = 22B.
# Re-count: b"\x10\x00\x01\x80\x00\x00" = 6B + b"Instantel\x00" = 10B + b"\x07\x2c" = 2B = 18B prefix.
_FILE_HEADER_PREFIX = b"\x10\x00\x01\x80\x00\x00Instantel\x00\x07\x2c" # 18 bytes
_WAVEFORM_HEADER_SIZE = 22 # 18 + 4 = 22 bytes ✅
_MLG_HEADER_SIZE = 308 # confirmed from BE11529.MLG
# MLG record marker (4 bytes after 2-byte CRC at start of each record)
_MLG_RECORD_MARKER = b"\x22\x01\x0e\x80"
_MLG_RECORD_SIZE = 292 # bytes per record (confirmed from BE11529.MLG)
# MLG record flags (4 bytes at record[22:26])
# Confirmed from BE11529.MLG binary inspection:
MLG_FLAGS_START_ONLY = b"\xff\xff\x00\x00" # monitoring start with no stop
MLG_FLAGS_TRIGGER = b"\x01\x00\x02\x00" # triggered event (has ts1 + ts2)
MLG_FLAGS_INTERVAL = b"\x02\x00\x00\x00" # monitoring interval (has ts1 + ts2)
# ── Timestamp helpers ─────────────────────────────────────────────────────────
def _encode_ts_be(ts: Optional[datetime.datetime]) -> bytes:
"""
Encode a datetime as an 8-byte big-endian Blastware timestamp.
Format (waveform file and MLG record timestamps):
[day][month][year_HI][year_LO][0x00][hour][min][sec]
Big-endian year confirmed from M529LIY6 (example event) footer:
footer bytes [2..9] = 01 04 07 ea 00 00 1c 08
day=1 month=4 year=0x07ea=2026 hour=0 min=28 sec=8
Returns 8 zero bytes if ts is None.
"""
if ts is None:
return bytes(8)
return bytes([
ts.day,
ts.month,
(ts.year >> 8) & 0xFF,
ts.year & 0xFF,
0x00,
ts.hour,
ts.minute,
ts.second,
])
def _decode_ts_be(raw: bytes) -> Optional[datetime.datetime]:
"""
Decode an 8-byte big-endian Blastware timestamp.
Returns None if the bytes are all zero or structurally invalid.
"""
if len(raw) < 8 or raw == bytes(8):
return None
day = raw[0]
month = raw[1]
year = (raw[2] << 8) | raw[3]
hour = raw[5]
minute = raw[6]
sec = raw[7]
try:
return datetime.datetime(year, month, day, hour, minute, sec)
except ValueError:
return None
def _ts_from_model(ts: Optional[Timestamp]) -> Optional[datetime.datetime]:
"""Convert a models.Timestamp to datetime.datetime, or None."""
if ts is None:
return None
try:
return datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second)
except (ValueError, TypeError):
return None
# ── DLE strip helper ──────────────────────────────────────────────────────────
def _strip_inner_frame_dles(data: bytes) -> bytes:
"""
Strip DLE (0x10) framing markers from A5 inner-frame content.
The A5 (bulk waveform stream) response body contains DLE-encoded sub-frame
structure. S3FrameParser preserves DLE+XX pairs as two literal bytes in
frame.data. Only the DLE marker byte needs to be removed; the following
byte is actual payload content.
Rule: when 0x10 is immediately followed by {0x02, 0x03, 0x04}, strip the
0x10 (DLE marker) and keep the following byte as payload.
Lone 0x10 bytes not followed by {0x02, 0x03, 0x04} are kept as-is.
Confirmed correct by verifying reconstructed waveform body against M529LIY6 (example event):
- 0x10 0x02 in terminator 0x02 kept
- 0x10 0x04 in terminator (month byte) 0x04 kept
"""
out = bytearray()
i = 0
while i < len(data):
b = data[i]
if b == 0x10 and i + 1 < len(data) and data[i + 1] in {0x02, 0x03, 0x04}:
# Strip the DLE marker; the next byte is payload and will be appended
# in the next loop iteration.
i += 1
continue
out.append(b)
i += 1
return bytes(out)
def _frame_body_bytes(frame: S3Frame, skip: int) -> bytes:
"""
Extract the waveform body contribution from one A5 S3Frame.
The contribution is frame.data[skip:] with inner-frame DLE pairs stripped
per _strip_inner_frame_dles(). The chk_byte is temporarily appended before
stripping to handle the split-pair edge case where a DLE at the end of
frame.data is paired with chk_byte.
Split-pair edge case (confirmed for A5[8] of M529LIY6 (example event), 2026-04-21):
S3FrameParser appends DLE+XX pairs as two literal bytes when XX {DLE, ETX}.
When the LAST occurrence of such a pair straddles the payload/checksum boundary
(i.e., DLE is the last byte of raw_payload and XX is the checksum), the parser
splits them:
- DLE ends up as the last byte of frame.data (frame.data[-1] == 0x10)
- XX is stored as frame.chk_byte
To strip the pair correctly, we reunite the bytes before calling the strip
function. Since chk_byte is the checksum (not payload data), it is excluded
from the final output regardless of whether it was part of a pair.
Post-strip chk_byte removal (ALL cases):
_strip_inner_frame_dles strips the 0x10 and KEEPS chk_byte in all cases.
Chk_byte is always the checksum (not payload), so always strip it off.
Args:
frame: S3Frame with frame.data and frame.chk_byte populated.
skip: Number of leading bytes in frame.data to exclude (frame header).
Returns:
bytes the waveform body contribution for this frame.
"""
if skip >= len(frame.data):
return b""
relevant = frame.data[skip:]
# Detect split DLE+chk pair at the frame boundary.
has_split_pair = (
len(relevant) > 0
and relevant[-1] == 0x10
and frame.chk_byte in {0x02, 0x03, 0x04}
)
if has_split_pair:
# Reunite the split pair so the strip function sees both bytes together.
buf = relevant + bytes([frame.chk_byte])
stripped = _strip_inner_frame_dles(buf)
# _strip_inner_frame_dles strips the DLE (0x10) and KEEPS chk_byte.
# chk_byte is the received checksum — never payload — so remove it.
# This is correct for all values in {0x02, 0x03, 0x04}.
if stripped:
stripped = stripped[:-1]
return stripped
else:
return _strip_inner_frame_dles(relevant)
# ── Filename helper ───────────────────────────────────────────────────────────
_INSTANTEL_EPOCH = datetime.datetime(1985, 1, 1, 0, 0, 0)
"""
Instantel timestamp epoch January 1, 1985, 00:00:00 local time.
Confirmed 2026-04-21: stem values for 6 independent events (April 19, 2026)
all converge to this epoch when decoded as floor(seconds_since_epoch / 1296).
1985 is the year Instantel was founded.
"""
_STEM_UNIT_SEC = 1296 # = 36^2 seconds ≈ 21.6 minutes per stem unit
_STEM_CHARS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
# ── Waveform file extension encoding ─────────────────────────────────────────
#
# NEW FIRMWARE (V10.72+) — FULLY DECODED (confirmed 2026-04-21, 10-year archive):
#
# Extension format: AB0T (4 characters)
# AB = 2-char base-36 encoding of (seconds_since_epoch % 1296)
# i.e. the number of seconds into the current 21.6-minute stem window
# Range: 0 ("00") to 1295 ("ZZ")
# 0 = always literal '0'
# T = event type: 'W' = Full Waveform, 'H' = Full Histogram
#
# Combined with the 4-char stem (which encodes seconds_since_epoch // 1296),
# the FULL filename gives a second-resolution timestamp:
# total_seconds = stem_val * 1296 + ab_val
# timestamp = EPOCH + timedelta(seconds=total_seconds)
#
# Verified against three S353L4H0 events (all three match to the second):
# S353L4H0.3M0W Full Waveform 2025-06-23 13:57:22 AB=3M=130 ✓
# S353L4H0.8S0H Full Histogram 2025-06-23 14:00:28 AB=8S=316 ✓
# S353L4H0.9X0W Full Waveform 2025-06-23 14:01:09 AB=9X=357 ✓
#
# OLD FIRMWARE (S338, 3-char extensions ending in '0') — UNKNOWN:
# Observed (old firmware / manual downloads): .440, .470, .7M0, .9T0, .EI0, etc.
# The V10.72 formula does NOT apply to these.
# Extension is NOT recording mode (refuted 2026-04-21: continuous → .EI0, not .9T0).
# blastware_filename() computes the correct AB0 extension for V10.72 firmware.
#
# WRONG earlier assumption (do not re-introduce):
# Extension was believed to encode recording mode × sample rate.
# Refuted by continuous-mode event producing .EI0 instead of .9T0.
def _make_stem(ts_local: datetime.datetime) -> str:
"""
Encode a local timestamp as a 4-character uppercase base-36 stem.
Algorithm (confirmed 2026-04-21 from 6 known file/timestamp pairs):
stem_int = floor((ts_local - Jan_1_1985_midnight_local) / 1296_seconds)
stem = 4-char uppercase base-36 encoding of stem_int
Unit = 36² = 1296 seconds 21.6 minutes. Events within the same 1296-second
window receive the same stem; their extension distinguishes them.
"""
delta_sec = int((ts_local - _INSTANTEL_EPOCH).total_seconds())
n = delta_sec // _STEM_UNIT_SEC
s = ""
for _ in range(4):
s = _STEM_CHARS[n % 36] + s
n //= 36
return s
def blastware_filename(event: Event, serial: str, ach: bool = False) -> str:
"""
Return the correct Blastware filename for an event.
CONFIRMED 2026-04-22 verified against 3,248 files from a 10-year archive.
Filename format: <prefix_letter><serial3><stem><AB>0[T]
where:
prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))
encodes the production generation (batch of 1000 units)
e.g. BE6907H, BE11529M, BE14036P, BE18003T
serial3 = f"{serial_numeric % 1000:03d}"
last 3 digits of numeric serial, zero-padded
stem = 4-char base-36 of floor(total_seconds / 1296)
encodes which 21.6-minute window the event fell in
AB = 2-char base-36 of (total_seconds % 1296)
encodes seconds within the window (01295)
0 = always literal digit zero
T = 'W' or 'H' ONLY appended for call-home (ACH) downloads (ach=True).
Manual / direct downloads produce a 3-char extension (AB0) with no type char.
Call-home downloads produce a 4-char extension (AB0W or AB0H).
total_seconds = (event_local_time 1985-01-01T00:00:00_local) in seconds
The 10-year production archive contains only call-home files (all end in W or H).
Manual Blastware downloads produce 3-char extensions the same AB0 prefix but
without the trailing type character.
Micromate Series 4 uses a completely different naming scheme (literal datetime
in filename); this function does not apply to Micromate units.
Args:
event: Event object with timestamp set.
serial: Device serial number string (e.g. "BE11529").
ach: If True, append W/H type character (call-home style).
If False (default), omit type character (direct download style).
Returns:
Filename string, e.g. "M529LIY6.CE0" (direct) or "M529LIY6.CE0H" (ACH).
"""
# ── Serial prefix ──────────────────────────────────────────────────────────
serial_digits = "".join(c for c in serial if c.isdigit())
if len(serial_digits) >= 1:
serial_numeric = int(serial_digits)
generation = serial_numeric // 1000
prefix_letter = chr(ord('B') + generation)
serial3 = f"{serial_numeric % 1000:03d}"
else:
prefix_letter = "M" # fallback
serial3 = "000"
prefix = prefix_letter + serial3
# ── Stem + AB extension from timestamp ────────────────────────────────────
if event.timestamp is not None:
try:
ts_local = datetime.datetime(
event.timestamp.year, event.timestamp.month, event.timestamp.day,
event.timestamp.hour, event.timestamp.minute, event.timestamp.second,
)
delta_sec = int((ts_local - _INSTANTEL_EPOCH).total_seconds())
stem = _make_stem(ts_local)
ab_val = delta_sec % _STEM_UNIT_SEC
ab_str = _STEM_CHARS[ab_val // 36] + _STEM_CHARS[ab_val % 36]
except (ValueError, TypeError, AttributeError):
stem = "0000"
ab_str = "00"
else:
stem = "0000"
ab_str = "00"
# ── Type character (ACH only) ─────────────────────────────────────────────
if ach:
if getattr(event, 'recording_mode', None) in (3, 4): # Histogram / Hist+Cont
type_char = 'H'
else:
type_char = 'W'
ext = f".{ab_str}0{type_char}"
else:
ext = f".{ab_str}0"
return prefix + stem + ext
# ── A5 frame classifier ───────────────────────────────────────────────────────────
# ASCII markers that identify a compliance-config / metadata frame.
# These strings appear in the A5 bulk stream as part of the device's
# compliance setup payload. They should NEVER appear in raw ADC waveform
# frames (which are binary-heavy, < 20 % printable ASCII).
_METADATA_FRAME_MARKERS = (
b"Project:",
b"Client:",
b"Standard Recording Setup",
b"Extended Notes",
b"User Name:",
b"Seis Loc:",
)
def classify_frame(frame: S3Frame) -> str:
"""
Classify an A5 bulk waveform stream frame by its content.
Returns one of:
"terminator" page_key == 0x0000
"probe_or_strt" data contains b"STRT\xff\xfe" (the initial probe response)
"metadata" data contains ASCII compliance-config markers
"waveform" predominantly binary (< 20 % printable ASCII)
"unknown" none of the above criteria matched
Used by write_blastware_file() to filter non-waveform frames out of
the reconstructed body so that metadata blocks (Project:, Client:, )
and spurious STRT records do not corrupt the output file.
"""
if frame.page_key == 0x0000:
return "terminator"
data = bytes(frame.data)
if b"STRT\xff\xfe" in data:
return "probe_or_strt"
if any(m in data for m in _METADATA_FRAME_MARKERS):
return "metadata"
if len(data) > 0:
printable = sum(1 for b in data if 32 <= b < 127)
if printable / len(data) < 0.20:
return "waveform"
return "unknown"
# ── Waveform file writer ───────────────────────────────────────────────────────────
def write_blastware_file(
event: Event,
a5_frames: list[S3Frame],
path: Union[str, Path],
) -> None:
"""
Write a Blastware waveform file from a downloaded event.
Args:
event: Event object (populated by get_events() or download_waveform()).
Used for the STRT record (key, rectime) and footer timestamps.
a5_frames: Complete A5 frame list INCLUDING the terminator frame
(page_key=0x0000). Pass include_terminator=True to
read_bulk_waveform_stream() when collecting frames.
Must have at least 2 frames (probe + terminator).
path: Destination file path. Parent directory must exist.
Extension should be set via blastware_filename().
File layout:
[22B header] [21B STRT] [body bytes] [26B footer]
Raises:
ValueError: if a5_frames is empty or has no terminator (page_key=0).
OSError: if the file cannot be written.
Confirmed correct waveform body reconstruction against M529LIY6 (example event) (2026-04-21).
"""
if not a5_frames:
raise ValueError("a5_frames must not be empty")
path = Path(path)
# ── Extract STRT record from probe frame ────────────────────────────────
# The STRT record (21 bytes) lives verbatim inside A5[0].data[7:].
# It is stored as-is in the waveform file — do NOT reconstruct it from Event
# fields, as bytes [10:14] and [14:20] contain device-specific values
# (not simply key4 repeated or zero-padded). Confirmed 2026-04-21.
#
# STRT layout (21 bytes, observed in M529LIY6 files):
# [0:4] b'STRT'
# [4:6] 0xff 0xfe (fixed)
# [6:10] key4 (event key)
# [10:14] device-specific field (NOT a key4 repeat)
# [14:20] device-specific fields (NOT zeros)
# [20] rectime uint8 seconds
# Extract STRT from the DLE-stripped probe frame.
#
# frame.data[7:] is the raw wire representation; it may contain DLE+{02,03,04}
# inner-frame pairs that S3FrameParser preserves as two literal bytes. The
# Blastware file stores the stripped form, so we must strip before extracting.
#
# Example (M529LK0Y, 2026-04-21): STRT contains value 0x02 encoded as [10 02]
# on the wire. Without stripping, STRT is 22 raw bytes → write_blastware_file writes the
# DLE prefix into the file AND begins the body 1 byte too early (probe_skip off
# by 1). Stripping fixes both.
#
# probe_skip must be computed in the RAW frame.data domain (it is used as the
# `skip` argument to _frame_body_bytes which operates on raw frame.data).
# We walk the raw bytes counting stripped bytes until we have passed
# strt_pos + 21 stripped bytes, giving the raw offset of the first body byte.
w0_raw = bytes(a5_frames[0].data[7:])
w0_stripped = _strip_inner_frame_dles(w0_raw)
strt_pos_stripped = w0_stripped.find(b"STRT")
if strt_pos_stripped >= 0:
strt = bytes(w0_stripped[strt_pos_stripped : strt_pos_stripped + 21])
# Walk raw bytes to find the raw-domain end of the STRT (= body start).
target_stripped = strt_pos_stripped + 21
stripped_so_far = 0
raw_i = 0
while stripped_so_far < target_stripped and raw_i < len(w0_raw):
if (w0_raw[raw_i] == 0x10
and raw_i + 1 < len(w0_raw)
and w0_raw[raw_i + 1] in {0x02, 0x03, 0x04}):
raw_i += 2 # DLE pair → 1 stripped byte, 2 raw bytes
else:
raw_i += 1 # normal byte → 1 stripped byte, 1 raw byte
stripped_so_far += 1
probe_skip = 7 + raw_i # raw bytes to skip: 7 header + raw STRT length
else:
# Fallback: construct a minimal STRT if probe frame lacks it
key4 = event._waveform_key if hasattr(event, '_waveform_key') and event._waveform_key else bytes(4)
rectime = event.rectime_seconds if event.rectime_seconds is not None else 0
strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF])
probe_skip = 7 + 21
log.debug(
"write_blastware_file: strt_pos_stripped=%d probe_skip=%d "
"probe_data_len=%d strt_hex=%s",
strt_pos_stripped if strt_pos_stripped >= 0 else -1,
probe_skip,
len(a5_frames[0].data),
strt.hex() if len(strt) >= 4 else "(short)",
)
if len(strt) != 21:
raise ValueError(f"STRT record must be 21 bytes, got {len(strt)}")
# ── Build waveform file header ─────────────────────────────────────────────────────
header = _FILE_HEADER_PREFIX + _WAVEFORM_TYPE_TAG
assert len(header) == _WAVEFORM_HEADER_SIZE, f"Waveform header must be {_WAVEFORM_HEADER_SIZE} bytes"
# ── Build body from A5 frames ────────────────────────────────────────────
# The waveform body is reconstructed from ALL A5 frames (data + terminator).
# The terminator frame's contribution includes the 26-byte footer at its end.
#
# Reconstruction layout (confirmed from M529LIY6 captures, 2026-04-21):
# all_bytes = contributions from A5[0..N] + terminator_contribution
# body = all_bytes[:-26] (everything except the last 26 bytes)
# footer = all_bytes[-26:] (last 26 bytes = the waveform file footer)
#
# The footer bytes come directly from the terminator frame's inner content —
# using them verbatim ensures timestamps match the device's recorded values.
# Separate terminator from data frames.
# Search from the FRONT for the first terminator (page_key == 0x0000).
# Do NOT use a5_frames[-1] — if _a5_frames contains stray frames from a
# subsequent event (a known get_events side-effect), the last frame will
# not be the terminator and the footer will be mis-identified.
# TERM detection (v0.14.0): last frame if page_key != 0x0010 (sample marker)
term_idx: Optional[int] = None
if a5_frames and a5_frames[-1].page_key != 0x0010:
term_idx = len(a5_frames) - 1
if term_idx is not None:
body_frames = a5_frames[:term_idx]
term_frame = a5_frames[term_idx]
else:
body_frames = a5_frames
term_frame = None
# Frame contribution loop (v0.14.0 BW-exact walk).
# Skip values:
# probe (fi=0): probe_skip
# meta@0x1002 (fi=1): 13 (6-byte inner header)
# meta@0x1004 (fi=2): 13 (6-byte inner header)
# sample chunks (fi=3+): 12 (5-byte inner header)
last_fi = len(body_frames) - 1
log.debug(
"write_blastware_file: %d body_frames last_fi=%d",
len(body_frames), last_fi,
)
all_bytes = bytearray()
for fi, frame in enumerate(body_frames):
if fi == 0:
skip = probe_skip
elif fi in (1, 2):
skip = 13 # metadata pages
else:
skip = 12 # sample chunks
contribution = _frame_body_bytes(frame, skip)
log.debug("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
fi, skip, len(frame.data), len(contribution))
all_bytes.extend(contribution)
# Terminator contributes its content, which ends with the 26-byte footer.
# skip=11 (not 12) because the terminator's inner frame header is 4 bytes,
# one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21.
if term_frame is not None:
term_contribution = _frame_body_bytes(term_frame, 11)
log.debug(
"write_blastware_file: term_frame data_len=%d skip=11 "
"contribution_len=%d first8=%s",
len(term_frame.data),
len(term_contribution),
term_contribution[:8].hex() if len(term_contribution) >= 8 else term_contribution.hex(),
)
all_bytes.extend(term_contribution)
log.debug(
"write_blastware_file: all_bytes total=%d last28=%s",
len(all_bytes),
bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(),
)
# NOTE: The "duplicate header+STRT strip" logic from v0.13.x has been
# REMOVED in v0.14.2. Under the v0.14.0 BW-exact 5A walk, body assembly
# is just contiguous concatenation of frame contributions in stream order
# (probe → meta@0x1002 → meta@0x1004 → samples → TERM), exactly as BW
# writes its files. The previous strip was matching the `00 12 03 00 STRT`
# byte sequence in legitimate waveform data — sample chunks at counter
# 0x1000 and beyond often contain those bytes coincidentally — and
# zeroing 25 bytes of valid samples per match. Compared to a known-good
# BW reference for the same 3-sec event 0, the strip introduced 26 bytes
# of zeros that BW did not have, then propagated alignment differences
# through the rest of the body. See decode_test/5-1-26/bw vs SFM diff
# at file[0x1012..0x102B] (2026-05-04 analysis).
# Find the first valid 0e 08 footer marker (v0.14.0).
footer_pos = -1
pos = 0
while True:
pos = bytes(all_bytes).find(b"\x0e\x08", pos)
if pos < 0 or pos + 26 > len(all_bytes):
break
yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
if 2015 <= yr <= 2050:
footer_pos = pos
break
pos += 1
if footer_pos >= 0:
body = bytes(all_bytes[:footer_pos])
footer = bytes(all_bytes[footer_pos:footer_pos + 26])
log.debug(
"write_blastware_file: real 0e 08 footer at all_bytes[%d]; "
"truncating %d post-footer bytes",
footer_pos, len(all_bytes) - footer_pos - 26,
)
elif len(all_bytes) >= 26:
body = bytes(all_bytes[:-26])
footer = bytes(all_bytes[-26:])
else:
body = bytes(all_bytes)
start_dt = _ts_from_model(event.timestamp)
stop_dt: Optional[datetime.datetime] = None
if start_dt is not None and event.rectime_seconds:
stop_dt = start_dt + datetime.timedelta(seconds=event.rectime_seconds)
footer = (
b"\x0e\x08"
+ _encode_ts_be(start_dt)
+ _encode_ts_be(stop_dt)
+ b"\x00\x01\x00\x02\x00\x00"
+ b"\x00\x00"
)
# ── Write file ───────────────────────────────────────────────────────────
with open(path, "wb") as f:
f.write(header)
f.write(strt)
f.write(body)
f.write(footer)
def read_blastware_file(path: Union[str, Path]) -> Event:
"""
Parse a Blastware waveform file into an Event object.
NOT YET IMPLEMENTED.
Args:
path: Path to the waveform file.
Returns:
Event object with waveform data populated.
Raises:
NotImplementedError: always (pending implementation).
"""
raise NotImplementedError("read_blastware_file() is not yet implemented")
# ── MLG file writer ───────────────────────────────────────────────────────────
def _build_mlg_header(serial: str) -> bytes:
"""
Build the 308-byte MLG file header.
Header structure (confirmed from BE11529.MLG binary inspection):
Offset 0x00: 10 00 01 80 00 00 Instantel\x00 07 2c 22 01 0e a0 (22B)
Offset 0x16: ... (16B unknown observed as zeros in BE11529.MLG)
Offset 0x2A: serial number (8 bytes, null-padded ASCII)
... rest zero-padded to 308 bytes
The serial string "BE11529" appears at offset 0x2A (42 decimal).
"""
buf = bytearray(_MLG_HEADER_SIZE)
# Common prefix + MLG type tag
prefix = _FILE_HEADER_PREFIX + _MLG_TYPE_TAG # 22 bytes
buf[0:len(prefix)] = prefix
# Serial number at offset 0x2A
serial_bytes = serial.encode("ascii", errors="replace")[:8]
serial_padded = serial_bytes.ljust(8, b"\x00")
buf[0x2A : 0x2A + 8] = serial_padded
return bytes(buf)
def _build_mlg_record(
entry: MonitorLogEntry,
serial: str,
) -> bytes:
"""
Build one 292-byte MLG record from a MonitorLogEntry.
Record layout (confirmed from BE11529.MLG binary inspection):
[0:2] CRC 2-byte CRC (algorithm unknown; written as 0x0000)
[2:6] marker 22 01 0e 80
[6:14] ts1 8B big-endian start timestamp
[14:22] ts2 8B big-endian stop timestamp
[22:26] flags 4B record flags (see MLG_FLAGS_* constants)
[26:36] serial 10B null-padded serial number
[36:] text for triggered events: [0x08][8B ts1_copy]["Geo: X.XXX in/s"]
for monitoring intervals: b"" or minimal separator
[... zero-padded to 292 bytes]
Flags based on entry type:
- MonitorLogEntry with start_time only (no stop_time): MLG_FLAGS_START_ONLY
- MonitorLogEntry with both times and geo_threshold_ips set: MLG_FLAGS_TRIGGER
- MonitorLogEntry with both times (monitoring interval): MLG_FLAGS_INTERVAL
The triggered-event text block (flags = MLG_FLAGS_TRIGGER):
[0x08] [ts1: 8B] [ASCII "Geo: X.XXX in/s\x00"]
Confirmed from BE11529.MLG records at offset 0x0134 and 0x0258.
"""
buf = bytearray(_MLG_RECORD_SIZE)
start_dt = (
datetime.datetime(
entry.start_time.year, entry.start_time.month, entry.start_time.day,
entry.start_time.hour, entry.start_time.minute, entry.start_time.second,
)
if entry.start_time else None
)
stop_dt = (
datetime.datetime(
entry.stop_time.year, entry.stop_time.month, entry.stop_time.day,
entry.stop_time.hour, entry.stop_time.minute, entry.stop_time.second,
)
if entry.stop_time else None
)
# [0:2] CRC placeholder
buf[0:2] = b"\x00\x00"
# [2:6] Record marker
buf[2:6] = _MLG_RECORD_MARKER
# [6:14] ts1
buf[6:14] = _encode_ts_be(start_dt)
# [14:22] ts2
buf[14:22] = _encode_ts_be(stop_dt)
# [22:26] flags
if stop_dt is None:
flags = MLG_FLAGS_START_ONLY
elif entry.geo_threshold_ips is not None:
flags = MLG_FLAGS_TRIGGER
else:
flags = MLG_FLAGS_INTERVAL
buf[22:26] = flags
# [26:36] serial (10B null-padded)
serial_bytes = serial.encode("ascii", errors="replace")[:10]
buf[26 : 26 + len(serial_bytes)] = serial_bytes
# [36:] text content
pos = 36
if flags == MLG_FLAGS_TRIGGER:
# Extra ts1 copy: [0x08][ts1: 8B]
buf[pos] = 0x08
pos += 1
buf[pos : pos + 8] = _encode_ts_be(start_dt)
pos += 8
if entry.geo_threshold_ips is not None:
geo_text = f"Geo: {entry.geo_threshold_ips:.3f} in/s\x00".encode("ascii")
buf[pos : pos + len(geo_text)] = geo_text
pos += len(geo_text)
return bytes(buf)
def write_mlg(
entries: list[MonitorLogEntry],
serial: str,
path: Union[str, Path],
) -> None:
"""
Write a Blastware .MLG monitor log file.
Args:
entries: List of MonitorLogEntry objects (from get_monitor_log_entries()).
Each entry produces one 292-byte record in the file.
serial: Device serial number string (e.g. "BE11529").
Written to the file header and each record.
path: Destination file path. Extension is not enforced use ".MLG".
File layout:
[308B header] [N × 292B records]
Note: The 2-byte CRC at the start of each record is written as 0x0000.
The CRC algorithm is unknown (see module docstring).
Raises:
OSError: if the file cannot be written.
"""
path = Path(path)
header = _build_mlg_header(serial)
with open(path, "wb") as f:
f.write(header)
for entry in entries:
record = _build_mlg_record(entry, serial)
f.write(record)
def read_mlg(path: Union[str, Path]) -> list[MonitorLogEntry]:
"""
Parse a Blastware .MLG file into a list of MonitorLogEntry objects.
NOT YET IMPLEMENTED.
Args:
path: Path to the .MLG file.
Returns:
List of MonitorLogEntry objects.
Raises:
NotImplementedError: always (pending implementation).
"""
raise NotImplementedError("read_mlg() is not yet implemented")
+361 -98
View File
@@ -449,7 +449,7 @@ class MiniMateClient:
proto.confirm_erase_all()
log.info("delete_all_events: erase confirmed — device memory cleared")
def get_events(self, full_waveform: bool = False, debug: bool = False, stop_after_index: Optional[int] = None, skip_waveform_for_keys: Optional[set] = None) -> list[Event]:
def get_events(self, full_waveform: bool = False, debug: bool = False, stop_after_index: Optional[int] = None, skip_waveform_for_keys: Optional[set] = None, skip_waveform_for_events: Optional[dict] = None, extra_chunks_after_metadata: int = 1) -> list[Event]:
"""
Download all stored events from the device using the confirmed
1E 0A 0C 5A 1F event-iterator protocol.
@@ -497,37 +497,24 @@ class MiniMateClient:
events: list[Event] = []
idx = 0
# Legacy bare-key skip set is deprecated: the device's key counter
# resets to 0x01110000 after every memory erase, so a key in this set
# cannot be trusted to identify the same physical event across erases.
# If a caller still passes it, log a warning and ignore — full
# downloads will run for every event so the bug never silently bites.
if skip_waveform_for_keys:
log.warning(
"get_events: skip_waveform_for_keys is deprecated and unsafe "
"(post-erase key reuse); ignoring %d entries. Use "
"skip_waveform_for_events={key: timestamp_iso} instead.",
len(skip_waveform_for_keys),
)
skip_evts: dict[str, str] = dict(skip_waveform_for_events or {})
while data8[4:8] != b"\x00\x00\x00\x00":
cur_key = key4 # key for this event's 0A/1E-arm/0C/5A calls
log.info("get_events: record %d key=%s", idx, cur_key.hex())
# Fast-advance path: if this key is already downloaded, skip
# 1E-arm/0C/POLL/5A entirely. Only 0A + 1F(browse) are needed
# to advance the device's internal pointer to the next event.
# This is identical to the browse-mode walk in count_events().
if skip_waveform_for_keys and cur_key.hex() in skip_waveform_for_keys:
log.debug("get_events: key=%s already seen -- fast-advance only", cur_key.hex())
try:
proto.read_waveform_header(cur_key)
except ProtocolError as exc:
log.warning(
"get_events: 0A failed for key=%s (skip path): %s -- stopping",
cur_key.hex(), exc,
)
break
try:
key4, data8 = proto.advance_event(browse=True)
except ProtocolError as exc:
log.warning(
"get_events: 1F failed for key=%s (skip path): %s -- stopping",
cur_key.hex(), exc,
)
break
idx += 1
if stop_after_index is not None and idx > stop_after_index:
break
continue
ev = Event(index=idx)
ev._waveform_key = cur_key
@@ -574,11 +561,30 @@ class MiniMateClient:
"get_events: 0C failed for key=%s: %s", cur_key.hex(), exc
)
# ── Skip-5A decision based on (key, timestamp) match ──────
# If skip_waveform_for_events maps cur_key.hex() to a non-empty
# ISO timestamp matching what we just read from 0C, this is
# the same physical event we already have on disk — bypass
# the 1F(arm)+POLL+5A bulk download. Otherwise (no entry, or
# timestamp mismatch indicating post-erase reuse) fall through
# to the full download.
expected_ts = skip_evts.get(cur_key.hex(), "")
actual_ts = _event_timestamp_iso(ev)
skip_5a = bool(expected_ts and actual_ts and expected_ts == actual_ts)
if skip_5a:
log.info(
"get_events: key=%s (key, ts=%s) match — skipping 5A bulk download",
cur_key.hex(), actual_ts,
)
arm_key4: Optional[bytes] = None
a5_ok = False
if not skip_5a:
# SUB 1F (download-arm) — send token=0xFE BEFORE POLL+5A to arm the
# device's bulk stream state machine. Cache the returned key as a
# fallback for loop iteration when 5A fails (see iteration block below).
# Confirmed from 4-2-26 capture frames 66-67 (1F before frames 68-73 POLL).
arm_key4: Optional[bytes] = None
try:
arm_key4, _ = proto.advance_event(browse=False) # arm 5A
log.info("get_events: 1F(download) — 5A armed, arm_key=%s", arm_key4.hex())
@@ -597,17 +603,24 @@ class MiniMateClient:
# SUB 5A — bulk waveform stream (uses cur_key, the event set up by 0A+1E+0C).
# By default (full_waveform=False): stop after frame 7 for metadata only.
# When full_waveform=True: fetch all chunks and decode raw ADC samples.
a5_ok = False
#
# Bypassed when skip_5a is True — the event is left with
# _a5_frames=None, which signals to the caller (e.g.
# ach_server.py) that this event was matched by (key, ts) and
# already has a stored .file in the persistent waveform store.
if not skip_5a:
try:
if full_waveform:
log.info(
"get_events: 5A full waveform download for key=%s", cur_key.hex()
)
a5_frames = proto.read_bulk_waveform_stream(
cur_key, stop_after_metadata=False, max_chunks=128
cur_key, stop_after_metadata=False, max_chunks=128,
include_terminator=True,
)
if a5_frames:
a5_ok = True
ev._a5_frames = a5_frames # store for write_blastware_file
_decode_a5_metadata_into(a5_frames, ev)
_decode_a5_waveform(a5_frames, ev)
log.info(
@@ -619,10 +632,14 @@ class MiniMateClient:
"get_events: 5A metadata-only download for key=%s", cur_key.hex()
)
a5_frames = proto.read_bulk_waveform_stream(
cur_key, stop_after_metadata=True
cur_key, stop_after_metadata=True,
include_terminator=True,
extra_chunks_after_metadata=extra_chunks_after_metadata,
max_chunks=128,
)
if a5_frames:
a5_ok = True
ev._a5_frames = a5_frames # store for write_blastware_file
_decode_a5_metadata_into(a5_frames, ev)
log.debug(
"get_events: 5A metadata client=%r operator=%r",
@@ -646,7 +663,14 @@ class MiniMateClient:
# Confirmed from 4-3-26 browse-mode captures: browse=True params
# are correct for multi-event iteration. Conditional logic added
# 2026-04-06 to avoid post-failure state disruption.
if a5_ok:
#
# NEW 2026-05-06: when skip_5a=True we never entered the 5A
# state at all (we read 0A+1E(arm)+0C and chose to bypass).
# 1F(browse) is safe in this scenario — the device's iteration
# pointer is independent of the bulk-stream state machine, and
# we never put it into the half-attempted 5A state that the
# earlier "post-failure 1F disruption" warning is about.
if skip_5a or a5_ok:
# 5A succeeded — use browse 1F for reliable key advancement.
try:
key4, data8 = proto.advance_event(browse=True)
@@ -776,6 +800,39 @@ class MiniMateClient:
else:
log.warning("download_waveform: waveform decode produced no samples")
return a5_frames
def save_blastware_file(self, event: "Event", path: "Union[str, Path]", serial: str) -> None:
"""
Download the full waveform for *event* and save it as a Blastware-
compatible Blastware waveform file at *path*.
This is a convenience wrapper that calls download_waveform() (which
performs the complete SUB 5A BULK_WAVEFORM_STREAM download) and then
calls write_blastware_file() from blastware_file.py to encode the result.
Args:
event: Event object with waveform key populated (from get_events()).
path: Destination file path. Caller should use blastware_filename()
to pick the correct extension via blastware_filename().
serial: Device serial number (e.g. "BE11529") passed to
blastware_filename() for reference, but the caller supplies
the final path.
"""
from pathlib import Path as _Path
from .blastware_file import write_blastware_file as _write_blastware_file
a5_frames = self.download_waveform(event)
if not a5_frames:
raise RuntimeError(
f"save_blastware_file: no A5 frames received for event#{event.index}"
)
_write_blastware_file(event, a5_frames, path)
log.info(
"save_blastware_file: wrote %s (%d A5 frames)",
path, len(a5_frames),
)
# ── Write commands ────────────────────────────────────────────────────────
def push_config_raw(
@@ -1135,6 +1192,27 @@ class MiniMateClient:
# Pure functions: bytes → model field population.
# Kept here (not in models.py) to isolate protocol knowledge from data shapes.
def _event_timestamp_iso(event: Event) -> str:
"""
Return a stable ISO-8601 string for the event's 0C-derived timestamp,
or "" if the event has no timestamp populated.
The format intentionally matches what `bridges/ach_server.py` writes
into `ach_state.json:downloaded_events[*]` so the (key, ts) compare
in get_events()'s skip path is a simple string equality.
"""
ts = getattr(event, "timestamp", None)
if ts is None:
return ""
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
def _decode_serial_number(data: bytes) -> DeviceInfo:
"""
Decode SUB EA (SERIAL_NUMBER_RESPONSE) payload into a new DeviceInfo.
@@ -1284,28 +1362,54 @@ def _decode_waveform_record_into(data: bytes, event: Event) -> None:
Modifies event in-place.
"""
# ── Record type ───────────────────────────────────────────────────────────
# Decoded from byte[1] (sub_code) first so we can gate timestamp parsing.
# ── Always preserve the raw 210 bytes ─────────────────────────────────────
# The 0C record carries far more than just peaks + project strings:
# ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement, Vector
# Sum Time, MicL Time of Peak, and the per-channel sensor self-check
# results (Test Freq / Ratio / Pass-Fail) all live somewhere in this
# 210-byte block. Their byte offsets are not yet mapped — keeping the
# raw bytes lets us decode those fields offline once we have a paired
# (raw 0C, BW-report) sample to fit against. Cheap to keep around
# (210 bytes per event).
try:
event._raw_record = bytes(data[:210])
except Exception:
pass
# ── Record type + format detection ────────────────────────────────────────
# `record_type` is the user-facing label ("Waveform" for any triggered
# event regardless of timestamp-header layout). `fmt` is the internal
# format code used to pick the right Timestamp parser; it stays
# internal and doesn't leak to the API / sidecar / UI.
try:
event.record_type = _extract_record_type(data)
except Exception as exc:
log.warning("waveform record type decode failed: %s", exc)
fmt = _detect_record_format(data)
# ── Timestamp ─────────────────────────────────────────────────────────────
# 9-byte format for sub_code=0x10 Waveform records:
# [day][sub_code][month][year:2 BE][unknown][hour][min][sec]
# sub_code=0x10 and sub_code=0x03 have different timestamp byte layouts.
# Both confirmed against Blastware event reports (BE11529, 2026-04-01 and 2026-04-03).
if event.record_type == "Waveform":
# Three timestamp-header layouts have been observed across BE11529
# firmware S338.17 — each picks a different Timestamp parser:
# "single_shot": 9-byte [day][0x10][month][year:2][unk][h][m][s]
# "continuous": 10-byte [0x10][day][0x10][month][year:2][unk][h][m][s]
# "short": 8-byte [day][month][year:2][unk][h][m][s]
# All decoded into the same Timestamp dataclass — only the byte
# offsets differ.
if fmt == "single_shot":
try:
event.timestamp = Timestamp.from_waveform_record(data)
except Exception as exc:
log.warning("waveform record timestamp decode failed: %s", exc)
elif event.record_type == "Waveform (Continuous)":
log.warning("single_shot record timestamp decode failed: %s", exc)
elif fmt == "continuous":
try:
event.timestamp = Timestamp.from_continuous_record(data)
except Exception as exc:
log.warning("continuous record timestamp decode failed: %s", exc)
elif fmt == "short":
try:
event.timestamp = Timestamp.from_short_record(data)
except Exception as exc:
log.warning("short record timestamp decode failed: %s", exc)
# ── Peak values (per-channel PPV + Peak Vector Sum) ───────────────────────
try:
@@ -1324,7 +1428,7 @@ def _decode_waveform_record_into(data: bytes, event: Event) -> None:
log.warning("waveform record project strings decode failed: %s", exc)
def _decode_a5_metadata_into(frames_data: list[bytes], event: Event) -> None:
def _decode_a5_metadata_into(frames_data: list[S3Frame], event: Event) -> None:
"""
Search A5 (BULK_WAVEFORM_STREAM) frame data for event-time metadata strings
and populate event.project_info.
@@ -1352,7 +1456,7 @@ def _decode_a5_metadata_into(frames_data: list[bytes], event: Event) -> None:
Modifies event in-place.
"""
combined = b"".join(frames_data)
combined = b"".join(f.data for f in frames_data)
def _find_string_after(needle: bytes, max_len: int = 64) -> Optional[str]:
pos = combined.find(needle)
@@ -1376,7 +1480,7 @@ def _decode_a5_metadata_into(frames_data: list[bytes], event: Event) -> None:
notes = _find_string_after(b"Extended Notes")
if not any([project, client, operator, location, notes]):
log.debug("a5 metadata: no project strings found in %d frames", len(frames_data))
log.debug("a5 metadata: no project strings found in %d frames (%d bytes)", len(frames_data), len(combined))
return
if event.project_info is None:
@@ -1402,7 +1506,7 @@ def _decode_a5_metadata_into(frames_data: list[bytes], event: Event) -> None:
def _decode_a5_waveform(
frames_data: list[bytes],
frames_data: list[S3Frame],
event: Event,
) -> None:
"""
@@ -1463,7 +1567,7 @@ def _decode_a5_waveform(
return
# ── Parse STRT record from A5[0] ────────────────────────────────────────
w0 = frames_data[0][7:] # db[7:] for A5[0]
w0 = frames_data[0].data[7:] # frame.data[7:] for A5[0]
strt_pos = w0.find(b"STRT")
if strt_pos < 0:
log.warning("_decode_a5_waveform: STRT record not found in A5[0]")
@@ -1479,46 +1583,109 @@ def _decode_a5_waveform(
log.warning("_decode_a5_waveform: STRT record truncated (%dB)", len(strt))
return
total_samples = struct.unpack_from(">H", strt, 8)[0]
pretrig_samples = struct.unpack_from(">H", strt, 16)[0]
rectime_seconds = strt[18]
# STRT byte layout (21 bytes; verified against M529LIY6 reference files
# and re-confirmed against live BE11529 captures, 2026-05-08):
# [0:4] b'STRT'
# [4:6] 0xff 0xfe sentinel
# [6:10] end_key 4-byte BE flash address where event ends
# [10:14] start_key 4-byte BE flash address where event starts
# [14:18] device-specific (semantics not pinned; values vary across events
# and don't hold authoritative total_samples / pretrig)
# [18] 0x46 record-type marker (NOT rectime)
# [19] device-specific
# [20] sometimes rectime, sometimes 0 — not reliable
#
# AUTHORITATIVE values must come from compliance_config (sample_rate,
# record_time) and from end_offset - start_offset arithmetic (event size).
# Earlier code claimed STRT[8:10]=total_samples and STRT[16:18]=pretrig;
# those positions actually overlap end_key low-word and dev-specific bytes
# respectively. We surface the address-derived event size so consumers
# can sanity-check chunk-loop bounds, but `total_samples` per channel must
# be derived externally (sample_rate × record_time, or computed from the
# decoded sample count below).
end_key = strt[6:10]
start_key = strt[10:14]
end_offset_in_strt = (end_key[2] << 8) | end_key[3]
start_offset_in_strt = (start_key[2] << 8) | start_key[3]
is_event_1 = (start_offset_in_strt == 0x0000)
event.total_samples = total_samples
event.pretrig_samples = pretrig_samples
event.rectime_seconds = rectime_seconds
# Don't trust STRT for these — leave them as None so the caller can
# backfill from compliance_config (the authoritative source).
event.total_samples = None
event.pretrig_samples = None
event.rectime_seconds = None
log.debug(
"_decode_a5_waveform: STRT total_samples=%d pretrig=%d rectime=%ds",
total_samples, pretrig_samples, rectime_seconds,
"_decode_a5_waveform: STRT start_key=%s end_key=%s "
"start_off=0x%04X end_off=0x%04X is_event_1=%s "
"dev-specific[14:18]=%s strt[20]=0x%02X",
start_key.hex(), end_key.hex(),
start_offset_in_strt, end_offset_in_strt, is_event_1,
strt[14:18].hex(), strt[20],
)
# ── Collect per-frame waveform bytes with global offset tracking ─────────
# global_offset is the cumulative byte count across all frames, used to
# compute the channel alignment at each frame boundary.
#
# Frame layout under the v0.14.0+ walk:
# frames_data[0] = probe response (page_addr 0x0000;
# contains STRT + post-STRT data)
# frames_data[1..2] = (event 1 only) metadata pages
# page_addr = 0x1002 / 0x1004
# frames_data[mid] = sample chunks at flash addresses
# 0x0600, 0x0800, … (page_addr in
# {0x0600..0x1FFE})
# frames_data[last] = TERM response (page_key=0x0000)
#
# We identify metadata pages by their PAGE ADDRESS at db.data[4:6] (the
# 2-byte counter the device echoes back), NOT by content scan. An earlier
# needle-based detection (b"Project:", b"Client:", etc.) was the wrong
# layer of abstraction:
# • The actual metadata pages 0x1002 / 0x1004 do NOT contain ASCII
# project strings on this firmware (S338.17 / BE11529).
# • The strings physically live at flash address 0x1600 — which falls
# inside the sample-chunk address range. Skipping that frame would
# drop a real sample chunk.
# BW handles the "samples region happens to contain string bytes" case
# by just rendering the bytes verbatim; we do the same.
_METADATA_PAGES = (b"\x10\x02", b"\x10\x04")
chunks: list[tuple[int, bytes]] = [] # (frame_idx, waveform_bytes)
global_offset = 0
for fi, db in enumerate(frames_data):
w = db[7:]
page_addr = db.data[4:6] if len(db.data) >= 6 else b""
w = db.data[7:] # frame.data[7:]
# A5[0]: waveform begins after the 21-byte STRT record and 6-byte preamble.
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
# A5[0]: probe response. Two cases:
# - Event 1 (start_offset_in_strt == 0x0000): the bytes after STRT
# are the device's *pre-event reserved area* (flash 0x0046 to
# 0x0600), NOT samples. We must skip them; samples begin at
# the first dedicated chunk frame at counter=0x0600.
# - Event N (continuation, start_offset != 0x0000): the bytes after
# the STRT record ARE the first slice of real samples for the
# event (BW's chunk loop addresses the probe as a sample chunk).
if fi == 0:
sp = w.find(b"STRT")
if sp < 0:
continue
if is_event_1:
# No usable samples in the probe — pre-event reserved bytes.
continue
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
wave = w[sp + 27 :]
# Frame 7 carries event-time metadata strings ("Project:", "Client:", …)
# and no waveform ADC data.
elif fi == 7:
# Skip the dedicated metadata pages (event 1 only): page_addr 0x1002 / 0x1004.
elif page_addr in _METADATA_PAGES:
log.debug(
"_decode_a5_waveform: skipping metadata page fi=%d page_addr=%s",
fi, page_addr.hex(),
)
continue
# Terminator frames have page_key=0x0000 and are excluded upstream
# (read_bulk_waveform_stream returns early on page_key==0).
# No hardcoded frame-index skip here — all non-metadata frames are data.
# Sample chunk (or TERM): strip the 8-byte per-frame header.
else:
# Strip the 8-byte per-frame header (ctr + 6 zero bytes)
if len(w) < 8:
continue
wave = w[8:]
@@ -1532,10 +1699,8 @@ def _decode_a5_waveform(
total_bytes = global_offset
n_sets = total_bytes // 8
log.debug(
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets "
"(%d of %d expected; %.0f%%)",
len(chunks), total_bytes, n_sets, n_sets, total_samples,
100.0 * n_sets / total_samples if total_samples else 0,
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets",
len(chunks), total_bytes, n_sets,
)
if n_sets == 0:
@@ -1593,38 +1758,85 @@ def _decode_a5_waveform(
"Tran": tran,
"Vert": vert,
"Long": long_,
"Mic": mic,
"MicL": mic,
}
def _detect_record_format(data: bytes) -> Optional[str]:
"""
Detect which timestamp-header format a 210-byte 0C waveform record uses.
THREE formats observed on BE11529 firmware S338.17:
"single_shot" 9-byte header:
[day] [0x10] [month] [year_BE:2] [unknown] [hour] [min] [sec]
sub_code=0x10 at byte [1]. Year at [3:5].
"continuous" 10-byte header:
[0x10] [day] [0x10] [month] [year_BE:2] [unknown] [hour] [min] [sec]
marker 0x10 at byte [0] AND byte [2]. Year at [4:6].
"short" 8-byte header (NEW 2026-05-01):
[day] [month] [year_BE:2] [unknown] [hour] [min] [sec]
No marker bytes. Year at [2:4].
Each format has the year (uint16 BE) at a UNIQUE byte position, so we can
disambiguate by scanning each candidate position and picking the one
where the year falls in a sane range (2015..2050).
Returns "single_shot" / "continuous" / "short" or None if no format matches.
"""
if len(data) < 8:
return None
def _sane_year(hi: int, lo: int) -> bool:
y = (hi << 8) | lo
return 2015 <= y <= 2050
# Order matters: prefer formats with stronger marker-byte evidence first.
if data[1] == 0x10 and len(data) >= 9 and _sane_year(data[3], data[4]):
return "single_shot"
if (data[0] == 0x10 and data[2] == 0x10
and len(data) >= 10 and _sane_year(data[4], data[5])):
return "continuous"
if _sane_year(data[2], data[3]):
return "short"
return None
def _extract_record_type(data: bytes) -> Optional[str]:
"""
Decode the recording mode from byte[1] of the 210-byte waveform record.
Return a user-facing name for a waveform record. All three internal
timestamp-header layouts represent the *same* user concept a
triggered seismic event so they all surface as just "Waveform".
Byte[1] is the sub-record code that immediately follows the day byte in the
9-byte timestamp header at the start of each waveform record:
[day:1] [sub_code:1] [month:1] [year:2 BE] ...
The internal format code is preserved for parsing logic (timestamp
decoder selection) but doesn't leak into the API / UI / sidecar.
Callers that need the raw layout can call `_detect_record_format`
directly.
Confirmed codes ( 2026-04-01):
0x10 "Waveform" (continuous / single-shot mode)
Histogram mode code is not yet confirmed a histogram event must be
captured with debug=true to identify it. Returns None for unknown codes.
Background: across BE11529 firmware S338.17 we've observed three
different byte layouts for the timestamp header at the start of the
0C record (8 / 9 / 10 bytes, distinguished by the position of the
BE-encoded year and the presence of `0x10` marker bytes). An older
revision of this code labelled them "Waveform" / "Waveform
(Continuous)" / "Waveform (Short)", which created the false
impression that there were three distinct event "types" the user
could configure. In reality the user only ever picks Single Shot
vs Continuous vs Histogram in the compliance config the byte
layout is a firmware-internal detail that doesn't always correlate
with that choice.
"""
if len(data) < 2:
return None
code = data[1]
if code == 0x10:
fmt = _detect_record_format(data)
if fmt in ("single_shot", "continuous", "short"):
return "Waveform"
if code == 0x03:
# Continuous mode waveform record (confirmed by user — NOT a monitor log).
# The byte layout differs from 0x10 single-shot records: the timestamp
# fields decode as garbage under the 0x10 waveform layout.
# TODO: confirm correct timestamp layout for 0x03 records from a known-time event.
return "Waveform (Continuous)"
log.warning("_extract_record_type: unknown sub_code=0x%02X", code)
return f"Unknown(0x{code:02X})"
if len(data) >= 3:
log.warning(
"_extract_record_type: unrecognized header: data[0:3]=%02X %02X %02X",
data[0], data[1], data[2],
)
return f"Unknown({data[0]:02X}.{data[1]:02X}.{data[2]:02X})"
return None
def _extract_peak_floats(data: bytes) -> Optional[PeakValues]:
"""
@@ -1770,10 +1982,13 @@ def _encode_compliance_config(
DLE-jitter shifts):
Anchor: b'\\xbe\\x80\\x00\\x00\\x00\\x00' (confirmed stable, both BE11529 and BE18189)
recording_mode uint8 at anchor_pos - 7 (write payload)
recording_mode uint8 at anchor_pos - 8 (BOTH read and write)
Values: 0x00=Single Shot, 0x01=Continuous, 0x03=Histogram, 0x04=Histogram+Continuous
NOTE: In the E5 read response (decode) field is at anchor_pos - 8 due to an
extra 0x10 byte at read anchor_pos - 7. Write payload has no extra byte.
NOTE: The byte at anchor_pos - 7 is always 0x10 (a DLE marker regenerated by
device firmware in every E5 response). It must NOT be overwritten during
write doing so causes anchor drift (+1 per write cycle).
CORRECTION 2026-04-21: previous doc stated anchor-7 for write; empirically
confirmed wrong writing to anchor-7 shifts the anchor by 1 on every cycle.
sample_rate uint16 BE at anchor_pos - 6
histogram_interval_sec uint16 BE at anchor_pos - 4 (seconds; mode-gated to Histogram/Histogram+Continuous)
Valid values: 2, 5, 15, 60, 300, 900 (= 2s, 5s, 15s, 1m, 5m, 15m)
@@ -1833,13 +2048,40 @@ def _encode_compliance_config(
_ANC = b'\xbe\x80\x00\x00\x00\x00'
_anc = buf.find(_ANC, 0, 150)
# Log anchor position every time so we can detect unexpected shifts due to
# DLE jitter or firmware differences. Expected position is ~15.
if _anc < 0:
log.warning(
"_encode_compliance_config: anchor NOT FOUND in cfg[0:150] "
"(buf len=%d) — all anchor-relative writes will be skipped",
len(buf),
)
else:
log.info(
"_encode_compliance_config: anchor at cfg[%d] buf_len=%d "
"(recording_mode@%d DLE_marker@%d sample_rate@%d:%d "
"histogram_interval@%d:%d record_time@%d:%d)",
_anc, len(buf),
_anc - 8,
_anc - 7,
_anc - 6, _anc - 4,
_anc - 4, _anc - 2,
_anc + 6, _anc + 10,
)
if recording_mode is not None:
if _anc < 7:
if _anc < 8:
log.warning("_encode_compliance_config: anchor not found — cannot write recording_mode")
else:
buf[_anc - 7] = recording_mode & 0xFF
# Write to anchor-8, same physical position as the E5 read format.
# The byte at anchor-7 is a DLE marker (0x10) that the device firmware
# regenerates in every E5 response — it must NOT be overwritten.
# Writing to anchor-7 causes the device to add an extra byte on the
# next read-back, drifting the anchor by +1 on every write cycle.
# (CLAUDE.md "anchor-7 write" was incorrect — confirmed 2026-04-21)
buf[_anc - 8] = recording_mode & 0xFF
log.debug("_encode_compliance_config: recording_mode=0x%02X -> offset %d",
recording_mode, _anc - 7)
recording_mode, _anc - 8)
if sample_rate is not None:
if _anc < 6:
@@ -2001,6 +2243,27 @@ def _decode_compliance_config_into(data: bytes, info: DeviceInfo) -> None:
# _anchor + 6 : record_time (float32 BE)
_ANCHOR = b'\xbe\x80\x00\x00\x00\x00'
_anchor = data.find(_ANCHOR, 0, 150)
# Log anchor position on every decode so we can compare read vs write and
# catch unexpected shifts from DLE jitter or firmware differences.
# Expected position is ~15 for the E5 read payload (anchor - 8 = recording_mode).
if _anchor < 0:
log.warning(
"_decode_compliance_config_into: anchor NOT FOUND in data[0:150] (len=%d)",
len(data),
)
else:
log.info(
"_decode_compliance_config_into: anchor at data[%d] data_len=%d "
"(expected ~15; recording_mode@%d sample_rate@%d:%d "
"histogram_interval@%d:%d record_time@%d:%d)",
_anchor, len(data),
_anchor - 8,
_anchor - 6, _anchor - 4,
_anchor - 4, _anchor - 2,
_anchor + 6, _anchor + 10,
)
if _anchor >= 8 and _anchor + 10 <= len(data):
try:
config.recording_mode = data[_anchor - 8]
+533
View File
@@ -0,0 +1,533 @@
"""
minimateplus/event_file_io.py modern event-file (.sfm.json sidecar) IO.
This module is the single home for event-file conversion code that doesn't
fit cleanly inside `blastware_file.py` (which is the BW binary codec):
- sidecar JSON read/write (the modern per-event metadata file)
- read_blastware_file() reverse of write_blastware_file, used by
the BW-importer flow when SFM is ingesting files produced by
Blastware's own ACH (where the source A5 frames aren't available).
Sidecar schema v1 layout see docs in the project plan or the schema
declared in `event_to_sidecar_dict()`.
"""
from __future__ import annotations
import base64
import datetime
import hashlib
import json
import logging
import os
import struct
from pathlib import Path
from typing import Optional, Union
from .models import Event, PeakValues, ProjectInfo, Timestamp
from . import blastware_file as _bw # avoid circular reference at module load
log = logging.getLogger(__name__)
# Schema version for the sidecar JSON. Bump when fields change shape.
# Older readers must reject anything > SCHEMA_VERSION; newer fields added
# inside `extensions` are forward-compatible without a bump.
SCHEMA_VERSION = 1
SIDECAR_KIND = "sfm.event"
# Default tool_version stamp; callers can override. Hard-coded here
# rather than read via importlib.metadata because the latter reflects the
# *installed* dist-info, which doesn't update when pyproject.toml is
# bumped without a `pip install` re-run — leading to confusing stale
# version stamps in sidecars. Bump this constant and CHANGELOG.md
# together at release time.
TOOL_VERSION = "0.15.0"
try:
# Best-effort: prefer the installed metadata when it's NEWER than the
# baked-in constant (e.g. a downstream packager bumped the wheel
# without editing this file). Otherwise fall back to TOOL_VERSION.
from importlib.metadata import version as _pkg_version
_meta_v = _pkg_version("seismo-relay")
def _vtuple(s):
try:
return tuple(int(p) for p in s.split(".")[:3])
except Exception:
return (0, 0, 0)
_TOOL_VERSION_DEFAULT = (
_meta_v if _vtuple(_meta_v) > _vtuple(TOOL_VERSION) else TOOL_VERSION
)
except Exception:
_TOOL_VERSION_DEFAULT = TOOL_VERSION
# ── Sidecar dict construction ─────────────────────────────────────────────────
def _ts_iso(ts: Optional[Timestamp]) -> Optional[str]:
if ts is None:
return None
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
def _peak_values_to_dict(pv: Optional[PeakValues]) -> dict:
if pv is None:
return {
"transverse": None,
"vertical": None,
"longitudinal": None,
"vector_sum": None,
"mic_psi": None,
}
return {
"transverse": pv.tran,
"vertical": pv.vert,
"longitudinal": pv.long,
"vector_sum": pv.peak_vector_sum,
"mic_psi": pv.micl,
}
def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
if pi is None:
return {
"project": None,
"client": None,
"operator": None,
"sensor_location": None,
}
return {
"project": pi.project,
"client": pi.client,
"operator": pi.operator,
"sensor_location": pi.sensor_location,
}
def event_to_sidecar_dict(
event: Event,
*,
serial: str,
blastware_filename: str,
blastware_filesize: int,
blastware_sha256: str,
source_kind: str = "sfm-live",
a5_pickle_filename: Optional[str] = None,
tool_version: str = _TOOL_VERSION_DEFAULT,
captured_at: Optional[datetime.datetime] = None,
review: Optional[dict] = None,
extensions: Optional[dict] = None,
) -> dict:
"""
Build a v1 sidecar dict from an Event + the surrounding metadata.
Pure helper no file I/O. Callers stitch the result into a sidecar
via `write_sidecar()` (or POST it back via the PATCH endpoint).
"""
if source_kind not in {"sfm-live", "sfm-ach", "bw-import"}:
raise ValueError(f"unknown source_kind: {source_kind!r}")
captured_at = captured_at or datetime.datetime.utcnow()
# Stash raw 0C record bytes in `extensions.raw_records` so future
# field-decoding work (Peak Acceleration, ZC Freq, Time of Peak,
# sensor self-check results, etc.) can run offline against committed
# sidecars without a live device. Cheap (~280 bytes base64) and
# forward-compatible (older readers ignore unknown extensions keys).
ext_dict: dict = dict(extensions) if extensions else {}
raw_0c = getattr(event, "_raw_record", None)
if raw_0c:
rr = ext_dict.setdefault("raw_records", {})
# Don't clobber a raw_0c that callers explicitly passed in via
# `extensions=...` (e.g. round-trip preservation in patch_sidecar).
rr.setdefault("waveform_record_b64", base64.b64encode(raw_0c).decode("ascii"))
rr.setdefault("waveform_record_len", len(raw_0c))
return {
"schema_version": SCHEMA_VERSION,
"kind": SIDECAR_KIND,
"event": {
"serial": serial,
"timestamp": _ts_iso(event.timestamp),
"waveform_key": event._waveform_key.hex() if event._waveform_key else None,
"record_type": event.record_type,
"sample_rate": event.sample_rate,
"rectime_seconds": event.rectime_seconds,
"total_samples": event.total_samples,
"pretrig_samples": event.pretrig_samples,
},
"peak_values": _peak_values_to_dict(event.peak_values),
"project_info": _project_info_to_dict(event.project_info),
"blastware": {
"filename": blastware_filename,
"filesize": blastware_filesize,
"sha256": blastware_sha256,
"available": True,
},
"source": {
"kind": source_kind,
"captured_at": captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
"tool_version": tool_version,
"a5_pickle_filename": a5_pickle_filename,
},
"review": review or {
"false_trigger": False,
"reviewer": None,
"reviewed_at": None,
"notes": "",
},
"extensions": ext_dict,
}
# ── Sidecar IO ────────────────────────────────────────────────────────────────
def write_sidecar(path: Union[str, Path], data: dict) -> None:
"""
Atomic write of a sidecar dict to <path>.
Validates schema_version is supported before writing so we don't
silently drop a future-format sidecar over the wire.
"""
path = Path(path)
sv = data.get("schema_version")
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
raise ValueError(
f"write_sidecar: unsupported schema_version={sv!r} "
f"(this build supports 1..{SCHEMA_VERSION})"
)
tmp = path.with_suffix(path.suffix + ".tmp")
with tmp.open("w", encoding="utf-8") as f:
json.dump(data, f, indent=2, sort_keys=False, default=str)
f.write("\n")
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
def read_sidecar(path: Union[str, Path]) -> dict:
"""
Load a sidecar JSON file.
Raises FileNotFoundError if missing, ValueError on bad shape /
unsupported schema_version. Unknown keys at the top level are
preserved in the returned dict (forward-compat).
"""
path = Path(path)
with path.open("r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
raise ValueError(f"sidecar at {path}: top-level is not a JSON object")
sv = data.get("schema_version")
if not isinstance(sv, int) or sv < 1:
raise ValueError(f"sidecar at {path}: missing or invalid schema_version")
if sv > SCHEMA_VERSION:
raise ValueError(
f"sidecar at {path}: schema_version={sv} > supported {SCHEMA_VERSION}; "
"upgrade seismo-relay to read this file"
)
if data.get("kind") != SIDECAR_KIND:
raise ValueError(f"sidecar at {path}: unexpected kind={data.get('kind')!r}")
return data
def patch_sidecar(
path: Union[str, Path],
*,
review: Optional[dict] = None,
extensions: Optional[dict] = None,
reviewer_now: bool = True,
) -> dict:
"""
Atomically apply a JSON-merge-patch to a sidecar file's `review`
and/or `extensions` blocks. Other top-level keys are untouched.
`review_now`: when True (default) and `review` is non-empty, stamps
`review.reviewed_at` with the current UTC time so the review-time is
auditable without the caller having to pass it.
Returns the new full sidecar dict.
"""
path = Path(path)
data = read_sidecar(path)
if review:
merged = dict(data.get("review") or {})
merged.update({k: v for k, v in review.items() if v is not None or k in merged})
if reviewer_now:
merged["reviewed_at"] = datetime.datetime.utcnow().isoformat() + "Z"
data["review"] = merged
if extensions:
merged_ext = dict(data.get("extensions") or {})
merged_ext.update(extensions)
data["extensions"] = merged_ext
write_sidecar(path, data)
return data
def sidecar_path_for(blastware_path: Union[str, Path]) -> Path:
"""Convention: <bw_path>.sfm.json sits next to the BW binary."""
p = Path(blastware_path)
return p.with_name(p.name + ".sfm.json")
def file_sha256(path: Union[str, Path], chunk_size: int = 65536) -> str:
"""Compute SHA-256 of a file as a hex string."""
h = hashlib.sha256()
with open(path, "rb") as f:
while True:
chunk = f.read(chunk_size)
if not chunk:
break
h.update(chunk)
return h.hexdigest()
# ── Blastware-file reader ─────────────────────────────────────────────────────
#
# Reverse of `blastware_file.write_blastware_file`. Used by the BW-import
# flow to ingest files produced by Blastware's own ACH (where the source
# A5 frames are not available).
#
# File structure (recap):
# [22B header] [21B STRT record] [body bytes] [26B footer]
#
# The body holds:
# - 6B preamble (00 00 ff ff ff ff) immediately after the STRT
# - 4-channel interleaved int16 LE samples
# - Embedded ASCII metadata strings (Project: / Client: / User Name: /
# Seis Loc: / Extended Notes) from the device's session-start config
#
# The 0C waveform record (per-event peaks, project name) is NOT in the
# BW file — those are computed by the device firmware and only carried
# in the live SUB 0C response. read_blastware_file() therefore computes
# peaks from the raw samples assuming Normal-range (10 in/s full-scale)
# geophone sensitivity. Imported events surface that assumption via the
# sidecar's `peak_values.computed_from_samples` flag.
# Geophone scale factor, in/s per ADC unit, for Normal range (10 in/s FS).
# Confirmed from CLAUDE.md (geo_hardware_constant = 6.206053 in/s per V,
# ADC full-scale = 1.61133 V Normal range = 10.0 in/s peak; per-count
# resolution ≈ 10.0 / 32768).
_GEO_NORMAL_FS_INS = 10.0
_GEO_SENSITIVE_FS_INS = 1.250
_INT16_FS = 32768.0
# Microphone scale factor, psi per ADC count. Approximate — exact factor
# depends on the geophone-vs-mic ADC scaling and the firmware reference.
# We mark mic_psi as "computed approximate" in the sidecar.
_MIC_FS_PSI = 0.0125 / _INT16_FS # ~0.5 psi full-scale assumption
def _decode_strt(strt: bytes) -> dict:
"""
Decode the 21-byte STRT record from a BW file.
Returns dict with waveform_key (4B), total_samples, pretrig_samples,
rectime_seconds. Falls back to None on truncated/missing fields.
"""
if len(strt) < 21 or strt[0:4] != b"STRT":
return {}
return {
"waveform_key": strt[6:10].hex(),
"total_samples": struct.unpack_from(">H", strt, 8)[0],
"pretrig_samples": struct.unpack_from(">H", strt, 16)[0],
"rectime_seconds": strt[18],
}
def _find_first_string(buf: bytes, label: bytes, max_len: int = 256) -> Optional[str]:
"""
Search `buf` for `label` (e.g. b"Project:") and return the
null-terminated ASCII string that follows, stripped.
"""
pos = buf.find(label)
if pos < 0:
return None
start = pos + len(label)
end = buf.find(b"\x00", start, start + max_len)
if end < 0:
end = start + max_len
text = buf[start:end].decode("ascii", errors="replace").strip()
return text or None
def _decode_samples_4ch_int16_le(stream: bytes) -> dict[str, list[int]]:
"""
Decode a 4-channel interleaved int16 LE byte stream into per-channel
lists. Channels are [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
Truncates to a multiple of 8 bytes (one full sample-set).
"""
n_complete = (len(stream) // 8) * 8
if n_complete == 0:
return {"Tran": [], "Vert": [], "Long": [], "MicL": []}
fmt = "<" + "h" * (n_complete // 2)
flat = list(struct.unpack(fmt, stream[:n_complete]))
return {
"Tran": flat[0::4],
"Vert": flat[1::4],
"Long": flat[2::4],
"MicL": flat[3::4],
}
def _peaks_from_samples(samples: dict[str, list[int]]) -> PeakValues:
"""
Compute approximate peaks from raw int16 samples assuming Normal-range
geophone sensitivity. Used by the BW-importer when the 0C waveform
record (the device's authoritative peaks) is unavailable.
"""
def _peak_ins(ch: list[int]) -> float:
if not ch:
return 0.0
m = max(abs(int(v)) for v in ch)
return m / _INT16_FS * _GEO_NORMAL_FS_INS
tran = _peak_ins(samples.get("Tran", []))
vert = _peak_ins(samples.get("Vert", []))
long_ = _peak_ins(samples.get("Long", []))
# Mic in psi (approximate)
mic_ch = samples.get("MicL", []) or []
mic = max((abs(int(v)) for v in mic_ch), default=0) * _MIC_FS_PSI
# Peak vector sum: max over time of sqrt(T^2 + V^2 + L^2)
pvs = 0.0
n = min(len(samples.get("Tran", [])), len(samples.get("Vert", [])), len(samples.get("Long", [])))
if n:
scale = _GEO_NORMAL_FS_INS / _INT16_FS
T = samples["Tran"]; V = samples["Vert"]; L = samples["Long"]
for i in range(n):
t = T[i] * scale
v = V[i] * scale
l = L[i] * scale
mag = (t*t + v*v + l*l) ** 0.5
if mag > pvs:
pvs = mag
return PeakValues(
tran=tran, vert=vert, long=long_,
peak_vector_sum=pvs, micl=mic,
)
def read_blastware_file(path: Union[str, Path]) -> Event:
"""
Parse a Blastware waveform file into an Event.
Recovers:
- waveform_key, rectime_seconds, total_samples, pretrig_samples
(from the STRT record)
- timestamp (from the footer's start-time field)
- project_info (from ASCII labels embedded in the body)
- raw_samples (Tran/Vert/Long/MicL int16 lists)
- peak_values (computed from raw_samples; approximate see notes
on _peaks_from_samples about Normal-range assumption)
Does NOT recover the source A5 frames (they aren't in the BW file).
The returned Event has `_a5_frames = None`, signalling that
byte-for-byte regeneration of the BW file from this Event alone is
not possible the on-disk BW file IS the byte-for-byte source.
"""
path = Path(path)
raw = path.read_bytes()
if len(raw) < _bw._WAVEFORM_HEADER_SIZE + 21 + 26:
raise ValueError(f"{path}: file too short ({len(raw)} bytes) to be a BW event")
# Header: validate magic prefix.
header = raw[:_bw._WAVEFORM_HEADER_SIZE]
if not header.startswith(_bw._FILE_HEADER_PREFIX):
raise ValueError(f"{path}: not a Blastware file (bad header prefix)")
# STRT record: 21 bytes immediately after the header.
strt_raw = raw[_bw._WAVEFORM_HEADER_SIZE : _bw._WAVEFORM_HEADER_SIZE + 21]
strt_fields = _decode_strt(strt_raw)
if not strt_fields:
raise ValueError(f"{path}: STRT record missing or malformed")
# Footer: locate the 0e 08 marker, validating the year is in a sane range.
body_start = _bw._WAVEFORM_HEADER_SIZE + 21
footer_pos = -1
pos = body_start
while True:
pos = raw.find(b"\x0e\x08", pos)
if pos < 0 or pos + 26 > len(raw):
break
yr = (raw[pos + 4] << 8) | raw[pos + 5]
if 2015 <= yr <= 2050:
footer_pos = pos
break
pos += 1
if footer_pos < 0 and len(raw) >= 26:
footer_pos = len(raw) - 26
if footer_pos < body_start:
raise ValueError(f"{path}: footer not found")
body = raw[body_start : footer_pos]
footer = raw[footer_pos : footer_pos + 26]
# Footer layout:
# [0:2] 0e 08 marker
# [2:10] ts1 (start) BE 8B
# [10:18] ts2 (stop) BE 8B
# [18:24] 00 01 00 02 00 00
# [24:26] crc
ts1 = _bw._decode_ts_be(footer[2:10])
ts2 = _bw._decode_ts_be(footer[10:18])
# Body: first 6 bytes are the preamble (00 00 ff ff ff ff). Strip
# them before decoding samples. Any trailing tail past the last
# full sample-set is silently truncated by _decode_samples_4ch.
sample_bytes = body[6:] if body[:6].hex() in ("0000ffffffff", "0000FFFFFFFF") else body
samples = _decode_samples_4ch_int16_le(sample_bytes)
# Metadata strings (label-anchored search across the body).
project = _find_first_string(body, b"Project:")
client = _find_first_string(body, b"Client:")
user = _find_first_string(body, b"User Name:")
seisloc = _find_first_string(body, b"Seis Loc:")
# Build the Event.
ev = Event(index=-1)
if strt_fields.get("waveform_key"):
ev._waveform_key = bytes.fromhex(strt_fields["waveform_key"])
ev.record_type = "Waveform"
ev.rectime_seconds = strt_fields.get("rectime_seconds")
ev.total_samples = strt_fields.get("total_samples")
ev.pretrig_samples = strt_fields.get("pretrig_samples")
if ts1 is not None:
ev.timestamp = Timestamp(
raw=footer[2:10],
flag=0x10,
year=ts1.year, unknown_byte=0, month=ts1.month, day=ts1.day,
hour=ts1.hour, minute=ts1.minute, second=ts1.second,
)
ev.project_info = ProjectInfo(
project=project, client=client, operator=user, sensor_location=seisloc,
)
ev.raw_samples = samples
ev.peak_values = _peaks_from_samples(samples)
ev._a5_frames = None # not recoverable from BW file
return ev
+187 -31
View File
@@ -111,20 +111,24 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
verified against this algorithm on 2026-04-02).
Args:
offset_word: 16-bit offset (0x1004 for probe/chunks, 0x005A for term).
raw_params: 10 or 11 params bytes (from bulk_waveform_params or
bulk_waveform_term_params). 0x10 bytes in params are
written RAW NOT DLE-stuffed. Confirmed 2026-04-06 by
comparing wire bytes: BW sends bare `10 04` for chunk 1
(counter=0x1004), not stuffed `10 10 04`. Device reads
params at fixed byte positions; stuffing shifts the bytes
and corrupts the counter, causing device to ignore the frame.
offset_word: 16-bit offset. For probe/chunks/metadata pages this is
`0x1002`. For the proper TERM frame this is computed by
`bulk_waveform_term_v2()` from the STRT-derived
`end_offset`.
raw_params: 10, 11, or 12 params bytes (from `bulk_waveform_params`
for probes/samples, `bulk_waveform_term_v2` for TERM, or
a manually-built 12-byte block for the metadata pages
0x1002 / 0x1004). See gotcha #3 below — params region
uses partial DLE stuffing of 0x10 bytes.
Returns:
Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
"""
if len(raw_params) not in (10, 11):
raise ValueError(f"raw_params must be 10 or 11 bytes, got {len(raw_params)}")
if len(raw_params) not in (10, 11, 12):
# 10 = termination params; 11 = regular probe / chunk params;
# 12 = metadata-page params (extra trailing 0x00 — BW byte-perfect quirk
# for the two fixed metadata reads at counter=0x1002 and 0x1004).
raise ValueError(f"raw_params must be 10/11/12 bytes, got {len(raw_params)}")
# Build stuffed section between STX and checksum
s = bytearray()
@@ -134,8 +138,40 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
s += b"\x00" # field3
s += bytes([(offset_word >> 8) & 0xFF, # offset_hi — raw, NOT stuffed
offset_word & 0xFF]) # offset_lo
for b in raw_params: # params — NOT DLE-stuffed (raw bytes, match BW wire format)
# Params — partial DLE stuffing of 0x10 bytes (CONFIRMED 2026-05-05).
#
# The device's de-stuffing rule for params is:
# • `10 10` → de-stuffs to `10`
# • `10 02/03/04` → kept literal (these are inner-frame markers)
# • `10 X` other → de-stuffs to just `X` (drops the 0x10)
#
# So for any 0x10 byte in the *logical* params that is followed by a
# byte NOT in {0x02, 0x03, 0x04, 0x10}, we must double the 0x10 on the
# wire (`10 X` → `10 10 X`) so the device's de-stuffer reproduces the
# original `10 X` pair. Without this, counter values with `0x10` in
# the high byte (e.g. counter=0x1000 has params bytes `10 00`) are
# silently corrupted to `0x__00` on the device side, and the device
# responds for the wrong address — for counter=0x1000 it returns the
# probe response (counter=0x0000), which contains the file header +
# STRT. That STRT block then lands in the assembled file body and
# Blastware rejects the file as malformed.
#
# Confirmed against BW capture 5-1-26 / bwcap3sec frame 20: params
# logical bytes `00 01 11 10 00 00 00 00 00 00 00` (counter=0x1000)
# are encoded on the wire as `00 01 11 10 10 00 00 00 00 00 00 00`.
# BW frames 13/14 (meta @ 0x1002 / 0x1004) leave `10 02` and `10 04`
# raw — the device handles those literal pairs correctly.
i = 0
while i < len(raw_params):
b = raw_params[i]
s.append(b)
if (
b == 0x10
and i + 1 < len(raw_params)
and raw_params[i + 1] not in (0x02, 0x03, 0x04, 0x10)
):
s.append(0x10) # double the 0x10 so it survives device de-stuffing
i += 1
# DLE-aware checksum: for 0x10 XX pairs count XX; for lone bytes count them
chk, i = 0, 0
@@ -398,28 +434,26 @@ def bulk_waveform_params(key4: bytes, counter: int, *, is_probe: bool = False) -
def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
"""
Build the 10-byte params block for the SUB 5A termination request.
DEPRECATED DO NOT USE IN NEW CODE.
The termination request uses offset=0x005A and a DIFFERENT params layout
the leading 0x00 byte is dropped, key4[0:2] shifts to params[0:2], and the
counter high byte is at params[2]:
This is the v1 termination params helper, paired with the broken
`_BULK_TERM_OFFSET = 0x005A` magic offset_word. Together they produce a
~100-byte device-side terminator response that does NOT contain the
partial-last-chunk waveform tail or the 26-byte file footer. Files
reconstructed using this terminator are missing their last ~512 bytes of
waveform data and have a synthesized footer that disagrees with what BW
would have written.
params[0] = key4[0]
params[1] = key4[1]
params[2] = (counter >> 8) & 0xFF
params[3:] = zeros
**For new code, use `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`**
which computes the correct offset_word + params from the STRT-derived
`end_offset`. v2 produces wire bytes that match BW exactly across all
tested events (4-27-26 / 5-1-26 / 5-4-26 captures).
Counter for the termination request = last_regular_counter + 0x0400.
Confirmed from 1-2-26 BW TX capture: final request (frame 83) uses
offset=0x005A, params[0:3] = key4[0:2] + term_counter_hi.
Args:
key4: 4-byte waveform key.
counter: Termination counter (= last regular counter + 0x0400).
Returns:
10-byte params block.
This function is retained ONLY for the defensive fallback path in
`read_bulk_waveform_stream()` that triggers when STRT parsing fails or no
chunks are fetched (= a malformed event or an unexpected device state).
The fallback already logs a WARNING when it activates; if you see that
warning, the bug is upstream STRT should have been parseable.
"""
if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
@@ -430,6 +464,123 @@ def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
return bytes(p)
def bulk_waveform_term_v2(
key4: bytes,
end_offset: int,
last_chunk_counter: int,
) -> tuple[int, bytes]:
"""
Compute the SUB 5A TERM frame's offset_word and 10-byte params block.
Confirmed across 3 events (4-27-26 + 5-1-26 captures):
next_boundary = last_chunk_counter + 0x0200
offset_word = end_offset - next_boundary (residual byte count)
params[0] = key4[0] (= 0x01 on every observed device)
params[1] = key4[1] (= 0x11)
params[2] = (next_boundary >> 8) & 0xFF
params[3] = next_boundary & 0xFF
params[4:10] = zeros
Verification:
| end_offset | last_chunk | next_boundary | offset_word | params[2:4] |
| 0x1ABE | 0x1800 | 0x1A00 | 0x00BE | 1A 00 |
| 0x21F2 | 0x1E00 | 0x2000 | 0x01F2 | 20 00 |
| 0x417E | 0x3E38 | 0x4038 | 0x0146 | 40 38 |
The device receives `requested_address = (params[2] << 8) | offset_word`
and replies with `(end_offset - next_boundary)` bytes of waveform tail
starting at `next_boundary` including the 26-byte file footer.
Args:
key4: 4-byte waveform key for this event.
end_offset: Event-end pointer (= `(end_key[2] << 8) | end_key[3]`
from the STRT record at data[23:27] of A5[0]).
last_chunk_counter: Counter of the last full 0x0200-byte chunk fetched
(the chunk that covers [last_chunk_counter,
last_chunk_counter + 0x0200)).
Returns:
(offset_word, params10) tuple. Pass as
`build_5a_frame(offset_word, params)`.
Raises:
ValueError: on inconsistent inputs.
"""
if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
next_boundary = last_chunk_counter + 0x0200
if next_boundary > 0xFFFF:
raise ValueError(
f"next_boundary 0x{next_boundary:04X} exceeds uint16; check inputs"
)
if end_offset <= last_chunk_counter:
raise ValueError(
f"end_offset 0x{end_offset:04X} must be > "
f"last_chunk_counter 0x{last_chunk_counter:04X}"
)
offset_word = end_offset - next_boundary
if offset_word < 0:
# Last chunk overshot end_offset; caller should have stopped one chunk
# earlier. Treat as zero residual.
offset_word = 0
if offset_word > 0xFFFF:
raise ValueError(
f"offset_word 0x{offset_word:04X} exceeds uint16"
)
p = bytearray(10)
p[0] = key4[0]
p[1] = key4[1]
p[2] = (next_boundary >> 8) & 0xFF
p[3] = next_boundary & 0xFF
return offset_word, bytes(p)
# ── End-offset extraction from STRT record ────────────────────────────────────
STRT_MARKER = b"STRT"
def parse_strt_end_offset(a5_data: bytes) -> Optional[int]:
"""
Extract the event-end offset from the STRT record in an A5 response payload.
The first A5 response (the probe response, or the first chunk for events
with non-zero start_key[2:4]) contains a STRT record at byte offset 17 of
`data`. Layout:
data[17:21] "STRT"
data[21:23] ff fe sentinel
data[23:27] end_key 4-byte key of where this event ENDS
data[27:31] start_key
...
Returns `(end_key[2] << 8) | end_key[3]` the absolute device-buffer
address where the event ends. Use this to bound the chunk loop and to
compute the TERM frame.
Verified end_offset values:
| event start_key | end_key | end_offset |
| 01110000 | 01111ABE | 0x1ABE |
| 01110000 | 011121F2 | 0x21F2 |
| 011121F2 | 0111417E | 0x417E |
Args:
a5_data: The `data` field of an A5 response frame (frame.data).
Returns:
The end_offset (uint16) if STRT is found, else None.
"""
pos = a5_data.find(STRT_MARKER)
if pos < 0 or pos + 10 > len(a5_data):
return None
# data[pos+4:pos+6] is "ff fe"; data[pos+6:pos+10] is end_key.
end_key = a5_data[pos + 6 : pos + 10]
if len(end_key) < 4:
return None
return (end_key[2] << 8) | end_key[3]
# ── Pre-built POLL frames ─────────────────────────────────────────────────────
#
# POLL (SUB 0x5B) uses the same two-step pattern as all other reads — the
@@ -457,6 +608,11 @@ class S3Frame:
page_lo: int # PAGE_LO from header
data: bytes # payload data section (payload[5:], checksum already stripped)
checksum_valid: bool
chk_byte: int = 0 # actual checksum byte received from wire (body[-1])
# needed for waveform file reconstruction: when the last data byte
# is 0x10 and chk_byte ∈ {0x02, 0x03, 0x04}, the DLE+chk pair
# must be included in the DLE-strip operation to correctly
# reconstruct the Blastware binary body.
@property
def page_key(self) -> int:
@@ -465,7 +621,6 @@ class S3Frame:
# ── Streaming S3 frame parser ─────────────────────────────────────────────────
class S3FrameParser:
"""
Incremental byte-stream parser for S3BW response frames.
@@ -597,4 +752,5 @@ class S3FrameParser:
page_lo = raw_payload[4],
data = raw_payload[5:],
checksum_valid = (chk_received == chk_computed),
chk_byte = chk_received,
)
+56
View File
@@ -201,6 +201,58 @@ class Timestamp:
second=second,
)
@classmethod
def from_short_record(cls, data: bytes) -> "Timestamp":
"""
Decode an 8-byte timestamp header from a 210-byte waveform record.
Wire layout ( CONFIRMED 2026-05-01 against live SFM run on BE11529 in
Continuous mode, day-of-month = 1 May, raw: 01 05 07 ea 00 0d 15 25):
byte[0]: day (uint8)
byte[1]: month (uint8)
bytes[2-3]: year (big-endian uint16)
byte[4]: unknown (0x00 in observed sample)
byte[5]: hour (uint8)
byte[6]: minute (uint8)
byte[7]: second (uint8)
This is a third format observed in the wild distinct from the 9-byte
(single-shot, sub_code=0x10 at [1]) and 10-byte (continuous, 0x10 at
[0] AND [2]) layouts. No marker bytes; disambiguated by where the
year lands when scanned at byte 2/3/4.
Args:
data: at least 8 bytes; only the first 8 are consumed.
Returns:
Decoded Timestamp.
Raises:
ValueError: if data is fewer than 8 bytes.
"""
if len(data) < 8:
raise ValueError(
f"Short record timestamp requires at least 8 bytes, got {len(data)}"
)
day = data[0]
month = data[1]
year = struct.unpack_from(">H", data, 2)[0]
unknown_byte = data[4]
hour = data[5]
minute = data[6]
second = data[7]
return cls(
raw=bytes(data[:8]),
flag=0,
year=year,
unknown_byte=unknown_byte,
month=month,
day=day,
hour=hour,
minute=minute,
second=second,
)
@property
def clock_set(self) -> bool:
"""False when year == 1995 (factory default / battery-lost state)."""
@@ -493,6 +545,10 @@ class Event:
# Set by get_events(); required by download_waveform().
_waveform_key: Optional[bytes] = field(default=None, repr=False)
# Raw A5 frames from the full bulk waveform download (full_waveform=True).
# Populated by get_events() when full_waveform=True; used by write_blastware_file().
_a5_frames: Optional[list] = field(default=None, repr=False)
def __str__(self) -> str:
ts = str(self.timestamp) if self.timestamp else "no timestamp"
ppv = ""
+239 -100
View File
@@ -35,6 +35,8 @@ from .framing import (
token_params,
bulk_waveform_params,
bulk_waveform_term_params,
bulk_waveform_term_v2,
parse_strt_end_offset,
POLL_PROBE,
POLL_DATA,
SESSION_RESET,
@@ -122,14 +124,22 @@ DATA_LENGTHS: dict[int, int] = {
}
# SUB 5A (BULK_WAVEFORM_STREAM) protocol constants.
# Confirmed from 1-2-26 BW TX capture analysis (2026-04-02).
_BULK_CHUNK_OFFSET = 0x1004 # offset field for probe + all regular chunk requests ✅
_BULK_TERM_OFFSET = 0x005A # offset field for termination request ✅
_BULK_COUNTER_STEP = 0x0400 # chunk counter increment per chunk ✅
# Chunk counter formula: chunk_num * 0x0400 for ALL chunks including chunk 1.
# Earlier captures showed 0x1004 for chunk 1 — that was a Blastware artifact, not a
# protocol requirement. Confirmed 2026-04-06: 0x0400 for chunk 1 works; 0x1004
# causes a 120-second device timeout. Formula n * 0x0400 is used for all chunks.
#
# 2026-05-01 minimal-fix: the chunk-counter walk is now bounded by the event's
# `end_offset` extracted from the STRT record at data[23:27] of the probe
# response. Without this bound the loop kept asking for chunks past the event
# end and the device responded with post-event circular-buffer garbage,
# corrupting reconstructed Blastware files for events ≥ 2 sec.
#
# We keep the OLD 0x0400 chunk step here (BW actually uses 0x0200 — see §7.8.5
# of the protocol reference for the corrected understanding) because the
# existing blastware_file.py builder relies on the 0x0400-step frame structure
# to produce valid files. Switching to BW's 0x0200 step is a separate task
# that also requires updating the file builder.
# BW-exact protocol values (v0.14.0). Verified against 4-27-26 + 5-1-26 captures.
_BULK_CHUNK_OFFSET = 0x1002 # offset_word for probe + all chunk requests
_BULK_TERM_OFFSET = 0x005A # offset_word for the legacy terminator (fallback only)
_BULK_COUNTER_STEP = 0x0200 # chunk counter increment (matches chunk payload size)
# Default timeout values (seconds).
# MiniMate Plus is a slow device — keep these generous.
@@ -524,142 +534,270 @@ class MiniMateProtocol:
self,
key4: bytes,
*,
stop_after_metadata: bool = True,
max_chunks: int = 32,
) -> list[bytes]:
stop_after_metadata: bool = True, # DEPRECATED — no-op under BW-exact walk
max_chunks: int = 256, # safety cap only; loop is bounded by end_offset
include_terminator: bool = False,
extra_chunks_after_metadata: int = 1, # DEPRECATED — no-op
) -> list[S3Frame]:
"""
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event.
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event using
Blastware's exact protocol. REWRITTEN 2026-05-02 (v0.14.0).
The bulk waveform stream carries both raw ADC samples (large) and
event-time metadata strings ("Project:", "Client:", "User Name:",
"Seis Loc:", "Extended Notes") embedded in one of the middle frames
(confirmed: A5[7] of 9 for 1-2-26 capture).
Algorithm (matches BW captures across 2-sec / 3-sec / event-2):
Protocol is request-per-chunk, NOT a continuous stream:
1. Probe (offset=_BULK_CHUNK_OFFSET, is_probe=True, counter=0x0000)
2. Chunks (offset=_BULK_CHUNK_OFFSET, is_probe=False, counter+=0x0400)
3. Loop until metadata found (stop_after_metadata=True) or max_chunks
4. Termination (offset=_BULK_TERM_OFFSET, counter=last+_BULK_COUNTER_STEP)
Device responds with a final A5 frame (page_key=0x0000).
1. Probe
- For events at start_key[2:4] = 0x0000 (first event after erase
/ wrap): probe at counter=0x0000 with full key in params.
- For continuation events (start_key[2:4] != 0): first chunk at
counter = start_key[2:4] + 0x0046; acts as both probe and
first sample chunk; response carries STRT.
The termination frame (page_key=0x0000) is NOT included in the returned list.
2. Parse end_offset from STRT record at data[23:27] of the probe response.
Args:
key4: 4-byte waveform key from EVENT_HEADER (1E).
stop_after_metadata: If True (default), send termination as soon as
b"Project:" is found in a frame's data — avoids
downloading the full ADC waveform payload (several
hundred KB). Set False to download everything.
max_chunks: Safety cap on the number of chunk requests sent
(default 32; a typical event uses 9 large frames).
3. Read two fixed metadata pages at counter=0x1002 and counter=0x1004
global session metadata (Project / Client / User Name / Seis Loc
/ Extended Notes ASCII strings). Event 1 only; continuation
events skip these (BW caches them across the session).
4. Walk sample chunks at 0x0200 increments, starting from 0x0600 for
event 1 or `start + 0x0046 + 0x0200` for continuation events.
Stop when `next_chunk + 0x0200 > end_offset`.
5. Send TERM frame with offset_word and params computed by
`bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`.
The TERM response contains the partial last chunk (residual =
end_offset - next_boundary) including the 26-byte 0e 08 file
footer.
Returns:
List of raw data bytes from each A5 response frame (not including
the terminator frame). Frame indices match the request sequence:
index 0 = probe response, index 1 = first chunk, etc.
List of S3Frame objects from each A5 response (probe, metadata
pages, sample chunks, optional TERM response). Caller passes
`include_terminator=True` (e.g. write_blastware_file) to keep the
TERM response in the list it's required to reconstruct the
file footer.
Deprecated kwargs:
stop_after_metadata: legacy "Project:"-string-based stop condition.
No-op under the BW-exact walk; the loop is
deterministically bounded by end_offset from
STRT. Accepted for backward compat.
extra_chunks_after_metadata: same.
Raises:
ProtocolError: on timeout, bad checksum, or unexpected SUB.
Confirmed from 1-2-26 BW TX/RX captures (2026-04-02):
- probe + 8 regular chunks + 1 termination = 10 TX frames
- 9 large A5 responses + 1 terminator A5 = 10 RX frames
- page_key=0x0010 on large frames; page_key=0x0000 on terminator
- "Project:" metadata at A5[7].data[626]
ProtocolError: on timeout / bad checksum / unexpected SUB.
"""
if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xFF - 0x5A = 0xA5
frames_data: list[bytes] = []
counter = 0
# Quietly accept and warn on deprecated kwargs.
if not stop_after_metadata:
log.debug("5A: stop_after_metadata=False is no-op under BW-exact walk")
if extra_chunks_after_metadata not in (0, 1):
log.debug("5A: extra_chunks_after_metadata=%d is no-op under BW-exact walk",
extra_chunks_after_metadata)
# ── Step 1: probe ────────────────────────────────────────────────────
log.debug("5A probe key=%s", key4.hex())
params = bulk_waveform_params(key4, 0, is_probe=True)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
self._parser.reset() # reset bytes_fed counter before probe recv
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xA5
frames_data: list[S3Frame] = []
start_offset = (key4[2] << 8) | key4[3]
is_event_1 = (start_offset == 0)
# ── Step 1: probe / first chunk ──────────────────────────────────────
if is_event_1:
probe_counter = 0
probe_params = bulk_waveform_params(key4, 0, is_probe=True)
log.debug("5A probe (event-1) key=%s counter=0x0000", key4.hex())
else:
# Continuation events: first 5A request lands at counter = key[2:4]
# (i.e. the address of the off=0x46 WAVEHDR record returned by 1F).
# The probe response carries STRT at byte 17 with end_offset.
#
# Confirmed 2026-05-04 from 5-1-26 "copy 2nd address" capture
# (BW probes counter=0x2238 with key=01112238, STRT@17 end=0x417E)
# and 5-4-26 BW captures (2-sec event probes counter=0x2238).
#
# The earlier "+0x46" formula in the doc came from calling
# start_key the BOUNDARY (off=0x2C) key, but the iteration walk
# uses 1F's off=0x46 key as cur_key, which already incorporates
# the +0x46 offset relative to the boundary. Adding it again
# caused the probe to overshoot, miss STRT, and run uncapped.
probe_counter = start_offset
probe_params = bulk_waveform_params(key4, probe_counter)
log.debug(
"5A probe (event-N) key=%s counter=0x%04X",
key4.hex(), probe_counter,
)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, probe_params))
self._parser.reset()
try:
rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False)
except TimeoutError:
log.warning(
"5A probe TIMED OUT for key=%s"
"%d raw bytes received (no complete A5 frame assembled)",
"5A probe TIMED OUT for key=%s%d raw bytes received",
key4.hex(), self._parser.bytes_fed,
)
raise
frames_data.append(rsp.data)
log.debug("5A A5[0] page_key=0x%04X %d bytes", rsp.page_key, len(rsp.data))
# ── Step 2: chunk loop ───────────────────────────────────────────────
# Chunk counters are monotonic: chunk_num * 0x0400 for all chunks.
# The 4-2-26 BW TX capture showed 0x1004 for chunk 1, but this is a
# Blastware artifact — the device accepts any counter value and streams
# data regardless. Empirically confirmed 2026-04-06: 0x0400 for chunk 1
# works; 0x1004 causes the device to ignore the frame (timeout).
for chunk_num in range(1, max_chunks + 1):
counter = chunk_num * _BULK_COUNTER_STEP
params = bulk_waveform_params(key4, counter)
log.debug("5A chunk %d counter=0x%04X", chunk_num, counter)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
self._parser.reset() # reset bytes_fed for accurate per-chunk count
frames_data.append(rsp)
log.debug("5A A5[0] (probe) page_key=0x%04X %d bytes",
rsp.page_key, len(rsp.data))
# ── Step 2: parse STRT end_offset from probe response ────────────────
end_offset = parse_strt_end_offset(rsp.data)
if end_offset is None:
log.warning(
"5A probe response did not contain a STRT record; "
"cannot bound chunk loop — falling back to max_chunks=%d cap",
max_chunks,
)
end_offset = 0xFFFF # impossible value → loop runs to max_chunks
else:
log.info(
"5A STRT start_offset=0x%04X end_offset=0x%04X size=0x%04X",
start_offset, end_offset, end_offset - start_offset,
)
# ── Step 3: metadata pages 0x1002 + 0x1004 (event 1 only) ────────────
# Confirmed from BW captures: BW reads these two fixed device-buffer
# pages immediately after the probe for events at start_key[2:4]=0.
# Continuation events skip them (BW caches across the session).
# Their content is global compliance-setup metadata: Project, Client,
# User Name, Seis Loc, Extended Notes.
if is_event_1:
for meta_counter in (0x1002, 0x1004):
# Metadata page params have an extra trailing 0x00 byte
# (12-byte params instead of 11) — empirical from BW captures.
# Checksum-neutral but matches BW byte-for-byte.
meta_params = bytes([
0x00,
key4[0], key4[1],
(meta_counter >> 8) & 0xFF,
meta_counter & 0xFF,
0, 0, 0, 0, 0, 0, 0,
])
log.debug("5A metadata page counter=0x%04X", meta_counter)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, meta_params))
self._parser.reset()
try:
rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False, timeout=10.0)
meta_rsp = self._recv_one(
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
)
except TimeoutError:
log.warning(
"5A metadata page 0x%04X TIMED OUT — continuing",
meta_counter,
)
continue
frames_data.append(meta_rsp)
log.debug(
"5A meta@0x%04X page_key=0x%04X %d bytes",
meta_counter, meta_rsp.page_key, len(meta_rsp.data),
)
# ── Step 4: sample chunk loop, bounded by end_offset ─────────────────
# Sample chunks start at:
# event 1: counter = 0x0600
# event N (>0): counter = probe_counter + 0x0200
# (probe was the first sample chunk)
if is_event_1:
counter = 0x0600
else:
counter = probe_counter + _BULK_COUNTER_STEP
last_chunk_counter: Optional[int] = (
probe_counter if not is_event_1 else None
)
chunks_fetched = 0
while chunks_fetched < max_chunks:
# Stop when next chunk would straddle the event end.
if counter + _BULK_COUNTER_STEP > end_offset:
log.debug(
"5A chunk loop done at counter=0x%04X (end=0x%04X); "
"%d chunks fetched",
counter, end_offset, chunks_fetched,
)
break
params = bulk_waveform_params(key4, counter)
log.debug("5A chunk #%d counter=0x%04X", chunks_fetched + 1, counter)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
self._parser.reset()
try:
rsp = self._recv_one(
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
)
except TimeoutError:
raw = self._parser.bytes_fed
log.warning(
"5A TIMEOUT chunk=%d counter=0x%04X raw_bytes=%d",
chunk_num, counter, raw,
chunks_fetched + 1, counter, raw,
)
if raw > 0 and frames_data:
# Device sent a partial byte (likely a bare DLE/ETX end-of-stream
# signal) but never completed a full frame. Treat as graceful
# stream end and fall through to the termination step.
log.warning(
"5A end-of-stream detected at chunk=%d (raw_bytes=%d, "
"frames_collected=%d) — proceeding to termination",
chunk_num, raw, len(frames_data),
"5A unexpected end-of-stream — proceeding to TERM",
)
break
raise
log.warning(
"5A RX chunk=%d page_key=0x%04X data_len=%d contains_Project=%s",
chunk_num, rsp.page_key, len(rsp.data), b"Project:" in rsp.data,
log.debug(
"5A RX chunk=%d page_key=0x%04X data_len=%d",
chunks_fetched + 1, rsp.page_key, len(rsp.data),
)
if rsp.page_key == 0x0000:
# Device unexpectedly terminated mid-stream (no termination needed).
log.debug("5A A5[%d] page_key=0x0000 — device terminated early", chunk_num)
# Device terminated mid-stream unexpectedly.
log.warning(
"5A unexpected page_key=0x0000 mid-stream at counter=0x%04X",
counter,
)
if include_terminator:
frames_data.append(rsp)
return frames_data
frames_data.append(rsp.data)
if stop_after_metadata and b"Project:" in rsp.data:
log.debug("5A A5[%d] metadata found — stopping early", chunk_num)
break
frames_data.append(rsp)
last_chunk_counter = counter
counter += _BULK_COUNTER_STEP
chunks_fetched += 1
else:
log.warning(
"5A reached max_chunks=%d without end-of-stream; sending termination",
max_chunks,
"5A reached max_chunks=%d at counter=0x%04X (end=0x%04X)",
max_chunks, counter, end_offset,
)
# ── Step 3: termination ──────────────────────────────────────────────
term_counter = counter + _BULK_COUNTER_STEP
term_params = bulk_waveform_term_params(key4, term_counter)
log.debug(
"5A termination term_counter=0x%04X offset=0x%04X",
term_counter, _BULK_TERM_OFFSET,
# ── Step 5: TERM with proper end_offset-derived formula ──────────────
if last_chunk_counter is None or end_offset == 0xFFFF:
# No STRT or no chunks fetched — fall back to legacy TERM.
log.warning(
"5A using legacy TERM (offset_word=0x005A); "
"end_offset unavailable or no chunks fetched",
)
legacy_counter = (last_chunk_counter or probe_counter) + _BULK_COUNTER_STEP
term_offset_word = _BULK_TERM_OFFSET # 0x005A
term_params = bulk_waveform_term_params(key4, legacy_counter)
else:
term_offset_word, term_params = bulk_waveform_term_v2(
key4, end_offset, last_chunk_counter,
)
self._send(build_5a_frame(_BULK_TERM_OFFSET, term_params))
try:
term_rsp = self._recv_one(expected_sub=rsp_sub)
log.debug(
"5A termination response page_key=0x%04X %d bytes",
"5A TERM offset_word=0x%04X params[2:4]=%s end=0x%04X "
"last_chunk=0x%04X",
term_offset_word, term_params[2:4].hex(),
end_offset, last_chunk_counter,
)
self._send(build_5a_frame(term_offset_word, term_params))
try:
term_rsp = self._recv_one(expected_sub=rsp_sub, timeout=10.0)
log.info(
"5A TERM response page_key=0x%04X %d bytes",
term_rsp.page_key, len(term_rsp.data),
)
if include_terminator:
frames_data.append(term_rsp)
except TimeoutError:
log.debug("5A no termination response — device may have already closed")
log.warning("5A no TERM response (timeout)")
return frames_data
@@ -799,7 +937,7 @@ class MiniMateProtocol:
continue
chunk = data_rsp.data[11:]
log.warning(
log.debug(
"read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d",
step_name, data_rsp.page_key, len(data_rsp.data),
len(chunk), len(config) + len(chunk),
@@ -819,17 +957,18 @@ class MiniMateProtocol:
except TimeoutError:
pass
log.warning(
log.info(
"read_compliance_config: done — %d cfg bytes total",
len(config),
)
# Hex dump first 128 bytes for field mapping
# Hex dump first 128 bytes — useful only for field-mapping work, not normal operation.
if log.isEnabledFor(logging.DEBUG):
for row in range(0, min(len(config), 128), 16):
row_bytes = bytes(config[row:row + 16])
hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
log.warning(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
log.debug(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
return bytes(config)
+99
View File
@@ -454,3 +454,102 @@ class SocketTransport(TcpTransport):
def __repr__(self) -> str:
return f"SocketTransport(peer={self.host!r})"
# ── Capturing transport (MITM-style raw byte mirror) ──────────────────────────
class CapturingTransport(BaseTransport):
"""
Wraps another BaseTransport and mirrors every byte to two raw capture files:
raw_bw_<...>.bin bytes WE wrote to the device (BW-side TX)
raw_s3_<...>.bin bytes the device wrote back (S3-side TX)
The file naming and on-wire byte layout are identical to the captures
produced by `bridges/ach_mitm.py`, so the resulting `.bin` files can be
loaded directly by the Analyzer (File > Open Capture) and parsed by the
same tooling used for genuine Blastware MITM captures.
All BaseTransport methods are forwarded to the inner transport; the only
side-effect is that successful read/write byte streams are appended to the
two open binary files.
Args:
inner: An already-built BaseTransport (SerialTransport / TcpTransport).
bw_path: File path for the "BW TX" stream (bytes we send). Opened "wb".
s3_path: File path for the "S3 TX" stream (bytes the device sends).
Opened "wb".
Example:
with CapturingTransport(TcpTransport("1.2.3.4", 9034),
"raw_bw.bin", "raw_s3.bin") as t:
client = MiniMateClient(transport=t)
client.connect()
client.get_events()
# both .bin files now hold the full bidirectional capture.
"""
def __init__(self, inner: BaseTransport, bw_path: str, s3_path: str) -> None:
self._inner = inner
self._bw_path = bw_path
self._s3_path = s3_path
self._bw_fh = None
self._s3_fh = None
# Forward inner attrs so callers can introspect (e.g. .host, .port).
self.host = getattr(inner, "host", None)
self.port = getattr(inner, "port", None)
# ── BaseTransport interface ───────────────────────────────────────────────
def connect(self) -> None:
if self._bw_fh is None:
self._bw_fh = open(self._bw_path, "wb", buffering=0)
if self._s3_fh is None:
self._s3_fh = open(self._s3_path, "wb", buffering=0)
self._inner.connect()
def disconnect(self) -> None:
try:
self._inner.disconnect()
finally:
for fh_attr in ("_bw_fh", "_s3_fh"):
fh = getattr(self, fh_attr)
if fh is not None:
try:
fh.flush()
fh.close()
except Exception:
pass
setattr(self, fh_attr, None)
@property
def is_connected(self) -> bool:
return self._inner.is_connected
def write(self, data: bytes) -> None:
self._inner.write(data)
if data and self._bw_fh is not None:
try:
self._bw_fh.write(data)
except Exception:
pass
def read(self, n: int) -> bytes:
got = self._inner.read(n)
if got and self._s3_fh is not None:
try:
self._s3_fh.write(got)
except Exception:
pass
return got
@property
def bw_path(self) -> str:
return self._bw_path
@property
def s3_path(self) -> str:
return self._s3_path
def __repr__(self) -> str:
return f"CapturingTransport({self._inner!r}, bw={self._bw_path!r}, s3={self._s3_path!r})"
+2
View File
@@ -53,7 +53,9 @@ SUB_TABLE: dict[int, tuple[str, str, str]] = {
0x82: ("TRIGGER_CONFIG_WRITE", "BW→S3", "0x1C bytes; trigger config block; mirrors SUB 1C"),
0x83: ("TRIGGER_WRITE_CONFIRM", "BW→S3", "Short frame; commit step after 0x82"),
# S3→BW responses
0x5A: ("BULK_WAVEFORM_STREAM", "BW→S3", "Bulk waveform chunk request; response is A5 stream"),
0xA4: ("POLL_RESPONSE", "S3→BW", "Response to SUB 5B poll"),
0xA5: ("BULK_WAVEFORM_RESPONSE", "S3→BW", "Response to SUB 5A; waveform chunks + metadata"),
0xFE: ("FULL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 01"),
0xF9: ("CHANNEL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 06"),
0xF7: ("EVENT_INDEX_RESPONSE", "S3→BW", "Response to SUB 08; contains backlight/power-save"),
+31 -34
View File
@@ -33,7 +33,7 @@ STX = 0x02
ETX = 0x03
ACK = 0x41
__version__ = "0.2.3"
__version__ = "0.2.5"
@dataclass
@@ -186,7 +186,7 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
IDLE = 0
IN_FRAME = 1
AFTER_DLE = 2
IN_FRAME_DLE = 2 # saw DLE inside frame — waiting for next byte
state = IDLE
body = bytearray()
@@ -206,66 +206,63 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
state = IN_FRAME
i += 2
continue
# ACK bytes, boot strings, garbage — silently ignored
elif state == IN_FRAME:
if b == DLE:
state = AFTER_DLE
state = IN_FRAME_DLE
i += 1
continue
body.append(b)
else: # AFTER_DLE
if b == DLE:
body.append(DLE)
state = IN_FRAME
i += 1
continue
if b == ETX:
# Bare ETX = real S3 frame terminator (confirmed from S3FrameParser)
end_offset = i + 1
trailer_start = i + 1
trailer_end = trailer_start + trailer_len
trailer = blob[trailer_start:trailer_end]
chk_valid = None
chk_type = None
chk_hex = None
payload = bytes(body)
if len(body) >= 1:
received_chk = body[-1]
computed_chk = checksum8_sum(bytes(body[:-1]))
if computed_chk == received_chk:
chk_valid = True
chk_type = "SUM8"
chk_hex = f"{received_chk:02x}"
payload = bytes(body[:-1])
else:
chk_valid = False
# S3 checksums are deliberately not validated here.
# Large S3 responses (A5 bulk waveform, E5 compliance) embed
# inner DLE+ETX sub-frame terminators whose trailing 0x03 byte
# lands where the parser would expect the SUM8 checksum, causing
# false failures. The live protocol (protocol.py _validate_frame)
# also skips S3 checksum enforcement for the same reason.
frames.append(Frame(
index=idx,
start_offset=start_offset,
end_offset=end_offset,
payload_raw=bytes(body),
payload=payload,
payload=bytes(body),
trailer=trailer,
checksum_valid=chk_valid,
checksum_type=chk_type,
checksum_hex=chk_hex
checksum_valid=None,
checksum_type=None,
checksum_hex=None
))
idx += 1
state = IDLE
i = trailer_end
continue
body.append(b)
else: # IN_FRAME_DLE
if b == DLE:
# DLE DLE → literal 0x10 in payload
body.append(DLE)
state = IN_FRAME
i += 1
continue
if b == ETX:
# DLE+ETX inside a frame = inner-frame terminator (A4/E5 sub-frames).
# Treat as literal data, NOT the outer frame end.
body.append(DLE)
body.append(ETX)
state = IN_FRAME
i += 1
continue
# Unexpected DLE + byte → treat as literal data
body.append(DLE)
body.append(b)
state = IN_FRAME
i += 1
continue
i += 1
+4 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "seismo-relay"
version = "0.12.0"
version = "0.15.0"
description = "Python client and REST server for MiniMate Plus seismographs"
requires-python = ">=3.10"
dependencies = [
@@ -12,6 +12,9 @@ dependencies = [
"uvicorn[standard]>=0.24",
"pyserial>=3.5",
"sqlalchemy>=2.0",
"python-multipart>=0.0.7",
"h5py>=3.10",
"numpy>=1.24",
]
[tool.setuptools.packages.find]
+3
View File
@@ -2,3 +2,6 @@ fastapi
uvicorn
sqlalchemy
pyserial
python-multipart
h5py
numpy
+346
View File
@@ -0,0 +1,346 @@
"""
scripts/backfill_sidecars.py generate .sfm.json sidecars AND .h5
clean-waveform files for existing events already in the waveform store
that predate those features.
Walks `<store_root>/<serial>/<filename>` and for each BW event file:
Sidecar (.sfm.json):
- Skip when an existing sidecar's blastware.sha256 matches the
current BW file's sha256.
- Else regenerate: prefer .a5.pkl (full fidelity); fall back to
parsing the BW binary directly (peaks computed from samples).
Clean waveform (.h5):
- Skip when <filename>.h5 already exists (idempotent).
- Else write from .a5.pkl (preferred) or BW binary parse (fallback).
Usage:
python scripts/backfill_sidecars.py [--store-root PATH]
[--db-path PATH]
[--dry-run]
[--skip-hdf5]
[-v]
"""
from __future__ import annotations
import argparse
import logging
import sys
from pathlib import Path
# Allow running from the repo root without installation.
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from minimateplus import event_file_io
from sfm import event_hdf5
from sfm.waveform_store import WaveformStore, _frame_to_dict, _dict_to_frame # noqa: F401
from sfm.database import SeismoDb
log = logging.getLogger("backfill_sidecars")
def _looks_like_event_file(path: Path) -> bool:
"""Same heuristic as the importer CLI."""
if not path.is_file():
return False
if path.name.endswith((".a5.pkl", ".sfm.json")):
return False
ext = path.suffix.lstrip(".")
if not (3 <= len(ext) <= 4):
return False
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
return False
try:
return path.stat().st_size >= 70
except OSError:
return False
def main(argv=None) -> int:
p = argparse.ArgumentParser(description=__doc__)
p.add_argument(
"--db-path",
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
)
p.add_argument("--store-root", default=None)
p.add_argument("--dry-run", action="store_true")
p.add_argument(
"--skip-hdf5", action="store_true",
help="Don't generate .h5 clean-waveform files (only sidecars).",
)
p.add_argument(
"--force", action="store_true",
help=(
"Regenerate sidecars + .h5 even when an existing sidecar's "
"blastware.sha256 matches the current BW file. Use this after "
"upgrading seismo-relay to pull in decoder bug fixes (e.g. the "
"STRT-rectime byte-offset fix in v0.15.x)."
),
)
p.add_argument("-v", "--verbose", action="store_true")
args = p.parse_args(argv)
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
datefmt="%H:%M:%S",
)
db_path = Path(args.db_path).expanduser().resolve()
store_root = (
Path(args.store_root).expanduser().resolve()
if args.store_root else db_path.parent / "waveforms"
)
if not store_root.exists():
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
return 2
store = WaveformStore(store_root)
db = SeismoDb(db_path)
written = skipped = errors = 0
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
serial = serial_dir.name
for path in sorted(serial_dir.iterdir()):
if not _looks_like_event_file(path):
continue
sidecar_path = store.sidecar_path_for(serial, path.name)
try:
bw_sha = event_file_io.file_sha256(path)
except Exception as exc:
log.error("sha256 failed for %s: %s", path, exc)
errors += 1
continue
# Skip when an up-to-date sidecar already exists.
#
# Two-part freshness check:
# 1. blastware.sha256 must match the current BW file (proves
# the sidecar describes THIS file).
# 2. source.tool_version must be ≥ current TOOL_VERSION (proves
# the sidecar was written by a build that includes any
# decoder fixes shipped since).
# Either part failing → regenerate. --force bypasses both.
if sidecar_path.exists() and not args.force:
try:
existing = event_file_io.read_sidecar(sidecar_path)
sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
src_ver = existing.get("source", {}).get("tool_version", "")
def _vt(s):
try:
return tuple(int(p) for p in str(s).split(".")[:3])
except Exception:
return (0, 0, 0)
ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION)
if sha_ok and ver_ok:
skipped += 1
continue
if sha_ok and not ver_ok:
log.info(
"regenerating %s (sidecar tool_version=%s < current %s)",
sidecar_path.name, src_ver or "(none)",
event_file_io.TOOL_VERSION,
)
except Exception:
pass # fall through to rewrite
# Decide path: A5-based (high-fidelity) or BW-only.
a5_path = serial_dir / f"{path.name}.a5.pkl"
try:
if a5_path.exists():
frames = store.load_a5(serial, path.name)
if not frames:
raise RuntimeError("a5_pickle present but unreadable")
# Build an Event by replaying the A5 decoders. Note:
# the .a5.pkl alone CANNOT recover timestamp /
# record_type / waveform_key / per-channel peaks —
# those live in the 0C record, which isn't saved
# separately. We seed those from the DB row + the
# existing sidecar below so a re-backfill doesn't
# nuke fields the original save populated.
from minimateplus.client import (
_decode_a5_metadata_into,
_decode_a5_waveform,
)
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
ev = Event(index=-1)
_decode_a5_metadata_into(frames, ev)
_decode_a5_waveform(frames, ev)
source_kind = "sfm-live"
a5_filename = a5_path.name
else:
ev = event_file_io.read_blastware_file(path)
source_kind = "bw-import"
a5_filename = None
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
# ── Seed missing fields from the SeismoDb events row ──
# The DB row was populated at original save time with peaks,
# project info, timestamp, record_type, sample_rate, etc.
# All of those survive intact in SQLite; pull them onto the
# rebuilt Event so the regenerated sidecar matches what was
# there before the backfill ran.
db_row = None
try:
import sqlite3 as _sql
with _sql.connect(str(db.db_path)) as _conn:
_conn.row_factory = _sql.Row
db_row = _conn.execute(
"SELECT * FROM events "
"WHERE serial=? AND blastware_filename=? "
"LIMIT 1",
(serial, path.name),
).fetchone()
except Exception as exc:
log.debug("DB lookup failed for %s: %s", path.name, exc)
if db_row is not None:
if ev.sample_rate is None and db_row["sample_rate"]:
ev.sample_rate = int(db_row["sample_rate"])
if not ev.record_type and db_row["record_type"]:
ev.record_type = db_row["record_type"]
if ev._waveform_key is None and db_row["waveform_key"]:
try:
ev._waveform_key = bytes.fromhex(db_row["waveform_key"])
except Exception:
pass
# Timestamp from the ISO-8601 string in the DB row.
if ev.timestamp is None and db_row["timestamp"]:
try:
import datetime as _dt
_t = _dt.datetime.fromisoformat(db_row["timestamp"])
ev.timestamp = Timestamp(
raw=b"", flag=0x10,
year=_t.year, unknown_byte=0,
month=_t.month, day=_t.day,
hour=_t.hour, minute=_t.minute, second=_t.second,
)
except Exception:
pass
# Peaks from the DB row when the A5 decode didn't supply them.
if ev.peak_values is None:
ev.peak_values = PeakValues(
tran=db_row["tran_ppv"],
vert=db_row["vert_ppv"],
long=db_row["long_ppv"],
peak_vector_sum=db_row["peak_vector_sum"],
micl=db_row["mic_ppv"],
)
# Project info from the DB row when the A5 metadata-page
# decode didn't pick it up.
if ev.project_info is None or all(
v in (None, "")
for v in (
(ev.project_info.project if ev.project_info else None),
(ev.project_info.client if ev.project_info else None),
(ev.project_info.operator if ev.project_info else None),
(ev.project_info.sensor_location if ev.project_info else None),
)
):
ev.project_info = ProjectInfo(
project=db_row["project"],
client=db_row["client"],
operator=db_row["operator"],
sensor_location=db_row["sensor_location"],
)
# Derive total_samples when we have both rectime + sample_rate.
# The decoder's STRT-derived value can be a buffer offset
# rather than a sample count — drop it in that case.
if ev.sample_rate and ev.rectime_seconds:
derived = int(round(ev.sample_rate * ev.rectime_seconds))
if (ev.total_samples is None
or ev.total_samples > derived * 2
or ev.total_samples < derived // 4):
ev.total_samples = derived
# Preserve user-edited review state + extensions from the
# existing sidecar (false_trigger flag, notes, etc.) so a
# backfill never wipes them out.
preserved_review = None
preserved_ext = None
if sidecar_path.exists():
try:
_existing = event_file_io.read_sidecar(sidecar_path)
preserved_review = _existing.get("review")
preserved_ext = _existing.get("extensions")
except Exception:
pass
sidecar = event_file_io.event_to_sidecar_dict(
ev,
serial=serial,
blastware_filename=path.name,
blastware_filesize=path.stat().st_size,
blastware_sha256=bw_sha,
source_kind=source_kind,
a5_pickle_filename=a5_filename,
review=preserved_review,
extensions=preserved_ext,
)
# Also emit the .h5 clean-waveform file when missing OR when
# --force was passed (so a re-backfill picks up decoder fixes).
hdf5_path = store.hdf5_path_for(serial, path.name)
hdf5_filename = hdf5_path.name if hdf5_path.exists() else None
hdf5_action = "kept"
need_h5 = not args.skip_hdf5 and (args.force or not hdf5_path.exists())
if need_h5:
if args.dry_run:
hdf5_action = "would (re)write"
else:
try:
event_hdf5.write_event_hdf5(
hdf5_path, ev,
serial=serial,
geo_range="normal",
source_kind=source_kind,
)
hdf5_filename = hdf5_path.name
hdf5_action = "rewrote" if hdf5_path.exists() else "wrote"
except Exception as exc:
log.warning("HDF5 write failed for %s: %s", path.name, exc)
hdf5_action = "FAILED"
if args.dry_run:
print(f" [DRY ] would write {sidecar_path.name} "
f"+ .h5 ({hdf5_action}) source={source_kind}")
written += 1
continue
event_file_io.write_sidecar(sidecar_path, sidecar)
# Best-effort: keep the SQL row's sidecar_filename in sync
# by upserting via insert_events (it dedups on serial+ts).
try:
db.insert_events(
[ev], serial=serial,
waveform_records=(
{ev._waveform_key.hex(): {
"filename": path.name,
"filesize": path.stat().st_size,
"a5_pickle_filename": a5_filename,
"sidecar_filename": sidecar_path.name,
}}
if ev._waveform_key else None
),
)
except Exception as exc:
log.warning("DB upsert failed for %s: %s", path.name, exc)
print(f" [OK ] {path.name}{sidecar_path.name} "
f"+ h5 ({hdf5_action}) source={source_kind}")
written += 1
except Exception as exc:
log.error("backfill failed for %s: %s", path, exc, exc_info=args.verbose)
errors += 1
print(f"\nDone. written={written} skipped(uptodate)={skipped} errors={errors}")
return 0 if errors == 0 else 1
if __name__ == "__main__":
sys.exit(main())
+793 -101
View File
File diff suppressed because it is too large Load Diff
+132 -3
View File
@@ -83,6 +83,15 @@ class CachedEvent(Base):
Events are immutable once recorded on the device; once we have an event in
the cache it never needs to be re-downloaded unless explicitly requested.
The two extra columns `waveform_key` and `event_timestamp` are an
integrity stamp: when set_event() / set_waveform() are called with a
different (waveform_key, event_timestamp) for the same (conn_key, index),
we know the device was erased and re-recorded the cached row no longer
refers to the same physical event and the entire device's cache is
flushed before the new entry is written. This catches the post-erase
key-reuse bug where the device's first new event (key 01110000) collides
with the first event we previously downloaded.
"""
__tablename__ = "cached_events"
@@ -90,6 +99,8 @@ class CachedEvent(Base):
index = sa.Column(sa.Integer, primary_key=True)
event_json = sa.Column(sa.Text, nullable=False) # serialised Event dict
cached_at = sa.Column(sa.Float, nullable=False) # Unix timestamp
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
class CachedWaveform(Base):
@@ -97,7 +108,9 @@ class CachedWaveform(Base):
Full raw ADC waveform for a single event (SUB 5A full download).
These are large (up to several MB) and expensive to fetch over cellular.
Once downloaded they are immutable and cached permanently.
Once downloaded they are immutable and cached permanently but the
cache row is invalidated when the device is erased and a new event lands
at the same index (see CachedEvent docstring).
"""
__tablename__ = "cached_waveforms"
@@ -105,6 +118,8 @@ class CachedWaveform(Base):
index = sa.Column(sa.Integer, primary_key=True)
waveform_json = sa.Column(sa.Text, nullable=False) # full /device/event/{idx}/waveform response JSON
cached_at = sa.Column(sa.Float, nullable=False)
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
class CachedMonitorStatus(Base):
@@ -149,6 +164,23 @@ class SFMCache:
engine = sa.create_engine(url, connect_args={"check_same_thread": False})
Base.metadata.create_all(engine)
self._Session = orm.sessionmaker(bind=engine)
# In-place schema migration: add the (waveform_key, event_timestamp)
# integrity-stamp columns to legacy cache DBs that predate the
# post-erase eviction logic. ALTER TABLE ADD COLUMN is idempotent
# via the column-presence check below.
with engine.begin() as conn:
for table in ("cached_events", "cached_waveforms"):
cols = {
r[1]
for r in conn.exec_driver_sql(f"PRAGMA table_info({table})").fetchall()
}
for new_col, ddl in (
("waveform_key", "TEXT"),
("event_timestamp", "TEXT"),
):
if new_col not in cols:
log.info("cache schema: %s ADD COLUMN %s %s", table, new_col, ddl)
conn.exec_driver_sql(f"ALTER TABLE {table} ADD COLUMN {new_col} {ddl}")
log.info("SFM cache opened: %s", db_path)
# ── Connection key ────────────────────────────────────────────────────────
@@ -242,15 +274,91 @@ class SFMCache:
row = s.get(CachedEvent, (conn_key, index))
return json.loads(row.event_json) if row else None
@staticmethod
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
"""
Extract the (waveform_key_hex, timestamp_iso) integrity stamp from
a serialised event dict. Either field may be None if the source
Event was missing it; the comparison logic in set_events/set_waveform
treats "both sides have a value AND they differ" as the only
eviction trigger, so partial data never spuriously flushes cache.
"""
key = ev.get("waveform_key") or ev.get("_waveform_key")
if isinstance(key, (bytes, bytearray)):
key = bytes(key).hex()
ts = ev.get("timestamp")
if isinstance(ts, dict):
# _serialise_timestamp returns a dict like {"iso": "...", ...}
ts = ts.get("iso") or ts.get("string") or None
return (key if isinstance(key, str) else None,
ts if isinstance(ts, str) else None)
def _maybe_flush_on_mismatch(
self,
s,
conn_key: str,
index: int,
new_key: Optional[str],
new_ts: Optional[str],
) -> bool:
"""
Check whether the cached entry at (conn_key, index) has a different
(waveform_key, timestamp) than the incoming one. If so, treat it as
a post-erase key-reuse signal and flush ALL cached events/waveforms
for this device, then return True.
Returns False when no flush was needed.
"""
if not new_key and not new_ts:
return False # nothing to compare against
existing = s.get(CachedEvent, (conn_key, index))
if existing is None:
existing = s.get(CachedWaveform, (conn_key, index))
if existing is None:
return False
old_key = existing.waveform_key
old_ts = existing.event_timestamp
# Only flush when both sides have populated values and they differ.
differs = (
(new_key and old_key and new_key != old_key)
or (new_ts and old_ts and new_ts != old_ts)
)
if not differs:
return False
log.warning(
"cache: device %s — index %d (key=%s, ts=%s) replaces (key=%s, ts=%s); "
"flushing all cached events/waveforms for this device "
"(post-erase key reuse detected)",
conn_key, index, new_key, new_ts, old_key, old_ts,
)
s.query(CachedEvent).filter_by(conn_key=conn_key).delete()
s.query(CachedWaveform).filter_by(conn_key=conn_key).delete()
return True
def set_events(self, conn_key: str, events: list[dict]) -> None:
"""
Upsert a list of event dicts. Existing rows are updated; new rows are
inserted. This is used to add newly-discovered events to the cache.
Eviction: if any incoming event has a different (waveform_key,
timestamp) than the row currently cached at the same index, we flush
the entire device's cache before inserting the new entries. Catches
post-erase key reuse where index 0 silently switches identity.
"""
now = time.time()
with self._Session() as s:
# Eviction check: scan incoming events for any (index, key, ts)
# that conflicts with a cached row. A single conflict triggers
# a full device-wide flush so we don't end up with a mixed-era
# cache.
for ev in events:
key, ts = self._event_signature(ev)
if self._maybe_flush_on_mismatch(s, conn_key, ev["index"], key, ts):
s.commit()
break # cache is now empty for this device; carry on
for ev in events:
idx = ev["index"]
key, ts = self._event_signature(ev)
row = s.get(CachedEvent, (conn_key, idx))
if row is None:
row = CachedEvent(
@@ -258,12 +366,18 @@ class SFMCache:
index=idx,
event_json=json.dumps(ev),
cached_at=now,
waveform_key=key,
event_timestamp=ts,
)
s.add(row)
log.debug("cached new event %d for %s", idx, conn_key)
else:
# Refresh in case project_info was backfilled after initial store
row.event_json = json.dumps(ev)
if key:
row.waveform_key = key
if ts:
row.event_timestamp = ts
s.commit()
# ── Waveforms ─────────────────────────────────────────────────────────────
@@ -278,8 +392,16 @@ class SFMCache:
return json.loads(row.waveform_json)
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
"""Store a full waveform response dict permanently."""
"""
Store a full waveform response dict permanently.
Like set_events, this checks the (waveform_key, timestamp) signature
of the incoming entry against what's currently cached at the same
index. A mismatch flushes the entire device's cache before insert.
"""
key, ts = self._event_signature(waveform)
with self._Session() as s:
self._maybe_flush_on_mismatch(s, conn_key, index, key, ts)
row = s.get(CachedWaveform, (conn_key, index))
if row is None:
row = CachedWaveform(
@@ -287,13 +409,20 @@ class SFMCache:
index=index,
waveform_json=json.dumps(waveform),
cached_at=time.time(),
waveform_key=key,
event_timestamp=ts,
)
s.add(row)
else:
row.waveform_json = json.dumps(waveform)
row.cached_at = time.time()
if key:
row.waveform_key = key
if ts:
row.event_timestamp = ts
s.commit()
log.debug("cached waveform for %s event %d", conn_key, index)
log.debug("cached waveform for %s event %d (key=%s, ts=%s)",
conn_key, index, key, ts)
# ── Monitor status ────────────────────────────────────────────────────────
+100 -2
View File
@@ -81,6 +81,10 @@ CREATE TABLE IF NOT EXISTS events (
sample_rate INTEGER,
record_type TEXT, -- "single_shot" | "continuous"
false_trigger INTEGER NOT NULL DEFAULT 0, -- 0=no, 1=yes (manual flag)
blastware_filename TEXT, -- event file within waveform store; extension is per-event (AB0T encodes timestamp)
blastware_filesize INTEGER, -- bytes; NULL if no event file saved
a5_pickle_filename TEXT, -- "<filename>.a5.pkl" sidecar
sidecar_filename TEXT, -- "<filename>.sfm.json" review/metadata sidecar
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
UNIQUE(serial, timestamp)
);
@@ -184,6 +188,21 @@ class SeismoDb:
""")
log.info("_migrate: events table rebuilt OK")
# Migration 1b: add Blastware-file columns to existing events tables.
# New columns are NULLable so old rows just read NULL.
existing_cols = {
r[1] for r in conn.execute("PRAGMA table_info(events)").fetchall()
}
for col, ddl in (
("blastware_filename", "TEXT"),
("blastware_filesize", "INTEGER"),
("a5_pickle_filename", "TEXT"),
("sidecar_filename", "TEXT"),
):
if col not in existing_cols:
log.info("_migrate: events ADD COLUMN %s %s", col, ddl)
conn.execute(f"ALTER TABLE events ADD COLUMN {col} {ddl}")
# Migration 2: change monitor_log UNIQUE from (serial, waveform_key) to
# (serial, start_time) — same reasoning as events.
row = conn.execute(
@@ -282,12 +301,24 @@ class SeismoDb:
*,
serial: str,
session_id: Optional[str] = None,
waveform_records: Optional[dict[str, dict]] = None,
) -> tuple[int, int]:
"""
Insert triggered events. Silently skips duplicates (serial+timestamp).
Returns (inserted, skipped).
``waveform_records`` (optional): dict keyed by event waveform_key (hex)
whose value is a record from ``WaveformStore.save()``:
{"filename": str, "filesize": int, "a5_pickle_filename": str}
For events whose key is in this dict, the matching columns are
populated. If a row with the same (serial, timestamp) already exists
(dedup hit), the matching waveform record is upserted onto the
existing row so a re-download via the live endpoint refreshes the
file metadata.
"""
inserted = skipped = 0
wave_recs = waveform_records or {}
with self._connect() as conn:
for ev in events:
key = ev._waveform_key.hex() if ev._waveform_key else None
@@ -307,6 +338,7 @@ class SeismoDb:
pv = ev.peak_values
pi = ev.project_info
rec = wave_recs.get(key) or {}
try:
conn.execute(
@@ -315,8 +347,10 @@ class SeismoDb:
(id, serial, waveform_key, session_id, timestamp,
tran_ppv, vert_ppv, long_ppv, peak_vector_sum, mic_ppv,
project, client, operator, sensor_location,
sample_rate, record_type)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
sample_rate, record_type,
blastware_filename, blastware_filesize,
a5_pickle_filename, sidecar_filename)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""",
(
self._new_id(), serial, key, session_id, ts,
@@ -331,16 +365,50 @@ class SeismoDb:
pi.sensor_location if pi else None,
ev.sample_rate,
ev.record_type,
rec.get("filename"),
rec.get("filesize"),
rec.get("a5_pickle_filename"),
rec.get("sidecar_filename"),
),
)
inserted += 1
except sqlite3.IntegrityError:
skipped += 1
# Upsert waveform fields onto the existing dedup row so a
# re-download via the live endpoint refreshes filename /
# size / sidecar without churning the rest of the row.
if rec and ts:
conn.execute(
"""
UPDATE events
SET blastware_filename = ?,
blastware_filesize = ?,
a5_pickle_filename = ?,
sidecar_filename = ?
WHERE serial = ? AND timestamp = ?
""",
(
rec.get("filename"),
rec.get("filesize"),
rec.get("a5_pickle_filename"),
rec.get("sidecar_filename"),
serial,
ts,
),
)
log.debug("insert_events serial=%s inserted=%d skipped=%d",
serial, inserted, skipped)
return inserted, skipped
def get_event(self, event_id: str) -> Optional[dict]:
"""Return one event row by id, or None."""
with self._connect() as conn:
row = conn.execute(
"SELECT * FROM events WHERE id = ?", (event_id,),
).fetchone()
return dict(row) if row else None
def query_events(
self,
serial: Optional[str] = None,
@@ -387,6 +455,36 @@ class SeismoDb:
)
return cur.rowcount > 0
def update_event_review(self, event_id: str, review: dict) -> bool:
"""
Sync derived index columns from a sidecar's `review` block.
Currently the only derived index is `events.false_trigger` kept
in sync so `/db/events?false_trigger=true` queries don't have to
scan every sidecar JSON on disk. The sidecar JSON itself remains
the source of truth for the full review state.
Returns True when the row exists, False otherwise. No-op fields
(review without `false_trigger`) leave the column untouched.
"""
if not isinstance(review, dict):
return False
if "false_trigger" not in review:
# Nothing derived to update; just confirm the row exists.
with self._connect() as conn:
row = conn.execute(
"SELECT 1 FROM events WHERE id=?", (event_id,),
).fetchone()
return row is not None
flag = 1 if review.get("false_trigger") else 0
with self._connect() as conn:
cur = conn.execute(
"UPDATE events SET false_trigger=? WHERE id=?",
(flag, event_id),
)
return cur.rowcount > 0
# ── Monitor log ───────────────────────────────────────────────────────────
def insert_monitor_log(
+216
View File
@@ -0,0 +1,216 @@
"""
sfm.dump_0c inspect the raw 210-byte SUB 0C waveform record stored in a
sidecar JSON's `extensions.raw_records.waveform_record_b64`.
Usage:
python -m sfm.dump_0c <sidecar.sfm.json> [<sidecar.sfm.json> ...]
Prints, for each input:
- A header summarising the sidecar's metadata-block claims (peaks,
project, timestamp) the "what BW says this event measured" view.
- A 16-byte-wide hex dump of the raw 0C record, annotated with known
field anchors (STRT, channel labels, project strings).
- A "candidate float regions" scan that brute-forces every byte
position as a float32 BE and prints any that yield a value in a
plausible range (1e-7 to 1e3) useful for hunting where Peak
Acceleration / Peak Displacement / ZC Freq / Time of Peak live.
Pairing the printed candidates with the BW Event Report values lets
us nail down byte offsets for the missing fields without a live
device.
"""
from __future__ import annotations
import argparse
import base64
import json
import struct
import sys
from pathlib import Path
# ── Annotations for known anchors in a 210-byte 0C record ──────────────────
# Anchors we look for and label inline in the hex dump. Each is a needle
# (bytes to find) and a short label. Found via .find() — the first
# occurrence wins.
_ANCHORS = [
(b"Tran", "Tran label (PPV @ +6, PVS @ -12)"),
(b"Vert", "Vert label (PPV @ +6)"),
(b"Long", "Long label (PPV @ +6)"),
(b"MicL", "MicL label (peak psi @ +6)"),
(b"Project:", "Project: label"),
(b"Client:", "Client: label"),
(b"User Name:", "User Name: label"),
(b"Seis Loc:", "Seis Loc: label"),
(b"Extended Notes", "Extended Notes label"),
]
def _hex_dump(data: bytes, anchors: dict[int, str]) -> str:
"""Return a 16-byte-wide hex+ASCII dump, with anchor labels printed
on the line that contains the anchor's start byte."""
lines = []
for off in range(0, len(data), 16):
chunk = data[off : off + 16]
hex_part = " ".join(f"{b:02x}" for b in chunk)
ascii_part = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
line = f" {off:04x} {hex_part:<47} |{ascii_part}|"
# If any anchor lands on a byte in this row, append a tag
tags = [
f"[{a:#04x}: {label}]"
for a, label in anchors.items()
if off <= a < off + 16
]
if tags:
line += " " + " ".join(tags)
lines.append(line)
return "\n".join(lines)
def _scan_float32_be(data: bytes, lo: float, hi: float) -> list[tuple[int, float]]:
"""Brute-force every offset where data[off:off+4] is a float32 BE in
(lo, hi). Includes negatives in the symmetric range."""
hits = []
for i in range(len(data) - 3):
try:
v = struct.unpack_from(">f", data, i)[0]
except struct.error:
continue
if v != v: # NaN
continue
if abs(v) < 1e-30 or abs(v) > 1e10: # crap range
continue
a = abs(v)
if lo <= a <= hi:
hits.append((i, v))
return hits
def _scan_uint16_be(data: bytes, lo: int, hi: int) -> list[tuple[int, int]]:
"""Find every offset where uint16 BE is in [lo, hi]."""
hits = []
for i in range(len(data) - 1):
v = (data[i] << 8) | data[i + 1]
if lo <= v <= hi:
hits.append((i, v))
return hits
def _summarize_sidecar(side: dict) -> str:
ev = side.get("event", {})
pv = side.get("peak_values", {})
pi = side.get("project_info", {})
bw = side.get("blastware", {})
return (
f" serial: {ev.get('serial')}\n"
f" timestamp: {ev.get('timestamp')}\n"
f" waveform: {ev.get('waveform_key')} ({ev.get('record_type')})\n"
f" sample_rate:{ev.get('sample_rate')} sps rectime:{ev.get('rectime_seconds')}s\n"
f" bw file: {bw.get('filename')} ({bw.get('filesize')} B)\n"
f" peaks: "
f"Tran={pv.get('transverse'):.5f} "
f"Vert={pv.get('vertical'):.5f} "
f"Long={pv.get('longitudinal'):.5f} "
f"PVS={pv.get('vector_sum'):.5f} in/s "
f"Mic={pv.get('mic_psi'):.6e} psi"
if all(pv.get(k) is not None for k in
("transverse", "vertical", "longitudinal", "vector_sum", "mic_psi"))
else f" peaks: {pv}\n project: {pi}"
) + (
f"\n project: {pi.get('project')!r} / {pi.get('client')!r} / "
f"operator={pi.get('operator')!r} loc={pi.get('sensor_location')!r}"
)
def dump_one(path: Path) -> int:
side = json.loads(path.read_text(encoding="utf-8"))
raw_b64 = (
side.get("extensions", {})
.get("raw_records", {})
.get("waveform_record_b64")
)
if not raw_b64:
print(f"\n=== {path} ===")
print(" ! no extensions.raw_records.waveform_record_b64 — sidecar")
print(" pre-dates raw-0C persistence (added in v0.15.x). Re-save")
print(" the event from the device to capture the bytes.")
return 1
raw = base64.b64decode(raw_b64)
# Build anchor map
anchors: dict[int, str] = {}
for needle, label in _ANCHORS:
i = raw.find(needle)
if i >= 0:
anchors[i] = label
print(f"\n=== {path} ===")
print("metadata claimed by sidecar:")
print(_summarize_sidecar(side))
print(f"\nraw 0C record ({len(raw)} bytes):")
print(_hex_dump(raw, anchors))
# Float32 BE candidates in geo-relevant ranges
geo_hits = _scan_float32_be(raw, 1e-5, 50.0)
# Filter: only show hits that are NOT trivially the per-channel labels'
# +6 PPV floats already documented (those will land in any sweep too).
print("\nfloat32 BE candidates (1e-5 .. 50.0):")
for off, v in geo_hits:
annotation = ""
for needle, _ in _ANCHORS[:4]: # geo + mic labels
i = raw.find(needle)
if i >= 0 and off == i + 6:
annotation = f"{needle.decode()} PPV (label+6)"
break
print(f" {off:#04x} ({off:3d}) {v:>+15.6f}{annotation}")
print("\nuint16 BE candidates ZC-Freq-ish (1..200):")
for off, v in _scan_uint16_be(raw, 1, 200):
if v < 5: # too noisy at very low end
continue
print(f" {off:#04x} ({off:3d}) = {v}")
print("\nuint16 BE candidates Time-of-Peak-ish if stored as ms (1..30000):")
for off, v in _scan_uint16_be(raw, 1, 30000):
if v < 100: # noise filter
continue
# Only the first ~80 are worth showing — too many hits otherwise
if off > 80:
break
print(f" {off:#04x} ({off:3d}) = {v} ms ?")
print()
return 0
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(
description="Inspect a saved 0C waveform record from a sidecar JSON.",
)
p.add_argument(
"sidecars",
nargs="+",
type=Path,
help="Path(s) to <event>.sfm.json sidecar file(s).",
)
args = p.parse_args(argv)
rc = 0
for path in args.sidecars:
try:
rc |= dump_one(path)
except Exception as exc:
print(f"\n=== {path} ===\n ERROR: {exc}", file=sys.stderr)
rc |= 2
return rc
if __name__ == "__main__":
sys.exit(main())
+530
View File
@@ -0,0 +1,530 @@
"""
sfm/event_hdf5.py HDF5 codec for the canonical "clean waveform" file.
Layout written to `<filename>.h5`:
/
samples/
Tran (float32, in/s) shape: (N,)
Vert (float32, in/s) shape: (N,)
Long (float32, in/s) shape: (N,)
MicL (float32, psi) shape: (N,)
samples_int16/ (optional)
Tran (int16, raw ADC counts) shape: (N,)
... per channel (only when present in the source)
root attrs (event metadata):
schema_version int = 1
kind str = "sfm.event.hdf5"
serial str
waveform_key str (8-hex)
timestamp str (ISO-8601)
record_type str
sample_rate int (sps)
pretrig_samples int
total_samples int
rectime_seconds float
geo_range str "normal" | "sensitive"
geo_full_scale_ips float (10.0 or 1.250)
project str
client str
operator str
sensor_location str
peak_tran_ips float (from 0C; authoritative)
peak_vert_ips float
peak_long_ips float
peak_pvs_ips float
peak_mic_psi float
tool_version str
captured_at str (ISO-8601 UTC)
source_kind str "sfm-live" | "sfm-ach" | "bw-import"
Why HDF5 and not just JSON for the canonical clean format:
- Native float32 arrays (no base64 dance, no per-value JSON parsing).
- Per-dataset gzip compression sample arrays compress 3-5×.
- Cross-language: h5py (Python), HDF5.jl (Julia), io.netcdf (R), etc.
Analysis pipelines don't have to know anything about Blastware.
- Self-describing via attributes; future fields don't break readers.
The plot-ready `sfm.plot.v1` JSON returned by the REST endpoints is
derived from this HDF5 (or computed on-the-fly when no .h5 exists yet).
"""
from __future__ import annotations
import datetime
import logging
from pathlib import Path
from typing import Optional, Union
import h5py
import numpy as np
from minimateplus.event_file_io import TOOL_VERSION as _DEFAULT_TOOL_VERSION
from minimateplus.models import Event
log = logging.getLogger(__name__)
SCHEMA_VERSION = 1
HDF5_KIND = "sfm.event.hdf5"
# Geophone full-scale velocity per range (in/s). Confirmed in CLAUDE.md
# from 4-20-26 captures: Normal=0x00 → 10 in/s, Sensitive=0x01 → 1.25 in/s.
_GEO_FS_BY_RANGE = {
"normal": 10.000,
"sensitive": 1.2500,
0: 10.000,
1: 1.2500,
}
_INT16_FS = 32768.0
# Default mic conversion: ADC count → psi. Approximate; exact factor
# depends on firmware reference voltage and mic sensitivity, neither of
# which is independently confirmed. We try to refine it from the device-
# reported peak when available (peak_mic_psi / max_abs_int16).
_MIC_DEFAULT_FS_PSI = 0.0125 # ≈ 0.5 psi at full scale (rough)
def _resolve_geo_full_scale(geo_range) -> float:
"""Map a geo_range value (string or int from compliance config) to the
full-scale velocity in in/s. Defaults to Normal range (10.0) when the
value is unknown same default as Blastware itself."""
if geo_range is None:
return _GEO_FS_BY_RANGE["normal"]
if isinstance(geo_range, str):
return _GEO_FS_BY_RANGE.get(geo_range.lower(), _GEO_FS_BY_RANGE["normal"])
return _GEO_FS_BY_RANGE.get(int(geo_range), _GEO_FS_BY_RANGE["normal"])
def _normalise_range(geo_range) -> str:
"""Return 'normal' or 'sensitive' (string) regardless of input form."""
if isinstance(geo_range, str):
v = geo_range.lower()
if v in ("normal", "sensitive"):
return v
return "normal"
if geo_range == 1:
return "sensitive"
return "normal"
def _ts_iso(ts) -> str:
if ts is None:
return ""
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
def _samples_to_float(
samples_int16: list[int],
full_scale: float,
) -> np.ndarray:
"""Convert int16 ADC counts → float32 physical units.
Uses _INT16_FS=32768 (not 32767) so that a count of -32768 maps to
exactly -full_scale and +32767 maps to ~+full_scale * 32767/32768.
Matches the device firmware's documented mapping (see CLAUDE.md
geo_hardware_constant rationale).
"""
if not samples_int16:
return np.array([], dtype=np.float32)
arr = np.asarray(samples_int16, dtype=np.int32) # int32 to avoid overflow during scale
return (arr.astype(np.float32) * (full_scale / _INT16_FS)).astype(np.float32)
def _mic_scale_factor(
samples_int16: list[int],
peak_mic_psi: Optional[float],
) -> float:
"""Resolve the per-count psi factor for the microphone channel.
When the device reports a peak mic value via the 0C record, we
back-solve the per-count factor from `peak_psi / max(|samples|)` so
the plotted waveform peaks land exactly at the device-reported value.
Otherwise fall back to the rough _MIC_DEFAULT_FS_PSI estimate.
"""
if peak_mic_psi is not None and peak_mic_psi > 0 and samples_int16:
max_count = max(abs(int(v)) for v in samples_int16) or 1
return float(peak_mic_psi) / float(max_count)
return _MIC_DEFAULT_FS_PSI / _INT16_FS
def write_event_hdf5(
path: Union[str, Path],
event: Event,
*,
serial: str,
geo_range = "normal",
source_kind: str = "sfm-live",
tool_version: Optional[str] = None,
captured_at: Optional[datetime.datetime] = None,
include_int16: bool = True,
) -> dict:
"""
Persist a decoded Event as an HDF5 file with samples in physical units.
Returns a small summary dict suitable for logging:
{"path": Path, "n_samples": int, "geo_full_scale_ips": float}
"""
path = Path(path)
raw = event.raw_samples or {}
pv = event.peak_values
pi = event.project_info
geo_fs = _resolve_geo_full_scale(geo_range)
geo_range_str = _normalise_range(geo_range)
captured_at = captured_at or datetime.datetime.utcnow()
tool_version = tool_version or _DEFAULT_TOOL_VERSION
# Per-channel float32 arrays in physical units.
geo_arrays = {}
for ch in ("Tran", "Vert", "Long"):
geo_arrays[ch] = _samples_to_float(raw.get(ch, []), geo_fs)
# Mic channel — the per-count factor is resolved from the device-reported
# peak when available so the plot peaks the BW value exactly.
mic_int16 = raw.get("MicL", [])
mic_factor = _mic_scale_factor(
mic_int16,
getattr(pv, "micl", None) if pv else None,
)
if mic_int16:
mic_arr = (np.asarray(mic_int16, dtype=np.int32).astype(np.float32) * mic_factor).astype(np.float32)
else:
mic_arr = np.array([], dtype=np.float32)
n_samples = max(
(len(geo_arrays[ch]) for ch in geo_arrays),
default=0,
)
# Atomic write: temp file + os.replace.
tmp = path.with_suffix(path.suffix + ".tmp")
with h5py.File(tmp, "w") as f:
# Root attrs — event-level metadata.
attrs = f.attrs
attrs["schema_version"] = SCHEMA_VERSION
attrs["kind"] = HDF5_KIND
attrs["serial"] = serial or ""
attrs["waveform_key"] = event._waveform_key.hex() if event._waveform_key else ""
attrs["timestamp"] = _ts_iso(event.timestamp)
attrs["record_type"] = event.record_type or ""
attrs["sample_rate"] = int(event.sample_rate or 0)
attrs["pretrig_samples"] = int(event.pretrig_samples or 0)
attrs["total_samples"] = int(event.total_samples or n_samples)
attrs["rectime_seconds"] = float(event.rectime_seconds or 0.0)
attrs["geo_range"] = geo_range_str
attrs["geo_full_scale_ips"] = float(geo_fs)
attrs["project"] = (pi.project if pi else "") or ""
attrs["client"] = (pi.client if pi else "") or ""
attrs["operator"] = (pi.operator if pi else "") or ""
attrs["sensor_location"] = (pi.sensor_location if pi else "") or ""
attrs["peak_tran_ips"] = float(pv.tran if pv and pv.tran is not None else 0.0)
attrs["peak_vert_ips"] = float(pv.vert if pv and pv.vert is not None else 0.0)
attrs["peak_long_ips"] = float(pv.long if pv and pv.long is not None else 0.0)
attrs["peak_pvs_ips"] = float(pv.peak_vector_sum if pv and pv.peak_vector_sum is not None else 0.0)
attrs["peak_mic_psi"] = float(pv.micl if pv and pv.micl is not None else 0.0)
attrs["tool_version"] = tool_version or ""
attrs["captured_at"] = captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat()
attrs["source_kind"] = source_kind
# /samples — physical-units float32 (the primary data).
sgrp = f.create_group("samples")
for ch, arr in geo_arrays.items():
sgrp.create_dataset(
ch, data=arr, dtype="float32",
compression="gzip", compression_opts=4, shuffle=True,
)
sgrp.create_dataset(
"MicL", data=mic_arr, dtype="float32",
compression="gzip", compression_opts=4, shuffle=True,
)
# /samples_int16 — optional raw ADC counts (preserved for analysis
# tools that want pre-conversion data). Cheap to include.
if include_int16:
igrp = f.create_group("samples_int16")
for ch in ("Tran", "Vert", "Long", "MicL"):
vals = raw.get(ch, [])
if vals:
igrp.create_dataset(
ch, data=np.asarray(vals, dtype=np.int16),
compression="gzip", compression_opts=4, shuffle=True,
)
igrp.attrs["mic_psi_per_count"] = float(mic_factor)
import os
os.replace(tmp, path)
log.info(
"write_event_hdf5: %s n_samples=%d geo_fs=%.3f filesize=%d",
path, n_samples, geo_fs, path.stat().st_size,
)
return {
"path": path,
"n_samples": n_samples,
"geo_full_scale_ips": geo_fs,
}
def read_event_hdf5(path: Union[str, Path]) -> dict:
"""
Load an event HDF5 into a plain dict (no Event reconstruction
callers that want an Event can use the data directly).
Returns:
{
"schema_version": int,
"kind": str,
"attrs": dict[str, ], # all root attributes
"samples": { # float32 lists in physical units
"Tran": ndarray, "Vert": ndarray, "Long": ndarray, "MicL": ndarray,
},
"samples_int16": {} or None,
"mic_psi_per_count": float | None,
}
Raises FileNotFoundError if missing, ValueError on bad shape /
unsupported schema_version.
"""
path = Path(path)
with h5py.File(path, "r") as f:
attrs = {k: _h5_attr_value(v) for k, v in f.attrs.items()}
sv = attrs.get("schema_version", 0)
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
raise ValueError(
f"{path}: unsupported HDF5 schema_version={sv} "
f"(this build supports 1..{SCHEMA_VERSION})"
)
if attrs.get("kind") != HDF5_KIND:
raise ValueError(f"{path}: kind != {HDF5_KIND!r} (got {attrs.get('kind')!r})")
samples = {}
for ch in ("Tran", "Vert", "Long", "MicL"):
ds = f.get(f"samples/{ch}")
samples[ch] = np.asarray(ds[()]) if ds is not None else np.array([], dtype=np.float32)
samples_int16 = None
mic_psi = None
igrp = f.get("samples_int16")
if igrp is not None:
samples_int16 = {}
for ch in ("Tran", "Vert", "Long", "MicL"):
ds = igrp.get(ch)
if ds is not None:
samples_int16[ch] = np.asarray(ds[()])
mic_attr = igrp.attrs.get("mic_psi_per_count")
if mic_attr is not None:
mic_psi = float(mic_attr)
return {
"schema_version": sv,
"kind": attrs.get("kind"),
"attrs": attrs,
"samples": samples,
"samples_int16": samples_int16,
"mic_psi_per_count": mic_psi,
}
def _h5_attr_value(v):
"""Convert an h5py attribute value to a plain Python type."""
if isinstance(v, bytes):
return v.decode("utf-8", errors="replace")
if isinstance(v, np.generic):
return v.item()
return v
# ── Plot-ready JSON ──────────────────────────────────────────────────────────
def event_to_plot_json(
event: Event,
*,
serial: str,
geo_range = "normal",
event_id: Optional[str] = None,
index: Optional[int] = None,
) -> dict:
"""
Build a `sfm.plot.v1` JSON dict directly from an Event (skipping HDF5).
Used by:
- `/device/event/{idx}/waveform` (live device path)
- The CLI / tests for in-memory conversion sanity-checks.
Stored events go through `plot_json_from_hdf5()` so the wire format
is identical regardless of whether the data came from the live device
or the on-disk HDF5.
"""
raw = event.raw_samples or {}
pv = event.peak_values
geo_fs = _resolve_geo_full_scale(geo_range)
geo_range_str = _normalise_range(geo_range)
sr = int(event.sample_rate or 0) or 1024
pretrig = int(event.pretrig_samples or 0)
geo_arrays = {ch: _samples_to_float(raw.get(ch, []), geo_fs).tolist()
for ch in ("Tran", "Vert", "Long")}
mic_int16 = raw.get("MicL", [])
mic_factor = _mic_scale_factor(
mic_int16,
getattr(pv, "micl", None) if pv else None,
)
mic_arr = [float(v) * mic_factor for v in mic_int16] if mic_int16 else []
n = max(
(len(geo_arrays[ch]) for ch in geo_arrays),
default=len(mic_arr),
)
return _build_plot_dict(
n_samples=n,
sample_rate=sr,
pretrig_samples=pretrig,
total_samples=int(event.total_samples or n),
rectime_seconds=float(event.rectime_seconds or 0.0),
timestamp_iso=_ts_iso(event.timestamp),
serial=serial,
record_type=event.record_type,
waveform_key=event._waveform_key.hex() if event._waveform_key else None,
geo_range=geo_range_str,
geo_fs=geo_fs,
channels_floats={
"Tran": geo_arrays["Tran"],
"Vert": geo_arrays["Vert"],
"Long": geo_arrays["Long"],
"MicL": mic_arr,
},
peaks_dict={
"tran": getattr(pv, "tran", None) if pv else None,
"vert": getattr(pv, "vert", None) if pv else None,
"long": getattr(pv, "long", None) if pv else None,
"pvs": getattr(pv, "peak_vector_sum", None) if pv else None,
"mic": getattr(pv, "micl", None) if pv else None,
},
event_id=event_id,
index=index if index is not None else event.index,
)
def plot_json_from_hdf5(
path: Union[str, Path],
*,
event_id: Optional[str] = None,
index: Optional[int] = None,
) -> dict:
"""Build a `sfm.plot.v1` JSON dict from a stored .h5 file."""
data = read_event_hdf5(path)
a = data["attrs"]
s = data["samples"]
return _build_plot_dict(
n_samples=len(s["Tran"]) if "Tran" in s else 0,
sample_rate=int(a.get("sample_rate", 1024) or 1024),
pretrig_samples=int(a.get("pretrig_samples", 0) or 0),
total_samples=int(a.get("total_samples", 0) or 0),
rectime_seconds=float(a.get("rectime_seconds", 0.0) or 0.0),
timestamp_iso=a.get("timestamp", ""),
serial=a.get("serial", ""),
record_type=a.get("record_type", ""),
waveform_key=a.get("waveform_key", "") or None,
geo_range=a.get("geo_range", "normal"),
geo_fs=float(a.get("geo_full_scale_ips", 10.0) or 10.0),
channels_floats={
"Tran": s.get("Tran", np.array([])).tolist(),
"Vert": s.get("Vert", np.array([])).tolist(),
"Long": s.get("Long", np.array([])).tolist(),
"MicL": s.get("MicL", np.array([])).tolist(),
},
peaks_dict={
"tran": float(a.get("peak_tran_ips", 0.0) or 0.0) or None,
"vert": float(a.get("peak_vert_ips", 0.0) or 0.0) or None,
"long": float(a.get("peak_long_ips", 0.0) or 0.0) or None,
"pvs": float(a.get("peak_pvs_ips", 0.0) or 0.0) or None,
"mic": float(a.get("peak_mic_psi", 0.0) or 0.0) or None,
},
event_id=event_id,
index=index,
)
def _build_plot_dict(
*,
n_samples: int,
sample_rate: int,
pretrig_samples: int,
total_samples: int,
rectime_seconds: float,
timestamp_iso: str,
serial: str,
record_type: Optional[str],
waveform_key: Optional[str],
geo_range: str,
geo_fs: float,
channels_floats: dict[str, list[float]],
peaks_dict: dict[str, Optional[float]],
event_id: Optional[str],
index: Optional[int] = None,
) -> dict:
dt_ms = (1000.0 / sample_rate) if sample_rate > 0 else 0.0
t0_ms = -pretrig_samples * dt_ms
def _ch(unit: str, values: list[float], peak: Optional[float]) -> dict:
# Locate the peak's time within the values array (max abs).
if values:
mags = [abs(v) for v in values]
i = mags.index(max(mags))
peak_t_ms = round(t0_ms + i * dt_ms, 4)
peak_value = peak if peak is not None else values[i]
else:
peak_t_ms = None
peak_value = peak
return {
"unit": unit,
"values": values,
"peak": peak_value,
"peak_t_ms": peak_t_ms,
}
return {
"schema": "sfm.plot.v1",
"event_id": event_id,
"index": index,
"serial": serial,
"timestamp": timestamp_iso,
"record_type": record_type,
"waveform_key": waveform_key,
"time_axis": {
"sample_rate": sample_rate,
"pretrig_samples": pretrig_samples,
"total_samples": total_samples or n_samples,
"n_samples": n_samples,
"t0_ms": round(t0_ms, 4),
"dt_ms": round(dt_ms, 6),
"rectime_seconds": rectime_seconds,
},
"geo_range": geo_range,
"geo_full_scale_ips": geo_fs,
"trigger_ms": 0.0,
"channels": {
"Tran": _ch("in/s", channels_floats.get("Tran", []), peaks_dict.get("tran")),
"Vert": _ch("in/s", channels_floats.get("Vert", []), peaks_dict.get("vert")),
"Long": _ch("in/s", channels_floats.get("Long", []), peaks_dict.get("long")),
"MicL": _ch("psi", channels_floats.get("MicL", []), peaks_dict.get("mic")),
},
"peak_values": {
"transverse": peaks_dict.get("tran"),
"vertical": peaks_dict.get("vert"),
"longitudinal": peaks_dict.get("long"),
"vector_sum": peaks_dict.get("pvs"),
"mic_psi": peaks_dict.get("mic"),
},
}
+194
View File
@@ -0,0 +1,194 @@
"""
sfm/import_bw.py CLI for ingesting Blastware-format event files.
Walks a path (file or directory), parses each recognised event-file
binary, copies it into the canonical waveform store, writes the
.sfm.json sidecar, and upserts a row in seismo_relay.db.
Use cases:
- Migrating a Blastware ACH inbox into SFM
- One-off imports of files emailed in by field crews
- Bulk-loading historical archives
Usage:
python -m sfm.import_bw <path-or-dir> [--serial BE11529]
[--db-path bridges/captures/seismo_relay.db]
[--store-root bridges/captures/waveforms]
[--dry-run]
[-v]
Examples:
python -m sfm.import_bw ~/Downloads/M529LKIQ.7M0W
python -m sfm.import_bw /path/to/blastware_archive --serial BE11529
"""
from __future__ import annotations
import argparse
import logging
import sys
from pathlib import Path
from typing import Iterator
# Allow running from the repo root without installation.
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from sfm.database import SeismoDb
from sfm.waveform_store import WaveformStore
log = logging.getLogger("sfm.import_bw")
# Blastware event-file extensions: 4-char `AB0T` (T = W or H) for ACH
# downloads, 3-char `AB0` for direct downloads. We discover candidates
# by length + last-char rather than enumerating every (A, B) pair.
def _looks_like_bw_event(path: Path) -> bool:
"""Heuristic: 3-char or 4-char extension, ends with W/H/0, and the
file is at least 70 bytes (header + STRT + footer minimum)."""
if not path.is_file():
return False
ext = path.suffix.lstrip(".")
if not (3 <= len(ext) <= 4):
return False
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
return False
try:
return path.stat().st_size >= 70
except OSError:
return False
def _walk(path: Path) -> Iterator[Path]:
"""Yield candidate BW event-file paths under `path` (file or dir)."""
if path.is_file():
if _looks_like_bw_event(path):
yield path
return
if path.is_dir():
for p in sorted(path.rglob("*")):
if _looks_like_bw_event(p):
yield p
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(
description="Import Blastware-format event files into the SFM store + DB.",
)
p.add_argument("path", help="File or directory to import.")
p.add_argument(
"--serial", default=None, metavar="SERIAL",
help="Override the serial-number hint (e.g. BE11529). Defaults to "
"the value decoded from each BW filename's prefix.",
)
p.add_argument(
"--db-path",
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
help="Path to seismo_relay.db (default: bridges/captures/seismo_relay.db).",
)
p.add_argument(
"--store-root",
default=None,
help="Root of the waveform store (default: <db_dir>/waveforms).",
)
p.add_argument(
"--dry-run", action="store_true",
help="Parse and report per-file outcomes; don't write anything.",
)
p.add_argument("-v", "--verbose", action="store_true", help="Debug logging.")
args = p.parse_args(argv)
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
datefmt="%H:%M:%S",
)
src = Path(args.path).expanduser().resolve()
if not src.exists():
print(f"error: {src} does not exist", file=sys.stderr)
return 2
db_path = Path(args.db_path).expanduser().resolve()
store_root = (
Path(args.store_root).expanduser().resolve()
if args.store_root else db_path.parent / "waveforms"
)
db = None if args.dry_run else SeismoDb(db_path)
store = None if args.dry_run else WaveformStore(store_root)
candidates = list(_walk(src))
if not candidates:
print(f"No BW event-file candidates found under {src}", file=sys.stderr)
return 1
print(f"Importing {len(candidates)} file(s) from {src}...")
if args.dry_run:
print("(dry-run — no writes will occur)")
ok = err = skipped = 0
for path in candidates:
try:
bw_bytes = path.read_bytes()
except Exception as exc:
print(f" [ERR ] {path}: read failed: {exc}")
err += 1
continue
if args.dry_run:
# Just parse to verify integrity; don't touch DB or store.
from minimateplus import event_file_io
try:
ev = event_file_io.read_blastware_file(path)
ts = ev.timestamp and (
f"{ev.timestamp.year}-{ev.timestamp.month:02d}-{ev.timestamp.day:02d} "
f"{ev.timestamp.hour:02d}:{ev.timestamp.minute:02d}:{ev.timestamp.second:02d}"
) or "?"
pv = ev.peak_values
pvs = pv.peak_vector_sum if pv and pv.peak_vector_sum is not None else 0.0
print(f" [OK ] {path.name} ts={ts} PVS={pvs:.4f}")
ok += 1
except Exception as exc:
print(f" [ERR ] {path}: parse failed: {exc}")
err += 1
continue
try:
ev, rec = store.save_imported_bw(
bw_bytes, source_path=path, serial_hint=args.serial,
)
# Resolve serial for the DB row. Prefer the hint, then the
# one decoded from the filename (already done by the store).
serial_used = args.serial or _infer_serial(path.name) or "UNKNOWN"
ins, sk = db.insert_events(
[ev], serial=serial_used,
waveform_records=(
{ev._waveform_key.hex(): rec}
if ev._waveform_key else None
),
)
tag = "OK " if ins else ("SKIP" if sk else "OK ")
print(f" [{tag}] {path.name}{rec['filename']} "
f"({rec['filesize']} B, sha256={rec['sha256'][:12]}…) "
f"serial={serial_used} ins={ins} skip={sk}")
if ins:
ok += 1
else:
skipped += 1
except Exception as exc:
print(f" [ERR ] {path}: import failed: {exc}")
log.debug("traceback", exc_info=True)
err += 1
print(f"\nDone. ok={ok} skipped={skipped} errors={err}")
return 0 if err == 0 else 1
def _infer_serial(filename: str):
"""Reuse WaveformStore's filename → serial decoder for log output."""
from sfm.waveform_store import _serial_from_bw_filename
return _serial_from_bw_filename(filename)
if __name__ == "__main__":
sys.exit(main())
+189
View File
@@ -0,0 +1,189 @@
"""
sfm/live_cache.py Thread-safe in-memory cache for live SFM device data.
Extracted from sfm/server.py so the cache logic is importable and testable
without pulling in fastapi/uvicorn.
Caching strategy
----------------
Keyed by `conn_key` ("tcp:host:port" or "serial:port:baud"). Does NOT
persist across server restarts.
device_info cached until POST /device/config marks it dirty
events cached by (conn_key, device_event_count); re-fetched when
a quick count_events() probe shows new events on the device
monitor_status 30-second TTL (changes frequently during monitoring)
waveforms permanent within a process but auto-evicted at the device
level when a (waveform_key, timestamp) mismatch is detected
at the same index (post-erase key reuse the device's
event-key counter resets to 0x01110000 after every erase,
so the same `(conn_key, index)` slot can refer to a
brand-new physical event).
All endpoints accept ?force=true to bypass the cache and re-read.
"""
from __future__ import annotations
import threading
import time
from typing import Optional
_MONITOR_STATUS_TTL = 30.0 # seconds
class LiveCache:
"""
Thread-safe in-memory cache for live SFM device data.
One singleton per server process.
"""
def __init__(self) -> None:
self._lock = threading.Lock()
self._device_info: dict[str, dict] = {}
self._events: dict[str, tuple[int, list]] = {}
self._monitor_status: dict[str, tuple[float, dict]] = {}
self._config_dirty: dict[str, bool] = {}
self._waveforms: dict[tuple, dict] = {}
# ── Connection key ────────────────────────────────────────────────────────
@staticmethod
def make_conn_key(
host: Optional[str],
tcp_port: int,
port: Optional[str],
baud: int,
) -> str:
if host:
return f"tcp:{host}:{tcp_port}"
return f"serial:{port}:{baud}"
# ── Eviction signature ────────────────────────────────────────────────────
@staticmethod
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
"""Return (waveform_key_hex, timestamp_iso) from a serialised event."""
key = ev.get("waveform_key") or ev.get("_waveform_key")
if isinstance(key, (bytes, bytearray)):
key = bytes(key).hex()
ts = ev.get("timestamp")
if isinstance(ts, dict):
ts = ts.get("iso") or ts.get("string") or None
return (key if isinstance(key, str) else None,
ts if isinstance(ts, str) else None)
def _flush_device(self, conn_key: str) -> None:
"""Drop all cached events + waveforms for one device. Caller holds lock."""
self._events.pop(conn_key, None)
stale_wf_keys = [k for k in self._waveforms if k[0] == conn_key]
for k in stale_wf_keys:
self._waveforms.pop(k, None)
# ── Device info ───────────────────────────────────────────────────────────
def get_device_info(self, conn_key: str) -> Optional[dict]:
with self._lock:
if self._config_dirty.get(conn_key):
return None
return self._device_info.get(conn_key)
def set_device_info(self, conn_key: str, info: dict) -> None:
with self._lock:
self._device_info[conn_key] = info
self._config_dirty[conn_key] = False
# ── Events ────────────────────────────────────────────────────────────────
def get_events(self, conn_key: str, device_count: int) -> Optional[list]:
with self._lock:
if self._config_dirty.get(conn_key):
return None
entry = self._events.get(conn_key)
if entry is None:
return None
cached_count, events = entry
return events if cached_count == device_count else None
def set_events(self, conn_key: str, device_count: int, events: list) -> None:
"""
Replace the cached events list for `conn_key`. If any incoming event
has a different (waveform_key, timestamp) than the cached entry at
the same index, flush the entire conn_key's event + waveform cache
first. Catches post-erase key reuse.
"""
with self._lock:
cached_entry = self._events.get(conn_key)
cached_events = cached_entry[1] if cached_entry else []
cached_by_index = {e.get("index"): e for e in cached_events}
evict = False
for ev in events:
idx = ev.get("index")
if idx is None:
continue
cached = cached_by_index.get(idx)
if cached is None:
continue
new_key, new_ts = self._event_signature(ev)
old_key, old_ts = self._event_signature(cached)
if (new_key and old_key and new_key != old_key) or \
(new_ts and old_ts and new_ts != old_ts):
evict = True
break
if evict:
self._flush_device(conn_key)
self._events[conn_key] = (device_count, events)
# ── Monitor status ────────────────────────────────────────────────────────
def get_monitor_status(self, conn_key: str) -> Optional[dict]:
with self._lock:
entry = self._monitor_status.get(conn_key)
if entry is None:
return None
fetched_at, status = entry
if time.time() - fetched_at > _MONITOR_STATUS_TTL:
return None
return status
def set_monitor_status(self, conn_key: str, status: dict) -> None:
with self._lock:
self._monitor_status[conn_key] = (time.time(), status)
def invalidate_monitor_status(self, conn_key: str) -> None:
with self._lock:
self._monitor_status.pop(conn_key, None)
# ── Config dirty flag ─────────────────────────────────────────────────────
def mark_config_dirty(self, conn_key: str) -> None:
with self._lock:
self._config_dirty[conn_key] = True
self._events.pop(conn_key, None)
# ── Waveforms (permanent cache, evicted on (key,ts) mismatch) ─────────────
def get_waveform(self, conn_key: str, index: int) -> Optional[dict]:
with self._lock:
return self._waveforms.get((conn_key, index))
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
"""
Cache a waveform. Evicts the device's whole cache when the existing
entry at the same index has a different (waveform_key, timestamp).
"""
with self._lock:
existing = self._waveforms.get((conn_key, index))
if existing is not None:
new_key, new_ts = self._event_signature(waveform)
old_key, old_ts = self._event_signature(existing)
differs = (
(new_key and old_key and new_key != old_key)
or (new_ts and old_ts and new_ts != old_ts)
)
if differs:
self._flush_device(conn_key)
self._waveforms[(conn_key, index)] = waveform
+533 -132
View File
@@ -37,6 +37,7 @@ from __future__ import annotations
import datetime
import logging
import sys
import tempfile
import threading
import time
from pathlib import Path
@@ -44,9 +45,9 @@ from typing import Optional
# FastAPI / Pydantic
try:
from fastapi import Body, FastAPI, HTTPException, Query
from fastapi import Body, FastAPI, File, HTTPException, Query, UploadFile
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, JSONResponse
from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
from pydantic import BaseModel
import uvicorn
except ImportError:
@@ -61,8 +62,13 @@ from minimateplus import MiniMateClient
from minimateplus.protocol import ProtocolError
from minimateplus.models import CallHomeConfig, ComplianceConfig, DeviceInfo, Event, PeakValues, ProjectInfo, Timestamp
from minimateplus.transport import TcpTransport, DEFAULT_TCP_PORT
from minimateplus.blastware_file import write_blastware_file, blastware_filename
from minimateplus.client import _decode_a5_metadata_into, _decode_a5_waveform
from sfm import event_hdf5
from sfm.cache import SFMCache, get_cache
from sfm.database import SeismoDb
from sfm.live_cache import LiveCache as _LiveCache
from sfm.waveform_store import WaveformStore
logging.basicConfig(
level=logging.INFO,
@@ -99,6 +105,7 @@ app.add_middleware(
_DEFAULT_DB_PATH = Path(__file__).parent.parent / "bridges" / "captures" / "seismo_relay.db"
_db: Optional[SeismoDb] = None
_store: Optional[WaveformStore] = None
def _get_db() -> SeismoDb:
@@ -108,6 +115,18 @@ def _get_db() -> SeismoDb:
return _db
def _get_store() -> WaveformStore:
"""
Persistent event-file + A5-sidecar store, rooted at <db_dir>/waveforms/.
Mirrors the layout used by bridges/ach_server.py so files saved by ACH
ingestion and by live SFM downloads share one canonical location.
"""
global _store
if _store is None:
_store = WaveformStore(_get_db().db_path.parent / "waveforms")
return _store
# ── Live device cache ─────────────────────────────────────────────────────────
# In-memory cache for live device data. Avoids re-dialing the device on every
# request when the data hasn't changed.
@@ -125,116 +144,6 @@ def _get_db() -> SeismoDb:
#
# All endpoints accept ?force=true to bypass the cache and re-read from device.
_MONITOR_STATUS_TTL = 30.0 # seconds
class _LiveCache:
"""
Thread-safe in-memory cache for live SFM device data.
One singleton per server process.
"""
def __init__(self) -> None:
self._lock = threading.Lock()
# conn_key → serialised device info dict
self._device_info: dict[str, dict] = {}
# conn_key → (device_event_count_when_cached, [event dicts])
self._events: dict[str, tuple[int, list]] = {}
# conn_key → (fetched_at_unix, status_dict)
self._monitor_status: dict[str, tuple[float, dict]] = {}
# conn_key → bool (True = re-read device on next /device/info)
self._config_dirty: dict[str, bool] = {}
# (conn_key, event_index) → waveform dict (permanent)
self._waveforms: dict[tuple, dict] = {}
# ── Connection key ────────────────────────────────────────────────────────
@staticmethod
def make_conn_key(
host: Optional[str],
tcp_port: int,
port: Optional[str],
baud: int,
) -> str:
if host:
return f"tcp:{host}:{tcp_port}"
return f"serial:{port}:{baud}"
# ── Device info ───────────────────────────────────────────────────────────
def get_device_info(self, conn_key: str) -> Optional[dict]:
with self._lock:
if self._config_dirty.get(conn_key):
return None
return self._device_info.get(conn_key)
def set_device_info(self, conn_key: str, info: dict) -> None:
with self._lock:
self._device_info[conn_key] = info
self._config_dirty[conn_key] = False
# ── Events ────────────────────────────────────────────────────────────────
def get_events(self, conn_key: str, device_count: int) -> Optional[list]:
"""
Return cached events if the device's current event count matches what
we had when we last fetched. Returns None (cache miss) otherwise.
"""
with self._lock:
if self._config_dirty.get(conn_key):
return None
entry = self._events.get(conn_key)
if entry is None:
return None
cached_count, events = entry
return events if cached_count == device_count else None
def set_events(self, conn_key: str, device_count: int, events: list) -> None:
with self._lock:
self._events[conn_key] = (device_count, events)
# ── Monitor status ────────────────────────────────────────────────────────
def get_monitor_status(self, conn_key: str) -> Optional[dict]:
with self._lock:
entry = self._monitor_status.get(conn_key)
if entry is None:
return None
fetched_at, status = entry
if time.time() - fetched_at > _MONITOR_STATUS_TTL:
return None
return status
def set_monitor_status(self, conn_key: str, status: dict) -> None:
with self._lock:
self._monitor_status[conn_key] = (time.time(), status)
def invalidate_monitor_status(self, conn_key: str) -> None:
with self._lock:
self._monitor_status.pop(conn_key, None)
# ── Config dirty flag ─────────────────────────────────────────────────────
def mark_config_dirty(self, conn_key: str) -> None:
"""
Called after a successful POST /device/config write.
Forces next /device/info and /device/events to re-read from the device.
"""
with self._lock:
self._config_dirty[conn_key] = True
self._events.pop(conn_key, None)
# ── Waveforms (permanent cache) ───────────────────────────────────────────
def get_waveform(self, conn_key: str, index: int) -> Optional[dict]:
with self._lock:
return self._waveforms.get((conn_key, index))
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
with self._lock:
self._waveforms[(conn_key, index)] = waveform
_live_cache = _LiveCache()
@@ -781,7 +690,7 @@ def device_event_waveform(
if the device is not storing all frames yet, or the capture was partial)
- **sample_rate**: samples per second (from compliance config)
- **channels**: dict of channel name list of signed int16 ADC counts
(keys: "Tran", "Vert", "Long", "Mic")
(keys: "Tran", "Vert", "Long", "MicL")
**Caching**: full waveforms are cached permanently after the first download
they are immutable once recorded on the device. Subsequent requests for the
@@ -824,30 +733,194 @@ def device_event_waveform(
detail=f"Event index {index} not found on device",
)
raw = getattr(ev, "raw_samples", None) or {}
samples_decoded = len(raw.get("Tran", []))
# Backfill from compliance_config: sample_rate, record_time, and
# derived total_samples. These are user-set authoritative values; the
# corresponding STRT-derived guesses in `_decode_a5_waveform` can be
# off (e.g. rectime used to read the 0x46 record-type marker = 70s).
cc = info.compliance_config
if cc:
if ev.sample_rate is None and cc.sample_rate:
ev.sample_rate = cc.sample_rate
if cc.record_time:
ev.rectime_seconds = cc.record_time
if ev.sample_rate and ev.rectime_seconds:
derived = int(round(ev.sample_rate * ev.rectime_seconds))
if (ev.total_samples is None
or ev.total_samples > derived * 2
or ev.total_samples < derived // 4):
ev.total_samples = derived
geo_range = getattr(cc, "geo_range", None) if cc else None
# Resolve sample_rate from compliance config if not on the event itself
sample_rate = ev.sample_rate
if sample_rate is None and info.compliance_config:
sample_rate = info.compliance_config.sample_rate
result = {
"index": ev.index,
"record_type": ev.record_type,
"timestamp": _serialise_timestamp(ev.timestamp),
"total_samples": ev.total_samples,
"pretrig_samples": ev.pretrig_samples,
"rectime_seconds": ev.rectime_seconds,
"samples_decoded": samples_decoded,
"sample_rate": sample_rate,
"peak_values": _serialise_peak_values(ev.peak_values),
"channels": raw,
}
# Build the plot.v1 JSON: samples in physical units (in/s for geo, psi
# for mic), explicit time axis, peak markers — the shape clients should
# consume directly without doing any ADC scaling.
serial = getattr(info, "serial", None) or ""
result = event_hdf5.event_to_plot_json(
ev, serial=serial,
geo_range=geo_range or "normal",
index=index,
)
cache.set_waveform(conn_key, index, result)
return result
@app.get("/device/event/{index}/blastware_file")
def device_event_blastware_file(
index: int,
port: Optional[str] = Query(None, description="Serial port (e.g. COM5)"),
baud: int = Query(38400, description="Serial baud rate"),
host: Optional[str] = Query(None, description="TCP host — modem IP or ACH relay"),
tcp_port: int = Query(DEFAULT_TCP_PORT, description=f"TCP port (default {DEFAULT_TCP_PORT})"),
force: bool = Query(False, description="Bypass any cached/dedup'd state and re-download from device"),
) -> FileResponse:
"""
Download the waveform for a single event (0-based index) and return it
as a Blastware-compatible binary file with a correct Blastware filename.
Supply either *port* (serial) or *host* (TCP/modem).
The file is written to the OS temp directory and streamed back as a binary
download. Blastware can open it directly filename encodes serial + timestamp.
Filename format: <prefix><serial3><stem><AB>0<W|H>
- prefix letter = chr(ord('B') + floor(serial_numeric / 1000))
- stem + AB = second-resolution timestamp since 1985-01-01 local
- W / H = Full Waveform / Full Histogram (defaults to W for
triggered events; histogram requires recording_mode
to be populated from compliance config)
Performs: POLL startup get_events(full_waveform=True,
stop_after_index=index) write_blastware_file() FileResponse +
persistent store + DB upsert.
"""
log.info(
"GET /device/event/%d/blastware_file port=%s host=%s force=%s",
index, port, host, force,
)
# `force` always re-downloads from the device. This endpoint already
# never short-circuits via cache, so `force` is reserved for parity with
# the other live endpoints.
try:
def _do():
with _build_client(port, baud, host, tcp_port, timeout=120.0) as client:
info = client.connect()
# full_waveform=True pulls the complete 5A stream so the
# client populates STRT-derived fields (total_samples,
# pretrig_samples, rectime_seconds) AND raw_samples on the
# Event. Required for the .h5 + .sfm.json sidecar to be
# filled in correctly — without it, those land as nulls.
events = client.get_events(
full_waveform=True,
stop_after_index=index,
)
matching = [ev for ev in events if ev.index == index]
return matching[0] if matching else None, info
ev, info = _run_with_retry(_do, is_tcp=_is_tcp(host))
except HTTPException:
raise
except ProtocolError as exc:
log.error("blastware_file: protocol error: %s", exc, exc_info=True)
raise HTTPException(status_code=502, detail=f"Protocol error: {exc}") from exc
except OSError as exc:
log.error("blastware_file: connection error: %s", exc, exc_info=True)
raise HTTPException(status_code=502, detail=f"Connection error: {exc}") from exc
except Exception as exc:
log.error("blastware_file: unexpected error: %s", exc, exc_info=True)
raise HTTPException(status_code=500, detail=f"Device error: {exc}") from exc
if ev is None:
raise HTTPException(
status_code=404,
detail=f"Event index {index} not found on device",
)
a5_frames = getattr(ev, "_a5_frames", None)
if not a5_frames:
raise HTTPException(
status_code=502,
detail=f"No waveform data received for event index {index} — 5A download failed",
)
# Determine serial number from device info
serial = getattr(info, "serial", None) or "UNKNOWN"
# Build filename using the same algorithm Blastware uses
filename = blastware_filename(ev, serial)
# Write to OS temp dir (cross-platform: /tmp on Linux/macOS,
# %TEMP% on Windows) so FastAPI can stream it back via FileResponse.
out_path = Path(tempfile.gettempdir()) / filename
# Delete any stale file at this path before writing. On Windows we have
# observed the new (smaller) file getting trailing zero-bytes from the
# previous (larger) file when filesystem semantics around open(...,"wb")
# don't truncate cleanly (e.g. through a synced folder). Explicit unlink
# eliminates that ambiguity.
try:
out_path.unlink()
except FileNotFoundError:
pass
write_blastware_file(ev, a5_frames, out_path)
log.info(
"blastware_file: wrote %s (%d A5 frames, serial=%s)",
out_path, len(a5_frames), serial,
)
# Promote to canonical persistent store + DB row so this event is
# queryable via /db/events afterwards (matches the ACH ingestion path).
if serial != "UNKNOWN" and ev._waveform_key is not None:
try:
cc = info.compliance_config
# Backfill authoritative compliance-config values onto the
# Event before persisting. These supersede whatever
# _decode_a5_waveform read from the STRT bytes (some of which
# have ambiguous semantics — e.g. STRT[20] is rectime but
# STRT[8:10] / STRT[16:18] are device-specific scratch fields
# that aren't reliable sample/pretrig counts).
if cc:
if ev.sample_rate is None and cc.sample_rate:
ev.sample_rate = cc.sample_rate
if cc.record_time:
# record_time from compliance is authoritative — the
# user-set value the device followed when recording.
ev.rectime_seconds = cc.record_time
# Derive total_samples from sample_rate × rectime when
# we can; the STRT-derived value can land at a buffer-
# offset rather than a sample count.
if ev.sample_rate and ev.rectime_seconds:
derived = int(round(ev.sample_rate * ev.rectime_seconds))
if (ev.total_samples is None
or ev.total_samples > derived * 2
or ev.total_samples < derived // 4):
ev.total_samples = derived
geo_range = getattr(cc, "geo_range", None) if cc else None
rec = _get_store().save(
ev, serial=serial, a5_frames=a5_frames,
geo_range=geo_range if geo_range is not None else "normal",
)
_get_db().insert_events(
[ev],
serial=serial,
waveform_records={ev._waveform_key.hex(): rec},
)
log.info(
"blastware_file: persisted to store (%s, %d bytes)",
rec["filename"], rec["filesize"],
)
except Exception as exc:
log.warning(
"blastware_file: persistent store save failed: %s "
"— temp file still served",
exc,
)
return FileResponse(
path=str(out_path),
filename=filename,
media_type="application/octet-stream",
)
# ── Write endpoints ───────────────────────────────────────────────────────────
class DeviceConfigBody(BaseModel):
@@ -1330,6 +1403,334 @@ def db_set_false_trigger(
return {"status": "ok", "event_id": event_id, "false_trigger": value}
# ── /db/events/{id} — waveform file accessors ─────────────────────────────────
#
# These endpoints serve files from the persistent WaveformStore, so a Blastware
# file or its decoded JSON for a previously-ingested ACH event can be fetched
# without re-dialing the device.
@app.get("/db/events/{event_id}/blastware_file")
def db_event_blastware_file(event_id: str) -> FileResponse:
"""
Return the Blastware-format event file for a previously-ingested
event. Filename extension is per-event (timestamp-encoded
`AB0T` for ACH downloads, 3-char `AB0` for direct downloads).
404 if the event is unknown or has no event file in the store
(events ingested before the store was wired will show this
re-download via the live endpoint to populate).
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=(
f"Event {event_id} has no Blastware file in the store. "
"Re-download via the live endpoint to populate."
),
)
bw_path = _get_store().open_blastware(serial, filename)
if bw_path is None:
raise HTTPException(
status_code=410,
detail=f"Stored file missing on disk: {filename}",
)
return FileResponse(
path=str(bw_path),
filename=filename,
media_type="application/octet-stream",
)
@app.get("/db/events/{event_id}/waveform.json")
def db_event_waveform_json(event_id: str) -> dict:
"""
Return the plot-ready JSON (`sfm.plot.v1`) for a stored event.
Resolution order (cheapest first):
1. If `<filename>.h5` exists, serve it via `plot_json_from_hdf5`.
Samples are already in physical units; no decode work needed.
2. Else if `<filename>.a5.pkl` exists, replay the A5 decoders to
rebuild an Event and serialise via `event_to_plot_json`.
3. Else 404 the event has no waveform data on disk.
The shape is identical regardless of source, so clients (the SFM
webapp, Terra-View, etc.) consume the same `sfm.plot.v1` payload.
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=f"Event {event_id} has no event file in the store",
)
store = _get_store()
# Path 1: HDF5 (canonical clean format).
h5_path = store.hdf5_path_for(serial, filename)
if h5_path.exists():
try:
return event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
except Exception as exc:
log.warning("HDF5 read failed (%s); falling back to A5 path", exc)
# Path 2: A5 pickle replay.
a5_frames = store.load_a5(serial, filename)
if not a5_frames:
raise HTTPException(
status_code=404,
detail=(
f"Event {event_id} has no waveform data on disk "
"(no .h5 and no .a5.pkl). Run the backfill script or "
"re-download via the live endpoint to populate."
),
)
ev = Event(index=-1)
try:
_decode_a5_metadata_into(a5_frames, ev)
except Exception as exc:
log.warning("db_event_waveform_json: metadata decode failed: %s", exc)
try:
_decode_a5_waveform(a5_frames, ev)
except Exception as exc:
log.error("db_event_waveform_json: waveform decode failed: %s", exc, exc_info=True)
raise HTTPException(status_code=500, detail=f"Waveform decode failed: {exc}") from exc
# Carry over fields from the DB row when the A5 replay didn't fill them.
if ev.sample_rate is None and row.get("sample_rate"):
ev.sample_rate = row.get("sample_rate")
return event_hdf5.event_to_plot_json(
ev, serial=serial, geo_range="normal", event_id=event_id,
)
# ── /db/events/{id}/sidecar — modern .sfm.json review/metadata accessors ──────
class SidecarPatchBody(BaseModel):
"""Body for PATCH /db/events/{id}/sidecar.
JSON-merge-patch semantics: only the keys you include get updated.
`review` is the editable block for monthly-summary workflows
(false_trigger flag, reviewer notes, etc.); `extensions` is the
forward-compat namespace for vendor / future fields.
"""
review: Optional[dict] = None
extensions: Optional[dict] = None
@app.get("/db/events/{event_id}/sidecar")
def db_event_sidecar(event_id: str) -> dict:
"""
Return the .sfm.json sidecar for a stored event. 404 if the event
is unknown or has no sidecar in the store (events ingested before
the sidecar feature landed will show this until backfilled).
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=f"Event {event_id} has no event file in the store",
)
sidecar = _get_store().load_sidecar(serial, filename)
if sidecar is None:
raise HTTPException(
status_code=404,
detail=(
f"No .sfm.json sidecar on disk for {filename}. "
"Run scripts/backfill_sidecars.py to generate one."
),
)
return sidecar
@app.patch("/db/events/{event_id}/sidecar")
def db_event_sidecar_patch(event_id: str, body: SidecarPatchBody) -> dict:
"""
JSON-merge-patch the sidecar's `review` and/or `extensions` blocks.
The sidecar JSON is the source of truth for review state. When
`review.false_trigger` is updated, the SQL `events.false_trigger`
column is kept in sync as a derived index for fast filtering.
Returns the new full sidecar. 404 if the event or sidecar is missing.
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=f"Event {event_id} has no event file in the store",
)
if not (body.review or body.extensions):
raise HTTPException(
status_code=400,
detail="PATCH body must include `review` and/or `extensions`",
)
new_sidecar = _get_store().patch_sidecar(
serial, filename,
review=body.review,
extensions=body.extensions,
)
if new_sidecar is None:
raise HTTPException(
status_code=404,
detail=f"No .sfm.json sidecar on disk for {filename}",
)
# Mirror false_trigger from review block into the SQL index column.
if body.review is not None:
_get_db().update_event_review(event_id, new_sidecar.get("review", {}))
return new_sidecar
# ── /db/import/blastware_file — ingest BW-only event files ────────────────────
@app.post("/db/import/blastware_file")
async def db_import_blastware_file(
files: list[UploadFile] = File(...),
serial: Optional[str] = Query(None, description="Optional serial-number hint (e.g. BE11529); falls back to the BW filename's encoded prefix when omitted"),
) -> dict:
"""
Multipart upload of one or more Blastware event file binaries
(typically produced by Blastware's own ACH). For each file:
1. Parse the bytes via WaveformStore.save_imported_bw produces
a parsed Event + copies the file into the persistent store +
writes a .sfm.json sidecar with source.kind = "bw-import".
2. Upsert a row into `events` (dedup'd on serial+timestamp).
Response includes per-file outcomes so the caller can see which
landed cleanly and which failed (e.g. malformed file, unknown
serial, etc.).
"""
store = _get_store()
db = _get_db()
results: list[dict] = []
for upload in files:
try:
content = await upload.read()
except Exception as exc:
results.append({
"filename": upload.filename, "status": "error",
"detail": f"read failed: {exc}",
})
continue
try:
ev, rec = store.save_imported_bw(
content,
source_path=Path(upload.filename or "imported.bw"),
serial_hint=serial,
)
inserted, skipped = db.insert_events(
[ev],
serial=(serial or _serial_from_event(ev) or "UNKNOWN"),
waveform_records={
ev._waveform_key.hex(): rec
if ev._waveform_key else None
} if ev._waveform_key else None,
)
results.append({
"filename": upload.filename,
"status": "ok",
"stored_filename": rec["filename"],
"filesize": rec["filesize"],
"sha256": rec["sha256"],
"inserted": inserted,
"skipped": skipped,
})
except Exception as exc:
log.error("import failed for %s: %s", upload.filename, exc, exc_info=True)
results.append({
"filename": upload.filename, "status": "error",
"detail": str(exc),
})
return {"count": len(results), "results": results}
def _serial_from_event(ev) -> Optional[str]:
"""Fallback serial resolver — currently relies on the BW filename
decoder via WaveformStore.save_imported_bw, so this is just a
placeholder for future enhancement (e.g. inferring from project_info)."""
return None
@app.get("/db/units/{serial}/waveforms.zip")
def db_unit_waveforms_zip(
serial: str,
from_dt: Optional[str] = Query(None, description="ISO-8601 start datetime (inclusive)"),
to_dt: Optional[str] = Query(None, description="ISO-8601 end datetime (inclusive)"),
limit: int = Query(5000, description="Hard cap on events bundled (default 5000)"),
) -> StreamingResponse:
"""
Stream a ZIP of all event files for a serial in the optional date range.
Events without a stored event file are silently skipped.
"""
import io
import zipfile
from_parsed = datetime.datetime.fromisoformat(from_dt) if from_dt else None
to_parsed = datetime.datetime.fromisoformat(to_dt) if to_dt else None
rows = _get_db().query_events(
serial=serial,
from_dt=from_parsed,
to_dt=to_parsed,
limit=limit,
offset=0,
)
store = _get_store()
buf = io.BytesIO()
written = 0
with zipfile.ZipFile(buf, "w", compression=zipfile.ZIP_DEFLATED) as zf:
for row in rows:
fn = row.get("blastware_filename")
if not fn:
continue
bw_path = store.open_blastware(serial, fn)
if bw_path is None:
continue
zf.write(bw_path, arcname=fn)
written += 1
if written == 0:
raise HTTPException(
status_code=404,
detail=f"No stored Blastware files found for serial {serial} in range",
)
buf.seek(0)
safe_serial = serial.replace("/", "_")
headers = {
"Content-Disposition": f'attachment; filename="{safe_serial}_waveforms.zip"',
"X-Waveform-Count": str(written),
}
return StreamingResponse(buf, media_type="application/zip", headers=headers)
@app.get("/db/monitor_log")
def db_monitor_log(
serial: Optional[str] = Query(None, description="Filter by unit serial"),
+606 -51
View File
@@ -609,6 +609,147 @@
.section-btn:hover { color: var(--text); }
.section-btn.active { background: var(--blue); color: #fff; }
/* ── Force-refresh toggle ── */
.force-toggle {
display: flex;
align-items: center;
gap: 6px;
padding: 4px 10px;
border: 1px solid var(--border);
border-radius: 6px;
background: var(--bg);
cursor: pointer;
font-size: 11px;
font-weight: 600;
color: var(--text-dim);
user-select: none;
white-space: nowrap;
transition: background 0.12s, color 0.12s, border-color 0.12s;
}
.force-toggle input { margin: 0; cursor: pointer; }
.force-toggle:hover { color: var(--text); }
.force-toggle.active {
background: rgba(248, 81, 73, 0.18);
border-color: #f85149;
color: #ff7b72;
}
.force-toggle .ft-dot {
width: 6px; height: 6px; border-radius: 50%;
background: var(--text-mute);
}
.force-toggle.active .ft-dot { background: #f85149; box-shadow: 0 0 6px #f85149; }
/* ── Sidecar review modal ── */
.sc-overlay {
position: fixed; inset: 0;
background: rgba(0,0,0,0.55);
display: none;
align-items: center;
justify-content: center;
z-index: 100;
}
.sc-overlay.visible { display: flex; }
.sc-modal {
background: var(--surface2);
border: 1px solid var(--border);
border-radius: 8px;
width: min(720px, 92vw);
max-height: 88vh;
display: flex;
flex-direction: column;
box-shadow: 0 8px 32px rgba(0,0,0,0.5);
}
.sc-header {
display: flex; align-items: center; justify-content: space-between;
padding: 14px 18px;
border-bottom: 1px solid var(--border);
}
.sc-header h3 {
margin: 0; font-size: 14px; font-weight: 600;
color: var(--text); font-family: monospace;
}
.sc-close {
background: none; border: none; cursor: pointer;
color: var(--text-mute); font-size: 18px; line-height: 1;
padding: 4px 8px; border-radius: 4px;
}
.sc-close:hover { background: var(--surface); color: var(--text); }
.sc-body {
flex: 1; overflow-y: auto;
padding: 16px 18px;
display: flex; flex-direction: column; gap: 14px;
}
.sc-section {
display: flex; flex-direction: column; gap: 6px;
}
.sc-section h4 {
margin: 0 0 4px;
font-size: 11px; font-weight: 600;
color: var(--text-mute); text-transform: uppercase;
letter-spacing: 0.6px;
}
.sc-grid {
display: grid;
grid-template-columns: 130px 1fr;
gap: 4px 12px;
font-size: 12px;
}
.sc-grid dt { color: var(--text-mute); }
.sc-grid dd { margin: 0; color: var(--text); font-family: monospace; word-break: break-all; }
.sc-row { display: flex; align-items: center; gap: 8px; font-size: 13px; }
.sc-row label { color: var(--text-dim); }
.sc-row input[type="checkbox"] { cursor: pointer; }
.sc-row input[type="text"], .sc-body textarea {
flex: 1;
background: var(--bg);
border: 1px solid var(--border);
border-radius: 5px;
padding: 6px 9px;
font-size: 12px;
color: var(--text);
font-family: monospace;
}
.sc-body textarea {
width: 100%;
min-height: 80px;
resize: vertical;
font-family: inherit;
}
.sc-raw {
border: 1px solid var(--border);
border-radius: 5px;
background: var(--bg);
}
.sc-raw summary {
padding: 6px 10px;
cursor: pointer;
font-size: 11px;
color: var(--text-dim);
user-select: none;
}
.sc-raw pre {
margin: 0;
padding: 8px 12px;
max-height: 240px;
overflow: auto;
font-size: 11px;
color: var(--text);
border-top: 1px solid var(--border);
}
.sc-footer {
display: flex; justify-content: flex-end; gap: 8px;
padding: 12px 18px;
border-top: 1px solid var(--border);
}
.sc-status {
flex: 1; align-self: center;
font-size: 11px; color: var(--text-mute);
}
.sc-status.error { color: #f85149; }
.sc-status.ok { color: #56d364; }
table.db-table tbody tr.clickable { cursor: pointer; }
table.db-table tbody tr.clickable:hover { background: var(--surface2); }
/* ── Section containers ── */
#section-live, #section-db {
display: flex;
@@ -654,6 +795,13 @@
<button class="section-btn active" onclick="switchSection('live')">Live Device</button>
<button class="section-btn" onclick="switchSection('db')">Database</button>
</div>
<div class="hdr-sep"></div>
<label class="force-toggle" id="force-toggle"
title="Bypass server cache and dedup. Forces a fresh download from the device on every live request — useful when the device has been erased and the cache is showing stale events.">
<input type="checkbox" id="force-cb" onchange="onForceToggle()">
<span class="ft-dot"></span>
<span>Force refresh</span>
</label>
</header>
<!-- ════════════════════════════════════════════════════════════════
@@ -769,6 +917,14 @@
<div class="event-toolbar">
<button class="btn btn-ghost" id="load-btn" onclick="loadWaveform()" disabled>Load Waveform</button>
<button class="btn btn-ghost" id="save-btn" onclick="saveEventToDb()" disabled
title="Download the full waveform from the device and save it to the SFM database + waveform store. Honors the Force refresh toggle.">
💾 Save to DB
</button>
<button class="btn btn-ghost" id="download-btn" onclick="downloadEventFile()" disabled
title="Download the Blastware-format event file to your computer (also saves it to the server's database + store).">
⬇ Download
</button>
<button class="btn btn-ghost" id="prev-btn" onclick="stepEvent(-1)" disabled></button>
<button class="btn btn-ghost" id="next-btn" onclick="stepEvent(+1)" disabled></button>
<div class="event-chips" id="event-chips"></div>
@@ -1187,7 +1343,7 @@ let currentEvent = 0;
let charts = {};
let geoAdcScale = 6.206;
const DBL_REF = 2.9e-9; // 20 µPa in psi — reference pressure for dBL
const CHANNEL_COLORS = { Tran:'#58a6ff', Vert:'#3fb950', Long:'#d29922', Mic:'#bc8cff' };
const CHANNEL_COLORS = { Tran:'#58a6ff', Vert:'#3fb950', Long:'#d29922', MicL:'#bc8cff' };
// ── Helpers ────────────────────────────────────────────────────────────────────
function api() { return document.getElementById('api-base').value.replace(/\/$/, ''); }
@@ -1214,8 +1370,21 @@ function setCfgStatus(msg, cls = '') {
el.className = cls;
}
// "Force refresh" override — when enabled, every live-device request is
// sent with ?force=true so the server bypasses its in-memory + persistent
// caches and re-reads from the device. Manual escape hatch for cases where
// the cache has gone stale (e.g. post-erase key reuse — see ach_server.py
// and sfm/cache.py for the eviction logic).
let forceRefresh = false;
function onForceToggle() {
forceRefresh = document.getElementById('force-cb').checked;
document.getElementById('force-toggle').classList.toggle('active', forceRefresh);
}
function deviceParams() {
return `host=${encodeURIComponent(devHost())}&tcp_port=${devPort()}`;
const base = `host=${encodeURIComponent(devHost())}&tcp_port=${devPort()}`;
return forceRefresh ? `${base}&force=true` : base;
}
// ── Section switching ─────────────────────────────────────────────────────────
@@ -1306,6 +1475,8 @@ async function connectUnit() {
document.getElementById('device-bar').style.display = 'flex';
document.getElementById('monitor-panel').style.display = 'flex';
document.getElementById('load-btn').disabled = eventList.length === 0;
document.getElementById('save-btn').disabled = eventList.length === 0;
document.getElementById('download-btn').disabled = eventList.length === 0;
document.getElementById('prev-btn').disabled = true;
document.getElementById('next-btn').disabled = eventList.length <= 1;
document.getElementById('cfg-read-btn').disabled = false;
@@ -1807,11 +1978,104 @@ async function loadWaveform() {
document.getElementById('load-btn').disabled = false;
}
// ── Persist current event to the SFM database + waveform store ──────────────
//
// Calls /device/event/{idx}/blastware_file, which on the server side:
// 1. Downloads the full waveform from the device (5A bulk stream)
// 2. Writes the Blastware-format event file into <db_dir>/waveforms/<serial>/
// 3. Writes the .a5.pkl sidecar next to it (so the file can be regenerated)
// 4. Upserts a row into seismo_relay.db `events` table (dedup'd on serial+timestamp)
//
// We discard the response body — the side effects are what we want. The
// filename comes back in the Content-Disposition header for confirmation.
async function saveEventToDb() {
if (!devHost()) { setStatus('Enter device host first.', 'error'); return; }
const idx = currentEvent;
const btn = document.getElementById('save-btn');
btn.disabled = true;
const orig = btn.textContent;
btn.textContent = '⏳ Saving…';
setStatus(`Downloading event #${idx} and saving to DB…`, 'loading');
try {
const r = await fetch(`${api()}/device/event/${idx}/blastware_file?${deviceParams()}`);
if (!r.ok) {
const e = await r.json().catch(() => ({}));
throw new Error(e.detail || r.statusText);
}
// Pull the body to completion so the connection releases promptly,
// then drop it on the floor — we just want the server-side persist.
await r.blob();
const filename = parseFilenameFromContentDisposition(r.headers.get('Content-Disposition'))
|| `event ${idx}`;
setStatus(`Saved ${filename} to database + waveform store`, 'ok');
} catch (e) {
setStatus(`Save error: ${e.message}`, 'error');
} finally {
btn.disabled = false;
btn.textContent = orig;
}
}
// ── Download the event file to the user's computer ──────────────────────────
//
// Uses a transient anchor + click trick so the browser surfaces its native
// "Save As" / Downloads behaviour. Same backend endpoint as Save to DB —
// the file is also persisted to the server store as a side effect.
function downloadEventFile() {
if (!devHost()) { setStatus('Enter device host first.', 'error'); return; }
const idx = currentEvent;
const url = `${api()}/device/event/${idx}/blastware_file?${deviceParams()}`;
setStatus(`Downloading event #${idx}…`, 'loading');
// Hidden iframe avoids navigating away from the SPA. FastAPI's FileResponse
// sets Content-Disposition: attachment so the browser saves rather than displays.
const a = document.createElement('a');
a.href = url;
a.style.display = 'none';
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
// We can't reliably detect when the browser finishes downloading; show a
// soft confirmation immediately. Errors will surface as a download failure
// dialog from the browser itself.
setTimeout(() => setStatus(`Download started for event #${idx} (also saved server-side)`, 'ok'), 250);
}
function parseFilenameFromContentDisposition(header) {
if (!header) return null;
// RFC 6266: `attachment; filename="M529LKIQ.7M0W"` (or filename*=UTF-8''…)
const m = /filename\*?=(?:UTF-8'')?["']?([^"';]+)["']?/i.exec(header);
return m ? decodeURIComponent(m[1]) : null;
}
// renderWaveform consumes the `sfm.plot.v1` JSON shape:
// {
// schema: "sfm.plot.v1",
// time_axis: { sample_rate, pretrig_samples, t0_ms, dt_ms, n_samples, ... },
// channels: { Tran|Vert|Long|MicL: { unit, values, peak, peak_t_ms } },
// geo_range, geo_full_scale_ips, trigger_ms, peak_values, ...
// }
//
// All sample arrays are already in PHYSICAL UNITS (in/s for geo, psi for
// mic) — the server applied the right scaling for the unit's geo_range.
// The viewer used to multiply ADC ints by `geoAdcScale / 32767` here,
// which silently scaled every plot ~38% too low because `geoAdcScale` is
// the in/s-per-V hardware constant, not the ADC-counts-to-velocity
// factor. No scaling happens client-side now.
function renderWaveform(data) {
const sr = data.sample_rate || 1024;
const pretrig = data.pretrig_samples || 0;
const decoded = data.samples_decoded || 0;
const total = data.total_samples || decoded;
// Backward-compat shim: if we ever get the legacy shape from a stale
// cache, normalise it on the client so the viewer still works.
if (!data.schema && data.channels && Array.isArray(data.channels.Tran)) {
data = _legacyWaveformToPlotV1(data);
}
const t = data.time_axis || {};
const sr = t.sample_rate || 1024;
const pretrig = t.pretrig_samples || 0;
const total = t.total_samples || t.n_samples || 0;
const decoded = t.n_samples || 0;
const t0 = t.t0_ms ?? -(pretrig / sr * 1000);
const dt = t.dt_ms ?? (1000 / sr);
const channels = data.channels || {};
// Status bar
@@ -1819,70 +2083,83 @@ function renderWaveform(data) {
bar.innerHTML = '';
bar.className = 'ok';
const ts = data.timestamp;
bar.textContent = ts ? `Event #${data.index} — ${ts.display} ` : `Event #${data.index} `;
// Title prefers `index` (live device, 0-based slot on the unit) and
// falls back to event_id (DB lookup) when index is absent.
const eventLabel = (data.index != null) ? `#${data.index}` : (data.event_id || '');
bar.textContent = ts ? `Event ${eventLabel} — ${ts} ` : `Event ${eventLabel} `;
addPill(`${data.record_type || '?'}`);
addPill(`${sr} sps`);
addPill(`${decoded.toLocaleString()} / ${total.toLocaleString()} samples`);
addPill(`pretrig ${pretrig}`);
addPill(`${data.rectime_seconds ?? '?'} s`);
addPill(`${t.rectime_seconds ?? '?'} s`);
if (data.geo_range) addPill(`geo: ${data.geo_range} (${data.geo_full_scale_ips} in/s FS)`);
// Any record_type starting with "Waveform" is a viewable triggered
// event (the timestamp-header byte layout varies across firmware but
// doesn't change the sample stream). Only block when there's actually
// no waveform payload to plot.
const isWaveformLike = !!(data.record_type || '').match(/^Waveform/i);
if (decoded === 0) {
document.getElementById('empty-state').style.display = 'flex';
document.getElementById('empty-state').querySelector('p').textContent =
data.record_type === 'Waveform'
isWaveformLike
? 'No samples decoded — check server logs'
: `Record type "${data.record_type}" — waveform not supported yet`;
: `Record type "${data.record_type}" — not a waveform event`;
document.getElementById('charts').style.display = 'none';
Object.values(charts).forEach(c => c.destroy()); charts = {};
return;
}
const times = Array.from({length: decoded}, (_, i) => ((i - pretrig) / sr * 1000).toFixed(2));
// Time axis: explicit ms values from t0_ms + i*dt_ms. More precise
// than the old (i - pretrig) / sr * 1000 since dt_ms came from the
// server with full float precision.
const times = Array.from({length: decoded}, (_, i) => (t0 + i * dt).toFixed(2));
document.getElementById('empty-state').style.display = 'none';
const chartsDiv = document.getElementById('charts');
chartsDiv.style.display = 'flex';
chartsDiv.innerHTML = '';
Object.values(charts).forEach(c => c.destroy()); charts = {};
const micPeakPsi = data.peak_values?.micl_psi ?? null;
for (const [ch, color] of Object.entries(CHANNEL_COLORS)) {
const samples = channels[ch];
if (!samples || samples.length === 0) continue;
const chData = channels[ch];
if (!chData || !chData.values || chData.values.length === 0) continue;
const isGeo = ch !== 'Mic';
let plotData, peakLabel, yUnit, ttFmt, tickFmt;
const plotData = chData.values;
const unit = chData.unit || (ch === 'MicL' ? 'psi' : 'in/s');
const peak = chData.peak;
const peakTms = chData.peak_t_ms;
if (isGeo) {
const scale = geoAdcScale / 32767;
plotData = samples.map(s => s * scale);
// Use the device-recorded peak from the 0C waveform record — authoritative
// and matches Blastware. Computing from raw samples can catch rogue
// near-full-scale values from decoding artifacts.
const peakKey = { Tran:'tran_in_s', Vert:'vert_in_s', Long:'long_in_s' }[ch];
const devicePeak = data.peak_values?.[peakKey] ?? null;
peakLabel = devicePeak != null ? `${devicePeak.toFixed(5)} in/s` : `${Math.max(...plotData.map(Math.abs)).toFixed(5)} in/s`;
yUnit = 'in/s';
ttFmt = v => `${ch}: ${v.toFixed(5)} in/s`;
tickFmt = v => v.toFixed(4);
} else {
const peakCounts = Math.max(...samples.map(Math.abs));
const micScale = (micPeakPsi !== null && peakCounts > 0) ? Math.abs(micPeakPsi) / peakCounts : 1.0;
plotData = samples.map(s => s * micScale);
const peakPsi = Math.max(...plotData.map(Math.abs));
const peakDbl = peakPsi > 0 ? 20 * Math.log10(peakPsi / DBL_REF) : -Infinity;
peakLabel = `${peakDbl.toFixed(1)} dBL`;
yUnit = 'psi';
let peakLabel, ttFmt, tickFmt;
if (unit === 'psi') {
const peakDbl = (peak != null && peak > 0)
? 20 * Math.log10(peak / DBL_REF) : -Infinity;
peakLabel = `${peakDbl.toFixed(1)} dBL (${peak != null ? peak.toExponential(2) : '—'} psi)`;
ttFmt = v => `${v.toExponential(3)} psi`;
tickFmt = v => v.toExponential(1);
} else {
peakLabel = peak != null ? `${peak.toFixed(5)} in/s` : '—';
ttFmt = v => `${ch}: ${v.toFixed(5)} in/s`;
tickFmt = v => v.toFixed(4);
}
// Downsample for display when the chart would otherwise have to
// rasterise tens of thousands of points. Uses every-Nth — fine for
// monthly-summary glance work; analysis tools should use the .h5 file.
const MAX_PTS = 4000;
let rTimes = times, rData = plotData;
let rTimes = times, rData = plotData, peakPlotIdx = -1;
if (plotData.length > MAX_PTS) {
const step = Math.ceil(plotData.length / MAX_PTS);
rTimes = times.filter((_, i) => i % step === 0);
rData = plotData.filter((_, i) => i % step === 0);
// Try to keep the peak sample from being downsampled away.
if (peakTms != null) {
const exactIdx = Math.round((peakTms - t0) / dt);
if (exactIdx >= 0 && exactIdx < plotData.length) {
peakPlotIdx = Math.floor(exactIdx / step);
}
}
} else if (peakTms != null) {
peakPlotIdx = Math.round((peakTms - t0) / dt);
}
const wrap = document.createElement('div');
@@ -1910,27 +2187,94 @@ function renderWaveform(data) {
},
scales: {
x: { type: 'category', ticks: { color:'#484f58', maxTicksLimit:10, maxRotation:0, callback:(v,i) => rTimes[i]+' ms' }, grid: { color:'#21262d' } },
y: { ticks: { color:'#484f58', maxTicksLimit:5, callback: v => tickFmt(v) }, grid: { color:'#21262d' }, title: { display:true, text:yUnit, color:'#484f58', font:{size:10} } },
y: { ticks: { color:'#484f58', maxTicksLimit:5, callback: v => tickFmt(v) }, grid: { color:'#21262d' }, title: { display:true, text:unit, color:'#484f58', font:{size:10} } },
},
},
plugins: [{
id: 'triggerLine',
id: 'triggerAndPeakMarkers',
afterDraw(chart) {
const zeroIdx = rTimes.findIndex(t => parseFloat(t) >= 0);
if (zeroIdx < 0) return;
const { ctx, scales: {x, y} } = chart;
// Trigger line at t = trigger_ms (typically 0).
const triggerMs = data.trigger_ms ?? 0;
const zeroIdx = rTimes.findIndex(s => parseFloat(s) >= triggerMs);
if (zeroIdx >= 0) {
const px = x.getPixelForValue(zeroIdx);
ctx.save();
ctx.beginPath();
ctx.moveTo(px, y.top); ctx.lineTo(px, y.bottom);
ctx.strokeStyle = 'rgba(248,81,73,0.7)'; ctx.lineWidth = 1.5;
ctx.setLineDash([4, 3]); ctx.stroke(); ctx.restore();
}
// Peak marker (dot at the channel's peak sample).
if (peakPlotIdx >= 0 && peakPlotIdx < rData.length) {
const px = x.getPixelForValue(peakPlotIdx);
const py = y.getPixelForValue(rData[peakPlotIdx]);
ctx.save();
ctx.beginPath();
ctx.arc(px, py, 3.2, 0, Math.PI * 2);
ctx.fillStyle = color;
ctx.strokeStyle = '#0d1117';
ctx.lineWidth = 1.5;
ctx.fill(); ctx.stroke();
ctx.restore();
}
},
}],
});
}
}
// One-time normaliser for the legacy /device/event/{idx}/waveform shape
// (samples as int16 ADC counts in `channels.{ch}: [...]`). Bridges the
// gap if a stale cache or non-upgraded server returns the old format.
function _legacyWaveformToPlotV1(data) {
const sr = data.sample_rate || 1024;
const pretrig = data.pretrig_samples || 0;
const decoded = data.samples_decoded || 0;
const total = data.total_samples || decoded;
const dt = 1000 / sr;
const t0 = -pretrig * dt;
// Apply the CORRECT scale: 10 in/s full-scale for Normal range.
const geoFs = 10.0;
const geoScale = geoFs / 32768;
const ch = data.channels || {};
const micPeak = data.peak_values?.micl_psi ?? null;
const micPeakCounts = (ch.MicL || ch.Mic || []).reduce((m, v) => Math.max(m, Math.abs(v)), 0);
const micScale = (micPeak != null && micPeakCounts > 0) ? micPeak / micPeakCounts : 1.0;
const mkGeo = (counts) => {
if (!counts || !counts.length) return [];
return counts.map(c => c * geoScale);
};
const mkMic = (counts) => {
if (!counts || !counts.length) return [];
return counts.map(c => c * micScale);
};
return {
schema: 'sfm.plot.v1',
event_id: data.event_id || null,
serial: data.serial || '',
timestamp: data.timestamp?.display || data.timestamp || '',
record_type: data.record_type,
waveform_key: null,
time_axis: {
sample_rate: sr, pretrig_samples: pretrig, total_samples: total,
n_samples: decoded, t0_ms: t0, dt_ms: dt,
rectime_seconds: data.rectime_seconds || 0,
},
geo_range: 'normal', geo_full_scale_ips: geoFs, trigger_ms: 0,
channels: {
Tran: { unit:'in/s', values: mkGeo(ch.Tran), peak: data.peak_values?.tran_in_s ?? null, peak_t_ms: null },
Vert: { unit:'in/s', values: mkGeo(ch.Vert), peak: data.peak_values?.vert_in_s ?? null, peak_t_ms: null },
Long: { unit:'in/s', values: mkGeo(ch.Long), peak: data.peak_values?.long_in_s ?? null, peak_t_ms: null },
MicL: { unit:'psi', values: mkMic(ch.MicL || ch.Mic), peak: micPeak, peak_t_ms: null },
},
peak_values: data.peak_values || {},
};
}
// ── DB tabs ────────────────────────────────────────────────────────────────────
let histLoaded = false;
let unitsLoaded = false;
@@ -2032,7 +2376,9 @@ async function loadHistory() {
for (const ev of events) {
const tr = document.createElement('tr');
const pvs = ev.peak_vector_sum;
const maxPPV = Math.max(ev.tran_ppv ?? 0, ev.vert_ppv ?? 0, ev.long_ppv ?? 0);
tr.classList.add('clickable');
tr.title = 'Click to review (open sidecar editor)';
tr.dataset.eventId = ev.id;
tr.innerHTML = `
<td>${_fmtTs(ev.timestamp)}</td>
<td class="td-key">${ev.serial ?? '—'}</td>
@@ -2045,24 +2391,157 @@ async function loadHistory() {
<td class="td-text">${ev.client ?? '—'}</td>
<td class="td-dim">${ev.record_type ?? '—'}</td>
<td class="td-dim" style="font-size:10px">${ev.waveform_key ?? '—'}</td>
<td>${ev.false_trigger ? '<span class="ft-badge">FALSE</span>' : `<button class="ft-toggle-btn" onclick="toggleFalseTrigger(${ev.id}, this)" title="Flag as false trigger">Flag</button>`}</td>
<td>${ev.false_trigger ? '<span class="ft-badge">FALSE</span>' : ''}</td>
`;
tr.addEventListener('click', () => openSidecarModal(ev.id));
tbody.appendChild(tr);
}
}
async function toggleFalseTrigger(id, btn) {
btn.disabled = true;
// ── Sidecar review modal ───────────────────────────────────────────────────────
//
// Opens on row click in the History table. Loads the .sfm.json sidecar
// for the event via GET /db/events/{id}/sidecar, lets the user toggle
// false_trigger / edit notes / set reviewer, and saves via PATCH on the
// same URL. This mirrors the workflow used by the monthly vibration
// summary process — most of the rich review UX lives in Terra-View;
// this is the SFM-standalone equivalent for testing / direct edits.
let _scCurrentEventId = null;
let _scCurrentSidecar = null;
async function openSidecarModal(eventId) {
_scCurrentEventId = eventId;
_scCurrentSidecar = null;
document.getElementById('sc-status').textContent = 'Loading sidecar…';
document.getElementById('sc-status').className = 'sc-status';
document.getElementById('sc-overlay').classList.add('visible');
// Reset edit fields
document.getElementById('sc-edit-ft').checked = false;
document.getElementById('sc-edit-reviewer').value = '';
document.getElementById('sc-edit-notes').value = '';
try {
const r = await fetch(`${api()}/db/events/${id}/false_trigger?value=true`, { method: 'PATCH' });
if (!r.ok) throw new Error(r.statusText);
btn.outerHTML = '<span class="ft-badge">FALSE</span>';
const r = await fetch(`${api()}/db/events/${eventId}/sidecar`);
if (!r.ok) {
const e = await r.json().catch(() => ({}));
throw new Error(e.detail || r.statusText);
}
const data = await r.json();
_scCurrentSidecar = data;
_renderSidecar(data);
document.getElementById('sc-status').textContent = '';
} catch (e) {
btn.disabled = false;
alert(`Failed to flag: ${e.message}`);
document.getElementById('sc-status').className = 'sc-status error';
document.getElementById('sc-status').textContent = `Load failed: ${e.message}`;
}
}
function _renderSidecar(data) {
const ev = data.event || {};
const pv = data.peak_values || {};
const pi = data.project_info || {};
const bw = data.blastware || {};
const src = data.source || {};
const rev = data.review || {};
document.getElementById('sc-title').textContent = `Event — ${bw.filename || ev.waveform_key || 'unknown'}`;
const fmtPpv = v => (v == null ? '—' : Number(v).toFixed(5) + ' in/s');
const fmtMic = v => {
if (v == null || v <= 0) return '—';
const dbl = 20 * Math.log10(v / DBL_REF);
return `${dbl.toFixed(1)} dBL (${v.toExponential(2)} psi)`;
};
document.getElementById('sc-f-serial').textContent = ev.serial || '—';
document.getElementById('sc-f-ts').textContent = ev.timestamp || '—';
document.getElementById('sc-f-rt').textContent = ev.record_type || '—';
document.getElementById('sc-f-sr').textContent = (ev.sample_rate ?? '—') + (ev.sample_rate ? ' sps' : '');
document.getElementById('sc-f-key').textContent = ev.waveform_key || '—';
document.getElementById('sc-f-tran').textContent = fmtPpv(pv.transverse);
document.getElementById('sc-f-vert').textContent = fmtPpv(pv.vertical);
document.getElementById('sc-f-long').textContent = fmtPpv(pv.longitudinal);
document.getElementById('sc-f-pvs').textContent = fmtPpv(pv.vector_sum);
document.getElementById('sc-f-mic').textContent = fmtMic(pv.mic_psi);
document.getElementById('sc-f-project').textContent = pi.project || '—';
document.getElementById('sc-f-client').textContent = pi.client || '—';
document.getElementById('sc-f-operator').textContent = pi.operator || '—';
document.getElementById('sc-f-loc').textContent = pi.sensor_location || '—';
document.getElementById('sc-f-bw').textContent = bw.filename || '—';
document.getElementById('sc-f-bwsize').textContent = bw.filesize != null ? `${bw.filesize} bytes` : '—';
document.getElementById('sc-f-sha').textContent = bw.sha256 || '—';
document.getElementById('sc-f-src').textContent = src.kind || '—';
document.getElementById('sc-f-cap').textContent = src.captured_at || '—';
document.getElementById('sc-edit-ft').checked = !!rev.false_trigger;
document.getElementById('sc-edit-reviewer').value = rev.reviewer || '';
document.getElementById('sc-edit-notes').value = rev.notes || '';
document.getElementById('sc-raw-json').textContent = JSON.stringify(data, null, 2);
}
function closeSidecarModal() {
document.getElementById('sc-overlay').classList.remove('visible');
_scCurrentEventId = null;
_scCurrentSidecar = null;
}
function onSidecarOverlayClick(e) {
// Click on the dimmed backdrop (but NOT on the modal itself) closes.
if (e.target.id === 'sc-overlay') closeSidecarModal();
}
async function saveSidecarReview() {
if (!_scCurrentEventId) return;
const btn = document.getElementById('sc-save-btn');
const status = document.getElementById('sc-status');
btn.disabled = true;
status.className = 'sc-status';
status.textContent = 'Saving…';
const review = {
false_trigger: document.getElementById('sc-edit-ft').checked,
reviewer: document.getElementById('sc-edit-reviewer').value.trim() || null,
notes: document.getElementById('sc-edit-notes').value,
};
try {
const r = await fetch(`${api()}/db/events/${_scCurrentEventId}/sidecar`, {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ review }),
});
if (!r.ok) {
const e = await r.json().catch(() => ({}));
throw new Error(e.detail || r.statusText);
}
const updated = await r.json();
_scCurrentSidecar = updated;
_renderSidecar(updated);
status.className = 'sc-status ok';
status.textContent = 'Saved.';
// Refresh the History table so the false_trigger badge reflects the change.
if (typeof loadHistory === 'function') loadHistory();
setTimeout(closeSidecarModal, 600);
} catch (e) {
status.className = 'sc-status error';
status.textContent = `Save failed: ${e.message}`;
} finally {
btn.disabled = false;
}
}
// Esc closes the modal.
document.addEventListener('keydown', (e) => {
if (e.key === 'Escape' && document.getElementById('sc-overlay').classList.contains('visible')) {
closeSidecarModal();
}
});
// ── Units tab ──────────────────────────────────────────────────────────────────
async function loadUnits() {
unitsLoaded = true;
@@ -2224,5 +2703,81 @@ document.getElementById('api-base').value = window.location.origin;
document.getElementById(id)?.addEventListener('keydown', e => { if (e.key === 'Enter') connectUnit(); });
});
</script>
<!-- ════════════════════════════════════════════════════════════════
Sidecar review modal (Database events table → row click)
═══════════════════════════════════════════════════════════════════ -->
<div class="sc-overlay" id="sc-overlay" onclick="onSidecarOverlayClick(event)">
<div class="sc-modal" id="sc-modal">
<div class="sc-header">
<h3 id="sc-title">Event</h3>
<button class="sc-close" onclick="closeSidecarModal()">×</button>
</div>
<div class="sc-body">
<div class="sc-section">
<h4>Event</h4>
<dl class="sc-grid">
<dt>Serial</dt> <dd id="sc-f-serial"></dd>
<dt>Timestamp</dt> <dd id="sc-f-ts"></dd>
<dt>Record type</dt> <dd id="sc-f-rt"></dd>
<dt>Sample rate</dt> <dd id="sc-f-sr"></dd>
<dt>Waveform key</dt> <dd id="sc-f-key"></dd>
</dl>
</div>
<div class="sc-section">
<h4>Peaks</h4>
<dl class="sc-grid">
<dt>Tran</dt> <dd id="sc-f-tran"></dd>
<dt>Vert</dt> <dd id="sc-f-vert"></dd>
<dt>Long</dt> <dd id="sc-f-long"></dd>
<dt>PVS</dt> <dd id="sc-f-pvs"></dd>
<dt>Mic</dt> <dd id="sc-f-mic"></dd>
</dl>
</div>
<div class="sc-section">
<h4>Project</h4>
<dl class="sc-grid">
<dt>Project</dt> <dd id="sc-f-project"></dd>
<dt>Client</dt> <dd id="sc-f-client"></dd>
<dt>Operator</dt> <dd id="sc-f-operator"></dd>
<dt>Location</dt> <dd id="sc-f-loc"></dd>
</dl>
</div>
<div class="sc-section">
<h4>Source / files</h4>
<dl class="sc-grid">
<dt>BW filename</dt> <dd id="sc-f-bw"></dd>
<dt>BW filesize</dt> <dd id="sc-f-bwsize"></dd>
<dt>BW sha256</dt> <dd id="sc-f-sha"></dd>
<dt>Source kind</dt> <dd id="sc-f-src"></dd>
<dt>Captured at</dt> <dd id="sc-f-cap"></dd>
</dl>
</div>
<div class="sc-section">
<h4>Review (editable)</h4>
<div class="sc-row">
<input type="checkbox" id="sc-edit-ft" />
<label for="sc-edit-ft">False trigger</label>
</div>
<div class="sc-row">
<label for="sc-edit-reviewer" style="min-width:60px">Reviewer</label>
<input type="text" id="sc-edit-reviewer" placeholder="e.g. brian" />
</div>
<label for="sc-edit-notes" style="font-size:11px;color:var(--text-mute)">Notes</label>
<textarea id="sc-edit-notes" placeholder="e.g. truck thump near sensor 14:23 — false trigger"></textarea>
</div>
<details class="sc-raw">
<summary>Raw sidecar JSON (read-only peek)</summary>
<pre id="sc-raw-json"></pre>
</details>
</div>
<div class="sc-footer">
<span class="sc-status" id="sc-status"></span>
<button class="btn btn-ghost" onclick="closeSidecarModal()">Cancel</button>
<button class="btn" id="sc-save-btn" onclick="saveSidecarReview()">Save</button>
</div>
</div>
</div>
</body>
</html>
+446
View File
@@ -0,0 +1,446 @@
"""
sfm/waveform_store.py On-disk store for Blastware-format event files.
Layout (flat per-serial, four files per event):
<root>/<serial>/<filename> event file (BW-readable binary)
<root>/<serial>/<filename>.a5.pkl pickled list of A5 S3Frame dicts
<root>/<serial>/<filename>.h5 clean waveform arrays (HDF5)
<root>/<serial>/<filename>.sfm.json modern sidecar (peaks, project,
review state, extensions)
`<filename>` is whatever `minimateplus.blastware_file.blastware_filename`
produces for the event. The extension is NOT a fixed type tag it
encodes the event timestamp (`AB0T` format).
Roles:
- BW binary: what Blastware reads. Untouched. The user-facing review
waveform viewer.
- .a5.pkl: regenerative source. Lets the BW binary be rebuilt
byte-for-byte if the encoder changes. Never delete.
- .h5: clean per-channel waveform arrays in physical units (in/s for
geo, psi for mic) plus event metadata. Canonical format for
downstream analysis tools and the `/device/event/{idx}/waveform`
endpoint's plot-JSON output.
- .sfm.json: small, queryable metadata + review state. SQL
`events.false_trigger` is a derived index kept in sync via
`patch_sidecar()`.
"""
from __future__ import annotations
import datetime
import logging
import pickle
import shutil
from pathlib import Path
from typing import Optional
from minimateplus import event_file_io
from minimateplus.blastware_file import blastware_filename, write_blastware_file
from minimateplus.framing import S3Frame
from minimateplus.models import Event
from sfm import event_hdf5
log = logging.getLogger("sfm.waveform_store")
A5_PICKLE_VERSION = 1
def _frame_to_dict(f: S3Frame) -> dict:
return {
"sub": f.sub,
"page_hi": f.page_hi,
"page_lo": f.page_lo,
"data": bytes(f.data),
"chk_byte": f.chk_byte,
"checksum_valid": f.checksum_valid,
}
def _dict_to_frame(d: dict) -> S3Frame:
return S3Frame(
sub=d["sub"],
page_hi=d["page_hi"],
page_lo=d["page_lo"],
data=bytes(d["data"]),
checksum_valid=d.get("checksum_valid", True),
chk_byte=d.get("chk_byte", 0),
)
class WaveformStore:
"""
Persistent store for Blastware-format waveform files + their A5 source frames.
Thread safety: write_blastware_file is single-shot; concurrent saves of the
*same* filename would race, but the filename encodes second-resolution
timestamps + serial, so collisions across threads/processes are vanishingly
unlikely in practice.
"""
def __init__(self, root: str | Path) -> None:
self.root = Path(root)
self.root.mkdir(parents=True, exist_ok=True)
log.info("WaveformStore root=%s", self.root)
# ── path helpers ────────────────────────────────────────────────────────────
def _serial_dir(self, serial: str) -> Path:
d = self.root / serial
d.mkdir(parents=True, exist_ok=True)
return d
def paths_for(self, serial: str, filename: str) -> tuple[Path, Path]:
"""Return (blastware_path, a5_pickle_path) for a given serial+filename.
For the sidecar path use `sidecar_path_for()` kept separate so
existing callers don't need to unpack a 3-tuple.
"""
d = self._serial_dir(serial)
return d / filename, d / f"{filename}.a5.pkl"
def sidecar_path_for(self, serial: str, filename: str) -> Path:
"""Return absolute path to the .sfm.json sidecar for a given event."""
return self._serial_dir(serial) / f"{filename}.sfm.json"
def hdf5_path_for(self, serial: str, filename: str) -> Path:
"""Return absolute path to the .h5 clean-waveform file for a given event."""
return self._serial_dir(serial) / f"{filename}.h5"
def open_blastware(self, serial: str, filename: str) -> Optional[Path]:
"""Return absolute path to an existing event file or None."""
bw_path, _ = self.paths_for(serial, filename)
return bw_path if bw_path.exists() else None
# ── save / load ─────────────────────────────────────────────────────────────
def save(
self,
ev: Event,
serial: str,
a5_frames: list[S3Frame],
*,
source_kind: str = "sfm-live",
geo_range = "normal",
) -> dict:
"""
Write all four event-file artifacts for one event:
- <filename> BW binary
- <filename>.a5.pkl raw A5 frame pickle
- <filename>.h5 clean waveform (HDF5)
- <filename>.sfm.json modern sidecar (metadata + review)
Returns a record dict suitable for persisting alongside the DB row:
{
"filename": "M529LKIQ.7M0W",
"filesize": 8708,
"sha256": "a1b2c3...",
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
"hdf5_filename": "M529LKIQ.7M0W.h5",
"sidecar_filename": "M529LKIQ.7M0W.sfm.json",
}
`source_kind` flows into `sidecar.source.kind` callers should
pass "sfm-live" (default) for the live endpoint and "sfm-ach" for
the ACH ingestion path. BW-imported events use save_imported_bw()
instead.
`geo_range` controls the ADC-counts in/s scaling in the HDF5
file ("normal" = 10 in/s FS, "sensitive" = 1.25 in/s FS).
Defaults to "normal" callers with compliance-config access
should pass the actual unit setting so the saved samples are in
the right units.
Idempotent: if the event file already exists, it is overwritten
with the freshly-encoded version (same bytes for the same
a5_frames) and the sidecar's review block is preserved across
re-saves.
"""
if not a5_frames:
raise ValueError("WaveformStore.save: a5_frames is empty")
if not serial:
raise ValueError("WaveformStore.save: serial is required")
filename = blastware_filename(ev, serial)
bw_path, a5_path = self.paths_for(serial, filename)
sidecar_path = self.sidecar_path_for(serial, filename)
hdf5_path = self.hdf5_path_for(serial, filename)
# 1. encode the event file (defensive unlink prevents trailing-byte
# leaks from a previous larger file on synced/odd filesystems).
try:
bw_path.unlink()
except FileNotFoundError:
pass
write_blastware_file(ev, a5_frames, bw_path)
filesize = bw_path.stat().st_size
sha256 = event_file_io.file_sha256(bw_path)
# 2. write the .a5.pkl sidecar
try:
a5_path.unlink()
except FileNotFoundError:
pass
payload = {
"version": A5_PICKLE_VERSION,
"frames": [_frame_to_dict(f) for f in a5_frames],
}
with a5_path.open("wb") as fp:
pickle.dump(payload, fp, protocol=pickle.HIGHEST_PROTOCOL)
# 3. write the .h5 clean-waveform file (samples in physical units).
# Best-effort: a write failure shouldn't sink the rest of the save
# (the HDF5 can be regenerated later from the .a5.pkl).
hdf5_filename: Optional[str] = None
try:
event_hdf5.write_event_hdf5(
hdf5_path, ev,
serial=serial,
geo_range=geo_range,
source_kind=source_kind,
)
hdf5_filename = hdf5_path.name
except Exception as exc:
log.warning(
"save: HDF5 write failed for %s: %s — continuing without .h5",
hdf5_path, exc,
)
# 4. write the .sfm.json sidecar. Preserve any existing review
# block + extensions across re-saves so user edits aren't lost
# when the same event is re-downloaded (e.g. via Force refresh).
existing_review = None
existing_extensions = None
if sidecar_path.exists():
try:
old = event_file_io.read_sidecar(sidecar_path)
existing_review = old.get("review")
existing_extensions = old.get("extensions")
except Exception as exc:
log.warning(
"save: existing sidecar at %s unreadable (%s); overwriting",
sidecar_path, exc,
)
sidecar = event_file_io.event_to_sidecar_dict(
ev,
serial=serial,
blastware_filename=filename,
blastware_filesize=filesize,
blastware_sha256=sha256,
source_kind=source_kind,
a5_pickle_filename=a5_path.name,
review=existing_review,
extensions=existing_extensions,
)
event_file_io.write_sidecar(sidecar_path, sidecar)
log.info(
"WaveformStore.save serial=%s filename=%s filesize=%d frames=%d "
"h5=%s sidecar=%s",
serial, filename, filesize, len(a5_frames),
hdf5_filename or "(skipped)", sidecar_path.name,
)
return {
"filename": filename,
"filesize": filesize,
"sha256": sha256,
"a5_pickle_filename": a5_path.name,
"hdf5_filename": hdf5_filename,
"sidecar_filename": sidecar_path.name,
}
def save_imported_bw(
self,
bw_bytes: bytes,
source_path: Path,
*,
serial_hint: Optional[str] = None,
) -> tuple[Event, dict]:
"""
Ingest a Blastware event file produced by an external tool
(Blastware's own ACH, manual download, etc.) where the source A5
frames aren't available.
Workflow:
1. Parse the bytes via event_file_io.read_blastware_file (writes
a temp file to do that, since the parser takes a path).
2. Resolve serial from BW filename (`<P><serial3>...`) or use
serial_hint. Falls back to "UNKNOWN".
3. Copy the BW bytes verbatim into <root>/<serial>/<filename>.
4. Write the .sfm.json sidecar with source.kind = "bw-import"
and a5_pickle_filename = None. Does NOT write a .a5.pkl
(no A5 source available; byte-for-byte regeneration not
possible the on-disk BW file IS the byte-for-byte source).
Returns (event, record_dict) so callers can both insert into
SeismoDb and surface the parsed Event.
"""
# Stash the bytes to a temp path so read_blastware_file (path-based)
# can parse without us duplicating its logic.
import tempfile
with tempfile.NamedTemporaryFile(suffix=".bw", delete=False) as tmp:
tmp.write(bw_bytes)
tmp_path = Path(tmp.name)
try:
ev = event_file_io.read_blastware_file(tmp_path)
finally:
try:
tmp_path.unlink()
except FileNotFoundError:
pass
# Resolve serial. blastware_filename derives a 4-char prefix from
# the numeric serial (e.g. BE11529 → M529); we go the other way
# via the source filename if a hint wasn't given.
serial = serial_hint or _serial_from_bw_filename(source_path.name) or "UNKNOWN"
# Use the source filename verbatim — it already encodes timestamp
# + record type per BW's AB0T scheme, and we want to preserve it
# so the file BW knows about can be opened back in BW.
filename = source_path.name
bw_path = self._serial_dir(serial) / filename
# 1. copy bytes
bw_path.write_bytes(bw_bytes)
filesize = bw_path.stat().st_size
sha256 = event_file_io.file_sha256(bw_path)
# 2. write the .h5 clean-waveform file from the parsed Event.
# Note: peaks here are computed from raw samples (the BW file
# doesn't carry the device-authoritative 0C peaks). Best-effort.
hdf5_path = self.hdf5_path_for(serial, filename)
hdf5_filename: Optional[str] = None
try:
event_hdf5.write_event_hdf5(
hdf5_path, ev,
serial=serial,
geo_range="normal", # BW file doesn't carry the range; assume Normal
source_kind="bw-import",
)
hdf5_filename = hdf5_path.name
except Exception as exc:
log.warning(
"save_imported_bw: HDF5 write failed for %s: %s — continuing",
hdf5_path, exc,
)
# 3. write sidecar with source.kind = bw-import
sidecar_path = self.sidecar_path_for(serial, filename)
existing_review = None
if sidecar_path.exists():
try:
existing_review = event_file_io.read_sidecar(sidecar_path).get("review")
except Exception:
pass
sidecar = event_file_io.event_to_sidecar_dict(
ev,
serial=serial,
blastware_filename=filename,
blastware_filesize=filesize,
blastware_sha256=sha256,
source_kind="bw-import",
a5_pickle_filename=None,
review=existing_review,
)
event_file_io.write_sidecar(sidecar_path, sidecar)
log.info(
"WaveformStore.save_imported_bw serial=%s filename=%s filesize=%d "
"h5=%s (no .a5.pkl — A5 source unavailable for BW-imported files)",
serial, filename, filesize, hdf5_filename or "(skipped)",
)
return ev, {
"filename": filename,
"filesize": filesize,
"sha256": sha256,
"a5_pickle_filename": None,
"hdf5_filename": hdf5_filename,
"sidecar_filename": sidecar_path.name,
}
def load_a5(self, serial: str, filename: str) -> Optional[list[S3Frame]]:
"""
Re-hydrate the pickled A5 frame stream for a stored event.
Returns None if the sidecar is missing.
"""
_, a5_path = self.paths_for(serial, filename)
if not a5_path.exists():
return None
with a5_path.open("rb") as fp:
payload = pickle.load(fp)
if not isinstance(payload, dict) or "frames" not in payload:
log.warning("WaveformStore.load_a5: malformed sidecar at %s", a5_path)
return None
return [_dict_to_frame(d) for d in payload["frames"]]
# ── modern .sfm.json sidecar accessors ──────────────────────────────────────
def load_sidecar(self, serial: str, filename: str) -> Optional[dict]:
"""Return the parsed .sfm.json sidecar dict, or None if missing."""
path = self.sidecar_path_for(serial, filename)
if not path.exists():
return None
try:
return event_file_io.read_sidecar(path)
except Exception as exc:
log.warning("load_sidecar: failed to read %s: %s", path, exc)
return None
def patch_sidecar(
self,
serial: str,
filename: str,
*,
review: Optional[dict] = None,
extensions: Optional[dict] = None,
reviewer_now: bool = True,
) -> Optional[dict]:
"""
JSON-merge-patch the .sfm.json sidecar's review/extensions blocks.
Returns the new full dict, or None if the sidecar doesn't exist.
"""
path = self.sidecar_path_for(serial, filename)
if not path.exists():
return None
return event_file_io.patch_sidecar(
path,
review=review,
extensions=extensions,
reviewer_now=reviewer_now,
)
# ── helpers ─────────────────────────────────────────────────────────────────────
def _serial_from_bw_filename(name: str) -> Optional[str]:
"""
Reverse of `blastware_filename`'s serial-prefix encoding.
BW filename format (V10.72): `<P><serial3><stem4>.<ext>`
where P = chr(ord('B') + floor(serial // 1000))
and serial3 = f"{serial % 1000:03d}".
Examples (from CLAUDE.md verification archive):
P036... BE14036 H907... BE6907
M529... BE11529 T003... BE18003
Returns the inferred BE-prefix serial (e.g. "BE11529") or None when
the filename doesn't match the expected pattern.
"""
if not name:
return None
# First letter encodes the thousands group; next 3 chars encode the
# last 3 digits of the serial.
base = name.split(".", 1)[0]
if len(base) < 4 or not base[0].isalpha() or not base[1:4].isdigit():
return None
prefix_letter = base[0].upper()
if prefix_letter < "B":
return None
thousands = ord(prefix_letter) - ord("B")
serial_num = thousands * 1000 + int(base[1:4])
return f"BE{serial_num}"
+252
View File
@@ -0,0 +1,252 @@
"""
test_5a_protocol.py Regression test for the v0.14.x SUB 5A protocol fixes.
Verifies that SFM's framing helpers reproduce Blastware's exact wire bytes
for every 5A request frame in the 5-1-26 "bwcap3sec" capture, AND that the
file builder produces a byte-identical file when fed the BW capture's A5
responses.
Together these two tests protect all four v0.14.x fixes:
v0.14.0 STRT-bounded chunk walk (probe @ 0, metadata pages @ 0x1002 +
0x1004, samples @ 0x0600..0x1E00 step 0x0200, TERM at residual)
v0.14.1 event-N probe counter is `start_offset`, not `start_offset+0x46`
(covered by the multi-event captures, not this 3-sec event-1
capture but the helpers are the same code path)
v0.14.2 file body assembly is contiguous concatenation, no de-duplication
v0.14.3 partial DLE stuffing of `0x10` bytes in 5A params (counter=0x1000
wire bytes are `10 10 00`, not `10 00`)
If any of these fixes regresses, this test fails immediately with a clear
byte-level diff.
Run:
python -m pytest tests/test_5a_protocol.py -v
or:
python tests/test_5a_protocol.py
"""
from __future__ import annotations
import os
import sys
import pytest
# Allow running from the project root without installation
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import (
S3FrameParser,
build_5a_frame,
bulk_waveform_params,
bulk_waveform_term_v2,
)
# ── Capture loading ────────────────────────────────────────────────────────────
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Reference BW MITM capture: BW saving a 3-sec event 0 (start_key=01110000,
# end_offset=0x21F2). 17 5A frames: probe + 2 metadata pages + 13 samples + TERM.
BW_TX_PATH = os.path.join(
ROOT,
"bridges/captures/5-1-26/comcheck/bwcap3sec/"
"raw_bw_20260501_165723_copy_3sec_waveform_to_disk.bin",
)
BW_S3_PATH = os.path.join(
ROOT,
"bridges/captures/5-1-26/comcheck/bwcap3sec/"
"raw_s3_20260501_165723_copy_3sec_waveform_to_disk.bin",
)
# BW's saved Blastware file for the same event (used for file-builder verification).
BW_SAVED_FILE = os.path.join(
ROOT, "example-events/decode_test/5-1-26/bw/M529LKIQ.G10",
)
def _split_bw_frames(data: bytes) -> list[bytes]:
"""Split BW TX bytes into individual frames (ACK STX … bare ETX)."""
frames: list[bytes] = []
i = 0
while i < len(data):
if data[i] != 0x41 or i + 1 >= len(data) or data[i + 1] != 0x02:
i += 1
continue
j = i + 2
while j < len(data):
if data[j] == 0x03:
break
if data[j] == 0x10 and j + 1 < len(data):
j += 2
continue
j += 1
if j >= len(data):
break
frames.append(data[i : j + 1])
i = j + 1
return frames
@pytest.fixture(scope="module")
def bw_5a_frames() -> list[bytes]:
"""All 5A frames from the BW TX capture, in wire order."""
if not os.path.exists(BW_TX_PATH):
pytest.skip(f"BW capture not found: {BW_TX_PATH}")
raw = open(BW_TX_PATH, "rb").read()
frames = [
f for f in _split_bw_frames(raw)
if len(f) >= 6 and f[5] == 0x5A # body[3] == 0x5A (SUB)
]
assert len(frames) == 17, f"expected 17 5A frames in capture, got {len(frames)}"
return frames
@pytest.fixture(scope="module")
def bw_a5_frames():
"""All A5 (response) frames from the matching S3 capture."""
if not os.path.exists(BW_S3_PATH):
pytest.skip(f"BW S3 capture not found: {BW_S3_PATH}")
raw = open(BW_S3_PATH, "rb").read()
p = S3FrameParser()
p.feed(raw)
a5 = [f for f in p.frames if f.sub == 0xA5]
assert len(a5) == 17, f"expected 17 A5 frames in capture, got {len(a5)}"
return a5
# ── 5A request frame byte-perfect verification ────────────────────────────────
KEY4 = bytes.fromhex("01110000") # start_key for the 3-sec event 0
END_OFFSET = 0x21F2 # parsed from STRT in the BW capture
LAST_CHUNK_COUNTER = 0x1E00 # last full 0x0200-byte chunk before TERM
SAMPLE_COUNTERS = (
0x0600, 0x0800, 0x0A00, 0x0C00, 0x0E00,
0x1000, 0x1200, 0x1400, 0x1600, 0x1800,
0x1A00, 0x1C00, 0x1E00,
)
def _meta_params(key: bytes, counter: int) -> bytes:
"""Build the 12-byte metadata-page params block (matches BW for 0x1002 / 0x1004)."""
return bytes(
[
0x00, key[0], key[1],
(counter >> 8) & 0xFF, counter & 0xFF,
0, 0, 0, 0, 0, 0, 0,
]
)
def test_probe_frame_byte_perfect(bw_5a_frames):
"""Probe @ counter=0x0000 (frame 0)."""
sfm = build_5a_frame(0x1002, bulk_waveform_params(KEY4, 0, is_probe=True))
assert sfm == bw_5a_frames[0], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[0].hex()}"
)
@pytest.mark.parametrize("idx,counter", [(1, 0x1002), (2, 0x1004)])
def test_metadata_page_frames_byte_perfect(bw_5a_frames, idx, counter):
"""Metadata pages @ counter=0x1002 and 0x1004 (frames 1 and 2)."""
sfm = build_5a_frame(0x1002, _meta_params(KEY4, counter))
assert sfm == bw_5a_frames[idx], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[idx].hex()}"
)
@pytest.mark.parametrize("i,counter", list(enumerate(SAMPLE_COUNTERS)))
def test_sample_chunk_frames_byte_perfect(bw_5a_frames, i, counter):
"""
Sample chunks @ counter=0x0600..0x1E00, step 0x0200 (frames 3..15).
Critically, frame 8 (counter=0x1000) requires the v0.14.3 partial DLE
stuffing fix wire params include `10 10 00` for the counter, not `10 00`.
"""
sfm = build_5a_frame(0x1002, bulk_waveform_params(KEY4, counter))
bw_idx = 3 + i
assert sfm == bw_5a_frames[bw_idx], (
f"\ncounter=0x{counter:04X}"
f"\nSFM: {sfm.hex()}"
f"\nBW: {bw_5a_frames[bw_idx].hex()}"
)
def test_term_frame_byte_perfect(bw_5a_frames):
"""TERM frame at residual (frame 16)."""
offset_word, params = bulk_waveform_term_v2(KEY4, END_OFFSET, LAST_CHUNK_COUNTER)
sfm = build_5a_frame(offset_word, params)
assert sfm == bw_5a_frames[16], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[16].hex()}"
)
def test_strt_end_offset_parsing(bw_a5_frames):
"""The probe response (A5[0]) carries STRT at byte 17 with end_offset=0x21F2."""
from minimateplus.framing import parse_strt_end_offset
end_offset = parse_strt_end_offset(bw_a5_frames[0].data)
assert end_offset == END_OFFSET, (
f"expected end_offset=0x{END_OFFSET:04X}, got "
f"{f'0x{end_offset:04X}' if end_offset is not None else 'None'}"
)
# ── File builder byte-perfect verification ────────────────────────────────────
def test_blastware_file_builder_byte_perfect(bw_a5_frames):
"""
Feed the BW capture's A5 frames into write_blastware_file() and verify the
output is byte-identical to BW's saved M529LKIQ.G10 reference file.
This protects the v0.14.2 strip-removal fix and the file-builder skip
values (probe=38, meta=13, samples=12, TERM=11).
"""
if not os.path.exists(BW_SAVED_FILE):
pytest.skip(f"BW saved file not found: {BW_SAVED_FILE}")
import tempfile
from minimateplus.blastware_file import write_blastware_file
from minimateplus.models import Event
ev = Event(index=0)
ev._waveform_key = KEY4
ev.rectime_seconds = 3
ev.timestamp = None # let the builder pull the footer from the TERM frame
with tempfile.NamedTemporaryFile(suffix=".G10", delete=False) as tf:
tmp_path = tf.name
try:
write_blastware_file(ev, bw_a5_frames, tmp_path)
sfm_bytes = open(tmp_path, "rb").read()
finally:
os.unlink(tmp_path)
bw_bytes = open(BW_SAVED_FILE, "rb").read()
assert len(sfm_bytes) == len(bw_bytes), (
f"file size mismatch: SFM={len(sfm_bytes)} BW={len(bw_bytes)}"
)
if sfm_bytes != bw_bytes:
# Find first diff for actionable error message
for i in range(len(bw_bytes)):
if bw_bytes[i] != sfm_bytes[i]:
ctx_start = max(0, i - 8)
ctx_end = min(len(bw_bytes), i + 16)
pytest.fail(
f"file diverges at byte 0x{i:04X}\n"
f" BW : {bw_bytes[ctx_start:ctx_end].hex()}\n"
f" SFM: {sfm_bytes[ctx_start:ctx_end].hex()}\n"
f" {' ' * (i - ctx_start)}^^"
)
# ── Standalone runner ─────────────────────────────────────────────────────────
if __name__ == "__main__":
sys.exit(pytest.main([__file__, "-v"]))
+209
View File
@@ -0,0 +1,209 @@
"""
test_cache_invalidation.py verify post-erase key-reuse correctness.
The device's event-key counter resets to 0x01110000 after every memory erase,
so a bare-key dedup (the old behaviour) silently treats a freshly-recorded
event 0 as if it were the previously-downloaded one. These tests exercise
the (key, timestamp)-based eviction logic in:
- bridges/ach_server.py (state-file migration + force flag)
- sfm/server.py (_LiveCache.set_events / set_waveform)
Run:
python tests/test_cache_invalidation.py
"""
from __future__ import annotations
import json
import os
import sys
import tempfile
from pathlib import Path
try:
import pytest
except ImportError:
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
# ── ACH state migration ───────────────────────────────────────────────────────
def test_ach_state_legacy_migration(tmp_path: Path):
"""
Legacy v1 state with a `downloaded_keys` list is migrated on _load_state
to the v2 `downloaded_events` dict. All legacy keys come back with empty
timestamps so the (key, ts) compare in get_events() always falls through
to a fresh download.
"""
from bridges.ach_server import _load_state
state_path = tmp_path / "ach_state.json"
legacy = {
"BE11529": {
"downloaded_keys": ["01110000", "0111245a"],
"max_downloaded_key": "0111245a",
"last_seen": "2026-04-11T01:04:36",
"serial": "BE11529",
"peer": "63.43.212.232:51920",
},
}
state_path.write_text(json.dumps(legacy))
migrated = _load_state(state_path)
unit = migrated["BE11529"]
assert "downloaded_keys" not in unit
assert unit["downloaded_events"] == {
"01110000": "",
"0111245a": "",
}
# max_downloaded_key is preserved verbatim
assert unit["max_downloaded_key"] == "0111245a"
def test_ach_state_v2_passes_through(tmp_path: Path):
"""A v2 state file is returned verbatim — no migration touches it."""
from bridges.ach_server import _load_state
state_path = tmp_path / "ach_state.json"
v2 = {
"BE11529": {
"downloaded_events": {
"01110000": "2026-04-15T14:23:45",
"0111245a": "2026-04-16T09:01:12",
},
"max_downloaded_key": "0111245a",
"serial": "BE11529",
},
}
state_path.write_text(json.dumps(v2))
loaded = _load_state(state_path)
assert loaded["BE11529"]["downloaded_events"] == v2["BE11529"]["downloaded_events"]
def test_ach_state_missing_returns_empty(tmp_path: Path):
"""Nonexistent state path → empty dict (not an error)."""
from bridges.ach_server import _load_state
assert _load_state(tmp_path / "absent.json") == {}
# ── _LiveCache eviction ───────────────────────────────────────────────────────
def _ev(index: int, key: str, ts: str) -> dict:
return {"index": index, "waveform_key": key, "timestamp": ts}
def test_live_cache_set_events_no_eviction_when_keys_match():
"""No flush when incoming events match the cached (key, ts) at each index."""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
c.set_events(conn, 2, [_ev(0, "01110000", "2026-04-15T14:23:45"),
_ev(1, "0111245a", "2026-04-16T09:01:12")])
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
# Same events again — must not flush.
c.set_events(conn, 2, [_ev(0, "01110000", "2026-04-15T14:23:45"),
_ev(1, "0111245a", "2026-04-16T09:01:12")])
assert c._events[conn][0] == 2
assert (conn, 0) in c._waveforms
def test_live_cache_set_events_flushes_on_post_erase_collision():
"""
Index 0 keeps the same key (01110000 reuses) but the timestamp differs
device was erased + re-recorded flush all events + waveforms for the
device.
"""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
# First "session": index 0 key=01110000 ts=2026-04-15.
c.set_events(conn, 1, [_ev(0, "01110000", "2026-04-15T14:23:45")])
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
assert (conn, 0) in c._waveforms
# Second "session" after erase: index 0 still key=01110000 but new ts.
c.set_events(conn, 1, [_ev(0, "01110000", "2026-05-06T12:34:56")])
# Stale waveform for index 0 must have been flushed by the eviction path
# before the new event was inserted. The new events list IS in cache but
# the cached waveform from the prior session is gone.
assert (conn, 0) not in c._waveforms
assert c._events[conn][1][0]["timestamp"] == "2026-05-06T12:34:56"
def test_live_cache_set_waveform_flushes_on_mismatch():
"""set_waveform alone should also evict when (key, ts) differs."""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
c.set_waveform(conn, 1, _ev(1, "0111245a", "2026-04-16T09:01:12"))
# Index 0 swap: same key, new timestamp.
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-05-06T12:34:56"))
# Index 1's stale waveform must be flushed — keeping it would mix eras.
assert (conn, 1) not in c._waveforms
# The newly-inserted index 0 entry is what's there.
assert c._waveforms[(conn, 0)]["timestamp"] == "2026-05-06T12:34:56"
def test_live_cache_partial_signature_does_not_flush():
"""
If incoming event lacks waveform_key OR timestamp, we cannot prove a
mismatch eviction must NOT trigger. Avoids spurious flushes from
legacy / partial event shapes.
"""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
# Incoming entry missing the timestamp — cannot prove a mismatch.
c.set_waveform(conn, 0, {"index": 0, "waveform_key": "01110000"})
# Cache should contain the new entry; the implementation overwrites
# the index-0 row but does NOT flush other indices. Since there are no
# other indices in this test, just check the entry exists.
assert (conn, 0) in c._waveforms
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
import inspect
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)
+401
View File
@@ -0,0 +1,401 @@
"""
test_event_file_io.py sidecar write/read/patch round-trips,
WaveformStore sidecar integration, and the BW-import path.
Run:
python tests/test_event_file_io.py
"""
from __future__ import annotations
import json
import os
import sys
import tempfile
from pathlib import Path
try:
import pytest
except ImportError:
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus import event_file_io
from minimateplus.framing import S3Frame
from minimateplus.models import Event, Timestamp
# ── Fixtures shared with test_waveform_store.py ───────────────────────────────
def _make_synthetic_event() -> tuple[Event, list[S3Frame]]:
"""Same shape as tests/test_waveform_store.py — minimum viable Event +
A5 stream that makes write_blastware_file emit a parseable file.
STRT is exactly 21 bytes; rectime_seconds lands at byte 18 to match
`_decode_a5_waveform`'s expected layout (which is also what
`read_blastware_file()` reads back)."""
key4 = bytes.fromhex("01110000")
rectime = 3
strt = bytearray(21)
strt[0:4] = b"STRT"
strt[4:6] = b"\xff\xfe"
strt[6:10] = key4 # end_key (per data[23:27] in CLAUDE.md)
strt[10:14] = key4 # start_key (per data[27:31])
strt[18] = rectime
strt = bytes(strt)
probe_data = bytes(7) + strt + bytes(32)
probe = S3Frame(sub=0xA5, page_hi=0x10, page_lo=0x00, data=probe_data,
checksum_valid=True, chk_byte=0x00)
sample = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x10,
data=bytes(7) + bytes(0x0200), checksum_valid=True,
chk_byte=0x00)
# Build a valid 26-byte footer (0e 08 + ts1 + ts2 + 6 const + 2 crc)
# and embed it at the END of the terminator's contribution so
# write_blastware_file finds the real `0e 08` marker rather than
# falling back to slicing the last 26 bytes of zero garbage.
# ts byte order: [day][month][year_HI][year_LO][0x00][hour][min][sec]
footer = (
b"\x0e\x08"
+ bytes([6, 5, 0x07, 0xea, 0, 12, 34, 56]) # ts1 = 2026-05-06 12:34:56
+ bytes([6, 5, 0x07, 0xea, 0, 12, 35, 6]) # ts2 = ts1 + ~10s
+ b"\x00\x01\x00\x02\x00\x00"
+ b"\x00\x00"
)
assert len(footer) == 26
term_data = bytes(11) + bytes(38) + footer # 11 prefix + 38 pad + 26 footer = 75
term = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x00,
data=term_data, checksum_valid=True, chk_byte=0x00)
ev = Event(index=0)
ev._waveform_key = key4
ev.timestamp = Timestamp(
raw=b"", flag=0x10, year=2026, unknown_byte=0,
month=5, day=6, hour=12, minute=34, second=56,
)
ev.rectime_seconds = rectime
ev.record_type = "Waveform"
ev._a5_frames = [probe, sample, term]
return ev, [probe, sample, term]
# ── Sidecar write/read round-trip ─────────────────────────────────────────────
def test_event_to_sidecar_dict_shape():
ev, _ = _make_synthetic_event()
d = event_file_io.event_to_sidecar_dict(
ev,
serial="BE11529",
blastware_filename="M529LKIQ.7M0W",
blastware_filesize=1024,
blastware_sha256="abcd" * 16,
source_kind="sfm-live",
a5_pickle_filename="M529LKIQ.7M0W.a5.pkl",
)
assert d["schema_version"] == event_file_io.SCHEMA_VERSION
assert d["kind"] == event_file_io.SIDECAR_KIND
assert d["event"]["serial"] == "BE11529"
assert d["event"]["timestamp"] == "2026-05-06T12:34:56"
assert d["event"]["waveform_key"] == "01110000"
assert d["blastware"]["sha256"] == "abcd" * 16
assert d["source"]["kind"] == "sfm-live"
assert d["review"] == {
"false_trigger": False, "reviewer": None,
"reviewed_at": None, "notes": "",
}
assert d["extensions"] == {}
def test_sidecar_write_and_read_round_trip(tmp_path: Path):
ev, _ = _make_synthetic_event()
path = tmp_path / "M529LKIQ.7M0W.sfm.json"
src = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="sfm-ach",
)
event_file_io.write_sidecar(path, src)
loaded = event_file_io.read_sidecar(path)
assert loaded["event"] == src["event"]
assert loaded["blastware"] == src["blastware"]
assert loaded["source"]["kind"] == "sfm-ach"
def test_sidecar_persists_raw_0c_record_in_extensions(tmp_path: Path):
"""An Event with _raw_record populated should land its 210 bytes
base64-encoded in extensions.raw_records.waveform_record_b64, so
later analysis (e.g. mapping Peak Acceleration / Time of Peak / ZC
Freq byte offsets) can run offline against the saved sidecar."""
import base64
ev, _ = _make_synthetic_event()
# Synthesize a 210-byte 0C record with embedded label needles so
# the dump tool's anchor scan has something to find.
raw = bytearray(210)
raw[10:14] = b"Tran"
raw[60:64] = b"Vert"
raw[110:114] = b"Long"
raw[160:164] = b"MicL"
ev._raw_record = bytes(raw)
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="sfm-live",
)
rr = d["extensions"]["raw_records"]
assert rr["waveform_record_len"] == 210
decoded = base64.b64decode(rr["waveform_record_b64"])
assert decoded == ev._raw_record
# Round-trip through write/read
path = tmp_path / "raw0c.sfm.json"
event_file_io.write_sidecar(path, d)
loaded = event_file_io.read_sidecar(path)
assert (
base64.b64decode(loaded["extensions"]["raw_records"]["waveform_record_b64"])
== ev._raw_record
)
def test_sidecar_omits_raw_records_when_event_has_no_0c(tmp_path: Path):
"""Events without a _raw_record (e.g. constructed by importers that
never see 0C) should NOT add an empty raw_records block keep the
sidecar clean for those flows."""
ev, _ = _make_synthetic_event()
assert ev._raw_record is None
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="bw-import",
)
assert d["extensions"] == {}
def test_sidecar_rejects_unsupported_schema_version(tmp_path: Path):
path = tmp_path / "future.sfm.json"
path.write_text(json.dumps({
"schema_version": event_file_io.SCHEMA_VERSION + 1,
"kind": event_file_io.SIDECAR_KIND,
}))
try:
event_file_io.read_sidecar(path)
except ValueError as exc:
assert "schema_version" in str(exc)
return
raise AssertionError("read_sidecar should have rejected unsupported version")
def test_sidecar_extensions_survive_round_trip(tmp_path: Path):
"""Forward-compat: unknown keys inside `extensions` survive a r/w cycle."""
ev, _ = _make_synthetic_event()
path = tmp_path / "x.sfm.json"
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="X", blastware_filesize=0, blastware_sha256="",
source_kind="sfm-live",
extensions={"vendor.acme.gps": {"lat": 40.7, "lon": -74.0}},
)
event_file_io.write_sidecar(path, d)
back = event_file_io.read_sidecar(path)
assert back["extensions"]["vendor.acme.gps"]["lat"] == 40.7
def test_sidecar_patch_review_stamps_reviewed_at(tmp_path: Path):
ev, _ = _make_synthetic_event()
path = tmp_path / "patch.sfm.json"
event_file_io.write_sidecar(
path,
event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="X", blastware_filesize=0, blastware_sha256="",
source_kind="sfm-live",
),
)
new = event_file_io.patch_sidecar(
path,
review={"false_trigger": True, "notes": "truck thump", "reviewer": "brian"},
)
assert new["review"]["false_trigger"] is True
assert new["review"]["notes"] == "truck thump"
assert new["review"]["reviewer"] == "brian"
assert new["review"]["reviewed_at"], "reviewed_at must be auto-stamped"
on_disk = event_file_io.read_sidecar(path)
assert on_disk["review"]["false_trigger"] is True
# ── WaveformStore integration ─────────────────────────────────────────────────
def test_waveform_store_save_writes_sidecar(tmp_path: Path):
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec = store.save(ev, serial="BE11529", a5_frames=frames, source_kind="sfm-live")
assert rec["sidecar_filename"].endswith(".sfm.json")
assert rec["sha256"] and len(rec["sha256"]) == 64
sc = store.load_sidecar("BE11529", rec["filename"])
assert sc is not None
assert sc["blastware"]["filename"] == rec["filename"]
assert sc["blastware"]["sha256"] == rec["sha256"]
assert sc["source"]["kind"] == "sfm-live"
# The .a5.pkl reference should match the actual filename on disk.
assert sc["source"]["a5_pickle_filename"] == rec["a5_pickle_filename"]
def test_waveform_store_save_preserves_review_across_resave(tmp_path: Path):
"""Re-saving the same event must preserve a user's prior review edits."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec = store.save(ev, serial="BE11529", a5_frames=frames)
# User flips false_trigger and adds a note.
store.patch_sidecar(
"BE11529", rec["filename"],
review={"false_trigger": True, "notes": "hello"},
)
# A second save (e.g. Force refresh re-download) must keep those edits.
store.save(ev, serial="BE11529", a5_frames=frames)
sc = store.load_sidecar("BE11529", rec["filename"])
assert sc["review"]["false_trigger"] is True
assert sc["review"]["notes"] == "hello"
def test_waveform_store_patch_sidecar_returns_none_when_missing(tmp_path: Path):
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
out = store.patch_sidecar("BE99999", "no.such.W", review={"notes": "x"})
assert out is None
# ── DB integration: sidecar_filename column + update_event_review ─────────────
def test_seismodb_persists_sidecar_filename_and_review_sync(tmp_path: Path):
from sfm.database import SeismoDb
db = SeismoDb(tmp_path / "seismo_relay.db")
ev, _ = _make_synthetic_event()
rec = {
"filename": "M529LKIQ.7M0W",
"filesize": 8708,
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
"sidecar_filename": "M529LKIQ.7M0W.sfm.json",
}
inserted, _ = db.insert_events(
[ev], serial="BE11529",
waveform_records={ev._waveform_key.hex(): rec},
)
assert inserted == 1
rows = db.query_events(serial="BE11529")
row = rows[0]
assert row["sidecar_filename"] == rec["sidecar_filename"]
# update_event_review keeps false_trigger column in sync with sidecar.
assert db.update_event_review(row["id"], {"false_trigger": True}) is True
again = db.get_event(row["id"])
assert again["false_trigger"] == 1
# Empty review block (no false_trigger key) → no-op but row exists.
assert db.update_event_review(row["id"], {"notes": "x"}) is True
# ── BW-file reader (read_blastware_file) ─────────────────────────────────────
def test_read_blastware_file_round_trip(tmp_path: Path):
"""write → read → key/timestamp/rectime survive."""
from minimateplus.blastware_file import write_blastware_file, blastware_filename
ev, frames = _make_synthetic_event()
bw_path = tmp_path / blastware_filename(ev, "BE11529")
write_blastware_file(ev, frames, bw_path)
parsed = event_file_io.read_blastware_file(bw_path)
assert parsed._waveform_key == ev._waveform_key
assert parsed.rectime_seconds == ev.rectime_seconds
# Timestamp lands via the footer; year/month/day/hour/min/sec all survive.
assert parsed.timestamp is not None
assert parsed.timestamp.year == ev.timestamp.year
assert parsed.timestamp.month == ev.timestamp.month
assert parsed.timestamp.day == ev.timestamp.day
assert parsed.timestamp.hour == ev.timestamp.hour
assert parsed.timestamp.minute == ev.timestamp.minute
assert parsed.timestamp.second == ev.timestamp.second
# No A5 source recoverable.
assert parsed._a5_frames is None
# Peaks computed from samples (synthetic = zero samples → zero peaks).
assert parsed.peak_values is not None
assert parsed.peak_values.peak_vector_sum == 0.0
def test_save_imported_bw_round_trip(tmp_path: Path):
"""save_imported_bw stores a copy + sidecar with source.kind = bw-import."""
from minimateplus.blastware_file import write_blastware_file, blastware_filename
from sfm.waveform_store import WaveformStore
# Produce a BW file outside the store.
ev, frames = _make_synthetic_event()
fname = blastware_filename(ev, "BE11529")
src = tmp_path / fname
write_blastware_file(ev, frames, src)
store = WaveformStore(tmp_path / "waveforms")
parsed_ev, rec = store.save_imported_bw(src.read_bytes(), source_path=src)
assert rec["filename"] == fname
assert rec["a5_pickle_filename"] is None # no A5 source for BW imports
sc = store.load_sidecar("BE11529", fname)
assert sc is not None
assert sc["source"]["kind"] == "bw-import"
assert sc["source"]["a5_pickle_filename"] is None
# The stored binary should match the source byte-for-byte (we just copied).
stored_path = store.open_blastware("BE11529", fname)
assert stored_path is not None
assert stored_path.read_bytes() == src.read_bytes()
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
import inspect
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)
+296
View File
@@ -0,0 +1,296 @@
"""
test_event_hdf5.py HDF5 codec round-trip + plot.v1 JSON shape sanity.
Run:
python tests/test_event_hdf5.py
"""
from __future__ import annotations
import os
import sys
import tempfile
from pathlib import Path
try:
import pytest
except ImportError:
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import S3Frame
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
from sfm import event_hdf5
# ── Fixtures ──────────────────────────────────────────────────────────────────
def _make_event_with_samples(n: int = 256) -> Event:
"""An Event with synthetic int16 ADC samples on all four channels.
Channel content:
- Tran: ramp from -16384 to +16383 (peak 5 in/s for Normal range)
- Vert: full-scale dirac at index n//2 (peak = 10 in/s)
- Long: zeros
- MicL: small ramp
Peak values are set on the event the way the device's 0C record
would supply them used by the HDF5 writer for the mic per-count
factor.
"""
tran = [int((i / max(n - 1, 1)) * 32767 - 16384) for i in range(n)]
vert = [0] * n
if n:
vert[n // 2] = 32767
long_ = [0] * n
mic = [int((i / max(n - 1, 1)) * 5000) for i in range(n)]
ev = Event(index=0)
ev._waveform_key = bytes.fromhex("01110000")
ev.timestamp = Timestamp(
raw=b"", flag=0x10,
year=2026, unknown_byte=0, month=5, day=7,
hour=10, minute=0, second=0,
)
ev.record_type = "Waveform"
ev.sample_rate = 1024
ev.pretrig_samples = n // 4
ev.total_samples = n
ev.rectime_seconds = n / 1024.0
ev.raw_samples = {"Tran": tran, "Vert": vert, "Long": long_, "MicL": mic}
ev.peak_values = PeakValues(
tran=5.0, vert=10.0, long=0.0,
peak_vector_sum=10.0, micl=0.001,
)
ev.project_info = ProjectInfo(
project="TestProj", client="TestClient",
operator="brian", sensor_location="loc-A",
)
return ev
# ── HDF5 round-trip ───────────────────────────────────────────────────────────
def test_hdf5_round_trip_preserves_metadata(tmp_path: Path):
ev = _make_event_with_samples()
h5 = tmp_path / "test.h5"
event_hdf5.write_event_hdf5(
h5, ev, serial="BE11529", geo_range="normal",
)
data = event_hdf5.read_event_hdf5(h5)
a = data["attrs"]
assert a["schema_version"] == event_hdf5.SCHEMA_VERSION
assert a["kind"] == event_hdf5.HDF5_KIND
assert a["serial"] == "BE11529"
assert a["waveform_key"] == "01110000"
assert a["sample_rate"] == 1024
assert a["pretrig_samples"] == 64
assert a["geo_range"] == "normal"
assert a["geo_full_scale_ips"] == 10.0
assert a["project"] == "TestProj"
assert a["client"] == "TestClient"
assert a["operator"] == "brian"
# Float attrs may round-trip with tiny precision noise.
assert abs(a["peak_tran_ips"] - 5.0) < 1e-6
assert abs(a["peak_vert_ips"] - 10.0) < 1e-6
def test_hdf5_samples_in_physical_units_normal_range(tmp_path: Path):
"""Vert hits ADC full-scale (32767) → with Normal range FS=10 in/s,
the HDF5 sample value should be 10 * 32767/32768 in/s."""
ev = _make_event_with_samples()
h5 = tmp_path / "n.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529", geo_range="normal")
data = event_hdf5.read_event_hdf5(h5)
vert = data["samples"]["Vert"]
assert vert.dtype.name == "float32"
assert max(abs(v) for v in vert) > 9.99 # full-scale ≈ 10.0
# The dirac was at n//2 → 32767 ADC counts.
expected_peak = 10.0 * 32767 / 32768
assert abs(max(vert) - expected_peak) < 1e-3
def test_hdf5_samples_in_physical_units_sensitive_range(tmp_path: Path):
"""Same fixture but Sensitive range → full-scale 1.250 in/s."""
ev = _make_event_with_samples()
h5 = tmp_path / "s.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529", geo_range="sensitive")
data = event_hdf5.read_event_hdf5(h5)
vert = data["samples"]["Vert"]
expected_peak = 1.250 * 32767 / 32768
assert abs(max(vert) - expected_peak) < 1e-4
def test_hdf5_includes_int16_samples(tmp_path: Path):
ev = _make_event_with_samples()
h5 = tmp_path / "i.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529")
data = event_hdf5.read_event_hdf5(h5)
assert data["samples_int16"] is not None
assert "Tran" in data["samples_int16"]
assert data["samples_int16"]["Vert"].dtype.name == "int16"
def test_hdf5_rejects_unsupported_schema(tmp_path: Path):
"""Round-tripping with a tampered schema_version raises ValueError."""
import h5py
h5 = tmp_path / "future.h5"
with h5py.File(h5, "w") as f:
f.attrs["schema_version"] = 99
f.attrs["kind"] = event_hdf5.HDF5_KIND
try:
event_hdf5.read_event_hdf5(h5)
except ValueError as exc:
assert "schema_version" in str(exc)
return
raise AssertionError("read_event_hdf5 should reject unsupported schema_version")
# ── plot.v1 JSON shape ────────────────────────────────────────────────────────
def test_event_to_plot_json_shape():
ev = _make_event_with_samples()
j = event_hdf5.event_to_plot_json(ev, serial="BE11529", geo_range="normal")
assert j["schema"] == "sfm.plot.v1"
assert j["serial"] == "BE11529"
assert j["geo_range"] == "normal"
assert j["geo_full_scale_ips"] == 10.0
assert j["trigger_ms"] == 0.0
t = j["time_axis"]
assert t["sample_rate"] == 1024
assert t["pretrig_samples"] == 64
assert t["n_samples"] == 256
# t0_ms = -pretrig * dt_ms = -64 * (1000/1024) ≈ -62.5
assert abs(t["t0_ms"] - (-64 * 1000 / 1024)) < 1e-3
assert abs(t["dt_ms"] - (1000 / 1024)) < 1e-6
chans = j["channels"]
for name in ("Tran", "Vert", "Long", "MicL"):
assert name in chans, f"missing channel: {name}"
assert chans[name]["unit"] in ("in/s", "psi")
assert "values" in chans[name]
assert "peak" in chans[name]
assert "peak_t_ms" in chans[name]
# Values are in physical units: Vert peak ≈ 10 in/s.
assert max(chans["Vert"]["values"]) > 9.99
def test_event_to_plot_json_peak_t_ms_locates_dirac():
"""The Vert channel's full-scale dirac at sample n//2 should produce
peak_t_ms = (n//2 - pretrig) * dt_ms."""
ev = _make_event_with_samples(n=256)
j = event_hdf5.event_to_plot_json(ev, serial="BE11529")
expected = (128 - 64) * (1000 / 1024) # = 62.5 ms
assert abs(j["channels"]["Vert"]["peak_t_ms"] - expected) < 1e-2
def test_plot_json_from_hdf5_round_trip(tmp_path: Path):
"""plot_json_from_hdf5 produces the same shape as event_to_plot_json."""
ev = _make_event_with_samples()
h5 = tmp_path / "rt.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529", geo_range="normal")
j_disk = event_hdf5.plot_json_from_hdf5(h5, event_id="abc-123")
j_mem = event_hdf5.event_to_plot_json(ev, serial="BE11529", geo_range="normal", event_id="abc-123")
# Top-level shape parity
for k in ("schema", "serial", "geo_range", "geo_full_scale_ips",
"trigger_ms", "record_type", "waveform_key", "event_id"):
assert j_disk.get(k) == j_mem.get(k), f"mismatch on {k}"
assert j_disk["time_axis"]["sample_rate"] == j_mem["time_axis"]["sample_rate"]
assert j_disk["time_axis"]["n_samples"] == j_mem["time_axis"]["n_samples"]
# Sample values must match within float32 precision.
for ch in ("Tran", "Vert", "Long", "MicL"):
a = j_disk["channels"][ch]["values"]
b = j_mem["channels"][ch]["values"]
assert len(a) == len(b)
if a:
mx = max(abs(x - y) for x, y in zip(a, b))
assert mx < 1e-3, f"{ch}: max diff {mx}"
# ── WaveformStore integration with HDF5 ───────────────────────────────────────
def _make_synthetic_event_for_save() -> tuple[Event, list[S3Frame]]:
"""Same flavour as test_event_file_io.py but ensures _make_event_with_samples
is also wired into the BW write path so we can exercise WaveformStore.save."""
ev = _make_event_with_samples(n=128)
# Build a minimum 3-frame A5 stream (probe + sample + term) — same
# shape used in the other test files. The encoder only really needs
# the STRT in the probe + a non-zero body and a footer in the term.
key4 = ev._waveform_key
rectime = int(ev.rectime_seconds or 0) or 1
strt = bytearray(21)
strt[0:4] = b"STRT"
strt[4:6] = b"\xff\xfe"
strt[6:10] = key4
strt[10:14] = key4
strt[18] = rectime
probe = S3Frame(sub=0xA5, page_hi=0x10, page_lo=0x00,
data=bytes(7) + bytes(strt) + bytes(32),
checksum_valid=True, chk_byte=0x00)
sample = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x10,
data=bytes(7) + bytes(0x0200), checksum_valid=True, chk_byte=0x00)
footer = (
b"\x0e\x08"
+ bytes([7, 5, 0x07, 0xea, 0, 10, 0, 0])
+ bytes([7, 5, 0x07, 0xea, 0, 10, 0, 1])
+ b"\x00\x01\x00\x02\x00\x00\x00\x00"
)
term = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x00,
data=bytes(11) + bytes(38) + footer, checksum_valid=True, chk_byte=0x00)
ev._a5_frames = [probe, sample, term]
return ev, [probe, sample, term]
def test_waveform_store_save_emits_hdf5(tmp_path: Path):
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event_for_save()
rec = store.save(ev, serial="BE11529", a5_frames=frames, geo_range="normal")
assert rec["hdf5_filename"], "hdf5_filename should be present in save() record"
h5 = store.hdf5_path_for("BE11529", rec["filename"])
assert h5.exists(), "WaveformStore.save should produce a .h5 file"
# The HDF5 round-trip should match the event's metadata.
data = event_hdf5.read_event_hdf5(h5)
assert data["attrs"]["serial"] == "BE11529"
assert data["attrs"]["geo_range"] == "normal"
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
import inspect
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)
+302
View File
@@ -0,0 +1,302 @@
"""
test_waveform_store.py unit tests for sfm/waveform_store.py and the
SeismoDb columns + insert_events upsert path that the store depends on.
These tests exercise the *store + DB plumbing* in isolation they do not
re-test write_blastware_file (covered separately) and do not require a live
device or a wire capture.
Run:
python -m pytest tests/test_waveform_store.py -v
"""
from __future__ import annotations
import os
import sys
import datetime
from pathlib import Path
try:
import pytest
except ImportError: # allow running standalone without pytest installed
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import S3Frame
from minimateplus.models import Event, Timestamp
# ── Test fixtures ──────────────────────────────────────────────────────────────
def _make_synthetic_event() -> tuple[Event, list[S3Frame]]:
"""
Build a minimal Event + a 3-frame A5 stream that satisfies
write_blastware_file's STRT-extraction path.
Frame 0 (probe): contains a STRT record at the canonical position so
write_blastware_file finds it without falling back.
Frame 1 (sample): 0x0200 bytes of zeros at page_key=0x0010 (sample marker).
Frame 2 (TERM): page_key=0x0000 marks the terminator.
"""
key4 = bytes.fromhex("01110000")
rectime = 3
strt = b"STRT" + b"\xff\xfe" + key4 + key4 + bytes(7) + bytes([rectime])
# Probe payload prefix: 7 zero bytes then STRT (matches blastware_file._strip
# logic which looks for STRT in data[7:]). Tail with 32 zero bytes of fake
# body so reconstruction has something to slice.
probe_data = bytes(7) + strt + bytes(32)
probe = S3Frame(sub=0xA5, page_hi=0x10, page_lo=0x00, data=probe_data,
checksum_valid=True, chk_byte=0x00)
sample = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x10,
data=bytes(7) + bytes(0x0200), checksum_valid=True,
chk_byte=0x00)
term = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x00,
data=bytes(7) + bytes(64), checksum_valid=True,
chk_byte=0x00)
ev = Event(index=0)
ev._waveform_key = key4
ev.timestamp = Timestamp(
raw=b"",
flag=0x10,
year=2026,
unknown_byte=0,
month=5,
day=6,
hour=12,
minute=34,
second=56,
)
ev.rectime_seconds = rectime
ev.record_type = "Waveform"
ev._a5_frames = [probe, sample, term]
return ev, [probe, sample, term]
# ── Frame round-trip ───────────────────────────────────────────────────────────
def test_frame_dict_round_trip():
"""_frame_to_dict and _dict_to_frame must round-trip every field."""
from sfm.waveform_store import _dict_to_frame, _frame_to_dict
f = S3Frame(
sub=0xA5, page_hi=0x12, page_lo=0x34,
data=b"\x10\x02\x00\xab\xcd",
checksum_valid=False,
chk_byte=0x42,
)
d = _frame_to_dict(f)
g = _dict_to_frame(d)
assert g.sub == f.sub
assert g.page_hi == f.page_hi
assert g.page_lo == f.page_lo
assert g.data == f.data
assert g.checksum_valid == f.checksum_valid
assert g.chk_byte == f.chk_byte
# ── Store save/load round-trip ─────────────────────────────────────────────────
def test_waveform_store_save_load_round_trip(tmp_path: Path):
"""save() writes both files; load_a5() returns equivalent frames."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec = store.save(ev, serial="BE11529", a5_frames=frames)
assert rec["filename"].startswith("M529")
assert rec["filesize"] > 0
assert rec["a5_pickle_filename"] == rec["filename"] + ".a5.pkl"
bw_path = store.open_blastware("BE11529", rec["filename"])
assert bw_path is not None
assert bw_path.exists()
assert bw_path.stat().st_size == rec["filesize"]
# Sidecar exists and round-trips
loaded = store.load_a5("BE11529", rec["filename"])
assert loaded is not None
assert len(loaded) == len(frames)
for orig, got in zip(frames, loaded):
assert got.sub == orig.sub
assert got.page_hi == orig.page_hi
assert got.page_lo == orig.page_lo
assert got.data == orig.data
def test_waveform_store_missing_returns_none(tmp_path: Path):
"""open_blastware / load_a5 return None for nonexistent entries."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
assert store.open_blastware("BE99999", "no_such.7M0W") is None
assert store.load_a5("BE99999", "no_such.7M0W") is None
def test_waveform_store_idempotent_save(tmp_path: Path):
"""Saving the same event twice produces the same event-file bytes."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec1 = store.save(ev, serial="BE11529", a5_frames=frames)
bw_path = store.open_blastware("BE11529", rec1["filename"])
bytes1 = bw_path.read_bytes()
rec2 = store.save(ev, serial="BE11529", a5_frames=frames)
bytes2 = bw_path.read_bytes()
assert rec1["filename"] == rec2["filename"]
assert bytes1 == bytes2
# ── DB integration ────────────────────────────────────────────────────────────
def test_seismodb_persists_waveform_columns(tmp_path: Path):
"""insert_events writes the new columns when waveform_records is supplied."""
from sfm.database import SeismoDb
db = SeismoDb(tmp_path / "seismo_relay.db")
ev, _ = _make_synthetic_event()
rec = {
"filename": "M529LKIQ.7M0W",
"filesize": 8708,
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
}
inserted, skipped = db.insert_events(
[ev],
serial="BE11529",
waveform_records={ev._waveform_key.hex(): rec},
)
assert inserted == 1
assert skipped == 0
rows = db.query_events(serial="BE11529")
assert len(rows) == 1
row = rows[0]
assert row["blastware_filename"] == rec["filename"]
assert row["blastware_filesize"] == rec["filesize"]
assert row["a5_pickle_filename"] == rec["a5_pickle_filename"]
# get_event by id returns the same fields
row2 = db.get_event(row["id"])
assert row2 is not None
assert row2["blastware_filename"] == rec["filename"]
def test_seismodb_dedup_upserts_waveform_fields(tmp_path: Path):
"""Re-inserting the same (serial, timestamp) refreshes waveform fields."""
from sfm.database import SeismoDb
db = SeismoDb(tmp_path / "seismo_relay.db")
ev, _ = _make_synthetic_event()
db.insert_events([ev], serial="BE11529") # no waveform record yet
rows = db.query_events(serial="BE11529")
assert rows[0]["blastware_filename"] is None
rec = {
"filename": "M529LKIQ.7M0W",
"filesize": 4242,
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
}
inserted, skipped = db.insert_events(
[ev],
serial="BE11529",
waveform_records={ev._waveform_key.hex(): rec},
)
assert inserted == 0 # dedup'd
assert skipped == 1
rows = db.query_events(serial="BE11529")
assert rows[0]["blastware_filename"] == rec["filename"]
assert rows[0]["blastware_filesize"] == 4242
def test_seismodb_migration_adds_columns(tmp_path: Path):
"""An existing DB without the new columns gets them added on init."""
import sqlite3
db_path = tmp_path / "old.db"
# Build a "v0" events table without the new columns.
with sqlite3.connect(str(db_path)) as conn:
conn.executescript("""
CREATE TABLE events (
id TEXT PRIMARY KEY,
serial TEXT NOT NULL,
waveform_key TEXT NOT NULL,
session_id TEXT,
timestamp TEXT,
tran_ppv REAL,
vert_ppv REAL,
long_ppv REAL,
peak_vector_sum REAL,
mic_ppv REAL,
project TEXT,
client TEXT,
operator TEXT,
sensor_location TEXT,
sample_rate INTEGER,
record_type TEXT,
false_trigger INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
UNIQUE(serial, timestamp)
);
INSERT INTO events
(id, serial, waveform_key, timestamp)
VALUES
('legacy-id', 'BE11529', '01110000',
'2026-04-01T12:00:00');
""")
# Initialise SeismoDb against the old DB — migration should run.
from sfm.database import SeismoDb
db = SeismoDb(db_path)
rows = db.query_events(serial="BE11529")
assert len(rows) == 1
assert rows[0]["blastware_filename"] is None
assert "blastware_filesize" in rows[0]
assert "a5_pickle_filename" in rows[0]
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
# Standalone runner — does not require pytest.
import inspect
import tempfile
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)