80 Commits

Author SHA1 Message Date
serversdown a18712442f feat: preserve and encode raw 0C record in sidecar extensions for offline analysis 2026-05-08 21:50:01 +00:00
serversdown 8aea46b8a0 doc(fix): retracts raw int16 LE sample set assumptions. 2026-05-08 19:26:25 +00:00
serversdown 9123269b1f feat(protocol): implement v0.14.0 SUB 5A protocol rewrite with enhanced chunk handling and new helpers
test: add regression tests for v0.14.x SUB 5A protocol fixes
refactor(logging): change warning logs to debug for less verbosity in write_blastware_file
2026-05-08 19:11:55 +00:00
serversdown 9400f59167 doc: update readme to 0.15.0 2026-05-08 19:06:26 +00:00
serversdown bbed85f7e2 fix: update channel keys to include 'MicL' in device_event_waveform documentation 2026-05-08 18:48:06 +00:00
serversdown c641d5fc10 feat: v0.15.0
### Added

- **Layered event storage architecture.**  Each event now lands as four
  files in the per-serial waveform store, each with a clear role:

  - `<filename>` — the Blastware-readable binary (BW file).  Untouched.
  - `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
  - `<filename>.h5` — clean per-channel waveform arrays in physical
    units (in/s for geo, psi for mic) plus event metadata (HDF5 with
    gzip compression).  This is the canonical format for downstream
    analysis tools.
  - `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
    project, source provenance, review state, extensions).

  SQLite (`seismo_relay.db`) is the searchable index over all four.

- **Plot-ready waveform JSON (`sfm.plot.v1`).**  The `/device/event/{idx}/waveform`
  and `/db/events/{id}/waveform.json` endpoints now return samples in
  physical units with explicit time-axis metadata, peak markers, and
  per-channel unit hints — no more guessing the ADC-to-velocity scale
  client-side.  The webapp waveform viewer was rewritten to consume
  this shape.

- **In-app waveform viewer accuracy fix.**  The standalone SFM webapp
  viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
  (≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
  *in/s per V* hardware constant — not the ADC-counts-to-velocity
  factor.  This silently scaled every plot ~38% too low for Normal-range
  geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
  Sensitive).  Conversion is now done server-side using the geo_range
  from compliance config; the client just plots.

- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
  `read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.

### Dependencies

- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
  `/db/import/blastware_file` endpoint introduced in this release).
2026-05-08 04:39:51 +00:00
serversdown 9afa3484f4 feat(cache): implement integrity checks for cached events and waveforms
- Added `waveform_key` and `event_timestamp` columns to `CachedEvent` and `CachedWaveform` for integrity verification.
- Implemented logic to flush the cache when a mismatch in (waveform_key, event_timestamp) is detected during event and waveform updates.
- Enhanced `set_events` and `set_waveform` methods to check for mismatches and trigger cache eviction as necessary.
- Introduced a new `LiveCache` class to manage in-memory caching of live device data, separating it from the server logic for better testability.
- Added tests to verify the correctness of cache invalidation logic, particularly for post-erase key reuse scenarios.
- Updated web application to include a "Force refresh" toggle, allowing users to bypass the cache and re-fetch data from the device.
2026-05-07 04:42:00 +00:00
serversdown 0484680c89 fix(docs/comments): rename refs to 'event files' to reflect their timestamp extenion names. 2026-05-06 19:08:38 +00:00
serversdown 3711b11bda feat: add waveform store handling 2026-05-06 19:03:38 +00:00
serversdown 52c6e7b618 Merge pull request 'v0.14.3 - Full waveform DL pipeline tested and working.' (#15) from protocol-fix into main
Reviewed-on: #15
2026-05-05 20:49:47 -04:00
serversdown 29ebc75656 doc: update readme v0.14.3 2026-05-05 20:48:58 -04:00
claude ebfe9877fa doc: update changelog to 0.14.3 2026-05-05 20:39:47 -04:00
claude c914a15e12 docs: update for v0.14.3 - Full continuous waveform download successful! 2026-05-05 20:37:52 -04:00
claude a27693242d fix(protocol): implement partial DLE stuffing for 0x10 bytes in params to prevent request corruption 2026-05-05 18:28:28 -04:00
claude eefec0bd64 fix(blastware_file): remove harmful "duplicate header+STRT" strip logic to preserve valid waveform data 2026-05-05 17:48:40 -04:00
claude 7444738883 debug(protocol): event-N probe is now at counter = start_offset instead of start_offset + 0x46 2026-05-05 16:46:35 -04:00
claude 6b76934a04 Merge branch 'main' into protocol-fix 2026-05-04 14:43:05 -04:00
claude 7b62c790a9 fix(seismo-lab): remove duplicate capture history list 2026-05-04 14:30:46 -04:00
claude b66cc9d075 fix(blastware_file): update TERM detection logic and strip duplicate header blocks for accurate file writing 2026-05-04 14:28:11 -04:00
serversdown 4ab604eff1 Merge pull request 'v0.12.6' (#10) from seismo-lab-new into main
Reviewed-on: #10
2026-05-04 13:22:54 -04:00
serversdown e15f1567ef Doc: Update docs for 0.12.6 2026-05-04 17:18:28 +00:00
serversdown bb33ad3837 doc: update to v0.12.5 2026-05-04 17:13:37 +00:00
claude 45e61fbcaf big refactor of waveform protocol. 2026-05-03 01:20:21 -04:00
claude d758825c67 fix(protocol): correct continuous-mode record header classification for accurate timestamp extraction 2026-05-01 20:28:55 -04:00
claude 0fbb39c21a Big event bugfix. see details:
## v0.13.0 — 2026-05-01

### Fixed

- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
  `read_bulk_waveform_stream` was walking the chunk counter past the actual
  end of the event, picking up post-event circular-buffer garbage that
  corrupted reconstructed Blastware files for any waveform > ~1 sec.  The
  loop now extracts the event's `end_offset` from the STRT record at
  `data[23:27]` of the probe response and stops the chunk walk when the next
  counter would step past it.  Verified against three BW MITM captures
  (4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
  bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.

### Added

- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)` —
  computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
  formula confirmed across all 3 BW captures.  Not yet wired into
  `read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
  existing `blastware_file.write_blastware_file` frame-structure expectations);
  available for the next iteration that switches to BW's 0x0200 chunk step.
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
  from the STRT record in an A5 response payload.
2026-05-01 18:37:34 -04:00
claude 1ef55521b1 Fix: Removed duplicates from merge botch. Stable version of seismo_lab.py 2026-05-01 17:34:41 -04:00
claude 738b39f3cb Manually Merged seismo lab persistent connection branch into the new direct download branch, creating a new branch called seismo-lab-new 2026-05-01 15:13:50 -04:00
Claude 625b0a4dfc feat(seismo_lab): add Download tab that captures wire bytes during event download
Adds a new CapturingTransport wrapper in minimateplus.transport that mirrors
every TX/RX byte to two raw .bin files using the same on-wire format as
bridges/ach_mitm.py, so the resulting captures are byte-for-byte compatible
with the existing Blastware MITM captures and load directly in the Analyzer.

A new "Download" tab in seismo_lab.py lets the user connect to a device over
TCP or serial and run connect / list-keys / download-events while the wrapper
saves raw_bw_<ts>.bin (our TX) and raw_s3_<ts>.bin (device TX) into a
seismo_dl_<ts>[_<label>]/ session directory. On completion, the panel hands
both files to the Analyzer and switches tabs, mirroring the UX of the
existing Bridge capture flow.
2026-05-01 00:12:02 +00:00
Claude b14f31f3b0 Include capture label in TCP raw filename
Matches serial bridge naming: raw_bw_{ts}_{label}.bin / raw_s3_{ts}_{label}.bin

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-27 20:48:10 +00:00
Claude b9ab368934 Fix TCP capture: write files only when capture is active
Previously every Blastware connection auto-created files.
Now TCP mode works the same as serial mode:
- Start Bridge: proxy listens and forwards silently, no files written
- New Capture: opens raw_bw/raw_s3 files; pipe threads write to them
- Stop Capture: flushes and closes files, fires Analyzer callback
- No connection = no file; multiple captures per bridge session work correctly

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-27 20:26:31 +00:00
Claude 9004241846 Restore multi-capture Bridge design + TCP mode
Brings back the protocol-exp BridgePanel design:
- Single bridge session stays up; New Capture / Stop Capture create
  labelled raw-file segments on demand (no files created at bridge start)
- Capture history listbox shows all segments; double-click reloads in Analyzer
- On capture complete: Analyzer auto-populates and runs analysis

TCP mode integrated into same tab (Serial/TCP radio toggle):
- Each incoming Blastware connection is automatically a capture segment
- Session appears in history list; Analyzer wires up live on connect
- Stop Capture disconnects current TCP session

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-27 20:20:43 +00:00
Claude 6861d9ed97 Merge TCP mode into Bridge tab (Serial/TCP radio toggle)
Removes the separate 'TCP Capture' tab and folds TCP MITM capture directly
into the existing Bridge tab.  A Serial/TCP radio selector at the top swaps
the connection fields (COM ports vs. listen port + device host:port) while
keeping the same Start Bridge / Stop Bridge / Add Mark buttons, capture
checkboxes, log dir, and live log — identical UX for both modes.

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 23:01:45 +00:00
claude 5cd5652560 Merge branch 'seismo-lab' of https://github.com/serversdwn/seismo-relay into seismo-lab 2026-04-26 18:16:52 -04:00
Claude 897ac8a3f3 Add TCP MITM capture tab (TcpBridgePanel)
New 'TCP Capture' tab in seismo_lab.py: listens on a configurable local
port for an incoming Blastware connection, transparently forwards all
traffic to the real seismograph device, and saves both directions to
raw_bw_<ts>.bin / raw_s3_<ts>.bin in the same format the Analyzer already
understands.  Session start wires up Analyzer live mode automatically via
the same on_bridge_started callback as the COM-port bridge.

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 22:10:48 +00:00
serversdown 310fc5986c Merge pull request 'seismo-lab2' (#7) from seismo-lab2 into seismo-lab
Reviewed-on: #7
2026-04-26 16:49:28 -04:00
Claude e1150b30aa fix(analyzer): name A5/5A frames; revert S3 checksum validation
Add 0x5A (BULK_WAVEFORM_STREAM) and 0xA5 (BULK_WAVEFORM_RESPONSE) to
SUB_TABLE so they display with real names instead of UNKNOWN_5A/A5.

Revert S3 checksum validation to checksum_valid=None (the original
intentional behavior). Large S3 frames (A5 bulk waveform, E5 compliance
config) embed inner DLE+ETX sub-frame delimiters; the trailing 0x03 of
the last inner delimiter can land where the parser expects the SUM8
checksum byte, causing false BAD CHK on every valid A5 frame.
protocol.py _validate_frame documents and ignores exactly this issue.

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 20:40:45 +00:00
claude a7585cb5e0 fix(blastware_file, server): implement logic to skip extra chunks after metadata for accurate file writing 2026-04-26 16:32:32 -04:00
Claude 9bbecea70f fix(parser): correct S3 frame terminator — bare ETX, not DLE+ETX
parse_s3 had the S3 terminator logic inverted vs the real S3FrameParser
in framing.py. It was terminating on DLE+ETX and treating bare ETX as
payload, which caused every bare 0x03 to be swallowed — bundling multiple
real S3 frames into one giant body until a DLE+ETX sequence happened to
appear. Result: 583-byte POLL_RESPONSE 'frames' containing many real
frames concatenated, all showing BAD CHK.

Fix: mirror S3FrameParser exactly —
  - Bare ETX (0x03) = real frame terminator
  - DLE+ETX (0x10 0x03) = inner-frame literal data (A4/E5 sub-frames),
    appended to body and parsing continues

https://claude.ai/code/session_014NczSHUz9uTzCAf4cVASTJ
2026-04-26 20:23:18 +00:00
claude ae30a02898 fix(blastware_file, server): enhance logging and correct chunk handling for accurate data processing 2026-04-26 16:03:07 -04:00
claude 2f084ed105 fix(protocol): update chunk counter formula to use max(key4[2:4], 0x0400) for accurate data streaming 2026-04-26 01:28:47 -04:00
claude 7976b544ed fix(blastware_file): never skip A5 frames based on classification at fi>0
Frame 0 is always the probe; frames 1+ are always data (waveform ADC
chunks, compliance config, compliance continuation).  Gating on
classify_frame() at fi>0 produces false positives: ADC binary data
can coincidentally contain b"STRT\xff\xfe", causing frames 1 and 5
to be silently dropped from the body (confirmed from live capture on
event key=01110000).  Remove all type-based filtering; include every
frame unconditionally with the standard index-based skip amounts.
2026-04-26 00:59:36 -04:00
claude 0415af19b4 fix(blastware_file): remove seen_metadata flag and adjust frame processing logic 2026-04-24 20:21:03 -04:00
claude 35c3f4f945 fix(protocol): correct A5 frame classification and chunk counter formula 2026-04-24 17:25:29 -04:00
claude 43c8158493 feat(blastware_file): classify A5 frames, only write waveform frames to body
Add classify_frame() which categorises each A5 frame by content:
  terminator    — page_key == 0x0000
  probe_or_strt — contains b"STRT"
  metadata      — contains compliance-config ASCII markers
                  (Project:, Client:, Standard Recording Setup, …)
  waveform      — binary-heavy (< 20% printable ASCII), i.e. raw ADC data
  unknown       — fallback

Update write_blastware_file() body loop: frame 0 (probe) is still
always processed; frames 1+ are only included when classify_frame
returns "waveform".  Metadata frames (compliance config block with
Project:/Client:/etc.) and any stray STRT-bearing frames are skipped
with a warning/debug log.  Terminator frame handling is unchanged.

Adds temporary print() diagnostics so each frame's classification is
visible in the server log to aid debugging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 15:48:37 -04:00
claude 242666f358 fix(protocol): correct chunk counter formula for accurate data streaming 2026-04-24 12:52:02 -04:00
claude 03540fdc00 fix: raise max_chunks to 128 for metadata-only 5A download
For 2-second events at 1024 sps the "Project:" metadata frame appears
beyond chunk 32 (the old default cap), causing the safety limit to be
hit and ~34 KB of waveform data to be downloaded instead of stopping
at the metadata frame.  Raising max_chunks to 128 ensures
stop_after_metadata=True can locate the metadata frame for record
times up to ~4 seconds.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 02:19:27 -04:00
claude f83fd880c0 fix(protocol): update device_event_blastware_file to include extra chunk for accurate data retrieval 2026-04-24 00:35:34 -04:00
claude ab2c11e9a9 fix(protocol): refine extra chunk fetching logic for accurate termination response 2026-04-23 20:30:07 -04:00
claude fa887b85d9 fix(protocol): update extra chunk fetching logic to stop at silence detection 2026-04-23 18:28:14 -04:00
claude ecd980d345 fix(protocol): enhance extra chunk fetching logic to ensure footer detection 2026-04-23 18:22:27 -04:00
claude bc9f16e503 fix(protocol): adjust extra_chunks calculation to use integer conversion of record_time 2026-04-23 17:39:28 -04:00
claude aa2b02535b fix(protocol): add record_time based chunk scaling for longer event record times 2026-04-23 17:33:16 -04:00
claude 2a2031c3a9 fix(protocol): fetch additional chunk after metadata to ensure valid termination response 2026-04-23 17:08:36 -04:00
claude 9e7e0bce2a fix(protocol): adjust full_waveform setting for event downloads to end when it should. 2026-04-23 16:43:59 -04:00
claude 5e2f3bf2a1 fix(protocol): enable full_waveform for continuous mode. 2026-04-23 16:24:39 -04:00
claude 39ebd4bdaa fix(protocol): revert endpoint back to stop_after_metadata=True 2026-04-23 15:11:56 -04:00
claude 84c87d0b57 fix(protocol): adjust waveform download to use full_waveform for accurate event streaming 2026-04-23 13:02:55 -04:00
claude ec6362cb8e fix(protocol): include terminator in waveform stream downloads 2026-04-23 12:45:59 -04:00
claude 3eeafd24aa fix(protocol): improve terminator frame detection in write_blastware_file.
fix: rename .n00 to just blastware file (.n00 was false positive)
2026-04-23 01:33:44 -04:00
claude 8cb8b86192 fix(server): add error logging for device event handling 2026-04-22 23:48:59 -04:00
claude 6dcca4da79 feat(protocol): fully decode Blastware filename encoding and update related documentation 2026-04-22 23:43:31 -04:00
claude c47e3a3af0 feat(protocol): update Blastware file format documentation and encoding details 2026-04-22 19:16:05 -04:00
claude dfbc9f29c5 feat: first try at building waveform binary files. 2026-04-21 22:57:53 -04:00
claude 4331215e23 feat(protocol): enhance raw capture functionality and documentation updates
- Update `s3_bridge.py` to default raw capture file paths to "auto" for timestamped naming.
- Modify `gui_bridge.py` to pre-check raw capture options and streamline path handling.
- Extend `ach_server.py` to save both incoming and outgoing raw bytes for analysis.
- Revise `CHANGELOG.md` and `instantel_protocol_reference.md` to reflect changes in recording mode handling and compliance data encoding.
2026-04-21 16:07:24 -04:00
claude b3dcfe7239 fix(client): correct recording_mode anchor position in compliance config encoding 2026-04-21 01:17:45 -04:00
claude 9b5cdfd857 feat(logging): add detailed logging for anchor position in compliance config encoding/decoding 2026-04-21 00:23:15 -04:00
serversdown 4a0c9b6da5 Merge pull request 'merge protocol-exp 0.12.3 to main' (#5) from protocol-exp into main
Reviewed-on: #5
2026-04-21 00:22:24 -04:00
claude 7129aae279 fix(client): update compliance data size handling (less strict now) 2026-04-21 00:09:30 -04:00
claude 2186bc238b fix: call home settings tab display 2026-04-20 21:15:16 -04:00
claude 3fb24e1895 feat(call-home): Implement Auto Call Home configuration management
- Added `CallHomeConfig` model to represent the Auto Call Home settings.
- Introduced methods in `MiniMateClient` for reading (`get_call_home_config`) and writing (`set_call_home_config`) the call home configuration.
- Updated `MiniMateProtocol` with new commands for call home operations (SUB 0x2C for read, SUB 0x7E for write, and SUB 0x7F for confirm).
- Created API endpoints for retrieving and updating call home settings in the server.
- Enhanced the web interface with a new "Call Home" tab for user interaction with call home settings.
- Implemented JavaScript functions for reading and writing call home configurations from the web app.
2026-04-20 18:23:48 -04:00
claude 7bdd7c92f2 Merge branch 'protocol-exp' of https://gitea.serversdown.net/serversdown/seismo-relay into protocol-exp 2026-04-20 17:04:00 -04:00
claude b6ffdcfa87 feat: implement geophone sensitivity and recording mode settings in compliance config 2026-04-20 17:03:58 -04:00
serversdown a7aec31915 Merge pull request 'fix(parser): resolve BAD CHK for BW frames caused by SESSION_RESET bytes' (#4) from seismo-lab into protocol-exp
Reviewed-on: #4
2026-04-20 17:01:34 -04:00
claude eec6c3dc6a feat: add histogram_interval setting and update UI with new field. 2026-04-20 16:25:56 -04:00
claude 702e06873e fix: add recording_mode option in html 2026-04-20 15:56:52 -04:00
claude 94767f5a9d feat: add recording_mode to config editor in sfm webapp 2026-04-20 15:54:08 -04:00
claude e04114fd6c feat: mapped record_mode protocol 2026-04-20 15:49:31 -04:00
claude f10c5c1b86 feat: add persistent bridge and streamlined capture pipeline to seismo_lab.py 2026-04-20 15:09:55 -04:00
claude aa28495a43 fix: rename max_geo_range to ADC scale, and make it so its not user configurable.
fix: change max_geo_range_enum to geo_range with two options (normal and sensitive)
2026-04-19 18:15:23 -04:00
claude b23cf4bb50 fix: max_geo_range correctly identified as ADC Scale factor number. 2026-04-17 19:43:45 -04:00
36 changed files with 11711 additions and 1053 deletions
+467
View File
@@ -4,6 +4,473 @@ All notable changes to seismo-relay are documented here.
---
## v0.15.0 — 2026-05-07
### Added
- **Layered event storage architecture.** Each event now lands as four
files in the per-serial waveform store, each with a clear role:
- `<filename>` — the Blastware-readable binary (BW file). Untouched.
- `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
- `<filename>.h5` — clean per-channel waveform arrays in physical
units (in/s for geo, psi for mic) plus event metadata (HDF5 with
gzip compression). This is the canonical format for downstream
analysis tools.
- `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
project, source provenance, review state, extensions).
SQLite (`seismo_relay.db`) is the searchable index over all four.
- **Plot-ready waveform JSON (`sfm.plot.v1`).** The `/device/event/{idx}/waveform`
and `/db/events/{id}/waveform.json` endpoints now return samples in
physical units with explicit time-axis metadata, peak markers, and
per-channel unit hints — no more guessing the ADC-to-velocity scale
client-side. The webapp waveform viewer was rewritten to consume
this shape.
- **In-app waveform viewer accuracy fix.** The standalone SFM webapp
viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
(≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
*in/s per V* hardware constant — not the ADC-counts-to-velocity
factor. This silently scaled every plot ~38% too low for Normal-range
geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
Sensitive). Conversion is now done server-side using the geo_range
from compliance config; the client just plots.
- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
`read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.
### Dependencies
- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
`/db/import/blastware_file` endpoint introduced in this release).
---
## v0.14.3 — 2026-05-05
### Fixed
- **`build_5a_frame` — DLE-stuffing rule for 0x10 bytes in params (the
long-standing >1-sec event 0 "won't open in BW" bug).**
Previously `build_5a_frame` wrote params bytes RAW with no DLE stuffing,
based on the incorrect assumption that the device handled all `0x10`
bytes in params literally. It does not. The device's actual de-stuffing
rule for the params region is:
- `10 10` → de-stuffs to `10`
- `10 02/03/04` → kept literal (inner-frame markers)
- `10 X` for other X → de-stuffs to just `X` (drops the `0x10`)
When the counter passed in params has `0x10` in the high byte (e.g.
counter=`0x1000` produces params bytes `... 10 00 ...`), the device
silently corrupts the request to counter=`0x__00` and responds with
whatever lives at that wrong address. For counter=0x1000 the wrong
address was 0x0000, so the response was a copy of the file header +
STRT record. That STRT block then got embedded in the assembled body
at file offset `0x1016`, and Blastware refused to open the file
(interprets the second STRT as a malformed multi-event file).
This explains the entire >1-sec event-0 failure pattern:
- 1-sec events have `end_offset < 0x1000`, so the chunk walk never
requests counter `0x10__` and the bug never triggers.
- 2-sec / 3-sec / longer events all need a chunk at counter `0x1000`
(and longer events also need `0x1200`, `0x1400`, etc., none of which
have `0x10` in the high byte except `0x1000`). Just one corrupted
response is enough to embed STRT in the body and break the file.
Verified against BW 5-1-26 "copy 3sec" capture: all 17 5A request
frames (probe + 2 metadata pages + 13 sample chunks + TERM) now match
BW's wire output **byte-for-byte**, including the doubled `10 10 00`
for counter=0x1000.
### Notes
- `0x10` bytes in `offset_hi` (the standalone offset field at body[5])
are still written RAW — confirmed correct per the 1-2-26 capture.
- BW's actual encoding of `10 02` / `10 04` for meta pages 0x1002 /
0x1004 is *not* doubled — it relies on the device keeping `10 02`
and `10 04` as literal pairs. This is preserved by the fix.
---
## v0.14.2 — 2026-05-04
### Fixed
- **`blastware_file.py` — removed harmful "duplicate header+STRT" strip.**
The v0.13.x strip logic was matching the byte sequence `00 12 03 00 STRT`
in legitimate waveform data — sample chunks at counter `0x1000` and
beyond often contain those bytes coincidentally — and zeroing 25 bytes
of valid samples per match. This is why event 0 (event-1 case in the
protocol) downloads of >1-sec recordings always failed in BW: the strip
destroyed real data at body offset `0x1012..0x102B` and propagated
alignment differences through the rest of the body. Sub-1-sec events
worked because their `end_offset` was below `0x1002`, so no sample
chunks landed in the metadata-page region and the strip's needle never
matched. Verified fix by re-feeding the BW 5-1-26 "copy 3sec" capture's
A5 frames into the file builder: output is now byte-identical to BW's
saved `M529LKIQ.G10` reference (8708 bytes, 0 differences).
- BW already concatenates frame contributions in stream order without
any de-duplication; SFM now does the same.
---
## v0.14.0 — 2026-05-02
### Changed (major rewrite)
- **`read_bulk_waveform_stream` — STRT-bounded chunk walk.** Replaces the
earlier `0x0400`-step / `max(key4[2:4], 0x0400)` chunk-counter formula,
which over-read ~5× past the actual event end into post-event circular-
buffer garbage. The new walk:
1. Probe at `counter = start_offset` (event 1: `0x0000`; event N:
`cur_key[2:4]`).
2. Parse `end_offset` from the STRT record at `data[17]` of the probe
response (`end_key[2:4]` field).
3. For event 1 only, read the two fixed metadata pages at counter
`0x1002` and `0x1004` — these contain the global session-start
compliance setup (Project / Client / User Name / Seis Loc /
Extended Notes ASCII strings). Continuation events skip these
(BW caches them across the session).
4. Walk sample chunks at **`0x0200` increments (NOT `0x0400`)**, bounded
by `end_offset` — the loop exits when
`next_chunk_counter + 0x0200 > end_offset`.
5. Send the proper TERM frame (see new `bulk_waveform_term_v2()`) with
`offset_word = end_offset - next_boundary` and
`params[2:4] = next_boundary BE`. The TERM response carries the
partial last chunk + 26-byte file footer.
- **New helpers:** `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
and `parse_strt_end_offset(a5_data)` in `minimateplus.framing`.
- **`stop_after_metadata` / `extra_chunks_after_metadata` kwargs are now
no-ops** under the v0.14.x walk. They are retained on the
`read_bulk_waveform_stream` signature for backward compatibility but log a
DEBUG line when set. The old "scan for `b'Project:'` and stop one chunk
later" workaround is obsolete — the loop is deterministically bounded by
the STRT-derived `end_offset`.
- **Project / Client / User Name / Seis Loc string source corrected.**
These come from the dedicated metadata pages at counter `0x1002` /
`0x1004`, not from "A5 frame 7" of the sample-chunk stream. The
earlier "A5 frame 7" claim was an artifact of the broken `0x0400`-step
walk where the bad counter formula coincidentally landed sample-chunk
fi=7 on top of the 0x1002 metadata page.
### Verified
- Three independent BW MITM captures (4-27-26 + 5-1-26 + 5-4-26) confirm
the new walk matches BW's behaviour event-for-event.
- `end_offset` values verified across 3 events: `0x1ABE` (4-27-26 2-sec),
`0x21F2` (5-1-26 3-sec), `0x417E` (5-1-26 event-2).
### Notes
- Earlier v0.13.0 / v0.13.1 / v0.13.2 entries describe partial steps along
the way (some of the file builder fixes, filename bugs, etc.) that were
superseded by the full rewrite. Treat this v0.14.0 entry as the
definitive landing point for the corrected SUB 5A protocol.
---
## v0.14.1 — 2026-05-04
### Fixed
- **`read_bulk_waveform_stream` — event-N probe counter off-by-`0x46`.**
Continuation events (start_key[2:4] != 0) were being probed at counter
`start_offset + 0x0046` instead of just `start_offset`. In the iteration
walk, `cur_key` from 1F is already the off=0x46 WAVEHDR record key, so the
earlier formula effectively double-counted the WAVEHDR offset. The probe
landed one WAVEHDR past the actual event start, the response no longer
contained the STRT record at byte 17, `parse_strt_end_offset` returned
`None`, and the chunk loop fell back to the `max_chunks=128` cap — walking
~110 chunks of post-event circular-buffer garbage. Verified against the
5-1-26 "copy 2nd address" and 5-4-26 BW 2-sec event captures: BW probes
counter=`0x2238` with key=`01112238` and STRT is present at byte 17 of
the response (end_offset=`0x417E`).
- **CLAUDE.md / docs/instantel_protocol_reference.md** — corrected the
event-N section to clarify that `start_key` in those formulas is the
off=0x46 key, not the off=0x2C boundary key, and removed the spurious
`+0x46` from the chunk-walk pseudocode.
---
## v0.13.2 — 2026-05-01
### Fixed
- **`_extract_record_type` — third 0C-record header format ("short", 8 bytes).**
A live SFM download against BE11529 produced files named `M5290000.000`
(zero-stamped) because the 0C waveform record's first bytes were
`01 05 07 ea ...` — neither the 9-byte single-shot layout (`0x10` at byte 1)
nor the 10-byte continuous layout (`0x10` at bytes 0 and 2). Investigation
showed this is a third format observed in the wild: an 8-byte header with no
marker bytes at all (`[day][month][year_BE:2][unknown][hour][min][sec]`).
The detection logic now scans the year (uint16 BE) at byte 2 / byte 3 / byte
4 and picks whichever offset returns a sensible year (20152050) — each
format has the year at a unique position so this disambiguates cleanly.
- New format → `event.record_type = "Waveform (Short)"`,
`Timestamp.from_short_record()`.
- Existing single-shot and continuous parsers unchanged.
- The user's event from May 1, 2026 13:21:37 now correctly resolves to a
filename like `M529LKIQ.G10` instead of `M5290000.000`.
### Added
- `Timestamp.from_short_record(data)` — decodes the 8-byte header.
- `_detect_record_format(data)` — internal helper returning
`"single_shot" / "continuous" / "short" / None` via year-position scan.
---
## v0.13.1 — 2026-05-01
### Fixed
- **`_extract_record_type` — Continuous-mode record headers misclassified as Unknown.**
In single-shot mode the 0C waveform record's 9-byte header puts the sub_code
marker `0x10` at byte 1, with the day at byte 0. In Continuous mode the
header is 10 bytes with the marker at byte 0 *and* byte 2, and the day at
byte 1. Previous logic only inspected byte 1 and treated any value other
than `0x10` / `0x03` as `"Unknown"`, which prevented `event.timestamp` from
being populated for any continuous-mode event whose day-of-month wasn't
exactly 3 or 16. As a downstream effect, `blastware_filename()` saw
`event.timestamp == None`, fell back to `stem="0000"` / `ab="00"`, and
produced filenames like `M5290000.000`. Discovered from a live SFM run on
BE11529 in continuous mode (day-of-month = 5).
Now disambiguates by checking BOTH byte 0 and byte 2: if both are `0x10`,
it's the 10-byte continuous header; else if byte 1 is `0x10`, it's the
9-byte single-shot header. Day-of-month no longer matters.
*Superseded by v0.13.2 — the user's actual record uses a third 8-byte format
with no `0x10` markers, which v0.13.1 still misclassified.*
---
## v0.13.0 — 2026-05-01
### Fixed
- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
`read_bulk_waveform_stream` was walking the chunk counter past the actual
end of the event, picking up post-event circular-buffer garbage that
corrupted reconstructed Blastware files for any waveform > ~1 sec. The
loop now extracts the event's `end_offset` from the STRT record at
`data[23:27]` of the probe response and stops the chunk walk when the next
counter would step past it. Verified against three BW MITM captures
(4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.
### Added
- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
formula confirmed across all 3 BW captures. Not yet wired into
`read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
existing `blastware_file.write_blastware_file` frame-structure expectations);
available for the next iteration that switches to BW's 0x0200 chunk step.
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
from the STRT record in an A5 response payload.
### Documentation
- **CLAUDE.md and `docs/instantel_protocol_reference.md` extensively
rewritten** to reflect the corrected SUB 5A protocol. See:
- CLAUDE.md "SUB 5A — chunk counter formula (REWRITTEN 2026-05-01)"
- CLAUDE.md "SUB 5A — STRT record encodes end_offset"
- CLAUDE.md "SUB 5A — TERM frame formula"
- CLAUDE.md "SUB 5A — fixed metadata pages 0x1002 and 0x1004"
- CLAUDE.md "SUB 0A — WAVEHDR response length distinguishes events from
boundaries" (0x46 = real event, 0x2C = boundary marker)
- protocol reference §7.8.5 / §7.8.6 / §7.8.7 / §7.8.8
- The previous chunk-counter formula (`max(key4[2:4], 0x0400) + (chunk-1) *
0x0400`) is now marked DEPRECATED and explicitly tagged WRONG with
pointers to the new sections, so future work doesn't re-derive it.
### Known minor diffs vs Blastware (deferred to a follow-up)
- We still use the OLD 0x0400 chunk step rather than BW's 0x0200; switching
also requires updating `blastware_file.write_blastware_file`'s skip values
and "extra chunk after metadata" logic, which depends on a fresh capture
to verify.
- We still use the legacy fixed `offset_word=0x005A` TERM frame rather than
BW's `end_offset - next_boundary` formula, for the same reason.
- Two fixed metadata pages at counter `0x1002` and `0x1004` are not yet
read explicitly; under the current 0x0400 walk their content is reachable
via the sample chunk that covers buffer addresses `[0x1000, 0x1400)`.
---
## v0.12.6 — 2026-05-01
### Fixed
- **`blastware_file.py` — waveform frame classification** — A5 frame classification for
waveform-only vs header-only frames now uses `frame.record_type` instead of frame index.
Only waveform frames (0x46) are written to the file body; metadata frames are skipped.
Fixes spurious data corruption from incorrectly classified frames.
- **`s3_analyzer.py` — A5/5A frame naming** — Bulk waveform stream frames (SUB 5A response)
are now correctly labeled "A5" in analyzer output instead of being conflated with other
multi-frame responses (SUB A4, E5, etc.).
- **`S3FrameParser` — frame terminator detection** — Corrected the bare ETX terminator
detection. Frame termination is now correctly identified by a standalone `ETX=0x03` byte,
not by the `DLE+ETX` sequence (which is part of the payload when it appears within a frame).
---
## v0.12.5 — 2026-04-21
### Added
- **`seismo_lab.py` — Download tab** — New fourth tab for live wire-byte capture during event
downloads. Captures both BW→device and device→S3 frames in real time, allowing inspection
of the 5A bulk stream chunk sequence and frame-by-frame analysis without needing a bridge
or MITM proxy. Files are saved with user-specified labels for easy tracking.
### Changed
- **`s3_bridge.py` — raw captures always-on by default** — `--raw-bw` and `--raw-s3` now
default to `"auto"` instead of `None`. Every bridge session automatically generates
timestamped `raw_bw_<ts>.bin` and `raw_s3_<ts>.bin` files alongside the `.bin`/`.log`
session files. Pass `--raw-bw ""` (explicit empty string) to disable if needed.
- **`gui_bridge.py` — raw capture checkboxes pre-checked** — Both "BW→S3 raw" and
"S3→BW raw" checkboxes start checked. Path fields are empty by default (bridge auto-names
the files). Unchecking a box passes `--raw-bw ""` to explicitly disable capture.
- **`Bridge tab` — TCP mode added** — Serial/TCP radio toggle allows connection via cellular
modem (RV50/RV55) instead of direct RS-232. Supports multi-capture design (simultaneous
Bridge + Analyzer + Download sessions).
- **`ach_server.py` — TX capture added (`raw_tx_<ts>.bin`)** — Every ACH inbound session
now saves both directions: `raw_rx_<ts>.bin` (device → us, S3 side, as before) and
`raw_tx_<ts>.bin` (us → device, BW side). Both files are usable in the Analyzer.
TX bytes are buffered in memory until startup handshake succeeds (same as RX), preventing
scanner probes from creating empty files.
---
## v0.12.4 — 2026-04-21 (protocol analysis / docs only — no code changes)
### Discovered
- **compliance_raw is wire-encoded, not logical bytes** — `read_compliance_config()` returns
bytes that include DLE prefix bytes (`0x10`) before any `0x03` values (because S3FrameParser
preserves DLE+ETX inner-frame pairs as two literal bytes). The previous CLAUDE.md claim that
"S3FrameParser handles this transparently so compliance_raw contains logical bytes" was wrong.
- **anchor-9 behavior per recording mode** (confirmed from 4-20-26 BW write captures):
- Single Shot (0x00) / Continuous (0x01): anchor-9 = `0x00`
- Histogram (0x03): anchor-9 = `0x10` — the E5 DLE prefix for the `0x03` recording_mode byte
- Histogram+Continuous (0x04): anchor-9 = `0x10` — an actual stored config byte for this mode
Anchor position shifts by ±1 when recording_mode = `0x03` due to the extra DLE byte; the
dynamic anchor search (`buf.find(ANCHOR, 0, 150)`) handles this correctly without code changes.
- **Write frame ETX escaping** — BW escapes `0x03` bytes in write frame data as `0x10 0x03`
on the wire. Our `build_bw_write_frame` sends data bytes raw without ETX escaping. Device
accepts our raw writes for all tested modes. Hypothesis: device write parser uses the
offset/length field for frame boundaries, not ETX scanning, making ETX escaping optional.
Histogram mode (recording_mode = 0x03) write via SFM from a non-Histogram starting state
not yet tested.
- **BW write payload vs E5 read payload are byte-identical** around the anchor region (confirmed
by comparing 3-11-26 BW TX and S3 captures). BW does NOT strip DLE prefix bytes before writing;
it round-trips the wire-encoded bytes verbatim with only the modified fields changed.
- **Capture folder content catalogued** — see CLAUDE.md "BW capture reference" table for a
summary of all available protocol captures and their contents.
---
## v0.12.3 — 2026-04-20
### Added
- **Auto Call Home config protocol** — Full read/write/decode/encode pipeline for the
device's Remote Access → Setup Unit ACH settings, confirmed from 4-20-26 call home
settings captures.
**Protocol (new):**
- `SUB 0x2C` — Call Home Config READ (response `0xD3`); two-step read; data offset
`0x7C` = 124; raw payload 125 bytes (1-byte longer than DATA_LENGTH due to DLE-escaped
`\x10\x03` at raw[117:119] representing num_retries = 3)
- `SUB 0x7E` — Call Home Config WRITE (response `0x81`); 127-byte payload (125-byte read
payload + `\x00\x00`); offset = `data[1]+2 = 0x7E`; write format (DLE-aware checksum)
- `SUB 0x7F` — Call Home WRITE CONFIRM (response `0x80`); no data
**Field map (confirmed from 10-frame BW TX diff):**
- `raw[5]` — auto_call_home_enabled (bool)
- `raw[6:46]` — dial_string (40-byte null-padded ASCII)
- `raw[87]` — after_event_recorded (bool)
- `raw[91]` — at_specified_times (bool)
- `raw[93]` — time1_enabled / `raw[101]` — time1_hour / `raw[102]` — time1_min
- `raw[95]` — time2_enabled / `raw[105]` — time2_hour / `raw[106]` — time2_min
- `raw[117:119]` — `\x10\x03` (DLE-escaped 0x03 = num_retries value 3)
- `raw[120]` — time_between_retries_sec / `raw[122]` — wait_for_connection_sec / `raw[124]` — warm_up_time_sec
**Library (`minimateplus/`):**
- `models.py` — `CallHomeConfig` dataclass (14 fields; `raw` bytes preserved for
round-trip writes)
- `protocol.py` — `SUB_CALL_HOME = 0x2C`, `SUB_CALL_HOME_WRITE = 0x7E`,
`SUB_CALL_HOME_CONFIRM = 0x7F`; `read_call_home_config()`, `write_call_home_config()`
- `client.py` — `get_call_home_config()`, `set_call_home_config()`,
`_decode_call_home_config()` (handles DLE prefix at raw[117]),
`_encode_call_home_config()` (patches in-place; raises `ValueError` if hour/min = 3)
**REST API (`sfm/server.py`):**
- `GET /device/call_home` — reads and decodes call home config from device
- `POST /device/call_home` — reads, patches specified fields, writes back to device
- `CallHomeConfigBody` Pydantic model with 9 optional writable fields
**Web UI (`sfm/sfm_webapp.html`):**
- New "Call Home" tab with enable flag, dial string (read-only), after-event trigger,
at-specified-times flag, two time slots (enable + HH:MM each), and read-only retry
settings (num_retries, time_between_retries_sec, wait_for_connection_sec,
warm_up_time_sec)
- "Read from Device", "Write to Device", "Clear Form" action buttons
- Client-side guard: rejects hour or minute value equal to 3 with a clear message
explaining the DLE-encoding limitation
---
## v0.12.2 — 2026-04-20
### Added / Fixed
- **Geophone sensitivity / maximum range field confirmed** — 4-20-26 geo sensitivity
captures (1.25 in/s vs 10 in/s) diffed across all three SUB 71 write chunks and both
E5 read payloads. The `geo_range` uint8 field per channel is now fully confirmed:
- E5 read offset: `channel_label + 33`; SUB 71 write offset: `channel_label + 29`
- `0x00` = Normal 10.000 in/s (standard gain); `0x01` = Sensitive 1.250 in/s (high gain)
- **Correction:** previous hypothesis (`channel_label+20`, `0x01`=Normal) was wrong.
`channel_label+20` reads `0x01` on ALL captures regardless of range — not this field.
- `_decode_compliance_config_into`: read offset corrected from `tran_pos+20` → `tran_pos+33`
- `_encode_compliance_config`: added `geo_range` parameter; writes to Tran/Vert/Long at `+29`
- `apply_config`: added `geo_range` parameter
- `POST /device/config`: added `geo_range` to `DeviceConfigBody`
- Web UI Config tab: added "Maximum Range — Geo" select (Normal / Sensitive)
- Web UI Device tab: added "Max Range (geo)" row to compliance table
- **`recording_mode` + `histogram_interval_sec` confirmed and implemented** (4-20-26 captures)
- `recording_mode`: uint8 at anchor8 (E5 read) / anchor7 (write); enum: 0x00=Single Shot,
0x01=Continuous, 0x03=Histogram, 0x04=Histogram+Continuous
- `histogram_interval_sec`: uint16 BE seconds at anchor4; same offset in read & write;
valid: 2, 5, 15, 60, 300, 900 (matching Blastware dropdown: 2s, 5s, 15s, 1m, 5m, 15m)
- Both fields added to `ComplianceConfig`, `_decode_compliance_config_into`,
`_encode_compliance_config`, `apply_config`, REST API body, and web UI
---
## v0.12.1 — 2026-04-16
### Added
+525 -59
View File
@@ -2,7 +2,9 @@
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
(Sierra Wireless RV50 / RV55). Current version: **v0.12.1**.
(Sierra Wireless RV50 / RV55). Current version: **v0.14.3**.
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
---
@@ -25,9 +27,9 @@ CHANGELOG.md ← version history
---
## Current implementation state (v0.10.0)
## Current implementation state (v0.14.3)
Full read pipeline + write pipeline + erase pipeline + monitor log working end-to-end over TCP/cellular:
Full read pipeline + write pipeline + erase pipeline + monitor log + call home config working end-to-end over TCP/cellular:
| Step | SUB | Status |
|---|---|---|
@@ -39,13 +41,15 @@ Full read pipeline + write pipeline + erase pipeline + monitor log working end-t
| Event header / first key | 1E | ✅ |
| Waveform header | 0A | ✅ |
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
| **Bulk waveform stream (event-time metadata)** | **5A** | ✅ new v0.6.0 |
| **Bulk waveform stream (event-time metadata + full waveform)** | **5A** | ✅ **byte-perfect against BW captures (v0.14.3, 2026-05-05)** — STRT-bounded chunk walk + correct event-N probe counter + DLE-stuffed `0x10` bytes in params + concatenate-only file body assembly. All 17 5A request frames in the 5-1-26 3-sec capture reproduce byte-for-byte. |
| Event advance / next key | 1F | ✅ |
| **Write commands (push config to device)** | **6883** | ✅ new v0.8.0 |
| **Erase all events** | **0xA3 → 0x1C → 0x06 → 0xA2** | ✅ new v0.9.0 |
| **Monitor log entries (partial 0x2C records)** | **0A browse** | ✅ **new v0.10.0** |
| **Monitor log entries (partial 0x2C records)** | **0A browse** | ✅ new v0.10.0 |
| **Auto Call Home config (read + write)** | **2C → 7E → 7F** | ✅ **new v0.12.3** |
`get_events()` sequence per event: `1E → 0A → 0C → 5A → 1F`
`get_events()` sequence per event: `1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`
(see "Correct iteration pattern" section below for full detail)
`push_config_raw()` write sequence: `68→73 | 71×3→72 | 82→83 | 69→74→72`
@@ -112,24 +116,203 @@ S3→BW (response):
section contribute only `XX` to the running sum; lone bytes contribute normally. This
differs from the standard SUM8-of-destuffed-payload that all other commands use.
Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26
BW TX capture. All 10 frames verified.
3. **Params region uses partial DLE stuffing (CONFIRMED 2026-05-05).** The device's
de-stuffing rule for bytes inside the params region is:
### SUB 5A — chunk counter is monotonic (CORRECTED 2026-04-06)
- `10 10` → de-stuffs to `10`
- `10 02 / 03 / 04` → kept literal (these are inner-frame markers)
- `10 X` for other X → de-stuffs to just `X` (drops the leading `0x10`)
**Chunk counters are `chunk_num * 0x0400` for ALL chunks including chunk 1.**
Therefore any `0x10` byte in the *logical* params that is followed by a byte NOT in
`{0x02, 0x03, 0x04, 0x10}` MUST be doubled on the wire (`10 X``10 10 X`) so the
device's de-stuffer reproduces the original `10 X` pair. This applies most commonly
to counters with `0x10` in the high byte (e.g. counter=`0x1000` produces logical
params bytes `... 10 00 ...`, which BW encodes on the wire as `... 10 10 00 ...`).
Without this stuffing the device interprets counter=`0x1000` as `0x0000` and returns
the probe response (which contains a copy of the file header + STRT record). That
STRT block then gets embedded in the assembled file body at offset `0x1016`, and
Blastware refuses to open the file — see the v0.14.3 entry in `CHANGELOG.md`.
The 4-2-26 BW TX capture showed `counter=0x1004` for chunk 1 of event key `01110000`, which
led to `_CHUNK1_COUNTER = 0x1004` being hardcoded as a special case. This was a Blastware
artifact, not a protocol requirement. Empirical test 2026-04-06: with `counter=0x1004` for
chunk 1 the device times out (120 s); with `counter=0x0400` (= `1 * 0x0400`) it responds
immediately and streams all frames correctly.
`0x10` bytes in `offset_hi` (body[5]) are still written RAW — only the params region
has this stuffing requirement. The metadata-page params for counter `0x1002` /
`0x1004` survive without stuffing because `10 02` and `10 04` fall in the "kept
literal" carve-out.
The 4-3-26 capture confirms the pattern for a second event (key `0111245a`):
chunk 1 = `0x245A`, chunk 2 = `0x285A`, chunk 3 = `0x2C5A` (each +0x0400). Blastware's
true formula is `key4[2:4] + n * 0x0400` — but since `key4[2:4]` of the first event is
`0x0000`, `n * 0x0400` produces the right result. The device does not strictly validate the
counter and streams data for any valid 5A request; using `chunk_num * 0x0400` is correct.
Both differences (1) and (2) confirmed by reproducing Blastware's exact wire bytes from
the 1-2-26 BW TX capture (10 frames). Difference (3) confirmed against the 5-1-26
"bwcap3sec" capture (17 frames, all match byte-for-byte after fix).
### SUB 5A — chunk counter formula (REWRITTEN 2026-05-01 — see 5-1-26 captures)
> ⚠️ **Everything that came before this rewrite was WRONG in important ways.** The previous
> formula `max(key4[2:4], 0x0400) + (chunk_num - 1) * 0x0400` happened to *work* for events
> at start_key=0 because the device responds to whatever counter you ask for — but it caused
> a 5× over-read past the actual event, picking up post-event circular-buffer garbage that
> corrupts the reconstructed file for any event > ~1 sec of waveform. The captures in
> `bridges/captures/4-27-26/` and `5-1-26/comcheck/` show BW reads only ~12-16 chunks for
> the same events SFM was reading 37+ chunks for. See "TERM frame" and "STRT end_offset"
> sections below for the actual mechanism.
**Chunk addressing is just absolute device-buffer addresses.**
`params[0]=0x00`, `params[1:5]` is a 4-byte absolute device flash-buffer address (= the
"key" of that location), `params[5:11]` are zeros. The device returns 0x0200 (= 512) bytes
starting at that address. Increments between consecutive chunks are **0x0200 (NOT 0x0400)**
— this matches the chunk payload size. The previous "0x0400 step" worked by accident: BW
asks for half-size chunks; SFM was asking for double-size chunks, both with the same-named
"counter" field, but the value is just an address pointer the device honors as-is.
**The chunk pattern depends on whether the event sits at start_key=0 or not.**
#### Event 1 case — start_key[2:4] == 0x0000 (first event after erase / wrap)
```
1. Probe at counter=0x0000 (params[1:5] = full key, returns STRT record)
2. Read 2 fixed metadata pages: counter=0x1002, counter=0x1004
(these are GLOBAL session metadata — read ONCE per
Blastware session, not per event; contain the
Project/Client/User Name/Seis Loc strings)
3. Sample chunks: counter=0x0600, 0x0800, …, by 0x0200 increment,
up to but not including end_offset (rounded down to
0x0200 boundary)
4. TERM frame (see TERM formula below)
```
The reason `0x0046..0x0600` is skipped for event 1 is unknown — likely some pre-event
firmware reserved area for the first slot in a freshly-erased buffer. Harmless to skip.
#### Event 2+ case — start_key[2:4] != 0x0000 (continuation events)
```
1. First chunk at counter = start_key[2:4] (this IS the probe — response
contains STRT at byte 17)
2. Sample chunks: counter += 0x0200 each, up to but
not including end_offset
3. TERM frame
```
**`start_key` here is the off=0x46 WAVEHDR record key returned by 1F** (e.g. `01112238`),
NOT the off=0x2C boundary key that immediately precedes it. An earlier draft of this
doc described event-N as "probe at start + 0x46" — that formula came from naming the
boundary key as `start_key`. In the iteration walk, `cur_key` passed to
`read_bulk_waveform_stream` is always the off=0x46 key (the partial-record skip path in
`get_events` re-runs 1F to advance past boundary records before invoking 5A), so the
probe counter is just `cur_key[2:4]` with no extra offset. **Adding +0x46 caused the
probe to overshoot, miss the STRT record at byte 17 of the response, fall back to the
`max_chunks=128` cap, and walk ~110 chunks of post-event garbage** — observed in
SFM 5-4-26 capture before the fix.
Confirmed across:
- 5-1-26 "copy 2nd address" BW capture: probe counter=0x2238, key=01112238, STRT@17 end=0x417E.
- 5-4-26 BW 2-sec event capture: probe counter=0x2238, key=01112238, TERM offset_word=0x0146 → end=0x417E.
No metadata pages — those have already been read during event 1 in the same Blastware
session, and BW caches them. Note that the metadata-page reads happen ONCE per
Blastware-session-on-the-device, not once per event, so an SFM session that downloads
several events should read 0x1002/0x1004 only once at the start.
#### History (do not re-derive)
- Original: `_CHUNK1_COUNTER = 0x1004` hardcoded (Blastware capture artifact — WRONG).
- 2026-04-06: `chunk_num * 0x0400` (worked for key 01110000 only).
- 2026-04-24: `key4[2:4] + (chunk_num-1) * 0x0400` (fixed non-zero offsets, broke key 01110000).
- 2026-04-26: `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400` (broken — over-read past event end).
- 2026-05-01: Increments are 0x0200 not 0x0400; absolute addresses inside event range; bounded
by STRT end_key, not by `max_chunks` cap or device-side timeout.
- 2026-05-04: Removed spurious `+0x0046` from event-N probe counter. `cur_key` from 1F
is already the off=0x46 WAVEHDR key, so adding +0x46 would have placed the probe one
WAVEHDR past the actual event start. This caused probe responses to lack a STRT
record (no `end_offset` parsed → `0xFFFF` fallback → `max_chunks=128` cap), walking
~110 chunks of post-event circular-buffer garbage. Fixed in protocol.py
`read_bulk_waveform_stream`.
### SUB 5A — STRT record encodes end_offset (NEW 2026-05-01)
The first A5 response (probe response, or the first chunk for event 2+) contains a STRT
record at byte offset 17 of the `data` field. Layout:
```
data[17:21] "STRT" magic
data[21:23] ff fe sentinel
data[23:27] end_key ← 4-byte key of where this event ENDS
data[27:31] start_key ← 4-byte key of where this event STARTS
data[31:33] uint16 BE ?? sample-count or total bytes (varies; not yet decoded)
data[33:35] uint16 BE ??
data[35] 0x46 record type (waveform full record)
```
`end_offset = (end_key[2] << 8) | end_key[3]` is **the authoritative event-end pointer**.
SFM must extract this from the first A5 response and use it to bound the chunk loop and
encode the TERM frame. The device will happily respond to chunk requests past `end_offset`
(returning post-event circular-buffer contents) — that's the over-read bug.
Verified across 3 events:
| Capture | start_key | end_key | end_offset | event size |
|---|---|---|---|---|
| 4-27-26 "open 2sec" / "copy event to disk" | `01110000` | `01111ABE` | `0x1ABE` | 6,846 B |
| 5-1-26 "copy 3sec" / Download All event 1 | `01110000` | `011121F2` | `0x21F2` | 8,690 B |
| 5-1-26 "copy 2nd address" / DA event 2 | `011121F2` | `0111417E` | `0x417E` (event 2 span 0x1F8C = 8,076 B) |
### SUB 5A — TERM frame formula (FINALIZED 2026-05-01)
The TERM frame fetches the partial last chunk *and* the file footer. It is **not** a simple
"goodbye" frame — its response payload contains the bytes between the last full 0x0200-aligned
chunk and `end_offset`, and is required for reconstructing the Blastware file format.
```
last_chunk_counter = address of last full 0x0200-byte chunk read
next_boundary = last_chunk_counter + 0x0200
TERM offset_word = end_offset - next_boundary
TERM params[0] = key[0] (= 0x01 on every observed device)
TERM params[1] = key[1] (= 0x11)
TERM params[2] = (next_boundary >> 8) & 0xFF
TERM params[3] = next_boundary & 0xFF
TERM params[4:10] = zeros
build_5a_frame(offset_word, params) (10-byte params, NOT 11)
```
The device reconstructs `requested_address = (params[2] << 8) | offset_word = end_offset`
and replies with `(end_offset - next_boundary)` bytes from `next_boundary` — the residual
between the last 0x0200 boundary and the actual event end. Append the TERM response data
to the chunk stream like any other A5 frame; it carries the final waveform tail + footer.
Verified across 3 events:
| end_offset | last chunk | next_boundary | TERM offset_word | TERM params[2:4] |
|---|---|---|---|---|
| `0x1ABE` | `0x1800` | `0x1A00` | `0x00BE` ✓ | `1A 00` ✓ |
| `0x21F2` | `0x1E00` | `0x2000` | `0x01F2` ✓ | `20 00` ✓ |
| `0x417E` | `0x3E38` | `0x4038` | `0x0146` ✓ | `40 38` ✓ |
The previous code's hard-coded `offset_word = 0x005A` and `term_counter = last + 0x0400`
are wrong; the device's response under that path is a tiny 101-byte device-side terminator
(arrived only after we walked the entire post-event buffer), not the proper file footer.
### SUB 5A — fixed metadata pages 0x1002 and 0x1004 (NEW 2026-05-01)
Two chunk addresses are GLOBAL device/session metadata, not event-specific:
- `counter=0x1002` — first metadata page
- `counter=0x1004` — second metadata page
These are at fixed absolute addresses in the device's flash buffer. They contain the
session-start compliance setup (Project/Client/User Name/Seis Loc/Extended Notes ASCII
strings). Under the v0.14.0+ walk these strings are read directly from the metadata
pages, not from the sample-chunk stream.
BW reads them ONCE per Blastware session (during event 1's download) and caches them.
For SFM, that means:
- Once per call-home / once per `MiniMateClient.connect()` is enough.
- Subsequent events in the same session don't need to re-fetch them.
- Their content does not change when iterating events; only when the user opens
Compliance Setup → Apply on the device or sends a SUB 71 compliance write.
The full byte-for-byte layout of the metadata pages has not been mapped — `_decode_a5_metadata_into`
locates the ASCII strings via label scans (`Project:`, `Client:`, `User Name:`, `Seis Loc:`,
`Extended Notes`) which works correctly across observed captures. Future work could
dump the structural layout if more session-global fields need to be extracted.
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
@@ -137,10 +320,11 @@ counter and streams data for any valid 5A request; using `chunk_num * 0x0400` is
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
Do not swap them.
### SUB 5A — event-time metadata lives in A5 frame 7
### SUB 5A — event-time metadata source (FINALIZED 2026-05-05)
The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance
setup as it existed when the event was recorded:
The metadata strings come from the two fixed metadata pages at counter `0x1002` and
`0x1004` (see "SUB 5A — fixed metadata pages 0x1002 and 0x1004" above). These pages
are GLOBAL session metadata — read once per Blastware/SFM session, not per event.
```
"Project:" → project description
@@ -150,44 +334,71 @@ setup as it existed when the event was recorded:
"Extended Notes"→ notes
```
**IMPORTANT — 5A "Project:" is session-start config, NOT per-event (confirmed 2026-04-05):**
The "Project:" string in the A5 frame 7 payload reflects the compliance setup from when
the *monitoring session first started*, not the individual event's project name. The per-
event project name is correctly stored in the 210-byte 0C waveform record and must be
used as the authoritative source. `_decode_a5_metadata_into` therefore only sets
`project` from 5A when 0C didn't already supply one.
**IMPORTANT — these strings are session-start config, NOT per-event:**
Project / Client / User Name / Seis Loc reflect the compliance setup from when the
*monitoring session first started*, not the individual event's per-event metadata. The
authoritative per-event project name is stored in the 210-byte 0C waveform record.
`_decode_a5_metadata_into` therefore only sets `project` from the 5A metadata pages
when 0C didn't already supply one.
"Client:", "User Name:", "Seis Loc:", and "Extended Notes" are **NOT** present in the 0C
record — 5A remains the sole source for those fields and they are set unconditionally.
record — the metadata pages are the sole source for those fields and they are set
unconditionally.
`stop_after_metadata=True` (default) stops the 5A loop as soon as `b"Project:"` appears,
then sends the termination frame.
#### Deprecated knobs (do not re-introduce)
### SUB 5A — end-of-stream signal (confirmed 2026-04-06)
The `read_bulk_waveform_stream()` function still accepts these legacy kwargs for
backward compatibility, but they are **no-ops** under the v0.14.0+ walk:
After streaming all waveform chunks, the device sends exactly **1 raw byte** in response to
the next chunk request, then goes silent. This is the natural end-of-stream indicator — NOT
a complete A5 frame. `S3FrameParser.bytes_fed` will be 1; no frame is assembled.
- `stop_after_metadata=True` — used to scan the chunk stream for `b"Project:"` and stop
one chunk later as a workaround for the missing end_offset bound. Obsolete: the loop
is now deterministically bounded by `end_offset` parsed from the STRT record at
data[17] of the probe response, with the partial tail fetched by the TERM frame.
- `extra_chunks_after_metadata` — same era, same reason. No-op.
Handling: on `TimeoutError`, if `bytes_fed > 0` AND frames were already collected, treat as
graceful end-of-stream, break the loop, and proceed to the termination frame. If `bytes_fed
== 0` with no prior frames, it is a genuine transport failure — re-raise.
If you find code or docs referencing "A5 frame 7" as the source of metadata strings,
that's an old-walk artifact (the broken `0x0400`-step formula occasionally caught the
0x1002 metadata page at sample-chunk fi=7). Update to reference the dedicated metadata
pages instead.
**Chunk recv timeout must be 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
Using 120 s causes a ~2-minute stall at every end-of-stream detection. The `_recv_one` call
in the chunk loop passes `timeout=10.0` explicitly.
### SUB 5A — end-of-stream (FINALIZED 2026-05-01)
**Typical chunk count (BE11529, 1024 sps):** A 9,306-sample event produces 35 chunks before
end-of-stream. Chunks with uniform 1,036-byte data are all-zero ADC samples (post-event
silence). Only the initial variable-size chunks contain actual signal.
Under the v0.14.0+ STRT-bounded walk the stream ends cleanly:
```
… last full chunk at counter < end_offset
TERM request (offset_word = end_offset - next_boundary,
params address (next_boundary))
TERM response (page_key = 0x0000 or 0x0001, data = the residual
end_offset - next_boundary bytes including the file footer)
```
No timeout-based detection, no "1-byte teaser," no `max_chunks` cap. The chunk loop
exits when `counter + 0x0200 > end_offset`; the TERM frame fetches the tail.
**Chunk recv timeout is 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
Using 120 s would cause a ~2-minute stall on any unexpected timeout. The `_recv_one`
call in the chunk loop passes `timeout=10.0` explicitly.
**Typical chunk count under the v0.14.0+ walk (BE11529, 1024 sps over TCP/cellular):**
| Event duration | Sample chunks | Metadata pages | TERM | Total A5 frames |
|---|---|---|---|---|
| 2-sec (event 1) | ~12 | 2 | 1 | ~15 |
| 3-sec (event 1) | 13 | 2 | 1 | 16 |
| 2-sec (continuation) | 15 | 0 | 1 | 16 |
| 3-sec (continuation) | ~14 | 0 | 1 | ~15 |
For comparison, the deprecated `0x0400`-step walk produced ~37 chunks for a 2-sec
event with chunks 17-37 containing post-event circular-buffer garbage. Do not
re-introduce that walk under any circumstances.
### SUB 5A — fi==9 hardcoded skip (FIXED 2026-04-06)
`_decode_a5_waveform()` previously had `elif fi == 9: continue` — a leftover from the
9-frame original blast capture where frame 9 was assumed to be a terminator. For current
35-frame streams, fi==9 is live waveform data (~133 sample-sets were being dropped).
Removed. Terminator detection is via `page_key == 0x0000` in `read_bulk_waveform_stream`,
not frame index.
9-frame original blast capture where frame 9 was assumed to be a terminator. Removed.
TERM detection in the file builder uses `frame.page_key != 0x0010` (sample marker),
not frame index — see `blastware_file.py`.
### SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)
@@ -292,6 +503,55 @@ sends token=0xFE and is NOT used by any caller.
`advance_event()` returns `(key4, event_data8)`.
Callers (`count_events`, `get_events`) loop while `data8[4:8] != b"\x00\x00\x00\x00"`.
### SUB 0A — WAVEHDR response length distinguishes events from boundaries (NEW 2026-05-01)
When iterating events with the "Download All" pattern (1E → 0A → 1F → 0A → 1F → …), the
DATA_LENGTH at `data_rsp.data[5]` (= the byte BW echoes back as the offset for the data
fetch step) takes one of two values:
| WAVEHDR offset | Meaning |
|---|---|
| `0x46` (= 70) | Real event start key — there is event data at this address |
| `0x2C` (= 44) | Boundary marker between events — this key is the END of the previous event AND the START key for the empty space after it (or is the next event's pre-header) |
Confirmed from the 5-1-26 "Download All" capture:
```
0A(key=01110000) → off=0x46 ← event 1 real start
1F → key=011121F2
0A(key=011121F2) → off=0x2C ← event 1 END / event 2 boundary
1F → key=01112238
0A(key=01112238) → off=0x46 ← event 2 real start (= boundary + 0x46)
1F → key=0111417E
0A(key=0111417E) → off=0x2C ← event 2 END / next-empty marker
1F → null sentinel
```
This is why event 2's first 5A chunk is at `start_key + 0x46` — that's the address of the
"real start" 0x46-record, distinct from the `0x2C`-record at the raw boundary. Use the
`0x46` keys as the input to `read_bulk_waveform_stream`, not the `0x2C` keys.
For event 1 only (start_key[2:4] = 0x0000) BW probes at counter=0x0000 directly, which is
the `0x46`-keyed start record. Subsequent events use `start_key + 0x46`.
**Practical iteration pattern (replaces the old 1E/1F walk for downloads):**
```
Setup: SERIAL × 2 → CHCFG → 1E (token=0x00) → key0
For each event:
0A(cur_key) → DATA_LENGTH = 0x46 (real) or 0x2C (boundary)
1F (token=0x00) → next_key
if length was 0x46: → cur_key is a real event; queue it for download
cur_key = next_key
if next_key all-zero null sentinel: stop
Then for each queued real-event key:
download_event(key) → 5A bulk stream with STRT-bounded chunk walk
```
This is what BW does in the 5-1-26 "Download All" capture — it walks the full event chain
collecting `(key, length)` tuples first, *then* downloads each event using the `0x46` keys.
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
@@ -305,10 +565,16 @@ producing only ~1071 bytes instead of ~2126.
### SUB 1A — anchor search range
`_decode_compliance_config_into()` locates sample_rate and record_time via the anchor
`b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'`. Search range is `cfg[0:150]`.
`_decode_compliance_config_into()` locates fields via the **6-byte stable anchor**
`b'\xbe\x80\x00\x00\x00\x00'`. Search range is `cfg[0:150]`.
Do not narrow this to `cfg[40:100]` — the old range was only accidentally correct because
**IMPORTANT — the "10-byte anchor" `\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00` is NOT fully constant.**
The first 2 bytes (`\x01\x2c` = 300) are the `histogram_interval_sec` field (uint16 BE, seconds) —
the value 300 is just the 5-minute default. When histogram interval is set to a different value
(e.g. 15min = 0x0384 = `\x03\x84`), those bytes change. Only the 6-byte suffix
`\xbe\x80\x00\x00\x00\x00` is truly constant. The code already uses the 6-byte anchor.
Do not narrow the search range to `cfg[40:100]` — the old range was only accidentally correct because
the orphaned-send bug was prepending a 44-byte spurious header, pushing the anchor from
its real position (cfg[11]) into the 40100 window.
@@ -358,15 +624,72 @@ Do NOT use fixed absolute offsets for sample_rate or record_time.
| Field | How to find it |
|---|---|
| **recording_mode** | **uint8 at anchor 3 (write) / anchor 4 (read)** ✅ confirmed 2026-04-20 |
| sample_rate | uint16 BE at anchor 2 |
| **histogram_interval_sec** | **uint16 BE at anchor 4 (seconds); same offset in read & write** ✅ confirmed 2026-04-20 |
| record_time | float32 BE at anchor + 10 |
| trigger_level_geo | float32 BE, located in channel block |
| alarm_level_geo | float32 BE, adjacent to trigger_level_geo |
| max_range_geo | float32 BE, adjacent to alarm_level_geo — **⚠ value and units UNKNOWN** (reads 6.206053 but doesn't match either UI range option; may not be the ADC full-scale — see GitHub issue) |
| geo_hardware_constant (adc_scale_factor) | float32 BE at **channel_label+28** in both read (E5) and write (SUB 71) payloads — reads **6.206053** on BOTH tested units (BE11529 and BE18189); identical across all geo channels (Tran/Vert/Long) and all captures. **Confirmed 2026-04-17 from Interface Handbook §4.5**: this is the **ADC-to-velocity scale factor** = 1/sensitivity = (in/s per V). Firmware uses it as: `PPV (in/s) = ADC_voltage × 6.206053`. Cross-check: `1.61133 V (ADC full-scale) × 6.206053 = 10.000 in/s` (Normal range ✅). Do NOT write this field — it is a hardware/firmware constant. |
| geo_range (sensitivity selector) | **uint8 at channel_label+33** in both read (E5) and write (SUB 71) payloads — **CONFIRMED 2026-04-20** from 4-20-26 geo sensitivity captures: `0x00` = Normal 10.000 in/s (standard gain), `0x01` = Sensitive 1.250 in/s (high gain). Present in all three geo channel blocks (Tran, Vert, Long). **NOTE: `channel_label+20` reads `0x01` on ALL captures regardless of range setting — it is NOT this field.** Note: the "SUB 71 write offset = +29" that appears in earlier analysis was an artifact of incorrect BW-style destuffing applied to write frame data — write frame data is RAW, so the literal `0x10` bytes in the channel block header are preserved, and the offset is the same as in the E5 read payload. |
| setup_name | ASCII, null-padded, in cfg body |
| project / client / operator / sensor_location | ASCII, label-value pairs |
Anchor: `b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'`, search `cfg[0:150]`
**True stable anchor: `b'\xbe\x80\x00\x00\x00\x00'` (6-byte suffix), search `cfg[0:150]`.**
The old "10-byte anchor" `b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'` is partially variable:
bytes `\x01\x2c` = 300 (5-minute default histogram interval); changes when interval changes.
**Field layout relative to the 6-byte anchor (write payload / E5 read — noted where different):**
| Offset | Field | Format | Notes |
|---|---|---|---|
| anchor 9 | mode_prefix | uint8 | `0x00` for Single Shot / Continuous; `0x10` for Histogram (DLE prefix in E5 encoding) and Histogram+Continuous (actual config byte). See "compliance_raw DLE encoding" note below. |
| anchor 8 | recording_mode | uint8 | **Same offset for both read and write** — confirmed 2026-04-21. `_encode_compliance_config` writes `buf[anc-8]`. NOTE: for Histogram (0x03), E5 encodes the value as `0x10 0x03` so compliance_raw[anc-9]=0x10, compliance_raw[anc-8]=0x03. |
| anchor 7 | constant | `0x10` | Always `0x10` in both E5 read and BW write payloads (not a DLE marker — it is part of the sample_rate field area). Do NOT overwrite. |
| anchor 6 | sample_rate | uint16 BE | same in read & write |
| anchor 4 | histogram_interval_sec | uint16 BE | seconds; same in read & write ✅ 2026-04-20 |
| anchor 2 | `0x00 0x00` | padding | |
| anchor | `\xbe\x80\x00\x00\x00\x00` | anchor | |
| anchor + 6 | record_time | float32 BE | same in read & write |
**recording_mode enum** (confirmed 2026-04-20 from 4-20-26 captures):
| Value | Mode | anchor-9 in compliance_raw |
|---|---|---|
| `0x00` | Single Shot | `0x00` |
| `0x01` | Continuous | `0x00` |
| `0x02` | ❓ not observed | ❓ |
| `0x03` | Histogram | `0x10` (DLE prefix from E5 wire encoding of 0x03) |
| `0x04` | Histogram + Continuous | `0x10` (actual config byte for this mode) |
**compliance_raw DLE encoding — IMPORTANT (confirmed 2026-04-21 from 4-20-26 captures):**
`compliance_raw` (returned by `read_compliance_config()`) is NOT purely logical bytes — it is
the wire-encoded representation where `0x03` bytes in the config are preceded by a `0x10` DLE
prefix (because S3FrameParser preserves DLE+ETX inner-frame pairs as two literal bytes).
Consequences:
- When recording_mode = `0x03` (Histogram), `compliance_raw[anc-9] = 0x10` (DLE prefix) and
`compliance_raw[anc-8] = 0x03` (the value). The anchor position is +1 compared to modes
without `0x03` bytes before the anchor.
- For Histogram+Continuous (`0x04`), `compliance_raw[anc-9] = 0x10` for a different reason:
it is an actual stored config byte, not a DLE prefix.
- The anchor search (`buf.find(b'\xbe\x80\x00\x00\x00\x00', 0, 150)`) correctly locates
the anchor regardless of these mode-dependent shifts.
- When SFM writes recording_mode and round-trips the rest verbatim, the byte at `anc-9` is
preserved from the previous read. This means transitioning Histogram→other modes via SFM
leaves a `0x10` at `anc-9`. The device stores it as a literal byte; it does not affect
recording mode operation (which is at `anc-8`), but differs from what BW writes. This is a
known minor discrepancy that does not impact device behavior.
- **Histogram recording mode (0x03) write via SFM**: untested. When starting from a mode with
`anc-9 = 0x00`, SFM writes bare `0x03` at anc-8. BW would write `0x10 0x03`. Device likely
accepts both (write frames probably use offset/length for framing, not ETX scanning).
**DLE escaping in write frames — confirmed 2026-04-20:** Blastware escapes `0x03` bytes in
write frame data as `0x10 0x03` on the wire (defensive ETX escaping). Our `build_bw_write_frame`
does NOT do this escaping — it sends data bytes raw. Device acceptance of bare `0x03` bytes
in write frame data is confirmed for the tested modes (Single Shot, Continuous, Histogram+Continuous
where `0x10 0x03` already appears from round-tripping). Histogram mode (bare `0x03` write from
non-Histogram starting state) has not been directly tested.
### SUB 0C — Waveform Record (210 bytes = data[11:11+0xD2])
@@ -453,6 +776,8 @@ All DB endpoints are read-only except `PATCH /db/events/{id}/false_trigger`.
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence; only 1 event stored so token=0xFE appeared to work |
| 4-3-26 | `bridges/captures/4-3-26/` | Browse-mode S3 capture with 2+ events — confirmed all-zero params for 1F, 1F response layout, null sentinel, 0A context requirement |
| 4-27-26 | `bridges/captures/4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof that SFM was over-reading 5× past event end. BW reads 14 chunks at 0x0200 increments + TERM at end_offset; SFM was reading 37 chunks at 0x0400 increments. STRT end_key field located. |
| 5-1-26 | `bridges/captures/5-1-26/comcheck/` | Three sub-captures: SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945`/`_171216`). Confirmed: TERM frame formula across 3 events; metadata pages 0x1002/0x1004 are global (read once per session); event-1 vs event-N chunk-pattern split; WAVEHDR length 0x46 vs 0x2C disambiguates real events from boundaries. |
---
@@ -701,22 +1026,22 @@ Fields visible in the Blastware Compliance Setup dialog — most are NOT YET dec
offsets in the raw 1A/E5 payload. Only fields with `✅` have confirmed offsets in the code.
**Recording Setup tab:**
- Recording Mode: Continuous / Single Shot / Histogram (enum)
- Record Stop Mode: Fixed Record Time / Auto / Manual Stop (enum)
- Recording Mode: Continuous / Single Shot / Histogram / Histogram+Continuous ✅ (uint8 at anchor3 in write, anchor4 in read; 0x00=Single Shot, 0x01=Continuous, 0x03=Histogram, 0x04=Histogram+Continuous) — confirmed 2026-04-20
- Record Stop Mode: Fixed Record Time / Auto / Manual Stop (enum) ❓ (byte near recording_mode; data[40] in E5 sf1 changed 0x01→0x00 alongside Continuous→Single Shot — may be this field)
- Sample Rate: Standard 1024 / Fast 2048 / Faster 4096 sps ✅ (anchor2)
- Record Time: float, seconds ✅ (anchor+10)
- Histogram Interval: 5 / 15 / 30 / 60 minutes (enum, mode-gated)
- Histogram Interval: 2s / 5s / 15s / 1m / 5m / 15m ✅ (uint16 BE seconds at anchor4, same in read & write; mode-gated to Histogram/Histogram+Continuous) — confirmed 2026-04-20
- Storage Mode: Save All Data / Save Triggered (enum)
- Geophone Type: Standard Triaxial / 4.5 Hz (bool/enum)
- Geophone Channels: Enable all geophones (bool), Trigger Source (bool)
- Chan 1-3 Trigger Level (float, in/s) ✅ (`trigger_level_geo`)
- Chan 1-3 Maximum Range: Normal 10.000 / 1.25 in/s (enum) (`max_range_geo` — offset found, reads 6.206053 which matches neither UI value; units and meaning unknown — do NOT use as ADC full-scale)
- Chan 1-3 Maximum Range: Normal 10.000 / 1.25 in/s (enum) (`geo_range` uint8; **CONFIRMED 2026-04-20** from 4-20-26 geo sensitivity captures: offset = `channel_label+33` in both E5 read and SUB 71 write payloads (same bytes, round-tripped verbatim); `0x00` = Normal 10.000 in/s, `0x01` = Sensitive 1.250 in/s; applied to Tran/Vert/Long channel blocks). **IMPORTANT: `channel_label+20` reads `0x01` on ALL captures and is NOT this field** — it is a constant flag. The float32 at `channel_label+28` = 6.206053 is the ADC-to-velocity scale factor (hardware constant, do NOT write).
- Microphone Channels: Enable all microphones (bool), Trigger Source (bool)
- Chan 4 Trigger Level (dB or psi depending on units)
**Notes tab:**
- Enable User Notes (bool)
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from A5 frame 7 via 5A)
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from 5A metadata pages at counter 0x1002 / 0x1004 — see "SUB 5A — fixed metadata pages" section)
- Enable Extended Notes (bool); Extended Notes text; Extended Notes Title
- Enable Job Number (bool); Job Number (int)
- Enable Scaled Distance (bool); Distance from Blast (float); Charge Weight (float) — Scaled Distance is derived
@@ -935,12 +1260,153 @@ call-home.
---
## Auto Call Home config (SUBs 0x2C / 0x7E / 0x7F) — confirmed 2026-04-20
Full read/write pipeline confirmed from `bridges/captures/4-20-26/call home settings/`
(10 BW TX write frames diffed against the S3 read response).
Accessible in Blastware: **Remote Access → Setup Unit**.
### Protocol
**SUB 0x2C — Call Home Config READ (response 0xD3)**
Standard two-step read: probe offset `0x0000`, data offset `0x007C` (124).
Returns 125 raw bytes (one more than DATA_LENGTH) because the device encodes
num_retries value `3` as `\x10\x03` on the wire — S3FrameParser preserves both
bytes literally, shifting all subsequent field positions by +1.
**SUB 0x7E — Call Home Config WRITE (response 0x81)**
Write format (only BW_CMD `0x10` doubled on wire; DLE-aware checksum).
Payload = 125-byte read payload + `\x00\x00` = 127 bytes.
Offset = `data[1] + 2 = 0x7C + 2 = 0x7E`.
**SUB 0x7F — Call Home WRITE CONFIRM (response 0x80)**
Confirm frame, no data payload. Required after SUB 0x7E.
### Field map (raw 125-byte array from `data_rsp.data[11:]`)
| Raw Offset | Field | Notes |
|---|---|---|
| `[5]` | `auto_call_home_enabled` | `0x00`=off, `0x01`=on |
| `[6:46]` | `dial_string` | 40-byte null-padded ASCII |
| `[87]` | `after_event_recorded` | bool |
| `[91]` | `at_specified_times` | bool |
| `[93]` | `time1_enabled` | bool |
| `[101]` | `time1_hour` | 023 |
| `[102]` | `time1_min` | 059 |
| `[95]` | `time2_enabled` | bool |
| `[105]` | `time2_hour` | 023 |
| `[106]` | `time2_min` | 059 |
| `[117]` | DLE prefix `0x10` | Part of `\x10\x03` (DLE-escaped ETX encoding value 3) |
| `[118]` | `num_retries` | Value = 3; detect via `raw[117] == 0x10` |
| `[120]` | `time_between_retries_sec` | Shifted +1 from logical 119 |
| `[122]` | `wait_for_connection_sec` | Shifted +1 from logical 121 |
| `[124]` | `warm_up_time_sec` | Shifted +1 from logical 123 |
**DLE-escaped 0x03 at raw[117:119]:** The byte value `0x03` is indistinguishable from the
frame ETX terminator, so the device encodes it as `\x10\x03` (DLE + ETX inner-terminator).
S3FrameParser in `STATE_AFTER_DLE` on ETX appends both bytes as literal payload. The write
frame sends them verbatim — device accepts `\x10\x03` and interprets it as value 3.
**Unconfirmed fields:** time slots 3 and 4 (offsets unknown), `modem_power_relay_enabled`.
### `CallHomeConfig` model — models.py
```python
@dataclass
class CallHomeConfig:
raw: Optional[bytes] = None # 125-byte raw read payload
auto_call_home_enabled: Optional[bool] = None # raw[5]
dial_string: Optional[str] = None # raw[6:46]
after_event_recorded: Optional[bool] = None # raw[87]
at_specified_times: Optional[bool] = None # raw[91]
time1_enabled: Optional[bool] = None # raw[93]
time1_hour: Optional[int] = None # raw[101]
time1_min: Optional[int] = None # raw[102]
time2_enabled: Optional[bool] = None # raw[95]
time2_hour: Optional[int] = None # raw[105]
time2_min: Optional[int] = None # raw[106]
num_retries: Optional[int] = None # raw[118] (DLE-prefixed)
time_between_retries_sec: Optional[int] = None # raw[120] (shifted +1)
wait_for_connection_sec: Optional[int] = None # raw[122] (shifted +1)
warm_up_time_sec: Optional[int] = None # raw[124] (shifted +1)
```
### SFM REST API — sfm/server.py
```
GET /device/call_home?host=1.2.3.4&tcp_port=9034 ← read call home config
POST /device/call_home?host=1.2.3.4&tcp_port=9034 ← write call home config
```
POST body fields (all optional): `auto_call_home_enabled`, `after_event_recorded`,
`at_specified_times`, `time1_enabled`, `time1_hour`, `time1_min`, `time2_enabled`,
`time2_hour`, `time2_min`.
**Note:** `dial_string` is read-only in the current implementation (omitted from POST
body) because writing a dial string may require DLE escaping for embedded control characters.
---
## What's next
- **Database** — SQLite store for events + monitor log entries; dedup by key; queryable
- **Histograms** — decode histogram-mode A5 data (noise floor tracking)
- **Blastware-compatible file output** — `write_blastware_file()` and `write_mlg()` implemented. `blastware_filename()` generates correct Blastware filenames (AB0 for direct, AB0W/AB0H for ACH). **Confirmed BYTE-PERFECT against BW reference (v0.14.3, 2026-05-05):** when fed the BW 5-1-26 3-sec capture's A5 frames, the SFM-built file matches BW's saved `M529LKIQ.G10` byte-for-byte (8708 bytes, 0 differences). Live SFM downloads of event 0 (3-sec) and event 1 (3-sec continuation) both open cleanly in Blastware with full Event Reports, frequency analysis, and waveform plots. Body assembly is just contiguous concatenation of frame contributions in stream order (probe → meta@0x1002 → meta@0x1004 → samples → TERM); no stripping, no overlay, no special handling. Histogram+Continuous mode deferred (5A stream for those events embeds histogram interval records that may need different handling — untested under v0.14.x). Extension mapping: extensions encode timestamp (AB0T for ACH, AB0 for direct), NOT recording mode. Filename format: `<prefix_letter><serial3><4-char-base36-stem><ext>`
**Serial encoding (CONFIRMED 2026-04-22):** `prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))`, `serial3 = f"{serial_numeric % 1000:03d}"`. Examples: BE6907→H907, BE11529→M529, BE14036→P036, BE17353→S353, BE18003→T003. The prefix letter encodes the production generation (batch of 1000 units).
**Stem encoding (FULLY CONFIRMED 2026-04-22):** stem = 4-char base-36 of `floor(total_seconds / 1296)` where `total_seconds = (event_local_time 1985-01-01T00:00:00_local)` in seconds. Epoch = `1985-01-01 00:00:00` device local time — confirmed against 3,248 files from 10-year production archive with zero errors. Decode: `event_time = datetime(1985,1,1) + timedelta(seconds=stem_int*1296 + ab_int)`. Example: P036L318.C80H → BE14036, 2025-05-26 15:00:08, Full Histogram.
- **Blastware filename extension — NEW FIRMWARE FULLY DECODED (confirmed 2026-04-21, further confirmed 2026-04-22 from 10-year production archive frequency analysis):**
Extension format = `AB0T` (4 chars):
- `AB` = 2-char base-36 encoding of `total_seconds % 1296` (seconds within the 21.6-min window, 01295); `A = value // 36`, `B = value % 36`
- `0` = always literal digit zero (third character, invariant)
- `T` = event type: `W` = Full Waveform, `H` = Full Histogram
Combined with the 4-char stem, the full filename encodes a complete second-resolution timestamp. Verified against three S353L4H0.{3M0W,8S0H,9X0W} events (all match to the second) plus large-scale frequency analysis of a 10-year archive.
**3-day cycle property (confirmed 2026-04-22):** A unit recording at a fixed daily time cycles through exactly **3 extensions** with a 3-day period. Each calendar day shifts `total_seconds % 1296` by 864 (since `86400 % 1296 = 864`). The cycle repeats every 3 days because `gcd(1296, 864) = 432` and `1296 / 432 = 3`. The three extension values are spaced 432 seconds apart. Confirmed from 10-year archive: the top 3 extensions overall were `CE0H` (95 files), `0E0H` (93), `OE0H` (91) — all three are the 3-day cycle of a 06:00:14 daily call-in time (seconds-in-window = 14, 446, 878; all three have `E` as second character because `14 = E` in base-36 and adding 864 never changes `value % 36` since `864 = 24 × 36`).
**B character invariance:** For a unit recording at a fixed time of day, the second character `B` of the extension (`value % 36`) **never changes** — only the first character `A` cycles through 3 values. This means same-time-of-day files from different dates all share the same `B` character.
**Old firmware (S338, 3-char extensions ending in `0`):** encoding unknown. Extension is NOT recording mode. `blastware_filename()` returns `.N00` as a placeholder for old-firmware units.
**Micromate Series 4** uses a different extension format entirely (observed: `IDFH`, `IDFW`). The `AB0T` formula applies only to MiniMate Plus / V10.72 firmware.
- Compliance config encoder — build raw write payloads from a `ComplianceConfig` object
- **Test Histogram recording mode (0x03) write via SFM** — confirmed working for Single Shot / Continuous / Histogram+Continuous; Histogram (0x03) needs a live test from a non-Histogram starting state (bare 0x03 in write vs BW's DLE-escaped `10 03`)
- **Compliance write anchor-9 cleanup** — when changing recording_mode via SFM, the byte at anchor-9 is not explicitly managed. A spurious `0x10` may persist after Histogram→other mode transitions. Does not affect device operation but differs from BW's byte-perfect output.
- Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring)
- Call Home — map time slots 3/4 offsets; add dial_string write support; confirm `modem_power_relay_enabled`
- Modem manager — push RV50/RV55 configs via Sierra Wireless API
- RV55 DCD/DTR issue — newer RV55 firmware doesn't assert DCD by default; units don't
resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred)
## BW capture reference
`bridges/captures/` contains the following BW TX + S3 response captures for protocol analysis:
| Folder / File | Contents |
|---|---|
| `1-2-26/` | First SUB 5A BW TX capture — established 5A frame format (raw offset_hi, DLE-aware checksum). 10 frames verified. |
| `3-11-26/raw_bw_20260311_170151.bin` | Full compliance write + event download (SUBs 68→83 confirmed, frames 102112) |
| `3-31-26/` | Single-event download (148 BW / 147 S3 frames) — 1E/0A/0C/1F sequence confirmed (single event so token=0xFE appeared to work in either branch) |
| `4-2-26/` | Download-mode BW TX capture — POLL×3 requirement confirmed (frames 68-73 between 1F and first 5A) |
| `4-3-26-multi_event/` | Browse-mode S3 capture with 2+ events — all-zero params for 1F, null sentinel layout, 0A context requirement |
| `4-8-26/` | Monitor status read, start/stop monitoring, SESSION_RESET signal, sensor check |
| `4-11-26 (mitm/ach_mitm_20260411_001912/)` | Full ACH call-home MITM — erase protocol (0xA3/0x06/0xA2), monitor log partial records confirmed |
| `4-20-26/raw_bw_*_recording_mode_*.bin` | Recording mode changes: Continuous→Single Shot, →Histogram, →Histogram+Continuous |
| `4-20-26/histogram interval/` | Histogram interval changes: 1min, 5min, 15min, 15sec |
| `4-20-26/geo sensitivity/` | Geo sensitivity changes: 1.25 in/s (Sensitive), 10 in/s (Normal) |
| `4-20-26/call home settings/` | Call home config read/write captures |
| `4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof of 5× SFM over-read. STRT end_key field located. |
| **`5-1-26/comcheck/`** | **Triplet of captures that nailed the v0.14.0 walk:** SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945` / `_171216`). Confirmed: TERM frame formula across 3 events, metadata pages 0x1002/0x1004 are global session metadata, event-1 vs event-N chunk pattern split, WAVEHDR off=0x46 vs 0x2C disambiguates real events from boundaries. |
| **`5-1-26/comcheck/bwcap3sec/`** | **The byte-perfect reference for v0.14.3.** All 17 BW 5A request frames (probe, 2 metadata, 13 samples, TERM) reproduce byte-for-byte from SFM's framing helpers — including the `10 10 00` DLE-stuffed counter for sample @ 0x1000 that was the long-standing failure mode. |
| `5-4-26/` | BW MITM captures of "copy 3sec / 2sec / Download All" + paired SFM session (`seismo_dl_20260504_145701`) showing the +0x46 event-N probe bug producing 110-chunk runaway walk. Cross-references against 5-1-26 confirmed device behavior is identical. |
To parse BW TX captures: use `bridges/captures/` scripts or adapt the `find_write_frames()` pattern
in `/tmp/analyze_write_payload.py` — it correctly handles `0x10 0x03` DLE-escaped ETX bytes
inside write frame data (the naive parser terminates early at the escaped `0x03`).
+135 -38
View File
@@ -1,4 +1,4 @@
# seismo-relay `v0.12.1`
# seismo-relay `v0.15.0`
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
software for managing MiniMate Plus seismographs.
@@ -10,7 +10,15 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
> pipeline working end-to-end over TCP/cellular. ACH Auto Call Home server
> handles inbound unit connections, downloads events, and persists everything
> to a SQLite database. SFM REST API exposes device control and DB queries.
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
> **As of v0.14.3 (2026-05-05): SUB 5A bulk waveform protocol is verified
> byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec
> events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with
> full Event Reports, frequency analysis, and waveform plots.
> **v0.15.0 (2026-05-07)** adds layered per-event storage (BW binary +
> raw 5A pickle + HDF5 + `.sfm.json` sidecar), a plot-ready
> `sfm.plot.v1` JSON shape with server-side ADC-to-physical-units
> conversion, and a BW-file importer for ingesting externally-produced
> events. See [CHANGELOG.md](CHANGELOG.md) for full version history.
---
@@ -18,26 +26,27 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
```
seismo-relay/
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Console tabs)
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Download + Console tabs)
├── minimateplus/ ← MiniMate Plus client library
│ ├── transport.py ← SerialTransport, TcpTransport, SocketTransport
│ ├── protocol.py ← DLE frame layer, SUB command dispatch
│ ├── client.py ← High-level client (connect, get_events, push_config, …)
│ ├── client.py ← High-level client (connect, get_events, delete_all_events, push_config, get_call_home_config, …)
│ ├── framing.py ← Frame builders, DLE codec, S3FrameParser
── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, …
── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, CallHomeConfig,
│ └── blastware_file.py ← Write events to Blastware-compatible .AB0 files
├── sfm/ ← SFM REST API server (FastAPI, port 8200)
│ ├── server.py ← All device + DB endpoints
│ ├── database.py ← SeismoDb — SQLite persistence layer
│ └── sfm_webapp.html ← Embedded web UI (served at /)
│ ├── server.py ← Live device endpoints + DB query endpoints + caching
│ ├── database.py ← SeismoDb — SQLite persistence (events, monitor_log, ach_sessions, sessions table)
│ └── sfm_webapp.html ← Embedded web UI with Call Home config tab
├── bridges/
│ ├── ach_server.py ← Inbound ACH call-home server (main production server)
│ ├── ach_mitm.py ← Transparent MITM proxy for capturing BW sessions
│ ├── s3-bridge/ ← RS-232 serial bridge (capture tool)
│ ├── tcp_serial_bridge.py ← Local TCP↔serial bridge (bench testing)
│ ├── gui_bridge.py ← Standalone bridge GUI
│ ├── gui_bridge.py ← Standalone bridge GUI with raw capture checkboxes
│ └── raw_capture.py ← Simple raw capture tool
├── parsers/
@@ -101,21 +110,28 @@ python seismo_lab.py
Each call dials the device, does its work, and closes the connection. TCP
connections are retried once on `ProtocolError` to handle cold-boot timing.
**Caching** — frequently-polled endpoints are cached in-process to avoid
redundant TCP round-trips:
**In-memory caching** — frequently-polled endpoints avoid redundant TCP round-trips
via a thread-safe `_LiveCache` (plain Python dict + `threading.Lock`):
| Method | URL | Cache |
|--------|-----|-------|
| Method | URL | Cache Strategy |
|--------|-----|---|
| `GET` | `/device/info` | Indefinite; invalidated by `POST /device/config` |
| `GET` | `/device/events` | Count-probe fast path (~2s); full download only when new events detected |
| `GET` | `/device/event/{idx}/waveform` | Permanent per event index |
| `GET` | `/device/monitor/status` | 30-second TTL |
| `GET` | `/device/monitor/status` | 30-second TTL; invalidated by monitor start/stop |
| `GET` | `/device/call_home` | Fresh read from device (not cached) |
| `POST` | `/device/connect` | — |
| `POST` | `/device/config` | Writes compliance config; invalidates cache |
| `POST` | `/device/monitor/start` | Sends SUB 0x96 |
| `POST` | `/device/monitor/stop` | Sends SUB 0x97 |
| `POST` | `/device/config` | Writes compliance config; invalidates info + events cache |
| `POST` | `/device/config/project` | Patches project/client/operator/sensor_location strings |
| `POST` | `/device/monitor/start` | Sends SUB 0x96; immediately evicts status cache |
| `POST` | `/device/monitor/stop` | Sends SUB 0x97; immediately evicts status cache |
| `POST` | `/device/call_home` | Reads, patches specified fields, writes back to device |
All cached endpoints accept `?force=true` to bypass the cache.
**Cache bypass**All cached endpoints accept `?force=true` to skip the cache and
force a fresh read from the device.
**Cache stats**`GET /cache/stats` returns hit/miss counts and TTL info; `DELETE /cache/device`
clears the device cache immediately.
Transport query params (supply one set):
```
@@ -158,42 +174,61 @@ with client:
events = client.get_events() # Full download: headers + peaks + metadata
monitor = client.get_monitor_status() # Battery, memory, is_monitoring flag
log = client.get_monitor_log_entries() # Monitoring intervals (partial 0x2C records)
ach_cfg = client.get_call_home_config() # Auto Call Home settings (SUB 0x2C)
# Write
client.apply_config(
sample_rate=1024,
recording_mode="Continuous", # Single Shot / Continuous / Histogram / Histogram+Continuous
histogram_interval_sec=15, # 2, 5, 15, 60, 300, 900
trigger_level_geo=0.5,
geo_range="Normal", # Normal (10.000 in/s) / Sensitive (1.25 in/s)
project="Bridge Inspection 2026",
client_name="City of Portland",
operator="B. Harrison",
)
client.set_call_home_config(
auto_call_home_enabled=True,
after_event_recorded=True,
at_specified_times=True,
time1_hour=18, time1_min=30, # 6:30 PM
time2_hour=6, time2_min=0, # 6:00 AM
)
# Control
client.start_monitoring() # SUB 0x96
client.stop_monitoring() # SUB 0x97
client.delete_all_events() # Erase all (SUB 0xA3 → 0x1C → 0x06 → 0xA2)
```
`get_events()` runs the full per-event sequence: `1E → 0A → 0C → 5A → 1F`.
SUB 5A bulk stream provides `client`, `operator`, and `sensor_location` as they
existed at record time — not backfilled from the current compliance config.
`get_events()` runs the full per-event sequence:
`1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`.
SUB 5A bulk stream walks chunks bounded by the `end_offset` extracted from
the STRT record at byte 17 of the probe response — no over-reading, no
chunk-count cap. Project / client / operator / sensor location strings come
from the dedicated metadata pages at counter `0x1002` and `0x1004`,
read once per session (they reflect the compliance setup at session start,
not per individual event).
---
## Database
`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode).
Three tables, all unit-keyed by serial number:
`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode) using the
`SeismoDb` persistence layer. Four tables, all unit-keyed by serial number:
| Table | Key | Contents |
|-------|-----|----------|
| `ach_sessions` | UUID | Per-call-home audit record: serial, peer IP, events_downloaded, duration |
| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, PPV per channel, project/client/operator strings, false_trigger flag |
| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: start/stop time, duration, geo threshold |
| `ach_sessions` | UUID | Per-call-home audit record: serial, timestamp, peer IP, events_downloaded, monitor_entries, duration_seconds |
| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag |
| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips |
| `events.false_trigger` | Boolean flag | PATCH endpoint to mark/unmark false triggers for review |
Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs
never produce duplicate rows. Post-erase key reuse is handled automatically
via the high-water mark in `ach_state.json`.
Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs never
produce duplicate rows. Post-erase key reuse is handled automatically via the
high-water mark in `ach_state.json`. Key-based state tracking allows correct
handling of device erasures (external or post-download).
---
@@ -231,6 +266,27 @@ Full protocol documentation: [`docs/instantel_protocol_reference.md`](docs/insta
---
## Compliance Config Features
The REST API and web UI expose full control over device compliance settings:
- **Recording Mode** (Single Shot / Continuous / Histogram / Histogram+Continuous)
- **Sample Rate** (1024 / 2048 / 4096 sps)
- **Record Time** (float, seconds)
- **Histogram Interval** (2s, 5s, 15s, 1m, 5m, 15m) — when recording mode includes histogram
- **Geo Trigger Levels** (float, in/s per channel)
- **Geo Maximum Range** (Normal 10.000 in/s / Sensitive 1.250 in/s per channel)
- **Project / Client / Operator / Sensor Location** (ASCII strings)
Auto Call Home config:
- **Auto Call Home Enable** (bool)
- **Dial String** (read-only; 40-byte ASCII)
- **Trigger on Event** (bool)
- **Scheduled Call-Ins** (two time slots with HH:MM each)
- **Retry Settings** (count, delay, connection timeout, warm-up time)
---
## Requirements
```bash
@@ -252,17 +308,58 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
---
## Roadmap
## Key Features
- [x] Full read pipeline — device info, compliance config, event download with true event-time metadata
- [x] Write commands — push compliance config, trigger thresholds, project strings to device
- [x] Erase all events — confirmed erase sequence from live MITM capture
- [x] Monitor control — start/stop monitoring, read battery/memory/status
- [x] Monitor log entries — decode partial 0x2C records (continuous monitoring intervals)
- [x] ACH inbound server — accept call-home connections, download events, dedup by key
- [x] SQLite persistence — events, monitor log, and session history in `seismo_relay.db`
- [x] SFM REST API — device control + DB query endpoints, live device cache
**Device support:**
- [x] Full read/write/erase pipelines
- [x] Compliance config (recording mode, sample rate, histogram interval, geo sensitivity, project strings)
- [x] Auto Call Home config (read/write ACH settings, dial string, time slots, retries)
- [x] Monitor control (start/stop, status polling, battery/memory)
- [x] Monitor log entries (continuous monitoring intervals without full waveform download)
**Data persistence:**
- [x] SQLite database (`seismo_relay.db`) with 4 tables: ach_sessions, events, monitor_log, plus false_trigger flag
- [x] Deduplication by waveform key (handles re-runs and repeat call-homes)
- [x] Post-erase key-reuse detection (tracks high-water mark)
- [x] Session state (`ach_state.json`) with downloaded keys and max key
**REST API:**
- [x] Live device endpoints with in-memory caching (`_LiveCache`)
- [x] Cache statistics (`/cache/stats`) and manual invalidation (`/cache/device`)
- [x] DB query endpoints (units, events, monitor_log, sessions, false_trigger PATCH)
- [x] Call Home config read/write endpoints
- [x] Blastware file download endpoint (`/device/event/{index}/blastware_file`)
**File output (v0.7+, byte-perfect as of v0.14.3):**
- [x] Blastware-compatible `.AB0` / `.G10` file generation (waveform + metadata)
- [x] Multi-channel waveform decode from SUB 5A bulk stream
- [x] Second-resolution timestamp encoding in Blastware filename
- [x] **Byte-perfect against BW reference captures** (verified across 2-sec / 3-sec / 10-sec event durations, both event 0 and event N continuation events)
- [x] STRT-bounded chunk walk + correct event-N probe counter + partial DLE stuffing of `0x10` in 5A params (the four fixes that landed in v0.14.0v0.14.3)
**Capture tools:**
- [x] Serial-to-TCP bridge with raw BW/S3 capture (s3_bridge.py, defaults to auto-capture)
- [x] GUI bridge with raw capture checkboxes (gui_bridge.py)
- [x] ACH inbound server with bidirectional capture (ach_server.py saves raw_tx + raw_rx)
- [x] Transparent TCP MITM proxy for live BW session capture (ach_mitm.py)
**Analysis tools:**
- [x] s3_analyzer.py — session parser, frame differ, Claude export
- [x] gui_analyzer.py — standalone analyzer GUI
- [x] frame_db.py — SQLite frame database for capture analysis
**seismo_lab.py GUI:**
- [x] Bridge tab — Serial/TCP mode selector with raw capture options
- [x] Analyzer tab — BW/S3 capture playback and differencing
- [x] Download tab — Live wire-byte capture during event download
- [x] Console tab — Logging and diagnostics
## Roadmap (Future)
- [ ] Verify 30-sec event download — body may exceed `0xFFFF` and force the device into a different `end_key` encoding (none of 2/3/10-sec test cases hit this boundary)
- [ ] Terra-view integration — seismo-relay router, unit detail page, VISON-style event listing
- [ ] Vibration summary reports — highest legit PPV per project → Word doc (false trigger filtering first)
- [ ] Compliance config encoder — build raw write payloads from a `ComplianceConfig` object
- [ ] Modem manager — push RV50/RV55 configs via Sierra Wireless API
- [ ] Histogram mode recording support (5A stream analysis for mode 0x03)
- [ ] Call Home dial_string write support (requires DLE escaping for embedded control characters)
+251 -125
View File
@@ -35,6 +35,7 @@ Output per session
device_info.json — serial number, firmware version, calibration date, etc.
events.json — all events: timestamp, PPV per channel, peaks, metadata
raw_rx_<ts>.bin — raw bytes from the device (S3 side) for Analyzer
raw_tx_<ts>.bin — raw bytes we sent to the device (BW side) for Analyzer
session_<ts>.log — detailed protocol log
What to look for
@@ -69,43 +70,78 @@ from minimateplus.transport import SocketTransport
from minimateplus.client import MiniMateClient
from minimateplus.models import DeviceInfo, Event, MonitorLogEntry
from sfm.database import SeismoDb
from sfm.waveform_store import WaveformStore
log = logging.getLogger("ach_server")
# ── Per-unit state (downloaded-key set) ───────────────────────────────────────
# ── Per-unit state (downloaded events index) ──────────────────────────────────
# Persisted as <output_dir>/ach_state.json
# Format:
# Format (current — v2):
# {
# "BE11529": {
# "downloaded_keys": ["01110000", "0111245a"], # hex keys already on disk
# "max_downloaded_key": "0111245a", # highest key ever seen
# "last_seen": "2026-04-11T01:04:36"
# "downloaded_events": { # key_hex → ISO timestamp string
# "01110000": "2026-04-11T00:42:17",
# "0111245a": "2026-04-11T01:04:30"
# },
# "max_downloaded_key": "0111245a",
# "last_seen": "2026-04-11T01:04:36",
# "serial": "BE11529",
# "peer": "63.43.212.232:51920"
# }
# }
#
# Key-based deduplication works well within a single "key generation" (between
# erases). After the device memory is erased the event counter resets to
# 0x01110000, so the first new event has the SAME key as the very first event
# we ever downloaded. We detect this situation with max_downloaded_key:
# Why (key, timestamp) and not key alone:
# The device's event-key counter resets to 0x01110000 after every memory
# erase (internal or external). A bare-key dedup (the v1 format) cannot
# distinguish a re-recorded event with the same key from one we already
# downloaded. The 0C waveform record's timestamp IS unique per physical
# event, so we pair (key, timestamp) and treat a key with a different
# timestamp as a new event regardless of `max_downloaded_key`.
#
# if max(current_device_keys) < max_downloaded_key
# → device was wiped and keys have restarted → treat all device keys as new
#
# After our own erase (--clear-after-download) we also explicitly clear
# downloaded_keys and max_downloaded_key so the next session starts fresh.
# Legacy v1 format (`downloaded_keys: list[str]` only) is auto-migrated on
# read: the keys are kept under a sentinel of "" (empty string) timestamp so
# the (key, timestamp) compare always sees a mismatch and forces a one-time
# re-download. After that pass the state is rewritten in v2 form.
_state_lock = threading.Lock()
def _load_state(state_path: Path) -> dict:
if state_path.exists():
"""
Load ach_state.json, transparently migrating any legacy
`downloaded_keys: list` entries into the v2 `downloaded_events: dict`
schema. Returns the migrated state.
"""
if not state_path.exists():
return {}
try:
with open(state_path) as f:
return json.load(f)
state = json.load(f)
except Exception:
pass
return {}
# Per-unit migration: legacy list → dict-with-empty-timestamps
for unit_key, unit_state in list(state.items()):
if not isinstance(unit_state, dict):
continue
if "downloaded_events" in unit_state:
continue
legacy_keys = unit_state.get("downloaded_keys")
if isinstance(legacy_keys, list):
unit_state["downloaded_events"] = {k: "" for k in legacy_keys}
log.info(
"ach_state: migrated %s from v1 (downloaded_keys list) → v2 "
"(downloaded_events dict, %d keys with empty timestamps; "
"they will re-validate on next session)",
unit_key, len(legacy_keys),
)
else:
unit_state["downloaded_events"] = {}
# keep legacy field for one cycle; cleared on next save
unit_state.pop("downloaded_keys", None)
return state
def _save_state(state_path: Path, state: dict) -> None:
with _state_lock:
@@ -138,8 +174,10 @@ class AchSession:
max_events: Optional[int],
state_path: Path,
db: "SeismoDb",
store: "WaveformStore",
clear_after_download: bool = False,
restart_monitoring: bool = False,
force_redownload: bool = False,
) -> None:
self.sock = sock
self.peer = peer
@@ -149,8 +187,14 @@ class AchSession:
self.max_events = max_events
self.state_path = state_path
self.db = db
self.store = store
self.clear_after_download = clear_after_download
self.restart_monitoring = restart_monitoring
# `force_redownload` tells this session to ignore ach_state and
# re-download every event currently on the device, regardless of any
# (key, timestamp) match. Useful as a manual override when state has
# become inconsistent with what's actually on disk / in the DB.
self.force_redownload = force_redownload
def run(self) -> None:
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
@@ -172,16 +216,24 @@ class AchSession:
transport = SocketTransport(self.sock, peer=self.peer)
# Collect raw bytes in memory until startup succeeds, then flush to disk.
raw_buf: list[bytes] = []
raw_rx_buf: list[bytes] = [] # device → us (S3 side)
raw_tx_buf: list[bytes] = [] # us → device (BW side)
_orig_read = transport.read
_orig_write = transport.write
def tapped_read(n: int) -> bytes:
data = _orig_read(n)
if data:
raw_buf.append(data)
raw_rx_buf.append(data)
return data
def tapped_write(data: bytes) -> None:
_orig_write(data)
if data:
raw_tx_buf.append(data)
transport.read = tapped_read # type: ignore[method-assign]
transport.write = tapped_write # type: ignore[method-assign]
serial: Optional[str] = None
@@ -202,22 +254,34 @@ class AchSession:
session_dir = self.output_dir / f"ach_inbound_{ts}"
session_dir.mkdir(parents=True, exist_ok=True)
log_path = session_dir / f"session_{ts}.log"
raw_path = session_dir / f"raw_rx_{ts}.bin"
raw_rx_path = session_dir / f"raw_rx_{ts}.bin" # device → us (S3 side)
raw_tx_path = session_dir / f"raw_tx_{ts}.bin" # us → device (BW side)
# Flush buffered raw bytes to file and switch to direct file writes.
raw_fh = open(raw_path, "wb")
for chunk in raw_buf:
raw_fh.write(chunk)
raw_buf.clear()
# Flush buffered bytes to files and switch to direct file writes.
raw_rx_fh = open(raw_rx_path, "wb")
raw_tx_fh = open(raw_tx_path, "wb")
for chunk in raw_rx_buf:
raw_rx_fh.write(chunk)
for chunk in raw_tx_buf:
raw_tx_fh.write(chunk)
raw_rx_buf.clear()
raw_tx_buf.clear()
def tapped_read_file(n: int) -> bytes:
data = _orig_read(n)
if data:
raw_fh.write(data)
raw_fh.flush()
raw_rx_fh.write(data)
raw_rx_fh.flush()
return data
def tapped_write_file(data: bytes) -> None:
_orig_write(data)
if data:
raw_tx_fh.write(data)
raw_tx_fh.flush()
transport.read = tapped_read_file # type: ignore[method-assign]
transport.write = tapped_write_file # type: ignore[method-assign]
# Wire up file handler now that the session dir exists.
fh = logging.FileHandler(log_path, encoding="utf-8")
@@ -252,11 +316,20 @@ class AchSession:
state = _load_state(self.state_path)
unit_key = serial or self.peer # fall back to IP if no serial
unit_state = state.get(unit_key, {})
seen_keys: set[str] = set(unit_state.get("downloaded_keys", []))
# Highest event key ever downloaded from this unit (hex string, 8 chars).
# Used to detect post-erase key reuse — see comment block above.
# downloaded_events is the v2 (key_hex → timestamp_iso) dict.
# Empty-string timestamps are migrated v1 entries — they force a
# one-time re-download because the (key, timestamp) compare always
# mismatches against any non-empty timestamp from a fresh 0C read.
seen_events: dict[str, str] = dict(unit_state.get("downloaded_events", {}))
max_seen_key: str = unit_state.get("max_downloaded_key", "00000000")
if self.force_redownload:
log.info(" --force-redownload-all set — ignoring %d cached "
"(key, timestamp) entries for this session",
len(seen_events))
seen_events = {}
# Walk the event index (browse-mode, no 5A) to get the actual current
# key list. The SUB 08 event_count field is a lifetime "total events
# ever recorded" counter that does NOT decrement on erase — confirmed
@@ -269,11 +342,10 @@ class AchSession:
log.warning(" list_event_keys failed: %s -- falling back to full download", exc)
device_keys = None
# Use the walk result as our authoritative current count.
current_count = len(device_keys) if device_keys is not None else 0
log.info(" Unit has %d stored event(s); %d key(s) previously downloaded",
current_count, len(seen_keys))
log.info(" Unit has %d stored event(s); %d (key, ts) entr(ies) previously downloaded",
current_count, len(seen_events))
if device_keys is not None and current_count == 0:
log.info(" [OK] No events on device -- nothing to download")
@@ -281,75 +353,29 @@ class AchSession:
return
if device_keys is not None:
# ── Post-erase detection ──────────────────────────────────────
# After the device memory is erased, new events start from key
# 01110000 again — the same keys we already downloaded. Detect
# this by comparing the device's current highest key against the
# historical maximum. If the device has rolled back below our
# high-water mark, its counter was reset and we must treat all
# its keys as new, regardless of what seen_keys contains.
# ── Post-erase detection (best-effort, key-only signal) ───────
# After erase the device's key counter resets to 01110000.
# If the device's current max key is below our high-water mark
# we know erase happened. This catches the cleanest case but
# does NOT catch erase-then-record-many-events (where the new
# max may climb past the old max). The (key, timestamp) check
# in get_events() is what handles those.
if device_keys and max_seen_key != "00000000":
max_device_key = max(device_keys) # lexicographic; safe because
# keys share the same 4-char prefix
max_device_key = max(device_keys)
if max_device_key < max_seen_key:
log.info(
" Post-erase reset detected: "
"device max key %s < historical max %s "
"-- treating all device keys as new",
"-- discarding stale (key, ts) state for this session",
max_device_key, max_seen_key,
)
seen_keys = set() # discard stale dedup info for this session
seen_events = {}
new_key_set = set(device_keys) - seen_keys
log.info(" Device has %d key(s): %d new, %d already seen",
len(device_keys), len(new_key_set), len(device_keys) - len(new_key_set))
if not new_key_set:
log.info(" [OK] All events already downloaded -- nothing to do")
# Refresh state timestamp; preserve max_seen_key unchanged.
state[unit_key] = {
"downloaded_keys": sorted(seen_keys | set(device_keys)),
"max_downloaded_key": max_seen_key,
"last_seen": datetime.datetime.now().isoformat(),
"serial": serial,
"peer": self.peer,
}
_save_state(self.state_path, state)
# ── Erase even when no new events (if requested) ──────────
# Blastware ACH always erases after every session — even when
# nothing new was downloaded. Without the erase the device
# still sees stored events in its memory and immediately
# retries the call-home, causing the looping we observed.
# Only erase when device actually has events stored; skip
# the erase if device_keys is empty (nothing to erase).
if self.clear_after_download and device_keys:
log.info(
" Clearing device memory (--clear-after-download, "
"no new events but device has %d stored)...",
len(device_keys),
)
try:
client.delete_all_events()
log.info(" [OK] Device memory cleared")
# Reset state so the next session starts fresh.
state[unit_key] = {
"downloaded_keys": [],
"max_downloaded_key": "00000000",
"last_seen": datetime.datetime.now().isoformat(),
"serial": serial,
"peer": self.peer,
}
_save_state(self.state_path, state)
except Exception as exc:
log.error(
" [WARN] Event deletion failed: %s -- events NOT cleared",
exc,
)
log.info("Session complete (no new events) -> %s", session_dir)
return
else:
new_key_set = None # unknown; proceed with full download
# Note: no early-exit "all already downloaded" short-circuit
# here. Without per-event timestamps we cannot tell whether
# device_keys ⊆ seen_events.keys() actually means we have
# those physical events. get_events() will read 0C on its
# skip path and decide per event.
# Apply max_events cap
# stop_idx: when we know the count from list_event_keys, use it as
@@ -367,27 +393,67 @@ class AchSession:
)
try:
# Pass `seen_events` (key → ISO timestamp) so the client can
# read 0C on its skip path and only skip 5A when the per-event
# timestamp matches what we already have on disk. When force_-
# redownload is set, seen_events was already cleared above.
#
# Filter out empty-string timestamps (legacy v1 entries) — the
# client's 0C-on-skip-path only trusts entries with a
# populated timestamp; otherwise it falls through to a full
# 5A download.
skip_dict = {k: ts for k, ts in seen_events.items() if ts}
all_events = client.get_events(
full_waveform=True,
stop_after_index=stop_idx,
skip_waveform_for_keys=seen_keys if seen_keys else None,
skip_waveform_for_events=skip_dict if skip_dict else None,
)
# Filter to events whose keys we haven't saved before.
# New events are those that came back with _a5_frames populated
# (= 5A actually ran on this session). Skipped events have
# _a5_frames = None because the client matched (key, timestamp)
# against skip_dict and bypassed 5A.
new_events = [
e for e in all_events
if e._waveform_key is None
or e._waveform_key.hex() not in seen_keys
if getattr(e, "_a5_frames", None)
]
skipped = len(all_events) - len(new_events)
log.info(" [OK] Downloaded %d event(s): %d new, %d skipped (already seen)",
log.info(" [OK] Walked %d event(s): %d downloaded, %d skipped (matched (key, ts) in state)",
len(all_events), len(new_events), skipped)
if skipped:
log.info(" (skipped %d already-downloaded event(s))", skipped)
# ── Persist event file + A5 sidecar to the waveform store ──
# Saves ride alongside the existing JSON dump so the on-disk
# event file and events.json reference the same set of events.
waveform_records: dict[str, dict] = {}
for ev in new_events:
if not ev._a5_frames:
continue
try:
rec = self.store.save(
ev,
serial=serial or "UNKNOWN",
a5_frames=ev._a5_frames,
)
if ev._waveform_key is not None:
waveform_records[ev._waveform_key.hex()] = rec
log.info(
" [WAVE] saved %s (%d bytes)",
rec["filename"], rec["filesize"],
)
except Exception as exc:
key_hex = ev._waveform_key.hex() if ev._waveform_key else "????????"
log.warning(
" [WARN] Waveform store save failed for %s: %s",
key_hex, exc,
)
if new_events:
_save_json(session_dir / "events.json", [_event_to_dict(e) for e in new_events])
_save_json(
session_dir / "events.json",
[_event_to_dict(e, waveform_records) for e in new_events],
)
for ev in new_events:
pv = ev.peak_values
@@ -446,7 +512,10 @@ class AchSession:
_session_start = datetime.datetime.now()
try:
_ev_ins, _ev_skip = self.db.insert_events(
new_events, serial=serial or self.peer, session_id=None
new_events,
serial=serial or self.peer,
session_id=None,
waveform_records=waveform_records,
)
_ml_ins, _ml_skip = self.db.insert_monitor_log(
new_monitor_entries, session_id=None
@@ -481,35 +550,64 @@ class AchSession:
)
# ── Update persistent state ───────────────────────────────────
# Include both triggered-event keys and monitor-log keys in the
# downloaded set so they are not re-processed on the next call-home.
current_event_keys = [
e._waveform_key.hex()
for e in all_events
if e._waveform_key is not None
]
current_monitor_keys = [e.key for e in new_monitor_entries]
current_keys = current_event_keys + current_monitor_keys
# Build a fresh (key → ISO timestamp) map from THIS session's
# results. For each event currently on the device, prefer the
# timestamp we just observed (from 0C); fall back to whatever
# was already in seen_events for that key (so we don't lose an
# entry just because get_events skipped it on the (key, ts)
# match path).
def _ts_iso(ev) -> str:
ts = getattr(ev, "timestamp", None)
if ts is None:
return ""
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
current_events_map: dict[str, str] = {}
for ev in all_events:
if ev._waveform_key is None:
continue
key_hex = ev._waveform_key.hex()
ts_iso = _ts_iso(ev) or seen_events.get(key_hex, "")
current_events_map[key_hex] = ts_iso
# Monitor-log entries don't have a 0C-style timestamp, but
# they DO have a start_time; use that so the monitor-log keys
# are properly entered into the (key, ts) map.
for ml in new_monitor_entries:
key_hex = ml.key
ts = ml.start_time
ts_iso = ts.isoformat() if ts else seen_events.get(key_hex, "")
# If a triggered event already populated this key, keep
# whichever has a non-empty timestamp.
if key_hex not in current_events_map or not current_events_map[key_hex]:
current_events_map[key_hex] = ts_iso
if erased_successfully:
# Device memory is clear. Reset downloaded_keys and the
# high-water mark so the next call-home starts fresh and
# doesn't mis-identify the recycled key 01110000 as "seen".
updated_keys = []
updated_events: dict[str, str] = {}
new_max_key = "00000000"
log.info(
" State reset after erase -- next session will download "
"from key 0 (device counter resets after erase)"
)
else:
# Normal (no erase): union of previously-seen + all keys on
# device now. Includes already-seen survivors so we never
# re-download them if the device somehow keeps old records.
updated_keys = sorted(set(seen_keys) | set(current_keys))
new_max_key = updated_keys[-1] if updated_keys else max_seen_key
# Merge: keep prior (key, ts) entries we still have evidence
# of (for survivors of any partial failure), plus this
# session's authoritative (key, ts) pairs.
updated_events = dict(seen_events)
updated_events.update(current_events_map)
new_max_key = (
max(updated_events.keys())
if updated_events else max_seen_key
)
state[unit_key] = {
"downloaded_keys": updated_keys,
"downloaded_events": updated_events,
"max_downloaded_key": new_max_key,
"last_seen": datetime.datetime.now().isoformat(),
"serial": serial,
@@ -530,7 +628,8 @@ class AchSession:
log.warning(" [WARN] Failed to restart monitoring: %s", exc)
finally:
raw_fh.close()
raw_rx_fh.close()
raw_tx_fh.close()
client.close() # closes transport / socket cleanly
root_logger.removeHandler(fh)
fh.close()
@@ -561,7 +660,8 @@ def _device_info_to_dict(d: DeviceInfo) -> dict:
"record_time": cc.record_time if cc else None,
"trigger_level_geo": cc.trigger_level_geo if cc else None,
"alarm_level_geo": cc.alarm_level_geo if cc else None,
"max_range_geo": cc.max_range_geo if cc else None,
"geo_adc_scale": cc.geo_adc_scale if cc else None, # hw scale factor (in/s)/V
"geo_range": cc.geo_range if cc else None, # 0x01=Normal 10in/s, 0x00=Sensitive 1.25in/s (unconfirmed)
"project": cc.project if cc else None,
"client": cc.client if cc else None,
"operator": cc.operator if cc else None,
@@ -569,7 +669,10 @@ def _device_info_to_dict(d: DeviceInfo) -> dict:
}
def _event_to_dict(e: Event) -> dict:
def _event_to_dict(
e: Event,
waveform_records: Optional[dict[str, dict]] = None,
) -> dict:
pv = e.peak_values
pi = e.project_info
peaks = {}
@@ -588,6 +691,11 @@ def _event_to_dict(e: Event) -> dict:
for ch, vals in e.raw_samples.items()
}
samples["__note__"] = "first 20 sample-sets only; see raw_rx.bin for full waveform"
rec: dict = {}
if waveform_records and e._waveform_key is not None:
rec = waveform_records.get(e._waveform_key.hex(), {}) or {}
return {
"timestamp": str(e.timestamp) if e.timestamp else None,
"project": pi.project if pi else None,
@@ -596,6 +704,9 @@ def _event_to_dict(e: Event) -> dict:
"sensor_location": pi.sensor_location if pi else None,
"peaks": peaks,
"raw_samples_preview": samples,
"blastware_filename": rec.get("filename"),
"blastware_filesize": rec.get("filesize"),
"a5_pickle_filename": rec.get("a5_pickle_filename"),
}
@@ -617,6 +728,7 @@ def serve(args: argparse.Namespace) -> None:
output_dir.mkdir(parents=True, exist_ok=True)
state_path = output_dir / "ach_state.json"
db = SeismoDb(output_dir / "seismo_relay.db")
store = WaveformStore(output_dir / "waveforms")
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
@@ -634,6 +746,7 @@ def serve(args: argparse.Namespace) -> None:
print(f" Max events per session: {max_ev if max_ev else 'unlimited'}")
print(f" Clear device after download: {'YES' if args.clear_after_download else 'no'}")
print(f" Restart monitoring after download: {'YES' if args.restart_monitoring else 'no'}")
print(f" Force re-download all (ignore state): {'YES' if args.force_redownload_all else 'no'}")
print(f"{'='*60}")
print(f"\n Point your test unit's ACEmanager call-home settings to:")
print(f" Remote Host: <this machine's LAN IP>")
@@ -671,8 +784,10 @@ def serve(args: argparse.Namespace) -> None:
max_events=max_ev,
state_path=state_path,
db=db,
store=store,
clear_after_download=args.clear_after_download,
restart_monitoring=args.restart_monitoring,
force_redownload=args.force_redownload_all,
)
t = threading.Thread(target=session.run, daemon=True, name=f"ach-{peer}")
t.start()
@@ -757,6 +872,17 @@ def parse_args() -> argparse.Namespace:
"This mirrors the standard Blastware ACH workflow."
),
)
p.add_argument(
"--force-redownload-all",
action="store_true",
default=False,
help=(
"Manual override: ignore ach_state.json's downloaded_events map "
"for this session and re-download every event currently on the "
"device, regardless of (key, timestamp) match. Useful when state "
"has become inconsistent with the on-disk waveform store / DB."
),
)
p.add_argument(
"--verbose", "-v",
action="store_true",
+34 -29
View File
@@ -58,16 +58,24 @@ class BridgeGUI(tk.Tk):
tk.Entry(self, textvariable=self.logdir_var, width=24).grid(row=1, column=3, sticky="we", **pad)
tk.Button(self, text="Browse", command=self._choose_dir).grid(row=1, column=4, sticky="w", **pad)
# Row 2: Raw taps
self.raw_bw_var = tk.StringVar(value="")
self.raw_s3_var = tk.StringVar(value="")
tk.Checkbutton(self, text="Save BW->S3 raw", command=self._toggle_raw_bw, onvalue="1", offvalue="").grid(row=2, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_bw_var, width=28).grid(row=2, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_bw_var, "bw")).grid(row=2, column=4, **pad)
# Row 2: Raw taps — ON by default; "auto" = timestamped name; blank checkbox = disabled
self.raw_bw_enabled = tk.IntVar(value=1)
self.raw_s3_enabled = tk.IntVar(value=1)
# Path fields: empty means "auto" (bridge picks a timestamped name)
self.raw_bw_path_var = tk.StringVar(value="")
self.raw_s3_path_var = tk.StringVar(value="")
tk.Checkbutton(self, text="Save S3->BW raw", command=self._toggle_raw_s3, onvalue="1", offvalue="").grid(row=3, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_s3_var, width=28).grid(row=3, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_s3_var, "s3")).grid(row=3, column=4, **pad)
tk.Checkbutton(self, text="BW→S3 raw (auto)", variable=self.raw_bw_enabled,
command=self._toggle_raw_bw).grid(row=2, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_bw_path_var, width=28,
fg="grey").grid(row=2, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_bw_path_var, "bw")).grid(row=2, column=4, **pad)
tk.Checkbutton(self, text="S3→BW raw (auto)", variable=self.raw_s3_enabled,
command=self._toggle_raw_s3).grid(row=3, column=0, sticky="w", **pad)
tk.Entry(self, textvariable=self.raw_s3_path_var, width=28,
fg="grey").grid(row=3, column=1, columnspan=3, sticky="we", **pad)
tk.Button(self, text="...", command=lambda: self._choose_file(self.raw_s3_path_var, "s3")).grid(row=3, column=4, **pad)
# Row 4: Status + buttons
self.status_var = tk.StringVar(value="Idle")
@@ -102,13 +110,11 @@ class BridgeGUI(tk.Tk):
var.set(filename)
def _toggle_raw_bw(self) -> None:
if not self.raw_bw_var.get():
# default name
self.raw_bw_var.set(os.path.join(self.logdir_var.get(), "raw_bw.bin"))
# Checkbox toggled — no path action needed; enabled state drives the flag.
pass
def _toggle_raw_s3(self) -> None:
if not self.raw_s3_var.get():
self.raw_s3_var.set(os.path.join(self.logdir_var.get(), "raw_s3.bin"))
pass
def start_bridge(self) -> None:
if self.process and self.process.poll() is None:
@@ -126,23 +132,22 @@ class BridgeGUI(tk.Tk):
args = [sys.executable, BRIDGE_PATH, "--bw", bw, "--s3", s3, "--baud", baud, "--logdir", logdir]
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
# Raw tap flags.
# Checkbox on + empty path → pass "auto" (bridge generates timestamped name).
# Checkbox on + explicit path → pass that path.
# Checkbox off → pass "" to disable (overrides bridge's auto default).
raw_bw_explicit = self.raw_bw_path_var.get().strip()
raw_s3_explicit = self.raw_s3_path_var.get().strip()
raw_bw = self.raw_bw_var.get().strip()
raw_s3 = self.raw_s3_var.get().strip()
if self.raw_bw_enabled.get():
args += ["--raw-bw", raw_bw_explicit if raw_bw_explicit else "auto"]
else:
args += ["--raw-bw", ""] # explicit disable
# If the user left the default generic name, replace with a timestamped one
# so each session gets its own file.
if raw_bw:
if os.path.basename(raw_bw) in ("raw_bw.bin", "raw_bw"):
raw_bw = os.path.join(os.path.dirname(raw_bw) or logdir, f"raw_bw_{ts}.bin")
self.raw_bw_var.set(raw_bw)
args += ["--raw-bw", raw_bw]
if raw_s3:
if os.path.basename(raw_s3) in ("raw_s3.bin", "raw_s3"):
raw_s3 = os.path.join(os.path.dirname(raw_s3) or logdir, f"raw_s3_{ts}.bin")
self.raw_s3_var.set(raw_s3)
args += ["--raw-s3", raw_s3]
if self.raw_s3_enabled.get():
args += ["--raw-s3", raw_s3_explicit if raw_s3_explicit else "auto"]
else:
args += ["--raw-s3", ""] # explicit disable
try:
self.process = subprocess.Popen(
+88 -13
View File
@@ -93,8 +93,11 @@ class SessionLogger:
self._bin_fh = open(bin_path, "ab", buffering=0)
self._lock = threading.Lock()
# Optional pure-byte taps (no headers). BW=Blastware tx, S3=device tx.
# These can be opened/closed on demand via start_raw_capture/stop_raw_capture.
self._raw_bw = open(raw_bw_path, "ab", buffering=0) if raw_bw_path else None
self._raw_s3 = open(raw_s3_path, "ab", buffering=0) if raw_s3_path else None
self._cap_bw_path: Optional[str] = raw_bw_path
self._cap_s3_path: Optional[str] = raw_s3_path
def log_line(self, line: str) -> None:
with self._lock:
@@ -124,6 +127,43 @@ class SessionLogger:
self.log_line(f"[{ts}] [INFO] {msg}")
self.bin_write_record(REC_INFO, msg.encode("utf-8", errors="replace"))
def start_raw_capture(self, label: str, logdir: str) -> tuple:
"""Open new raw tap files for a named capture. Returns (bw_path, s3_path)."""
ts = _dt.datetime.now().strftime("%Y%m%d_%H%M%S")
safe = "".join(c if c.isalnum() or c in "-_" else "_" for c in label)[:40] if label else ""
suffix = f"_{safe}" if safe else ""
bw_path = os.path.join(logdir, f"raw_bw_{ts}{suffix}.bin")
s3_path = os.path.join(logdir, f"raw_s3_{ts}{suffix}.bin")
with self._lock:
# Close any previously open taps first
if self._raw_bw:
self._raw_bw.close()
if self._raw_s3:
self._raw_s3.close()
self._raw_bw = open(bw_path, "ab", buffering=0)
self._raw_s3 = open(s3_path, "ab", buffering=0)
self._cap_bw_path = bw_path
self._cap_s3_path = s3_path
self.log_info(f"raw capture started: label={label!r} bw={bw_path} s3={s3_path}")
return bw_path, s3_path
def stop_raw_capture(self) -> tuple:
"""Close raw tap files. Returns (bw_path, s3_path) for the capture just closed."""
with self._lock:
bw = self._cap_bw_path
s3 = self._cap_s3_path
if self._raw_bw:
self._raw_bw.close()
self._raw_bw = None
if self._raw_s3:
self._raw_s3.close()
self._raw_s3 = None
self._cap_bw_path = None
self._cap_s3_path = None
if bw:
self.log_info(f"raw capture stopped: bw={bw} s3={s3}")
return bw, s3
def close(self) -> None:
with self._lock:
try:
@@ -291,8 +331,18 @@ def forward_loop(
time.sleep(0.002)
def annotation_loop(logger: SessionLogger, stop: threading.Event) -> None:
print("[MARK] Type 'm' + Enter to annotate the capture. Ctrl+C to stop.")
def annotation_loop(logger: SessionLogger, logdir: str, stop: threading.Event) -> None:
"""
Reads stdin commands while the bridge runs.
Commands:
m — prompt for a mark label (interactive)
CAP_START:<label> — begin a raw tap capture with the given label
CAP_STOP — stop the current raw tap capture
Responses (printed to stdout, parsed by the GUI):
[CAP_START] <bw_path>\\t<s3_path>
[CAP_STOP] <bw_path>\\t<s3_path>
"""
while not stop.is_set():
try:
line = input()
@@ -303,7 +353,21 @@ def annotation_loop(logger: SessionLogger, stop: threading.Event) -> None:
if not line:
continue
if line.lower() == "m":
if line.startswith("CAP_START:"):
label = line[10:].strip()
bw_path, s3_path = logger.start_raw_capture(label, logdir)
print(f"[CAP_START] {bw_path}\t{s3_path}")
sys.stdout.flush()
elif line == "CAP_STOP":
bw_path, s3_path = logger.stop_raw_capture()
if bw_path:
print(f"[CAP_STOP] {bw_path}\t{s3_path}")
else:
print("[CAP_STOP] no active capture")
sys.stdout.flush()
elif line.lower() == "m":
try:
sys.stdout.write(" Label: ")
sys.stdout.flush()
@@ -315,8 +379,9 @@ def annotation_loop(logger: SessionLogger, stop: threading.Event) -> None:
print(f" [MARK written] {label}")
else:
print(" (empty label — mark cancelled)")
else:
print(" (type 'm' + Enter to annotate)")
print(f" (unknown command: {line!r})")
def main() -> int:
@@ -325,8 +390,14 @@ def main() -> int:
ap.add_argument("--s3", default="COM5", help="S3-side COM port (default: COM5)")
ap.add_argument("--baud", type=int, default=38400, help="Baud rate (default: 38400)")
ap.add_argument("--logdir", default=".", help="Directory to write session logs into (default: .)")
ap.add_argument("--raw-bw", default=None, help="Optional file to append raw bytes sent from BW->S3 (no headers)")
ap.add_argument("--raw-s3", default=None, help="Optional file to append raw bytes sent from S3->BW (no headers)")
ap.add_argument("--raw-bw", default="auto",
help="File to append raw bytes sent from BW->S3 (no headers). "
"Default 'auto' generates a timestamped name in --logdir. "
"Pass an empty string to disable.")
ap.add_argument("--raw-s3", default="auto",
help="File to append raw bytes sent from S3->BW (no headers). "
"Default 'auto' generates a timestamped name in --logdir. "
"Pass an empty string to disable.")
ap.add_argument("--quiet", action="store_true", help="No console heartbeat output")
ap.add_argument("--status-every", type=float, default=0.0, help="Seconds between console heartbeat lines (default: 0 = off)")
args = ap.parse_args()
@@ -349,12 +420,16 @@ def main() -> int:
# If raw tap flags were passed without a path (bare --raw-bw / --raw-s3),
# or if the sentinel value "auto" is used, generate a timestamped name.
# If a specific path was provided, use it as-is (caller's responsibility).
raw_bw_path = args.raw_bw
raw_s3_path = args.raw_s3
if raw_bw_path in (None, "", "auto"):
raw_bw_path = os.path.join(args.logdir, f"raw_bw_{ts}.bin") if args.raw_bw is not None else None
if raw_s3_path in (None, "", "auto"):
raw_s3_path = os.path.join(args.logdir, f"raw_s3_{ts}.bin") if args.raw_s3 is not None else None
# Resolve raw tap paths.
# "auto" (default) → timestamped file in logdir (always captured).
# Explicit path → use verbatim.
# None or "" → disabled (pass --raw-bw "" to suppress capture).
raw_bw_path: Optional[str] = args.raw_bw if args.raw_bw else None
raw_s3_path: Optional[str] = args.raw_s3 if args.raw_s3 else None
if raw_bw_path == "auto":
raw_bw_path = os.path.join(args.logdir, f"raw_bw_{ts}.bin")
if raw_s3_path == "auto":
raw_s3_path = os.path.join(args.logdir, f"raw_s3_{ts}.bin")
logger = SessionLogger(log_path, bin_path, raw_bw_path=raw_bw_path, raw_s3_path=raw_s3_path)
@@ -391,7 +466,7 @@ def main() -> int:
t_ann = threading.Thread(
target=annotation_loop,
name="Annotator",
args=(logger, stop),
args=(logger, args.logdir, stop),
daemon=True,
)
File diff suppressed because it is too large Load Diff
+10 -2
View File
@@ -21,7 +21,15 @@ Typical usage (TCP / modem):
from .client import MiniMateClient
from .models import DeviceInfo, Event, MonitorLogEntry
from .transport import SerialTransport, TcpTransport
from .transport import CapturingTransport, SerialTransport, TcpTransport
__version__ = "0.1.0"
__all__ = ["MiniMateClient", "DeviceInfo", "Event", "MonitorLogEntry", "SerialTransport", "TcpTransport"]
__all__ = [
"MiniMateClient",
"DeviceInfo",
"Event",
"MonitorLogEntry",
"SerialTransport",
"TcpTransport",
"CapturingTransport",
]
+974
View File
@@ -0,0 +1,974 @@
"""
blastware_file.py Blastware binary file codec for bidirectional interoperability.
Reads and writes the proprietary Instantel/Blastware file formats:
Waveform events (.CE0W, .VM0H, .440, .7M0, etc.) (extension encoding UNKNOWN see below)
.MLG Monitor log (monitoring session history)
All waveform formats share a common 22-byte file header prefix and identical
internal binary structure (same type tag 00 12 03 00, same STRT record layout).
Blastware identifies the file type by extension, not by a magic marker.
EXTENSION ENCODING V10.72 firmware FULLY CONFIRMED 2026-04-22:
Direct / manual download: AB0 (3-char, no type character)
Call-home (ACH) download: AB0W or AB0H (4-char, W=waveform H=histogram)
AB = 2-char base-36 of (total_seconds % 1296), where
total_seconds = (event_local_time 1985-01-01T00:00:00_local).
0 = always literal digit zero.
Verified against 3,248 call-home files from a 10-year production archive.
The 10-year archive contains only ACH files (all end in W or H).
Manual Blastware downloads produce 3-char AB0 extensions same encoding
but without the trailing type character.
Old firmware (S338, 3-char extensions): encoding unknown / same as manual?
Micromate Series 4 uses a different scheme (literal datetime in filename).
File structure overview
Waveform file structure (confirmed from example-events/4-3-26-multi/M529LIY6 (example event)):
[22B header] [21B STRT record] [body bytes] [26B footer]
Header (22 bytes):
10 00 01 80 00 00 fixed prefix
49 6e 73 74 61 6e 74 65 6c 00 b'Instantel\x00'
07 2c fixed
00 12 03 00 waveform file type tag (shared by all waveform extensions)
STRT record (21 bytes, immediately follows header):
53 54 52 54 b'STRT'
ff fe fixed (2 bytes)
[key4] 4-byte waveform event key
[key4] 4-byte waveform event key (repeated)
[zeros] 7 bytes padding
[rectime] uint8 record time in seconds
Body (variable reconstructed from A5 frame data):
The body bytes are derived from the raw A5 frame wire content, specifically
from the DLE-decoded representation of each frame's contribution. See the
_frame_body_bytes() helper for the exact algorithm.
Footer (26 bytes):
0e 08
[ts1: 8B big-endian timestamp] start timestamp
[ts2: 8B big-endian timestamp] stop timestamp
00 01 00 02 00 00
[crc: 2B] CRC (algorithm unconfirmed; written as 0x00 0x00 placeholder)
Timestamp format (big-endian, 8 bytes):
[day] [month] [year_HI] [year_LO] [0x00] [hour] [min] [sec]
MLG (monitor log, confirmed from example-events/4-3-26-multi/BE11529.MLG):
[308B header] [N × 292B records]
Header (308 bytes):
Offset 0x00: 10 00 01 80 00 00 Instantel\x00 07 2c 22 01 0e a0 fixed (16B)
Offset 0x10: ... (unknown structure, written as zeros + serial)
Offset 0x2A: serial number (8 bytes, null-padded ASCII, e.g. "BE11529")
... zero-padded to 308 bytes total
Record (292 bytes each):
[2B CRC] unknown algorithm; written as 0x00 0x00
22 01 0e 80 record marker
[ts1: 8B big-endian timestamp] start time
[ts2: 8B big-endian timestamp] stop time (zeros if no stop)
[4B flags] see MLG_FLAGS_* constants below
[10B serial] null-padded serial number ASCII
[text] for trigger records: [0x08][8B ts1_copy] then ASCII "Geo: X.XXX in/s"
for monitoring records: b'' (or minimal separator)
[zero-padded to 292 bytes]
Critical implementation notes
Waveform body reconstruction algorithm (confirmed 2026-04-21 from verification against
M529LIY6 (example event) using raw_s3_20260403_153508.bin capture):
The waveform body bytes come from the A5 frame content, stripped of DLE-framing
artifacts. Each A5 frame contributes a different slice of its data section,
with DLE+{0x02,0x03,0x04} byte pairs stripped.
Skip amounts per frame index (offsets into frame.data):
A5[0] (probe): data[strt_pos + 21 + 7] (skip header + STRT record)
strt_pos found by searching frame.data[7:] for b'STRT';
the contribution starts at strt_pos + 21 within data[7:]
which equals strt_pos + 21 + 7 within frame.data.
A5[1]: data[13] (skip 7-byte frame.data prefix + 6 header bytes)
A5[2..N]: data[12] (skip 7-byte frame.data prefix + 5 header bytes)
Terminator A5: data[11] (1 byte less than chunk frames; terminator inner header
is 4 bytes instead of 5 confirmed 2026-04-21)
DLE strip rule (applied AFTER slicing):
Strip any 0x10 byte that is immediately followed by 0x02, 0x03, or 0x04.
This undoes the DLE-escape that S3FrameParser preserves as literal pairs.
Applied to frame.data[skip:] + bytes([frame.chk_byte]) together, then
conditionally exclude the trailing chk_byte from the output.
chk_byte absorption:
When frame.data[-1] == 0x10 AND frame.chk_byte {0x02, 0x03, 0x04},
the last byte of frame.data is the DLE prefix of a split DLE+chk pair.
Including chk_byte in the strip buffer allows the pair to be stripped as
a unit. After stripping, the trailing chk_byte is ALWAYS removed because
_strip_inner_frame_dles keeps the byte after the DLE (the chk_byte value),
and that value is the checksum, never payload. This applies to all three
cases (chk {0x02, 0x03, 0x04}) identically.
MLG CRC:
The algorithm that produces the 2-byte CRC at the start of each MLG record
is unknown. All examined records use non-zero values that do not match
CRC-16/CCITT, CRC-16/IBM, CRC-32 (truncated), word sums, XOR variants, or
any of the 40+ polynomial/init combinations tested. The writer emits 0x0000.
This produces files that Blastware may reject or display without the CRC check
the exact impact on BW import is unknown (TODO: test).
Public API
blastware_filename(event, serial)
Return the correct Blastware filename for an event (e.g. "M529LIY6.CE0W").
Full AB0T extension encoding confirmed 2026-04-22 against 3,248 archive files.
Extension matches what Blastware itself would generate for the same event.
write_blastware_file(event, a5_frames, path)
Create a Blastware waveform file from an Event and the full A5 frame list.
All waveform extensions share the same binary format the extension is set
by blastware_filename() based on the event timestamp and type.
read_blastware_file(path) Event
Parse a Blastware waveform file into an Event object with waveform data populated.
(Not yet implemented placeholder raises NotImplementedError.)
write_mlg(entries, serial, path)
Create a .MLG file from a list of MonitorLogEntry objects.
read_mlg(path) list[MonitorLogEntry]
Parse a .MLG file into MonitorLogEntry objects.
(Not yet implemented placeholder raises NotImplementedError.)
"""
from __future__ import annotations
import datetime
import logging
import struct
from pathlib import Path
from typing import Optional, Union
from .framing import S3Frame
from .models import Event, MonitorLogEntry, Timestamp
log = logging.getLogger(__name__)
# ── File header constants ─────────────────────────────────────────────────────
# Common 16-byte prefix shared by waveform files and MLG (confirmed from binary inspection).
_FILE_HEADER_PREFIX = bytes.fromhex("1000018000004973") + b"tantel\x00\x07\x2c"
# = 10 00 01 80 00 00 49 73 74 61 6e 74 65 6c 00 07 2c (17 bytes)
# Confirmed breakdown: 10 00 01 80 00 00 = fixed; "Instantel\x00" = 10B; 07 2c = fixed
# Simpler construction:
_FILE_HEADER_PREFIX = b"\x10\x00\x01\x80\x00\x00Instantel\x00\x07\x2c" # 17 bytes
# Waveform file type tag (4 bytes after common prefix) — shared by ALL waveform extensions
_WAVEFORM_TYPE_TAG = b"\x00\x12\x03\x00" # confirmed from M529LIY6 (example event) — same tag for .CE0W, .VM0H, etc.
# MLG type tag (4 bytes after common prefix)
_MLG_TYPE_TAG = b"\x22\x01\x0e\xa0" # confirmed from BE11529.MLG offset 0x11..0x14
# Total header sizes
_WAVEFORM_HEADER_SIZE = 22 # 17 + 4 = 21... wait. Let me recalculate.
# From binary: first 22 bytes = header, then STRT at byte 22.
# 17-byte common prefix + 4-byte type tag = 21 bytes. But observed header is 22B.
# Checking: 6 fixed + 10 "Instantel\x00" + 2 "07 2c" = 18B prefix, then 4B type tag = 22B.
# Re-count: b"\x10\x00\x01\x80\x00\x00" = 6B + b"Instantel\x00" = 10B + b"\x07\x2c" = 2B = 18B prefix.
_FILE_HEADER_PREFIX = b"\x10\x00\x01\x80\x00\x00Instantel\x00\x07\x2c" # 18 bytes
_WAVEFORM_HEADER_SIZE = 22 # 18 + 4 = 22 bytes ✅
_MLG_HEADER_SIZE = 308 # confirmed from BE11529.MLG
# MLG record marker (4 bytes after 2-byte CRC at start of each record)
_MLG_RECORD_MARKER = b"\x22\x01\x0e\x80"
_MLG_RECORD_SIZE = 292 # bytes per record (confirmed from BE11529.MLG)
# MLG record flags (4 bytes at record[22:26])
# Confirmed from BE11529.MLG binary inspection:
MLG_FLAGS_START_ONLY = b"\xff\xff\x00\x00" # monitoring start with no stop
MLG_FLAGS_TRIGGER = b"\x01\x00\x02\x00" # triggered event (has ts1 + ts2)
MLG_FLAGS_INTERVAL = b"\x02\x00\x00\x00" # monitoring interval (has ts1 + ts2)
# ── Timestamp helpers ─────────────────────────────────────────────────────────
def _encode_ts_be(ts: Optional[datetime.datetime]) -> bytes:
"""
Encode a datetime as an 8-byte big-endian Blastware timestamp.
Format (waveform file and MLG record timestamps):
[day][month][year_HI][year_LO][0x00][hour][min][sec]
Big-endian year confirmed from M529LIY6 (example event) footer:
footer bytes [2..9] = 01 04 07 ea 00 00 1c 08
day=1 month=4 year=0x07ea=2026 hour=0 min=28 sec=8
Returns 8 zero bytes if ts is None.
"""
if ts is None:
return bytes(8)
return bytes([
ts.day,
ts.month,
(ts.year >> 8) & 0xFF,
ts.year & 0xFF,
0x00,
ts.hour,
ts.minute,
ts.second,
])
def _decode_ts_be(raw: bytes) -> Optional[datetime.datetime]:
"""
Decode an 8-byte big-endian Blastware timestamp.
Returns None if the bytes are all zero or structurally invalid.
"""
if len(raw) < 8 or raw == bytes(8):
return None
day = raw[0]
month = raw[1]
year = (raw[2] << 8) | raw[3]
hour = raw[5]
minute = raw[6]
sec = raw[7]
try:
return datetime.datetime(year, month, day, hour, minute, sec)
except ValueError:
return None
def _ts_from_model(ts: Optional[Timestamp]) -> Optional[datetime.datetime]:
"""Convert a models.Timestamp to datetime.datetime, or None."""
if ts is None:
return None
try:
return datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second)
except (ValueError, TypeError):
return None
# ── DLE strip helper ──────────────────────────────────────────────────────────
def _strip_inner_frame_dles(data: bytes) -> bytes:
"""
Strip DLE (0x10) framing markers from A5 inner-frame content.
The A5 (bulk waveform stream) response body contains DLE-encoded sub-frame
structure. S3FrameParser preserves DLE+XX pairs as two literal bytes in
frame.data. Only the DLE marker byte needs to be removed; the following
byte is actual payload content.
Rule: when 0x10 is immediately followed by {0x02, 0x03, 0x04}, strip the
0x10 (DLE marker) and keep the following byte as payload.
Lone 0x10 bytes not followed by {0x02, 0x03, 0x04} are kept as-is.
Confirmed correct by verifying reconstructed waveform body against M529LIY6 (example event):
- 0x10 0x02 in terminator 0x02 kept
- 0x10 0x04 in terminator (month byte) 0x04 kept
"""
out = bytearray()
i = 0
while i < len(data):
b = data[i]
if b == 0x10 and i + 1 < len(data) and data[i + 1] in {0x02, 0x03, 0x04}:
# Strip the DLE marker; the next byte is payload and will be appended
# in the next loop iteration.
i += 1
continue
out.append(b)
i += 1
return bytes(out)
def _frame_body_bytes(frame: S3Frame, skip: int) -> bytes:
"""
Extract the waveform body contribution from one A5 S3Frame.
The contribution is frame.data[skip:] with inner-frame DLE pairs stripped
per _strip_inner_frame_dles(). The chk_byte is temporarily appended before
stripping to handle the split-pair edge case where a DLE at the end of
frame.data is paired with chk_byte.
Split-pair edge case (confirmed for A5[8] of M529LIY6 (example event), 2026-04-21):
S3FrameParser appends DLE+XX pairs as two literal bytes when XX {DLE, ETX}.
When the LAST occurrence of such a pair straddles the payload/checksum boundary
(i.e., DLE is the last byte of raw_payload and XX is the checksum), the parser
splits them:
- DLE ends up as the last byte of frame.data (frame.data[-1] == 0x10)
- XX is stored as frame.chk_byte
To strip the pair correctly, we reunite the bytes before calling the strip
function. Since chk_byte is the checksum (not payload data), it is excluded
from the final output regardless of whether it was part of a pair.
Post-strip chk_byte removal (ALL cases):
_strip_inner_frame_dles strips the 0x10 and KEEPS chk_byte in all cases.
Chk_byte is always the checksum (not payload), so always strip it off.
Args:
frame: S3Frame with frame.data and frame.chk_byte populated.
skip: Number of leading bytes in frame.data to exclude (frame header).
Returns:
bytes the waveform body contribution for this frame.
"""
if skip >= len(frame.data):
return b""
relevant = frame.data[skip:]
# Detect split DLE+chk pair at the frame boundary.
has_split_pair = (
len(relevant) > 0
and relevant[-1] == 0x10
and frame.chk_byte in {0x02, 0x03, 0x04}
)
if has_split_pair:
# Reunite the split pair so the strip function sees both bytes together.
buf = relevant + bytes([frame.chk_byte])
stripped = _strip_inner_frame_dles(buf)
# _strip_inner_frame_dles strips the DLE (0x10) and KEEPS chk_byte.
# chk_byte is the received checksum — never payload — so remove it.
# This is correct for all values in {0x02, 0x03, 0x04}.
if stripped:
stripped = stripped[:-1]
return stripped
else:
return _strip_inner_frame_dles(relevant)
# ── Filename helper ───────────────────────────────────────────────────────────
_INSTANTEL_EPOCH = datetime.datetime(1985, 1, 1, 0, 0, 0)
"""
Instantel timestamp epoch January 1, 1985, 00:00:00 local time.
Confirmed 2026-04-21: stem values for 6 independent events (April 19, 2026)
all converge to this epoch when decoded as floor(seconds_since_epoch / 1296).
1985 is the year Instantel was founded.
"""
_STEM_UNIT_SEC = 1296 # = 36^2 seconds ≈ 21.6 minutes per stem unit
_STEM_CHARS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
# ── Waveform file extension encoding ─────────────────────────────────────────
#
# NEW FIRMWARE (V10.72+) — FULLY DECODED (confirmed 2026-04-21, 10-year archive):
#
# Extension format: AB0T (4 characters)
# AB = 2-char base-36 encoding of (seconds_since_epoch % 1296)
# i.e. the number of seconds into the current 21.6-minute stem window
# Range: 0 ("00") to 1295 ("ZZ")
# 0 = always literal '0'
# T = event type: 'W' = Full Waveform, 'H' = Full Histogram
#
# Combined with the 4-char stem (which encodes seconds_since_epoch // 1296),
# the FULL filename gives a second-resolution timestamp:
# total_seconds = stem_val * 1296 + ab_val
# timestamp = EPOCH + timedelta(seconds=total_seconds)
#
# Verified against three S353L4H0 events (all three match to the second):
# S353L4H0.3M0W Full Waveform 2025-06-23 13:57:22 AB=3M=130 ✓
# S353L4H0.8S0H Full Histogram 2025-06-23 14:00:28 AB=8S=316 ✓
# S353L4H0.9X0W Full Waveform 2025-06-23 14:01:09 AB=9X=357 ✓
#
# OLD FIRMWARE (S338, 3-char extensions ending in '0') — UNKNOWN:
# Observed (old firmware / manual downloads): .440, .470, .7M0, .9T0, .EI0, etc.
# The V10.72 formula does NOT apply to these.
# Extension is NOT recording mode (refuted 2026-04-21: continuous → .EI0, not .9T0).
# blastware_filename() computes the correct AB0 extension for V10.72 firmware.
#
# WRONG earlier assumption (do not re-introduce):
# Extension was believed to encode recording mode × sample rate.
# Refuted by continuous-mode event producing .EI0 instead of .9T0.
def _make_stem(ts_local: datetime.datetime) -> str:
"""
Encode a local timestamp as a 4-character uppercase base-36 stem.
Algorithm (confirmed 2026-04-21 from 6 known file/timestamp pairs):
stem_int = floor((ts_local - Jan_1_1985_midnight_local) / 1296_seconds)
stem = 4-char uppercase base-36 encoding of stem_int
Unit = 36² = 1296 seconds 21.6 minutes. Events within the same 1296-second
window receive the same stem; their extension distinguishes them.
"""
delta_sec = int((ts_local - _INSTANTEL_EPOCH).total_seconds())
n = delta_sec // _STEM_UNIT_SEC
s = ""
for _ in range(4):
s = _STEM_CHARS[n % 36] + s
n //= 36
return s
def blastware_filename(event: Event, serial: str, ach: bool = False) -> str:
"""
Return the correct Blastware filename for an event.
CONFIRMED 2026-04-22 verified against 3,248 files from a 10-year archive.
Filename format: <prefix_letter><serial3><stem><AB>0[T]
where:
prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))
encodes the production generation (batch of 1000 units)
e.g. BE6907H, BE11529M, BE14036P, BE18003T
serial3 = f"{serial_numeric % 1000:03d}"
last 3 digits of numeric serial, zero-padded
stem = 4-char base-36 of floor(total_seconds / 1296)
encodes which 21.6-minute window the event fell in
AB = 2-char base-36 of (total_seconds % 1296)
encodes seconds within the window (01295)
0 = always literal digit zero
T = 'W' or 'H' ONLY appended for call-home (ACH) downloads (ach=True).
Manual / direct downloads produce a 3-char extension (AB0) with no type char.
Call-home downloads produce a 4-char extension (AB0W or AB0H).
total_seconds = (event_local_time 1985-01-01T00:00:00_local) in seconds
The 10-year production archive contains only call-home files (all end in W or H).
Manual Blastware downloads produce 3-char extensions the same AB0 prefix but
without the trailing type character.
Micromate Series 4 uses a completely different naming scheme (literal datetime
in filename); this function does not apply to Micromate units.
Args:
event: Event object with timestamp set.
serial: Device serial number string (e.g. "BE11529").
ach: If True, append W/H type character (call-home style).
If False (default), omit type character (direct download style).
Returns:
Filename string, e.g. "M529LIY6.CE0" (direct) or "M529LIY6.CE0H" (ACH).
"""
# ── Serial prefix ──────────────────────────────────────────────────────────
serial_digits = "".join(c for c in serial if c.isdigit())
if len(serial_digits) >= 1:
serial_numeric = int(serial_digits)
generation = serial_numeric // 1000
prefix_letter = chr(ord('B') + generation)
serial3 = f"{serial_numeric % 1000:03d}"
else:
prefix_letter = "M" # fallback
serial3 = "000"
prefix = prefix_letter + serial3
# ── Stem + AB extension from timestamp ────────────────────────────────────
if event.timestamp is not None:
try:
ts_local = datetime.datetime(
event.timestamp.year, event.timestamp.month, event.timestamp.day,
event.timestamp.hour, event.timestamp.minute, event.timestamp.second,
)
delta_sec = int((ts_local - _INSTANTEL_EPOCH).total_seconds())
stem = _make_stem(ts_local)
ab_val = delta_sec % _STEM_UNIT_SEC
ab_str = _STEM_CHARS[ab_val // 36] + _STEM_CHARS[ab_val % 36]
except (ValueError, TypeError, AttributeError):
stem = "0000"
ab_str = "00"
else:
stem = "0000"
ab_str = "00"
# ── Type character (ACH only) ─────────────────────────────────────────────
if ach:
if getattr(event, 'recording_mode', None) in (3, 4): # Histogram / Hist+Cont
type_char = 'H'
else:
type_char = 'W'
ext = f".{ab_str}0{type_char}"
else:
ext = f".{ab_str}0"
return prefix + stem + ext
# ── A5 frame classifier ───────────────────────────────────────────────────────────
# ASCII markers that identify a compliance-config / metadata frame.
# These strings appear in the A5 bulk stream as part of the device's
# compliance setup payload. They should NEVER appear in raw ADC waveform
# frames (which are binary-heavy, < 20 % printable ASCII).
_METADATA_FRAME_MARKERS = (
b"Project:",
b"Client:",
b"Standard Recording Setup",
b"Extended Notes",
b"User Name:",
b"Seis Loc:",
)
def classify_frame(frame: S3Frame) -> str:
"""
Classify an A5 bulk waveform stream frame by its content.
Returns one of:
"terminator" page_key == 0x0000
"probe_or_strt" data contains b"STRT\xff\xfe" (the initial probe response)
"metadata" data contains ASCII compliance-config markers
"waveform" predominantly binary (< 20 % printable ASCII)
"unknown" none of the above criteria matched
Used by write_blastware_file() to filter non-waveform frames out of
the reconstructed body so that metadata blocks (Project:, Client:, )
and spurious STRT records do not corrupt the output file.
"""
if frame.page_key == 0x0000:
return "terminator"
data = bytes(frame.data)
if b"STRT\xff\xfe" in data:
return "probe_or_strt"
if any(m in data for m in _METADATA_FRAME_MARKERS):
return "metadata"
if len(data) > 0:
printable = sum(1 for b in data if 32 <= b < 127)
if printable / len(data) < 0.20:
return "waveform"
return "unknown"
# ── Waveform file writer ───────────────────────────────────────────────────────────
def write_blastware_file(
event: Event,
a5_frames: list[S3Frame],
path: Union[str, Path],
) -> None:
"""
Write a Blastware waveform file from a downloaded event.
Args:
event: Event object (populated by get_events() or download_waveform()).
Used for the STRT record (key, rectime) and footer timestamps.
a5_frames: Complete A5 frame list INCLUDING the terminator frame
(page_key=0x0000). Pass include_terminator=True to
read_bulk_waveform_stream() when collecting frames.
Must have at least 2 frames (probe + terminator).
path: Destination file path. Parent directory must exist.
Extension should be set via blastware_filename().
File layout:
[22B header] [21B STRT] [body bytes] [26B footer]
Raises:
ValueError: if a5_frames is empty or has no terminator (page_key=0).
OSError: if the file cannot be written.
Confirmed correct waveform body reconstruction against M529LIY6 (example event) (2026-04-21).
"""
if not a5_frames:
raise ValueError("a5_frames must not be empty")
path = Path(path)
# ── Extract STRT record from probe frame ────────────────────────────────
# The STRT record (21 bytes) lives verbatim inside A5[0].data[7:].
# It is stored as-is in the waveform file — do NOT reconstruct it from Event
# fields, as bytes [10:14] and [14:20] contain device-specific values
# (not simply key4 repeated or zero-padded). Confirmed 2026-04-21.
#
# STRT layout (21 bytes, observed in M529LIY6 files):
# [0:4] b'STRT'
# [4:6] 0xff 0xfe (fixed)
# [6:10] key4 (event key)
# [10:14] device-specific field (NOT a key4 repeat)
# [14:20] device-specific fields (NOT zeros)
# [20] rectime uint8 seconds
# Extract STRT from the DLE-stripped probe frame.
#
# frame.data[7:] is the raw wire representation; it may contain DLE+{02,03,04}
# inner-frame pairs that S3FrameParser preserves as two literal bytes. The
# Blastware file stores the stripped form, so we must strip before extracting.
#
# Example (M529LK0Y, 2026-04-21): STRT contains value 0x02 encoded as [10 02]
# on the wire. Without stripping, STRT is 22 raw bytes → write_blastware_file writes the
# DLE prefix into the file AND begins the body 1 byte too early (probe_skip off
# by 1). Stripping fixes both.
#
# probe_skip must be computed in the RAW frame.data domain (it is used as the
# `skip` argument to _frame_body_bytes which operates on raw frame.data).
# We walk the raw bytes counting stripped bytes until we have passed
# strt_pos + 21 stripped bytes, giving the raw offset of the first body byte.
w0_raw = bytes(a5_frames[0].data[7:])
w0_stripped = _strip_inner_frame_dles(w0_raw)
strt_pos_stripped = w0_stripped.find(b"STRT")
if strt_pos_stripped >= 0:
strt = bytes(w0_stripped[strt_pos_stripped : strt_pos_stripped + 21])
# Walk raw bytes to find the raw-domain end of the STRT (= body start).
target_stripped = strt_pos_stripped + 21
stripped_so_far = 0
raw_i = 0
while stripped_so_far < target_stripped and raw_i < len(w0_raw):
if (w0_raw[raw_i] == 0x10
and raw_i + 1 < len(w0_raw)
and w0_raw[raw_i + 1] in {0x02, 0x03, 0x04}):
raw_i += 2 # DLE pair → 1 stripped byte, 2 raw bytes
else:
raw_i += 1 # normal byte → 1 stripped byte, 1 raw byte
stripped_so_far += 1
probe_skip = 7 + raw_i # raw bytes to skip: 7 header + raw STRT length
else:
# Fallback: construct a minimal STRT if probe frame lacks it
key4 = event._waveform_key if hasattr(event, '_waveform_key') and event._waveform_key else bytes(4)
rectime = event.rectime_seconds if event.rectime_seconds is not None else 0
strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF])
probe_skip = 7 + 21
log.debug(
"write_blastware_file: strt_pos_stripped=%d probe_skip=%d "
"probe_data_len=%d strt_hex=%s",
strt_pos_stripped if strt_pos_stripped >= 0 else -1,
probe_skip,
len(a5_frames[0].data),
strt.hex() if len(strt) >= 4 else "(short)",
)
if len(strt) != 21:
raise ValueError(f"STRT record must be 21 bytes, got {len(strt)}")
# ── Build waveform file header ─────────────────────────────────────────────────────
header = _FILE_HEADER_PREFIX + _WAVEFORM_TYPE_TAG
assert len(header) == _WAVEFORM_HEADER_SIZE, f"Waveform header must be {_WAVEFORM_HEADER_SIZE} bytes"
# ── Build body from A5 frames ────────────────────────────────────────────
# The waveform body is reconstructed from ALL A5 frames (data + terminator).
# The terminator frame's contribution includes the 26-byte footer at its end.
#
# Reconstruction layout (confirmed from M529LIY6 captures, 2026-04-21):
# all_bytes = contributions from A5[0..N] + terminator_contribution
# body = all_bytes[:-26] (everything except the last 26 bytes)
# footer = all_bytes[-26:] (last 26 bytes = the waveform file footer)
#
# The footer bytes come directly from the terminator frame's inner content —
# using them verbatim ensures timestamps match the device's recorded values.
# Separate terminator from data frames.
# Search from the FRONT for the first terminator (page_key == 0x0000).
# Do NOT use a5_frames[-1] — if _a5_frames contains stray frames from a
# subsequent event (a known get_events side-effect), the last frame will
# not be the terminator and the footer will be mis-identified.
# TERM detection (v0.14.0): last frame if page_key != 0x0010 (sample marker)
term_idx: Optional[int] = None
if a5_frames and a5_frames[-1].page_key != 0x0010:
term_idx = len(a5_frames) - 1
if term_idx is not None:
body_frames = a5_frames[:term_idx]
term_frame = a5_frames[term_idx]
else:
body_frames = a5_frames
term_frame = None
# Frame contribution loop (v0.14.0 BW-exact walk).
# Skip values:
# probe (fi=0): probe_skip
# meta@0x1002 (fi=1): 13 (6-byte inner header)
# meta@0x1004 (fi=2): 13 (6-byte inner header)
# sample chunks (fi=3+): 12 (5-byte inner header)
last_fi = len(body_frames) - 1
log.debug(
"write_blastware_file: %d body_frames last_fi=%d",
len(body_frames), last_fi,
)
all_bytes = bytearray()
for fi, frame in enumerate(body_frames):
if fi == 0:
skip = probe_skip
elif fi in (1, 2):
skip = 13 # metadata pages
else:
skip = 12 # sample chunks
contribution = _frame_body_bytes(frame, skip)
log.debug("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
fi, skip, len(frame.data), len(contribution))
all_bytes.extend(contribution)
# Terminator contributes its content, which ends with the 26-byte footer.
# skip=11 (not 12) because the terminator's inner frame header is 4 bytes,
# one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21.
if term_frame is not None:
term_contribution = _frame_body_bytes(term_frame, 11)
log.debug(
"write_blastware_file: term_frame data_len=%d skip=11 "
"contribution_len=%d first8=%s",
len(term_frame.data),
len(term_contribution),
term_contribution[:8].hex() if len(term_contribution) >= 8 else term_contribution.hex(),
)
all_bytes.extend(term_contribution)
log.debug(
"write_blastware_file: all_bytes total=%d last28=%s",
len(all_bytes),
bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(),
)
# NOTE: The "duplicate header+STRT strip" logic from v0.13.x has been
# REMOVED in v0.14.2. Under the v0.14.0 BW-exact 5A walk, body assembly
# is just contiguous concatenation of frame contributions in stream order
# (probe → meta@0x1002 → meta@0x1004 → samples → TERM), exactly as BW
# writes its files. The previous strip was matching the `00 12 03 00 STRT`
# byte sequence in legitimate waveform data — sample chunks at counter
# 0x1000 and beyond often contain those bytes coincidentally — and
# zeroing 25 bytes of valid samples per match. Compared to a known-good
# BW reference for the same 3-sec event 0, the strip introduced 26 bytes
# of zeros that BW did not have, then propagated alignment differences
# through the rest of the body. See decode_test/5-1-26/bw vs SFM diff
# at file[0x1012..0x102B] (2026-05-04 analysis).
# Find the first valid 0e 08 footer marker (v0.14.0).
footer_pos = -1
pos = 0
while True:
pos = bytes(all_bytes).find(b"\x0e\x08", pos)
if pos < 0 or pos + 26 > len(all_bytes):
break
yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
if 2015 <= yr <= 2050:
footer_pos = pos
break
pos += 1
if footer_pos >= 0:
body = bytes(all_bytes[:footer_pos])
footer = bytes(all_bytes[footer_pos:footer_pos + 26])
log.debug(
"write_blastware_file: real 0e 08 footer at all_bytes[%d]; "
"truncating %d post-footer bytes",
footer_pos, len(all_bytes) - footer_pos - 26,
)
elif len(all_bytes) >= 26:
body = bytes(all_bytes[:-26])
footer = bytes(all_bytes[-26:])
else:
body = bytes(all_bytes)
start_dt = _ts_from_model(event.timestamp)
stop_dt: Optional[datetime.datetime] = None
if start_dt is not None and event.rectime_seconds:
stop_dt = start_dt + datetime.timedelta(seconds=event.rectime_seconds)
footer = (
b"\x0e\x08"
+ _encode_ts_be(start_dt)
+ _encode_ts_be(stop_dt)
+ b"\x00\x01\x00\x02\x00\x00"
+ b"\x00\x00"
)
# ── Write file ───────────────────────────────────────────────────────────
with open(path, "wb") as f:
f.write(header)
f.write(strt)
f.write(body)
f.write(footer)
def read_blastware_file(path: Union[str, Path]) -> Event:
"""
Parse a Blastware waveform file into an Event object.
NOT YET IMPLEMENTED.
Args:
path: Path to the waveform file.
Returns:
Event object with waveform data populated.
Raises:
NotImplementedError: always (pending implementation).
"""
raise NotImplementedError("read_blastware_file() is not yet implemented")
# ── MLG file writer ───────────────────────────────────────────────────────────
def _build_mlg_header(serial: str) -> bytes:
"""
Build the 308-byte MLG file header.
Header structure (confirmed from BE11529.MLG binary inspection):
Offset 0x00: 10 00 01 80 00 00 Instantel\x00 07 2c 22 01 0e a0 (22B)
Offset 0x16: ... (16B unknown observed as zeros in BE11529.MLG)
Offset 0x2A: serial number (8 bytes, null-padded ASCII)
... rest zero-padded to 308 bytes
The serial string "BE11529" appears at offset 0x2A (42 decimal).
"""
buf = bytearray(_MLG_HEADER_SIZE)
# Common prefix + MLG type tag
prefix = _FILE_HEADER_PREFIX + _MLG_TYPE_TAG # 22 bytes
buf[0:len(prefix)] = prefix
# Serial number at offset 0x2A
serial_bytes = serial.encode("ascii", errors="replace")[:8]
serial_padded = serial_bytes.ljust(8, b"\x00")
buf[0x2A : 0x2A + 8] = serial_padded
return bytes(buf)
def _build_mlg_record(
entry: MonitorLogEntry,
serial: str,
) -> bytes:
"""
Build one 292-byte MLG record from a MonitorLogEntry.
Record layout (confirmed from BE11529.MLG binary inspection):
[0:2] CRC 2-byte CRC (algorithm unknown; written as 0x0000)
[2:6] marker 22 01 0e 80
[6:14] ts1 8B big-endian start timestamp
[14:22] ts2 8B big-endian stop timestamp
[22:26] flags 4B record flags (see MLG_FLAGS_* constants)
[26:36] serial 10B null-padded serial number
[36:] text for triggered events: [0x08][8B ts1_copy]["Geo: X.XXX in/s"]
for monitoring intervals: b"" or minimal separator
[... zero-padded to 292 bytes]
Flags based on entry type:
- MonitorLogEntry with start_time only (no stop_time): MLG_FLAGS_START_ONLY
- MonitorLogEntry with both times and geo_threshold_ips set: MLG_FLAGS_TRIGGER
- MonitorLogEntry with both times (monitoring interval): MLG_FLAGS_INTERVAL
The triggered-event text block (flags = MLG_FLAGS_TRIGGER):
[0x08] [ts1: 8B] [ASCII "Geo: X.XXX in/s\x00"]
Confirmed from BE11529.MLG records at offset 0x0134 and 0x0258.
"""
buf = bytearray(_MLG_RECORD_SIZE)
start_dt = (
datetime.datetime(
entry.start_time.year, entry.start_time.month, entry.start_time.day,
entry.start_time.hour, entry.start_time.minute, entry.start_time.second,
)
if entry.start_time else None
)
stop_dt = (
datetime.datetime(
entry.stop_time.year, entry.stop_time.month, entry.stop_time.day,
entry.stop_time.hour, entry.stop_time.minute, entry.stop_time.second,
)
if entry.stop_time else None
)
# [0:2] CRC placeholder
buf[0:2] = b"\x00\x00"
# [2:6] Record marker
buf[2:6] = _MLG_RECORD_MARKER
# [6:14] ts1
buf[6:14] = _encode_ts_be(start_dt)
# [14:22] ts2
buf[14:22] = _encode_ts_be(stop_dt)
# [22:26] flags
if stop_dt is None:
flags = MLG_FLAGS_START_ONLY
elif entry.geo_threshold_ips is not None:
flags = MLG_FLAGS_TRIGGER
else:
flags = MLG_FLAGS_INTERVAL
buf[22:26] = flags
# [26:36] serial (10B null-padded)
serial_bytes = serial.encode("ascii", errors="replace")[:10]
buf[26 : 26 + len(serial_bytes)] = serial_bytes
# [36:] text content
pos = 36
if flags == MLG_FLAGS_TRIGGER:
# Extra ts1 copy: [0x08][ts1: 8B]
buf[pos] = 0x08
pos += 1
buf[pos : pos + 8] = _encode_ts_be(start_dt)
pos += 8
if entry.geo_threshold_ips is not None:
geo_text = f"Geo: {entry.geo_threshold_ips:.3f} in/s\x00".encode("ascii")
buf[pos : pos + len(geo_text)] = geo_text
pos += len(geo_text)
return bytes(buf)
def write_mlg(
entries: list[MonitorLogEntry],
serial: str,
path: Union[str, Path],
) -> None:
"""
Write a Blastware .MLG monitor log file.
Args:
entries: List of MonitorLogEntry objects (from get_monitor_log_entries()).
Each entry produces one 292-byte record in the file.
serial: Device serial number string (e.g. "BE11529").
Written to the file header and each record.
path: Destination file path. Extension is not enforced use ".MLG".
File layout:
[308B header] [N × 292B records]
Note: The 2-byte CRC at the start of each record is written as 0x0000.
The CRC algorithm is unknown (see module docstring).
Raises:
OSError: if the file cannot be written.
"""
path = Path(path)
header = _build_mlg_header(serial)
with open(path, "wb") as f:
f.write(header)
for entry in entries:
record = _build_mlg_record(entry, serial)
f.write(record)
def read_mlg(path: Union[str, Path]) -> list[MonitorLogEntry]:
"""
Parse a Blastware .MLG file into a list of MonitorLogEntry objects.
NOT YET IMPLEMENTED.
Args:
path: Path to the .MLG file.
Returns:
List of MonitorLogEntry objects.
Raises:
NotImplementedError: always (pending implementation).
"""
raise NotImplementedError("read_mlg() is not yet implemented")
+752 -140
View File
File diff suppressed because it is too large Load Diff
+533
View File
@@ -0,0 +1,533 @@
"""
minimateplus/event_file_io.py modern event-file (.sfm.json sidecar) IO.
This module is the single home for event-file conversion code that doesn't
fit cleanly inside `blastware_file.py` (which is the BW binary codec):
- sidecar JSON read/write (the modern per-event metadata file)
- read_blastware_file() reverse of write_blastware_file, used by
the BW-importer flow when SFM is ingesting files produced by
Blastware's own ACH (where the source A5 frames aren't available).
Sidecar schema v1 layout see docs in the project plan or the schema
declared in `event_to_sidecar_dict()`.
"""
from __future__ import annotations
import base64
import datetime
import hashlib
import json
import logging
import os
import struct
from pathlib import Path
from typing import Optional, Union
from .models import Event, PeakValues, ProjectInfo, Timestamp
from . import blastware_file as _bw # avoid circular reference at module load
log = logging.getLogger(__name__)
# Schema version for the sidecar JSON. Bump when fields change shape.
# Older readers must reject anything > SCHEMA_VERSION; newer fields added
# inside `extensions` are forward-compatible without a bump.
SCHEMA_VERSION = 1
SIDECAR_KIND = "sfm.event"
# Default tool_version stamp; callers can override. Hard-coded here
# rather than read via importlib.metadata because the latter reflects the
# *installed* dist-info, which doesn't update when pyproject.toml is
# bumped without a `pip install` re-run — leading to confusing stale
# version stamps in sidecars. Bump this constant and CHANGELOG.md
# together at release time.
TOOL_VERSION = "0.15.0"
try:
# Best-effort: prefer the installed metadata when it's NEWER than the
# baked-in constant (e.g. a downstream packager bumped the wheel
# without editing this file). Otherwise fall back to TOOL_VERSION.
from importlib.metadata import version as _pkg_version
_meta_v = _pkg_version("seismo-relay")
def _vtuple(s):
try:
return tuple(int(p) for p in s.split(".")[:3])
except Exception:
return (0, 0, 0)
_TOOL_VERSION_DEFAULT = (
_meta_v if _vtuple(_meta_v) > _vtuple(TOOL_VERSION) else TOOL_VERSION
)
except Exception:
_TOOL_VERSION_DEFAULT = TOOL_VERSION
# ── Sidecar dict construction ─────────────────────────────────────────────────
def _ts_iso(ts: Optional[Timestamp]) -> Optional[str]:
if ts is None:
return None
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
def _peak_values_to_dict(pv: Optional[PeakValues]) -> dict:
if pv is None:
return {
"transverse": None,
"vertical": None,
"longitudinal": None,
"vector_sum": None,
"mic_psi": None,
}
return {
"transverse": pv.tran,
"vertical": pv.vert,
"longitudinal": pv.long,
"vector_sum": pv.peak_vector_sum,
"mic_psi": pv.micl,
}
def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
if pi is None:
return {
"project": None,
"client": None,
"operator": None,
"sensor_location": None,
}
return {
"project": pi.project,
"client": pi.client,
"operator": pi.operator,
"sensor_location": pi.sensor_location,
}
def event_to_sidecar_dict(
event: Event,
*,
serial: str,
blastware_filename: str,
blastware_filesize: int,
blastware_sha256: str,
source_kind: str = "sfm-live",
a5_pickle_filename: Optional[str] = None,
tool_version: str = _TOOL_VERSION_DEFAULT,
captured_at: Optional[datetime.datetime] = None,
review: Optional[dict] = None,
extensions: Optional[dict] = None,
) -> dict:
"""
Build a v1 sidecar dict from an Event + the surrounding metadata.
Pure helper no file I/O. Callers stitch the result into a sidecar
via `write_sidecar()` (or POST it back via the PATCH endpoint).
"""
if source_kind not in {"sfm-live", "sfm-ach", "bw-import"}:
raise ValueError(f"unknown source_kind: {source_kind!r}")
captured_at = captured_at or datetime.datetime.utcnow()
# Stash raw 0C record bytes in `extensions.raw_records` so future
# field-decoding work (Peak Acceleration, ZC Freq, Time of Peak,
# sensor self-check results, etc.) can run offline against committed
# sidecars without a live device. Cheap (~280 bytes base64) and
# forward-compatible (older readers ignore unknown extensions keys).
ext_dict: dict = dict(extensions) if extensions else {}
raw_0c = getattr(event, "_raw_record", None)
if raw_0c:
rr = ext_dict.setdefault("raw_records", {})
# Don't clobber a raw_0c that callers explicitly passed in via
# `extensions=...` (e.g. round-trip preservation in patch_sidecar).
rr.setdefault("waveform_record_b64", base64.b64encode(raw_0c).decode("ascii"))
rr.setdefault("waveform_record_len", len(raw_0c))
return {
"schema_version": SCHEMA_VERSION,
"kind": SIDECAR_KIND,
"event": {
"serial": serial,
"timestamp": _ts_iso(event.timestamp),
"waveform_key": event._waveform_key.hex() if event._waveform_key else None,
"record_type": event.record_type,
"sample_rate": event.sample_rate,
"rectime_seconds": event.rectime_seconds,
"total_samples": event.total_samples,
"pretrig_samples": event.pretrig_samples,
},
"peak_values": _peak_values_to_dict(event.peak_values),
"project_info": _project_info_to_dict(event.project_info),
"blastware": {
"filename": blastware_filename,
"filesize": blastware_filesize,
"sha256": blastware_sha256,
"available": True,
},
"source": {
"kind": source_kind,
"captured_at": captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
"tool_version": tool_version,
"a5_pickle_filename": a5_pickle_filename,
},
"review": review or {
"false_trigger": False,
"reviewer": None,
"reviewed_at": None,
"notes": "",
},
"extensions": ext_dict,
}
# ── Sidecar IO ────────────────────────────────────────────────────────────────
def write_sidecar(path: Union[str, Path], data: dict) -> None:
"""
Atomic write of a sidecar dict to <path>.
Validates schema_version is supported before writing so we don't
silently drop a future-format sidecar over the wire.
"""
path = Path(path)
sv = data.get("schema_version")
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
raise ValueError(
f"write_sidecar: unsupported schema_version={sv!r} "
f"(this build supports 1..{SCHEMA_VERSION})"
)
tmp = path.with_suffix(path.suffix + ".tmp")
with tmp.open("w", encoding="utf-8") as f:
json.dump(data, f, indent=2, sort_keys=False, default=str)
f.write("\n")
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
def read_sidecar(path: Union[str, Path]) -> dict:
"""
Load a sidecar JSON file.
Raises FileNotFoundError if missing, ValueError on bad shape /
unsupported schema_version. Unknown keys at the top level are
preserved in the returned dict (forward-compat).
"""
path = Path(path)
with path.open("r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
raise ValueError(f"sidecar at {path}: top-level is not a JSON object")
sv = data.get("schema_version")
if not isinstance(sv, int) or sv < 1:
raise ValueError(f"sidecar at {path}: missing or invalid schema_version")
if sv > SCHEMA_VERSION:
raise ValueError(
f"sidecar at {path}: schema_version={sv} > supported {SCHEMA_VERSION}; "
"upgrade seismo-relay to read this file"
)
if data.get("kind") != SIDECAR_KIND:
raise ValueError(f"sidecar at {path}: unexpected kind={data.get('kind')!r}")
return data
def patch_sidecar(
path: Union[str, Path],
*,
review: Optional[dict] = None,
extensions: Optional[dict] = None,
reviewer_now: bool = True,
) -> dict:
"""
Atomically apply a JSON-merge-patch to a sidecar file's `review`
and/or `extensions` blocks. Other top-level keys are untouched.
`review_now`: when True (default) and `review` is non-empty, stamps
`review.reviewed_at` with the current UTC time so the review-time is
auditable without the caller having to pass it.
Returns the new full sidecar dict.
"""
path = Path(path)
data = read_sidecar(path)
if review:
merged = dict(data.get("review") or {})
merged.update({k: v for k, v in review.items() if v is not None or k in merged})
if reviewer_now:
merged["reviewed_at"] = datetime.datetime.utcnow().isoformat() + "Z"
data["review"] = merged
if extensions:
merged_ext = dict(data.get("extensions") or {})
merged_ext.update(extensions)
data["extensions"] = merged_ext
write_sidecar(path, data)
return data
def sidecar_path_for(blastware_path: Union[str, Path]) -> Path:
"""Convention: <bw_path>.sfm.json sits next to the BW binary."""
p = Path(blastware_path)
return p.with_name(p.name + ".sfm.json")
def file_sha256(path: Union[str, Path], chunk_size: int = 65536) -> str:
"""Compute SHA-256 of a file as a hex string."""
h = hashlib.sha256()
with open(path, "rb") as f:
while True:
chunk = f.read(chunk_size)
if not chunk:
break
h.update(chunk)
return h.hexdigest()
# ── Blastware-file reader ─────────────────────────────────────────────────────
#
# Reverse of `blastware_file.write_blastware_file`. Used by the BW-import
# flow to ingest files produced by Blastware's own ACH (where the source
# A5 frames are not available).
#
# File structure (recap):
# [22B header] [21B STRT record] [body bytes] [26B footer]
#
# The body holds:
# - 6B preamble (00 00 ff ff ff ff) immediately after the STRT
# - 4-channel interleaved int16 LE samples
# - Embedded ASCII metadata strings (Project: / Client: / User Name: /
# Seis Loc: / Extended Notes) from the device's session-start config
#
# The 0C waveform record (per-event peaks, project name) is NOT in the
# BW file — those are computed by the device firmware and only carried
# in the live SUB 0C response. read_blastware_file() therefore computes
# peaks from the raw samples assuming Normal-range (10 in/s full-scale)
# geophone sensitivity. Imported events surface that assumption via the
# sidecar's `peak_values.computed_from_samples` flag.
# Geophone scale factor, in/s per ADC unit, for Normal range (10 in/s FS).
# Confirmed from CLAUDE.md (geo_hardware_constant = 6.206053 in/s per V,
# ADC full-scale = 1.61133 V Normal range = 10.0 in/s peak; per-count
# resolution ≈ 10.0 / 32768).
_GEO_NORMAL_FS_INS = 10.0
_GEO_SENSITIVE_FS_INS = 1.250
_INT16_FS = 32768.0
# Microphone scale factor, psi per ADC count. Approximate — exact factor
# depends on the geophone-vs-mic ADC scaling and the firmware reference.
# We mark mic_psi as "computed approximate" in the sidecar.
_MIC_FS_PSI = 0.0125 / _INT16_FS # ~0.5 psi full-scale assumption
def _decode_strt(strt: bytes) -> dict:
"""
Decode the 21-byte STRT record from a BW file.
Returns dict with waveform_key (4B), total_samples, pretrig_samples,
rectime_seconds. Falls back to None on truncated/missing fields.
"""
if len(strt) < 21 or strt[0:4] != b"STRT":
return {}
return {
"waveform_key": strt[6:10].hex(),
"total_samples": struct.unpack_from(">H", strt, 8)[0],
"pretrig_samples": struct.unpack_from(">H", strt, 16)[0],
"rectime_seconds": strt[18],
}
def _find_first_string(buf: bytes, label: bytes, max_len: int = 256) -> Optional[str]:
"""
Search `buf` for `label` (e.g. b"Project:") and return the
null-terminated ASCII string that follows, stripped.
"""
pos = buf.find(label)
if pos < 0:
return None
start = pos + len(label)
end = buf.find(b"\x00", start, start + max_len)
if end < 0:
end = start + max_len
text = buf[start:end].decode("ascii", errors="replace").strip()
return text or None
def _decode_samples_4ch_int16_le(stream: bytes) -> dict[str, list[int]]:
"""
Decode a 4-channel interleaved int16 LE byte stream into per-channel
lists. Channels are [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
Truncates to a multiple of 8 bytes (one full sample-set).
"""
n_complete = (len(stream) // 8) * 8
if n_complete == 0:
return {"Tran": [], "Vert": [], "Long": [], "MicL": []}
fmt = "<" + "h" * (n_complete // 2)
flat = list(struct.unpack(fmt, stream[:n_complete]))
return {
"Tran": flat[0::4],
"Vert": flat[1::4],
"Long": flat[2::4],
"MicL": flat[3::4],
}
def _peaks_from_samples(samples: dict[str, list[int]]) -> PeakValues:
"""
Compute approximate peaks from raw int16 samples assuming Normal-range
geophone sensitivity. Used by the BW-importer when the 0C waveform
record (the device's authoritative peaks) is unavailable.
"""
def _peak_ins(ch: list[int]) -> float:
if not ch:
return 0.0
m = max(abs(int(v)) for v in ch)
return m / _INT16_FS * _GEO_NORMAL_FS_INS
tran = _peak_ins(samples.get("Tran", []))
vert = _peak_ins(samples.get("Vert", []))
long_ = _peak_ins(samples.get("Long", []))
# Mic in psi (approximate)
mic_ch = samples.get("MicL", []) or []
mic = max((abs(int(v)) for v in mic_ch), default=0) * _MIC_FS_PSI
# Peak vector sum: max over time of sqrt(T^2 + V^2 + L^2)
pvs = 0.0
n = min(len(samples.get("Tran", [])), len(samples.get("Vert", [])), len(samples.get("Long", [])))
if n:
scale = _GEO_NORMAL_FS_INS / _INT16_FS
T = samples["Tran"]; V = samples["Vert"]; L = samples["Long"]
for i in range(n):
t = T[i] * scale
v = V[i] * scale
l = L[i] * scale
mag = (t*t + v*v + l*l) ** 0.5
if mag > pvs:
pvs = mag
return PeakValues(
tran=tran, vert=vert, long=long_,
peak_vector_sum=pvs, micl=mic,
)
def read_blastware_file(path: Union[str, Path]) -> Event:
"""
Parse a Blastware waveform file into an Event.
Recovers:
- waveform_key, rectime_seconds, total_samples, pretrig_samples
(from the STRT record)
- timestamp (from the footer's start-time field)
- project_info (from ASCII labels embedded in the body)
- raw_samples (Tran/Vert/Long/MicL int16 lists)
- peak_values (computed from raw_samples; approximate see notes
on _peaks_from_samples about Normal-range assumption)
Does NOT recover the source A5 frames (they aren't in the BW file).
The returned Event has `_a5_frames = None`, signalling that
byte-for-byte regeneration of the BW file from this Event alone is
not possible the on-disk BW file IS the byte-for-byte source.
"""
path = Path(path)
raw = path.read_bytes()
if len(raw) < _bw._WAVEFORM_HEADER_SIZE + 21 + 26:
raise ValueError(f"{path}: file too short ({len(raw)} bytes) to be a BW event")
# Header: validate magic prefix.
header = raw[:_bw._WAVEFORM_HEADER_SIZE]
if not header.startswith(_bw._FILE_HEADER_PREFIX):
raise ValueError(f"{path}: not a Blastware file (bad header prefix)")
# STRT record: 21 bytes immediately after the header.
strt_raw = raw[_bw._WAVEFORM_HEADER_SIZE : _bw._WAVEFORM_HEADER_SIZE + 21]
strt_fields = _decode_strt(strt_raw)
if not strt_fields:
raise ValueError(f"{path}: STRT record missing or malformed")
# Footer: locate the 0e 08 marker, validating the year is in a sane range.
body_start = _bw._WAVEFORM_HEADER_SIZE + 21
footer_pos = -1
pos = body_start
while True:
pos = raw.find(b"\x0e\x08", pos)
if pos < 0 or pos + 26 > len(raw):
break
yr = (raw[pos + 4] << 8) | raw[pos + 5]
if 2015 <= yr <= 2050:
footer_pos = pos
break
pos += 1
if footer_pos < 0 and len(raw) >= 26:
footer_pos = len(raw) - 26
if footer_pos < body_start:
raise ValueError(f"{path}: footer not found")
body = raw[body_start : footer_pos]
footer = raw[footer_pos : footer_pos + 26]
# Footer layout:
# [0:2] 0e 08 marker
# [2:10] ts1 (start) BE 8B
# [10:18] ts2 (stop) BE 8B
# [18:24] 00 01 00 02 00 00
# [24:26] crc
ts1 = _bw._decode_ts_be(footer[2:10])
ts2 = _bw._decode_ts_be(footer[10:18])
# Body: first 6 bytes are the preamble (00 00 ff ff ff ff). Strip
# them before decoding samples. Any trailing tail past the last
# full sample-set is silently truncated by _decode_samples_4ch.
sample_bytes = body[6:] if body[:6].hex() in ("0000ffffffff", "0000FFFFFFFF") else body
samples = _decode_samples_4ch_int16_le(sample_bytes)
# Metadata strings (label-anchored search across the body).
project = _find_first_string(body, b"Project:")
client = _find_first_string(body, b"Client:")
user = _find_first_string(body, b"User Name:")
seisloc = _find_first_string(body, b"Seis Loc:")
# Build the Event.
ev = Event(index=-1)
if strt_fields.get("waveform_key"):
ev._waveform_key = bytes.fromhex(strt_fields["waveform_key"])
ev.record_type = "Waveform"
ev.rectime_seconds = strt_fields.get("rectime_seconds")
ev.total_samples = strt_fields.get("total_samples")
ev.pretrig_samples = strt_fields.get("pretrig_samples")
if ts1 is not None:
ev.timestamp = Timestamp(
raw=footer[2:10],
flag=0x10,
year=ts1.year, unknown_byte=0, month=ts1.month, day=ts1.day,
hour=ts1.hour, minute=ts1.minute, second=ts1.second,
)
ev.project_info = ProjectInfo(
project=project, client=client, operator=user, sensor_location=seisloc,
)
ev.raw_samples = samples
ev.peak_values = _peaks_from_samples(samples)
ev._a5_frames = None # not recoverable from BW file
return ev
+187 -31
View File
@@ -111,20 +111,24 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
verified against this algorithm on 2026-04-02).
Args:
offset_word: 16-bit offset (0x1004 for probe/chunks, 0x005A for term).
raw_params: 10 or 11 params bytes (from bulk_waveform_params or
bulk_waveform_term_params). 0x10 bytes in params are
written RAW NOT DLE-stuffed. Confirmed 2026-04-06 by
comparing wire bytes: BW sends bare `10 04` for chunk 1
(counter=0x1004), not stuffed `10 10 04`. Device reads
params at fixed byte positions; stuffing shifts the bytes
and corrupts the counter, causing device to ignore the frame.
offset_word: 16-bit offset. For probe/chunks/metadata pages this is
`0x1002`. For the proper TERM frame this is computed by
`bulk_waveform_term_v2()` from the STRT-derived
`end_offset`.
raw_params: 10, 11, or 12 params bytes (from `bulk_waveform_params`
for probes/samples, `bulk_waveform_term_v2` for TERM, or
a manually-built 12-byte block for the metadata pages
0x1002 / 0x1004). See gotcha #3 below — params region
uses partial DLE stuffing of 0x10 bytes.
Returns:
Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
"""
if len(raw_params) not in (10, 11):
raise ValueError(f"raw_params must be 10 or 11 bytes, got {len(raw_params)}")
if len(raw_params) not in (10, 11, 12):
# 10 = termination params; 11 = regular probe / chunk params;
# 12 = metadata-page params (extra trailing 0x00 — BW byte-perfect quirk
# for the two fixed metadata reads at counter=0x1002 and 0x1004).
raise ValueError(f"raw_params must be 10/11/12 bytes, got {len(raw_params)}")
# Build stuffed section between STX and checksum
s = bytearray()
@@ -134,8 +138,40 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
s += b"\x00" # field3
s += bytes([(offset_word >> 8) & 0xFF, # offset_hi — raw, NOT stuffed
offset_word & 0xFF]) # offset_lo
for b in raw_params: # params — NOT DLE-stuffed (raw bytes, match BW wire format)
# Params — partial DLE stuffing of 0x10 bytes (CONFIRMED 2026-05-05).
#
# The device's de-stuffing rule for params is:
# • `10 10` → de-stuffs to `10`
# • `10 02/03/04` → kept literal (these are inner-frame markers)
# • `10 X` other → de-stuffs to just `X` (drops the 0x10)
#
# So for any 0x10 byte in the *logical* params that is followed by a
# byte NOT in {0x02, 0x03, 0x04, 0x10}, we must double the 0x10 on the
# wire (`10 X` → `10 10 X`) so the device's de-stuffer reproduces the
# original `10 X` pair. Without this, counter values with `0x10` in
# the high byte (e.g. counter=0x1000 has params bytes `10 00`) are
# silently corrupted to `0x__00` on the device side, and the device
# responds for the wrong address — for counter=0x1000 it returns the
# probe response (counter=0x0000), which contains the file header +
# STRT. That STRT block then lands in the assembled file body and
# Blastware rejects the file as malformed.
#
# Confirmed against BW capture 5-1-26 / bwcap3sec frame 20: params
# logical bytes `00 01 11 10 00 00 00 00 00 00 00` (counter=0x1000)
# are encoded on the wire as `00 01 11 10 10 00 00 00 00 00 00 00`.
# BW frames 13/14 (meta @ 0x1002 / 0x1004) leave `10 02` and `10 04`
# raw — the device handles those literal pairs correctly.
i = 0
while i < len(raw_params):
b = raw_params[i]
s.append(b)
if (
b == 0x10
and i + 1 < len(raw_params)
and raw_params[i + 1] not in (0x02, 0x03, 0x04, 0x10)
):
s.append(0x10) # double the 0x10 so it survives device de-stuffing
i += 1
# DLE-aware checksum: for 0x10 XX pairs count XX; for lone bytes count them
chk, i = 0, 0
@@ -398,28 +434,26 @@ def bulk_waveform_params(key4: bytes, counter: int, *, is_probe: bool = False) -
def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
"""
Build the 10-byte params block for the SUB 5A termination request.
DEPRECATED DO NOT USE IN NEW CODE.
The termination request uses offset=0x005A and a DIFFERENT params layout
the leading 0x00 byte is dropped, key4[0:2] shifts to params[0:2], and the
counter high byte is at params[2]:
This is the v1 termination params helper, paired with the broken
`_BULK_TERM_OFFSET = 0x005A` magic offset_word. Together they produce a
~100-byte device-side terminator response that does NOT contain the
partial-last-chunk waveform tail or the 26-byte file footer. Files
reconstructed using this terminator are missing their last ~512 bytes of
waveform data and have a synthesized footer that disagrees with what BW
would have written.
params[0] = key4[0]
params[1] = key4[1]
params[2] = (counter >> 8) & 0xFF
params[3:] = zeros
**For new code, use `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`**
which computes the correct offset_word + params from the STRT-derived
`end_offset`. v2 produces wire bytes that match BW exactly across all
tested events (4-27-26 / 5-1-26 / 5-4-26 captures).
Counter for the termination request = last_regular_counter + 0x0400.
Confirmed from 1-2-26 BW TX capture: final request (frame 83) uses
offset=0x005A, params[0:3] = key4[0:2] + term_counter_hi.
Args:
key4: 4-byte waveform key.
counter: Termination counter (= last regular counter + 0x0400).
Returns:
10-byte params block.
This function is retained ONLY for the defensive fallback path in
`read_bulk_waveform_stream()` that triggers when STRT parsing fails or no
chunks are fetched (= a malformed event or an unexpected device state).
The fallback already logs a WARNING when it activates; if you see that
warning, the bug is upstream STRT should have been parseable.
"""
if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
@@ -430,6 +464,123 @@ def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
return bytes(p)
def bulk_waveform_term_v2(
key4: bytes,
end_offset: int,
last_chunk_counter: int,
) -> tuple[int, bytes]:
"""
Compute the SUB 5A TERM frame's offset_word and 10-byte params block.
Confirmed across 3 events (4-27-26 + 5-1-26 captures):
next_boundary = last_chunk_counter + 0x0200
offset_word = end_offset - next_boundary (residual byte count)
params[0] = key4[0] (= 0x01 on every observed device)
params[1] = key4[1] (= 0x11)
params[2] = (next_boundary >> 8) & 0xFF
params[3] = next_boundary & 0xFF
params[4:10] = zeros
Verification:
| end_offset | last_chunk | next_boundary | offset_word | params[2:4] |
| 0x1ABE | 0x1800 | 0x1A00 | 0x00BE | 1A 00 |
| 0x21F2 | 0x1E00 | 0x2000 | 0x01F2 | 20 00 |
| 0x417E | 0x3E38 | 0x4038 | 0x0146 | 40 38 |
The device receives `requested_address = (params[2] << 8) | offset_word`
and replies with `(end_offset - next_boundary)` bytes of waveform tail
starting at `next_boundary` including the 26-byte file footer.
Args:
key4: 4-byte waveform key for this event.
end_offset: Event-end pointer (= `(end_key[2] << 8) | end_key[3]`
from the STRT record at data[23:27] of A5[0]).
last_chunk_counter: Counter of the last full 0x0200-byte chunk fetched
(the chunk that covers [last_chunk_counter,
last_chunk_counter + 0x0200)).
Returns:
(offset_word, params10) tuple. Pass as
`build_5a_frame(offset_word, params)`.
Raises:
ValueError: on inconsistent inputs.
"""
if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
next_boundary = last_chunk_counter + 0x0200
if next_boundary > 0xFFFF:
raise ValueError(
f"next_boundary 0x{next_boundary:04X} exceeds uint16; check inputs"
)
if end_offset <= last_chunk_counter:
raise ValueError(
f"end_offset 0x{end_offset:04X} must be > "
f"last_chunk_counter 0x{last_chunk_counter:04X}"
)
offset_word = end_offset - next_boundary
if offset_word < 0:
# Last chunk overshot end_offset; caller should have stopped one chunk
# earlier. Treat as zero residual.
offset_word = 0
if offset_word > 0xFFFF:
raise ValueError(
f"offset_word 0x{offset_word:04X} exceeds uint16"
)
p = bytearray(10)
p[0] = key4[0]
p[1] = key4[1]
p[2] = (next_boundary >> 8) & 0xFF
p[3] = next_boundary & 0xFF
return offset_word, bytes(p)
# ── End-offset extraction from STRT record ────────────────────────────────────
STRT_MARKER = b"STRT"
def parse_strt_end_offset(a5_data: bytes) -> Optional[int]:
"""
Extract the event-end offset from the STRT record in an A5 response payload.
The first A5 response (the probe response, or the first chunk for events
with non-zero start_key[2:4]) contains a STRT record at byte offset 17 of
`data`. Layout:
data[17:21] "STRT"
data[21:23] ff fe sentinel
data[23:27] end_key 4-byte key of where this event ENDS
data[27:31] start_key
...
Returns `(end_key[2] << 8) | end_key[3]` the absolute device-buffer
address where the event ends. Use this to bound the chunk loop and to
compute the TERM frame.
Verified end_offset values:
| event start_key | end_key | end_offset |
| 01110000 | 01111ABE | 0x1ABE |
| 01110000 | 011121F2 | 0x21F2 |
| 011121F2 | 0111417E | 0x417E |
Args:
a5_data: The `data` field of an A5 response frame (frame.data).
Returns:
The end_offset (uint16) if STRT is found, else None.
"""
pos = a5_data.find(STRT_MARKER)
if pos < 0 or pos + 10 > len(a5_data):
return None
# data[pos+4:pos+6] is "ff fe"; data[pos+6:pos+10] is end_key.
end_key = a5_data[pos + 6 : pos + 10]
if len(end_key) < 4:
return None
return (end_key[2] << 8) | end_key[3]
# ── Pre-built POLL frames ─────────────────────────────────────────────────────
#
# POLL (SUB 0x5B) uses the same two-step pattern as all other reads — the
@@ -457,6 +608,11 @@ class S3Frame:
page_lo: int # PAGE_LO from header
data: bytes # payload data section (payload[5:], checksum already stripped)
checksum_valid: bool
chk_byte: int = 0 # actual checksum byte received from wire (body[-1])
# needed for waveform file reconstruction: when the last data byte
# is 0x10 and chk_byte ∈ {0x02, 0x03, 0x04}, the DLE+chk pair
# must be included in the DLE-strip operation to correctly
# reconstruct the Blastware binary body.
@property
def page_key(self) -> int:
@@ -465,7 +621,6 @@ class S3Frame:
# ── Streaming S3 frame parser ─────────────────────────────────────────────────
class S3FrameParser:
"""
Incremental byte-stream parser for S3BW response frames.
@@ -597,4 +752,5 @@ class S3FrameParser:
page_lo = raw_payload[4],
data = raw_payload[5:],
checksum_valid = (chk_received == chk_computed),
chk_byte = chk_received,
)
+153 -6
View File
@@ -201,6 +201,58 @@ class Timestamp:
second=second,
)
@classmethod
def from_short_record(cls, data: bytes) -> "Timestamp":
"""
Decode an 8-byte timestamp header from a 210-byte waveform record.
Wire layout ( CONFIRMED 2026-05-01 against live SFM run on BE11529 in
Continuous mode, day-of-month = 1 May, raw: 01 05 07 ea 00 0d 15 25):
byte[0]: day (uint8)
byte[1]: month (uint8)
bytes[2-3]: year (big-endian uint16)
byte[4]: unknown (0x00 in observed sample)
byte[5]: hour (uint8)
byte[6]: minute (uint8)
byte[7]: second (uint8)
This is a third format observed in the wild distinct from the 9-byte
(single-shot, sub_code=0x10 at [1]) and 10-byte (continuous, 0x10 at
[0] AND [2]) layouts. No marker bytes; disambiguated by where the
year lands when scanned at byte 2/3/4.
Args:
data: at least 8 bytes; only the first 8 are consumed.
Returns:
Decoded Timestamp.
Raises:
ValueError: if data is fewer than 8 bytes.
"""
if len(data) < 8:
raise ValueError(
f"Short record timestamp requires at least 8 bytes, got {len(data)}"
)
day = data[0]
month = data[1]
year = struct.unpack_from(">H", data, 2)[0]
unknown_byte = data[4]
hour = data[5]
minute = data[6]
second = data[7]
return cls(
raw=bytes(data[:8]),
flag=0,
year=year,
unknown_byte=unknown_byte,
month=month,
day=day,
hour=hour,
minute=minute,
second=second,
)
@property
def clock_set(self) -> bool:
"""False when year == 1995 (factory default / battery-lost state)."""
@@ -269,7 +321,7 @@ class ChannelConfig:
label: str # e.g. "Tran", "Vert", "Long", "MicL" ✅
trigger_level: float # in/s (geo) or psi (MicL) ✅
alarm_level: float # in/s (geo) or psi (MicL) ✅
max_range: float # full-scale calibration constant (e.g. 6.206) 🔶
max_range: float # hardware/firmware sensitivity constant (e.g. 6.206053) ✅ confirmed same on all units
unit_label: str # e.g. "in./s" or "psi" ✅
@@ -338,15 +390,34 @@ class ComplianceConfig:
raw: Optional[bytes] = None # full 2090-byte payload (for debugging)
# Recording parameters (✅ CONFIRMED from §7.6)
record_time: Optional[float] = None # seconds (7.0, 10.0, 13.0, etc.)
sample_rate: Optional[int] = None # sps (1024, 2048, 4096, etc.) — NOT YET FOUND ❓
recording_mode: Optional[int] = None # uint8: 0x00=Single Shot, 0x01=Continuous,
# 0x03=Histogram, 0x04=Histogram+Continuous ✅ confirmed 2026-04-20
# Read (E5): data[anchor_pos - 8] (6-byte anchor)
# Write (SUB 71): data[anchor_pos - 7]
sample_rate: Optional[int] = None # sps (1024, 2048, 4096)
histogram_interval_sec: Optional[int] = None # uint16 BE, seconds ✅ confirmed 2026-04-20
# anchor_pos - 4 (same offset in read & write)
# Valid values: 2, 5, 15, 60, 300, 900
# Mode-gated: only active in Histogram/Histogram+Continuous
record_time: Optional[float] = None # seconds (e.g. 3.0, 5.0, 8.0, 10.0)
# Trigger/alarm levels (✅ CONFIRMED per-channel at §7.6)
# For now we store the first geo channel (Transverse) as representatives;
# full per-channel data would require structured Channel objects.
trigger_level_geo: Optional[float] = None # in/s (first geo channel)
alarm_level_geo: Optional[float] = None # in/s (first geo channel)
max_range_geo: Optional[float] = None # in/s full-scale range
trigger_level_geo: Optional[float] = None # in/s (first geo channel)
alarm_level_geo: Optional[float] = None # in/s (first geo channel)
geo_adc_scale: Optional[float] = None # ADC-to-velocity scale factor (float32 at Tran+28) ✅
# = inverse sensitivity = 1/sensitivity (in/s per V)
# Formula (Interface Handbook §4.5): Range = 1.61133 V × scale_factor
# → 1.61133 × 6.206053 = 10.000 in/s (Normal range) ✅
# Firmware uses: PPV (in/s) = ADC_voltage (V) × 6.206053
# Identical on BE11529 and BE18189 — same Instantel geophone hardware.
# NOT a user-configurable setting. Must NOT be written.
geo_range: Optional[int] = None # range/sensitivity selector — CONFIRMED 2026-04-20
# 0x00 = Normal 10.000 in/s (standard gain)
# 0x01 = Sensitive 1.250 in/s (high gain)
# Offset: Tran+33 in both E5 read and SUB 71 write payloads
# (same 2126-byte buffer is round-tripped; applied to Tran/Vert/Long)
# Project/setup strings (sourced from E5 / SUB 71 write payload)
# These are the FULL project metadata from compliance config,
@@ -359,6 +430,78 @@ class ComplianceConfig:
notes: Optional[str] = None # extended notes / additional info
# ── Call Home Config ──────────────────────────────────────────────────────────
@dataclass
class CallHomeConfig:
"""
Auto Call Home (ACH) configuration from SUB 0x2C (response 0xD3).
Read with a standard two-step protocol (probe offset=0x00, data offset=0x7C).
Written via SUB 0x7E (write, 127-byte payload) + SUB 0x7F (confirm).
Confirmed from 4-20-26 call home settings captures (11 BW + S3 capture pairs).
Raw payload layout (data[11:] from S3 response, 125 bytes):
[0] 0x00 header byte
[1] 0x7C = 124 inner length (= offset for SUB 0x7E write - 2)
[2] 0xDC constant
[3:5] 0x00 0x00 padding
[5] auto_call_home_enabled (0x00=off, 0x01=on)
[6:46] dial_string 40-byte null-padded ASCII
[46:87] auto_answer_raw AT command strings (not decoded) present
[87] after_event_recorded (0x01=on, 0x00=off)
[91] at_specified_times (0x01=on, 0x00=off)
[93] time1_enabled (0x01=on, 0x00=off)
[95] time2_enabled (0x01=on, 0x00=off)
[101] time1_hour uint8 decimal 0-23
[102] time1_min uint8 decimal 0-59
[105] time2_hour uint8 decimal 0-23
[106] time2_min uint8 decimal 0-59
[117] DLE prefix (0x10) DLE-escaped num_retries=3 (0x03)
[118] 0x03 device stores/returns 0x03 DLE-escaped
[120] time_between_retries_sec uint8 (= 0x0F = 15 s default)
[122] wait_for_connection_sec uint8 (= 0x3C = 60 s default)
[124] warm_up_time_sec uint8 (= 0x3C = 60 s default)
Write payload = raw 125 bytes + b'\\x00\\x00' (2 trailing zeros) = 127 bytes.
Offset for SUB 0x7E: data[1] + 2 = 0x7C + 2 = 0x7E = 126.
Note on DLE-escaped 0x03: The device's S3 response DLE-escapes ETX (0x03)
bytes as \\x10\\x03. The S3FrameParser preserves both bytes in frame.data.
Subsequent fields after offset 117 are therefore at raw_offset = logical+1.
The raw payload must be round-tripped verbatim in write; do NOT reapply DLE
destuffing or stripping.
"""
raw: Optional[bytes] = None # raw 125-byte read payload (for round-trip write)
# ── Main enable ──────────────────────────────────────────────────────────
auto_call_home_enabled: Optional[bool] = None # raw[5] ✅
# ── Dial string ──────────────────────────────────────────────────────────
dial_string: Optional[str] = None # raw[6:46] 40-byte null-padded ASCII ✅
# ── When to call ─────────────────────────────────────────────────────────
after_event_recorded: Optional[bool] = None # raw[87] ✅
at_specified_times: Optional[bool] = None # raw[91] ✅
# ── Time slot 1 ──────────────────────────────────────────────────────────
time1_enabled: Optional[bool] = None # raw[93] ✅
time1_hour: Optional[int] = None # raw[101] 0-23 ✅
time1_min: Optional[int] = None # raw[102] 0-59 ✅
# ── Time slot 2 ──────────────────────────────────────────────────────────
time2_enabled: Optional[bool] = None # raw[95] ✅
time2_hour: Optional[int] = None # raw[105] 0-23 ✅
time2_min: Optional[int] = None # raw[106] 0-59 ✅
# ── Retry / timeout settings (read-only; not writable via set_call_home_config) ──
num_retries: Optional[int] = None # raw[117:119]=10 03 → value 3 ✅
time_between_retries_sec: Optional[int] = None # raw[120] (shifted +1 by DLE) ✅
wait_for_connection_sec: Optional[int] = None # raw[122] ✅
warm_up_time_sec: Optional[int] = None # raw[124] ✅
# ── Event ─────────────────────────────────────────────────────────────────────
@dataclass
@@ -402,6 +545,10 @@ class Event:
# Set by get_events(); required by download_waveform().
_waveform_key: Optional[bytes] = field(default=None, repr=False)
# Raw A5 frames from the full bulk waveform download (full_waveform=True).
# Populated by get_events() when full_waveform=True; used by write_blastware_file().
_a5_frames: Optional[list] = field(default=None, repr=False)
def __str__(self) -> str:
ts = str(self.timestamp) if self.timestamp else "no timestamp"
ppv = ""
+328 -100
View File
@@ -35,6 +35,8 @@ from .framing import (
token_params,
bulk_waveform_params,
bulk_waveform_term_params,
bulk_waveform_term_v2,
parse_strt_end_offset,
POLL_PROBE,
POLL_DATA,
SESSION_RESET,
@@ -65,6 +67,7 @@ SUB_WAVEFORM_HEADER = 0x0A
SUB_WAVEFORM_RECORD = 0x0C
SUB_BULK_WAVEFORM = 0x5A
SUB_COMPLIANCE = 0x1A
SUB_CALL_HOME = 0x2C # Call home config read → response 0xD3 ✅
SUB_UNKNOWN_2E = 0x2E
# Write command SUBs (= Read SUB + 0x60, confirmed from BW captures 3-11-26)
@@ -78,6 +81,10 @@ SUB_WRITE_CONFIRM_C = 0x74 # Confirm C — sent after 69 ✅
SUB_TRIGGER_CONFIG_WRITE = 0x82 # Write trigger config (0x22 + 0x60) ✅
SUB_TRIGGER_CONFIRM = 0x83 # Confirm trigger write ✅
# Call home write SUBs (confirmed from 4-20-26 call home settings captures)
SUB_CALL_HOME_WRITE = 0x7E # Write call home config → response 0x81 ✅
SUB_CALL_HOME_CONFIRM = 0x7F # Confirm call home write → response 0x80 ✅
# Monitoring control SUBs (confirmed from 4-8-26/2ndtry BW TX capture)
SUB_START_MONITORING = 0x96 # Start monitoring → response 0x69 ✅
SUB_STOP_MONITORING = 0x97 # Stop monitoring → response 0x68 ✅
@@ -109,6 +116,7 @@ DATA_LENGTHS: dict[int, int] = {
# SUB_WAVEFORM_HEADER (0x0A) is VARIABLE — length read from probe response
# data[4]. Do NOT add it here; use read_waveform_header() instead. ✅
SUB_WAVEFORM_RECORD: 0xD2, # 210-byte waveform/histogram record ✅
SUB_CALL_HOME: 0x7C, # 124-byte call home config ✅ (confirmed 4-20-26)
SUB_UNKNOWN_2E: 0x1A, # 26 bytes, purpose TBD 🔶
0x09: 0xCA, # 202 bytes, purpose TBD 🔶
# SUB_COMPLIANCE (0x1A) uses a multi-step sequence with a 2090-byte total;
@@ -116,14 +124,22 @@ DATA_LENGTHS: dict[int, int] = {
}
# SUB 5A (BULK_WAVEFORM_STREAM) protocol constants.
# Confirmed from 1-2-26 BW TX capture analysis (2026-04-02).
_BULK_CHUNK_OFFSET = 0x1004 # offset field for probe + all regular chunk requests ✅
_BULK_TERM_OFFSET = 0x005A # offset field for termination request ✅
_BULK_COUNTER_STEP = 0x0400 # chunk counter increment per chunk ✅
# Chunk counter formula: chunk_num * 0x0400 for ALL chunks including chunk 1.
# Earlier captures showed 0x1004 for chunk 1 — that was a Blastware artifact, not a
# protocol requirement. Confirmed 2026-04-06: 0x0400 for chunk 1 works; 0x1004
# causes a 120-second device timeout. Formula n * 0x0400 is used for all chunks.
#
# 2026-05-01 minimal-fix: the chunk-counter walk is now bounded by the event's
# `end_offset` extracted from the STRT record at data[23:27] of the probe
# response. Without this bound the loop kept asking for chunks past the event
# end and the device responded with post-event circular-buffer garbage,
# corrupting reconstructed Blastware files for events ≥ 2 sec.
#
# We keep the OLD 0x0400 chunk step here (BW actually uses 0x0200 — see §7.8.5
# of the protocol reference for the corrected understanding) because the
# existing blastware_file.py builder relies on the 0x0400-step frame structure
# to produce valid files. Switching to BW's 0x0200 step is a separate task
# that also requires updating the file builder.
# BW-exact protocol values (v0.14.0). Verified against 4-27-26 + 5-1-26 captures.
_BULK_CHUNK_OFFSET = 0x1002 # offset_word for probe + all chunk requests
_BULK_TERM_OFFSET = 0x005A # offset_word for the legacy terminator (fallback only)
_BULK_COUNTER_STEP = 0x0200 # chunk counter increment (matches chunk payload size)
# Default timeout values (seconds).
# MiniMate Plus is a slow device — keep these generous.
@@ -518,142 +534,270 @@ class MiniMateProtocol:
self,
key4: bytes,
*,
stop_after_metadata: bool = True,
max_chunks: int = 32,
) -> list[bytes]:
stop_after_metadata: bool = True, # DEPRECATED — no-op under BW-exact walk
max_chunks: int = 256, # safety cap only; loop is bounded by end_offset
include_terminator: bool = False,
extra_chunks_after_metadata: int = 1, # DEPRECATED — no-op
) -> list[S3Frame]:
"""
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event.
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event using
Blastware's exact protocol. REWRITTEN 2026-05-02 (v0.14.0).
The bulk waveform stream carries both raw ADC samples (large) and
event-time metadata strings ("Project:", "Client:", "User Name:",
"Seis Loc:", "Extended Notes") embedded in one of the middle frames
(confirmed: A5[7] of 9 for 1-2-26 capture).
Algorithm (matches BW captures across 2-sec / 3-sec / event-2):
Protocol is request-per-chunk, NOT a continuous stream:
1. Probe (offset=_BULK_CHUNK_OFFSET, is_probe=True, counter=0x0000)
2. Chunks (offset=_BULK_CHUNK_OFFSET, is_probe=False, counter+=0x0400)
3. Loop until metadata found (stop_after_metadata=True) or max_chunks
4. Termination (offset=_BULK_TERM_OFFSET, counter=last+_BULK_COUNTER_STEP)
Device responds with a final A5 frame (page_key=0x0000).
1. Probe
- For events at start_key[2:4] = 0x0000 (first event after erase
/ wrap): probe at counter=0x0000 with full key in params.
- For continuation events (start_key[2:4] != 0): first chunk at
counter = start_key[2:4] + 0x0046; acts as both probe and
first sample chunk; response carries STRT.
The termination frame (page_key=0x0000) is NOT included in the returned list.
2. Parse end_offset from STRT record at data[23:27] of the probe response.
Args:
key4: 4-byte waveform key from EVENT_HEADER (1E).
stop_after_metadata: If True (default), send termination as soon as
b"Project:" is found in a frame's data — avoids
downloading the full ADC waveform payload (several
hundred KB). Set False to download everything.
max_chunks: Safety cap on the number of chunk requests sent
(default 32; a typical event uses 9 large frames).
3. Read two fixed metadata pages at counter=0x1002 and counter=0x1004
global session metadata (Project / Client / User Name / Seis Loc
/ Extended Notes ASCII strings). Event 1 only; continuation
events skip these (BW caches them across the session).
4. Walk sample chunks at 0x0200 increments, starting from 0x0600 for
event 1 or `start + 0x0046 + 0x0200` for continuation events.
Stop when `next_chunk + 0x0200 > end_offset`.
5. Send TERM frame with offset_word and params computed by
`bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`.
The TERM response contains the partial last chunk (residual =
end_offset - next_boundary) including the 26-byte 0e 08 file
footer.
Returns:
List of raw data bytes from each A5 response frame (not including
the terminator frame). Frame indices match the request sequence:
index 0 = probe response, index 1 = first chunk, etc.
List of S3Frame objects from each A5 response (probe, metadata
pages, sample chunks, optional TERM response). Caller passes
`include_terminator=True` (e.g. write_blastware_file) to keep the
TERM response in the list it's required to reconstruct the
file footer.
Deprecated kwargs:
stop_after_metadata: legacy "Project:"-string-based stop condition.
No-op under the BW-exact walk; the loop is
deterministically bounded by end_offset from
STRT. Accepted for backward compat.
extra_chunks_after_metadata: same.
Raises:
ProtocolError: on timeout, bad checksum, or unexpected SUB.
Confirmed from 1-2-26 BW TX/RX captures (2026-04-02):
- probe + 8 regular chunks + 1 termination = 10 TX frames
- 9 large A5 responses + 1 terminator A5 = 10 RX frames
- page_key=0x0010 on large frames; page_key=0x0000 on terminator
- "Project:" metadata at A5[7].data[626]
ProtocolError: on timeout / bad checksum / unexpected SUB.
"""
if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xFF - 0x5A = 0xA5
frames_data: list[bytes] = []
counter = 0
# Quietly accept and warn on deprecated kwargs.
if not stop_after_metadata:
log.debug("5A: stop_after_metadata=False is no-op under BW-exact walk")
if extra_chunks_after_metadata not in (0, 1):
log.debug("5A: extra_chunks_after_metadata=%d is no-op under BW-exact walk",
extra_chunks_after_metadata)
# ── Step 1: probe ────────────────────────────────────────────────────
log.debug("5A probe key=%s", key4.hex())
params = bulk_waveform_params(key4, 0, is_probe=True)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
self._parser.reset() # reset bytes_fed counter before probe recv
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xA5
frames_data: list[S3Frame] = []
start_offset = (key4[2] << 8) | key4[3]
is_event_1 = (start_offset == 0)
# ── Step 1: probe / first chunk ──────────────────────────────────────
if is_event_1:
probe_counter = 0
probe_params = bulk_waveform_params(key4, 0, is_probe=True)
log.debug("5A probe (event-1) key=%s counter=0x0000", key4.hex())
else:
# Continuation events: first 5A request lands at counter = key[2:4]
# (i.e. the address of the off=0x46 WAVEHDR record returned by 1F).
# The probe response carries STRT at byte 17 with end_offset.
#
# Confirmed 2026-05-04 from 5-1-26 "copy 2nd address" capture
# (BW probes counter=0x2238 with key=01112238, STRT@17 end=0x417E)
# and 5-4-26 BW captures (2-sec event probes counter=0x2238).
#
# The earlier "+0x46" formula in the doc came from calling
# start_key the BOUNDARY (off=0x2C) key, but the iteration walk
# uses 1F's off=0x46 key as cur_key, which already incorporates
# the +0x46 offset relative to the boundary. Adding it again
# caused the probe to overshoot, miss STRT, and run uncapped.
probe_counter = start_offset
probe_params = bulk_waveform_params(key4, probe_counter)
log.debug(
"5A probe (event-N) key=%s counter=0x%04X",
key4.hex(), probe_counter,
)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, probe_params))
self._parser.reset()
try:
rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False)
except TimeoutError:
log.warning(
"5A probe TIMED OUT for key=%s"
"%d raw bytes received (no complete A5 frame assembled)",
"5A probe TIMED OUT for key=%s%d raw bytes received",
key4.hex(), self._parser.bytes_fed,
)
raise
frames_data.append(rsp.data)
log.debug("5A A5[0] page_key=0x%04X %d bytes", rsp.page_key, len(rsp.data))
# ── Step 2: chunk loop ───────────────────────────────────────────────
# Chunk counters are monotonic: chunk_num * 0x0400 for all chunks.
# The 4-2-26 BW TX capture showed 0x1004 for chunk 1, but this is a
# Blastware artifact — the device accepts any counter value and streams
# data regardless. Empirically confirmed 2026-04-06: 0x0400 for chunk 1
# works; 0x1004 causes the device to ignore the frame (timeout).
for chunk_num in range(1, max_chunks + 1):
counter = chunk_num * _BULK_COUNTER_STEP
params = bulk_waveform_params(key4, counter)
log.debug("5A chunk %d counter=0x%04X", chunk_num, counter)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
self._parser.reset() # reset bytes_fed for accurate per-chunk count
frames_data.append(rsp)
log.debug("5A A5[0] (probe) page_key=0x%04X %d bytes",
rsp.page_key, len(rsp.data))
# ── Step 2: parse STRT end_offset from probe response ────────────────
end_offset = parse_strt_end_offset(rsp.data)
if end_offset is None:
log.warning(
"5A probe response did not contain a STRT record; "
"cannot bound chunk loop — falling back to max_chunks=%d cap",
max_chunks,
)
end_offset = 0xFFFF # impossible value → loop runs to max_chunks
else:
log.info(
"5A STRT start_offset=0x%04X end_offset=0x%04X size=0x%04X",
start_offset, end_offset, end_offset - start_offset,
)
# ── Step 3: metadata pages 0x1002 + 0x1004 (event 1 only) ────────────
# Confirmed from BW captures: BW reads these two fixed device-buffer
# pages immediately after the probe for events at start_key[2:4]=0.
# Continuation events skip them (BW caches across the session).
# Their content is global compliance-setup metadata: Project, Client,
# User Name, Seis Loc, Extended Notes.
if is_event_1:
for meta_counter in (0x1002, 0x1004):
# Metadata page params have an extra trailing 0x00 byte
# (12-byte params instead of 11) — empirical from BW captures.
# Checksum-neutral but matches BW byte-for-byte.
meta_params = bytes([
0x00,
key4[0], key4[1],
(meta_counter >> 8) & 0xFF,
meta_counter & 0xFF,
0, 0, 0, 0, 0, 0, 0,
])
log.debug("5A metadata page counter=0x%04X", meta_counter)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, meta_params))
self._parser.reset()
try:
rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False, timeout=10.0)
meta_rsp = self._recv_one(
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
)
except TimeoutError:
log.warning(
"5A metadata page 0x%04X TIMED OUT — continuing",
meta_counter,
)
continue
frames_data.append(meta_rsp)
log.debug(
"5A meta@0x%04X page_key=0x%04X %d bytes",
meta_counter, meta_rsp.page_key, len(meta_rsp.data),
)
# ── Step 4: sample chunk loop, bounded by end_offset ─────────────────
# Sample chunks start at:
# event 1: counter = 0x0600
# event N (>0): counter = probe_counter + 0x0200
# (probe was the first sample chunk)
if is_event_1:
counter = 0x0600
else:
counter = probe_counter + _BULK_COUNTER_STEP
last_chunk_counter: Optional[int] = (
probe_counter if not is_event_1 else None
)
chunks_fetched = 0
while chunks_fetched < max_chunks:
# Stop when next chunk would straddle the event end.
if counter + _BULK_COUNTER_STEP > end_offset:
log.debug(
"5A chunk loop done at counter=0x%04X (end=0x%04X); "
"%d chunks fetched",
counter, end_offset, chunks_fetched,
)
break
params = bulk_waveform_params(key4, counter)
log.debug("5A chunk #%d counter=0x%04X", chunks_fetched + 1, counter)
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
self._parser.reset()
try:
rsp = self._recv_one(
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
)
except TimeoutError:
raw = self._parser.bytes_fed
log.warning(
"5A TIMEOUT chunk=%d counter=0x%04X raw_bytes=%d",
chunk_num, counter, raw,
chunks_fetched + 1, counter, raw,
)
if raw > 0 and frames_data:
# Device sent a partial byte (likely a bare DLE/ETX end-of-stream
# signal) but never completed a full frame. Treat as graceful
# stream end and fall through to the termination step.
log.warning(
"5A end-of-stream detected at chunk=%d (raw_bytes=%d, "
"frames_collected=%d) — proceeding to termination",
chunk_num, raw, len(frames_data),
"5A unexpected end-of-stream — proceeding to TERM",
)
break
raise
log.warning(
"5A RX chunk=%d page_key=0x%04X data_len=%d contains_Project=%s",
chunk_num, rsp.page_key, len(rsp.data), b"Project:" in rsp.data,
log.debug(
"5A RX chunk=%d page_key=0x%04X data_len=%d",
chunks_fetched + 1, rsp.page_key, len(rsp.data),
)
if rsp.page_key == 0x0000:
# Device unexpectedly terminated mid-stream (no termination needed).
log.debug("5A A5[%d] page_key=0x0000 — device terminated early", chunk_num)
# Device terminated mid-stream unexpectedly.
log.warning(
"5A unexpected page_key=0x0000 mid-stream at counter=0x%04X",
counter,
)
if include_terminator:
frames_data.append(rsp)
return frames_data
frames_data.append(rsp.data)
if stop_after_metadata and b"Project:" in rsp.data:
log.debug("5A A5[%d] metadata found — stopping early", chunk_num)
break
frames_data.append(rsp)
last_chunk_counter = counter
counter += _BULK_COUNTER_STEP
chunks_fetched += 1
else:
log.warning(
"5A reached max_chunks=%d without end-of-stream; sending termination",
max_chunks,
"5A reached max_chunks=%d at counter=0x%04X (end=0x%04X)",
max_chunks, counter, end_offset,
)
# ── Step 3: termination ──────────────────────────────────────────────
term_counter = counter + _BULK_COUNTER_STEP
term_params = bulk_waveform_term_params(key4, term_counter)
log.debug(
"5A termination term_counter=0x%04X offset=0x%04X",
term_counter, _BULK_TERM_OFFSET,
# ── Step 5: TERM with proper end_offset-derived formula ──────────────
if last_chunk_counter is None or end_offset == 0xFFFF:
# No STRT or no chunks fetched — fall back to legacy TERM.
log.warning(
"5A using legacy TERM (offset_word=0x005A); "
"end_offset unavailable or no chunks fetched",
)
legacy_counter = (last_chunk_counter or probe_counter) + _BULK_COUNTER_STEP
term_offset_word = _BULK_TERM_OFFSET # 0x005A
term_params = bulk_waveform_term_params(key4, legacy_counter)
else:
term_offset_word, term_params = bulk_waveform_term_v2(
key4, end_offset, last_chunk_counter,
)
self._send(build_5a_frame(_BULK_TERM_OFFSET, term_params))
try:
term_rsp = self._recv_one(expected_sub=rsp_sub)
log.debug(
"5A termination response page_key=0x%04X %d bytes",
"5A TERM offset_word=0x%04X params[2:4]=%s end=0x%04X "
"last_chunk=0x%04X",
term_offset_word, term_params[2:4].hex(),
end_offset, last_chunk_counter,
)
self._send(build_5a_frame(term_offset_word, term_params))
try:
term_rsp = self._recv_one(expected_sub=rsp_sub, timeout=10.0)
log.info(
"5A TERM response page_key=0x%04X %d bytes",
term_rsp.page_key, len(term_rsp.data),
)
if include_terminator:
frames_data.append(term_rsp)
except TimeoutError:
log.debug("5A no termination response — device may have already closed")
log.warning("5A no TERM response (timeout)")
return frames_data
@@ -793,7 +937,7 @@ class MiniMateProtocol:
continue
chunk = data_rsp.data[11:]
log.warning(
log.debug(
"read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d",
step_name, data_rsp.page_key, len(data_rsp.data),
len(chunk), len(config) + len(chunk),
@@ -813,17 +957,18 @@ class MiniMateProtocol:
except TimeoutError:
pass
log.warning(
log.info(
"read_compliance_config: done — %d cfg bytes total",
len(config),
)
# Hex dump first 128 bytes for field mapping
# Hex dump first 128 bytes — useful only for field-mapping work, not normal operation.
if log.isEnabledFor(logging.DEBUG):
for row in range(0, min(len(config), 128), 16):
row_bytes = bytes(config[row:row + 16])
hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
log.warning(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
log.debug(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
return bytes(config)
@@ -1087,6 +1232,89 @@ class MiniMateProtocol:
self._send(frame)
return self.recv_write_ack(expected_sub=rsp_sub)
# ── Call home config (SUBs 0x2C / 0x7E / 0x7F) ──────────────────────────
def read_call_home_config(self) -> bytes:
"""
Read the auto call home configuration (SUB 0x2C response 0xD3).
Standard two-step read: probe (offset=0x00) then data (offset=0x7C=124).
Returns the raw 125-byte payload (data[11:] of the data response).
Confirmed from 4-20-26 call home settings capture:
- Probe response: data[4]=0x7C (confirms data length = 124)
- Data response: 136 bytes total (11-byte echo header + 125 bytes payload)
- Payload[0:3] = 0x00 0x7C 0xDC (header: zero, inner-length, constant)
- Payload[5] = auto_call_home_enabled
- Payload[6:46] = dial_string (40-byte null-padded ASCII "RADIO RING")
Returns:
Raw 125-byte call home config payload (data[11:]).
Suitable for round-trip write (append \\x00\\x00 127-byte write payload).
Raises:
ProtocolError: on timeout or wrong response SUB.
"""
rsp_sub = _expected_rsp_sub(SUB_CALL_HOME) # 0xFF - 0x2C = 0xD3
length = DATA_LENGTHS[SUB_CALL_HOME] # 0x7C = 124
log.debug("read_call_home_config: 0x2C probe")
self._send(build_bw_frame(SUB_CALL_HOME, 0))
self._recv_one(expected_sub=rsp_sub)
log.debug("read_call_home_config: 0x2C data request offset=0x%02X", length)
self._send(build_bw_frame(SUB_CALL_HOME, length))
data_rsp = self._recv_one(expected_sub=rsp_sub)
payload = data_rsp.data[11:]
log.debug("read_call_home_config: received %d payload bytes", len(payload))
return payload
def write_call_home_config(self, data: bytes) -> None:
"""
Write the auto call home configuration (SUB 0x7E 0x7F confirm).
Write sequence (confirmed from 4-20-26 call home settings captures):
SUB 0x7E write 127-byte payload device acks SUB 0x81
SUB 0x7F confirm (no data) device acks SUB 0x80
The 127-byte write payload = 125-byte read payload + b'\\x00\\x00'.
The offset field = data[1] + 2 = 0x7C + 2 = 0x7E = 126.
Write frame format: build_bw_write_frame (minimal DLE stuffing only
BW_CMD is doubled; all other bytes are RAW). The \\x10\\x03 sequence
within the payload is preserved as-is (device interprets DLE+ETX as the
literal value 0x03 per the inner-frame terminator convention).
Args:
data: 127-byte write payload (read payload + \\x00\\x00 footer).
Must start with [0x00][0x7C][...] (standard header).
Raises:
ValueError: if data is not exactly 127 bytes or lacks expected header.
ProtocolError: on timeout or wrong response SUB.
"""
if len(data) < 2:
raise ValueError(f"call home write payload must be at least 2 bytes, got {len(data)}")
rsp_sub_write = _expected_rsp_sub(SUB_CALL_HOME_WRITE) # 0xFF - 0x7E = 0x81
rsp_sub_confirm = _expected_rsp_sub(SUB_CALL_HOME_CONFIRM) # 0xFF - 0x7F = 0x80
# Offset formula: data[1] + 2 (same pattern as other single-chunk writes)
offset = data[1] + 2 # 0x7C + 2 = 0x7E = 126
frame = build_bw_write_frame(SUB_CALL_HOME_WRITE, data, offset=offset)
log.debug(
"write_call_home_config: %d bytes data[1]=0x%02X offset=0x%04X",
len(data), data[1], offset,
)
self._send(frame)
self.recv_write_ack(expected_sub=rsp_sub_write)
log.debug("write_call_home_config: write acked; sending confirm 0x7F")
confirm_frame = build_bw_write_frame(SUB_CALL_HOME_CONFIRM, b"")
self._send(confirm_frame)
self.recv_write_ack(expected_sub=rsp_sub_confirm)
log.debug("write_call_home_config: confirm acked — done")
# ── Monitoring ────────────────────────────────────────────────────────────
def read_monitor_status(self) -> S3Frame:
+99
View File
@@ -454,3 +454,102 @@ class SocketTransport(TcpTransport):
def __repr__(self) -> str:
return f"SocketTransport(peer={self.host!r})"
# ── Capturing transport (MITM-style raw byte mirror) ──────────────────────────
class CapturingTransport(BaseTransport):
"""
Wraps another BaseTransport and mirrors every byte to two raw capture files:
raw_bw_<...>.bin bytes WE wrote to the device (BW-side TX)
raw_s3_<...>.bin bytes the device wrote back (S3-side TX)
The file naming and on-wire byte layout are identical to the captures
produced by `bridges/ach_mitm.py`, so the resulting `.bin` files can be
loaded directly by the Analyzer (File > Open Capture) and parsed by the
same tooling used for genuine Blastware MITM captures.
All BaseTransport methods are forwarded to the inner transport; the only
side-effect is that successful read/write byte streams are appended to the
two open binary files.
Args:
inner: An already-built BaseTransport (SerialTransport / TcpTransport).
bw_path: File path for the "BW TX" stream (bytes we send). Opened "wb".
s3_path: File path for the "S3 TX" stream (bytes the device sends).
Opened "wb".
Example:
with CapturingTransport(TcpTransport("1.2.3.4", 9034),
"raw_bw.bin", "raw_s3.bin") as t:
client = MiniMateClient(transport=t)
client.connect()
client.get_events()
# both .bin files now hold the full bidirectional capture.
"""
def __init__(self, inner: BaseTransport, bw_path: str, s3_path: str) -> None:
self._inner = inner
self._bw_path = bw_path
self._s3_path = s3_path
self._bw_fh = None
self._s3_fh = None
# Forward inner attrs so callers can introspect (e.g. .host, .port).
self.host = getattr(inner, "host", None)
self.port = getattr(inner, "port", None)
# ── BaseTransport interface ───────────────────────────────────────────────
def connect(self) -> None:
if self._bw_fh is None:
self._bw_fh = open(self._bw_path, "wb", buffering=0)
if self._s3_fh is None:
self._s3_fh = open(self._s3_path, "wb", buffering=0)
self._inner.connect()
def disconnect(self) -> None:
try:
self._inner.disconnect()
finally:
for fh_attr in ("_bw_fh", "_s3_fh"):
fh = getattr(self, fh_attr)
if fh is not None:
try:
fh.flush()
fh.close()
except Exception:
pass
setattr(self, fh_attr, None)
@property
def is_connected(self) -> bool:
return self._inner.is_connected
def write(self, data: bytes) -> None:
self._inner.write(data)
if data and self._bw_fh is not None:
try:
self._bw_fh.write(data)
except Exception:
pass
def read(self, n: int) -> bytes:
got = self._inner.read(n)
if got and self._s3_fh is not None:
try:
self._s3_fh.write(got)
except Exception:
pass
return got
@property
def bw_path(self) -> str:
return self._bw_path
@property
def s3_path(self) -> str:
return self._s3_path
def __repr__(self) -> str:
return f"CapturingTransport({self._inner!r}, bw={self._bw_path!r}, s3={self._s3_path!r})"
+2
View File
@@ -53,7 +53,9 @@ SUB_TABLE: dict[int, tuple[str, str, str]] = {
0x82: ("TRIGGER_CONFIG_WRITE", "BW→S3", "0x1C bytes; trigger config block; mirrors SUB 1C"),
0x83: ("TRIGGER_WRITE_CONFIRM", "BW→S3", "Short frame; commit step after 0x82"),
# S3→BW responses
0x5A: ("BULK_WAVEFORM_STREAM", "BW→S3", "Bulk waveform chunk request; response is A5 stream"),
0xA4: ("POLL_RESPONSE", "S3→BW", "Response to SUB 5B poll"),
0xA5: ("BULK_WAVEFORM_RESPONSE", "S3→BW", "Response to SUB 5A; waveform chunks + metadata"),
0xFE: ("FULL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 01"),
0xF9: ("CHANNEL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 06"),
0xF7: ("EVENT_INDEX_RESPONSE", "S3→BW", "Response to SUB 08; contains backlight/power-save"),
+31 -34
View File
@@ -33,7 +33,7 @@ STX = 0x02
ETX = 0x03
ACK = 0x41
__version__ = "0.2.3"
__version__ = "0.2.5"
@dataclass
@@ -186,7 +186,7 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
IDLE = 0
IN_FRAME = 1
AFTER_DLE = 2
IN_FRAME_DLE = 2 # saw DLE inside frame — waiting for next byte
state = IDLE
body = bytearray()
@@ -206,66 +206,63 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
state = IN_FRAME
i += 2
continue
# ACK bytes, boot strings, garbage — silently ignored
elif state == IN_FRAME:
if b == DLE:
state = AFTER_DLE
state = IN_FRAME_DLE
i += 1
continue
body.append(b)
else: # AFTER_DLE
if b == DLE:
body.append(DLE)
state = IN_FRAME
i += 1
continue
if b == ETX:
# Bare ETX = real S3 frame terminator (confirmed from S3FrameParser)
end_offset = i + 1
trailer_start = i + 1
trailer_end = trailer_start + trailer_len
trailer = blob[trailer_start:trailer_end]
chk_valid = None
chk_type = None
chk_hex = None
payload = bytes(body)
if len(body) >= 1:
received_chk = body[-1]
computed_chk = checksum8_sum(bytes(body[:-1]))
if computed_chk == received_chk:
chk_valid = True
chk_type = "SUM8"
chk_hex = f"{received_chk:02x}"
payload = bytes(body[:-1])
else:
chk_valid = False
# S3 checksums are deliberately not validated here.
# Large S3 responses (A5 bulk waveform, E5 compliance) embed
# inner DLE+ETX sub-frame terminators whose trailing 0x03 byte
# lands where the parser would expect the SUM8 checksum, causing
# false failures. The live protocol (protocol.py _validate_frame)
# also skips S3 checksum enforcement for the same reason.
frames.append(Frame(
index=idx,
start_offset=start_offset,
end_offset=end_offset,
payload_raw=bytes(body),
payload=payload,
payload=bytes(body),
trailer=trailer,
checksum_valid=chk_valid,
checksum_type=chk_type,
checksum_hex=chk_hex
checksum_valid=None,
checksum_type=None,
checksum_hex=None
))
idx += 1
state = IDLE
i = trailer_end
continue
body.append(b)
else: # IN_FRAME_DLE
if b == DLE:
# DLE DLE → literal 0x10 in payload
body.append(DLE)
state = IN_FRAME
i += 1
continue
if b == ETX:
# DLE+ETX inside a frame = inner-frame terminator (A4/E5 sub-frames).
# Treat as literal data, NOT the outer frame end.
body.append(DLE)
body.append(ETX)
state = IN_FRAME
i += 1
continue
# Unexpected DLE + byte → treat as literal data
body.append(DLE)
body.append(b)
state = IN_FRAME
i += 1
continue
i += 1
+4 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "seismo-relay"
version = "0.12.0"
version = "0.15.0"
description = "Python client and REST server for MiniMate Plus seismographs"
requires-python = ">=3.10"
dependencies = [
@@ -12,6 +12,9 @@ dependencies = [
"uvicorn[standard]>=0.24",
"pyserial>=3.5",
"sqlalchemy>=2.0",
"python-multipart>=0.0.7",
"h5py>=3.10",
"numpy>=1.24",
]
[tool.setuptools.packages.find]
+3
View File
@@ -2,3 +2,6 @@ fastapi
uvicorn
sqlalchemy
pyserial
python-multipart
h5py
numpy
+346
View File
@@ -0,0 +1,346 @@
"""
scripts/backfill_sidecars.py generate .sfm.json sidecars AND .h5
clean-waveform files for existing events already in the waveform store
that predate those features.
Walks `<store_root>/<serial>/<filename>` and for each BW event file:
Sidecar (.sfm.json):
- Skip when an existing sidecar's blastware.sha256 matches the
current BW file's sha256.
- Else regenerate: prefer .a5.pkl (full fidelity); fall back to
parsing the BW binary directly (peaks computed from samples).
Clean waveform (.h5):
- Skip when <filename>.h5 already exists (idempotent).
- Else write from .a5.pkl (preferred) or BW binary parse (fallback).
Usage:
python scripts/backfill_sidecars.py [--store-root PATH]
[--db-path PATH]
[--dry-run]
[--skip-hdf5]
[-v]
"""
from __future__ import annotations
import argparse
import logging
import sys
from pathlib import Path
# Allow running from the repo root without installation.
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from minimateplus import event_file_io
from sfm import event_hdf5
from sfm.waveform_store import WaveformStore, _frame_to_dict, _dict_to_frame # noqa: F401
from sfm.database import SeismoDb
log = logging.getLogger("backfill_sidecars")
def _looks_like_event_file(path: Path) -> bool:
"""Same heuristic as the importer CLI."""
if not path.is_file():
return False
if path.name.endswith((".a5.pkl", ".sfm.json")):
return False
ext = path.suffix.lstrip(".")
if not (3 <= len(ext) <= 4):
return False
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
return False
try:
return path.stat().st_size >= 70
except OSError:
return False
def main(argv=None) -> int:
p = argparse.ArgumentParser(description=__doc__)
p.add_argument(
"--db-path",
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
)
p.add_argument("--store-root", default=None)
p.add_argument("--dry-run", action="store_true")
p.add_argument(
"--skip-hdf5", action="store_true",
help="Don't generate .h5 clean-waveform files (only sidecars).",
)
p.add_argument(
"--force", action="store_true",
help=(
"Regenerate sidecars + .h5 even when an existing sidecar's "
"blastware.sha256 matches the current BW file. Use this after "
"upgrading seismo-relay to pull in decoder bug fixes (e.g. the "
"STRT-rectime byte-offset fix in v0.15.x)."
),
)
p.add_argument("-v", "--verbose", action="store_true")
args = p.parse_args(argv)
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
datefmt="%H:%M:%S",
)
db_path = Path(args.db_path).expanduser().resolve()
store_root = (
Path(args.store_root).expanduser().resolve()
if args.store_root else db_path.parent / "waveforms"
)
if not store_root.exists():
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
return 2
store = WaveformStore(store_root)
db = SeismoDb(db_path)
written = skipped = errors = 0
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
serial = serial_dir.name
for path in sorted(serial_dir.iterdir()):
if not _looks_like_event_file(path):
continue
sidecar_path = store.sidecar_path_for(serial, path.name)
try:
bw_sha = event_file_io.file_sha256(path)
except Exception as exc:
log.error("sha256 failed for %s: %s", path, exc)
errors += 1
continue
# Skip when an up-to-date sidecar already exists.
#
# Two-part freshness check:
# 1. blastware.sha256 must match the current BW file (proves
# the sidecar describes THIS file).
# 2. source.tool_version must be ≥ current TOOL_VERSION (proves
# the sidecar was written by a build that includes any
# decoder fixes shipped since).
# Either part failing → regenerate. --force bypasses both.
if sidecar_path.exists() and not args.force:
try:
existing = event_file_io.read_sidecar(sidecar_path)
sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
src_ver = existing.get("source", {}).get("tool_version", "")
def _vt(s):
try:
return tuple(int(p) for p in str(s).split(".")[:3])
except Exception:
return (0, 0, 0)
ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION)
if sha_ok and ver_ok:
skipped += 1
continue
if sha_ok and not ver_ok:
log.info(
"regenerating %s (sidecar tool_version=%s < current %s)",
sidecar_path.name, src_ver or "(none)",
event_file_io.TOOL_VERSION,
)
except Exception:
pass # fall through to rewrite
# Decide path: A5-based (high-fidelity) or BW-only.
a5_path = serial_dir / f"{path.name}.a5.pkl"
try:
if a5_path.exists():
frames = store.load_a5(serial, path.name)
if not frames:
raise RuntimeError("a5_pickle present but unreadable")
# Build an Event by replaying the A5 decoders. Note:
# the .a5.pkl alone CANNOT recover timestamp /
# record_type / waveform_key / per-channel peaks —
# those live in the 0C record, which isn't saved
# separately. We seed those from the DB row + the
# existing sidecar below so a re-backfill doesn't
# nuke fields the original save populated.
from minimateplus.client import (
_decode_a5_metadata_into,
_decode_a5_waveform,
)
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
ev = Event(index=-1)
_decode_a5_metadata_into(frames, ev)
_decode_a5_waveform(frames, ev)
source_kind = "sfm-live"
a5_filename = a5_path.name
else:
ev = event_file_io.read_blastware_file(path)
source_kind = "bw-import"
a5_filename = None
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
# ── Seed missing fields from the SeismoDb events row ──
# The DB row was populated at original save time with peaks,
# project info, timestamp, record_type, sample_rate, etc.
# All of those survive intact in SQLite; pull them onto the
# rebuilt Event so the regenerated sidecar matches what was
# there before the backfill ran.
db_row = None
try:
import sqlite3 as _sql
with _sql.connect(str(db.db_path)) as _conn:
_conn.row_factory = _sql.Row
db_row = _conn.execute(
"SELECT * FROM events "
"WHERE serial=? AND blastware_filename=? "
"LIMIT 1",
(serial, path.name),
).fetchone()
except Exception as exc:
log.debug("DB lookup failed for %s: %s", path.name, exc)
if db_row is not None:
if ev.sample_rate is None and db_row["sample_rate"]:
ev.sample_rate = int(db_row["sample_rate"])
if not ev.record_type and db_row["record_type"]:
ev.record_type = db_row["record_type"]
if ev._waveform_key is None and db_row["waveform_key"]:
try:
ev._waveform_key = bytes.fromhex(db_row["waveform_key"])
except Exception:
pass
# Timestamp from the ISO-8601 string in the DB row.
if ev.timestamp is None and db_row["timestamp"]:
try:
import datetime as _dt
_t = _dt.datetime.fromisoformat(db_row["timestamp"])
ev.timestamp = Timestamp(
raw=b"", flag=0x10,
year=_t.year, unknown_byte=0,
month=_t.month, day=_t.day,
hour=_t.hour, minute=_t.minute, second=_t.second,
)
except Exception:
pass
# Peaks from the DB row when the A5 decode didn't supply them.
if ev.peak_values is None:
ev.peak_values = PeakValues(
tran=db_row["tran_ppv"],
vert=db_row["vert_ppv"],
long=db_row["long_ppv"],
peak_vector_sum=db_row["peak_vector_sum"],
micl=db_row["mic_ppv"],
)
# Project info from the DB row when the A5 metadata-page
# decode didn't pick it up.
if ev.project_info is None or all(
v in (None, "")
for v in (
(ev.project_info.project if ev.project_info else None),
(ev.project_info.client if ev.project_info else None),
(ev.project_info.operator if ev.project_info else None),
(ev.project_info.sensor_location if ev.project_info else None),
)
):
ev.project_info = ProjectInfo(
project=db_row["project"],
client=db_row["client"],
operator=db_row["operator"],
sensor_location=db_row["sensor_location"],
)
# Derive total_samples when we have both rectime + sample_rate.
# The decoder's STRT-derived value can be a buffer offset
# rather than a sample count — drop it in that case.
if ev.sample_rate and ev.rectime_seconds:
derived = int(round(ev.sample_rate * ev.rectime_seconds))
if (ev.total_samples is None
or ev.total_samples > derived * 2
or ev.total_samples < derived // 4):
ev.total_samples = derived
# Preserve user-edited review state + extensions from the
# existing sidecar (false_trigger flag, notes, etc.) so a
# backfill never wipes them out.
preserved_review = None
preserved_ext = None
if sidecar_path.exists():
try:
_existing = event_file_io.read_sidecar(sidecar_path)
preserved_review = _existing.get("review")
preserved_ext = _existing.get("extensions")
except Exception:
pass
sidecar = event_file_io.event_to_sidecar_dict(
ev,
serial=serial,
blastware_filename=path.name,
blastware_filesize=path.stat().st_size,
blastware_sha256=bw_sha,
source_kind=source_kind,
a5_pickle_filename=a5_filename,
review=preserved_review,
extensions=preserved_ext,
)
# Also emit the .h5 clean-waveform file when missing OR when
# --force was passed (so a re-backfill picks up decoder fixes).
hdf5_path = store.hdf5_path_for(serial, path.name)
hdf5_filename = hdf5_path.name if hdf5_path.exists() else None
hdf5_action = "kept"
need_h5 = not args.skip_hdf5 and (args.force or not hdf5_path.exists())
if need_h5:
if args.dry_run:
hdf5_action = "would (re)write"
else:
try:
event_hdf5.write_event_hdf5(
hdf5_path, ev,
serial=serial,
geo_range="normal",
source_kind=source_kind,
)
hdf5_filename = hdf5_path.name
hdf5_action = "rewrote" if hdf5_path.exists() else "wrote"
except Exception as exc:
log.warning("HDF5 write failed for %s: %s", path.name, exc)
hdf5_action = "FAILED"
if args.dry_run:
print(f" [DRY ] would write {sidecar_path.name} "
f"+ .h5 ({hdf5_action}) source={source_kind}")
written += 1
continue
event_file_io.write_sidecar(sidecar_path, sidecar)
# Best-effort: keep the SQL row's sidecar_filename in sync
# by upserting via insert_events (it dedups on serial+ts).
try:
db.insert_events(
[ev], serial=serial,
waveform_records=(
{ev._waveform_key.hex(): {
"filename": path.name,
"filesize": path.stat().st_size,
"a5_pickle_filename": a5_filename,
"sidecar_filename": sidecar_path.name,
}}
if ev._waveform_key else None
),
)
except Exception as exc:
log.warning("DB upsert failed for %s: %s", path.name, exc)
print(f" [OK ] {path.name}{sidecar_path.name} "
f"+ h5 ({hdf5_action}) source={source_kind}")
written += 1
except Exception as exc:
log.error("backfill failed for %s: %s", path, exc, exc_info=args.verbose)
errors += 1
print(f"\nDone. written={written} skipped(uptodate)={skipped} errors={errors}")
return 0 if errors == 0 else 1
if __name__ == "__main__":
sys.exit(main())
+912 -65
View File
File diff suppressed because it is too large Load Diff
+132 -3
View File
@@ -83,6 +83,15 @@ class CachedEvent(Base):
Events are immutable once recorded on the device; once we have an event in
the cache it never needs to be re-downloaded unless explicitly requested.
The two extra columns `waveform_key` and `event_timestamp` are an
integrity stamp: when set_event() / set_waveform() are called with a
different (waveform_key, event_timestamp) for the same (conn_key, index),
we know the device was erased and re-recorded the cached row no longer
refers to the same physical event and the entire device's cache is
flushed before the new entry is written. This catches the post-erase
key-reuse bug where the device's first new event (key 01110000) collides
with the first event we previously downloaded.
"""
__tablename__ = "cached_events"
@@ -90,6 +99,8 @@ class CachedEvent(Base):
index = sa.Column(sa.Integer, primary_key=True)
event_json = sa.Column(sa.Text, nullable=False) # serialised Event dict
cached_at = sa.Column(sa.Float, nullable=False) # Unix timestamp
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
class CachedWaveform(Base):
@@ -97,7 +108,9 @@ class CachedWaveform(Base):
Full raw ADC waveform for a single event (SUB 5A full download).
These are large (up to several MB) and expensive to fetch over cellular.
Once downloaded they are immutable and cached permanently.
Once downloaded they are immutable and cached permanently but the
cache row is invalidated when the device is erased and a new event lands
at the same index (see CachedEvent docstring).
"""
__tablename__ = "cached_waveforms"
@@ -105,6 +118,8 @@ class CachedWaveform(Base):
index = sa.Column(sa.Integer, primary_key=True)
waveform_json = sa.Column(sa.Text, nullable=False) # full /device/event/{idx}/waveform response JSON
cached_at = sa.Column(sa.Float, nullable=False)
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
class CachedMonitorStatus(Base):
@@ -149,6 +164,23 @@ class SFMCache:
engine = sa.create_engine(url, connect_args={"check_same_thread": False})
Base.metadata.create_all(engine)
self._Session = orm.sessionmaker(bind=engine)
# In-place schema migration: add the (waveform_key, event_timestamp)
# integrity-stamp columns to legacy cache DBs that predate the
# post-erase eviction logic. ALTER TABLE ADD COLUMN is idempotent
# via the column-presence check below.
with engine.begin() as conn:
for table in ("cached_events", "cached_waveforms"):
cols = {
r[1]
for r in conn.exec_driver_sql(f"PRAGMA table_info({table})").fetchall()
}
for new_col, ddl in (
("waveform_key", "TEXT"),
("event_timestamp", "TEXT"),
):
if new_col not in cols:
log.info("cache schema: %s ADD COLUMN %s %s", table, new_col, ddl)
conn.exec_driver_sql(f"ALTER TABLE {table} ADD COLUMN {new_col} {ddl}")
log.info("SFM cache opened: %s", db_path)
# ── Connection key ────────────────────────────────────────────────────────
@@ -242,15 +274,91 @@ class SFMCache:
row = s.get(CachedEvent, (conn_key, index))
return json.loads(row.event_json) if row else None
@staticmethod
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
"""
Extract the (waveform_key_hex, timestamp_iso) integrity stamp from
a serialised event dict. Either field may be None if the source
Event was missing it; the comparison logic in set_events/set_waveform
treats "both sides have a value AND they differ" as the only
eviction trigger, so partial data never spuriously flushes cache.
"""
key = ev.get("waveform_key") or ev.get("_waveform_key")
if isinstance(key, (bytes, bytearray)):
key = bytes(key).hex()
ts = ev.get("timestamp")
if isinstance(ts, dict):
# _serialise_timestamp returns a dict like {"iso": "...", ...}
ts = ts.get("iso") or ts.get("string") or None
return (key if isinstance(key, str) else None,
ts if isinstance(ts, str) else None)
def _maybe_flush_on_mismatch(
self,
s,
conn_key: str,
index: int,
new_key: Optional[str],
new_ts: Optional[str],
) -> bool:
"""
Check whether the cached entry at (conn_key, index) has a different
(waveform_key, timestamp) than the incoming one. If so, treat it as
a post-erase key-reuse signal and flush ALL cached events/waveforms
for this device, then return True.
Returns False when no flush was needed.
"""
if not new_key and not new_ts:
return False # nothing to compare against
existing = s.get(CachedEvent, (conn_key, index))
if existing is None:
existing = s.get(CachedWaveform, (conn_key, index))
if existing is None:
return False
old_key = existing.waveform_key
old_ts = existing.event_timestamp
# Only flush when both sides have populated values and they differ.
differs = (
(new_key and old_key and new_key != old_key)
or (new_ts and old_ts and new_ts != old_ts)
)
if not differs:
return False
log.warning(
"cache: device %s — index %d (key=%s, ts=%s) replaces (key=%s, ts=%s); "
"flushing all cached events/waveforms for this device "
"(post-erase key reuse detected)",
conn_key, index, new_key, new_ts, old_key, old_ts,
)
s.query(CachedEvent).filter_by(conn_key=conn_key).delete()
s.query(CachedWaveform).filter_by(conn_key=conn_key).delete()
return True
def set_events(self, conn_key: str, events: list[dict]) -> None:
"""
Upsert a list of event dicts. Existing rows are updated; new rows are
inserted. This is used to add newly-discovered events to the cache.
Eviction: if any incoming event has a different (waveform_key,
timestamp) than the row currently cached at the same index, we flush
the entire device's cache before inserting the new entries. Catches
post-erase key reuse where index 0 silently switches identity.
"""
now = time.time()
with self._Session() as s:
# Eviction check: scan incoming events for any (index, key, ts)
# that conflicts with a cached row. A single conflict triggers
# a full device-wide flush so we don't end up with a mixed-era
# cache.
for ev in events:
key, ts = self._event_signature(ev)
if self._maybe_flush_on_mismatch(s, conn_key, ev["index"], key, ts):
s.commit()
break # cache is now empty for this device; carry on
for ev in events:
idx = ev["index"]
key, ts = self._event_signature(ev)
row = s.get(CachedEvent, (conn_key, idx))
if row is None:
row = CachedEvent(
@@ -258,12 +366,18 @@ class SFMCache:
index=idx,
event_json=json.dumps(ev),
cached_at=now,
waveform_key=key,
event_timestamp=ts,
)
s.add(row)
log.debug("cached new event %d for %s", idx, conn_key)
else:
# Refresh in case project_info was backfilled after initial store
row.event_json = json.dumps(ev)
if key:
row.waveform_key = key
if ts:
row.event_timestamp = ts
s.commit()
# ── Waveforms ─────────────────────────────────────────────────────────────
@@ -278,8 +392,16 @@ class SFMCache:
return json.loads(row.waveform_json)
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
"""Store a full waveform response dict permanently."""
"""
Store a full waveform response dict permanently.
Like set_events, this checks the (waveform_key, timestamp) signature
of the incoming entry against what's currently cached at the same
index. A mismatch flushes the entire device's cache before insert.
"""
key, ts = self._event_signature(waveform)
with self._Session() as s:
self._maybe_flush_on_mismatch(s, conn_key, index, key, ts)
row = s.get(CachedWaveform, (conn_key, index))
if row is None:
row = CachedWaveform(
@@ -287,13 +409,20 @@ class SFMCache:
index=index,
waveform_json=json.dumps(waveform),
cached_at=time.time(),
waveform_key=key,
event_timestamp=ts,
)
s.add(row)
else:
row.waveform_json = json.dumps(waveform)
row.cached_at = time.time()
if key:
row.waveform_key = key
if ts:
row.event_timestamp = ts
s.commit()
log.debug("cached waveform for %s event %d", conn_key, index)
log.debug("cached waveform for %s event %d (key=%s, ts=%s)",
conn_key, index, key, ts)
# ── Monitor status ────────────────────────────────────────────────────────
+100 -2
View File
@@ -81,6 +81,10 @@ CREATE TABLE IF NOT EXISTS events (
sample_rate INTEGER,
record_type TEXT, -- "single_shot" | "continuous"
false_trigger INTEGER NOT NULL DEFAULT 0, -- 0=no, 1=yes (manual flag)
blastware_filename TEXT, -- event file within waveform store; extension is per-event (AB0T encodes timestamp)
blastware_filesize INTEGER, -- bytes; NULL if no event file saved
a5_pickle_filename TEXT, -- "<filename>.a5.pkl" sidecar
sidecar_filename TEXT, -- "<filename>.sfm.json" review/metadata sidecar
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
UNIQUE(serial, timestamp)
);
@@ -184,6 +188,21 @@ class SeismoDb:
""")
log.info("_migrate: events table rebuilt OK")
# Migration 1b: add Blastware-file columns to existing events tables.
# New columns are NULLable so old rows just read NULL.
existing_cols = {
r[1] for r in conn.execute("PRAGMA table_info(events)").fetchall()
}
for col, ddl in (
("blastware_filename", "TEXT"),
("blastware_filesize", "INTEGER"),
("a5_pickle_filename", "TEXT"),
("sidecar_filename", "TEXT"),
):
if col not in existing_cols:
log.info("_migrate: events ADD COLUMN %s %s", col, ddl)
conn.execute(f"ALTER TABLE events ADD COLUMN {col} {ddl}")
# Migration 2: change monitor_log UNIQUE from (serial, waveform_key) to
# (serial, start_time) — same reasoning as events.
row = conn.execute(
@@ -282,12 +301,24 @@ class SeismoDb:
*,
serial: str,
session_id: Optional[str] = None,
waveform_records: Optional[dict[str, dict]] = None,
) -> tuple[int, int]:
"""
Insert triggered events. Silently skips duplicates (serial+timestamp).
Returns (inserted, skipped).
``waveform_records`` (optional): dict keyed by event waveform_key (hex)
whose value is a record from ``WaveformStore.save()``:
{"filename": str, "filesize": int, "a5_pickle_filename": str}
For events whose key is in this dict, the matching columns are
populated. If a row with the same (serial, timestamp) already exists
(dedup hit), the matching waveform record is upserted onto the
existing row so a re-download via the live endpoint refreshes the
file metadata.
"""
inserted = skipped = 0
wave_recs = waveform_records or {}
with self._connect() as conn:
for ev in events:
key = ev._waveform_key.hex() if ev._waveform_key else None
@@ -307,6 +338,7 @@ class SeismoDb:
pv = ev.peak_values
pi = ev.project_info
rec = wave_recs.get(key) or {}
try:
conn.execute(
@@ -315,8 +347,10 @@ class SeismoDb:
(id, serial, waveform_key, session_id, timestamp,
tran_ppv, vert_ppv, long_ppv, peak_vector_sum, mic_ppv,
project, client, operator, sensor_location,
sample_rate, record_type)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
sample_rate, record_type,
blastware_filename, blastware_filesize,
a5_pickle_filename, sidecar_filename)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""",
(
self._new_id(), serial, key, session_id, ts,
@@ -331,16 +365,50 @@ class SeismoDb:
pi.sensor_location if pi else None,
ev.sample_rate,
ev.record_type,
rec.get("filename"),
rec.get("filesize"),
rec.get("a5_pickle_filename"),
rec.get("sidecar_filename"),
),
)
inserted += 1
except sqlite3.IntegrityError:
skipped += 1
# Upsert waveform fields onto the existing dedup row so a
# re-download via the live endpoint refreshes filename /
# size / sidecar without churning the rest of the row.
if rec and ts:
conn.execute(
"""
UPDATE events
SET blastware_filename = ?,
blastware_filesize = ?,
a5_pickle_filename = ?,
sidecar_filename = ?
WHERE serial = ? AND timestamp = ?
""",
(
rec.get("filename"),
rec.get("filesize"),
rec.get("a5_pickle_filename"),
rec.get("sidecar_filename"),
serial,
ts,
),
)
log.debug("insert_events serial=%s inserted=%d skipped=%d",
serial, inserted, skipped)
return inserted, skipped
def get_event(self, event_id: str) -> Optional[dict]:
"""Return one event row by id, or None."""
with self._connect() as conn:
row = conn.execute(
"SELECT * FROM events WHERE id = ?", (event_id,),
).fetchone()
return dict(row) if row else None
def query_events(
self,
serial: Optional[str] = None,
@@ -387,6 +455,36 @@ class SeismoDb:
)
return cur.rowcount > 0
def update_event_review(self, event_id: str, review: dict) -> bool:
"""
Sync derived index columns from a sidecar's `review` block.
Currently the only derived index is `events.false_trigger` kept
in sync so `/db/events?false_trigger=true` queries don't have to
scan every sidecar JSON on disk. The sidecar JSON itself remains
the source of truth for the full review state.
Returns True when the row exists, False otherwise. No-op fields
(review without `false_trigger`) leave the column untouched.
"""
if not isinstance(review, dict):
return False
if "false_trigger" not in review:
# Nothing derived to update; just confirm the row exists.
with self._connect() as conn:
row = conn.execute(
"SELECT 1 FROM events WHERE id=?", (event_id,),
).fetchone()
return row is not None
flag = 1 if review.get("false_trigger") else 0
with self._connect() as conn:
cur = conn.execute(
"UPDATE events SET false_trigger=? WHERE id=?",
(flag, event_id),
)
return cur.rowcount > 0
# ── Monitor log ───────────────────────────────────────────────────────────
def insert_monitor_log(
+216
View File
@@ -0,0 +1,216 @@
"""
sfm.dump_0c inspect the raw 210-byte SUB 0C waveform record stored in a
sidecar JSON's `extensions.raw_records.waveform_record_b64`.
Usage:
python -m sfm.dump_0c <sidecar.sfm.json> [<sidecar.sfm.json> ...]
Prints, for each input:
- A header summarising the sidecar's metadata-block claims (peaks,
project, timestamp) the "what BW says this event measured" view.
- A 16-byte-wide hex dump of the raw 0C record, annotated with known
field anchors (STRT, channel labels, project strings).
- A "candidate float regions" scan that brute-forces every byte
position as a float32 BE and prints any that yield a value in a
plausible range (1e-7 to 1e3) useful for hunting where Peak
Acceleration / Peak Displacement / ZC Freq / Time of Peak live.
Pairing the printed candidates with the BW Event Report values lets
us nail down byte offsets for the missing fields without a live
device.
"""
from __future__ import annotations
import argparse
import base64
import json
import struct
import sys
from pathlib import Path
# ── Annotations for known anchors in a 210-byte 0C record ──────────────────
# Anchors we look for and label inline in the hex dump. Each is a needle
# (bytes to find) and a short label. Found via .find() — the first
# occurrence wins.
_ANCHORS = [
(b"Tran", "Tran label (PPV @ +6, PVS @ -12)"),
(b"Vert", "Vert label (PPV @ +6)"),
(b"Long", "Long label (PPV @ +6)"),
(b"MicL", "MicL label (peak psi @ +6)"),
(b"Project:", "Project: label"),
(b"Client:", "Client: label"),
(b"User Name:", "User Name: label"),
(b"Seis Loc:", "Seis Loc: label"),
(b"Extended Notes", "Extended Notes label"),
]
def _hex_dump(data: bytes, anchors: dict[int, str]) -> str:
"""Return a 16-byte-wide hex+ASCII dump, with anchor labels printed
on the line that contains the anchor's start byte."""
lines = []
for off in range(0, len(data), 16):
chunk = data[off : off + 16]
hex_part = " ".join(f"{b:02x}" for b in chunk)
ascii_part = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
line = f" {off:04x} {hex_part:<47} |{ascii_part}|"
# If any anchor lands on a byte in this row, append a tag
tags = [
f"[{a:#04x}: {label}]"
for a, label in anchors.items()
if off <= a < off + 16
]
if tags:
line += " " + " ".join(tags)
lines.append(line)
return "\n".join(lines)
def _scan_float32_be(data: bytes, lo: float, hi: float) -> list[tuple[int, float]]:
"""Brute-force every offset where data[off:off+4] is a float32 BE in
(lo, hi). Includes negatives in the symmetric range."""
hits = []
for i in range(len(data) - 3):
try:
v = struct.unpack_from(">f", data, i)[0]
except struct.error:
continue
if v != v: # NaN
continue
if abs(v) < 1e-30 or abs(v) > 1e10: # crap range
continue
a = abs(v)
if lo <= a <= hi:
hits.append((i, v))
return hits
def _scan_uint16_be(data: bytes, lo: int, hi: int) -> list[tuple[int, int]]:
"""Find every offset where uint16 BE is in [lo, hi]."""
hits = []
for i in range(len(data) - 1):
v = (data[i] << 8) | data[i + 1]
if lo <= v <= hi:
hits.append((i, v))
return hits
def _summarize_sidecar(side: dict) -> str:
ev = side.get("event", {})
pv = side.get("peak_values", {})
pi = side.get("project_info", {})
bw = side.get("blastware", {})
return (
f" serial: {ev.get('serial')}\n"
f" timestamp: {ev.get('timestamp')}\n"
f" waveform: {ev.get('waveform_key')} ({ev.get('record_type')})\n"
f" sample_rate:{ev.get('sample_rate')} sps rectime:{ev.get('rectime_seconds')}s\n"
f" bw file: {bw.get('filename')} ({bw.get('filesize')} B)\n"
f" peaks: "
f"Tran={pv.get('transverse'):.5f} "
f"Vert={pv.get('vertical'):.5f} "
f"Long={pv.get('longitudinal'):.5f} "
f"PVS={pv.get('vector_sum'):.5f} in/s "
f"Mic={pv.get('mic_psi'):.6e} psi"
if all(pv.get(k) is not None for k in
("transverse", "vertical", "longitudinal", "vector_sum", "mic_psi"))
else f" peaks: {pv}\n project: {pi}"
) + (
f"\n project: {pi.get('project')!r} / {pi.get('client')!r} / "
f"operator={pi.get('operator')!r} loc={pi.get('sensor_location')!r}"
)
def dump_one(path: Path) -> int:
side = json.loads(path.read_text(encoding="utf-8"))
raw_b64 = (
side.get("extensions", {})
.get("raw_records", {})
.get("waveform_record_b64")
)
if not raw_b64:
print(f"\n=== {path} ===")
print(" ! no extensions.raw_records.waveform_record_b64 — sidecar")
print(" pre-dates raw-0C persistence (added in v0.15.x). Re-save")
print(" the event from the device to capture the bytes.")
return 1
raw = base64.b64decode(raw_b64)
# Build anchor map
anchors: dict[int, str] = {}
for needle, label in _ANCHORS:
i = raw.find(needle)
if i >= 0:
anchors[i] = label
print(f"\n=== {path} ===")
print("metadata claimed by sidecar:")
print(_summarize_sidecar(side))
print(f"\nraw 0C record ({len(raw)} bytes):")
print(_hex_dump(raw, anchors))
# Float32 BE candidates in geo-relevant ranges
geo_hits = _scan_float32_be(raw, 1e-5, 50.0)
# Filter: only show hits that are NOT trivially the per-channel labels'
# +6 PPV floats already documented (those will land in any sweep too).
print("\nfloat32 BE candidates (1e-5 .. 50.0):")
for off, v in geo_hits:
annotation = ""
for needle, _ in _ANCHORS[:4]: # geo + mic labels
i = raw.find(needle)
if i >= 0 and off == i + 6:
annotation = f"{needle.decode()} PPV (label+6)"
break
print(f" {off:#04x} ({off:3d}) {v:>+15.6f}{annotation}")
print("\nuint16 BE candidates ZC-Freq-ish (1..200):")
for off, v in _scan_uint16_be(raw, 1, 200):
if v < 5: # too noisy at very low end
continue
print(f" {off:#04x} ({off:3d}) = {v}")
print("\nuint16 BE candidates Time-of-Peak-ish if stored as ms (1..30000):")
for off, v in _scan_uint16_be(raw, 1, 30000):
if v < 100: # noise filter
continue
# Only the first ~80 are worth showing — too many hits otherwise
if off > 80:
break
print(f" {off:#04x} ({off:3d}) = {v} ms ?")
print()
return 0
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(
description="Inspect a saved 0C waveform record from a sidecar JSON.",
)
p.add_argument(
"sidecars",
nargs="+",
type=Path,
help="Path(s) to <event>.sfm.json sidecar file(s).",
)
args = p.parse_args(argv)
rc = 0
for path in args.sidecars:
try:
rc |= dump_one(path)
except Exception as exc:
print(f"\n=== {path} ===\n ERROR: {exc}", file=sys.stderr)
rc |= 2
return rc
if __name__ == "__main__":
sys.exit(main())
+530
View File
@@ -0,0 +1,530 @@
"""
sfm/event_hdf5.py HDF5 codec for the canonical "clean waveform" file.
Layout written to `<filename>.h5`:
/
samples/
Tran (float32, in/s) shape: (N,)
Vert (float32, in/s) shape: (N,)
Long (float32, in/s) shape: (N,)
MicL (float32, psi) shape: (N,)
samples_int16/ (optional)
Tran (int16, raw ADC counts) shape: (N,)
... per channel (only when present in the source)
root attrs (event metadata):
schema_version int = 1
kind str = "sfm.event.hdf5"
serial str
waveform_key str (8-hex)
timestamp str (ISO-8601)
record_type str
sample_rate int (sps)
pretrig_samples int
total_samples int
rectime_seconds float
geo_range str "normal" | "sensitive"
geo_full_scale_ips float (10.0 or 1.250)
project str
client str
operator str
sensor_location str
peak_tran_ips float (from 0C; authoritative)
peak_vert_ips float
peak_long_ips float
peak_pvs_ips float
peak_mic_psi float
tool_version str
captured_at str (ISO-8601 UTC)
source_kind str "sfm-live" | "sfm-ach" | "bw-import"
Why HDF5 and not just JSON for the canonical clean format:
- Native float32 arrays (no base64 dance, no per-value JSON parsing).
- Per-dataset gzip compression sample arrays compress 3-5×.
- Cross-language: h5py (Python), HDF5.jl (Julia), io.netcdf (R), etc.
Analysis pipelines don't have to know anything about Blastware.
- Self-describing via attributes; future fields don't break readers.
The plot-ready `sfm.plot.v1` JSON returned by the REST endpoints is
derived from this HDF5 (or computed on-the-fly when no .h5 exists yet).
"""
from __future__ import annotations
import datetime
import logging
from pathlib import Path
from typing import Optional, Union
import h5py
import numpy as np
from minimateplus.event_file_io import TOOL_VERSION as _DEFAULT_TOOL_VERSION
from minimateplus.models import Event
log = logging.getLogger(__name__)
SCHEMA_VERSION = 1
HDF5_KIND = "sfm.event.hdf5"
# Geophone full-scale velocity per range (in/s). Confirmed in CLAUDE.md
# from 4-20-26 captures: Normal=0x00 → 10 in/s, Sensitive=0x01 → 1.25 in/s.
_GEO_FS_BY_RANGE = {
"normal": 10.000,
"sensitive": 1.2500,
0: 10.000,
1: 1.2500,
}
_INT16_FS = 32768.0
# Default mic conversion: ADC count → psi. Approximate; exact factor
# depends on firmware reference voltage and mic sensitivity, neither of
# which is independently confirmed. We try to refine it from the device-
# reported peak when available (peak_mic_psi / max_abs_int16).
_MIC_DEFAULT_FS_PSI = 0.0125 # ≈ 0.5 psi at full scale (rough)
def _resolve_geo_full_scale(geo_range) -> float:
"""Map a geo_range value (string or int from compliance config) to the
full-scale velocity in in/s. Defaults to Normal range (10.0) when the
value is unknown same default as Blastware itself."""
if geo_range is None:
return _GEO_FS_BY_RANGE["normal"]
if isinstance(geo_range, str):
return _GEO_FS_BY_RANGE.get(geo_range.lower(), _GEO_FS_BY_RANGE["normal"])
return _GEO_FS_BY_RANGE.get(int(geo_range), _GEO_FS_BY_RANGE["normal"])
def _normalise_range(geo_range) -> str:
"""Return 'normal' or 'sensitive' (string) regardless of input form."""
if isinstance(geo_range, str):
v = geo_range.lower()
if v in ("normal", "sensitive"):
return v
return "normal"
if geo_range == 1:
return "sensitive"
return "normal"
def _ts_iso(ts) -> str:
if ts is None:
return ""
try:
return datetime.datetime(
ts.year, ts.month, ts.day,
ts.hour or 0, ts.minute or 0, ts.second or 0,
).isoformat()
except Exception:
return str(ts)
def _samples_to_float(
samples_int16: list[int],
full_scale: float,
) -> np.ndarray:
"""Convert int16 ADC counts → float32 physical units.
Uses _INT16_FS=32768 (not 32767) so that a count of -32768 maps to
exactly -full_scale and +32767 maps to ~+full_scale * 32767/32768.
Matches the device firmware's documented mapping (see CLAUDE.md
geo_hardware_constant rationale).
"""
if not samples_int16:
return np.array([], dtype=np.float32)
arr = np.asarray(samples_int16, dtype=np.int32) # int32 to avoid overflow during scale
return (arr.astype(np.float32) * (full_scale / _INT16_FS)).astype(np.float32)
def _mic_scale_factor(
samples_int16: list[int],
peak_mic_psi: Optional[float],
) -> float:
"""Resolve the per-count psi factor for the microphone channel.
When the device reports a peak mic value via the 0C record, we
back-solve the per-count factor from `peak_psi / max(|samples|)` so
the plotted waveform peaks land exactly at the device-reported value.
Otherwise fall back to the rough _MIC_DEFAULT_FS_PSI estimate.
"""
if peak_mic_psi is not None and peak_mic_psi > 0 and samples_int16:
max_count = max(abs(int(v)) for v in samples_int16) or 1
return float(peak_mic_psi) / float(max_count)
return _MIC_DEFAULT_FS_PSI / _INT16_FS
def write_event_hdf5(
path: Union[str, Path],
event: Event,
*,
serial: str,
geo_range = "normal",
source_kind: str = "sfm-live",
tool_version: Optional[str] = None,
captured_at: Optional[datetime.datetime] = None,
include_int16: bool = True,
) -> dict:
"""
Persist a decoded Event as an HDF5 file with samples in physical units.
Returns a small summary dict suitable for logging:
{"path": Path, "n_samples": int, "geo_full_scale_ips": float}
"""
path = Path(path)
raw = event.raw_samples or {}
pv = event.peak_values
pi = event.project_info
geo_fs = _resolve_geo_full_scale(geo_range)
geo_range_str = _normalise_range(geo_range)
captured_at = captured_at or datetime.datetime.utcnow()
tool_version = tool_version or _DEFAULT_TOOL_VERSION
# Per-channel float32 arrays in physical units.
geo_arrays = {}
for ch in ("Tran", "Vert", "Long"):
geo_arrays[ch] = _samples_to_float(raw.get(ch, []), geo_fs)
# Mic channel — the per-count factor is resolved from the device-reported
# peak when available so the plot peaks the BW value exactly.
mic_int16 = raw.get("MicL", [])
mic_factor = _mic_scale_factor(
mic_int16,
getattr(pv, "micl", None) if pv else None,
)
if mic_int16:
mic_arr = (np.asarray(mic_int16, dtype=np.int32).astype(np.float32) * mic_factor).astype(np.float32)
else:
mic_arr = np.array([], dtype=np.float32)
n_samples = max(
(len(geo_arrays[ch]) for ch in geo_arrays),
default=0,
)
# Atomic write: temp file + os.replace.
tmp = path.with_suffix(path.suffix + ".tmp")
with h5py.File(tmp, "w") as f:
# Root attrs — event-level metadata.
attrs = f.attrs
attrs["schema_version"] = SCHEMA_VERSION
attrs["kind"] = HDF5_KIND
attrs["serial"] = serial or ""
attrs["waveform_key"] = event._waveform_key.hex() if event._waveform_key else ""
attrs["timestamp"] = _ts_iso(event.timestamp)
attrs["record_type"] = event.record_type or ""
attrs["sample_rate"] = int(event.sample_rate or 0)
attrs["pretrig_samples"] = int(event.pretrig_samples or 0)
attrs["total_samples"] = int(event.total_samples or n_samples)
attrs["rectime_seconds"] = float(event.rectime_seconds or 0.0)
attrs["geo_range"] = geo_range_str
attrs["geo_full_scale_ips"] = float(geo_fs)
attrs["project"] = (pi.project if pi else "") or ""
attrs["client"] = (pi.client if pi else "") or ""
attrs["operator"] = (pi.operator if pi else "") or ""
attrs["sensor_location"] = (pi.sensor_location if pi else "") or ""
attrs["peak_tran_ips"] = float(pv.tran if pv and pv.tran is not None else 0.0)
attrs["peak_vert_ips"] = float(pv.vert if pv and pv.vert is not None else 0.0)
attrs["peak_long_ips"] = float(pv.long if pv and pv.long is not None else 0.0)
attrs["peak_pvs_ips"] = float(pv.peak_vector_sum if pv and pv.peak_vector_sum is not None else 0.0)
attrs["peak_mic_psi"] = float(pv.micl if pv and pv.micl is not None else 0.0)
attrs["tool_version"] = tool_version or ""
attrs["captured_at"] = captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat()
attrs["source_kind"] = source_kind
# /samples — physical-units float32 (the primary data).
sgrp = f.create_group("samples")
for ch, arr in geo_arrays.items():
sgrp.create_dataset(
ch, data=arr, dtype="float32",
compression="gzip", compression_opts=4, shuffle=True,
)
sgrp.create_dataset(
"MicL", data=mic_arr, dtype="float32",
compression="gzip", compression_opts=4, shuffle=True,
)
# /samples_int16 — optional raw ADC counts (preserved for analysis
# tools that want pre-conversion data). Cheap to include.
if include_int16:
igrp = f.create_group("samples_int16")
for ch in ("Tran", "Vert", "Long", "MicL"):
vals = raw.get(ch, [])
if vals:
igrp.create_dataset(
ch, data=np.asarray(vals, dtype=np.int16),
compression="gzip", compression_opts=4, shuffle=True,
)
igrp.attrs["mic_psi_per_count"] = float(mic_factor)
import os
os.replace(tmp, path)
log.info(
"write_event_hdf5: %s n_samples=%d geo_fs=%.3f filesize=%d",
path, n_samples, geo_fs, path.stat().st_size,
)
return {
"path": path,
"n_samples": n_samples,
"geo_full_scale_ips": geo_fs,
}
def read_event_hdf5(path: Union[str, Path]) -> dict:
"""
Load an event HDF5 into a plain dict (no Event reconstruction
callers that want an Event can use the data directly).
Returns:
{
"schema_version": int,
"kind": str,
"attrs": dict[str, ], # all root attributes
"samples": { # float32 lists in physical units
"Tran": ndarray, "Vert": ndarray, "Long": ndarray, "MicL": ndarray,
},
"samples_int16": {} or None,
"mic_psi_per_count": float | None,
}
Raises FileNotFoundError if missing, ValueError on bad shape /
unsupported schema_version.
"""
path = Path(path)
with h5py.File(path, "r") as f:
attrs = {k: _h5_attr_value(v) for k, v in f.attrs.items()}
sv = attrs.get("schema_version", 0)
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
raise ValueError(
f"{path}: unsupported HDF5 schema_version={sv} "
f"(this build supports 1..{SCHEMA_VERSION})"
)
if attrs.get("kind") != HDF5_KIND:
raise ValueError(f"{path}: kind != {HDF5_KIND!r} (got {attrs.get('kind')!r})")
samples = {}
for ch in ("Tran", "Vert", "Long", "MicL"):
ds = f.get(f"samples/{ch}")
samples[ch] = np.asarray(ds[()]) if ds is not None else np.array([], dtype=np.float32)
samples_int16 = None
mic_psi = None
igrp = f.get("samples_int16")
if igrp is not None:
samples_int16 = {}
for ch in ("Tran", "Vert", "Long", "MicL"):
ds = igrp.get(ch)
if ds is not None:
samples_int16[ch] = np.asarray(ds[()])
mic_attr = igrp.attrs.get("mic_psi_per_count")
if mic_attr is not None:
mic_psi = float(mic_attr)
return {
"schema_version": sv,
"kind": attrs.get("kind"),
"attrs": attrs,
"samples": samples,
"samples_int16": samples_int16,
"mic_psi_per_count": mic_psi,
}
def _h5_attr_value(v):
"""Convert an h5py attribute value to a plain Python type."""
if isinstance(v, bytes):
return v.decode("utf-8", errors="replace")
if isinstance(v, np.generic):
return v.item()
return v
# ── Plot-ready JSON ──────────────────────────────────────────────────────────
def event_to_plot_json(
event: Event,
*,
serial: str,
geo_range = "normal",
event_id: Optional[str] = None,
index: Optional[int] = None,
) -> dict:
"""
Build a `sfm.plot.v1` JSON dict directly from an Event (skipping HDF5).
Used by:
- `/device/event/{idx}/waveform` (live device path)
- The CLI / tests for in-memory conversion sanity-checks.
Stored events go through `plot_json_from_hdf5()` so the wire format
is identical regardless of whether the data came from the live device
or the on-disk HDF5.
"""
raw = event.raw_samples or {}
pv = event.peak_values
geo_fs = _resolve_geo_full_scale(geo_range)
geo_range_str = _normalise_range(geo_range)
sr = int(event.sample_rate or 0) or 1024
pretrig = int(event.pretrig_samples or 0)
geo_arrays = {ch: _samples_to_float(raw.get(ch, []), geo_fs).tolist()
for ch in ("Tran", "Vert", "Long")}
mic_int16 = raw.get("MicL", [])
mic_factor = _mic_scale_factor(
mic_int16,
getattr(pv, "micl", None) if pv else None,
)
mic_arr = [float(v) * mic_factor for v in mic_int16] if mic_int16 else []
n = max(
(len(geo_arrays[ch]) for ch in geo_arrays),
default=len(mic_arr),
)
return _build_plot_dict(
n_samples=n,
sample_rate=sr,
pretrig_samples=pretrig,
total_samples=int(event.total_samples or n),
rectime_seconds=float(event.rectime_seconds or 0.0),
timestamp_iso=_ts_iso(event.timestamp),
serial=serial,
record_type=event.record_type,
waveform_key=event._waveform_key.hex() if event._waveform_key else None,
geo_range=geo_range_str,
geo_fs=geo_fs,
channels_floats={
"Tran": geo_arrays["Tran"],
"Vert": geo_arrays["Vert"],
"Long": geo_arrays["Long"],
"MicL": mic_arr,
},
peaks_dict={
"tran": getattr(pv, "tran", None) if pv else None,
"vert": getattr(pv, "vert", None) if pv else None,
"long": getattr(pv, "long", None) if pv else None,
"pvs": getattr(pv, "peak_vector_sum", None) if pv else None,
"mic": getattr(pv, "micl", None) if pv else None,
},
event_id=event_id,
index=index if index is not None else event.index,
)
def plot_json_from_hdf5(
path: Union[str, Path],
*,
event_id: Optional[str] = None,
index: Optional[int] = None,
) -> dict:
"""Build a `sfm.plot.v1` JSON dict from a stored .h5 file."""
data = read_event_hdf5(path)
a = data["attrs"]
s = data["samples"]
return _build_plot_dict(
n_samples=len(s["Tran"]) if "Tran" in s else 0,
sample_rate=int(a.get("sample_rate", 1024) or 1024),
pretrig_samples=int(a.get("pretrig_samples", 0) or 0),
total_samples=int(a.get("total_samples", 0) or 0),
rectime_seconds=float(a.get("rectime_seconds", 0.0) or 0.0),
timestamp_iso=a.get("timestamp", ""),
serial=a.get("serial", ""),
record_type=a.get("record_type", ""),
waveform_key=a.get("waveform_key", "") or None,
geo_range=a.get("geo_range", "normal"),
geo_fs=float(a.get("geo_full_scale_ips", 10.0) or 10.0),
channels_floats={
"Tran": s.get("Tran", np.array([])).tolist(),
"Vert": s.get("Vert", np.array([])).tolist(),
"Long": s.get("Long", np.array([])).tolist(),
"MicL": s.get("MicL", np.array([])).tolist(),
},
peaks_dict={
"tran": float(a.get("peak_tran_ips", 0.0) or 0.0) or None,
"vert": float(a.get("peak_vert_ips", 0.0) or 0.0) or None,
"long": float(a.get("peak_long_ips", 0.0) or 0.0) or None,
"pvs": float(a.get("peak_pvs_ips", 0.0) or 0.0) or None,
"mic": float(a.get("peak_mic_psi", 0.0) or 0.0) or None,
},
event_id=event_id,
index=index,
)
def _build_plot_dict(
*,
n_samples: int,
sample_rate: int,
pretrig_samples: int,
total_samples: int,
rectime_seconds: float,
timestamp_iso: str,
serial: str,
record_type: Optional[str],
waveform_key: Optional[str],
geo_range: str,
geo_fs: float,
channels_floats: dict[str, list[float]],
peaks_dict: dict[str, Optional[float]],
event_id: Optional[str],
index: Optional[int] = None,
) -> dict:
dt_ms = (1000.0 / sample_rate) if sample_rate > 0 else 0.0
t0_ms = -pretrig_samples * dt_ms
def _ch(unit: str, values: list[float], peak: Optional[float]) -> dict:
# Locate the peak's time within the values array (max abs).
if values:
mags = [abs(v) for v in values]
i = mags.index(max(mags))
peak_t_ms = round(t0_ms + i * dt_ms, 4)
peak_value = peak if peak is not None else values[i]
else:
peak_t_ms = None
peak_value = peak
return {
"unit": unit,
"values": values,
"peak": peak_value,
"peak_t_ms": peak_t_ms,
}
return {
"schema": "sfm.plot.v1",
"event_id": event_id,
"index": index,
"serial": serial,
"timestamp": timestamp_iso,
"record_type": record_type,
"waveform_key": waveform_key,
"time_axis": {
"sample_rate": sample_rate,
"pretrig_samples": pretrig_samples,
"total_samples": total_samples or n_samples,
"n_samples": n_samples,
"t0_ms": round(t0_ms, 4),
"dt_ms": round(dt_ms, 6),
"rectime_seconds": rectime_seconds,
},
"geo_range": geo_range,
"geo_full_scale_ips": geo_fs,
"trigger_ms": 0.0,
"channels": {
"Tran": _ch("in/s", channels_floats.get("Tran", []), peaks_dict.get("tran")),
"Vert": _ch("in/s", channels_floats.get("Vert", []), peaks_dict.get("vert")),
"Long": _ch("in/s", channels_floats.get("Long", []), peaks_dict.get("long")),
"MicL": _ch("psi", channels_floats.get("MicL", []), peaks_dict.get("mic")),
},
"peak_values": {
"transverse": peaks_dict.get("tran"),
"vertical": peaks_dict.get("vert"),
"longitudinal": peaks_dict.get("long"),
"vector_sum": peaks_dict.get("pvs"),
"mic_psi": peaks_dict.get("mic"),
},
}
+194
View File
@@ -0,0 +1,194 @@
"""
sfm/import_bw.py CLI for ingesting Blastware-format event files.
Walks a path (file or directory), parses each recognised event-file
binary, copies it into the canonical waveform store, writes the
.sfm.json sidecar, and upserts a row in seismo_relay.db.
Use cases:
- Migrating a Blastware ACH inbox into SFM
- One-off imports of files emailed in by field crews
- Bulk-loading historical archives
Usage:
python -m sfm.import_bw <path-or-dir> [--serial BE11529]
[--db-path bridges/captures/seismo_relay.db]
[--store-root bridges/captures/waveforms]
[--dry-run]
[-v]
Examples:
python -m sfm.import_bw ~/Downloads/M529LKIQ.7M0W
python -m sfm.import_bw /path/to/blastware_archive --serial BE11529
"""
from __future__ import annotations
import argparse
import logging
import sys
from pathlib import Path
from typing import Iterator
# Allow running from the repo root without installation.
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from sfm.database import SeismoDb
from sfm.waveform_store import WaveformStore
log = logging.getLogger("sfm.import_bw")
# Blastware event-file extensions: 4-char `AB0T` (T = W or H) for ACH
# downloads, 3-char `AB0` for direct downloads. We discover candidates
# by length + last-char rather than enumerating every (A, B) pair.
def _looks_like_bw_event(path: Path) -> bool:
"""Heuristic: 3-char or 4-char extension, ends with W/H/0, and the
file is at least 70 bytes (header + STRT + footer minimum)."""
if not path.is_file():
return False
ext = path.suffix.lstrip(".")
if not (3 <= len(ext) <= 4):
return False
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
return False
try:
return path.stat().st_size >= 70
except OSError:
return False
def _walk(path: Path) -> Iterator[Path]:
"""Yield candidate BW event-file paths under `path` (file or dir)."""
if path.is_file():
if _looks_like_bw_event(path):
yield path
return
if path.is_dir():
for p in sorted(path.rglob("*")):
if _looks_like_bw_event(p):
yield p
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(
description="Import Blastware-format event files into the SFM store + DB.",
)
p.add_argument("path", help="File or directory to import.")
p.add_argument(
"--serial", default=None, metavar="SERIAL",
help="Override the serial-number hint (e.g. BE11529). Defaults to "
"the value decoded from each BW filename's prefix.",
)
p.add_argument(
"--db-path",
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
help="Path to seismo_relay.db (default: bridges/captures/seismo_relay.db).",
)
p.add_argument(
"--store-root",
default=None,
help="Root of the waveform store (default: <db_dir>/waveforms).",
)
p.add_argument(
"--dry-run", action="store_true",
help="Parse and report per-file outcomes; don't write anything.",
)
p.add_argument("-v", "--verbose", action="store_true", help="Debug logging.")
args = p.parse_args(argv)
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
datefmt="%H:%M:%S",
)
src = Path(args.path).expanduser().resolve()
if not src.exists():
print(f"error: {src} does not exist", file=sys.stderr)
return 2
db_path = Path(args.db_path).expanduser().resolve()
store_root = (
Path(args.store_root).expanduser().resolve()
if args.store_root else db_path.parent / "waveforms"
)
db = None if args.dry_run else SeismoDb(db_path)
store = None if args.dry_run else WaveformStore(store_root)
candidates = list(_walk(src))
if not candidates:
print(f"No BW event-file candidates found under {src}", file=sys.stderr)
return 1
print(f"Importing {len(candidates)} file(s) from {src}...")
if args.dry_run:
print("(dry-run — no writes will occur)")
ok = err = skipped = 0
for path in candidates:
try:
bw_bytes = path.read_bytes()
except Exception as exc:
print(f" [ERR ] {path}: read failed: {exc}")
err += 1
continue
if args.dry_run:
# Just parse to verify integrity; don't touch DB or store.
from minimateplus import event_file_io
try:
ev = event_file_io.read_blastware_file(path)
ts = ev.timestamp and (
f"{ev.timestamp.year}-{ev.timestamp.month:02d}-{ev.timestamp.day:02d} "
f"{ev.timestamp.hour:02d}:{ev.timestamp.minute:02d}:{ev.timestamp.second:02d}"
) or "?"
pv = ev.peak_values
pvs = pv.peak_vector_sum if pv and pv.peak_vector_sum is not None else 0.0
print(f" [OK ] {path.name} ts={ts} PVS={pvs:.4f}")
ok += 1
except Exception as exc:
print(f" [ERR ] {path}: parse failed: {exc}")
err += 1
continue
try:
ev, rec = store.save_imported_bw(
bw_bytes, source_path=path, serial_hint=args.serial,
)
# Resolve serial for the DB row. Prefer the hint, then the
# one decoded from the filename (already done by the store).
serial_used = args.serial or _infer_serial(path.name) or "UNKNOWN"
ins, sk = db.insert_events(
[ev], serial=serial_used,
waveform_records=(
{ev._waveform_key.hex(): rec}
if ev._waveform_key else None
),
)
tag = "OK " if ins else ("SKIP" if sk else "OK ")
print(f" [{tag}] {path.name}{rec['filename']} "
f"({rec['filesize']} B, sha256={rec['sha256'][:12]}…) "
f"serial={serial_used} ins={ins} skip={sk}")
if ins:
ok += 1
else:
skipped += 1
except Exception as exc:
print(f" [ERR ] {path}: import failed: {exc}")
log.debug("traceback", exc_info=True)
err += 1
print(f"\nDone. ok={ok} skipped={skipped} errors={err}")
return 0 if err == 0 else 1
def _infer_serial(filename: str):
"""Reuse WaveformStore's filename → serial decoder for log output."""
from sfm.waveform_store import _serial_from_bw_filename
return _serial_from_bw_filename(filename)
if __name__ == "__main__":
sys.exit(main())
+189
View File
@@ -0,0 +1,189 @@
"""
sfm/live_cache.py Thread-safe in-memory cache for live SFM device data.
Extracted from sfm/server.py so the cache logic is importable and testable
without pulling in fastapi/uvicorn.
Caching strategy
----------------
Keyed by `conn_key` ("tcp:host:port" or "serial:port:baud"). Does NOT
persist across server restarts.
device_info cached until POST /device/config marks it dirty
events cached by (conn_key, device_event_count); re-fetched when
a quick count_events() probe shows new events on the device
monitor_status 30-second TTL (changes frequently during monitoring)
waveforms permanent within a process but auto-evicted at the device
level when a (waveform_key, timestamp) mismatch is detected
at the same index (post-erase key reuse the device's
event-key counter resets to 0x01110000 after every erase,
so the same `(conn_key, index)` slot can refer to a
brand-new physical event).
All endpoints accept ?force=true to bypass the cache and re-read.
"""
from __future__ import annotations
import threading
import time
from typing import Optional
_MONITOR_STATUS_TTL = 30.0 # seconds
class LiveCache:
"""
Thread-safe in-memory cache for live SFM device data.
One singleton per server process.
"""
def __init__(self) -> None:
self._lock = threading.Lock()
self._device_info: dict[str, dict] = {}
self._events: dict[str, tuple[int, list]] = {}
self._monitor_status: dict[str, tuple[float, dict]] = {}
self._config_dirty: dict[str, bool] = {}
self._waveforms: dict[tuple, dict] = {}
# ── Connection key ────────────────────────────────────────────────────────
@staticmethod
def make_conn_key(
host: Optional[str],
tcp_port: int,
port: Optional[str],
baud: int,
) -> str:
if host:
return f"tcp:{host}:{tcp_port}"
return f"serial:{port}:{baud}"
# ── Eviction signature ────────────────────────────────────────────────────
@staticmethod
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
"""Return (waveform_key_hex, timestamp_iso) from a serialised event."""
key = ev.get("waveform_key") or ev.get("_waveform_key")
if isinstance(key, (bytes, bytearray)):
key = bytes(key).hex()
ts = ev.get("timestamp")
if isinstance(ts, dict):
ts = ts.get("iso") or ts.get("string") or None
return (key if isinstance(key, str) else None,
ts if isinstance(ts, str) else None)
def _flush_device(self, conn_key: str) -> None:
"""Drop all cached events + waveforms for one device. Caller holds lock."""
self._events.pop(conn_key, None)
stale_wf_keys = [k for k in self._waveforms if k[0] == conn_key]
for k in stale_wf_keys:
self._waveforms.pop(k, None)
# ── Device info ───────────────────────────────────────────────────────────
def get_device_info(self, conn_key: str) -> Optional[dict]:
with self._lock:
if self._config_dirty.get(conn_key):
return None
return self._device_info.get(conn_key)
def set_device_info(self, conn_key: str, info: dict) -> None:
with self._lock:
self._device_info[conn_key] = info
self._config_dirty[conn_key] = False
# ── Events ────────────────────────────────────────────────────────────────
def get_events(self, conn_key: str, device_count: int) -> Optional[list]:
with self._lock:
if self._config_dirty.get(conn_key):
return None
entry = self._events.get(conn_key)
if entry is None:
return None
cached_count, events = entry
return events if cached_count == device_count else None
def set_events(self, conn_key: str, device_count: int, events: list) -> None:
"""
Replace the cached events list for `conn_key`. If any incoming event
has a different (waveform_key, timestamp) than the cached entry at
the same index, flush the entire conn_key's event + waveform cache
first. Catches post-erase key reuse.
"""
with self._lock:
cached_entry = self._events.get(conn_key)
cached_events = cached_entry[1] if cached_entry else []
cached_by_index = {e.get("index"): e for e in cached_events}
evict = False
for ev in events:
idx = ev.get("index")
if idx is None:
continue
cached = cached_by_index.get(idx)
if cached is None:
continue
new_key, new_ts = self._event_signature(ev)
old_key, old_ts = self._event_signature(cached)
if (new_key and old_key and new_key != old_key) or \
(new_ts and old_ts and new_ts != old_ts):
evict = True
break
if evict:
self._flush_device(conn_key)
self._events[conn_key] = (device_count, events)
# ── Monitor status ────────────────────────────────────────────────────────
def get_monitor_status(self, conn_key: str) -> Optional[dict]:
with self._lock:
entry = self._monitor_status.get(conn_key)
if entry is None:
return None
fetched_at, status = entry
if time.time() - fetched_at > _MONITOR_STATUS_TTL:
return None
return status
def set_monitor_status(self, conn_key: str, status: dict) -> None:
with self._lock:
self._monitor_status[conn_key] = (time.time(), status)
def invalidate_monitor_status(self, conn_key: str) -> None:
with self._lock:
self._monitor_status.pop(conn_key, None)
# ── Config dirty flag ─────────────────────────────────────────────────────
def mark_config_dirty(self, conn_key: str) -> None:
with self._lock:
self._config_dirty[conn_key] = True
self._events.pop(conn_key, None)
# ── Waveforms (permanent cache, evicted on (key,ts) mismatch) ─────────────
def get_waveform(self, conn_key: str, index: int) -> Optional[dict]:
with self._lock:
return self._waveforms.get((conn_key, index))
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
"""
Cache a waveform. Evicts the device's whole cache when the existing
entry at the same index has a different (waveform_key, timestamp).
"""
with self._lock:
existing = self._waveforms.get((conn_key, index))
if existing is not None:
new_key, new_ts = self._event_signature(waveform)
old_key, old_ts = self._event_signature(existing)
differs = (
(new_key and old_key and new_key != old_key)
or (new_ts and old_ts and new_ts != old_ts)
)
if differs:
self._flush_device(conn_key)
self._waveforms[(conn_key, index)] = waveform
+710 -143
View File
@@ -37,6 +37,7 @@ from __future__ import annotations
import datetime
import logging
import sys
import tempfile
import threading
import time
from pathlib import Path
@@ -44,9 +45,9 @@ from typing import Optional
# FastAPI / Pydantic
try:
from fastapi import Body, FastAPI, HTTPException, Query
from fastapi import Body, FastAPI, File, HTTPException, Query, UploadFile
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, JSONResponse
from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
from pydantic import BaseModel
import uvicorn
except ImportError:
@@ -59,10 +60,15 @@ except ImportError:
from minimateplus import MiniMateClient
from minimateplus.protocol import ProtocolError
from minimateplus.models import ComplianceConfig, DeviceInfo, Event, PeakValues, ProjectInfo, Timestamp
from minimateplus.models import CallHomeConfig, ComplianceConfig, DeviceInfo, Event, PeakValues, ProjectInfo, Timestamp
from minimateplus.transport import TcpTransport, DEFAULT_TCP_PORT
from minimateplus.blastware_file import write_blastware_file, blastware_filename
from minimateplus.client import _decode_a5_metadata_into, _decode_a5_waveform
from sfm import event_hdf5
from sfm.cache import SFMCache, get_cache
from sfm.database import SeismoDb
from sfm.live_cache import LiveCache as _LiveCache
from sfm.waveform_store import WaveformStore
logging.basicConfig(
level=logging.INFO,
@@ -99,6 +105,7 @@ app.add_middleware(
_DEFAULT_DB_PATH = Path(__file__).parent.parent / "bridges" / "captures" / "seismo_relay.db"
_db: Optional[SeismoDb] = None
_store: Optional[WaveformStore] = None
def _get_db() -> SeismoDb:
@@ -108,6 +115,18 @@ def _get_db() -> SeismoDb:
return _db
def _get_store() -> WaveformStore:
"""
Persistent event-file + A5-sidecar store, rooted at <db_dir>/waveforms/.
Mirrors the layout used by bridges/ach_server.py so files saved by ACH
ingestion and by live SFM downloads share one canonical location.
"""
global _store
if _store is None:
_store = WaveformStore(_get_db().db_path.parent / "waveforms")
return _store
# ── Live device cache ─────────────────────────────────────────────────────────
# In-memory cache for live device data. Avoids re-dialing the device on every
# request when the data hasn't changed.
@@ -125,116 +144,6 @@ def _get_db() -> SeismoDb:
#
# All endpoints accept ?force=true to bypass the cache and re-read from device.
_MONITOR_STATUS_TTL = 30.0 # seconds
class _LiveCache:
"""
Thread-safe in-memory cache for live SFM device data.
One singleton per server process.
"""
def __init__(self) -> None:
self._lock = threading.Lock()
# conn_key → serialised device info dict
self._device_info: dict[str, dict] = {}
# conn_key → (device_event_count_when_cached, [event dicts])
self._events: dict[str, tuple[int, list]] = {}
# conn_key → (fetched_at_unix, status_dict)
self._monitor_status: dict[str, tuple[float, dict]] = {}
# conn_key → bool (True = re-read device on next /device/info)
self._config_dirty: dict[str, bool] = {}
# (conn_key, event_index) → waveform dict (permanent)
self._waveforms: dict[tuple, dict] = {}
# ── Connection key ────────────────────────────────────────────────────────
@staticmethod
def make_conn_key(
host: Optional[str],
tcp_port: int,
port: Optional[str],
baud: int,
) -> str:
if host:
return f"tcp:{host}:{tcp_port}"
return f"serial:{port}:{baud}"
# ── Device info ───────────────────────────────────────────────────────────
def get_device_info(self, conn_key: str) -> Optional[dict]:
with self._lock:
if self._config_dirty.get(conn_key):
return None
return self._device_info.get(conn_key)
def set_device_info(self, conn_key: str, info: dict) -> None:
with self._lock:
self._device_info[conn_key] = info
self._config_dirty[conn_key] = False
# ── Events ────────────────────────────────────────────────────────────────
def get_events(self, conn_key: str, device_count: int) -> Optional[list]:
"""
Return cached events if the device's current event count matches what
we had when we last fetched. Returns None (cache miss) otherwise.
"""
with self._lock:
if self._config_dirty.get(conn_key):
return None
entry = self._events.get(conn_key)
if entry is None:
return None
cached_count, events = entry
return events if cached_count == device_count else None
def set_events(self, conn_key: str, device_count: int, events: list) -> None:
with self._lock:
self._events[conn_key] = (device_count, events)
# ── Monitor status ────────────────────────────────────────────────────────
def get_monitor_status(self, conn_key: str) -> Optional[dict]:
with self._lock:
entry = self._monitor_status.get(conn_key)
if entry is None:
return None
fetched_at, status = entry
if time.time() - fetched_at > _MONITOR_STATUS_TTL:
return None
return status
def set_monitor_status(self, conn_key: str, status: dict) -> None:
with self._lock:
self._monitor_status[conn_key] = (time.time(), status)
def invalidate_monitor_status(self, conn_key: str) -> None:
with self._lock:
self._monitor_status.pop(conn_key, None)
# ── Config dirty flag ─────────────────────────────────────────────────────
def mark_config_dirty(self, conn_key: str) -> None:
"""
Called after a successful POST /device/config write.
Forces next /device/info and /device/events to re-read from the device.
"""
with self._lock:
self._config_dirty[conn_key] = True
self._events.pop(conn_key, None)
# ── Waveforms (permanent cache) ───────────────────────────────────────────
def get_waveform(self, conn_key: str, index: int) -> Optional[dict]:
with self._lock:
return self._waveforms.get((conn_key, index))
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
with self._lock:
self._waveforms[(conn_key, index)] = waveform
_live_cache = _LiveCache()
@@ -285,11 +194,14 @@ def _serialise_compliance_config(cc: Optional["ComplianceConfig"]) -> Optional[d
if cc is None:
return None
return {
"record_time": cc.record_time,
"recording_mode": cc.recording_mode, # 0x00=Single Shot, 0x01=Continuous, 0x03=Histogram, 0x04=Histogram+Continuous
"sample_rate": cc.sample_rate,
"histogram_interval_sec": cc.histogram_interval_sec, # seconds; None if not Histogram mode
"record_time": cc.record_time,
"trigger_level_geo": cc.trigger_level_geo,
"alarm_level_geo": cc.alarm_level_geo,
"max_range_geo": cc.max_range_geo,
"geo_adc_scale": cc.geo_adc_scale, # hw scale factor (in/s)/V — informational only, do not write
"geo_range": cc.geo_range, # CONFIRMED 2026-04-20: 0x00=Normal 10in/s, 0x01=Sensitive 1.25in/s
"setup_name": cc.setup_name,
"project": cc.project,
"client": cc.client,
@@ -299,6 +211,27 @@ def _serialise_compliance_config(cc: Optional["ComplianceConfig"]) -> Optional[d
}
def _serialise_call_home_config(ch: Optional["CallHomeConfig"]) -> Optional[dict]:
if ch is None:
return None
return {
"auto_call_home_enabled": ch.auto_call_home_enabled,
"dial_string": ch.dial_string,
"after_event_recorded": ch.after_event_recorded,
"at_specified_times": ch.at_specified_times,
"time1_enabled": ch.time1_enabled,
"time1_hour": ch.time1_hour,
"time1_min": ch.time1_min,
"time2_enabled": ch.time2_enabled,
"time2_hour": ch.time2_hour,
"time2_min": ch.time2_min,
"num_retries": ch.num_retries,
"time_between_retries_sec": ch.time_between_retries_sec,
"wait_for_connection_sec": ch.wait_for_connection_sec,
"warm_up_time_sec": ch.warm_up_time_sec,
}
def _serialise_device_info(info: DeviceInfo) -> dict:
return {
"serial": info.serial,
@@ -757,7 +690,7 @@ def device_event_waveform(
if the device is not storing all frames yet, or the capture was partial)
- **sample_rate**: samples per second (from compliance config)
- **channels**: dict of channel name list of signed int16 ADC counts
(keys: "Tran", "Vert", "Long", "Mic")
(keys: "Tran", "Vert", "Long", "MicL")
**Caching**: full waveforms are cached permanently after the first download
they are immutable once recorded on the device. Subsequent requests for the
@@ -800,30 +733,194 @@ def device_event_waveform(
detail=f"Event index {index} not found on device",
)
raw = getattr(ev, "raw_samples", None) or {}
samples_decoded = len(raw.get("Tran", []))
# Backfill from compliance_config: sample_rate, record_time, and
# derived total_samples. These are user-set authoritative values; the
# corresponding STRT-derived guesses in `_decode_a5_waveform` can be
# off (e.g. rectime used to read the 0x46 record-type marker = 70s).
cc = info.compliance_config
if cc:
if ev.sample_rate is None and cc.sample_rate:
ev.sample_rate = cc.sample_rate
if cc.record_time:
ev.rectime_seconds = cc.record_time
if ev.sample_rate and ev.rectime_seconds:
derived = int(round(ev.sample_rate * ev.rectime_seconds))
if (ev.total_samples is None
or ev.total_samples > derived * 2
or ev.total_samples < derived // 4):
ev.total_samples = derived
geo_range = getattr(cc, "geo_range", None) if cc else None
# Resolve sample_rate from compliance config if not on the event itself
sample_rate = ev.sample_rate
if sample_rate is None and info.compliance_config:
sample_rate = info.compliance_config.sample_rate
result = {
"index": ev.index,
"record_type": ev.record_type,
"timestamp": _serialise_timestamp(ev.timestamp),
"total_samples": ev.total_samples,
"pretrig_samples": ev.pretrig_samples,
"rectime_seconds": ev.rectime_seconds,
"samples_decoded": samples_decoded,
"sample_rate": sample_rate,
"peak_values": _serialise_peak_values(ev.peak_values),
"channels": raw,
}
# Build the plot.v1 JSON: samples in physical units (in/s for geo, psi
# for mic), explicit time axis, peak markers — the shape clients should
# consume directly without doing any ADC scaling.
serial = getattr(info, "serial", None) or ""
result = event_hdf5.event_to_plot_json(
ev, serial=serial,
geo_range=geo_range or "normal",
index=index,
)
cache.set_waveform(conn_key, index, result)
return result
@app.get("/device/event/{index}/blastware_file")
def device_event_blastware_file(
index: int,
port: Optional[str] = Query(None, description="Serial port (e.g. COM5)"),
baud: int = Query(38400, description="Serial baud rate"),
host: Optional[str] = Query(None, description="TCP host — modem IP or ACH relay"),
tcp_port: int = Query(DEFAULT_TCP_PORT, description=f"TCP port (default {DEFAULT_TCP_PORT})"),
force: bool = Query(False, description="Bypass any cached/dedup'd state and re-download from device"),
) -> FileResponse:
"""
Download the waveform for a single event (0-based index) and return it
as a Blastware-compatible binary file with a correct Blastware filename.
Supply either *port* (serial) or *host* (TCP/modem).
The file is written to the OS temp directory and streamed back as a binary
download. Blastware can open it directly filename encodes serial + timestamp.
Filename format: <prefix><serial3><stem><AB>0<W|H>
- prefix letter = chr(ord('B') + floor(serial_numeric / 1000))
- stem + AB = second-resolution timestamp since 1985-01-01 local
- W / H = Full Waveform / Full Histogram (defaults to W for
triggered events; histogram requires recording_mode
to be populated from compliance config)
Performs: POLL startup get_events(full_waveform=True,
stop_after_index=index) write_blastware_file() FileResponse +
persistent store + DB upsert.
"""
log.info(
"GET /device/event/%d/blastware_file port=%s host=%s force=%s",
index, port, host, force,
)
# `force` always re-downloads from the device. This endpoint already
# never short-circuits via cache, so `force` is reserved for parity with
# the other live endpoints.
try:
def _do():
with _build_client(port, baud, host, tcp_port, timeout=120.0) as client:
info = client.connect()
# full_waveform=True pulls the complete 5A stream so the
# client populates STRT-derived fields (total_samples,
# pretrig_samples, rectime_seconds) AND raw_samples on the
# Event. Required for the .h5 + .sfm.json sidecar to be
# filled in correctly — without it, those land as nulls.
events = client.get_events(
full_waveform=True,
stop_after_index=index,
)
matching = [ev for ev in events if ev.index == index]
return matching[0] if matching else None, info
ev, info = _run_with_retry(_do, is_tcp=_is_tcp(host))
except HTTPException:
raise
except ProtocolError as exc:
log.error("blastware_file: protocol error: %s", exc, exc_info=True)
raise HTTPException(status_code=502, detail=f"Protocol error: {exc}") from exc
except OSError as exc:
log.error("blastware_file: connection error: %s", exc, exc_info=True)
raise HTTPException(status_code=502, detail=f"Connection error: {exc}") from exc
except Exception as exc:
log.error("blastware_file: unexpected error: %s", exc, exc_info=True)
raise HTTPException(status_code=500, detail=f"Device error: {exc}") from exc
if ev is None:
raise HTTPException(
status_code=404,
detail=f"Event index {index} not found on device",
)
a5_frames = getattr(ev, "_a5_frames", None)
if not a5_frames:
raise HTTPException(
status_code=502,
detail=f"No waveform data received for event index {index} — 5A download failed",
)
# Determine serial number from device info
serial = getattr(info, "serial", None) or "UNKNOWN"
# Build filename using the same algorithm Blastware uses
filename = blastware_filename(ev, serial)
# Write to OS temp dir (cross-platform: /tmp on Linux/macOS,
# %TEMP% on Windows) so FastAPI can stream it back via FileResponse.
out_path = Path(tempfile.gettempdir()) / filename
# Delete any stale file at this path before writing. On Windows we have
# observed the new (smaller) file getting trailing zero-bytes from the
# previous (larger) file when filesystem semantics around open(...,"wb")
# don't truncate cleanly (e.g. through a synced folder). Explicit unlink
# eliminates that ambiguity.
try:
out_path.unlink()
except FileNotFoundError:
pass
write_blastware_file(ev, a5_frames, out_path)
log.info(
"blastware_file: wrote %s (%d A5 frames, serial=%s)",
out_path, len(a5_frames), serial,
)
# Promote to canonical persistent store + DB row so this event is
# queryable via /db/events afterwards (matches the ACH ingestion path).
if serial != "UNKNOWN" and ev._waveform_key is not None:
try:
cc = info.compliance_config
# Backfill authoritative compliance-config values onto the
# Event before persisting. These supersede whatever
# _decode_a5_waveform read from the STRT bytes (some of which
# have ambiguous semantics — e.g. STRT[20] is rectime but
# STRT[8:10] / STRT[16:18] are device-specific scratch fields
# that aren't reliable sample/pretrig counts).
if cc:
if ev.sample_rate is None and cc.sample_rate:
ev.sample_rate = cc.sample_rate
if cc.record_time:
# record_time from compliance is authoritative — the
# user-set value the device followed when recording.
ev.rectime_seconds = cc.record_time
# Derive total_samples from sample_rate × rectime when
# we can; the STRT-derived value can land at a buffer-
# offset rather than a sample count.
if ev.sample_rate and ev.rectime_seconds:
derived = int(round(ev.sample_rate * ev.rectime_seconds))
if (ev.total_samples is None
or ev.total_samples > derived * 2
or ev.total_samples < derived // 4):
ev.total_samples = derived
geo_range = getattr(cc, "geo_range", None) if cc else None
rec = _get_store().save(
ev, serial=serial, a5_frames=a5_frames,
geo_range=geo_range if geo_range is not None else "normal",
)
_get_db().insert_events(
[ev],
serial=serial,
waveform_records={ev._waveform_key.hex(): rec},
)
log.info(
"blastware_file: persisted to store (%s, %d bytes)",
rec["filename"], rec["filesize"],
)
except Exception as exc:
log.warning(
"blastware_file: persistent store save failed: %s "
"— temp file still served",
exc,
)
return FileResponse(
path=str(out_path),
filename=filename,
media_type="application/octet-stream",
)
# ── Write endpoints ───────────────────────────────────────────────────────────
class DeviceConfigBody(BaseModel):
@@ -835,16 +932,15 @@ class DeviceConfigBody(BaseModel):
Recording parameters
--------------------
recording_mode : Recording mode enum. Values: 0=Single Shot, 1=Continuous, 3=Histogram, 4=Histogram+Continuous.
sample_rate : Samples per second. Valid values: 1024, 2048, 4096.
record_time : Record duration in seconds (e.g. 1.0, 2.0, 3.0).
Trigger / alarm thresholds (geo channels, in/s)
------------------------------------------------
Trigger / alarm thresholds and range (geo channels)
----------------------------------------------------
trigger_level_geo : Trigger threshold in in/s (e.g. 0.5).
alarm_level_geo : Alarm threshold in in/s (e.g. 1.0).
max_range_geo : Full-scale calibration constant (e.g. 6.206).
Rarely changed only set if you know what you're doing.
geo_range : Geophone range/sensitivity. 0=Normal 10.000 in/s, 1=Sensitive 1.250 in/s.
Project / operator strings (max 41 ASCII characters each)
----------------------------
project : Project description.
@@ -854,12 +950,14 @@ class DeviceConfigBody(BaseModel):
notes : Extended notes.
"""
# Recording parameters
recording_mode: Optional[int] = None
sample_rate: Optional[int] = None
record_time: Optional[float] = None
# Threshold parameters
histogram_interval_sec: Optional[int] = None # seconds: 2, 5, 15, 60, 300, 900 (mode-gated)
# Threshold parameters / geo range
trigger_level_geo: Optional[float] = None
alarm_level_geo: Optional[float] = None
max_range_geo: Optional[float] = None
geo_range: Optional[int] = None # 0=Normal 10.000 in/s, 1=Sensitive 1.250 in/s
# Project / operator strings
project: Optional[str] = None
client_name: Optional[str] = None
@@ -887,6 +985,7 @@ def device_config(
Example body (all fields optional include only what you want to change):
{
"recording_mode": 1,
"sample_rate": 1024,
"record_time": 3.0,
"trigger_level_geo": 0.5,
@@ -914,11 +1013,13 @@ def device_config(
with _build_client(port, baud, host, tcp_port) as client:
client.connect()
client.apply_config(
recording_mode=body.recording_mode,
sample_rate=body.sample_rate,
record_time=body.record_time,
histogram_interval_sec=body.histogram_interval_sec,
trigger_level_geo=body.trigger_level_geo,
alarm_level_geo=body.alarm_level_geo,
max_range_geo=body.max_range_geo,
geo_range=body.geo_range,
project=body.project,
client_name=body.client_name,
operator=body.operator,
@@ -1068,6 +1169,144 @@ def device_monitor_stop(
return {"status": "stopped"}
# ── Call home config endpoints ───────────────────────────────────────────────
@app.get("/device/call_home")
def device_call_home_get(
port: Optional[str] = Query(None, description="Serial port (e.g. COM5)"),
baud: int = Query(38400, description="Serial baud rate"),
host: Optional[str] = Query(None, description="TCP host — modem IP or ACH relay"),
tcp_port: int = Query(DEFAULT_TCP_PORT, description=f"TCP port (default {DEFAULT_TCP_PORT})"),
) -> dict:
"""
Read the Auto Call Home (ACH) configuration from the device.
Sends SUB 0x2C (two-step read) and returns the decoded call home config.
Confirmed from 4-20-26 call home settings captures (BE11529).
Returns:
{
"auto_call_home_enabled": true/false,
"dial_string": "RADIO RING",
"after_event_recorded": true/false,
"at_specified_times": true/false,
"time1_enabled": true/false, "time1_hour": 19, "time1_min": 55,
"time2_enabled": false, "time2_hour": 0, "time2_min": 0,
"num_retries": 3,
"time_between_retries_sec": 15,
"wait_for_connection_sec": 60,
"warm_up_time_sec": 60
}
"""
try:
def _do():
with _build_client(port, baud, host, tcp_port) as client:
client.poll()
return client.get_call_home_config()
ch_config = _run_with_retry(_do, is_tcp=_is_tcp(host))
except HTTPException:
raise
except ProtocolError as exc:
raise HTTPException(status_code=502, detail=f"Protocol error: {exc}") from exc
except OSError as exc:
raise HTTPException(status_code=502, detail=f"Connection error: {exc}") from exc
except Exception as exc:
raise HTTPException(status_code=500, detail=f"Device error: {exc}") from exc
return _serialise_call_home_config(ch_config) or {}
class CallHomeConfigBody(BaseModel):
"""
Request body for POST /device/call_home.
All fields are optional only supplied (non-null) fields are modified.
All other call home config bytes are round-tripped verbatim from the device.
Confirmed writable fields (4-20-26 captures):
auto_call_home_enabled : bool master enable for auto call home
after_event_recorded : bool call home after each triggered event
at_specified_times : bool enable time-based scheduled calls
time1_enabled : bool enable time slot 1
time1_hour : int hour for slot 1 (0-23; avoid 3 DLE escape limitation)
time1_min : int minute for slot 1 (0-59; avoid 3)
time2_enabled : bool enable time slot 2
time2_hour : int hour for slot 2 (0-23; avoid 3)
time2_min : int minute for slot 2 (0-59; avoid 3)
Read-only fields (not writable via this endpoint):
dial_string, num_retries, time_between_retries_sec,
wait_for_connection_sec, warm_up_time_sec
"""
auto_call_home_enabled: Optional[bool] = None
after_event_recorded: Optional[bool] = None
at_specified_times: Optional[bool] = None
time1_enabled: Optional[bool] = None
time1_hour: Optional[int] = None
time1_min: Optional[int] = None
time2_enabled: Optional[bool] = None
time2_hour: Optional[int] = None
time2_min: Optional[int] = None
@app.post("/device/call_home")
def device_call_home_set(
body: CallHomeConfigBody,
port: Optional[str] = Query(None, description="Serial port (e.g. COM5)"),
baud: int = Query(38400, description="Serial baud rate"),
host: Optional[str] = Query(None, description="TCP host — modem IP or ACH relay"),
tcp_port: int = Query(DEFAULT_TCP_PORT, description=f"TCP port (default {DEFAULT_TCP_PORT})"),
) -> dict:
"""
Read the current call home config, apply supplied changes, and write back.
Only non-null fields are modified. All other bytes round-trip verbatim.
Write sequence (confirmed from 4-20-26 call home settings captures):
SUB 0x2C (read 2-step) 125-byte raw payload
patch fields
SUB 0x7E (write 127-byte payload) ack 0x81
SUB 0x7F (confirm) ack 0x80
Example body:
{ "auto_call_home_enabled": true, "after_event_recorded": true,
"time1_enabled": true, "time1_hour": 20, "time1_min": 0 }
"""
changed = body.model_dump(exclude_none=True)
log.info("POST /device/call_home port=%s host=%s fields=%s", port, host, list(changed.keys()))
try:
def _do():
with _build_client(port, baud, host, tcp_port) as client:
client.poll()
client.set_call_home_config(
auto_call_home_enabled=body.auto_call_home_enabled,
after_event_recorded=body.after_event_recorded,
at_specified_times=body.at_specified_times,
time1_enabled=body.time1_enabled,
time1_hour=body.time1_hour,
time1_min=body.time1_min,
time2_enabled=body.time2_enabled,
time2_hour=body.time2_hour,
time2_min=body.time2_min,
)
_run_with_retry(_do, is_tcp=_is_tcp(host))
except HTTPException:
raise
except ProtocolError as exc:
raise HTTPException(status_code=502, detail=f"Protocol error: {exc}") from exc
except OSError as exc:
raise HTTPException(status_code=502, detail=f"Connection error: {exc}") from exc
except ValueError as exc:
raise HTTPException(status_code=422, detail=str(exc)) from exc
except Exception as exc:
raise HTTPException(status_code=500, detail=f"Device error: {exc}") from exc
return {"status": "ok", "updated_fields": changed}
# ── Cache management endpoints ────────────────────────────────────────────────
@app.get("/cache/stats")
@@ -1164,6 +1403,334 @@ def db_set_false_trigger(
return {"status": "ok", "event_id": event_id, "false_trigger": value}
# ── /db/events/{id} — waveform file accessors ─────────────────────────────────
#
# These endpoints serve files from the persistent WaveformStore, so a Blastware
# file or its decoded JSON for a previously-ingested ACH event can be fetched
# without re-dialing the device.
@app.get("/db/events/{event_id}/blastware_file")
def db_event_blastware_file(event_id: str) -> FileResponse:
"""
Return the Blastware-format event file for a previously-ingested
event. Filename extension is per-event (timestamp-encoded
`AB0T` for ACH downloads, 3-char `AB0` for direct downloads).
404 if the event is unknown or has no event file in the store
(events ingested before the store was wired will show this
re-download via the live endpoint to populate).
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=(
f"Event {event_id} has no Blastware file in the store. "
"Re-download via the live endpoint to populate."
),
)
bw_path = _get_store().open_blastware(serial, filename)
if bw_path is None:
raise HTTPException(
status_code=410,
detail=f"Stored file missing on disk: {filename}",
)
return FileResponse(
path=str(bw_path),
filename=filename,
media_type="application/octet-stream",
)
@app.get("/db/events/{event_id}/waveform.json")
def db_event_waveform_json(event_id: str) -> dict:
"""
Return the plot-ready JSON (`sfm.plot.v1`) for a stored event.
Resolution order (cheapest first):
1. If `<filename>.h5` exists, serve it via `plot_json_from_hdf5`.
Samples are already in physical units; no decode work needed.
2. Else if `<filename>.a5.pkl` exists, replay the A5 decoders to
rebuild an Event and serialise via `event_to_plot_json`.
3. Else 404 the event has no waveform data on disk.
The shape is identical regardless of source, so clients (the SFM
webapp, Terra-View, etc.) consume the same `sfm.plot.v1` payload.
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=f"Event {event_id} has no event file in the store",
)
store = _get_store()
# Path 1: HDF5 (canonical clean format).
h5_path = store.hdf5_path_for(serial, filename)
if h5_path.exists():
try:
return event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
except Exception as exc:
log.warning("HDF5 read failed (%s); falling back to A5 path", exc)
# Path 2: A5 pickle replay.
a5_frames = store.load_a5(serial, filename)
if not a5_frames:
raise HTTPException(
status_code=404,
detail=(
f"Event {event_id} has no waveform data on disk "
"(no .h5 and no .a5.pkl). Run the backfill script or "
"re-download via the live endpoint to populate."
),
)
ev = Event(index=-1)
try:
_decode_a5_metadata_into(a5_frames, ev)
except Exception as exc:
log.warning("db_event_waveform_json: metadata decode failed: %s", exc)
try:
_decode_a5_waveform(a5_frames, ev)
except Exception as exc:
log.error("db_event_waveform_json: waveform decode failed: %s", exc, exc_info=True)
raise HTTPException(status_code=500, detail=f"Waveform decode failed: {exc}") from exc
# Carry over fields from the DB row when the A5 replay didn't fill them.
if ev.sample_rate is None and row.get("sample_rate"):
ev.sample_rate = row.get("sample_rate")
return event_hdf5.event_to_plot_json(
ev, serial=serial, geo_range="normal", event_id=event_id,
)
# ── /db/events/{id}/sidecar — modern .sfm.json review/metadata accessors ──────
class SidecarPatchBody(BaseModel):
"""Body for PATCH /db/events/{id}/sidecar.
JSON-merge-patch semantics: only the keys you include get updated.
`review` is the editable block for monthly-summary workflows
(false_trigger flag, reviewer notes, etc.); `extensions` is the
forward-compat namespace for vendor / future fields.
"""
review: Optional[dict] = None
extensions: Optional[dict] = None
@app.get("/db/events/{event_id}/sidecar")
def db_event_sidecar(event_id: str) -> dict:
"""
Return the .sfm.json sidecar for a stored event. 404 if the event
is unknown or has no sidecar in the store (events ingested before
the sidecar feature landed will show this until backfilled).
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=f"Event {event_id} has no event file in the store",
)
sidecar = _get_store().load_sidecar(serial, filename)
if sidecar is None:
raise HTTPException(
status_code=404,
detail=(
f"No .sfm.json sidecar on disk for {filename}. "
"Run scripts/backfill_sidecars.py to generate one."
),
)
return sidecar
@app.patch("/db/events/{event_id}/sidecar")
def db_event_sidecar_patch(event_id: str, body: SidecarPatchBody) -> dict:
"""
JSON-merge-patch the sidecar's `review` and/or `extensions` blocks.
The sidecar JSON is the source of truth for review state. When
`review.false_trigger` is updated, the SQL `events.false_trigger`
column is kept in sync as a derived index for fast filtering.
Returns the new full sidecar. 404 if the event or sidecar is missing.
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(
status_code=404,
detail=f"Event {event_id} has no event file in the store",
)
if not (body.review or body.extensions):
raise HTTPException(
status_code=400,
detail="PATCH body must include `review` and/or `extensions`",
)
new_sidecar = _get_store().patch_sidecar(
serial, filename,
review=body.review,
extensions=body.extensions,
)
if new_sidecar is None:
raise HTTPException(
status_code=404,
detail=f"No .sfm.json sidecar on disk for {filename}",
)
# Mirror false_trigger from review block into the SQL index column.
if body.review is not None:
_get_db().update_event_review(event_id, new_sidecar.get("review", {}))
return new_sidecar
# ── /db/import/blastware_file — ingest BW-only event files ────────────────────
@app.post("/db/import/blastware_file")
async def db_import_blastware_file(
files: list[UploadFile] = File(...),
serial: Optional[str] = Query(None, description="Optional serial-number hint (e.g. BE11529); falls back to the BW filename's encoded prefix when omitted"),
) -> dict:
"""
Multipart upload of one or more Blastware event file binaries
(typically produced by Blastware's own ACH). For each file:
1. Parse the bytes via WaveformStore.save_imported_bw produces
a parsed Event + copies the file into the persistent store +
writes a .sfm.json sidecar with source.kind = "bw-import".
2. Upsert a row into `events` (dedup'd on serial+timestamp).
Response includes per-file outcomes so the caller can see which
landed cleanly and which failed (e.g. malformed file, unknown
serial, etc.).
"""
store = _get_store()
db = _get_db()
results: list[dict] = []
for upload in files:
try:
content = await upload.read()
except Exception as exc:
results.append({
"filename": upload.filename, "status": "error",
"detail": f"read failed: {exc}",
})
continue
try:
ev, rec = store.save_imported_bw(
content,
source_path=Path(upload.filename or "imported.bw"),
serial_hint=serial,
)
inserted, skipped = db.insert_events(
[ev],
serial=(serial or _serial_from_event(ev) or "UNKNOWN"),
waveform_records={
ev._waveform_key.hex(): rec
if ev._waveform_key else None
} if ev._waveform_key else None,
)
results.append({
"filename": upload.filename,
"status": "ok",
"stored_filename": rec["filename"],
"filesize": rec["filesize"],
"sha256": rec["sha256"],
"inserted": inserted,
"skipped": skipped,
})
except Exception as exc:
log.error("import failed for %s: %s", upload.filename, exc, exc_info=True)
results.append({
"filename": upload.filename, "status": "error",
"detail": str(exc),
})
return {"count": len(results), "results": results}
def _serial_from_event(ev) -> Optional[str]:
"""Fallback serial resolver — currently relies on the BW filename
decoder via WaveformStore.save_imported_bw, so this is just a
placeholder for future enhancement (e.g. inferring from project_info)."""
return None
@app.get("/db/units/{serial}/waveforms.zip")
def db_unit_waveforms_zip(
serial: str,
from_dt: Optional[str] = Query(None, description="ISO-8601 start datetime (inclusive)"),
to_dt: Optional[str] = Query(None, description="ISO-8601 end datetime (inclusive)"),
limit: int = Query(5000, description="Hard cap on events bundled (default 5000)"),
) -> StreamingResponse:
"""
Stream a ZIP of all event files for a serial in the optional date range.
Events without a stored event file are silently skipped.
"""
import io
import zipfile
from_parsed = datetime.datetime.fromisoformat(from_dt) if from_dt else None
to_parsed = datetime.datetime.fromisoformat(to_dt) if to_dt else None
rows = _get_db().query_events(
serial=serial,
from_dt=from_parsed,
to_dt=to_parsed,
limit=limit,
offset=0,
)
store = _get_store()
buf = io.BytesIO()
written = 0
with zipfile.ZipFile(buf, "w", compression=zipfile.ZIP_DEFLATED) as zf:
for row in rows:
fn = row.get("blastware_filename")
if not fn:
continue
bw_path = store.open_blastware(serial, fn)
if bw_path is None:
continue
zf.write(bw_path, arcname=fn)
written += 1
if written == 0:
raise HTTPException(
status_code=404,
detail=f"No stored Blastware files found for serial {serial} in range",
)
buf.seek(0)
safe_serial = serial.replace("/", "_")
headers = {
"Content-Disposition": f'attachment; filename="{safe_serial}_waveforms.zip"',
"X-Waveform-Count": str(written),
}
return StreamingResponse(buf, media_type="application/zip", headers=headers)
@app.get("/db/monitor_log")
def db_monitor_log(
serial: Optional[str] = Query(None, description="Filter by unit serial"),
+905 -61
View File
File diff suppressed because it is too large Load Diff
+446
View File
@@ -0,0 +1,446 @@
"""
sfm/waveform_store.py On-disk store for Blastware-format event files.
Layout (flat per-serial, four files per event):
<root>/<serial>/<filename> event file (BW-readable binary)
<root>/<serial>/<filename>.a5.pkl pickled list of A5 S3Frame dicts
<root>/<serial>/<filename>.h5 clean waveform arrays (HDF5)
<root>/<serial>/<filename>.sfm.json modern sidecar (peaks, project,
review state, extensions)
`<filename>` is whatever `minimateplus.blastware_file.blastware_filename`
produces for the event. The extension is NOT a fixed type tag it
encodes the event timestamp (`AB0T` format).
Roles:
- BW binary: what Blastware reads. Untouched. The user-facing review
waveform viewer.
- .a5.pkl: regenerative source. Lets the BW binary be rebuilt
byte-for-byte if the encoder changes. Never delete.
- .h5: clean per-channel waveform arrays in physical units (in/s for
geo, psi for mic) plus event metadata. Canonical format for
downstream analysis tools and the `/device/event/{idx}/waveform`
endpoint's plot-JSON output.
- .sfm.json: small, queryable metadata + review state. SQL
`events.false_trigger` is a derived index kept in sync via
`patch_sidecar()`.
"""
from __future__ import annotations
import datetime
import logging
import pickle
import shutil
from pathlib import Path
from typing import Optional
from minimateplus import event_file_io
from minimateplus.blastware_file import blastware_filename, write_blastware_file
from minimateplus.framing import S3Frame
from minimateplus.models import Event
from sfm import event_hdf5
log = logging.getLogger("sfm.waveform_store")
A5_PICKLE_VERSION = 1
def _frame_to_dict(f: S3Frame) -> dict:
return {
"sub": f.sub,
"page_hi": f.page_hi,
"page_lo": f.page_lo,
"data": bytes(f.data),
"chk_byte": f.chk_byte,
"checksum_valid": f.checksum_valid,
}
def _dict_to_frame(d: dict) -> S3Frame:
return S3Frame(
sub=d["sub"],
page_hi=d["page_hi"],
page_lo=d["page_lo"],
data=bytes(d["data"]),
checksum_valid=d.get("checksum_valid", True),
chk_byte=d.get("chk_byte", 0),
)
class WaveformStore:
"""
Persistent store for Blastware-format waveform files + their A5 source frames.
Thread safety: write_blastware_file is single-shot; concurrent saves of the
*same* filename would race, but the filename encodes second-resolution
timestamps + serial, so collisions across threads/processes are vanishingly
unlikely in practice.
"""
def __init__(self, root: str | Path) -> None:
self.root = Path(root)
self.root.mkdir(parents=True, exist_ok=True)
log.info("WaveformStore root=%s", self.root)
# ── path helpers ────────────────────────────────────────────────────────────
def _serial_dir(self, serial: str) -> Path:
d = self.root / serial
d.mkdir(parents=True, exist_ok=True)
return d
def paths_for(self, serial: str, filename: str) -> tuple[Path, Path]:
"""Return (blastware_path, a5_pickle_path) for a given serial+filename.
For the sidecar path use `sidecar_path_for()` kept separate so
existing callers don't need to unpack a 3-tuple.
"""
d = self._serial_dir(serial)
return d / filename, d / f"{filename}.a5.pkl"
def sidecar_path_for(self, serial: str, filename: str) -> Path:
"""Return absolute path to the .sfm.json sidecar for a given event."""
return self._serial_dir(serial) / f"{filename}.sfm.json"
def hdf5_path_for(self, serial: str, filename: str) -> Path:
"""Return absolute path to the .h5 clean-waveform file for a given event."""
return self._serial_dir(serial) / f"{filename}.h5"
def open_blastware(self, serial: str, filename: str) -> Optional[Path]:
"""Return absolute path to an existing event file or None."""
bw_path, _ = self.paths_for(serial, filename)
return bw_path if bw_path.exists() else None
# ── save / load ─────────────────────────────────────────────────────────────
def save(
self,
ev: Event,
serial: str,
a5_frames: list[S3Frame],
*,
source_kind: str = "sfm-live",
geo_range = "normal",
) -> dict:
"""
Write all four event-file artifacts for one event:
- <filename> BW binary
- <filename>.a5.pkl raw A5 frame pickle
- <filename>.h5 clean waveform (HDF5)
- <filename>.sfm.json modern sidecar (metadata + review)
Returns a record dict suitable for persisting alongside the DB row:
{
"filename": "M529LKIQ.7M0W",
"filesize": 8708,
"sha256": "a1b2c3...",
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
"hdf5_filename": "M529LKIQ.7M0W.h5",
"sidecar_filename": "M529LKIQ.7M0W.sfm.json",
}
`source_kind` flows into `sidecar.source.kind` callers should
pass "sfm-live" (default) for the live endpoint and "sfm-ach" for
the ACH ingestion path. BW-imported events use save_imported_bw()
instead.
`geo_range` controls the ADC-counts in/s scaling in the HDF5
file ("normal" = 10 in/s FS, "sensitive" = 1.25 in/s FS).
Defaults to "normal" callers with compliance-config access
should pass the actual unit setting so the saved samples are in
the right units.
Idempotent: if the event file already exists, it is overwritten
with the freshly-encoded version (same bytes for the same
a5_frames) and the sidecar's review block is preserved across
re-saves.
"""
if not a5_frames:
raise ValueError("WaveformStore.save: a5_frames is empty")
if not serial:
raise ValueError("WaveformStore.save: serial is required")
filename = blastware_filename(ev, serial)
bw_path, a5_path = self.paths_for(serial, filename)
sidecar_path = self.sidecar_path_for(serial, filename)
hdf5_path = self.hdf5_path_for(serial, filename)
# 1. encode the event file (defensive unlink prevents trailing-byte
# leaks from a previous larger file on synced/odd filesystems).
try:
bw_path.unlink()
except FileNotFoundError:
pass
write_blastware_file(ev, a5_frames, bw_path)
filesize = bw_path.stat().st_size
sha256 = event_file_io.file_sha256(bw_path)
# 2. write the .a5.pkl sidecar
try:
a5_path.unlink()
except FileNotFoundError:
pass
payload = {
"version": A5_PICKLE_VERSION,
"frames": [_frame_to_dict(f) for f in a5_frames],
}
with a5_path.open("wb") as fp:
pickle.dump(payload, fp, protocol=pickle.HIGHEST_PROTOCOL)
# 3. write the .h5 clean-waveform file (samples in physical units).
# Best-effort: a write failure shouldn't sink the rest of the save
# (the HDF5 can be regenerated later from the .a5.pkl).
hdf5_filename: Optional[str] = None
try:
event_hdf5.write_event_hdf5(
hdf5_path, ev,
serial=serial,
geo_range=geo_range,
source_kind=source_kind,
)
hdf5_filename = hdf5_path.name
except Exception as exc:
log.warning(
"save: HDF5 write failed for %s: %s — continuing without .h5",
hdf5_path, exc,
)
# 4. write the .sfm.json sidecar. Preserve any existing review
# block + extensions across re-saves so user edits aren't lost
# when the same event is re-downloaded (e.g. via Force refresh).
existing_review = None
existing_extensions = None
if sidecar_path.exists():
try:
old = event_file_io.read_sidecar(sidecar_path)
existing_review = old.get("review")
existing_extensions = old.get("extensions")
except Exception as exc:
log.warning(
"save: existing sidecar at %s unreadable (%s); overwriting",
sidecar_path, exc,
)
sidecar = event_file_io.event_to_sidecar_dict(
ev,
serial=serial,
blastware_filename=filename,
blastware_filesize=filesize,
blastware_sha256=sha256,
source_kind=source_kind,
a5_pickle_filename=a5_path.name,
review=existing_review,
extensions=existing_extensions,
)
event_file_io.write_sidecar(sidecar_path, sidecar)
log.info(
"WaveformStore.save serial=%s filename=%s filesize=%d frames=%d "
"h5=%s sidecar=%s",
serial, filename, filesize, len(a5_frames),
hdf5_filename or "(skipped)", sidecar_path.name,
)
return {
"filename": filename,
"filesize": filesize,
"sha256": sha256,
"a5_pickle_filename": a5_path.name,
"hdf5_filename": hdf5_filename,
"sidecar_filename": sidecar_path.name,
}
def save_imported_bw(
self,
bw_bytes: bytes,
source_path: Path,
*,
serial_hint: Optional[str] = None,
) -> tuple[Event, dict]:
"""
Ingest a Blastware event file produced by an external tool
(Blastware's own ACH, manual download, etc.) where the source A5
frames aren't available.
Workflow:
1. Parse the bytes via event_file_io.read_blastware_file (writes
a temp file to do that, since the parser takes a path).
2. Resolve serial from BW filename (`<P><serial3>...`) or use
serial_hint. Falls back to "UNKNOWN".
3. Copy the BW bytes verbatim into <root>/<serial>/<filename>.
4. Write the .sfm.json sidecar with source.kind = "bw-import"
and a5_pickle_filename = None. Does NOT write a .a5.pkl
(no A5 source available; byte-for-byte regeneration not
possible the on-disk BW file IS the byte-for-byte source).
Returns (event, record_dict) so callers can both insert into
SeismoDb and surface the parsed Event.
"""
# Stash the bytes to a temp path so read_blastware_file (path-based)
# can parse without us duplicating its logic.
import tempfile
with tempfile.NamedTemporaryFile(suffix=".bw", delete=False) as tmp:
tmp.write(bw_bytes)
tmp_path = Path(tmp.name)
try:
ev = event_file_io.read_blastware_file(tmp_path)
finally:
try:
tmp_path.unlink()
except FileNotFoundError:
pass
# Resolve serial. blastware_filename derives a 4-char prefix from
# the numeric serial (e.g. BE11529 → M529); we go the other way
# via the source filename if a hint wasn't given.
serial = serial_hint or _serial_from_bw_filename(source_path.name) or "UNKNOWN"
# Use the source filename verbatim — it already encodes timestamp
# + record type per BW's AB0T scheme, and we want to preserve it
# so the file BW knows about can be opened back in BW.
filename = source_path.name
bw_path = self._serial_dir(serial) / filename
# 1. copy bytes
bw_path.write_bytes(bw_bytes)
filesize = bw_path.stat().st_size
sha256 = event_file_io.file_sha256(bw_path)
# 2. write the .h5 clean-waveform file from the parsed Event.
# Note: peaks here are computed from raw samples (the BW file
# doesn't carry the device-authoritative 0C peaks). Best-effort.
hdf5_path = self.hdf5_path_for(serial, filename)
hdf5_filename: Optional[str] = None
try:
event_hdf5.write_event_hdf5(
hdf5_path, ev,
serial=serial,
geo_range="normal", # BW file doesn't carry the range; assume Normal
source_kind="bw-import",
)
hdf5_filename = hdf5_path.name
except Exception as exc:
log.warning(
"save_imported_bw: HDF5 write failed for %s: %s — continuing",
hdf5_path, exc,
)
# 3. write sidecar with source.kind = bw-import
sidecar_path = self.sidecar_path_for(serial, filename)
existing_review = None
if sidecar_path.exists():
try:
existing_review = event_file_io.read_sidecar(sidecar_path).get("review")
except Exception:
pass
sidecar = event_file_io.event_to_sidecar_dict(
ev,
serial=serial,
blastware_filename=filename,
blastware_filesize=filesize,
blastware_sha256=sha256,
source_kind="bw-import",
a5_pickle_filename=None,
review=existing_review,
)
event_file_io.write_sidecar(sidecar_path, sidecar)
log.info(
"WaveformStore.save_imported_bw serial=%s filename=%s filesize=%d "
"h5=%s (no .a5.pkl — A5 source unavailable for BW-imported files)",
serial, filename, filesize, hdf5_filename or "(skipped)",
)
return ev, {
"filename": filename,
"filesize": filesize,
"sha256": sha256,
"a5_pickle_filename": None,
"hdf5_filename": hdf5_filename,
"sidecar_filename": sidecar_path.name,
}
def load_a5(self, serial: str, filename: str) -> Optional[list[S3Frame]]:
"""
Re-hydrate the pickled A5 frame stream for a stored event.
Returns None if the sidecar is missing.
"""
_, a5_path = self.paths_for(serial, filename)
if not a5_path.exists():
return None
with a5_path.open("rb") as fp:
payload = pickle.load(fp)
if not isinstance(payload, dict) or "frames" not in payload:
log.warning("WaveformStore.load_a5: malformed sidecar at %s", a5_path)
return None
return [_dict_to_frame(d) for d in payload["frames"]]
# ── modern .sfm.json sidecar accessors ──────────────────────────────────────
def load_sidecar(self, serial: str, filename: str) -> Optional[dict]:
"""Return the parsed .sfm.json sidecar dict, or None if missing."""
path = self.sidecar_path_for(serial, filename)
if not path.exists():
return None
try:
return event_file_io.read_sidecar(path)
except Exception as exc:
log.warning("load_sidecar: failed to read %s: %s", path, exc)
return None
def patch_sidecar(
self,
serial: str,
filename: str,
*,
review: Optional[dict] = None,
extensions: Optional[dict] = None,
reviewer_now: bool = True,
) -> Optional[dict]:
"""
JSON-merge-patch the .sfm.json sidecar's review/extensions blocks.
Returns the new full dict, or None if the sidecar doesn't exist.
"""
path = self.sidecar_path_for(serial, filename)
if not path.exists():
return None
return event_file_io.patch_sidecar(
path,
review=review,
extensions=extensions,
reviewer_now=reviewer_now,
)
# ── helpers ─────────────────────────────────────────────────────────────────────
def _serial_from_bw_filename(name: str) -> Optional[str]:
"""
Reverse of `blastware_filename`'s serial-prefix encoding.
BW filename format (V10.72): `<P><serial3><stem4>.<ext>`
where P = chr(ord('B') + floor(serial // 1000))
and serial3 = f"{serial % 1000:03d}".
Examples (from CLAUDE.md verification archive):
P036... BE14036 H907... BE6907
M529... BE11529 T003... BE18003
Returns the inferred BE-prefix serial (e.g. "BE11529") or None when
the filename doesn't match the expected pattern.
"""
if not name:
return None
# First letter encodes the thousands group; next 3 chars encode the
# last 3 digits of the serial.
base = name.split(".", 1)[0]
if len(base) < 4 or not base[0].isalpha() or not base[1:4].isdigit():
return None
prefix_letter = base[0].upper()
if prefix_letter < "B":
return None
thousands = ord(prefix_letter) - ord("B")
serial_num = thousands * 1000 + int(base[1:4])
return f"BE{serial_num}"
+3 -3
View File
@@ -240,7 +240,7 @@
let charts = {};
let lastData = null;
let unitInfo = null;
let geoRange = 10.0; // in/s full-scale for geo channels; updated on connect
let geoAdcScale = 10.0; // in/s full-scale for geo channels; updated on connect
let eventList = []; // populated from /device/events after connect
let currentEventIndex = 0;
@@ -278,7 +278,7 @@
throw new Error(err.detail || resp.statusText);
}
unitInfo = await resp.json();
geoRange = unitInfo.compliance_config?.max_range_geo ?? 10.0;
geoAdcScale = unitInfo.compliance_config?.geo_adc_scale ?? 10.0;
} catch (e) {
setStatus(`Error: ${e.message}`, 'error');
btn.disabled = false;
@@ -457,7 +457,7 @@
if (isGeo) {
// Geo channels: counts × (range / 32767) → in/s
const scale = geoRange / 32767;
const scale = geoAdcScale / 32767;
plotSamples = samples.map(c => c * scale);
const peakIns = Math.max(...plotSamples.map(Math.abs));
peakLabel = `${peakIns.toFixed(5)} in/s`;
+252
View File
@@ -0,0 +1,252 @@
"""
test_5a_protocol.py Regression test for the v0.14.x SUB 5A protocol fixes.
Verifies that SFM's framing helpers reproduce Blastware's exact wire bytes
for every 5A request frame in the 5-1-26 "bwcap3sec" capture, AND that the
file builder produces a byte-identical file when fed the BW capture's A5
responses.
Together these two tests protect all four v0.14.x fixes:
v0.14.0 STRT-bounded chunk walk (probe @ 0, metadata pages @ 0x1002 +
0x1004, samples @ 0x0600..0x1E00 step 0x0200, TERM at residual)
v0.14.1 event-N probe counter is `start_offset`, not `start_offset+0x46`
(covered by the multi-event captures, not this 3-sec event-1
capture but the helpers are the same code path)
v0.14.2 file body assembly is contiguous concatenation, no de-duplication
v0.14.3 partial DLE stuffing of `0x10` bytes in 5A params (counter=0x1000
wire bytes are `10 10 00`, not `10 00`)
If any of these fixes regresses, this test fails immediately with a clear
byte-level diff.
Run:
python -m pytest tests/test_5a_protocol.py -v
or:
python tests/test_5a_protocol.py
"""
from __future__ import annotations
import os
import sys
import pytest
# Allow running from the project root without installation
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import (
S3FrameParser,
build_5a_frame,
bulk_waveform_params,
bulk_waveform_term_v2,
)
# ── Capture loading ────────────────────────────────────────────────────────────
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Reference BW MITM capture: BW saving a 3-sec event 0 (start_key=01110000,
# end_offset=0x21F2). 17 5A frames: probe + 2 metadata pages + 13 samples + TERM.
BW_TX_PATH = os.path.join(
ROOT,
"bridges/captures/5-1-26/comcheck/bwcap3sec/"
"raw_bw_20260501_165723_copy_3sec_waveform_to_disk.bin",
)
BW_S3_PATH = os.path.join(
ROOT,
"bridges/captures/5-1-26/comcheck/bwcap3sec/"
"raw_s3_20260501_165723_copy_3sec_waveform_to_disk.bin",
)
# BW's saved Blastware file for the same event (used for file-builder verification).
BW_SAVED_FILE = os.path.join(
ROOT, "example-events/decode_test/5-1-26/bw/M529LKIQ.G10",
)
def _split_bw_frames(data: bytes) -> list[bytes]:
"""Split BW TX bytes into individual frames (ACK STX … bare ETX)."""
frames: list[bytes] = []
i = 0
while i < len(data):
if data[i] != 0x41 or i + 1 >= len(data) or data[i + 1] != 0x02:
i += 1
continue
j = i + 2
while j < len(data):
if data[j] == 0x03:
break
if data[j] == 0x10 and j + 1 < len(data):
j += 2
continue
j += 1
if j >= len(data):
break
frames.append(data[i : j + 1])
i = j + 1
return frames
@pytest.fixture(scope="module")
def bw_5a_frames() -> list[bytes]:
"""All 5A frames from the BW TX capture, in wire order."""
if not os.path.exists(BW_TX_PATH):
pytest.skip(f"BW capture not found: {BW_TX_PATH}")
raw = open(BW_TX_PATH, "rb").read()
frames = [
f for f in _split_bw_frames(raw)
if len(f) >= 6 and f[5] == 0x5A # body[3] == 0x5A (SUB)
]
assert len(frames) == 17, f"expected 17 5A frames in capture, got {len(frames)}"
return frames
@pytest.fixture(scope="module")
def bw_a5_frames():
"""All A5 (response) frames from the matching S3 capture."""
if not os.path.exists(BW_S3_PATH):
pytest.skip(f"BW S3 capture not found: {BW_S3_PATH}")
raw = open(BW_S3_PATH, "rb").read()
p = S3FrameParser()
p.feed(raw)
a5 = [f for f in p.frames if f.sub == 0xA5]
assert len(a5) == 17, f"expected 17 A5 frames in capture, got {len(a5)}"
return a5
# ── 5A request frame byte-perfect verification ────────────────────────────────
KEY4 = bytes.fromhex("01110000") # start_key for the 3-sec event 0
END_OFFSET = 0x21F2 # parsed from STRT in the BW capture
LAST_CHUNK_COUNTER = 0x1E00 # last full 0x0200-byte chunk before TERM
SAMPLE_COUNTERS = (
0x0600, 0x0800, 0x0A00, 0x0C00, 0x0E00,
0x1000, 0x1200, 0x1400, 0x1600, 0x1800,
0x1A00, 0x1C00, 0x1E00,
)
def _meta_params(key: bytes, counter: int) -> bytes:
"""Build the 12-byte metadata-page params block (matches BW for 0x1002 / 0x1004)."""
return bytes(
[
0x00, key[0], key[1],
(counter >> 8) & 0xFF, counter & 0xFF,
0, 0, 0, 0, 0, 0, 0,
]
)
def test_probe_frame_byte_perfect(bw_5a_frames):
"""Probe @ counter=0x0000 (frame 0)."""
sfm = build_5a_frame(0x1002, bulk_waveform_params(KEY4, 0, is_probe=True))
assert sfm == bw_5a_frames[0], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[0].hex()}"
)
@pytest.mark.parametrize("idx,counter", [(1, 0x1002), (2, 0x1004)])
def test_metadata_page_frames_byte_perfect(bw_5a_frames, idx, counter):
"""Metadata pages @ counter=0x1002 and 0x1004 (frames 1 and 2)."""
sfm = build_5a_frame(0x1002, _meta_params(KEY4, counter))
assert sfm == bw_5a_frames[idx], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[idx].hex()}"
)
@pytest.mark.parametrize("i,counter", list(enumerate(SAMPLE_COUNTERS)))
def test_sample_chunk_frames_byte_perfect(bw_5a_frames, i, counter):
"""
Sample chunks @ counter=0x0600..0x1E00, step 0x0200 (frames 3..15).
Critically, frame 8 (counter=0x1000) requires the v0.14.3 partial DLE
stuffing fix wire params include `10 10 00` for the counter, not `10 00`.
"""
sfm = build_5a_frame(0x1002, bulk_waveform_params(KEY4, counter))
bw_idx = 3 + i
assert sfm == bw_5a_frames[bw_idx], (
f"\ncounter=0x{counter:04X}"
f"\nSFM: {sfm.hex()}"
f"\nBW: {bw_5a_frames[bw_idx].hex()}"
)
def test_term_frame_byte_perfect(bw_5a_frames):
"""TERM frame at residual (frame 16)."""
offset_word, params = bulk_waveform_term_v2(KEY4, END_OFFSET, LAST_CHUNK_COUNTER)
sfm = build_5a_frame(offset_word, params)
assert sfm == bw_5a_frames[16], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[16].hex()}"
)
def test_strt_end_offset_parsing(bw_a5_frames):
"""The probe response (A5[0]) carries STRT at byte 17 with end_offset=0x21F2."""
from minimateplus.framing import parse_strt_end_offset
end_offset = parse_strt_end_offset(bw_a5_frames[0].data)
assert end_offset == END_OFFSET, (
f"expected end_offset=0x{END_OFFSET:04X}, got "
f"{f'0x{end_offset:04X}' if end_offset is not None else 'None'}"
)
# ── File builder byte-perfect verification ────────────────────────────────────
def test_blastware_file_builder_byte_perfect(bw_a5_frames):
"""
Feed the BW capture's A5 frames into write_blastware_file() and verify the
output is byte-identical to BW's saved M529LKIQ.G10 reference file.
This protects the v0.14.2 strip-removal fix and the file-builder skip
values (probe=38, meta=13, samples=12, TERM=11).
"""
if not os.path.exists(BW_SAVED_FILE):
pytest.skip(f"BW saved file not found: {BW_SAVED_FILE}")
import tempfile
from minimateplus.blastware_file import write_blastware_file
from minimateplus.models import Event
ev = Event(index=0)
ev._waveform_key = KEY4
ev.rectime_seconds = 3
ev.timestamp = None # let the builder pull the footer from the TERM frame
with tempfile.NamedTemporaryFile(suffix=".G10", delete=False) as tf:
tmp_path = tf.name
try:
write_blastware_file(ev, bw_a5_frames, tmp_path)
sfm_bytes = open(tmp_path, "rb").read()
finally:
os.unlink(tmp_path)
bw_bytes = open(BW_SAVED_FILE, "rb").read()
assert len(sfm_bytes) == len(bw_bytes), (
f"file size mismatch: SFM={len(sfm_bytes)} BW={len(bw_bytes)}"
)
if sfm_bytes != bw_bytes:
# Find first diff for actionable error message
for i in range(len(bw_bytes)):
if bw_bytes[i] != sfm_bytes[i]:
ctx_start = max(0, i - 8)
ctx_end = min(len(bw_bytes), i + 16)
pytest.fail(
f"file diverges at byte 0x{i:04X}\n"
f" BW : {bw_bytes[ctx_start:ctx_end].hex()}\n"
f" SFM: {sfm_bytes[ctx_start:ctx_end].hex()}\n"
f" {' ' * (i - ctx_start)}^^"
)
# ── Standalone runner ─────────────────────────────────────────────────────────
if __name__ == "__main__":
sys.exit(pytest.main([__file__, "-v"]))
+209
View File
@@ -0,0 +1,209 @@
"""
test_cache_invalidation.py verify post-erase key-reuse correctness.
The device's event-key counter resets to 0x01110000 after every memory erase,
so a bare-key dedup (the old behaviour) silently treats a freshly-recorded
event 0 as if it were the previously-downloaded one. These tests exercise
the (key, timestamp)-based eviction logic in:
- bridges/ach_server.py (state-file migration + force flag)
- sfm/server.py (_LiveCache.set_events / set_waveform)
Run:
python tests/test_cache_invalidation.py
"""
from __future__ import annotations
import json
import os
import sys
import tempfile
from pathlib import Path
try:
import pytest
except ImportError:
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
# ── ACH state migration ───────────────────────────────────────────────────────
def test_ach_state_legacy_migration(tmp_path: Path):
"""
Legacy v1 state with a `downloaded_keys` list is migrated on _load_state
to the v2 `downloaded_events` dict. All legacy keys come back with empty
timestamps so the (key, ts) compare in get_events() always falls through
to a fresh download.
"""
from bridges.ach_server import _load_state
state_path = tmp_path / "ach_state.json"
legacy = {
"BE11529": {
"downloaded_keys": ["01110000", "0111245a"],
"max_downloaded_key": "0111245a",
"last_seen": "2026-04-11T01:04:36",
"serial": "BE11529",
"peer": "63.43.212.232:51920",
},
}
state_path.write_text(json.dumps(legacy))
migrated = _load_state(state_path)
unit = migrated["BE11529"]
assert "downloaded_keys" not in unit
assert unit["downloaded_events"] == {
"01110000": "",
"0111245a": "",
}
# max_downloaded_key is preserved verbatim
assert unit["max_downloaded_key"] == "0111245a"
def test_ach_state_v2_passes_through(tmp_path: Path):
"""A v2 state file is returned verbatim — no migration touches it."""
from bridges.ach_server import _load_state
state_path = tmp_path / "ach_state.json"
v2 = {
"BE11529": {
"downloaded_events": {
"01110000": "2026-04-15T14:23:45",
"0111245a": "2026-04-16T09:01:12",
},
"max_downloaded_key": "0111245a",
"serial": "BE11529",
},
}
state_path.write_text(json.dumps(v2))
loaded = _load_state(state_path)
assert loaded["BE11529"]["downloaded_events"] == v2["BE11529"]["downloaded_events"]
def test_ach_state_missing_returns_empty(tmp_path: Path):
"""Nonexistent state path → empty dict (not an error)."""
from bridges.ach_server import _load_state
assert _load_state(tmp_path / "absent.json") == {}
# ── _LiveCache eviction ───────────────────────────────────────────────────────
def _ev(index: int, key: str, ts: str) -> dict:
return {"index": index, "waveform_key": key, "timestamp": ts}
def test_live_cache_set_events_no_eviction_when_keys_match():
"""No flush when incoming events match the cached (key, ts) at each index."""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
c.set_events(conn, 2, [_ev(0, "01110000", "2026-04-15T14:23:45"),
_ev(1, "0111245a", "2026-04-16T09:01:12")])
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
# Same events again — must not flush.
c.set_events(conn, 2, [_ev(0, "01110000", "2026-04-15T14:23:45"),
_ev(1, "0111245a", "2026-04-16T09:01:12")])
assert c._events[conn][0] == 2
assert (conn, 0) in c._waveforms
def test_live_cache_set_events_flushes_on_post_erase_collision():
"""
Index 0 keeps the same key (01110000 reuses) but the timestamp differs
device was erased + re-recorded flush all events + waveforms for the
device.
"""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
# First "session": index 0 key=01110000 ts=2026-04-15.
c.set_events(conn, 1, [_ev(0, "01110000", "2026-04-15T14:23:45")])
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
assert (conn, 0) in c._waveforms
# Second "session" after erase: index 0 still key=01110000 but new ts.
c.set_events(conn, 1, [_ev(0, "01110000", "2026-05-06T12:34:56")])
# Stale waveform for index 0 must have been flushed by the eviction path
# before the new event was inserted. The new events list IS in cache but
# the cached waveform from the prior session is gone.
assert (conn, 0) not in c._waveforms
assert c._events[conn][1][0]["timestamp"] == "2026-05-06T12:34:56"
def test_live_cache_set_waveform_flushes_on_mismatch():
"""set_waveform alone should also evict when (key, ts) differs."""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
c.set_waveform(conn, 1, _ev(1, "0111245a", "2026-04-16T09:01:12"))
# Index 0 swap: same key, new timestamp.
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-05-06T12:34:56"))
# Index 1's stale waveform must be flushed — keeping it would mix eras.
assert (conn, 1) not in c._waveforms
# The newly-inserted index 0 entry is what's there.
assert c._waveforms[(conn, 0)]["timestamp"] == "2026-05-06T12:34:56"
def test_live_cache_partial_signature_does_not_flush():
"""
If incoming event lacks waveform_key OR timestamp, we cannot prove a
mismatch eviction must NOT trigger. Avoids spurious flushes from
legacy / partial event shapes.
"""
from sfm.live_cache import LiveCache as _LiveCache
c = _LiveCache()
conn = "tcp:1.2.3.4:12345"
c.set_waveform(conn, 0, _ev(0, "01110000", "2026-04-15T14:23:45"))
# Incoming entry missing the timestamp — cannot prove a mismatch.
c.set_waveform(conn, 0, {"index": 0, "waveform_key": "01110000"})
# Cache should contain the new entry; the implementation overwrites
# the index-0 row but does NOT flush other indices. Since there are no
# other indices in this test, just check the entry exists.
assert (conn, 0) in c._waveforms
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
import inspect
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)
+401
View File
@@ -0,0 +1,401 @@
"""
test_event_file_io.py sidecar write/read/patch round-trips,
WaveformStore sidecar integration, and the BW-import path.
Run:
python tests/test_event_file_io.py
"""
from __future__ import annotations
import json
import os
import sys
import tempfile
from pathlib import Path
try:
import pytest
except ImportError:
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus import event_file_io
from minimateplus.framing import S3Frame
from minimateplus.models import Event, Timestamp
# ── Fixtures shared with test_waveform_store.py ───────────────────────────────
def _make_synthetic_event() -> tuple[Event, list[S3Frame]]:
"""Same shape as tests/test_waveform_store.py — minimum viable Event +
A5 stream that makes write_blastware_file emit a parseable file.
STRT is exactly 21 bytes; rectime_seconds lands at byte 18 to match
`_decode_a5_waveform`'s expected layout (which is also what
`read_blastware_file()` reads back)."""
key4 = bytes.fromhex("01110000")
rectime = 3
strt = bytearray(21)
strt[0:4] = b"STRT"
strt[4:6] = b"\xff\xfe"
strt[6:10] = key4 # end_key (per data[23:27] in CLAUDE.md)
strt[10:14] = key4 # start_key (per data[27:31])
strt[18] = rectime
strt = bytes(strt)
probe_data = bytes(7) + strt + bytes(32)
probe = S3Frame(sub=0xA5, page_hi=0x10, page_lo=0x00, data=probe_data,
checksum_valid=True, chk_byte=0x00)
sample = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x10,
data=bytes(7) + bytes(0x0200), checksum_valid=True,
chk_byte=0x00)
# Build a valid 26-byte footer (0e 08 + ts1 + ts2 + 6 const + 2 crc)
# and embed it at the END of the terminator's contribution so
# write_blastware_file finds the real `0e 08` marker rather than
# falling back to slicing the last 26 bytes of zero garbage.
# ts byte order: [day][month][year_HI][year_LO][0x00][hour][min][sec]
footer = (
b"\x0e\x08"
+ bytes([6, 5, 0x07, 0xea, 0, 12, 34, 56]) # ts1 = 2026-05-06 12:34:56
+ bytes([6, 5, 0x07, 0xea, 0, 12, 35, 6]) # ts2 = ts1 + ~10s
+ b"\x00\x01\x00\x02\x00\x00"
+ b"\x00\x00"
)
assert len(footer) == 26
term_data = bytes(11) + bytes(38) + footer # 11 prefix + 38 pad + 26 footer = 75
term = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x00,
data=term_data, checksum_valid=True, chk_byte=0x00)
ev = Event(index=0)
ev._waveform_key = key4
ev.timestamp = Timestamp(
raw=b"", flag=0x10, year=2026, unknown_byte=0,
month=5, day=6, hour=12, minute=34, second=56,
)
ev.rectime_seconds = rectime
ev.record_type = "Waveform"
ev._a5_frames = [probe, sample, term]
return ev, [probe, sample, term]
# ── Sidecar write/read round-trip ─────────────────────────────────────────────
def test_event_to_sidecar_dict_shape():
ev, _ = _make_synthetic_event()
d = event_file_io.event_to_sidecar_dict(
ev,
serial="BE11529",
blastware_filename="M529LKIQ.7M0W",
blastware_filesize=1024,
blastware_sha256="abcd" * 16,
source_kind="sfm-live",
a5_pickle_filename="M529LKIQ.7M0W.a5.pkl",
)
assert d["schema_version"] == event_file_io.SCHEMA_VERSION
assert d["kind"] == event_file_io.SIDECAR_KIND
assert d["event"]["serial"] == "BE11529"
assert d["event"]["timestamp"] == "2026-05-06T12:34:56"
assert d["event"]["waveform_key"] == "01110000"
assert d["blastware"]["sha256"] == "abcd" * 16
assert d["source"]["kind"] == "sfm-live"
assert d["review"] == {
"false_trigger": False, "reviewer": None,
"reviewed_at": None, "notes": "",
}
assert d["extensions"] == {}
def test_sidecar_write_and_read_round_trip(tmp_path: Path):
ev, _ = _make_synthetic_event()
path = tmp_path / "M529LKIQ.7M0W.sfm.json"
src = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="sfm-ach",
)
event_file_io.write_sidecar(path, src)
loaded = event_file_io.read_sidecar(path)
assert loaded["event"] == src["event"]
assert loaded["blastware"] == src["blastware"]
assert loaded["source"]["kind"] == "sfm-ach"
def test_sidecar_persists_raw_0c_record_in_extensions(tmp_path: Path):
"""An Event with _raw_record populated should land its 210 bytes
base64-encoded in extensions.raw_records.waveform_record_b64, so
later analysis (e.g. mapping Peak Acceleration / Time of Peak / ZC
Freq byte offsets) can run offline against the saved sidecar."""
import base64
ev, _ = _make_synthetic_event()
# Synthesize a 210-byte 0C record with embedded label needles so
# the dump tool's anchor scan has something to find.
raw = bytearray(210)
raw[10:14] = b"Tran"
raw[60:64] = b"Vert"
raw[110:114] = b"Long"
raw[160:164] = b"MicL"
ev._raw_record = bytes(raw)
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="sfm-live",
)
rr = d["extensions"]["raw_records"]
assert rr["waveform_record_len"] == 210
decoded = base64.b64decode(rr["waveform_record_b64"])
assert decoded == ev._raw_record
# Round-trip through write/read
path = tmp_path / "raw0c.sfm.json"
event_file_io.write_sidecar(path, d)
loaded = event_file_io.read_sidecar(path)
assert (
base64.b64decode(loaded["extensions"]["raw_records"]["waveform_record_b64"])
== ev._raw_record
)
def test_sidecar_omits_raw_records_when_event_has_no_0c(tmp_path: Path):
"""Events without a _raw_record (e.g. constructed by importers that
never see 0C) should NOT add an empty raw_records block keep the
sidecar clean for those flows."""
ev, _ = _make_synthetic_event()
assert ev._raw_record is None
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="bw-import",
)
assert d["extensions"] == {}
def test_sidecar_rejects_unsupported_schema_version(tmp_path: Path):
path = tmp_path / "future.sfm.json"
path.write_text(json.dumps({
"schema_version": event_file_io.SCHEMA_VERSION + 1,
"kind": event_file_io.SIDECAR_KIND,
}))
try:
event_file_io.read_sidecar(path)
except ValueError as exc:
assert "schema_version" in str(exc)
return
raise AssertionError("read_sidecar should have rejected unsupported version")
def test_sidecar_extensions_survive_round_trip(tmp_path: Path):
"""Forward-compat: unknown keys inside `extensions` survive a r/w cycle."""
ev, _ = _make_synthetic_event()
path = tmp_path / "x.sfm.json"
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="X", blastware_filesize=0, blastware_sha256="",
source_kind="sfm-live",
extensions={"vendor.acme.gps": {"lat": 40.7, "lon": -74.0}},
)
event_file_io.write_sidecar(path, d)
back = event_file_io.read_sidecar(path)
assert back["extensions"]["vendor.acme.gps"]["lat"] == 40.7
def test_sidecar_patch_review_stamps_reviewed_at(tmp_path: Path):
ev, _ = _make_synthetic_event()
path = tmp_path / "patch.sfm.json"
event_file_io.write_sidecar(
path,
event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="X", blastware_filesize=0, blastware_sha256="",
source_kind="sfm-live",
),
)
new = event_file_io.patch_sidecar(
path,
review={"false_trigger": True, "notes": "truck thump", "reviewer": "brian"},
)
assert new["review"]["false_trigger"] is True
assert new["review"]["notes"] == "truck thump"
assert new["review"]["reviewer"] == "brian"
assert new["review"]["reviewed_at"], "reviewed_at must be auto-stamped"
on_disk = event_file_io.read_sidecar(path)
assert on_disk["review"]["false_trigger"] is True
# ── WaveformStore integration ─────────────────────────────────────────────────
def test_waveform_store_save_writes_sidecar(tmp_path: Path):
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec = store.save(ev, serial="BE11529", a5_frames=frames, source_kind="sfm-live")
assert rec["sidecar_filename"].endswith(".sfm.json")
assert rec["sha256"] and len(rec["sha256"]) == 64
sc = store.load_sidecar("BE11529", rec["filename"])
assert sc is not None
assert sc["blastware"]["filename"] == rec["filename"]
assert sc["blastware"]["sha256"] == rec["sha256"]
assert sc["source"]["kind"] == "sfm-live"
# The .a5.pkl reference should match the actual filename on disk.
assert sc["source"]["a5_pickle_filename"] == rec["a5_pickle_filename"]
def test_waveform_store_save_preserves_review_across_resave(tmp_path: Path):
"""Re-saving the same event must preserve a user's prior review edits."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec = store.save(ev, serial="BE11529", a5_frames=frames)
# User flips false_trigger and adds a note.
store.patch_sidecar(
"BE11529", rec["filename"],
review={"false_trigger": True, "notes": "hello"},
)
# A second save (e.g. Force refresh re-download) must keep those edits.
store.save(ev, serial="BE11529", a5_frames=frames)
sc = store.load_sidecar("BE11529", rec["filename"])
assert sc["review"]["false_trigger"] is True
assert sc["review"]["notes"] == "hello"
def test_waveform_store_patch_sidecar_returns_none_when_missing(tmp_path: Path):
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
out = store.patch_sidecar("BE99999", "no.such.W", review={"notes": "x"})
assert out is None
# ── DB integration: sidecar_filename column + update_event_review ─────────────
def test_seismodb_persists_sidecar_filename_and_review_sync(tmp_path: Path):
from sfm.database import SeismoDb
db = SeismoDb(tmp_path / "seismo_relay.db")
ev, _ = _make_synthetic_event()
rec = {
"filename": "M529LKIQ.7M0W",
"filesize": 8708,
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
"sidecar_filename": "M529LKIQ.7M0W.sfm.json",
}
inserted, _ = db.insert_events(
[ev], serial="BE11529",
waveform_records={ev._waveform_key.hex(): rec},
)
assert inserted == 1
rows = db.query_events(serial="BE11529")
row = rows[0]
assert row["sidecar_filename"] == rec["sidecar_filename"]
# update_event_review keeps false_trigger column in sync with sidecar.
assert db.update_event_review(row["id"], {"false_trigger": True}) is True
again = db.get_event(row["id"])
assert again["false_trigger"] == 1
# Empty review block (no false_trigger key) → no-op but row exists.
assert db.update_event_review(row["id"], {"notes": "x"}) is True
# ── BW-file reader (read_blastware_file) ─────────────────────────────────────
def test_read_blastware_file_round_trip(tmp_path: Path):
"""write → read → key/timestamp/rectime survive."""
from minimateplus.blastware_file import write_blastware_file, blastware_filename
ev, frames = _make_synthetic_event()
bw_path = tmp_path / blastware_filename(ev, "BE11529")
write_blastware_file(ev, frames, bw_path)
parsed = event_file_io.read_blastware_file(bw_path)
assert parsed._waveform_key == ev._waveform_key
assert parsed.rectime_seconds == ev.rectime_seconds
# Timestamp lands via the footer; year/month/day/hour/min/sec all survive.
assert parsed.timestamp is not None
assert parsed.timestamp.year == ev.timestamp.year
assert parsed.timestamp.month == ev.timestamp.month
assert parsed.timestamp.day == ev.timestamp.day
assert parsed.timestamp.hour == ev.timestamp.hour
assert parsed.timestamp.minute == ev.timestamp.minute
assert parsed.timestamp.second == ev.timestamp.second
# No A5 source recoverable.
assert parsed._a5_frames is None
# Peaks computed from samples (synthetic = zero samples → zero peaks).
assert parsed.peak_values is not None
assert parsed.peak_values.peak_vector_sum == 0.0
def test_save_imported_bw_round_trip(tmp_path: Path):
"""save_imported_bw stores a copy + sidecar with source.kind = bw-import."""
from minimateplus.blastware_file import write_blastware_file, blastware_filename
from sfm.waveform_store import WaveformStore
# Produce a BW file outside the store.
ev, frames = _make_synthetic_event()
fname = blastware_filename(ev, "BE11529")
src = tmp_path / fname
write_blastware_file(ev, frames, src)
store = WaveformStore(tmp_path / "waveforms")
parsed_ev, rec = store.save_imported_bw(src.read_bytes(), source_path=src)
assert rec["filename"] == fname
assert rec["a5_pickle_filename"] is None # no A5 source for BW imports
sc = store.load_sidecar("BE11529", fname)
assert sc is not None
assert sc["source"]["kind"] == "bw-import"
assert sc["source"]["a5_pickle_filename"] is None
# The stored binary should match the source byte-for-byte (we just copied).
stored_path = store.open_blastware("BE11529", fname)
assert stored_path is not None
assert stored_path.read_bytes() == src.read_bytes()
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
import inspect
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)
+296
View File
@@ -0,0 +1,296 @@
"""
test_event_hdf5.py HDF5 codec round-trip + plot.v1 JSON shape sanity.
Run:
python tests/test_event_hdf5.py
"""
from __future__ import annotations
import os
import sys
import tempfile
from pathlib import Path
try:
import pytest
except ImportError:
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import S3Frame
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
from sfm import event_hdf5
# ── Fixtures ──────────────────────────────────────────────────────────────────
def _make_event_with_samples(n: int = 256) -> Event:
"""An Event with synthetic int16 ADC samples on all four channels.
Channel content:
- Tran: ramp from -16384 to +16383 (peak 5 in/s for Normal range)
- Vert: full-scale dirac at index n//2 (peak = 10 in/s)
- Long: zeros
- MicL: small ramp
Peak values are set on the event the way the device's 0C record
would supply them used by the HDF5 writer for the mic per-count
factor.
"""
tran = [int((i / max(n - 1, 1)) * 32767 - 16384) for i in range(n)]
vert = [0] * n
if n:
vert[n // 2] = 32767
long_ = [0] * n
mic = [int((i / max(n - 1, 1)) * 5000) for i in range(n)]
ev = Event(index=0)
ev._waveform_key = bytes.fromhex("01110000")
ev.timestamp = Timestamp(
raw=b"", flag=0x10,
year=2026, unknown_byte=0, month=5, day=7,
hour=10, minute=0, second=0,
)
ev.record_type = "Waveform"
ev.sample_rate = 1024
ev.pretrig_samples = n // 4
ev.total_samples = n
ev.rectime_seconds = n / 1024.0
ev.raw_samples = {"Tran": tran, "Vert": vert, "Long": long_, "MicL": mic}
ev.peak_values = PeakValues(
tran=5.0, vert=10.0, long=0.0,
peak_vector_sum=10.0, micl=0.001,
)
ev.project_info = ProjectInfo(
project="TestProj", client="TestClient",
operator="brian", sensor_location="loc-A",
)
return ev
# ── HDF5 round-trip ───────────────────────────────────────────────────────────
def test_hdf5_round_trip_preserves_metadata(tmp_path: Path):
ev = _make_event_with_samples()
h5 = tmp_path / "test.h5"
event_hdf5.write_event_hdf5(
h5, ev, serial="BE11529", geo_range="normal",
)
data = event_hdf5.read_event_hdf5(h5)
a = data["attrs"]
assert a["schema_version"] == event_hdf5.SCHEMA_VERSION
assert a["kind"] == event_hdf5.HDF5_KIND
assert a["serial"] == "BE11529"
assert a["waveform_key"] == "01110000"
assert a["sample_rate"] == 1024
assert a["pretrig_samples"] == 64
assert a["geo_range"] == "normal"
assert a["geo_full_scale_ips"] == 10.0
assert a["project"] == "TestProj"
assert a["client"] == "TestClient"
assert a["operator"] == "brian"
# Float attrs may round-trip with tiny precision noise.
assert abs(a["peak_tran_ips"] - 5.0) < 1e-6
assert abs(a["peak_vert_ips"] - 10.0) < 1e-6
def test_hdf5_samples_in_physical_units_normal_range(tmp_path: Path):
"""Vert hits ADC full-scale (32767) → with Normal range FS=10 in/s,
the HDF5 sample value should be 10 * 32767/32768 in/s."""
ev = _make_event_with_samples()
h5 = tmp_path / "n.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529", geo_range="normal")
data = event_hdf5.read_event_hdf5(h5)
vert = data["samples"]["Vert"]
assert vert.dtype.name == "float32"
assert max(abs(v) for v in vert) > 9.99 # full-scale ≈ 10.0
# The dirac was at n//2 → 32767 ADC counts.
expected_peak = 10.0 * 32767 / 32768
assert abs(max(vert) - expected_peak) < 1e-3
def test_hdf5_samples_in_physical_units_sensitive_range(tmp_path: Path):
"""Same fixture but Sensitive range → full-scale 1.250 in/s."""
ev = _make_event_with_samples()
h5 = tmp_path / "s.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529", geo_range="sensitive")
data = event_hdf5.read_event_hdf5(h5)
vert = data["samples"]["Vert"]
expected_peak = 1.250 * 32767 / 32768
assert abs(max(vert) - expected_peak) < 1e-4
def test_hdf5_includes_int16_samples(tmp_path: Path):
ev = _make_event_with_samples()
h5 = tmp_path / "i.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529")
data = event_hdf5.read_event_hdf5(h5)
assert data["samples_int16"] is not None
assert "Tran" in data["samples_int16"]
assert data["samples_int16"]["Vert"].dtype.name == "int16"
def test_hdf5_rejects_unsupported_schema(tmp_path: Path):
"""Round-tripping with a tampered schema_version raises ValueError."""
import h5py
h5 = tmp_path / "future.h5"
with h5py.File(h5, "w") as f:
f.attrs["schema_version"] = 99
f.attrs["kind"] = event_hdf5.HDF5_KIND
try:
event_hdf5.read_event_hdf5(h5)
except ValueError as exc:
assert "schema_version" in str(exc)
return
raise AssertionError("read_event_hdf5 should reject unsupported schema_version")
# ── plot.v1 JSON shape ────────────────────────────────────────────────────────
def test_event_to_plot_json_shape():
ev = _make_event_with_samples()
j = event_hdf5.event_to_plot_json(ev, serial="BE11529", geo_range="normal")
assert j["schema"] == "sfm.plot.v1"
assert j["serial"] == "BE11529"
assert j["geo_range"] == "normal"
assert j["geo_full_scale_ips"] == 10.0
assert j["trigger_ms"] == 0.0
t = j["time_axis"]
assert t["sample_rate"] == 1024
assert t["pretrig_samples"] == 64
assert t["n_samples"] == 256
# t0_ms = -pretrig * dt_ms = -64 * (1000/1024) ≈ -62.5
assert abs(t["t0_ms"] - (-64 * 1000 / 1024)) < 1e-3
assert abs(t["dt_ms"] - (1000 / 1024)) < 1e-6
chans = j["channels"]
for name in ("Tran", "Vert", "Long", "MicL"):
assert name in chans, f"missing channel: {name}"
assert chans[name]["unit"] in ("in/s", "psi")
assert "values" in chans[name]
assert "peak" in chans[name]
assert "peak_t_ms" in chans[name]
# Values are in physical units: Vert peak ≈ 10 in/s.
assert max(chans["Vert"]["values"]) > 9.99
def test_event_to_plot_json_peak_t_ms_locates_dirac():
"""The Vert channel's full-scale dirac at sample n//2 should produce
peak_t_ms = (n//2 - pretrig) * dt_ms."""
ev = _make_event_with_samples(n=256)
j = event_hdf5.event_to_plot_json(ev, serial="BE11529")
expected = (128 - 64) * (1000 / 1024) # = 62.5 ms
assert abs(j["channels"]["Vert"]["peak_t_ms"] - expected) < 1e-2
def test_plot_json_from_hdf5_round_trip(tmp_path: Path):
"""plot_json_from_hdf5 produces the same shape as event_to_plot_json."""
ev = _make_event_with_samples()
h5 = tmp_path / "rt.h5"
event_hdf5.write_event_hdf5(h5, ev, serial="BE11529", geo_range="normal")
j_disk = event_hdf5.plot_json_from_hdf5(h5, event_id="abc-123")
j_mem = event_hdf5.event_to_plot_json(ev, serial="BE11529", geo_range="normal", event_id="abc-123")
# Top-level shape parity
for k in ("schema", "serial", "geo_range", "geo_full_scale_ips",
"trigger_ms", "record_type", "waveform_key", "event_id"):
assert j_disk.get(k) == j_mem.get(k), f"mismatch on {k}"
assert j_disk["time_axis"]["sample_rate"] == j_mem["time_axis"]["sample_rate"]
assert j_disk["time_axis"]["n_samples"] == j_mem["time_axis"]["n_samples"]
# Sample values must match within float32 precision.
for ch in ("Tran", "Vert", "Long", "MicL"):
a = j_disk["channels"][ch]["values"]
b = j_mem["channels"][ch]["values"]
assert len(a) == len(b)
if a:
mx = max(abs(x - y) for x, y in zip(a, b))
assert mx < 1e-3, f"{ch}: max diff {mx}"
# ── WaveformStore integration with HDF5 ───────────────────────────────────────
def _make_synthetic_event_for_save() -> tuple[Event, list[S3Frame]]:
"""Same flavour as test_event_file_io.py but ensures _make_event_with_samples
is also wired into the BW write path so we can exercise WaveformStore.save."""
ev = _make_event_with_samples(n=128)
# Build a minimum 3-frame A5 stream (probe + sample + term) — same
# shape used in the other test files. The encoder only really needs
# the STRT in the probe + a non-zero body and a footer in the term.
key4 = ev._waveform_key
rectime = int(ev.rectime_seconds or 0) or 1
strt = bytearray(21)
strt[0:4] = b"STRT"
strt[4:6] = b"\xff\xfe"
strt[6:10] = key4
strt[10:14] = key4
strt[18] = rectime
probe = S3Frame(sub=0xA5, page_hi=0x10, page_lo=0x00,
data=bytes(7) + bytes(strt) + bytes(32),
checksum_valid=True, chk_byte=0x00)
sample = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x10,
data=bytes(7) + bytes(0x0200), checksum_valid=True, chk_byte=0x00)
footer = (
b"\x0e\x08"
+ bytes([7, 5, 0x07, 0xea, 0, 10, 0, 0])
+ bytes([7, 5, 0x07, 0xea, 0, 10, 0, 1])
+ b"\x00\x01\x00\x02\x00\x00\x00\x00"
)
term = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x00,
data=bytes(11) + bytes(38) + footer, checksum_valid=True, chk_byte=0x00)
ev._a5_frames = [probe, sample, term]
return ev, [probe, sample, term]
def test_waveform_store_save_emits_hdf5(tmp_path: Path):
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event_for_save()
rec = store.save(ev, serial="BE11529", a5_frames=frames, geo_range="normal")
assert rec["hdf5_filename"], "hdf5_filename should be present in save() record"
h5 = store.hdf5_path_for("BE11529", rec["filename"])
assert h5.exists(), "WaveformStore.save should produce a .h5 file"
# The HDF5 round-trip should match the event's metadata.
data = event_hdf5.read_event_hdf5(h5)
assert data["attrs"]["serial"] == "BE11529"
assert data["attrs"]["geo_range"] == "normal"
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
import inspect
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)
+302
View File
@@ -0,0 +1,302 @@
"""
test_waveform_store.py unit tests for sfm/waveform_store.py and the
SeismoDb columns + insert_events upsert path that the store depends on.
These tests exercise the *store + DB plumbing* in isolation they do not
re-test write_blastware_file (covered separately) and do not require a live
device or a wire capture.
Run:
python -m pytest tests/test_waveform_store.py -v
"""
from __future__ import annotations
import os
import sys
import datetime
from pathlib import Path
try:
import pytest
except ImportError: # allow running standalone without pytest installed
pytest = None # type: ignore
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import S3Frame
from minimateplus.models import Event, Timestamp
# ── Test fixtures ──────────────────────────────────────────────────────────────
def _make_synthetic_event() -> tuple[Event, list[S3Frame]]:
"""
Build a minimal Event + a 3-frame A5 stream that satisfies
write_blastware_file's STRT-extraction path.
Frame 0 (probe): contains a STRT record at the canonical position so
write_blastware_file finds it without falling back.
Frame 1 (sample): 0x0200 bytes of zeros at page_key=0x0010 (sample marker).
Frame 2 (TERM): page_key=0x0000 marks the terminator.
"""
key4 = bytes.fromhex("01110000")
rectime = 3
strt = b"STRT" + b"\xff\xfe" + key4 + key4 + bytes(7) + bytes([rectime])
# Probe payload prefix: 7 zero bytes then STRT (matches blastware_file._strip
# logic which looks for STRT in data[7:]). Tail with 32 zero bytes of fake
# body so reconstruction has something to slice.
probe_data = bytes(7) + strt + bytes(32)
probe = S3Frame(sub=0xA5, page_hi=0x10, page_lo=0x00, data=probe_data,
checksum_valid=True, chk_byte=0x00)
sample = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x10,
data=bytes(7) + bytes(0x0200), checksum_valid=True,
chk_byte=0x00)
term = S3Frame(sub=0xA5, page_hi=0x00, page_lo=0x00,
data=bytes(7) + bytes(64), checksum_valid=True,
chk_byte=0x00)
ev = Event(index=0)
ev._waveform_key = key4
ev.timestamp = Timestamp(
raw=b"",
flag=0x10,
year=2026,
unknown_byte=0,
month=5,
day=6,
hour=12,
minute=34,
second=56,
)
ev.rectime_seconds = rectime
ev.record_type = "Waveform"
ev._a5_frames = [probe, sample, term]
return ev, [probe, sample, term]
# ── Frame round-trip ───────────────────────────────────────────────────────────
def test_frame_dict_round_trip():
"""_frame_to_dict and _dict_to_frame must round-trip every field."""
from sfm.waveform_store import _dict_to_frame, _frame_to_dict
f = S3Frame(
sub=0xA5, page_hi=0x12, page_lo=0x34,
data=b"\x10\x02\x00\xab\xcd",
checksum_valid=False,
chk_byte=0x42,
)
d = _frame_to_dict(f)
g = _dict_to_frame(d)
assert g.sub == f.sub
assert g.page_hi == f.page_hi
assert g.page_lo == f.page_lo
assert g.data == f.data
assert g.checksum_valid == f.checksum_valid
assert g.chk_byte == f.chk_byte
# ── Store save/load round-trip ─────────────────────────────────────────────────
def test_waveform_store_save_load_round_trip(tmp_path: Path):
"""save() writes both files; load_a5() returns equivalent frames."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec = store.save(ev, serial="BE11529", a5_frames=frames)
assert rec["filename"].startswith("M529")
assert rec["filesize"] > 0
assert rec["a5_pickle_filename"] == rec["filename"] + ".a5.pkl"
bw_path = store.open_blastware("BE11529", rec["filename"])
assert bw_path is not None
assert bw_path.exists()
assert bw_path.stat().st_size == rec["filesize"]
# Sidecar exists and round-trips
loaded = store.load_a5("BE11529", rec["filename"])
assert loaded is not None
assert len(loaded) == len(frames)
for orig, got in zip(frames, loaded):
assert got.sub == orig.sub
assert got.page_hi == orig.page_hi
assert got.page_lo == orig.page_lo
assert got.data == orig.data
def test_waveform_store_missing_returns_none(tmp_path: Path):
"""open_blastware / load_a5 return None for nonexistent entries."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
assert store.open_blastware("BE99999", "no_such.7M0W") is None
assert store.load_a5("BE99999", "no_such.7M0W") is None
def test_waveform_store_idempotent_save(tmp_path: Path):
"""Saving the same event twice produces the same event-file bytes."""
from sfm.waveform_store import WaveformStore
store = WaveformStore(tmp_path / "waveforms")
ev, frames = _make_synthetic_event()
rec1 = store.save(ev, serial="BE11529", a5_frames=frames)
bw_path = store.open_blastware("BE11529", rec1["filename"])
bytes1 = bw_path.read_bytes()
rec2 = store.save(ev, serial="BE11529", a5_frames=frames)
bytes2 = bw_path.read_bytes()
assert rec1["filename"] == rec2["filename"]
assert bytes1 == bytes2
# ── DB integration ────────────────────────────────────────────────────────────
def test_seismodb_persists_waveform_columns(tmp_path: Path):
"""insert_events writes the new columns when waveform_records is supplied."""
from sfm.database import SeismoDb
db = SeismoDb(tmp_path / "seismo_relay.db")
ev, _ = _make_synthetic_event()
rec = {
"filename": "M529LKIQ.7M0W",
"filesize": 8708,
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
}
inserted, skipped = db.insert_events(
[ev],
serial="BE11529",
waveform_records={ev._waveform_key.hex(): rec},
)
assert inserted == 1
assert skipped == 0
rows = db.query_events(serial="BE11529")
assert len(rows) == 1
row = rows[0]
assert row["blastware_filename"] == rec["filename"]
assert row["blastware_filesize"] == rec["filesize"]
assert row["a5_pickle_filename"] == rec["a5_pickle_filename"]
# get_event by id returns the same fields
row2 = db.get_event(row["id"])
assert row2 is not None
assert row2["blastware_filename"] == rec["filename"]
def test_seismodb_dedup_upserts_waveform_fields(tmp_path: Path):
"""Re-inserting the same (serial, timestamp) refreshes waveform fields."""
from sfm.database import SeismoDb
db = SeismoDb(tmp_path / "seismo_relay.db")
ev, _ = _make_synthetic_event()
db.insert_events([ev], serial="BE11529") # no waveform record yet
rows = db.query_events(serial="BE11529")
assert rows[0]["blastware_filename"] is None
rec = {
"filename": "M529LKIQ.7M0W",
"filesize": 4242,
"a5_pickle_filename": "M529LKIQ.7M0W.a5.pkl",
}
inserted, skipped = db.insert_events(
[ev],
serial="BE11529",
waveform_records={ev._waveform_key.hex(): rec},
)
assert inserted == 0 # dedup'd
assert skipped == 1
rows = db.query_events(serial="BE11529")
assert rows[0]["blastware_filename"] == rec["filename"]
assert rows[0]["blastware_filesize"] == 4242
def test_seismodb_migration_adds_columns(tmp_path: Path):
"""An existing DB without the new columns gets them added on init."""
import sqlite3
db_path = tmp_path / "old.db"
# Build a "v0" events table without the new columns.
with sqlite3.connect(str(db_path)) as conn:
conn.executescript("""
CREATE TABLE events (
id TEXT PRIMARY KEY,
serial TEXT NOT NULL,
waveform_key TEXT NOT NULL,
session_id TEXT,
timestamp TEXT,
tran_ppv REAL,
vert_ppv REAL,
long_ppv REAL,
peak_vector_sum REAL,
mic_ppv REAL,
project TEXT,
client TEXT,
operator TEXT,
sensor_location TEXT,
sample_rate INTEGER,
record_type TEXT,
false_trigger INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
UNIQUE(serial, timestamp)
);
INSERT INTO events
(id, serial, waveform_key, timestamp)
VALUES
('legacy-id', 'BE11529', '01110000',
'2026-04-01T12:00:00');
""")
# Initialise SeismoDb against the old DB — migration should run.
from sfm.database import SeismoDb
db = SeismoDb(db_path)
rows = db.query_events(serial="BE11529")
assert len(rows) == 1
assert rows[0]["blastware_filename"] is None
assert "blastware_filesize" in rows[0]
assert "a5_pickle_filename" in rows[0]
if __name__ == "__main__":
if pytest is not None:
pytest.main([__file__, "-v"])
else:
# Standalone runner — does not require pytest.
import inspect
import tempfile
import traceback as _tb
passed = failed = 0
for _name, _fn in sorted(globals().items()):
if not _name.startswith("test_") or not callable(_fn):
continue
try:
_sig = inspect.signature(_fn)
if "tmp_path" in _sig.parameters:
with tempfile.TemporaryDirectory() as _td:
_fn(Path(_td))
else:
_fn()
print(f"PASS {_name}")
passed += 1
except Exception:
print(f"FAIL {_name}")
_tb.print_exc()
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)