4 Commits

Author SHA1 Message Date
serversdown a18712442f feat: preserve and encode raw 0C record in sidecar extensions for offline analysis 2026-05-08 21:50:01 +00:00
serversdown 8aea46b8a0 doc(fix): retracts raw int16 LE sample set assumptions. 2026-05-08 19:26:25 +00:00
serversdown 9123269b1f feat(protocol): implement v0.14.0 SUB 5A protocol rewrite with enhanced chunk handling and new helpers
test: add regression tests for v0.14.x SUB 5A protocol fixes
refactor(logging): change warning logs to debug for less verbosity in write_blastware_file
2026-05-08 19:11:55 +00:00
serversdown 9400f59167 doc: update readme to 0.15.0 2026-05-08 19:06:26 +00:00
11 changed files with 718 additions and 45 deletions
+59
View File
@@ -121,6 +121,65 @@ All notable changes to seismo-relay are documented here.
--- ---
## v0.14.0 — 2026-05-02
### Changed (major rewrite)
- **`read_bulk_waveform_stream` — STRT-bounded chunk walk.** Replaces the
earlier `0x0400`-step / `max(key4[2:4], 0x0400)` chunk-counter formula,
which over-read ~5× past the actual event end into post-event circular-
buffer garbage. The new walk:
1. Probe at `counter = start_offset` (event 1: `0x0000`; event N:
`cur_key[2:4]`).
2. Parse `end_offset` from the STRT record at `data[17]` of the probe
response (`end_key[2:4]` field).
3. For event 1 only, read the two fixed metadata pages at counter
`0x1002` and `0x1004` — these contain the global session-start
compliance setup (Project / Client / User Name / Seis Loc /
Extended Notes ASCII strings). Continuation events skip these
(BW caches them across the session).
4. Walk sample chunks at **`0x0200` increments (NOT `0x0400`)**, bounded
by `end_offset` — the loop exits when
`next_chunk_counter + 0x0200 > end_offset`.
5. Send the proper TERM frame (see new `bulk_waveform_term_v2()`) with
`offset_word = end_offset - next_boundary` and
`params[2:4] = next_boundary BE`. The TERM response carries the
partial last chunk + 26-byte file footer.
- **New helpers:** `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
and `parse_strt_end_offset(a5_data)` in `minimateplus.framing`.
- **`stop_after_metadata` / `extra_chunks_after_metadata` kwargs are now
no-ops** under the v0.14.x walk. They are retained on the
`read_bulk_waveform_stream` signature for backward compatibility but log a
DEBUG line when set. The old "scan for `b'Project:'` and stop one chunk
later" workaround is obsolete — the loop is deterministically bounded by
the STRT-derived `end_offset`.
- **Project / Client / User Name / Seis Loc string source corrected.**
These come from the dedicated metadata pages at counter `0x1002` /
`0x1004`, not from "A5 frame 7" of the sample-chunk stream. The
earlier "A5 frame 7" claim was an artifact of the broken `0x0400`-step
walk where the bad counter formula coincidentally landed sample-chunk
fi=7 on top of the 0x1002 metadata page.
### Verified
- Three independent BW MITM captures (4-27-26 + 5-1-26 + 5-4-26) confirm
the new walk matches BW's behaviour event-for-event.
- `end_offset` values verified across 3 events: `0x1ABE` (4-27-26 2-sec),
`0x21F2` (5-1-26 3-sec), `0x417E` (5-1-26 event-2).
### Notes
- Earlier v0.13.0 / v0.13.1 / v0.13.2 entries describe partial steps along
the way (some of the file builder fixes, filename bugs, etc.) that were
superseded by the full rewrite. Treat this v0.14.0 entry as the
definitive landing point for the corrected SUB 5A protocol.
---
## v0.14.1 — 2026-05-04 ## v0.14.1 — 2026-05-04
### Fixed ### Fixed
+6 -2
View File
@@ -1,4 +1,4 @@
# seismo-relay `v0.14.3` # seismo-relay `v0.15.0`
A ground-up replacement for **Blastware** — Instantel's aging Windows-only A ground-up replacement for **Blastware** — Instantel's aging Windows-only
software for managing MiniMate Plus seismographs. software for managing MiniMate Plus seismographs.
@@ -14,7 +14,11 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
> byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec > byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec
> events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with > events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with
> full Event Reports, frequency analysis, and waveform plots. > full Event Reports, frequency analysis, and waveform plots.
> See [CHANGELOG.md](CHANGELOG.md) for full version history. > **v0.15.0 (2026-05-07)** adds layered per-event storage (BW binary +
> raw 5A pickle + HDF5 + `.sfm.json` sidecar), a plot-ready
> `sfm.plot.v1` JSON shape with server-side ADC-to-physical-units
> conversion, and a BW-file importer for ingesting externally-produced
> events. See [CHANGELOG.md](CHANGELOG.md) for full version history.
--- ---
+61 -8
View File
@@ -11,6 +11,7 @@
| Date | Section | Change | | Date | Section | Change |
|---|---|---| |---|---|---|
| 2026-05-08 | §7.6.1 (RETRACTION) | **❌ RETRACTED — "raw int16 LE 8 bytes/sample-set" body codec was never validated.** The original 4-2-26 confirmation was based on misreading broken-decoder output (full-scale ±32K noise) as evidence the signal had saturated. BW's own 0C peaks for that capture (Tran=0.420 / Vert=3.870 / Long=0.495 in/s) prove the signal was NOT saturated — none of those exceed 13K ADC counts. No event in the project's archive has ever come close to saturation, yet the decoder consistently produces ±32K noise on every event. Conclusion: the body codec is not raw int16 LE; the actual encoding is open. Body byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`, lots of `10 XX` pairs) — likely a delta encoding with `0x10` as escape, but unverified. Retraction box added at top of §7.6.1; "fully-saturating event" claim removed from channel-identification note. The histogram codec in §7.6.2 IS verified and decoded correctly (different recording mode, 32-byte blocks); use it as a structural hint when reverse-engineering the waveform codec. |
| 2026-02-26 | Initial | Document created from first hex dump analysis | | 2026-02-26 | Initial | Document created from first hex dump analysis |
| 2026-02-26 | §2 Frame Structure | **CORRECTED:** Frame uses DLE-STX (`0x10 0x02`) and DLE-ETX (`0x10 0x03`), not bare `0x02`/`0x03`. `0x41` confirmed as ACK not STX. DLE stuffing rule added. | | 2026-02-26 | §2 Frame Structure | **CORRECTED:** Frame uses DLE-STX (`0x10 0x02`) and DLE-ETX (`0x10 0x03`), not bare `0x02`/`0x03`. `0x41` confirmed as ACK not STX. DLE stuffing rule added. |
| 2026-02-26 | §8 Timestamp | **UPDATED:** Year `0x07CB = 1995` confirmed as MiniMate hardware default date when RTC battery is disconnected. Not an encoding error. Confidence upgraded from ❓ to 🔶. | | 2026-02-26 | §8 Timestamp | **UPDATED:** Year `0x07CB = 1995` confirmed as MiniMate hardware default date when RTC battery is disconnected. Not an encoding error. Confidence upgraded from ❓ to 🔶. |
@@ -851,14 +852,59 @@ MicL: 39 64 1D AA = 0.0000875 psi
> strings actually live — NOT in any sample-chunk frame) > strings actually live — NOT in any sample-chunk frame)
> - **§7.8.8** — multi-event "Download All" sequence > - **§7.8.8** — multi-event "Download All" sequence
> >
> The waveform sample encoding (4-channel interleaved s16 LE, 8 bytes per sample-set) described in §7.6.1 > The waveform sample encoding described in §7.6.1 below (4-channel interleaved s16 LE, 8 bytes
> below is still correct. Only the frame-indexing claims and metadata-source claims are wrong. > per sample-set) is **NOT actually verified** — see the retraction note at the top of §7.6.1.
> The frame-indexing claims and metadata-source claims in §7.6 are also wrong; use §7.8.5–§7.8.8.
**Two distinct formats exist depending on recording mode. Both confirmed from captures.** **Two distinct formats exist depending on recording mode. Both confirmed from captures.**
--- ---
#### 7.6.1 Blast / Waveform mode — ✅ CONFIRMED (4-2-26 capture) #### 7.6.1 Blast / Waveform mode — ❌ NOT VERIFIED (retracted 2026-05-08)
> ## ⚠️ RETRACTION (2026-05-08)
>
> The "4-channel interleaved s16 LE, 8 bytes per sample-set" claim
> below was **never actually validated**. It got into this document
> because the decoder built around that assumption produced full-scale
> ±32K counts on every channel of the 4-2-26 capture, and the
> ±32K-shaped output was misread as "the signal must have saturated."
>
> Cross-checking the BW-reported peaks proves the opposite:
>
> | Channel | BW PPV (in/s) | Expected ADC counts at 10 in/s FS |
> |---|---|---|
> | Tran | 0.420 | **1,376** |
> | Vert | 3.870 | **12,686** |
> | Long | 0.495 | **1,622** |
>
> None of these are anywhere near ±32K saturation. No event in the
> project's archive (across all captures from 1-2-26 onward) has
> ever come close to saturation either. Yet the decoder has
> consistently produced ±32K-shaped noise on every event. The right
> conclusion is that the byte-to-sample interpretation has been wrong
> the whole time, NOT that every event happened to saturate.
>
> What's actually known about the body bytes:
>
> - The byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`,
> plus high frequencies of `0x01 / 0x04 / 0x0F / 0xF0 / 0xF1`). Lots
> of `10 XX` pairs. Reading them as LE int16 produces uniform ±32K
> noise — the signature of mis-aligned or encoded data.
> - The CHANGELOG note for v0.14.2 calls the body a "delta-encoded
> ADC stream" — that hint plus the byte distribution points toward
> a delta encoding with `0x10` as an escape marker, but no decoder
> has been worked out yet.
> - The histogram-mode codec in §7.6.2 IS verified and decoded
> correctly (different format: 32-byte blocks with 9× int16 LE
> samples + metadata). The same firmware emits both formats, so
> §7.6.2 may share encoding primitives with the waveform codec
> and is worth using as a structural hint when reverse-engineering.
>
> **Treat the spec below as a starting hypothesis to disprove, not
> ground truth.** The frame-layout pieces (STRT location, preamble,
> chunk header) appear correct; the per-byte sample interpretation
> is the open question.
4-channel interleaved signed 16-bit little-endian, 8 bytes per sample-set: 4-channel interleaved signed 16-bit little-endian, 8 bytes per sample-set:
@@ -923,11 +969,18 @@ Total: 7633B → 954 naive sample-sets, 948 alignment-corrected
Only 948 of 9306 sample-sets captured (10%) — `stop_after_metadata=True` terminated Only 948 of 9306 sample-sets captured (10%) — `stop_after_metadata=True` terminated
download after A5[7] was received. download after A5[7] was received.
**Channel identification note:** The 4-2-26 blast saturated all four geophone channels **Channel identification note:** Channel ordering [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3]
to near-maximum ADC output (~3200032617 counts). Channel ordering [Tran, Vert, Long, Mic] is the Blastware convention. This ordering has not been independently verified end-to-end,
= [ch0, ch1, ch2, ch3] is the Blastware convention and is consistent with per-channel PPV since no decoder yet produces samples that match BW's own rendering of the same event (see
values (Tran=0.420, Vert=3.870, Long=0.495 in/s from 0C record), but cannot be the retraction at the top of §7.6.1). Once the body codec is decoded, the per-channel PPV
independently confirmed from a fully-saturating event alone. values from the 0C record (Tran=0.420, Vert=3.870, Long=0.495 in/s for the 4-2-26 capture)
provide the cross-check that pins down channel order.
> **Historical note:** earlier revisions of this section claimed the 4-2-26 blast had
> "saturated all four channels to ~3200032617 counts," citing that as evidence the s16 LE
> interpretation was correct. That claim was wrong — the ±32K values were the broken
> decoder's output, not the actual signal amplitude (which the 0C peaks above show was
> nowhere near saturation). Retracted 2026-05-08.
--- ---
+6 -6
View File
@@ -639,7 +639,7 @@ def write_blastware_file(
strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF]) strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF])
probe_skip = 7 + 21 probe_skip = 7 + 21
log.warning( log.debug(
"write_blastware_file: strt_pos_stripped=%d probe_skip=%d " "write_blastware_file: strt_pos_stripped=%d probe_skip=%d "
"probe_data_len=%d strt_hex=%s", "probe_data_len=%d strt_hex=%s",
strt_pos_stripped if strt_pos_stripped >= 0 else -1, strt_pos_stripped if strt_pos_stripped >= 0 else -1,
@@ -708,8 +708,8 @@ def write_blastware_file(
skip = 12 # sample chunks skip = 12 # sample chunks
contribution = _frame_body_bytes(frame, skip) contribution = _frame_body_bytes(frame, skip)
log.warning("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d", log.debug("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
fi, skip, len(frame.data), len(contribution)) fi, skip, len(frame.data), len(contribution))
all_bytes.extend(contribution) all_bytes.extend(contribution)
# Terminator contributes its content, which ends with the 26-byte footer. # Terminator contributes its content, which ends with the 26-byte footer.
@@ -717,7 +717,7 @@ def write_blastware_file(
# one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21. # one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21.
if term_frame is not None: if term_frame is not None:
term_contribution = _frame_body_bytes(term_frame, 11) term_contribution = _frame_body_bytes(term_frame, 11)
log.warning( log.debug(
"write_blastware_file: term_frame data_len=%d skip=11 " "write_blastware_file: term_frame data_len=%d skip=11 "
"contribution_len=%d first8=%s", "contribution_len=%d first8=%s",
len(term_frame.data), len(term_frame.data),
@@ -726,7 +726,7 @@ def write_blastware_file(
) )
all_bytes.extend(term_contribution) all_bytes.extend(term_contribution)
log.warning( log.debug(
"write_blastware_file: all_bytes total=%d last28=%s", "write_blastware_file: all_bytes total=%d last28=%s",
len(all_bytes), len(all_bytes),
bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(), bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(),
@@ -760,7 +760,7 @@ def write_blastware_file(
if footer_pos >= 0: if footer_pos >= 0:
body = bytes(all_bytes[:footer_pos]) body = bytes(all_bytes[:footer_pos])
footer = bytes(all_bytes[footer_pos:footer_pos + 26]) footer = bytes(all_bytes[footer_pos:footer_pos + 26])
log.warning( log.debug(
"write_blastware_file: real 0e 08 footer at all_bytes[%d]; " "write_blastware_file: real 0e 08 footer at all_bytes[%d]; "
"truncating %d post-footer bytes", "truncating %d post-footer bytes",
footer_pos, len(all_bytes) - footer_pos - 26, footer_pos, len(all_bytes) - footer_pos - 26,
+14
View File
@@ -1362,6 +1362,20 @@ def _decode_waveform_record_into(data: bytes, event: Event) -> None:
Modifies event in-place. Modifies event in-place.
""" """
# ── Always preserve the raw 210 bytes ─────────────────────────────────────
# The 0C record carries far more than just peaks + project strings:
# ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement, Vector
# Sum Time, MicL Time of Peak, and the per-channel sensor self-check
# results (Test Freq / Ratio / Pass-Fail) all live somewhere in this
# 210-byte block. Their byte offsets are not yet mapped — keeping the
# raw bytes lets us decode those fields offline once we have a paired
# (raw 0C, BW-report) sample to fit against. Cheap to keep around
# (210 bytes per event).
try:
event._raw_record = bytes(data[:210])
except Exception:
pass
# ── Record type + format detection ──────────────────────────────────────── # ── Record type + format detection ────────────────────────────────────────
# `record_type` is the user-facing label ("Waveform" for any triggered # `record_type` is the user-facing label ("Waveform" for any triggered
# event regardless of timestamp-header layout). `fmt` is the internal # event regardless of timestamp-header layout). `fmt` is the internal
+16 -1
View File
@@ -15,6 +15,7 @@ declared in `event_to_sidecar_dict()`.
from __future__ import annotations from __future__ import annotations
import base64
import datetime import datetime
import hashlib import hashlib
import json import json
@@ -135,6 +136,20 @@ def event_to_sidecar_dict(
captured_at = captured_at or datetime.datetime.utcnow() captured_at = captured_at or datetime.datetime.utcnow()
# Stash raw 0C record bytes in `extensions.raw_records` so future
# field-decoding work (Peak Acceleration, ZC Freq, Time of Peak,
# sensor self-check results, etc.) can run offline against committed
# sidecars without a live device. Cheap (~280 bytes base64) and
# forward-compatible (older readers ignore unknown extensions keys).
ext_dict: dict = dict(extensions) if extensions else {}
raw_0c = getattr(event, "_raw_record", None)
if raw_0c:
rr = ext_dict.setdefault("raw_records", {})
# Don't clobber a raw_0c that callers explicitly passed in via
# `extensions=...` (e.g. round-trip preservation in patch_sidecar).
rr.setdefault("waveform_record_b64", base64.b64encode(raw_0c).decode("ascii"))
rr.setdefault("waveform_record_len", len(raw_0c))
return { return {
"schema_version": SCHEMA_VERSION, "schema_version": SCHEMA_VERSION,
"kind": SIDECAR_KIND, "kind": SIDECAR_KIND,
@@ -174,7 +189,7 @@ def event_to_sidecar_dict(
"notes": "", "notes": "",
}, },
"extensions": extensions or {}, "extensions": ext_dict,
} }
+26 -20
View File
@@ -111,14 +111,15 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
verified against this algorithm on 2026-04-02). verified against this algorithm on 2026-04-02).
Args: Args:
offset_word: 16-bit offset (0x1004 for probe/chunks, 0x005A for term). offset_word: 16-bit offset. For probe/chunks/metadata pages this is
raw_params: 10 or 11 params bytes (from bulk_waveform_params or `0x1002`. For the proper TERM frame this is computed by
bulk_waveform_term_params). 0x10 bytes in params are `bulk_waveform_term_v2()` from the STRT-derived
written RAW NOT DLE-stuffed. Confirmed 2026-04-06 by `end_offset`.
comparing wire bytes: BW sends bare `10 04` for chunk 1 raw_params: 10, 11, or 12 params bytes (from `bulk_waveform_params`
(counter=0x1004), not stuffed `10 10 04`. Device reads for probes/samples, `bulk_waveform_term_v2` for TERM, or
params at fixed byte positions; stuffing shifts the bytes a manually-built 12-byte block for the metadata pages
and corrupts the counter, causing device to ignore the frame. 0x1002 / 0x1004). See gotcha #3 below — params region
uses partial DLE stuffing of 0x10 bytes.
Returns: Returns:
Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX] Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
@@ -433,21 +434,26 @@ def bulk_waveform_params(key4: bytes, counter: int, *, is_probe: bool = False) -
def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes: def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
""" """
DEPRECATED 2026-05-01 see bulk_waveform_term_v2(). DEPRECATED DO NOT USE IN NEW CODE.
Build the 10-byte params block for the SUB 5A termination request, OLD layout This is the v1 termination params helper, paired with the broken
(used in conjunction with the fixed offset_word=0x005A). Kept for backward `_BULK_TERM_OFFSET = 0x005A` magic offset_word. Together they produce a
compatibility produces a tiny ~100-byte device-side terminator response ~100-byte device-side terminator response that does NOT contain the
rather than the proper partial-last-chunk + footer payload that BW gets. partial-last-chunk waveform tail or the 26-byte file footer. Files
reconstructed using this terminator are missing their last ~512 bytes of
waveform data and have a synthesized footer that disagrees with what BW
would have written.
params[0] = key4[0] **For new code, use `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`**
params[1] = key4[1] which computes the correct offset_word + params from the STRT-derived
params[2] = (counter >> 8) & 0xFF `end_offset`. v2 produces wire bytes that match BW exactly across all
params[3:] = zeros tested events (4-27-26 / 5-1-26 / 5-4-26 captures).
Use bulk_waveform_term_v2() for new code it computes the verified This function is retained ONLY for the defensive fallback path in
offset_word + params from end_offset (extracted from STRT) and the last `read_bulk_waveform_stream()` that triggers when STRT parsing fails or no
chunk counter. chunks are fetched (= a malformed event or an unexpected device state).
The fallback already logs a WARNING when it activates; if you see that
warning, the bug is upstream STRT should have been parseable.
""" """
if len(key4) != 4: if len(key4) != 4:
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}") raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
+9 -8
View File
@@ -937,7 +937,7 @@ class MiniMateProtocol:
continue continue
chunk = data_rsp.data[11:] chunk = data_rsp.data[11:]
log.warning( log.debug(
"read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d", "read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d",
step_name, data_rsp.page_key, len(data_rsp.data), step_name, data_rsp.page_key, len(data_rsp.data),
len(chunk), len(config) + len(chunk), len(chunk), len(config) + len(chunk),
@@ -957,17 +957,18 @@ class MiniMateProtocol:
except TimeoutError: except TimeoutError:
pass pass
log.warning( log.info(
"read_compliance_config: done — %d cfg bytes total", "read_compliance_config: done — %d cfg bytes total",
len(config), len(config),
) )
# Hex dump first 128 bytes for field mapping # Hex dump first 128 bytes — useful only for field-mapping work, not normal operation.
for row in range(0, min(len(config), 128), 16): if log.isEnabledFor(logging.DEBUG):
row_bytes = bytes(config[row:row + 16]) for row in range(0, min(len(config), 128), 16):
hex_part = ' '.join(f'{b:02x}' for b in row_bytes) row_bytes = bytes(config[row:row + 16])
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes) hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
log.warning(" cfg[%04x]: %-48s %s", row, hex_part, asc_part) asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
log.debug(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
return bytes(config) return bytes(config)
+216
View File
@@ -0,0 +1,216 @@
"""
sfm.dump_0c inspect the raw 210-byte SUB 0C waveform record stored in a
sidecar JSON's `extensions.raw_records.waveform_record_b64`.
Usage:
python -m sfm.dump_0c <sidecar.sfm.json> [<sidecar.sfm.json> ...]
Prints, for each input:
- A header summarising the sidecar's metadata-block claims (peaks,
project, timestamp) the "what BW says this event measured" view.
- A 16-byte-wide hex dump of the raw 0C record, annotated with known
field anchors (STRT, channel labels, project strings).
- A "candidate float regions" scan that brute-forces every byte
position as a float32 BE and prints any that yield a value in a
plausible range (1e-7 to 1e3) useful for hunting where Peak
Acceleration / Peak Displacement / ZC Freq / Time of Peak live.
Pairing the printed candidates with the BW Event Report values lets
us nail down byte offsets for the missing fields without a live
device.
"""
from __future__ import annotations
import argparse
import base64
import json
import struct
import sys
from pathlib import Path
# ── Annotations for known anchors in a 210-byte 0C record ──────────────────
# Anchors we look for and label inline in the hex dump. Each is a needle
# (bytes to find) and a short label. Found via .find() — the first
# occurrence wins.
_ANCHORS = [
(b"Tran", "Tran label (PPV @ +6, PVS @ -12)"),
(b"Vert", "Vert label (PPV @ +6)"),
(b"Long", "Long label (PPV @ +6)"),
(b"MicL", "MicL label (peak psi @ +6)"),
(b"Project:", "Project: label"),
(b"Client:", "Client: label"),
(b"User Name:", "User Name: label"),
(b"Seis Loc:", "Seis Loc: label"),
(b"Extended Notes", "Extended Notes label"),
]
def _hex_dump(data: bytes, anchors: dict[int, str]) -> str:
"""Return a 16-byte-wide hex+ASCII dump, with anchor labels printed
on the line that contains the anchor's start byte."""
lines = []
for off in range(0, len(data), 16):
chunk = data[off : off + 16]
hex_part = " ".join(f"{b:02x}" for b in chunk)
ascii_part = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
line = f" {off:04x} {hex_part:<47} |{ascii_part}|"
# If any anchor lands on a byte in this row, append a tag
tags = [
f"[{a:#04x}: {label}]"
for a, label in anchors.items()
if off <= a < off + 16
]
if tags:
line += " " + " ".join(tags)
lines.append(line)
return "\n".join(lines)
def _scan_float32_be(data: bytes, lo: float, hi: float) -> list[tuple[int, float]]:
"""Brute-force every offset where data[off:off+4] is a float32 BE in
(lo, hi). Includes negatives in the symmetric range."""
hits = []
for i in range(len(data) - 3):
try:
v = struct.unpack_from(">f", data, i)[0]
except struct.error:
continue
if v != v: # NaN
continue
if abs(v) < 1e-30 or abs(v) > 1e10: # crap range
continue
a = abs(v)
if lo <= a <= hi:
hits.append((i, v))
return hits
def _scan_uint16_be(data: bytes, lo: int, hi: int) -> list[tuple[int, int]]:
"""Find every offset where uint16 BE is in [lo, hi]."""
hits = []
for i in range(len(data) - 1):
v = (data[i] << 8) | data[i + 1]
if lo <= v <= hi:
hits.append((i, v))
return hits
def _summarize_sidecar(side: dict) -> str:
ev = side.get("event", {})
pv = side.get("peak_values", {})
pi = side.get("project_info", {})
bw = side.get("blastware", {})
return (
f" serial: {ev.get('serial')}\n"
f" timestamp: {ev.get('timestamp')}\n"
f" waveform: {ev.get('waveform_key')} ({ev.get('record_type')})\n"
f" sample_rate:{ev.get('sample_rate')} sps rectime:{ev.get('rectime_seconds')}s\n"
f" bw file: {bw.get('filename')} ({bw.get('filesize')} B)\n"
f" peaks: "
f"Tran={pv.get('transverse'):.5f} "
f"Vert={pv.get('vertical'):.5f} "
f"Long={pv.get('longitudinal'):.5f} "
f"PVS={pv.get('vector_sum'):.5f} in/s "
f"Mic={pv.get('mic_psi'):.6e} psi"
if all(pv.get(k) is not None for k in
("transverse", "vertical", "longitudinal", "vector_sum", "mic_psi"))
else f" peaks: {pv}\n project: {pi}"
) + (
f"\n project: {pi.get('project')!r} / {pi.get('client')!r} / "
f"operator={pi.get('operator')!r} loc={pi.get('sensor_location')!r}"
)
def dump_one(path: Path) -> int:
side = json.loads(path.read_text(encoding="utf-8"))
raw_b64 = (
side.get("extensions", {})
.get("raw_records", {})
.get("waveform_record_b64")
)
if not raw_b64:
print(f"\n=== {path} ===")
print(" ! no extensions.raw_records.waveform_record_b64 — sidecar")
print(" pre-dates raw-0C persistence (added in v0.15.x). Re-save")
print(" the event from the device to capture the bytes.")
return 1
raw = base64.b64decode(raw_b64)
# Build anchor map
anchors: dict[int, str] = {}
for needle, label in _ANCHORS:
i = raw.find(needle)
if i >= 0:
anchors[i] = label
print(f"\n=== {path} ===")
print("metadata claimed by sidecar:")
print(_summarize_sidecar(side))
print(f"\nraw 0C record ({len(raw)} bytes):")
print(_hex_dump(raw, anchors))
# Float32 BE candidates in geo-relevant ranges
geo_hits = _scan_float32_be(raw, 1e-5, 50.0)
# Filter: only show hits that are NOT trivially the per-channel labels'
# +6 PPV floats already documented (those will land in any sweep too).
print("\nfloat32 BE candidates (1e-5 .. 50.0):")
for off, v in geo_hits:
annotation = ""
for needle, _ in _ANCHORS[:4]: # geo + mic labels
i = raw.find(needle)
if i >= 0 and off == i + 6:
annotation = f"{needle.decode()} PPV (label+6)"
break
print(f" {off:#04x} ({off:3d}) {v:>+15.6f}{annotation}")
print("\nuint16 BE candidates ZC-Freq-ish (1..200):")
for off, v in _scan_uint16_be(raw, 1, 200):
if v < 5: # too noisy at very low end
continue
print(f" {off:#04x} ({off:3d}) = {v}")
print("\nuint16 BE candidates Time-of-Peak-ish if stored as ms (1..30000):")
for off, v in _scan_uint16_be(raw, 1, 30000):
if v < 100: # noise filter
continue
# Only the first ~80 are worth showing — too many hits otherwise
if off > 80:
break
print(f" {off:#04x} ({off:3d}) = {v} ms ?")
print()
return 0
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(
description="Inspect a saved 0C waveform record from a sidecar JSON.",
)
p.add_argument(
"sidecars",
nargs="+",
type=Path,
help="Path(s) to <event>.sfm.json sidecar file(s).",
)
args = p.parse_args(argv)
rc = 0
for path in args.sidecars:
try:
rc |= dump_one(path)
except Exception as exc:
print(f"\n=== {path} ===\n ERROR: {exc}", file=sys.stderr)
rc |= 2
return rc
if __name__ == "__main__":
sys.exit(main())
+252
View File
@@ -0,0 +1,252 @@
"""
test_5a_protocol.py Regression test for the v0.14.x SUB 5A protocol fixes.
Verifies that SFM's framing helpers reproduce Blastware's exact wire bytes
for every 5A request frame in the 5-1-26 "bwcap3sec" capture, AND that the
file builder produces a byte-identical file when fed the BW capture's A5
responses.
Together these two tests protect all four v0.14.x fixes:
v0.14.0 STRT-bounded chunk walk (probe @ 0, metadata pages @ 0x1002 +
0x1004, samples @ 0x0600..0x1E00 step 0x0200, TERM at residual)
v0.14.1 event-N probe counter is `start_offset`, not `start_offset+0x46`
(covered by the multi-event captures, not this 3-sec event-1
capture but the helpers are the same code path)
v0.14.2 file body assembly is contiguous concatenation, no de-duplication
v0.14.3 partial DLE stuffing of `0x10` bytes in 5A params (counter=0x1000
wire bytes are `10 10 00`, not `10 00`)
If any of these fixes regresses, this test fails immediately with a clear
byte-level diff.
Run:
python -m pytest tests/test_5a_protocol.py -v
or:
python tests/test_5a_protocol.py
"""
from __future__ import annotations
import os
import sys
import pytest
# Allow running from the project root without installation
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from minimateplus.framing import (
S3FrameParser,
build_5a_frame,
bulk_waveform_params,
bulk_waveform_term_v2,
)
# ── Capture loading ────────────────────────────────────────────────────────────
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Reference BW MITM capture: BW saving a 3-sec event 0 (start_key=01110000,
# end_offset=0x21F2). 17 5A frames: probe + 2 metadata pages + 13 samples + TERM.
BW_TX_PATH = os.path.join(
ROOT,
"bridges/captures/5-1-26/comcheck/bwcap3sec/"
"raw_bw_20260501_165723_copy_3sec_waveform_to_disk.bin",
)
BW_S3_PATH = os.path.join(
ROOT,
"bridges/captures/5-1-26/comcheck/bwcap3sec/"
"raw_s3_20260501_165723_copy_3sec_waveform_to_disk.bin",
)
# BW's saved Blastware file for the same event (used for file-builder verification).
BW_SAVED_FILE = os.path.join(
ROOT, "example-events/decode_test/5-1-26/bw/M529LKIQ.G10",
)
def _split_bw_frames(data: bytes) -> list[bytes]:
"""Split BW TX bytes into individual frames (ACK STX … bare ETX)."""
frames: list[bytes] = []
i = 0
while i < len(data):
if data[i] != 0x41 or i + 1 >= len(data) or data[i + 1] != 0x02:
i += 1
continue
j = i + 2
while j < len(data):
if data[j] == 0x03:
break
if data[j] == 0x10 and j + 1 < len(data):
j += 2
continue
j += 1
if j >= len(data):
break
frames.append(data[i : j + 1])
i = j + 1
return frames
@pytest.fixture(scope="module")
def bw_5a_frames() -> list[bytes]:
"""All 5A frames from the BW TX capture, in wire order."""
if not os.path.exists(BW_TX_PATH):
pytest.skip(f"BW capture not found: {BW_TX_PATH}")
raw = open(BW_TX_PATH, "rb").read()
frames = [
f for f in _split_bw_frames(raw)
if len(f) >= 6 and f[5] == 0x5A # body[3] == 0x5A (SUB)
]
assert len(frames) == 17, f"expected 17 5A frames in capture, got {len(frames)}"
return frames
@pytest.fixture(scope="module")
def bw_a5_frames():
"""All A5 (response) frames from the matching S3 capture."""
if not os.path.exists(BW_S3_PATH):
pytest.skip(f"BW S3 capture not found: {BW_S3_PATH}")
raw = open(BW_S3_PATH, "rb").read()
p = S3FrameParser()
p.feed(raw)
a5 = [f for f in p.frames if f.sub == 0xA5]
assert len(a5) == 17, f"expected 17 A5 frames in capture, got {len(a5)}"
return a5
# ── 5A request frame byte-perfect verification ────────────────────────────────
KEY4 = bytes.fromhex("01110000") # start_key for the 3-sec event 0
END_OFFSET = 0x21F2 # parsed from STRT in the BW capture
LAST_CHUNK_COUNTER = 0x1E00 # last full 0x0200-byte chunk before TERM
SAMPLE_COUNTERS = (
0x0600, 0x0800, 0x0A00, 0x0C00, 0x0E00,
0x1000, 0x1200, 0x1400, 0x1600, 0x1800,
0x1A00, 0x1C00, 0x1E00,
)
def _meta_params(key: bytes, counter: int) -> bytes:
"""Build the 12-byte metadata-page params block (matches BW for 0x1002 / 0x1004)."""
return bytes(
[
0x00, key[0], key[1],
(counter >> 8) & 0xFF, counter & 0xFF,
0, 0, 0, 0, 0, 0, 0,
]
)
def test_probe_frame_byte_perfect(bw_5a_frames):
"""Probe @ counter=0x0000 (frame 0)."""
sfm = build_5a_frame(0x1002, bulk_waveform_params(KEY4, 0, is_probe=True))
assert sfm == bw_5a_frames[0], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[0].hex()}"
)
@pytest.mark.parametrize("idx,counter", [(1, 0x1002), (2, 0x1004)])
def test_metadata_page_frames_byte_perfect(bw_5a_frames, idx, counter):
"""Metadata pages @ counter=0x1002 and 0x1004 (frames 1 and 2)."""
sfm = build_5a_frame(0x1002, _meta_params(KEY4, counter))
assert sfm == bw_5a_frames[idx], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[idx].hex()}"
)
@pytest.mark.parametrize("i,counter", list(enumerate(SAMPLE_COUNTERS)))
def test_sample_chunk_frames_byte_perfect(bw_5a_frames, i, counter):
"""
Sample chunks @ counter=0x0600..0x1E00, step 0x0200 (frames 3..15).
Critically, frame 8 (counter=0x1000) requires the v0.14.3 partial DLE
stuffing fix wire params include `10 10 00` for the counter, not `10 00`.
"""
sfm = build_5a_frame(0x1002, bulk_waveform_params(KEY4, counter))
bw_idx = 3 + i
assert sfm == bw_5a_frames[bw_idx], (
f"\ncounter=0x{counter:04X}"
f"\nSFM: {sfm.hex()}"
f"\nBW: {bw_5a_frames[bw_idx].hex()}"
)
def test_term_frame_byte_perfect(bw_5a_frames):
"""TERM frame at residual (frame 16)."""
offset_word, params = bulk_waveform_term_v2(KEY4, END_OFFSET, LAST_CHUNK_COUNTER)
sfm = build_5a_frame(offset_word, params)
assert sfm == bw_5a_frames[16], (
f"\nSFM: {sfm.hex()}\nBW: {bw_5a_frames[16].hex()}"
)
def test_strt_end_offset_parsing(bw_a5_frames):
"""The probe response (A5[0]) carries STRT at byte 17 with end_offset=0x21F2."""
from minimateplus.framing import parse_strt_end_offset
end_offset = parse_strt_end_offset(bw_a5_frames[0].data)
assert end_offset == END_OFFSET, (
f"expected end_offset=0x{END_OFFSET:04X}, got "
f"{f'0x{end_offset:04X}' if end_offset is not None else 'None'}"
)
# ── File builder byte-perfect verification ────────────────────────────────────
def test_blastware_file_builder_byte_perfect(bw_a5_frames):
"""
Feed the BW capture's A5 frames into write_blastware_file() and verify the
output is byte-identical to BW's saved M529LKIQ.G10 reference file.
This protects the v0.14.2 strip-removal fix and the file-builder skip
values (probe=38, meta=13, samples=12, TERM=11).
"""
if not os.path.exists(BW_SAVED_FILE):
pytest.skip(f"BW saved file not found: {BW_SAVED_FILE}")
import tempfile
from minimateplus.blastware_file import write_blastware_file
from minimateplus.models import Event
ev = Event(index=0)
ev._waveform_key = KEY4
ev.rectime_seconds = 3
ev.timestamp = None # let the builder pull the footer from the TERM frame
with tempfile.NamedTemporaryFile(suffix=".G10", delete=False) as tf:
tmp_path = tf.name
try:
write_blastware_file(ev, bw_a5_frames, tmp_path)
sfm_bytes = open(tmp_path, "rb").read()
finally:
os.unlink(tmp_path)
bw_bytes = open(BW_SAVED_FILE, "rb").read()
assert len(sfm_bytes) == len(bw_bytes), (
f"file size mismatch: SFM={len(sfm_bytes)} BW={len(bw_bytes)}"
)
if sfm_bytes != bw_bytes:
# Find first diff for actionable error message
for i in range(len(bw_bytes)):
if bw_bytes[i] != sfm_bytes[i]:
ctx_start = max(0, i - 8)
ctx_end = min(len(bw_bytes), i + 16)
pytest.fail(
f"file diverges at byte 0x{i:04X}\n"
f" BW : {bw_bytes[ctx_start:ctx_end].hex()}\n"
f" SFM: {sfm_bytes[ctx_start:ctx_end].hex()}\n"
f" {' ' * (i - ctx_start)}^^"
)
# ── Standalone runner ─────────────────────────────────────────────────────────
if __name__ == "__main__":
sys.exit(pytest.main([__file__, "-v"]))
+53
View File
@@ -127,6 +127,59 @@ def test_sidecar_write_and_read_round_trip(tmp_path: Path):
assert loaded["source"]["kind"] == "sfm-ach" assert loaded["source"]["kind"] == "sfm-ach"
def test_sidecar_persists_raw_0c_record_in_extensions(tmp_path: Path):
"""An Event with _raw_record populated should land its 210 bytes
base64-encoded in extensions.raw_records.waveform_record_b64, so
later analysis (e.g. mapping Peak Acceleration / Time of Peak / ZC
Freq byte offsets) can run offline against the saved sidecar."""
import base64
ev, _ = _make_synthetic_event()
# Synthesize a 210-byte 0C record with embedded label needles so
# the dump tool's anchor scan has something to find.
raw = bytearray(210)
raw[10:14] = b"Tran"
raw[60:64] = b"Vert"
raw[110:114] = b"Long"
raw[160:164] = b"MicL"
ev._raw_record = bytes(raw)
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="sfm-live",
)
rr = d["extensions"]["raw_records"]
assert rr["waveform_record_len"] == 210
decoded = base64.b64decode(rr["waveform_record_b64"])
assert decoded == ev._raw_record
# Round-trip through write/read
path = tmp_path / "raw0c.sfm.json"
event_file_io.write_sidecar(path, d)
loaded = event_file_io.read_sidecar(path)
assert (
base64.b64decode(loaded["extensions"]["raw_records"]["waveform_record_b64"])
== ev._raw_record
)
def test_sidecar_omits_raw_records_when_event_has_no_0c(tmp_path: Path):
"""Events without a _raw_record (e.g. constructed by importers that
never see 0C) should NOT add an empty raw_records block keep the
sidecar clean for those flows."""
ev, _ = _make_synthetic_event()
assert ev._raw_record is None
d = event_file_io.event_to_sidecar_dict(
ev, serial="BE11529",
blastware_filename="M529LKIQ.7M0W", blastware_filesize=1024,
blastware_sha256="x" * 64, source_kind="bw-import",
)
assert d["extensions"] == {}
def test_sidecar_rejects_unsupported_schema_version(tmp_path: Path): def test_sidecar_rejects_unsupported_schema_version(tmp_path: Path):
path = tmp_path / "future.sfm.json" path = tmp_path / "future.sfm.json"
path.write_text(json.dumps({ path.write_text(json.dumps({