feat: implement raw ADC waveform decoding and download functionality

- Added `_decode_a5_waveform()` to parse SUB 5A frames into per-channel time-series data.
- Introduced `download_waveform(event)` method in `MiniMateClient` to fetch full waveform data.
- Updated `Event` model to include new fields: `total_samples`, `pretrig_samples`, `rectime_seconds`, and `_waveform_key`.
- Enhanced documentation in `CHANGELOG.md` and `instantel_protocol_reference.md` to reflect new features and confirmed protocol details.
This commit is contained in:
Brian Harrison
2026-04-03 13:53:09 -04:00
parent 5d0f0855f2
commit 790e442a7a
5 changed files with 671 additions and 9 deletions

View File

@@ -4,6 +4,50 @@ All notable changes to seismo-relay are documented here.
---
## v0.7.0 — 2026-04-03
### Added
- **Raw ADC waveform decode — `_decode_a5_waveform(frames_data, event)`** in `client.py`.
Parses the complete set of SUB 5A A5 response frames into per-channel time-series:
- Reads the STRT record from A5[0] (bytes 7+): extracts `total_samples` (BE uint16 at +8),
`pretrig_samples` (BE uint16 at +16), and `rectime_seconds` (uint8 at +18) into
`event.total_samples / pretrig_samples / rectime_seconds`.
- Skips the 6-byte preamble (`00 00 ff ff ff ff`) that follows the 21-byte STRT header;
waveform data begins at `strt_pos + 27`.
- Strips the 8-byte per-frame counter header from A5[16, 8] before appending waveform bytes.
- Skips A5[7] (metadata-only) and A5[9] (terminator).
- **Cross-frame alignment correction**: accumulates `running_offset % 8` across all frames
and discards `(8 align) % 8` leading bytes per frame to re-align to a T/V/L/M boundary.
Required because individual frame waveform payloads are not always multiples of 8 bytes.
- Decodes as 4-channel interleaved signed 16-bit LE at 8 bytes per sample-set:
bytes 01 = Tran, 23 = Vert, 45 = Long, 67 = Mic.
- Stores result in `event.raw_samples = {"Tran": [...], "Vert": [...], "Long": [...], "Mic": [...]}`.
- **`download_waveform(event)` public method** on `MiniMateClient`.
Issues a full SUB 5A stream with `stop_after_metadata=False`, then calls
`_decode_a5_waveform()` to populate `event.raw_samples` and `event.total_samples /
pretrig_samples / rectime_seconds`. Previously only metadata frames were fetched during
`get_events()`; raw waveform data is now available on demand.
- **`Event` model new fields** (`models.py`): `total_samples`, `pretrig_samples`,
`rectime_seconds` (from STRT record), and `_waveform_key` (4-byte key stored during
`get_events()` for later use by `download_waveform()`).
### Protocol / Documentation
- **SUB 5A A5[0] STRT record layout confirmed** (✅ 2026-04-03, 4-2-26 blast capture):
- STRT header is 21 bytes: `b"STRT"` + length fields + `total_samples` (BE uint16 at +8) +
`pretrig_samples` (BE uint16 at +16) + `rectime_seconds` (uint8 at +18).
- Followed by 6-byte preamble: `00 00 ff ff ff ff`. Waveform begins at `strt_pos + 27`.
- Confirmed: 4-2-26 blast → `total_samples=9306`, `pretrig_samples=298`, `rectime_seconds=70`.
- **Blast/waveform mode A5 format confirmed** (✅ 2026-04-03, 4-2-26 blast capture):
4-channel interleaved int16 LE at 8 bytes per sample-set; cross-frame alignment correction
required. 948 of 9306 total sample-sets captured via `stop_after_metadata=True` (10 frames).
- **Noise/histogram mode A5 format — endianness corrected** (✅ 2026-04-03, 3-31-26 capture):
32-byte block samples are signed 16-bit **little-endian** (previously documented as BE).
`0a 00` → LE int16 = 10 (correct noise floor); BE would give 2560 (wrong).
- Protocol reference §7.6 rewritten — split into §7.6.1 (Blast/Waveform mode) and §7.6.2
(Noise/Histogram mode), each with confirmed field layouts and open questions noted.
---
## v0.6.0 — 2026-04-02
### Added

260
CLAUDE.md Normal file
View File

@@ -0,0 +1,260 @@
# CLAUDE.md — seismo-relay
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
(Sierra Wireless RV50 / RV55). Current version: **v0.6.0**.
---
## Project layout
```
minimateplus/ ← Python client library (primary focus)
transport.py ← SerialTransport, TcpTransport
framing.py ← DLE codec, frame builders, S3FrameParser
protocol.py ← MiniMateProtocol — wire-level read/write methods
client.py ← MiniMateClient — high-level API (connect, get_events, …)
models.py ← DeviceInfo, EventRecord, ComplianceConfig, …
sfm/server.py ← FastAPI REST server exposing device data over HTTP
seismo_lab.py ← Tkinter GUI (Bridge + Analyzer + Console tabs)
docs/
instantel_protocol_reference.md ← reverse-engineered protocol spec ("the Rosetta Stone")
CHANGELOG.md ← version history
```
---
## Current implementation state (v0.6.0)
Full read pipeline working end-to-end over TCP/cellular:
| Step | SUB | Status |
|---|---|---|
| POLL / startup handshake | 5B | ✅ |
| Serial number | 15 | ✅ |
| Full config (firmware, calibration date, etc.) | FE | ✅ |
| Compliance config (record time, sample rate, geo thresholds) | 1A | ✅ |
| Event index | 08 | ✅ |
| Event header / first key | 1E | ✅ |
| Waveform header | 0A | ✅ |
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
| **Bulk waveform stream (event-time metadata)** | **5A** | ✅ **new v0.6.0** |
| Event advance / next key | 1F | ✅ |
| Write commands (push config to device) | 6883 | ❌ not yet implemented |
`get_events()` sequence per event: `1E → 0A → 0C → 5A → 1F`
---
## Protocol fundamentals
### DLE framing
```
BW→S3 (our requests): [ACK=0x41] [STX=0x02] [stuffed payload+chk] [ETX=0x03]
S3→BW (device replies): [DLE=0x10] [STX=0x02] [stuffed payload+chk] [bare ETX=0x03]
```
- **DLE stuffing rule:** any literal `0x10` byte in the payload is doubled on the wire
(`0x10``0x10 0x10`). This includes the checksum byte.
- **Inner-frame terminators:** large S3 responses (A4, E5) contain embedded sub-frames
using `DLE+ETX` as inner terminators. The outer parser treats `DLE+ETX` inside a frame
as literal data — the bare ETX is the ONLY real frame terminator.
- **Response SUB rule:** `response_SUB = 0xFF - request_SUB`
(one known exception: SUB `1C` → response `6E`, not `0xE3`)
- **Two-step read pattern:** every read command is sent twice — probe step (`offset=0x00`,
get length) then data step (`offset=DATA_LENGTH`, get payload). All data lengths are
hardcoded constants, not read from the probe response.
### De-stuffed payload header
```
BW→S3 (request):
[0] CMD 0x10
[1] flags 0x00
[2] SUB command byte
[3] 0x00 always zero
[4] 0x00 always zero
[5] OFFSET 0x00 for probe step; DATA_LENGTH for data step
[6-15] params (key, token, etc. — see helpers in framing.py)
S3→BW (response):
[0] CMD 0x00
[1] flags 0x10
[2] SUB response sub byte
[3] PAGE_HI
[4] PAGE_LO
[5+] data
```
---
## Critical protocol gotchas (hard-won — do not re-derive)
### SUB 5A — bulk waveform stream — NON-STANDARD frame format
**Always use `build_5a_frame()` for SUB 5A. Never use `build_bw_frame()` for SUB 5A.**
`build_bw_frame` produces WRONG output for 5A for two reasons:
1. **`offset_hi = 0x10` must NOT be DLE-stuffed.** Blastware sends the offset field raw.
`build_bw_frame` would stuff it to `10 10` on the wire — the device silently ignores
the frame. `build_5a_frame` writes it as a bare `10`.
2. **DLE-aware checksum.** When computing the checksum, `10 XX` pairs in the stuffed
section contribute only `XX` to the running sum; lone bytes contribute normally. This
differs from the standard SUM8-of-destuffed-payload that all other commands use.
Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26
BW TX capture. All 10 frames verified.
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
`bulk_waveform_params()` returns 11 bytes (extra trailing `0x00`). The 11th byte was
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
Do not swap them.
### SUB 5A — event-time metadata lives in A5 frame 7
The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance
setup as it existed when the event was recorded:
```
"Project:" → project description
"Client:" → client name ← NOT in the 0C record
"User Name:" → operator name ← NOT in the 0C record
"Seis Loc:" → sensor location ← NOT in the 0C record
"Extended Notes"→ notes
```
These strings are **NOT** present in the 210-byte SUB 0C waveform record. They reflect
the setup at record time, not the current device config — this is why we fetch them from
5A instead of backfilling from the current compliance config.
`stop_after_metadata=True` (default) stops the 5A loop as soon as `b"Project:"` appears,
then sends the termination frame.
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
- Frame A is a probe (no `recv_one` needed — device ACKs but returns no data page)
- Frames B, C, D each need a `recv_one` to collect the response
**There must be NO extra `self._send(...)` call before the B/C/D recv loop without a
matching `recv_one()`.** An orphaned send shifts all receives one step behind, leaving
frame D's channel block (trigger_level_geo, alarm_level_geo, max_range_geo) unread and
producing only ~1071 bytes instead of ~2126.
### SUB 1A — anchor search range
`_decode_compliance_config_into()` locates sample_rate and record_time via the anchor
`b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'`. Search range is `cfg[0:150]`.
Do not narrow this to `cfg[40:100]` — the old range was only accidentally correct because
the orphaned-send bug was prepending a 44-byte spurious header, pushing the anchor from
its real position (cfg[11]) into the 40100 window.
### Sample rate and DLE jitter in cfg data
Sample rate 4096 (`0x1000`) causes DLE jitter: the frame carries `10 10 00` on the wire,
which unstuffs to `10 00` — 2 bytes instead of 3. This makes frame C 1 byte shorter and
shifts all subsequent absolute offsets by 1. The anchor approach is immune to this.
Do NOT use fixed absolute offsets for sample_rate or record_time.
### TCP / cellular transport
- Protocol bytes over TCP are bit-for-bit identical to RS-232. No wrapping.
- The modem (RV50/RV55) forwards bytes with up to ~1s buffering. `TcpTransport` uses
`read_until_idle(idle_gap=1.5s)` to drain the buffer completely before parsing.
- Cold-boot: unit sends the 16-byte ASCII string `"Operating System"` before entering
DLE-framed mode. The parser discards it (scans for DLE+STX).
- RV50/RV55 sends `\r\nRING\r\n\r\nCONNECT\r\n` over TCP to the caller even with
Quiet Mode enabled. Parser handles this — do not strip it manually before feeding to
`S3FrameParser`.
### Required ACEmanager settings (Sierra Wireless RV50/RV55)
| Setting | Value | Why |
|---|---|---|
| Configure Serial Port | `38400,8N1` | Must match MiniMate baud |
| Flow Control | `None` | Hardware FC blocks TX if pins unconnected |
| **Quiet Mode** | **Enable** | **Critical.** Disabled injects `RING`/`CONNECT` onto serial, corrupting S3 handshake |
| Data Forwarding Timeout | `1` (= 0.1 s) | Lower latency |
| TCP Connect Response Delay | `0` | Non-zero silently drops first POLL frame |
| TCP Idle Timeout | `2` (minutes) | Prevents premature disconnect |
| DB9 Serial Echo | `Disable` | Echo corrupts the data stream |
---
## Key confirmed field locations
### SUB FE — Full Config (166 destuffed bytes)
| Offset | Field | Type | Notes |
|---|---|---|---|
| 0x34 | firmware version string | ASCII | e.g. `"S338.17"` |
| 0x560x57 | calibration year | uint16 BE | `0x07E9` = 2025 |
| 0x0109 | aux trigger enabled | uint8 | `0x00` = off, `0x01` = on |
### SUB 1A — Compliance Config (~2126 bytes total after 4-frame sequence)
| Field | How to find it |
|---|---|
| sample_rate | uint16 BE at anchor 2 |
| record_time | float32 BE at anchor + 10 |
| trigger_level_geo | float32 BE, located in channel block |
| alarm_level_geo | float32 BE, adjacent to trigger_level_geo |
| max_range_geo | float32 BE, adjacent to alarm_level_geo |
| setup_name | ASCII, null-padded, in cfg body |
| project / client / operator / sensor_location | ASCII, label-value pairs |
Anchor: `b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'`, search `cfg[0:150]`
### SUB 0C — Waveform Record (210 bytes = data[11:11+0xD2])
| Offset | Field | Type |
|---|---|---|
| 0 | day | uint8 |
| 1 | sub_code | uint8 (`0x10` = Waveform) |
| 2 | month | uint8 |
| 34 | year | uint16 BE |
| 5 | unknown | uint8 (always 0) |
| 6 | hour | uint8 |
| 7 | minute | uint8 |
| 8 | second | uint8 |
| 87 | peak_vector_sum | float32 BE |
| label+6 | PPV per channel | float32 BE (search for `"Tran"`, `"Vert"`, `"Long"`, `"MicL"`) |
PPV labels are NOT 4-byte aligned. The label-offset+6 approach is the only reliable method.
---
## SFM REST API (sfm/server.py)
```
GET /device/info?port=COM5 ← serial
GET /device/info?host=1.2.3.4&tcp_port=9034 ← cellular
GET /device/events?host=1.2.3.4&tcp_port=9034&baud=38400
GET /device/event?host=1.2.3.4&tcp_port=9034&index=0
```
Server retries once on `ProtocolError` for TCP connections (handles cold-boot timing).
---
## Key wire captures (reference material)
| Capture | Location | Contents |
|---|---|---|
| 1-2-26 | `bridges/captures/1-2-26/` | SUB 5A BW TX frames — used to confirm 5A frame format, 11-byte params, DLE-aware checksum |
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence |
---
## What's next
- Write commands (SUBs 6883) — push compliance config, channel config, trigger settings to device
- ACH inbound server — accept call-home connections from field units
- Modem manager — push RV50/RV55 configs via Sierra Wireless API

View File

@@ -77,6 +77,9 @@
| 2026-04-02 | §7.7.5 | **CONFIRMED — Event-time metadata source.** `Client:`, `User Name:`, and `Seis Loc:` strings are present in **A5 frame 7** of the SUB 5A bulk waveform stream — they are NOT in the 210-byte SUB 0C waveform record. They reflect the compliance setup active when the event was stored on the device (not the current setup). `get_events()` now issues SUB 5A after each 0C download. Sequence: `1E → 0A → 0C → 5A → 1F`. |
| 2026-04-02 | §7.6.2 | **FIXED — Compliance config orphaned send bug.** An extra `self._send(SUB_COMPLIANCE / 0x2A / DATA_PARAMS)` before the B/C/D receive loop had no corresponding `recv_one()`. Every receive in the loop was consuming the previous send's response, leaving frame D's channel block unread. Bug removed. Total config bytes now ~2126 (was ~1071 due to truncation). `trigger_level_geo`, `alarm_level_geo`, `max_range_geo` are now correctly populated. |
| 2026-04-02 | §7.6.1 | **CORRECTED — Anchor search range.** Previous doc stated anchor search range `cfg[40:100]`. With the orphaned-send bug fixed, the 44-byte header padding is gone and the anchor now appears at `cfg[11]`. Corrected to `cfg[0:150]`. |
| 2026-04-03 | §7.6 | **CONFIRMED — Blast waveform format (4-2-26 capture).** Blast/waveform-mode SUB 5A stream uses 4-channel interleaved signed int16 LE, 8 bytes per sample-set [T,V,L,M]. NOT the 32-byte block format (which is noise/histogram mode only). Frame sizes are NOT multiples of 8 — cross-frame alignment correction required (track global byte offset mod 8; skip `(8-align)%8` bytes at each frame start). A5[0] STRT record confirmed: 21 bytes at db[7:]+11; waveform starts at strt_pos+27 (after 2-byte null pad + 4-byte 0xFF sentinel). Frame index 7 = metadata only, no ADC data. Full §7.6 rewritten. |
| 2026-04-03 | §7.6 | **CONFIRMED — Noise block format details.** 32-byte blocks: LE uint16 type + LE uint16 ctr + 9×int16 LE samples + 10B metadata. Samples are little-endian (previous doc said big-endian — WRONG). Type: 0x0016=sync (appears at start of each A5 frame), 0x0000=data. Noise floor ≈ 911 counts. Metadata fixed pattern `00 01 43 [2B var] 00 [pretrig] [rectime] 00 00` confirmed. |
| 2026-04-03 | client.py | **NEW — `_decode_a5_waveform()` and `download_waveform()` implemented.** `_decode_a5_waveform(frames_data, event)` decodes full A5 waveform stream into `event.raw_samples = {"Tran":[…], "Vert":[…], "Long":[…], "Mic":[…]}`. Populates `event.total_samples`, `event.pretrig_samples`, `event.rectime_seconds` from STRT record. Handles cross-frame alignment. `MiniMateClient.download_waveform(event)` calls `read_bulk_waveform_stream(stop_after_metadata=False)` then invokes the decoder. Waveform key stored on Event as `_waveform_key` during `get_events()`. |
---
@@ -722,20 +725,110 @@ MicL: 39 64 1D AA = 0.0000875 psi
### 7.6 Bulk Waveform Stream (SUB A5) — Raw ADC Sample Records
Each repeating record (🔶 INFERRED structure):
**Two distinct formats exist depending on recording mode. Both confirmed from captures.**
---
#### 7.6.1 Blast / Waveform mode — ✅ CONFIRMED (4-2-26 capture)
4-channel interleaved signed 16-bit little-endian, 8 bytes per sample-set:
```
[CH_ID] [S0_HI] [S0_LO] [S1_HI] [S1_LO] ... [S8_HI] [S8_LO] [00 00] [01] [PEAK × 3 bytes]
01 00 0A 00 0B 43 xx xx
[T_lo T_hi V_lo V_hi L_lo L_hi M_lo M_hi] × N sample-sets
```
- `CH_ID` — Channel identifier. `01` consistently observed. Full mapping unknown. 🔶 INFERRED
- 9× signed 16-bit big-endian ADC samples. Noise floor ≈ `0x000A``0x000B`
- `00 00` — separator / padding
- `01` — unknown flag byte
- 3-byte partial IEEE 754 float — peak value for this sample window. `0x43` prefix = range 130260
- **T** = Transverse (Tran), **V** = Vertical (Vert), **L** = Longitudinal (Long), **M** = Microphone
- Channel order follows the Blastware convention: Tran is always first (ch[0]).
- Encoding: signed int16 little-endian. Full scale = ±32768 counts.
- Sample rate: set by compliance config (typical: 1024 Hz for blast monitoring).
- Each A5 frame chunk carries a different number of waveform bytes. Frame sizes
are NOT multiples of 8, so naive concatenation scrambles channel assignments at
frame boundaries. **Always track cumulative byte offset mod 8 to correct alignment.**
> ❓ SPECULATIVE: At 1024 sps, 9 samples ≈ 8.8ms per record. Sample rate unconfirmed from captured data alone.
**A5[0] frame layout:**
```
db[7:]: [11-byte header] [21-byte STRT record] [6-byte preamble] [waveform ...]
STRT: offset 11 in db[7:]
+0..3 b'STRT' magic
+8..9 uint16 BE total_samples (full-record expected sample-set count)
+16..17 uint16 BE pretrig_samples (pre-trigger window, in sample-sets)
+18 uint8 rectime_seconds
preamble: +19..20 0x00 0x00 null padding
+21..24 0xFF × 4 synchronisation sentinel
Waveform: starts at strt_pos + 27 within db[7:]
```
**A5[1..N] frame layout (non-metadata frames):**
```
db[7:]: [8-byte per-frame header] [waveform ...]
Header: [counter LE uint16, 0x00 × 6] — frame sequence counter (0, 8, 12, 16, 20, …×0x400)
Waveform: starts at byte 8 of db[7:]
```
**Special frames:**
| Frame index | Contents |
|---|---|
| A5[0] | Probe response: STRT record + first waveform chunk |
| A5[7] | Event-time metadata strings only (no waveform data) |
| A5[9] | Terminator frame (page_key=0x0000) — ignored |
| A5[1..6,8] | Waveform chunks |
**Confirmed from 4-2-26 blast capture (total_samples=9306, pretrig=298, rate=1024 Hz):**
```
Frame Waveform bytes Cumulative Align(mod 8)
A5[0] 933B 933B 0
A5[1] 963B 1896B 5
A5[2] 946B 2842B 0
A5[3] 960B 3802B 2
A5[4] 952B 4754B 2
A5[5] 946B 5700B 2
A5[6] 941B 6641B 4
A5[8] 992B 7633B 1
Total: 7633B → 954 naive sample-sets, 948 alignment-corrected
```
Only 948 of 9306 sample-sets captured (10%) — `stop_after_metadata=True` terminated
download after A5[7] was received.
**Channel identification note:** The 4-2-26 blast saturated all four geophone channels
to near-maximum ADC output (~3200032617 counts). Channel ordering [Tran, Vert, Long, Mic]
= [ch0, ch1, ch2, ch3] is the Blastware convention and is consistent with per-channel PPV
values (Tran=0.420, Vert=3.870, Long=0.495 in/s from 0C record), but cannot be
independently confirmed from a fully-saturating event alone.
---
#### 7.6.2 Noise monitoring / Histogram mode — ✅ CONFIRMED (3-31-26 capture)
32-byte blocks with the following layout:
```
Offset Size Type Description
0 2 uint16 LE block type: 0x0016=sync, 0x0000=data
2 2 uint16 LE block counter (ctr)
4 18 int16 LE × 9 ADC samples
22 10 bytes metadata: [00 01 43 VAR VAR 00 pretrig rectime 00 00]
```
- Sync blocks (type=0x0016) appear at the start of each A5 frame; ctr=0 in sync blocks.
- Data blocks (type=0x0000) carry actual sample data. First data block ctr=288 (empirical,
not yet decoded — likely related to a pre-trigger sample offset).
- Metadata fixed bytes: `00 01 43` then 2 variable bytes, then `00 [pretrig] [rectime] 00 00`.
Pretrig byte = 0x1E (30) and rectime byte = 0x0A (10) for the 3-31-26 capture.
- 9 samples per block (int16 LE, NOT big-endian). Noise floor ≈ 911 counts.
- **This is a different recording mode** from waveform/blast — the device firmware uses
32-byte blocks for histogram/noise monitoring and 4-channel continuous for waveform events.
> ❓ **Open:** The 9-sample-per-block structure does not divide evenly into 4 channels.
> Whether these represent a single channel, all channels in rotation, or downsampled
> aggregates is not yet determined. The first data block ctr=288 vs pretrig=30 is also
> unexplained — possibly counting in units other than sample-sets.
---
---

View File

@@ -212,6 +212,7 @@ class MiniMateClient:
while key4 != b"\x00\x00\x00\x00":
log.info("get_events: record %d key=%s", idx, key4.hex())
ev = Event(index=idx)
ev._waveform_key = key4 # stored so download_waveform() can re-use it
# First event: call 0A to verify it's a full record (0x30 length).
# Subsequent keys come from 1F(0xFE) which guarantees full records,
@@ -280,6 +281,66 @@ class MiniMateClient:
log.info("get_events: downloaded %d event(s)", len(events))
return events
def download_waveform(self, event: Event) -> None:
"""
Download the full raw ADC waveform for a previously-retrieved event
and populate event.raw_samples, event.total_samples,
event.pretrig_samples, and event.rectime_seconds.
This performs a complete SUB 5A (BULK_WAVEFORM_STREAM) download with
stop_after_metadata=False, fetching all waveform frames (typically 9
large A5 frames for a standard blast record). The download is large
(up to several hundred KB for a 9-second, 4-channel, 1024-Hz record)
and is intentionally not performed by get_events() by default.
Args:
event: An Event object returned by get_events(). Must have a
waveform key embedded; the key is reconstructed from the
event's timestamp and index via the 1E/1F protocol.
Raises:
ValueError: if the event does not have a waveform key available.
RuntimeError: if the client is not connected.
ProtocolError: on communication failure.
Confirmed format (4-2-26 blast capture, ✅):
4-channel interleaved signed 16-bit LE, 8 bytes per sample-set.
Total samples: 9306 (≈9.1 s at 1024 Hz), pretrig: 298 (≈0.29 s).
Channel order: Tran, Vert, Long, Mic (Blastware convention).
"""
proto = self._require_proto()
if event._waveform_key is None:
raise ValueError(
f"Event#{event.index} has no waveform key — "
"was it retrieved via get_events()?"
)
log.info(
"download_waveform: starting full 5A download for event#%d (key=%s)",
event.index, event._waveform_key.hex(),
)
a5_frames = proto.read_bulk_waveform_stream(
event._waveform_key, stop_after_metadata=False
)
log.info(
"download_waveform: received %d A5 frames; decoding waveform",
len(a5_frames),
)
_decode_a5_waveform(a5_frames, event)
if event.raw_samples is not None:
n = len(event.raw_samples.get("Tran", []))
log.info(
"download_waveform: decoded %d sample-sets across 4 channels",
n,
)
else:
log.warning("download_waveform: waveform decode produced no samples")
# ── Internal helpers ──────────────────────────────────────────────────────
def _require_proto(self) -> MiniMateProtocol:
@@ -543,6 +604,203 @@ def _decode_a5_metadata_into(frames_data: list[bytes], event: Event) -> None:
)
def _decode_a5_waveform(
frames_data: list[bytes],
event: Event,
) -> None:
"""
Decode the raw 4-channel ADC waveform from a complete set of SUB 5A
(BULK_WAVEFORM_STREAM) frame payloads and populate event.raw_samples,
event.total_samples, event.pretrig_samples, and event.rectime_seconds.
This requires ALL A5 frames (stop_after_metadata=False), not just the
metadata-bearing subset.
── Waveform format (confirmed from 4-2-26 blast capture) ───────────────────
The blast waveform is 4-channel interleaved signed 16-bit little-endian,
8 bytes per sample-set:
[T_lo T_hi V_lo V_hi L_lo L_hi M_lo M_hi] × N
where T=Tran, V=Vert, L=Long, M=Mic. Channel ordering follows the
Blastware convention [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
⚠️ Channel ordering is a confirmed CONVENTION — the physical ordering on
the ADC mux is not independently verifiable from the saturating blast
captures we have. The convention is consistent with Blastware labeling
(Tran is always the first channel field in the A5 STRT+waveform stream).
── Frame structure ──────────────────────────────────────────────────────────
A5[0] (probe response):
db[7:] = [11-byte header] [21-byte STRT record] [6-byte preamble] [waveform ...]
STRT: b'STRT' at offset 11, total 21 bytes
+8 uint16 BE: total_samples (expected full-record sample-sets)
+16 uint16 BE: pretrig_samples (pre-trigger sample count)
+18 uint8: rectime_seconds (record duration)
Preamble: 6 bytes after the STRT record (confirmed from 4-2-26 blast capture):
bytes 21-22: 0x00 0x00 (null padding)
bytes 23-26: 0xFF × 4 (sync sentinel / alignment marker)
Waveform starts at strt_pos + 27 within db[7:].
A5[1..N] (chunk responses):
db[7:] = [8-byte per-frame header] [waveform bytes ...]
Header: [ctr LE uint16, 0x00 × 6] — frame sequence counter
Waveform starts at byte 8 of db[7:].
── Cross-frame alignment ────────────────────────────────────────────────────
Frame waveform chunk sizes are NOT multiples of 8. Naive concatenation
scrambles channel assignments at frame boundaries. Fix: track the
cumulative global byte offset; at each new frame, the starting alignment
within the T,V,L,M cycle is (global_offset % 8).
Confirmed sizes from 4-2-26 (A5[0..8], skipping A5[7] metadata frame
and A5[9] terminator):
Frame 0: 934B Frame 1: 963B Frame 2: 946B Frame 3: 960B
Frame 4: 952B Frame 5: 946B Frame 6: 941B Frame 8: 992B
— none are multiples of 8.
── Modifies event in-place. ─────────────────────────────────────────────────
"""
if not frames_data:
log.debug("_decode_a5_waveform: no frames provided")
return
# ── Parse STRT record from A5[0] ────────────────────────────────────────
w0 = frames_data[0][7:] # db[7:] for A5[0]
strt_pos = w0.find(b"STRT")
if strt_pos < 0:
log.warning("_decode_a5_waveform: STRT record not found in A5[0]")
return
# STRT record layout (21 bytes, offsets relative to b'STRT'):
# +0..3 magic b'STRT'
# +8..9 uint16 BE total_samples (full-record expected sample-set count)
# +16..17 uint16 BE pretrig_samples
# +18 uint8 rectime_seconds
strt = w0[strt_pos : strt_pos + 21]
if len(strt) < 21:
log.warning("_decode_a5_waveform: STRT record truncated (%dB)", len(strt))
return
total_samples = struct.unpack_from(">H", strt, 8)[0]
pretrig_samples = struct.unpack_from(">H", strt, 16)[0]
rectime_seconds = strt[18]
event.total_samples = total_samples
event.pretrig_samples = pretrig_samples
event.rectime_seconds = rectime_seconds
log.debug(
"_decode_a5_waveform: STRT total_samples=%d pretrig=%d rectime=%ds",
total_samples, pretrig_samples, rectime_seconds,
)
# ── Collect per-frame waveform bytes with global offset tracking ─────────
# global_offset is the cumulative byte count across all frames, used to
# compute the channel alignment at each frame boundary.
chunks: list[tuple[int, bytes]] = [] # (frame_idx, waveform_bytes)
global_offset = 0
for fi, db in enumerate(frames_data):
w = db[7:]
# A5[0]: waveform begins after the 21-byte STRT record and 6-byte preamble.
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
if fi == 0:
sp = w.find(b"STRT")
if sp < 0:
continue
wave = w[sp + 27 :]
# Frame 7 carries event-time metadata strings ("Project:", "Client:", …)
# and no waveform ADC data.
elif fi == 7:
continue
# A5[9] is the device terminator frame (page_key=0x0000), also no data.
elif fi == 9:
continue
else:
# Strip the 8-byte per-frame header (ctr + 6 zero bytes)
if len(w) < 8:
continue
wave = w[8:]
if len(wave) < 2:
continue
chunks.append((fi, wave))
global_offset += len(wave)
total_bytes = global_offset
n_sets = total_bytes // 8
log.debug(
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets "
"(%d of %d expected; %.0f%%)",
len(chunks), total_bytes, n_sets, n_sets, total_samples,
100.0 * n_sets / total_samples if total_samples else 0,
)
if n_sets == 0:
log.warning("_decode_a5_waveform: no complete sample-sets found")
return
# ── Concatenate into one stream and decode ───────────────────────────────
# Rather than concatenating and then fixing up, we reconstruct the correct
# channel-aligned stream by skipping misaligned partial sample-sets at each
# frame start.
#
# At global byte offset G, the byte position within the T,V,L,M cycle is
# G % 8. When a frame starts with align = G % 8 ≠ 0, the first
# (8 - align) bytes of that frame complete a partial sample-set that
# cannot be decoded cleanly, so we skip them and start from the next full
# T-boundary.
#
# This produces a slightly smaller decoded set but preserves correct
# channel alignment throughout.
tran: list[int] = []
vert: list[int] = []
long_: list[int] = []
mic: list[int] = []
running_offset = 0
for fi, wave in chunks:
align = running_offset % 8 # byte position within T,V,L,M cycle
skip = (8 - align) % 8 # bytes to discard to reach next T start
if skip > 0 and skip < len(wave):
usable = wave[skip:]
elif align == 0:
usable = wave
else:
running_offset += len(wave)
continue # entire frame is a partial sample-set
n_usable = len(usable) // 8
for i in range(n_usable):
off = i * 8
tran.append( struct.unpack_from("<h", usable, off)[0])
vert.append( struct.unpack_from("<h", usable, off + 2)[0])
long_.append(struct.unpack_from("<h", usable, off + 4)[0])
mic.append( struct.unpack_from("<h", usable, off + 6)[0])
running_offset += len(wave)
log.debug(
"_decode_a5_waveform: decoded %d alignment-corrected sample-sets "
"(skipped %d due to frame boundary misalignment)",
len(tran), n_sets - len(tran),
)
event.raw_samples = {
"Tran": tran,
"Vert": vert,
"Long": long_,
"Mic": mic,
}
def _extract_record_type(data: bytes) -> Optional[str]:
"""
Decode the recording mode from byte[1] of the 210-byte waveform record.

View File

@@ -327,12 +327,19 @@ class Event:
# Raw ADC samples keyed by channel label. Not fetched unless explicitly
# requested (large data transfer — up to several MB per event).
raw_samples: Optional[dict] = None # {"Tran": [...], "Vert": [...], ...}
total_samples: Optional[int] = None # from STRT record: expected total sample-sets
pretrig_samples: Optional[int] = None # from STRT record: pre-trigger sample count
rectime_seconds: Optional[int] = None # from STRT record: record duration (seconds)
# ── Debug / introspection ─────────────────────────────────────────────────
# Raw 210-byte waveform record bytes, set when debug mode is active.
# Exposed by the SFM server via ?debug=true so field layouts can be verified.
_raw_record: Optional[bytes] = field(default=None, repr=False)
# 4-byte waveform key used to request this event via SUB 5A.
# Set by get_events(); required by download_waveform().
_waveform_key: Optional[bytes] = field(default=None, repr=False)
def __str__(self) -> str:
ts = str(self.timestamp) if self.timestamp else "no timestamp"
ppv = ""