Files
seismo-relay/CLAUDE.md
T
claude 9bef430451 docs: document 5A end-of-stream signal, chunk timing, fi==9 bug, ADC conversion
Adds §7.8.4 to protocol reference and corresponding CLAUDE.md sections:

- End-of-stream: device sends exactly 1 raw byte after last chunk; handled
  via TimeoutError + bytes_fed>0 check → graceful break to termination
- Chunk timing: ~1s per chunk, 35 chunks for a 9,306-sample event, safe
  timeout is 10s (not default 120s)
- fi==9 decoder bug: hardcoded skip drops ~133 sample-sets per event;
  noted as known issue pending fix
- ADC conversion: counts × (range/32767) → physical units (in/s for geo)

Changelog entries added for all four items (2026-04-06).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 21:28:26 -04:00

20 KiB
Raw Blame History

CLAUDE.md — seismo-relay

Ground-up Python replacement for Blastware, Instantel's Windows-only software for managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55). Current version: v0.7.0.


Project layout

minimateplus/         ← Python client library (primary focus)
  transport.py        ←   SerialTransport, TcpTransport
  framing.py          ←   DLE codec, frame builders, S3FrameParser
  protocol.py         ←   MiniMateProtocol — wire-level read/write methods
  client.py           ←   MiniMateClient — high-level API (connect, get_events, …)
  models.py           ←   DeviceInfo, EventRecord, ComplianceConfig, …

sfm/server.py         ← FastAPI REST server exposing device data over HTTP
seismo_lab.py         ← Tkinter GUI (Bridge + Analyzer + Console tabs)
docs/
  instantel_protocol_reference.md  ← reverse-engineered protocol spec ("the Rosetta Stone")
CHANGELOG.md          ← version history

Current implementation state (v0.7.0)

Full read pipeline working end-to-end over TCP/cellular:

Step SUB Status
POLL / startup handshake 5B
Serial number 15
Full config (firmware, calibration date, etc.) FE
Compliance config (record time, sample rate, geo thresholds) 1A
Event index 08
Event header / first key 1E
Waveform header 0A
Waveform record (peaks, timestamp, project) 0C
Bulk waveform stream (event-time metadata) 5A new v0.6.0
Event advance / next key 1F
Write commands (push config to device) 6883 not yet implemented

get_events() sequence per event: 1E → 0A → 0C → 5A → 1F


Protocol fundamentals

DLE framing

BW→S3 (our requests):   [ACK=0x41] [STX=0x02] [stuffed payload+chk] [ETX=0x03]
S3→BW (device replies): [DLE=0x10] [STX=0x02] [stuffed payload+chk] [bare ETX=0x03]
  • DLE stuffing rule: any literal 0x10 byte in the payload is doubled on the wire (0x100x10 0x10). This includes the checksum byte.
  • Inner-frame terminators: large S3 responses (A4, E5) contain embedded sub-frames using DLE+ETX as inner terminators. The outer parser treats DLE+ETX inside a frame as literal data — the bare ETX is the ONLY real frame terminator.
  • Response SUB rule: response_SUB = 0xFF - request_SUB (one known exception: SUB 1C → response 6E, not 0xE3)
  • Two-step read pattern: every read command is sent twice — probe step (offset=0x00, get length) then data step (offset=DATA_LENGTH, get payload). All data lengths are hardcoded constants, not read from the probe response.

De-stuffed payload header

BW→S3 (request):
  [0] CMD    0x10
  [1] flags  0x00
  [2] SUB    command byte
  [3] 0x00   always zero
  [4] 0x00   always zero
  [5] OFFSET 0x00 for probe step; DATA_LENGTH for data step
  [6-15]     params (key, token, etc. — see helpers in framing.py)

S3→BW (response):
  [0] CMD    0x00
  [1] flags  0x10
  [2] SUB    response sub byte
  [3] PAGE_HI
  [4] PAGE_LO
  [5+] data

Critical protocol gotchas (hard-won — do not re-derive)

SUB 5A — bulk waveform stream — NON-STANDARD frame format

Always use build_5a_frame() for SUB 5A. Never use build_bw_frame() for SUB 5A.

build_bw_frame produces WRONG output for 5A for two reasons:

  1. offset_hi = 0x10 must NOT be DLE-stuffed. Blastware sends the offset field raw. build_bw_frame would stuff it to 10 10 on the wire — the device silently ignores the frame. build_5a_frame writes it as a bare 10.

  2. DLE-aware checksum. When computing the checksum, 10 XX pairs in the stuffed section contribute only XX to the running sum; lone bytes contribute normally. This differs from the standard SUM8-of-destuffed-payload that all other commands use.

Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26 BW TX capture. All 10 frames verified.

SUB 5A — chunk counter is monotonic (CORRECTED 2026-04-06)

Chunk counters are chunk_num * 0x0400 for ALL chunks including chunk 1.

The 4-2-26 BW TX capture showed counter=0x1004 for chunk 1 of event key 01110000, which led to _CHUNK1_COUNTER = 0x1004 being hardcoded as a special case. This was a Blastware artifact, not a protocol requirement. Empirical test 2026-04-06: with counter=0x1004 for chunk 1 the device times out (120 s); with counter=0x0400 (= 1 * 0x0400) it responds immediately and streams all frames correctly.

The 4-3-26 capture confirms the pattern for a second event (key 0111245a): chunk 1 = 0x245A, chunk 2 = 0x285A, chunk 3 = 0x2C5A (each +0x0400). Blastware's true formula is key4[2:4] + n * 0x0400 — but since key4[2:4] of the first event is 0x0000, n * 0x0400 produces the right result. The device does not strictly validate the counter and streams data for any valid 5A request; using chunk_num * 0x0400 is correct.

SUB 5A — params are 11 bytes for chunk frames, 10 for termination

bulk_waveform_params() returns 11 bytes (extra trailing 0x00). The 11th byte was confirmed from the BW wire capture. bulk_waveform_term_params() returns 10 bytes. Do not swap them.

SUB 5A — event-time metadata lives in A5 frame 7

The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance setup as it existed when the event was recorded:

"Project:"      → project description
"Client:"       → client name        ← NOT in the 0C record
"User Name:"    → operator name      ← NOT in the 0C record
"Seis Loc:"     → sensor location    ← NOT in the 0C record
"Extended Notes"→ notes

IMPORTANT — 5A "Project:" is session-start config, NOT per-event (confirmed 2026-04-05): The "Project:" string in the A5 frame 7 payload reflects the compliance setup from when the monitoring session first started, not the individual event's project name. The per- event project name is correctly stored in the 210-byte 0C waveform record and must be used as the authoritative source. _decode_a5_metadata_into therefore only sets project from 5A when 0C didn't already supply one.

"Client:", "User Name:", "Seis Loc:", and "Extended Notes" are NOT present in the 0C record — 5A remains the sole source for those fields and they are set unconditionally.

stop_after_metadata=True (default) stops the 5A loop as soon as b"Project:" appears, then sends the termination frame.

SUB 5A — end-of-stream signal (confirmed 2026-04-06)

After streaming all waveform chunks, the device sends exactly 1 raw byte in response to the next chunk request, then goes silent. This is the natural end-of-stream indicator — NOT a complete A5 frame. S3FrameParser.bytes_fed will be 1; no frame is assembled.

Handling: on TimeoutError, if bytes_fed > 0 AND frames were already collected, treat as graceful end-of-stream, break the loop, and proceed to the termination frame. If bytes_fed == 0 with no prior frames, it is a genuine transport failure — re-raise.

Chunk recv timeout must be 10 s, not the default 120 s. Chunks arrive within ~1 s each. Using 120 s causes a ~2-minute stall at every end-of-stream detection. The _recv_one call in the chunk loop passes timeout=10.0 explicitly.

Typical chunk count (BE11529, 1024 sps): A 9,306-sample event produces 35 chunks before end-of-stream. Chunks with uniform 1,036-byte data are all-zero ADC samples (post-event silence). Only the initial variable-size chunks contain actual signal.

SUB 5A — known decoder issue: fi==9 hardcoded skip (not yet fixed)

_decode_a5_waveform() in client.py has elif fi == 9: continue — a leftover from the 9-frame original blast capture where frame 9 was assumed to be a terminator. For current 35-frame streams, fi==9 is live waveform data (~133 sample-sets dropped). Terminator detection is via page_key == 0x0000, not frame index. This skip should be removed.

SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)

token_params bug (FIXED): The token byte was at params[6] (wrong). Both 3-31-26 and 4-3-26 BW TX captures confirm it belongs at params[7] (raw: 00 00 00 00 00 00 00 fe 00 00). With the wrong position the device ignores the token and 1F returns null immediately.

1F token depends on context: In browse mode (no 5A), use all-zero params (browse=True). In download mode (get_events with 5A), use token=0xFE (browse=False) — this is required to arm the device's 5A bulk stream state machine. The earlier "empirical" test showing token=0xFE returns null was done WITHOUT the 1E(arm) step; that test is invalid. BW always uses 1F(0xFE) in download mode. count_events uses browse=True (no 5A needed).

0A context requirement: advance_event() (1F) only returns a valid next-event key when a preceding read_waveform_header() (0A) call has established device waveform context for the current key. Call 0A before every event in the loop, not just the first. Calling 1F cold (after only 1E, with no 0A) returns the null sentinel regardless of how many events are stored.

1F response layout: The next event's key IS at data_rsp.data[11:15] (= payload[16:20]). Confirmed from 4-3-26 browse-mode S3 captures:

1F after 0A(key0=01110000):  data[11:15]=0111245a  data[15:19]=00001e36  ← valid
1F after 0A(key1=0111245a):  data[11:15]=01114290  data[15:19]=00000046  ← valid
1F null sentinel:            data[11:15]=00000000  data[15:19]=00000000  ← done

Null sentinel: data8[4:8] == b"\x00\x00\x00\x00" (= data_rsp.data[15:19]) works for BOTH 1E trailing (offset to next event key) and 1F response (null key echo) — in both cases, all zeros means "no more events."

1E response layout: data_rsp.data[11:15] = event 0's actual key; data_rsp.data[15:19] = sample-count offset to the next event key (key1 = key0 + this offset). If offset == 0, there is only one event.

Correct iteration pattern (confirmed empirically with live device, 2+ events):

count_events (browse mode only, no 5A):

1E(all zeros) → key0, trailing0         ← trailing0 non-zero if event 1 exists
0A(key0)                                ← REQUIRED: establishes device context
1F(all zeros / browse=True) → key1      ← use all-zero params
0A(key1)                                ← REQUIRED before each advance
1F(all zeros) → null                    ← done

get_events (download mode, with 5A):

1E(all zeros) → key0, trailing0         ← trailing0 non-zero if event 1 exists
0A(key0)                                ← REQUIRED: establishes device context
1E(token=0xFE)                          ← REQUIRED: arms device for 5A; CONFIRMED 4-2-26 + 4-3-26
0C(key0)                                ← read waveform record
1F(token=0xFE) → [discard key]          ← REQUIRED: arms 5A bulk stream state machine
POLL × 3                                ← REQUIRED: 3 full POLL cycles before 5A (BW frames 68-73)
5A(key0)                                ← bulk stream; key0 used even though 1F already advanced
1F(all zeros / browse=True) → key1      ← USE THIS for loop iteration (browse=True returns correct key)
0A(key1)
1E(token=0xFE)                          ← re-arm for next event's 5A
0C(key1)
1F(token=0xFE) → [discard key]          ← arm 5A
POLL × 3
5A(key1)
1F(browse=True) → null                  ← done

IMPORTANT — conditional browse 1F (UPDATED 2026-04-06): 1F(token=0xFE) (browse=False) BEFORE POLL+5A arms the device's bulk stream state machine. Its returned key is cached as arm_key4 in get_events().

1F(browse=True) AFTER 5A is ONLY sent when 5A succeeded. If 5A timed out or failed, sending browse 1F disrupts the device's internal state — subsequent 5A probes for the next event get no response (confirmed empirically: calling browse 1F after a failed 5A causes the next event's 5A probe to also time out with 0 bytes received).

In the failure path, arm_key4 from 1F(download) is used as a best-effort next-key hint:

  • If arm_key4 != cur_key: use it to advance the loop without any 1F call
  • If arm_key4 == cur_key (device stuck, typical for second+ events when 5A fails): abort

The diagnostic bytes_fed counter on S3FrameParser (incremented in every feed() call, reset by reset()) makes it possible to distinguish "no bytes at all" from "bytes received but no complete frame assembled" in 5A probe timeouts — both show up as 120s timeouts in the log but have very different root causes.

The 1E(token=0xFE) arm step is required (FIXED 2026-04-06): The device silently ignores all 5A probe frames unless a second SUB 1E with token=0xFE has been issued between 0A and 0C. This step is present in EVERY download cycle in both the 4-2-26 and 4-3-26 BW TX captures.

1F must come BEFORE 5A (FIXED 2026-04-06): BW always calls 1F (advance event) before starting the 5A bulk stream. 5A still uses the pre-advance key — the device streams the waveform for the key that was set up with 0A+1E-arm+0C even after 1F has moved the internal pointer to the next event.

POLL × 3 required before 5A (FIXED 2026-04-06): BW sends exactly 3 complete POLL (SUB 5B) probe+data cycles between the last 1F and the first 5A probe frame. Confirmed from 4-2-26 BW TX capture frames 68-73. Without these POLLs the device does not respond to the 5A probe. Use proto.poll() (not startup()startup() drains the boot string, which is only needed on initial connect).

advance_event(browse=True) sends all-zero params; advance_event() default (browse=False) sends token=0xFE and is NOT used by any caller. advance_event() returns (key4, event_data8). Callers (count_events, get_events) loop while data8[4:8] != b"\x00\x00\x00\x00".

SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)

read_compliance_config() sends a 4-frame sequence (A, B, C, D) where:

  • Frame A is a probe (no recv_one needed — device ACKs but returns no data page)
  • Frames B, C, D each need a recv_one to collect the response

There must be NO extra self._send(...) call before the B/C/D recv loop without a matching recv_one(). An orphaned send shifts all receives one step behind, leaving frame D's channel block (trigger_level_geo, alarm_level_geo, max_range_geo) unread and producing only ~1071 bytes instead of ~2126.

SUB 1A — anchor search range

_decode_compliance_config_into() locates sample_rate and record_time via the anchor b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'. Search range is cfg[0:150].

Do not narrow this to cfg[40:100] — the old range was only accidentally correct because the orphaned-send bug was prepending a 44-byte spurious header, pushing the anchor from its real position (cfg[11]) into the 40100 window.

Sample rate and DLE jitter in cfg data

Sample rate 4096 (0x1000) causes DLE jitter: the frame carries 10 10 00 on the wire, which unstuffs to 10 00 — 2 bytes instead of 3. This makes frame C 1 byte shorter and shifts all subsequent absolute offsets by 1. The anchor approach is immune to this. Do NOT use fixed absolute offsets for sample_rate or record_time.

TCP / cellular transport

  • Protocol bytes over TCP are bit-for-bit identical to RS-232. No wrapping.
  • The modem (RV50/RV55) forwards bytes with up to ~1s buffering. TcpTransport uses read_until_idle(idle_gap=1.5s) to drain the buffer completely before parsing.
  • Cold-boot: unit sends the 16-byte ASCII string "Operating System" before entering DLE-framed mode. The parser discards it (scans for DLE+STX).
  • RV50/RV55 sends \r\nRING\r\n\r\nCONNECT\r\n over TCP to the caller even with Quiet Mode enabled. Parser handles this — do not strip it manually before feeding to S3FrameParser.

Required ACEmanager settings (Sierra Wireless RV50/RV55)

Setting Value Why
Configure Serial Port 38400,8N1 Must match MiniMate baud
Flow Control None Hardware FC blocks TX if pins unconnected
Quiet Mode Enable Critical. Disabled injects RING/CONNECT onto serial, corrupting S3 handshake
Data Forwarding Timeout 1 (= 0.1 s) Lower latency
TCP Connect Response Delay 0 Non-zero silently drops first POLL frame
TCP Idle Timeout 2 (minutes) Prevents premature disconnect
DB9 Serial Echo Disable Echo corrupts the data stream

Key confirmed field locations

SUB FE — Full Config (166 destuffed bytes)

Offset Field Type Notes
0x34 firmware version string ASCII e.g. "S338.17"
0x560x57 calibration year uint16 BE 0x07E9 = 2025
0x0109 aux trigger enabled uint8 0x00 = off, 0x01 = on

SUB 1A — Compliance Config (~2126 bytes total after 4-frame sequence)

Field How to find it
sample_rate uint16 BE at anchor 2
record_time float32 BE at anchor + 10
trigger_level_geo float32 BE, located in channel block
alarm_level_geo float32 BE, adjacent to trigger_level_geo
max_range_geo float32 BE, adjacent to alarm_level_geo
setup_name ASCII, null-padded, in cfg body
project / client / operator / sensor_location ASCII, label-value pairs

Anchor: b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00', search cfg[0:150]

SUB 0C — Waveform Record (210 bytes = data[11:11+0xD2])

sub_code=0x10 (Waveform single-shot) — 9-byte timestamp header:

Offset Field Type
0 day uint8
1 sub_code uint8 (0x10)
2 month uint8
34 year uint16 BE
5 unknown uint8 (always 0)
6 hour uint8
7 minute uint8
8 second uint8

sub_code=0x03 (Waveform continuous) — 10-byte timestamp header (1-byte shift):

Confirmed 2026-04-03 against Blastware event report (15:20:17 Apr 3 2026). Raw wire bytes: 10 03 10 04 07 ea 00 0f 14 11

Offset Field Type Notes
0 unknown_a uint8 0x10 observed
1 day uint8 doubles as sub_code position in 0x10 layout
2 unknown_b uint8 0x10 observed
3 month uint8
45 year uint16 BE
6 unknown uint8
7 hour uint8
8 minute uint8
9 second uint8

Peak values (both record types):

Location Field Type
tran_pos - 12 peak_vector_sum float32 BE — label-relative, NOT fixed offset
label + 6 PPV per channel float32 BE (search for "Tran", "Vert", "Long", "MicL")

PPV labels are NOT 4-byte aligned. The label-relative approach is the only reliable method. peak_vector_sum is exactly 12 bytes before the "Tran" label — confirmed for both sub_code=0x10 and sub_code=0x03. Do NOT use fixed offset 87 (only incidentally correct for 0x10 records).


SFM REST API (sfm/server.py)

GET /device/info?port=COM5                              ← serial
GET /device/info?host=1.2.3.4&tcp_port=9034            ← cellular
GET /device/events?host=1.2.3.4&tcp_port=9034&baud=38400
GET /device/event?host=1.2.3.4&tcp_port=9034&index=0

Server retries once on ProtocolError for TCP connections (handles cold-boot timing).


Key wire captures (reference material)

Capture Location Contents
1-2-26 bridges/captures/1-2-26/ SUB 5A BW TX frames — confirmed 5A frame format, 11-byte params, DLE-aware checksum
3-11-26 bridges/captures/3-11-26/ Full compliance setup write, Aux Trigger capture
3-31-26 bridges/captures/3-31-26/ Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence; only 1 event stored so token=0xFE appeared to work
4-3-26 bridges/captures/4-3-26/ Browse-mode S3 capture with 2+ events — confirmed all-zero params for 1F, 1F response layout, null sentinel, 0A context requirement

What's next

  • Write commands (SUBs 6883) — push compliance config, channel config, trigger settings to device
  • ACH inbound server — accept call-home connections from field units
  • Modem manager — push RV50/RV55 configs via Sierra Wireless API