Files
seismo-relay/CLAUDE.md

313 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CLAUDE.md — seismo-relay
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
(Sierra Wireless RV50 / RV55). Current version: **v0.7.0**.
---
## Project layout
```
minimateplus/ ← Python client library (primary focus)
transport.py ← SerialTransport, TcpTransport
framing.py ← DLE codec, frame builders, S3FrameParser
protocol.py ← MiniMateProtocol — wire-level read/write methods
client.py ← MiniMateClient — high-level API (connect, get_events, …)
models.py ← DeviceInfo, EventRecord, ComplianceConfig, …
sfm/server.py ← FastAPI REST server exposing device data over HTTP
seismo_lab.py ← Tkinter GUI (Bridge + Analyzer + Console tabs)
docs/
instantel_protocol_reference.md ← reverse-engineered protocol spec ("the Rosetta Stone")
CHANGELOG.md ← version history
```
---
## Current implementation state (v0.6.0)
Full read pipeline working end-to-end over TCP/cellular:
| Step | SUB | Status |
|---|---|---|
| POLL / startup handshake | 5B | ✅ |
| Serial number | 15 | ✅ |
| Full config (firmware, calibration date, etc.) | FE | ✅ |
| Compliance config (record time, sample rate, geo thresholds) | 1A | ✅ |
| Event index | 08 | ✅ |
| Event header / first key | 1E | ✅ |
| Waveform header | 0A | ✅ |
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
| **Bulk waveform stream (event-time metadata)** | **5A** | ✅ **new v0.6.0** |
| Event advance / next key | 1F | ✅ |
| Write commands (push config to device) | 6883 | ❌ not yet implemented |
`get_events()` sequence per event: `1E → 0A → 0C → 5A → 1F`
---
## Protocol fundamentals
### DLE framing
```
BW→S3 (our requests): [ACK=0x41] [STX=0x02] [stuffed payload+chk] [ETX=0x03]
S3→BW (device replies): [DLE=0x10] [STX=0x02] [stuffed payload+chk] [bare ETX=0x03]
```
- **DLE stuffing rule:** any literal `0x10` byte in the payload is doubled on the wire
(`0x10``0x10 0x10`). This includes the checksum byte.
- **Inner-frame terminators:** large S3 responses (A4, E5) contain embedded sub-frames
using `DLE+ETX` as inner terminators. The outer parser treats `DLE+ETX` inside a frame
as literal data — the bare ETX is the ONLY real frame terminator.
- **Response SUB rule:** `response_SUB = 0xFF - request_SUB`
(one known exception: SUB `1C` → response `6E`, not `0xE3`)
- **Two-step read pattern:** every read command is sent twice — probe step (`offset=0x00`,
get length) then data step (`offset=DATA_LENGTH`, get payload). All data lengths are
hardcoded constants, not read from the probe response.
### De-stuffed payload header
```
BW→S3 (request):
[0] CMD 0x10
[1] flags 0x00
[2] SUB command byte
[3] 0x00 always zero
[4] 0x00 always zero
[5] OFFSET 0x00 for probe step; DATA_LENGTH for data step
[6-15] params (key, token, etc. — see helpers in framing.py)
S3→BW (response):
[0] CMD 0x00
[1] flags 0x10
[2] SUB response sub byte
[3] PAGE_HI
[4] PAGE_LO
[5+] data
```
---
## Critical protocol gotchas (hard-won — do not re-derive)
### SUB 5A — bulk waveform stream — NON-STANDARD frame format
**Always use `build_5a_frame()` for SUB 5A. Never use `build_bw_frame()` for SUB 5A.**
`build_bw_frame` produces WRONG output for 5A for two reasons:
1. **`offset_hi = 0x10` must NOT be DLE-stuffed.** Blastware sends the offset field raw.
`build_bw_frame` would stuff it to `10 10` on the wire — the device silently ignores
the frame. `build_5a_frame` writes it as a bare `10`.
2. **DLE-aware checksum.** When computing the checksum, `10 XX` pairs in the stuffed
section contribute only `XX` to the running sum; lone bytes contribute normally. This
differs from the standard SUM8-of-destuffed-payload that all other commands use.
Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26
BW TX capture. All 10 frames verified.
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
`bulk_waveform_params()` returns 11 bytes (extra trailing `0x00`). The 11th byte was
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
Do not swap them.
### SUB 5A — event-time metadata lives in A5 frame 7
The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance
setup as it existed when the event was recorded:
```
"Project:" → project description
"Client:" → client name ← NOT in the 0C record
"User Name:" → operator name ← NOT in the 0C record
"Seis Loc:" → sensor location ← NOT in the 0C record
"Extended Notes"→ notes
```
These strings are **NOT** present in the 210-byte SUB 0C waveform record. They reflect
the setup at record time, not the current device config — this is why we fetch them from
5A instead of backfilling from the current compliance config.
`stop_after_metadata=True` (default) stops the 5A loop as soon as `b"Project:"` appears,
then sends the termination frame.
### SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)
**token_params bug (FIXED):** The token byte was at `params[6]` (wrong). Both 3-31-26 and
4-3-26 BW TX captures confirm it belongs at **`params[7]`** (raw: `00 00 00 00 00 00 00 fe 00 00`).
With the wrong position the device ignores the token and 1F returns null immediately.
**all-zero params required (empirically confirmed):** Even with the correct token position,
sending `token=0xFE` causes the device to return null from 1F in multi-event sessions.
All callers (`count_events`, `get_events`) must use `advance_event(browse=True)` which
sends all-zero params. The 3-31-26 capture that "confirmed" token=0xFE had only one event
stored — 1F always returns null at end-of-events, so we never actually observed 1F
successfully returning a second key with token=0xFE. Empirical evidence from live device
testing with 2+ events is definitive: **always use all-zero params for 1F.**
**0A context requirement:** `advance_event()` (1F) only returns a valid next-event key
when a preceding `read_waveform_header()` (0A) call has established device waveform
context for the current key. Call 0A before every event in the loop, not just the first.
Calling 1F cold (after only 1E, with no 0A) returns the null sentinel regardless of how
many events are stored.
**1F response layout:** The next event's key IS at `data_rsp.data[11:15]` (= payload[16:20]).
Confirmed from 4-3-26 browse-mode S3 captures:
```
1F after 0A(key0=01110000): data[11:15]=0111245a data[15:19]=00001e36 ← valid
1F after 0A(key1=0111245a): data[11:15]=01114290 data[15:19]=00000046 ← valid
1F null sentinel: data[11:15]=00000000 data[15:19]=00000000 ← done
```
**Null sentinel:** `data8[4:8] == b"\x00\x00\x00\x00"` (= `data_rsp.data[15:19]`)
works for BOTH 1E trailing (offset to next event key) and 1F response (null key
echo) — in both cases, all zeros means "no more events."
**1E response layout:** `data_rsp.data[11:15]` = event 0's actual key; `data_rsp.data[15:19]`
= sample-count offset to the next event key (key1 = key0 + this offset). If offset == 0,
there is only one event.
**Correct iteration pattern (confirmed empirically with live device, 2+ events):**
```
1E(all zeros) → key0, trailing0 ← trailing0 non-zero if event 1 exists
0A(key0) ← REQUIRED: establishes device context
0C(key0) [+ 5A(key0) for get_events] ← read record data
1F(all zeros / browse=True) → key1 ← use all-zero params, NOT token=0xFE
0A(key1) ← REQUIRED before each advance
0C(key1) [+ 5A(key1) for get_events]
1F(all zeros) → null ← done
```
`advance_event(browse=True)` sends all-zero params; `advance_event()` default (browse=False)
sends token=0xFE and is NOT used by any caller.
`advance_event()` returns `(key4, event_data8)`.
Callers (`count_events`, `get_events`) loop while `data8[4:8] != b"\x00\x00\x00\x00"`.
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
- Frame A is a probe (no `recv_one` needed — device ACKs but returns no data page)
- Frames B, C, D each need a `recv_one` to collect the response
**There must be NO extra `self._send(...)` call before the B/C/D recv loop without a
matching `recv_one()`.** An orphaned send shifts all receives one step behind, leaving
frame D's channel block (trigger_level_geo, alarm_level_geo, max_range_geo) unread and
producing only ~1071 bytes instead of ~2126.
### SUB 1A — anchor search range
`_decode_compliance_config_into()` locates sample_rate and record_time via the anchor
`b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'`. Search range is `cfg[0:150]`.
Do not narrow this to `cfg[40:100]` — the old range was only accidentally correct because
the orphaned-send bug was prepending a 44-byte spurious header, pushing the anchor from
its real position (cfg[11]) into the 40100 window.
### Sample rate and DLE jitter in cfg data
Sample rate 4096 (`0x1000`) causes DLE jitter: the frame carries `10 10 00` on the wire,
which unstuffs to `10 00` — 2 bytes instead of 3. This makes frame C 1 byte shorter and
shifts all subsequent absolute offsets by 1. The anchor approach is immune to this.
Do NOT use fixed absolute offsets for sample_rate or record_time.
### TCP / cellular transport
- Protocol bytes over TCP are bit-for-bit identical to RS-232. No wrapping.
- The modem (RV50/RV55) forwards bytes with up to ~1s buffering. `TcpTransport` uses
`read_until_idle(idle_gap=1.5s)` to drain the buffer completely before parsing.
- Cold-boot: unit sends the 16-byte ASCII string `"Operating System"` before entering
DLE-framed mode. The parser discards it (scans for DLE+STX).
- RV50/RV55 sends `\r\nRING\r\n\r\nCONNECT\r\n` over TCP to the caller even with
Quiet Mode enabled. Parser handles this — do not strip it manually before feeding to
`S3FrameParser`.
### Required ACEmanager settings (Sierra Wireless RV50/RV55)
| Setting | Value | Why |
|---|---|---|
| Configure Serial Port | `38400,8N1` | Must match MiniMate baud |
| Flow Control | `None` | Hardware FC blocks TX if pins unconnected |
| **Quiet Mode** | **Enable** | **Critical.** Disabled injects `RING`/`CONNECT` onto serial, corrupting S3 handshake |
| Data Forwarding Timeout | `1` (= 0.1 s) | Lower latency |
| TCP Connect Response Delay | `0` | Non-zero silently drops first POLL frame |
| TCP Idle Timeout | `2` (minutes) | Prevents premature disconnect |
| DB9 Serial Echo | `Disable` | Echo corrupts the data stream |
---
## Key confirmed field locations
### SUB FE — Full Config (166 destuffed bytes)
| Offset | Field | Type | Notes |
|---|---|---|---|
| 0x34 | firmware version string | ASCII | e.g. `"S338.17"` |
| 0x560x57 | calibration year | uint16 BE | `0x07E9` = 2025 |
| 0x0109 | aux trigger enabled | uint8 | `0x00` = off, `0x01` = on |
### SUB 1A — Compliance Config (~2126 bytes total after 4-frame sequence)
| Field | How to find it |
|---|---|
| sample_rate | uint16 BE at anchor 2 |
| record_time | float32 BE at anchor + 10 |
| trigger_level_geo | float32 BE, located in channel block |
| alarm_level_geo | float32 BE, adjacent to trigger_level_geo |
| max_range_geo | float32 BE, adjacent to alarm_level_geo |
| setup_name | ASCII, null-padded, in cfg body |
| project / client / operator / sensor_location | ASCII, label-value pairs |
Anchor: `b'\x01\x2c\x00\x00\xbe\x80\x00\x00\x00\x00'`, search `cfg[0:150]`
### SUB 0C — Waveform Record (210 bytes = data[11:11+0xD2])
| Offset | Field | Type |
|---|---|---|
| 0 | day | uint8 |
| 1 | sub_code | uint8 (`0x10` = Waveform single-shot, `0x03` = Waveform continuous) |
| 2 | month | uint8 |
| 34 | year | uint16 BE |
| 5 | unknown | uint8 (always 0) |
| 6 | hour | uint8 |
| 7 | minute | uint8 |
| 8 | second | uint8 |
| 87 | peak_vector_sum | float32 BE |
| label+6 | PPV per channel | float32 BE (search for `"Tran"`, `"Vert"`, `"Long"`, `"MicL"`) |
PPV labels are NOT 4-byte aligned. The label-offset+6 approach is the only reliable method.
---
## SFM REST API (sfm/server.py)
```
GET /device/info?port=COM5 ← serial
GET /device/info?host=1.2.3.4&tcp_port=9034 ← cellular
GET /device/events?host=1.2.3.4&tcp_port=9034&baud=38400
GET /device/event?host=1.2.3.4&tcp_port=9034&index=0
```
Server retries once on `ProtocolError` for TCP connections (handles cold-boot timing).
---
## Key wire captures (reference material)
| Capture | Location | Contents |
|---|---|---|
| 1-2-26 | `bridges/captures/1-2-26/` | SUB 5A BW TX frames — used to confirm 5A frame format, 11-byte params, DLE-aware checksum |
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence |
---
## What's next
- Write commands (SUBs 6883) — push compliance config, channel config, trigger settings to device
- ACH inbound server — accept call-home connections from field units
- Modem manager — push RV50/RV55 configs via Sierra Wireless API