From fa9873cf4ab0facae943f4159c28d9fd3fde07eb Mon Sep 17 00:00:00 2001 From: serversdwn Date: Wed, 4 Mar 2026 17:42:15 -0500 Subject: [PATCH] =?UTF-8?q?doc:=20=C2=A72,=20=C2=A710,=20Appendix=20C=20|?= =?UTF-8?q?=20**MILESTONE=20=E2=80=94=20Link-layer=20grammar=20formally=20?= =?UTF-8?q?confirmed.**?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/instantel_protocol_reference.md | 87 +++++++++++++++++----------- 1 file changed, 52 insertions(+), 35 deletions(-) diff --git a/docs/instantel_protocol_reference.md b/docs/instantel_protocol_reference.md index 678301d..b2a3aca 100644 --- a/docs/instantel_protocol_reference.md +++ b/docs/instantel_protocol_reference.md @@ -47,6 +47,7 @@ | 2026-03-02 | Appendix A | **UPDATED:** New capture architecture: two flat raw wire dumps per session (`raw_s3.bin`, `raw_bw.bin`), one per direction, no record wrapper. Replaces structured `.bin` format for parser input. | | 2026-03-02 | Appendix A | **PARSER:** Deterministic DLE state machine implemented (`s3_parser.py`). Three states: `IDLE → IN_FRAME → AFTER_DLE`. Replaces heuristic global scanning. Properly handles DLE stuffing (`10 10` → literal `10`). Only complete STX→ETX pairs counted as frames. | | 2026-03-02 | Appendix A | **VALIDATED:** `raw_bw.bin` yields 7 complete frames via state machine. `raw_s3.bin` contains large structured responses (first frame payload ~3922 bytes). Both files confirmed lossless. BW bare `0x02` pattern confirmed as asymmetric framing (BW sends bare STX, S3 sends DLE+STX). | +| 2026-03-03 | §2, §10, Appendix C | **MILESTONE — Link-layer grammar formally confirmed.** BW start marker is `41 02` (ACK+STX as a unit — bare `02` alone is not sufficient). BW frame boundary is structural sequence `03 41 02`. ETX lookahead: bare `03` only accepted as ETX when followed by `41 02` or at EOF. Checksum confirmed split: small frames use SUM8, large frames use unknown algorithm. Checksum is semantic, not a framing primitive. `s3_parser.py` v0.2.2 implements dual `--mode s3/bw`. | --- @@ -71,22 +72,29 @@ The two sides of the connection use **fully asymmetric framing**. DLE stuffing applies on both sides. -| Direction | STX (frame start) | ETX (frame end) | Stuffing | Notes | +| Direction | Start marker | End marker | Stuffing | Notes | |---|---|---|---|---| -| S3 → BW (device) | `0x10 0x02` (DLE+STX) | `0x10 0x03` (DLE+ETX) | `0x10` → `0x10 0x10` | Full DLE framing | -| BW → S3 (Blastware) | `0x02` (bare STX) | `0x03` (bare ETX) | `0x10` → `0x10 0x10` | Bare delimiters, DLE stuffing only | +| S3 → BW (device) | `10 02` (DLE+STX) | `10 03` (DLE+ETX) | `10` → `10 10` | Full DLE framing | +| BW → S3 (Blastware) | `41 02` (ACK+STX) | `03` (bare ETX) | `10` → `10 10` | ACK is part of start marker | + +**BW start marker:** `41 02` is treated as a single two-byte start signature. Bare `02` alone is **not** sufficient to start a BW frame — it must be preceded by `41` (ACK). This prevents false triggering on `02` bytes inside payload data. + +**BW ETX rule:** Bare `03` is accepted as frame end **only** when followed by `41 02` (next frame's ACK+STX) or at EOF. The structural boundary pattern is: +``` +... [payload] 03 41 02 [next payload] ... +``` +In-payload `03` bytes are preserved as data when not followed by `41 02`. **Evidence:** -- 91/98 BW frames validate checksum when parsed with bare `0x03` as ETX -- All `10 03` sequences in `raw_bw.bin` are in-payload data — none are followed by `41 02` (next frame start) -- `10 03` appearing in BW payload is always `10 10 03` origin (stuffed DLE + literal `03`) — the S3 device correctly parses this via its own state machine without false ETX detection -- S3 captures consistently terminate with `10 03` confirmed via HxD +- 98/98 BW frames extracted from `raw_bw.bin` using `41 02` start + `03 41 02` structural boundary +- 91/98 small BW frames validate SUM8 checksum; 7 large config/write frames do not match any known checksum algorithm +- All `10 03` sequences in `raw_bw.bin` confirmed as in-payload data (none followed by `41 02`) +- `s3_parser.py v0.2.2` implements both modes; BW ETX lookahead confirmed working -**Practical impact for parsers:** -- Parser on `raw_s3.bin`: trigger on `10 02`, terminate on `10 03` -- Parser on `raw_bw.bin`: trigger on bare `02`, terminate on bare `03` -- Both parsers must handle `10 10` → literal `10` unstuffing -- ETX detection must be state-machine-aware (not raw byte search) to avoid false matches on stuffed sequences +**Checksum is NOT a framing primitive:** +- Small frames (e.g. keepalive SUB `5B`): SUM8 validates consistently +- Large frames (e.g. SUB `71` config writes): checksum algorithm unknown — does not match SUM8, CRC-16/IBM, CRC-16/CCITT-FALSE, or CRC-16/X25 +- Frame boundaries are determined structurally; checksum validation is a semantic-layer concern only ### Frame Structure by Direction @@ -658,24 +666,27 @@ ESCAPE: ### Parser State Machine — BW→S3 direction (Blastware commands) -Trigger on bare STX, terminate on bare ETX. DLE only appears in stuffing context. +Trigger on `41 02` (ACK+STX as a unit). ETX accepted only when followed by `41 02` or at EOF. ``` IDLE: - receive 0x41 → emit ACK event, stay IDLE - receive 0x02 → frame started, goto IN_FRAME - receive anything → discard, stay IDLE + receive 0x41 + next==0x02 → frame started (consume both), goto IN_FRAME + receive anything → discard, stay IDLE IN_FRAME: - receive 0x10 → goto ESCAPE - receive 0x03 → frame complete — validate checksum, process buffer, goto IDLE - receive any byte → append to buffer, stay IN_FRAME + receive 0x10 → goto ESCAPE + receive 0x03 + lookahead==0x41 0x02, or EOF + → frame complete — validate checksum, process buffer, goto IDLE + receive 0x03 (no lookahead) → append to buffer (in-payload 03), stay IN_FRAME + receive any byte → append to buffer, stay IN_FRAME ESCAPE: - receive 0x10 → append single 0x10 to buffer, goto IN_FRAME (stuffed literal) - receive anything → append DLE + byte to buffer (recovery), goto IN_FRAME + receive 0x10 → append single 0x10 to buffer, goto IN_FRAME (stuffed literal) + receive anything → append DLE + byte to buffer (recovery), goto IN_FRAME ``` +**Architectural note:** Checksum validation is optional and informational only. Frame boundaries are determined structurally via the `03 41 02` sequence — never by checksum gating. + --- ## 11. Checksum Reference Implementation @@ -829,10 +840,10 @@ As of `s3_bridge v0.5.0`, captures are produced as **two flat raw wire dump file Every byte on the wire is written verbatim — no modification, no record headers, no timestamps. `0x10 0x03` (DLE+ETX) is preserved intact. **Practical impact for parsing:** -- `raw_s3.bin`: trigger on `0x10 0x02`, terminate on `0x10 0x03` (DLE+ETX) -- `raw_bw.bin`: trigger on bare `0x02`, terminate on bare `0x03` -- Both: handle `0x10 0x10` → literal `0x10` unstuffing -- ETX detection must be state-machine-aware on both sides to avoid false matches on stuffed sequences +- `raw_s3.bin`: trigger on `10 02`, terminate on `10 03` (state-machine-aware) +- `raw_bw.bin`: trigger on `41 02` (ACK+STX as a unit), terminate on `03` only when followed by `41 02` or at EOF +- Both: handle `10 10` → literal `10` unstuffing +- Use `s3_parser.py --mode s3` and `--mode bw` respectively --- @@ -840,6 +851,7 @@ Every byte on the wire is written verbatim — no modification, no record header | Question | Priority | Added | Notes | |---|---|---|---| +| **Large BW frame checksum algorithm** — Small frames (SUB `5B` keepalive etc.) validate with SUM8. Large config/write frames (SUB `71`, `68`, `69` etc.) do not match SUM8, CRC-16/IBM, CRC-16/CCITT-FALSE, or CRC-16/X25 in either endianness. Unknown whether it covers full payload or excludes header bytes, or whether different SUB types use different algorithms. | MEDIUM | 2026-03-03 | NEW | | Byte at timestamp offset 3 — hours, minutes, or padding? | MEDIUM | 2026-02-26 | | | `trail[0]` in serial number response — unit-specific byte, derivation unknown. `trail[1]` resolved as firmware minor version. | MEDIUM | 2026-02-26 | | | Full channel ID mapping in SUB `5A` stream (01/02/03/04 → which sensor?) | MEDIUM | 2026-02-26 | | @@ -920,7 +932,7 @@ As of 2026-03-02 the capture pipeline produces two flat raw wire dump files per No record headers, no timestamps, no framing logic applied by the dumper. Files are flat concatenations of `serial.read()` chunks. Frame boundaries must be recovered by the parser. -### C.3 Parser Design — DLE State Machine +### C.3 Parser Design — Dual-Mode State Machine (`s3_parser.py v0.2.2`) A deterministic state machine replaces all prior heuristic scanning. @@ -948,14 +960,15 @@ STATE_AFTER_DLE — last byte was 0x10, awaiting qualifier **BW→S3 parser states:** -| Current State | Byte | Action | Next State | +| Current State | Condition | Action | Next State | |---|---|---|---| -| IDLE | `02` | Begin new frame | IN_FRAME | +| IDLE | byte==`41` AND next==`02` | Begin new frame (consume both) | IN_FRAME | | IDLE | any | Discard | IDLE | -| IN_FRAME | `03` | Frame complete, emit | IDLE | -| IN_FRAME | `10` | — | AFTER_DLE | +| IN_FRAME | byte==`03` AND (next two==`41 02` OR at EOF) | Frame complete, emit | IDLE | +| IN_FRAME | byte==`03` (no lookahead match) | Append `03` to payload | IN_FRAME | +| IN_FRAME | byte==`10` | — | AFTER_DLE | | IN_FRAME | other | Append to payload | IN_FRAME | -| AFTER_DLE | `10` | Append literal `0x10` | IN_FRAME | +| AFTER_DLE | byte==`10` | Append literal `10` | IN_FRAME | | AFTER_DLE | other | Append DLE + byte (recovery) | IN_FRAME | **Properties:** @@ -967,9 +980,10 @@ STATE_AFTER_DLE — last byte was 0x10, awaiting qualifier ### C.4 Observed Traffic (Validation Captures) **`raw_bw.bin`** (Blastware → S3): -- 98 complete frames via state machine (bare STX + bare ETX mode) -- 91/98 checksums validate; 7 failures are large frames containing in-payload `10 03` sequences that a naive scanner misreads as ETX -- Bare `0x02` STX and bare `0x03` ETX confirmed; DLE used for stuffing only +- 98 complete frames via `41 02` start + `03 41 02` structural boundary detection +- 91/98 small frames validate SUM8 checksum; 7 large config/write frames fail all known checksum algorithms +- `41 02` confirmed as two-byte start signature; bare `02` alone is insufficient +- Bare `03` ETX confirmed; in-payload `03` bytes correctly preserved via lookahead rule - Contains project metadata strings: `"Standard Recording Setup.set"`, `"Claude test2"`, `"Location #1 - Brians House"` **`raw_s3.bin`** (S3 → Blastware): @@ -983,7 +997,10 @@ STATE_AFTER_DLE — last byte was 0x10, awaiting qualifier 1. **Global byte counting ≠ frame counting.** `0x10 0x02` appears inside payloads. Only state machine transitions produce valid frame boundaries. 2. **STX count ≠ frame count.** Only STX→ETX pairs within proper state transitions count. 3. **EOF mid-frame is normal.** Capture termination during active traffic produces an incomplete trailing frame. Not an error. -4. **Layer separation.** The parser extracts frames only. Decoding block IDs, validating checksums, and interpreting semantics are responsibilities of a separate protocol decoder layer above it. +4. **Start marker must be the full signature.** In BW mode, `41 02` is the start marker — not bare `02`. Bare `02` appears in payload data and would cause phantom frames. +5. **ETX lookahead prevents false termination.** In BW mode, `03` is only a frame terminator when followed by `41 02` or at EOF. In-payload `03` bytes are common in large config frames. +6. **Framing is structural. Checksum is semantic.** Frame boundaries are determined by grammar patterns — never by checksum validation. Checksum belongs to the protocol decoder layer, not the framing layer. +7. **Layer separation.** The parser extracts frames only. Decoding block IDs, validating checksums, and interpreting semantics are responsibilities of a separate protocol decoder layer above it. ### C.6 Parser Layer Architecture