doc: §2, §10, Appendix C | **MILESTONE — Link-layer grammar formally confirmed.**
This commit is contained in:
@@ -47,6 +47,7 @@
|
||||
| 2026-03-02 | Appendix A | **UPDATED:** New capture architecture: two flat raw wire dumps per session (`raw_s3.bin`, `raw_bw.bin`), one per direction, no record wrapper. Replaces structured `.bin` format for parser input. |
|
||||
| 2026-03-02 | Appendix A | **PARSER:** Deterministic DLE state machine implemented (`s3_parser.py`). Three states: `IDLE → IN_FRAME → AFTER_DLE`. Replaces heuristic global scanning. Properly handles DLE stuffing (`10 10` → literal `10`). Only complete STX→ETX pairs counted as frames. |
|
||||
| 2026-03-02 | Appendix A | **VALIDATED:** `raw_bw.bin` yields 7 complete frames via state machine. `raw_s3.bin` contains large structured responses (first frame payload ~3922 bytes). Both files confirmed lossless. BW bare `0x02` pattern confirmed as asymmetric framing (BW sends bare STX, S3 sends DLE+STX). |
|
||||
| 2026-03-03 | §2, §10, Appendix C | **MILESTONE — Link-layer grammar formally confirmed.** BW start marker is `41 02` (ACK+STX as a unit — bare `02` alone is not sufficient). BW frame boundary is structural sequence `03 41 02`. ETX lookahead: bare `03` only accepted as ETX when followed by `41 02` or at EOF. Checksum confirmed split: small frames use SUM8, large frames use unknown algorithm. Checksum is semantic, not a framing primitive. `s3_parser.py` v0.2.2 implements dual `--mode s3/bw`. |
|
||||
|
||||
---
|
||||
|
||||
@@ -71,22 +72,29 @@
|
||||
|
||||
The two sides of the connection use **fully asymmetric framing**. DLE stuffing applies on both sides.
|
||||
|
||||
| Direction | STX (frame start) | ETX (frame end) | Stuffing | Notes |
|
||||
| Direction | Start marker | End marker | Stuffing | Notes |
|
||||
|---|---|---|---|---|
|
||||
| S3 → BW (device) | `0x10 0x02` (DLE+STX) | `0x10 0x03` (DLE+ETX) | `0x10` → `0x10 0x10` | Full DLE framing |
|
||||
| BW → S3 (Blastware) | `0x02` (bare STX) | `0x03` (bare ETX) | `0x10` → `0x10 0x10` | Bare delimiters, DLE stuffing only |
|
||||
| S3 → BW (device) | `10 02` (DLE+STX) | `10 03` (DLE+ETX) | `10` → `10 10` | Full DLE framing |
|
||||
| BW → S3 (Blastware) | `41 02` (ACK+STX) | `03` (bare ETX) | `10` → `10 10` | ACK is part of start marker |
|
||||
|
||||
**BW start marker:** `41 02` is treated as a single two-byte start signature. Bare `02` alone is **not** sufficient to start a BW frame — it must be preceded by `41` (ACK). This prevents false triggering on `02` bytes inside payload data.
|
||||
|
||||
**BW ETX rule:** Bare `03` is accepted as frame end **only** when followed by `41 02` (next frame's ACK+STX) or at EOF. The structural boundary pattern is:
|
||||
```
|
||||
... [payload] 03 41 02 [next payload] ...
|
||||
```
|
||||
In-payload `03` bytes are preserved as data when not followed by `41 02`.
|
||||
|
||||
**Evidence:**
|
||||
- 91/98 BW frames validate checksum when parsed with bare `0x03` as ETX
|
||||
- All `10 03` sequences in `raw_bw.bin` are in-payload data — none are followed by `41 02` (next frame start)
|
||||
- `10 03` appearing in BW payload is always `10 10 03` origin (stuffed DLE + literal `03`) — the S3 device correctly parses this via its own state machine without false ETX detection
|
||||
- S3 captures consistently terminate with `10 03` confirmed via HxD
|
||||
- 98/98 BW frames extracted from `raw_bw.bin` using `41 02` start + `03 41 02` structural boundary
|
||||
- 91/98 small BW frames validate SUM8 checksum; 7 large config/write frames do not match any known checksum algorithm
|
||||
- All `10 03` sequences in `raw_bw.bin` confirmed as in-payload data (none followed by `41 02`)
|
||||
- `s3_parser.py v0.2.2` implements both modes; BW ETX lookahead confirmed working
|
||||
|
||||
**Practical impact for parsers:**
|
||||
- Parser on `raw_s3.bin`: trigger on `10 02`, terminate on `10 03`
|
||||
- Parser on `raw_bw.bin`: trigger on bare `02`, terminate on bare `03`
|
||||
- Both parsers must handle `10 10` → literal `10` unstuffing
|
||||
- ETX detection must be state-machine-aware (not raw byte search) to avoid false matches on stuffed sequences
|
||||
**Checksum is NOT a framing primitive:**
|
||||
- Small frames (e.g. keepalive SUB `5B`): SUM8 validates consistently
|
||||
- Large frames (e.g. SUB `71` config writes): checksum algorithm unknown — does not match SUM8, CRC-16/IBM, CRC-16/CCITT-FALSE, or CRC-16/X25
|
||||
- Frame boundaries are determined structurally; checksum validation is a semantic-layer concern only
|
||||
|
||||
### Frame Structure by Direction
|
||||
|
||||
@@ -658,17 +666,18 @@ ESCAPE:
|
||||
|
||||
### Parser State Machine — BW→S3 direction (Blastware commands)
|
||||
|
||||
Trigger on bare STX, terminate on bare ETX. DLE only appears in stuffing context.
|
||||
Trigger on `41 02` (ACK+STX as a unit). ETX accepted only when followed by `41 02` or at EOF.
|
||||
|
||||
```
|
||||
IDLE:
|
||||
receive 0x41 → emit ACK event, stay IDLE
|
||||
receive 0x02 → frame started, goto IN_FRAME
|
||||
receive 0x41 + next==0x02 → frame started (consume both), goto IN_FRAME
|
||||
receive anything → discard, stay IDLE
|
||||
|
||||
IN_FRAME:
|
||||
receive 0x10 → goto ESCAPE
|
||||
receive 0x03 → frame complete — validate checksum, process buffer, goto IDLE
|
||||
receive 0x03 + lookahead==0x41 0x02, or EOF
|
||||
→ frame complete — validate checksum, process buffer, goto IDLE
|
||||
receive 0x03 (no lookahead) → append to buffer (in-payload 03), stay IN_FRAME
|
||||
receive any byte → append to buffer, stay IN_FRAME
|
||||
|
||||
ESCAPE:
|
||||
@@ -676,6 +685,8 @@ ESCAPE:
|
||||
receive anything → append DLE + byte to buffer (recovery), goto IN_FRAME
|
||||
```
|
||||
|
||||
**Architectural note:** Checksum validation is optional and informational only. Frame boundaries are determined structurally via the `03 41 02` sequence — never by checksum gating.
|
||||
|
||||
---
|
||||
|
||||
## 11. Checksum Reference Implementation
|
||||
@@ -829,10 +840,10 @@ As of `s3_bridge v0.5.0`, captures are produced as **two flat raw wire dump file
|
||||
Every byte on the wire is written verbatim — no modification, no record headers, no timestamps. `0x10 0x03` (DLE+ETX) is preserved intact.
|
||||
|
||||
**Practical impact for parsing:**
|
||||
- `raw_s3.bin`: trigger on `0x10 0x02`, terminate on `0x10 0x03` (DLE+ETX)
|
||||
- `raw_bw.bin`: trigger on bare `0x02`, terminate on bare `0x03`
|
||||
- Both: handle `0x10 0x10` → literal `0x10` unstuffing
|
||||
- ETX detection must be state-machine-aware on both sides to avoid false matches on stuffed sequences
|
||||
- `raw_s3.bin`: trigger on `10 02`, terminate on `10 03` (state-machine-aware)
|
||||
- `raw_bw.bin`: trigger on `41 02` (ACK+STX as a unit), terminate on `03` only when followed by `41 02` or at EOF
|
||||
- Both: handle `10 10` → literal `10` unstuffing
|
||||
- Use `s3_parser.py --mode s3` and `--mode bw` respectively
|
||||
|
||||
---
|
||||
|
||||
@@ -840,6 +851,7 @@ Every byte on the wire is written verbatim — no modification, no record header
|
||||
|
||||
| Question | Priority | Added | Notes |
|
||||
|---|---|---|---|
|
||||
| **Large BW frame checksum algorithm** — Small frames (SUB `5B` keepalive etc.) validate with SUM8. Large config/write frames (SUB `71`, `68`, `69` etc.) do not match SUM8, CRC-16/IBM, CRC-16/CCITT-FALSE, or CRC-16/X25 in either endianness. Unknown whether it covers full payload or excludes header bytes, or whether different SUB types use different algorithms. | MEDIUM | 2026-03-03 | NEW |
|
||||
| Byte at timestamp offset 3 — hours, minutes, or padding? | MEDIUM | 2026-02-26 | |
|
||||
| `trail[0]` in serial number response — unit-specific byte, derivation unknown. `trail[1]` resolved as firmware minor version. | MEDIUM | 2026-02-26 | |
|
||||
| Full channel ID mapping in SUB `5A` stream (01/02/03/04 → which sensor?) | MEDIUM | 2026-02-26 | |
|
||||
@@ -920,7 +932,7 @@ As of 2026-03-02 the capture pipeline produces two flat raw wire dump files per
|
||||
|
||||
No record headers, no timestamps, no framing logic applied by the dumper. Files are flat concatenations of `serial.read()` chunks. Frame boundaries must be recovered by the parser.
|
||||
|
||||
### C.3 Parser Design — DLE State Machine
|
||||
### C.3 Parser Design — Dual-Mode State Machine (`s3_parser.py v0.2.2`)
|
||||
|
||||
A deterministic state machine replaces all prior heuristic scanning.
|
||||
|
||||
@@ -948,14 +960,15 @@ STATE_AFTER_DLE — last byte was 0x10, awaiting qualifier
|
||||
|
||||
**BW→S3 parser states:**
|
||||
|
||||
| Current State | Byte | Action | Next State |
|
||||
| Current State | Condition | Action | Next State |
|
||||
|---|---|---|---|
|
||||
| IDLE | `02` | Begin new frame | IN_FRAME |
|
||||
| IDLE | byte==`41` AND next==`02` | Begin new frame (consume both) | IN_FRAME |
|
||||
| IDLE | any | Discard | IDLE |
|
||||
| IN_FRAME | `03` | Frame complete, emit | IDLE |
|
||||
| IN_FRAME | `10` | — | AFTER_DLE |
|
||||
| IN_FRAME | byte==`03` AND (next two==`41 02` OR at EOF) | Frame complete, emit | IDLE |
|
||||
| IN_FRAME | byte==`03` (no lookahead match) | Append `03` to payload | IN_FRAME |
|
||||
| IN_FRAME | byte==`10` | — | AFTER_DLE |
|
||||
| IN_FRAME | other | Append to payload | IN_FRAME |
|
||||
| AFTER_DLE | `10` | Append literal `0x10` | IN_FRAME |
|
||||
| AFTER_DLE | byte==`10` | Append literal `10` | IN_FRAME |
|
||||
| AFTER_DLE | other | Append DLE + byte (recovery) | IN_FRAME |
|
||||
|
||||
**Properties:**
|
||||
@@ -967,9 +980,10 @@ STATE_AFTER_DLE — last byte was 0x10, awaiting qualifier
|
||||
### C.4 Observed Traffic (Validation Captures)
|
||||
|
||||
**`raw_bw.bin`** (Blastware → S3):
|
||||
- 98 complete frames via state machine (bare STX + bare ETX mode)
|
||||
- 91/98 checksums validate; 7 failures are large frames containing in-payload `10 03` sequences that a naive scanner misreads as ETX
|
||||
- Bare `0x02` STX and bare `0x03` ETX confirmed; DLE used for stuffing only
|
||||
- 98 complete frames via `41 02` start + `03 41 02` structural boundary detection
|
||||
- 91/98 small frames validate SUM8 checksum; 7 large config/write frames fail all known checksum algorithms
|
||||
- `41 02` confirmed as two-byte start signature; bare `02` alone is insufficient
|
||||
- Bare `03` ETX confirmed; in-payload `03` bytes correctly preserved via lookahead rule
|
||||
- Contains project metadata strings: `"Standard Recording Setup.set"`, `"Claude test2"`, `"Location #1 - Brians House"`
|
||||
|
||||
**`raw_s3.bin`** (S3 → Blastware):
|
||||
@@ -983,7 +997,10 @@ STATE_AFTER_DLE — last byte was 0x10, awaiting qualifier
|
||||
1. **Global byte counting ≠ frame counting.** `0x10 0x02` appears inside payloads. Only state machine transitions produce valid frame boundaries.
|
||||
2. **STX count ≠ frame count.** Only STX→ETX pairs within proper state transitions count.
|
||||
3. **EOF mid-frame is normal.** Capture termination during active traffic produces an incomplete trailing frame. Not an error.
|
||||
4. **Layer separation.** The parser extracts frames only. Decoding block IDs, validating checksums, and interpreting semantics are responsibilities of a separate protocol decoder layer above it.
|
||||
4. **Start marker must be the full signature.** In BW mode, `41 02` is the start marker — not bare `02`. Bare `02` appears in payload data and would cause phantom frames.
|
||||
5. **ETX lookahead prevents false termination.** In BW mode, `03` is only a frame terminator when followed by `41 02` or at EOF. In-payload `03` bytes are common in large config frames.
|
||||
6. **Framing is structural. Checksum is semantic.** Frame boundaries are determined by grammar patterns — never by checksum validation. Checksum belongs to the protocol decoder layer, not the framing layer.
|
||||
7. **Layer separation.** The parser extracts frames only. Decoding block IDs, validating checksums, and interpreting semantics are responsibilities of a separate protocol decoder layer above it.
|
||||
|
||||
### C.6 Parser Layer Architecture
|
||||
|
||||
|
||||
Reference in New Issue
Block a user