# s3_parser.py ## Purpose `s3_parser.py` extracts complete DLE-framed packets from raw serial capture files produced by the `s3_bridge` logger. It operates strictly at the **framing layer**. It does **not** decode higher-level protocol structures. This parser is designed specifically for Instantel / Series 3--style serial traffic using: - `DLE STX` (`0x10 0x02`) to start a frame - `DLE ETX` (`0x10 0x03`) to end a frame - DLE byte stuffing (`0x10 0x10` → literal `0x10`) ------------------------------------------------------------------------ ## Design Philosophy This parser: - Uses a deterministic state machine (no regex, no global scanning). - Assumes raw wire framing is preserved (`DLE+ETX` is present). - Does **not** attempt auto-detection of framing style. - Extracts only complete `STX → ETX` frame pairs. - Safely ignores incomplete trailing frames at EOF. Separation of concerns is intentional: - **Parser = framing extraction** - **Decoder = protocol interpretation (future layer)** Do not add message-level logic here. ------------------------------------------------------------------------ ## Input Raw binary `.bin` files captured from: - `--raw-bw` tap (Blastware → S3) - `--raw-s3` tap (S3 → Blastware) These must preserve raw serial bytes. ------------------------------------------------------------------------ ## Usage Basic frame extraction: ``` bash python s3_parser.py raw_s3.bin --trailer-len 2 ``` Options: - `--trailer-len N` - Number of bytes to capture after `DLE ETX` - Often `2` (CRC16) - `--crc` - Attempts CRC16 validation against first 2 trailer bytes - Tries several common CRC16 variants - `--crc-endian {little|big}` - Endianness for interpreting trailer bytes (default: little) - `--out frames.jsonl` - Writes full JSONL output instead of printing summary ------------------------------------------------------------------------ ## Output Format Each extracted frame produces: ``` json { "index": 0, "start_offset": 20, "end_offset": 4033, "payload_len": 3922, "payload_hex": "...", "trailer_hex": "000f", "crc_match": null } ``` Where: - `payload_hex` = unescaped payload bytes (DLE stuffing removed) - `trailer_hex` = bytes immediately following `DLE ETX` - `crc_match` = matched CRC algorithm (if `--crc` enabled) ------------------------------------------------------------------------ ## Known Behavior - Frames that start but never receive a matching `DLE ETX` before EOF are discarded. - Embedded `0x10 0x02` inside payload does not trigger a new frame (correct behavior). - Embedded `0x10 0x10` is correctly unescaped to a single `0x10`. ------------------------------------------------------------------------ ## What This Parser Does NOT Do - It does not decode Instantel message structure. - It does not interpret block IDs or message types. - It does not validate protocol-level fields. - It does not reconstruct multi-frame logical responses. That is the responsibility of a higher-level decoder. ------------------------------------------------------------------------ ## Status Framing layer verified against: - raw_bw.bin (command/control direction) - raw_s3.bin (device response direction) State machine validated via start/end instrumentation.