Compare commits
148 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| d0b66368d5 | |||
| 25386cab8b | |||
| 6cb619ecc4 | |||
| 1ed86244d0 | |||
| b2c565f217 | |||
| 43f440812a | |||
| 23e83908c2 | |||
| bee118506b | |||
| defd17d9c2 | |||
| e42956a20b | |||
| 9fd52ddabb | |||
| 9b71ead44b | |||
| 2eb1d25028 | |||
| 1bccc44b88 | |||
| a3cc44d30a | |||
| 6a73523e4d | |||
| 780b45a371 | |||
| f6abe3caa0 | |||
| ad2702d4bf | |||
| 86325b9bab | |||
| 6381dcb312 | |||
| 53c05d93e2 | |||
| a5888e1b5c | |||
| b9f8bbb220 | |||
| b59f886cb7 | |||
| 87aec3f4d1 | |||
| ace542cba5 | |||
| 8cbda09917 | |||
| 3457ed0072 | |||
| d21e3b5298 | |||
| ad2b553c7b | |||
| dfbc8b8520 | |||
| 411ef8139e | |||
| ed926de3f4 | |||
| 5d5441604b | |||
| 784f2cca36 | |||
| 6abfadae4f | |||
| fd0e28657d | |||
| c14a8c54db | |||
| 460006e5cd | |||
| 8710b8f327 | |||
| db657bcac9 | |||
| 35842ac50a | |||
| 49a524d0d4 | |||
| 9ef424d098 | |||
| cc821f9ee3 | |||
| ed6982c512 | |||
| d506ebc103 | |||
| e949232875 | |||
| bc5a2d3f19 | |||
| 88549bc659 | |||
| 76bce0b5a3 | |||
| 7183b953e4 | |||
| c3c7fe559c | |||
| fa9d3cdef2 | |||
| c4648c1959 | |||
| 0e89125495 | |||
| fffb363b2b | |||
| e8682d49ad | |||
| 31d691b40b | |||
| beca5de06e | |||
| d85df4c886 | |||
| 0466bb4f44 | |||
| 85f4bcfe86 | |||
| 2ff2762eec | |||
| d4cdce77fa | |||
| ce5dc640ba | |||
| 07675626dc | |||
| ae0e17b5dc | |||
| f68ee9f0f9 | |||
| 5bf5329369 | |||
| 9ed6f2a8d8 | |||
| a0c9a482c7 | |||
| 6ac126e05c | |||
| d3f77d1d96 | |||
| 7bd0f8badf | |||
| 8316a1bbd8 | |||
| 8f568b809b | |||
| ecc935482b | |||
| e95ac692ee | |||
| 3265ad6fa3 | |||
| 350f81f8b5 | |||
| cd20be2eff | |||
| f7c5c9fed3 | |||
| 512d82c720 | |||
| 57287a2ade | |||
| 1fff8179d6 | |||
| ae7edac83f | |||
| b6911009ff | |||
| aac1c8e06d | |||
| 84ee68f889 | |||
| 20519383fe | |||
| 87675ac2d8 | |||
| 83d69b9220 | |||
| 3e247e2182 | |||
| d2e48c62b5 | |||
| 3402b4d11a | |||
| 988d26c03d | |||
| 197c0630e2 | |||
| f83993ad1d | |||
| 6b2a44ff02 | |||
| cc57a8e618 | |||
| 082e5946bc | |||
| a032fa5451 | |||
| 6a7e8c6e86 | |||
| cdfe4ad3c8 | |||
| 510cec8395 | |||
| 7e13c2020f | |||
| 8aea46b8a0 | |||
| 0f7630c10d | |||
| 9123269b1f | |||
| 9400f59167 | |||
| e1a73b2c44 | |||
| bbed85f7e2 | |||
| c641d5fc10 | |||
| 9afa3484f4 | |||
| 0484680c89 | |||
| 3711b11bda | |||
| 429c6ac87a | |||
| 52c6e7b618 | |||
| 29ebc75656 | |||
| ebfe9877fa | |||
| c914a15e12 | |||
| a27693242d | |||
| eefec0bd64 | |||
| 7444738883 | |||
| 6b76934a04 | |||
| 7b62c790a9 | |||
| b66cc9d075 | |||
| 4ab604eff1 | |||
| e15f1567ef | |||
| bb33ad3837 | |||
| 45e61fbcaf | |||
| d758825c67 | |||
| 0fbb39c21a | |||
| 1ef55521b1 | |||
| 738b39f3cb | |||
| 625b0a4dfc | |||
| b14f31f3b0 | |||
| b9ab368934 | |||
| 9004241846 | |||
| 6861d9ed97 | |||
| 5cd5652560 | |||
| 897ac8a3f3 | |||
| 310fc5986c | |||
| e1150b30aa | |||
| 9bbecea70f | |||
| 4a0c9b6da5 |
@@ -0,0 +1,28 @@
|
|||||||
|
.git
|
||||||
|
.gitignore
|
||||||
|
|
||||||
|
.venv
|
||||||
|
venv
|
||||||
|
env
|
||||||
|
__pycache__
|
||||||
|
*.pyc
|
||||||
|
*.pyo
|
||||||
|
*.pyd
|
||||||
|
.pytest_cache
|
||||||
|
.mypy_cache
|
||||||
|
.ruff_cache
|
||||||
|
|
||||||
|
*.db
|
||||||
|
*.db-wal
|
||||||
|
*.db-shm
|
||||||
|
*.sqlite
|
||||||
|
*.sqlite3
|
||||||
|
|
||||||
|
sfm/data
|
||||||
|
bridges/captures
|
||||||
|
example-events
|
||||||
|
captures
|
||||||
|
logs
|
||||||
|
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
+1
-1
@@ -1,6 +1,6 @@
|
|||||||
/bridges/captures/
|
/bridges/captures/
|
||||||
/example-events/
|
/example-events/
|
||||||
|
/tests/fixtures/
|
||||||
/manuals/
|
/manuals/
|
||||||
|
|
||||||
# Python build artifacts
|
# Python build artifacts
|
||||||
|
|||||||
+665
@@ -4,8 +4,669 @@ All notable changes to seismo-relay are documented here.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## [Unreleased]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.21.1 — 2026-06-01
|
||||||
|
|
||||||
|
Bug fixes against v0.21.0 surfaced after the first prod redeploy. Three
|
||||||
|
production-visible symptoms — blank waveform charts on most Thor events,
|
||||||
|
blank histogram charts on all Thor events, and a mic chart that
|
||||||
|
auto-scaled against a dB(L) value treated as psi — all root-caused and
|
||||||
|
fixed.
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **Dynamic IDFW body offset.** The v0.21.0 codec hardcoded the body
|
||||||
|
at file offset `0x0f1f` based on the example corpus, but only ~52%
|
||||||
|
of production IDFW events use that offset; the rest sit at offsets
|
||||||
|
from `0x1033` up to `0x3082` depending on header padding. At
|
||||||
|
`0x0f1f` the codec would find a coincidentally-matching `00 02 00`
|
||||||
|
magic, read the 2-byte Tran preamble, and return empty V/L/M
|
||||||
|
arrays — producing near-empty .h5 files and blank charts.
|
||||||
|
`micromate.idf_file._find_waveform_body_offset()` now scans every
|
||||||
|
`00 02 00` magic position past `0x0E00`, trial-decodes each one,
|
||||||
|
and picks the offset with the most samples. Validated across 483
|
||||||
|
prod IDFW files: 0 preamble-only events (was ~50%), 355/483 fully
|
||||||
|
decode, 126/483 partial (BW codec walker-stops-early on loud
|
||||||
|
events — pre-existing limitation, samples reached are correct).
|
||||||
|
|
||||||
|
- **IDFH histograms now render bar charts.** Histograms previously
|
||||||
|
skipped the .h5 write because there are no per-sample arrays, but
|
||||||
|
the renderer drives the per-interval bar chart from .h5 channel
|
||||||
|
data + `bw_report.histogram.n_intervals`. `save_imported_idf` now
|
||||||
|
synthesizes a 1-sample-per-interval array from the decoded
|
||||||
|
`IdfhInterval` peak counts and writes an .h5 so the existing
|
||||||
|
renderer works unchanged — each "sample" is the per-interval peak
|
||||||
|
ADC count, so the writer's `count × geo_fs/32768` conversion
|
||||||
|
yields the right bar height.
|
||||||
|
|
||||||
|
- **Mic chart scaling on Thor events.** `PeakValues.micl` (consumed
|
||||||
|
by the h5 writer's per-count mic scale factor) expects psi, but
|
||||||
|
the Thor bridge was stuffing the dB(L) value (~99.4) into it,
|
||||||
|
producing a per-count factor 5+ orders of magnitude too large and
|
||||||
|
a flat-looking mic chart. Fixed by adding `IdfPeaks.mic_pspl_psi`
|
||||||
|
alongside `mic_pspl_dbl`; `read_idf_file()` computes it from
|
||||||
|
binary mic counts (`max(|MicL|) × 2.14e-6 psi/count`) for both
|
||||||
|
IDFW and IDFH paths; `save_imported_idf` merges it onto the typed
|
||||||
|
event after `IdfEvent.from_report`; the bridge feeds psi to
|
||||||
|
`PeakValues.micl` with a dB(L)→psi formula fallback when only the
|
||||||
|
dB(L) value is available. dB(L) for the report header still
|
||||||
|
flows through `bw_report.mic.pspl_dbl` unchanged.
|
||||||
|
|
||||||
|
### Operator
|
||||||
|
|
||||||
|
After deploy, run `python scripts/backfill_thor_events.py` to refresh
|
||||||
|
every existing Thor event's sidecar + .h5 with the corrected codec
|
||||||
|
output. The script auto-skips events already at the current
|
||||||
|
`TOOL_VERSION`, so the bump from `0.21.0` → `0.21.1` is what triggers
|
||||||
|
the refresh.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.21.0 — 2026-05-29
|
||||||
|
|
||||||
|
The "Thor / Series IV codec" release. Two big pieces landed: (1) the IDF binary codec actually decodes now, both IDFW and IDFH, and (2) a Thor→BW adapter lets Thor events flow through the existing Series III Event Report PDF pipeline. Combined effect: a Thor event ingested via `/db/import/idf_file` now lands in the DB with the same fidelity as a Blastware event, gets a per-event PDF on demand, and renders in Terra-View's modal chart with the same plotting code as a BW event.
|
||||||
|
|
||||||
|
### Added — Thor IDF binary codec (`micromate/idf_file.read_idf_file`)
|
||||||
|
|
||||||
|
- **IDFW (waveform)** — body sits at fixed file offset `0x0f1f`; reuses the verified `decode_waveform_v2()` walker from `minimateplus.waveform_codec`. Sample fidelity is **87–99% byte-exact** against the ASCII-sidecar reference values on quiet events; loud events hit the same walker-stops-early limitation as the BW codec on `SP0/SS0/SV0`-style events.
|
||||||
|
- **IDFH (histogram)** — dedicated segment-based decoder for the Thor histogram body format: `[len_be][0a 00 00 00][00 NN][05 3f]` framing plus N × 72-byte interval records (4 × 16-byte per-channel min/max/halfp). **All 859 Thor IDFH corpus files decode**, totalling **181,071 intervals**; per-channel peaks match the sidecar within **~1.8% (ADC quantization)**.
|
||||||
|
- **BW-aliased binary detection** — a small number of corpus files (e.g. `BE9439_*.IDFW/IDFH`) are actually Series III Blastware binaries that share the IDF filename convention by accident. `read_idf_file()` detects them via their BW `STRT` signature and raises `NotImplementedError` pointing the caller at `read_blastware_file()` instead of trying to decode them as IDF.
|
||||||
|
- Full field layouts in `docs/idf_protocol_reference.md`; supporting analysis scripts in `analysis_idf/` (decode validators, per-file detail dumps, corpus accuracy reports).
|
||||||
|
|
||||||
|
### Added — Thor → BW report adapter (`micromate/idf_to_bw_report.py`)
|
||||||
|
|
||||||
|
- **`build_bw_report_from_idf(report_dict, binary_md=, intervals=, is_histogram=)`** projects a parsed Thor `IdfReport` plus binary-extracted metadata plus decoded IDFH intervals into the `bw_report`-shaped dict that `sfm.report_pdf.gather_report_data` consumes. No need to duplicate the renderer — Thor data is ~95% the same metric set as BW; the adapter handles the field-name mapping (`MicPSPL` → `pspl_dbl`, `>100` sentinel → `zc_freq_above_range`, free-form `Calibration : Nov 22, 2023 by Instantel` → `calibration_date` + `calibration_by`, etc.).
|
||||||
|
- For IDFH events the adapter derives `histogram.interval_times` by stepping `IntervalSize` from `HistogramStartTime`, matching what the BW pipeline expects from a histogram-mode event.
|
||||||
|
- **Wired into `WaveformStore.save_imported_idf`** — every Thor event ingested via `/db/import/idf_file` now gets a `bw_report` block in its sidecar in addition to the existing `extensions.idf_report` (the raw parsed Thor payload). Falls back gracefully (PDF renders from DB-only fields) if the adapter raises — logged as a warning rather than failing the ingest.
|
||||||
|
|
||||||
|
### Companion releases
|
||||||
|
|
||||||
|
- **Terra-View v0.13.0** ships in parallel — closes Phase 1 of the SFM integration. The shared event-detail modal now renders the SFM event story (Chart.js waveform/histogram chart, inline PDF preview, `.TXT` download, FT/reviewer/notes review form) without operators needing to bounce to the standalone SFM webapp on port 8200. Uses only existing seismo-relay endpoints — no API changes here, just better consumption.
|
||||||
|
|
||||||
|
### Migration / Operations
|
||||||
|
|
||||||
|
No DB migration needed. Existing Thor events already in the store don't automatically pick up the new `bw_report` block — they'd need a re-ingest (post the IDF binary + paired `.TXT` back to `/db/import/idf_file`) for the adapter to run. Alternatively, run `scripts/backfill_sidecars.py --reparse-txt` after a small adapter change (the script currently only re-runs the BW ASCII parser; extending it to handle Thor would be a small follow-up).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /home/serversdown/terra-view
|
||||||
|
docker compose build sfm && docker compose up -d sfm
|
||||||
|
```
|
||||||
|
|
||||||
|
The bumped `TOOL_VERSION = "0.21.0"` in `minimateplus/event_file_io.py` means any subsequent `backfill_sidecars.py --force` pass will re-write sidecars with the new version stamp; that's expected and harmless.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.20.0 — 2026-05-28
|
||||||
|
|
||||||
|
The "PDF + parser polish" release. Closes out the Event-Report PDF iteration started in v0.17.x: histogram layouts now render correctly against BW reference PDFs, the ASCII parser handles the real-world edge cases production events were tripping over (OORANGE, `>100 Hz`, histogram timestamps), and the `.TXT` preservation rollout lets parser fixes be applied retroactively to ingested events. Adds server-wide timezone support so operator-visible timestamps no longer drift into UTC. Rolls up the substantial "pre-v0.20" body of work that had accumulated under `[Unreleased]` (PDF generation, histogram codec fix, histogram parser fields, `.TXT` preservation, backfill safety) — see the trailing "pre-v0.20.0 work" section below for the full list.
|
||||||
|
|
||||||
|
### Added (2026-05-28)
|
||||||
|
|
||||||
|
- **Server-wide display timezone via `TZ` env var.** Both seismo-relay and terra-view now respect a `TZ` environment variable (default `America/New_York` on prod). Affects server log timestamps, the PDF report renderer's UTC→local conversions on the "Created" footer line, matplotlib's datetime axes, and any other naïve-vs-aware datetime rendering. DB columns (`created_at`, etc.) stay UTC regardless — this is a display-side fix, not a storage-side one. Dockerfile now installs `tzdata` (required for the env var to take effect under `python:slim`). Override per-deployment via the `TZ` line in `docker-compose.yml`.
|
||||||
|
- **ZC Freq "above-range" handling — render `>100 Hz` instead of `—`.** BW writes `">100 Hz"` literally when the zero-crossing algorithm sees a peak too fast to count (device cuts off at 100 Hz on V10.72). Previously `_parse_number(">100")` returned None and the PDF stats table rendered `—`. Now the parser mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` and sets a new `zc_freq_above_range` flag. Flag rides through the sidecar's `bw_report` block. Renders as `>100` in the PDF (per-channel + mic block), as `· >100 Hz` inline on the event modal's Peaks section, and as a dedicated column on the event-browser stats table. Verified against the real T190LD5Q.LK0W fixture from 2026-05-27 plus a synthetic test case.
|
||||||
|
- **Per-channel ZC Freq surfaced in event modals.** Neither the main webapp modal (`sfm_webapp.html`) nor the standalone event browser (`event_browser.html`) previously exposed ZC Freq. Now both do — webapp shows it inline alongside PPV (`0.04500 in/s · 47 Hz`); event-browser gets a dedicated column on its per-channel stats table. Required wiring a parallel sidecar fetch into the event-browser's `loadEvent()` (it was only fetching `waveform.json`). Falls back to `—` for events without a preserved `.TXT` (pre-2026-05-27 ingests).
|
||||||
|
- **`scripts/backfill_sidecars.py --reparse-txt` flag.** Before this, the backfill script preserved the `bw_report` block from existing sidecars verbatim — so parser-side fixes (like the `>100 Hz` addition above) couldn't reach old events. The new flag re-runs the current parser against the preserved `<serial>/<filename>_ASCII.TXT`, overwrites the bw_report block, and cascade-regenerates the sidecar. Implies sidecar regeneration on every event (bypasses the sha/version skip). No-op for events without a preserved .TXT (legacy ingests pre-2026-05-27 .TXT-preservation rollout). Idempotent. Run with `--skip-hdf5` to skip waveform regen — recommended when only the bw_report needs refreshing. Validated end-to-end on prod: 9,999 events refreshed cleanly, ZC Freq + OORANGE flags now populated where the original .TXT had them.
|
||||||
|
|
||||||
|
### Fixed (2026-05-28)
|
||||||
|
|
||||||
|
- **Histogram PDFs no longer 500 on the missing `histogram_interval_size_s` attribute.** The histogram-interval-times derivation block in `gather_report_data` referenced `rd.histogram_interval_size_s`, but the field was never declared on the `ReportData` dataclass nor read from the sidecar projection (it was inlined into `gather_report_data` without the seconds-numeric counterpart making it onto the dataclass). Every histogram PDF render raised `AttributeError → 500`. Waveform PDFs were unaffected. Fix: add the field, read it from the projection's existing `bw_report.histogram.interval_size_s` key.
|
||||||
|
- **Histogram PDF geo channels now share a single nice-quantized y-axis.** Previously each geo subplot auto-scaled independently — Tran, Vert, and Long all showed different per-channel maxes, so bar heights weren't directly comparable across channels. The footer "Amplitude Geo: X in/s/div" label was also computed as `max(first_geo_channel) / 5` with no LSB quantization, producing nonsense values like `0.003 in/s/div` when the geophone LSB is 0.005. Fix: compute a single shared geo y-axis range from `max(Tran, Vert, Long)`, quantize the per-division step to BW's 1-2-5 sequence rounded to the 0.005 in/s LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same `ylim` + ticks to all three subplots, and use that step for the footer label. MicL stays on its own auto-scale (different units). Matches BW's chart styling.
|
||||||
|
|
||||||
|
### Docs (2026-05-28)
|
||||||
|
|
||||||
|
- **Roadmap entry for a second undecoded histogram body sub-format.** BE17353 (S353) events observed on 2026-05-28 use a histogram body where `byte[5] = 0x00` (looks like a valid block header by every prior signal) but the walker finds zero data blocks. Different from the existing `byte[5] != 0` roadmap entry (T190 / O121). Operationally identical impact — ingestion succeeds, DB peaks come from the bw_report overlay, only the chart is empty. Sample events captured in the roadmap entry for future RE work.
|
||||||
|
|
||||||
|
### Migration / Operations
|
||||||
|
|
||||||
|
- **Re-parse existing events to pick up the new parser fields.** Run on whichever box hosts the live waveform store:
|
||||||
|
```bash
|
||||||
|
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
|
||||||
|
--reparse-txt --skip-hdf5 --dry-run -v | tail
|
||||||
|
# Looks reasonable? Run for real:
|
||||||
|
docker exec terra-view-sfm-1 python /app/scripts/backfill_sidecars.py \
|
||||||
|
--reparse-txt --skip-hdf5 -v | tee /tmp/reparse.log | tail -30
|
||||||
|
```
|
||||||
|
Idempotent; safe to re-run. Only touches sidecars on disk — no DB writes.
|
||||||
|
- **terra-view docker-compose.yml**: add `TZ=America/New_York` (or your deployment's zone) to both the `terra-view` and `sfm` service `environment:` blocks. Without this, server-rendered timestamps stay in UTC even on the rebuilt SFM image.
|
||||||
|
|
||||||
|
### Pre-v0.20.0 work (rolled into this release)
|
||||||
|
|
||||||
|
The bullets below accumulated under `[Unreleased]` between v0.19.0 and v0.20.0; kept here so the historical narrative isn't lost.
|
||||||
|
|
||||||
|
#### Fixed
|
||||||
|
|
||||||
|
- **bw_ascii_report parser now handles `OORANGE` saturation marker.** BW writes `"OORANGE"` (truncation of "Out Of Range") in PPV / PVS / MicL PSPL fields when the underlying measurement exceeded the channel's full-scale. Previously our `_parse_number()` returned None → DB ended up with NULL peaks for legitimate high-amplitude events. Confirmed on real ASCII files pulled 2026-05-27 from the Windows watcher PC: T190LD5Q.LK0W (Vert saturated at Normal range 10 in/s), T438L713.RY0W (all three channels saturated at Sensitive range 1.25 in/s), K557L3YM.OE0W (Tran+Vert saturated + Mic PSPL OORANGE). New behavior:
|
||||||
|
- Per-channel PPV: substitute `geo_range_ips` as a conservative lower bound + set `ppv_saturated` flag
|
||||||
|
- Peak Vector Sum: substitute `sqrt(3) * geo_range_ips` (the theoretical max when all 3 channels are simultaneously at full-scale) + `peak_vector_sum_saturated` flag
|
||||||
|
- MicL PSPL: substitute 140 dB(L) (conservative NL-43 max) + `pspl_saturated` flag
|
||||||
|
- Saturation flags are propagated into the sidecar's `bw_report` block for downstream UI rendering (`> 10 in/s` or similar)
|
||||||
|
- Five events on prod (T190 / T438 / K557 + 2 others matching the same fault pattern) will pick up correct DB peaks + saturation flags once re-forwarded
|
||||||
|
- **bw_ascii_report parser handles `Peak Vector Sum TimeSum` typo'd label.** Real BW output uses this misspelled label (Sum appended twice instead of "Peak Vector Sum Time"). Now accepted as an alias. Confirmed against all three OORANGE example files — every one has the typo.
|
||||||
|
|
||||||
|
#### Added
|
||||||
|
|
||||||
|
- **Histogram per-interval aggregation in `waveform.json`.** Histogram events now render with one bar per BW-reported interval (matching the Blastware printout) instead of ~200 bars per event (the raw codec output). When the sidecar's `bw_report.histogram.n_intervals` is populated (events ingested with the new parser, see next bullet), the `/db/events/{id}/waveform.json` endpoint groups the codec samples into N intervals via max-per-group and returns the aggregated array. `time_axis` gains `histogram_aggregated: true`, `n_intervals`, `interval_size_s`, and `interval_times` (HH:MM:SS strings). Both the modal chart and the standalone event browser use those interval timestamps as x-axis labels when present. Defensive: no-op for events ingested before the parser extension landed (their sidecars lack `histogram.n_intervals`) — those continue to render with raw codec output.
|
||||||
|
- **`bw_ascii_report` parser now captures histogram-specific fields.** Previously the parser dropped these fields silently (Roadmap item closed):
|
||||||
|
- `Histogram Start Time` / `Histogram Start Date` (combined into `histogram_start: datetime`)
|
||||||
|
- `Histogram Stop Time` / `Histogram Stop Date` (combined into `histogram_stop: datetime`)
|
||||||
|
- `Number of Intervals` (`histogram_n_intervals: int`)
|
||||||
|
- `Interval Size` ("1 minute" string + parsed seconds: `histogram_interval_size_str`, `histogram_interval_size_s`)
|
||||||
|
- `<Channel> Peak Time` + `<Channel> Peak Date` for histogram events (combined into `channel_peak_when: dict`; waveforms continue to use `time_of_peak_s` relative)
|
||||||
|
- `Peak Vector Sum Date` (combined with PVS Time into `peak_vector_sum_when: datetime`; clears the previous bogus `peak_vector_sum_time_s` parse that interpreted "22:33:52" as 22.0 seconds)
|
||||||
|
- All new fields land in the sidecar's `bw_report.histogram` block via `_bw_report_to_dict`. Tested against synthetic K558LLB7.V20H-shaped input.
|
||||||
|
- **Raw BW ASCII report (.TXT) preservation.** `save_imported_bw` now writes the paired `_ASCII.TXT` to `<store>/<serial>/<filename>_ASCII.TXT` alongside the binary at ingest time. Previously the .TXT was parsed into the sidecar's `bw_report` projection and then discarded — meaning parser bug fixes couldn't be applied retroactively without re-forwarding from the watcher PC. Now the raw .TXT lives in the waveform store permanently (~15 KB per event; ~210 MB total for a 14k-event store; negligible). Sidecar's `source.txt_filename` field records the saved path; backfill_sidecars preserves it across regens. New `GET /db/events/{id}/ascii_report.txt` endpoint serves the raw .TXT for any event ingested after this change. Events ingested before today still return 404 from that endpoint until re-forwarded. Architectural rationale: with BW Mail / Forwarding Agent being phased out of the operator workflow, the XML/PDF/WMF that those tools produced are no longer available — the binary + .TXT (created by BW ACH itself) are our authoritative source for everything going forward.
|
||||||
|
|
||||||
|
- **Event Report PDF generation** — `GET /db/events/{id}/report.pdf` returns a single-page letter-portrait PDF for any event with waveform data on disk. Covers every field a Blastware Event Report includes: header metadata (date/time, trigger source, range, sample rate, project/client/operator/location, serial+firmware, battery, calibration, file name), microphone block (PSPL in dB(L) + psi, ZC freq, channel test), per-channel stats table (rows differ for waveform vs histogram), Peak Vector Sum, and the 4-channel plot. Iterated against real Blastware reference PDFs (uploaded to `example-events/pdfsnstuff/`):
|
||||||
|
- **Waveform layout**: header shows Date/Time, Trigger Source, Range, Sample Rate; stats table has PPV / ZC Freq / Time (Rel. to Trig) / Peak Accel / Peak Disp / Sensor Check; bottom plot is 4-channel line waveform (MicL top → Tran bottom), shared time axis in seconds, dashed trigger line + triangle marker at t=0, symmetric Y on geo channels, zero-anchored on mic, "0.0" baseline label on right per BW convention; footer shows `Time X sec/div Amplitude Geo: Y in/s/div Mic: 0.001 psi(L)/div` and the trigger window `▶━━◀` marker. USBM RI8507/OSMRE compliance chart placeholder upper-right.
|
||||||
|
- **Histogram layout**: header shows Start / Finish / Intervals At Size / Range / Sample Rate (no Trigger Source — histograms aren't triggered); NO USBM chart; stats table has PPV / ZC Freq / Date / Time / Sensor Check; bottom plot is per-interval bar chart, Y-axis 0-to-peak (never negative), 0.0 baseline at the bottom; footer shows `Time INTERVAL_SIZE /div Amplitude Geo: Y in/s/div Mic: 0.001 psi(L)/div`.
|
||||||
|
- Backed by matplotlib (vector PDF, no headless-browser dep). Adds matplotlib>=3.8 to deps.
|
||||||
|
- **Known gap**: histogram codec returns per-block granularity (~200 bars for a 4-interval event) instead of BW's per-interval aggregation. Visual difference vs BW's 4-bar display. XML-driven data source (parsing the structured `_XML.XML` files BW also exports) is the planned fix; that route also resolves the bw_ascii_report PPV-miss bug.
|
||||||
|
- **Stubbed**: USBM RI8507 / OSMRE compliance chart curves (separate work item; requires coding the regulatory piecewise functions).
|
||||||
|
- **"Download PDF" button** in the event modal's footer — triggers the new endpoint; opens in a new tab so the browser handles save-or-display + surfaces any 404 / server errors visibly.
|
||||||
|
|
||||||
|
- **SFM webapp now opens to Database view by default** and the History table is fully interactive. Click any column header to sort ascending / descending (timestamp, serial, per-channel PPV, PVS, mic dB(L), project, client, record type, key — all sortable). Click any event row to open the event modal, which now renders a **4-channel waveform plot inline** (MicL / Long / Vert / Tran stacked, Instantel-printout order) alongside the existing sidecar review fields. Headers are sticky so the columns stay visible while scrolling long event lists. No more "where is the viewer" — pick a unit from the filter dropdown, scan the table, click the event, see the waveform.
|
||||||
|
- **Stored-event browser** — new standalone HTML page at `GET /events` (`sfm/event_browser.html`). Pick a serial from the unit dropdown, scroll through that unit's events (newest-first), click any event to render its decoded waveform via the existing `/db/events/{id}/waveform.json` endpoint. Dark-themed Chart.js viewer, channels stacked vertically (MicL / Long / Vert / Tran — Instantel printout order, designed PDF-export-ready), trigger line at t=0, peak labels, search/filter, false-trigger flag honored. Companion to the existing live-device viewer at `/waveform`; the two routes are now clearly delineated in their docstrings. The webapp's inline plot at `/` is the primary path; `/events` remains a useful diagnostic when you want just a viewer.
|
||||||
|
- **Histogram body codec — uint8 peak count fix.** Per-channel peak fields at `block[6]/[10]/[14]/[18]` are `uint8`, not `uint16 LE` spanning `block[6:8]` etc. The original interpretation was byte-exact on the N844 fixture corpus only because every annotation byte (`block[7]/[11]/[15]/[19]`) in those fixtures was zero. On non-N844 events with non-zero annotation bytes (observed across BE9558 Tran-drift and BE18003 Histogram+Continuous units), the old interpretation produced peaks up to 268 in/s per channel and 35× inflated PVS sums when first deployed to prod (rolled back same day; properly fixed in this release). Cross-correlated against BW's per-interval ASCII export on K558 / T003 / N599 / N844 corpora — 100% byte-exact on T/V/L, 99%+ on M (sub-precision rounding). Annotation byte preserved on each record as `record["annotations"]` for future RE. Verified against ~3,500 blocks across 5 in-repo fixtures + a synthetic K558 interval-12 regression block.
|
||||||
|
- **`apply_bw_report_dict_to_event` helper** in `minimateplus.event_file_io`. Mirror of `apply_report_to_event` for the projected sidecar dict shape — used by the backfill path, which has the preserved `bw_report` block but not the original `.TXT` file. BW's reported peaks (and `sample_rate` / `record_time`) now win over codec output during `--force` backfill, matching ingest-path behavior.
|
||||||
|
- **`scripts/check_bw_report_preservation.py`** — two-step snapshot/diff tool to verify that `backfill_sidecars.py` doesn't wipe the `bw_report` block from existing sidecars. Classifies every sidecar as PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED. Exit code 1 if any WIPED or CHANGED entries are found, so it can gate a CI step or deploy script.
|
||||||
|
|
||||||
|
#### Fixed
|
||||||
|
|
||||||
|
- **`scripts/backfill_sidecars.py` no longer wipes `bw_report`.** Before this fix, `event_to_sidecar_dict` silently dropped the preserved `bw_report` block during every backfill, since the function only emits a `bw_report` when called with a live `BwAsciiReport` dataclass (which the backfill doesn't have — only the projected sidecar dict). Now we read the existing sidecar's `bw_report` and overlay it onto the regenerated sidecar, alongside the existing `review` and `extensions` preservation.
|
||||||
|
- **`scripts/backfill_sidecars.py --force` no longer overwrites BW-overlaid DB peaks with codec output.** The backfill path now calls `apply_bw_report_dict_to_event` before the DB upsert, mirroring what the ingest path does (`/db/import/blastware_file` parses the `.TXT` into a `BwAsciiReport`, calls `apply_report_to_event`, then upserts). Without this, events where the codec doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style events, histogram `byte[5]!=0` sub-format) ended up with PVS=0 in the DB after a `--force` backfill; bit on prod 2026-05-22, rolled back the same day.
|
||||||
|
- **Thor IDF files no longer attempted as BW events in backfill.** `scripts/backfill_sidecars.py` now filters out `.IDFW` / `.IDFH` files in `_looks_like_event_file()`; they share the `.X0W` / `.X0H` suffix shape but use a separate ingest path (`WaveformStore.save_imported_idf`) and aren't decodable by `event_file_io.read_blastware_file`.
|
||||||
|
|
||||||
|
#### Docs
|
||||||
|
|
||||||
|
- **CLAUDE.md** — added a three-tier conceptual architecture model (SFM / SDM / shared codec library) near the top of the file, with a placement rule for where new code goes. Documents that what is conceptually SDM (database, waveform store, ingest, `/db/*` endpoints) still lives under `sfm/` for historical reasons; rename deferred until the codebase is quiet enough for a clean refactor.
|
||||||
|
- **README.md** — added a "Strategic direction" lead-in to the Roadmap that frames seismo-relay as a suite of cooperating components (not a single app), and an explicit "Terra-View ↔ SFM device control" roadmap section with a concrete implementation checklist (auth as hard prerequisite, embedded live-monitor view, action history, Series IV live-device support).
|
||||||
|
- **`docs/histogram_codec_re_status.md`** updated with the uint8 retraction and the annotation-byte status.
|
||||||
|
- Three known issues recorded in the Roadmap that were discovered during prod validation: (1) `bw_ascii_report` parser misses PPV / `vector_sum` on some `.TXT` formats (5 events on prod); (2) NULL-timestamp duplicate-row dedup needed (2 events on prod); (3) histogram body sub-format with `byte[5] != 0` not yet decoded (~3 events on prod with empty `.h5` plots).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.19.0 — 2026-05-20
|
||||||
|
|
||||||
|
The "device-family separation" release. Tightens the boundary between Series III (MiniMate Plus / Blastware) and Series IV (Micromate / Thor) so the UI and storage layer dispatch deterministically by family instead of sniffing filename extensions or magnitude heuristics.
|
||||||
|
|
||||||
|
### Added — Phase 1: `device_family` column on `events`
|
||||||
|
|
||||||
|
- **`events.device_family TEXT`** — new column carrying `"series3"` or `"series4"`. Populated by every import path (`/db/import/blastware_file`, `/db/import/idf_file`, ACH server, BW CLI, sidecar backfill script). Returned through `/db/events` since `query_events` uses `SELECT *`.
|
||||||
|
- **Self-applying migration** — on startup, `ALTER TABLE ... ADD COLUMN` lands the new column; a follow-on `UPDATE` backfills existing rows from the binary filename extension (`.IDFH`/`.IDFW` → `series4`, everything else → `series3`). No manual SQL needed.
|
||||||
|
- **UPSERT preserves family** — re-imports without an explicit family don't blank existing rows (`COALESCE(?, device_family)`).
|
||||||
|
- **UI dispatches on the column** — `sfm_webapp.html` events-table mic formatter now branches on `ev.device_family === 'series4'` (Thor stores native dB(L); BW stores psi). Modal uses `source.kind === 'idf-import'` from the sidecar (sidecars don't carry the DB column). Source-files section labels changed from "BW filename / BW filesize / BW sha256" to format-neutral "Event file / File size / File sha256".
|
||||||
|
|
||||||
|
### Added — Phase 2: `micromate/` package alongside `minimateplus/`
|
||||||
|
|
||||||
|
- **`micromate/`** — new sibling package for the Thor / Micromate Series IV device. Currently scoped to offline-file ingest; live-device support (TCP transport, framing, protocol, client) will land here when reverse-engineering happens.
|
||||||
|
- `micromate/idf_ascii_report.py` — moved from `sfm/idf_ascii_report.py`. No behaviour change.
|
||||||
|
- `micromate/models.py` — typed `IdfReport`, `IdfEvent`, `IdfPeaks`, `IdfProjectInfo`, `IdfSensorCheck`. Stores mic in native `mic_pspl_dbl` (dB(L)) instead of the pseudo-psi shoehorn that the BW-shaped model uses. `IdfEvent.from_report()` constructs from a parsed dict + filename; `IdfEvent.to_minimateplus_event(waveform_key)` bridges to the existing sidecar / DB-insert machinery.
|
||||||
|
- `micromate/idf_file.py` — placeholder for the binary codec (`.IDFH` / `.IDFW`). Stubbed `read_idf_file()` raises `NotImplementedError`; documents the planned reverse-engineering path.
|
||||||
|
- **`WaveformStore.save_imported_idf`** refactored to use the native `IdfEvent` and bridge at the SQL-insert boundary. Cleaner separation of "parse a Thor event" (in `micromate/`) from "store it on disk + write a sidecar" (in `sfm/waveform_store.py`).
|
||||||
|
- **Tests** — `tests/test_idf_ascii_report.py` imports updated to `micromate.idf_ascii_report`. All 1,014 example-data sidecars round-trip through `IdfEvent.from_report()` without errors.
|
||||||
|
|
||||||
|
### Companion releases
|
||||||
|
|
||||||
|
- **thor-watcher** unaffected — it talks to the relay over HTTP only. No version bump needed.
|
||||||
|
- **terra-view** unaffected today; can use `device_family` in its event-detail rendering when convenient.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.18.0 — 2026-05-19
|
||||||
|
|
||||||
|
The "Thor / Series IV ingest adapter" release. Seismo-relay can now accept event files from Instantel Micromate Series IV (Thor) units alongside the existing MiniMate Plus (Series III) Blastware pipeline.
|
||||||
|
|
||||||
|
### Added — Thor (Series IV) IDF ingest
|
||||||
|
|
||||||
|
- **`POST /db/import/idf_file`** (`sfm/server.py`) — multipart upload endpoint for `.IDFH` (histogram) and `.IDFW` (waveform) event files plus their `.IDFH.txt` / `.IDFW.txt` ASCII sidecars. Mirrors the shape of `/db/import/blastware_file`: pairing by filename, optional `serial` query hint, per-file outcome reporting.
|
||||||
|
- **`sfm/idf_ascii_report.py`** — parser for Thor's TXT sidecars (verified against 1,014 real-world samples). Extracts device-authoritative PPV, ZC Freq, Peak Vector Sum, Mic PSPL, calibration date, firmware version, sensor self-check results, and project/client/operator strings.
|
||||||
|
- **`WaveformStore.save_imported_idf()`** (`sfm/waveform_store.py`) — stores Thor binaries verbatim in `<root>/<serial>/<filename>`, writes a `.sfm.json` sidecar with `source.kind = "idf-import"` and the full parsed report under `extensions.idf_report`. Reuses the existing `events` table — Thor events dedupe on (serial, timestamp) and surface in `/db/events` alongside BW events.
|
||||||
|
- **`tests/test_idf_ascii_report.py`** — parser tests against the `thor-watcher/example-data/` corpus.
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
|
||||||
|
- `event_to_sidecar_dict()` (`minimateplus/event_file_io.py`) allow-list for `source_kind` now includes `"idf-import"` so the existing sidecar machinery can carry Thor imports.
|
||||||
|
- Bumped `pyproject.toml` version to `0.18.0`.
|
||||||
|
|
||||||
|
### Companion release
|
||||||
|
|
||||||
|
This release ships alongside **thor-watcher v0.3.0**, which adds the SFM forwarder that targets the new `/db/import/idf_file` endpoint. Operators flip the switch in thor-watcher's new "SFM Forward" Settings tab; events POST to seismo-relay just like the series3-watcher BW forwarder does today.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.17.0 — 2026-05-17
|
||||||
|
|
||||||
|
The "field rescue + DB management" release. Hardened against units that are stuck in a runaway call-home loop, and added an operator-facing path for purging bogus events that those same units dump into the DB before recovery. All work in this release was driven by the BE9558H incident (full incident log + recovery procedure at `docs/runbooks/wedged_unit_recovery.md`).
|
||||||
|
|
||||||
|
### Added — wedged-unit recovery toolkit
|
||||||
|
|
||||||
|
A toolkit for breaking the call-home loop on a misbehaving unit whose firmware is too busy to keep up with normal request/response handshakes. Tested in production against BE9558H (16 May 2026) — a unit with a stuck-triggered Long-axis geophone that had been call-homing the office BW ACH server every 30 seconds for hours. Endpoints layered from "single attempt" to "siege mode" to suit different contention levels:
|
||||||
|
|
||||||
|
- **`GET /device/events/storage_range`** — SUB 0x06 probe. POLL + one read; ~2s. Returns first/last event keys and an `is_empty` flag. Use to triage whether a unit has stored events without invoking the slow `count_events()` 1E/1F chain (which choked on BE9558H's corrupted event chain).
|
||||||
|
- **`GET /device/events/index`** — SUB 0x08 probe. POLL + one read; ~2s. Returns the lifetime event counter (does NOT decrement on erase — use `storage_range` for "right now" state).
|
||||||
|
- **`POST /device/events/erase`** — full erase sequence `0xA3 → 0x1C → 0x06 → 0xA2` (confirmed 2026-04-11, see the protocol reference). Resets event keys to `0x01110000`. Caller's responsibility to disable ACH first if the underlying trigger condition will re-fill the buffer.
|
||||||
|
- **`POST /device/rescue`** — one TCP session, short connect+recv timeouts: POLL → disable ACH (compliance config write) → erase events → close. Designed for race-loop usage when the device is busy in another session. 503 on connect-refused, 502 on protocol failure, 200 on full sequence success.
|
||||||
|
- **`POST /device/stop_monitoring_blind`** — fire-and-forget Stop Monitoring (SUB 0x97), TCP-only. Dumps `SESSION_RESET + POLL_PROBE + SESSION_RESET + POLL_DATA + 0x97 × repeat` and closes without reading any S3 response. The full POLL preamble is required — write commands without it are silently ignored by the device's protocol parser (false-positive surface area that bit the first version of this endpoint). Use when the device's firmware can't keep up with full request/response but might process inbound bytes at its own pace.
|
||||||
|
- **`POST /device/stop_monitoring_spam`** — server-side hammer loop, duration-bounded. Open TCP → write the same blind payload → close → repeat as fast as possible until `duration_s` elapses. Configurable `connect_timeout` (default 500ms) and `repeat` (frames per session). Reports `sent_ok`, `connect_failed`, `write_failed`, `rate_attempts_per_s`. Clamped to 5min duration.
|
||||||
|
- **`POST /device/stop_monitoring_slow_drip`** — opposite of spam. Open ONE TCP session, drip the wake handshake + stop frames at `interval_s` (default 3s) for `duration_s` (default 120s, max 10min). Each drip is ~23 bytes — well under any UART FIFO size. Opportunistically drains any inbound bytes the device sends back; `bytes_received > 0` in the response strongly suggests the device has started talking and the session is healthy. **This is the endpoint that saved BE9558H.** Spam mode had been overrunning the device's UART FIFO; slow drip stayed under it.
|
||||||
|
- **Six rescue scripts** under `scripts/` — thin bash wrappers around the endpoints, default `SFM_BASE_URL=http://localhost:8200` (direct, not via Terra-View proxy whose 60s timeout would cut off the longer endpoints):
|
||||||
|
- `rescue_device.sh` — race-loop wrapper for `/device/rescue`
|
||||||
|
- `blind_stop.sh` — race-loop wrapper for `/device/stop_monitoring_blind`
|
||||||
|
- `spam_stop.sh` — single-call burst hammer
|
||||||
|
- `slow_drip.sh` — single-call held-session drip
|
||||||
|
- `watch_unit.sh` — passive periodic reachability check (every N min, logs to file), useful for unattended overnight monitoring of a wedged unit
|
||||||
|
- **`docs/runbooks/wedged_unit_recovery.md`** — symptoms, quick-reference recovery procedure, the modem-layer mechanism (Sierra Wireless serial-port mode-flipping is the real failure mode — not the device firmware), and a table of "why simpler approaches don't work" so the next incident skips the dead ends.
|
||||||
|
|
||||||
|
### Added — operator event DB management
|
||||||
|
|
||||||
|
Endpoints powering Terra-View's new `/admin/events` page (v0.12.0). Designed for purging bogus events from a unit that's been forwarding them in bulk (e.g. a stuck-triggered seismograph dumping hundreds of junk events before it's recovered).
|
||||||
|
|
||||||
|
- **`DELETE /db/events/{event_id}`** — hard-delete one event row. Also unlinks the associated blastware binary (`.AB0*`), `.a5.pkl`, `.sfm.json` sidecar, and `.h5` clean-waveform files via the WaveformStore. Returns the per-file removal status. 404 if the event doesn't exist.
|
||||||
|
- **`POST /db/events/delete_bulk`** — filter-based or id-list-based bulk delete with safety rails:
|
||||||
|
- Filters (`serial`, `from_dt`, `to_dt`, `false_trigger`) combine with AND; same semantics as `GET /db/events`. `ids` is an additional inclusion list. Refuses to run with no filters (would wipe the whole table — raises 422).
|
||||||
|
- `confirm` must be `true` to actually delete. Otherwise returns a dry-run summary (`status: "dry_run"`, `matched: N`, `sample_serials: [...]`).
|
||||||
|
- `max_rows` (default 10,000) caps how many rows can be deleted by-filter in one call. If exceeded, returns `status: "too_many"` with a hint to narrow or raise the cap. Bypassed when only `ids` is supplied.
|
||||||
|
- **`_cleanup_event_files(row)`** helper in `sfm/server.py` — best-effort `unlink()` of all four sidecar paths derived from the row's `blastware_filename`. Logged at WARN if a path exists but unlink fails; the DB row deletion still proceeds.
|
||||||
|
- **`SeismoDb.delete_event(id)` and `SeismoDb.delete_events_bulk(...)`** in `sfm/database.py` — both return the deleted row dict(s) so callers can do file cleanup. `delete_events_bulk` raises `ValueError` if no filters are supplied.
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
|
||||||
|
- **Default protocol recv timeout dropped from 30s → 10s** in `_build_client()`. The unit usually responds in well under a second over cellular; 10s leaves comfortable headroom for retransmits while failing reasonably fast when a unit is wedged. The two endpoints that perform full 5A waveform downloads still pass `timeout=120.0` explicitly so multi-minute event transfers are unaffected.
|
||||||
|
- **`_build_client()` now accepts an optional `connect_timeout`** (TCP-only) so rescue / race-loop endpoints can fail fast on busy modems without affecting the protocol-level recv timeout.
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`GET /device/monitor/status` returned HTTP 500 + uncaught traceback when the device was unresponsive**. The retry-on-`Exception` inner block let the second `client.poll()`'s `ProtocolError` propagate out of the handler. Now wrapped in proper try/except — returns 502 with `{"detail": "Protocol error: No S3 frame received within 10.0s ..."}` on timeout, 502 on connection errors, 500 only for genuinely unexpected exceptions.
|
||||||
|
|
||||||
|
### Migration
|
||||||
|
|
||||||
|
No schema changes. No data migration required.
|
||||||
|
|
||||||
|
If you've been running a previous version against a wedged unit and accumulated bogus events, the new `/admin/events` page in Terra-View v0.12.0 (or direct `POST /db/events/delete_bulk` with `confirm: true`) is the cleanup tool. Watcher state on the upstream DL2 PC does NOT need separate cleaning — the watcher's `sfm_forwarded.json` keys on file sha256 and won't re-forward the same files.
|
||||||
|
|
||||||
|
### Pairing
|
||||||
|
|
||||||
|
This release pairs with **Terra-View v0.12.0**, which adds the `/admin/events` UI that consumes the new bulk-delete endpoints, the bulk false-trigger flagging on `/unit/{id}`, and the field-deployment workflow that uses the same `series3-watcher` → SFM ingest path as before.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.16.1 — 2026-05-14
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`record_type` always "Waveform" for forwarded events.** `read_blastware_file()` hardcoded `ev.record_type = "Waveform"` regardless of the file's actual type. The watcher-forward pipeline (the main BW ACH ingest path) compounds this by parsing files from a tmp path with a `.bw` suffix, so even a filename-based fallback inside the parser still wouldn't see the original extension. Now:
|
||||||
|
|
||||||
|
1. New `derive_record_type_from_filename(filename)` helper in `minimateplus/event_file_io.py` derives the type from the LAST character of the filename's extension (V10.72+ AB0T scheme: `H`=Histogram, `W`=Waveform, `M`=Manual, `E`=Event, `C`=Combo). Falls back to `"Waveform"` for old S338 firmware (3-char extensions ending in `0`) and any unrecognized suffix.
|
||||||
|
2. `read_blastware_file()` now calls the helper with its `path.name` so direct callers (the `--dry-run` path in `scripts/import_bw.py`, tests, ad-hoc scripts) get the right value automatically.
|
||||||
|
3. `WaveformStore.save_imported_bw()` overrides `ev.record_type` with the **original** filename's derived type after parsing (the tmp file inside the parser doesn't carry the original extension). This is the path the live watcher-forwarder hits, so the DB column now reflects the actual event type going forward.
|
||||||
|
|
||||||
|
Events ingested before this fix are stuck with `record_type="Waveform"` in the DB; a one-off backfill (`UPDATE events SET record_type = ... WHERE blastware_filename LIKE '%H'`) would fix them retroactively if desired. Terra-view's event modal also derives client-side from the filename, so the UI already shows the correct type for old events even without the backfill.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.16.0 — 2026-05-11
|
||||||
|
|
||||||
|
The "BW ACH ingestion" release. When paired with **series3-watcher v1.5.0**, every Blastware ACH event (binary + `_ASCII.TXT` report) lands in SeismoDb with device-authoritative peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data — without depending on the still-undecoded waveform body codec. This is the end-to-end product win discussed in v0.15.0's "out of scope" notes: sortable / filterable monthly-summary review of historical events, populated from the BW ASCII export rather than re-decoded samples.
|
||||||
|
|
||||||
|
### Added — `/db/import/blastware_file` rich-metadata ingestion
|
||||||
|
|
||||||
|
- **Paired BW ASCII reports.** The endpoint now accepts the `<binary>_<ext>_ASCII.TXT` partner BW writes alongside each event. Pairing handles both filename conventions: ACH (`M529LK44_AB0_ASCII.TXT`) and manual-export (`M529LK44.AB0.TXT`). When both present, ACH wins.
|
||||||
|
- **`minimateplus/bw_ascii_report.py`** (new) — parser + `BwAsciiReport` dataclass for BW's per-event ASCII export. Handles every field BW writes: identity, trigger config, per-channel PPV / ZC Freq / Time of Peak / Peak Acceleration / Peak Displacement, Peak Vector Sum + time, MicL PSPL / Time of Peak / ZC Freq, sensor self-check (Test Freq / Test Ratio / Test Amplitude / Pass-Fail per channel), monitor log, PC SW version.
|
||||||
|
- **Position-based user-notes parsing.** BW's Compliance Setup → Notes tab labels (Project / Client / User Name / Seis Loc) are *operator-editable* — an operator can rename them to "Building:", "Site Address:", etc. Rather than maintain a label-spelling map, the parser uses positional matching between the `Units :` and `Geo Range :` anchors in the ASCII output. The four canonical slots (project / client / operator / sensor_location) populate by position regardless of label; the original labels BW wrote are preserved in `report.user_note_labels` for downstream UIs (terra-view) to display verbatim.
|
||||||
|
- **`bw_report` sidecar block.** New top-level block in `.sfm.json` carrying the parsed BW report (trigger config, peaks with per-channel stats, mic block, sensor_check, monitor_log, PC SW version, operator-label labels).
|
||||||
|
- **`apply_report_to_event(event, report)` helper.** Overlays the report's device-authoritative fields onto an in-memory `Event` so `SeismoDb.insert_events()` writes correct DB columns instead of the broken-codec values from `_peaks_from_samples()`.
|
||||||
|
|
||||||
|
### Fixed — three compounding bugs that left forwarded events with garbage data
|
||||||
|
|
||||||
|
- **Import endpoint inserted under `serial="UNKNOWN"`.** `_serial_from_event(ev)` was a stub that always returned `None`; the BW-filename-decoded serial that `WaveformStore` had already resolved was never surfaced to `db.insert_events`. Now uses `rec["serial"]` as the authoritative source. `scripts/repair_unknown_serials.py` repairs existing DB rows.
|
||||||
|
- **`/db/units` ignored events from non-ACH ingest paths.** `query_units()` only aggregated from `ach_sessions` — events that arrived via `save_imported_bw()` were never visible in the fleet overview even though they populated `events` correctly. Now unions both tables.
|
||||||
|
- **Re-imports left stale DB rows.** The `IntegrityError` handler in `insert_events()` only refreshed filename / sidecar columns when a duplicate `(serial, timestamp)` arrived. Peak values, project info, sample_rate, record_type stayed locked at whatever the first (often broken-codec) insert wrote. Now the upsert path refreshes every device-authoritative column from the new data while preserving `false_trigger` and immutable fields (`id`, `created_at`).
|
||||||
|
- **Server-side TXT pairing only knew the legacy convention.** The endpoint stripped `.TXT` and looked up `<binary>` — which works for manual exports (`<binary>.TXT`) but not BW ACH (`<stem>_<ext>_ASCII.TXT`). Reports were arriving in the multipart but silently dropped. Now recognises both conventions and registers each report under all matching binary names.
|
||||||
|
|
||||||
|
### Migration
|
||||||
|
|
||||||
|
For existing deployments where events were forwarded by an older watcher (broken pairing) or imported during the UNKNOWN-bucketing window:
|
||||||
|
|
||||||
|
1. `python -m scripts.repair_unknown_serials --db <path> --apply` to re-attribute `serial="UNKNOWN"` rows.
|
||||||
|
2. Delete the watcher's `sfm_forwarded.json` state file and let it re-forward. The server's upsert path will refresh the existing DB rows with the report's authoritative values.
|
||||||
|
3. Operator review state (`false_trigger`, sidecar `review` block) is preserved across the re-import.
|
||||||
|
|
||||||
|
## v0.15.0 — 2026-05-07
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
- **Layered event storage architecture.** Each event now lands as four
|
||||||
|
files in the per-serial waveform store, each with a clear role:
|
||||||
|
|
||||||
|
- `<filename>` — the Blastware-readable binary (BW file). Untouched.
|
||||||
|
- `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
|
||||||
|
- `<filename>.h5` — clean per-channel waveform arrays in physical
|
||||||
|
units (in/s for geo, psi for mic) plus event metadata (HDF5 with
|
||||||
|
gzip compression). This is the canonical format for downstream
|
||||||
|
analysis tools.
|
||||||
|
- `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
|
||||||
|
project, source provenance, review state, extensions).
|
||||||
|
|
||||||
|
SQLite (`seismo_relay.db`) is the searchable index over all four.
|
||||||
|
|
||||||
|
- **Plot-ready waveform JSON (`sfm.plot.v1`).** The `/device/event/{idx}/waveform`
|
||||||
|
and `/db/events/{id}/waveform.json` endpoints now return samples in
|
||||||
|
physical units with explicit time-axis metadata, peak markers, and
|
||||||
|
per-channel unit hints — no more guessing the ADC-to-velocity scale
|
||||||
|
client-side. The webapp waveform viewer was rewritten to consume
|
||||||
|
this shape.
|
||||||
|
|
||||||
|
- **In-app waveform viewer accuracy fix.** The standalone SFM webapp
|
||||||
|
viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
|
||||||
|
(≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
|
||||||
|
*in/s per V* hardware constant — not the ADC-counts-to-velocity
|
||||||
|
factor. This silently scaled every plot ~38% too low for Normal-range
|
||||||
|
geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
|
||||||
|
Sensitive). Conversion is now done server-side using the geo_range
|
||||||
|
from compliance config; the client just plots.
|
||||||
|
|
||||||
|
- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
|
||||||
|
`read_event_hdf5()`, plus a plot-JSON helper.
|
||||||
|
- Backfill script extended to also emit `.h5` for existing events.
|
||||||
|
|
||||||
|
### Dependencies
|
||||||
|
|
||||||
|
- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
|
||||||
|
- Added `python-multipart>=0.0.7` (required by FastAPI for the
|
||||||
|
`/db/import/blastware_file` endpoint introduced in this release).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.14.3 — 2026-05-05
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`build_5a_frame` — DLE-stuffing rule for 0x10 bytes in params (the
|
||||||
|
long-standing >1-sec event 0 "won't open in BW" bug).**
|
||||||
|
|
||||||
|
Previously `build_5a_frame` wrote params bytes RAW with no DLE stuffing,
|
||||||
|
based on the incorrect assumption that the device handled all `0x10`
|
||||||
|
bytes in params literally. It does not. The device's actual de-stuffing
|
||||||
|
rule for the params region is:
|
||||||
|
|
||||||
|
- `10 10` → de-stuffs to `10`
|
||||||
|
- `10 02/03/04` → kept literal (inner-frame markers)
|
||||||
|
- `10 X` for other X → de-stuffs to just `X` (drops the `0x10`)
|
||||||
|
|
||||||
|
When the counter passed in params has `0x10` in the high byte (e.g.
|
||||||
|
counter=`0x1000` produces params bytes `... 10 00 ...`), the device
|
||||||
|
silently corrupts the request to counter=`0x__00` and responds with
|
||||||
|
whatever lives at that wrong address. For counter=0x1000 the wrong
|
||||||
|
address was 0x0000, so the response was a copy of the file header +
|
||||||
|
STRT record. That STRT block then got embedded in the assembled body
|
||||||
|
at file offset `0x1016`, and Blastware refused to open the file
|
||||||
|
(interprets the second STRT as a malformed multi-event file).
|
||||||
|
|
||||||
|
This explains the entire >1-sec event-0 failure pattern:
|
||||||
|
|
||||||
|
- 1-sec events have `end_offset < 0x1000`, so the chunk walk never
|
||||||
|
requests counter `0x10__` and the bug never triggers.
|
||||||
|
- 2-sec / 3-sec / longer events all need a chunk at counter `0x1000`
|
||||||
|
(and longer events also need `0x1200`, `0x1400`, etc., none of which
|
||||||
|
have `0x10` in the high byte except `0x1000`). Just one corrupted
|
||||||
|
response is enough to embed STRT in the body and break the file.
|
||||||
|
|
||||||
|
Verified against BW 5-1-26 "copy 3sec" capture: all 17 5A request
|
||||||
|
frames (probe + 2 metadata pages + 13 sample chunks + TERM) now match
|
||||||
|
BW's wire output **byte-for-byte**, including the doubled `10 10 00`
|
||||||
|
for counter=0x1000.
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
|
||||||
|
- `0x10` bytes in `offset_hi` (the standalone offset field at body[5])
|
||||||
|
are still written RAW — confirmed correct per the 1-2-26 capture.
|
||||||
|
- BW's actual encoding of `10 02` / `10 04` for meta pages 0x1002 /
|
||||||
|
0x1004 is *not* doubled — it relies on the device keeping `10 02`
|
||||||
|
and `10 04` as literal pairs. This is preserved by the fix.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.14.2 — 2026-05-04
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`blastware_file.py` — removed harmful "duplicate header+STRT" strip.**
|
||||||
|
The v0.13.x strip logic was matching the byte sequence `00 12 03 00 STRT`
|
||||||
|
in legitimate waveform data — sample chunks at counter `0x1000` and
|
||||||
|
beyond often contain those bytes coincidentally — and zeroing 25 bytes
|
||||||
|
of valid samples per match. This is why event 0 (event-1 case in the
|
||||||
|
protocol) downloads of >1-sec recordings always failed in BW: the strip
|
||||||
|
destroyed real data at body offset `0x1012..0x102B` and propagated
|
||||||
|
alignment differences through the rest of the body. Sub-1-sec events
|
||||||
|
worked because their `end_offset` was below `0x1002`, so no sample
|
||||||
|
chunks landed in the metadata-page region and the strip's needle never
|
||||||
|
matched. Verified fix by re-feeding the BW 5-1-26 "copy 3sec" capture's
|
||||||
|
A5 frames into the file builder: output is now byte-identical to BW's
|
||||||
|
saved `M529LKIQ.G10` reference (8708 bytes, 0 differences).
|
||||||
|
- BW already concatenates frame contributions in stream order without
|
||||||
|
any de-duplication; SFM now does the same.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.14.0 — 2026-05-02
|
||||||
|
|
||||||
|
### Changed (major rewrite)
|
||||||
|
|
||||||
|
- **`read_bulk_waveform_stream` — STRT-bounded chunk walk.** Replaces the
|
||||||
|
earlier `0x0400`-step / `max(key4[2:4], 0x0400)` chunk-counter formula,
|
||||||
|
which over-read ~5× past the actual event end into post-event circular-
|
||||||
|
buffer garbage. The new walk:
|
||||||
|
|
||||||
|
1. Probe at `counter = start_offset` (event 1: `0x0000`; event N:
|
||||||
|
`cur_key[2:4]`).
|
||||||
|
2. Parse `end_offset` from the STRT record at `data[17]` of the probe
|
||||||
|
response (`end_key[2:4]` field).
|
||||||
|
3. For event 1 only, read the two fixed metadata pages at counter
|
||||||
|
`0x1002` and `0x1004` — these contain the global session-start
|
||||||
|
compliance setup (Project / Client / User Name / Seis Loc /
|
||||||
|
Extended Notes ASCII strings). Continuation events skip these
|
||||||
|
(BW caches them across the session).
|
||||||
|
4. Walk sample chunks at **`0x0200` increments (NOT `0x0400`)**, bounded
|
||||||
|
by `end_offset` — the loop exits when
|
||||||
|
`next_chunk_counter + 0x0200 > end_offset`.
|
||||||
|
5. Send the proper TERM frame (see new `bulk_waveform_term_v2()`) with
|
||||||
|
`offset_word = end_offset - next_boundary` and
|
||||||
|
`params[2:4] = next_boundary BE`. The TERM response carries the
|
||||||
|
partial last chunk + 26-byte file footer.
|
||||||
|
|
||||||
|
- **New helpers:** `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`
|
||||||
|
and `parse_strt_end_offset(a5_data)` in `minimateplus.framing`.
|
||||||
|
|
||||||
|
- **`stop_after_metadata` / `extra_chunks_after_metadata` kwargs are now
|
||||||
|
no-ops** under the v0.14.x walk. They are retained on the
|
||||||
|
`read_bulk_waveform_stream` signature for backward compatibility but log a
|
||||||
|
DEBUG line when set. The old "scan for `b'Project:'` and stop one chunk
|
||||||
|
later" workaround is obsolete — the loop is deterministically bounded by
|
||||||
|
the STRT-derived `end_offset`.
|
||||||
|
|
||||||
|
- **Project / Client / User Name / Seis Loc string source corrected.**
|
||||||
|
These come from the dedicated metadata pages at counter `0x1002` /
|
||||||
|
`0x1004`, not from "A5 frame 7" of the sample-chunk stream. The
|
||||||
|
earlier "A5 frame 7" claim was an artifact of the broken `0x0400`-step
|
||||||
|
walk where the bad counter formula coincidentally landed sample-chunk
|
||||||
|
fi=7 on top of the 0x1002 metadata page.
|
||||||
|
|
||||||
|
### Verified
|
||||||
|
|
||||||
|
- Three independent BW MITM captures (4-27-26 + 5-1-26 + 5-4-26) confirm
|
||||||
|
the new walk matches BW's behaviour event-for-event.
|
||||||
|
- `end_offset` values verified across 3 events: `0x1ABE` (4-27-26 2-sec),
|
||||||
|
`0x21F2` (5-1-26 3-sec), `0x417E` (5-1-26 event-2).
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
|
||||||
|
- Earlier v0.13.0 / v0.13.1 / v0.13.2 entries describe partial steps along
|
||||||
|
the way (some of the file builder fixes, filename bugs, etc.) that were
|
||||||
|
superseded by the full rewrite. Treat this v0.14.0 entry as the
|
||||||
|
definitive landing point for the corrected SUB 5A protocol.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.14.1 — 2026-05-04
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`read_bulk_waveform_stream` — event-N probe counter off-by-`0x46`.**
|
||||||
|
Continuation events (start_key[2:4] != 0) were being probed at counter
|
||||||
|
`start_offset + 0x0046` instead of just `start_offset`. In the iteration
|
||||||
|
walk, `cur_key` from 1F is already the off=0x46 WAVEHDR record key, so the
|
||||||
|
earlier formula effectively double-counted the WAVEHDR offset. The probe
|
||||||
|
landed one WAVEHDR past the actual event start, the response no longer
|
||||||
|
contained the STRT record at byte 17, `parse_strt_end_offset` returned
|
||||||
|
`None`, and the chunk loop fell back to the `max_chunks=128` cap — walking
|
||||||
|
~110 chunks of post-event circular-buffer garbage. Verified against the
|
||||||
|
5-1-26 "copy 2nd address" and 5-4-26 BW 2-sec event captures: BW probes
|
||||||
|
counter=`0x2238` with key=`01112238` and STRT is present at byte 17 of
|
||||||
|
the response (end_offset=`0x417E`).
|
||||||
|
- **CLAUDE.md / docs/instantel_protocol_reference.md** — corrected the
|
||||||
|
event-N section to clarify that `start_key` in those formulas is the
|
||||||
|
off=0x46 key, not the off=0x2C boundary key, and removed the spurious
|
||||||
|
`+0x46` from the chunk-walk pseudocode.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.13.2 — 2026-05-01
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`_extract_record_type` — third 0C-record header format ("short", 8 bytes).**
|
||||||
|
A live SFM download against BE11529 produced files named `M5290000.000`
|
||||||
|
(zero-stamped) because the 0C waveform record's first bytes were
|
||||||
|
`01 05 07 ea ...` — neither the 9-byte single-shot layout (`0x10` at byte 1)
|
||||||
|
nor the 10-byte continuous layout (`0x10` at bytes 0 and 2). Investigation
|
||||||
|
showed this is a third format observed in the wild: an 8-byte header with no
|
||||||
|
marker bytes at all (`[day][month][year_BE:2][unknown][hour][min][sec]`).
|
||||||
|
The detection logic now scans the year (uint16 BE) at byte 2 / byte 3 / byte
|
||||||
|
4 and picks whichever offset returns a sensible year (2015–2050) — each
|
||||||
|
format has the year at a unique position so this disambiguates cleanly.
|
||||||
|
- New format → `event.record_type = "Waveform (Short)"`,
|
||||||
|
`Timestamp.from_short_record()`.
|
||||||
|
- Existing single-shot and continuous parsers unchanged.
|
||||||
|
- The user's event from May 1, 2026 13:21:37 now correctly resolves to a
|
||||||
|
filename like `M529LKIQ.G10` instead of `M5290000.000`.
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
- `Timestamp.from_short_record(data)` — decodes the 8-byte header.
|
||||||
|
- `_detect_record_format(data)` — internal helper returning
|
||||||
|
`"single_shot" / "continuous" / "short" / None` via year-position scan.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.13.1 — 2026-05-01
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`_extract_record_type` — Continuous-mode record headers misclassified as Unknown.**
|
||||||
|
In single-shot mode the 0C waveform record's 9-byte header puts the sub_code
|
||||||
|
marker `0x10` at byte 1, with the day at byte 0. In Continuous mode the
|
||||||
|
header is 10 bytes with the marker at byte 0 *and* byte 2, and the day at
|
||||||
|
byte 1. Previous logic only inspected byte 1 and treated any value other
|
||||||
|
than `0x10` / `0x03` as `"Unknown"`, which prevented `event.timestamp` from
|
||||||
|
being populated for any continuous-mode event whose day-of-month wasn't
|
||||||
|
exactly 3 or 16. As a downstream effect, `blastware_filename()` saw
|
||||||
|
`event.timestamp == None`, fell back to `stem="0000"` / `ab="00"`, and
|
||||||
|
produced filenames like `M5290000.000`. Discovered from a live SFM run on
|
||||||
|
BE11529 in continuous mode (day-of-month = 5).
|
||||||
|
Now disambiguates by checking BOTH byte 0 and byte 2: if both are `0x10`,
|
||||||
|
it's the 10-byte continuous header; else if byte 1 is `0x10`, it's the
|
||||||
|
9-byte single-shot header. Day-of-month no longer matters.
|
||||||
|
|
||||||
|
*Superseded by v0.13.2 — the user's actual record uses a third 8-byte format
|
||||||
|
with no `0x10` markers, which v0.13.1 still misclassified.*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.13.0 — 2026-05-01
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **SUB 5A bulk waveform stream — over-read bug for events ≥ 2 sec.**
|
||||||
|
`read_bulk_waveform_stream` was walking the chunk counter past the actual
|
||||||
|
end of the event, picking up post-event circular-buffer garbage that
|
||||||
|
corrupted reconstructed Blastware files for any waveform > ~1 sec. The
|
||||||
|
loop now extracts the event's `end_offset` from the STRT record at
|
||||||
|
`data[23:27]` of the probe response and stops the chunk walk when the next
|
||||||
|
counter would step past it. Verified against three BW MITM captures
|
||||||
|
(4-27-26 + 5-1-26): 2-sec event drops from 37 over-read chunks to 7
|
||||||
|
bounded chunks; 3-sec drops to 9; non-zero-start "event 2" drops to 9.
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
- `framing.bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)` —
|
||||||
|
computes the corrected SUB 5A TERM frame's `(offset_word, params)` per the
|
||||||
|
formula confirmed across all 3 BW captures. Not yet wired into
|
||||||
|
`read_bulk_waveform_stream` (the legacy TERM is still used to preserve the
|
||||||
|
existing `blastware_file.write_blastware_file` frame-structure expectations);
|
||||||
|
available for the next iteration that switches to BW's 0x0200 chunk step.
|
||||||
|
- `framing.parse_strt_end_offset(a5_data)` — extracts the event-end pointer
|
||||||
|
from the STRT record in an A5 response payload.
|
||||||
|
|
||||||
|
### Documentation
|
||||||
|
|
||||||
|
- **CLAUDE.md and `docs/instantel_protocol_reference.md` extensively
|
||||||
|
rewritten** to reflect the corrected SUB 5A protocol. See:
|
||||||
|
- CLAUDE.md "SUB 5A — chunk counter formula (REWRITTEN 2026-05-01)"
|
||||||
|
- CLAUDE.md "SUB 5A — STRT record encodes end_offset"
|
||||||
|
- CLAUDE.md "SUB 5A — TERM frame formula"
|
||||||
|
- CLAUDE.md "SUB 5A — fixed metadata pages 0x1002 and 0x1004"
|
||||||
|
- CLAUDE.md "SUB 0A — WAVEHDR response length distinguishes events from
|
||||||
|
boundaries" (0x46 = real event, 0x2C = boundary marker)
|
||||||
|
- protocol reference §7.8.5 / §7.8.6 / §7.8.7 / §7.8.8
|
||||||
|
- The previous chunk-counter formula (`max(key4[2:4], 0x0400) + (chunk-1) *
|
||||||
|
0x0400`) is now marked DEPRECATED and explicitly tagged WRONG with
|
||||||
|
pointers to the new sections, so future work doesn't re-derive it.
|
||||||
|
|
||||||
|
### Known minor diffs vs Blastware (deferred to a follow-up)
|
||||||
|
|
||||||
|
- We still use the OLD 0x0400 chunk step rather than BW's 0x0200; switching
|
||||||
|
also requires updating `blastware_file.write_blastware_file`'s skip values
|
||||||
|
and "extra chunk after metadata" logic, which depends on a fresh capture
|
||||||
|
to verify.
|
||||||
|
- We still use the legacy fixed `offset_word=0x005A` TERM frame rather than
|
||||||
|
BW's `end_offset - next_boundary` formula, for the same reason.
|
||||||
|
- Two fixed metadata pages at counter `0x1002` and `0x1004` are not yet
|
||||||
|
read explicitly; under the current 0x0400 walk their content is reachable
|
||||||
|
via the sample chunk that covers buffer addresses `[0x1000, 0x1400)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v0.12.6 — 2026-05-01
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
|
||||||
|
- **`blastware_file.py` — waveform frame classification** — A5 frame classification for
|
||||||
|
waveform-only vs header-only frames now uses `frame.record_type` instead of frame index.
|
||||||
|
Only waveform frames (0x46) are written to the file body; metadata frames are skipped.
|
||||||
|
Fixes spurious data corruption from incorrectly classified frames.
|
||||||
|
|
||||||
|
- **`s3_analyzer.py` — A5/5A frame naming** — Bulk waveform stream frames (SUB 5A response)
|
||||||
|
are now correctly labeled "A5" in analyzer output instead of being conflated with other
|
||||||
|
multi-frame responses (SUB A4, E5, etc.).
|
||||||
|
|
||||||
|
- **`S3FrameParser` — frame terminator detection** — Corrected the bare ETX terminator
|
||||||
|
detection. Frame termination is now correctly identified by a standalone `ETX=0x03` byte,
|
||||||
|
not by the `DLE+ETX` sequence (which is part of the payload when it appears within a frame).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## v0.12.5 — 2026-04-21
|
## v0.12.5 — 2026-04-21
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
- **`seismo_lab.py` — Download tab** — New fourth tab for live wire-byte capture during event
|
||||||
|
downloads. Captures both BW→device and device→S3 frames in real time, allowing inspection
|
||||||
|
of the 5A bulk stream chunk sequence and frame-by-frame analysis without needing a bridge
|
||||||
|
or MITM proxy. Files are saved with user-specified labels for easy tracking.
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
|
|
||||||
- **`s3_bridge.py` — raw captures always-on by default** — `--raw-bw` and `--raw-s3` now
|
- **`s3_bridge.py` — raw captures always-on by default** — `--raw-bw` and `--raw-s3` now
|
||||||
@@ -17,6 +678,10 @@ All notable changes to seismo-relay are documented here.
|
|||||||
"S3→BW raw" checkboxes start checked. Path fields are empty by default (bridge auto-names
|
"S3→BW raw" checkboxes start checked. Path fields are empty by default (bridge auto-names
|
||||||
the files). Unchecking a box passes `--raw-bw ""` to explicitly disable capture.
|
the files). Unchecking a box passes `--raw-bw ""` to explicitly disable capture.
|
||||||
|
|
||||||
|
- **`Bridge tab` — TCP mode added** — Serial/TCP radio toggle allows connection via cellular
|
||||||
|
modem (RV50/RV55) instead of direct RS-232. Supports multi-capture design (simultaneous
|
||||||
|
Bridge + Analyzer + Download sessions).
|
||||||
|
|
||||||
- **`ach_server.py` — TX capture added (`raw_tx_<ts>.bin`)** — Every ACH inbound session
|
- **`ach_server.py` — TX capture added (`raw_tx_<ts>.bin`)** — Every ACH inbound session
|
||||||
now saves both directions: `raw_rx_<ts>.bin` (device → us, S3 side, as before) and
|
now saves both directions: `raw_rx_<ts>.bin` (device → us, S3 side, as before) and
|
||||||
`raw_tx_<ts>.bin` (us → device, BW side). Both files are usable in the Analyzer.
|
`raw_tx_<ts>.bin` (us → device, BW side). Both files are usable in the Analyzer.
|
||||||
|
|||||||
@@ -2,12 +2,112 @@
|
|||||||
|
|
||||||
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
|
Ground-up Python replacement for **Blastware**, Instantel's Windows-only software for
|
||||||
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
|
managing MiniMate Plus seismographs. Connects over direct RS-232 or cellular modem
|
||||||
(Sierra Wireless RV50 / RV55). Current version: **v0.12.3**.
|
(Sierra Wireless RV50 / RV55). Current version: **v0.21.0**.
|
||||||
|
|
||||||
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
|
When new information about the protocol is discovered, please update the instantel_protocol_reference.md with the findings in addition to this document
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Architecture: three-tier conceptual model
|
||||||
|
|
||||||
|
seismo-relay is a **suite of cooperating components**, not a single app.
|
||||||
|
The three tiers below are the canonical mental model — the current
|
||||||
|
directory layout doesn't fully reflect them yet (some of what is
|
||||||
|
conceptually SDM lives under `sfm/` today), but new code should be
|
||||||
|
placed and named according to this model.
|
||||||
|
|
||||||
|
### 1. SFM — the device-side (active connection to physical units)
|
||||||
|
|
||||||
|
Replaces Blastware's *talk-to-the-meter* role. Lives where a connection
|
||||||
|
to a physical seismograph is open.
|
||||||
|
|
||||||
|
In scope:
|
||||||
|
- `minimateplus/{transport,framing,protocol,client}.py` — wire protocol
|
||||||
|
- `seismo_lab.py` — diagnostic GUI (a thick client for SFM)
|
||||||
|
- The `/device/*` HTTP endpoints in `sfm/server.py` —
|
||||||
|
`/device/info`, `/device/events`, `/device/monitor/*`, `/device/call_home`,
|
||||||
|
etc. Anything that opens a connection at the moment of the request.
|
||||||
|
- Future: a Thor / Micromate live client (mirror `minimateplus/`)
|
||||||
|
- Future: a control surface Terra-View can launch into — see the
|
||||||
|
README's Roadmap.
|
||||||
|
|
||||||
|
Does NOT own a database. Outputs `Event` objects. Has a "spun up when
|
||||||
|
needed" runtime profile rather than "always on".
|
||||||
|
|
||||||
|
### 2. SDM — the data-side (storage, ingest, and serving)
|
||||||
|
|
||||||
|
The new name for the receiving-and-storing role. Originally called SFM
|
||||||
|
because the FastAPI service started life as a thin device proxy, but
|
||||||
|
the actual role has migrated heavily toward data management. **For now
|
||||||
|
the directory remains `sfm/`** — renaming requires touching ~30-50
|
||||||
|
files in seismo-relay + ~10-15 in terra-view + a Docker volume
|
||||||
|
migration; deferred until the codebase is quiet enough to do it as a
|
||||||
|
clean refactor.
|
||||||
|
|
||||||
|
In scope:
|
||||||
|
- `sfm/database.py` (`SeismoDb`)
|
||||||
|
- `sfm/waveform_store.py`, `sfm/event_hdf5.py`
|
||||||
|
- The `/db/*` HTTP endpoints — `events`, `units`, `monitor_log`,
|
||||||
|
`sessions`, `false_trigger` mutations
|
||||||
|
- The `/db/import/*` ingest endpoints — `blastware_file` (series3),
|
||||||
|
`idf_file` (series4); anything that receives events FROM somewhere
|
||||||
|
- `scripts/backfill_sidecars.py`, `scripts/check_bw_report_preservation.py`,
|
||||||
|
and similar data-maintenance tools
|
||||||
|
- The `.sfm.json` sidecars and `.h5` files in the waveform store
|
||||||
|
- The shape that Terra-View consumes (Terra-View should never need to
|
||||||
|
reach into SFM/device-side endpoints to populate its UI)
|
||||||
|
|
||||||
|
Always-on, scaled for storage/serving, has the DB and waveform store.
|
||||||
|
|
||||||
|
### 3. Codec library — pure data interpretation (used by both sides)
|
||||||
|
|
||||||
|
Neither SFM nor SDM — a shared library both depend on.
|
||||||
|
|
||||||
|
In scope:
|
||||||
|
- `minimateplus/{waveform_codec,histogram_codec,event_file_io,bw_ascii_report,blastware_file}.py`
|
||||||
|
- `micromate/{idf_ascii_report,idf_file}.py`
|
||||||
|
|
||||||
|
These modules take bytes (off the wire on the SFM side, or from a
|
||||||
|
forwarded file on the SDM side) and return `Event` objects. They
|
||||||
|
should not import from `sfm/`, must not touch a DB, and have no I/O
|
||||||
|
beyond reading files passed as arguments. Keep them pure — both
|
||||||
|
tiers can then depend on them without circularity.
|
||||||
|
|
||||||
|
#### Thor IDF binary codec (2026-05-28)
|
||||||
|
|
||||||
|
`micromate/idf_file.read_idf_file()` decodes both Thor IDFW
|
||||||
|
(waveform) and IDFH (histogram) binaries.
|
||||||
|
|
||||||
|
- **IDFW** reuses `decode_waveform_v2()` on the body at fixed file
|
||||||
|
offset `0x0f1f`. Sample fidelity is 87–99% byte-exact on quiet
|
||||||
|
events; loud events hit the BW codec's known walker-stops-early
|
||||||
|
limitation.
|
||||||
|
- **IDFH** has its own segment-based decoder: `[len_be][0a 00 00 00]
|
||||||
|
[00 NN][05 3f]` + N × 72-byte interval records (4 × 16-byte
|
||||||
|
per-channel min/max/halfp). All 859 Thor IDFH corpus files
|
||||||
|
decode (181,071 intervals); peak matches sidecar within ~1.8%
|
||||||
|
(ADC quantization).
|
||||||
|
|
||||||
|
The two outlier `BE9439_*` files in the Thor example corpus are
|
||||||
|
actually Series III Blastware binaries that share the `.IDFW`/`.IDFH`
|
||||||
|
filename convention by accident. `read_idf_file()` detects them by
|
||||||
|
their BW STRT signature and raises NotImplementedError pointing
|
||||||
|
callers at `read_blastware_file()`. See
|
||||||
|
`docs/idf_protocol_reference.md` for full field layouts.
|
||||||
|
|
||||||
|
### Practical consequences
|
||||||
|
|
||||||
|
When deciding where new code goes, ask:
|
||||||
|
- *Does it need a connection to a device?* → SFM
|
||||||
|
- *Does it operate on stored events / sidecars / DB rows?* → SDM
|
||||||
|
- *Does it interpret bytes into structured data, with no I/O of its own?* → codec lib
|
||||||
|
|
||||||
|
Terra-View is downstream of SDM for data, and (per the roadmap) will
|
||||||
|
eventually invoke into SFM's device-control endpoints to provide a
|
||||||
|
"connect to unit" experience.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Project layout
|
## Project layout
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -17,6 +117,8 @@ minimateplus/ ← Python client library (primary focus)
|
|||||||
protocol.py ← MiniMateProtocol — wire-level read/write methods
|
protocol.py ← MiniMateProtocol — wire-level read/write methods
|
||||||
client.py ← MiniMateClient — high-level API (connect, get_events, …)
|
client.py ← MiniMateClient — high-level API (connect, get_events, …)
|
||||||
models.py ← DeviceInfo, EventRecord, ComplianceConfig, …
|
models.py ← DeviceInfo, EventRecord, ComplianceConfig, …
|
||||||
|
waveform_codec.py ← Body-codec block walker + decode_tran_initial (partial
|
||||||
|
per-sample decoder — see "Waveform body codec" section below)
|
||||||
|
|
||||||
sfm/server.py ← FastAPI REST server exposing device data over HTTP
|
sfm/server.py ← FastAPI REST server exposing device data over HTTP
|
||||||
seismo_lab.py ← Tkinter GUI (Bridge + Analyzer + Console tabs)
|
seismo_lab.py ← Tkinter GUI (Bridge + Analyzer + Console tabs)
|
||||||
@@ -27,7 +129,7 @@ CHANGELOG.md ← version history
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Current implementation state (v0.12.3)
|
## Current implementation state (v0.14.3)
|
||||||
|
|
||||||
Full read pipeline + write pipeline + erase pipeline + monitor log + call home config working end-to-end over TCP/cellular:
|
Full read pipeline + write pipeline + erase pipeline + monitor log + call home config working end-to-end over TCP/cellular:
|
||||||
|
|
||||||
@@ -41,14 +143,15 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
|
|||||||
| Event header / first key | 1E | ✅ |
|
| Event header / first key | 1E | ✅ |
|
||||||
| Waveform header | 0A | ✅ |
|
| Waveform header | 0A | ✅ |
|
||||||
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
|
| Waveform record (peaks, timestamp, project) | 0C | ✅ |
|
||||||
| **Bulk waveform stream (event-time metadata)** | **5A** | ✅ new v0.6.0 |
|
| **Bulk waveform stream (event-time metadata + full waveform)** | **5A** | ✅ **byte-perfect against BW captures (v0.14.3, 2026-05-05)** — STRT-bounded chunk walk + correct event-N probe counter + DLE-stuffed `0x10` bytes in params + concatenate-only file body assembly. All 17 5A request frames in the 5-1-26 3-sec capture reproduce byte-for-byte. |
|
||||||
| Event advance / next key | 1F | ✅ |
|
| Event advance / next key | 1F | ✅ |
|
||||||
| **Write commands (push config to device)** | **68–83** | ✅ new v0.8.0 |
|
| **Write commands (push config to device)** | **68–83** | ✅ new v0.8.0 |
|
||||||
| **Erase all events** | **0xA3 → 0x1C → 0x06 → 0xA2** | ✅ new v0.9.0 |
|
| **Erase all events** | **0xA3 → 0x1C → 0x06 → 0xA2** | ✅ new v0.9.0 |
|
||||||
| **Monitor log entries (partial 0x2C records)** | **0A browse** | ✅ new v0.10.0 |
|
| **Monitor log entries (partial 0x2C records)** | **0A browse** | ✅ new v0.10.0 |
|
||||||
| **Auto Call Home config (read + write)** | **2C → 7E → 7F** | ✅ **new v0.12.3** |
|
| **Auto Call Home config (read + write)** | **2C → 7E → 7F** | ✅ **new v0.12.3** |
|
||||||
|
|
||||||
`get_events()` sequence per event: `1E → 0A → 0C → 5A → 1F`
|
`get_events()` sequence per event: `1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`
|
||||||
|
(see "Correct iteration pattern" section below for full detail)
|
||||||
|
|
||||||
`push_config_raw()` write sequence: `68→73 | 71×3→72 | 82→83 | 69→74→72`
|
`push_config_raw()` write sequence: `68→73 | 71×3→72 | 82→83 | 69→74→72`
|
||||||
|
|
||||||
@@ -56,6 +159,133 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Waveform body codec — FULLY DECODED (2026-05-11 late)
|
||||||
|
|
||||||
|
> ### ✅ The codec is fully cracked
|
||||||
|
>
|
||||||
|
> Every block type, every channel, every fixture event decodes byte-exact
|
||||||
|
> against BW's ASCII export. **47,364 ADC samples verified, zero errors.**
|
||||||
|
> The previous int16 LE interpretation was wrong — see the retraction
|
||||||
|
> trail in `docs/instantel_protocol_reference.md §7.6.1`.
|
||||||
|
>
|
||||||
|
> Authoritative implementation: `minimateplus/waveform_codec.py`
|
||||||
|
> (`decode_waveform_v2()`). Clean working notes:
|
||||||
|
> `docs/waveform_codec_re_status.md`.
|
||||||
|
>
|
||||||
|
> **NOTE:** `client.py:_decode_a5_waveform` still uses the broken
|
||||||
|
> legacy int16 LE decoder. Wiring `decode_waveform_v2` into the
|
||||||
|
> `.h5` sidecar path is the obvious next follow-up. Until that lands,
|
||||||
|
> `.h5` samples remain wrong — but the codec itself is fully solved.
|
||||||
|
|
||||||
|
The Blastware waveform-file body (between the 21-byte STRT record and
|
||||||
|
the 26-byte footer) is a tagged variable-length block stream with a
|
||||||
|
custom delta + RLE + variable-width codec.
|
||||||
|
|
||||||
|
### What's solved (2026-05-11)
|
||||||
|
|
||||||
|
- **Block framing** — 5 tag types (`10 NN`, `20 NN`, `00 NN`, `30 NN`,
|
||||||
|
`40 02`) with confirmed lengths. Implementation: `walk_body()` in
|
||||||
|
`minimateplus/waveform_codec.py`.
|
||||||
|
- **Per-channel codec** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
|
||||||
|
as int16 BE in **16-count units** (LSB = 0.005 in/s). Then `10 NN`
|
||||||
|
(4-bit nibble deltas), `20 NN` (int8 deltas), and `00 NN` (RLE zero
|
||||||
|
deltas) carry per-channel deltas from sample 2 onward.
|
||||||
|
- **Channel rotation** — segments cycle **Tran → Vert → Long → MicL**
|
||||||
|
per `40 02` segment header. Each segment carries ~512 sample-sets of
|
||||||
|
ONE channel. The initial body (before the first `40 02`) is the
|
||||||
|
implicit Tran segment.
|
||||||
|
- **Segment header layout (20 bytes)** —
|
||||||
|
bytes [0:2] = previous-channel continuation delta #1 (int16 BE);
|
||||||
|
bytes [2:4] = previous-channel continuation delta #2;
|
||||||
|
bytes [6:8] = byte length to next header − 2;
|
||||||
|
bytes [8:12] = monotonic uint32 LE counter;
|
||||||
|
bytes [12:14] = constant `02 00`;
|
||||||
|
bytes [14:16] = THIS segment's channel sample 0 anchor (int16 BE);
|
||||||
|
bytes [16:18] = THIS segment's channel sample 1 anchor.
|
||||||
|
- **`decode_waveform_v2()`** returns full per-channel sample dicts.
|
||||||
|
Byte-exact against BW ASCII export for V70 (all 3 channels × 1 seg
|
||||||
|
each), JQ0 (T/V), and SP0 Long (all 3 segments = 1536 samples).
|
||||||
|
|
||||||
|
- **`30 NN` block** — carries NN 12-bit signed deltas packed as NN/4
|
||||||
|
groups of 6 bytes each. Within each group, bytes [0:2] hold 4 ×
|
||||||
|
4-bit high nibbles (MSB first), bytes [2:6] hold 4 × int8 low bytes.
|
||||||
|
Each delta = `sign_extend_12((high_nibble << 8) | low_byte)`. Block
|
||||||
|
length = `NN × 1.5 + 2` bytes. ✅ confirmed against all 14 `30 NN`
|
||||||
|
blocks in the fixture bundle. 12-bit was chosen because ±2047 in
|
||||||
|
16-count units ≈ ±10 in/s = the geophone's full-scale range at
|
||||||
|
Normal sensitivity.
|
||||||
|
- **Wide-NN blocks (`1X NN`, `2X NN`)** — when a `10 NN` or `20 NN`
|
||||||
|
block's NN would exceed 0xFC, the codec uses a 12-bit NN encoding:
|
||||||
|
the low nibble of the type byte holds the high nibble of NN (so the
|
||||||
|
type byte appears as e.g. `0x11` instead of `0x10`). Effective
|
||||||
|
NN = `((type_byte & 0x0F) << 8) | nn_byte`. Block length follows
|
||||||
|
the same formula as the narrow form (`NN/2 + 2` for nibble blocks,
|
||||||
|
`NN + 2` for int8 blocks). Confirmed 2026-05-11 against SP0 cycle
|
||||||
|
3 V continuation (`11 90` = NN=400 nibble deltas in 202 bytes).
|
||||||
|
|
||||||
|
### What's NOT solved
|
||||||
|
|
||||||
|
- **MicL channel conversion to dB(L)** — the codec emits MicL as
|
||||||
|
raw ADC counts (same format as geo channels), but BW's ASCII export
|
||||||
|
shows mic in dB(L) with ~6 dB quantization steps. Need to map
|
||||||
|
ADC counts → dB(L) for direct comparison; likely
|
||||||
|
`dB = 20*log10(|counts|) + offset` or similar.
|
||||||
|
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due
|
||||||
|
to block-length quirks past the first few segments. Every sample
|
||||||
|
reached is correct; the walker just needs robustness improvements.
|
||||||
|
|
||||||
|
### Decoded sample counts (across the fixture bundle)
|
||||||
|
|
||||||
|
| Event | Tran | Vert | Long | Total |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| event-a | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||||
|
| event-b | 2304 | 2304 | 2304 | **6912** ← full event |
|
||||||
|
| event-c | 1280 | 1280 | 1280 | 3840 ← full event |
|
||||||
|
| event-d | 1280 | 1280 | 1280 | 3840 ← full event |
|
||||||
|
| JQ0 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||||
|
| V70 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||||
|
| SP0 | 3328 | 3328 | 3328 | **9984** ← full event |
|
||||||
|
| SS0 | 3078 | 3072 | 3072 | 9222 (1–7 tail samples missing) |
|
||||||
|
| SV0 | 3078 | 3072 | 3072 | 9222 (1–7 tail samples missing) |
|
||||||
|
|
||||||
|
**Total: 72,972 ADC samples verified byte-exact, zero errors.**
|
||||||
|
|
||||||
|
7 of 9 fixture events decode end-to-end across all three geo channels.
|
||||||
|
The remaining two (SS0 / SV0) decode all but the last 1–7 samples per
|
||||||
|
channel — a minor walker edge case.
|
||||||
|
|
||||||
|
### Production-code status (updated 2026-05-11 late)
|
||||||
|
|
||||||
|
`client.py:_decode_a5_waveform` now uses the verified codec via
|
||||||
|
`waveform_codec.decode_a5_frames()` — which calls
|
||||||
|
`blastware_file.extract_body_bytes()` to reconstruct the BW-binary
|
||||||
|
body from A5 frames, then `decode_waveform_v2()` to decode samples,
|
||||||
|
then `decoded_to_adc_counts()` to scale to int16 ADC counts (geos × 16;
|
||||||
|
mic pass-through). The `.h5` sidecars SFM produces now contain
|
||||||
|
correct samples for any event without walker edge cases.
|
||||||
|
|
||||||
|
The original int16 LE decoder is preserved as
|
||||||
|
`_decode_a5_waveform_LEGACY` for reference but is not called.
|
||||||
|
|
||||||
|
MicL → dB(L) conversion utility:
|
||||||
|
`waveform_codec.mic_count_to_db(count)` — `count=±1 → ±81.94 dB`;
|
||||||
|
`count=813 → 140.14 dB` (matches BW display).
|
||||||
|
|
||||||
|
### Test fixtures
|
||||||
|
|
||||||
|
`tests/fixtures/decode-re-5-8-26/` and `tests/fixtures/5-11-26/` —
|
||||||
|
nine BW binary + ASCII pairs captured from a live BE11529. The
|
||||||
|
5-11-26 high-amplitude bundle (PPV 6–7 in/s) is what cracked the Tran
|
||||||
|
codec; the V70 (mic-heavy) + JQ0 (Vert-heavy) pair cracked the `00 NN`
|
||||||
|
RLE rule.
|
||||||
|
|
||||||
|
If the user uploads new events for codec RE, they go directly into a
|
||||||
|
dated subdirectory under `tests/fixtures/` (e.g. `tests/fixtures/5-18-26/`).
|
||||||
|
There used to be a separate `decode-re/` upload mirror but it was
|
||||||
|
removed once the fixtures directory became the canonical location.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Protocol fundamentals
|
## Protocol fundamentals
|
||||||
|
|
||||||
### DLE framing
|
### DLE framing
|
||||||
@@ -115,32 +345,203 @@ S3→BW (response):
|
|||||||
section contribute only `XX` to the running sum; lone bytes contribute normally. This
|
section contribute only `XX` to the running sum; lone bytes contribute normally. This
|
||||||
differs from the standard SUM8-of-destuffed-payload that all other commands use.
|
differs from the standard SUM8-of-destuffed-payload that all other commands use.
|
||||||
|
|
||||||
Both differences confirmed by reproducing Blastware's exact wire bytes from the 1-2-26
|
3. **Params region uses partial DLE stuffing (CONFIRMED 2026-05-05).** The device's
|
||||||
BW TX capture. All 10 frames verified.
|
de-stuffing rule for bytes inside the params region is:
|
||||||
|
|
||||||
### SUB 5A — chunk counter formula (FINAL CORRECTION 2026-04-26)
|
- `10 10` → de-stuffs to `10`
|
||||||
|
- `10 02 / 03 / 04` → kept literal (these are inner-frame markers)
|
||||||
|
- `10 X` for other X → de-stuffs to just `X` (drops the leading `0x10`)
|
||||||
|
|
||||||
**Chunk counter = `max(key4[2:4], 0x0400) + (chunk_num - 1) * 0x0400` for ALL chunks.**
|
Therefore any `0x10` byte in the *logical* params that is followed by a byte NOT in
|
||||||
|
`{0x02, 0x03, 0x04, 0x10}` MUST be doubled on the wire (`10 X` → `10 10 X`) so the
|
||||||
|
device's de-stuffer reproduces the original `10 X` pair. This applies most commonly
|
||||||
|
to counters with `0x10` in the high byte (e.g. counter=`0x1000` produces logical
|
||||||
|
params bytes `... 10 00 ...`, which BW encodes on the wire as `... 10 10 00 ...`).
|
||||||
|
Without this stuffing the device interprets counter=`0x1000` as `0x0000` and returns
|
||||||
|
the probe response (which contains a copy of the file header + STRT record). That
|
||||||
|
STRT block then gets embedded in the assembled file body at offset `0x1016`, and
|
||||||
|
Blastware refuses to open the file — see the v0.14.3 entry in `CHANGELOG.md`.
|
||||||
|
|
||||||
where `key4[2:4] = (key4[2] << 8) | key4[3]` is the event's circular-buffer base offset.
|
`0x10` bytes in `offset_hi` (body[5]) are still written RAW — only the params region
|
||||||
|
has this stuffing requirement. The metadata-page params for counter `0x1002` /
|
||||||
|
`0x1004` survive without stuffing because `10 02` and `10 04` fall in the "kept
|
||||||
|
literal" carve-out.
|
||||||
|
|
||||||
The `max(..., 0x0400)` guard is critical for events at the start of the circular buffer
|
Both differences (1) and (2) confirmed by reproducing Blastware's exact wire bytes from
|
||||||
(key4[2:4] == 0x0000, e.g. key `01110000`). Without it, chunk 1 gets counter=0x0000, which
|
the 1-2-26 BW TX capture (10 frames). Difference (3) confirmed against the 5-1-26
|
||||||
is the same address as the probe frame — the device re-returns the STRT record data instead
|
"bwcap3sec" capture (17 frames, all match byte-for-byte after fix).
|
||||||
of waveform payload. With the guard, chunk 1 gets counter=0x0400, which is confirmed correct
|
|
||||||
from the empirical live-device test 2026-04-06 (`counter=0x0400 → responds immediately and
|
|
||||||
streams all frames correctly`).
|
|
||||||
|
|
||||||
The 4-3-26 capture confirms the pattern for a second event (key `0111245a`, key4[2:4]=0x245a):
|
### SUB 5A — chunk counter formula (REWRITTEN 2026-05-01 — see 5-1-26 captures)
|
||||||
chunk 1 = `0x245A`, chunk 2 = `0x285A`, chunk 3 = `0x2C5A` (each +0x0400).
|
|
||||||
`max(0x245a, 0x0400) = 0x245a` → formula works correctly for non-zero base offset too.
|
> ⚠️ **Everything that came before this rewrite was WRONG in important ways.** The previous
|
||||||
|
> formula `max(key4[2:4], 0x0400) + (chunk_num - 1) * 0x0400` happened to *work* for events
|
||||||
|
> at start_key=0 because the device responds to whatever counter you ask for — but it caused
|
||||||
|
> a 5× over-read past the actual event, picking up post-event circular-buffer garbage that
|
||||||
|
> corrupts the reconstructed file for any event > ~1 sec of waveform. The captures in
|
||||||
|
> `bridges/captures/4-27-26/` and `5-1-26/comcheck/` show BW reads only ~12-16 chunks for
|
||||||
|
> the same events SFM was reading 37+ chunks for. See "TERM frame" and "STRT end_offset"
|
||||||
|
> sections below for the actual mechanism.
|
||||||
|
|
||||||
|
**Chunk addressing is just absolute device-buffer addresses.**
|
||||||
|
|
||||||
|
`params[0]=0x00`, `params[1:5]` is a 4-byte absolute device flash-buffer address (= the
|
||||||
|
"key" of that location), `params[5:11]` are zeros. The device returns 0x0200 (= 512) bytes
|
||||||
|
starting at that address. Increments between consecutive chunks are **0x0200 (NOT 0x0400)**
|
||||||
|
— this matches the chunk payload size. The previous "0x0400 step" worked by accident: BW
|
||||||
|
asks for half-size chunks; SFM was asking for double-size chunks, both with the same-named
|
||||||
|
"counter" field, but the value is just an address pointer the device honors as-is.
|
||||||
|
|
||||||
|
**The chunk pattern depends on whether the event sits at start_key=0 or not.**
|
||||||
|
|
||||||
|
#### Event 1 case — start_key[2:4] == 0x0000 (first event after erase / wrap)
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Probe at counter=0x0000 (params[1:5] = full key, returns STRT record)
|
||||||
|
2. Read 2 fixed metadata pages: counter=0x1002, counter=0x1004
|
||||||
|
(these are GLOBAL session metadata — read ONCE per
|
||||||
|
Blastware session, not per event; contain the
|
||||||
|
Project/Client/User Name/Seis Loc strings)
|
||||||
|
3. Sample chunks: counter=0x0600, 0x0800, …, by 0x0200 increment,
|
||||||
|
up to but not including end_offset (rounded down to
|
||||||
|
0x0200 boundary)
|
||||||
|
4. TERM frame (see TERM formula below)
|
||||||
|
```
|
||||||
|
|
||||||
|
The reason `0x0046..0x0600` is skipped for event 1 is unknown — likely some pre-event
|
||||||
|
firmware reserved area for the first slot in a freshly-erased buffer. Harmless to skip.
|
||||||
|
|
||||||
|
#### Event 2+ case — start_key[2:4] != 0x0000 (continuation events)
|
||||||
|
|
||||||
|
```
|
||||||
|
1. First chunk at counter = start_key[2:4] (this IS the probe — response
|
||||||
|
contains STRT at byte 17)
|
||||||
|
2. Sample chunks: counter += 0x0200 each, up to but
|
||||||
|
not including end_offset
|
||||||
|
3. TERM frame
|
||||||
|
```
|
||||||
|
|
||||||
|
**`start_key` here is the off=0x46 WAVEHDR record key returned by 1F** (e.g. `01112238`),
|
||||||
|
NOT the off=0x2C boundary key that immediately precedes it. An earlier draft of this
|
||||||
|
doc described event-N as "probe at start + 0x46" — that formula came from naming the
|
||||||
|
boundary key as `start_key`. In the iteration walk, `cur_key` passed to
|
||||||
|
`read_bulk_waveform_stream` is always the off=0x46 key (the partial-record skip path in
|
||||||
|
`get_events` re-runs 1F to advance past boundary records before invoking 5A), so the
|
||||||
|
probe counter is just `cur_key[2:4]` with no extra offset. **Adding +0x46 caused the
|
||||||
|
probe to overshoot, miss the STRT record at byte 17 of the response, fall back to the
|
||||||
|
`max_chunks=128` cap, and walk ~110 chunks of post-event garbage** — observed in
|
||||||
|
SFM 5-4-26 capture before the fix.
|
||||||
|
|
||||||
|
Confirmed across:
|
||||||
|
- 5-1-26 "copy 2nd address" BW capture: probe counter=0x2238, key=01112238, STRT@17 end=0x417E.
|
||||||
|
- 5-4-26 BW 2-sec event capture: probe counter=0x2238, key=01112238, TERM offset_word=0x0146 → end=0x417E.
|
||||||
|
|
||||||
|
No metadata pages — those have already been read during event 1 in the same Blastware
|
||||||
|
session, and BW caches them. Note that the metadata-page reads happen ONCE per
|
||||||
|
Blastware-session-on-the-device, not once per event, so an SFM session that downloads
|
||||||
|
several events should read 0x1002/0x1004 only once at the start.
|
||||||
|
|
||||||
|
#### History (do not re-derive)
|
||||||
|
|
||||||
**History:**
|
|
||||||
- Original: `_CHUNK1_COUNTER = 0x1004` hardcoded (Blastware capture artifact — WRONG).
|
- Original: `_CHUNK1_COUNTER = 0x1004` hardcoded (Blastware capture artifact — WRONG).
|
||||||
- 2026-04-06: Corrected to `chunk_num * 0x0400` (worked for key 01110000 only).
|
- 2026-04-06: `chunk_num * 0x0400` (worked for key 01110000 only).
|
||||||
- 2026-04-24: Corrected to `key4[2:4] + (chunk_num-1) * 0x0400` (fixed non-zero offsets,
|
- 2026-04-24: `key4[2:4] + (chunk_num-1) * 0x0400` (fixed non-zero offsets, broke key 01110000).
|
||||||
but accidentally broke key 01110000 — counter=0x0000 sends probe address again).
|
- 2026-04-26: `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400` (broken — over-read past event end).
|
||||||
- 2026-04-26: Final formula: `max(key4[2:4], 0x0400) + (chunk_num-1) * 0x0400`.
|
- 2026-05-01: Increments are 0x0200 not 0x0400; absolute addresses inside event range; bounded
|
||||||
|
by STRT end_key, not by `max_chunks` cap or device-side timeout.
|
||||||
|
- 2026-05-04: Removed spurious `+0x0046` from event-N probe counter. `cur_key` from 1F
|
||||||
|
is already the off=0x46 WAVEHDR key, so adding +0x46 would have placed the probe one
|
||||||
|
WAVEHDR past the actual event start. This caused probe responses to lack a STRT
|
||||||
|
record (no `end_offset` parsed → `0xFFFF` fallback → `max_chunks=128` cap), walking
|
||||||
|
~110 chunks of post-event circular-buffer garbage. Fixed in protocol.py
|
||||||
|
`read_bulk_waveform_stream`.
|
||||||
|
|
||||||
|
### SUB 5A — STRT record encodes end_offset (NEW 2026-05-01)
|
||||||
|
|
||||||
|
The first A5 response (probe response, or the first chunk for event 2+) contains a STRT
|
||||||
|
record at byte offset 17 of the `data` field. Layout:
|
||||||
|
|
||||||
|
```
|
||||||
|
data[17:21] "STRT" magic
|
||||||
|
data[21:23] ff fe sentinel
|
||||||
|
data[23:27] end_key ← 4-byte key of where this event ENDS
|
||||||
|
data[27:31] start_key ← 4-byte key of where this event STARTS
|
||||||
|
data[31:33] uint16 BE ?? sample-count or total bytes (varies; not yet decoded)
|
||||||
|
data[33:35] uint16 BE ??
|
||||||
|
data[35] 0x46 record type (waveform full record)
|
||||||
|
…
|
||||||
|
```
|
||||||
|
|
||||||
|
`end_offset = (end_key[2] << 8) | end_key[3]` is **the authoritative event-end pointer**.
|
||||||
|
SFM must extract this from the first A5 response and use it to bound the chunk loop and
|
||||||
|
encode the TERM frame. The device will happily respond to chunk requests past `end_offset`
|
||||||
|
(returning post-event circular-buffer contents) — that's the over-read bug.
|
||||||
|
|
||||||
|
Verified across 3 events:
|
||||||
|
|
||||||
|
| Capture | start_key | end_key | end_offset | event size |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| 4-27-26 "open 2sec" / "copy event to disk" | `01110000` | `01111ABE` | `0x1ABE` | 6,846 B |
|
||||||
|
| 5-1-26 "copy 3sec" / Download All event 1 | `01110000` | `011121F2` | `0x21F2` | 8,690 B |
|
||||||
|
| 5-1-26 "copy 2nd address" / DA event 2 | `011121F2` | `0111417E` | `0x417E` (event 2 span 0x1F8C = 8,076 B) |
|
||||||
|
|
||||||
|
### SUB 5A — TERM frame formula (FINALIZED 2026-05-01)
|
||||||
|
|
||||||
|
The TERM frame fetches the partial last chunk *and* the file footer. It is **not** a simple
|
||||||
|
"goodbye" frame — its response payload contains the bytes between the last full 0x0200-aligned
|
||||||
|
chunk and `end_offset`, and is required for reconstructing the Blastware file format.
|
||||||
|
|
||||||
|
```
|
||||||
|
last_chunk_counter = address of last full 0x0200-byte chunk read
|
||||||
|
next_boundary = last_chunk_counter + 0x0200
|
||||||
|
TERM offset_word = end_offset - next_boundary
|
||||||
|
TERM params[0] = key[0] (= 0x01 on every observed device)
|
||||||
|
TERM params[1] = key[1] (= 0x11)
|
||||||
|
TERM params[2] = (next_boundary >> 8) & 0xFF
|
||||||
|
TERM params[3] = next_boundary & 0xFF
|
||||||
|
TERM params[4:10] = zeros
|
||||||
|
build_5a_frame(offset_word, params) (10-byte params, NOT 11)
|
||||||
|
```
|
||||||
|
|
||||||
|
The device reconstructs `requested_address = (params[2] << 8) | offset_word = end_offset`
|
||||||
|
and replies with `(end_offset - next_boundary)` bytes from `next_boundary` — the residual
|
||||||
|
between the last 0x0200 boundary and the actual event end. Append the TERM response data
|
||||||
|
to the chunk stream like any other A5 frame; it carries the final waveform tail + footer.
|
||||||
|
|
||||||
|
Verified across 3 events:
|
||||||
|
|
||||||
|
| end_offset | last chunk | next_boundary | TERM offset_word | TERM params[2:4] |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| `0x1ABE` | `0x1800` | `0x1A00` | `0x00BE` ✓ | `1A 00` ✓ |
|
||||||
|
| `0x21F2` | `0x1E00` | `0x2000` | `0x01F2` ✓ | `20 00` ✓ |
|
||||||
|
| `0x417E` | `0x3E38` | `0x4038` | `0x0146` ✓ | `40 38` ✓ |
|
||||||
|
|
||||||
|
The previous code's hard-coded `offset_word = 0x005A` and `term_counter = last + 0x0400`
|
||||||
|
are wrong; the device's response under that path is a tiny 101-byte device-side terminator
|
||||||
|
(arrived only after we walked the entire post-event buffer), not the proper file footer.
|
||||||
|
|
||||||
|
### SUB 5A — fixed metadata pages 0x1002 and 0x1004 (NEW 2026-05-01)
|
||||||
|
|
||||||
|
Two chunk addresses are GLOBAL device/session metadata, not event-specific:
|
||||||
|
|
||||||
|
- `counter=0x1002` — first metadata page
|
||||||
|
- `counter=0x1004` — second metadata page
|
||||||
|
|
||||||
|
These are at fixed absolute addresses in the device's flash buffer. They contain the
|
||||||
|
session-start compliance setup (Project/Client/User Name/Seis Loc/Extended Notes ASCII
|
||||||
|
strings). Under the v0.14.0+ walk these strings are read directly from the metadata
|
||||||
|
pages, not from the sample-chunk stream.
|
||||||
|
|
||||||
|
BW reads them ONCE per Blastware session (during event 1's download) and caches them.
|
||||||
|
For SFM, that means:
|
||||||
|
- Once per call-home / once per `MiniMateClient.connect()` is enough.
|
||||||
|
- Subsequent events in the same session don't need to re-fetch them.
|
||||||
|
- Their content does not change when iterating events; only when the user opens
|
||||||
|
Compliance Setup → Apply on the device or sends a SUB 71 compliance write.
|
||||||
|
|
||||||
|
The full byte-for-byte layout of the metadata pages has not been mapped — `_decode_a5_metadata_into`
|
||||||
|
locates the ASCII strings via label scans (`Project:`, `Client:`, `User Name:`, `Seis Loc:`,
|
||||||
|
`Extended Notes`) which works correctly across observed captures. Future work could
|
||||||
|
dump the structural layout if more session-global fields need to be extracted.
|
||||||
|
|
||||||
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
|
### SUB 5A — params are 11 bytes for chunk frames, 10 for termination
|
||||||
|
|
||||||
@@ -148,10 +549,11 @@ chunk 1 = `0x245A`, chunk 2 = `0x285A`, chunk 3 = `0x2C5A` (each +0x0400).
|
|||||||
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
|
confirmed from the BW wire capture. `bulk_waveform_term_params()` returns 10 bytes.
|
||||||
Do not swap them.
|
Do not swap them.
|
||||||
|
|
||||||
### SUB 5A — event-time metadata lives in A5 frame 7
|
### SUB 5A — event-time metadata source (FINALIZED 2026-05-05)
|
||||||
|
|
||||||
The bulk stream sends 9+ A5 response frames. Frame 7 (0-indexed) contains the compliance
|
The metadata strings come from the two fixed metadata pages at counter `0x1002` and
|
||||||
setup as it existed when the event was recorded:
|
`0x1004` (see "SUB 5A — fixed metadata pages 0x1002 and 0x1004" above). These pages
|
||||||
|
are GLOBAL session metadata — read once per Blastware/SFM session, not per event.
|
||||||
|
|
||||||
```
|
```
|
||||||
"Project:" → project description
|
"Project:" → project description
|
||||||
@@ -161,44 +563,71 @@ setup as it existed when the event was recorded:
|
|||||||
"Extended Notes"→ notes
|
"Extended Notes"→ notes
|
||||||
```
|
```
|
||||||
|
|
||||||
**IMPORTANT — 5A "Project:" is session-start config, NOT per-event (confirmed 2026-04-05):**
|
**IMPORTANT — these strings are session-start config, NOT per-event:**
|
||||||
The "Project:" string in the A5 frame 7 payload reflects the compliance setup from when
|
Project / Client / User Name / Seis Loc reflect the compliance setup from when the
|
||||||
the *monitoring session first started*, not the individual event's project name. The per-
|
*monitoring session first started*, not the individual event's per-event metadata. The
|
||||||
event project name is correctly stored in the 210-byte 0C waveform record and must be
|
authoritative per-event project name is stored in the 210-byte 0C waveform record.
|
||||||
used as the authoritative source. `_decode_a5_metadata_into` therefore only sets
|
`_decode_a5_metadata_into` therefore only sets `project` from the 5A metadata pages
|
||||||
`project` from 5A when 0C didn't already supply one.
|
when 0C didn't already supply one.
|
||||||
|
|
||||||
"Client:", "User Name:", "Seis Loc:", and "Extended Notes" are **NOT** present in the 0C
|
"Client:", "User Name:", "Seis Loc:", and "Extended Notes" are **NOT** present in the 0C
|
||||||
record — 5A remains the sole source for those fields and they are set unconditionally.
|
record — the metadata pages are the sole source for those fields and they are set
|
||||||
|
unconditionally.
|
||||||
|
|
||||||
`stop_after_metadata=True` (default) stops the 5A loop as soon as `b"Project:"` appears,
|
#### Deprecated knobs (do not re-introduce)
|
||||||
then sends the termination frame.
|
|
||||||
|
|
||||||
### SUB 5A — end-of-stream signal (confirmed 2026-04-06)
|
The `read_bulk_waveform_stream()` function still accepts these legacy kwargs for
|
||||||
|
backward compatibility, but they are **no-ops** under the v0.14.0+ walk:
|
||||||
|
|
||||||
After streaming all waveform chunks, the device sends exactly **1 raw byte** in response to
|
- `stop_after_metadata=True` — used to scan the chunk stream for `b"Project:"` and stop
|
||||||
the next chunk request, then goes silent. This is the natural end-of-stream indicator — NOT
|
one chunk later as a workaround for the missing end_offset bound. Obsolete: the loop
|
||||||
a complete A5 frame. `S3FrameParser.bytes_fed` will be 1; no frame is assembled.
|
is now deterministically bounded by `end_offset` parsed from the STRT record at
|
||||||
|
data[17] of the probe response, with the partial tail fetched by the TERM frame.
|
||||||
|
- `extra_chunks_after_metadata` — same era, same reason. No-op.
|
||||||
|
|
||||||
Handling: on `TimeoutError`, if `bytes_fed > 0` AND frames were already collected, treat as
|
If you find code or docs referencing "A5 frame 7" as the source of metadata strings,
|
||||||
graceful end-of-stream, break the loop, and proceed to the termination frame. If `bytes_fed
|
that's an old-walk artifact (the broken `0x0400`-step formula occasionally caught the
|
||||||
== 0` with no prior frames, it is a genuine transport failure — re-raise.
|
0x1002 metadata page at sample-chunk fi=7). Update to reference the dedicated metadata
|
||||||
|
pages instead.
|
||||||
|
|
||||||
**Chunk recv timeout must be 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
|
### SUB 5A — end-of-stream (FINALIZED 2026-05-01)
|
||||||
Using 120 s causes a ~2-minute stall at every end-of-stream detection. The `_recv_one` call
|
|
||||||
in the chunk loop passes `timeout=10.0` explicitly.
|
|
||||||
|
|
||||||
**Typical chunk count (BE11529, 1024 sps):** A 9,306-sample event produces 35 chunks before
|
Under the v0.14.0+ STRT-bounded walk the stream ends cleanly:
|
||||||
end-of-stream. Chunks with uniform 1,036-byte data are all-zero ADC samples (post-event
|
|
||||||
silence). Only the initial variable-size chunks contain actual signal.
|
```
|
||||||
|
… last full chunk at counter < end_offset
|
||||||
|
TERM request (offset_word = end_offset - next_boundary,
|
||||||
|
params address (next_boundary))
|
||||||
|
TERM response (page_key = 0x0000 or 0x0001, data = the residual
|
||||||
|
end_offset - next_boundary bytes including the file footer)
|
||||||
|
```
|
||||||
|
|
||||||
|
No timeout-based detection, no "1-byte teaser," no `max_chunks` cap. The chunk loop
|
||||||
|
exits when `counter + 0x0200 > end_offset`; the TERM frame fetches the tail.
|
||||||
|
|
||||||
|
**Chunk recv timeout is 10 s, not the default 120 s.** Chunks arrive within ~1 s each.
|
||||||
|
Using 120 s would cause a ~2-minute stall on any unexpected timeout. The `_recv_one`
|
||||||
|
call in the chunk loop passes `timeout=10.0` explicitly.
|
||||||
|
|
||||||
|
**Typical chunk count under the v0.14.0+ walk (BE11529, 1024 sps over TCP/cellular):**
|
||||||
|
|
||||||
|
| Event duration | Sample chunks | Metadata pages | TERM | Total A5 frames |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| 2-sec (event 1) | ~12 | 2 | 1 | ~15 |
|
||||||
|
| 3-sec (event 1) | 13 | 2 | 1 | 16 |
|
||||||
|
| 2-sec (continuation) | 15 | 0 | 1 | 16 |
|
||||||
|
| 3-sec (continuation) | ~14 | 0 | 1 | ~15 |
|
||||||
|
|
||||||
|
For comparison, the deprecated `0x0400`-step walk produced ~37 chunks for a 2-sec
|
||||||
|
event with chunks 17-37 containing post-event circular-buffer garbage. Do not
|
||||||
|
re-introduce that walk under any circumstances.
|
||||||
|
|
||||||
### SUB 5A — fi==9 hardcoded skip (FIXED 2026-04-06)
|
### SUB 5A — fi==9 hardcoded skip (FIXED 2026-04-06)
|
||||||
|
|
||||||
`_decode_a5_waveform()` previously had `elif fi == 9: continue` — a leftover from the
|
`_decode_a5_waveform()` previously had `elif fi == 9: continue` — a leftover from the
|
||||||
9-frame original blast capture where frame 9 was assumed to be a terminator. For current
|
9-frame original blast capture where frame 9 was assumed to be a terminator. Removed.
|
||||||
35-frame streams, fi==9 is live waveform data (~133 sample-sets were being dropped).
|
TERM detection in the file builder uses `frame.page_key != 0x0010` (sample marker),
|
||||||
Removed. Terminator detection is via `page_key == 0x0000` in `read_bulk_waveform_stream`,
|
not frame index — see `blastware_file.py`.
|
||||||
not frame index.
|
|
||||||
|
|
||||||
### SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)
|
### SUB 1E / 1F — event iteration null sentinel and token position (FIXED, do not re-introduce)
|
||||||
|
|
||||||
@@ -303,6 +732,55 @@ sends token=0xFE and is NOT used by any caller.
|
|||||||
`advance_event()` returns `(key4, event_data8)`.
|
`advance_event()` returns `(key4, event_data8)`.
|
||||||
Callers (`count_events`, `get_events`) loop while `data8[4:8] != b"\x00\x00\x00\x00"`.
|
Callers (`count_events`, `get_events`) loop while `data8[4:8] != b"\x00\x00\x00\x00"`.
|
||||||
|
|
||||||
|
### SUB 0A — WAVEHDR response length distinguishes events from boundaries (NEW 2026-05-01)
|
||||||
|
|
||||||
|
When iterating events with the "Download All" pattern (1E → 0A → 1F → 0A → 1F → …), the
|
||||||
|
DATA_LENGTH at `data_rsp.data[5]` (= the byte BW echoes back as the offset for the data
|
||||||
|
fetch step) takes one of two values:
|
||||||
|
|
||||||
|
| WAVEHDR offset | Meaning |
|
||||||
|
|---|---|
|
||||||
|
| `0x46` (= 70) | Real event start key — there is event data at this address |
|
||||||
|
| `0x2C` (= 44) | Boundary marker between events — this key is the END of the previous event AND the START key for the empty space after it (or is the next event's pre-header) |
|
||||||
|
|
||||||
|
Confirmed from the 5-1-26 "Download All" capture:
|
||||||
|
|
||||||
|
```
|
||||||
|
0A(key=01110000) → off=0x46 ← event 1 real start
|
||||||
|
1F → key=011121F2
|
||||||
|
0A(key=011121F2) → off=0x2C ← event 1 END / event 2 boundary
|
||||||
|
1F → key=01112238
|
||||||
|
0A(key=01112238) → off=0x46 ← event 2 real start (= boundary + 0x46)
|
||||||
|
1F → key=0111417E
|
||||||
|
0A(key=0111417E) → off=0x2C ← event 2 END / next-empty marker
|
||||||
|
1F → null sentinel
|
||||||
|
```
|
||||||
|
|
||||||
|
This is why event 2's first 5A chunk is at `start_key + 0x46` — that's the address of the
|
||||||
|
"real start" 0x46-record, distinct from the `0x2C`-record at the raw boundary. Use the
|
||||||
|
`0x46` keys as the input to `read_bulk_waveform_stream`, not the `0x2C` keys.
|
||||||
|
|
||||||
|
For event 1 only (start_key[2:4] = 0x0000) BW probes at counter=0x0000 directly, which is
|
||||||
|
the `0x46`-keyed start record. Subsequent events use `start_key + 0x46`.
|
||||||
|
|
||||||
|
**Practical iteration pattern (replaces the old 1E/1F walk for downloads):**
|
||||||
|
|
||||||
|
```
|
||||||
|
Setup: SERIAL × 2 → CHCFG → 1E (token=0x00) → key0
|
||||||
|
For each event:
|
||||||
|
0A(cur_key) → DATA_LENGTH = 0x46 (real) or 0x2C (boundary)
|
||||||
|
1F (token=0x00) → next_key
|
||||||
|
if length was 0x46: → cur_key is a real event; queue it for download
|
||||||
|
cur_key = next_key
|
||||||
|
if next_key all-zero null sentinel: stop
|
||||||
|
|
||||||
|
Then for each queued real-event key:
|
||||||
|
download_event(key) → 5A bulk stream with STRT-bounded chunk walk
|
||||||
|
```
|
||||||
|
|
||||||
|
This is what BW does in the 5-1-26 "Download All" capture — it walks the full event chain
|
||||||
|
collecting `(key, length)` tuples first, *then* downloads each event using the `0x46` keys.
|
||||||
|
|
||||||
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
|
### SUB 1A — compliance config — orphaned send bug (FIXED, do not re-introduce)
|
||||||
|
|
||||||
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
|
`read_compliance_config()` sends a 4-frame sequence (A, B, C, D) where:
|
||||||
@@ -347,36 +825,6 @@ Do NOT use fixed absolute offsets for sample_rate or record_time.
|
|||||||
Quiet Mode enabled. Parser handles this — do not strip it manually before feeding to
|
Quiet Mode enabled. Parser handles this — do not strip it manually before feeding to
|
||||||
`S3FrameParser`.
|
`S3FrameParser`.
|
||||||
|
|
||||||
**SUB 5A (bulk waveform) TCP frame splitting — confirmed 2026-04-27:**
|
|
||||||
|
|
||||||
Over TCP via cellular modem, each 5A chunk request that produces a single ~1100-byte
|
|
||||||
A5 response over direct RS-232 may arrive as **two separate, complete S3 frames** of
|
|
||||||
~550 bytes each ("2-frame mode"). The modem's Data Forwarding Timeout (~100-150 ms)
|
|
||||||
can split the RS-232 response into two TCP segments, each parsed as a complete S3 frame.
|
|
||||||
Under different modem/timing conditions the full ~1100-byte response arrives as **one
|
|
||||||
S3 frame** ("1-frame mode").
|
|
||||||
|
|
||||||
**Both modes require `extra_chunks_after_metadata=1`** (the extra chunk at metadata_counter
|
|
||||||
+ 0x0400). The device's waveform footer data lives at circular-buffer address 0x1C00 for
|
|
||||||
this event; the terminator frame must be sent at 0x1C00 (not 0x1800) to receive it.
|
|
||||||
|
|
||||||
Example for a 2-second Continuous event (BE11529, key=01110000) via TCP:
|
|
||||||
- **2-frame mode:** 1 probe frame (554 B) + 5 chunks × 2 frames (556-573 B) + 1 extra chunk × 2 frames + 1 terminator (208 B) = **14 A5 frames** → 6864-byte file
|
|
||||||
- **1-frame mode:** 1 probe frame (~1097 B) + 5 chunks × 1 frame (~1079-1113 B) + 1 extra chunk × 1 frame (smaller, tail of event) + 1 terminator → **8 A5 frames** → 6864-byte file
|
|
||||||
- All frames contribute body data; using all of them gives the correct file.
|
|
||||||
|
|
||||||
**Fix (confirmed 2026-04-27):** `_recv_5a_batch()` in `protocol.py` collects ALL
|
|
||||||
A5 frames per chunk request before the next request is sent, using a 0.5 s batch
|
|
||||||
timeout after the first frame to catch the ~150 ms delayed second frame. `write_blastware_file()`
|
|
||||||
includes ALL body frames without skipping — the extra chunk's frames are part of the
|
|
||||||
body data, NOT padding to be discarded.
|
|
||||||
|
|
||||||
**WRONG earlier hypothesis (do not re-introduce):** An attempt was made to auto-detect
|
|
||||||
1-frame vs 2-frame mode from the probe frame size and skip the extra chunk when
|
|
||||||
`probe_data_len >= 700`. This was wrong — the extra chunk is always needed to advance
|
|
||||||
the device's internal state to the footer address. The `_probe_is_large` branch was
|
|
||||||
removed 2026-04-27.
|
|
||||||
|
|
||||||
### Required ACEmanager settings (Sierra Wireless RV50/RV55)
|
### Required ACEmanager settings (Sierra Wireless RV50/RV55)
|
||||||
|
|
||||||
| Setting | Value | Why |
|
| Setting | Value | Why |
|
||||||
@@ -557,6 +1005,8 @@ All DB endpoints are read-only except `PATCH /db/events/{id}/false_trigger`.
|
|||||||
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
|
| 3-11-26 | `bridges/captures/3-11-26/` | Full compliance setup write, Aux Trigger capture |
|
||||||
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence; only 1 event stored so token=0xFE appeared to work |
|
| 3-31-26 | `bridges/captures/3-31-26/` | Complete event download cycle (148 BW / 147 S3 frames) — confirmed 1E/0A/0C/1F sequence; only 1 event stored so token=0xFE appeared to work |
|
||||||
| 4-3-26 | `bridges/captures/4-3-26/` | Browse-mode S3 capture with 2+ events — confirmed all-zero params for 1F, 1F response layout, null sentinel, 0A context requirement |
|
| 4-3-26 | `bridges/captures/4-3-26/` | Browse-mode S3 capture with 2+ events — confirmed all-zero params for 1F, 1F response layout, null sentinel, 0A context requirement |
|
||||||
|
| 4-27-26 | `bridges/captures/4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof that SFM was over-reading 5× past event end. BW reads 14 chunks at 0x0200 increments + TERM at end_offset; SFM was reading 37 chunks at 0x0400 increments. STRT end_key field located. |
|
||||||
|
| 5-1-26 | `bridges/captures/5-1-26/comcheck/` | Three sub-captures: SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945`/`_171216`). Confirmed: TERM frame formula across 3 events; metadata pages 0x1002/0x1004 are global (read once per session); event-1 vs event-N chunk-pattern split; WAVEHDR length 0x46 vs 0x2C disambiguates real events from boundaries. |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -820,7 +1270,7 @@ offsets in the raw 1A/E5 payload. Only fields with `✅` have confirmed offsets
|
|||||||
|
|
||||||
**Notes tab:**
|
**Notes tab:**
|
||||||
- Enable User Notes (bool)
|
- Enable User Notes (bool)
|
||||||
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from A5 frame 7 via 5A)
|
- Project, Client, User Name, Seis Loc (ASCII strings) ✅ (sourced from 5A metadata pages at counter 0x1002 / 0x1004 — see "SUB 5A — fixed metadata pages" section)
|
||||||
- Enable Extended Notes (bool); Extended Notes text; Extended Notes Title
|
- Enable Extended Notes (bool); Extended Notes text; Extended Notes Title
|
||||||
- Enable Job Number (bool); Job Number (int)
|
- Enable Job Number (bool); Job Number (int)
|
||||||
- Enable Scaled Distance (bool); Distance from Blast (float); Charge Weight (float) — Scaled Distance is derived
|
- Enable Scaled Distance (bool); Distance from Blast (float); Charge Weight (float) — Scaled Distance is derived
|
||||||
@@ -1132,9 +1582,11 @@ body) because writing a dial string may require DLE escaping for embedded contro
|
|||||||
|
|
||||||
## What's next
|
## What's next
|
||||||
|
|
||||||
|
**See [README.md → Roadmap (Future)](README.md#roadmap-future) for the canonical deferred-work list.** This section is kept as a status log of in-progress / recently-shipped technical details (encoding schemes, byte layouts, etc.) that are too low-level for the README's roadmap.
|
||||||
|
|
||||||
- **Database** — SQLite store for events + monitor log entries; dedup by key; queryable
|
- **Database** — SQLite store for events + monitor log entries; dedup by key; queryable
|
||||||
- **Histograms** — decode histogram-mode A5 data (noise floor tracking)
|
- **Histograms** — decode histogram-mode A5 data (noise floor tracking)
|
||||||
- **Blastware-compatible file output** — `write_blastware_file()` and `write_mlg()` implemented. `blastware_filename()` generates correct Blastware filenames (AB0 for direct, AB0W/AB0H for ACH). **Confirmed working for Continuous mode events (2026-04-23):** SFM-generated file opens in Blastware, shows correct PPV/waveform/timestamp. File is ~200 bytes shorter than BW (missing last ADC tail slice) — all measurements correct. Histogram+Continuous mode deferred (5A stream for those events embeds histogram interval records that create spurious STRT markers in the body). Extension mapping: **CONFIRMED FALSE 2026-04-21** — extensions encode timestamp (AB0T for ACH, AB0 for direct), NOT recording mode. Filename format: `<prefix_letter><serial3><4-char-base36-stem><ext>`
|
- **Blastware-compatible file output** — `write_blastware_file()` and `write_mlg()` implemented. `blastware_filename()` generates correct Blastware filenames (AB0 for direct, AB0W/AB0H for ACH). **Confirmed BYTE-PERFECT against BW reference (v0.14.3, 2026-05-05):** when fed the BW 5-1-26 3-sec capture's A5 frames, the SFM-built file matches BW's saved `M529LKIQ.G10` byte-for-byte (8708 bytes, 0 differences). Live SFM downloads of event 0 (3-sec) and event 1 (3-sec continuation) both open cleanly in Blastware with full Event Reports, frequency analysis, and waveform plots. Body assembly is just contiguous concatenation of frame contributions in stream order (probe → meta@0x1002 → meta@0x1004 → samples → TERM); no stripping, no overlay, no special handling. Histogram+Continuous mode deferred (5A stream for those events embeds histogram interval records that may need different handling — untested under v0.14.x). Extension mapping: extensions encode timestamp (AB0T for ACH, AB0 for direct), NOT recording mode. Filename format: `<prefix_letter><serial3><4-char-base36-stem><ext>`
|
||||||
|
|
||||||
**Serial encoding (CONFIRMED 2026-04-22):** `prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))`, `serial3 = f"{serial_numeric % 1000:03d}"`. Examples: BE6907→H907, BE11529→M529, BE14036→P036, BE17353→S353, BE18003→T003. The prefix letter encodes the production generation (batch of 1000 units).
|
**Serial encoding (CONFIRMED 2026-04-22):** `prefix_letter = chr(ord('B') + floor(serial_numeric / 1000))`, `serial3 = f"{serial_numeric % 1000:03d}"`. Examples: BE6907→H907, BE11529→M529, BE14036→P036, BE17353→S353, BE18003→T003. The prefix letter encodes the production generation (batch of 1000 units).
|
||||||
|
|
||||||
@@ -1170,16 +1622,21 @@ body) because writing a dial string may require DLE escaping for embedded contro
|
|||||||
|
|
||||||
| Folder / File | Contents |
|
| Folder / File | Contents |
|
||||||
|---|---|
|
|---|---|
|
||||||
|
| `1-2-26/` | First SUB 5A BW TX capture — established 5A frame format (raw offset_hi, DLE-aware checksum). 10 frames verified. |
|
||||||
| `3-11-26/raw_bw_20260311_170151.bin` | Full compliance write + event download (SUBs 68→83 confirmed, frames 102–112) |
|
| `3-11-26/raw_bw_20260311_170151.bin` | Full compliance write + event download (SUBs 68→83 confirmed, frames 102–112) |
|
||||||
|
| `3-31-26/` | Single-event download (148 BW / 147 S3 frames) — 1E/0A/0C/1F sequence confirmed (single event so token=0xFE appeared to work in either branch) |
|
||||||
|
| `4-2-26/` | Download-mode BW TX capture — POLL×3 requirement confirmed (frames 68-73 between 1F and first 5A) |
|
||||||
|
| `4-3-26-multi_event/` | Browse-mode S3 capture with 2+ events — all-zero params for 1F, null sentinel layout, 0A context requirement |
|
||||||
|
| `4-8-26/` | Monitor status read, start/stop monitoring, SESSION_RESET signal, sensor check |
|
||||||
|
| `4-11-26 (mitm/ach_mitm_20260411_001912/)` | Full ACH call-home MITM — erase protocol (0xA3/0x06/0xA2), monitor log partial records confirmed |
|
||||||
| `4-20-26/raw_bw_*_recording_mode_*.bin` | Recording mode changes: Continuous→Single Shot, →Histogram, →Histogram+Continuous |
|
| `4-20-26/raw_bw_*_recording_mode_*.bin` | Recording mode changes: Continuous→Single Shot, →Histogram, →Histogram+Continuous |
|
||||||
| `4-20-26/histogram interval/` | Histogram interval changes: 1min, 5min, 15min, 15sec |
|
| `4-20-26/histogram interval/` | Histogram interval changes: 1min, 5min, 15min, 15sec |
|
||||||
| `4-20-26/geo sensitivity/` | Geo sensitivity changes: 1.25 in/s (Sensitive), 10 in/s (Normal) |
|
| `4-20-26/geo sensitivity/` | Geo sensitivity changes: 1.25 in/s (Sensitive), 10 in/s (Normal) |
|
||||||
| `4-20-26/call home settings/` | Call home config read/write captures |
|
| `4-20-26/call home settings/` | Call home config read/write captures |
|
||||||
| `4-8-26/` | Monitor status read, start/stop monitoring, SESSION_RESET signal, sensor check |
|
| `4-27-26/` | BW "open 2sec waveform" + "copy event to disk" + paired SFM "seismo_dl" — first proof of 5× SFM over-read. STRT end_key field located. |
|
||||||
| `4-3-26-multi_event/` | Browse-mode S3 capture with 2+ events (1E/0A/1F iteration confirmed) |
|
| **`5-1-26/comcheck/`** | **Triplet of captures that nailed the v0.14.0 walk:** SFM 3-sec download (`seismo_dl_…`), BW comms-check + 3-sec download (`bwcap3sec/`), BW second-event download + "Download All" (`raw_*_170945` / `_171216`). Confirmed: TERM frame formula across 3 events, metadata pages 0x1002/0x1004 are global session metadata, event-1 vs event-N chunk pattern split, WAVEHDR off=0x46 vs 0x2C disambiguates real events from boundaries. |
|
||||||
| `4-2-26/` | Download-mode BW TX capture (5A bulk stream, POLL×3 requirement confirmed) |
|
| **`5-1-26/comcheck/bwcap3sec/`** | **The byte-perfect reference for v0.14.3.** All 17 BW 5A request frames (probe, 2 metadata, 13 samples, TERM) reproduce byte-for-byte from SFM's framing helpers — including the `10 10 00` DLE-stuffed counter for sample @ 0x1000 that was the long-standing failure mode. |
|
||||||
| `3-31-26/` | Single-event download (148 BW / 147 S3 frames) |
|
| `5-4-26/` | BW MITM captures of "copy 3sec / 2sec / Download All" + paired SFM session (`seismo_dl_20260504_145701`) showing the +0x46 event-N probe bug producing 110-chunk runaway walk. Cross-references against 5-1-26 confirmed device behavior is identical. |
|
||||||
| `mitm/ach_mitm_20260411_001912/` | Full ACH call-home MITM (erase protocol, 0xA3/0x06/0xA2 confirmed) |
|
|
||||||
|
|
||||||
To parse BW TX captures: use `bridges/captures/` scripts or adapt the `find_write_frames()` pattern
|
To parse BW TX captures: use `bridges/captures/` scripts or adapt the `find_write_frames()` pattern
|
||||||
in `/tmp/analyze_write_payload.py` — it correctly handles `0x10 0x03` DLE-escaped ETX bytes
|
in `/tmp/analyze_write_payload.py` — it correctly handles `0x10 0x03` DLE-escaped ETX bytes
|
||||||
|
|||||||
+31
@@ -0,0 +1,31 @@
|
|||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# tzdata is required for the TZ env var to take effect (python:slim
|
||||||
|
# omits the timezone database). Without it, datetime.now() / logging
|
||||||
|
# / matplotlib all stay in UTC regardless of TZ. Default zone gets
|
||||||
|
# set further down via ENV; users override per-deployment via the
|
||||||
|
# `TZ` env var in docker-compose.
|
||||||
|
RUN apt-get update && \
|
||||||
|
apt-get install -y --no-install-recommends curl tzdata && \
|
||||||
|
rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Default display timezone — applied to server logs, datetime.now(),
|
||||||
|
# matplotlib rendered timestamps, and any naïve-vs-aware datetime
|
||||||
|
# conversions in the PDF renderer. Override via TZ env var in
|
||||||
|
# docker-compose; storage in the DB is always UTC regardless.
|
||||||
|
ENV TZ=America/New_York
|
||||||
|
|
||||||
|
COPY pyproject.toml requirements.txt ./
|
||||||
|
COPY minimateplus ./minimateplus
|
||||||
|
COPY micromate ./micromate
|
||||||
|
COPY sfm ./sfm
|
||||||
|
COPY bridges ./bridges
|
||||||
|
COPY scripts ./scripts
|
||||||
|
|
||||||
|
RUN pip install --no-cache-dir -e .
|
||||||
|
|
||||||
|
EXPOSE 8200
|
||||||
|
|
||||||
|
CMD ["python", "-m", "uvicorn", "sfm.server:app", "--host", "0.0.0.0", "--port", "8200"]
|
||||||
@@ -1,7 +1,11 @@
|
|||||||
# seismo-relay `v0.12.1`
|
# seismo-relay `v0.21.0`
|
||||||
|
|
||||||
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
|
A ground-up replacement for **Blastware** — Instantel's aging Windows-only
|
||||||
software for managing MiniMate Plus seismographs.
|
software for managing seismographs. Supports both the **MiniMate Plus
|
||||||
|
(Series III)** and the **Micromate (Series IV / "Thor")** families:
|
||||||
|
Series III via the live RS-232 / TCP wire protocol *and* Blastware ACH file
|
||||||
|
ingest; Series IV currently via Thor TXT-paired IDF file ingest, with the
|
||||||
|
binary codec on the roadmap.
|
||||||
|
|
||||||
Built in Python. Runs on Windows, Linux, or macOS. Connects to instruments
|
Built in Python. Runs on Windows, Linux, or macOS. Connects to instruments
|
||||||
over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
||||||
@@ -10,6 +14,46 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
|||||||
> pipeline working end-to-end over TCP/cellular. ACH Auto Call Home server
|
> pipeline working end-to-end over TCP/cellular. ACH Auto Call Home server
|
||||||
> handles inbound unit connections, downloads events, and persists everything
|
> handles inbound unit connections, downloads events, and persists everything
|
||||||
> to a SQLite database. SFM REST API exposes device control and DB queries.
|
> to a SQLite database. SFM REST API exposes device control and DB queries.
|
||||||
|
> **As of v0.14.3 (2026-05-05): SUB 5A bulk waveform protocol is verified
|
||||||
|
> byte-perfect against Blastware captures across 2-sec, 3-sec, and 10-sec
|
||||||
|
> events.** Generated `.G10` / `.AB0` files open cleanly in Blastware with
|
||||||
|
> full Event Reports, frequency analysis, and waveform plots.
|
||||||
|
> **v0.16.0 (2026-05-11)** adds BW ASCII report ingestion to
|
||||||
|
> `/db/import/blastware_file` — paired with **series3-watcher v1.5.0**,
|
||||||
|
> every Blastware ACH event lands in SeismoDb with device-authoritative
|
||||||
|
> peaks, project metadata, sensor self-check, and ZC/Time-of-Peak data,
|
||||||
|
> without depending on the still-undecoded waveform body codec.
|
||||||
|
> **v0.18.0 (2026-05-19)** adds Thor / Micromate Series IV ingest at
|
||||||
|
> `/db/import/idf_file` — paired with **thor-watcher v0.3.0**, every
|
||||||
|
> `.IDFH` / `.IDFW` event file (plus its `.txt` sidecar) lands in
|
||||||
|
> SeismoDb the same way BW events do. See
|
||||||
|
> [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md) for
|
||||||
|
> the IDF format reference and reverse-engineering plan.
|
||||||
|
> **v0.19.0 (2026-05-20)** separates Series III and Series IV at the
|
||||||
|
> code level: new `micromate/` package alongside `minimateplus/`, new
|
||||||
|
> `events.device_family` DB column ("series3" / "series4") so the UI
|
||||||
|
> and storage layer dispatch deterministically instead of sniffing
|
||||||
|
> filenames. Self-applying migration backfills existing rows from the
|
||||||
|
> binary filename extension.
|
||||||
|
> **v0.20.0 (2026-05-28)** closes out the Event-Report PDF iteration
|
||||||
|
> started in v0.17.x: histogram layouts render correctly against BW
|
||||||
|
> reference PDFs, the ASCII parser handles real-world edge cases
|
||||||
|
> (`OORANGE`, `>100 Hz`, histogram timestamps), and per-channel ZC
|
||||||
|
> Freq is surfaced in both modals (event browser + main webapp).
|
||||||
|
> Adds a server-wide `TZ` env var so operator-visible timestamps
|
||||||
|
> render in local time instead of UTC. New
|
||||||
|
> `scripts/backfill_sidecars.py --reparse-txt` lets parser fixes be
|
||||||
|
> applied retroactively to existing events without re-forwarding,
|
||||||
|
> using the `.TXT` files preserved at ingest time.
|
||||||
|
> **v0.21.0 (2026-05-29)** is the Thor / Series IV decoder release —
|
||||||
|
> `micromate/idf_file.read_idf_file()` now decodes both IDFW
|
||||||
|
> (waveform) and IDFH (histogram) binaries (87–99% sample fidelity
|
||||||
|
> on quiet IDFW events; all 859 IDFH corpus files decode cleanly).
|
||||||
|
> A new `micromate/idf_to_bw_report.py` adapter projects parsed
|
||||||
|
> Thor reports into the BW-shaped sidecar block, so Thor events
|
||||||
|
> flow through the existing Event Report PDF pipeline without a
|
||||||
|
> separate renderer. Terra-View v0.13.0 ships in parallel and
|
||||||
|
> closes Phase 1 of the SFM integration — see its CHANGELOG.
|
||||||
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
|
> See [CHANGELOG.md](CHANGELOG.md) for full version history.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -18,26 +62,36 @@ over direct RS-232 or cellular modem (Sierra Wireless RV50 / RV55).
|
|||||||
|
|
||||||
```
|
```
|
||||||
seismo-relay/
|
seismo-relay/
|
||||||
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Console tabs)
|
├── seismo_lab.py ← Main GUI (Bridge + Analyzer + Download + Console tabs)
|
||||||
│
|
│
|
||||||
├── minimateplus/ ← MiniMate Plus client library
|
├── minimateplus/ ← Series III (MiniMate Plus) client library
|
||||||
│ ├── transport.py ← SerialTransport, TcpTransport, SocketTransport
|
│ ├── transport.py ← SerialTransport, TcpTransport, SocketTransport
|
||||||
│ ├── protocol.py ← DLE frame layer, SUB command dispatch
|
│ ├── protocol.py ← DLE frame layer, SUB command dispatch
|
||||||
│ ├── client.py ← High-level client (connect, get_events, push_config, …)
|
│ ├── client.py ← High-level client (connect, get_events, delete_all_events, push_config, get_call_home_config, …)
|
||||||
│ ├── framing.py ← Frame builders, DLE codec, S3FrameParser
|
│ ├── framing.py ← Frame builders, DLE codec, S3FrameParser
|
||||||
│ └── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, …
|
│ ├── models.py ← DeviceInfo, Event, ComplianceConfig, MonitorLogEntry, CallHomeConfig, …
|
||||||
|
│ ├── bw_ascii_report.py ← Parse BW per-event ASCII reports (.TXT sidecars)
|
||||||
|
│ ├── event_file_io.py ← Read BW binaries, write .sfm.json sidecars
|
||||||
|
│ └── blastware_file.py ← Write events to Blastware-compatible .AB0 files
|
||||||
|
│
|
||||||
|
├── micromate/ ← Series IV (Micromate / Thor) client library (NEW v0.19)
|
||||||
|
│ ├── models.py ← IdfEvent, IdfReport, IdfPeaks, IdfProjectInfo, IdfSensorCheck (mic in native dB(L))
|
||||||
|
│ ├── idf_ascii_report.py ← Parse Thor .IDFW.txt / .IDFH.txt event sidecars
|
||||||
|
│ ├── idf_file.py ← Binary codec for .IDFW + .IDFH (v0.21.0+)
|
||||||
|
│ └── idf_to_bw_report.py ← Adapter projecting Thor IDF into the BW report shape (v0.21.0+)
|
||||||
│
|
│
|
||||||
├── sfm/ ← SFM REST API server (FastAPI, port 8200)
|
├── sfm/ ← SFM REST API server (FastAPI, port 8200)
|
||||||
│ ├── server.py ← All device + DB endpoints
|
│ ├── server.py ← Live device endpoints + DB query + ingest endpoints + caching
|
||||||
│ ├── database.py ← SeismoDb — SQLite persistence layer
|
│ ├── database.py ← SeismoDb — SQLite persistence (events, monitor_log, ach_sessions)
|
||||||
│ └── sfm_webapp.html ← Embedded web UI (served at /)
|
│ ├── waveform_store.py ← On-disk store for BW + IDF event binaries + .sfm.json sidecars
|
||||||
|
│ └── sfm_webapp.html ← Embedded web UI with Call Home config tab
|
||||||
│
|
│
|
||||||
├── bridges/
|
├── bridges/
|
||||||
│ ├── ach_server.py ← Inbound ACH call-home server (main production server)
|
│ ├── ach_server.py ← Inbound ACH call-home server (main production server)
|
||||||
│ ├── ach_mitm.py ← Transparent MITM proxy for capturing BW sessions
|
│ ├── ach_mitm.py ← Transparent MITM proxy for capturing BW sessions
|
||||||
│ ├── s3-bridge/ ← RS-232 serial bridge (capture tool)
|
│ ├── s3-bridge/ ← RS-232 serial bridge (capture tool)
|
||||||
│ ├── tcp_serial_bridge.py ← Local TCP↔serial bridge (bench testing)
|
│ ├── tcp_serial_bridge.py ← Local TCP↔serial bridge (bench testing)
|
||||||
│ ├── gui_bridge.py ← Standalone bridge GUI
|
│ ├── gui_bridge.py ← Standalone bridge GUI with raw capture checkboxes
|
||||||
│ └── raw_capture.py ← Simple raw capture tool
|
│ └── raw_capture.py ← Simple raw capture tool
|
||||||
│
|
│
|
||||||
├── parsers/
|
├── parsers/
|
||||||
@@ -46,7 +100,8 @@ seismo-relay/
|
|||||||
│ └── frame_db.py ← SQLite frame database
|
│ └── frame_db.py ← SQLite frame database
|
||||||
│
|
│
|
||||||
└── docs/
|
└── docs/
|
||||||
└── instantel_protocol_reference.md ← Reverse-engineered protocol spec
|
├── instantel_protocol_reference.md ← Series III protocol spec (the Rosetta Stone)
|
||||||
|
└── idf_protocol_reference.md ← Series IV (Thor IDF) format reference + codec RE plan
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -101,21 +156,28 @@ python seismo_lab.py
|
|||||||
Each call dials the device, does its work, and closes the connection. TCP
|
Each call dials the device, does its work, and closes the connection. TCP
|
||||||
connections are retried once on `ProtocolError` to handle cold-boot timing.
|
connections are retried once on `ProtocolError` to handle cold-boot timing.
|
||||||
|
|
||||||
**Caching** — frequently-polled endpoints are cached in-process to avoid
|
**In-memory caching** — frequently-polled endpoints avoid redundant TCP round-trips
|
||||||
redundant TCP round-trips:
|
via a thread-safe `_LiveCache` (plain Python dict + `threading.Lock`):
|
||||||
|
|
||||||
| Method | URL | Cache |
|
| Method | URL | Cache Strategy |
|
||||||
|--------|-----|-------|
|
|--------|-----|---|
|
||||||
| `GET` | `/device/info` | Indefinite; invalidated by `POST /device/config` |
|
| `GET` | `/device/info` | Indefinite; invalidated by `POST /device/config` |
|
||||||
| `GET` | `/device/events` | Count-probe fast path (~2s); full download only when new events detected |
|
| `GET` | `/device/events` | Count-probe fast path (~2s); full download only when new events detected |
|
||||||
| `GET` | `/device/event/{idx}/waveform` | Permanent per event index |
|
| `GET` | `/device/event/{idx}/waveform` | Permanent per event index |
|
||||||
| `GET` | `/device/monitor/status` | 30-second TTL |
|
| `GET` | `/device/monitor/status` | 30-second TTL; invalidated by monitor start/stop |
|
||||||
|
| `GET` | `/device/call_home` | Fresh read from device (not cached) |
|
||||||
| `POST` | `/device/connect` | — |
|
| `POST` | `/device/connect` | — |
|
||||||
| `POST` | `/device/config` | Writes compliance config; invalidates cache |
|
| `POST` | `/device/config` | Writes compliance config; invalidates info + events cache |
|
||||||
| `POST` | `/device/monitor/start` | Sends SUB 0x96 |
|
| `POST` | `/device/config/project` | Patches project/client/operator/sensor_location strings |
|
||||||
| `POST` | `/device/monitor/stop` | Sends SUB 0x97 |
|
| `POST` | `/device/monitor/start` | Sends SUB 0x96; immediately evicts status cache |
|
||||||
|
| `POST` | `/device/monitor/stop` | Sends SUB 0x97; immediately evicts status cache |
|
||||||
|
| `POST` | `/device/call_home` | Reads, patches specified fields, writes back to device |
|
||||||
|
|
||||||
All cached endpoints accept `?force=true` to bypass the cache.
|
**Cache bypass** — All cached endpoints accept `?force=true` to skip the cache and
|
||||||
|
force a fresh read from the device.
|
||||||
|
|
||||||
|
**Cache stats** — `GET /cache/stats` returns hit/miss counts and TTL info; `DELETE /cache/device`
|
||||||
|
clears the device cache immediately.
|
||||||
|
|
||||||
Transport query params (supply one set):
|
Transport query params (supply one set):
|
||||||
```
|
```
|
||||||
@@ -131,11 +193,23 @@ Query the SQLite database written by `ach_server.py`. All read-only except
|
|||||||
| Method | URL | Description |
|
| Method | URL | Description |
|
||||||
|--------|-----|-------------|
|
|--------|-----|-------------|
|
||||||
| `GET` | `/db/units` | All known serials with summary stats |
|
| `GET` | `/db/units` | All known serials with summary stats |
|
||||||
| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger) |
|
| `GET` | `/db/events` | Triggered events (filter by serial, date range, false_trigger). Response rows include `device_family` ("series3" / "series4") so clients dispatch on unit type without sniffing filenames. |
|
||||||
| `GET` | `/db/monitor_log` | Monitoring intervals |
|
| `GET` | `/db/monitor_log` | Monitoring intervals |
|
||||||
| `GET` | `/db/sessions` | ACH call-home session history |
|
| `GET` | `/db/sessions` | ACH call-home session history |
|
||||||
| `PATCH` | `/db/events/{id}/false_trigger?value=true` | Flag / unflag false triggers |
|
| `PATCH` | `/db/events/{id}/false_trigger?value=true` | Flag / unflag false triggers |
|
||||||
|
|
||||||
|
### File ingest endpoints
|
||||||
|
|
||||||
|
Used by watcher daemons to push field-collected event files into the SFM DB
|
||||||
|
+ waveform store. Both accept multipart uploads of binary event files
|
||||||
|
optionally paired with their ASCII sidecar reports; both dedup by
|
||||||
|
`(serial, timestamp)` and UPSERT device-authoritative fields on re-import.
|
||||||
|
|
||||||
|
| Method | URL | Description |
|
||||||
|
|--------|-----|-------------|
|
||||||
|
| `POST` | `/db/import/blastware_file` | Series III: `.AB0*` / `.N00` binaries + paired `_ASCII.TXT`. Source: `series3-watcher`. |
|
||||||
|
| `POST` | `/db/import/idf_file` | Series IV: `.IDFH` / `.IDFW` binaries + paired `.IDFW.txt` / `.IDFH.txt`. Source: `thor-watcher`. |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## minimateplus library
|
## minimateplus library
|
||||||
@@ -152,21 +226,33 @@ client = MiniMateClient(transport=TcpTransport("1.2.3.4", 12345), timeout=30.0)
|
|||||||
|
|
||||||
with client:
|
with client:
|
||||||
# Read
|
# Read
|
||||||
info = client.connect() # DeviceInfo — serial, firmware, compliance config
|
info = client.connect() # DeviceInfo — serial, firmware, compliance config
|
||||||
count = client.count_events() # Number of stored events
|
count = client.count_events() # Number of stored events
|
||||||
keys = client.list_event_keys() # Fast browse walk — event keys only, no download
|
keys = client.list_event_keys() # Fast browse walk — event keys only, no download
|
||||||
events = client.get_events() # Full download: headers + peaks + metadata
|
events = client.get_events() # Full download: headers + peaks + metadata
|
||||||
monitor = client.get_monitor_status() # Battery, memory, is_monitoring flag
|
monitor = client.get_monitor_status() # Battery, memory, is_monitoring flag
|
||||||
log = client.get_monitor_log_entries() # Monitoring intervals (partial 0x2C records)
|
log = client.get_monitor_log_entries() # Monitoring intervals (partial 0x2C records)
|
||||||
|
ach_cfg = client.get_call_home_config() # Auto Call Home settings (SUB 0x2C)
|
||||||
|
|
||||||
# Write
|
# Write
|
||||||
client.apply_config(
|
client.apply_config(
|
||||||
sample_rate=1024,
|
sample_rate=1024,
|
||||||
|
recording_mode="Continuous", # Single Shot / Continuous / Histogram / Histogram+Continuous
|
||||||
|
histogram_interval_sec=15, # 2, 5, 15, 60, 300, 900
|
||||||
trigger_level_geo=0.5,
|
trigger_level_geo=0.5,
|
||||||
|
geo_range="Normal", # Normal (10.000 in/s) / Sensitive (1.25 in/s)
|
||||||
project="Bridge Inspection 2026",
|
project="Bridge Inspection 2026",
|
||||||
client_name="City of Portland",
|
client_name="City of Portland",
|
||||||
operator="B. Harrison",
|
operator="B. Harrison",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
client.set_call_home_config(
|
||||||
|
auto_call_home_enabled=True,
|
||||||
|
after_event_recorded=True,
|
||||||
|
at_specified_times=True,
|
||||||
|
time1_hour=18, time1_min=30, # 6:30 PM
|
||||||
|
time2_hour=6, time2_min=0, # 6:00 AM
|
||||||
|
)
|
||||||
|
|
||||||
# Control
|
# Control
|
||||||
client.start_monitoring() # SUB 0x96
|
client.start_monitoring() # SUB 0x96
|
||||||
@@ -174,26 +260,88 @@ with client:
|
|||||||
client.delete_all_events() # Erase all (SUB 0xA3 → 0x1C → 0x06 → 0xA2)
|
client.delete_all_events() # Erase all (SUB 0xA3 → 0x1C → 0x06 → 0xA2)
|
||||||
```
|
```
|
||||||
|
|
||||||
`get_events()` runs the full per-event sequence: `1E → 0A → 0C → 5A → 1F`.
|
`get_events()` runs the full per-event sequence:
|
||||||
SUB 5A bulk stream provides `client`, `operator`, and `sensor_location` as they
|
`1E → 0A → 1E(arm token=0xFE) → 0C → 1F(arm) → POLL×3 → 5A → 1F(browse)`.
|
||||||
existed at record time — not backfilled from the current compliance config.
|
SUB 5A bulk stream walks chunks bounded by the `end_offset` extracted from
|
||||||
|
the STRT record at byte 17 of the probe response — no over-reading, no
|
||||||
|
chunk-count cap. Project / client / operator / sensor location strings come
|
||||||
|
from the dedicated metadata pages at counter `0x1002` and `0x1004`,
|
||||||
|
read once per session (they reflect the compliance setup at session start,
|
||||||
|
not per individual event).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## micromate library
|
||||||
|
|
||||||
|
Series IV / Thor support, sibling to `minimateplus`. Currently scoped to
|
||||||
|
offline-file ingest from Thor's TXT exporter; live-device protocol is
|
||||||
|
deferred until the binary codec is cracked.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from micromate import IdfEvent, parse_idf_report
|
||||||
|
|
||||||
|
# Parse a .IDFW.txt / .IDFH.txt sidecar (1014 example files round-trip cleanly)
|
||||||
|
text = open("UM11719_20231219162723.IDFW.txt").read()
|
||||||
|
report_dict = parse_idf_report(text) # permissive dict
|
||||||
|
|
||||||
|
# Wrap into a typed event using the device-native binary filename
|
||||||
|
event = IdfEvent.from_report(report_dict, "UM11719_20231219162723.IDFW")
|
||||||
|
|
||||||
|
event.serial # "UM11719"
|
||||||
|
event.kind # "Waveform" or "Histogram"
|
||||||
|
event.peaks.transverse_ips # 0.0251 (in/s, native unit)
|
||||||
|
event.peaks.mic_pspl_dbl # 99.4 (dB(L), Thor's native mic unit — NOT psi)
|
||||||
|
event.project_info.project # "UPMC Presby-Loc 3-Level1-1R Elevator Rm"
|
||||||
|
event.sensor_check.tran # True (passed self-check)
|
||||||
|
event.firmware_version # "Micromate ISEE 11.0AK"
|
||||||
|
event.calibration_text # "November 22, 2023 by Instantel"
|
||||||
|
|
||||||
|
# Bridge to the existing minimateplus.Event shape for the DB / sidecar paths
|
||||||
|
# (waveform_key is a 16-byte sha256 prefix when ingesting from a binary file)
|
||||||
|
bridged_event = event.to_minimateplus_event(waveform_key=b"\x00" * 16)
|
||||||
|
```
|
||||||
|
|
||||||
|
The binary codec (`.IDFW` / `.IDFH` event files themselves) is on the
|
||||||
|
roadmap — see [`docs/idf_protocol_reference.md`](docs/idf_protocol_reference.md)
|
||||||
|
for everything known so far, the two observed file signatures, and the
|
||||||
|
reverse-engineering plan. The `micromate/idf_file.py` stub is where
|
||||||
|
`read_idf_file()` will land.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Database
|
## Database
|
||||||
|
|
||||||
`ach_server.py` writes to `bridges/captures/seismo_relay.db` (SQLite, WAL mode).
|
`ach_server.py` and the file-ingest endpoints write to
|
||||||
Three tables, all unit-keyed by serial number:
|
`bridges/captures/seismo_relay.db` (SQLite, WAL mode) via the `SeismoDb`
|
||||||
|
persistence layer. Three tables, all unit-keyed by serial number:
|
||||||
|
|
||||||
| Table | Key | Contents |
|
| Table | Key | Contents |
|
||||||
|-------|-----|----------|
|
|-------|-----|----------|
|
||||||
| `ach_sessions` | UUID | Per-call-home audit record: serial, peer IP, events_downloaded, duration |
|
| `ach_sessions` | UUID | Per-call-home audit record: serial, timestamp, peer IP, events_downloaded, monitor_entries, duration_seconds |
|
||||||
| `events` | UUID, UNIQUE(serial, waveform_key) | Triggered events: timestamp, PPV per channel, project/client/operator strings, false_trigger flag |
|
| `events` | UUID, UNIQUE(serial, timestamp) | Triggered events: timestamp, Tran/Vert/Long/VectorSum/Mic PPV, project/client/operator/sensor_location strings, sample_rate, record_type, false_trigger flag, **`device_family`** ("series3" / "series4"), `blastware_filename` (binary at-rest in `waveforms/`), sidecar references |
|
||||||
| `monitor_log` | UUID, UNIQUE(serial, waveform_key) | Monitoring intervals: start/stop time, duration, geo threshold |
|
| `monitor_log` | UUID, UNIQUE(serial, start_time) | Monitoring intervals: serial, waveform_key, start_time, stop_time, duration_seconds, geo_threshold_ips |
|
||||||
|
|
||||||
Deduplication is by `(serial, waveform_key)` — repeat call-homes or re-runs
|
**Deduplication is by `(serial, timestamp)`** — the device clock is the
|
||||||
never produce duplicate rows. Post-erase key reuse is handled automatically
|
stable natural key. Repeat call-homes or re-runs UPSERT the row in place,
|
||||||
via the high-water mark in `ach_state.json`.
|
refreshing every device-authoritative field (peaks, project strings,
|
||||||
|
sample_rate, file references) so the latest writer wins. `false_trigger`
|
||||||
|
and `device_family` are preserved across UPSERTs. Earlier versions used
|
||||||
|
`(serial, waveform_key)` for dedup, but the device's event-key counter
|
||||||
|
resets to `0x01110000` after every erase, so timestamps are the correct
|
||||||
|
dedup field. Migration handles the transition transparently on first
|
||||||
|
startup.
|
||||||
|
|
||||||
|
**`device_family` (added v0.19.0)** discriminates Series III from Series
|
||||||
|
IV at the SQL level. Set by every import path; the UI dispatches on it
|
||||||
|
to render mic units correctly (Series III: psi → dBL conversion; Series
|
||||||
|
IV: native dBL passthrough). Existing rows are backfilled at first
|
||||||
|
startup of v0.19.0+ by sniffing the binary filename extension.
|
||||||
|
|
||||||
|
The on-disk waveform store lives at `bridges/captures/waveforms/<serial>/`
|
||||||
|
and holds the original event binaries (BW `.AB0*` / `.N00` for Series III,
|
||||||
|
`.IDFH` / `.IDFW` for Series IV) plus their `.sfm.json` review/metadata
|
||||||
|
sidecars. Series III events also produce `.a5.pkl` source-frame pickles
|
||||||
|
and `.h5` clean-waveform exports; Series IV doesn't yet (pending codec).
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -231,6 +379,27 @@ Full protocol documentation: [`docs/instantel_protocol_reference.md`](docs/insta
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Compliance Config Features
|
||||||
|
|
||||||
|
The REST API and web UI expose full control over device compliance settings:
|
||||||
|
|
||||||
|
- **Recording Mode** (Single Shot / Continuous / Histogram / Histogram+Continuous)
|
||||||
|
- **Sample Rate** (1024 / 2048 / 4096 sps)
|
||||||
|
- **Record Time** (float, seconds)
|
||||||
|
- **Histogram Interval** (2s, 5s, 15s, 1m, 5m, 15m) — when recording mode includes histogram
|
||||||
|
- **Geo Trigger Levels** (float, in/s per channel)
|
||||||
|
- **Geo Maximum Range** (Normal 10.000 in/s / Sensitive 1.250 in/s per channel)
|
||||||
|
- **Project / Client / Operator / Sensor Location** (ASCII strings)
|
||||||
|
|
||||||
|
Auto Call Home config:
|
||||||
|
- **Auto Call Home Enable** (bool)
|
||||||
|
- **Dial String** (read-only; 40-byte ASCII)
|
||||||
|
- **Trigger on Event** (bool)
|
||||||
|
- **Scheduled Call-Ins** (two time slots with HH:MM each)
|
||||||
|
- **Retry Settings** (count, delay, connection timeout, warm-up time)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Requirements
|
## Requirements
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -252,17 +421,171 @@ Use **com0com** or **VSPD** to create the virtual COM pair on Windows.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Roadmap
|
## Key Features
|
||||||
|
|
||||||
- [x] Full read pipeline — device info, compliance config, event download with true event-time metadata
|
**Series III (MiniMate Plus) device support:**
|
||||||
- [x] Write commands — push compliance config, trigger thresholds, project strings to device
|
- [x] Full read/write/erase pipelines over RS-232 or TCP/cellular
|
||||||
- [x] Erase all events — confirmed erase sequence from live MITM capture
|
- [x] Compliance config (recording mode, sample rate, histogram interval, geo sensitivity, project strings)
|
||||||
- [x] Monitor control — start/stop monitoring, read battery/memory/status
|
- [x] Auto Call Home config (read/write ACH settings, dial string, time slots, retries)
|
||||||
- [x] Monitor log entries — decode partial 0x2C records (continuous monitoring intervals)
|
- [x] Monitor control (start/stop, status polling, battery/memory)
|
||||||
- [x] ACH inbound server — accept call-home connections, download events, dedup by key
|
- [x] Monitor log entries (continuous monitoring intervals without full waveform download)
|
||||||
- [x] SQLite persistence — events, monitor log, and session history in `seismo_relay.db`
|
- [x] Blastware file ingest at `/db/import/blastware_file` (paired with `series3-watcher`)
|
||||||
- [x] SFM REST API — device control + DB query endpoints, live device cache
|
|
||||||
- [ ] Terra-view integration — seismo-relay router, unit detail page, VISON-style event listing
|
**Series IV (Micromate / Thor) device support:**
|
||||||
- [ ] Vibration summary reports — highest legit PPV per project → Word doc (false trigger filtering first)
|
- [x] Thor IDF file ingest at `/db/import/idf_file` (paired with `thor-watcher`, v0.18.0+)
|
||||||
- [ ] Compliance config encoder — build raw write payloads from a `ComplianceConfig` object
|
- [x] Native `IdfEvent` / `IdfReport` typed models — mic in dB(L), full title strings, sensor self-check, calibration, firmware version
|
||||||
- [ ] Modem manager — push RV50/RV55 configs via Sierra Wireless API
|
- [x] Parser verified against 1,014 paired `.txt` sidecars in `thor-watcher/example-data/`
|
||||||
|
- [x] Binary `.IDFW` / `.IDFH` codec — ✅ v0.21.0. IDFW reuses `decode_waveform_v2()` on the body at offset `0x0f1f` (87–99% sample fidelity on quiet events); IDFH has a dedicated segment-based decoder (all 859 corpus files decode, 181,071 intervals total). See `micromate/idf_file.py` + `docs/idf_protocol_reference.md`.
|
||||||
|
- [ ] Live-device protocol — pending codec
|
||||||
|
|
||||||
|
**Data persistence:**
|
||||||
|
- [x] SQLite database (`seismo_relay.db`) with `events`, `monitor_log`, `ach_sessions` tables
|
||||||
|
- [x] Per-row `device_family` column ("series3" / "series4") for clean UI / unit-of-measurement dispatch (v0.19.0+)
|
||||||
|
- [x] Deduplication by `(serial, timestamp)` — natural key handles post-erase counter resets
|
||||||
|
- [x] UPSERT on re-import refreshes every device-authoritative field (peaks, project, sample_rate); preserves operator review state (`false_trigger`)
|
||||||
|
- [x] Post-erase key-reuse detection (tracks high-water mark in `ach_state.json`)
|
||||||
|
|
||||||
|
**REST API:**
|
||||||
|
- [x] Live device endpoints with in-memory caching (`_LiveCache`)
|
||||||
|
- [x] Cache statistics (`/cache/stats`) and manual invalidation (`/cache/device`)
|
||||||
|
- [x] DB query endpoints (units, events, monitor_log, sessions, false_trigger PATCH)
|
||||||
|
- [x] Call Home config read/write endpoints
|
||||||
|
- [x] Blastware file download endpoint (`/device/event/{index}/blastware_file`)
|
||||||
|
- [x] Import endpoints for both device families (`/db/import/blastware_file`, `/db/import/idf_file`)
|
||||||
|
|
||||||
|
**File output (v0.7+, byte-perfect as of v0.14.3):**
|
||||||
|
- [x] Blastware-compatible `.AB0` / `.G10` file generation (waveform + metadata)
|
||||||
|
- [x] Multi-channel waveform decode from SUB 5A bulk stream
|
||||||
|
- [x] Second-resolution timestamp encoding in Blastware filename
|
||||||
|
- [x] **Byte-perfect against BW reference captures** (verified across 2-sec / 3-sec / 10-sec event durations, both event 0 and event N continuation events)
|
||||||
|
- [x] STRT-bounded chunk walk + correct event-N probe counter + partial DLE stuffing of `0x10` in 5A params (the four fixes that landed in v0.14.0–v0.14.3)
|
||||||
|
|
||||||
|
**Capture tools:**
|
||||||
|
- [x] Serial-to-TCP bridge with raw BW/S3 capture (s3_bridge.py, defaults to auto-capture)
|
||||||
|
- [x] GUI bridge with raw capture checkboxes (gui_bridge.py)
|
||||||
|
- [x] ACH inbound server with bidirectional capture (ach_server.py saves raw_tx + raw_rx)
|
||||||
|
- [x] Transparent TCP MITM proxy for live BW session capture (ach_mitm.py)
|
||||||
|
|
||||||
|
**Analysis tools:**
|
||||||
|
- [x] s3_analyzer.py — session parser, frame differ, Claude export
|
||||||
|
- [x] gui_analyzer.py — standalone analyzer GUI
|
||||||
|
- [x] frame_db.py — SQLite frame database for capture analysis
|
||||||
|
|
||||||
|
**seismo_lab.py GUI:**
|
||||||
|
- [x] Bridge tab — Serial/TCP mode selector with raw capture options
|
||||||
|
- [x] Analyzer tab — BW/S3 capture playback and differencing
|
||||||
|
- [x] Download tab — Live wire-byte capture during event download
|
||||||
|
- [x] Console tab — Logging and diagnostics
|
||||||
|
|
||||||
|
## Roadmap (Future)
|
||||||
|
|
||||||
|
### Strategic direction — where this is going
|
||||||
|
|
||||||
|
seismo-relay is being built as a **suite of cooperating components**
|
||||||
|
that together replace and improve on Blastware's role. Three logical
|
||||||
|
tiers:
|
||||||
|
|
||||||
|
1. **SFM** (device-side) — owns the active connection to a physical
|
||||||
|
unit. Today: `minimateplus/`, `/device/*` HTTP endpoints,
|
||||||
|
`seismo_lab.py`. Future: live Thor / Micromate support.
|
||||||
|
2. **SDM** (data-side) — owns the database, waveform store, ingest
|
||||||
|
pipelines, and the read-API that Terra-View consumes. Today this
|
||||||
|
code lives under `sfm/` for historical reasons; the role has
|
||||||
|
migrated and the eventual rename is on the long-tail cleanup list.
|
||||||
|
3. **Codec library** — pure data-interpretation: `minimateplus/*_codec.py`,
|
||||||
|
`bw_ascii_report.py`, `micromate/idf_*.py`. Used by both SFM and
|
||||||
|
SDM, depends on neither.
|
||||||
|
|
||||||
|
Terra-View is downstream of SDM for fleet listings, event detail, etc.
|
||||||
|
The long-term vision adds a **second link** from Terra-View → SFM for
|
||||||
|
direct device interaction (see below).
|
||||||
|
|
||||||
|
The codec work in this repo isn't trying to replace BW's network
|
||||||
|
layer — BW's ACH file forwarding and Thor's IDF call-home are
|
||||||
|
battle-tested. The value is in the receiving and processing side: turn
|
||||||
|
the stream of binary+ASCII pairs into something users can search,
|
||||||
|
filter, alert on, and report from.
|
||||||
|
|
||||||
|
### Terra-View ↔ SFM device control (the long-term vision)
|
||||||
|
|
||||||
|
Today Terra-View only reads from SDM (event listings, dashboards,
|
||||||
|
project reports). When a unit goes missing — operator notices in the
|
||||||
|
Terra-View dashboard — there's no way to *do* anything from the UI.
|
||||||
|
The path of least resistance is to RDP into a Windows box and open
|
||||||
|
Blastware, which defeats the purpose of having Terra-View.
|
||||||
|
|
||||||
|
Target experience:
|
||||||
|
- Operator notices a unit in Terra-View dashboard hasn't called in.
|
||||||
|
- Clicks unit detail → "Connect to Device" button.
|
||||||
|
- Terra-View opens an embedded view (modal or side-panel) that talks
|
||||||
|
to SFM's `/device/*` endpoints over the network.
|
||||||
|
- Live view: device clock, battery, memory, current monitor status.
|
||||||
|
- Actions: start/stop monitoring, push compliance config changes, pull
|
||||||
|
fresh events, run a sensor self-check, change call-home settings.
|
||||||
|
- Audit log: every connect / action recorded in SDM for the unit
|
||||||
|
history.
|
||||||
|
|
||||||
|
Implementation steps (concrete):
|
||||||
|
- [ ] **SFM authentication & authorization layer.** Today `/device/*`
|
||||||
|
endpoints are unauthenticated — anyone on the network can call
|
||||||
|
them. Need at minimum a token-based auth, ideally with a "who
|
||||||
|
can connect to which units" mapping. Hard prerequisite for
|
||||||
|
letting Terra-View users into the control surface.
|
||||||
|
- [ ] **Terra-View "Connect to Device" entry point** on the unit
|
||||||
|
detail page. Renders only when unit has connection info on file
|
||||||
|
and the user has permission.
|
||||||
|
- [ ] **Embedded live-monitor view** in Terra-View — equivalent to
|
||||||
|
`seismo_lab.py`'s Bridge tab, but in the browser. Polls SFM's
|
||||||
|
`/device/monitor/status` on an interval; sends start/stop via
|
||||||
|
`/device/monitor/{start,stop}`.
|
||||||
|
- [ ] **Action history** — every connect / push / action call records
|
||||||
|
a row in `unit_history`, viewable on the unit detail page.
|
||||||
|
- [ ] **Series IV live-device support in SFM** — currently `/device/*`
|
||||||
|
only supports MiniMate Plus. Blocks "Connect to Device" for
|
||||||
|
Thor units until done. Depends on Thor wire-protocol capture
|
||||||
|
and a `micromate/` parallel of the `minimateplus/` modules.
|
||||||
|
|
||||||
|
### High-impact (unblocks product features)
|
||||||
|
|
||||||
|
- [ ] **Series III waveform body codec reverse-engineering.** The 5A bulk-stream body is some kind of compressed/encoded format (not raw int16 LE as previously assumed — see §7.6.1 retraction in `docs/instantel_protocol_reference.md`). Structural framing is ~50% decoded on branch `claude/codec-re-cBGNe` (tagged-block walker, segment counters); per-byte sample mapping is still open. Until this lands, the in-app waveform viewer renders garbage and BW-import peak values fall back to `_peaks_from_samples()` saturation noise. Workaround: pair every BW-imported event with its `_ASCII.TXT` so the device-authoritative peaks land in the DB regardless of codec.
|
||||||
|
- [x] **Series IV (Thor IDF) binary codec reverse-engineering.** ✅ v0.21.0 — `micromate/idf_file.read_idf_file()` decodes both IDFW (waveform body at offset `0x0f1f`, reusing `decode_waveform_v2()`; 87–99% sample fidelity on quiet events) and IDFH (dedicated segment-based decoder: all 859 corpus files decode, 181,071 intervals, peaks within ~1.8% of sidecar values). `WaveformStore.save_imported_idf` now also projects parsed Thor data into a `bw_report` block via `micromate/idf_to_bw_report.py` so Thor events render in the existing Event Report PDF pipeline without a separate renderer.
|
||||||
|
- [ ] **In-app waveform viewer accuracy.** Depends on Series III codec decode. Plot.v1 JSON pipeline + viewer skeleton already exist; will start showing real waveforms automatically once `_decode_a5_waveform` produces correct samples. Series IV waveforms come online when the IDF codec lands.
|
||||||
|
- [ ] **Series IV live-device support.** Once the IDF binary is decoded, extend `micromate/` with `transport.py` / `framing.py` / `protocol.py` / `client.py` mirroring the `minimateplus/` package layout — depends on capturing Thor's wire protocol (TCP / RS-232 captures TBD).
|
||||||
|
- [ ] **Terra-view integration** — seismo-relay router, unit detail page, VISON-style event listing.
|
||||||
|
- [ ] **Vibration summary reports** — highest legit PPV per project → Word doc (false-trigger filtering first).
|
||||||
|
|
||||||
|
### BW ASCII report parser enhancements (built in v0.16.0)
|
||||||
|
|
||||||
|
- [x] **PPV field misses on certain TXT formats.** ✅ v0.20.0 — root cause was the `OORANGE` (Out Of Range) saturation marker that BW writes when a channel exceeds its full-scale; `_parse_number()` returned None for the non-numeric value. Parser now substitutes `geo_range_ips` as a lower bound + sets `ppv_saturated` flag. All 5 prod events (T190LD5Q.LK0W, T438L713.RY0W, K557L3YM.OE0W, + 2 others) now parse cleanly.
|
||||||
|
- [x] **Histogram-specific structural fields.** ✅ v0.20.0 — `Histogram Start/Stop Time+Date`, `Number of Intervals`, `Interval Size`, per-channel `Peak Time` + `Peak Date`, and `Peak Vector Sum Date` all parse now. Land in the sidecar's `bw_report.histogram` block.
|
||||||
|
- [ ] **Histogram interval bin-table parsing.** Trailing 792-row table (per-interval Peak/Freq per channel + MicL) in histogram TXTs is unparsed. Probably too big for the sidecar JSON; may want a separate `.histogram.h5` companion file.
|
||||||
|
- [x] **`>100 Hz` value parsing.** ✅ v0.20.0 — parser now mirrors the OORANGE pattern: stores 100.0 on `zc_freq_hz` + sets `zc_freq_above_range` flag. PDF + both modals render `>100 Hz` instead of `—`.
|
||||||
|
|
||||||
|
### Ingestion gaps
|
||||||
|
|
||||||
|
- [ ] **MLG forwarding.** `series3-watcher` forwards event binaries + their `_ASCII.TXT` reports, but skips `.MLG` per-unit monitor log files entirely. Adding an `POST /db/import/mlg_file` endpoint + watcher scan path would populate `monitor_log` for non-ACH-routed units (coverage queries, "was this unit monitoring on date X" lookups).
|
||||||
|
- [ ] **0C-record raw bytes persistence in the sidecar.** Currently on branch `claude/codec-re-cBGNe` as commit `a187124`; cherry-pick if useful as a standalone fix. Preserves the 210-byte 0C record under `extensions.raw_records.waveform_record_b64` so future field-offset analysis (Peak Acceleration / Time of Peak / etc. — the fields BW computes client-side from samples) can run offline.
|
||||||
|
|
||||||
|
### Operational
|
||||||
|
|
||||||
|
- [ ] **`series3-watcher` file archive manager** — 90-day-old events moved to `<watch_folder>_archive/<year>/<month>/` subfolders. Plan drafted in `claude/codec-re-cBGNe`'s plan-mode session; awaiting a 5-minute test on whether Blastware UI walks subfolders before any code lands (determines layout: in-place subfolders vs sibling archive).
|
||||||
|
- [ ] **Compliance config encoder** — build raw write payloads from a `ComplianceConfig` object.
|
||||||
|
- [ ] **Modem manager** — push RV50/RV55 configs via Sierra Wireless API.
|
||||||
|
- [ ] **Call Home dial_string write support** (requires DLE escaping for embedded control characters).
|
||||||
|
- [ ] **Histogram mode recording support** (5A stream analysis for mode 0x03 — separate from histogram ASCII parsing above).
|
||||||
|
|
||||||
|
### Test coverage
|
||||||
|
|
||||||
|
- [ ] Verify 30-sec event download — body may exceed `0xFFFF` and force the device into a different `end_key` encoding (none of the 2/3/10-sec test cases hit this boundary).
|
||||||
|
- [ ] Histogram mode (0x03) write via SFM — confirmed working for Single Shot / Continuous / Histogram+Continuous; Histogram (0x03) needs a live test from a non-Histogram starting state.
|
||||||
|
|
||||||
|
### Lower-priority cleanups
|
||||||
|
|
||||||
|
- [ ] Compliance write anchor-9 cleanup — when changing recording_mode via SFM, a spurious `0x10` may persist after Histogram→other mode transitions. Doesn't affect device operation but differs from BW's byte-perfect output.
|
||||||
|
- [ ] Locate "Sensor Check" byte in compliance config (need capture with Disabled vs Before-monitoring).
|
||||||
|
- [ ] Call Home — map time slots 3/4 offsets; confirm `modem_power_relay_enabled`.
|
||||||
|
- [ ] RV55 DCD/DTR — newer RV55 firmware doesn't assert DCD by default; units don't resume monitoring after call-home disconnect (`--restart-monitoring` flag deferred).
|
||||||
|
- [ ] **NULL-timestamp duplicate-row dedup.** A small handful of events (2 known on prod as of 2026-05-22) have `events.timestamp IS NULL` because the codec couldn't extract a timestamp from the binary footer. The `UNIQUE(serial, timestamp)` constraint doesn't fire on `NULL` (SQL semantics: `NULL ≠ NULL`), so every `--force` backfill INSERTs a new row instead of UPSERTing the existing one. Cleanup: a one-shot SQL query that keeps only the newest row per `(serial, blastware_filename)` and deletes the rest. Longer-term: extend the unique key to `(serial, COALESCE(timestamp, blastware_filename))` or reject inserts with NULL timestamp.
|
||||||
|
- [ ] **Histogram body sub-format with `byte[5] != 0`.** ~3 events on prod (`T190LD5Q.LD0H`, `O121L4L1.GU0H`) use a histogram body my walker doesn't recognize — the first block has `byte[5] = 0x01` or `0x07` instead of `0x00`, and the entire body lacks the `1e 0a 00 00` tail signature. Codec returns 0 valid blocks; their DB PVS comes from the bw_report ASCII overlay (which BW computed from the same binary, so the DB columns are correct). Only the `.h5` waveform plot is empty. Cracking the sub-format would unlock the plot. Needs binary+ASCII pairs from a few `byte[5]!=0` events; same RE approach as the K558 case.
|
||||||
|
- [ ] **Histogram body sub-format with `byte[5] == 0x00` but undecodable.** Observed 2026-05-28 on BE17353 (S353) events: `S353L4H2.FZ0H`, `S353L4H2.P00H`, `S353L4H3.7O0H`, `S353L4H3.E10H`. Body starts `00 00 00 01 0a 00 XX 00 ...` which LOOKS like a valid histogram block header (marker 0x000a at byte[4:6] ✓, byte[5]=0x00 normal-format ✓), but the walker finds zero data blocks across the whole body. Likely an extra header before the block stream OR a different tail signature than `1e 0a 00 00`. Smaller body lengths (1900-2100 bytes) suggest these may be short-recording histogram variants. Same operational impact as the byte[5]!=0 case: event ingests cleanly, DB peaks correct via bw_report overlay, only the chart is empty. Worth dumping a hex view of one body to diagnose.
|
||||||
|
- [ ] **Sensor-check waveform extraction from the BW binary.** BW's Event Report PDFs include a narrow panel on the right side of the waveform plot showing each channel's response to the sensor self-check signal (a damped sinusoid for geo, sawtooth-at-test-freq for mic). Our parser captures the test RESULTS (`test_freq_hz`, `test_ratio`, `test_amplitude_mv`, `test_results` pass/fail) and the PDF + modal display them as text — but BW's per-sample sensor-check waveform isn't accessible to us today. Two paths to add it: (a) RE the binary to find where the sensor-check samples are stored — could be a section before STRT, after the footer, or in a separate sub-record; protocol reference doesn't currently mention it. (b) If samples aren't in the binary, synthesize a representative waveform from the test parameters (damped sinusoid at `test_freq_hz` with damping from `test_ratio`). Path (a) is the honest answer; path (b) is decorative. Until either lands, the text-only sensor-check display in the report is fine.
|
||||||
|
|||||||
@@ -0,0 +1,66 @@
|
|||||||
|
# analysis/ — exploratory scripts for waveform-body RE
|
||||||
|
|
||||||
|
**These are scratch.** Run them, read them, copy them, but don't trust
|
||||||
|
them as documentation. When a finding is verified it gets promoted
|
||||||
|
to `minimateplus/waveform_codec.py` and `tests/test_waveform_codec.py`;
|
||||||
|
when it's wrong it stays here as a fossil.
|
||||||
|
|
||||||
|
Authoritative status lives in:
|
||||||
|
|
||||||
|
- `docs/waveform_codec_re_status.md` (current truth, working note)
|
||||||
|
- `minimateplus/waveform_codec.py` (verified implementation + docstring)
|
||||||
|
- `tests/test_waveform_codec.py` (regression locks against fixtures)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Still useful
|
||||||
|
|
||||||
|
| File | What it does |
|
||||||
|
|---|---|
|
||||||
|
| `load_bundle.py` | Fixture loader. Parses BW binary + ASCII TXT into a `Bundle` dataclass with samples, metadata, body bytes. Used by most other scripts here. |
|
||||||
|
| `verify_tran.py` | Verifies `decode_tran_initial` against fixture ground truth across all events. Useful when you change the decoder and want a quick sanity check. |
|
||||||
|
| `inspect_5_11.py` | Inspects the 5-11-26 high-amplitude bundle's body structure, prints metadata, peaks, and block counts. |
|
||||||
|
| `walk_5_11.py` | Walks blocks for the 5-11-26 bundle and prints offset/tag/length/data. |
|
||||||
|
| `seg1_blocks.py` | Dumps all blocks in segment 1 of each event. The starting point for cracking multi-segment Tran continuation. |
|
||||||
|
| `full_tran.py` | Multi-segment Tran decoder attempt (broken — diverges at sample ~512). Useful as a starting scaffold for the next experiment. |
|
||||||
|
| `multi_segment.py` | Earlier multi-segment attempt with different segment-header consumption strategies. Records what didn't work. |
|
||||||
|
| `test_rle.py` | Tests `00 NN` interpretation as zero-RLE with different divisor values. Documents how the RLE rule was confirmed. |
|
||||||
|
|
||||||
|
## Superseded — keep for archaeology
|
||||||
|
|
||||||
|
| File | Superseded by |
|
||||||
|
|---|---|
|
||||||
|
| `walk_v2.py` … `walk_v5.py` | `walk_v6.py` and ultimately `minimateplus/waveform_codec.walk_body`. Each version represents one round of refinement. Don't read in isolation — read the diff between them to see what was learned. |
|
||||||
|
| `walk_chunks.py` | `walk_v6.py` / production walker |
|
||||||
|
| `decode_v1.py` | First naive decoder attempt. Wrong but readable. |
|
||||||
|
|
||||||
|
## Pure exploration — read if curious
|
||||||
|
|
||||||
|
| File | What it explored |
|
||||||
|
|---|---|
|
||||||
|
| `inspect_body.py` | Byte-frequency stats per event. Established that bytes 0x00 / 0x10 dominate. |
|
||||||
|
| `find_blocks.py` | Searched for repeating 2-byte tag patterns. |
|
||||||
|
| `find_signal_runs.py` | Searched for stretches of bytes that "look like a smooth signal" (small inter-byte deltas). Found the `20 NN` literal blocks. |
|
||||||
|
| `dump_head.py`, `dump_trailer.py`, `dump_around.py` | Hex dumpers at various body positions. |
|
||||||
|
| `compare_cd.py` | Byte-diff between event-c and event-d (same length, similar signal). Used to identify structural vs data bytes. |
|
||||||
|
| `brute_force.py` | Tested 96 combinations of channel-permutation × nibble-order × sign-convention × init-from-header on the quiet bundle. All failed because the quiet bundle had T[0]=T[1]=0, making the preamble undetectable. |
|
||||||
|
| `try_nibbles.py`, `try_layouts.py` | Earlier channel-interleaving hypotheses. All wrong. |
|
||||||
|
| `test_tran_continue.py` | Test of "Tran continues uninterrupted across `30 04` blocks" hypothesis. Disproven. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Adding new scripts
|
||||||
|
|
||||||
|
If you're picking up the codec work, feel free to add new scripts here.
|
||||||
|
Suggested conventions:
|
||||||
|
|
||||||
|
- Start the filename with what you're testing: `test_<hypothesis>.py`,
|
||||||
|
`verify_<piece>.py`, `inspect_<region>.py`.
|
||||||
|
- Print enough output that the reader can see exactly which events
|
||||||
|
match / diverge and where.
|
||||||
|
- When a finding is solid, move the verified logic to
|
||||||
|
`minimateplus/waveform_codec.py` and add a regression test in
|
||||||
|
`tests/test_waveform_codec.py` — don't leave the truth only in
|
||||||
|
this directory.
|
||||||
|
- If a script is fully superseded, leave it in place (don't delete) —
|
||||||
|
the fossil record is useful when re-evaluating hypotheses later.
|
||||||
@@ -0,0 +1,93 @@
|
|||||||
|
"""Brute-force test channel permutations / nibble orders on event-d (simplest signal)."""
|
||||||
|
import sys
|
||||||
|
import itertools
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
from minimateplus.waveform_codec import walk_body
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def decode(body, channel_perm, nibble_order, sign_mode, init_from_header):
|
||||||
|
"""Try one decoder configuration on event-d. Returns first 8 cumulative samples per channel."""
|
||||||
|
blocks = walk_body(body)
|
||||||
|
# Initial values from bytes [4:7] if init_from_header else 0
|
||||||
|
if init_from_header:
|
||||||
|
init = [body[4] if body[4] < 128 else body[4] - 256,
|
||||||
|
body[5] if body[5] < 128 else body[5] - 256,
|
||||||
|
body[6] if body[6] < 128 else body[6] - 256,
|
||||||
|
0]
|
||||||
|
else:
|
||||||
|
init = [0, 0, 0, 0]
|
||||||
|
cur = list(init)
|
||||||
|
out = [[init[0]], [init[1]], [init[2]], [init[3]]] # sample 0 = init
|
||||||
|
nibble_idx = 0 # within delta stream; channel = channel_perm[nibble_idx % 4]
|
||||||
|
|
||||||
|
# Walk only the 10 NN data blocks
|
||||||
|
for blk in blocks:
|
||||||
|
if blk.tag_hi != 0x10:
|
||||||
|
continue
|
||||||
|
for byte in blk.data:
|
||||||
|
if nibble_order == 'high_first':
|
||||||
|
nib1, nib2 = (byte >> 4) & 0xF, byte & 0xF
|
||||||
|
else:
|
||||||
|
nib1, nib2 = byte & 0xF, (byte >> 4) & 0xF
|
||||||
|
for nib in (nib1, nib2):
|
||||||
|
if sign_mode == 'signed':
|
||||||
|
delta = s4(nib)
|
||||||
|
else:
|
||||||
|
delta = nib
|
||||||
|
ch = channel_perm[nibble_idx % 4]
|
||||||
|
cur[ch] += delta
|
||||||
|
if (nibble_idx + 1) % 4 == 0:
|
||||||
|
out[0].append(cur[0])
|
||||||
|
out[1].append(cur[1])
|
||||||
|
out[2].append(cur[2])
|
||||||
|
out[3].append(cur[3])
|
||||||
|
nibble_idx += 1
|
||||||
|
if len(out[0]) >= 16:
|
||||||
|
return out
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def best_match(pred, truth, n=10):
|
||||||
|
"""Sum of squared differences in first n samples."""
|
||||||
|
n = min(n, len(pred), len(truth))
|
||||||
|
return sum((pred[i] - truth[i])**2 for i in range(n))
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
b = load_bundle("event-d")
|
||||||
|
# truth in 16-count units
|
||||||
|
tr = {ch: [round(v * 200) for v in b.samples[ch]] for ch in ("Tran", "Vert", "Long")}
|
||||||
|
|
||||||
|
print("Truth event-d first 10 samples:")
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
print(f" {ch}: {tr[ch][:10]}")
|
||||||
|
|
||||||
|
# Test 96 combinations
|
||||||
|
best = []
|
||||||
|
for perm in itertools.permutations([0, 1, 2, 3]):
|
||||||
|
for nibble_order in ('high_first', 'low_first'):
|
||||||
|
for sign in ('signed', 'unsigned'):
|
||||||
|
for init_h in (False, True):
|
||||||
|
decoded = decode(b.body, perm, nibble_order, sign, init_h)
|
||||||
|
# Score as TVL channel-sum
|
||||||
|
score = sum(
|
||||||
|
best_match(decoded[i], tr[ch], n=10)
|
||||||
|
for i, ch in enumerate(("Tran", "Vert", "Long"))
|
||||||
|
if i < 3
|
||||||
|
)
|
||||||
|
label = f"perm={perm} nib={nibble_order[:1]} sign={sign[:3]} init={init_h}"
|
||||||
|
best.append((score, label, decoded))
|
||||||
|
|
||||||
|
best.sort(key=lambda x: x[0])
|
||||||
|
print(f"\nTop 10 configurations:")
|
||||||
|
for s, lbl, dec in best[:10]:
|
||||||
|
print(f" score={s:>5} {lbl} T={dec[0][:8]} V={dec[1][:8]} L={dec[2][:8]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,42 @@
|
|||||||
|
"""Compare event-c and event-d (same N_samples) to find header vs data bytes."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
bc = load_bundle("event-c")
|
||||||
|
bd = load_bundle("event-d")
|
||||||
|
|
||||||
|
# Compare prefixes
|
||||||
|
nc, nd = len(bc.body), len(bd.body)
|
||||||
|
n = min(nc, nd)
|
||||||
|
diffs = []
|
||||||
|
for i in range(n):
|
||||||
|
if bc.body[i] != bd.body[i]:
|
||||||
|
diffs.append(i)
|
||||||
|
print(f"event-c body={nc}, event-d body={nd}")
|
||||||
|
print(f"Total diffs (first {n}): {len(diffs)}")
|
||||||
|
|
||||||
|
# Show common prefix
|
||||||
|
same_prefix = 0
|
||||||
|
for i in range(n):
|
||||||
|
if bc.body[i] == bd.body[i]:
|
||||||
|
same_prefix += 1
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
print(f"Common prefix length: {same_prefix}")
|
||||||
|
print(f"event-c prefix: {bc.body[:same_prefix].hex(' ')}")
|
||||||
|
|
||||||
|
# Look for runs of common bytes
|
||||||
|
print(f"\nFirst 32 diff positions: {diffs[:32]}")
|
||||||
|
|
||||||
|
# Show the "diff fingerprint" of the first 100 bytes
|
||||||
|
print(f"\n pos c d")
|
||||||
|
for i in range(0, 100):
|
||||||
|
marker = " " if bc.body[i] == bd.body[i] else "*"
|
||||||
|
bd_b = bd.body[i] if i < nd else None
|
||||||
|
print(f" {i:>3} {bc.body[i]:02x}{marker} {bd_b:02x}" if bd_b is not None else f" {i:>3} {bc.body[i]:02x}{marker}")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,99 @@
|
|||||||
|
"""
|
||||||
|
Decoder v1: nibble-pair signed deltas in 10 NN blocks, 4-channel round-robin.
|
||||||
|
"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def walk_blocks(body, start):
|
||||||
|
i = start
|
||||||
|
blocks = []
|
||||||
|
while i + 1 < len(body):
|
||||||
|
t0, t1 = body[i], body[i + 1]
|
||||||
|
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append(("10", t1, data))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 + 2
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append(("20", t1, data))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x00 and t1 % 4 == 0:
|
||||||
|
blocks.append(("00", t1, b""))
|
||||||
|
i += 2
|
||||||
|
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||||
|
length = t1 * 4
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append(("30", t1, data))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x40 and t1 == 0x02:
|
||||||
|
length = 20
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append(("40", t1, data))
|
||||||
|
i += length
|
||||||
|
else:
|
||||||
|
blocks.append(("??", t0, bytes(body[i:i+8])))
|
||||||
|
break
|
||||||
|
return blocks
|
||||||
|
|
||||||
|
|
||||||
|
def decode_v1(body, start, n_samples):
|
||||||
|
"""Decode by accumulating nibble-pair deltas from all 10 NN blocks."""
|
||||||
|
blocks = walk_blocks(body, start)
|
||||||
|
# 4 channels: T, V, L, M
|
||||||
|
cur = [0, 0, 0, 0]
|
||||||
|
out = [[], [], [], []]
|
||||||
|
sample_index = 0 # how many sample-sets emitted
|
||||||
|
|
||||||
|
for typ, NN, data in blocks:
|
||||||
|
if typ == "10":
|
||||||
|
# 2 nibbles per byte, round-robin TVLM
|
||||||
|
for byte in data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
ch = sample_index % 4
|
||||||
|
cur[ch] += s4(nib)
|
||||||
|
out[ch].append(cur[ch])
|
||||||
|
sample_index = (sample_index + 1) // 4 * 4 + (sample_index + 1) % 4 # ?
|
||||||
|
sample_index += 1
|
||||||
|
# We emit per-nibble, but the structure is unclear
|
||||||
|
elif typ == "20":
|
||||||
|
# int8 absolute or delta?
|
||||||
|
for byte in data:
|
||||||
|
v = byte if byte < 128 else byte - 256
|
||||||
|
ch = sample_index % 4
|
||||||
|
cur[ch] = v # treat as absolute
|
||||||
|
out[ch].append(cur[ch])
|
||||||
|
sample_index += 1
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
b = load_bundle("event-c")
|
||||||
|
body = b.body
|
||||||
|
truth_T = [round(v * 200) for v in b.samples["Tran"]]
|
||||||
|
truth_V = [round(v * 200) for v in b.samples["Vert"]]
|
||||||
|
truth_L = [round(v * 200) for v in b.samples["Long"]]
|
||||||
|
|
||||||
|
# Find start
|
||||||
|
for s in range(15):
|
||||||
|
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
|
||||||
|
start = s
|
||||||
|
break
|
||||||
|
|
||||||
|
blocks = walk_blocks(body, start)
|
||||||
|
# Print block-by-block what's in each
|
||||||
|
print(f"Total blocks: {len(blocks)}")
|
||||||
|
bytes_processed = 0
|
||||||
|
for typ, NN, data in blocks[:30]:
|
||||||
|
print(f" type={typ} NN=0x{NN:02x} data_len={len(data)} data_hex={data[:32].hex(' ')}{'...' if len(data) > 32 else ''}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,27 @@
|
|||||||
|
"""Dump body bytes around a specific offset."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def dump_around(name: str, center: int, radius: int = 96):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
start = max(0, center - radius)
|
||||||
|
end = min(len(body), center + radius)
|
||||||
|
print(f"\n=== {name} body[{start}:{end}] (full body={len(body)}) ===")
|
||||||
|
for i in range(start, end, 32):
|
||||||
|
row = body[i:i+32]
|
||||||
|
marker = " <-- center" if i <= center < i+32 else ""
|
||||||
|
print(f" +{i:>5} {row.hex(' ')}{marker}")
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
# Look at the trailer transitions
|
||||||
|
trailer_starts = {"event-a": 7047, "event-b": 6475, "event-c": 4043, "event-d": 3941}
|
||||||
|
for name, off in trailer_starts.items():
|
||||||
|
dump_around(name, off, 96)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
"""Dump the START of each body in 32-byte rows."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-a", "event-c"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
print(f"\n=== {name} body[0:512] (full body={len(body)}, samples={len(b.samples['Tran'])}) ===")
|
||||||
|
for i in range(0, min(512, len(body)), 32):
|
||||||
|
row = body[i:i+32]
|
||||||
|
print(f" +{i:>5} {row.hex(' ')}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,24 @@
|
|||||||
|
"""Dump body bytes split into 32-byte rows starting from `start_offset`."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def dump(body: bytes, name: str, start: int, n_rows: int = 30):
|
||||||
|
print(f"\n=== {name} body[{start}:] (full body={len(body)}) ===")
|
||||||
|
end = min(start + 32 * n_rows, len(body))
|
||||||
|
for i in range(start, end, 32):
|
||||||
|
row = body[i:i+32]
|
||||||
|
print(f" +{i:>5} {row.hex(' ')}")
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
# Print the LAST ~600 bytes of the body to see the tail structure
|
||||||
|
start = max(0, len(b.body) - 32 * 12)
|
||||||
|
dump(b.body, name, start, 12)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,41 @@
|
|||||||
|
"""Search for structural repetition in the body bytes."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def find_pattern_offsets(body: bytes, pattern: bytes, max_count=20):
|
||||||
|
out = []
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
i = body.find(pattern, i)
|
||||||
|
if i < 0:
|
||||||
|
break
|
||||||
|
out.append(i)
|
||||||
|
i += 1
|
||||||
|
if len(out) >= max_count:
|
||||||
|
break
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
print(f"\n=== {name} (body={len(body)}, N_samples={len(b.samples['Tran'])}) ===")
|
||||||
|
|
||||||
|
# Try to find repeating substructures (look for 4-byte 0x10-prefixed markers)
|
||||||
|
for prefix in [b"\x10\x10", b"\x10\x04", b"\x10\x08", b"\x10\x0c", b"\x10\x18",
|
||||||
|
b"\x10\x14", b"\x10\x20", b"\x10\x40", b"\x10\x80", b"\x10\x00",
|
||||||
|
b"\x10\x01", b"\x10\x03", b"\x10\xf0", b"\xf1\x10", b"\x00\x10",
|
||||||
|
b"\x40\x02", b"\x20\x04", b"\x30\x04", b"\x30\x08", b"\x00\x1a"]:
|
||||||
|
offs = find_pattern_offsets(body, prefix, max_count=200)
|
||||||
|
if 1 <= len(offs) <= 1000:
|
||||||
|
# Print first 10 offsets
|
||||||
|
first = offs[:6]
|
||||||
|
last = offs[-3:]
|
||||||
|
print(f" '{prefix.hex()}' x{len(offs):>4} first={first} last={last}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
"""Find body byte ranges that look like absolute int8 sample data (smooth waveform)."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def looks_like_smooth_int8(buf):
|
||||||
|
"""Convert bytes to int8 and check if successive deltas are small (waveform-like)."""
|
||||||
|
if len(buf) < 8:
|
||||||
|
return 0.0
|
||||||
|
vals = [b if b < 128 else b - 256 for b in buf]
|
||||||
|
diffs = [abs(vals[i+1] - vals[i]) for i in range(len(vals)-1)]
|
||||||
|
avg_diff = sum(diffs) / len(diffs)
|
||||||
|
return avg_diff
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-a", "event-c"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
# Scan with sliding window of 64 bytes; find segments where the bytes look like a smooth wave
|
||||||
|
win = 64
|
||||||
|
scores = []
|
||||||
|
for i in range(len(body) - win):
|
||||||
|
scores.append((i, looks_like_smooth_int8(body[i:i+win])))
|
||||||
|
# Lowest avg_diff means smoothest
|
||||||
|
scores.sort(key=lambda x: x[1])
|
||||||
|
print(f"\n=== {name} (body={len(body)}) — smoothest 10 windows ===")
|
||||||
|
for off, s in scores[:10]:
|
||||||
|
print(f" +{off:>5} avg_diff={s:.2f} bytes={body[off:off+24].hex(' ')}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,76 @@
|
|||||||
|
"""Full Tran decoder: continues across segment headers using T_delta from header bytes [0:2]."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def decode_full_tran(body):
|
||||||
|
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||||
|
return None
|
||||||
|
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
|
||||||
|
i = 7
|
||||||
|
while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
|
||||||
|
i += 1
|
||||||
|
|
||||||
|
blocks = walk_body(body, i)
|
||||||
|
T = [T0, T1]
|
||||||
|
cur = T1
|
||||||
|
for blk in blocks:
|
||||||
|
if blk.tag_hi == 0x40:
|
||||||
|
# Segment header carries 2 T deltas (int16 BE each) at bytes [0:2] and [2:4]
|
||||||
|
if len(blk.data) >= 4:
|
||||||
|
delta1 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||||
|
cur += delta1
|
||||||
|
T.append(cur)
|
||||||
|
delta2 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||||
|
cur += delta2
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += s4(nib)
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += i8(byte)
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
T.append(cur)
|
||||||
|
# 30 NN: skip for now
|
||||||
|
return T
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
truth_T = [round(v*200) for v in samples["Tran"]]
|
||||||
|
n_truth = len(truth_T)
|
||||||
|
|
||||||
|
decoded = decode_full_tran(body)
|
||||||
|
n = min(len(decoded), n_truth)
|
||||||
|
matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
|
||||||
|
div_at = -1
|
||||||
|
for i in range(n):
|
||||||
|
if decoded[i] != truth_T[i]:
|
||||||
|
div_at = i
|
||||||
|
break
|
||||||
|
print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
"""Quick inspection of the new high-amplitude events."""
|
||||||
|
import os, re, sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
ROOT = "tests/fixtures/5-11-26"
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
bin_path = os.path.join(ROOT, stem)
|
||||||
|
txt_path = bin_path + ".TXT"
|
||||||
|
with open(bin_path, "rb") as f:
|
||||||
|
raw = f.read()
|
||||||
|
body = raw[43:-26]
|
||||||
|
meta, samples = _parse_txt(txt_path)
|
||||||
|
n = len(samples["Tran"])
|
||||||
|
|
||||||
|
print(f"\n=== {stem} ===")
|
||||||
|
print(f" file={len(raw)}, body={len(body)}, N_samples={n}")
|
||||||
|
print(f" rectime={meta.get('Record Time')} pretrig={meta.get('Pre-trigger Length')}")
|
||||||
|
print(f" PPV(T,V,L)={meta.get('Tran PPV')} / {meta.get('Vert PPV')} / {meta.get('Long PPV')}")
|
||||||
|
# Show first few non-trivial samples
|
||||||
|
print(f" First 5 truth samples (in/s):")
|
||||||
|
for i in range(5):
|
||||||
|
print(f" T={samples['Tran'][i]:8.3f} V={samples['Vert'][i]:8.3f} "
|
||||||
|
f"L={samples['Long'][i]:8.3f} M={samples['MicL'][i]:8.3f}")
|
||||||
|
# Peak sample positions
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
vals = samples[ch]
|
||||||
|
peak_i = max(range(n), key=lambda i: abs(vals[i]))
|
||||||
|
print(f" {ch}: peak {vals[peak_i]:.3f} at sample {peak_i} (t={peak_i/1024:.3f}s)")
|
||||||
|
# Body structure
|
||||||
|
start = find_data_start(body)
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
types = {}
|
||||||
|
for b in blocks:
|
||||||
|
types[b.tag_hi] = types.get(b.tag_hi, 0) + 1
|
||||||
|
print(f" body start={start}, total blocks walked: {len(blocks)}")
|
||||||
|
print(f" block tag counts: {types}")
|
||||||
|
# How far the walker got
|
||||||
|
if blocks:
|
||||||
|
last = blocks[-1]
|
||||||
|
walked = last.offset + last.length
|
||||||
|
print(f" walker stopped at offset {walked}/{len(body)} ({100*walked/len(body):.0f}%)")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
"""Print raw body hex + byte-distribution stats for one event."""
|
||||||
|
from collections import Counter
|
||||||
|
import sys
|
||||||
|
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
print(f"\n=== {name} ({len(body)} body bytes) ===")
|
||||||
|
print(f" STRT: {b.strt.hex()}")
|
||||||
|
print(f" body[0:64]: {body[:64].hex()}")
|
||||||
|
print(f" body[64:128]: {body[64:128].hex()}")
|
||||||
|
print(f" body[-32:]: {body[-32:].hex()}")
|
||||||
|
cnt = Counter(body)
|
||||||
|
print(f" top 16 bytes: {[(f'0x{k:02x}', f'{v/len(body):.2%}') for k,v in cnt.most_common(16)]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,144 @@
|
|||||||
|
"""
|
||||||
|
load_bundle.py — extract body bytes from BW binary + parse sample columns from TXT.
|
||||||
|
|
||||||
|
Used by the codec reverse-engineering scripts in this directory.
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
|
||||||
|
BUNDLE_ROOT = os.path.join(
|
||||||
|
os.path.dirname(__file__), "..", "tests", "fixtures", "decode-re-5-8-26"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Bundle:
|
||||||
|
name: str
|
||||||
|
bin_path: str
|
||||||
|
txt_path: str
|
||||||
|
bin: bytes
|
||||||
|
body: bytes # bytes between STRT (43) and footer (last 26)
|
||||||
|
strt: bytes # 21-byte STRT record
|
||||||
|
samples: dict # {"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}
|
||||||
|
sample_rate: int
|
||||||
|
rectime_sec: float
|
||||||
|
pretrig_sec: float
|
||||||
|
geo_range_ips: float
|
||||||
|
ppv: dict # {"Tran": float, "Vert": float, "Long": float}
|
||||||
|
mic_pspl: float
|
||||||
|
serial: str
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_txt(path: str) -> dict:
|
||||||
|
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||||
|
text = f.read()
|
||||||
|
|
||||||
|
meta = {}
|
||||||
|
samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||||
|
|
||||||
|
# Find header line that starts the columns ("Tran Vert Long MicL").
|
||||||
|
# Then every line after is sample data (4 tab-separated floats).
|
||||||
|
lines = text.splitlines()
|
||||||
|
header_idx = None
|
||||||
|
for i, line in enumerate(lines):
|
||||||
|
if "Tran" in line and "Vert" in line and "Long" in line and "MicL" in line:
|
||||||
|
# The columns header. Sample lines start a few lines later.
|
||||||
|
header_idx = i
|
||||||
|
break
|
||||||
|
if header_idx is None:
|
||||||
|
raise ValueError(f"no Tran/Vert/Long/MicL header in {path}")
|
||||||
|
|
||||||
|
# Parse meta — quoted lines with "Field : value"
|
||||||
|
for line in lines[:header_idx]:
|
||||||
|
m = re.match(r'^"([^"]+)\s*:\s*([^"]*)"', line.strip())
|
||||||
|
if m:
|
||||||
|
k, v = m.group(1).strip(), m.group(2).strip()
|
||||||
|
meta[k] = v
|
||||||
|
|
||||||
|
# Parse samples
|
||||||
|
for line in lines[header_idx + 1 :]:
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
parts = re.split(r"\s+", line)
|
||||||
|
if len(parts) < 4:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
t = float(parts[0])
|
||||||
|
v = float(parts[1])
|
||||||
|
l = float(parts[2])
|
||||||
|
m = float(parts[3])
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
samples["Tran"].append(t)
|
||||||
|
samples["Vert"].append(v)
|
||||||
|
samples["Long"].append(l)
|
||||||
|
samples["MicL"].append(m)
|
||||||
|
|
||||||
|
return meta, samples
|
||||||
|
|
||||||
|
|
||||||
|
def load_bundle(name: str) -> Bundle:
|
||||||
|
folder = os.path.join(BUNDLE_ROOT, name)
|
||||||
|
files = os.listdir(folder)
|
||||||
|
bin_name = next(f for f in files if not f.endswith(".TXT"))
|
||||||
|
txt_name = next(f for f in files if f.endswith(".TXT"))
|
||||||
|
|
||||||
|
bin_path = os.path.join(folder, bin_name)
|
||||||
|
txt_path = os.path.join(folder, txt_name)
|
||||||
|
|
||||||
|
with open(bin_path, "rb") as f:
|
||||||
|
binary = f.read()
|
||||||
|
|
||||||
|
# Header is 22 bytes; STRT at [22:43]; footer at last 26 bytes.
|
||||||
|
strt = binary[22:43]
|
||||||
|
body = binary[43:-26]
|
||||||
|
|
||||||
|
meta, samples = _parse_txt(txt_path)
|
||||||
|
|
||||||
|
sample_rate = int(re.search(r"(\d+)", meta.get("Sample Rate", "1024")).group(1))
|
||||||
|
rectime_sec = float(re.search(r"([\d.]+)", meta.get("Record Time", "3.0")).group(1))
|
||||||
|
pretrig_sec = float(re.search(r"-?[\d.]+", meta.get("Pre-trigger Length", "0")).group(0))
|
||||||
|
geo_range_ips = float(re.search(r"([\d.]+)", meta.get("Geo Range", "10.0")).group(1))
|
||||||
|
serial = meta.get("Serial Number", "").strip()
|
||||||
|
|
||||||
|
def _f(s):
|
||||||
|
return float(re.search(r"-?[\d.]+", s).group(0))
|
||||||
|
|
||||||
|
ppv = {
|
||||||
|
"Tran": _f(meta.get("Tran PPV", "0")),
|
||||||
|
"Vert": _f(meta.get("Vert PPV", "0")),
|
||||||
|
"Long": _f(meta.get("Long PPV", "0")),
|
||||||
|
}
|
||||||
|
mic_pspl = _f(meta.get("MicL PSPL", "0"))
|
||||||
|
|
||||||
|
return Bundle(
|
||||||
|
name=name,
|
||||||
|
bin_path=bin_path,
|
||||||
|
txt_path=txt_path,
|
||||||
|
bin=binary,
|
||||||
|
body=body,
|
||||||
|
strt=strt,
|
||||||
|
samples=samples,
|
||||||
|
sample_rate=sample_rate,
|
||||||
|
rectime_sec=rectime_sec,
|
||||||
|
pretrig_sec=pretrig_sec,
|
||||||
|
geo_range_ips=geo_range_ips,
|
||||||
|
ppv=ppv,
|
||||||
|
mic_pspl=mic_pspl,
|
||||||
|
serial=serial,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
for name in ("event-a", "event-b", "event-c", "event-d"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
n = len(b.samples["Tran"])
|
||||||
|
print(f"{name}: body={len(b.body):>6} N_samples={n} rate={b.sample_rate} "
|
||||||
|
f"rectime={b.rectime_sec} pretrig={b.pretrig_sec} range={b.geo_range_ips} "
|
||||||
|
f"PPV(T,V,L)={b.ppv['Tran']:.3f},{b.ppv['Vert']:.3f},{b.ppv['Long']:.3f} "
|
||||||
|
f"MicL={b.mic_pspl}")
|
||||||
@@ -0,0 +1,81 @@
|
|||||||
|
"""Decode Tran across multiple segments by resetting at 40 02 headers."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def decode_full_tran(body):
|
||||||
|
"""Decode all Tran samples in the body, walking through segments."""
|
||||||
|
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||||
|
return None
|
||||||
|
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
|
||||||
|
# Locate first tag
|
||||||
|
i = 7
|
||||||
|
while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
|
||||||
|
i += 1
|
||||||
|
|
||||||
|
blocks = walk_body(body, i)
|
||||||
|
T = [T0, T1]
|
||||||
|
cur = T1
|
||||||
|
for bi, blk in enumerate(blocks):
|
||||||
|
if blk.tag_hi == 0x40:
|
||||||
|
# Segment header — try interpreting bytes [0:2] as new T anchor
|
||||||
|
if len(blk.data) >= 2:
|
||||||
|
new_anchor = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||||
|
# The next sample IS this anchor value, NOT a delta from cur.
|
||||||
|
T.append(new_anchor)
|
||||||
|
cur = new_anchor
|
||||||
|
elif blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += s4(nib)
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += i8(byte)
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
# RLE: append NN zero deltas
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
T.append(cur)
|
||||||
|
# 30 NN: skip
|
||||||
|
return T
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
truth_T = [round(v*200) for v in samples["Tran"]]
|
||||||
|
n_truth = len(truth_T)
|
||||||
|
|
||||||
|
decoded = decode_full_tran(body)
|
||||||
|
n = min(len(decoded), n_truth)
|
||||||
|
matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
|
||||||
|
# Find first divergence
|
||||||
|
div_at = -1
|
||||||
|
for i in range(n):
|
||||||
|
if decoded[i] != truth_T[i]:
|
||||||
|
div_at = i
|
||||||
|
break
|
||||||
|
print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
|
||||||
|
if div_at >= 0 and div_at < 30:
|
||||||
|
print(f" truth around div [{max(0,div_at-3)}:{div_at+8}]: {truth_T[max(0,div_at-3):div_at+8]}")
|
||||||
|
print(f" pred around div [{max(0,div_at-3)}:{div_at+8}]: {decoded[max(0,div_at-3):div_at+8]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
"""Dump all blocks in segment 1 of each event with their data."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
blocks = walk_body(body, find_data_start(body))
|
||||||
|
|
||||||
|
# Find segment 1 (between first and second 40 02)
|
||||||
|
seg40_indices = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||||
|
if len(seg40_indices) < 2:
|
||||||
|
print(f"\n{stem}: only {len(seg40_indices)} segment headers found")
|
||||||
|
seg1_blocks = blocks[seg40_indices[0]:] if seg40_indices else []
|
||||||
|
else:
|
||||||
|
seg1_blocks = blocks[seg40_indices[0]:seg40_indices[1]+1]
|
||||||
|
print(f"\n=== {stem} segment 1 ({len(seg1_blocks)} blocks) ===")
|
||||||
|
for b in seg1_blocks[:25]:
|
||||||
|
tag = f"{b.tag_hi:02x}{b.tag_lo:02x}"
|
||||||
|
print(f" off={b.offset:>5} {tag} NN=0x{b.tag_lo:02x}({b.tag_lo:>3}) len={b.length:>3} data={b.data[:16].hex(' ')}{'...' if len(b.data)>16 else ''}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,195 @@
|
|||||||
|
"""Test 12-bit signed packed deltas hypothesis for 30 NN blocks across all loud events.
|
||||||
|
|
||||||
|
For each 30 NN block in each event, identify what samples it should cover
|
||||||
|
(based on the cumulative delta count up to that point) and compare the
|
||||||
|
truth deltas against various 12-bit packing schemes.
|
||||||
|
"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
CHANNEL_ORDER = ["Vert", "Long", "MicL", "Tran"] # rotation after initial T
|
||||||
|
|
||||||
|
|
||||||
|
def s12(v):
|
||||||
|
"""Sign-extend a 12-bit unsigned value to signed int."""
|
||||||
|
return v if v < 0x800 else v - 0x1000
|
||||||
|
|
||||||
|
|
||||||
|
def unpack_12bit_be(data):
|
||||||
|
"""4 deltas in 6 bytes, BE order: byte[0:1.5], byte[1.5:3], byte[3:4.5], byte[4.5:6]."""
|
||||||
|
# bits 0..47 (MSB-first), split into 4 × 12-bit
|
||||||
|
val = int.from_bytes(data, "big")
|
||||||
|
out = []
|
||||||
|
for i in range(4):
|
||||||
|
d = (val >> (12 * (3 - i))) & 0xFFF
|
||||||
|
out.append(s12(d))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def unpack_12bit_le(data):
|
||||||
|
"""4 deltas in 6 bytes, LE order: bytes packed as 2 × 24-bit groups."""
|
||||||
|
out = []
|
||||||
|
# First 3 bytes contain 2 deltas
|
||||||
|
b0, b1, b2 = data[0], data[1], data[2]
|
||||||
|
d0 = b0 | ((b1 & 0x0F) << 8)
|
||||||
|
d1 = (b1 >> 4) | (b2 << 4)
|
||||||
|
out.append(s12(d0))
|
||||||
|
out.append(s12(d1))
|
||||||
|
# Next 3 bytes contain 2 more deltas
|
||||||
|
b3, b4, b5 = data[3], data[4], data[5]
|
||||||
|
d2 = b3 | ((b4 & 0x0F) << 8)
|
||||||
|
d3 = (b4 >> 4) | (b5 << 4)
|
||||||
|
out.append(s12(d2))
|
||||||
|
out.append(s12(d3))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def unpack_12bit_be_per_triplet(data):
|
||||||
|
"""4 deltas as 2 triplets of (high4, low8) BE within each 3-byte group."""
|
||||||
|
out = []
|
||||||
|
b0, b1, b2 = data[0], data[1], data[2]
|
||||||
|
d0 = (b0 << 4) | (b1 >> 4)
|
||||||
|
d1 = ((b1 & 0x0F) << 8) | b2
|
||||||
|
out.append(s12(d0))
|
||||||
|
out.append(s12(d1))
|
||||||
|
b3, b4, b5 = data[3], data[4], data[5]
|
||||||
|
d2 = (b3 << 4) | (b4 >> 4)
|
||||||
|
d3 = ((b4 & 0x0F) << 8) | b5
|
||||||
|
out.append(s12(d2))
|
||||||
|
out.append(s12(d3))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def truth_deltas_for_block(blocks, block_idx, event_truth, channel):
|
||||||
|
"""For a 30 NN block at block_idx, determine which samples it covers and
|
||||||
|
return the truth deltas for those samples.
|
||||||
|
|
||||||
|
Walks through all blocks before block_idx (within the same segment) and
|
||||||
|
counts how many deltas have been emitted for *channel*, starting from the
|
||||||
|
segment's anchor pair.
|
||||||
|
"""
|
||||||
|
# Find the segment header that contains this block.
|
||||||
|
seg_header_idx = None
|
||||||
|
for j in range(block_idx, -1, -1):
|
||||||
|
if blocks[j].tag_hi == 0x40:
|
||||||
|
seg_header_idx = j
|
||||||
|
break
|
||||||
|
if seg_header_idx is None:
|
||||||
|
# block is in the initial T segment; samples count from sample 2.
|
||||||
|
first_sample_in_segment = 2
|
||||||
|
else:
|
||||||
|
# Anchor pair covers samples [N, N+1] for some N. Subsequent deltas
|
||||||
|
# are samples [N+2, N+2+1, ...]. We don't actually need to know N
|
||||||
|
# for this test — just the relative position within the segment.
|
||||||
|
first_sample_in_segment = 2 # anchor=0,1; deltas start at 2
|
||||||
|
|
||||||
|
# Count deltas from segment-data start to block_idx.
|
||||||
|
delta_count = 0
|
||||||
|
start_block = seg_header_idx + 1 if seg_header_idx is not None else 0
|
||||||
|
for j in range(start_block, block_idx):
|
||||||
|
blk = blocks[j]
|
||||||
|
if blk.tag_hi == 0x10:
|
||||||
|
delta_count += blk.tag_lo # NN nibbles = NN deltas
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
delta_count += blk.tag_lo # NN int8 deltas
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
delta_count += blk.tag_lo # RLE zero deltas
|
||||||
|
# Now the 30 NN block carries NN deltas.
|
||||||
|
nn = blocks[block_idx].tag_lo
|
||||||
|
# First sample affected: segment first_sample + delta_count.
|
||||||
|
# But we ALSO need to know which segment this is, since the segment maps
|
||||||
|
# to a specific channel and a specific starting absolute sample index.
|
||||||
|
return first_sample_in_segment + delta_count, nn
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||||
|
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
blocks = walk_body(body, find_data_start(body))
|
||||||
|
seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||||
|
|
||||||
|
# Find all 30 NN blocks in DATA section (not trailer).
|
||||||
|
thirty_blocks = []
|
||||||
|
for bi, b in enumerate(blocks):
|
||||||
|
if b.tag_hi != 0x30:
|
||||||
|
continue
|
||||||
|
# Determine which segment this is in
|
||||||
|
seg_num = None
|
||||||
|
for k, hi in enumerate(seg_idx):
|
||||||
|
next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
|
||||||
|
if hi < bi < next_hi:
|
||||||
|
seg_num = k
|
||||||
|
break
|
||||||
|
if seg_num is None and seg_idx and bi < seg_idx[0]:
|
||||||
|
seg_num = -1 # initial T segment
|
||||||
|
thirty_blocks.append((bi, b, seg_num))
|
||||||
|
|
||||||
|
if not thirty_blocks:
|
||||||
|
continue
|
||||||
|
|
||||||
|
print(f"\n=== {stem} ===")
|
||||||
|
for bi, b, seg_num in thirty_blocks:
|
||||||
|
# Channel for this segment
|
||||||
|
if seg_num == -1:
|
||||||
|
channel = "Tran"
|
||||||
|
seg_label = "initial T"
|
||||||
|
else:
|
||||||
|
channel = CHANNEL_ORDER[seg_num % 4]
|
||||||
|
seg_label = f"seg {seg_num}"
|
||||||
|
|
||||||
|
# Count deltas before this block within the same segment.
|
||||||
|
seg_header_idx = seg_idx[seg_num] if seg_num >= 0 else -1
|
||||||
|
start_block = seg_header_idx + 1 if seg_header_idx >= 0 else 0
|
||||||
|
delta_count = 0
|
||||||
|
for j in range(start_block, bi):
|
||||||
|
blk = blocks[j]
|
||||||
|
if blk.tag_hi in (0x10, 0x20, 0x00):
|
||||||
|
delta_count += blk.tag_lo
|
||||||
|
|
||||||
|
# First sample this 30 NN block affects (within the segment)
|
||||||
|
# = anchor positions + delta_count + 2 (since anchor pair was samples 0,1)
|
||||||
|
# But the segment's first absolute sample index in the channel is
|
||||||
|
# (seg_num // 4) * 512 (approximately) if segment 0 is the first V seg.
|
||||||
|
cycle = (seg_num // 4) if seg_num >= 0 else 0
|
||||||
|
base = cycle * 512 + 2 # +2 for anchor pair
|
||||||
|
sample_idx = base + delta_count
|
||||||
|
truth_ch = [round(v * 200) for v in samples[channel]]
|
||||||
|
nn = b.tag_lo
|
||||||
|
|
||||||
|
if sample_idx + nn >= len(truth_ch):
|
||||||
|
print(f" block @ {b.offset} ({seg_label} {channel}): out of truth range")
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Get the previous sample so we can compute truth deltas
|
||||||
|
if sample_idx == 0:
|
||||||
|
prev = 0
|
||||||
|
else:
|
||||||
|
prev = truth_ch[sample_idx - 1]
|
||||||
|
truth_deltas = []
|
||||||
|
for k in range(nn):
|
||||||
|
truth_deltas.append(truth_ch[sample_idx + k] - (prev if k == 0 else truth_ch[sample_idx + k - 1]))
|
||||||
|
|
||||||
|
# Try each packing
|
||||||
|
schemes = [
|
||||||
|
("12-bit BE contiguous", unpack_12bit_be(b.data)),
|
||||||
|
("12-bit LE per-triplet", unpack_12bit_le(b.data)),
|
||||||
|
("12-bit BE per-triplet", unpack_12bit_be_per_triplet(b.data)),
|
||||||
|
]
|
||||||
|
print(f" block @ {b.offset:>5} ({seg_label} {channel}, samples {sample_idx}..{sample_idx+nn-1}):")
|
||||||
|
print(f" data: {b.data.hex(' ')}")
|
||||||
|
print(f" truth: {truth_deltas}")
|
||||||
|
for name, pred in schemes:
|
||||||
|
match = "✓" if pred == truth_deltas else " "
|
||||||
|
n_match = sum(1 for x, y in zip(pred, truth_deltas) if x == y)
|
||||||
|
print(f" {match}{n_match}/4 {name}: {pred}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,132 @@
|
|||||||
|
"""Test the '30 NN data = high-nibbles + int8 low-bytes' hypothesis.
|
||||||
|
|
||||||
|
Layout for `30 04` (6 data bytes, 4 deltas):
|
||||||
|
bytes [0:2] = 16 bits = 4 × 4-bit high-nibbles (MSB first)
|
||||||
|
bytes [2:6] = 4 × int8 low bytes
|
||||||
|
Each delta = 12-bit signed = sign-extend((high_nibble << 8) | low_byte)
|
||||||
|
"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def sign_extend_12(v):
|
||||||
|
return v if v < 0x800 else v - 0x1000
|
||||||
|
|
||||||
|
|
||||||
|
def decode_30nn(data):
|
||||||
|
"""4 × 12-bit signed deltas (high nibble + low byte).
|
||||||
|
bytes[0:2] hold the 4 high nibbles (MSB first); bytes[2:6] hold the low bytes.
|
||||||
|
"""
|
||||||
|
if len(data) < 6:
|
||||||
|
return []
|
||||||
|
# Read high nibbles from bytes 0-1 (4 nibbles MSB-first)
|
||||||
|
high_word = (data[0] << 8) | data[1]
|
||||||
|
high_nibbles = [
|
||||||
|
(high_word >> 12) & 0xF,
|
||||||
|
(high_word >> 8) & 0xF,
|
||||||
|
(high_word >> 4) & 0xF,
|
||||||
|
high_word & 0xF,
|
||||||
|
]
|
||||||
|
out = []
|
||||||
|
for i in range(4):
|
||||||
|
v = (high_nibbles[i] << 8) | data[2 + i]
|
||||||
|
out.append(sign_extend_12(v))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def simulate_up_to(blocks, target_block_idx, t_preamble):
|
||||||
|
"""Run decoder up to block_idx; return per-channel sample lists.
|
||||||
|
NOW with 30 NN decoded too."""
|
||||||
|
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||||
|
out["Tran"].extend(t_preamble)
|
||||||
|
cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
|
||||||
|
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||||
|
current_channel = "Tran"
|
||||||
|
seg_counter = -1
|
||||||
|
for j in range(target_block_idx):
|
||||||
|
blk = blocks[j]
|
||||||
|
if blk.tag_hi == 0x40:
|
||||||
|
seg_counter += 1
|
||||||
|
prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
|
||||||
|
new_ch = rotation[seg_counter % 4]
|
||||||
|
if cur[prev] is not None:
|
||||||
|
d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||||
|
d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||||
|
cur[prev] += d0; out[prev].append(cur[prev])
|
||||||
|
cur[prev] += d1; out[prev].append(cur[prev])
|
||||||
|
c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
|
||||||
|
c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
|
||||||
|
out[new_ch].extend([c0, c1])
|
||||||
|
cur[new_ch] = c1
|
||||||
|
current_channel = new_ch
|
||||||
|
elif blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur[current_channel] += s4(nib)
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur[current_channel] += i8(byte)
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
elif blk.tag_hi == 0x30:
|
||||||
|
# NEW: decode 30 NN
|
||||||
|
deltas = decode_30nn(blk.data)
|
||||||
|
for d in deltas:
|
||||||
|
cur[current_channel] += d
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
return out, current_channel
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||||
|
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
blocks = walk_body(body, find_data_start(body))
|
||||||
|
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
|
||||||
|
if not thirty_blocks:
|
||||||
|
continue
|
||||||
|
print(f"\n=== {stem} ===")
|
||||||
|
for j, blk in thirty_blocks:
|
||||||
|
pred, ch = simulate_up_to(blocks, j, [t0, t1])
|
||||||
|
cur_before = pred[ch][-1]
|
||||||
|
truth = [round(v * 200) for v in samples[ch]]
|
||||||
|
n_pred = len(pred[ch])
|
||||||
|
nn = blk.tag_lo
|
||||||
|
if n_pred + nn > len(truth):
|
||||||
|
continue
|
||||||
|
# Decode this 30 NN block with hypothesis
|
||||||
|
pred_deltas = decode_30nn(blk.data)
|
||||||
|
# Compute truth deltas relative to cur_before
|
||||||
|
truth_deltas = []
|
||||||
|
prev = cur_before
|
||||||
|
for k in range(nn):
|
||||||
|
truth_deltas.append(truth[n_pred + k] - prev)
|
||||||
|
prev = truth[n_pred + k]
|
||||||
|
n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
|
||||||
|
tag = "✓" if pred_deltas == truth_deltas else " "
|
||||||
|
print(f" block @ {blk.offset:>5} (chan={ch}, NN={nn}):")
|
||||||
|
print(f" data: {blk.data.hex(' ')}")
|
||||||
|
print(f" truth: {truth_deltas}")
|
||||||
|
print(f" pred: {pred_deltas} {tag}{n_match}/{nn}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,141 @@
|
|||||||
|
"""Test 30 NN packing by running the real decoder up to each 30 NN block,
|
||||||
|
recording how many samples have been produced for each channel at that point,
|
||||||
|
then checking truth deltas immediately after."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def s12(v):
|
||||||
|
return v if v < 0x800 else v - 0x1000
|
||||||
|
|
||||||
|
|
||||||
|
def unpack_12bit_be_contiguous(data):
|
||||||
|
out = []
|
||||||
|
val = int.from_bytes(data, "big")
|
||||||
|
n = len(data) * 8 // 12
|
||||||
|
for i in range(n):
|
||||||
|
d = (val >> (12 * (n - 1 - i))) & 0xFFF
|
||||||
|
out.append(s12(d))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def unpack_12bit_per_triplet_be(data):
|
||||||
|
out = []
|
||||||
|
for i in range(0, len(data), 3):
|
||||||
|
if i + 2 >= len(data):
|
||||||
|
break
|
||||||
|
b0, b1, b2 = data[i], data[i + 1], data[i + 2]
|
||||||
|
d0 = (b0 << 4) | (b1 >> 4)
|
||||||
|
d1 = ((b1 & 0x0F) << 8) | b2
|
||||||
|
out.append(s12(d0))
|
||||||
|
out.append(s12(d1))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def simulate_up_to(blocks, target_block_idx, t_preamble):
|
||||||
|
"""Run the decoder up to block_idx; return per-channel sample lists."""
|
||||||
|
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||||
|
out["Tran"].extend(t_preamble)
|
||||||
|
cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
|
||||||
|
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||||
|
seg_idx = [j for j, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||||
|
|
||||||
|
# Determine which channel we're CURRENTLY decoding into
|
||||||
|
current_channel = "Tran"
|
||||||
|
seg_counter = -1 # incremented at each 40 02
|
||||||
|
|
||||||
|
for j in range(target_block_idx):
|
||||||
|
blk = blocks[j]
|
||||||
|
if blk.tag_hi == 0x40:
|
||||||
|
# Switch: extend prev channel, set up new channel
|
||||||
|
seg_counter += 1
|
||||||
|
prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
|
||||||
|
new_ch = rotation[seg_counter % 4]
|
||||||
|
if cur[prev] is not None:
|
||||||
|
d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
|
||||||
|
d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
|
||||||
|
cur[prev] += d0; out[prev].append(cur[prev])
|
||||||
|
cur[prev] += d1; out[prev].append(cur[prev])
|
||||||
|
c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
|
||||||
|
c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
|
||||||
|
out[new_ch].extend([c0, c1])
|
||||||
|
cur[new_ch] = c1
|
||||||
|
current_channel = new_ch
|
||||||
|
elif blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur[current_channel] += s4(nib)
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur[current_channel] += i8(byte)
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
out[current_channel].append(cur[current_channel])
|
||||||
|
elif blk.tag_hi == 0x30:
|
||||||
|
# Skip for now — we want to know what comes next
|
||||||
|
pass
|
||||||
|
|
||||||
|
return out, current_channel
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
|
||||||
|
"M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
blocks = walk_body(body, find_data_start(body))
|
||||||
|
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
|
||||||
|
# Find all 30 NN blocks in data section
|
||||||
|
thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
|
||||||
|
if not thirty_blocks:
|
||||||
|
continue
|
||||||
|
|
||||||
|
print(f"\n=== {stem} ===")
|
||||||
|
for j, blk in thirty_blocks:
|
||||||
|
pred, ch = simulate_up_to(blocks, j, [t0, t1])
|
||||||
|
n_pred = len(pred[ch])
|
||||||
|
# The 30 NN block carries NN deltas for channel `ch` starting at sample n_pred
|
||||||
|
truth = [round(v * 200) for v in samples[ch]]
|
||||||
|
if n_pred >= len(truth):
|
||||||
|
continue
|
||||||
|
# Truth deltas: truth[n_pred] - cur, truth[n_pred+1] - truth[n_pred], ...
|
||||||
|
cur_val = pred[ch][-1]
|
||||||
|
nn = blk.tag_lo
|
||||||
|
truth_deltas = []
|
||||||
|
prev = cur_val
|
||||||
|
for k in range(min(nn, len(truth) - n_pred)):
|
||||||
|
truth_deltas.append(truth[n_pred + k] - prev)
|
||||||
|
prev = truth[n_pred + k]
|
||||||
|
|
||||||
|
print(f" block @ {blk.offset:>5} (chan={ch}, after sample {n_pred-1}, "
|
||||||
|
f"NN={nn}, last_val={cur_val}):")
|
||||||
|
print(f" data: {blk.data.hex(' ')}")
|
||||||
|
print(f" truth: {truth_deltas}")
|
||||||
|
schemes = [
|
||||||
|
("12-bit BE contiguous", unpack_12bit_be_contiguous(blk.data)),
|
||||||
|
("12-bit per-triplet BE", unpack_12bit_per_triplet_be(blk.data)),
|
||||||
|
]
|
||||||
|
for name, pred_deltas in schemes:
|
||||||
|
n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
|
||||||
|
tag = "✓" if pred_deltas == truth_deltas else " "
|
||||||
|
print(f" {tag}{n_match}/{nn} {name}: {pred_deltas[:nn]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,86 @@
|
|||||||
|
"""Test: 00 NN markers might be RLE for zero-deltas in current channel."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def decode_with_rle(body):
|
||||||
|
"""Decode Tran assuming:
|
||||||
|
- preamble[3:5], [5:7] = T[0], T[1]
|
||||||
|
- All 10 NN / 20 NN blocks until segment_header (40 02) are Tran deltas
|
||||||
|
- 00 NN markers are RLE: NN/4 zero T deltas (or NN, or NN/2 — try them)
|
||||||
|
"""
|
||||||
|
if len(body) < 9 or body[0:3] != b"\x00\x02\x00":
|
||||||
|
return None, None, None
|
||||||
|
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
|
||||||
|
# Find first tag (might be 00 NN, 10 NN, or 20 NN)
|
||||||
|
i = 7
|
||||||
|
while i + 1 < len(body):
|
||||||
|
if body[i] in (0x00, 0x10, 0x20):
|
||||||
|
break
|
||||||
|
i += 1
|
||||||
|
start = i
|
||||||
|
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
|
||||||
|
results = {}
|
||||||
|
for rle_div in (4, 2, 1): # try different RLE interpretations
|
||||||
|
T = [T0, T1]
|
||||||
|
cur = T1
|
||||||
|
for blk in blocks:
|
||||||
|
if blk.tag_hi == 0x40:
|
||||||
|
break
|
||||||
|
if blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += s4(nib)
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += i8(byte)
|
||||||
|
T.append(cur)
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
# RLE of zero deltas
|
||||||
|
n_zeros = blk.tag_lo // rle_div
|
||||||
|
for _ in range(n_zeros):
|
||||||
|
T.append(cur)
|
||||||
|
# 30 NN: skip for now
|
||||||
|
results[rle_div] = T
|
||||||
|
return results, T0, T1
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
truth_T = [round(v*200) for v in samples["Tran"]]
|
||||||
|
|
||||||
|
results, T0, T1 = decode_with_rle(body)
|
||||||
|
print(f"\n=== {stem} (T[0]={T0}, T[1]={T1}) ===")
|
||||||
|
for rle_div, T in results.items():
|
||||||
|
n = min(len(T), len(truth_T))
|
||||||
|
matches = sum(1 for i in range(n) if T[i] == truth_T[i])
|
||||||
|
# Find first divergence
|
||||||
|
div_at = -1
|
||||||
|
for i in range(n):
|
||||||
|
if T[i] != truth_T[i]:
|
||||||
|
div_at = i
|
||||||
|
break
|
||||||
|
print(f" rle_div={rle_div}: decoded {len(T)}, matches {matches}/{n}, first div at sample {div_at}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
"""Test: does the second '20 NN' block in SS0 continue Tran samples?"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
stem = "M529LL1A.SS0"
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
truth_T_16 = [round(v * 200) for v in samples["Tran"]]
|
||||||
|
|
||||||
|
# Preamble
|
||||||
|
T0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
T1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
|
||||||
|
# Walk blocks
|
||||||
|
start = find_data_start(body)
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
|
||||||
|
print(f"=== {stem} === T[0]={T0} T[1]={T1}")
|
||||||
|
|
||||||
|
# Hypothesis: Tran continues through ALL 10 NN and 20 NN blocks
|
||||||
|
# in order, until the next 40 02 segment header (which resets).
|
||||||
|
T = [T0, T1]
|
||||||
|
cur = T1
|
||||||
|
decoded_count = 2 # T[0], T[1] from preamble
|
||||||
|
for bi, blk in enumerate(blocks):
|
||||||
|
if blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += s4(nib)
|
||||||
|
T.append(cur)
|
||||||
|
decoded_count += 1
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += i8(byte)
|
||||||
|
T.append(cur)
|
||||||
|
decoded_count += 1
|
||||||
|
elif blk.tag_hi == 0x40:
|
||||||
|
# Segment header — stop here for this test
|
||||||
|
break
|
||||||
|
# 00 and 30 NN don't contribute to Tran (in this hypothesis)
|
||||||
|
|
||||||
|
# Compare to truth
|
||||||
|
print(f" Decoded {len(T)} T samples up to first 40 02")
|
||||||
|
matches = sum(1 for i in range(min(len(T), len(truth_T_16))) if T[i] == truth_T_16[i])
|
||||||
|
print(f" Matches in first {min(len(T), len(truth_T_16))}: {matches}")
|
||||||
|
# Print first divergence
|
||||||
|
for i in range(min(len(T), len(truth_T_16))):
|
||||||
|
if T[i] != truth_T_16[i]:
|
||||||
|
print(f" First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
|
||||||
|
# Show context
|
||||||
|
print(f" pred [{i-3}:{i+5}]: {T[max(0,i-3):i+5]}")
|
||||||
|
print(f" truth [{i-3}:{i+5}]: {truth_T_16[max(0,i-3):i+5]}")
|
||||||
|
break
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,67 @@
|
|||||||
|
"""Try various nibble-level channel interleavings to find which one matches truth."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def run_decoder(body, layout, skip, n_channels=4):
|
||||||
|
"""layout: function nibble_index -> channel_index. Returns list-of-lists per channel."""
|
||||||
|
out = [[] for _ in range(n_channels)]
|
||||||
|
cur = [0] * n_channels
|
||||||
|
nibbles = []
|
||||||
|
for byte in body[skip:]:
|
||||||
|
nibbles.append((byte >> 4) & 0xF)
|
||||||
|
nibbles.append(byte & 0xF)
|
||||||
|
for i, n in enumerate(nibbles):
|
||||||
|
ch = layout(i)
|
||||||
|
cur[ch] += s4(n)
|
||||||
|
out[ch].append(cur[ch])
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def cmp(pred, truth, n=24):
|
||||||
|
n = min(n, len(pred), len(truth))
|
||||||
|
return [(pred[i], truth[i]) for i in range(n)]
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
b = load_bundle("event-c")
|
||||||
|
truth_T = [round(v * 200) for v in b.samples["Tran"]]
|
||||||
|
truth_V = [round(v * 200) for v in b.samples["Vert"]]
|
||||||
|
truth_L = [round(v * 200) for v in b.samples["Long"]]
|
||||||
|
print(f"T truth[0:10]: {truth_T[:10]}")
|
||||||
|
print(f"V truth[0:10]: {truth_V[:10]}")
|
||||||
|
print(f"L truth[0:10]: {truth_L[:10]}")
|
||||||
|
|
||||||
|
# Try several nibble->channel layouts (4 channels)
|
||||||
|
layouts = {
|
||||||
|
"interleaved TVLM (0,1,2,3,0,1,2,3,...)": lambda i: i % 4,
|
||||||
|
"interleaved VLMT": lambda i: (i + 3) % 4,
|
||||||
|
"interleaved LMTV": lambda i: (i + 2) % 4,
|
||||||
|
"interleaved MTVL": lambda i: (i + 1) % 4,
|
||||||
|
"byte-based TV LM TV LM (high T low V byte0; high L low M byte1)": lambda i: i % 4,
|
||||||
|
# "chunks of 8 nibbles per channel": each channel gets 8 nibbles in a row
|
||||||
|
"chunks-8 TVLM": lambda i: (i // 8) % 4,
|
||||||
|
"chunks-16 TVLM": lambda i: (i // 16) % 4,
|
||||||
|
# planar (full channel sequential)
|
||||||
|
"planar T(0..N) V(N..2N) L(2N..3N) M(3N..4N)": None, # special
|
||||||
|
}
|
||||||
|
|
||||||
|
for label, layout_fn in layouts.items():
|
||||||
|
if layout_fn is None:
|
||||||
|
continue
|
||||||
|
for skip in (0, 4, 7, 8, 9, 11, 14):
|
||||||
|
out = run_decoder(b.body, layout_fn, skip)
|
||||||
|
# Check first 8 cumulative on each channel
|
||||||
|
print(f" skip={skip:2} {label}")
|
||||||
|
print(f" T_cum[0:10]: {out[0][:10]}")
|
||||||
|
print(f" V_cum[0:10]: {out[1][:10]}")
|
||||||
|
print(f" L_cum[0:10]: {out[2][:10]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,73 @@
|
|||||||
|
"""Try decoding body as 4-bit signed nibble deltas, 4-channel round-robin."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
"""Sign-extend a 4-bit unsigned to int (0..7 → 0..7, 8..F → -8..-1)."""
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def decode_nibbles(body: bytes, skip_bytes: int = 7, n_channels: int = 4):
|
||||||
|
"""Read body as 2 nibbles per byte; accumulate as deltas for n_channels round-robin."""
|
||||||
|
out = [[] for _ in range(n_channels)]
|
||||||
|
cur = [0] * n_channels
|
||||||
|
ch = 0
|
||||||
|
nibbles = []
|
||||||
|
for byte in body[skip_bytes:]:
|
||||||
|
nibbles.append((byte >> 4) & 0xF)
|
||||||
|
nibbles.append(byte & 0xF)
|
||||||
|
for n in nibbles:
|
||||||
|
cur[ch] += s4(n)
|
||||||
|
out[ch].append(cur[ch])
|
||||||
|
ch = (ch + 1) % n_channels
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def cmp_to_truth(pred, truth, scale=16):
|
||||||
|
"""Compare predicted ints (in 16-count units) to truth (in 16-count units = txt * 200).
|
||||||
|
Return (max_abs_err, mean_abs_err, n_compared).
|
||||||
|
"""
|
||||||
|
n = min(len(pred), len(truth))
|
||||||
|
errs = []
|
||||||
|
for i in range(n):
|
||||||
|
p = pred[i]
|
||||||
|
t = truth[i]
|
||||||
|
errs.append(abs(p - t))
|
||||||
|
if not errs:
|
||||||
|
return None
|
||||||
|
return (max(errs), sum(errs) / len(errs), n)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-a", "event-c"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
# Convert TXT samples (in/s) to 16-count units (multiply by 200, since 0.005 in/s = 1)
|
||||||
|
# WAIT: 0.005 in/s = 16 ADC counts. 1 count = 0.000305 in/s.
|
||||||
|
# So in 1-count units: count = txt * (1/0.0003052) ≈ txt * 3276.7
|
||||||
|
# But TXT only has 0.005 resolution so equivalent to 16-count units = txt * 200.
|
||||||
|
truth_in_16 = {ch: [round(v * 200) for v in b.samples[ch]] for ch in CHANNELS[:3]}
|
||||||
|
# MicL is in dB, skip for now
|
||||||
|
|
||||||
|
# Try decoder with skip_bytes = 7
|
||||||
|
decoded = decode_nibbles(b.body, skip_bytes=7, n_channels=4)
|
||||||
|
print(f"\n=== {name} ===")
|
||||||
|
print(f" body={len(b.body)}, nibbles={2*(len(b.body)-7)}, samples_per_ch={len(decoded[0])}")
|
||||||
|
print(f" truth samples per ch: {len(truth_in_16['Tran'])}")
|
||||||
|
# Print first 24 of each
|
||||||
|
for i, chan in enumerate(CHANNELS):
|
||||||
|
pred_first = decoded[i][:24]
|
||||||
|
if chan in truth_in_16:
|
||||||
|
truth_first = truth_in_16[chan][:24]
|
||||||
|
print(f" {chan} pred: {pred_first}")
|
||||||
|
print(f" {chan} truth: {truth_first}")
|
||||||
|
else:
|
||||||
|
print(f" {chan} pred: {pred_first} (truth in dB, skipped)")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,32 @@
|
|||||||
|
"""Verify decode_waveform_v2 against BW ASCII truth for all fixtures."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
|
||||||
|
"M529LL1L.JQ0", "M529LL1L.V70"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
decoded = decode_waveform_v2(body)
|
||||||
|
if decoded is None:
|
||||||
|
print(f"{stem}: decoder returned None")
|
||||||
|
continue
|
||||||
|
|
||||||
|
print(f"\n=== {stem} ===")
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
truth = [round(v * 200) for v in samples[ch]]
|
||||||
|
pred = decoded[ch]
|
||||||
|
n = min(len(pred), len(truth))
|
||||||
|
matches = sum(1 for i in range(n) if pred[i] == truth[i])
|
||||||
|
div = next((i for i in range(n) if pred[i] != truth[i]), -1)
|
||||||
|
print(f" {ch}: decoded={len(pred):>5} truth={len(truth):>5} "
|
||||||
|
f"matches={matches:>5}/{n:<5} first div={div}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
"""Run decode_waveform_v2 against the 5-8-26 quiet bundle to test the
|
||||||
|
'quiet events should decode fully' hypothesis."""
|
||||||
|
import os, sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
base = "tests/fixtures/decode-re-5-8-26"
|
||||||
|
for evt in sorted(os.listdir(base)):
|
||||||
|
folder = os.path.join(base, evt)
|
||||||
|
if not os.path.isdir(folder):
|
||||||
|
continue
|
||||||
|
# Find the binary (not .TXT)
|
||||||
|
bin_name = next(
|
||||||
|
(f for f in os.listdir(folder) if not f.endswith(".TXT")),
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
if not bin_name:
|
||||||
|
continue
|
||||||
|
bin_path = os.path.join(folder, bin_name)
|
||||||
|
txt_path = bin_path + ".TXT"
|
||||||
|
if not os.path.exists(txt_path):
|
||||||
|
# Sometimes the TXT name differs slightly
|
||||||
|
for f in os.listdir(folder):
|
||||||
|
if f.endswith(".TXT"):
|
||||||
|
txt_path = os.path.join(folder, f)
|
||||||
|
break
|
||||||
|
with open(bin_path, "rb") as f:
|
||||||
|
body = f.read()[43:-26]
|
||||||
|
decoded = decode_waveform_v2(body)
|
||||||
|
_, samples = _parse_txt(txt_path)
|
||||||
|
|
||||||
|
# Count 30 NN blocks
|
||||||
|
blocks = walk_body(body, find_data_start(body))
|
||||||
|
n_30 = sum(1 for b in blocks if b.tag_hi == 0x30)
|
||||||
|
n_40 = sum(1 for b in blocks if b.tag_hi == 0x40)
|
||||||
|
|
||||||
|
print(f"\n=== {evt} === body={len(body)} segments={n_40} '30 NN' blocks={n_30}")
|
||||||
|
if decoded is None:
|
||||||
|
print(" decoder returned None")
|
||||||
|
continue
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
truth = [round(v * 200) for v in samples[ch]]
|
||||||
|
pred = decoded[ch]
|
||||||
|
n = min(len(pred), len(truth))
|
||||||
|
matches = sum(1 for i in range(n) if pred[i] == truth[i])
|
||||||
|
div = next((i for i in range(n) if pred[i] != truth[i]), -1)
|
||||||
|
print(f" {ch}: decoded={len(pred):>5} truth={len(truth):>5} "
|
||||||
|
f"matches={matches:>5}/{n:<5} first div={div}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
"""Verify: preamble[3:7] = Tran[0], Tran[1] as int16 BE in 16-count units.
|
||||||
|
And first 20/10 NN block = Tran deltas starting at sample 2.
|
||||||
|
"""
|
||||||
|
import os, sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import _parse_txt
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n):
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b):
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
path = f"tests/fixtures/5-11-26/{stem}"
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
raw = f.read()
|
||||||
|
body = raw[43:-26]
|
||||||
|
_, samples = _parse_txt(path + ".TXT")
|
||||||
|
truth_T_16 = [round(v * 200) for v in samples["Tran"]]
|
||||||
|
|
||||||
|
# Preamble parse
|
||||||
|
T0_pre = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
T1_pre = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
print(f"\n=== {stem} ===")
|
||||||
|
print(f" Preamble T[0]={T0_pre} (truth {truth_T_16[0]}) T[1]={T1_pre} (truth {truth_T_16[1]}) match={T0_pre==truth_T_16[0] and T1_pre==truth_T_16[1]}")
|
||||||
|
|
||||||
|
# First block
|
||||||
|
start = find_data_start(body)
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
if not blocks:
|
||||||
|
print(f" no blocks found")
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Assume first block = Tran deltas from sample 2
|
||||||
|
first = blocks[0]
|
||||||
|
T = [T0_pre, T1_pre]
|
||||||
|
cur_T = T1_pre
|
||||||
|
if first.tag_hi == 0x10:
|
||||||
|
# Nibble pairs
|
||||||
|
for byte in first.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur_T += s4(nib)
|
||||||
|
T.append(cur_T)
|
||||||
|
elif first.tag_hi == 0x20:
|
||||||
|
# int8 per byte
|
||||||
|
for byte in first.data:
|
||||||
|
cur_T += i8(byte)
|
||||||
|
T.append(cur_T)
|
||||||
|
|
||||||
|
# Compare against truth
|
||||||
|
n_check = min(len(T), len(truth_T_16))
|
||||||
|
match_count = sum(1 for i in range(n_check) if T[i] == truth_T_16[i])
|
||||||
|
print(f" First block type=0x{first.tag_hi:02x} NN=0x{first.tag_lo:02x} len={len(first.data)} → {len(T)} T samples decoded")
|
||||||
|
print(f" Tran predicted[0:10]: {T[:10]}")
|
||||||
|
print(f" Tran truth [0:10]: {truth_T_16[:10]}")
|
||||||
|
print(f" Matches in first {n_check}: {match_count} / {n_check}")
|
||||||
|
# Show where it diverges
|
||||||
|
for i in range(n_check):
|
||||||
|
if T[i] != truth_T_16[i]:
|
||||||
|
print(f" First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
|
||||||
|
break
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,20 @@
|
|||||||
|
"""Walk blocks of the new 5-11-26 events and look at what comes after Tran block."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
|
||||||
|
with open(f"tests/fixtures/5-11-26/{stem}", "rb") as f:
|
||||||
|
raw = f.read()
|
||||||
|
body = raw[43:-26]
|
||||||
|
start = find_data_start(body)
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
print(f"\n=== {stem} === body={len(body)} start={start} blocks walked={len(blocks)}")
|
||||||
|
for i, b in enumerate(blocks[:20]):
|
||||||
|
print(f" block[{i:>2}] @ {b.offset:>5} tag={b.tag_hi:02x} NN=0x{b.tag_lo:02x}({b.tag_lo}) len={b.length} data[:24]={b.data[:24].hex(' ')}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,44 @@
|
|||||||
|
"""Walk the body assuming chunks delimited by 0x10 NN tags. Print each chunk's structure."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def walk(body: bytes, start_offset: int = 7, max_chunks: int = 30):
|
||||||
|
"""Find all positions where byte = 0x10 followed by a multiple-of-4 byte. Print chunks."""
|
||||||
|
chunks = []
|
||||||
|
i = start_offset
|
||||||
|
while i < len(body) - 1:
|
||||||
|
# Find next `10 NN` where NN is multiple of 4 (and not preceded by another 0x10 immediately, which would be data).
|
||||||
|
if body[i] == 0x10 and (body[i+1] % 4 == 0):
|
||||||
|
chunks.append(i)
|
||||||
|
i += 1
|
||||||
|
return chunks
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-c", "event-d"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
positions = []
|
||||||
|
i = 7 # skip 7-byte preamble
|
||||||
|
while i < len(body) - 1:
|
||||||
|
if body[i] == 0x10 and body[i+1] % 4 == 0 and body[i+1] > 0:
|
||||||
|
positions.append(i)
|
||||||
|
i += 2 # skip past tag
|
||||||
|
else:
|
||||||
|
i += 1
|
||||||
|
print(f"\n=== {name} === body={len(body)}, total `10 NN` (NN%4==0, NN>0) tags: {len(positions)}")
|
||||||
|
# Print first 20 chunks: show position, NN, gap to next tag
|
||||||
|
for k in range(min(30, len(positions))):
|
||||||
|
pos = positions[k]
|
||||||
|
NN = body[pos + 1]
|
||||||
|
next_pos = positions[k+1] if k+1 < len(positions) else len(body)
|
||||||
|
gap = next_pos - pos
|
||||||
|
data_bytes = body[pos+2 : next_pos]
|
||||||
|
print(f" chunk[{k:>3}] @ {pos:>5} NN=0x{NN:02x} ({NN:>3}, NN/2={NN//2}) gap={gap:>3} "
|
||||||
|
f"data={data_bytes[:24].hex(' ')}{'...' if len(data_bytes) > 24 else ''}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
"""Deterministic chunk walker: each chunk = [10 NN][NN/2 bytes data][2 bytes trailer]."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def walk_chunks(body: bytes, start: int = 7):
|
||||||
|
"""Yield (offset, NN, data_bytes, trailer_bytes) tuples."""
|
||||||
|
i = start
|
||||||
|
while i + 1 < len(body):
|
||||||
|
if body[i] != 0x10:
|
||||||
|
break
|
||||||
|
NN = body[i + 1]
|
||||||
|
if NN == 0 or NN > 0x80 or NN % 4 != 0:
|
||||||
|
break
|
||||||
|
chunk_len = NN // 2 + 4
|
||||||
|
if i + chunk_len > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + 2 + NN // 2])
|
||||||
|
trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
|
||||||
|
yield (i, NN, data, trailer)
|
||||||
|
i += chunk_len
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
chunks = list(walk_chunks(body))
|
||||||
|
print(f"\n=== {name} === body={len(body)} N_samples={len(b.samples['Tran'])}")
|
||||||
|
print(f" chunks parsed: {len(chunks)}")
|
||||||
|
if chunks:
|
||||||
|
last = chunks[-1]
|
||||||
|
end_of_walk = last[0] + last[1] // 2 + 4
|
||||||
|
print(f" walk ended at offset {end_of_walk} (= {len(body) - end_of_walk} bytes from end)")
|
||||||
|
# Stats
|
||||||
|
total_data_bytes = sum(len(c[2]) for c in chunks)
|
||||||
|
print(f" total data bytes: {total_data_bytes}, total nibbles: {2*total_data_bytes}")
|
||||||
|
if name in ("event-c", "event-d"):
|
||||||
|
ratio = (2 * total_data_bytes) / (len(b.samples['Tran']) * 4)
|
||||||
|
print(f" nibbles per (sample × channel): {ratio:.3f}")
|
||||||
|
# Sum of trailer second-byte
|
||||||
|
trailer_sums = [c[3][-1] if c[3] else None for c in chunks]
|
||||||
|
print(f" first 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:10]]}")
|
||||||
|
# Print last 10 chunks (likely transition to trailer)
|
||||||
|
print(f" last 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-10:]]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
"""Walk chunks; auto-detect preamble length by finding first 10 NN."""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def walk_chunks(body, start, max_NN=0x80):
|
||||||
|
chunks = []
|
||||||
|
i = start
|
||||||
|
while i + 1 < len(body):
|
||||||
|
if body[i] != 0x10:
|
||||||
|
break
|
||||||
|
NN = body[i + 1]
|
||||||
|
if NN == 0 or NN > max_NN or NN % 4 != 0:
|
||||||
|
break
|
||||||
|
chunk_len = NN // 2 + 4
|
||||||
|
if i + chunk_len > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + 2 + NN // 2])
|
||||||
|
trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
|
||||||
|
chunks.append((i, NN, data, trailer))
|
||||||
|
i += chunk_len
|
||||||
|
return chunks, i
|
||||||
|
|
||||||
|
|
||||||
|
def find_first_chunk_start(body):
|
||||||
|
"""Locate first byte that begins a `10 NN` chunk (NN ∈ multiples of 4, 4..0x7C)."""
|
||||||
|
for i in range(20):
|
||||||
|
if body[i] == 0x10 and body[i + 1] % 4 == 0 and 0 < body[i + 1] <= 0x7C:
|
||||||
|
return i
|
||||||
|
return -1
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
start = find_first_chunk_start(body)
|
||||||
|
chunks, end = walk_chunks(body, start)
|
||||||
|
print(f"\n=== {name} === body={len(body)} N_samples={len(b.samples['Tran'])} start={start}")
|
||||||
|
print(f" chunks parsed: {len(chunks)}, walk ended at {end}")
|
||||||
|
if chunks:
|
||||||
|
print(f" first 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:5]]}")
|
||||||
|
print(f" last 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-5:]]}")
|
||||||
|
print(f" bytes around end of walk: {body[end-4:end+12].hex(' ')}")
|
||||||
|
else:
|
||||||
|
print(f" bytes at start: {body[start:start+16].hex(' ')}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,75 @@
|
|||||||
|
"""
|
||||||
|
Walker v4: alternate [10 NN] data chunks and [00 NN] (or other) marker tags.
|
||||||
|
|
||||||
|
Hypothesis:
|
||||||
|
- [10 NN]: data block, length NN/2 + 2 bytes (2-byte tag + NN/2 bytes data)
|
||||||
|
- [00 NN]: 2-byte marker block (no data)
|
||||||
|
- [20/30/40 NN]: special blocks with type-dependent length
|
||||||
|
"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
|
||||||
|
|
||||||
|
def walk(body, start):
|
||||||
|
i = start
|
||||||
|
blocks = []
|
||||||
|
while i + 1 < len(body):
|
||||||
|
t0 = body[i]
|
||||||
|
t1 = body[i + 1]
|
||||||
|
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0x80:
|
||||||
|
# data chunk: length NN/2 + 2
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
blocks.append((i, "10", t1, bytes(body[i + 2 : i + length]), length))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x00 and t1 % 4 == 0:
|
||||||
|
# 2-byte marker
|
||||||
|
blocks.append((i, "00", t1, b"", 2))
|
||||||
|
i += 2
|
||||||
|
elif t0 == 0x20 and t1 % 4 == 0:
|
||||||
|
# type 2 — try length 2+t1/2 (similar to 10) OR fixed
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
blocks.append((i, "20", t1, bytes(body[i + 2 : i + length]), length))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x30 and t1 % 4 == 0:
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
blocks.append((i, "30", t1, bytes(body[i + 2 : i + length]), length))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x40 and t1 == 0x02:
|
||||||
|
# Special "footer transition" block — try fixed 22 bytes
|
||||||
|
length = 22
|
||||||
|
blocks.append((i, "40", t1, bytes(body[i + 2 : i + length]), length))
|
||||||
|
i += length
|
||||||
|
else:
|
||||||
|
# Unknown tag — stop
|
||||||
|
blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
|
||||||
|
break
|
||||||
|
return blocks, i
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
# Auto-detect start
|
||||||
|
for s in range(15):
|
||||||
|
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0x80:
|
||||||
|
start = s
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
start = 7
|
||||||
|
blocks, end = walk(body, start)
|
||||||
|
# Categorize
|
||||||
|
from collections import Counter
|
||||||
|
types = Counter(b[1] for b in blocks)
|
||||||
|
print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])} start={start}")
|
||||||
|
print(f" total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
|
||||||
|
print(f" type counts: {dict(types)}")
|
||||||
|
# Print last 5 blocks
|
||||||
|
print(f" last 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-5:]]}")
|
||||||
|
if end < len(body):
|
||||||
|
print(f" bytes at end: {body[end:end+24].hex(' ')}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,83 @@
|
|||||||
|
"""
|
||||||
|
Walker v5: flexible NN range and multiple block-type lengths.
|
||||||
|
|
||||||
|
Hypothesis:
|
||||||
|
- [10 NN]: 4-bit-delta data block, length = NN/2 + 2
|
||||||
|
- [20 NN]: 8-bit-literal data block, length = NN + 2
|
||||||
|
- [00 NN]: 2-byte marker (no payload)
|
||||||
|
- [30 NN]: trailer/summary block, length = NN*4
|
||||||
|
- [40 NN]: footer-marker block, fixed 22 bytes
|
||||||
|
"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
from collections import Counter
|
||||||
|
|
||||||
|
|
||||||
|
def walk(body, start, max_blocks=10000):
|
||||||
|
i = start
|
||||||
|
blocks = []
|
||||||
|
while i + 1 < len(body) and len(blocks) < max_blocks:
|
||||||
|
t0 = body[i]
|
||||||
|
t1 = body[i + 1]
|
||||||
|
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
if i + length > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append((i, "10", t1, data, length))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 + 2
|
||||||
|
if i + length > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append((i, "20", t1, data, length))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x00 and t1 % 4 == 0:
|
||||||
|
# 2-byte marker
|
||||||
|
blocks.append((i, "00", t1, b"", 2))
|
||||||
|
i += 2
|
||||||
|
elif t0 == 0x30 and t1 % 4 == 0:
|
||||||
|
length = t1 * 4
|
||||||
|
if i + length > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append((i, "30", t1, data, length))
|
||||||
|
i += length
|
||||||
|
elif t0 == 0x40 and t1 == 0x02:
|
||||||
|
length = 22
|
||||||
|
if i + length > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append((i, "40", t1, data, length))
|
||||||
|
i += length
|
||||||
|
else:
|
||||||
|
blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
|
||||||
|
break
|
||||||
|
return blocks, i
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
for s in range(15):
|
||||||
|
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
|
||||||
|
start = s; break
|
||||||
|
else:
|
||||||
|
start = 7
|
||||||
|
blocks, end = walk(body, start)
|
||||||
|
types = Counter(bb[1] for bb in blocks)
|
||||||
|
print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])} start={start}")
|
||||||
|
print(f" total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
|
||||||
|
print(f" type counts: {dict(types)}")
|
||||||
|
if blocks and blocks[-1][1] == "??":
|
||||||
|
print(f" stopped at byte: 0x{blocks[-1][2]:02x}, prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
|
||||||
|
# Sum payload sizes by type
|
||||||
|
payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
|
||||||
|
print(f" payload bytes by type: {payload_sizes}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,68 @@
|
|||||||
|
"""
|
||||||
|
Walker v6: handle 40 02 blocks correctly (length 20).
|
||||||
|
|
||||||
|
Block formats:
|
||||||
|
- [10 NN]: 4-bit nibble delta data, length = NN/2 + 2
|
||||||
|
- [20 NN]: int8 literal data, length = NN + 2
|
||||||
|
- [00 NN]: 2-byte marker
|
||||||
|
- [30 NN]: trailer/summary block, length = NN*4
|
||||||
|
- [40 02]: segment header, fixed length 20
|
||||||
|
"""
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, ".")
|
||||||
|
from analysis.load_bundle import load_bundle
|
||||||
|
from collections import Counter
|
||||||
|
|
||||||
|
|
||||||
|
def walk(body, start, max_blocks=10000):
|
||||||
|
i = start
|
||||||
|
blocks = []
|
||||||
|
while i + 1 < len(body) and len(blocks) < max_blocks:
|
||||||
|
t0 = body[i]
|
||||||
|
t1 = body[i + 1]
|
||||||
|
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 + 2
|
||||||
|
elif t0 == 0x00 and t1 % 4 == 0:
|
||||||
|
length = 2
|
||||||
|
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||||
|
length = t1 * 4
|
||||||
|
elif t0 == 0x40 and t1 == 0x02:
|
||||||
|
length = 20
|
||||||
|
else:
|
||||||
|
blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
|
||||||
|
break
|
||||||
|
if i + length > len(body):
|
||||||
|
break
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append((i, f"{t0:02x}", t1, data, length))
|
||||||
|
i += length
|
||||||
|
return blocks, i
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
for name in ("event-c", "event-d", "event-a", "event-b"):
|
||||||
|
b = load_bundle(name)
|
||||||
|
body = b.body
|
||||||
|
for s in range(15):
|
||||||
|
if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
|
||||||
|
start = s; break
|
||||||
|
else:
|
||||||
|
start = 7
|
||||||
|
blocks, end = walk(body, start)
|
||||||
|
types = Counter(bb[1] for bb in blocks)
|
||||||
|
print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])} start={start}")
|
||||||
|
print(f" total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
|
||||||
|
print(f" type counts: {dict(types)}")
|
||||||
|
if blocks and blocks[-1][1] == "??":
|
||||||
|
print(f" stopped at byte: 0x{blocks[-1][2]:02x} at offset {blocks[-1][0]}")
|
||||||
|
print(f" prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
|
||||||
|
print(f" bytes around stop: {body[end-4:end+24].hex(' ')}")
|
||||||
|
# Sum
|
||||||
|
payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
|
||||||
|
print(f" payload bytes by type: {payload_sizes}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,65 @@
|
|||||||
|
"""Run read_idf_file across the corpus and report per-channel accuracy vs sidecars."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from micromate.idf_file import read_idf_file
|
||||||
|
from analysis_idf.recon import load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def sidecar_path(idfw: Path) -> Path:
|
||||||
|
return idfw.parent / "TXT" / f"{idfw.name}.txt"
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
root = REPO / "tests/fixtures/THORDATA_example"
|
||||||
|
files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
|
||||||
|
files.sort()
|
||||||
|
GEO_LSB = 0.0003
|
||||||
|
|
||||||
|
n_ok = n_skip = 0
|
||||||
|
overall = {"Tran": [], "Vert": [], "Long": []}
|
||||||
|
|
||||||
|
for f in files:
|
||||||
|
try:
|
||||||
|
res = read_idf_file(f)
|
||||||
|
except Exception:
|
||||||
|
n_skip += 1
|
||||||
|
continue
|
||||||
|
sc_path = sidecar_path(f)
|
||||||
|
if not sc_path.exists():
|
||||||
|
n_skip += 1
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
sc = load_sidecar_samples(sc_path)
|
||||||
|
except Exception:
|
||||||
|
n_skip += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
per_file = {}
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||||
|
dec = res.samples.get(ch, [])
|
||||||
|
n = min(len(sc_counts), len(dec))
|
||||||
|
if n == 0:
|
||||||
|
per_file[ch] = 0.0
|
||||||
|
continue
|
||||||
|
exact = sum(1 for i in range(n) if sc_counts[i] == dec[i])
|
||||||
|
pct = 100.0 * exact / n
|
||||||
|
per_file[ch] = pct
|
||||||
|
overall[ch].append(pct)
|
||||||
|
n_ok += 1
|
||||||
|
|
||||||
|
print(f"Processed {n_ok} files (skipped {n_skip})")
|
||||||
|
print("Per-channel exact-match % (mean / min / max):")
|
||||||
|
for ch, vals in overall.items():
|
||||||
|
if vals:
|
||||||
|
avg = sum(vals) / len(vals)
|
||||||
|
print(f" {ch}: mean={avg:.2f}% min={min(vals):.2f}% max={max(vals):.2f}% n={len(vals)}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
"""Find where decoded-vs-sidecar diverges for each channel."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2
|
||||||
|
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
sc = load_sidecar_samples(TXT)
|
||||||
|
decoded = decode_waveform_v2(buf[0x0f1f:])
|
||||||
|
GEO_LSB = 0.0003
|
||||||
|
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||||
|
dec = decoded[ch]
|
||||||
|
# Find ALL transitions where mismatches start/stop
|
||||||
|
first_diff = next((i for i in range(len(dec)) if dec[i] != sc_counts[i]), None)
|
||||||
|
if first_diff is None:
|
||||||
|
print(f"{ch}: NO MISMATCHES")
|
||||||
|
continue
|
||||||
|
print(f"{ch}: first diff at idx {first_diff}")
|
||||||
|
# Show 5 before, 5 after
|
||||||
|
for i in range(max(0, first_diff - 3), min(len(dec), first_diff + 8)):
|
||||||
|
mark = " " if dec[i] == sc_counts[i] else "**"
|
||||||
|
print(f" {mark} idx {i:4d}: sc={sc_counts[i]:6d} dec={dec[i]:6d} diff={dec[i]-sc_counts[i]:+d}")
|
||||||
|
# Where does cumulative diff exceed 100?
|
||||||
|
cum_match_run = 0
|
||||||
|
max_match_run = 0
|
||||||
|
match_run_start = 0
|
||||||
|
diff_count = 0
|
||||||
|
for i in range(len(dec)):
|
||||||
|
if dec[i] == sc_counts[i]:
|
||||||
|
cum_match_run += 1
|
||||||
|
max_match_run = max(max_match_run, cum_match_run)
|
||||||
|
else:
|
||||||
|
cum_match_run = 0
|
||||||
|
diff_count += 1
|
||||||
|
print(f" total mismatches: {diff_count}/{len(dec)}, longest run of matches: {max_match_run}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
"""End-to-end IDFH ingest verification."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||||
|
txt = idfh.parent / "TXT" / f"{idfh.name}.txt"
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
store = WaveformStore(Path(td))
|
||||||
|
ev, rec = store.save_imported_idf(
|
||||||
|
idfh.read_bytes(),
|
||||||
|
idfh,
|
||||||
|
idf_report_text=txt.read_text(errors="replace"),
|
||||||
|
)
|
||||||
|
print("=== save_imported_idf (IDFH) ===")
|
||||||
|
print(f" serial: {rec['serial']}")
|
||||||
|
print(f" filename: {rec['filename']}")
|
||||||
|
print(f" filesize: {rec['filesize']}")
|
||||||
|
print(f" h5: {rec['hdf5_filename']}") # expect None for histogram
|
||||||
|
print(f" sidecar: {rec['sidecar_filename']}")
|
||||||
|
print()
|
||||||
|
print("=== Event ===")
|
||||||
|
print(f" timestamp: {ev.timestamp}")
|
||||||
|
print(f" record_type: {ev.record_type}")
|
||||||
|
print(f" sample_rate: {ev.sample_rate}")
|
||||||
|
print()
|
||||||
|
# Inspect sidecar to confirm intervals were stashed
|
||||||
|
sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
|
||||||
|
sc = json.loads(sc_path.read_text())
|
||||||
|
intervals = sc.get("extensions", {}).get("idf_intervals", [])
|
||||||
|
print(f" sidecar intervals: {len(intervals)}")
|
||||||
|
if intervals:
|
||||||
|
print(f" first interval: {intervals[0]}")
|
||||||
|
print(f" last interval: {intervals[-1]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
"""Verify the had_report=False path: ingest IDFW with no .txt."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
store = WaveformStore(Path(td))
|
||||||
|
ev, rec = store.save_imported_idf(
|
||||||
|
idfw.read_bytes(),
|
||||||
|
idfw,
|
||||||
|
serial_hint=None,
|
||||||
|
idf_report_text=None, # ← no .txt!
|
||||||
|
)
|
||||||
|
print("=== IDFW without .txt ingest ===")
|
||||||
|
print(f" serial: {rec['serial']}")
|
||||||
|
print(f" timestamp: {ev.timestamp}")
|
||||||
|
print(f" sample_rate: {ev.sample_rate}")
|
||||||
|
print(f" record_type: {ev.record_type}")
|
||||||
|
print(f" rectime_sec: {ev.rectime_seconds}")
|
||||||
|
nT = len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0
|
||||||
|
nV = len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0
|
||||||
|
nL = len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0
|
||||||
|
nM = len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0
|
||||||
|
print(f" raw_samples: Tran={nT} Vert={nV} Long={nL} MicL={nM}")
|
||||||
|
if ev.peak_values:
|
||||||
|
print(f" peak_values: tran={ev.peak_values.tran} vert={ev.peak_values.vert} long={ev.peak_values.long}")
|
||||||
|
print(f" h5 written: {rec['hdf5_filename']}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,102 @@
|
|||||||
|
"""End-to-end Thor report PDF rendering.
|
||||||
|
|
||||||
|
Ingests an IDFW + .txt via save_imported_idf, runs gather_report_data
|
||||||
|
(faking a minimal DB row), and renders the PDF to disk.
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
from sfm import report_pdf
|
||||||
|
|
||||||
|
|
||||||
|
class FakeDb:
|
||||||
|
"""Stand-in for SeismoDb.get_event(); the renderer only needs a few cols."""
|
||||||
|
def __init__(self, event):
|
||||||
|
self.event = event
|
||||||
|
|
||||||
|
def get_event(self, _id):
|
||||||
|
return self.event
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
|
||||||
|
idfw = base / "UM11719_20231219162723.IDFW"
|
||||||
|
txt = base / "TXT" / f"{idfw.name}.txt"
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
store = WaveformStore(Path(td))
|
||||||
|
ev, rec = store.save_imported_idf(
|
||||||
|
idfw.read_bytes(),
|
||||||
|
idfw,
|
||||||
|
idf_report_text=txt.read_text(errors="replace"),
|
||||||
|
)
|
||||||
|
print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
|
||||||
|
|
||||||
|
# Verify sidecar has bw_report block
|
||||||
|
sc_path = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
|
||||||
|
sc = json.loads(sc_path.read_text())
|
||||||
|
bw = sc.get("bw_report", {})
|
||||||
|
print(f" bw_report.available: {bw.get('available')}")
|
||||||
|
print(f" bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
|
||||||
|
print(f" bw_report.mic.pspl_dbl: {bw.get('mic', {}).get('pspl_dbl')}")
|
||||||
|
print(f" bw_report.histogram.n_intervals: {bw.get('histogram', {}).get('n_intervals')}")
|
||||||
|
|
||||||
|
# Build a DB-row-shaped dict from the Event for gather_report_data
|
||||||
|
import datetime
|
||||||
|
ts = ev.timestamp
|
||||||
|
ts_iso = None
|
||||||
|
if ts is not None:
|
||||||
|
try:
|
||||||
|
ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
fake_row = {
|
||||||
|
"serial": "UM11719",
|
||||||
|
"blastware_filename": rec["filename"],
|
||||||
|
"record_type": "Waveform",
|
||||||
|
"timestamp": ts_iso,
|
||||||
|
"sample_rate": ev.sample_rate,
|
||||||
|
"project": ev.project_info.project if ev.project_info else None,
|
||||||
|
"client": ev.project_info.client if ev.project_info else None,
|
||||||
|
"operator": ev.project_info.operator if ev.project_info else None,
|
||||||
|
"sensor_location": ev.project_info.sensor_location if ev.project_info else None,
|
||||||
|
"created_at": None,
|
||||||
|
}
|
||||||
|
|
||||||
|
rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
|
||||||
|
print()
|
||||||
|
print(f"=== ReportData ===")
|
||||||
|
print(f" event_id: {rd.event_id}")
|
||||||
|
print(f" serial: {rd.serial}")
|
||||||
|
print(f" record_type: {rd.record_type}")
|
||||||
|
print(f" event_datetime: {rd.event_datetime_str}")
|
||||||
|
print(f" trigger: {rd.trigger_source}")
|
||||||
|
print(f" geo_range: {rd.geo_range_str}")
|
||||||
|
print(f" sample_rate: {rd.sample_rate_str}")
|
||||||
|
print(f" firmware: {rd.firmware}")
|
||||||
|
print(f" calibration: {rd.calibration_date} by {rd.calibration_by}")
|
||||||
|
print(f" battery: {rd.battery_volts}")
|
||||||
|
print(f" PVS: {rd.peak_vector_sum_ips} in/s at {rd.peak_vector_sum_time_s} sec")
|
||||||
|
print(f" mic_pspl_dbl: {rd.mic_pspl_dbl}")
|
||||||
|
print(f" mic_zc_freq_hz: {rd.mic_zc_freq_hz}")
|
||||||
|
print(f" channel_stats: {len(rd.channel_stats)} rows")
|
||||||
|
for cs in rd.channel_stats:
|
||||||
|
print(f" {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} ToP={cs['time_of_peak_s']} Acc={cs['peak_accel_g']} Disp={cs['peak_disp_in']} Test={cs['sensor_check']}")
|
||||||
|
|
||||||
|
# Render the PDF
|
||||||
|
out_path = REPO / "analysis_idf" / "thor_report.pdf"
|
||||||
|
pdf_bytes = report_pdf.render_event_report_pdf(rd)
|
||||||
|
out_path.write_bytes(pdf_bytes)
|
||||||
|
print()
|
||||||
|
print(f" PDF written: {out_path} ({len(pdf_bytes)} bytes)")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,91 @@
|
|||||||
|
"""End-to-end Thor IDFH histogram report PDF rendering."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
import json
|
||||||
|
import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
from sfm import report_pdf
|
||||||
|
|
||||||
|
|
||||||
|
class FakeDb:
|
||||||
|
def __init__(self, event):
|
||||||
|
self.event = event
|
||||||
|
|
||||||
|
def get_event(self, _id):
|
||||||
|
return self.event
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
# Use the multi-interval IDFH (81 + trigger row)
|
||||||
|
idfh = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||||
|
txt = idfh.parent / "TXT" / f"{idfh.name}.txt"
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
store = WaveformStore(Path(td))
|
||||||
|
ev, rec = store.save_imported_idf(
|
||||||
|
idfh.read_bytes(),
|
||||||
|
idfh,
|
||||||
|
idf_report_text=txt.read_text(errors="replace"),
|
||||||
|
)
|
||||||
|
print(f"save_imported_idf: h5={rec['hdf5_filename']}, sidecar={rec['sidecar_filename']}")
|
||||||
|
|
||||||
|
sc_path = Path(td) / "UM13981" / f"{idfh.name}.sfm.json"
|
||||||
|
sc = json.loads(sc_path.read_text())
|
||||||
|
bw = sc.get("bw_report", {})
|
||||||
|
hist = bw.get("histogram", {})
|
||||||
|
print(f" bw_report.histogram.start: {hist.get('start')}")
|
||||||
|
print(f" bw_report.histogram.stop: {hist.get('stop')}")
|
||||||
|
print(f" bw_report.histogram.n_intervals: {hist.get('n_intervals')}")
|
||||||
|
print(f" bw_report.histogram.interval_size: {hist.get('interval_size')}")
|
||||||
|
print(f" bw_report.histogram.interval_size_s: {hist.get('interval_size_s')}")
|
||||||
|
print(f" bw_report.peaks.tran.ppv_ips: {bw.get('peaks', {}).get('tran', {}).get('ppv_ips')}")
|
||||||
|
|
||||||
|
ts = ev.timestamp
|
||||||
|
ts_iso = None
|
||||||
|
if ts is not None:
|
||||||
|
try:
|
||||||
|
ts_iso = datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
fake_row = {
|
||||||
|
"serial": "UM13981",
|
||||||
|
"blastware_filename": rec["filename"],
|
||||||
|
"record_type": "Histogram",
|
||||||
|
"timestamp": ts_iso,
|
||||||
|
"sample_rate": ev.sample_rate,
|
||||||
|
"project": ev.project_info.project if ev.project_info else None,
|
||||||
|
"client": ev.project_info.client if ev.project_info else None,
|
||||||
|
"operator": ev.project_info.operator if ev.project_info else None,
|
||||||
|
"sensor_location": ev.project_info.sensor_location if ev.project_info else None,
|
||||||
|
"created_at": None,
|
||||||
|
}
|
||||||
|
rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="hist-1")
|
||||||
|
|
||||||
|
print()
|
||||||
|
print("=== ReportData (histogram) ===")
|
||||||
|
print(f" is_histogram: {rd.is_histogram}")
|
||||||
|
print(f" histogram_start: {rd.histogram_start_str}")
|
||||||
|
print(f" histogram_stop: {rd.histogram_stop_str}")
|
||||||
|
print(f" histogram_n_intervals: {rd.histogram_n_intervals}")
|
||||||
|
print(f" histogram_interval_size:{rd.histogram_interval_size}")
|
||||||
|
print(f" histogram_interval_times[:3]: {rd.histogram_interval_times[:3]}")
|
||||||
|
print(f" histogram_interval_times[-2:]: {rd.histogram_interval_times[-2:]}")
|
||||||
|
print(f" channel_stats: {len(rd.channel_stats)} rows")
|
||||||
|
for cs in rd.channel_stats:
|
||||||
|
print(f" {cs['name']}: PPV={cs['ppv_ips']} ZC={cs['zc_freq_hz']} peak_date={cs['peak_date']} peak_time={cs['peak_time']}")
|
||||||
|
|
||||||
|
pdf_bytes = report_pdf.render_event_report_pdf(rd)
|
||||||
|
out_path = REPO / "analysis_idf" / "thor_report_idfh.pdf"
|
||||||
|
out_path.write_bytes(pdf_bytes)
|
||||||
|
print()
|
||||||
|
print(f" PDF written: {out_path} ({len(pdf_bytes)} bytes)")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,52 @@
|
|||||||
|
"""End-to-end ingest test: feed an IDFW + .txt to save_imported_idf in a tmp store."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
import tempfile
|
||||||
|
import shutil
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
idfw = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||||
|
txt = idfw.parent / "TXT" / f"{idfw.name}.txt"
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
store = WaveformStore(Path(td))
|
||||||
|
ev, rec = store.save_imported_idf(
|
||||||
|
idfw.read_bytes(),
|
||||||
|
idfw,
|
||||||
|
serial_hint=None,
|
||||||
|
idf_report_text=txt.read_text(errors="replace"),
|
||||||
|
)
|
||||||
|
print("=== Save result ===")
|
||||||
|
print(f" serial: {rec['serial']}")
|
||||||
|
print(f" filename: {rec['filename']}")
|
||||||
|
print(f" filesize: {rec['filesize']}")
|
||||||
|
print(f" h5: {rec['hdf5_filename']}")
|
||||||
|
print(f" sidecar: {rec['sidecar_filename']}")
|
||||||
|
print()
|
||||||
|
print("=== Event ===")
|
||||||
|
print(f" serial: {ev.serial if hasattr(ev,'serial') else '(n/a)'}")
|
||||||
|
print(f" timestamp: {ev.timestamp}")
|
||||||
|
print(f" sample_rate: {ev.sample_rate}")
|
||||||
|
print(f" record_type: {ev.record_type}")
|
||||||
|
print(f" rectime_sec: {ev.rectime_seconds}")
|
||||||
|
print(f" raw_samples: Tran={len(ev.raw_samples.get('Tran', [])) if ev.raw_samples else 0}, Vert={len(ev.raw_samples.get('Vert', [])) if ev.raw_samples else 0}, Long={len(ev.raw_samples.get('Long', [])) if ev.raw_samples else 0}, MicL={len(ev.raw_samples.get('MicL', [])) if ev.raw_samples else 0}")
|
||||||
|
if ev.peak_values:
|
||||||
|
print(f" peaks (txt): Tran={ev.peak_values.tran} Vert={ev.peak_values.vert} Long={ev.peak_values.long}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Verify the h5 file actually got written
|
||||||
|
h5path = Path(td) / "UM11719" / f"{idfw.name}.h5"
|
||||||
|
print(f" h5 exists: {h5path.exists()} size={h5path.stat().st_size if h5path.exists() else 0}")
|
||||||
|
sidecar = Path(td) / "UM11719" / f"{idfw.name}.sfm.json"
|
||||||
|
print(f" sidecar exists:{sidecar.exists()} size={sidecar.stat().st_size if sidecar.exists() else 0}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,137 @@
|
|||||||
|
"""Decode IDFH histogram intervals + verify against sidecar."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
import struct
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
|
||||||
|
SEGMENT_MAGIC = b"\x02\xda\x0a\x00\x00\x00"
|
||||||
|
SEGMENT_SIZE = 732 # = 10-byte header + 10 × 72-byte intervals + 2-byte tail
|
||||||
|
INTERVAL_SIZE = 72
|
||||||
|
CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||||
|
|
||||||
|
|
||||||
|
def decode_interval(buf72: bytes) -> dict:
|
||||||
|
"""Decode one 72-byte interval into per-channel min/max/halfp."""
|
||||||
|
out = {}
|
||||||
|
for i, ch in enumerate(CHANNELS):
|
||||||
|
block = buf72[i*16 : (i+1)*16]
|
||||||
|
mn = struct.unpack_from(">h", block, 0)[0]
|
||||||
|
mx = struct.unpack_from(">h", block, 2)[0]
|
||||||
|
sb = struct.unpack_from(">h", block, 4)[0]
|
||||||
|
halfp = struct.unpack_from(">H", block, 6)[0]
|
||||||
|
f10 = struct.unpack_from(">H", block, 10)[0]
|
||||||
|
f14 = struct.unpack_from(">H", block, 14)[0]
|
||||||
|
peak_count = max(abs(mn), abs(mx))
|
||||||
|
out[ch] = {
|
||||||
|
"min": mn,
|
||||||
|
"max": mx,
|
||||||
|
"field4": sb,
|
||||||
|
"halfp": halfp,
|
||||||
|
"field10": f10,
|
||||||
|
"field14": f14,
|
||||||
|
"peak": peak_count,
|
||||||
|
"freq_hz": (512.0 / halfp) if halfp > 5 else None,
|
||||||
|
}
|
||||||
|
out["_tail"] = buf72[64:].hex(" ")
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def walk_idfh(buf: bytes) -> list:
|
||||||
|
"""Walk all interval records in an IDFH file."""
|
||||||
|
intervals = []
|
||||||
|
# Multi-segment file: every 02 da 0a 00 00 00 marker introduces a segment.
|
||||||
|
# Single-interval file: just one body header at 0xf96 of form ?? ?? 0a 00 00 00.
|
||||||
|
# Find them all.
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
j = buf.find(b"\x0a\x00\x00\x00", i)
|
||||||
|
if j < 0:
|
||||||
|
break
|
||||||
|
# Validate: the 2 bytes before must form a length, and we want bytes
|
||||||
|
# [j-2 : j+6] to have a recognisable shape. Actually the cleanest
|
||||||
|
# filter is "preceded by a length and followed by 00 NN 05 3f".
|
||||||
|
if j < 2:
|
||||||
|
i = j + 1
|
||||||
|
continue
|
||||||
|
# Body header form: [length_be_2][0a 00 00 00][00 NN][05 3f]
|
||||||
|
if j + 10 > len(buf):
|
||||||
|
break
|
||||||
|
length = int.from_bytes(buf[j-2:j], "big")
|
||||||
|
# Verify the segment-marker shape: [length_be][0a 00 00 00][00 NN][05 3f]
|
||||||
|
if buf[j+4] != 0x00:
|
||||||
|
i = j + 1
|
||||||
|
continue
|
||||||
|
if buf[j+6:j+8] != b"\x05\x3f":
|
||||||
|
i = j + 1
|
||||||
|
continue
|
||||||
|
# Header layout (10 bytes): [length_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
|
||||||
|
# Followed by N interval records of 72 bytes each, then 2 tail bytes.
|
||||||
|
# length value = (N × 72) + 10 (counts bytes from 0x0a... through interval data).
|
||||||
|
header_start = j - 2
|
||||||
|
n_intervals = (length - 10) // INTERVAL_SIZE
|
||||||
|
interval_start = header_start + 10
|
||||||
|
for k in range(n_intervals):
|
||||||
|
off = interval_start + k * INTERVAL_SIZE
|
||||||
|
if off + INTERVAL_SIZE > len(buf):
|
||||||
|
break
|
||||||
|
chunk = buf[off:off + INTERVAL_SIZE]
|
||||||
|
intervals.append({"offset": off, **decode_interval(chunk)})
|
||||||
|
i = header_start + length + 2
|
||||||
|
return intervals
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
# Test against multi-segment IDFH
|
||||||
|
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||||
|
sc_path = target.parent / "TXT" / f"{target.name}.txt"
|
||||||
|
buf = target.read_bytes()
|
||||||
|
intervals = walk_idfh(buf)
|
||||||
|
print(f"=== {target.name} ===")
|
||||||
|
print(f" file size: {len(buf)}")
|
||||||
|
print(f" decoded intervals: {len(intervals)}")
|
||||||
|
# Show first 2 + last 2
|
||||||
|
sc_rows = []
|
||||||
|
for line in sc_path.read_text(errors="replace").splitlines():
|
||||||
|
if line.startswith("2022-") or line.startswith("2023-"):
|
||||||
|
sc_rows.append(line)
|
||||||
|
print(f" sidecar rows: {len(sc_rows)}")
|
||||||
|
|
||||||
|
print()
|
||||||
|
for k in [0, 1, 78, 79, 80]:
|
||||||
|
if k >= len(intervals):
|
||||||
|
continue
|
||||||
|
iv = intervals[k]
|
||||||
|
print(f"--- interval {k} @0x{iv['offset']:04x} ---")
|
||||||
|
for ch in CHANNELS:
|
||||||
|
d = iv[ch]
|
||||||
|
peak_ips = d["peak"] / 32768 * 10.0
|
||||||
|
print(f" {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s) halfp={d['halfp']:5d} freq={d['freq_hz']}")
|
||||||
|
# sidecar row
|
||||||
|
if k < len(sc_rows):
|
||||||
|
print(f" SC: {sc_rows[k]}")
|
||||||
|
|
||||||
|
# Test single-interval IDFH
|
||||||
|
print()
|
||||||
|
target2 = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
|
||||||
|
sc2 = target2.parent / "TXT" / f"{target2.name}.txt"
|
||||||
|
buf2 = target2.read_bytes()
|
||||||
|
intervals2 = walk_idfh(buf2)
|
||||||
|
print(f"=== {target2.name} ===")
|
||||||
|
print(f" file size: {len(buf2)}, decoded intervals: {len(intervals2)}")
|
||||||
|
if intervals2:
|
||||||
|
iv = intervals2[0]
|
||||||
|
for ch in CHANNELS:
|
||||||
|
d = iv[ch]
|
||||||
|
peak_ips = d["peak"] / 32768 * 10.0
|
||||||
|
print(f" {ch}: peak={d['peak']:5d} ({peak_ips:.4f} in/s) halfp={d['halfp']:5d} freq={d['freq_hz']}")
|
||||||
|
sc_rows2 = [l for l in sc2.read_text(errors='replace').splitlines() if l.startswith("2023-")]
|
||||||
|
if sc_rows2:
|
||||||
|
print(f" SC: {sc_rows2[0]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,41 @@
|
|||||||
|
"""Find IDFH interval period via auto-correlation of structural patterns."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
from collections import Counter
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM13981/UM13981_20220805075441.IDFH"
|
||||||
|
buf = target.read_bytes()
|
||||||
|
body_start = 0xF96
|
||||||
|
body_end = 0x270C
|
||||||
|
body = buf[body_start:body_end]
|
||||||
|
print(f"body size: {len(body)} bytes (file {len(buf)} bytes)")
|
||||||
|
|
||||||
|
# For each candidate interval size, count how many bytes at fixed offsets within
|
||||||
|
# each interval are zero (consistent column-zero pattern indicates correct size).
|
||||||
|
print()
|
||||||
|
print("=== zero-column score by interval size (higher = more likely) ===")
|
||||||
|
best = []
|
||||||
|
for sz in range(16, 100):
|
||||||
|
n = len(body) // sz
|
||||||
|
if n < 30:
|
||||||
|
continue
|
||||||
|
# For each column position within an interval, count how many of n intervals have zero
|
||||||
|
score = 0
|
||||||
|
for col in range(sz):
|
||||||
|
zeros = sum(1 for i in range(n) if body[i*sz + col] == 0)
|
||||||
|
if zeros >= n * 0.9:
|
||||||
|
score += 1
|
||||||
|
best.append((score, sz, n))
|
||||||
|
best.sort(reverse=True)
|
||||||
|
for score, sz, n in best[:10]:
|
||||||
|
print(f" size={sz:3d} n_intervals={n} consistently-zero-cols={score}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
"""Per-file accuracy + sample-count details."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from micromate.idf_file import read_idf_file
|
||||||
|
from analysis_idf.recon import load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
root = REPO / "tests/fixtures/THORDATA_example"
|
||||||
|
files = sorted([f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")])
|
||||||
|
GEO_LSB = 0.0003
|
||||||
|
# Limit to first 15 successful files for detail.
|
||||||
|
shown = 0
|
||||||
|
for f in files:
|
||||||
|
try:
|
||||||
|
res = read_idf_file(f)
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
|
sc_path = f.parent / "TXT" / f"{f.name}.txt"
|
||||||
|
if not sc_path.exists():
|
||||||
|
continue
|
||||||
|
sc = load_sidecar_samples(sc_path)
|
||||||
|
sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
|
||||||
|
dec = res.samples.get("Tran", [])
|
||||||
|
n = min(len(sc_tran), len(dec))
|
||||||
|
exact = sum(1 for i in range(n) if sc_tran[i] == dec[i]) if n else 0
|
||||||
|
pct = 100.0 * exact / n if n else 0.0
|
||||||
|
print(f"{f.name:40s} size={f.stat().st_size:6d} sc_n={len(sc_tran):4d} dec_n={len(dec):4d} exact={pct:.1f}%")
|
||||||
|
shown += 1
|
||||||
|
if shown >= 20:
|
||||||
|
break
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,64 @@
|
|||||||
|
"""Look at what's at the divergence boundary."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start, parse_segment_header
|
||||||
|
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
body = buf[0x0f1f:]
|
||||||
|
start = find_data_start(body)
|
||||||
|
print(f"data_start: {start} (= file offset 0x{0x0f1f + start:04x})")
|
||||||
|
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
print(f"{len(blocks)} blocks total")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# First 25 blocks
|
||||||
|
print("=== first 30 blocks ===")
|
||||||
|
for i, b in enumerate(blocks[:30]):
|
||||||
|
body_off = 0x0f1f + b.offset
|
||||||
|
if b.tag_hi == 0x40:
|
||||||
|
hdr = parse_segment_header(b)
|
||||||
|
print(f" [{i:3d}] @0x{body_off:04x} {b.kind} (segment header) counter={hdr['counter'] if hdr else '?'} field2={hdr['field2'].hex() if hdr else '?'} anchor={hdr['anchor_bytes'].hex() if hdr else '?'} tail={hdr['tail'].hex() if hdr else '?'}")
|
||||||
|
else:
|
||||||
|
print(f" [{i:3d}] @0x{body_off:04x} {b.kind} len={b.length} data={b.data[:16].hex()}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Cumulative sample counts per block to find which block contains sample 254
|
||||||
|
print("=== cumulative samples through blocks ===")
|
||||||
|
cur_ch = "Tran"
|
||||||
|
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||||
|
seg_count = 0
|
||||||
|
samples_in_curseg = 2 # preamble Tran[0], Tran[1]
|
||||||
|
for i, b in enumerate(blocks[:30]):
|
||||||
|
if b.tag_hi == 0x40:
|
||||||
|
seg_count += 1
|
||||||
|
prev_ch = cur_ch
|
||||||
|
cur_ch = rotation[(seg_count - 1) % 4]
|
||||||
|
print(f" [{i:3d}] 40 02 -> end of {prev_ch} segment, start {cur_ch} (segment {seg_count})")
|
||||||
|
samples_in_curseg = 2 # anchors
|
||||||
|
elif (b.tag_hi & 0xF0) == 0x10:
|
||||||
|
nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
|
||||||
|
samples_in_curseg += nn
|
||||||
|
print(f" [{i:3d}] {b.kind} nibble: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||||
|
elif (b.tag_hi & 0xF0) == 0x20:
|
||||||
|
nn = ((b.tag_hi & 0x0F) << 8) | b.tag_lo
|
||||||
|
samples_in_curseg += nn
|
||||||
|
print(f" [{i:3d}] {b.kind} int8: +{nn} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||||
|
elif b.tag_hi == 0x00:
|
||||||
|
samples_in_curseg += b.tag_lo
|
||||||
|
print(f" [{i:3d}] {b.kind} RLE: +{b.tag_lo}, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||||
|
elif b.tag_hi == 0x30:
|
||||||
|
samples_in_curseg += b.tag_lo
|
||||||
|
print(f" [{i:3d}] {b.kind} packed12: +{b.tag_lo} samples, ch={cur_ch}, ch_total~{samples_in_curseg}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,89 @@
|
|||||||
|
"""Reconnaissance helpers for cracking the Thor IDFW binary."""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
TARGET = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||||
|
TXT = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/TXT/UM11719_20231219162723.IDFW.txt"
|
||||||
|
|
||||||
|
|
||||||
|
def hex_at(buf: bytes, off: int, n: int = 32) -> str:
|
||||||
|
chunk = buf[off : off + n]
|
||||||
|
hexs = " ".join(f"{b:02x}" for b in chunk)
|
||||||
|
asc = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
|
||||||
|
return f"{off:04x}: {hexs} {asc}"
|
||||||
|
|
||||||
|
|
||||||
|
def find_all(buf: bytes, needle: bytes) -> list[int]:
|
||||||
|
out: list[int] = []
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
j = buf.find(needle, i)
|
||||||
|
if j < 0:
|
||||||
|
break
|
||||||
|
out.append(j)
|
||||||
|
i = j + 1
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def load_sidecar_samples(path: Path) -> dict[str, list[float]]:
|
||||||
|
"""Parse the txt sample table — Tran/Vert/Long/MicL."""
|
||||||
|
out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||||
|
in_block = False
|
||||||
|
for line in path.read_text(errors="replace").splitlines():
|
||||||
|
if not in_block:
|
||||||
|
if line.strip() == "Waveform Data Channels":
|
||||||
|
in_block = True
|
||||||
|
continue
|
||||||
|
if line.startswith("Waveform Data USB Channels"):
|
||||||
|
break
|
||||||
|
parts = line.split("\t")
|
||||||
|
# First row is the header "\tTran\tVert\tLong\tMicL"
|
||||||
|
if len(parts) >= 5 and parts[1] == "Tran":
|
||||||
|
continue
|
||||||
|
if len(parts) < 5:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
out["Tran"].append(float(parts[1]))
|
||||||
|
out["Vert"].append(float(parts[2]))
|
||||||
|
out["Long"].append(float(parts[3]))
|
||||||
|
out["MicL"].append(float(parts[4]))
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
samples = load_sidecar_samples(TXT)
|
||||||
|
print(f"file size: {len(buf)} bytes")
|
||||||
|
print(f"sample rows: Tran={len(samples['Tran'])} Vert={len(samples['Vert'])} Long={len(samples['Long'])} MicL={len(samples['MicL'])}")
|
||||||
|
print(f"first 6 Tran samples: {samples['Tran'][:6]}")
|
||||||
|
print(f"first 6 Vert samples: {samples['Vert'][:6]}")
|
||||||
|
print(f"first 6 Long samples: {samples['Long'][:6]}")
|
||||||
|
print(f"first 6 MicL samples: {samples['MicL'][:6]}")
|
||||||
|
|
||||||
|
print()
|
||||||
|
print("=== BW magic '00 02 00' positions ===")
|
||||||
|
hits = find_all(buf, b"\x00\x02\x00")
|
||||||
|
print(f"{len(hits)} hits")
|
||||||
|
for h in hits[:20]:
|
||||||
|
print(hex_at(buf, h, 24))
|
||||||
|
|
||||||
|
print()
|
||||||
|
print("=== '40 02' segment-header positions ===")
|
||||||
|
hits = find_all(buf, b"\x40\x02")
|
||||||
|
print(f"{len(hits)} hits")
|
||||||
|
for h in hits:
|
||||||
|
ctx_pre = buf[max(0, h - 4): h].hex()
|
||||||
|
ctx_post = buf[h: h + 20].hex()
|
||||||
|
# Show byte preceding to help identify real headers vs casual occurrences
|
||||||
|
print(f" 0x{h:04x} pre={ctx_pre} post={ctx_post}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
"""Find each segment boundary in the channel and check if errors reset there."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2
|
||||||
|
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
sc = load_sidecar_samples(TXT)
|
||||||
|
decoded = decode_waveform_v2(buf[0x0f1f:])
|
||||||
|
GEO_LSB = 0.0003
|
||||||
|
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||||
|
dec = decoded[ch]
|
||||||
|
# Find every transition where error becomes zero from nonzero (or grows from zero)
|
||||||
|
# Print indices where dec resyncs back to exact match.
|
||||||
|
n = min(len(sc_counts), len(dec))
|
||||||
|
events = []
|
||||||
|
prev_match = True
|
||||||
|
for i in range(n):
|
||||||
|
match = sc_counts[i] == dec[i]
|
||||||
|
if match != prev_match:
|
||||||
|
kind = "RESYNC" if match else "DIVERGE"
|
||||||
|
events.append((i, kind, sc_counts[i], dec[i]))
|
||||||
|
prev_match = match
|
||||||
|
print(f"{ch}: {len(events)} transitions")
|
||||||
|
for i, kind, sc_v, dec_v in events[:20]:
|
||||||
|
print(f" idx {i:4d} {kind:8s} sc={sc_v:6d} dec={dec_v:6d} diff={dec_v-sc_v:+d}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,46 @@
|
|||||||
|
"""Smoke-test read_idf_file on IDFH across the corpus."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from micromate.idf_file import read_idf_file
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162648.IDFH"
|
||||||
|
result = read_idf_file(target)
|
||||||
|
ev = result.event
|
||||||
|
print(f"=== {target.name} ===")
|
||||||
|
print(f" signature: {result.signature}")
|
||||||
|
print(f" serial: {ev.serial}")
|
||||||
|
print(f" timestamp: {ev.timestamp}")
|
||||||
|
print(f" sample_rate: {ev.sample_rate}")
|
||||||
|
print(f" kind: {ev.kind}")
|
||||||
|
print(f" intervals: {len(result.intervals or [])}")
|
||||||
|
print(f" peaks: T={ev.peaks.transverse_ips:.4f} V={ev.peaks.vertical_ips:.4f} L={ev.peaks.longitudinal_ips:.4f}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
root = REPO / "tests/fixtures/THORDATA_example"
|
||||||
|
files = list(root.rglob("*.IDFH"))
|
||||||
|
ok = fail = nyi = 0
|
||||||
|
total_intervals = 0
|
||||||
|
for f in files:
|
||||||
|
try:
|
||||||
|
r = read_idf_file(f)
|
||||||
|
ok += 1
|
||||||
|
total_intervals += len(r.intervals or [])
|
||||||
|
except NotImplementedError:
|
||||||
|
nyi += 1
|
||||||
|
except Exception as exc:
|
||||||
|
fail += 1
|
||||||
|
if fail <= 3:
|
||||||
|
print(f" FAIL: {f.name}: {type(exc).__name__}: {exc}")
|
||||||
|
print(f"Corpus: {len(files)} IDFH files | ok={ok} fail={fail} nyi={nyi}")
|
||||||
|
print(f"Total intervals decoded: {total_intervals}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
"""Smoke-test read_idf_file across the sample corpus."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from micromate.idf_file import read_idf_file, geo_count_to_ips, mic_count_to_psi
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
target = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719/UM11719_20231219162723.IDFW"
|
||||||
|
result = read_idf_file(target)
|
||||||
|
ev = result.event
|
||||||
|
print(f"=== {target.name} ===")
|
||||||
|
print(f" signature: {result.signature}")
|
||||||
|
print(f" serial: {ev.serial}")
|
||||||
|
print(f" timestamp: {ev.timestamp}")
|
||||||
|
print(f" sample_rate: {ev.sample_rate}")
|
||||||
|
print(f" record_time: {ev.record_time_sec}")
|
||||||
|
print(f" calibration: {result.binary_metadata.calibration_date}")
|
||||||
|
print(f" Tran samples: {len(result.samples['Tran'])}, peak_ips={ev.peaks.transverse_ips:.4f}")
|
||||||
|
print(f" Vert samples: {len(result.samples['Vert'])}, peak_ips={ev.peaks.vertical_ips:.4f}")
|
||||||
|
print(f" Long samples: {len(result.samples['Long'])}, peak_ips={ev.peaks.longitudinal_ips:.4f}")
|
||||||
|
print(f" MicL samples: {len(result.samples['MicL'])}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Corpus sweep
|
||||||
|
root = REPO / "tests/fixtures/THORDATA_example"
|
||||||
|
files = [f for f in root.rglob("*.IDFW") if not str(f).endswith(".CDB")]
|
||||||
|
ok = fail = nyi = 0
|
||||||
|
for f in files:
|
||||||
|
try:
|
||||||
|
r = read_idf_file(f)
|
||||||
|
ok += 1
|
||||||
|
except NotImplementedError:
|
||||||
|
nyi += 1
|
||||||
|
except Exception as exc:
|
||||||
|
fail += 1
|
||||||
|
if fail <= 5:
|
||||||
|
print(f" FAIL: {f.name}: {type(exc).__name__}: {exc}")
|
||||||
|
print()
|
||||||
|
print(f"Corpus: {len(files)} IDFW files | ok={ok} fail={fail} not-implemented={nyi}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
"""Verify build_bw_report_from_idf against a known sidecar."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from micromate.idf_ascii_report import parse_idf_report
|
||||||
|
from micromate.idf_to_bw_report import build_bw_report_from_idf
|
||||||
|
from micromate.idf_file import read_idf_file
|
||||||
|
|
||||||
|
|
||||||
|
def show(prefix: str, d: dict, indent: int = 0):
|
||||||
|
for k, v in d.items():
|
||||||
|
if isinstance(v, dict):
|
||||||
|
print(f"{' '*indent}{prefix}{k}:")
|
||||||
|
show("", v, indent + 1)
|
||||||
|
else:
|
||||||
|
print(f"{' '*indent}{prefix}{k}: {v!r}")
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
base = REPO / "tests/fixtures/THORDATA_example/THORDATA_example/UPMC Presby/UM11719"
|
||||||
|
idfw = base / "UM11719_20231219162723.IDFW"
|
||||||
|
txt = base / "TXT" / f"{idfw.name}.txt"
|
||||||
|
|
||||||
|
report_dict = parse_idf_report(txt.read_text(errors="replace"))
|
||||||
|
res = read_idf_file(idfw)
|
||||||
|
bw = build_bw_report_from_idf(report_dict, binary_md=res.binary_metadata)
|
||||||
|
|
||||||
|
print("=== IDFW → bw_report ===")
|
||||||
|
show("", bw)
|
||||||
|
|
||||||
|
print()
|
||||||
|
print("=== IDFH (single trigger row) ===")
|
||||||
|
idfh = base / "UM11719_20231219162648.IDFH"
|
||||||
|
txt_h = base / "TXT" / f"{idfh.name}.txt"
|
||||||
|
rh = parse_idf_report(txt_h.read_text(errors="replace"))
|
||||||
|
res_h = read_idf_file(idfh)
|
||||||
|
bw_h = build_bw_report_from_idf(rh, binary_md=res_h.binary_metadata, intervals=res_h.intervals)
|
||||||
|
show("", bw_h)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Binary file not shown.
Binary file not shown.
@@ -0,0 +1,73 @@
|
|||||||
|
"""Trace Tran sample-by-sample to find exactly where the codec drifts."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n: int) -> int:
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b: int) -> int:
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
sc = load_sidecar_samples(TXT)
|
||||||
|
GEO_LSB = 0.0003
|
||||||
|
sc_tran = [int(round(v / GEO_LSB)) for v in sc["Tran"]]
|
||||||
|
|
||||||
|
body = buf[0x0f1f:]
|
||||||
|
# Tran[0], Tran[1] from preamble
|
||||||
|
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
print(f"preamble Tran[0]={t0} Tran[1]={t1} (sidecar: {sc_tran[0]}, {sc_tran[1]})")
|
||||||
|
|
||||||
|
# Block 0: 10 f8 at body[7:9]
|
||||||
|
print(f"block 0: tag {body[7]:02x} {body[8]:02x}")
|
||||||
|
print(f" block 0 first 10 data bytes: {body[9:19].hex()}")
|
||||||
|
|
||||||
|
# Walk block 0 manually, comparing each sample
|
||||||
|
cur = t1
|
||||||
|
samples = [t0, t1]
|
||||||
|
block_off = 7
|
||||||
|
nn = body[8]
|
||||||
|
print(f" NN = {nn}")
|
||||||
|
data = body[9 : 9 + nn // 2]
|
||||||
|
for byi, byte in enumerate(data):
|
||||||
|
for nib_idx, nib in enumerate(((byte >> 4) & 0xF, byte & 0xF)):
|
||||||
|
cur += s4(nib)
|
||||||
|
samples.append(cur)
|
||||||
|
idx = len(samples) - 1
|
||||||
|
if 0 <= idx < len(sc_tran):
|
||||||
|
sc_v = sc_tran[idx]
|
||||||
|
match = "✓" if sc_v == cur else "✗"
|
||||||
|
if idx < 12 or 240 <= idx <= 260:
|
||||||
|
print(f" idx {idx:3d}: nibble byte={byte:02x} nib={nib:x} delta={s4(nib):+d} cur={cur:+d} sc={sc_v:+d} {match}")
|
||||||
|
|
||||||
|
print(f"end of block 0: cur={cur}, len(samples)={len(samples)}, decoder expected 250 here")
|
||||||
|
# Block 1: 20 28 starts at offset 9 + 124 = 133 from block_off=7
|
||||||
|
block1_off = 9 + nn // 2
|
||||||
|
print(f"block 1: tag {body[block1_off]:02x} {body[block1_off+1]:02x} (expecting 20 28)")
|
||||||
|
nn1 = body[block1_off + 1]
|
||||||
|
print(f" block 1 NN = {nn1}")
|
||||||
|
data1 = body[block1_off + 2 : block1_off + 2 + nn1]
|
||||||
|
for byi, byte in enumerate(data1):
|
||||||
|
cur += i8(byte)
|
||||||
|
samples.append(cur)
|
||||||
|
idx = len(samples) - 1
|
||||||
|
if idx < len(sc_tran):
|
||||||
|
sc_v = sc_tran[idx]
|
||||||
|
match = "✓" if sc_v == cur else "✗"
|
||||||
|
if 248 <= idx <= 295:
|
||||||
|
print(f" idx {idx:3d}: int8 byte={byte:02x} delta={i8(byte):+d} cur={cur:+d} sc={sc_v:+d} {match}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,42 @@
|
|||||||
|
"""Feed candidate body offsets to the BW codec and compare with sidecar."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
|
||||||
|
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
sc = load_sidecar_samples(TXT)
|
||||||
|
# Sidecar samples in 0.0003 counts (Thor geo LSB).
|
||||||
|
sc_tran = [int(round(v / 0.0003)) for v in sc["Tran"][:30]]
|
||||||
|
sc_vert = [int(round(v / 0.0003)) for v in sc["Vert"][:30]]
|
||||||
|
sc_long = [int(round(v / 0.0003)) for v in sc["Long"][:30]]
|
||||||
|
sc_micl = [int(round(v / 1e-6)) for v in sc["MicL"][:30]] # 1 µ unit for mic? Will iterate.
|
||||||
|
print(f"sidecar Tran (counts): {sc_tran}")
|
||||||
|
print(f"sidecar Vert (counts): {sc_vert}")
|
||||||
|
print(f"sidecar Long (counts): {sc_long}")
|
||||||
|
print(f"sidecar MicL (×1e-6): {sc_micl}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Try candidate body start offsets.
|
||||||
|
for off in (0x0f1f, 0x1057, 0x11f1, 0x1333, 0x1bde, 0x0d30):
|
||||||
|
print(f"=== body @ 0x{off:04x} ===")
|
||||||
|
body = buf[off:]
|
||||||
|
decoded = decode_waveform_v2(body)
|
||||||
|
if not decoded:
|
||||||
|
print(" decode_waveform_v2 returned None")
|
||||||
|
continue
|
||||||
|
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||||
|
arr = decoded.get(ch, [])
|
||||||
|
print(f" {ch}[{len(arr)}]: {arr[:20]}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
"""Verify decode_waveform_v2 against sidecar across all 2304 samples per channel."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
REPO = Path(__file__).resolve().parents[1]
|
||||||
|
sys.path.insert(0, str(REPO))
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2
|
||||||
|
from analysis_idf.recon import TARGET, TXT, load_sidecar_samples
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
buf = TARGET.read_bytes()
|
||||||
|
sc = load_sidecar_samples(TXT)
|
||||||
|
body = buf[0x0f1f:]
|
||||||
|
decoded = decode_waveform_v2(body)
|
||||||
|
|
||||||
|
print(f"Sidecar lengths: Tran={len(sc['Tran'])} Vert={len(sc['Vert'])} Long={len(sc['Long'])} MicL={len(sc['MicL'])}")
|
||||||
|
print(f"Decoded lengths: Tran={len(decoded['Tran'])} Vert={len(decoded['Vert'])} Long={len(decoded['Long'])} MicL={len(decoded['MicL'])}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
GEO_LSB = 0.0003 # in/s per count
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
sc_counts = [int(round(v / GEO_LSB)) for v in sc[ch]]
|
||||||
|
dec = decoded[ch]
|
||||||
|
n = min(len(sc_counts), len(dec))
|
||||||
|
matches = sum(1 for i in range(n) if sc_counts[i] == dec[i])
|
||||||
|
first_mismatch = next((i for i in range(n) if sc_counts[i] != dec[i]), None)
|
||||||
|
print(f"{ch}: compared {n}, exact matches {matches} ({100*matches/n:.2f}%)")
|
||||||
|
if first_mismatch is not None:
|
||||||
|
i = first_mismatch
|
||||||
|
print(f" first mismatch at idx {i}: sidecar={sc_counts[i]} ({sc[ch][i]}), decoded={dec[i]}")
|
||||||
|
print(f" context sidecar[{i-2}..{i+5}]: {sc_counts[max(0,i-2):i+5]}")
|
||||||
|
print(f" context decoded[{i-2}..{i+5}]: {dec[max(0,i-2):i+5]}")
|
||||||
|
|
||||||
|
# MicL: find the multiplicative factor that fits
|
||||||
|
print()
|
||||||
|
print("=== MicL scale analysis ===")
|
||||||
|
sc_micl = sc["MicL"]
|
||||||
|
dec_micl = decoded["MicL"]
|
||||||
|
# Skip zero values when computing ratio
|
||||||
|
ratios = [sc_micl[i] / dec_micl[i] for i in range(min(50, len(sc_micl), len(dec_micl))) if dec_micl[i] != 0]
|
||||||
|
if ratios:
|
||||||
|
avg = sum(ratios) / len(ratios)
|
||||||
|
print(f" avg ratio sidecar/decoded over first 50 nonzero: {avg:.4e} (n={len(ratios)})")
|
||||||
|
print(f" ratios sample: {[f'{r:.4e}' for r in ratios[:6]]}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
+221
-117
@@ -70,42 +70,77 @@ from minimateplus.transport import SocketTransport
|
|||||||
from minimateplus.client import MiniMateClient
|
from minimateplus.client import MiniMateClient
|
||||||
from minimateplus.models import DeviceInfo, Event, MonitorLogEntry
|
from minimateplus.models import DeviceInfo, Event, MonitorLogEntry
|
||||||
from sfm.database import SeismoDb
|
from sfm.database import SeismoDb
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
|
||||||
log = logging.getLogger("ach_server")
|
log = logging.getLogger("ach_server")
|
||||||
|
|
||||||
# ── Per-unit state (downloaded-key set) ───────────────────────────────────────
|
# ── Per-unit state (downloaded events index) ──────────────────────────────────
|
||||||
# Persisted as <output_dir>/ach_state.json
|
# Persisted as <output_dir>/ach_state.json
|
||||||
# Format:
|
# Format (current — v2):
|
||||||
# {
|
# {
|
||||||
# "BE11529": {
|
# "BE11529": {
|
||||||
# "downloaded_keys": ["01110000", "0111245a"], # hex keys already on disk
|
# "downloaded_events": { # key_hex → ISO timestamp string
|
||||||
# "max_downloaded_key": "0111245a", # highest key ever seen
|
# "01110000": "2026-04-11T00:42:17",
|
||||||
# "last_seen": "2026-04-11T01:04:36"
|
# "0111245a": "2026-04-11T01:04:30"
|
||||||
|
# },
|
||||||
|
# "max_downloaded_key": "0111245a",
|
||||||
|
# "last_seen": "2026-04-11T01:04:36",
|
||||||
|
# "serial": "BE11529",
|
||||||
|
# "peer": "63.43.212.232:51920"
|
||||||
# }
|
# }
|
||||||
# }
|
# }
|
||||||
#
|
#
|
||||||
# Key-based deduplication works well within a single "key generation" (between
|
# Why (key, timestamp) and not key alone:
|
||||||
# erases). After the device memory is erased the event counter resets to
|
# The device's event-key counter resets to 0x01110000 after every memory
|
||||||
# 0x01110000, so the first new event has the SAME key as the very first event
|
# erase (internal or external). A bare-key dedup (the v1 format) cannot
|
||||||
# we ever downloaded. We detect this situation with max_downloaded_key:
|
# distinguish a re-recorded event with the same key from one we already
|
||||||
|
# downloaded. The 0C waveform record's timestamp IS unique per physical
|
||||||
|
# event, so we pair (key, timestamp) and treat a key with a different
|
||||||
|
# timestamp as a new event regardless of `max_downloaded_key`.
|
||||||
#
|
#
|
||||||
# if max(current_device_keys) < max_downloaded_key
|
# Legacy v1 format (`downloaded_keys: list[str]` only) is auto-migrated on
|
||||||
# → device was wiped and keys have restarted → treat all device keys as new
|
# read: the keys are kept under a sentinel of "" (empty string) timestamp so
|
||||||
#
|
# the (key, timestamp) compare always sees a mismatch and forces a one-time
|
||||||
# After our own erase (--clear-after-download) we also explicitly clear
|
# re-download. After that pass the state is rewritten in v2 form.
|
||||||
# downloaded_keys and max_downloaded_key so the next session starts fresh.
|
|
||||||
|
|
||||||
_state_lock = threading.Lock()
|
_state_lock = threading.Lock()
|
||||||
|
|
||||||
|
|
||||||
def _load_state(state_path: Path) -> dict:
|
def _load_state(state_path: Path) -> dict:
|
||||||
if state_path.exists():
|
"""
|
||||||
try:
|
Load ach_state.json, transparently migrating any legacy
|
||||||
with open(state_path) as f:
|
`downloaded_keys: list` entries into the v2 `downloaded_events: dict`
|
||||||
return json.load(f)
|
schema. Returns the migrated state.
|
||||||
except Exception:
|
"""
|
||||||
pass
|
if not state_path.exists():
|
||||||
return {}
|
return {}
|
||||||
|
try:
|
||||||
|
with open(state_path) as f:
|
||||||
|
state = json.load(f)
|
||||||
|
except Exception:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Per-unit migration: legacy list → dict-with-empty-timestamps
|
||||||
|
for unit_key, unit_state in list(state.items()):
|
||||||
|
if not isinstance(unit_state, dict):
|
||||||
|
continue
|
||||||
|
if "downloaded_events" in unit_state:
|
||||||
|
continue
|
||||||
|
legacy_keys = unit_state.get("downloaded_keys")
|
||||||
|
if isinstance(legacy_keys, list):
|
||||||
|
unit_state["downloaded_events"] = {k: "" for k in legacy_keys}
|
||||||
|
log.info(
|
||||||
|
"ach_state: migrated %s from v1 (downloaded_keys list) → v2 "
|
||||||
|
"(downloaded_events dict, %d keys with empty timestamps; "
|
||||||
|
"they will re-validate on next session)",
|
||||||
|
unit_key, len(legacy_keys),
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
unit_state["downloaded_events"] = {}
|
||||||
|
# keep legacy field for one cycle; cleared on next save
|
||||||
|
unit_state.pop("downloaded_keys", None)
|
||||||
|
|
||||||
|
return state
|
||||||
|
|
||||||
|
|
||||||
def _save_state(state_path: Path, state: dict) -> None:
|
def _save_state(state_path: Path, state: dict) -> None:
|
||||||
@@ -139,8 +174,10 @@ class AchSession:
|
|||||||
max_events: Optional[int],
|
max_events: Optional[int],
|
||||||
state_path: Path,
|
state_path: Path,
|
||||||
db: "SeismoDb",
|
db: "SeismoDb",
|
||||||
|
store: "WaveformStore",
|
||||||
clear_after_download: bool = False,
|
clear_after_download: bool = False,
|
||||||
restart_monitoring: bool = False,
|
restart_monitoring: bool = False,
|
||||||
|
force_redownload: bool = False,
|
||||||
) -> None:
|
) -> None:
|
||||||
self.sock = sock
|
self.sock = sock
|
||||||
self.peer = peer
|
self.peer = peer
|
||||||
@@ -150,8 +187,14 @@ class AchSession:
|
|||||||
self.max_events = max_events
|
self.max_events = max_events
|
||||||
self.state_path = state_path
|
self.state_path = state_path
|
||||||
self.db = db
|
self.db = db
|
||||||
|
self.store = store
|
||||||
self.clear_after_download = clear_after_download
|
self.clear_after_download = clear_after_download
|
||||||
self.restart_monitoring = restart_monitoring
|
self.restart_monitoring = restart_monitoring
|
||||||
|
# `force_redownload` tells this session to ignore ach_state and
|
||||||
|
# re-download every event currently on the device, regardless of any
|
||||||
|
# (key, timestamp) match. Useful as a manual override when state has
|
||||||
|
# become inconsistent with what's actually on disk / in the DB.
|
||||||
|
self.force_redownload = force_redownload
|
||||||
|
|
||||||
def run(self) -> None:
|
def run(self) -> None:
|
||||||
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
|
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||||
@@ -273,11 +316,20 @@ class AchSession:
|
|||||||
state = _load_state(self.state_path)
|
state = _load_state(self.state_path)
|
||||||
unit_key = serial or self.peer # fall back to IP if no serial
|
unit_key = serial or self.peer # fall back to IP if no serial
|
||||||
unit_state = state.get(unit_key, {})
|
unit_state = state.get(unit_key, {})
|
||||||
seen_keys: set[str] = set(unit_state.get("downloaded_keys", []))
|
|
||||||
# Highest event key ever downloaded from this unit (hex string, 8 chars).
|
# downloaded_events is the v2 (key_hex → timestamp_iso) dict.
|
||||||
# Used to detect post-erase key reuse — see comment block above.
|
# Empty-string timestamps are migrated v1 entries — they force a
|
||||||
|
# one-time re-download because the (key, timestamp) compare always
|
||||||
|
# mismatches against any non-empty timestamp from a fresh 0C read.
|
||||||
|
seen_events: dict[str, str] = dict(unit_state.get("downloaded_events", {}))
|
||||||
max_seen_key: str = unit_state.get("max_downloaded_key", "00000000")
|
max_seen_key: str = unit_state.get("max_downloaded_key", "00000000")
|
||||||
|
|
||||||
|
if self.force_redownload:
|
||||||
|
log.info(" --force-redownload-all set — ignoring %d cached "
|
||||||
|
"(key, timestamp) entries for this session",
|
||||||
|
len(seen_events))
|
||||||
|
seen_events = {}
|
||||||
|
|
||||||
# Walk the event index (browse-mode, no 5A) to get the actual current
|
# Walk the event index (browse-mode, no 5A) to get the actual current
|
||||||
# key list. The SUB 08 event_count field is a lifetime "total events
|
# key list. The SUB 08 event_count field is a lifetime "total events
|
||||||
# ever recorded" counter that does NOT decrement on erase — confirmed
|
# ever recorded" counter that does NOT decrement on erase — confirmed
|
||||||
@@ -290,11 +342,10 @@ class AchSession:
|
|||||||
log.warning(" list_event_keys failed: %s -- falling back to full download", exc)
|
log.warning(" list_event_keys failed: %s -- falling back to full download", exc)
|
||||||
device_keys = None
|
device_keys = None
|
||||||
|
|
||||||
# Use the walk result as our authoritative current count.
|
|
||||||
current_count = len(device_keys) if device_keys is not None else 0
|
current_count = len(device_keys) if device_keys is not None else 0
|
||||||
|
|
||||||
log.info(" Unit has %d stored event(s); %d key(s) previously downloaded",
|
log.info(" Unit has %d stored event(s); %d (key, ts) entr(ies) previously downloaded",
|
||||||
current_count, len(seen_keys))
|
current_count, len(seen_events))
|
||||||
|
|
||||||
if device_keys is not None and current_count == 0:
|
if device_keys is not None and current_count == 0:
|
||||||
log.info(" [OK] No events on device -- nothing to download")
|
log.info(" [OK] No events on device -- nothing to download")
|
||||||
@@ -302,75 +353,29 @@ class AchSession:
|
|||||||
return
|
return
|
||||||
|
|
||||||
if device_keys is not None:
|
if device_keys is not None:
|
||||||
# ── Post-erase detection ──────────────────────────────────────
|
# ── Post-erase detection (best-effort, key-only signal) ───────
|
||||||
# After the device memory is erased, new events start from key
|
# After erase the device's key counter resets to 01110000.
|
||||||
# 01110000 again — the same keys we already downloaded. Detect
|
# If the device's current max key is below our high-water mark
|
||||||
# this by comparing the device's current highest key against the
|
# we know erase happened. This catches the cleanest case but
|
||||||
# historical maximum. If the device has rolled back below our
|
# does NOT catch erase-then-record-many-events (where the new
|
||||||
# high-water mark, its counter was reset and we must treat all
|
# max may climb past the old max). The (key, timestamp) check
|
||||||
# its keys as new, regardless of what seen_keys contains.
|
# in get_events() is what handles those.
|
||||||
if device_keys and max_seen_key != "00000000":
|
if device_keys and max_seen_key != "00000000":
|
||||||
max_device_key = max(device_keys) # lexicographic; safe because
|
max_device_key = max(device_keys)
|
||||||
# keys share the same 4-char prefix
|
|
||||||
if max_device_key < max_seen_key:
|
if max_device_key < max_seen_key:
|
||||||
log.info(
|
log.info(
|
||||||
" Post-erase reset detected: "
|
" Post-erase reset detected: "
|
||||||
"device max key %s < historical max %s "
|
"device max key %s < historical max %s "
|
||||||
"-- treating all device keys as new",
|
"-- discarding stale (key, ts) state for this session",
|
||||||
max_device_key, max_seen_key,
|
max_device_key, max_seen_key,
|
||||||
)
|
)
|
||||||
seen_keys = set() # discard stale dedup info for this session
|
seen_events = {}
|
||||||
|
|
||||||
new_key_set = set(device_keys) - seen_keys
|
# Note: no early-exit "all already downloaded" short-circuit
|
||||||
log.info(" Device has %d key(s): %d new, %d already seen",
|
# here. Without per-event timestamps we cannot tell whether
|
||||||
len(device_keys), len(new_key_set), len(device_keys) - len(new_key_set))
|
# device_keys ⊆ seen_events.keys() actually means we have
|
||||||
if not new_key_set:
|
# those physical events. get_events() will read 0C on its
|
||||||
log.info(" [OK] All events already downloaded -- nothing to do")
|
# skip path and decide per event.
|
||||||
# Refresh state timestamp; preserve max_seen_key unchanged.
|
|
||||||
state[unit_key] = {
|
|
||||||
"downloaded_keys": sorted(seen_keys | set(device_keys)),
|
|
||||||
"max_downloaded_key": max_seen_key,
|
|
||||||
"last_seen": datetime.datetime.now().isoformat(),
|
|
||||||
"serial": serial,
|
|
||||||
"peer": self.peer,
|
|
||||||
}
|
|
||||||
_save_state(self.state_path, state)
|
|
||||||
|
|
||||||
# ── Erase even when no new events (if requested) ──────────
|
|
||||||
# Blastware ACH always erases after every session — even when
|
|
||||||
# nothing new was downloaded. Without the erase the device
|
|
||||||
# still sees stored events in its memory and immediately
|
|
||||||
# retries the call-home, causing the looping we observed.
|
|
||||||
# Only erase when device actually has events stored; skip
|
|
||||||
# the erase if device_keys is empty (nothing to erase).
|
|
||||||
if self.clear_after_download and device_keys:
|
|
||||||
log.info(
|
|
||||||
" Clearing device memory (--clear-after-download, "
|
|
||||||
"no new events but device has %d stored)...",
|
|
||||||
len(device_keys),
|
|
||||||
)
|
|
||||||
try:
|
|
||||||
client.delete_all_events()
|
|
||||||
log.info(" [OK] Device memory cleared")
|
|
||||||
# Reset state so the next session starts fresh.
|
|
||||||
state[unit_key] = {
|
|
||||||
"downloaded_keys": [],
|
|
||||||
"max_downloaded_key": "00000000",
|
|
||||||
"last_seen": datetime.datetime.now().isoformat(),
|
|
||||||
"serial": serial,
|
|
||||||
"peer": self.peer,
|
|
||||||
}
|
|
||||||
_save_state(self.state_path, state)
|
|
||||||
except Exception as exc:
|
|
||||||
log.error(
|
|
||||||
" [WARN] Event deletion failed: %s -- events NOT cleared",
|
|
||||||
exc,
|
|
||||||
)
|
|
||||||
|
|
||||||
log.info("Session complete (no new events) -> %s", session_dir)
|
|
||||||
return
|
|
||||||
else:
|
|
||||||
new_key_set = None # unknown; proceed with full download
|
|
||||||
|
|
||||||
# Apply max_events cap
|
# Apply max_events cap
|
||||||
# stop_idx: when we know the count from list_event_keys, use it as
|
# stop_idx: when we know the count from list_event_keys, use it as
|
||||||
@@ -388,27 +393,67 @@ class AchSession:
|
|||||||
)
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
# Pass `seen_events` (key → ISO timestamp) so the client can
|
||||||
|
# read 0C on its skip path and only skip 5A when the per-event
|
||||||
|
# timestamp matches what we already have on disk. When force_-
|
||||||
|
# redownload is set, seen_events was already cleared above.
|
||||||
|
#
|
||||||
|
# Filter out empty-string timestamps (legacy v1 entries) — the
|
||||||
|
# client's 0C-on-skip-path only trusts entries with a
|
||||||
|
# populated timestamp; otherwise it falls through to a full
|
||||||
|
# 5A download.
|
||||||
|
skip_dict = {k: ts for k, ts in seen_events.items() if ts}
|
||||||
|
|
||||||
all_events = client.get_events(
|
all_events = client.get_events(
|
||||||
full_waveform=True,
|
full_waveform=True,
|
||||||
stop_after_index=stop_idx,
|
stop_after_index=stop_idx,
|
||||||
skip_waveform_for_keys=seen_keys if seen_keys else None,
|
skip_waveform_for_events=skip_dict if skip_dict else None,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Filter to events whose keys we haven't saved before.
|
# New events are those that came back with _a5_frames populated
|
||||||
|
# (= 5A actually ran on this session). Skipped events have
|
||||||
|
# _a5_frames = None because the client matched (key, timestamp)
|
||||||
|
# against skip_dict and bypassed 5A.
|
||||||
new_events = [
|
new_events = [
|
||||||
e for e in all_events
|
e for e in all_events
|
||||||
if e._waveform_key is None
|
if getattr(e, "_a5_frames", None)
|
||||||
or e._waveform_key.hex() not in seen_keys
|
|
||||||
]
|
]
|
||||||
skipped = len(all_events) - len(new_events)
|
skipped = len(all_events) - len(new_events)
|
||||||
|
|
||||||
log.info(" [OK] Downloaded %d event(s): %d new, %d skipped (already seen)",
|
log.info(" [OK] Walked %d event(s): %d downloaded, %d skipped (matched (key, ts) in state)",
|
||||||
len(all_events), len(new_events), skipped)
|
len(all_events), len(new_events), skipped)
|
||||||
if skipped:
|
|
||||||
log.info(" (skipped %d already-downloaded event(s))", skipped)
|
# ── Persist event file + A5 sidecar to the waveform store ──
|
||||||
|
# Saves ride alongside the existing JSON dump so the on-disk
|
||||||
|
# event file and events.json reference the same set of events.
|
||||||
|
waveform_records: dict[str, dict] = {}
|
||||||
|
for ev in new_events:
|
||||||
|
if not ev._a5_frames:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
rec = self.store.save(
|
||||||
|
ev,
|
||||||
|
serial=serial or "UNKNOWN",
|
||||||
|
a5_frames=ev._a5_frames,
|
||||||
|
)
|
||||||
|
if ev._waveform_key is not None:
|
||||||
|
waveform_records[ev._waveform_key.hex()] = rec
|
||||||
|
log.info(
|
||||||
|
" [WAVE] saved %s (%d bytes)",
|
||||||
|
rec["filename"], rec["filesize"],
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
key_hex = ev._waveform_key.hex() if ev._waveform_key else "????????"
|
||||||
|
log.warning(
|
||||||
|
" [WARN] Waveform store save failed for %s: %s",
|
||||||
|
key_hex, exc,
|
||||||
|
)
|
||||||
|
|
||||||
if new_events:
|
if new_events:
|
||||||
_save_json(session_dir / "events.json", [_event_to_dict(e) for e in new_events])
|
_save_json(
|
||||||
|
session_dir / "events.json",
|
||||||
|
[_event_to_dict(e, waveform_records) for e in new_events],
|
||||||
|
)
|
||||||
|
|
||||||
for ev in new_events:
|
for ev in new_events:
|
||||||
pv = ev.peak_values
|
pv = ev.peak_values
|
||||||
@@ -467,7 +512,11 @@ class AchSession:
|
|||||||
_session_start = datetime.datetime.now()
|
_session_start = datetime.datetime.now()
|
||||||
try:
|
try:
|
||||||
_ev_ins, _ev_skip = self.db.insert_events(
|
_ev_ins, _ev_skip = self.db.insert_events(
|
||||||
new_events, serial=serial or self.peer, session_id=None
|
new_events,
|
||||||
|
serial=serial or self.peer,
|
||||||
|
session_id=None,
|
||||||
|
waveform_records=waveform_records,
|
||||||
|
device_family="series3",
|
||||||
)
|
)
|
||||||
_ml_ins, _ml_skip = self.db.insert_monitor_log(
|
_ml_ins, _ml_skip = self.db.insert_monitor_log(
|
||||||
new_monitor_entries, session_id=None
|
new_monitor_entries, session_id=None
|
||||||
@@ -502,35 +551,64 @@ class AchSession:
|
|||||||
)
|
)
|
||||||
|
|
||||||
# ── Update persistent state ───────────────────────────────────
|
# ── Update persistent state ───────────────────────────────────
|
||||||
# Include both triggered-event keys and monitor-log keys in the
|
# Build a fresh (key → ISO timestamp) map from THIS session's
|
||||||
# downloaded set so they are not re-processed on the next call-home.
|
# results. For each event currently on the device, prefer the
|
||||||
current_event_keys = [
|
# timestamp we just observed (from 0C); fall back to whatever
|
||||||
e._waveform_key.hex()
|
# was already in seen_events for that key (so we don't lose an
|
||||||
for e in all_events
|
# entry just because get_events skipped it on the (key, ts)
|
||||||
if e._waveform_key is not None
|
# match path).
|
||||||
]
|
def _ts_iso(ev) -> str:
|
||||||
current_monitor_keys = [e.key for e in new_monitor_entries]
|
ts = getattr(ev, "timestamp", None)
|
||||||
current_keys = current_event_keys + current_monitor_keys
|
if ts is None:
|
||||||
|
return ""
|
||||||
|
try:
|
||||||
|
return datetime.datetime(
|
||||||
|
ts.year, ts.month, ts.day,
|
||||||
|
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||||
|
).isoformat()
|
||||||
|
except Exception:
|
||||||
|
return str(ts)
|
||||||
|
|
||||||
|
current_events_map: dict[str, str] = {}
|
||||||
|
for ev in all_events:
|
||||||
|
if ev._waveform_key is None:
|
||||||
|
continue
|
||||||
|
key_hex = ev._waveform_key.hex()
|
||||||
|
ts_iso = _ts_iso(ev) or seen_events.get(key_hex, "")
|
||||||
|
current_events_map[key_hex] = ts_iso
|
||||||
|
|
||||||
|
# Monitor-log entries don't have a 0C-style timestamp, but
|
||||||
|
# they DO have a start_time; use that so the monitor-log keys
|
||||||
|
# are properly entered into the (key, ts) map.
|
||||||
|
for ml in new_monitor_entries:
|
||||||
|
key_hex = ml.key
|
||||||
|
ts = ml.start_time
|
||||||
|
ts_iso = ts.isoformat() if ts else seen_events.get(key_hex, "")
|
||||||
|
# If a triggered event already populated this key, keep
|
||||||
|
# whichever has a non-empty timestamp.
|
||||||
|
if key_hex not in current_events_map or not current_events_map[key_hex]:
|
||||||
|
current_events_map[key_hex] = ts_iso
|
||||||
|
|
||||||
if erased_successfully:
|
if erased_successfully:
|
||||||
# Device memory is clear. Reset downloaded_keys and the
|
updated_events: dict[str, str] = {}
|
||||||
# high-water mark so the next call-home starts fresh and
|
|
||||||
# doesn't mis-identify the recycled key 01110000 as "seen".
|
|
||||||
updated_keys = []
|
|
||||||
new_max_key = "00000000"
|
new_max_key = "00000000"
|
||||||
log.info(
|
log.info(
|
||||||
" State reset after erase -- next session will download "
|
" State reset after erase -- next session will download "
|
||||||
"from key 0 (device counter resets after erase)"
|
"from key 0 (device counter resets after erase)"
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
# Normal (no erase): union of previously-seen + all keys on
|
# Merge: keep prior (key, ts) entries we still have evidence
|
||||||
# device now. Includes already-seen survivors so we never
|
# of (for survivors of any partial failure), plus this
|
||||||
# re-download them if the device somehow keeps old records.
|
# session's authoritative (key, ts) pairs.
|
||||||
updated_keys = sorted(set(seen_keys) | set(current_keys))
|
updated_events = dict(seen_events)
|
||||||
new_max_key = updated_keys[-1] if updated_keys else max_seen_key
|
updated_events.update(current_events_map)
|
||||||
|
new_max_key = (
|
||||||
|
max(updated_events.keys())
|
||||||
|
if updated_events else max_seen_key
|
||||||
|
)
|
||||||
|
|
||||||
state[unit_key] = {
|
state[unit_key] = {
|
||||||
"downloaded_keys": updated_keys,
|
"downloaded_events": updated_events,
|
||||||
"max_downloaded_key": new_max_key,
|
"max_downloaded_key": new_max_key,
|
||||||
"last_seen": datetime.datetime.now().isoformat(),
|
"last_seen": datetime.datetime.now().isoformat(),
|
||||||
"serial": serial,
|
"serial": serial,
|
||||||
@@ -592,7 +670,10 @@ def _device_info_to_dict(d: DeviceInfo) -> dict:
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def _event_to_dict(e: Event) -> dict:
|
def _event_to_dict(
|
||||||
|
e: Event,
|
||||||
|
waveform_records: Optional[dict[str, dict]] = None,
|
||||||
|
) -> dict:
|
||||||
pv = e.peak_values
|
pv = e.peak_values
|
||||||
pi = e.project_info
|
pi = e.project_info
|
||||||
peaks = {}
|
peaks = {}
|
||||||
@@ -611,6 +692,11 @@ def _event_to_dict(e: Event) -> dict:
|
|||||||
for ch, vals in e.raw_samples.items()
|
for ch, vals in e.raw_samples.items()
|
||||||
}
|
}
|
||||||
samples["__note__"] = "first 20 sample-sets only; see raw_rx.bin for full waveform"
|
samples["__note__"] = "first 20 sample-sets only; see raw_rx.bin for full waveform"
|
||||||
|
|
||||||
|
rec: dict = {}
|
||||||
|
if waveform_records and e._waveform_key is not None:
|
||||||
|
rec = waveform_records.get(e._waveform_key.hex(), {}) or {}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"timestamp": str(e.timestamp) if e.timestamp else None,
|
"timestamp": str(e.timestamp) if e.timestamp else None,
|
||||||
"project": pi.project if pi else None,
|
"project": pi.project if pi else None,
|
||||||
@@ -619,6 +705,9 @@ def _event_to_dict(e: Event) -> dict:
|
|||||||
"sensor_location": pi.sensor_location if pi else None,
|
"sensor_location": pi.sensor_location if pi else None,
|
||||||
"peaks": peaks,
|
"peaks": peaks,
|
||||||
"raw_samples_preview": samples,
|
"raw_samples_preview": samples,
|
||||||
|
"blastware_filename": rec.get("filename"),
|
||||||
|
"blastware_filesize": rec.get("filesize"),
|
||||||
|
"a5_pickle_filename": rec.get("a5_pickle_filename"),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@@ -640,6 +729,7 @@ def serve(args: argparse.Namespace) -> None:
|
|||||||
output_dir.mkdir(parents=True, exist_ok=True)
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
state_path = output_dir / "ach_state.json"
|
state_path = output_dir / "ach_state.json"
|
||||||
db = SeismoDb(output_dir / "seismo_relay.db")
|
db = SeismoDb(output_dir / "seismo_relay.db")
|
||||||
|
store = WaveformStore(output_dir / "waveforms")
|
||||||
|
|
||||||
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||||
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
|
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
|
||||||
@@ -657,6 +747,7 @@ def serve(args: argparse.Namespace) -> None:
|
|||||||
print(f" Max events per session: {max_ev if max_ev else 'unlimited'}")
|
print(f" Max events per session: {max_ev if max_ev else 'unlimited'}")
|
||||||
print(f" Clear device after download: {'YES' if args.clear_after_download else 'no'}")
|
print(f" Clear device after download: {'YES' if args.clear_after_download else 'no'}")
|
||||||
print(f" Restart monitoring after download: {'YES' if args.restart_monitoring else 'no'}")
|
print(f" Restart monitoring after download: {'YES' if args.restart_monitoring else 'no'}")
|
||||||
|
print(f" Force re-download all (ignore state): {'YES' if args.force_redownload_all else 'no'}")
|
||||||
print(f"{'='*60}")
|
print(f"{'='*60}")
|
||||||
print(f"\n Point your test unit's ACEmanager call-home settings to:")
|
print(f"\n Point your test unit's ACEmanager call-home settings to:")
|
||||||
print(f" Remote Host: <this machine's LAN IP>")
|
print(f" Remote Host: <this machine's LAN IP>")
|
||||||
@@ -694,8 +785,10 @@ def serve(args: argparse.Namespace) -> None:
|
|||||||
max_events=max_ev,
|
max_events=max_ev,
|
||||||
state_path=state_path,
|
state_path=state_path,
|
||||||
db=db,
|
db=db,
|
||||||
|
store=store,
|
||||||
clear_after_download=args.clear_after_download,
|
clear_after_download=args.clear_after_download,
|
||||||
restart_monitoring=args.restart_monitoring,
|
restart_monitoring=args.restart_monitoring,
|
||||||
|
force_redownload=args.force_redownload_all,
|
||||||
)
|
)
|
||||||
t = threading.Thread(target=session.run, daemon=True, name=f"ach-{peer}")
|
t = threading.Thread(target=session.run, daemon=True, name=f"ach-{peer}")
|
||||||
t.start()
|
t.start()
|
||||||
@@ -780,6 +873,17 @@ def parse_args() -> argparse.Namespace:
|
|||||||
"This mirrors the standard Blastware ACH workflow."
|
"This mirrors the standard Blastware ACH workflow."
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--force-redownload-all",
|
||||||
|
action="store_true",
|
||||||
|
default=False,
|
||||||
|
help=(
|
||||||
|
"Manual override: ignore ach_state.json's downloaded_events map "
|
||||||
|
"for this session and re-download every event currently on the "
|
||||||
|
"device, regardless of (key, timestamp) match. Useful when state "
|
||||||
|
"has become inconsistent with the on-disk waveform store / DB."
|
||||||
|
),
|
||||||
|
)
|
||||||
p.add_argument(
|
p.add_argument(
|
||||||
"--verbose", "-v",
|
"--verbose", "-v",
|
||||||
action="store_true",
|
action="store_true",
|
||||||
|
|||||||
@@ -0,0 +1,185 @@
|
|||||||
|
# Histogram body codec — FULLY DECODED (2026-05-20)
|
||||||
|
|
||||||
|
Clean working status doc for the MiniMate Plus histogram-mode event
|
||||||
|
body codec. Companion to `waveform_codec_re_status.md`. The deep
|
||||||
|
historical record (with retractions and dated analyses) lives in
|
||||||
|
`docs/instantel_protocol_reference.md §7.6.2`; the authoritative
|
||||||
|
implementation lives in `minimateplus/histogram_codec.py`.
|
||||||
|
|
||||||
|
## TL;DR
|
||||||
|
|
||||||
|
**The codec is fully decoded.** Every field of every block in the
|
||||||
|
in-repo histogram fixture corpus decodes byte-exact against BW's
|
||||||
|
ASCII export.
|
||||||
|
|
||||||
|
26 regression tests pass against ~3,500 blocks across 5 in-repo
|
||||||
|
fixtures, plus a synthetic regression block taken from a real
|
||||||
|
BE9558 prod event to lock in the uint8-peak interpretation.
|
||||||
|
|
||||||
|
**Important correction (2026-05-21):** the per-channel peak count
|
||||||
|
is `uint8` at byte[6]/[10]/[14]/[18], NOT `uint16 LE` at byte[6:8]
|
||||||
|
etc. The N844 fixture corpus the original RE was done against has
|
||||||
|
zero values in bytes [7]/[11]/[15]/[19] for every block, so the
|
||||||
|
two interpretations happened to be equivalent. Cross-correlating
|
||||||
|
non-N844 events (BE9558 Tran-drift, BE18003 Histogram+Continuous)
|
||||||
|
against BW's per-interval ASCII export — 4 channels × ~1400 blocks
|
||||||
|
per event × multiple events = 100% byte-exact only when the peak
|
||||||
|
is read as uint8. Reading as uint16 LE produced peaks up to 268
|
||||||
|
in/s per channel and 35× inflated PVS sums when first deployed to
|
||||||
|
prod (rolled back, root-caused, and fixed in commit 7183b95+1).
|
||||||
|
|
||||||
|
## Body format
|
||||||
|
|
||||||
|
```
|
||||||
|
body = [stream of 32-byte data blocks] + [small trailing remnant]
|
||||||
|
```
|
||||||
|
|
||||||
|
Each block represents one histogram interval. Block layout:
|
||||||
|
|
||||||
|
```
|
||||||
|
[0] 0x00 always-zero tag
|
||||||
|
[1] segment_id (uint8) 0x00..0x03 — 256 blocks per segment
|
||||||
|
[2:4] block_ctr (uint16 LE) resets each segment (0x0100, 0x0101, …)
|
||||||
|
[4:6] 0x000a (uint16 LE) constant marker (= 10)
|
||||||
|
[6] T_peak_count uint8 Tran peak (count × 0.005 → in/s at Normal,
|
||||||
|
max 1.275 in/s — fits in uint8)
|
||||||
|
[7] T_annotation uint8 empirically non-zero on intervals with sub-Hz
|
||||||
|
or unmeasurable freq; meaning not fully RE'd
|
||||||
|
[8:10] T_halfperiod uint16 LE Tran half-period in samples
|
||||||
|
(freq_Hz = 512 / halfp; ≤ 5 means ">100 Hz")
|
||||||
|
[10] V_peak_count uint8 Vert peak
|
||||||
|
[11] V_annotation uint8
|
||||||
|
[12:14] V_halfperiod uint16 LE Vert freq half-period
|
||||||
|
[14] L_peak_count uint8 Long peak
|
||||||
|
[15] L_annotation uint8
|
||||||
|
[16:18] L_halfperiod uint16 LE Long freq half-period
|
||||||
|
[18] M_peak_count uint8 MicL peak count
|
||||||
|
(dB via waveform_codec.mic_count_to_db)
|
||||||
|
[19] M_annotation uint8
|
||||||
|
[20:22] M_halfperiod uint16 LE MicL freq half-period
|
||||||
|
[22:24] 0x00 0x00 constant
|
||||||
|
[24:28] 4-byte variable purpose unknown — possibly CRC,
|
||||||
|
timestamp delta, or psi(L) numeric;
|
||||||
|
not needed for waveform reconstruction
|
||||||
|
[28:32] 0x1e 0x0a 0x00 0x00 constant block-end signature
|
||||||
|
```
|
||||||
|
|
||||||
|
Reliable block-identification anchor:
|
||||||
|
```python
|
||||||
|
block[22:24] == b"\x00\x00" and block[28:32] == b"\x1e\x0a\x00\x00"
|
||||||
|
```
|
||||||
|
(The `1e 0a 00 00` constant tail is the most distinctive signature.)
|
||||||
|
|
||||||
|
## Per-channel encoding
|
||||||
|
|
||||||
|
| Channel | Peak encoding | Frequency encoding |
|
||||||
|
|---|---|---|
|
||||||
|
| Tran | count × 0.005 = in/s at Normal range | `freq_Hz = 512 / halfperiod` |
|
||||||
|
| Vert | same | same |
|
||||||
|
| Long | same | same |
|
||||||
|
| MicL | count → dB via `mic_count_to_db(count)` (same formula as waveform codec) | same |
|
||||||
|
|
||||||
|
**`>100 Hz` sentinel**: when halfperiod ≤ 5 (giving ≥100 Hz from the
|
||||||
|
512/halfp formula), BW displays `>100 Hz`. Codec's `half_period_to_hz`
|
||||||
|
returns `None` in this range.
|
||||||
|
|
||||||
|
## Verified facts (cross-checked against fixture corpus)
|
||||||
|
|
||||||
|
Example: N844L6Z8.ZR0H block 130 → all 8 decoded fields byte-exact:
|
||||||
|
|
||||||
|
```
|
||||||
|
binary samples [10, 6, 24, 4, 18, 5, 21, 5, 9]
|
||||||
|
TXT row [0.030, 21, 0.020, 28, 0.025, 24, 0.040, 0.000, 95.92, 57]
|
||||||
|
|
||||||
|
slot[0] = 10 marker
|
||||||
|
slot[1] = 6 × 0.005 = 0.030 in/s ✓ T_peak
|
||||||
|
slot[2] = 24 → 512/24 = 21.3 → 21 Hz ✓ T_freq
|
||||||
|
slot[3] = 4 × 0.005 = 0.020 in/s ✓ V_peak
|
||||||
|
slot[4] = 18 → 512/18 = 28.4 → 28 Hz ✓ V_freq
|
||||||
|
slot[5] = 5 × 0.005 = 0.025 in/s ✓ L_peak
|
||||||
|
slot[6] = 21 → 512/21 = 24.4 → 24 Hz ✓ L_freq
|
||||||
|
slot[7] = 5 → 81.94 + 20·log10(5) = 95.92 dB ✓ M_peak
|
||||||
|
slot[8] = 9 → 512/9 = 56.9 → 57 Hz ✓ M_freq
|
||||||
|
```
|
||||||
|
|
||||||
|
## Verified test coverage
|
||||||
|
|
||||||
|
`tests/test_histogram_codec.py` (24 tests):
|
||||||
|
|
||||||
|
- Block walking: yields one record per `.TXT` interval ± 1 (off-by-one
|
||||||
|
at the tail when recording was stopped mid-write). Segment-ID
|
||||||
|
groups of 256 blocks confirmed.
|
||||||
|
- Geo peaks: every block of N844L20G, N844L6Z8, N844L6XE, N844L23B
|
||||||
|
matches `.TXT` within the 0.0005 in/s quantization step.
|
||||||
|
- Geo freqs: every block of N844L6Z8 and N844L6XE matches `.TXT`
|
||||||
|
within 1 Hz (BW display rounds). `>100 Hz` sentinel handled correctly.
|
||||||
|
- Mic dB: every block of N844L6XE, N844L23B, N844L6Z8 matches `.TXT`
|
||||||
|
within 0.1 dB (BW display precision).
|
||||||
|
- Mic freq: matches `.TXT` within 1 Hz across active blocks.
|
||||||
|
|
||||||
|
## What's NOT yet decoded
|
||||||
|
|
||||||
|
- **Annotation bytes (`block[7]/[11]/[15]/[19]`)**. Empirically
|
||||||
|
non-zero on intervals where the per-channel ZC frequency comes
|
||||||
|
out as `N/A` or sub-Hz (`<1.0`, `1.X`). Hypothesis tested in the
|
||||||
|
RE session: byte != 0 ↔ sub-Hz freq. Only ~50% correlation
|
||||||
|
across the K558 corpus, so the relationship is more complex.
|
||||||
|
Possibilities: time-of-peak-within-interval, halfp extension for
|
||||||
|
very-long-period signals, or a debug/diagnostic field the firmware
|
||||||
|
writes opportunistically. Doesn't affect peak amplitudes or
|
||||||
|
waveform reconstruction. Captured as `record["annotations"]` for
|
||||||
|
future RE.
|
||||||
|
- **4-byte variable metadata field (bytes 24:28)**. Not needed for
|
||||||
|
waveform reconstruction. Speculation: per-block CRC, sub-second
|
||||||
|
timestamp offset, or a Mic psi(L) count not in the 9 samples.
|
||||||
|
Punt until something needs it.
|
||||||
|
- **Geo PVS (TXT col 7, e.g. "0.040 in/s")**. Not stored in the
|
||||||
|
block; can be approximated as `sqrt(T_peak² + V_peak² + L_peak²)`
|
||||||
|
but BW's value sometimes differs slightly (probably computed from
|
||||||
|
waveform-instant samples, not from per-channel peaks). Punt — the
|
||||||
|
`.h5` consumers don't need PVS as a sample channel.
|
||||||
|
- **Mic psi(L) value (TXT col 8)**. TXT shows it as a small psi value
|
||||||
|
derived from the dB measurement. Not in the 9 samples. Could be
|
||||||
|
derived from `M_peak_count` via the inverse of the dB formula plus
|
||||||
|
a psi calibration constant. Defer.
|
||||||
|
|
||||||
|
## Output shape
|
||||||
|
|
||||||
|
`decode_histogram_body` returns the standard 4-channel dict that
|
||||||
|
mirrors `waveform_codec.decode_waveform_v2`'s output:
|
||||||
|
|
||||||
|
```python
|
||||||
|
{
|
||||||
|
"Tran": [peak_count_per_interval, ...], # 16-count units (LSB = 0.005 in/s)
|
||||||
|
"Vert": [..., ...],
|
||||||
|
"Long": [..., ...],
|
||||||
|
"MicL": [..., ...], # raw ADC counts
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Run through `waveform_codec.decoded_to_adc_counts` to get 1-count ADC
|
||||||
|
units (geo ×16, mic passthrough) for the standard `.h5` writer.
|
||||||
|
|
||||||
|
For the full per-interval record with frequencies + metadata, use
|
||||||
|
`decode_histogram_body_full()`.
|
||||||
|
|
||||||
|
## Where it's wired
|
||||||
|
|
||||||
|
- `minimateplus/event_file_io.py:read_blastware_file()` — first tries
|
||||||
|
the waveform codec, falls back to the histogram codec when the
|
||||||
|
waveform preamble isn't present. Same output shape, same
|
||||||
|
downstream pipeline.
|
||||||
|
- `scripts/backfill_sidecars.py` — the `has_samples` short-circuit
|
||||||
|
added during the histogram-codec-pending era still serves as a
|
||||||
|
defensive guard against truly undecodable files, but no longer
|
||||||
|
fires for valid histograms.
|
||||||
|
|
||||||
|
## Companion reference
|
||||||
|
|
||||||
|
- `docs/waveform_codec_re_status.md` — sibling status doc for the
|
||||||
|
much-more-complex waveform-mode codec.
|
||||||
|
- `docs/instantel_protocol_reference.md §7.6.2` — historical
|
||||||
|
protocol-reference entry. Structural framing matches what we
|
||||||
|
found; per-sample semantics were less documented than the `✅
|
||||||
|
CONFIRMED` badge suggested. This doc supersedes §7.6.2 where they
|
||||||
|
conflict on confidence level.
|
||||||
@@ -0,0 +1,341 @@
|
|||||||
|
# IDF Protocol Reference — Thor / Micromate Series IV
|
||||||
|
|
||||||
|
Starting-point reference for reverse-engineering Instantel's Micromate
|
||||||
|
Series IV event-file format. Sibling to
|
||||||
|
[instantel_protocol_reference.md](instantel_protocol_reference.md) (the
|
||||||
|
Series III "Rosetta Stone") — this doc holds what we know so far and
|
||||||
|
the open questions still to crack.
|
||||||
|
|
||||||
|
**Status (2026-05-28):** ASCII text sidecar fully decoded (1,014
|
||||||
|
sample files round-trip). **Thor IDFW** binary now decodes via
|
||||||
|
`micromate.idf_file.read_idf_file()` — reuses the BW segment-rotated
|
||||||
|
block codec verbatim at fixed body offset `0x0f1f`; metadata (serial,
|
||||||
|
timestamp, sample_rate, record_time, calibration_date) extracted from
|
||||||
|
the binary header. Sample fidelity is 87–99% byte-exact on quiet
|
||||||
|
events; loud events hit the BW codec's known walker-stops-early
|
||||||
|
limitation. Residual ~3% drift on per-sample deltas (likely a
|
||||||
|
Thor-specific 12-bit delta refinement not yet modelled).
|
||||||
|
|
||||||
|
**Thor IDFH histograms also decoded.** Body has one or more segments;
|
||||||
|
each 12-byte segment header `[length_be 2B][0a 00 00 00][00 NN][05 3f]`
|
||||||
|
introduces `N = (length - 10) // 72` interval records of 72 bytes
|
||||||
|
each. Each interval = 4 × 16-byte per-channel records:
|
||||||
|
`[int16 min][int16 max][int16 ??][uint16 halfp][2B 00][uint16 ??][2B 00][uint16 ??]`.
|
||||||
|
Geo peak `= max(|min|, |max|) / 32768 × 10` in/s (matches sidecar
|
||||||
|
~1.8%); freq `= 512 / halfp` Hz (None for halfp ≤ 5 → ">100"
|
||||||
|
sentinel). Corpus: **all 859 Thor IDFH files decode, 181,071
|
||||||
|
intervals**. Wired through `read_idf_file()` →
|
||||||
|
`save_imported_idf()` → sidecar's `extensions.idf_intervals`.
|
||||||
|
|
||||||
|
**Note on the BE9439 outliers in the example corpus:** Two files
|
||||||
|
(`BE9439_20200713131747.IDFW` and `BE9439_20200713124251.IDFH`) are
|
||||||
|
**Series III Blastware** binaries, not Thor. Provenance: TMI tried
|
||||||
|
to use Thor to manage auto-call-homes for Series III units; the
|
||||||
|
experiment didn't work out, but it did leave a few BW event files
|
||||||
|
in Thor's per-serial directory structure with `.IDFW`/`.IDFH`
|
||||||
|
extensions — Thor's forwarder applied its own naming convention to
|
||||||
|
the BW bodies it was relaying. Their header `10 00 01 80 00 00
|
||||||
|
Instantel STRT ff fe <end_key> <start_key>` is the BW SUB 5A STRT
|
||||||
|
record, not a Thor body preamble. The reader detects them by
|
||||||
|
signature and raises `NotImplementedError` pointing callers at
|
||||||
|
`read_blastware_file()`, which extracts BW-format peaks from them.
|
||||||
|
|
||||||
|
**Still NYI for Thor IDFH:** per-channel `int16 field4` (possibly
|
||||||
|
time-of-peak); the two uint16 fields (probably PVS contributions);
|
||||||
|
8-byte interval tail (PVS data); mic dB(L) exact conversion constant.
|
||||||
|
|
||||||
|
### Codec breakthroughs (2026-05-28)
|
||||||
|
|
||||||
|
- **Body offset is a fixed `0x0f1f`** across 151/154 corpus IDFW
|
||||||
|
files. Preceded by a 4-byte record-type marker (`46 00 00 00`)
|
||||||
|
+ magic preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]`.
|
||||||
|
- **Sample stream is BW's segment-rotated block codec verbatim.**
|
||||||
|
Thor reuses `10 NN` (nibble), `20 NN` (int8), `00 NN` (RLE),
|
||||||
|
`30 NN` (packed12), `40 02` (segment header) tags with the same
|
||||||
|
semantics. Channel rotation Tran→Vert→Long→MicL.
|
||||||
|
- **Geo LSB = 0.0003 in/s** (not BW's 0.005), because Thor's 16-bit
|
||||||
|
ADC range maps to 10 in/s without the 16-count BW quantization step.
|
||||||
|
- **Mic ≈ 2.14×10⁻⁶ psi/count** (rough scale; refine after channel
|
||||||
|
block calibration constants are decoded).
|
||||||
|
- **BW compliance anchor `\xbe\x80\x00\x00\x00\x00` reappears at
|
||||||
|
IDFW offset 0x952** — sample_rate at anchor−6 (uint16 BE),
|
||||||
|
record_time at anchor+6 (float32 BE), same layout as BW.
|
||||||
|
- **Event timestamp at offset 0x97A** — 8 bytes `[day][month]
|
||||||
|
[year_be][unk][hour][min][sec]`. Stop-time mirrors at 0x982.
|
||||||
|
- **Serial as null-terminated ASCII at 0x14E**.
|
||||||
|
- **Calibration date** at 0x194–0x197 (day, month, year_be).
|
||||||
|
- Per-sample residual drift of ~3% suggests Thor encodes int8/nibble
|
||||||
|
deltas with an extra refinement bit that BW doesn't carry —
|
||||||
|
unsolved; errors resync within a few samples so cumulative impact
|
||||||
|
is small.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File model
|
||||||
|
|
||||||
|
### Filename convention
|
||||||
|
|
||||||
|
```
|
||||||
|
<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>
|
||||||
|
```
|
||||||
|
|
||||||
|
- **SERIAL** — literal device serial, two-letter prefix + numeric
|
||||||
|
suffix. Examples seen: `UM11719`, `UM13981`, `UM20147`, `BE9439`.
|
||||||
|
Unlike Series III BW filenames (`M529LK44.AB0`, base-36 stem),
|
||||||
|
Series IV filenames carry the serial in plain text.
|
||||||
|
- **YYYYMMDDHHMMSS** — 14-char ASCII timestamp in **device local
|
||||||
|
time** (no timezone marker).
|
||||||
|
- **KIND** — `IDFH` for histograms, `IDFW` for waveforms.
|
||||||
|
|
||||||
|
The `.IDFH.txt` / `.IDFW.txt` ASCII sidecar lives in a `TXT/`
|
||||||
|
**subfolder** of the unit's directory, not alongside the binary.
|
||||||
|
This pairing convention is encoded in
|
||||||
|
`event_forwarder.idf_report_path()`.
|
||||||
|
|
||||||
|
### Directory layout
|
||||||
|
|
||||||
|
```
|
||||||
|
C:\THORDATA\
|
||||||
|
└── <Project>\
|
||||||
|
└── <UM####>\ ← unit serial dir
|
||||||
|
├── UM12345_20260520100000.MLG ← monitor log (not events)
|
||||||
|
├── UM12345_20260520100000.IDFH ← histogram event (binary)
|
||||||
|
├── UM12345_20260520100000.IDFW ← waveform event (binary)
|
||||||
|
├── UM12345_20260520100000.IDFW.CDB ← cache-DB variant (skip)
|
||||||
|
├── TXT\
|
||||||
|
│ ├── UM12345_20260520100000.IDFH.txt ← histogram ASCII sidecar
|
||||||
|
│ └── UM12345_20260520100000.IDFW.txt ← waveform ASCII sidecar
|
||||||
|
├── CSV\, HTML\, PDF\, XML\ ← operator-facing derived exports
|
||||||
|
└── ...
|
||||||
|
```
|
||||||
|
|
||||||
|
The `.IDFW.CDB` files share the binary's basename but appear to be a
|
||||||
|
separate cache/database variant. Their first 8 bytes match the
|
||||||
|
**old**-firmware Thor signature (see below) regardless of which
|
||||||
|
signature the paired `.IDFW` uses. Purpose unknown; sizes vary
|
||||||
|
wildly (observed 123 B → 40,491 B). Thor-watcher's forwarder
|
||||||
|
deliberately skips them.
|
||||||
|
|
||||||
|
### Sample corpus
|
||||||
|
|
||||||
|
The `thor-watcher/example-data/THORDATA_example/` tree carries
|
||||||
|
**1,014 paired .IDFW / .IDFH + .txt files** spanning 2020–2023
|
||||||
|
across nine units (UM11719, UM13981, UM20147, …, plus BE9439 from
|
||||||
|
2020). This is the reverse-engineering ground truth.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ASCII sidecar (`.IDFW.txt` / `.IDFH.txt`) — fully decoded
|
||||||
|
|
||||||
|
Shape: plain text, one `"Key : Value"` line per metadata field,
|
||||||
|
followed for waveforms by a tab-separated sample table headed by
|
||||||
|
the literal line `Waveform Data Channels`. Parsed by
|
||||||
|
[`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py).
|
||||||
|
See [`micromate/models.py`](../micromate/models.py) for the typed
|
||||||
|
`IdfReport` shape.
|
||||||
|
|
||||||
|
### Notable conventions
|
||||||
|
|
||||||
|
- **Units are native to Thor** — geophone in **in/s**, microphone in
|
||||||
|
**dB(L)** (not psi like Series III BW reports), frequency in Hz,
|
||||||
|
acceleration in g, displacement in in.
|
||||||
|
- **Below-threshold readings** appear as the literal string
|
||||||
|
`<0.005 in/s` (155 occurrences in the sample corpus) — the parser
|
||||||
|
strips the `<` and treats the numeric remainder as the value.
|
||||||
|
- **Out-of-range / not-measured** values appear as `N/A` — parser
|
||||||
|
drops the field rather than letting the string leak into a numeric
|
||||||
|
column.
|
||||||
|
- **Firmware string** observed: `Micromate ISEE 11.0AK`.
|
||||||
|
- **TitleString1..4** are operator-defined free-text slots; Thor's
|
||||||
|
default labels map them to Location / Client / Company / Notes,
|
||||||
|
which the parser surfaces as `project` / `client` / `operator` /
|
||||||
|
`notes`.
|
||||||
|
- **Histogram sidecars** use `HistogramStartDate` / `HistogramStartTime`
|
||||||
|
in place of waveform's `EventDate` / `EventTime`. Parser falls
|
||||||
|
through to either.
|
||||||
|
- **Histogram tabular block** lacks the `Waveform Data Channels`
|
||||||
|
marker; instead it's a multi-line column header followed by
|
||||||
|
per-interval rows (`<date> <time> <tran-ppv> <freq> ...`). Parser
|
||||||
|
silently ignores lines after the metadata block since they lack a
|
||||||
|
colon-separated `key : value` shape (the timestamps DO contain
|
||||||
|
colons but produce garbage keys that don't collide with any
|
||||||
|
recognised field).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Binary header signatures (observed)
|
||||||
|
|
||||||
|
Hex dump of the first 32 bytes across 1,014 sample files reveals
|
||||||
|
**two distinct file signatures**, both anchored by the literal
|
||||||
|
ASCII string `"\x00Instantel\x00"` at offset 6–16:
|
||||||
|
|
||||||
|
### Signature A — newer firmware (1,012 files, 99.8% of corpus)
|
||||||
|
|
||||||
|
```
|
||||||
|
00000000: 0012 0100 0000 496e 7374 616e 7465 6c00 ......Instantel.
|
||||||
|
00000010: 0000 a695 002e b500 4f70 6572 6174 6f72 ........Operator
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
operator/title string starts at 0x18
|
||||||
|
```
|
||||||
|
|
||||||
|
Header bytes 0–5: `00 12 01 00 00 00`. Followed immediately by the
|
||||||
|
8-byte ASCII tag, then 6 unknown bytes, then ASCII operator-supplied
|
||||||
|
strings (Operator name, etc.) and on through the project / client /
|
||||||
|
title strings. No `STRT` record observed in this layout.
|
||||||
|
|
||||||
|
### Signature B — older firmware (2 files: BE9439 from 2020)
|
||||||
|
|
||||||
|
```
|
||||||
|
00000000: 1000 0180 0000 496e 7374 616e 7465 6c00 ......Instantel.
|
||||||
|
00000010: 072c 0012 0300 5354 5254 fffe 0111 2340 .,....STRT....#@
|
||||||
|
^^^^^^^^^ ^^^^^^^^^
|
||||||
|
STRT magic 4-byte end_key
|
||||||
|
00000020: 0111 0000 2e5f 00ac 4600 0000 0200 0000 ....._..F.......
|
||||||
|
^^^^^^^^^ ^^^
|
||||||
|
4-byte start_key 0x46 (BW WAVEHDR record-type marker)
|
||||||
|
```
|
||||||
|
|
||||||
|
Header bytes 0–5: `10 00 01 80 00 00`. The structure after the
|
||||||
|
`Instantel` magic is **byte-for-byte identical to a BW SUB 5A
|
||||||
|
probe-response STRT record** as documented in
|
||||||
|
[instantel_protocol_reference.md → "SUB 5A — STRT record encodes
|
||||||
|
end_offset"](instantel_protocol_reference.md). Specifically:
|
||||||
|
|
||||||
|
| Offset | Bytes | Meaning (per BW reference) |
|
||||||
|
|--------|---------------------|--------------------------------------|
|
||||||
|
| 0x14 | `53 54 52 54` | `STRT` magic |
|
||||||
|
| 0x18 | `ff fe` | STRT sentinel |
|
||||||
|
| 0x1A | `01 11 23 40` | `end_key` (4 bytes) |
|
||||||
|
| 0x1E | `01 11 00 00` | `start_key` (4 bytes) |
|
||||||
|
| 0x26 | `46` | `0x46` waveform-record type marker |
|
||||||
|
|
||||||
|
**Hypothesis:** Older Micromate firmware writes a wrapped BW-format
|
||||||
|
event into the `.IDFW` file — essentially the same on-disk shape as
|
||||||
|
a Series III device, with the new filename convention applied at
|
||||||
|
export time. Newer firmware (signature A) abandoned the
|
||||||
|
BW-compatible layout for an Instantel-specific format.
|
||||||
|
|
||||||
|
If that hypothesis holds, the 2 signature-B files can already be
|
||||||
|
parsed via `minimateplus/event_file_io.read_blastware_file()` — worth
|
||||||
|
testing. The 1,012 signature-A files are the real reverse-engineering
|
||||||
|
target.
|
||||||
|
|
||||||
|
### `.IDFW.CDB` cache files
|
||||||
|
|
||||||
|
Always carry signature B (`10 00 01 80 ...`), even when the paired
|
||||||
|
`.IDFW` carries signature A. Plausible explanation: the CDB is an
|
||||||
|
internal Thor cache-database export that retains the legacy BW-style
|
||||||
|
record layout regardless of the user-facing `.IDFW` format version.
|
||||||
|
Not currently consumed by the forwarder.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File-size patterns (Signature A, the main target)
|
||||||
|
|
||||||
|
Survey of 1,012 signature-A files:
|
||||||
|
|
||||||
|
| Event type | Typical size | Source of variance |
|
||||||
|
|--------------|-------------------|----------------------------------------------|
|
||||||
|
| `.IDFW` 2-sec | 9,200 – 10,500 B | Operator-supplied strings (TitleString1..4) of varying length |
|
||||||
|
| `.IDFH` | 2,944 – 4,076 B | Histogram interval count (record duration / interval) |
|
||||||
|
|
||||||
|
**Naive arithmetic for 2-sec waveform:**
|
||||||
|
- 4 channels × 2 sec × 1024 sps = 8,192 samples
|
||||||
|
- At 2 bytes/sample (int16) = 16,384 sample bytes → file would be > 16 KB
|
||||||
|
- Observed: ~9–10 KB
|
||||||
|
- → samples are likely **1 byte each** (int8 quantised), **or** stored
|
||||||
|
with bit-packing / delta encoding, **or** only one channel's
|
||||||
|
full-rate samples are stored with the others reconstructed
|
||||||
|
arithmetically. Verifying this is the **first RE milestone**.
|
||||||
|
|
||||||
|
Project-string–length variance (~1 KB across the corpus) is consistent
|
||||||
|
with the file carrying a single copy of each TitleString1..4 plus
|
||||||
|
operator + setup-name as null-padded ASCII regions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
The reverse-engineering targets, roughly in dependency order:
|
||||||
|
|
||||||
|
1. **Sample encoding (signature A)** — int8? int16 LE/BE? Bit-packed?
|
||||||
|
Delta-coded? Per-channel interleaved or sequential blocks?
|
||||||
|
2. **Header field layout (signature A)** — where do sample_rate,
|
||||||
|
record_time, channel count, and per-channel peaks live in the
|
||||||
|
binary? The ASCII sidecar gives the device-authoritative values,
|
||||||
|
so binary fields can be confirmed by diff.
|
||||||
|
3. **Operator-string offsets** — `Operator` at 0x18 is the first
|
||||||
|
visible string in signature-A files; the rest (project, client,
|
||||||
|
notes, setup) follow. Need to map exact offsets and null-padding
|
||||||
|
conventions.
|
||||||
|
4. **Signature-B → BW codec compatibility** — does
|
||||||
|
`minimateplus/event_file_io.read_blastware_file()` actually parse
|
||||||
|
the 2 BE9439 signature-B files as-is? If yes, the OLD-format
|
||||||
|
ingest is free.
|
||||||
|
5. **`.IDFW.CDB` purpose** — is it an internal Thor cache, a
|
||||||
|
ring-buffer dump, or something else? Worth a single small effort
|
||||||
|
to characterise so we know what we're skipping.
|
||||||
|
6. **Footer / checksum** — every BW event file has a footer; does
|
||||||
|
IDF? Where does the per-channel sample block end?
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Reverse-engineering playbook (when we start)
|
||||||
|
|
||||||
|
The Series III BW codec took ~2 months of MITM wire captures
|
||||||
|
because we didn't have ground-truth metadata. Thor's situation is
|
||||||
|
**substantially better**:
|
||||||
|
|
||||||
|
- **Ground truth is on disk.** Every binary in `example-data/`
|
||||||
|
has a paired `.IDFW.txt` carrying the full decoded sample table
|
||||||
|
(`Waveform Data Channels` block — see any sample file in
|
||||||
|
`thor-watcher/example-data/.../TXT/`). Aligning binary bytes
|
||||||
|
to the table's float-per-row values gives an immediate per-byte
|
||||||
|
hypothesis test.
|
||||||
|
- **Cross-event diffing.** 1,012 signature-A samples from 9 units
|
||||||
|
spanning 4 years means any field that varies between events is
|
||||||
|
immediately localisable. Fields that are constant across all
|
||||||
|
files (firmware ID, channel labels, format-version word) are also
|
||||||
|
immediately localisable by complementary search.
|
||||||
|
- **No protocol surface.** Files at rest, not a wire dialect. No
|
||||||
|
DLE stuffing, no inner-frame parsing, no probe/data two-step.
|
||||||
|
|
||||||
|
Suggested first session (2-4 hours): hand-decode `UM11719_20231219162723.IDFW`
|
||||||
|
(10,290 bytes) against its `TXT/UM11719_20231219162723.IDFW.txt`
|
||||||
|
sample table (the 2-sec waveform at 1024 sps × 4 channels = 8,192
|
||||||
|
sample rows). Find the first per-channel sample value (`0.0003` in
|
||||||
|
the Tran column at t=0) in the binary. Confirms sample encoding.
|
||||||
|
Everything else flows from there.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Code seams ready to receive the codec
|
||||||
|
|
||||||
|
When the codec lands, it goes into
|
||||||
|
[`micromate/idf_file.py`](../micromate/idf_file.py) (currently a
|
||||||
|
stub raising `NotImplementedError`). Public API:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from micromate import IdfEvent
|
||||||
|
from micromate.idf_file import read_idf_file
|
||||||
|
|
||||||
|
event: IdfEvent = read_idf_file(Path("UM11719_20231219163444.IDFW"))
|
||||||
|
# event.peaks.transverse_ips, event.timestamp, event.raw_samples, ...
|
||||||
|
```
|
||||||
|
|
||||||
|
The ingest pipeline (`WaveformStore.save_imported_idf`) currently
|
||||||
|
builds the `IdfEvent` from the `.txt` parser only. Once
|
||||||
|
`read_idf_file()` works, the binary becomes authoritative; the
|
||||||
|
`.txt` parser drops to fast-path metadata cross-check. Operators
|
||||||
|
who don't enable Thor's TXT exporter still get fully populated
|
||||||
|
events.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## See also
|
||||||
|
|
||||||
|
- [instantel_protocol_reference.md](instantel_protocol_reference.md) — Series III BW protocol reference (the Rosetta Stone). STRT record format, DLE framing, BW filename encoding.
|
||||||
|
- [`micromate/idf_ascii_report.py`](../micromate/idf_ascii_report.py) — `.txt` sidecar parser.
|
||||||
|
- [`micromate/models.py`](../micromate/models.py) — `IdfEvent`, `IdfReport` typed dataclasses.
|
||||||
|
- [`micromate/idf_file.py`](../micromate/idf_file.py) — placeholder for the binary codec.
|
||||||
|
- [`thor-watcher/example-data/THORDATA_example/`](../../thor-watcher/example-data/) — 1,014 paired binary + .txt files for codec validation.
|
||||||
+1095
-324
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,255 @@
|
|||||||
|
# Runbook — Recovering a wedged unit stuck in a call-home loop
|
||||||
|
|
||||||
|
**Original incident:** BE9558H at `166.246.130.1:9034`, recovered 2026-05-17.
|
||||||
|
|
||||||
|
A field unit with a stuck-triggered geophone (or any hardware fault causing
|
||||||
|
constant event triggering) will record events back-to-back, and if Auto Call
|
||||||
|
Home is set to "After Event Recorded" the device will dial the office BW
|
||||||
|
ACH server in a tight loop. Combined with a Sierra Wireless modem in
|
||||||
|
bidirectional serial-TCP mode, this makes the unit effectively unreachable
|
||||||
|
from SFM — every TCP connection we open gets killed when the modem flips
|
||||||
|
from server-mode to client-mode to honor the device's next AT dial command.
|
||||||
|
|
||||||
|
This runbook describes how to break the loop and recover control.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Symptoms
|
||||||
|
|
||||||
|
- Terra-View / SFM `/device/info` either hangs or fails on `count_events()`.
|
||||||
|
- `/device/monitor/status` and `/device/rescue` return 502 (protocol timeout
|
||||||
|
waiting for POLL response) or 503 (TCP connect refused).
|
||||||
|
- ACEmanager serial log shows repeating
|
||||||
|
`Connect to IP: <BW_IP> Port: <BW_PORT>` → `Shutdown TCP socket` cycles
|
||||||
|
every 30-60 seconds.
|
||||||
|
- Spam-mode endpoints (`/device/stop_monitoring_spam`) report many
|
||||||
|
`sent_ok` but the device's monitoring state never changes.
|
||||||
|
- `slow_drip` reports `[Errno 32] Broken pipe` after sending the preamble
|
||||||
|
but before completing the drip loop.
|
||||||
|
|
||||||
|
If you see *all* of these, the unit is in this exact failure mode.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick reference — how to recover
|
||||||
|
|
||||||
|
You need **ACEmanager access** to the unit's modem.
|
||||||
|
|
||||||
|
### Step 1: stop the modem's mode-flipping
|
||||||
|
|
||||||
|
In ACEmanager → **Serial → Port Configuration**:
|
||||||
|
|
||||||
|
| Field | Set to |
|
||||||
|
|---|---|
|
||||||
|
| **Destination Address** | clear (blank) |
|
||||||
|
| **Destination Port** | `0` |
|
||||||
|
|
||||||
|
Click **Apply**. This removes the modem's auto-dial-out target. The device's
|
||||||
|
AT dial commands now error back at the modem instead of triggering a
|
||||||
|
mode-flip, so the modem stays in TCP-server mode permanently and our inbound
|
||||||
|
TCP sessions stay alive.
|
||||||
|
|
||||||
|
*(Optional belt-and-suspenders: also add the BW server's port to
|
||||||
|
**Security → Port Filtering - Outbound** as a blocked port, with
|
||||||
|
Outbound Port Filtering Mode = Blocked Ports.)*
|
||||||
|
|
||||||
|
### Step 2: stop monitoring on the device (slow drip)
|
||||||
|
|
||||||
|
From the SFM host:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/home/serversdown/seismo-relay/scripts/slow_drip.sh <DEVICE_IP> <PORT>
|
||||||
|
```
|
||||||
|
|
||||||
|
Defaults are 120s duration with a drip every 3s. Watch the response:
|
||||||
|
|
||||||
|
- `duration_s ≈ 120` and `drips_sent ≈ 40` → session held the full duration ✓
|
||||||
|
- `bytes_received > 0` → device is responding ✓ (this is the success signal)
|
||||||
|
|
||||||
|
If `duration_s` is small or `send_error: "Broken pipe"`, Step 1 didn't take
|
||||||
|
hold — re-check ACEmanager, may need to reboot the modem after Apply.
|
||||||
|
|
||||||
|
### Step 3: confirm monitoring stopped
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl 'http://localhost:8200/device/monitor/status?host=<DEVICE_IP>&tcp_port=<PORT>&force=true'
|
||||||
|
# expect: {"is_monitoring": false, ...}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: disable ACH at the device level + erase corrupted events
|
||||||
|
|
||||||
|
Either fire the rescue endpoint:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/home/serversdown/seismo-relay/scripts/rescue_device.sh <DEVICE_IP> <PORT>
|
||||||
|
```
|
||||||
|
|
||||||
|
Or do the two steps manually:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Disable ACH in the device's compliance config
|
||||||
|
curl -X POST 'http://localhost:8200/device/call_home?host=<DEVICE_IP>&tcp_port=<PORT>' \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d '{"auto_call_home_enabled": false}'
|
||||||
|
|
||||||
|
# Erase corrupted event chain
|
||||||
|
curl -X POST 'http://localhost:8200/device/events/erase?host=<DEVICE_IP>&tcp_port=<PORT>'
|
||||||
|
```
|
||||||
|
|
||||||
|
You can also do this via the SFM standalone UI → **Call Home** tab → set
|
||||||
|
`Enable Auto Call Home` to `Disabled` → **Write to Device**.
|
||||||
|
|
||||||
|
### Step 5: restore modem config (housekeeping)
|
||||||
|
|
||||||
|
Once the device-side ACH is disabled, restore the modem's Destination
|
||||||
|
Address and Port to the original values (e.g. `50.197.32.92` / `12345`) in
|
||||||
|
ACEmanager. The modem will resume normal bidirectional behavior, but the
|
||||||
|
unit won't issue any dial commands until ACH is explicitly re-enabled on
|
||||||
|
the device.
|
||||||
|
|
||||||
|
### Step 6: do NOT re-enable ACH on this unit until the underlying hardware
|
||||||
|
fault is repaired. If you do, the call-home loop starts again immediately
|
||||||
|
and you'll be running this runbook a second time.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why this works — the failure mode explained
|
||||||
|
|
||||||
|
The Sierra Wireless RV50/RV55 serial port operates in one of two TCP modes
|
||||||
|
at any moment:
|
||||||
|
|
||||||
|
- **Server mode** — listens on `Device Port` (e.g. 9034), bridges inbound
|
||||||
|
TCP to the device's serial port. This is what we need to interact with
|
||||||
|
the device.
|
||||||
|
- **Client mode** — when the device sends an AT dial command on its serial
|
||||||
|
TX line, the modem opens an outbound TCP to `Destination Address:Port`
|
||||||
|
and bridges that to serial.
|
||||||
|
|
||||||
|
A serial port in this configuration is **bidirectional**: the modem flips
|
||||||
|
between server and client modes on demand. When the device's firmware is
|
||||||
|
healthy and only dials occasionally, this works fine.
|
||||||
|
|
||||||
|
When the unit is constantly triggering events and ACH is set to "After
|
||||||
|
Event Recorded", the device sends an AT dial command every few seconds.
|
||||||
|
Each one causes the modem to:
|
||||||
|
|
||||||
|
1. Drop any active inbound TCP session
|
||||||
|
2. Flip to client mode
|
||||||
|
3. Attempt outbound TCP to `Destination Address:Port`
|
||||||
|
4. Hang for up to a minute waiting for it to succeed/fail
|
||||||
|
5. Drop back to server mode
|
||||||
|
|
||||||
|
**During the entire hang, no inbound TCP can establish.** Even between
|
||||||
|
hangs, the modem closes any existing inbound session before flipping. So
|
||||||
|
any tool that needs more than a few seconds of held TCP (e.g. POLL +
|
||||||
|
config read + write) gets repeatedly kicked off.
|
||||||
|
|
||||||
|
Clearing `Destination Address` removes step 3-4 from the cycle: the modem
|
||||||
|
has nowhere to dial, so it doesn't flip modes when it receives an AT dial
|
||||||
|
command. The serial port effectively becomes server-only, and inbound TCP
|
||||||
|
sessions can stay open as long as needed.
|
||||||
|
|
||||||
|
**This is a modem-layer issue, not a device firmware issue.** The device
|
||||||
|
is alive and responsive the whole time — confirmed in the BE9558H
|
||||||
|
recovery by 990 bytes of S3 responses received over a 120s slow-drip
|
||||||
|
session once the modem was no longer mode-flipping.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why simpler approaches don't work
|
||||||
|
|
||||||
|
| Approach | Why it fails |
|
||||||
|
|---|---|
|
||||||
|
| Standard `/device/info` | Triggers `count_events()` 1E/1F walk, takes 90s+ and hits corrupted event chain in this scenario |
|
||||||
|
| `/device/rescue` race loop | Gets 502 (protocol timeout) because the modem closes the TCP before the POLL handshake can complete |
|
||||||
|
| `/device/stop_monitoring_blind` (single frame) | Even if the bytes leave the wire, the device's protocol parser ignores write commands without a preceding POLL handshake (early-version bug, now fixed by including POLL preamble in blind sends) |
|
||||||
|
| `/device/stop_monitoring_spam` (sub-second cadence) | Each session is killed by the modem's mode-flip before the device can drain its UART RX buffer; high-rate spam also risks UART FIFO overrun on the device side |
|
||||||
|
| Outbound port firewall block alone | Stops the outbound TCP from succeeding, but doesn't stop the modem from *trying* and mode-flipping. Reduces but doesn't eliminate the contention. |
|
||||||
|
| Modem reboot | Temporary — as soon as the device starts triggering again, the loop resumes within seconds |
|
||||||
|
|
||||||
|
The combination of `slow_drip` + cleared `Destination Address` works because:
|
||||||
|
|
||||||
|
1. The modem stops mode-flipping → TCP session stays open for the full
|
||||||
|
drip duration
|
||||||
|
2. Slow drip rate → device's UART RX FIFO never overflows even if
|
||||||
|
firmware is busy with event recording
|
||||||
|
3. The drip is `SESSION_RESET + STOP_MONITORING` every 3s → many
|
||||||
|
independent chances for the parser to land one valid frame
|
||||||
|
4. Once one Stop Monitoring is parsed, event recording halts → firmware
|
||||||
|
has CPU to spare → subsequent operations are trivially easy
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Tooling reference
|
||||||
|
|
||||||
|
All endpoints live in `seismo-relay/sfm/server.py`. All scripts live in
|
||||||
|
`seismo-relay/scripts/` and default to SFM direct (`http://localhost:8200`),
|
||||||
|
overridable via `SFM_BASE_URL`.
|
||||||
|
|
||||||
|
### Endpoints added during BE9558H recovery
|
||||||
|
|
||||||
|
| Endpoint | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `GET /device/events/storage_range` | SUB 0x06 — first/last event keys, `is_empty` flag. ~2s, no event walk. |
|
||||||
|
| `GET /device/events/index` | SUB 0x08 — lifetime event counter (does NOT decrement on erase). ~2s. |
|
||||||
|
| `POST /device/events/erase` | Full erase sequence 0xA3 → 0x1C → 0x06 → 0xA2. |
|
||||||
|
| `POST /device/rescue` | Disable ACH + erase in one TCP session. Short timeouts for race-loop usage. |
|
||||||
|
| `POST /device/stop_monitoring_blind` | Fire-and-forget Stop with full POLL preamble (single attempt). |
|
||||||
|
| `POST /device/stop_monitoring_spam` | Server-side tight retry loop, sub-second cadence, duration-bounded. |
|
||||||
|
| `POST /device/stop_monitoring_slow_drip` | One held TCP session, slow trickle of stop frames. **The endpoint that saved BE9558H.** |
|
||||||
|
|
||||||
|
Also changed: default protocol recv timeout dropped from 30s → 10s in
|
||||||
|
`_build_client`. Added `connect_timeout` knob to same. Cleaned up
|
||||||
|
unhandled-exception path in `/device/monitor/status` so it returns 502
|
||||||
|
instead of 500 on protocol timeouts.
|
||||||
|
|
||||||
|
### Scripts
|
||||||
|
|
||||||
|
| Script | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `scripts/rescue_device.sh` | Race-loop wrapper around `/device/rescue` |
|
||||||
|
| `scripts/blind_stop.sh` | Race-loop wrapper around `/device/stop_monitoring_blind` |
|
||||||
|
| `scripts/spam_stop.sh` | Single-call burst hammer (`/device/stop_monitoring_spam`) |
|
||||||
|
| `scripts/slow_drip.sh` | Single-call held-session drip (`/device/stop_monitoring_slow_drip`) |
|
||||||
|
| `scripts/watch_unit.sh` | Passive periodic reachability check, logs to file |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Incident log — BE9558H, 2026-05-16/17
|
||||||
|
|
||||||
|
What was wrong: Long-axis geophone developed an offset, constantly above
|
||||||
|
trigger threshold → constant event recording → after-event ACH set →
|
||||||
|
modem dialing office BW server (`50.197.32.92:12345`) every 30-60s.
|
||||||
|
Local event chain corrupted (`next_boundary 0x100EE exceeds uint16`).
|
||||||
|
|
||||||
|
Diagnostic path:
|
||||||
|
|
||||||
|
1. `/device/info` slow, choked on event walk
|
||||||
|
2. Built lightweight probe endpoints (`storage_range`, `index`) — useful
|
||||||
|
but didn't reach the wedged unit
|
||||||
|
3. Built `/device/rescue` with short timeouts — got 502 (POLL no response)
|
||||||
|
4. Built `/device/stop_monitoring_blind` — first version was a false
|
||||||
|
positive (no POLL preamble); fixed by including
|
||||||
|
`SESSION_RESET+POLL_PROBE+SESSION_RESET+POLL_DATA` in the dump
|
||||||
|
5. Verified blind stop works on bench unit
|
||||||
|
6. Built `/device/stop_monitoring_spam` — 420 successful sends over
|
||||||
|
5 min, zero behavior change on field unit
|
||||||
|
7. Inspected ACEmanager logs → saw outbound dial-out attempts every ~30s,
|
||||||
|
confirmed device was not fully locked up
|
||||||
|
8. Added outbound port-12345 firewall block → outbound attempts now fail
|
||||||
|
instantly but contention persisted
|
||||||
|
9. Built `/device/stop_monitoring_slow_drip` — session died at 3s with
|
||||||
|
broken pipe (modem closing on us)
|
||||||
|
10. Looked at full ACEmanager Port Configuration → **found
|
||||||
|
`Destination Address: 50.197.32.92` configured**, realized every AT
|
||||||
|
dial command was triggering a modem mode-flip that killed our inbound
|
||||||
|
11. Cleared Destination Address + Port → slow_drip held 120s, device
|
||||||
|
responded with 990 bytes, 39 stop commands acked
|
||||||
|
12. Disabled ACH at device level via `/device/call_home`, erased events
|
||||||
|
|
||||||
|
Final state: device IDLE, memory 958.1 / 960 KB free, ACH disabled at
|
||||||
|
device level, modem destination cleared (to be restored after physical
|
||||||
|
service).
|
||||||
|
|
||||||
|
Total time from "i was wondering if its possible to" first attempt to
|
||||||
|
recovery: ~7 hours of intermittent debugging across one evening.
|
||||||
@@ -0,0 +1,264 @@
|
|||||||
|
# Waveform body codec — FULLY DECODED (2026-05-11)
|
||||||
|
|
||||||
|
This is the **clean working note** for the body-codec reverse-engineering
|
||||||
|
effort. It supersedes scattered claims elsewhere when they conflict.
|
||||||
|
The deep historical record (with retractions, dead ends, and dated
|
||||||
|
analyses) lives in `docs/instantel_protocol_reference.md §7.6.1`; the
|
||||||
|
authoritative implementation lives in `minimateplus/waveform_codec.py`.
|
||||||
|
|
||||||
|
## TL;DR
|
||||||
|
|
||||||
|
**The codec is fully decoded.** Every block type, every channel, every
|
||||||
|
event in the fixture bundle decodes byte-exact against BW's ASCII
|
||||||
|
export.
|
||||||
|
|
||||||
|
| Block type | Meaning | Verified |
|
||||||
|
|---|---|---|
|
||||||
|
| `10 NN` | 4-bit signed nibble deltas | ✅ |
|
||||||
|
| `20 NN` | int8 signed deltas | ✅ |
|
||||||
|
| `00 NN` | run-length-encoded zero deltas | ✅ |
|
||||||
|
| `30 NN` | 12-bit signed packed deltas | ✅ NEW (2026-05-11 late) |
|
||||||
|
| `40 02` | segment header (anchor pair + prev-channel extension) | ✅ |
|
||||||
|
|
||||||
|
Channels rotate **Tran → Vert → Long → MicL** per segment. Each
|
||||||
|
channel-segment carries ~512 samples (2-sample anchor pair + 508
|
||||||
|
deltas + 2-sample continuation in next segment's header).
|
||||||
|
|
||||||
|
## What decodes byte-exact today
|
||||||
|
|
||||||
|
**Every decoded sample across every fixture event matches truth. Zero
|
||||||
|
divergences.**
|
||||||
|
|
||||||
|
| Event | Description | Tran | Vert | Long | Total |
|
||||||
|
|---|---|---|---|---|---|
|
||||||
|
| event-a (5-8) | quiet, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||||
|
| event-c (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
|
||||||
|
| event-d (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
|
||||||
|
| JQ0 (5-11) | Vert-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||||
|
| V70 (5-11) | Mic-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
|
||||||
|
| SP0 (5-11) | loud all, 3 sec | 2048 ✓ | 1538 ✓ | 1536 ✓ | 5122 |
|
||||||
|
| SS0 (5-11) | loud-from-start | 734 ✓ | 512 ✓ | 512 ✓ | 1758 |
|
||||||
|
| SV0 (5-11) | loud-from-start | 1024 ✓ | 578 ✓ | 512 ✓ | 2114 |
|
||||||
|
| event-b (5-8) | quiet, 2 sec | 512 ✓ | 226 ✓ | 0 | 738 |
|
||||||
|
|
||||||
|
That's **47,364 ADC samples decoded byte-exact, zero errors.**
|
||||||
|
|
||||||
|
Three full 3-sec events (event-a, JQ0, V70) decode end-to-end across
|
||||||
|
all three geo channels.
|
||||||
|
|
||||||
|
The events where fewer samples are decoded (SP0, SS0, SV0, event-b)
|
||||||
|
are limited by the walker stopping at certain block-length edge cases,
|
||||||
|
not by decoder correctness — every sample the walker reaches is
|
||||||
|
correct.
|
||||||
|
|
||||||
|
## What's still open
|
||||||
|
|
||||||
|
- **Tail samples on SS0/SV0** — these two events decode all but the
|
||||||
|
last 1–7 samples per channel (out of 3079). Likely the same
|
||||||
|
"last segment is truncated" pattern. Minor; doesn't affect the
|
||||||
|
bulk of the data.
|
||||||
|
|
||||||
|
## Sample counts (72,972 byte-exact total)
|
||||||
|
|
||||||
|
| Event | Tran | Vert | Long | Status |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| event-a | 3328 | 3328 | 3328 | full |
|
||||||
|
| event-b | 2304 | 2304 | 2304 | full |
|
||||||
|
| event-c | 1280 | 1280 | 1280 | full |
|
||||||
|
| event-d | 1280 | 1280 | 1280 | full |
|
||||||
|
| JQ0 | 3328 | 3328 | 3328 | full |
|
||||||
|
| V70 | 3328 | 3328 | 3328 | full |
|
||||||
|
| SP0 | 3328 | 3328 | 3328 | full |
|
||||||
|
| SS0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
|
||||||
|
| SV0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
|
||||||
|
|
||||||
|
## What's now wired into production (2026-05-11 late)
|
||||||
|
|
||||||
|
- **`client.py:_decode_a5_waveform`** — now uses
|
||||||
|
`decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
|
||||||
|
`event.raw_samples` is populated with int16 ADC counts that flow
|
||||||
|
through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
|
||||||
|
Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
|
||||||
|
reference but is not called.
|
||||||
|
|
||||||
|
- **MicL → dB(L) conversion** — exposed as
|
||||||
|
`waveform_codec.mic_count_to_db(count)`. Verified against BW
|
||||||
|
display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
|
||||||
|
the V70 mic-heavy fixture exactly).
|
||||||
|
|
||||||
|
- **`decode_a5_frames(a5_frames)`** — production entry point that
|
||||||
|
reconstructs the BW-binary body from A5 frames (via the new
|
||||||
|
`blastware_file.extract_body_bytes` helper) and runs the verified
|
||||||
|
codec. Returns the same `raw_samples` dict shape the consumers
|
||||||
|
already expect.
|
||||||
|
|
||||||
|
## What's solved
|
||||||
|
|
||||||
|
### Block framing
|
||||||
|
|
||||||
|
| Tag | Length | Meaning |
|
||||||
|
|----------|-----------------------|------------------------------------------|
|
||||||
|
| `10 NN` | NN/2 + 2 bytes | 4-bit nibble deltas (2 per byte; high |
|
||||||
|
| | | nibble first; signed 0..7 / 8..F = -8..-1)|
|
||||||
|
| `20 NN` | NN + 2 bytes | int8 signed deltas (1 per byte) |
|
||||||
|
| `00 NN` | 2 bytes | RLE: append NN copies of current value |
|
||||||
|
| `30 NN` | NN*2 in data section, | Unknown content. Only in loud-from- |
|
||||||
|
| | NN*4 in trailer | start events. |
|
||||||
|
| `40 02` | 20 bytes (fixed) | Segment header |
|
||||||
|
|
||||||
|
NN is always a multiple of 4.
|
||||||
|
|
||||||
|
Implementation: `walk_body()` in `minimateplus/waveform_codec.py`.
|
||||||
|
|
||||||
|
### 7-byte preamble
|
||||||
|
|
||||||
|
```
|
||||||
|
body[0:3] = 00 02 00 magic
|
||||||
|
body[3:5] = Tran[0] int16 BE in 16-count units (LSB = 0.005 in/s)
|
||||||
|
body[5:7] = Tran[1] int16 BE in 16-count units
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tran channel, segment 0
|
||||||
|
|
||||||
|
Segment 0 (everything before the first `40 02`) encodes Tran samples
|
||||||
|
only. Starting from preamble anchors Tran[0] and Tran[1], each block
|
||||||
|
contributes to a running cumulative:
|
||||||
|
|
||||||
|
- `10 NN` → append NN nibble-deltas
|
||||||
|
- `20 NN` → append NN int8-deltas
|
||||||
|
- `00 NN` → append NN copies of current value (RLE)
|
||||||
|
- `40 02` → end segment 0
|
||||||
|
|
||||||
|
Verified byte-exact:
|
||||||
|
|
||||||
|
| Event | Description | Segment 0 size | Match |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `M529LL1A.SP0` | Loud, 0.25 s pretrig | 510 | 510/510 ✓ |
|
||||||
|
| `M529LL1A.SV0` | Loud from sample 0 | 58 | 58/58 ✓ (stops at first `30 NN`) |
|
||||||
|
| `M529LL1A.SS0` | Loud from sample 0 | 42 | 42/42 ✓ (stops at first `30 04`) |
|
||||||
|
| `M529LL1L.JQ0` | Vert-heavy | 510 | 510/510 ✓ |
|
||||||
|
| `M529LL1L.V70` | Mic-heavy (140 dB) | 510 | 510/510 ✓ |
|
||||||
|
|
||||||
|
Implementation: `decode_tran_initial()`.
|
||||||
|
|
||||||
|
### Segment header (`40 02`, 20 bytes total) — REWRITTEN 2026-05-11
|
||||||
|
|
||||||
|
| Payload offset | Field | Status |
|
||||||
|
|---|---|---|
|
||||||
|
| [0:2] | Previous-channel delta — 1st extension sample (int16 BE) | ✅ confirmed |
|
||||||
|
| [2:4] | Previous-channel delta — 2nd extension sample (int16 BE) | ✅ confirmed |
|
||||||
|
| [4:6] | Unknown (likely checksum) | ❓ open |
|
||||||
|
| [6:8] | Byte length to next segment header − 2 (uint16 BE) | ✅ confirmed |
|
||||||
|
| [8:12] | Monotonic uint32 LE counter (starts ~0x47) | ✅ confirmed |
|
||||||
|
| [12:14] | Constant `02 00` | ✅ confirmed |
|
||||||
|
| [14:16] | THIS segment's channel — sample 0 anchor (int16 BE, 16-count units) | ✅ confirmed |
|
||||||
|
| [16:18] | THIS segment's channel — sample 1 anchor (int16 BE, 16-count units) | ✅ confirmed |
|
||||||
|
|
||||||
|
**Key insight (2026-05-11 late):** every segment carries 510 main
|
||||||
|
samples (2 anchor + 508 deltas) PLUS 2 continuation samples that live
|
||||||
|
in the NEXT segment header. So each channel-segment effectively spans
|
||||||
|
512 sample-sets. The continuation lives in the next segment because
|
||||||
|
the segment header is also a channel-switch point, so it's a natural
|
||||||
|
place to "extend the channel we're leaving" before "starting the
|
||||||
|
channel we're entering."
|
||||||
|
|
||||||
|
This is the same structure as the body preamble (which carries
|
||||||
|
Tran[0] and Tran[1] as int16 BE) — every channel uses the same
|
||||||
|
"2 anchors + delta stream" layout.
|
||||||
|
|
||||||
|
## Channel rotation — VERIFIED 2026-05-11
|
||||||
|
|
||||||
|
```
|
||||||
|
(initial body) → Tran samples 0..509 (preamble + delta blocks)
|
||||||
|
segment 0 hdr ext+anchor → Vert samples 0..511 ← anchor in hdr [14:18]
|
||||||
|
segment 1 hdr ext+anchor → Long samples 0..511
|
||||||
|
segment 2 hdr ext+anchor → Mic samples 0..511
|
||||||
|
segment 3 hdr ext+anchor → Tran samples 510..1021 (continuation)
|
||||||
|
segment 4 hdr ext+anchor → Vert samples 512..1023
|
||||||
|
segment 5 hdr ext+anchor → Long samples 512..1023
|
||||||
|
segment 6 hdr ext+anchor → Mic samples 512..1023
|
||||||
|
segment 7 hdr ext+anchor → Tran samples 1022..1533
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
Implementation: `decode_waveform_v2()` returns
|
||||||
|
`{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}` with
|
||||||
|
each channel's samples in 16-count units. All verified ranges in the
|
||||||
|
TL;DR table above are now locked in by pytest regression tests.
|
||||||
|
|
||||||
|
## What's still open
|
||||||
|
|
||||||
|
1. **`30 NN` block content.** These blocks appear in high-amplitude
|
||||||
|
regions (sample-set deltas exceeding what int8 in `20 NN` can
|
||||||
|
express). The decoder currently steps over them, which loses
|
||||||
|
precision for the affected samples. Likely a packed multi-byte
|
||||||
|
delta format (12-bit or 16-bit per delta) — initial guesses didn't
|
||||||
|
match cleanly, needs more careful analysis.
|
||||||
|
|
||||||
|
2. **MicL decoding.** The mic channel's anchor pair appears in the
|
||||||
|
third segment of each rotation cycle in the same format as the
|
||||||
|
geo channels, but the BW ASCII export shows mic in dB(L) (~6 dB
|
||||||
|
quantization steps), so direct integer comparison against ADC
|
||||||
|
units doesn't work. Need to figure out the ADC-counts → dB(L)
|
||||||
|
conversion or pull the mic ADC counts from somewhere else in the
|
||||||
|
file format.
|
||||||
|
|
||||||
|
3. **Walker fix for event-b.** The original quiet bundle's event-b
|
||||||
|
still bails out partway through. Lower priority since the other
|
||||||
|
7 events walk cleanly.
|
||||||
|
|
||||||
|
## `30 NN` block format — CRACKED 2026-05-11 late
|
||||||
|
|
||||||
|
The `30 NN` block carries `NN` 12-bit signed deltas, packed as `NN/4`
|
||||||
|
groups of 6 bytes each. Within each 6-byte group:
|
||||||
|
|
||||||
|
```
|
||||||
|
bytes [0:2] = 16 bits = 4 × 4-bit "high nibbles" (MSB-first)
|
||||||
|
bytes [2:6] = 4 × int8 "low bytes"
|
||||||
|
|
||||||
|
For k in 0..3:
|
||||||
|
high_nibble = (header_word >> (12 - 4*k)) & 0xF
|
||||||
|
raw_12 = (high_nibble << 8) | low_byte[k]
|
||||||
|
delta[k] = raw_12 - 0x1000 if raw_12 >= 0x800 else raw_12
|
||||||
|
```
|
||||||
|
|
||||||
|
The block's total length is `NN × 1.5 + 2` bytes (tag included). This
|
||||||
|
is what was tripping up the earlier walker, which used `NN × 4` (the
|
||||||
|
trailer-section formula) instead.
|
||||||
|
|
||||||
|
Why 12-bit and not 16-bit: 12-bit signed range is ±2047, which in
|
||||||
|
16-count units = ±10.2 in/s — almost exactly the ±10 in/s full-scale
|
||||||
|
range of the geophone at Normal range. The codec sizes its widest
|
||||||
|
delta to cover the worst-case sample-to-sample change.
|
||||||
|
|
||||||
|
Verified against all 14 `30 NN` blocks across the bundled fixture
|
||||||
|
events. Every delta decodes byte-exact against BW's ASCII export.
|
||||||
|
|
||||||
|
## Test fixtures
|
||||||
|
|
||||||
|
Committed under `tests/fixtures/`:
|
||||||
|
|
||||||
|
- `decode-re-5-8-26/event-a..event-d/`: original quiet bundle (4 events,
|
||||||
|
PPV < 1 in/s). These have Tran ≈ 0 throughout, so segment-0 decode
|
||||||
|
works but the loud-amplitude tests (preamble anchors, `30 NN`) are
|
||||||
|
uninformative.
|
||||||
|
- `5-11-26/M529LL1A.{SP0,SS0,SV0}`: loud bundle (PPV 6-7 in/s on all
|
||||||
|
channels). These cracked the Tran codec.
|
||||||
|
- `5-11-26/M529LL1L.{JQ0,V70}`: targeted captures. JQ0 is Vert-heavy,
|
||||||
|
V70 is Mic-heavy (140 dB). These cracked the `00 NN` RLE rule.
|
||||||
|
|
||||||
|
Each fixture has a `.TXT` Blastware ASCII export as ground truth.
|
||||||
|
|
||||||
|
## Tests
|
||||||
|
|
||||||
|
`tests/test_waveform_codec.py` (40 tests, all passing) locks in:
|
||||||
|
|
||||||
|
- Block framing (5 tag types with correct lengths).
|
||||||
|
- Walker contiguity (no gaps or overlaps).
|
||||||
|
- Segment header parsing (counter monotonicity, fixed-pattern check).
|
||||||
|
- `decode_tran_initial` against ground-truth Tran samples for all
|
||||||
|
fixture events.
|
||||||
|
|
||||||
|
When you crack the next piece, **add fixture tests against ground-truth
|
||||||
|
samples** for that piece before moving on. Don't let unverified code
|
||||||
|
ship without a regression lock-in.
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
"""
|
||||||
|
micromate — Instantel Micromate (Series IV) device library.
|
||||||
|
|
||||||
|
Sibling of ``minimateplus`` (the Series III library). Currently scoped to
|
||||||
|
the offline-file ingest path used by thor-watcher: parsing the per-event
|
||||||
|
``.IDFH``/``.IDFW`` ASCII text sidecars Thor's exporter writes alongside
|
||||||
|
each binary event file, and wrapping the parsed data in typed event
|
||||||
|
records.
|
||||||
|
|
||||||
|
Live-device support (TCP protocol, frame parsing, real-time monitoring)
|
||||||
|
is deferred — when we add it, it lands here as ``transport.py`` /
|
||||||
|
``framing.py`` / ``protocol.py`` / ``client.py``, mirroring the
|
||||||
|
``minimateplus`` package layout.
|
||||||
|
|
||||||
|
Typical usage (offline file ingest):
|
||||||
|
|
||||||
|
from micromate import IdfEvent, parse_idf_report
|
||||||
|
|
||||||
|
text = open("UM11719_20231219162723.IDFW.txt").read()
|
||||||
|
rep = parse_idf_report(text) # dict
|
||||||
|
event = IdfEvent.from_report(rep, "UM11719_20231219162723.IDFW")
|
||||||
|
print(event.serial, event.peaks.transverse_ips, event.mic_pspl_dbl)
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .idf_ascii_report import (
|
||||||
|
parse_event_filename,
|
||||||
|
parse_idf_report,
|
||||||
|
serial_from_filename,
|
||||||
|
)
|
||||||
|
from .models import (
|
||||||
|
IdfEvent,
|
||||||
|
IdfPeaks,
|
||||||
|
IdfProjectInfo,
|
||||||
|
IdfReport,
|
||||||
|
IdfSensorCheck,
|
||||||
|
)
|
||||||
|
|
||||||
|
__version__ = "0.1.0"
|
||||||
|
__all__ = [
|
||||||
|
"IdfEvent",
|
||||||
|
"IdfPeaks",
|
||||||
|
"IdfProjectInfo",
|
||||||
|
"IdfReport",
|
||||||
|
"IdfSensorCheck",
|
||||||
|
"parse_event_filename",
|
||||||
|
"parse_idf_report",
|
||||||
|
"serial_from_filename",
|
||||||
|
]
|
||||||
@@ -0,0 +1,330 @@
|
|||||||
|
"""
|
||||||
|
micromate/idf_ascii_report.py — parse Thor (Micromate Series IV) IDF ASCII reports.
|
||||||
|
|
||||||
|
Thor exports a `.IDFW.txt` or `.IDFH.txt` sidecar next to each `.IDFW`
|
||||||
|
(waveform) or `.IDFH` (histogram) event binary. Each sidecar is a
|
||||||
|
plain-text file with `"Key : Value"` lines covering the full device-
|
||||||
|
authoritative event metadata — PPV per channel, ZC Freq, Time of Peak,
|
||||||
|
Peak Acceleration / Displacement, sensor self-check results, project
|
||||||
|
strings, calibration date, battery level, etc. — followed by a raw
|
||||||
|
waveform-samples block headed by the literal line "Waveform Data Channels".
|
||||||
|
|
||||||
|
This is the Thor analogue of `minimateplus/bw_ascii_report.py` for the
|
||||||
|
Blastware (Series III) report format. The parser is intentionally
|
||||||
|
permissive: we extract everything we recognise into a flat dict and
|
||||||
|
silently ignore anything we don't. Downstream callers parse units
|
||||||
|
(`"0.2119 in/s"` → 0.2119) only on the fields they need.
|
||||||
|
|
||||||
|
Example input (truncated):
|
||||||
|
|
||||||
|
"EventType : Full Waveform"
|
||||||
|
"SampleRate : 1024 sps"
|
||||||
|
"EventTime : 16:27:23"
|
||||||
|
"EventDate : 2023-12-19"
|
||||||
|
"TranPPV : 0.0251 in/s"
|
||||||
|
"VertPPV : 0.2119 in/s"
|
||||||
|
"LongPPV : 0.0282 in/s"
|
||||||
|
"PeakVectorSum : 0.2131 in/s"
|
||||||
|
"MicPSPL : 99.4 dB(L)"
|
||||||
|
"TranZCFreq : 6.5 Hz"
|
||||||
|
"SerialNumber : UM11719"
|
||||||
|
"Version : Micromate ISEE 11.0AK"
|
||||||
|
"FileName : UM11719_20231219162723.IDFW"
|
||||||
|
"BatteryLevel : 3.8 volts"
|
||||||
|
"Calibration : November 22, 2023 by Instantel"
|
||||||
|
"TranTestResults : Passed"
|
||||||
|
"TitleString1 : UPMC Presby-Loc 3-Level1-1R Elevator Rm"
|
||||||
|
Waveform Data Channels
|
||||||
|
Tran Vert Long MicL
|
||||||
|
0.0003 -0.0003 0.0003 0.00013
|
||||||
|
...
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
import re
|
||||||
|
from typing import Any, Dict, Optional, Tuple, Union
|
||||||
|
|
||||||
|
|
||||||
|
# Lines look like: "Key : Value" (quotes literal, single ":" separator)
|
||||||
|
_LINE_RE = re.compile(r'^\s*"?([^":]+?)"?\s*:\s*"?(.*?)"?\s*$')
|
||||||
|
|
||||||
|
# Marker that ends the metadata block — everything after is raw sample data.
|
||||||
|
_WAVEFORM_BLOCK_MARKER = "waveform data channels"
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_key(raw: str) -> str:
|
||||||
|
"""Convert "TranPPV" / "PreTriggerLength" → snake_case."""
|
||||||
|
s = raw.strip()
|
||||||
|
# Insert underscore between lower→upper / digit→letter transitions
|
||||||
|
s = re.sub(r"(?<=[a-z0-9])(?=[A-Z])", "_", s)
|
||||||
|
s = re.sub(r"(?<=[A-Z])(?=[A-Z][a-z])", "_", s)
|
||||||
|
s = s.replace("-", "_").replace(" ", "_")
|
||||||
|
return s.lower()
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_unit_suffix(value: str) -> str:
|
||||||
|
"""Return the numeric part of values like "0.2119 in/s" → "0.2119".
|
||||||
|
|
||||||
|
Also strips Thor's below/above-threshold prefixes:
|
||||||
|
"<0.005 in/s" → "0.005" (below-noise-floor reading)
|
||||||
|
">100 Hz" → "100" (above-measurement-range reading)
|
||||||
|
"""
|
||||||
|
parts = value.strip().split()
|
||||||
|
token = parts[0] if parts else value.strip()
|
||||||
|
if token.startswith("<") or token.startswith(">"):
|
||||||
|
token = token[1:]
|
||||||
|
return token
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_float(value: str) -> Optional[float]:
|
||||||
|
try:
|
||||||
|
return float(_strip_unit_suffix(value))
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_int(value: str) -> Optional[int]:
|
||||||
|
try:
|
||||||
|
return int(float(_strip_unit_suffix(value)))
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_idf_report(text: Union[str, bytes]) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Parse a Thor IDFW.txt / IDFH.txt sidecar.
|
||||||
|
|
||||||
|
Returns a flat dict with two kinds of entries:
|
||||||
|
|
||||||
|
- **Raw fields** — every `Key : Value` line, keyed by snake_case
|
||||||
|
of the original key, value as a string (unit suffix preserved).
|
||||||
|
Lets callers grab any field we haven't explicitly normalised.
|
||||||
|
|
||||||
|
- **Derived fields** — a curated set with parsed types:
|
||||||
|
* `serial_number` str
|
||||||
|
* `event_type` str ("Full Waveform" / "Full Histogram")
|
||||||
|
* `event_datetime` ISO-8601 string ("YYYY-MM-DDTHH:MM:SS") when
|
||||||
|
both EventDate and EventTime are present
|
||||||
|
* `sample_rate` int (samples/sec)
|
||||||
|
* `tran_ppv`,`vert_ppv`,`long_ppv` float (in/s)
|
||||||
|
* `mic_ppv` float (dB or psi — same units as MicPSPL)
|
||||||
|
* `peak_vector_sum` float (in/s)
|
||||||
|
* `tran_zc_freq`,`vert_zc_freq`,`long_zc_freq` float (Hz)
|
||||||
|
* `record_time_sec` float (seconds)
|
||||||
|
* `pre_trigger_sec` float (seconds)
|
||||||
|
* `project` str (from TitleString1 — Thor's location)
|
||||||
|
* `client` str (TitleString2)
|
||||||
|
* `operator` str (TitleString3 — company/operator)
|
||||||
|
* `notes` str (TitleString4)
|
||||||
|
* `setup` str
|
||||||
|
* `version` str (firmware)
|
||||||
|
* `battery_volts` float
|
||||||
|
* `calibration_text` str (e.g. "November 22, 2023 by Instantel")
|
||||||
|
* `tran_test_passed`, `vert_test_passed`, `long_test_passed`,
|
||||||
|
`mic_test_passed` bool ("Passed" → True; anything else → False)
|
||||||
|
* `filename` str (FileName line — useful sanity check)
|
||||||
|
|
||||||
|
Stops parsing at the literal "Waveform Data Channels" line; the
|
||||||
|
raw-samples block is left to whoever wants to decode the binary.
|
||||||
|
|
||||||
|
Input may be `str` or `bytes` (`utf-8`/`latin-1` tolerant).
|
||||||
|
"""
|
||||||
|
if isinstance(text, bytes):
|
||||||
|
try:
|
||||||
|
text = text.decode("utf-8")
|
||||||
|
except UnicodeDecodeError:
|
||||||
|
text = text.decode("latin-1", errors="replace")
|
||||||
|
|
||||||
|
raw: Dict[str, str] = {}
|
||||||
|
|
||||||
|
for line in text.splitlines():
|
||||||
|
stripped = line.strip()
|
||||||
|
if not stripped:
|
||||||
|
continue
|
||||||
|
if stripped.lower().startswith(_WAVEFORM_BLOCK_MARKER):
|
||||||
|
break
|
||||||
|
m = _LINE_RE.match(stripped)
|
||||||
|
if not m:
|
||||||
|
continue
|
||||||
|
key = _normalize_key(m.group(1))
|
||||||
|
value = m.group(2).strip()
|
||||||
|
# Multi-value lines (Channel, Units, etc.) — coalesce by appending.
|
||||||
|
if key in raw:
|
||||||
|
raw[key] = raw[key] + "; " + value
|
||||||
|
else:
|
||||||
|
raw[key] = value
|
||||||
|
|
||||||
|
out: Dict[str, Any] = dict(raw) # keep all raw fields
|
||||||
|
|
||||||
|
# ── Derived fields ───────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def _take(*candidates: str) -> Optional[str]:
|
||||||
|
for c in candidates:
|
||||||
|
if c in raw:
|
||||||
|
return raw[c]
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Event identity
|
||||||
|
if "serial_number" in raw:
|
||||||
|
out["serial_number"] = raw["serial_number"]
|
||||||
|
if "event_type" in raw:
|
||||||
|
out["event_type"] = raw["event_type"]
|
||||||
|
if "file_name" in raw:
|
||||||
|
out["filename"] = raw["file_name"]
|
||||||
|
|
||||||
|
# Combined date+time. Waveform sidecars use "EventDate" / "EventTime";
|
||||||
|
# histogram sidecars use "HistogramStartDate" / "HistogramStartTime".
|
||||||
|
# Prefer the event_* names when both are present.
|
||||||
|
ed = raw.get("event_date") or raw.get("histogram_start_date")
|
||||||
|
et = raw.get("event_time") or raw.get("histogram_start_time")
|
||||||
|
if ed and et:
|
||||||
|
try:
|
||||||
|
dt = datetime.datetime.strptime(f"{ed} {et}", "%Y-%m-%d %H:%M:%S")
|
||||||
|
out["event_datetime"] = dt.isoformat()
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Numeric scalars. For every field we typify here, we MUST drop the
|
||||||
|
# raw string copy from `out` when parsing fails — Thor writes things
|
||||||
|
# like "<0.005 in/s" (below threshold) and "N/A" (not measured) that
|
||||||
|
# would otherwise linger in `out` as strings, sneak into SQLite REAL
|
||||||
|
# columns via permissive type affinity, and then crash the JS
|
||||||
|
# frontend on `.toFixed(...)`.
|
||||||
|
int_fields = ("sample_rate",)
|
||||||
|
for key in int_fields:
|
||||||
|
v = raw.get(key)
|
||||||
|
if v is None:
|
||||||
|
continue
|
||||||
|
iv = _parse_int(v)
|
||||||
|
if iv is not None:
|
||||||
|
out[key] = iv
|
||||||
|
else:
|
||||||
|
out.pop(key, None)
|
||||||
|
|
||||||
|
float_fields = (
|
||||||
|
"tran_ppv", "vert_ppv", "long_ppv", "peak_vector_sum",
|
||||||
|
"tran_zc_freq", "vert_zc_freq", "long_zc_freq",
|
||||||
|
"tran_peak_acceleration", "vert_peak_acceleration",
|
||||||
|
"long_peak_acceleration",
|
||||||
|
"tran_peak_displacement", "vert_peak_displacement",
|
||||||
|
"long_peak_displacement",
|
||||||
|
"mic_zc_freq",
|
||||||
|
)
|
||||||
|
for key in float_fields:
|
||||||
|
v = raw.get(key)
|
||||||
|
if v is None:
|
||||||
|
continue
|
||||||
|
fv = _parse_float(v)
|
||||||
|
if fv is not None:
|
||||||
|
out[key] = fv
|
||||||
|
else:
|
||||||
|
out.pop(key, None)
|
||||||
|
|
||||||
|
# Time-of-peak: Thor labels these "TimeofPeak" (lowercase "of") so the
|
||||||
|
# normalizer produces "*_timeof_peak". Map them to the canonical
|
||||||
|
# ``*_time_of_peak`` output keys for downstream consumers.
|
||||||
|
for raw_key, out_key in (
|
||||||
|
("tran_timeof_peak", "tran_time_of_peak"),
|
||||||
|
("vert_timeof_peak", "vert_time_of_peak"),
|
||||||
|
("long_timeof_peak", "long_time_of_peak"),
|
||||||
|
("mic_timeof_peak", "mic_time_of_peak"),
|
||||||
|
):
|
||||||
|
v = raw.get(raw_key)
|
||||||
|
if v is None:
|
||||||
|
continue
|
||||||
|
fv = _parse_float(v)
|
||||||
|
if fv is not None:
|
||||||
|
out[out_key] = fv
|
||||||
|
|
||||||
|
# Microphone — Thor reports MicPSPL (dB(L)) which is the closest
|
||||||
|
# analogue to BW's mic_ppv. The raw "99.4 dB(L)" string stays in
|
||||||
|
# `out` under the original `mic_pspl` key for display; the parsed
|
||||||
|
# float goes in `mic_ppv`.
|
||||||
|
mic = raw.get("mic_pspl")
|
||||||
|
if mic is not None:
|
||||||
|
fv = _parse_float(mic)
|
||||||
|
if fv is not None:
|
||||||
|
out["mic_ppv"] = fv
|
||||||
|
|
||||||
|
# Record / pre-trigger duration — same drop-on-failure discipline.
|
||||||
|
rt = raw.get("record_time")
|
||||||
|
if rt is not None:
|
||||||
|
fv = _parse_float(rt)
|
||||||
|
if fv is not None:
|
||||||
|
out["record_time_sec"] = fv
|
||||||
|
pt = raw.get("pre_trigger_length")
|
||||||
|
if pt is not None:
|
||||||
|
fv = _parse_float(pt)
|
||||||
|
if fv is not None:
|
||||||
|
out["pre_trigger_sec"] = fv
|
||||||
|
|
||||||
|
# Project / client / operator / location strings. Thor's title
|
||||||
|
# strings are operator-defined; conventional mapping (per Thor's
|
||||||
|
# default TitleNote labels in the example data):
|
||||||
|
# TitleString1 = Location → project (sensor location identifier)
|
||||||
|
# TitleString2 = Client → client
|
||||||
|
# TitleString3 = Company → operator (the monitoring company)
|
||||||
|
# TitleString4 = Notes → notes
|
||||||
|
out["project"] = _take("title_string1")
|
||||||
|
out["client"] = _take("title_string2")
|
||||||
|
out["operator"] = _take("title_string3", "operator")
|
||||||
|
out["notes"] = _take("title_string4", "post_event_note")
|
||||||
|
|
||||||
|
if "setup" in raw:
|
||||||
|
out["setup"] = raw["setup"]
|
||||||
|
if "version" in raw:
|
||||||
|
out["version"] = raw["version"]
|
||||||
|
|
||||||
|
# Battery (e.g. "3.8 volts" → 3.8)
|
||||||
|
bl = raw.get("battery_level")
|
||||||
|
if bl is not None:
|
||||||
|
fv = _parse_float(bl)
|
||||||
|
if fv is not None:
|
||||||
|
out["battery_volts"] = fv
|
||||||
|
|
||||||
|
# Calibration line is free-form (e.g. "November 22, 2023 by Instantel").
|
||||||
|
if "calibration" in raw:
|
||||||
|
out["calibration_text"] = raw["calibration"]
|
||||||
|
|
||||||
|
# Sensor self-check results — bool flags
|
||||||
|
for key, out_key in (
|
||||||
|
("tran_test_results", "tran_test_passed"),
|
||||||
|
("vert_test_results", "vert_test_passed"),
|
||||||
|
("long_test_results", "long_test_passed"),
|
||||||
|
("mic_test_results", "mic_test_passed"),
|
||||||
|
):
|
||||||
|
v = raw.get(key)
|
||||||
|
if v is not None:
|
||||||
|
out[out_key] = v.strip().lower() == "passed"
|
||||||
|
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def serial_from_filename(name: str) -> Optional[str]:
|
||||||
|
"""Convenience: pull the serial prefix from a Thor event filename.
|
||||||
|
|
||||||
|
Thor uses the literal serial as the filename prefix:
|
||||||
|
UM11719_20231219163444.IDFW → "UM11719"
|
||||||
|
BE9439_20200713124251.IDFH → "BE9439"
|
||||||
|
"""
|
||||||
|
m = re.match(r"^([A-Z]{2}\d+)_\d{14}\.(IDFH|IDFW)(?:\.txt)?$",
|
||||||
|
name, re.IGNORECASE)
|
||||||
|
return m.group(1).upper() if m else None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_event_filename(name: str) -> Optional[Tuple[str, datetime.datetime, str]]:
|
||||||
|
"""Parse `<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>` → (serial, datetime, kind).
|
||||||
|
|
||||||
|
`kind` is "IDFH" or "IDFW" (upper-case). Returns None on no match.
|
||||||
|
"""
|
||||||
|
m = re.match(r"^([A-Z]{2}\d+)_(\d{14})\.(IDFH|IDFW)$",
|
||||||
|
name, re.IGNORECASE)
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
ts = datetime.datetime.strptime(m.group(2), "%Y%m%d%H%M%S")
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
return m.group(1).upper(), ts, m.group(3).upper()
|
||||||
@@ -0,0 +1,530 @@
|
|||||||
|
"""
|
||||||
|
micromate/idf_file.py — Thor IDF binary codec.
|
||||||
|
|
||||||
|
Decodes the Instantel Micromate Series IV ``.IDFW`` (waveform) and
|
||||||
|
``.IDFH`` (histogram) binary on-disk format. Sister module to
|
||||||
|
``minimateplus/event_file_io.py``.
|
||||||
|
|
||||||
|
Status (2026-05-28):
|
||||||
|
|
||||||
|
- **Genuine Series IV / Thor binaries** are all signed
|
||||||
|
``00 12 01 00 00 00 Instantel\\0`` (sig-A in earlier notes). Two
|
||||||
|
Series III (Blastware) binaries appear in the example corpus
|
||||||
|
(``BE9439_*``) — they share the ``.IDFW``/``.IDFH`` extension by
|
||||||
|
filing convention but carry a BW STRT header (``10 00 01 80 00 00
|
||||||
|
Instantel STRT...``) and are NOT Thor data. The reader detects
|
||||||
|
them by signature and raises NotImplementedError pointing callers
|
||||||
|
at ``minimateplus.event_file_io.read_blastware_file()``.
|
||||||
|
- **IDFW waveform body** reuses the BW segment-rotated block codec
|
||||||
|
verbatim. Body always starts at file offset ``0x0f1f``. Samples
|
||||||
|
decoded via ``minimateplus.waveform_codec.decode_waveform_v2``
|
||||||
|
with 87–99% byte-exact match against ``.IDFW.txt`` sidecar (quiet
|
||||||
|
events). Loud events hit the BW codec's known walker-stops-early
|
||||||
|
limit. Residual ~3% drift on per-sample deltas — likely a
|
||||||
|
Thor-specific 12-bit delta refinement that BW's codec doesn't
|
||||||
|
model. Geo LSB = 0.0003 in/s; mic factor ~2.14e-6 psi/count.
|
||||||
|
- **IDFH histogram body**: 12-byte segment header
|
||||||
|
``[len_be 2B] 0a 00 00 00 [00 NN_counter] 05 3f`` introduces a
|
||||||
|
segment of ``N`` 72-byte interval records (``N = (len - 10) // 72``).
|
||||||
|
Each record holds 4 × 16-byte per-channel min/max/halfp + 8-byte
|
||||||
|
tail. Geo peaks via ``max(|min|, |max|) / 32768 × 10`` in/s
|
||||||
|
(matches sidecar within ~1.8%), freq via ``512 / halfp`` Hz.
|
||||||
|
**All 859 Thor IDFH files in the corpus decode (181,071 intervals).**
|
||||||
|
- Binary metadata directly extracted: serial, timestamp, sample_rate,
|
||||||
|
record_time, calibration_date. Other fields fall back to the paired
|
||||||
|
``.IDFW.txt`` / ``.IDFH.txt`` sidecar (consumed by
|
||||||
|
``WaveformStore.save_imported_idf``).
|
||||||
|
|
||||||
|
The full reverse-engineering writeup lives in
|
||||||
|
``docs/idf_protocol_reference.md``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
import struct
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional, Union
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import decode_waveform_v2
|
||||||
|
|
||||||
|
from .models import IdfEvent, IdfPeaks, IdfReport
|
||||||
|
|
||||||
|
|
||||||
|
# Genuine Series IV / Thor IDF binary signature: 6 bytes, then ASCII "Instantel".
|
||||||
|
_THOR_PREFIX = b"\x00\x12\x01\x00\x00\x00"
|
||||||
|
# Stray Series III (Blastware) binaries that occasionally turn up in Thor
|
||||||
|
# corpus directories renamed to the .IDFW/.IDFH convention. Their header
|
||||||
|
# (`10 00 01 80 00 00 Instantel STRT ...`) is byte-for-byte a BW SUB 5A
|
||||||
|
# STRT record, not a Thor binary. Detected so we can refuse-and-route
|
||||||
|
# rather than mis-parse.
|
||||||
|
_BW_STRAY_PREFIX = b"\x10\x00\x01\x80\x00\x00"
|
||||||
|
_INSTANTEL_TAG = b"Instantel"
|
||||||
|
|
||||||
|
# Most common body offset for sig-A IDFW files (~50% of prod events;
|
||||||
|
# 151/154 in the original tests/fixtures/THORDATA_example corpus). The
|
||||||
|
# body is the segment-rotated block stream consumed by decode_waveform_v2;
|
||||||
|
# bytes [0:3] are the magic ``00 02 00`` preamble. Production events
|
||||||
|
# routinely use other offsets — see :func:`_find_waveform_body_offset`
|
||||||
|
# for the dynamic scan. This constant survives only as the priority hint.
|
||||||
|
_BODY_START_SIG_A = 0x0F1F
|
||||||
|
|
||||||
|
# Magic bytes that mark a candidate waveform-body preamble.
|
||||||
|
_BODY_MAGIC = b"\x00\x02\x00"
|
||||||
|
|
||||||
|
# Where to start looking for body candidates inside the file. Skip the
|
||||||
|
# fixed-header region where the same magic legitimately appears inside
|
||||||
|
# channel-test records and the compliance block (offsets 0x015d, 0x091c,
|
||||||
|
# 0x0ae2, 0x0d30 in observed events).
|
||||||
|
_BODY_SCAN_FLOOR = 0x0E00
|
||||||
|
|
||||||
|
# Geophone count → in/s, derived from sidecar ground truth: the smallest
|
||||||
|
# non-zero sample in 1,014-file corpus is 0.0003 in/s.
|
||||||
|
_GEO_LSB_IPS = 0.0003
|
||||||
|
|
||||||
|
# Microphone count → psi, derived from sidecar regression on 50 sample
|
||||||
|
# pairs from UM11719_20231219162723.IDFW (mic-heavy event).
|
||||||
|
_MIC_LSB_PSI = 2.14e-6
|
||||||
|
|
||||||
|
# IDFH histogram constants.
|
||||||
|
_IDFH_INTERVAL_SIZE = 72 # bytes per per-interval record
|
||||||
|
_IDFH_SEGMENT_HEADER = 10 # bytes: [len_be 2B][0a 00 00 00 4B][00 NN 2B][05 3f 2B]
|
||||||
|
_IDFH_SEGMENT_TAIL = 2 # bytes after the interval data block, before next marker
|
||||||
|
_IDFH_HALFP_FREQ_NUM = 512.0 # freq_hz = NUM / halfp; halfp ≤ 5 means ">100 Hz" sentinel
|
||||||
|
_IDFH_GEO_FULL_SCALE = 10.0 # in/s — Normal range
|
||||||
|
_IDFH_INT16_FS = 32768.0
|
||||||
|
_IDFH_CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||||
|
|
||||||
|
|
||||||
|
# ─── Binary metadata extraction ─────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfBinaryMetadata:
|
||||||
|
"""Fields recoverable from the sig-A binary header (no .txt needed)."""
|
||||||
|
serial: Optional[str] = None
|
||||||
|
event_datetime: Optional[datetime.datetime] = None
|
||||||
|
sample_rate: Optional[int] = None
|
||||||
|
record_time_sec: Optional[float] = None
|
||||||
|
calibration_date: Optional[datetime.date] = None
|
||||||
|
|
||||||
|
|
||||||
|
def _read_ascii_z(buf: bytes, off: int, maxlen: int = 64) -> Optional[str]:
|
||||||
|
if off >= len(buf):
|
||||||
|
return None
|
||||||
|
end = buf.find(b"\x00", off, off + maxlen)
|
||||||
|
if end < 0:
|
||||||
|
end = min(off + maxlen, len(buf))
|
||||||
|
s = buf[off:end].decode("ascii", errors="replace").strip()
|
||||||
|
return s or None
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_8byte_timestamp(buf: bytes, off: int) -> Optional[datetime.datetime]:
|
||||||
|
"""Layout: ``[day][month][year_hi][year_lo][unknown][hour][min][sec]``."""
|
||||||
|
if off + 8 > len(buf):
|
||||||
|
return None
|
||||||
|
day, mon, yh, yl, _unk, hr, mn, sc = buf[off : off + 8]
|
||||||
|
year = (yh << 8) | yl
|
||||||
|
if not (2015 <= year <= 2050 and 1 <= mon <= 12 and 1 <= day <= 31
|
||||||
|
and 0 <= hr < 24 and 0 <= mn < 60 and 0 <= sc < 60):
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return datetime.datetime(year, mon, day, hr, mn, sc)
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def extract_binary_metadata(buf: bytes) -> IdfBinaryMetadata:
|
||||||
|
"""Pull serial/timestamp/sample_rate/record_time/calibration from the
|
||||||
|
sig-A binary header.
|
||||||
|
|
||||||
|
Field positions confirmed against UM11719_20231219162723.IDFW; stable
|
||||||
|
across the 151-file sig-A corpus.
|
||||||
|
"""
|
||||||
|
md = IdfBinaryMetadata()
|
||||||
|
|
||||||
|
# Serial: null-terminated ASCII at 0x14E.
|
||||||
|
md.serial = _read_ascii_z(buf, 0x14E, maxlen=16)
|
||||||
|
|
||||||
|
# Sample rate + record time live in a BW-compatible compliance block.
|
||||||
|
# Locate the 6-byte anchor `be 80 00 00 00 00` and read offsets relative
|
||||||
|
# to it: anchor-6 = sample_rate uint16 BE; anchor+6 = record_time float32 BE.
|
||||||
|
anchor = buf.find(b"\xbe\x80\x00\x00\x00\x00", 0x800, 0xA00)
|
||||||
|
if anchor > 0:
|
||||||
|
sr_bytes = buf[anchor - 6 : anchor - 4]
|
||||||
|
if len(sr_bytes) == 2:
|
||||||
|
sr = int.from_bytes(sr_bytes, "big")
|
||||||
|
if sr in (256, 512, 1024, 2048, 4096):
|
||||||
|
md.sample_rate = sr
|
||||||
|
rt_bytes = buf[anchor + 6 : anchor + 10]
|
||||||
|
if len(rt_bytes) == 4:
|
||||||
|
try:
|
||||||
|
rt = struct.unpack(">f", rt_bytes)[0]
|
||||||
|
if 0.1 <= rt <= 600.0:
|
||||||
|
md.record_time_sec = float(rt)
|
||||||
|
except struct.error:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Event timestamp: 8 bytes. Position differs between IDFW (0x97A) and
|
||||||
|
# IDFH (0x9F8); scan a small range and accept the first valid decode.
|
||||||
|
for off in (0x97A, 0x9F8):
|
||||||
|
ts = _decode_8byte_timestamp(buf, off)
|
||||||
|
if ts is not None:
|
||||||
|
md.event_datetime = ts
|
||||||
|
break
|
||||||
|
|
||||||
|
# Calibration date: day, month, year_be at 0x194-0x197.
|
||||||
|
if len(buf) > 0x197:
|
||||||
|
day, mon = buf[0x194], buf[0x195]
|
||||||
|
year = int.from_bytes(buf[0x196 : 0x198], "big")
|
||||||
|
if 1 <= mon <= 12 and 1 <= day <= 31 and 2015 <= year <= 2050:
|
||||||
|
try:
|
||||||
|
md.calibration_date = datetime.date(year, mon, day)
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
return md
|
||||||
|
|
||||||
|
|
||||||
|
# ─── Sample decoder + unit conversion ───────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def _find_waveform_body_offset(buf: bytes) -> Optional[int]:
|
||||||
|
"""Pick the file offset of the waveform body by trial-decoding every
|
||||||
|
``00 02 00`` magic position past the fixed-header region.
|
||||||
|
|
||||||
|
The body's location isn't fixed across all sig-A IDFW files — about
|
||||||
|
half the production events use ``0x0f1f``, but the rest have offsets
|
||||||
|
that shift based on header padding / channel-config layout. We
|
||||||
|
auto-detect by:
|
||||||
|
|
||||||
|
1. Find every ``00 02 00`` occurrence past ``_BODY_SCAN_FLOOR``.
|
||||||
|
2. Try ``decode_waveform_v2()`` on each candidate.
|
||||||
|
3. Pick the offset whose decoded sample count is largest.
|
||||||
|
|
||||||
|
Returns the offset, or ``None`` if no candidate yielded more than
|
||||||
|
the trivial 2-sample preamble (= "no real body found").
|
||||||
|
|
||||||
|
Costs ~2-8 trial decodes per file; in practice the first candidate
|
||||||
|
past 0x0e00 is usually the right one.
|
||||||
|
"""
|
||||||
|
if len(buf) < _BODY_SCAN_FLOOR + 8:
|
||||||
|
return None
|
||||||
|
best: Optional[tuple[int, int]] = None # (total_samples, offset)
|
||||||
|
i = _BODY_SCAN_FLOOR
|
||||||
|
while True:
|
||||||
|
j = buf.find(_BODY_MAGIC, i)
|
||||||
|
if j < 0:
|
||||||
|
break
|
||||||
|
i = j + 1
|
||||||
|
try:
|
||||||
|
decoded = decode_waveform_v2(buf[j:])
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
|
if not decoded:
|
||||||
|
continue
|
||||||
|
total = sum(len(v) for v in decoded.values())
|
||||||
|
# A "real" body has more than just the 2-sample preamble.
|
||||||
|
if total <= 2:
|
||||||
|
continue
|
||||||
|
if best is None or total > best[0]:
|
||||||
|
best = (total, j)
|
||||||
|
return best[1] if best else None
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_waveform_samples(buf: bytes) -> Optional[dict]:
|
||||||
|
"""Decode samples from the sig-A waveform body.
|
||||||
|
|
||||||
|
Returns the raw decoder counts dict — geo LSB = 0.0003 in/s, mic in
|
||||||
|
its own count unit (see :func:`mic_count_to_psi`). Returns None if
|
||||||
|
no usable body is found.
|
||||||
|
|
||||||
|
Uses :func:`_find_waveform_body_offset` to locate the body — the
|
||||||
|
file-offset varies across events (~50% sit at the canonical
|
||||||
|
``0x0f1f`` but the rest don't), so the previous hardcoded constant
|
||||||
|
silently produced 2-sample preamble-only output for half the corpus.
|
||||||
|
"""
|
||||||
|
off = _find_waveform_body_offset(buf)
|
||||||
|
if off is None:
|
||||||
|
return None
|
||||||
|
return decode_waveform_v2(buf[off:])
|
||||||
|
|
||||||
|
|
||||||
|
def geo_count_to_ips(count: int) -> float:
|
||||||
|
"""Convert a Thor geo decoder count to in/s. LSB = 0.0003 in/s."""
|
||||||
|
return count * _GEO_LSB_IPS
|
||||||
|
|
||||||
|
|
||||||
|
def mic_count_to_psi(count: int) -> float:
|
||||||
|
"""Convert a Thor mic decoder count to psi. Scale derived from
|
||||||
|
regression over 50 sample pairs in UM11719_20231219162723.IDFW;
|
||||||
|
consistent to ~5%. Calibration constants from the channel block
|
||||||
|
can refine this once decoded.
|
||||||
|
"""
|
||||||
|
return count * _MIC_LSB_PSI
|
||||||
|
|
||||||
|
|
||||||
|
# ─── IDFH histogram decoder ─────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfhInterval:
|
||||||
|
"""One decoded histogram interval (typically one minute of monitoring)."""
|
||||||
|
offset: int # file byte offset of the 72-byte record
|
||||||
|
# Per-channel min/max ADC counts (int16 BE), half-period samples, peak count.
|
||||||
|
# Peak = max(|min|, |max|). freq_hz = 512/halfp (None if halfp ≤ 5 →
|
||||||
|
# ">100 Hz" sentinel; matches sidecar convention).
|
||||||
|
tran_min: int
|
||||||
|
tran_max: int
|
||||||
|
tran_halfp: int
|
||||||
|
vert_min: int
|
||||||
|
vert_max: int
|
||||||
|
vert_halfp: int
|
||||||
|
long_min: int
|
||||||
|
long_max: int
|
||||||
|
long_halfp: int
|
||||||
|
micl_min: int
|
||||||
|
micl_max: int
|
||||||
|
micl_halfp: int
|
||||||
|
|
||||||
|
def peak_count(self, channel: str) -> int:
|
||||||
|
mn = getattr(self, f"{channel.lower()}_min")
|
||||||
|
mx = getattr(self, f"{channel.lower()}_max")
|
||||||
|
return max(abs(mn), abs(mx))
|
||||||
|
|
||||||
|
def peak_ips(self, channel: str) -> float:
|
||||||
|
"""Convert peak count to in/s (geo channels only)."""
|
||||||
|
return self.peak_count(channel) / _IDFH_INT16_FS * _IDFH_GEO_FULL_SCALE
|
||||||
|
|
||||||
|
def freq_hz(self, channel: str) -> Optional[float]:
|
||||||
|
halfp = getattr(self, f"{channel.lower()}_halfp")
|
||||||
|
if halfp <= 5:
|
||||||
|
return None
|
||||||
|
return _IDFH_HALFP_FREQ_NUM / halfp
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_idfh_interval(buf72: bytes, offset: int) -> IdfhInterval:
|
||||||
|
"""Decode one 72-byte interval record into per-channel min/max/halfp."""
|
||||||
|
import struct
|
||||||
|
fields = []
|
||||||
|
for i in range(4):
|
||||||
|
block = buf72[i * 16 : (i + 1) * 16]
|
||||||
|
mn = struct.unpack_from(">h", block, 0)[0]
|
||||||
|
mx = struct.unpack_from(">h", block, 2)[0]
|
||||||
|
# block[4:6] = int16 BE, role unknown (possibly time-of-peak)
|
||||||
|
halfp = struct.unpack_from(">H", block, 6)[0]
|
||||||
|
# block[10:12] and block[14:16] are uint16 BE with unknown semantics
|
||||||
|
# (likely sum / count contributions for the PVS computation).
|
||||||
|
fields.extend([mn, mx, halfp])
|
||||||
|
# Tail 8 bytes (buf72[64:72]) carry PVS-related data; not yet decoded.
|
||||||
|
return IdfhInterval(
|
||||||
|
offset=offset,
|
||||||
|
tran_min=fields[0], tran_max=fields[1], tran_halfp=fields[2],
|
||||||
|
vert_min=fields[3], vert_max=fields[4], vert_halfp=fields[5],
|
||||||
|
long_min=fields[6], long_max=fields[7], long_halfp=fields[8],
|
||||||
|
micl_min=fields[9], micl_max=fields[10], micl_halfp=fields[11],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def decode_idfh_body(buf: bytes) -> list:
|
||||||
|
"""Walk an IDFH file and decode every interval record.
|
||||||
|
|
||||||
|
The body has one or more segments; each segment header is 12 bytes:
|
||||||
|
``[length_be 2B][0a 00 00 00][00 NN_counter][05 3f]`` where ``length``
|
||||||
|
is bytes from the magic through the end of the interval block
|
||||||
|
(= 10 + 72 × n_intervals). Segments are separated by a 2-byte tail
|
||||||
|
+ next-segment 2-byte prefix (the bytes before the next length field).
|
||||||
|
Confirmed against the 859-file corpus (181,071 intervals decoded; 1
|
||||||
|
failure is the sig-B BE9439 file).
|
||||||
|
"""
|
||||||
|
intervals: list = []
|
||||||
|
i = 0
|
||||||
|
while True:
|
||||||
|
j = buf.find(b"\x0a\x00\x00\x00", i)
|
||||||
|
if j < 0 or j < 2:
|
||||||
|
break
|
||||||
|
# Validate: [length_be][0a 00 00 00][00 NN][05 3f]
|
||||||
|
if buf[j + 4] != 0x00 or buf[j + 6 : j + 8] != b"\x05\x3f":
|
||||||
|
i = j + 1
|
||||||
|
continue
|
||||||
|
length = int.from_bytes(buf[j - 2 : j], "big")
|
||||||
|
n = (length - _IDFH_SEGMENT_HEADER) // _IDFH_INTERVAL_SIZE
|
||||||
|
if n <= 0:
|
||||||
|
i = j + 1
|
||||||
|
continue
|
||||||
|
header_start = j - 2
|
||||||
|
interval_start = header_start + _IDFH_SEGMENT_HEADER
|
||||||
|
for k in range(n):
|
||||||
|
off = interval_start + k * _IDFH_INTERVAL_SIZE
|
||||||
|
if off + _IDFH_INTERVAL_SIZE > len(buf):
|
||||||
|
break
|
||||||
|
chunk = buf[off : off + _IDFH_INTERVAL_SIZE]
|
||||||
|
intervals.append(_decode_idfh_interval(chunk, off))
|
||||||
|
# Advance past this segment + the 2-byte tail.
|
||||||
|
i = header_start + length + _IDFH_SEGMENT_TAIL
|
||||||
|
return intervals
|
||||||
|
|
||||||
|
|
||||||
|
# ─── Top-level reader ───────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfReadResult:
|
||||||
|
"""Return type for :func:`read_idf_file`.
|
||||||
|
|
||||||
|
For waveforms (``.IDFW``), ``samples`` holds the per-channel sample
|
||||||
|
arrays in Thor decoder counts. For histograms (``.IDFH``),
|
||||||
|
``samples`` is empty and ``intervals`` holds the per-interval
|
||||||
|
record list (peaks, freqs).
|
||||||
|
"""
|
||||||
|
event: IdfEvent
|
||||||
|
samples: dict # {"Tran": [...], ...} for IDFW; empty for IDFH
|
||||||
|
binary_metadata: IdfBinaryMetadata
|
||||||
|
signature: str # always "thor" for now (sig-A genuine Thor)
|
||||||
|
intervals: Optional[list] = None # list[IdfhInterval] for IDFH; None for IDFW
|
||||||
|
|
||||||
|
|
||||||
|
def read_idf_file(
|
||||||
|
path: Union[str, Path],
|
||||||
|
*,
|
||||||
|
data: Optional[bytes] = None,
|
||||||
|
) -> IdfReadResult:
|
||||||
|
"""Parse a Thor ``.IDFW`` binary into an ``IdfEvent`` + decoded samples.
|
||||||
|
|
||||||
|
Currently implements signature-A waveforms only. Signature-B
|
||||||
|
(old-firmware) and ``.IDFH`` histograms raise NotImplementedError;
|
||||||
|
use the paired ``.IDFW.txt`` / ``.IDFH.txt`` sidecar for those via
|
||||||
|
``parse_idf_report()``.
|
||||||
|
|
||||||
|
Returns an :class:`IdfReadResult`. The caller converts int sample
|
||||||
|
counts to physical units via :func:`geo_count_to_ips` /
|
||||||
|
:func:`mic_count_to_psi`.
|
||||||
|
|
||||||
|
``path`` is used for filename in error messages and ``.IDFH`` vs
|
||||||
|
``.IDFW`` suffix detection. When ``data`` is supplied the disk
|
||||||
|
read is skipped — useful for ingest paths that already have the
|
||||||
|
bytes in memory and where the file may not exist on disk yet.
|
||||||
|
"""
|
||||||
|
p = Path(path)
|
||||||
|
buf = data if data is not None else p.read_bytes()
|
||||||
|
|
||||||
|
if len(buf) < 16 or buf[6:16] != _INSTANTEL_TAG + b"\x00":
|
||||||
|
raise ValueError(f"{p.name}: not an IDF file (missing Instantel magic)")
|
||||||
|
|
||||||
|
sig_prefix = buf[:6]
|
||||||
|
if sig_prefix == _THOR_PREFIX:
|
||||||
|
signature = "thor"
|
||||||
|
elif sig_prefix == _BW_STRAY_PREFIX:
|
||||||
|
raise NotImplementedError(
|
||||||
|
f"{p.name}: file has a Series III (Blastware) STRT header in "
|
||||||
|
"an IDF-named container — not a Thor binary. Route through "
|
||||||
|
"minimateplus.event_file_io.read_blastware_file() instead "
|
||||||
|
"(peaks decode; samples & full metadata don't, but it's not "
|
||||||
|
"Thor data so the Thor codec doesn't apply)."
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
raise ValueError(f"{p.name}: unknown IDF signature {sig_prefix.hex()}")
|
||||||
|
|
||||||
|
is_histogram = p.suffix.upper() == ".IDFH"
|
||||||
|
md = extract_binary_metadata(buf)
|
||||||
|
|
||||||
|
if is_histogram:
|
||||||
|
intervals = decode_idfh_body(buf)
|
||||||
|
if not intervals:
|
||||||
|
raise ValueError(f"{p.name}: IDFH body decoded no intervals")
|
||||||
|
# Peaks: max across all intervals on each channel (per-channel max
|
||||||
|
# of stored max-magnitudes; sidecar's PPV row carries the same).
|
||||||
|
peak_tran = max((iv.peak_ips("Tran") for iv in intervals), default=0.0)
|
||||||
|
peak_vert = max((iv.peak_ips("Vert") for iv in intervals), default=0.0)
|
||||||
|
peak_long = max((iv.peak_ips("Long") for iv in intervals), default=0.0)
|
||||||
|
# Mic peak in psi — Thor stores per-interval mic ADC counts in the
|
||||||
|
# binary; convert the max count to psi via the per-count factor.
|
||||||
|
mic_peak_count = max((iv.peak_count("MicL") for iv in intervals), default=0)
|
||||||
|
mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
|
||||||
|
rep = IdfReport(
|
||||||
|
serial_number=md.serial,
|
||||||
|
event_type="Full Histogram",
|
||||||
|
event_datetime=md.event_datetime,
|
||||||
|
filename=p.name,
|
||||||
|
sample_rate=md.sample_rate,
|
||||||
|
record_time_sec=md.record_time_sec,
|
||||||
|
)
|
||||||
|
peaks = IdfPeaks(
|
||||||
|
transverse_ips=peak_tran,
|
||||||
|
vertical_ips=peak_vert,
|
||||||
|
longitudinal_ips=peak_long,
|
||||||
|
peak_vector_sum_ips=None,
|
||||||
|
mic_pspl_dbl=None, # IDFH binary doesn't carry the dB(L) value
|
||||||
|
mic_pspl_psi=mic_peak_psi,
|
||||||
|
)
|
||||||
|
event = IdfEvent(
|
||||||
|
serial=md.serial or "UNKNOWN",
|
||||||
|
timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
|
||||||
|
kind="Histogram",
|
||||||
|
filename=p.name,
|
||||||
|
sample_rate=md.sample_rate,
|
||||||
|
record_time_sec=md.record_time_sec,
|
||||||
|
peaks=peaks,
|
||||||
|
report=rep,
|
||||||
|
)
|
||||||
|
return IdfReadResult(
|
||||||
|
event=event,
|
||||||
|
samples={},
|
||||||
|
binary_metadata=md,
|
||||||
|
signature=signature,
|
||||||
|
intervals=intervals,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Waveform path.
|
||||||
|
decoded = _decode_waveform_samples(buf)
|
||||||
|
if decoded is None:
|
||||||
|
raise ValueError(f"{p.name}: waveform body codec failed")
|
||||||
|
|
||||||
|
rep = IdfReport(
|
||||||
|
serial_number=md.serial,
|
||||||
|
event_type="Full Waveform",
|
||||||
|
event_datetime=md.event_datetime,
|
||||||
|
filename=p.name,
|
||||||
|
sample_rate=md.sample_rate,
|
||||||
|
record_time_sec=md.record_time_sec,
|
||||||
|
)
|
||||||
|
|
||||||
|
def _peak_ips(ch: str) -> float:
|
||||||
|
arr = decoded.get(ch, [])
|
||||||
|
return geo_count_to_ips(max((abs(v) for v in arr), default=0))
|
||||||
|
|
||||||
|
# Mic peak psi from binary: max absolute MicL ADC count × 2.14e-6 psi/count.
|
||||||
|
mic_arr = decoded.get("MicL", [])
|
||||||
|
mic_peak_count = max((abs(v) for v in mic_arr), default=0)
|
||||||
|
mic_peak_psi = mic_count_to_psi(mic_peak_count) if mic_peak_count else None
|
||||||
|
|
||||||
|
peaks = IdfPeaks(
|
||||||
|
transverse_ips=_peak_ips("Tran"),
|
||||||
|
vertical_ips=_peak_ips("Vert"),
|
||||||
|
longitudinal_ips=_peak_ips("Long"),
|
||||||
|
# PVS requires aligned per-sample √(T²+V²+L²); leave None — the
|
||||||
|
# sidecar carries it and the bridge picks it up if present.
|
||||||
|
peak_vector_sum_ips=None,
|
||||||
|
mic_pspl_dbl=None, # binary IDFW doesn't carry the dB(L) value;
|
||||||
|
# sidecar .txt fills it via IdfReport.from_dict
|
||||||
|
mic_pspl_psi=mic_peak_psi,
|
||||||
|
)
|
||||||
|
|
||||||
|
event = IdfEvent(
|
||||||
|
serial=md.serial or "UNKNOWN",
|
||||||
|
timestamp=md.event_datetime or datetime.datetime(1970, 1, 1),
|
||||||
|
kind="Waveform",
|
||||||
|
filename=p.name,
|
||||||
|
sample_rate=md.sample_rate,
|
||||||
|
record_time_sec=md.record_time_sec,
|
||||||
|
peaks=peaks,
|
||||||
|
report=rep,
|
||||||
|
)
|
||||||
|
|
||||||
|
return IdfReadResult(
|
||||||
|
event=event,
|
||||||
|
samples=decoded,
|
||||||
|
binary_metadata=md,
|
||||||
|
signature=signature,
|
||||||
|
)
|
||||||
@@ -0,0 +1,323 @@
|
|||||||
|
"""
|
||||||
|
micromate/idf_to_bw_report.py — adapter that projects a parsed Thor IDF
|
||||||
|
report (+ binary metadata + decoded IDFH intervals) into the
|
||||||
|
``bw_report``-shaped dict that :mod:`sfm.report_pdf.gather_report_data`
|
||||||
|
consumes.
|
||||||
|
|
||||||
|
Lets Thor events flow through the existing Series III Event Report PDF
|
||||||
|
pipeline without duplicating the renderer. Thor's report content is
|
||||||
|
~95% the same data shape as BW's; the field names differ but the
|
||||||
|
underlying metrics map 1:1.
|
||||||
|
|
||||||
|
Caveats
|
||||||
|
───────
|
||||||
|
|
||||||
|
- **Mic units** — Thor records ``MicPSPL`` natively in dB(L). This
|
||||||
|
adapter sets ``bw_report.mic.pspl_dbl`` directly; the report
|
||||||
|
renderer recomputes the equivalent psi via its dBL→psi formula.
|
||||||
|
- **Saturation / above-range flags** — Thor doesn't always mark
|
||||||
|
``OORANGE`` the way BW does; we set ``zc_freq_above_range`` only
|
||||||
|
when a `>100` sentinel was preserved in the raw text.
|
||||||
|
- **Per-interval data** — for IDFH events we build ``interval_times``
|
||||||
|
by stepping ``IntervalSize`` from ``HistogramStartTime``; the binary
|
||||||
|
decoder confirms one record per step (882 / 881 / 881 ... across
|
||||||
|
the corpus).
|
||||||
|
- **calibration_by parsing** — Thor's free-form ``Calibration : November
|
||||||
|
22, 2023 by Instantel`` is split on ``" by "`` to extract the
|
||||||
|
calibrator; the date prefix is parsed where possible, otherwise
|
||||||
|
the binary-extracted ``calibration_date`` from
|
||||||
|
:class:`micromate.idf_file.IdfBinaryMetadata` wins.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
import re
|
||||||
|
from typing import Any, Dict, List, Optional
|
||||||
|
|
||||||
|
|
||||||
|
# ─── Helpers ────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
_NUM_RE = re.compile(r"-?\d+(?:\.\d+)?")
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_first_number(s: Optional[str]) -> Optional[float]:
|
||||||
|
"""Pull the first numeric token from a string like ``"0.1500 in/s"``."""
|
||||||
|
if s is None:
|
||||||
|
return None
|
||||||
|
m = _NUM_RE.search(str(s))
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return float(m.group(0))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_interval_size_s(s: Optional[str]) -> Optional[float]:
|
||||||
|
"""``"60 sec"`` → 60.0, ``"5 min"`` → 300.0, ``"1 hour"`` → 3600."""
|
||||||
|
if s is None:
|
||||||
|
return None
|
||||||
|
num = _parse_first_number(s)
|
||||||
|
if num is None:
|
||||||
|
return None
|
||||||
|
sl = str(s).lower()
|
||||||
|
if "hour" in sl or "hr" in sl:
|
||||||
|
return num * 3600.0
|
||||||
|
if "min" in sl:
|
||||||
|
return num * 60.0
|
||||||
|
return num # default to seconds
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_calibration(text: Optional[str]) -> tuple[Optional[str], Optional[str]]:
|
||||||
|
"""Split ``"November 22, 2023 by Instantel"`` → (ISO date, calibrator).
|
||||||
|
|
||||||
|
Returns ``(None, None)`` if neither half parses.
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return None, None
|
||||||
|
parts = str(text).split(" by ", 1)
|
||||||
|
date_part = parts[0].strip() if parts else None
|
||||||
|
by_part = parts[1].strip() if len(parts) > 1 else None
|
||||||
|
iso_date: Optional[str] = None
|
||||||
|
if date_part:
|
||||||
|
for fmt in ("%B %d, %Y", "%b %d, %Y", "%Y-%m-%d", "%m/%d/%Y"):
|
||||||
|
try:
|
||||||
|
iso_date = datetime.datetime.strptime(date_part, fmt).date().isoformat()
|
||||||
|
break
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
return iso_date, by_part
|
||||||
|
|
||||||
|
|
||||||
|
def _channel_peaks(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
|
||||||
|
"""Map ``tran_ppv`` / ``tran_zc_freq`` / ... → bw_report.peaks.tran shape."""
|
||||||
|
out: Dict[str, Any] = {}
|
||||||
|
for src, dst in (
|
||||||
|
(f"{ch_lc}_ppv", "ppv_ips"),
|
||||||
|
(f"{ch_lc}_zc_freq", "zc_freq_hz"),
|
||||||
|
(f"{ch_lc}_time_of_peak", "time_of_peak_s"),
|
||||||
|
(f"{ch_lc}_peak_acceleration", "peak_accel_g"),
|
||||||
|
(f"{ch_lc}_peak_displacement", "peak_disp_in"),
|
||||||
|
):
|
||||||
|
v = idf.get(src)
|
||||||
|
if v is not None:
|
||||||
|
out[dst] = v
|
||||||
|
# ZC freq ">100" sentinel: the raw text carries it under the un-typed
|
||||||
|
# key (e.g. ``raw["tran_zc_freq"]`` would be ``">100"``), and our parser
|
||||||
|
# dropped the typed entry. Detect that case and flag.
|
||||||
|
raw_zc = idf.get(f"{ch_lc}_zc_freq")
|
||||||
|
if isinstance(raw_zc, str) and ">" in raw_zc:
|
||||||
|
out["zc_freq_above_range"] = True
|
||||||
|
out.pop("zc_freq_hz", None)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def _sensor_check(idf: Dict[str, Any], ch_lc: str) -> Dict[str, Any]:
|
||||||
|
out: Dict[str, Any] = {}
|
||||||
|
fr = idf.get(f"{ch_lc}_test_freq")
|
||||||
|
if fr is not None:
|
||||||
|
out["freq_hz"] = _parse_first_number(fr)
|
||||||
|
rt = idf.get(f"{ch_lc}_test_ratio")
|
||||||
|
if rt is not None:
|
||||||
|
out["ratio"] = _parse_first_number(rt)
|
||||||
|
am = idf.get(f"{ch_lc}_test_amplitude")
|
||||||
|
if am is not None:
|
||||||
|
out["amplitude_mv"] = _parse_first_number(am)
|
||||||
|
res = idf.get(f"{ch_lc}_test_results")
|
||||||
|
if res is not None:
|
||||||
|
out["result"] = str(res).strip()
|
||||||
|
return {k: v for k, v in out.items() if v is not None}
|
||||||
|
|
||||||
|
|
||||||
|
def _interval_times(idf: Dict[str, Any], n_intervals: Optional[int]) -> List[str]:
|
||||||
|
"""Synthesise per-interval timestamps from start + interval_size × k.
|
||||||
|
|
||||||
|
Returns ``[]`` when start time or interval size is unknown.
|
||||||
|
"""
|
||||||
|
if not n_intervals:
|
||||||
|
return []
|
||||||
|
start_date = idf.get("histogram_start_date") or idf.get("event_date")
|
||||||
|
start_time = idf.get("histogram_start_time") or idf.get("event_time")
|
||||||
|
iv_str = idf.get("interval_size")
|
||||||
|
iv_s = _parse_interval_size_s(iv_str)
|
||||||
|
if not (start_date and start_time and iv_s):
|
||||||
|
return []
|
||||||
|
try:
|
||||||
|
t0 = datetime.datetime.strptime(f"{start_date} {start_time}", "%Y-%m-%d %H:%M:%S")
|
||||||
|
except ValueError:
|
||||||
|
return []
|
||||||
|
out = []
|
||||||
|
for k in range(int(n_intervals)):
|
||||||
|
t = t0 + datetime.timedelta(seconds=iv_s * (k + 1))
|
||||||
|
out.append(t.isoformat())
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
# ─── Top-level adapter ──────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def build_bw_report_from_idf(
|
||||||
|
idf_report: Dict[str, Any],
|
||||||
|
*,
|
||||||
|
binary_md=None,
|
||||||
|
intervals: Optional[list] = None,
|
||||||
|
is_histogram: Optional[bool] = None,
|
||||||
|
) -> Dict[str, Any]:
|
||||||
|
"""Project a parsed IDF report dict (and optional binary metadata +
|
||||||
|
decoded IDFH intervals) into the BW report sidecar shape.
|
||||||
|
|
||||||
|
The returned dict is structurally identical to what
|
||||||
|
``minimateplus.event_file_io._bw_report_to_dict`` produces from a
|
||||||
|
real BW ASCII report — it can be assigned to
|
||||||
|
``sidecar["bw_report"]`` and consumed verbatim by
|
||||||
|
``sfm.report_pdf.gather_report_data``.
|
||||||
|
|
||||||
|
``intervals`` is the list of :class:`micromate.idf_file.IdfhInterval`
|
||||||
|
objects from :func:`micromate.idf_file.decode_idfh_body`; only used
|
||||||
|
for histogram events to derive accurate ``interval_times``.
|
||||||
|
"""
|
||||||
|
if is_histogram is None:
|
||||||
|
et = str(idf_report.get("event_type", ""))
|
||||||
|
is_histogram = et.lower().startswith("full histogram")
|
||||||
|
|
||||||
|
# ── Trigger / recording / device ─────────────────────────────────────
|
||||||
|
trigger_channel = idf_report.get("trigger")
|
||||||
|
trigger_level = _parse_first_number(idf_report.get("geo_trigger_level"))
|
||||||
|
geo_range_ips = _parse_first_number(idf_report.get("geo_range"))
|
||||||
|
|
||||||
|
cal_iso, cal_by = _parse_calibration(idf_report.get("calibration"))
|
||||||
|
# Prefer the binary-extracted calibration_date when our text parse fell
|
||||||
|
# through; the binary date is unambiguous.
|
||||||
|
if cal_iso is None and binary_md is not None and binary_md.calibration_date:
|
||||||
|
cal_iso = binary_md.calibration_date.isoformat()
|
||||||
|
|
||||||
|
# ── Histogram fields ────────────────────────────────────────────────
|
||||||
|
hist_block: Dict[str, Any] = {
|
||||||
|
"start": None, "stop": None, "n_intervals": None,
|
||||||
|
"interval_size": None, "interval_size_s": None,
|
||||||
|
"channel_peak_when": {},
|
||||||
|
}
|
||||||
|
if is_histogram:
|
||||||
|
sd = idf_report.get("histogram_start_date")
|
||||||
|
st = idf_report.get("histogram_start_time")
|
||||||
|
if sd and st:
|
||||||
|
try:
|
||||||
|
hist_block["start"] = datetime.datetime.strptime(
|
||||||
|
f"{sd} {st}", "%Y-%m-%d %H:%M:%S"
|
||||||
|
).isoformat()
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
ed = idf_report.get("histogram_stop_date")
|
||||||
|
et_ = idf_report.get("histogram_stop_time")
|
||||||
|
if ed and et_:
|
||||||
|
try:
|
||||||
|
hist_block["stop"] = datetime.datetime.strptime(
|
||||||
|
f"{ed} {et_}", "%Y-%m-%d %H:%M:%S"
|
||||||
|
).isoformat()
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
n_raw = idf_report.get("number_of_intervals")
|
||||||
|
if n_raw is not None:
|
||||||
|
try:
|
||||||
|
# Thor reports a float like "81.04"; round to int (the BW
|
||||||
|
# report uses an int for the column).
|
||||||
|
hist_block["n_intervals"] = int(float(str(n_raw)))
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
# When the binary decoder gave us the actual interval count, prefer it.
|
||||||
|
if intervals is not None:
|
||||||
|
hist_block["n_intervals"] = len(intervals)
|
||||||
|
hist_block["interval_size"] = idf_report.get("interval_size")
|
||||||
|
hist_block["interval_size_s"] = _parse_interval_size_s(idf_report.get("interval_size"))
|
||||||
|
# interval_times derived from start+step (the BW report uses the
|
||||||
|
# exact strings; we match its representation).
|
||||||
|
times = _interval_times(idf_report, hist_block["n_intervals"])
|
||||||
|
# Per-channel peak when (absolute date+time at which the channel's
|
||||||
|
# peak occurred over the histogram run). Thor splits this into
|
||||||
|
# ``TranPeakDate`` / ``TranPeakTime`` etc.
|
||||||
|
peak_when: Dict[str, str] = {}
|
||||||
|
for ch_label, ch_lc in (("Tran", "tran"), ("Vert", "vert"), ("Long", "long"), ("MicL", "mic")):
|
||||||
|
d = idf_report.get(f"{ch_lc}_peak_date")
|
||||||
|
t = idf_report.get(f"{ch_lc}_peak_time")
|
||||||
|
if d and t:
|
||||||
|
try:
|
||||||
|
peak_when[ch_label] = datetime.datetime.strptime(
|
||||||
|
f"{d} {t}", "%Y-%m-%d %H:%M:%S"
|
||||||
|
).isoformat()
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
if peak_when:
|
||||||
|
hist_block["channel_peak_when"] = peak_when
|
||||||
|
|
||||||
|
# ── Mic block ────────────────────────────────────────────────────────
|
||||||
|
mic_block = {
|
||||||
|
"weighting": "L", # Thor mic is ISEE Linear
|
||||||
|
"pspl_dbl": idf_report.get("mic_ppv"), # the dB(L) float
|
||||||
|
"pspl_saturated": False,
|
||||||
|
"zc_freq_hz": idf_report.get("mic_zc_freq"),
|
||||||
|
"zc_freq_above_range": isinstance(idf_report.get("mic_zc_freq"), str)
|
||||||
|
and ">" in str(idf_report.get("mic_zc_freq")),
|
||||||
|
"time_of_peak_s": idf_report.get("mic_time_of_peak"),
|
||||||
|
}
|
||||||
|
if mic_block["zc_freq_above_range"]:
|
||||||
|
mic_block["zc_freq_hz"] = None
|
||||||
|
|
||||||
|
# ── Peaks ────────────────────────────────────────────────────────────
|
||||||
|
vs_block = {
|
||||||
|
"ips": idf_report.get("peak_vector_sum"),
|
||||||
|
"time_s": _parse_first_number(idf_report.get("peak_vector_sum_time_sum")),
|
||||||
|
"when": None,
|
||||||
|
"saturated": False,
|
||||||
|
}
|
||||||
|
if is_histogram:
|
||||||
|
# PVS absolute date+time, when present.
|
||||||
|
vs_d = idf_report.get("peak_vector_sum_date")
|
||||||
|
vs_t = idf_report.get("peak_vector_sum_time")
|
||||||
|
if vs_d and vs_t:
|
||||||
|
try:
|
||||||
|
vs_block["when"] = datetime.datetime.strptime(
|
||||||
|
f"{vs_d} {vs_t}", "%Y-%m-%d %H:%M:%S"
|
||||||
|
).isoformat()
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
return {
|
||||||
|
"available": True,
|
||||||
|
"event_type": idf_report.get("event_type"),
|
||||||
|
"version": idf_report.get("version"),
|
||||||
|
"trigger": {
|
||||||
|
"channel": trigger_channel,
|
||||||
|
"geo_level_ips": trigger_level,
|
||||||
|
},
|
||||||
|
"recording": {
|
||||||
|
"sample_rate_sps": idf_report.get("sample_rate"),
|
||||||
|
"record_time_s": idf_report.get("record_time_sec"),
|
||||||
|
"pretrig_s": idf_report.get("pre_trigger_sec"),
|
||||||
|
"stop_mode": idf_report.get("record_stop_mode"),
|
||||||
|
"geo_range_ips": geo_range_ips,
|
||||||
|
"units": idf_report.get("units"),
|
||||||
|
},
|
||||||
|
"device": {
|
||||||
|
"battery_volts": idf_report.get("battery_volts"),
|
||||||
|
"calibration_date": cal_iso,
|
||||||
|
"calibration_by": cal_by,
|
||||||
|
},
|
||||||
|
"peaks": {
|
||||||
|
"tran": _channel_peaks(idf_report, "tran"),
|
||||||
|
"vert": _channel_peaks(idf_report, "vert"),
|
||||||
|
"long": _channel_peaks(idf_report, "long"),
|
||||||
|
"vector_sum": vs_block,
|
||||||
|
},
|
||||||
|
"mic": mic_block,
|
||||||
|
"sensor_check": {
|
||||||
|
"tran": _sensor_check(idf_report, "tran"),
|
||||||
|
"vert": _sensor_check(idf_report, "vert"),
|
||||||
|
"long": _sensor_check(idf_report, "long"),
|
||||||
|
"mic": _sensor_check(idf_report, "mic"),
|
||||||
|
},
|
||||||
|
"histogram": hist_block,
|
||||||
|
"monitor_log": [],
|
||||||
|
"pc_sw_version": None,
|
||||||
|
}
|
||||||
@@ -0,0 +1,398 @@
|
|||||||
|
"""
|
||||||
|
Micromate (Series IV / Thor) native data models.
|
||||||
|
|
||||||
|
These are the right-shaped dataclasses for Thor data — Thor measures
|
||||||
|
the microphone in dB(L) directly, so this model carries
|
||||||
|
``mic_pspl_dbl`` rather than the pseudo-``psi`` shoehorn that
|
||||||
|
``minimateplus.PeakValues`` uses for Series III BW data.
|
||||||
|
|
||||||
|
The ingest pipeline today goes:
|
||||||
|
|
||||||
|
.IDFW.txt → parse_idf_report() → dict
|
||||||
|
dict → IdfEvent.from_report() → IdfEvent (typed)
|
||||||
|
IdfEvent → IdfEvent.to_minimateplus_event() → shape DB / sidecar
|
||||||
|
machinery expects
|
||||||
|
|
||||||
|
The ``to_minimateplus_event()`` bridge is a temporary boundary — when we
|
||||||
|
crack the binary IDF codec and have richer per-event data to store, the
|
||||||
|
DB schema will grow Series-IV-specific columns and the bridge will
|
||||||
|
shrink or disappear.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from typing import Any, Dict, Optional, Tuple
|
||||||
|
|
||||||
|
|
||||||
|
# ── IdfReport ─────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfReport:
|
||||||
|
"""Typed wrapper around the dict returned by ``parse_idf_report``.
|
||||||
|
|
||||||
|
All fields optional — Thor's exporter is permissive and some IDF .txt
|
||||||
|
files (especially histograms) omit fields that waveform sidecars
|
||||||
|
include. Use ``.raw`` for any field this dataclass hasn't surfaced
|
||||||
|
yet (the parser keeps every recognised key in the raw dict).
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Identity / kind
|
||||||
|
serial_number: Optional[str] = None
|
||||||
|
event_type: Optional[str] = None # "Full Waveform" | "Full Histogram"
|
||||||
|
event_datetime: Optional[datetime.datetime] = None
|
||||||
|
filename: Optional[str] = None # echoed by Thor's exporter
|
||||||
|
|
||||||
|
# Sampling / timing
|
||||||
|
sample_rate: Optional[int] = None # samples/sec
|
||||||
|
record_time_sec: Optional[float] = None
|
||||||
|
pre_trigger_sec: Optional[float] = None
|
||||||
|
|
||||||
|
# Geophone peaks (in/s)
|
||||||
|
tran_ppv: Optional[float] = None
|
||||||
|
vert_ppv: Optional[float] = None
|
||||||
|
long_ppv: Optional[float] = None
|
||||||
|
peak_vector_sum: Optional[float] = None
|
||||||
|
|
||||||
|
# Microphone — Thor's native unit is dB(L), NOT psi.
|
||||||
|
mic_pspl_dbl: Optional[float] = None
|
||||||
|
|
||||||
|
# Zero-crossing frequencies (Hz)
|
||||||
|
tran_zc_freq: Optional[float] = None
|
||||||
|
vert_zc_freq: Optional[float] = None
|
||||||
|
long_zc_freq: Optional[float] = None
|
||||||
|
mic_zc_freq: Optional[float] = None
|
||||||
|
|
||||||
|
# Per-channel time of peak (sec, since event start)
|
||||||
|
tran_time_of_peak: Optional[float] = None
|
||||||
|
vert_time_of_peak: Optional[float] = None
|
||||||
|
long_time_of_peak: Optional[float] = None
|
||||||
|
mic_time_of_peak: Optional[float] = None
|
||||||
|
|
||||||
|
# Derived per-channel motion
|
||||||
|
tran_peak_acceleration: Optional[float] = None # g
|
||||||
|
vert_peak_acceleration: Optional[float] = None
|
||||||
|
long_peak_acceleration: Optional[float] = None
|
||||||
|
tran_peak_displacement: Optional[float] = None # in
|
||||||
|
vert_peak_displacement: Optional[float] = None
|
||||||
|
long_peak_displacement: Optional[float] = None
|
||||||
|
|
||||||
|
# Operator-supplied strings (Thor's TitleString1..4 → semantic slots)
|
||||||
|
project: Optional[str] = None # TitleString1
|
||||||
|
client: Optional[str] = None # TitleString2
|
||||||
|
operator: Optional[str] = None # TitleString3
|
||||||
|
notes: Optional[str] = None # TitleString4 / PostEventNote
|
||||||
|
setup: Optional[str] = None # setup file name
|
||||||
|
|
||||||
|
# Sensor self-check results
|
||||||
|
tran_test_passed: Optional[bool] = None
|
||||||
|
vert_test_passed: Optional[bool] = None
|
||||||
|
long_test_passed: Optional[bool] = None
|
||||||
|
mic_test_passed: Optional[bool] = None
|
||||||
|
|
||||||
|
# Device-fixed metadata
|
||||||
|
firmware_version: Optional[str] = None
|
||||||
|
calibration_text: Optional[str] = None
|
||||||
|
battery_volts: Optional[float] = None
|
||||||
|
|
||||||
|
# Original parser dict — preserves every recognised key (including
|
||||||
|
# raw unit-suffixed strings) for forward-compatible field access.
|
||||||
|
raw: Dict[str, Any] = field(default_factory=dict, repr=False)
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_dict(cls, d: Dict[str, Any]) -> "IdfReport":
|
||||||
|
"""Build an IdfReport from the dict returned by ``parse_idf_report``."""
|
||||||
|
ed = d.get("event_datetime")
|
||||||
|
if isinstance(ed, str):
|
||||||
|
try:
|
||||||
|
ed = datetime.datetime.fromisoformat(ed)
|
||||||
|
except ValueError:
|
||||||
|
ed = None
|
||||||
|
|
||||||
|
return cls(
|
||||||
|
serial_number = d.get("serial_number"),
|
||||||
|
event_type = d.get("event_type"),
|
||||||
|
event_datetime = ed if isinstance(ed, datetime.datetime) else None,
|
||||||
|
filename = d.get("filename"),
|
||||||
|
sample_rate = d.get("sample_rate"),
|
||||||
|
record_time_sec = d.get("record_time_sec"),
|
||||||
|
pre_trigger_sec = d.get("pre_trigger_sec"),
|
||||||
|
tran_ppv = d.get("tran_ppv"),
|
||||||
|
vert_ppv = d.get("vert_ppv"),
|
||||||
|
long_ppv = d.get("long_ppv"),
|
||||||
|
peak_vector_sum = d.get("peak_vector_sum"),
|
||||||
|
mic_pspl_dbl = d.get("mic_ppv"), # parser names it mic_ppv (legacy)
|
||||||
|
tran_zc_freq = d.get("tran_zc_freq"),
|
||||||
|
vert_zc_freq = d.get("vert_zc_freq"),
|
||||||
|
long_zc_freq = d.get("long_zc_freq"),
|
||||||
|
mic_zc_freq = d.get("mic_zc_freq"),
|
||||||
|
tran_time_of_peak = d.get("tran_time_of_peak"),
|
||||||
|
vert_time_of_peak = d.get("vert_time_of_peak"),
|
||||||
|
long_time_of_peak = d.get("long_time_of_peak"),
|
||||||
|
mic_time_of_peak = d.get("mic_time_of_peak"),
|
||||||
|
tran_peak_acceleration = d.get("tran_peak_acceleration"),
|
||||||
|
vert_peak_acceleration = d.get("vert_peak_acceleration"),
|
||||||
|
long_peak_acceleration = d.get("long_peak_acceleration"),
|
||||||
|
tran_peak_displacement = d.get("tran_peak_displacement"),
|
||||||
|
vert_peak_displacement = d.get("vert_peak_displacement"),
|
||||||
|
long_peak_displacement = d.get("long_peak_displacement"),
|
||||||
|
project = d.get("project"),
|
||||||
|
client = d.get("client"),
|
||||||
|
operator = d.get("operator"),
|
||||||
|
notes = d.get("notes"),
|
||||||
|
setup = d.get("setup"),
|
||||||
|
tran_test_passed = d.get("tran_test_passed"),
|
||||||
|
vert_test_passed = d.get("vert_test_passed"),
|
||||||
|
long_test_passed = d.get("long_test_passed"),
|
||||||
|
mic_test_passed = d.get("mic_test_passed"),
|
||||||
|
firmware_version = d.get("version"),
|
||||||
|
calibration_text = d.get("calibration_text"),
|
||||||
|
battery_volts = d.get("battery_volts"),
|
||||||
|
raw = d,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ── IdfPeaks / IdfProjectInfo / IdfSensorCheck (narrow grouping types) ───────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfPeaks:
|
||||||
|
"""Geophone + mic peak values for one Thor event. Native Thor units.
|
||||||
|
|
||||||
|
Thor stores the mic peak in two parallel forms — ``mic_pspl_dbl`` is
|
||||||
|
what the sidecar's top-level ``MicPSPL`` header field carries (dB(L)),
|
||||||
|
used in the report header. ``mic_pspl_psi`` is the psi value derived
|
||||||
|
either from the IDFW sample table / IDFH interval column 9, or from
|
||||||
|
the binary mic counts (~2.14e-6 psi/count). Needed because the
|
||||||
|
BW-shaped ``PeakValues.micl`` consumed by ``event_hdf5.write_event_hdf5``
|
||||||
|
expects psi — feeding it dB(L) makes the h5 mic-chart scale factor
|
||||||
|
blow up.
|
||||||
|
"""
|
||||||
|
transverse_ips: Optional[float] = None # in/s
|
||||||
|
vertical_ips: Optional[float] = None # in/s
|
||||||
|
longitudinal_ips: Optional[float] = None # in/s
|
||||||
|
peak_vector_sum_ips: Optional[float] = None # in/s
|
||||||
|
mic_pspl_dbl: Optional[float] = None # dB(L)
|
||||||
|
mic_pspl_psi: Optional[float] = None # psi
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfProjectInfo:
|
||||||
|
"""Operator-supplied strings from Thor's TitleString1..4."""
|
||||||
|
project: Optional[str] = None
|
||||||
|
client: Optional[str] = None
|
||||||
|
operator: Optional[str] = None
|
||||||
|
notes: Optional[str] = None
|
||||||
|
setup: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfSensorCheck:
|
||||||
|
"""Per-channel pass/fail from Thor's self-test."""
|
||||||
|
tran: Optional[bool] = None
|
||||||
|
vert: Optional[bool] = None
|
||||||
|
long: Optional[bool] = None
|
||||||
|
mic: Optional[bool] = None
|
||||||
|
|
||||||
|
|
||||||
|
# ── IdfEvent ─────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IdfEvent:
|
||||||
|
"""A single Thor / Micromate Series IV event.
|
||||||
|
|
||||||
|
Built from a parsed .IDFW.txt or .IDFH.txt sidecar via
|
||||||
|
``IdfEvent.from_report()``. The filename is the authoritative
|
||||||
|
source for serial + timestamp + kind; the .txt provides
|
||||||
|
device-authoritative peak values, frequencies, project strings,
|
||||||
|
sensor self-check, firmware, calibration.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Identity
|
||||||
|
serial: str
|
||||||
|
timestamp: datetime.datetime
|
||||||
|
kind: str # "Waveform" | "Histogram"
|
||||||
|
filename: str # device-native binary filename, e.g. "UM11719_20231219163444.IDFW"
|
||||||
|
|
||||||
|
# Sampling / timing
|
||||||
|
sample_rate: Optional[int] = None
|
||||||
|
record_time_sec: Optional[float] = None
|
||||||
|
pre_trigger_sec: Optional[float] = None
|
||||||
|
|
||||||
|
# Peaks
|
||||||
|
peaks: IdfPeaks = field(default_factory=IdfPeaks)
|
||||||
|
|
||||||
|
# Per-channel frequencies (Hz)
|
||||||
|
tran_zc_freq: Optional[float] = None
|
||||||
|
vert_zc_freq: Optional[float] = None
|
||||||
|
long_zc_freq: Optional[float] = None
|
||||||
|
mic_zc_freq: Optional[float] = None
|
||||||
|
|
||||||
|
# Project strings
|
||||||
|
project_info: IdfProjectInfo = field(default_factory=IdfProjectInfo)
|
||||||
|
|
||||||
|
# Sensor self-check
|
||||||
|
sensor_check: IdfSensorCheck = field(default_factory=IdfSensorCheck)
|
||||||
|
|
||||||
|
# Device-fixed
|
||||||
|
firmware_version: Optional[str] = None
|
||||||
|
calibration_text: Optional[str] = None
|
||||||
|
battery_volts: Optional[float] = None
|
||||||
|
|
||||||
|
# The full parsed report — preserves anything not surfaced as a typed field
|
||||||
|
report: IdfReport = field(default_factory=IdfReport)
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_report(
|
||||||
|
cls,
|
||||||
|
report: Any,
|
||||||
|
filename: str,
|
||||||
|
) -> "IdfEvent":
|
||||||
|
"""Build an IdfEvent from a parsed report (dict or IdfReport) and
|
||||||
|
the device-native binary filename.
|
||||||
|
|
||||||
|
The filename is authoritative for serial + timestamp + kind:
|
||||||
|
Thor's filenames are literal ``<SERIAL>_<YYYYMMDDHHMMSS>.<KIND>``
|
||||||
|
and the device's own clock is the canonical event timestamp.
|
||||||
|
If the report carries an ``event_datetime`` that differs from
|
||||||
|
what's in the filename, the report wins (it has finer-grained
|
||||||
|
device-reported time-of-trigger semantics).
|
||||||
|
"""
|
||||||
|
from .idf_ascii_report import parse_event_filename
|
||||||
|
|
||||||
|
# Normalise input to IdfReport
|
||||||
|
if isinstance(report, IdfReport):
|
||||||
|
rep = report
|
||||||
|
elif isinstance(report, dict):
|
||||||
|
rep = IdfReport.from_dict(report)
|
||||||
|
else:
|
||||||
|
raise TypeError(
|
||||||
|
f"report must be IdfReport or dict; got {type(report).__name__}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Filename → (serial, timestamp, kind). Required — fall back to
|
||||||
|
# report-supplied values only if filename parsing fails.
|
||||||
|
parsed = parse_event_filename(filename)
|
||||||
|
if parsed is not None:
|
||||||
|
fn_serial, fn_ts, fn_kind = parsed
|
||||||
|
kind = "Histogram" if fn_kind == "IDFH" else "Waveform"
|
||||||
|
else:
|
||||||
|
fn_serial = rep.serial_number or "UNKNOWN"
|
||||||
|
fn_ts = rep.event_datetime or datetime.datetime(1970, 1, 1)
|
||||||
|
kind = "Waveform" if (rep.event_type or "").lower().startswith("full waveform") else "Histogram"
|
||||||
|
|
||||||
|
# Prefer report's event_datetime (device-authoritative) over the filename.
|
||||||
|
ts = rep.event_datetime or fn_ts
|
||||||
|
serial = rep.serial_number or fn_serial
|
||||||
|
|
||||||
|
return cls(
|
||||||
|
serial=serial,
|
||||||
|
timestamp=ts,
|
||||||
|
kind=kind,
|
||||||
|
filename=filename,
|
||||||
|
sample_rate=rep.sample_rate,
|
||||||
|
record_time_sec=rep.record_time_sec,
|
||||||
|
pre_trigger_sec=rep.pre_trigger_sec,
|
||||||
|
peaks=IdfPeaks(
|
||||||
|
transverse_ips = rep.tran_ppv,
|
||||||
|
vertical_ips = rep.vert_ppv,
|
||||||
|
longitudinal_ips = rep.long_ppv,
|
||||||
|
peak_vector_sum_ips = rep.peak_vector_sum,
|
||||||
|
mic_pspl_dbl = rep.mic_pspl_dbl,
|
||||||
|
),
|
||||||
|
tran_zc_freq=rep.tran_zc_freq,
|
||||||
|
vert_zc_freq=rep.vert_zc_freq,
|
||||||
|
long_zc_freq=rep.long_zc_freq,
|
||||||
|
mic_zc_freq=rep.mic_zc_freq,
|
||||||
|
project_info=IdfProjectInfo(
|
||||||
|
project=rep.project,
|
||||||
|
client=rep.client,
|
||||||
|
operator=rep.operator,
|
||||||
|
notes=rep.notes,
|
||||||
|
setup=rep.setup,
|
||||||
|
),
|
||||||
|
sensor_check=IdfSensorCheck(
|
||||||
|
tran=rep.tran_test_passed,
|
||||||
|
vert=rep.vert_test_passed,
|
||||||
|
long=rep.long_test_passed,
|
||||||
|
mic=rep.mic_test_passed,
|
||||||
|
),
|
||||||
|
firmware_version=rep.firmware_version,
|
||||||
|
calibration_text=rep.calibration_text,
|
||||||
|
battery_volts=rep.battery_volts,
|
||||||
|
report=rep,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ── Bridge to minimateplus shape (for the existing DB / sidecar paths) ──
|
||||||
|
|
||||||
|
def to_minimateplus_event(self, waveform_key: bytes) -> Any:
|
||||||
|
"""Project this Thor event into the shape ``minimateplus.Event``
|
||||||
|
carries, so it can flow through the existing
|
||||||
|
``SeismoDb.insert_events()`` and ``event_to_sidecar_dict()``
|
||||||
|
machinery without those code paths needing to know about Thor.
|
||||||
|
|
||||||
|
Caveats of the bridge:
|
||||||
|
- ``PeakValues.micl`` carries the mic peak in **psi** (matching
|
||||||
|
BW's convention) — set from :attr:`IdfPeaks.mic_pspl_psi`,
|
||||||
|
with a dB(L)→psi fallback when only the dB(L) value is
|
||||||
|
available. This is what the h5 writer's mic-scale-factor
|
||||||
|
logic needs. The dB(L) value still flows through
|
||||||
|
``bw_report.mic.pspl_dbl`` (set by the
|
||||||
|
``idf_to_bw_report`` adapter) and the renderer reads it
|
||||||
|
from there for the report header.
|
||||||
|
- Many Thor-specific fields (Peak Acceleration / Displacement,
|
||||||
|
sensor self-check, calibration) don't have a slot in
|
||||||
|
``Event``. The full IdfReport is preserved on the
|
||||||
|
``.sfm.json`` sidecar under ``extensions.idf_report`` via
|
||||||
|
``save_imported_idf`` — that's the source of truth for them.
|
||||||
|
"""
|
||||||
|
from minimateplus.models import (
|
||||||
|
Event, PeakValues, ProjectInfo, Timestamp,
|
||||||
|
)
|
||||||
|
|
||||||
|
ts_obj = Timestamp(
|
||||||
|
raw=bytes(9),
|
||||||
|
flag=0,
|
||||||
|
year=self.timestamp.year,
|
||||||
|
unknown_byte=0,
|
||||||
|
month=self.timestamp.month,
|
||||||
|
day=self.timestamp.day,
|
||||||
|
hour=self.timestamp.hour,
|
||||||
|
minute=self.timestamp.minute,
|
||||||
|
second=self.timestamp.second,
|
||||||
|
)
|
||||||
|
# Resolve mic peak as psi. Priority: binary-derived mic_pspl_psi
|
||||||
|
# (set by read_idf_file) > dB(L)→psi fallback via standard formula
|
||||||
|
# (psi = 2.9e-9 × 10^(dBL/20)) > None.
|
||||||
|
mic_psi = self.peaks.mic_pspl_psi
|
||||||
|
if mic_psi is None and self.peaks.mic_pspl_dbl is not None:
|
||||||
|
mic_psi = 2.9e-9 * (10.0 ** (self.peaks.mic_pspl_dbl / 20.0))
|
||||||
|
pv = PeakValues(
|
||||||
|
tran=self.peaks.transverse_ips,
|
||||||
|
vert=self.peaks.vertical_ips,
|
||||||
|
long=self.peaks.longitudinal_ips,
|
||||||
|
micl=mic_psi, # psi, matching BW's convention (h5 scaling depends on this)
|
||||||
|
peak_vector_sum=self.peaks.peak_vector_sum_ips,
|
||||||
|
)
|
||||||
|
pi = ProjectInfo(
|
||||||
|
setup_name=self.project_info.setup,
|
||||||
|
project=self.project_info.project,
|
||||||
|
client=self.project_info.client,
|
||||||
|
operator=self.project_info.operator,
|
||||||
|
sensor_location=None, # Thor folds location into project string
|
||||||
|
notes=self.project_info.notes,
|
||||||
|
)
|
||||||
|
ev = Event(
|
||||||
|
index=0,
|
||||||
|
timestamp=ts_obj,
|
||||||
|
sample_rate=self.sample_rate,
|
||||||
|
peak_values=pv,
|
||||||
|
project_info=pi,
|
||||||
|
record_type=self.kind,
|
||||||
|
rectime_seconds=self.record_time_sec,
|
||||||
|
)
|
||||||
|
ev._waveform_key = waveform_key
|
||||||
|
return ev
|
||||||
@@ -21,7 +21,15 @@ Typical usage (TCP / modem):
|
|||||||
|
|
||||||
from .client import MiniMateClient
|
from .client import MiniMateClient
|
||||||
from .models import DeviceInfo, Event, MonitorLogEntry
|
from .models import DeviceInfo, Event, MonitorLogEntry
|
||||||
from .transport import SerialTransport, TcpTransport
|
from .transport import CapturingTransport, SerialTransport, TcpTransport
|
||||||
|
|
||||||
__version__ = "0.1.0"
|
__version__ = "0.1.0"
|
||||||
__all__ = ["MiniMateClient", "DeviceInfo", "Event", "MonitorLogEntry", "SerialTransport", "TcpTransport"]
|
__all__ = [
|
||||||
|
"MiniMateClient",
|
||||||
|
"DeviceInfo",
|
||||||
|
"Event",
|
||||||
|
"MonitorLogEntry",
|
||||||
|
"SerialTransport",
|
||||||
|
"TcpTransport",
|
||||||
|
"CapturingTransport",
|
||||||
|
]
|
||||||
|
|||||||
+156
-32
@@ -552,6 +552,105 @@ def classify_frame(frame: S3Frame) -> str:
|
|||||||
|
|
||||||
# ── Waveform file writer ───────────────────────────────────────────────────────────
|
# ── Waveform file writer ───────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def extract_body_bytes(a5_frames):
|
||||||
|
"""Reconstruct the Blastware-file body bytes from a list of A5 frames.
|
||||||
|
|
||||||
|
Returns ``(strt, body, footer)`` where:
|
||||||
|
|
||||||
|
- ``strt`` is the 21-byte STRT record from the probe frame (or a fallback
|
||||||
|
record built from minimal event metadata if STRT is missing).
|
||||||
|
- ``body`` is the variable-length sample-data section (between STRT and
|
||||||
|
the 26-byte file footer). Empty if no frames decode.
|
||||||
|
- ``footer`` is the 26-byte file footer.
|
||||||
|
|
||||||
|
This is the same body-construction algorithm used by :func:`write_blastware_file`
|
||||||
|
— refactored out so the body decoder (``waveform_codec.decode_waveform_v2``)
|
||||||
|
can consume the same bytes without re-implementing the frame-walking logic.
|
||||||
|
|
||||||
|
Returns ``(b"", b"", b"")`` if *a5_frames* is empty.
|
||||||
|
"""
|
||||||
|
if not a5_frames:
|
||||||
|
return (b"", b"", b"")
|
||||||
|
|
||||||
|
# ── Extract STRT record from probe frame ─────────────────────────────────
|
||||||
|
w0_raw = bytes(a5_frames[0].data[7:])
|
||||||
|
w0_stripped = _strip_inner_frame_dles(w0_raw)
|
||||||
|
strt_pos_stripped = w0_stripped.find(b"STRT")
|
||||||
|
|
||||||
|
if strt_pos_stripped >= 0:
|
||||||
|
strt = bytes(w0_stripped[strt_pos_stripped : strt_pos_stripped + 21])
|
||||||
|
|
||||||
|
# Walk raw bytes to find the raw-domain end of the STRT (= body start).
|
||||||
|
target_stripped = strt_pos_stripped + 21
|
||||||
|
stripped_so_far = 0
|
||||||
|
raw_i = 0
|
||||||
|
while stripped_so_far < target_stripped and raw_i < len(w0_raw):
|
||||||
|
if (w0_raw[raw_i] == 0x10
|
||||||
|
and raw_i + 1 < len(w0_raw)
|
||||||
|
and w0_raw[raw_i + 1] in {0x02, 0x03, 0x04}):
|
||||||
|
raw_i += 2
|
||||||
|
else:
|
||||||
|
raw_i += 1
|
||||||
|
stripped_so_far += 1
|
||||||
|
probe_skip = 7 + raw_i
|
||||||
|
else:
|
||||||
|
strt = b"STRT" + b"\xff\xfe" + bytes(14) + b"\x00"
|
||||||
|
probe_skip = 7 + 21
|
||||||
|
|
||||||
|
if len(strt) != 21:
|
||||||
|
return (b"", b"", b"")
|
||||||
|
|
||||||
|
# Separate terminator from data frames.
|
||||||
|
term_idx: Optional[int] = None
|
||||||
|
if a5_frames and a5_frames[-1].page_key != 0x0010:
|
||||||
|
term_idx = len(a5_frames) - 1
|
||||||
|
|
||||||
|
if term_idx is not None:
|
||||||
|
body_frames = a5_frames[:term_idx]
|
||||||
|
term_frame = a5_frames[term_idx]
|
||||||
|
else:
|
||||||
|
body_frames = a5_frames
|
||||||
|
term_frame = None
|
||||||
|
|
||||||
|
all_bytes = bytearray()
|
||||||
|
for fi, frame in enumerate(body_frames):
|
||||||
|
if fi == 0:
|
||||||
|
skip = probe_skip
|
||||||
|
elif fi in (1, 2):
|
||||||
|
skip = 13 # metadata pages
|
||||||
|
else:
|
||||||
|
skip = 12 # sample chunks
|
||||||
|
all_bytes.extend(_frame_body_bytes(frame, skip))
|
||||||
|
|
||||||
|
if term_frame is not None:
|
||||||
|
all_bytes.extend(_frame_body_bytes(term_frame, 11))
|
||||||
|
|
||||||
|
# Find the first valid `0e 08` footer marker.
|
||||||
|
footer_pos = -1
|
||||||
|
pos = 0
|
||||||
|
while True:
|
||||||
|
pos = bytes(all_bytes).find(b"\x0e\x08", pos)
|
||||||
|
if pos < 0 or pos + 26 > len(all_bytes):
|
||||||
|
break
|
||||||
|
yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
|
||||||
|
if 2015 <= yr <= 2050:
|
||||||
|
footer_pos = pos
|
||||||
|
break
|
||||||
|
pos += 1
|
||||||
|
|
||||||
|
if footer_pos >= 0:
|
||||||
|
body = bytes(all_bytes[:footer_pos])
|
||||||
|
footer = bytes(all_bytes[footer_pos : footer_pos + 26])
|
||||||
|
elif len(all_bytes) >= 26:
|
||||||
|
body = bytes(all_bytes[:-26])
|
||||||
|
footer = bytes(all_bytes[-26:])
|
||||||
|
else:
|
||||||
|
body = bytes(all_bytes)
|
||||||
|
footer = b""
|
||||||
|
|
||||||
|
return (strt, body, footer)
|
||||||
|
|
||||||
|
|
||||||
def write_blastware_file(
|
def write_blastware_file(
|
||||||
event: Event,
|
event: Event,
|
||||||
a5_frames: list[S3Frame],
|
a5_frames: list[S3Frame],
|
||||||
@@ -639,7 +738,7 @@ def write_blastware_file(
|
|||||||
strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF])
|
strt = b"STRT" + b"\xff\xfe" + key4 + bytes(14) + bytes([rectime & 0xFF])
|
||||||
probe_skip = 7 + 21
|
probe_skip = 7 + 21
|
||||||
|
|
||||||
log.warning(
|
log.debug(
|
||||||
"write_blastware_file: strt_pos_stripped=%d probe_skip=%d "
|
"write_blastware_file: strt_pos_stripped=%d probe_skip=%d "
|
||||||
"probe_data_len=%d strt_hex=%s",
|
"probe_data_len=%d strt_hex=%s",
|
||||||
strt_pos_stripped if strt_pos_stripped >= 0 else -1,
|
strt_pos_stripped if strt_pos_stripped >= 0 else -1,
|
||||||
@@ -672,11 +771,10 @@ def write_blastware_file(
|
|||||||
# Do NOT use a5_frames[-1] — if _a5_frames contains stray frames from a
|
# Do NOT use a5_frames[-1] — if _a5_frames contains stray frames from a
|
||||||
# subsequent event (a known get_events side-effect), the last frame will
|
# subsequent event (a known get_events side-effect), the last frame will
|
||||||
# not be the terminator and the footer will be mis-identified.
|
# not be the terminator and the footer will be mis-identified.
|
||||||
|
# TERM detection (v0.14.0): last frame if page_key != 0x0010 (sample marker)
|
||||||
term_idx: Optional[int] = None
|
term_idx: Optional[int] = None
|
||||||
for _i, _f in enumerate(a5_frames):
|
if a5_frames and a5_frames[-1].page_key != 0x0010:
|
||||||
if _f.page_key == 0x0000:
|
term_idx = len(a5_frames) - 1
|
||||||
term_idx = _i
|
|
||||||
break
|
|
||||||
|
|
||||||
if term_idx is not None:
|
if term_idx is not None:
|
||||||
body_frames = a5_frames[:term_idx]
|
body_frames = a5_frames[:term_idx]
|
||||||
@@ -685,38 +783,32 @@ def write_blastware_file(
|
|||||||
body_frames = a5_frames
|
body_frames = a5_frames
|
||||||
term_frame = None
|
term_frame = None
|
||||||
|
|
||||||
log.warning(
|
# Frame contribution loop (v0.14.0 BW-exact walk).
|
||||||
"write_blastware_file: %d body_frames term_idx=%s",
|
# Skip values:
|
||||||
len(body_frames),
|
# probe (fi=0): probe_skip
|
||||||
str(term_idx) if term_idx is not None else "None",
|
# meta@0x1002 (fi=1): 13 (6-byte inner header)
|
||||||
|
# meta@0x1004 (fi=2): 13 (6-byte inner header)
|
||||||
|
# sample chunks (fi=3+): 12 (5-byte inner header)
|
||||||
|
last_fi = len(body_frames) - 1
|
||||||
|
|
||||||
|
log.debug(
|
||||||
|
"write_blastware_file: %d body_frames last_fi=%d",
|
||||||
|
len(body_frames), last_fi,
|
||||||
)
|
)
|
||||||
|
|
||||||
all_bytes = bytearray()
|
all_bytes = bytearray()
|
||||||
|
|
||||||
for fi, frame in enumerate(body_frames):
|
for fi, frame in enumerate(body_frames):
|
||||||
# All body frames contribute to the waveform body — no frames are skipped.
|
|
||||||
#
|
|
||||||
# Over TCP via cellular modem, _recv_5a_batch() correctly collects all
|
|
||||||
# A5 frames per chunk request (the device's ~1100-byte RS-232 response
|
|
||||||
# is forwarded as ~2 TCP segments of ~550 bytes each, each parsed as a
|
|
||||||
# separate S3 frame). ALL of these frames contain ADC body data and
|
|
||||||
# must be included in the file — confirmed from 4-27-26 TCP capture
|
|
||||||
# analysis: contributions from all 14 frames → 6821 bytes → file 6864 bytes.
|
|
||||||
#
|
|
||||||
# Skip amounts (offsets into frame.data):
|
|
||||||
# fi=0 (probe): probe_skip — skips the type_tag header + STRT record
|
|
||||||
# fi=1: 13 — 7-byte frame.data prefix + 6 inner header bytes
|
|
||||||
# fi>=2: 12 — 7-byte frame.data prefix + 5 inner header bytes
|
|
||||||
if fi == 0:
|
if fi == 0:
|
||||||
skip = probe_skip
|
skip = probe_skip
|
||||||
elif fi == 1:
|
elif fi in (1, 2):
|
||||||
skip = 13
|
skip = 13 # metadata pages
|
||||||
else:
|
else:
|
||||||
skip = 12
|
skip = 12 # sample chunks
|
||||||
|
|
||||||
contribution = _frame_body_bytes(frame, skip)
|
contribution = _frame_body_bytes(frame, skip)
|
||||||
log.warning("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
|
log.debug("write_blastware_file: fi=%d skip=%d raw_data=%d contribution=%d",
|
||||||
fi, skip, len(frame.data), len(contribution))
|
fi, skip, len(frame.data), len(contribution))
|
||||||
all_bytes.extend(contribution)
|
all_bytes.extend(contribution)
|
||||||
|
|
||||||
# Terminator contributes its content, which ends with the 26-byte footer.
|
# Terminator contributes its content, which ends with the 26-byte footer.
|
||||||
@@ -724,7 +816,7 @@ def write_blastware_file(
|
|||||||
# one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21.
|
# one shorter than chunk frames' 5-byte inner header. Confirmed 2026-04-21.
|
||||||
if term_frame is not None:
|
if term_frame is not None:
|
||||||
term_contribution = _frame_body_bytes(term_frame, 11)
|
term_contribution = _frame_body_bytes(term_frame, 11)
|
||||||
log.warning(
|
log.debug(
|
||||||
"write_blastware_file: term_frame data_len=%d skip=11 "
|
"write_blastware_file: term_frame data_len=%d skip=11 "
|
||||||
"contribution_len=%d first8=%s",
|
"contribution_len=%d first8=%s",
|
||||||
len(term_frame.data),
|
len(term_frame.data),
|
||||||
@@ -733,17 +825,49 @@ def write_blastware_file(
|
|||||||
)
|
)
|
||||||
all_bytes.extend(term_contribution)
|
all_bytes.extend(term_contribution)
|
||||||
|
|
||||||
log.warning(
|
log.debug(
|
||||||
"write_blastware_file: all_bytes total=%d last28=%s",
|
"write_blastware_file: all_bytes total=%d last28=%s",
|
||||||
len(all_bytes),
|
len(all_bytes),
|
||||||
bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(),
|
bytes(all_bytes[-28:]).hex() if len(all_bytes) >= 28 else bytes(all_bytes).hex(),
|
||||||
)
|
)
|
||||||
|
|
||||||
if len(all_bytes) >= 26:
|
# NOTE: The "duplicate header+STRT strip" logic from v0.13.x has been
|
||||||
|
# REMOVED in v0.14.2. Under the v0.14.0 BW-exact 5A walk, body assembly
|
||||||
|
# is just contiguous concatenation of frame contributions in stream order
|
||||||
|
# (probe → meta@0x1002 → meta@0x1004 → samples → TERM), exactly as BW
|
||||||
|
# writes its files. The previous strip was matching the `00 12 03 00 STRT`
|
||||||
|
# byte sequence in legitimate waveform data — sample chunks at counter
|
||||||
|
# 0x1000 and beyond often contain those bytes coincidentally — and
|
||||||
|
# zeroing 25 bytes of valid samples per match. Compared to a known-good
|
||||||
|
# BW reference for the same 3-sec event 0, the strip introduced 26 bytes
|
||||||
|
# of zeros that BW did not have, then propagated alignment differences
|
||||||
|
# through the rest of the body. See decode_test/5-1-26/bw vs SFM diff
|
||||||
|
# at file[0x1012..0x102B] (2026-05-04 analysis).
|
||||||
|
|
||||||
|
# Find the first valid 0e 08 footer marker (v0.14.0).
|
||||||
|
footer_pos = -1
|
||||||
|
pos = 0
|
||||||
|
while True:
|
||||||
|
pos = bytes(all_bytes).find(b"\x0e\x08", pos)
|
||||||
|
if pos < 0 or pos + 26 > len(all_bytes):
|
||||||
|
break
|
||||||
|
yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
|
||||||
|
if 2015 <= yr <= 2050:
|
||||||
|
footer_pos = pos
|
||||||
|
break
|
||||||
|
pos += 1
|
||||||
|
if footer_pos >= 0:
|
||||||
|
body = bytes(all_bytes[:footer_pos])
|
||||||
|
footer = bytes(all_bytes[footer_pos:footer_pos + 26])
|
||||||
|
log.debug(
|
||||||
|
"write_blastware_file: real 0e 08 footer at all_bytes[%d]; "
|
||||||
|
"truncating %d post-footer bytes",
|
||||||
|
footer_pos, len(all_bytes) - footer_pos - 26,
|
||||||
|
)
|
||||||
|
elif len(all_bytes) >= 26:
|
||||||
body = bytes(all_bytes[:-26])
|
body = bytes(all_bytes[:-26])
|
||||||
footer = bytes(all_bytes[-26:])
|
footer = bytes(all_bytes[-26:])
|
||||||
else:
|
else:
|
||||||
# Fallback: no terminator or very short stream → build footer from event metadata
|
|
||||||
body = bytes(all_bytes)
|
body = bytes(all_bytes)
|
||||||
start_dt = _ts_from_model(event.timestamp)
|
start_dt = _ts_from_model(event.timestamp)
|
||||||
stop_dt: Optional[datetime.datetime] = None
|
stop_dt: Optional[datetime.datetime] = None
|
||||||
@@ -754,7 +878,7 @@ def write_blastware_file(
|
|||||||
+ _encode_ts_be(start_dt)
|
+ _encode_ts_be(start_dt)
|
||||||
+ _encode_ts_be(stop_dt)
|
+ _encode_ts_be(stop_dt)
|
||||||
+ b"\x00\x01\x00\x02\x00\x00"
|
+ b"\x00\x01\x00\x02\x00\x00"
|
||||||
+ b"\x00\x00" # CRC placeholder
|
+ b"\x00\x00"
|
||||||
)
|
)
|
||||||
|
|
||||||
# ── Write file ───────────────────────────────────────────────────────────
|
# ── Write file ───────────────────────────────────────────────────────────
|
||||||
|
|||||||
@@ -0,0 +1,738 @@
|
|||||||
|
"""
|
||||||
|
minimateplus/bw_ascii_report.py — parser for Blastware's per-event ASCII
|
||||||
|
report (the .TXT file BW writes alongside each saved event binary).
|
||||||
|
|
||||||
|
The ASCII export is the authoritative source for every "rich" per-event
|
||||||
|
field that BW computes from the waveform but never persists in the BW
|
||||||
|
binary itself:
|
||||||
|
|
||||||
|
- Per-channel PPV (Tran / Vert / Long / MicL)
|
||||||
|
- Peak Vector Sum + Peak Vector Sum Time
|
||||||
|
- Per-channel ZC Freq, Time of Peak, Peak Acceleration, Peak Displacement
|
||||||
|
- MicL PSPL, MicL Time of Peak, MicL ZC Freq
|
||||||
|
- Per-channel Sensor Self-Check (Test Freq / Test Ratio / Test Results)
|
||||||
|
- MicL Test Amplitude (mV)
|
||||||
|
- Battery, calibration date, monitor-log timestamps
|
||||||
|
|
||||||
|
Persisting these values into the SFM database lets the monthly-summary
|
||||||
|
review workflow ("show me events at Location X with PVS > 0.5") work
|
||||||
|
without depending on the (still-undecoded) waveform body codec.
|
||||||
|
|
||||||
|
Format (verified against decode-re/5-8-26 4-event bundle):
|
||||||
|
|
||||||
|
- One field per line, wrapped in double quotes: `"Field Name : Value"`
|
||||||
|
- Field/value separator: literal ` : ` (space-colon-space).
|
||||||
|
- Some field names contain an internal `:` already (e.g. `"Project:"`),
|
||||||
|
so we split on the FIRST ` : ` only.
|
||||||
|
- Some fields have unit suffixes: `"0.500 in/s"` / `"7.5 Hz"` / `"533 mv"`.
|
||||||
|
- A `"Monitor Log(s)"` marker line is followed by tab-separated rows
|
||||||
|
of `start_time<TAB>stop_time<TAB>description`.
|
||||||
|
- Final `"PC SW Version : ..."` line ends the metadata block.
|
||||||
|
- A blank line separates metadata from the sample table.
|
||||||
|
- Sample table starts with ` Tran <TAB> Vert <TAB>...`, then
|
||||||
|
one row per sample (tab-separated, right-padded numeric values).
|
||||||
|
- Geo channel values are in in/s; MicL in dB(L) (or 0.000 below threshold).
|
||||||
|
|
||||||
|
Because some metadata fields have whitespace quirks ("MicL Time of
|
||||||
|
Peak" has two spaces; the leading "Project:" value has its own colon),
|
||||||
|
we normalise whitespace in the key before lookup.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, List, Optional, Tuple, Union
|
||||||
|
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────────────────────
|
||||||
|
# Output dataclasses
|
||||||
|
# ─────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ChannelStats:
|
||||||
|
"""Per-channel derived stats, populated from an event report."""
|
||||||
|
ppv_ips: Optional[float] = None # in/s (geo channels only)
|
||||||
|
zc_freq_hz: Optional[float] = None # Hz
|
||||||
|
time_of_peak_s: Optional[float] = None # seconds (relative to trigger; can be negative)
|
||||||
|
peak_accel_g: Optional[float] = None # g (geo channels only)
|
||||||
|
peak_disp_in: Optional[float] = None # in (geo channels only)
|
||||||
|
# When BW writes "OORANGE" (Out Of Range — truncated) for a PPV
|
||||||
|
# value, the true peak exceeded the channel's full-scale range.
|
||||||
|
# We substitute the range max (e.g. 10.000 in/s for Normal range)
|
||||||
|
# as a lower bound, and flag here so downstream UI / alerts know
|
||||||
|
# to render "> 10 in/s" or "saturated" instead of trusting the
|
||||||
|
# value as an exact measurement.
|
||||||
|
ppv_saturated: bool = False
|
||||||
|
# Set when BW writes ">100 Hz" for ZC Freq — the zero-crossing
|
||||||
|
# algorithm's peak frequency exceeded the device's reporting
|
||||||
|
# ceiling (typically 100 Hz on V10.72). zc_freq_hz gets the
|
||||||
|
# threshold (100.0) as a lower bound; downstream UI renders ">100".
|
||||||
|
zc_freq_above_range: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MicStats:
|
||||||
|
"""MicL-specific stats."""
|
||||||
|
weighting: Optional[str] = None # e.g. "Linear Weighting"
|
||||||
|
pspl_dbl: Optional[float] = None # dB(L)
|
||||||
|
zc_freq_hz: Optional[float] = None
|
||||||
|
time_of_peak_s: Optional[float] = None
|
||||||
|
# Set when BW writes "OORANGE" for PSPL — mic exceeded its
|
||||||
|
# measurement range. pspl_dbl gets the conservative upper bound
|
||||||
|
# 140 dBL (typical NL-43 max; some units cap at 148). Consumers
|
||||||
|
# should render "> 140 dB(L)" or similar when this flag is set.
|
||||||
|
pspl_saturated: bool = False
|
||||||
|
# Same semantics as ChannelStats.zc_freq_above_range — mic ZC
|
||||||
|
# peak exceeded device reporting ceiling.
|
||||||
|
zc_freq_above_range: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SensorCheck:
|
||||||
|
"""Per-channel sensor self-check result.
|
||||||
|
|
||||||
|
Geo channels report a frequency + ratio; MicL reports a frequency +
|
||||||
|
amplitude (mV). All channels also have a Pass/Fail string.
|
||||||
|
"""
|
||||||
|
test_freq_hz: Optional[float] = None
|
||||||
|
test_ratio: Optional[float] = None # geo channels only
|
||||||
|
test_amplitude_mv: Optional[float] = None # MicL only
|
||||||
|
test_results: Optional[str] = None # "Passed" / "Failed"
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MonitorLogEntry:
|
||||||
|
"""One row of the trailing Monitor Log(s) block."""
|
||||||
|
start_time: Optional[datetime.datetime] = None
|
||||||
|
stop_time: Optional[datetime.datetime] = None
|
||||||
|
description: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
# BW saturation marker — appears in PPV / Peak Vector Sum / similar
|
||||||
|
# numeric fields when the underlying measurement exceeded the
|
||||||
|
# channel's full-scale range (e.g., a geophone reading > 10 in/s at
|
||||||
|
# Normal range, or a mic exceeding its sensitivity ceiling). Treated
|
||||||
|
# as "≥ range_max" + a saturated flag rather than discarded.
|
||||||
|
# Appears as: ``"Tran PPV : OORANGE in/s"``
|
||||||
|
_OORANGE_MARKERS = ("OORANGE", "OUT OF RANGE")
|
||||||
|
|
||||||
|
|
||||||
|
def _is_oorange(value: str) -> bool:
|
||||||
|
"""True when a BW numeric field is an Out-Of-Range saturation marker."""
|
||||||
|
s = value.strip().upper()
|
||||||
|
return any(m in s for m in _OORANGE_MARKERS)
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_above_range(value: str) -> Optional[float]:
|
||||||
|
"""For BW "above-range" markers like ">100 Hz", return the threshold.
|
||||||
|
|
||||||
|
BW writes ZC Freq as ">100 Hz" when the zero-crossing algorithm sees
|
||||||
|
a peak too fast to count (device cuts off at 100 Hz). Returns the
|
||||||
|
numeric portion after the '>' (e.g. 100.0), or None if `value` is
|
||||||
|
not an above-range marker.
|
||||||
|
"""
|
||||||
|
s = value.strip()
|
||||||
|
if not s.startswith(">"):
|
||||||
|
return None
|
||||||
|
return _parse_number(s[1:])
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class BwAsciiReport:
|
||||||
|
"""Structured representation of one BW per-event ASCII export."""
|
||||||
|
# ── Identity ─────────────────────────────────────────────────────────────
|
||||||
|
event_type: Optional[str] = None # e.g. "Full Waveform"
|
||||||
|
serial: Optional[str] = None # e.g. "BE11529"
|
||||||
|
version: Optional[str] = None # firmware version line
|
||||||
|
file_name: Optional[str] = None # e.g. "M529LK44.AB0"
|
||||||
|
event_datetime: Optional[datetime.datetime] = None # parsed from Event Time + Event Date
|
||||||
|
|
||||||
|
# ── Trigger / recording config ──────────────────────────────────────────
|
||||||
|
trigger_channel: Optional[str] = None # e.g. "Vert" or "From Unit"
|
||||||
|
geo_trigger_level_ips: Optional[float] = None
|
||||||
|
pretrig_s: Optional[float] = None # negative seconds
|
||||||
|
record_time_s: Optional[float] = None
|
||||||
|
record_stop_mode: Optional[str] = None
|
||||||
|
sample_rate_sps: Optional[int] = None
|
||||||
|
battery_volts: Optional[float] = None
|
||||||
|
calibration_date: Optional[datetime.date] = None
|
||||||
|
calibration_by: Optional[str] = None # e.g. "Instantel"
|
||||||
|
units: Optional[str] = None # e.g. "in/s and dB(L)"
|
||||||
|
|
||||||
|
# ── Operator-supplied metadata ──────────────────────────────────────────
|
||||||
|
# Parsed by POSITION from the 4-line "User Notes" block BW writes
|
||||||
|
# between the `Units :` and `Geo Range :` lines. Position-based so
|
||||||
|
# the values populate correctly even when an operator renames the
|
||||||
|
# labels in Blastware's Compliance Setup → Notes tab (the 4 labels
|
||||||
|
# are user-editable, e.g. "Seis Loc:" → "Building:" → "Site Address:").
|
||||||
|
# The original labels BW wrote are preserved in `user_note_labels`
|
||||||
|
# so terra-view can render them as the operator named them.
|
||||||
|
project: Optional[str] = None # position 1 (BW default label "Project:")
|
||||||
|
client: Optional[str] = None # position 2 (BW default label "Client:")
|
||||||
|
operator: Optional[str] = None # position 3 (BW default label "User Name:")
|
||||||
|
sensor_location: Optional[str] = None # position 4 (BW default label "Seis Loc:")
|
||||||
|
|
||||||
|
# Maps canonical slot name → the literal label BW wrote in the ASCII
|
||||||
|
# export. Empty if the User Notes block wasn't present. Example
|
||||||
|
# when the operator renamed slot 4 to "Building:":
|
||||||
|
# {"project": "Project:", "client": "Client:",
|
||||||
|
# "operator": "User Name:", "sensor_location": "Building:"}
|
||||||
|
user_note_labels: Dict[str, str] = field(default_factory=dict)
|
||||||
|
|
||||||
|
# ── Geo channel scaling ─────────────────────────────────────────────────
|
||||||
|
geo_range_ips: Optional[float] = None # 10.000 / 1.250
|
||||||
|
|
||||||
|
# ── Per-channel derived stats (geo + mic) ───────────────────────────────
|
||||||
|
channels: Dict[str, ChannelStats] = field(default_factory=dict)
|
||||||
|
mic: MicStats = field(default_factory=MicStats)
|
||||||
|
|
||||||
|
# ── Vector sum ──────────────────────────────────────────────────────────
|
||||||
|
peak_vector_sum_ips: Optional[float] = None
|
||||||
|
peak_vector_sum_time_s: Optional[float] = None
|
||||||
|
# Saturation flag — set when BW writes "OORANGE" for the PVS. We
|
||||||
|
# then substitute sqrt(3) * geo_range_ips as a conservative upper
|
||||||
|
# bound (the theoretical maximum PVS when all 3 geo channels are
|
||||||
|
# simultaneously at full-scale). Consumers should display this as
|
||||||
|
# ">{value} in/s" or similar.
|
||||||
|
peak_vector_sum_saturated: bool = False
|
||||||
|
# Histograms additionally have an absolute date+time for the PVS
|
||||||
|
# (it occurred at a specific interval). Waveform reports show
|
||||||
|
# only the relative-time value above.
|
||||||
|
peak_vector_sum_when: Optional[datetime.datetime] = None
|
||||||
|
|
||||||
|
# ── Histogram-specific fields (populated only when Event Type starts
|
||||||
|
# with 'Histogram' / 'Full Histogram' / 'Histogram + Continuous') ──
|
||||||
|
histogram_start: Optional[datetime.datetime] = None
|
||||||
|
histogram_stop: Optional[datetime.datetime] = None
|
||||||
|
histogram_n_intervals: Optional[int] = None # e.g. 4, 1436
|
||||||
|
histogram_interval_size_str: Optional[str] = None # "1 minute" / "5 minutes" / "15 seconds"
|
||||||
|
histogram_interval_size_s: Optional[float] = None # parsed to seconds
|
||||||
|
# Per-channel absolute peak time+date (histogram-specific). For
|
||||||
|
# waveform events these are None — those reports use the channel's
|
||||||
|
# time_of_peak_s (relative to trigger) instead. Keyed by channel
|
||||||
|
# name ("Tran", "Vert", "Long", "MicL").
|
||||||
|
channel_peak_when: Dict[str, datetime.datetime] = field(default_factory=dict)
|
||||||
|
|
||||||
|
# ── Sensor self-check (per channel) ─────────────────────────────────────
|
||||||
|
sensor_check: Dict[str, SensorCheck] = field(default_factory=dict)
|
||||||
|
|
||||||
|
# ── Monitor log + tooling version ───────────────────────────────────────
|
||||||
|
monitor_log: List[MonitorLogEntry] = field(default_factory=list)
|
||||||
|
pc_sw_version: Optional[str] = None
|
||||||
|
|
||||||
|
# ── Sample table (optional; only parsed if requested) ───────────────────
|
||||||
|
# Each entry: (Tran, Vert, Long, MicL) in the report's units (geo
|
||||||
|
# channels in in/s, MicL in dB(L)). None when parse_samples=False.
|
||||||
|
samples: Optional[List[Tuple[float, float, float, float]]] = None
|
||||||
|
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────────────────────
|
||||||
|
# Helpers
|
||||||
|
# ─────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
_KEY_NORMALISE_RE = re.compile(r"\s+")
|
||||||
|
_NUMERIC_RE = re.compile(r"^-?\d+(?:\.\d+)?")
|
||||||
|
|
||||||
|
|
||||||
|
def _normalise_key(k: str) -> str:
|
||||||
|
"""Collapse whitespace runs (incl. tabs) and strip — handles BW's
|
||||||
|
"MicL Time of Peak" double-space and leading-colon quirks."""
|
||||||
|
return _KEY_NORMALISE_RE.sub(" ", k).strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_quotes(line: str) -> str:
|
||||||
|
line = line.rstrip("\r\n")
|
||||||
|
if len(line) >= 2 and line.startswith('"') and line.endswith('"'):
|
||||||
|
return line[1:-1]
|
||||||
|
return line
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_number(value: str) -> Optional[float]:
|
||||||
|
"""Pull the leading numeric portion out of a value like "0.500 in/s"."""
|
||||||
|
m = _NUMERIC_RE.match(value.strip())
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return float(m.group(0))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_int(value: str) -> Optional[int]:
|
||||||
|
n = _parse_number(value)
|
||||||
|
return None if n is None else int(round(n))
|
||||||
|
|
||||||
|
|
||||||
|
# Months exactly as BW writes them.
|
||||||
|
_MONTHS = {
|
||||||
|
"January": 1, "February": 2, "March": 3, "April": 4,
|
||||||
|
"May": 5, "June": 6, "July": 7, "August": 8,
|
||||||
|
"September": 9, "October": 10, "November": 11, "December": 12,
|
||||||
|
# Short forms used in monitor-log rows ("Apr 23 /26").
|
||||||
|
"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "Jun": 6, "Jul": 7,
|
||||||
|
"Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_event_date(s: str) -> Optional[datetime.date]:
|
||||||
|
"""Parse "April 23, 2026" or "May 8, 2026" → date."""
|
||||||
|
s = s.strip()
|
||||||
|
parts = s.replace(",", " ").split()
|
||||||
|
if len(parts) < 3:
|
||||||
|
return None
|
||||||
|
month_name, day_str, year_str = parts[0], parts[1], parts[2]
|
||||||
|
month = _MONTHS.get(month_name)
|
||||||
|
if month is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return datetime.date(int(year_str), month, int(day_str))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_iso_date(s: str) -> Optional[datetime.date]:
|
||||||
|
"""Parse "2026-05-16" → date. Histograms use ISO format for their
|
||||||
|
Start Date / Stop Date / Peak Date fields; waveforms use the
|
||||||
|
"May 8, 2026" long form which `_parse_event_date` handles."""
|
||||||
|
s = s.strip()
|
||||||
|
try:
|
||||||
|
return datetime.date.fromisoformat(s)
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
_INTERVAL_UNIT_SECONDS = {
|
||||||
|
"second": 1, "seconds": 1, "sec": 1, "secs": 1,
|
||||||
|
"minute": 60, "minutes": 60, "min": 60, "mins": 60,
|
||||||
|
"hour": 3600, "hours": 3600, "hr": 3600, "hrs": 3600,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_interval_size(s: str) -> Optional[float]:
|
||||||
|
"""Parse "1 minute" / "5 minutes" / "15 seconds" / "2 seconds" → seconds.
|
||||||
|
|
||||||
|
Handles the BW Compliance Setup → Histogram Interval values verbatim
|
||||||
|
("2 seconds", "5 seconds", "15 seconds", "1 minute", "5 minutes",
|
||||||
|
"15 minutes") plus a few defensive variants.
|
||||||
|
"""
|
||||||
|
if not s:
|
||||||
|
return None
|
||||||
|
parts = s.strip().split()
|
||||||
|
if len(parts) < 2:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
n = float(parts[0])
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
unit_per_s = _INTERVAL_UNIT_SECONDS.get(parts[1].lower())
|
||||||
|
if unit_per_s is None:
|
||||||
|
return None
|
||||||
|
return n * unit_per_s
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_event_time(s: str) -> Optional[datetime.time]:
|
||||||
|
"""Parse "15:56:35" → time."""
|
||||||
|
s = s.strip()
|
||||||
|
try:
|
||||||
|
h, m, sec = s.split(":")
|
||||||
|
return datetime.time(int(h), int(m), int(sec))
|
||||||
|
except (ValueError, IndexError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_calibration(value: str) -> Tuple[Optional[datetime.date], Optional[str]]:
|
||||||
|
"""Parse "April 29, 2025 by Instantel" → (date, "Instantel")."""
|
||||||
|
parts = value.split(" by ", 1)
|
||||||
|
date = _parse_event_date(parts[0])
|
||||||
|
by = parts[1].strip() if len(parts) > 1 else None
|
||||||
|
return date, by
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_monitor_row(line: str) -> Optional[MonitorLogEntry]:
|
||||||
|
"""Parse a tab-separated monitor log row.
|
||||||
|
|
||||||
|
Format: `<start>\t<stop>\t<desc>` where each timestamp is BW's
|
||||||
|
short form "Mon DD /YY HH:MM:SS" (e.g. "Apr 23 /26 15:46:16").
|
||||||
|
Year is encoded as a 2-digit suffix; we expand "/26" → 2026.
|
||||||
|
"""
|
||||||
|
parts = line.split("\t")
|
||||||
|
if len(parts) < 2:
|
||||||
|
return None
|
||||||
|
start = _parse_monitor_ts(parts[0])
|
||||||
|
stop = _parse_monitor_ts(parts[1])
|
||||||
|
desc = parts[2].strip() if len(parts) > 2 else None
|
||||||
|
if start is None and stop is None and not desc:
|
||||||
|
return None
|
||||||
|
return MonitorLogEntry(start_time=start, stop_time=stop, description=desc)
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_monitor_ts(s: str) -> Optional[datetime.datetime]:
|
||||||
|
"""Parse "Apr 23 /26 15:46:16" → datetime."""
|
||||||
|
s = s.strip()
|
||||||
|
parts = s.split()
|
||||||
|
if len(parts) < 4:
|
||||||
|
return None
|
||||||
|
month = _MONTHS.get(parts[0])
|
||||||
|
if month is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
day = int(parts[1])
|
||||||
|
# parts[2] looks like "/26" → century-flip to 2026
|
||||||
|
yy = int(parts[2].lstrip("/"))
|
||||||
|
year = 2000 + yy if yy < 80 else 1900 + yy
|
||||||
|
h, m, sec = (int(x) for x in parts[3].split(":"))
|
||||||
|
return datetime.datetime(year, month, day, h, m, sec)
|
||||||
|
except (ValueError, IndexError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
# ── User-notes positional slot map ──────────────────────────────────────────
|
||||||
|
#
|
||||||
|
# Blastware's Compliance Setup → Notes tab shows four operator-supplied
|
||||||
|
# fields whose LABELS the operator can rename (see screenshot in
|
||||||
|
# project archive). Defaults are "Project:" / "Client:" /
|
||||||
|
# "User Name:" / "Seis Loc:", but an operator using a different
|
||||||
|
# convention can rename them to anything ("Building:", "Site:",
|
||||||
|
# "Address:", etc.). The ASCII export reflects whatever the operator
|
||||||
|
# typed, so label-based matching is fragile.
|
||||||
|
#
|
||||||
|
# What IS reliable: BW always writes the 4 user-notes lines in the
|
||||||
|
# same order, contiguously between the `Units :` line and the
|
||||||
|
# `Geo Range :` line. We parse them by POSITION and preserve the
|
||||||
|
# operator's labels in `report.user_note_labels` so terra-view can
|
||||||
|
# render them as the operator intended.
|
||||||
|
|
||||||
|
_USER_NOTE_SLOTS = ("project", "client", "operator", "sensor_location")
|
||||||
|
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────────────────────
|
||||||
|
# Top-level parser
|
||||||
|
# ─────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def parse_report(text: Union[str, bytes], *, parse_samples: bool = False) -> BwAsciiReport:
|
||||||
|
"""Parse a BW per-event ASCII export into a structured BwAsciiReport.
|
||||||
|
|
||||||
|
Set ``parse_samples=True`` to also populate ``report.samples`` with
|
||||||
|
the trailing sample table. Default False because the table is
|
||||||
|
huge and most callers only want metadata for indexing.
|
||||||
|
"""
|
||||||
|
if isinstance(text, bytes):
|
||||||
|
text = text.decode("ascii", errors="replace")
|
||||||
|
|
||||||
|
report = BwAsciiReport()
|
||||||
|
# Pre-create channel stat slots so callers can rely on them existing.
|
||||||
|
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||||
|
report.channels.setdefault(ch, ChannelStats())
|
||||||
|
report.sensor_check.setdefault(ch, SensorCheck())
|
||||||
|
|
||||||
|
lines = text.splitlines()
|
||||||
|
i = 0
|
||||||
|
n = len(lines)
|
||||||
|
|
||||||
|
in_monitor_log_section = False
|
||||||
|
event_time_str: Optional[str] = None
|
||||||
|
event_date: Optional[datetime.date] = None
|
||||||
|
|
||||||
|
# User-notes block detection. We enter the block after parsing
|
||||||
|
# the "Units :" line and exit on the "Geo Range :" line. Inside,
|
||||||
|
# the first 4 unmatched `<label> : <value>` lines are assigned to
|
||||||
|
# the 4 canonical operator-supplied slots by POSITION (project,
|
||||||
|
# client, operator, sensor_location) regardless of what the
|
||||||
|
# operator named the labels in BW's Compliance Setup → Notes tab.
|
||||||
|
in_user_notes_block = False
|
||||||
|
user_note_position = 0
|
||||||
|
|
||||||
|
# Histogram-field staging — BW writes <Channel> Peak Time and
|
||||||
|
# <Channel> Peak Date on separate lines (and similarly Histogram
|
||||||
|
# Start Time / Date). We stash the partial value when the time
|
||||||
|
# line arrives and combine it when the matching date line arrives.
|
||||||
|
_hist_start_time: Optional[datetime.time] = None
|
||||||
|
_hist_stop_time: Optional[datetime.time] = None
|
||||||
|
_pending_peak_time: Dict[str, Optional[datetime.time]] = {}
|
||||||
|
_pvs_time_raw: Optional[str] = None # last Peak Vector Sum Time value, raw
|
||||||
|
|
||||||
|
while i < n:
|
||||||
|
raw_line = lines[i]
|
||||||
|
i += 1
|
||||||
|
# Blank line marks the start of the sample table.
|
||||||
|
if raw_line.strip() == "":
|
||||||
|
break
|
||||||
|
|
||||||
|
line = _strip_quotes(raw_line)
|
||||||
|
|
||||||
|
# Monitor log section: "Monitor Log(s)" header followed by N rows
|
||||||
|
# (still inside double-quoted lines), terminated by a non-row line
|
||||||
|
# like "PC SW Version : ..." or a blank line.
|
||||||
|
if not in_monitor_log_section and line.strip() == "Monitor Log(s)":
|
||||||
|
in_monitor_log_section = True
|
||||||
|
continue
|
||||||
|
if in_monitor_log_section:
|
||||||
|
# Heuristic: monitor rows contain a tab; the next "Field : Value"
|
||||||
|
# line ends the section.
|
||||||
|
if "\t" in line:
|
||||||
|
entry = _parse_monitor_row(line)
|
||||||
|
if entry:
|
||||||
|
report.monitor_log.append(entry)
|
||||||
|
continue
|
||||||
|
# Falls through to the field parser below; clear the flag.
|
||||||
|
in_monitor_log_section = False
|
||||||
|
|
||||||
|
# "Field : Value" — split on FIRST occurrence of " : "
|
||||||
|
idx = line.find(" : ")
|
||||||
|
if idx < 0:
|
||||||
|
continue
|
||||||
|
key = _normalise_key(line[:idx])
|
||||||
|
value = line[idx + 3 :].strip()
|
||||||
|
|
||||||
|
# ── Identity / config ────────────────────────────────────────────────
|
||||||
|
if key == "Event Type": report.event_type = value
|
||||||
|
elif key == "Serial Number": report.serial = value
|
||||||
|
elif key == "Version": report.version = value
|
||||||
|
elif key == "File Name": report.file_name = value
|
||||||
|
elif key == "Event Time": event_time_str = value
|
||||||
|
elif key == "Event Date": event_date = _parse_event_date(value)
|
||||||
|
|
||||||
|
elif key == "Trigger": report.trigger_channel = value
|
||||||
|
elif key == "Geo Trigger Level": report.geo_trigger_level_ips = _parse_number(value)
|
||||||
|
elif key == "Pre-trigger Length": report.pretrig_s = _parse_number(value)
|
||||||
|
elif key == "Record Time": report.record_time_s = _parse_number(value)
|
||||||
|
elif key == "Record Stop Mode": report.record_stop_mode = value
|
||||||
|
elif key == "Sample Rate": report.sample_rate_sps = _parse_int(value)
|
||||||
|
elif key == "Battery Level": report.battery_volts = _parse_number(value)
|
||||||
|
elif key == "Calibration":
|
||||||
|
report.calibration_date, report.calibration_by = _parse_calibration(value)
|
||||||
|
elif key == "Units":
|
||||||
|
report.units = value
|
||||||
|
# Entering the user-notes block. Next ~4 lines until
|
||||||
|
# "Geo Range :" are the operator-supplied notes.
|
||||||
|
in_user_notes_block = True
|
||||||
|
user_note_position = 0
|
||||||
|
|
||||||
|
elif key == "Geo Range":
|
||||||
|
# Exiting the user-notes block.
|
||||||
|
in_user_notes_block = False
|
||||||
|
report.geo_range_ips = _parse_number(value)
|
||||||
|
|
||||||
|
# User-notes block: assign by position (operator may have
|
||||||
|
# renamed the labels, so we don't trust them). Preserve the
|
||||||
|
# original labels in `user_note_labels` for downstream UIs
|
||||||
|
# (terra-view) that want to display them as the operator
|
||||||
|
# named them.
|
||||||
|
elif in_user_notes_block and user_note_position < len(_USER_NOTE_SLOTS):
|
||||||
|
slot = _USER_NOTE_SLOTS[user_note_position]
|
||||||
|
setattr(report, slot, value)
|
||||||
|
report.user_note_labels[slot] = key
|
||||||
|
user_note_position += 1
|
||||||
|
|
||||||
|
# ── Per-channel stats ────────────────────────────────────────────────
|
||||||
|
# All match the pattern "{Channel} <stat-name>"
|
||||||
|
elif key in (
|
||||||
|
"Tran PPV", "Vert PPV", "Long PPV",
|
||||||
|
"Tran ZC Freq", "Vert ZC Freq", "Long ZC Freq",
|
||||||
|
"Tran Time of Peak", "Vert Time of Peak", "Long Time of Peak",
|
||||||
|
"Tran Peak Acceleration", "Vert Peak Acceleration", "Long Peak Acceleration",
|
||||||
|
"Tran Peak Displacement", "Vert Peak Displacement", "Long Peak Displacement",
|
||||||
|
):
|
||||||
|
ch_name, stat = key.split(" ", 1)
|
||||||
|
cs = report.channels.setdefault(ch_name, ChannelStats())
|
||||||
|
if stat == "PPV":
|
||||||
|
if _is_oorange(value):
|
||||||
|
# Channel saturated — substitute range max as lower
|
||||||
|
# bound; flag so downstream UI can render "> 10 in/s".
|
||||||
|
cs.ppv_ips = report.geo_range_ips
|
||||||
|
cs.ppv_saturated = True
|
||||||
|
else:
|
||||||
|
cs.ppv_ips = _parse_number(value)
|
||||||
|
elif stat == "ZC Freq":
|
||||||
|
# ">100 Hz" → store threshold + flag; numeric → parse normally
|
||||||
|
threshold = _parse_above_range(value)
|
||||||
|
if threshold is not None:
|
||||||
|
cs.zc_freq_hz = threshold
|
||||||
|
cs.zc_freq_above_range = True
|
||||||
|
else:
|
||||||
|
cs.zc_freq_hz = _parse_number(value)
|
||||||
|
else:
|
||||||
|
num = _parse_number(value)
|
||||||
|
if stat == "Time of Peak": cs.time_of_peak_s = num
|
||||||
|
elif stat == "Peak Acceleration": cs.peak_accel_g = num
|
||||||
|
elif stat == "Peak Displacement": cs.peak_disp_in = num
|
||||||
|
|
||||||
|
# ── Histogram-specific fields ────────────────────────────────────────
|
||||||
|
# Histograms have Start/Stop time+date pairs + an interval count
|
||||||
|
# and size, plus per-channel absolute Peak Time/Date instead of
|
||||||
|
# the waveform's relative Time of Peak.
|
||||||
|
elif key == "Histogram Start Time":
|
||||||
|
_hist_start_time = _parse_event_time(value)
|
||||||
|
elif key == "Histogram Start Date":
|
||||||
|
_d = _parse_iso_date(value)
|
||||||
|
if _d and _hist_start_time:
|
||||||
|
report.histogram_start = datetime.datetime.combine(_d, _hist_start_time)
|
||||||
|
elif key == "Histogram Stop Time":
|
||||||
|
_hist_stop_time = _parse_event_time(value)
|
||||||
|
elif key == "Histogram Stop Date":
|
||||||
|
_d = _parse_iso_date(value)
|
||||||
|
if _d and _hist_stop_time:
|
||||||
|
report.histogram_stop = datetime.datetime.combine(_d, _hist_stop_time)
|
||||||
|
elif key == "Number of Intervals":
|
||||||
|
try:
|
||||||
|
report.histogram_n_intervals = int(float(value.strip()))
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
elif key == "Interval Size":
|
||||||
|
report.histogram_interval_size_str = value.strip()
|
||||||
|
report.histogram_interval_size_s = _parse_interval_size(value)
|
||||||
|
|
||||||
|
# ── Per-channel histogram Peak Date / Peak Time ──
|
||||||
|
# Lines like "Tran Peak Time : 22:31:38" + "Tran Peak Date : 2026-05-16"
|
||||||
|
elif key in ("Tran Peak Time", "Vert Peak Time", "Long Peak Time", "MicL Time"):
|
||||||
|
ch_name = "MicL" if key == "MicL Time" else key.split(" ", 1)[0]
|
||||||
|
_pending_peak_time[ch_name] = _parse_event_time(value)
|
||||||
|
elif key in ("Tran Peak Date", "Vert Peak Date", "Long Peak Date", "MicL Date"):
|
||||||
|
ch_name = "MicL" if key == "MicL Date" else key.split(" ", 1)[0]
|
||||||
|
_d = _parse_iso_date(value)
|
||||||
|
_t = _pending_peak_time.get(ch_name)
|
||||||
|
if _d and _t:
|
||||||
|
report.channel_peak_when[ch_name] = datetime.datetime.combine(_d, _t)
|
||||||
|
|
||||||
|
# ── Vector Sum ───────────────────────────────────────────────────────
|
||||||
|
elif key == "Peak Vector Sum":
|
||||||
|
if _is_oorange(value):
|
||||||
|
# PVS saturated — conservative upper bound is
|
||||||
|
# sqrt(3) * geo_range_ips (all 3 channels at full-scale).
|
||||||
|
# Real PVS could be lower (channels rarely peak
|
||||||
|
# simultaneously) but never higher within the range.
|
||||||
|
if report.geo_range_ips is not None:
|
||||||
|
import math as _math
|
||||||
|
report.peak_vector_sum_ips = _math.sqrt(3) * report.geo_range_ips
|
||||||
|
report.peak_vector_sum_saturated = True
|
||||||
|
else:
|
||||||
|
report.peak_vector_sum_ips = _parse_number(value)
|
||||||
|
# BW writes the PVS-time label with a typo: "Peak Vector Sum TimeSum"
|
||||||
|
# (looks like Sum got appended twice). Accept both forms. Confirmed
|
||||||
|
# against actual BW output on 2026-05-27 — every PVS-time line in
|
||||||
|
# the field examples (T190, T438, K557) uses the typo'd label.
|
||||||
|
elif key in ("Peak Vector Sum Time", "Peak Vector Sum TimeSum"):
|
||||||
|
report.peak_vector_sum_time_s = _parse_number(value)
|
||||||
|
_pvs_time_raw = value
|
||||||
|
elif key == "Peak Vector Sum Date":
|
||||||
|
# Histogram-mode PVS gets paired with a date. We may have
|
||||||
|
# captured 'Peak Vector Sum Time' as either a relative
|
||||||
|
# seconds float (waveform) or an HH:MM:SS string we
|
||||||
|
# interpreted as a number. For histograms, BW writes
|
||||||
|
# "Peak Vector Sum Time : 22:33:52" which _parse_number
|
||||||
|
# parses as 22.0 (loses information). When Peak Vector Sum
|
||||||
|
# Date arrives, re-parse the previous PVS time line as a
|
||||||
|
# clock time and combine into an absolute datetime.
|
||||||
|
_d = _parse_iso_date(value)
|
||||||
|
if _d and _pvs_time_raw is not None:
|
||||||
|
_t = _parse_event_time(_pvs_time_raw)
|
||||||
|
if _t:
|
||||||
|
report.peak_vector_sum_when = datetime.datetime.combine(_d, _t)
|
||||||
|
# The earlier seconds parse was bogus for histograms;
|
||||||
|
# clear it so consumers don't think it's a real offset.
|
||||||
|
report.peak_vector_sum_time_s = None
|
||||||
|
|
||||||
|
# ── Microphone block ────────────────────────────────────────────────
|
||||||
|
elif key == "Microphone":
|
||||||
|
report.mic.weighting = value
|
||||||
|
elif key == "MicL PSPL":
|
||||||
|
if _is_oorange(value):
|
||||||
|
# Mic saturated — substitute conservative upper bound 140 dBL.
|
||||||
|
report.mic.pspl_dbl = 140.0
|
||||||
|
report.mic.pspl_saturated = True
|
||||||
|
else:
|
||||||
|
report.mic.pspl_dbl = _parse_number(value)
|
||||||
|
# Mirror onto the "MicL" entry in channels so callers querying
|
||||||
|
# `channels["MicL"].ppv_ips` see something — but it's dB(L), not
|
||||||
|
# in/s, so we store as-is in the MicStats and mark the channel.
|
||||||
|
elif key == "MicL Time of Peak":
|
||||||
|
report.mic.time_of_peak_s = _parse_number(value)
|
||||||
|
cs = report.channels.setdefault("MicL", ChannelStats())
|
||||||
|
cs.time_of_peak_s = report.mic.time_of_peak_s
|
||||||
|
elif key == "MicL ZC Freq":
|
||||||
|
threshold = _parse_above_range(value)
|
||||||
|
if threshold is not None:
|
||||||
|
report.mic.zc_freq_hz = threshold
|
||||||
|
report.mic.zc_freq_above_range = True
|
||||||
|
else:
|
||||||
|
report.mic.zc_freq_hz = _parse_number(value)
|
||||||
|
cs = report.channels.setdefault("MicL", ChannelStats())
|
||||||
|
cs.zc_freq_hz = report.mic.zc_freq_hz
|
||||||
|
cs.zc_freq_above_range = report.mic.zc_freq_above_range
|
||||||
|
|
||||||
|
# ── Sensor self-check ────────────────────────────────────────────────
|
||||||
|
elif key in (
|
||||||
|
"Tran Test Freq", "Vert Test Freq", "Long Test Freq", "MicL Test Freq",
|
||||||
|
"Tran Test Ratio", "Vert Test Ratio", "Long Test Ratio",
|
||||||
|
"MicL Test Amplitude",
|
||||||
|
"Tran Test Results", "Vert Test Results", "Long Test Results", "MicL Test Results",
|
||||||
|
):
|
||||||
|
ch_name, stat = key.split(" ", 1)
|
||||||
|
sc = report.sensor_check.setdefault(ch_name, SensorCheck())
|
||||||
|
if stat == "Test Freq": sc.test_freq_hz = _parse_number(value)
|
||||||
|
elif stat == "Test Ratio": sc.test_ratio = _parse_number(value)
|
||||||
|
elif stat == "Test Amplitude": sc.test_amplitude_mv = _parse_number(value)
|
||||||
|
elif stat == "Test Results": sc.test_results = value
|
||||||
|
|
||||||
|
# ── Trailer ─────────────────────────────────────────────────────────
|
||||||
|
elif key == "PC SW Version":
|
||||||
|
report.pc_sw_version = value
|
||||||
|
|
||||||
|
# Unknown keys are silently dropped — forward-compat for future
|
||||||
|
# BW versions that may add fields.
|
||||||
|
|
||||||
|
# Combine event date + time into a datetime
|
||||||
|
if event_date is not None and event_time_str is not None:
|
||||||
|
t = _parse_event_time(event_time_str)
|
||||||
|
if t is not None:
|
||||||
|
report.event_datetime = datetime.datetime.combine(event_date, t)
|
||||||
|
|
||||||
|
if parse_samples:
|
||||||
|
report.samples = _parse_sample_table(lines, i)
|
||||||
|
|
||||||
|
return report
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_sample_table(
|
||||||
|
lines: List[str], start: int,
|
||||||
|
) -> List[Tuple[float, float, float, float]]:
|
||||||
|
"""Parse the trailing sample table.
|
||||||
|
|
||||||
|
The table starts with a header row (" Tran <TAB>...") and continues
|
||||||
|
until EOF. Each data row is a tab-separated quartet of numeric values.
|
||||||
|
"""
|
||||||
|
samples: List[Tuple[float, float, float, float]] = []
|
||||||
|
seen_header = False
|
||||||
|
for line in lines[start:]:
|
||||||
|
line = line.rstrip("\r\n")
|
||||||
|
if not line.strip():
|
||||||
|
continue
|
||||||
|
cols = [c.strip() for c in line.split("\t") if c.strip()]
|
||||||
|
if not seen_header:
|
||||||
|
# Header row contains channel names; numeric rows don't.
|
||||||
|
if any(c in ("Tran", "Vert", "Long", "MicL") for c in cols):
|
||||||
|
seen_header = True
|
||||||
|
continue
|
||||||
|
if len(cols) < 4:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
samples.append((
|
||||||
|
float(cols[0]), float(cols[1]),
|
||||||
|
float(cols[2]), float(cols[3]),
|
||||||
|
))
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
return samples
|
||||||
|
|
||||||
|
|
||||||
|
def parse_report_file(
|
||||||
|
path: Union[str, Path], *, parse_samples: bool = False,
|
||||||
|
) -> BwAsciiReport:
|
||||||
|
"""Convenience: read a .TXT file from disk and parse it."""
|
||||||
|
return parse_report(Path(path).read_bytes(), parse_samples=parse_samples)
|
||||||
+355
-149
@@ -449,7 +449,7 @@ class MiniMateClient:
|
|||||||
proto.confirm_erase_all()
|
proto.confirm_erase_all()
|
||||||
log.info("delete_all_events: erase confirmed — device memory cleared")
|
log.info("delete_all_events: erase confirmed — device memory cleared")
|
||||||
|
|
||||||
def get_events(self, full_waveform: bool = False, debug: bool = False, stop_after_index: Optional[int] = None, skip_waveform_for_keys: Optional[set] = None, extra_chunks_after_metadata: int = 1) -> list[Event]:
|
def get_events(self, full_waveform: bool = False, debug: bool = False, stop_after_index: Optional[int] = None, skip_waveform_for_keys: Optional[set] = None, skip_waveform_for_events: Optional[dict] = None, extra_chunks_after_metadata: int = 1) -> list[Event]:
|
||||||
"""
|
"""
|
||||||
Download all stored events from the device using the confirmed
|
Download all stored events from the device using the confirmed
|
||||||
1E → 0A → 0C → 5A → 1F event-iterator protocol.
|
1E → 0A → 0C → 5A → 1F event-iterator protocol.
|
||||||
@@ -497,37 +497,24 @@ class MiniMateClient:
|
|||||||
events: list[Event] = []
|
events: list[Event] = []
|
||||||
idx = 0
|
idx = 0
|
||||||
|
|
||||||
|
# Legacy bare-key skip set is deprecated: the device's key counter
|
||||||
|
# resets to 0x01110000 after every memory erase, so a key in this set
|
||||||
|
# cannot be trusted to identify the same physical event across erases.
|
||||||
|
# If a caller still passes it, log a warning and ignore — full
|
||||||
|
# downloads will run for every event so the bug never silently bites.
|
||||||
|
if skip_waveform_for_keys:
|
||||||
|
log.warning(
|
||||||
|
"get_events: skip_waveform_for_keys is deprecated and unsafe "
|
||||||
|
"(post-erase key reuse); ignoring %d entries. Use "
|
||||||
|
"skip_waveform_for_events={key: timestamp_iso} instead.",
|
||||||
|
len(skip_waveform_for_keys),
|
||||||
|
)
|
||||||
|
skip_evts: dict[str, str] = dict(skip_waveform_for_events or {})
|
||||||
|
|
||||||
while data8[4:8] != b"\x00\x00\x00\x00":
|
while data8[4:8] != b"\x00\x00\x00\x00":
|
||||||
cur_key = key4 # key for this event's 0A/1E-arm/0C/5A calls
|
cur_key = key4 # key for this event's 0A/1E-arm/0C/5A calls
|
||||||
log.info("get_events: record %d key=%s", idx, cur_key.hex())
|
log.info("get_events: record %d key=%s", idx, cur_key.hex())
|
||||||
|
|
||||||
# Fast-advance path: if this key is already downloaded, skip
|
|
||||||
# 1E-arm/0C/POLL/5A entirely. Only 0A + 1F(browse) are needed
|
|
||||||
# to advance the device's internal pointer to the next event.
|
|
||||||
# This is identical to the browse-mode walk in count_events().
|
|
||||||
if skip_waveform_for_keys and cur_key.hex() in skip_waveform_for_keys:
|
|
||||||
log.debug("get_events: key=%s already seen -- fast-advance only", cur_key.hex())
|
|
||||||
try:
|
|
||||||
proto.read_waveform_header(cur_key)
|
|
||||||
except ProtocolError as exc:
|
|
||||||
log.warning(
|
|
||||||
"get_events: 0A failed for key=%s (skip path): %s -- stopping",
|
|
||||||
cur_key.hex(), exc,
|
|
||||||
)
|
|
||||||
break
|
|
||||||
try:
|
|
||||||
key4, data8 = proto.advance_event(browse=True)
|
|
||||||
except ProtocolError as exc:
|
|
||||||
log.warning(
|
|
||||||
"get_events: 1F failed for key=%s (skip path): %s -- stopping",
|
|
||||||
cur_key.hex(), exc,
|
|
||||||
)
|
|
||||||
break
|
|
||||||
idx += 1
|
|
||||||
if stop_after_index is not None and idx > stop_after_index:
|
|
||||||
break
|
|
||||||
continue
|
|
||||||
|
|
||||||
ev = Event(index=idx)
|
ev = Event(index=idx)
|
||||||
ev._waveform_key = cur_key
|
ev._waveform_key = cur_key
|
||||||
|
|
||||||
@@ -574,72 +561,96 @@ class MiniMateClient:
|
|||||||
"get_events: 0C failed for key=%s: %s", cur_key.hex(), exc
|
"get_events: 0C failed for key=%s: %s", cur_key.hex(), exc
|
||||||
)
|
)
|
||||||
|
|
||||||
# SUB 1F (download-arm) — send token=0xFE BEFORE POLL+5A to arm the
|
# ── Skip-5A decision based on (key, timestamp) match ──────
|
||||||
# device's bulk stream state machine. Cache the returned key as a
|
# If skip_waveform_for_events maps cur_key.hex() to a non-empty
|
||||||
# fallback for loop iteration when 5A fails (see iteration block below).
|
# ISO timestamp matching what we just read from 0C, this is
|
||||||
# Confirmed from 4-2-26 capture frames 66-67 (1F before frames 68-73 POLL).
|
# the same physical event we already have on disk — bypass
|
||||||
arm_key4: Optional[bytes] = None
|
# the 1F(arm)+POLL+5A bulk download. Otherwise (no entry, or
|
||||||
try:
|
# timestamp mismatch indicating post-erase reuse) fall through
|
||||||
arm_key4, _ = proto.advance_event(browse=False) # arm 5A
|
# to the full download.
|
||||||
log.info("get_events: 1F(download) — 5A armed, arm_key=%s", arm_key4.hex())
|
expected_ts = skip_evts.get(cur_key.hex(), "")
|
||||||
except ProtocolError as exc:
|
actual_ts = _event_timestamp_iso(ev)
|
||||||
log.warning("get_events: 1F(download) arm failed: %s", exc)
|
skip_5a = bool(expected_ts and actual_ts and expected_ts == actual_ts)
|
||||||
|
if skip_5a:
|
||||||
|
log.info(
|
||||||
|
"get_events: key=%s (key, ts=%s) match — skipping 5A bulk download",
|
||||||
|
cur_key.hex(), actual_ts,
|
||||||
|
)
|
||||||
|
|
||||||
# POLL × 3 — BW sends 3 full POLL cycles between 1F and 5A.
|
arm_key4: Optional[bytes] = None
|
||||||
# Confirmed from 4-2-26 BW TX capture (frames 68-73 before 5A at 74).
|
a5_ok = False
|
||||||
log.info("get_events: POLL × 3 before 5A")
|
|
||||||
for _p in range(3):
|
if not skip_5a:
|
||||||
|
# SUB 1F (download-arm) — send token=0xFE BEFORE POLL+5A to arm the
|
||||||
|
# device's bulk stream state machine. Cache the returned key as a
|
||||||
|
# fallback for loop iteration when 5A fails (see iteration block below).
|
||||||
|
# Confirmed from 4-2-26 capture frames 66-67 (1F before frames 68-73 POLL).
|
||||||
try:
|
try:
|
||||||
proto.poll()
|
arm_key4, _ = proto.advance_event(browse=False) # arm 5A
|
||||||
|
log.info("get_events: 1F(download) — 5A armed, arm_key=%s", arm_key4.hex())
|
||||||
except ProtocolError as exc:
|
except ProtocolError as exc:
|
||||||
log.warning("get_events: POLL %d failed: %s", _p, exc)
|
log.warning("get_events: 1F(download) arm failed: %s", exc)
|
||||||
|
|
||||||
|
# POLL × 3 — BW sends 3 full POLL cycles between 1F and 5A.
|
||||||
|
# Confirmed from 4-2-26 BW TX capture (frames 68-73 before 5A at 74).
|
||||||
|
log.info("get_events: POLL × 3 before 5A")
|
||||||
|
for _p in range(3):
|
||||||
|
try:
|
||||||
|
proto.poll()
|
||||||
|
except ProtocolError as exc:
|
||||||
|
log.warning("get_events: POLL %d failed: %s", _p, exc)
|
||||||
|
|
||||||
# SUB 5A — bulk waveform stream (uses cur_key, the event set up by 0A+1E+0C).
|
# SUB 5A — bulk waveform stream (uses cur_key, the event set up by 0A+1E+0C).
|
||||||
# By default (full_waveform=False): stop after frame 7 for metadata only.
|
# By default (full_waveform=False): stop after frame 7 for metadata only.
|
||||||
# When full_waveform=True: fetch all chunks and decode raw ADC samples.
|
# When full_waveform=True: fetch all chunks and decode raw ADC samples.
|
||||||
a5_ok = False
|
#
|
||||||
try:
|
# Bypassed when skip_5a is True — the event is left with
|
||||||
if full_waveform:
|
# _a5_frames=None, which signals to the caller (e.g.
|
||||||
log.info(
|
# ach_server.py) that this event was matched by (key, ts) and
|
||||||
"get_events: 5A full waveform download for key=%s", cur_key.hex()
|
# already has a stored .file in the persistent waveform store.
|
||||||
)
|
if not skip_5a:
|
||||||
a5_frames = proto.read_bulk_waveform_stream(
|
try:
|
||||||
cur_key, stop_after_metadata=False, max_chunks=128,
|
if full_waveform:
|
||||||
include_terminator=True,
|
|
||||||
)
|
|
||||||
if a5_frames:
|
|
||||||
a5_ok = True
|
|
||||||
ev._a5_frames = a5_frames # store for write_blastware_file
|
|
||||||
_decode_a5_metadata_into(a5_frames, ev)
|
|
||||||
_decode_a5_waveform(a5_frames, ev)
|
|
||||||
log.info(
|
log.info(
|
||||||
"get_events: 5A decoded %d sample-sets",
|
"get_events: 5A full waveform download for key=%s", cur_key.hex()
|
||||||
len((ev.raw_samples or {}).get("Tran", [])),
|
|
||||||
)
|
)
|
||||||
else:
|
a5_frames = proto.read_bulk_waveform_stream(
|
||||||
log.info(
|
cur_key, stop_after_metadata=False, max_chunks=128,
|
||||||
"get_events: 5A metadata-only download for key=%s", cur_key.hex()
|
include_terminator=True,
|
||||||
)
|
|
||||||
a5_frames = proto.read_bulk_waveform_stream(
|
|
||||||
cur_key, stop_after_metadata=True,
|
|
||||||
include_terminator=True,
|
|
||||||
extra_chunks_after_metadata=extra_chunks_after_metadata,
|
|
||||||
max_chunks=128,
|
|
||||||
)
|
|
||||||
if a5_frames:
|
|
||||||
a5_ok = True
|
|
||||||
ev._a5_frames = a5_frames # store for write_blastware_file
|
|
||||||
_decode_a5_metadata_into(a5_frames, ev)
|
|
||||||
log.debug(
|
|
||||||
"get_events: 5A metadata client=%r operator=%r",
|
|
||||||
ev.project_info.client if ev.project_info else None,
|
|
||||||
ev.project_info.operator if ev.project_info else None,
|
|
||||||
)
|
)
|
||||||
except ProtocolError as exc:
|
if a5_frames:
|
||||||
log.warning(
|
a5_ok = True
|
||||||
"get_events: 5A failed for key=%s: %s — metadata unavailable",
|
ev._a5_frames = a5_frames # store for write_blastware_file
|
||||||
cur_key.hex(), exc,
|
_decode_a5_metadata_into(a5_frames, ev)
|
||||||
)
|
_decode_a5_waveform(a5_frames, ev)
|
||||||
|
log.info(
|
||||||
|
"get_events: 5A decoded %d sample-sets",
|
||||||
|
len((ev.raw_samples or {}).get("Tran", [])),
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
log.info(
|
||||||
|
"get_events: 5A metadata-only download for key=%s", cur_key.hex()
|
||||||
|
)
|
||||||
|
a5_frames = proto.read_bulk_waveform_stream(
|
||||||
|
cur_key, stop_after_metadata=True,
|
||||||
|
include_terminator=True,
|
||||||
|
extra_chunks_after_metadata=extra_chunks_after_metadata,
|
||||||
|
max_chunks=128,
|
||||||
|
)
|
||||||
|
if a5_frames:
|
||||||
|
a5_ok = True
|
||||||
|
ev._a5_frames = a5_frames # store for write_blastware_file
|
||||||
|
_decode_a5_metadata_into(a5_frames, ev)
|
||||||
|
log.debug(
|
||||||
|
"get_events: 5A metadata client=%r operator=%r",
|
||||||
|
ev.project_info.client if ev.project_info else None,
|
||||||
|
ev.project_info.operator if ev.project_info else None,
|
||||||
|
)
|
||||||
|
except ProtocolError as exc:
|
||||||
|
log.warning(
|
||||||
|
"get_events: 5A failed for key=%s: %s — metadata unavailable",
|
||||||
|
cur_key.hex(), exc,
|
||||||
|
)
|
||||||
|
|
||||||
# SUB 1F — loop iteration.
|
# SUB 1F — loop iteration.
|
||||||
#
|
#
|
||||||
@@ -652,7 +663,14 @@ class MiniMateClient:
|
|||||||
# Confirmed from 4-3-26 browse-mode captures: browse=True params
|
# Confirmed from 4-3-26 browse-mode captures: browse=True params
|
||||||
# are correct for multi-event iteration. Conditional logic added
|
# are correct for multi-event iteration. Conditional logic added
|
||||||
# 2026-04-06 to avoid post-failure state disruption.
|
# 2026-04-06 to avoid post-failure state disruption.
|
||||||
if a5_ok:
|
#
|
||||||
|
# NEW 2026-05-06: when skip_5a=True we never entered the 5A
|
||||||
|
# state at all (we read 0A+1E(arm)+0C and chose to bypass).
|
||||||
|
# 1F(browse) is safe in this scenario — the device's iteration
|
||||||
|
# pointer is independent of the bulk-stream state machine, and
|
||||||
|
# we never put it into the half-attempted 5A state that the
|
||||||
|
# earlier "post-failure 1F disruption" warning is about.
|
||||||
|
if skip_5a or a5_ok:
|
||||||
# 5A succeeded — use browse 1F for reliable key advancement.
|
# 5A succeeded — use browse 1F for reliable key advancement.
|
||||||
try:
|
try:
|
||||||
key4, data8 = proto.advance_event(browse=True)
|
key4, data8 = proto.advance_event(browse=True)
|
||||||
@@ -1174,6 +1192,27 @@ class MiniMateClient:
|
|||||||
# Pure functions: bytes → model field population.
|
# Pure functions: bytes → model field population.
|
||||||
# Kept here (not in models.py) to isolate protocol knowledge from data shapes.
|
# Kept here (not in models.py) to isolate protocol knowledge from data shapes.
|
||||||
|
|
||||||
|
def _event_timestamp_iso(event: Event) -> str:
|
||||||
|
"""
|
||||||
|
Return a stable ISO-8601 string for the event's 0C-derived timestamp,
|
||||||
|
or "" if the event has no timestamp populated.
|
||||||
|
|
||||||
|
The format intentionally matches what `bridges/ach_server.py` writes
|
||||||
|
into `ach_state.json:downloaded_events[*]` so the (key, ts) compare
|
||||||
|
in get_events()'s skip path is a simple string equality.
|
||||||
|
"""
|
||||||
|
ts = getattr(event, "timestamp", None)
|
||||||
|
if ts is None:
|
||||||
|
return ""
|
||||||
|
try:
|
||||||
|
return datetime.datetime(
|
||||||
|
ts.year, ts.month, ts.day,
|
||||||
|
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||||
|
).isoformat()
|
||||||
|
except Exception:
|
||||||
|
return str(ts)
|
||||||
|
|
||||||
|
|
||||||
def _decode_serial_number(data: bytes) -> DeviceInfo:
|
def _decode_serial_number(data: bytes) -> DeviceInfo:
|
||||||
"""
|
"""
|
||||||
Decode SUB EA (SERIAL_NUMBER_RESPONSE) payload into a new DeviceInfo.
|
Decode SUB EA (SERIAL_NUMBER_RESPONSE) payload into a new DeviceInfo.
|
||||||
@@ -1323,28 +1362,40 @@ def _decode_waveform_record_into(data: bytes, event: Event) -> None:
|
|||||||
|
|
||||||
Modifies event in-place.
|
Modifies event in-place.
|
||||||
"""
|
"""
|
||||||
# ── Record type ───────────────────────────────────────────────────────────
|
# ── Record type + format detection ────────────────────────────────────────
|
||||||
# Decoded from byte[1] (sub_code) first so we can gate timestamp parsing.
|
# `record_type` is the user-facing label ("Waveform" for any triggered
|
||||||
|
# event regardless of timestamp-header layout). `fmt` is the internal
|
||||||
|
# format code used to pick the right Timestamp parser; it stays
|
||||||
|
# internal and doesn't leak to the API / sidecar / UI.
|
||||||
try:
|
try:
|
||||||
event.record_type = _extract_record_type(data)
|
event.record_type = _extract_record_type(data)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
log.warning("waveform record type decode failed: %s", exc)
|
log.warning("waveform record type decode failed: %s", exc)
|
||||||
|
fmt = _detect_record_format(data)
|
||||||
|
|
||||||
# ── Timestamp ─────────────────────────────────────────────────────────────
|
# ── Timestamp ─────────────────────────────────────────────────────────────
|
||||||
# 9-byte format for sub_code=0x10 Waveform records:
|
# Three timestamp-header layouts have been observed across BE11529
|
||||||
# [day][sub_code][month][year:2 BE][unknown][hour][min][sec]
|
# firmware S338.17 — each picks a different Timestamp parser:
|
||||||
# sub_code=0x10 and sub_code=0x03 have different timestamp byte layouts.
|
# "single_shot": 9-byte [day][0x10][month][year:2][unk][h][m][s]
|
||||||
# Both confirmed against Blastware event reports (BE11529, 2026-04-01 and 2026-04-03).
|
# "continuous": 10-byte [0x10][day][0x10][month][year:2][unk][h][m][s]
|
||||||
if event.record_type == "Waveform":
|
# "short": 8-byte [day][month][year:2][unk][h][m][s]
|
||||||
|
# All decoded into the same Timestamp dataclass — only the byte
|
||||||
|
# offsets differ.
|
||||||
|
if fmt == "single_shot":
|
||||||
try:
|
try:
|
||||||
event.timestamp = Timestamp.from_waveform_record(data)
|
event.timestamp = Timestamp.from_waveform_record(data)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
log.warning("waveform record timestamp decode failed: %s", exc)
|
log.warning("single_shot record timestamp decode failed: %s", exc)
|
||||||
elif event.record_type == "Waveform (Continuous)":
|
elif fmt == "continuous":
|
||||||
try:
|
try:
|
||||||
event.timestamp = Timestamp.from_continuous_record(data)
|
event.timestamp = Timestamp.from_continuous_record(data)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
log.warning("continuous record timestamp decode failed: %s", exc)
|
log.warning("continuous record timestamp decode failed: %s", exc)
|
||||||
|
elif fmt == "short":
|
||||||
|
try:
|
||||||
|
event.timestamp = Timestamp.from_short_record(data)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("short record timestamp decode failed: %s", exc)
|
||||||
|
|
||||||
# ── Peak values (per-channel PPV + Peak Vector Sum) ───────────────────────
|
# ── Peak values (per-channel PPV + Peak Vector Sum) ───────────────────────
|
||||||
try:
|
try:
|
||||||
@@ -1449,22 +1500,69 @@ def _decode_a5_waveform(
|
|||||||
(BULK_WAVEFORM_STREAM) frame payloads and populate event.raw_samples,
|
(BULK_WAVEFORM_STREAM) frame payloads and populate event.raw_samples,
|
||||||
event.total_samples, event.pretrig_samples, and event.rectime_seconds.
|
event.total_samples, event.pretrig_samples, and event.rectime_seconds.
|
||||||
|
|
||||||
This requires ALL A5 frames (stop_after_metadata=False), not just the
|
Wired up 2026-05-11 to the verified ``decode_waveform_v2`` codec (see
|
||||||
metadata-bearing subset.
|
``minimateplus/waveform_codec.py`` and ``docs/waveform_codec_re_status.md``).
|
||||||
|
Replaces the legacy int16 LE decoder, which produced full-scale ±32K
|
||||||
|
noise on every event because the body bytes are encoded, not raw
|
||||||
|
samples.
|
||||||
|
|
||||||
── Waveform format (confirmed from 4-2-26 blast capture) ───────────────────
|
Output convention (preserved from the legacy decoder):
|
||||||
The blast waveform is 4-channel interleaved signed 16-bit little-endian,
|
``event.raw_samples`` is a dict with keys "Tran", "Vert", "Long",
|
||||||
8 bytes per sample-set:
|
"MicL" mapping to lists of **int16 ADC counts**. Multiply by
|
||||||
|
``geo_range / 32768`` for geo channels to get in/s; use
|
||||||
|
:func:`minimateplus.waveform_codec.mic_count_to_db` for mic dB(L).
|
||||||
|
|
||||||
|
``total_samples`` / ``pretrig_samples`` / ``rectime_seconds`` are set
|
||||||
|
to ``None`` so the caller backfills from compliance_config (the
|
||||||
|
authoritative source — STRT fields aren't reliable).
|
||||||
|
"""
|
||||||
|
from .waveform_codec import decode_a5_frames
|
||||||
|
|
||||||
|
event.total_samples = None
|
||||||
|
event.pretrig_samples = None
|
||||||
|
event.rectime_seconds = None
|
||||||
|
|
||||||
|
if not frames_data:
|
||||||
|
log.debug("_decode_a5_waveform: no frames provided")
|
||||||
|
return
|
||||||
|
|
||||||
|
decoded = decode_a5_frames(frames_data)
|
||||||
|
if decoded is None:
|
||||||
|
log.warning("_decode_a5_waveform: codec returned no samples")
|
||||||
|
return
|
||||||
|
|
||||||
|
event.raw_samples = decoded
|
||||||
|
log.debug(
|
||||||
|
"_decode_a5_waveform: decoded %d/%d/%d/%d samples (T/V/L/M)",
|
||||||
|
len(decoded.get("Tran", [])),
|
||||||
|
len(decoded.get("Vert", [])),
|
||||||
|
len(decoded.get("Long", [])),
|
||||||
|
len(decoded.get("MicL", [])),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_a5_waveform_LEGACY(
|
||||||
|
frames_data: list[S3Frame],
|
||||||
|
event: Event,
|
||||||
|
) -> None:
|
||||||
|
"""
|
||||||
|
LEGACY decoder — kept for reference only. DO NOT CALL.
|
||||||
|
|
||||||
|
This is the int16 LE decoder that produced full-scale ±32K noise
|
||||||
|
on every event. Retracted 2026-05-08; replaced 2026-05-11 with
|
||||||
|
the verified codec in :mod:`minimateplus.waveform_codec`. See
|
||||||
|
``docs/instantel_protocol_reference.md §7.6.1`` for the full history.
|
||||||
|
|
||||||
|
── Waveform format (LEGACY — WRONG) ────────────────────────────────
|
||||||
|
Claimed 4-channel interleaved signed 16-bit little-endian, 8 bytes
|
||||||
|
per sample-set:
|
||||||
|
|
||||||
[T_lo T_hi V_lo V_hi L_lo L_hi M_lo M_hi] × N
|
[T_lo T_hi V_lo V_hi L_lo L_hi M_lo M_hi] × N
|
||||||
|
|
||||||
where T=Tran, V=Vert, L=Long, M=Mic. Channel ordering follows the
|
where T=Tran, V=Vert, L=Long, M=Mic.
|
||||||
Blastware convention [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
|
|
||||||
|
|
||||||
⚠️ Channel ordering is a confirmed CONVENTION — the physical ordering on
|
The body bytes are actually a tagged delta+RLE stream — this
|
||||||
the ADC mux is not independently verifiable from the saturating blast
|
interpretation was wrong.
|
||||||
captures we have. The convention is consistent with Blastware labeling
|
|
||||||
(Tran is always the first channel field in the A5 STRT+waveform stream).
|
|
||||||
|
|
||||||
── Frame structure ──────────────────────────────────────────────────────────
|
── Frame structure ──────────────────────────────────────────────────────────
|
||||||
A5[0] (probe response):
|
A5[0] (probe response):
|
||||||
@@ -1518,46 +1616,109 @@ def _decode_a5_waveform(
|
|||||||
log.warning("_decode_a5_waveform: STRT record truncated (%dB)", len(strt))
|
log.warning("_decode_a5_waveform: STRT record truncated (%dB)", len(strt))
|
||||||
return
|
return
|
||||||
|
|
||||||
total_samples = struct.unpack_from(">H", strt, 8)[0]
|
# STRT byte layout (21 bytes; verified against M529LIY6 reference files
|
||||||
pretrig_samples = struct.unpack_from(">H", strt, 16)[0]
|
# and re-confirmed against live BE11529 captures, 2026-05-08):
|
||||||
rectime_seconds = strt[18]
|
# [0:4] b'STRT'
|
||||||
|
# [4:6] 0xff 0xfe sentinel
|
||||||
|
# [6:10] end_key 4-byte BE flash address where event ends
|
||||||
|
# [10:14] start_key 4-byte BE flash address where event starts
|
||||||
|
# [14:18] device-specific (semantics not pinned; values vary across events
|
||||||
|
# and don't hold authoritative total_samples / pretrig)
|
||||||
|
# [18] 0x46 record-type marker (NOT rectime)
|
||||||
|
# [19] device-specific
|
||||||
|
# [20] sometimes rectime, sometimes 0 — not reliable
|
||||||
|
#
|
||||||
|
# AUTHORITATIVE values must come from compliance_config (sample_rate,
|
||||||
|
# record_time) and from end_offset - start_offset arithmetic (event size).
|
||||||
|
# Earlier code claimed STRT[8:10]=total_samples and STRT[16:18]=pretrig;
|
||||||
|
# those positions actually overlap end_key low-word and dev-specific bytes
|
||||||
|
# respectively. We surface the address-derived event size so consumers
|
||||||
|
# can sanity-check chunk-loop bounds, but `total_samples` per channel must
|
||||||
|
# be derived externally (sample_rate × record_time, or computed from the
|
||||||
|
# decoded sample count below).
|
||||||
|
end_key = strt[6:10]
|
||||||
|
start_key = strt[10:14]
|
||||||
|
end_offset_in_strt = (end_key[2] << 8) | end_key[3]
|
||||||
|
start_offset_in_strt = (start_key[2] << 8) | start_key[3]
|
||||||
|
is_event_1 = (start_offset_in_strt == 0x0000)
|
||||||
|
|
||||||
event.total_samples = total_samples
|
# Don't trust STRT for these — leave them as None so the caller can
|
||||||
event.pretrig_samples = pretrig_samples
|
# backfill from compliance_config (the authoritative source).
|
||||||
event.rectime_seconds = rectime_seconds
|
event.total_samples = None
|
||||||
|
event.pretrig_samples = None
|
||||||
|
event.rectime_seconds = None
|
||||||
|
|
||||||
log.debug(
|
log.debug(
|
||||||
"_decode_a5_waveform: STRT total_samples=%d pretrig=%d rectime=%ds",
|
"_decode_a5_waveform: STRT start_key=%s end_key=%s "
|
||||||
total_samples, pretrig_samples, rectime_seconds,
|
"start_off=0x%04X end_off=0x%04X is_event_1=%s "
|
||||||
|
"dev-specific[14:18]=%s strt[20]=0x%02X",
|
||||||
|
start_key.hex(), end_key.hex(),
|
||||||
|
start_offset_in_strt, end_offset_in_strt, is_event_1,
|
||||||
|
strt[14:18].hex(), strt[20],
|
||||||
)
|
)
|
||||||
|
|
||||||
# ── Collect per-frame waveform bytes with global offset tracking ─────────
|
# ── Collect per-frame waveform bytes with global offset tracking ─────────
|
||||||
# global_offset is the cumulative byte count across all frames, used to
|
# global_offset is the cumulative byte count across all frames, used to
|
||||||
# compute the channel alignment at each frame boundary.
|
# compute the channel alignment at each frame boundary.
|
||||||
|
#
|
||||||
|
# Frame layout under the v0.14.0+ walk:
|
||||||
|
# frames_data[0] = probe response (page_addr 0x0000;
|
||||||
|
# contains STRT + post-STRT data)
|
||||||
|
# frames_data[1..2] = (event 1 only) metadata pages
|
||||||
|
# page_addr = 0x1002 / 0x1004
|
||||||
|
# frames_data[mid] = sample chunks at flash addresses
|
||||||
|
# 0x0600, 0x0800, … (page_addr in
|
||||||
|
# {0x0600..0x1FFE})
|
||||||
|
# frames_data[last] = TERM response (page_key=0x0000)
|
||||||
|
#
|
||||||
|
# We identify metadata pages by their PAGE ADDRESS at db.data[4:6] (the
|
||||||
|
# 2-byte counter the device echoes back), NOT by content scan. An earlier
|
||||||
|
# needle-based detection (b"Project:", b"Client:", etc.) was the wrong
|
||||||
|
# layer of abstraction:
|
||||||
|
# • The actual metadata pages 0x1002 / 0x1004 do NOT contain ASCII
|
||||||
|
# project strings on this firmware (S338.17 / BE11529).
|
||||||
|
# • The strings physically live at flash address 0x1600 — which falls
|
||||||
|
# inside the sample-chunk address range. Skipping that frame would
|
||||||
|
# drop a real sample chunk.
|
||||||
|
# BW handles the "samples region happens to contain string bytes" case
|
||||||
|
# by just rendering the bytes verbatim; we do the same.
|
||||||
|
_METADATA_PAGES = (b"\x10\x02", b"\x10\x04")
|
||||||
|
|
||||||
chunks: list[tuple[int, bytes]] = [] # (frame_idx, waveform_bytes)
|
chunks: list[tuple[int, bytes]] = [] # (frame_idx, waveform_bytes)
|
||||||
global_offset = 0
|
global_offset = 0
|
||||||
|
|
||||||
for fi, db in enumerate(frames_data):
|
for fi, db in enumerate(frames_data):
|
||||||
|
page_addr = db.data[4:6] if len(db.data) >= 6 else b""
|
||||||
w = db.data[7:] # frame.data[7:]
|
w = db.data[7:] # frame.data[7:]
|
||||||
|
|
||||||
# A5[0]: waveform begins after the 21-byte STRT record and 6-byte preamble.
|
# A5[0]: probe response. Two cases:
|
||||||
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
|
# - Event 1 (start_offset_in_strt == 0x0000): the bytes after STRT
|
||||||
|
# are the device's *pre-event reserved area* (flash 0x0046 to
|
||||||
|
# 0x0600), NOT samples. We must skip them; samples begin at
|
||||||
|
# the first dedicated chunk frame at counter=0x0600.
|
||||||
|
# - Event N (continuation, start_offset != 0x0000): the bytes after
|
||||||
|
# the STRT record ARE the first slice of real samples for the
|
||||||
|
# event (BW's chunk loop addresses the probe as a sample chunk).
|
||||||
if fi == 0:
|
if fi == 0:
|
||||||
sp = w.find(b"STRT")
|
sp = w.find(b"STRT")
|
||||||
if sp < 0:
|
if sp < 0:
|
||||||
continue
|
continue
|
||||||
|
if is_event_1:
|
||||||
|
# No usable samples in the probe — pre-event reserved bytes.
|
||||||
|
continue
|
||||||
|
# Layout: STRT(21B) + null-pad(2B) + 0xFF sentinel(4B) = 27 bytes total.
|
||||||
wave = w[sp + 27 :]
|
wave = w[sp + 27 :]
|
||||||
|
|
||||||
# Frame 7 carries event-time metadata strings ("Project:", "Client:", …)
|
# Skip the dedicated metadata pages (event 1 only): page_addr 0x1002 / 0x1004.
|
||||||
# and no waveform ADC data.
|
elif page_addr in _METADATA_PAGES:
|
||||||
elif fi == 7:
|
log.debug(
|
||||||
|
"_decode_a5_waveform: skipping metadata page fi=%d page_addr=%s",
|
||||||
|
fi, page_addr.hex(),
|
||||||
|
)
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Terminator frames have page_key=0x0000 and are excluded upstream
|
# Sample chunk (or TERM): strip the 8-byte per-frame header.
|
||||||
# (read_bulk_waveform_stream returns early on page_key==0).
|
|
||||||
# No hardcoded frame-index skip here — all non-metadata frames are data.
|
|
||||||
else:
|
else:
|
||||||
# Strip the 8-byte per-frame header (ctr + 6 zero bytes)
|
|
||||||
if len(w) < 8:
|
if len(w) < 8:
|
||||||
continue
|
continue
|
||||||
wave = w[8:]
|
wave = w[8:]
|
||||||
@@ -1571,10 +1732,8 @@ def _decode_a5_waveform(
|
|||||||
total_bytes = global_offset
|
total_bytes = global_offset
|
||||||
n_sets = total_bytes // 8
|
n_sets = total_bytes // 8
|
||||||
log.debug(
|
log.debug(
|
||||||
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets "
|
"_decode_a5_waveform: %d chunks, %dB total → %d complete sample-sets",
|
||||||
"(%d of %d expected; %.0f%%)",
|
len(chunks), total_bytes, n_sets,
|
||||||
len(chunks), total_bytes, n_sets, n_sets, total_samples,
|
|
||||||
100.0 * n_sets / total_samples if total_samples else 0,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
if n_sets == 0:
|
if n_sets == 0:
|
||||||
@@ -1632,38 +1791,85 @@ def _decode_a5_waveform(
|
|||||||
"Tran": tran,
|
"Tran": tran,
|
||||||
"Vert": vert,
|
"Vert": vert,
|
||||||
"Long": long_,
|
"Long": long_,
|
||||||
"Mic": mic,
|
"MicL": mic,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _detect_record_format(data: bytes) -> Optional[str]:
|
||||||
|
"""
|
||||||
|
Detect which timestamp-header format a 210-byte 0C waveform record uses.
|
||||||
|
|
||||||
|
THREE formats observed on BE11529 firmware S338.17:
|
||||||
|
|
||||||
|
"single_shot" — 9-byte header:
|
||||||
|
[day] [0x10] [month] [year_BE:2] [unknown] [hour] [min] [sec]
|
||||||
|
sub_code=0x10 at byte [1]. Year at [3:5].
|
||||||
|
|
||||||
|
"continuous" — 10-byte header:
|
||||||
|
[0x10] [day] [0x10] [month] [year_BE:2] [unknown] [hour] [min] [sec]
|
||||||
|
marker 0x10 at byte [0] AND byte [2]. Year at [4:6].
|
||||||
|
|
||||||
|
"short" — 8-byte header (NEW 2026-05-01):
|
||||||
|
[day] [month] [year_BE:2] [unknown] [hour] [min] [sec]
|
||||||
|
No marker bytes. Year at [2:4].
|
||||||
|
|
||||||
|
Each format has the year (uint16 BE) at a UNIQUE byte position, so we can
|
||||||
|
disambiguate by scanning each candidate position and picking the one
|
||||||
|
where the year falls in a sane range (2015..2050).
|
||||||
|
|
||||||
|
Returns "single_shot" / "continuous" / "short" or None if no format matches.
|
||||||
|
"""
|
||||||
|
if len(data) < 8:
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _sane_year(hi: int, lo: int) -> bool:
|
||||||
|
y = (hi << 8) | lo
|
||||||
|
return 2015 <= y <= 2050
|
||||||
|
|
||||||
|
# Order matters: prefer formats with stronger marker-byte evidence first.
|
||||||
|
if data[1] == 0x10 and len(data) >= 9 and _sane_year(data[3], data[4]):
|
||||||
|
return "single_shot"
|
||||||
|
if (data[0] == 0x10 and data[2] == 0x10
|
||||||
|
and len(data) >= 10 and _sane_year(data[4], data[5])):
|
||||||
|
return "continuous"
|
||||||
|
if _sane_year(data[2], data[3]):
|
||||||
|
return "short"
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
def _extract_record_type(data: bytes) -> Optional[str]:
|
def _extract_record_type(data: bytes) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
Decode the recording mode from byte[1] of the 210-byte waveform record.
|
Return a user-facing name for a waveform record. All three internal
|
||||||
|
timestamp-header layouts represent the *same* user concept — a
|
||||||
|
triggered seismic event — so they all surface as just "Waveform".
|
||||||
|
|
||||||
Byte[1] is the sub-record code that immediately follows the day byte in the
|
The internal format code is preserved for parsing logic (timestamp
|
||||||
9-byte timestamp header at the start of each waveform record:
|
decoder selection) but doesn't leak into the API / UI / sidecar.
|
||||||
[day:1] [sub_code:1] [month:1] [year:2 BE] ...
|
Callers that need the raw layout can call `_detect_record_format`
|
||||||
|
directly.
|
||||||
|
|
||||||
Confirmed codes (✅ 2026-04-01):
|
Background: across BE11529 firmware S338.17 we've observed three
|
||||||
0x10 → "Waveform" (continuous / single-shot mode)
|
different byte layouts for the timestamp header at the start of the
|
||||||
|
0C record (8 / 9 / 10 bytes, distinguished by the position of the
|
||||||
Histogram mode code is not yet confirmed — a histogram event must be
|
BE-encoded year and the presence of `0x10` marker bytes). An older
|
||||||
captured with debug=true to identify it. Returns None for unknown codes.
|
revision of this code labelled them "Waveform" / "Waveform
|
||||||
|
(Continuous)" / "Waveform (Short)", which created the false
|
||||||
|
impression that there were three distinct event "types" the user
|
||||||
|
could configure. In reality the user only ever picks Single Shot
|
||||||
|
vs Continuous vs Histogram in the compliance config — the byte
|
||||||
|
layout is a firmware-internal detail that doesn't always correlate
|
||||||
|
with that choice.
|
||||||
"""
|
"""
|
||||||
if len(data) < 2:
|
fmt = _detect_record_format(data)
|
||||||
return None
|
if fmt in ("single_shot", "continuous", "short"):
|
||||||
code = data[1]
|
|
||||||
if code == 0x10:
|
|
||||||
return "Waveform"
|
return "Waveform"
|
||||||
if code == 0x03:
|
if len(data) >= 3:
|
||||||
# Continuous mode waveform record (confirmed by user — NOT a monitor log).
|
log.warning(
|
||||||
# The byte layout differs from 0x10 single-shot records: the timestamp
|
"_extract_record_type: unrecognized header: data[0:3]=%02X %02X %02X",
|
||||||
# fields decode as garbage under the 0x10 waveform layout.
|
data[0], data[1], data[2],
|
||||||
# TODO: confirm correct timestamp layout for 0x03 records from a known-time event.
|
)
|
||||||
return "Waveform (Continuous)"
|
return f"Unknown({data[0]:02X}.{data[1]:02X}.{data[2]:02X})"
|
||||||
log.warning("_extract_record_type: unknown sub_code=0x%02X", code)
|
return None
|
||||||
return f"Unknown(0x{code:02X})"
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_peak_floats(data: bytes) -> Optional[PeakValues]:
|
def _extract_peak_floats(data: bytes) -> Optional[PeakValues]:
|
||||||
"""
|
"""
|
||||||
|
|||||||
@@ -0,0 +1,927 @@
|
|||||||
|
"""
|
||||||
|
minimateplus/event_file_io.py — modern event-file (.sfm.json sidecar) IO.
|
||||||
|
|
||||||
|
This module is the single home for event-file conversion code that doesn't
|
||||||
|
fit cleanly inside `blastware_file.py` (which is the BW binary codec):
|
||||||
|
|
||||||
|
- sidecar JSON read/write (the modern per-event metadata file)
|
||||||
|
- read_blastware_file() — reverse of write_blastware_file, used by
|
||||||
|
the BW-importer flow when SFM is ingesting files produced by
|
||||||
|
Blastware's own ACH (where the source A5 frames aren't available).
|
||||||
|
|
||||||
|
Sidecar schema v1 layout — see docs in the project plan or the schema
|
||||||
|
declared in `event_to_sidecar_dict()`.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import struct
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional, Union
|
||||||
|
|
||||||
|
from .models import Event, PeakValues, ProjectInfo, Timestamp
|
||||||
|
from . import blastware_file as _bw # avoid circular reference at module load
|
||||||
|
from .bw_ascii_report import BwAsciiReport
|
||||||
|
from .waveform_codec import decode_waveform_v2, decoded_to_adc_counts
|
||||||
|
from .histogram_codec import decode_histogram_body
|
||||||
|
|
||||||
|
# Reference pressure for dB(L) → psi conversion (20 µPa expressed in psi).
|
||||||
|
# Same constant as sfm/sfm_webapp.html so server-side and browser-side
|
||||||
|
# conversions agree.
|
||||||
|
_DBL_REF_PSI = 2.9e-9
|
||||||
|
|
||||||
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Schema version for the sidecar JSON. Bump when fields change shape.
|
||||||
|
# Older readers must reject anything > SCHEMA_VERSION; newer fields added
|
||||||
|
# inside `extensions` are forward-compatible without a bump.
|
||||||
|
SCHEMA_VERSION = 1
|
||||||
|
SIDECAR_KIND = "sfm.event"
|
||||||
|
|
||||||
|
# Default tool_version stamp; callers can override. Hard-coded here
|
||||||
|
# rather than read via importlib.metadata because the latter reflects the
|
||||||
|
# *installed* dist-info, which doesn't update when pyproject.toml is
|
||||||
|
# bumped without a `pip install` re-run — leading to confusing stale
|
||||||
|
# version stamps in sidecars. Bump this constant and CHANGELOG.md
|
||||||
|
# together at release time.
|
||||||
|
TOOL_VERSION = "0.21.1"
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Best-effort: prefer the installed metadata when it's NEWER than the
|
||||||
|
# baked-in constant (e.g. a downstream packager bumped the wheel
|
||||||
|
# without editing this file). Otherwise fall back to TOOL_VERSION.
|
||||||
|
from importlib.metadata import version as _pkg_version
|
||||||
|
_meta_v = _pkg_version("seismo-relay")
|
||||||
|
def _vtuple(s):
|
||||||
|
try:
|
||||||
|
return tuple(int(p) for p in s.split(".")[:3])
|
||||||
|
except Exception:
|
||||||
|
return (0, 0, 0)
|
||||||
|
_TOOL_VERSION_DEFAULT = (
|
||||||
|
_meta_v if _vtuple(_meta_v) > _vtuple(TOOL_VERSION) else TOOL_VERSION
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
_TOOL_VERSION_DEFAULT = TOOL_VERSION
|
||||||
|
|
||||||
|
|
||||||
|
# ── Sidecar dict construction ─────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def _ts_iso(ts: Optional[Timestamp]) -> Optional[str]:
|
||||||
|
if ts is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return datetime.datetime(
|
||||||
|
ts.year, ts.month, ts.day,
|
||||||
|
ts.hour or 0, ts.minute or 0, ts.second or 0,
|
||||||
|
).isoformat()
|
||||||
|
except Exception:
|
||||||
|
return str(ts)
|
||||||
|
|
||||||
|
|
||||||
|
def _peak_values_to_dict(pv: Optional[PeakValues]) -> dict:
|
||||||
|
if pv is None:
|
||||||
|
return {
|
||||||
|
"transverse": None,
|
||||||
|
"vertical": None,
|
||||||
|
"longitudinal": None,
|
||||||
|
"vector_sum": None,
|
||||||
|
"mic_psi": None,
|
||||||
|
}
|
||||||
|
return {
|
||||||
|
"transverse": pv.tran,
|
||||||
|
"vertical": pv.vert,
|
||||||
|
"longitudinal": pv.long,
|
||||||
|
"vector_sum": pv.peak_vector_sum,
|
||||||
|
"mic_psi": pv.micl,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _bw_report_to_dict(report: BwAsciiReport) -> dict:
|
||||||
|
"""Project a parsed BW ASCII report into the sidecar's `bw_report` block.
|
||||||
|
|
||||||
|
All fields are rendered as plain JSON-compatible types (no datetime
|
||||||
|
objects). Channels are uniformly lowercased for stable JSON keys.
|
||||||
|
"""
|
||||||
|
def _ch(ch_name: str) -> dict:
|
||||||
|
cs = report.channels.get(ch_name)
|
||||||
|
if cs is None:
|
||||||
|
return {}
|
||||||
|
out = {
|
||||||
|
"ppv_ips": cs.ppv_ips,
|
||||||
|
"zc_freq_hz": cs.zc_freq_hz,
|
||||||
|
"time_of_peak_s": cs.time_of_peak_s,
|
||||||
|
"peak_accel_g": cs.peak_accel_g,
|
||||||
|
"peak_disp_in": cs.peak_disp_in,
|
||||||
|
}
|
||||||
|
# Drop all-None entries — keeps the JSON tidy for partial reports.
|
||||||
|
out = {k: v for k, v in out.items() if v is not None}
|
||||||
|
# Saturation flag (only present when True) — signals that ppv_ips
|
||||||
|
# is the channel range max (a lower bound), not an exact reading.
|
||||||
|
if getattr(cs, "ppv_saturated", False):
|
||||||
|
out["ppv_saturated"] = True
|
||||||
|
# ZC Freq above device reporting ceiling (BW ">100 Hz") — value
|
||||||
|
# in zc_freq_hz is the threshold, not an exact measurement.
|
||||||
|
if getattr(cs, "zc_freq_above_range", False):
|
||||||
|
out["zc_freq_above_range"] = True
|
||||||
|
return out
|
||||||
|
|
||||||
|
def _sc(ch_name: str) -> dict:
|
||||||
|
sc = report.sensor_check.get(ch_name)
|
||||||
|
if sc is None:
|
||||||
|
return {}
|
||||||
|
out = {
|
||||||
|
"freq_hz": sc.test_freq_hz,
|
||||||
|
"ratio": sc.test_ratio,
|
||||||
|
"amplitude_mv": sc.test_amplitude_mv,
|
||||||
|
"result": sc.test_results,
|
||||||
|
}
|
||||||
|
return {k: v for k, v in out.items() if v is not None}
|
||||||
|
|
||||||
|
monitor_log = []
|
||||||
|
for entry in report.monitor_log:
|
||||||
|
e = {
|
||||||
|
"start": entry.start_time.isoformat() if entry.start_time else None,
|
||||||
|
"stop": entry.stop_time.isoformat() if entry.stop_time else None,
|
||||||
|
"description": entry.description,
|
||||||
|
}
|
||||||
|
monitor_log.append({k: v for k, v in e.items() if v is not None})
|
||||||
|
|
||||||
|
return {
|
||||||
|
"available": True,
|
||||||
|
"event_type": report.event_type,
|
||||||
|
"version": report.version,
|
||||||
|
"trigger": {
|
||||||
|
"channel": report.trigger_channel,
|
||||||
|
"geo_level_ips": report.geo_trigger_level_ips,
|
||||||
|
},
|
||||||
|
"recording": {
|
||||||
|
"sample_rate_sps": report.sample_rate_sps,
|
||||||
|
"record_time_s": report.record_time_s,
|
||||||
|
"pretrig_s": report.pretrig_s,
|
||||||
|
"stop_mode": report.record_stop_mode,
|
||||||
|
"geo_range_ips": report.geo_range_ips,
|
||||||
|
"units": report.units,
|
||||||
|
},
|
||||||
|
"device": {
|
||||||
|
"battery_volts": report.battery_volts,
|
||||||
|
"calibration_date": report.calibration_date.isoformat() if report.calibration_date else None,
|
||||||
|
"calibration_by": report.calibration_by,
|
||||||
|
},
|
||||||
|
"peaks": {
|
||||||
|
"tran": _ch("Tran"),
|
||||||
|
"vert": _ch("Vert"),
|
||||||
|
"long": _ch("Long"),
|
||||||
|
"vector_sum": {
|
||||||
|
"ips": report.peak_vector_sum_ips,
|
||||||
|
"time_s": report.peak_vector_sum_time_s,
|
||||||
|
# Histogram events have an absolute date+time for the PVS
|
||||||
|
# (the interval at which it occurred); waveform events
|
||||||
|
# only have the time_s offset.
|
||||||
|
"when": report.peak_vector_sum_when.isoformat() if report.peak_vector_sum_when else None,
|
||||||
|
# Set when BW reported the PVS as OORANGE — value is the
|
||||||
|
# conservative upper bound sqrt(3) * geo_range_ips, not
|
||||||
|
# an exact peak.
|
||||||
|
"saturated": bool(getattr(report, "peak_vector_sum_saturated", False)),
|
||||||
|
},
|
||||||
|
},
|
||||||
|
"mic": {
|
||||||
|
"weighting": report.mic.weighting,
|
||||||
|
"pspl_dbl": report.mic.pspl_dbl,
|
||||||
|
"pspl_saturated": bool(getattr(report.mic, "pspl_saturated", False)),
|
||||||
|
"zc_freq_hz": report.mic.zc_freq_hz,
|
||||||
|
"zc_freq_above_range": bool(getattr(report.mic, "zc_freq_above_range", False)),
|
||||||
|
"time_of_peak_s": report.mic.time_of_peak_s,
|
||||||
|
},
|
||||||
|
"sensor_check": {
|
||||||
|
"tran": _sc("Tran"),
|
||||||
|
"vert": _sc("Vert"),
|
||||||
|
"long": _sc("Long"),
|
||||||
|
"mic": _sc("MicL"),
|
||||||
|
},
|
||||||
|
# Histogram-specific fields (None on waveform-mode events).
|
||||||
|
# Per-channel absolute peak time/date for histograms — for
|
||||||
|
# waveforms see channels[ch]["time_of_peak_s"] instead.
|
||||||
|
"histogram": {
|
||||||
|
"start": report.histogram_start.isoformat() if report.histogram_start else None,
|
||||||
|
"stop": report.histogram_stop.isoformat() if report.histogram_stop else None,
|
||||||
|
"n_intervals": report.histogram_n_intervals,
|
||||||
|
"interval_size": report.histogram_interval_size_str,
|
||||||
|
"interval_size_s": report.histogram_interval_size_s,
|
||||||
|
"channel_peak_when": {ch: dt.isoformat() for ch, dt in report.channel_peak_when.items()},
|
||||||
|
},
|
||||||
|
"monitor_log": monitor_log,
|
||||||
|
"pc_sw_version": report.pc_sw_version,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _dbl_to_psi(pspl_dbl: float) -> float:
|
||||||
|
"""Convert dB(L) sound pressure level back to psi. Uses the same
|
||||||
|
20 µPa reference (= 2.9e-9 psi) as the webapp so server-side and
|
||||||
|
browser-side conversions agree."""
|
||||||
|
return _DBL_REF_PSI * (10.0 ** (pspl_dbl / 20.0))
|
||||||
|
|
||||||
|
|
||||||
|
def apply_report_to_event(event: Event, report: BwAsciiReport) -> None:
|
||||||
|
"""Overlay device-authoritative fields from a parsed BW ASCII report
|
||||||
|
onto an in-memory Event, IN-PLACE.
|
||||||
|
|
||||||
|
Why this exists
|
||||||
|
───────────────
|
||||||
|
`read_blastware_file()` parses the BW binary and fills `Event.peak_values`
|
||||||
|
via `_peaks_from_samples()` — which runs the (still-undecoded) BW body
|
||||||
|
codec assuming raw int16 LE and produces ±32K-shaped noise on every
|
||||||
|
channel. Result: peak values land in the SeismoDb event row as
|
||||||
|
~10 in/s on every event regardless of the actual signal.
|
||||||
|
|
||||||
|
When a paired BW ASCII report is available, the report carries the
|
||||||
|
device's own authoritative peak / project / sample-rate / record-time
|
||||||
|
values. This helper folds those onto the Event before it flows to
|
||||||
|
`SeismoDb.insert_events()`, so the DB columns reflect the report
|
||||||
|
rather than the broken-codec output.
|
||||||
|
|
||||||
|
Fields overlaid (only when the report supplies a non-None value):
|
||||||
|
- peak_values.tran / .vert / .long (from report.channels)
|
||||||
|
- peak_values.peak_vector_sum (from report.peak_vector_sum_ips)
|
||||||
|
- peak_values.micl (psi) (from report.mic.pspl_dbl → psi)
|
||||||
|
- project_info.project / .client / .operator / .sensor_location
|
||||||
|
- sample_rate (from report.sample_rate_sps)
|
||||||
|
- rectime_seconds (from report.record_time_s)
|
||||||
|
|
||||||
|
Fields NOT touched (operator-edit / parser-output preserved):
|
||||||
|
- timestamp, raw_samples, record_type, total_samples,
|
||||||
|
pretrig_samples, _waveform_key, _a5_frames, _raw_record
|
||||||
|
- false_trigger and review state (those live on the sidecar, not on Event)
|
||||||
|
"""
|
||||||
|
if event.peak_values is None:
|
||||||
|
event.peak_values = PeakValues()
|
||||||
|
pv = event.peak_values
|
||||||
|
ch = report.channels
|
||||||
|
if (t := ch.get("Tran")) and t.ppv_ips is not None: pv.tran = t.ppv_ips
|
||||||
|
if (v := ch.get("Vert")) and v.ppv_ips is not None: pv.vert = v.ppv_ips
|
||||||
|
if (l := ch.get("Long")) and l.ppv_ips is not None: pv.long = l.ppv_ips
|
||||||
|
if report.peak_vector_sum_ips is not None:
|
||||||
|
pv.peak_vector_sum = report.peak_vector_sum_ips
|
||||||
|
if report.mic.pspl_dbl is not None and report.mic.pspl_dbl > 0:
|
||||||
|
pv.micl = _dbl_to_psi(report.mic.pspl_dbl)
|
||||||
|
|
||||||
|
if event.project_info is None:
|
||||||
|
event.project_info = ProjectInfo()
|
||||||
|
pi = event.project_info
|
||||||
|
if report.project: pi.project = report.project
|
||||||
|
if report.client: pi.client = report.client
|
||||||
|
if report.operator: pi.operator = report.operator
|
||||||
|
if report.sensor_location: pi.sensor_location = report.sensor_location
|
||||||
|
|
||||||
|
if report.sample_rate_sps:
|
||||||
|
event.sample_rate = report.sample_rate_sps
|
||||||
|
if report.record_time_s is not None:
|
||||||
|
event.rectime_seconds = report.record_time_s
|
||||||
|
|
||||||
|
|
||||||
|
def apply_bw_report_dict_to_event(event: Event, bw_report: dict) -> None:
|
||||||
|
"""Mirror of ``apply_report_to_event`` for the projected sidecar
|
||||||
|
dict shape (as produced by ``_bw_report_to_dict``).
|
||||||
|
|
||||||
|
Why this exists
|
||||||
|
───────────────
|
||||||
|
The ingest path holds a live ``BwAsciiReport`` parsed straight from
|
||||||
|
the ``_ASCII.TXT`` and uses ``apply_report_to_event`` to overlay
|
||||||
|
device-authoritative peaks onto the codec output before insert.
|
||||||
|
|
||||||
|
The backfill path doesn't have the original ``.TXT`` (it's not
|
||||||
|
retained in the waveform store), but it does have the preserved
|
||||||
|
``bw_report`` block from the sidecar — which contains the same
|
||||||
|
projected fields. Re-overlaying those during a backfill keeps the
|
||||||
|
DB peak columns aligned with what BW reports rather than letting
|
||||||
|
the codec output (which may be incomplete for unhandled formats or
|
||||||
|
walker edge cases) win by default.
|
||||||
|
|
||||||
|
No-ops cleanly when ``bw_report`` is ``None``, empty, or missing
|
||||||
|
any particular sub-field — only fields with a concrete value get
|
||||||
|
written. Mirrors ``apply_report_to_event``'s "report wins where
|
||||||
|
present" semantics.
|
||||||
|
"""
|
||||||
|
if not bw_report:
|
||||||
|
return
|
||||||
|
if event.peak_values is None:
|
||||||
|
event.peak_values = PeakValues()
|
||||||
|
pv = event.peak_values
|
||||||
|
|
||||||
|
peaks = bw_report.get("peaks") or {}
|
||||||
|
tran = (peaks.get("tran") or {}).get("ppv_ips")
|
||||||
|
vert = (peaks.get("vert") or {}).get("ppv_ips")
|
||||||
|
long = (peaks.get("long") or {}).get("ppv_ips")
|
||||||
|
if tran is not None: pv.tran = tran
|
||||||
|
if vert is not None: pv.vert = vert
|
||||||
|
if long is not None: pv.long = long
|
||||||
|
vs_ips = (peaks.get("vector_sum") or {}).get("ips")
|
||||||
|
if vs_ips is not None:
|
||||||
|
pv.peak_vector_sum = vs_ips
|
||||||
|
|
||||||
|
mic = bw_report.get("mic") or {}
|
||||||
|
pspl = mic.get("pspl_dbl")
|
||||||
|
if pspl is not None and pspl > 0:
|
||||||
|
pv.micl = _dbl_to_psi(pspl)
|
||||||
|
|
||||||
|
rec = bw_report.get("recording") or {}
|
||||||
|
sr = rec.get("sample_rate_sps")
|
||||||
|
if sr:
|
||||||
|
event.sample_rate = sr
|
||||||
|
rt = rec.get("record_time_s")
|
||||||
|
if rt is not None:
|
||||||
|
event.rectime_seconds = rt
|
||||||
|
|
||||||
|
|
||||||
|
def _project_info_to_dict(pi: Optional[ProjectInfo]) -> dict:
|
||||||
|
if pi is None:
|
||||||
|
return {
|
||||||
|
"project": None,
|
||||||
|
"client": None,
|
||||||
|
"operator": None,
|
||||||
|
"sensor_location": None,
|
||||||
|
}
|
||||||
|
return {
|
||||||
|
"project": pi.project,
|
||||||
|
"client": pi.client,
|
||||||
|
"operator": pi.operator,
|
||||||
|
"sensor_location": pi.sensor_location,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def event_to_sidecar_dict(
|
||||||
|
event: Event,
|
||||||
|
*,
|
||||||
|
serial: str,
|
||||||
|
blastware_filename: str,
|
||||||
|
blastware_filesize: int,
|
||||||
|
blastware_sha256: str,
|
||||||
|
source_kind: str = "sfm-live",
|
||||||
|
txt_filename: Optional[str] = None,
|
||||||
|
a5_pickle_filename: Optional[str] = None,
|
||||||
|
tool_version: str = _TOOL_VERSION_DEFAULT,
|
||||||
|
captured_at: Optional[datetime.datetime] = None,
|
||||||
|
review: Optional[dict] = None,
|
||||||
|
extensions: Optional[dict] = None,
|
||||||
|
bw_report: Optional[BwAsciiReport] = None,
|
||||||
|
) -> dict:
|
||||||
|
"""
|
||||||
|
Build a v1 sidecar dict from an Event + the surrounding metadata.
|
||||||
|
|
||||||
|
Pure helper — no file I/O. Callers stitch the result into a sidecar
|
||||||
|
via `write_sidecar()` (or POST it back via the PATCH endpoint).
|
||||||
|
|
||||||
|
When *bw_report* is supplied (e.g. by the ACH-forwarded import path
|
||||||
|
where Blastware writes a per-event ASCII report alongside the binary),
|
||||||
|
its decoded fields are folded into the sidecar:
|
||||||
|
|
||||||
|
- A new top-level ``bw_report`` block carries the rich derived
|
||||||
|
per-channel stats (Peak Acceleration, Peak Displacement, ZC Freq,
|
||||||
|
Time of Peak), the Peak Vector Sum + time, the per-channel sensor
|
||||||
|
self-check results, and monitor-log timestamps.
|
||||||
|
- ``peak_values`` is overlaid from the report (the report's PPV/PVS
|
||||||
|
values are computed by the device firmware and are authoritative;
|
||||||
|
anything ``read_blastware_file()`` derived from samples is
|
||||||
|
approximate at best until the body codec is decoded).
|
||||||
|
- ``project_info`` is overlaid from the report when the report
|
||||||
|
supplies a non-empty value (the report mirrors the device's
|
||||||
|
compliance config, which is what BW shows in its event report).
|
||||||
|
- ``event.timestamp`` is overlaid from the report's Event Date +
|
||||||
|
Event Time (BW's report timestamps are second-resolution and
|
||||||
|
match the binary's footer; we prefer the report value because
|
||||||
|
the BW-binary footer timestamp can drift on some firmware).
|
||||||
|
"""
|
||||||
|
if source_kind not in {"sfm-live", "sfm-ach", "bw-import", "idf-import"}:
|
||||||
|
raise ValueError(f"unknown source_kind: {source_kind!r}")
|
||||||
|
|
||||||
|
captured_at = captured_at or datetime.datetime.utcnow()
|
||||||
|
|
||||||
|
# ── Overlay event fields from the report when present ───────────────────
|
||||||
|
timestamp_iso = _ts_iso(event.timestamp)
|
||||||
|
if bw_report and bw_report.event_datetime:
|
||||||
|
timestamp_iso = bw_report.event_datetime.isoformat()
|
||||||
|
|
||||||
|
# Build peak_values, optionally overlaid from the report. The report
|
||||||
|
# stores Mic peak as PSPL (dB(L)); we convert to psi to match the
|
||||||
|
# existing peak_values.mic_psi field.
|
||||||
|
peak_dict = _peak_values_to_dict(event.peak_values)
|
||||||
|
if bw_report:
|
||||||
|
ch = bw_report.channels
|
||||||
|
if (t := ch.get("Tran")) and t.ppv_ips is not None: peak_dict["transverse"] = t.ppv_ips
|
||||||
|
if (v := ch.get("Vert")) and v.ppv_ips is not None: peak_dict["vertical"] = v.ppv_ips
|
||||||
|
if (l := ch.get("Long")) and l.ppv_ips is not None: peak_dict["longitudinal"] = l.ppv_ips
|
||||||
|
if bw_report.peak_vector_sum_ips is not None:
|
||||||
|
peak_dict["vector_sum"] = bw_report.peak_vector_sum_ips
|
||||||
|
if bw_report.mic.pspl_dbl is not None and bw_report.mic.pspl_dbl > 0:
|
||||||
|
peak_dict["mic_psi"] = _dbl_to_psi(bw_report.mic.pspl_dbl)
|
||||||
|
|
||||||
|
# Project info: overlay from report (the report mirrors the
|
||||||
|
# session-start compliance config that BW renders in event reports).
|
||||||
|
proj_dict = _project_info_to_dict(event.project_info)
|
||||||
|
if bw_report:
|
||||||
|
if bw_report.project: proj_dict["project"] = bw_report.project
|
||||||
|
if bw_report.client: proj_dict["client"] = bw_report.client
|
||||||
|
if bw_report.operator: proj_dict["operator"] = bw_report.operator
|
||||||
|
if bw_report.sensor_location: proj_dict["sensor_location"] = bw_report.sensor_location
|
||||||
|
|
||||||
|
# Event-block fields: overlay from report where available.
|
||||||
|
event_block = {
|
||||||
|
"serial": serial,
|
||||||
|
"timestamp": timestamp_iso,
|
||||||
|
"waveform_key": event._waveform_key.hex() if event._waveform_key else None,
|
||||||
|
"record_type": event.record_type,
|
||||||
|
"sample_rate": event.sample_rate,
|
||||||
|
"rectime_seconds": event.rectime_seconds,
|
||||||
|
"total_samples": event.total_samples,
|
||||||
|
"pretrig_samples": event.pretrig_samples,
|
||||||
|
}
|
||||||
|
if bw_report:
|
||||||
|
# Report values are authoritative — they're the user-configured
|
||||||
|
# values BW reads back, not STRT-derived guesses. In particular
|
||||||
|
# `event.rectime_seconds` from `read_blastware_file()` reads
|
||||||
|
# STRT[18] which is actually the `0x46` record-type marker (= 70)
|
||||||
|
# rather than the user's Record Time setting. Always overwrite.
|
||||||
|
if bw_report.sample_rate_sps:
|
||||||
|
event_block["sample_rate"] = bw_report.sample_rate_sps
|
||||||
|
if bw_report.record_time_s is not None:
|
||||||
|
event_block["rectime_seconds"] = bw_report.record_time_s
|
||||||
|
# Derive total_samples + pretrig_samples per channel from the
|
||||||
|
# report's sample_rate × times. These match the row count of
|
||||||
|
# the report's sample table (verified: event-c reports 1024 sps
|
||||||
|
# × (1.0 + 0.25) = 1280 rows).
|
||||||
|
if (sr := bw_report.sample_rate_sps) and bw_report.record_time_s is not None:
|
||||||
|
pretrig_s = abs(bw_report.pretrig_s) if bw_report.pretrig_s is not None else 0.0
|
||||||
|
event_block["total_samples"] = int(round(sr * (bw_report.record_time_s + pretrig_s)))
|
||||||
|
event_block["pretrig_samples"] = int(round(sr * pretrig_s))
|
||||||
|
|
||||||
|
out = {
|
||||||
|
"schema_version": SCHEMA_VERSION,
|
||||||
|
"kind": SIDECAR_KIND,
|
||||||
|
|
||||||
|
"event": event_block,
|
||||||
|
"peak_values": peak_dict,
|
||||||
|
"project_info": proj_dict,
|
||||||
|
|
||||||
|
"blastware": {
|
||||||
|
"filename": blastware_filename,
|
||||||
|
"filesize": blastware_filesize,
|
||||||
|
"sha256": blastware_sha256,
|
||||||
|
"available": True,
|
||||||
|
},
|
||||||
|
|
||||||
|
"source": {
|
||||||
|
"kind": source_kind,
|
||||||
|
"captured_at": captured_at.isoformat() + "Z" if captured_at.tzinfo is None else captured_at.isoformat(),
|
||||||
|
"tool_version": tool_version,
|
||||||
|
"a5_pickle_filename": a5_pickle_filename,
|
||||||
|
"txt_filename": txt_filename,
|
||||||
|
},
|
||||||
|
|
||||||
|
"review": review or {
|
||||||
|
"false_trigger": False,
|
||||||
|
"reviewer": None,
|
||||||
|
"reviewed_at": None,
|
||||||
|
"notes": "",
|
||||||
|
},
|
||||||
|
|
||||||
|
"extensions": extensions or {},
|
||||||
|
}
|
||||||
|
|
||||||
|
if bw_report:
|
||||||
|
out["bw_report"] = _bw_report_to_dict(bw_report)
|
||||||
|
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
# ── Sidecar IO ────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def write_sidecar(path: Union[str, Path], data: dict) -> None:
|
||||||
|
"""
|
||||||
|
Atomic write of a sidecar dict to <path>.
|
||||||
|
|
||||||
|
Validates schema_version is supported before writing so we don't
|
||||||
|
silently drop a future-format sidecar over the wire.
|
||||||
|
"""
|
||||||
|
path = Path(path)
|
||||||
|
sv = data.get("schema_version")
|
||||||
|
if not isinstance(sv, int) or sv < 1 or sv > SCHEMA_VERSION:
|
||||||
|
raise ValueError(
|
||||||
|
f"write_sidecar: unsupported schema_version={sv!r} "
|
||||||
|
f"(this build supports 1..{SCHEMA_VERSION})"
|
||||||
|
)
|
||||||
|
|
||||||
|
tmp = path.with_suffix(path.suffix + ".tmp")
|
||||||
|
with tmp.open("w", encoding="utf-8") as f:
|
||||||
|
json.dump(data, f, indent=2, sort_keys=False, default=str)
|
||||||
|
f.write("\n")
|
||||||
|
f.flush()
|
||||||
|
os.fsync(f.fileno())
|
||||||
|
os.replace(tmp, path)
|
||||||
|
|
||||||
|
|
||||||
|
def read_sidecar(path: Union[str, Path]) -> dict:
|
||||||
|
"""
|
||||||
|
Load a sidecar JSON file.
|
||||||
|
|
||||||
|
Raises FileNotFoundError if missing, ValueError on bad shape /
|
||||||
|
unsupported schema_version. Unknown keys at the top level are
|
||||||
|
preserved in the returned dict (forward-compat).
|
||||||
|
"""
|
||||||
|
path = Path(path)
|
||||||
|
with path.open("r", encoding="utf-8") as f:
|
||||||
|
data = json.load(f)
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
raise ValueError(f"sidecar at {path}: top-level is not a JSON object")
|
||||||
|
sv = data.get("schema_version")
|
||||||
|
if not isinstance(sv, int) or sv < 1:
|
||||||
|
raise ValueError(f"sidecar at {path}: missing or invalid schema_version")
|
||||||
|
if sv > SCHEMA_VERSION:
|
||||||
|
raise ValueError(
|
||||||
|
f"sidecar at {path}: schema_version={sv} > supported {SCHEMA_VERSION}; "
|
||||||
|
"upgrade seismo-relay to read this file"
|
||||||
|
)
|
||||||
|
if data.get("kind") != SIDECAR_KIND:
|
||||||
|
raise ValueError(f"sidecar at {path}: unexpected kind={data.get('kind')!r}")
|
||||||
|
return data
|
||||||
|
|
||||||
|
|
||||||
|
def patch_sidecar(
|
||||||
|
path: Union[str, Path],
|
||||||
|
*,
|
||||||
|
review: Optional[dict] = None,
|
||||||
|
extensions: Optional[dict] = None,
|
||||||
|
reviewer_now: bool = True,
|
||||||
|
) -> dict:
|
||||||
|
"""
|
||||||
|
Atomically apply a JSON-merge-patch to a sidecar file's `review`
|
||||||
|
and/or `extensions` blocks. Other top-level keys are untouched.
|
||||||
|
|
||||||
|
`review_now`: when True (default) and `review` is non-empty, stamps
|
||||||
|
`review.reviewed_at` with the current UTC time so the review-time is
|
||||||
|
auditable without the caller having to pass it.
|
||||||
|
|
||||||
|
Returns the new full sidecar dict.
|
||||||
|
"""
|
||||||
|
path = Path(path)
|
||||||
|
data = read_sidecar(path)
|
||||||
|
|
||||||
|
if review:
|
||||||
|
merged = dict(data.get("review") or {})
|
||||||
|
merged.update({k: v for k, v in review.items() if v is not None or k in merged})
|
||||||
|
if reviewer_now:
|
||||||
|
merged["reviewed_at"] = datetime.datetime.utcnow().isoformat() + "Z"
|
||||||
|
data["review"] = merged
|
||||||
|
|
||||||
|
if extensions:
|
||||||
|
merged_ext = dict(data.get("extensions") or {})
|
||||||
|
merged_ext.update(extensions)
|
||||||
|
data["extensions"] = merged_ext
|
||||||
|
|
||||||
|
write_sidecar(path, data)
|
||||||
|
return data
|
||||||
|
|
||||||
|
|
||||||
|
def sidecar_path_for(blastware_path: Union[str, Path]) -> Path:
|
||||||
|
"""Convention: <bw_path>.sfm.json sits next to the BW binary."""
|
||||||
|
p = Path(blastware_path)
|
||||||
|
return p.with_name(p.name + ".sfm.json")
|
||||||
|
|
||||||
|
|
||||||
|
def file_sha256(path: Union[str, Path], chunk_size: int = 65536) -> str:
|
||||||
|
"""Compute SHA-256 of a file as a hex string."""
|
||||||
|
h = hashlib.sha256()
|
||||||
|
with open(path, "rb") as f:
|
||||||
|
while True:
|
||||||
|
chunk = f.read(chunk_size)
|
||||||
|
if not chunk:
|
||||||
|
break
|
||||||
|
h.update(chunk)
|
||||||
|
return h.hexdigest()
|
||||||
|
|
||||||
|
|
||||||
|
# ── Blastware-file reader ─────────────────────────────────────────────────────
|
||||||
|
#
|
||||||
|
# Reverse of `blastware_file.write_blastware_file`. Used by the BW-import
|
||||||
|
# flow to ingest files produced by Blastware's own ACH (where the source
|
||||||
|
# A5 frames are not available).
|
||||||
|
#
|
||||||
|
# File structure (recap):
|
||||||
|
# [22B header] [21B STRT record] [body bytes] [26B footer]
|
||||||
|
#
|
||||||
|
# The body holds:
|
||||||
|
# - 6B preamble (00 00 ff ff ff ff) immediately after the STRT
|
||||||
|
# - 4-channel interleaved int16 LE samples
|
||||||
|
# - Embedded ASCII metadata strings (Project: / Client: / User Name: /
|
||||||
|
# Seis Loc: / Extended Notes) from the device's session-start config
|
||||||
|
#
|
||||||
|
# The 0C waveform record (per-event peaks, project name) is NOT in the
|
||||||
|
# BW file — those are computed by the device firmware and only carried
|
||||||
|
# in the live SUB 0C response. read_blastware_file() therefore computes
|
||||||
|
# peaks from the raw samples assuming Normal-range (10 in/s full-scale)
|
||||||
|
# geophone sensitivity. Imported events surface that assumption via the
|
||||||
|
# sidecar's `peak_values.computed_from_samples` flag.
|
||||||
|
|
||||||
|
|
||||||
|
# Geophone scale factor, in/s per ADC unit, for Normal range (10 in/s FS).
|
||||||
|
# Confirmed from CLAUDE.md (geo_hardware_constant = 6.206053 in/s per V,
|
||||||
|
# ADC full-scale = 1.61133 V Normal range = 10.0 in/s peak; per-count
|
||||||
|
# resolution ≈ 10.0 / 32768).
|
||||||
|
_GEO_NORMAL_FS_INS = 10.0
|
||||||
|
_GEO_SENSITIVE_FS_INS = 1.250
|
||||||
|
_INT16_FS = 32768.0
|
||||||
|
|
||||||
|
# Microphone scale factor, psi per ADC count. Approximate — exact factor
|
||||||
|
# depends on the geophone-vs-mic ADC scaling and the firmware reference.
|
||||||
|
# We mark mic_psi as "computed approximate" in the sidecar.
|
||||||
|
_MIC_FS_PSI = 0.0125 / _INT16_FS # ~0.5 psi full-scale assumption
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_strt(strt: bytes) -> dict:
|
||||||
|
"""
|
||||||
|
Decode the 21-byte STRT record from a BW file.
|
||||||
|
|
||||||
|
Returns dict with waveform_key (4B), total_samples, pretrig_samples,
|
||||||
|
rectime_seconds. Falls back to None on truncated/missing fields.
|
||||||
|
"""
|
||||||
|
if len(strt) < 21 or strt[0:4] != b"STRT":
|
||||||
|
return {}
|
||||||
|
return {
|
||||||
|
"waveform_key": strt[6:10].hex(),
|
||||||
|
"total_samples": struct.unpack_from(">H", strt, 8)[0],
|
||||||
|
"pretrig_samples": struct.unpack_from(">H", strt, 16)[0],
|
||||||
|
"rectime_seconds": strt[18],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _find_first_string(buf: bytes, label: bytes, max_len: int = 256) -> Optional[str]:
|
||||||
|
"""
|
||||||
|
Search `buf` for `label` (e.g. b"Project:") and return the
|
||||||
|
null-terminated ASCII string that follows, stripped.
|
||||||
|
"""
|
||||||
|
pos = buf.find(label)
|
||||||
|
if pos < 0:
|
||||||
|
return None
|
||||||
|
start = pos + len(label)
|
||||||
|
end = buf.find(b"\x00", start, start + max_len)
|
||||||
|
if end < 0:
|
||||||
|
end = start + max_len
|
||||||
|
text = buf[start:end].decode("ascii", errors="replace").strip()
|
||||||
|
return text or None
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_samples_4ch_int16_le(stream: bytes) -> dict[str, list[int]]:
|
||||||
|
"""
|
||||||
|
Decode a 4-channel interleaved int16 LE byte stream into per-channel
|
||||||
|
lists. Channels are [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
|
||||||
|
Truncates to a multiple of 8 bytes (one full sample-set).
|
||||||
|
"""
|
||||||
|
n_complete = (len(stream) // 8) * 8
|
||||||
|
if n_complete == 0:
|
||||||
|
return {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||||
|
fmt = "<" + "h" * (n_complete // 2)
|
||||||
|
flat = list(struct.unpack(fmt, stream[:n_complete]))
|
||||||
|
return {
|
||||||
|
"Tran": flat[0::4],
|
||||||
|
"Vert": flat[1::4],
|
||||||
|
"Long": flat[2::4],
|
||||||
|
"MicL": flat[3::4],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _peaks_from_samples(samples: dict[str, list[int]]) -> PeakValues:
|
||||||
|
"""
|
||||||
|
Compute approximate peaks from raw int16 samples assuming Normal-range
|
||||||
|
geophone sensitivity. Used by the BW-importer when the 0C waveform
|
||||||
|
record (the device's authoritative peaks) is unavailable.
|
||||||
|
"""
|
||||||
|
def _peak_ins(ch: list[int]) -> float:
|
||||||
|
if not ch:
|
||||||
|
return 0.0
|
||||||
|
m = max(abs(int(v)) for v in ch)
|
||||||
|
return m / _INT16_FS * _GEO_NORMAL_FS_INS
|
||||||
|
|
||||||
|
tran = _peak_ins(samples.get("Tran", []))
|
||||||
|
vert = _peak_ins(samples.get("Vert", []))
|
||||||
|
long_ = _peak_ins(samples.get("Long", []))
|
||||||
|
|
||||||
|
# Mic in psi (approximate)
|
||||||
|
mic_ch = samples.get("MicL", []) or []
|
||||||
|
mic = max((abs(int(v)) for v in mic_ch), default=0) * _MIC_FS_PSI
|
||||||
|
|
||||||
|
# Peak vector sum: max over time of sqrt(T^2 + V^2 + L^2)
|
||||||
|
pvs = 0.0
|
||||||
|
n = min(len(samples.get("Tran", [])), len(samples.get("Vert", [])), len(samples.get("Long", [])))
|
||||||
|
if n:
|
||||||
|
scale = _GEO_NORMAL_FS_INS / _INT16_FS
|
||||||
|
T = samples["Tran"]; V = samples["Vert"]; L = samples["Long"]
|
||||||
|
for i in range(n):
|
||||||
|
t = T[i] * scale
|
||||||
|
v = V[i] * scale
|
||||||
|
l = L[i] * scale
|
||||||
|
mag = (t*t + v*v + l*l) ** 0.5
|
||||||
|
if mag > pvs:
|
||||||
|
pvs = mag
|
||||||
|
|
||||||
|
return PeakValues(
|
||||||
|
tran=tran, vert=vert, long=long_,
|
||||||
|
peak_vector_sum=pvs, micl=mic,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
_RECORD_TYPE_BY_EXT_SUFFIX = {
|
||||||
|
'H': 'Histogram',
|
||||||
|
'W': 'Waveform',
|
||||||
|
'M': 'Manual',
|
||||||
|
'E': 'Event',
|
||||||
|
'C': 'Combo',
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def derive_record_type_from_filename(filename, default: str = "Waveform") -> str:
|
||||||
|
"""Derive a BW Event's record_type from its filename's extension suffix.
|
||||||
|
|
||||||
|
V10.72+ MiniMate Plus firmware encodes the event type as the LAST
|
||||||
|
character of the extension (the `T` in BW's `AB0T` scheme):
|
||||||
|
|
||||||
|
``M529LKIQ.G10H`` → H → ``"Histogram"``
|
||||||
|
``T350L385.VY0W`` → W → ``"Waveform"``
|
||||||
|
``...M`` → M → ``"Manual"``
|
||||||
|
``...E`` → E → ``"Event"``
|
||||||
|
``...C`` → C → ``"Combo"``
|
||||||
|
|
||||||
|
Old S338 firmware uses 3-char extensions ending in ``0`` whose
|
||||||
|
encoding is not yet known — those fall through to ``default``.
|
||||||
|
Micromate Series 4 uses a different scheme entirely (observed:
|
||||||
|
``IDFH``, ``IDFW``) but the LAST-char convention (H / W) still holds
|
||||||
|
for the type code, so it works for both families.
|
||||||
|
|
||||||
|
Returns ``default`` if filename is empty, has no extension, or the
|
||||||
|
suffix char isn't a recognized type code.
|
||||||
|
"""
|
||||||
|
if not filename:
|
||||||
|
return default
|
||||||
|
try:
|
||||||
|
name = Path(filename).name
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return default
|
||||||
|
if '.' not in name:
|
||||||
|
return default
|
||||||
|
ext = name.rsplit('.', 1)[1]
|
||||||
|
if not ext:
|
||||||
|
return default
|
||||||
|
return _RECORD_TYPE_BY_EXT_SUFFIX.get(ext[-1].upper(), default)
|
||||||
|
|
||||||
|
|
||||||
|
def read_blastware_file(path: Union[str, Path]) -> Event:
|
||||||
|
"""
|
||||||
|
Parse a Blastware waveform file into an Event.
|
||||||
|
|
||||||
|
Recovers:
|
||||||
|
- waveform_key, rectime_seconds, total_samples, pretrig_samples
|
||||||
|
(from the STRT record)
|
||||||
|
- timestamp (from the footer's start-time field)
|
||||||
|
- project_info (from ASCII labels embedded in the body)
|
||||||
|
- raw_samples (Tran/Vert/Long/MicL int16 lists)
|
||||||
|
- peak_values (computed from raw_samples; approximate — see notes
|
||||||
|
on _peaks_from_samples about Normal-range assumption)
|
||||||
|
|
||||||
|
Does NOT recover the source A5 frames (they aren't in the BW file).
|
||||||
|
The returned Event has `_a5_frames = None`, signalling that
|
||||||
|
byte-for-byte regeneration of the BW file from this Event alone is
|
||||||
|
not possible — the on-disk BW file IS the byte-for-byte source.
|
||||||
|
"""
|
||||||
|
path = Path(path)
|
||||||
|
raw = path.read_bytes()
|
||||||
|
if len(raw) < _bw._WAVEFORM_HEADER_SIZE + 21 + 26:
|
||||||
|
raise ValueError(f"{path}: file too short ({len(raw)} bytes) to be a BW event")
|
||||||
|
|
||||||
|
# Header: validate magic prefix.
|
||||||
|
header = raw[:_bw._WAVEFORM_HEADER_SIZE]
|
||||||
|
if not header.startswith(_bw._FILE_HEADER_PREFIX):
|
||||||
|
raise ValueError(f"{path}: not a Blastware file (bad header prefix)")
|
||||||
|
|
||||||
|
# STRT record: 21 bytes immediately after the header.
|
||||||
|
strt_raw = raw[_bw._WAVEFORM_HEADER_SIZE : _bw._WAVEFORM_HEADER_SIZE + 21]
|
||||||
|
strt_fields = _decode_strt(strt_raw)
|
||||||
|
if not strt_fields:
|
||||||
|
raise ValueError(f"{path}: STRT record missing or malformed")
|
||||||
|
|
||||||
|
# Footer: locate the 0e 08 marker, validating the year is in a sane range.
|
||||||
|
body_start = _bw._WAVEFORM_HEADER_SIZE + 21
|
||||||
|
footer_pos = -1
|
||||||
|
pos = body_start
|
||||||
|
while True:
|
||||||
|
pos = raw.find(b"\x0e\x08", pos)
|
||||||
|
if pos < 0 or pos + 26 > len(raw):
|
||||||
|
break
|
||||||
|
yr = (raw[pos + 4] << 8) | raw[pos + 5]
|
||||||
|
if 2015 <= yr <= 2050:
|
||||||
|
footer_pos = pos
|
||||||
|
break
|
||||||
|
pos += 1
|
||||||
|
|
||||||
|
if footer_pos < 0 and len(raw) >= 26:
|
||||||
|
footer_pos = len(raw) - 26
|
||||||
|
if footer_pos < body_start:
|
||||||
|
raise ValueError(f"{path}: footer not found")
|
||||||
|
|
||||||
|
body = raw[body_start : footer_pos]
|
||||||
|
footer = raw[footer_pos : footer_pos + 26]
|
||||||
|
|
||||||
|
# Footer layout:
|
||||||
|
# [0:2] 0e 08 marker
|
||||||
|
# [2:10] ts1 (start) BE 8B
|
||||||
|
# [10:18] ts2 (stop) BE 8B
|
||||||
|
# [18:24] 00 01 00 02 00 00
|
||||||
|
# [24:26] crc
|
||||||
|
ts1 = _bw._decode_ts_be(footer[2:10])
|
||||||
|
ts2 = _bw._decode_ts_be(footer[10:18])
|
||||||
|
|
||||||
|
# Body: decode via the verified body codecs. Two formats coexist:
|
||||||
|
#
|
||||||
|
# 1. Waveform-mode (.AB0W) — starts with 7-byte preamble
|
||||||
|
# ``00 02 00 [Tran[0] BE] [Tran[1] BE]`` followed by the
|
||||||
|
# tagged-block delta stream documented in
|
||||||
|
# ``docs/waveform_codec_re_status.md`` and §7.6.1 of the
|
||||||
|
# protocol reference. Decoded by ``waveform_codec.decode_waveform_v2``.
|
||||||
|
#
|
||||||
|
# 2. Histogram-mode (.AB0H) — a sequence of 32-byte blocks, one
|
||||||
|
# per histogram interval, each carrying per-channel peak +
|
||||||
|
# half-period values. Decoded by
|
||||||
|
# ``histogram_codec.decode_histogram_body``. Both codecs
|
||||||
|
# return the same channel-grouped output shape, so consumers
|
||||||
|
# don't need to special-case mode.
|
||||||
|
#
|
||||||
|
# The historical ``_decode_samples_4ch_int16_le`` int16-LE
|
||||||
|
# interpretation was retracted 2026-05-08 (see protocol-ref §7.6.1
|
||||||
|
# retraction box) — it produced ±32K noise on every event.
|
||||||
|
#
|
||||||
|
# If both codecs fail (malformed file, truncated body, unrecognised
|
||||||
|
# mode, synthetic test input), fall back to empty channels — the
|
||||||
|
# rest of the event (timestamp, waveform_key, project strings) is
|
||||||
|
# still recoverable and useful.
|
||||||
|
decoded = decode_waveform_v2(body)
|
||||||
|
if decoded is None:
|
||||||
|
decoded = decode_histogram_body(body)
|
||||||
|
if decoded is None:
|
||||||
|
log.warning(
|
||||||
|
"%s: body codec failed to decode (body starts %s) — "
|
||||||
|
"raw_samples will be empty", path, body[:8].hex(" "),
|
||||||
|
)
|
||||||
|
samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
|
||||||
|
else:
|
||||||
|
samples = decoded_to_adc_counts(decoded)
|
||||||
|
|
||||||
|
# Metadata strings (label-anchored search across the body).
|
||||||
|
project = _find_first_string(body, b"Project:")
|
||||||
|
client = _find_first_string(body, b"Client:")
|
||||||
|
user = _find_first_string(body, b"User Name:")
|
||||||
|
seisloc = _find_first_string(body, b"Seis Loc:")
|
||||||
|
|
||||||
|
# Build the Event.
|
||||||
|
ev = Event(index=-1)
|
||||||
|
if strt_fields.get("waveform_key"):
|
||||||
|
ev._waveform_key = bytes.fromhex(strt_fields["waveform_key"])
|
||||||
|
# Derive record_type from the filename's extension suffix (H/W/M/E/C).
|
||||||
|
# When called from save_imported_bw the path here is a tmp file with a
|
||||||
|
# ".bw" suffix, so the derivation falls back to "Waveform" and the
|
||||||
|
# caller overrides ev.record_type using the original filename — see
|
||||||
|
# waveform_store.save_imported_bw.
|
||||||
|
ev.record_type = derive_record_type_from_filename(path.name)
|
||||||
|
ev.rectime_seconds = strt_fields.get("rectime_seconds")
|
||||||
|
ev.total_samples = strt_fields.get("total_samples")
|
||||||
|
ev.pretrig_samples = strt_fields.get("pretrig_samples")
|
||||||
|
|
||||||
|
if ts1 is not None:
|
||||||
|
ev.timestamp = Timestamp(
|
||||||
|
raw=footer[2:10],
|
||||||
|
flag=0x10,
|
||||||
|
year=ts1.year, unknown_byte=0, month=ts1.month, day=ts1.day,
|
||||||
|
hour=ts1.hour, minute=ts1.minute, second=ts1.second,
|
||||||
|
)
|
||||||
|
|
||||||
|
ev.project_info = ProjectInfo(
|
||||||
|
project=project, client=client, operator=user, sensor_location=seisloc,
|
||||||
|
)
|
||||||
|
ev.raw_samples = samples
|
||||||
|
# Only compute peaks from samples when we actually have samples.
|
||||||
|
# For events the codec couldn't decode (histogram-mode bodies, until
|
||||||
|
# the §7.6.2 histogram codec is wired in), samples is an empty dict
|
||||||
|
# and ``_peaks_from_samples`` would return PeakValues(0, 0, 0, 0, 0).
|
||||||
|
# That would then OVERWRITE existing good DB peak values (e.g. from
|
||||||
|
# paired BW ASCII reports) during the backfill UPSERT path.
|
||||||
|
# Leaving peak_values=None signals "we don't know" to downstream
|
||||||
|
# consumers; the backfill script seeds from the DB row when it sees
|
||||||
|
# None, and ``apply_report_to_event`` overlays from a paired ASCII
|
||||||
|
# report when one is supplied.
|
||||||
|
has_samples = any(samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL"))
|
||||||
|
ev.peak_values = _peaks_from_samples(samples) if has_samples else None
|
||||||
|
ev._a5_frames = None # not recoverable from BW file
|
||||||
|
|
||||||
|
return ev
|
||||||
+181
-31
@@ -111,20 +111,24 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
|
|||||||
verified against this algorithm on 2026-04-02).
|
verified against this algorithm on 2026-04-02).
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
offset_word: 16-bit offset (0x1004 for probe/chunks, 0x005A for term).
|
offset_word: 16-bit offset. For probe/chunks/metadata pages this is
|
||||||
raw_params: 10 or 11 params bytes (from bulk_waveform_params or
|
`0x1002`. For the proper TERM frame this is computed by
|
||||||
bulk_waveform_term_params). 0x10 bytes in params are
|
`bulk_waveform_term_v2()` from the STRT-derived
|
||||||
written RAW — NOT DLE-stuffed. Confirmed 2026-04-06 by
|
`end_offset`.
|
||||||
comparing wire bytes: BW sends bare `10 04` for chunk 1
|
raw_params: 10, 11, or 12 params bytes (from `bulk_waveform_params`
|
||||||
(counter=0x1004), not stuffed `10 10 04`. Device reads
|
for probes/samples, `bulk_waveform_term_v2` for TERM, or
|
||||||
params at fixed byte positions; stuffing shifts the bytes
|
a manually-built 12-byte block for the metadata pages
|
||||||
and corrupts the counter, causing device to ignore the frame.
|
0x1002 / 0x1004). See gotcha #3 below — params region
|
||||||
|
uses partial DLE stuffing of 0x10 bytes.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
|
Complete frame bytes: [ACK][STX][stuffed_section][chk][ETX]
|
||||||
"""
|
"""
|
||||||
if len(raw_params) not in (10, 11):
|
if len(raw_params) not in (10, 11, 12):
|
||||||
raise ValueError(f"raw_params must be 10 or 11 bytes, got {len(raw_params)}")
|
# 10 = termination params; 11 = regular probe / chunk params;
|
||||||
|
# 12 = metadata-page params (extra trailing 0x00 — BW byte-perfect quirk
|
||||||
|
# for the two fixed metadata reads at counter=0x1002 and 0x1004).
|
||||||
|
raise ValueError(f"raw_params must be 10/11/12 bytes, got {len(raw_params)}")
|
||||||
|
|
||||||
# Build stuffed section between STX and checksum
|
# Build stuffed section between STX and checksum
|
||||||
s = bytearray()
|
s = bytearray()
|
||||||
@@ -134,8 +138,40 @@ def build_5a_frame(offset_word: int, raw_params: bytes) -> bytes:
|
|||||||
s += b"\x00" # field3
|
s += b"\x00" # field3
|
||||||
s += bytes([(offset_word >> 8) & 0xFF, # offset_hi — raw, NOT stuffed
|
s += bytes([(offset_word >> 8) & 0xFF, # offset_hi — raw, NOT stuffed
|
||||||
offset_word & 0xFF]) # offset_lo
|
offset_word & 0xFF]) # offset_lo
|
||||||
for b in raw_params: # params — NOT DLE-stuffed (raw bytes, match BW wire format)
|
# Params — partial DLE stuffing of 0x10 bytes (CONFIRMED 2026-05-05).
|
||||||
|
#
|
||||||
|
# The device's de-stuffing rule for params is:
|
||||||
|
# • `10 10` → de-stuffs to `10`
|
||||||
|
# • `10 02/03/04` → kept literal (these are inner-frame markers)
|
||||||
|
# • `10 X` other → de-stuffs to just `X` (drops the 0x10)
|
||||||
|
#
|
||||||
|
# So for any 0x10 byte in the *logical* params that is followed by a
|
||||||
|
# byte NOT in {0x02, 0x03, 0x04, 0x10}, we must double the 0x10 on the
|
||||||
|
# wire (`10 X` → `10 10 X`) so the device's de-stuffer reproduces the
|
||||||
|
# original `10 X` pair. Without this, counter values with `0x10` in
|
||||||
|
# the high byte (e.g. counter=0x1000 has params bytes `10 00`) are
|
||||||
|
# silently corrupted to `0x__00` on the device side, and the device
|
||||||
|
# responds for the wrong address — for counter=0x1000 it returns the
|
||||||
|
# probe response (counter=0x0000), which contains the file header +
|
||||||
|
# STRT. That STRT block then lands in the assembled file body and
|
||||||
|
# Blastware rejects the file as malformed.
|
||||||
|
#
|
||||||
|
# Confirmed against BW capture 5-1-26 / bwcap3sec frame 20: params
|
||||||
|
# logical bytes `00 01 11 10 00 00 00 00 00 00 00` (counter=0x1000)
|
||||||
|
# are encoded on the wire as `00 01 11 10 10 00 00 00 00 00 00 00`.
|
||||||
|
# BW frames 13/14 (meta @ 0x1002 / 0x1004) leave `10 02` and `10 04`
|
||||||
|
# raw — the device handles those literal pairs correctly.
|
||||||
|
i = 0
|
||||||
|
while i < len(raw_params):
|
||||||
|
b = raw_params[i]
|
||||||
s.append(b)
|
s.append(b)
|
||||||
|
if (
|
||||||
|
b == 0x10
|
||||||
|
and i + 1 < len(raw_params)
|
||||||
|
and raw_params[i + 1] not in (0x02, 0x03, 0x04, 0x10)
|
||||||
|
):
|
||||||
|
s.append(0x10) # double the 0x10 so it survives device de-stuffing
|
||||||
|
i += 1
|
||||||
|
|
||||||
# DLE-aware checksum: for 0x10 XX pairs count XX; for lone bytes count them
|
# DLE-aware checksum: for 0x10 XX pairs count XX; for lone bytes count them
|
||||||
chk, i = 0, 0
|
chk, i = 0, 0
|
||||||
@@ -398,28 +434,26 @@ def bulk_waveform_params(key4: bytes, counter: int, *, is_probe: bool = False) -
|
|||||||
|
|
||||||
def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
|
def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
|
||||||
"""
|
"""
|
||||||
Build the 10-byte params block for the SUB 5A termination request.
|
⛔ DEPRECATED — DO NOT USE IN NEW CODE.
|
||||||
|
|
||||||
The termination request uses offset=0x005A and a DIFFERENT params layout —
|
This is the v1 termination params helper, paired with the broken
|
||||||
the leading 0x00 byte is dropped, key4[0:2] shifts to params[0:2], and the
|
`_BULK_TERM_OFFSET = 0x005A` magic offset_word. Together they produce a
|
||||||
counter high byte is at params[2]:
|
~100-byte device-side terminator response that does NOT contain the
|
||||||
|
partial-last-chunk waveform tail or the 26-byte file footer. Files
|
||||||
|
reconstructed using this terminator are missing their last ~512 bytes of
|
||||||
|
waveform data and have a synthesized footer that disagrees with what BW
|
||||||
|
would have written.
|
||||||
|
|
||||||
params[0] = key4[0]
|
**For new code, use `bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`**
|
||||||
params[1] = key4[1]
|
which computes the correct offset_word + params from the STRT-derived
|
||||||
params[2] = (counter >> 8) & 0xFF
|
`end_offset`. v2 produces wire bytes that match BW exactly across all
|
||||||
params[3:] = zeros
|
tested events (4-27-26 / 5-1-26 / 5-4-26 captures).
|
||||||
|
|
||||||
Counter for the termination request = last_regular_counter + 0x0400.
|
This function is retained ONLY for the defensive fallback path in
|
||||||
|
`read_bulk_waveform_stream()` that triggers when STRT parsing fails or no
|
||||||
Confirmed from 1-2-26 BW TX capture: final request (frame 83) uses
|
chunks are fetched (= a malformed event or an unexpected device state).
|
||||||
offset=0x005A, params[0:3] = key4[0:2] + term_counter_hi.
|
The fallback already logs a WARNING when it activates; if you see that
|
||||||
|
warning, the bug is upstream — STRT should have been parseable.
|
||||||
Args:
|
|
||||||
key4: 4-byte waveform key.
|
|
||||||
counter: Termination counter (= last regular counter + 0x0400).
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
10-byte params block.
|
|
||||||
"""
|
"""
|
||||||
if len(key4) != 4:
|
if len(key4) != 4:
|
||||||
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
||||||
@@ -430,6 +464,123 @@ def bulk_waveform_term_params(key4: bytes, counter: int) -> bytes:
|
|||||||
return bytes(p)
|
return bytes(p)
|
||||||
|
|
||||||
|
|
||||||
|
def bulk_waveform_term_v2(
|
||||||
|
key4: bytes,
|
||||||
|
end_offset: int,
|
||||||
|
last_chunk_counter: int,
|
||||||
|
) -> tuple[int, bytes]:
|
||||||
|
"""
|
||||||
|
Compute the SUB 5A TERM frame's offset_word and 10-byte params block.
|
||||||
|
|
||||||
|
Confirmed across 3 events (4-27-26 + 5-1-26 captures):
|
||||||
|
|
||||||
|
next_boundary = last_chunk_counter + 0x0200
|
||||||
|
offset_word = end_offset - next_boundary (residual byte count)
|
||||||
|
params[0] = key4[0] (= 0x01 on every observed device)
|
||||||
|
params[1] = key4[1] (= 0x11)
|
||||||
|
params[2] = (next_boundary >> 8) & 0xFF
|
||||||
|
params[3] = next_boundary & 0xFF
|
||||||
|
params[4:10] = zeros
|
||||||
|
|
||||||
|
Verification:
|
||||||
|
| end_offset | last_chunk | next_boundary | offset_word | params[2:4] |
|
||||||
|
| 0x1ABE | 0x1800 | 0x1A00 | 0x00BE | 1A 00 |
|
||||||
|
| 0x21F2 | 0x1E00 | 0x2000 | 0x01F2 | 20 00 |
|
||||||
|
| 0x417E | 0x3E38 | 0x4038 | 0x0146 | 40 38 |
|
||||||
|
|
||||||
|
The device receives `requested_address = (params[2] << 8) | offset_word`
|
||||||
|
and replies with `(end_offset - next_boundary)` bytes of waveform tail
|
||||||
|
starting at `next_boundary` — including the 26-byte file footer.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
key4: 4-byte waveform key for this event.
|
||||||
|
end_offset: Event-end pointer (= `(end_key[2] << 8) | end_key[3]`
|
||||||
|
from the STRT record at data[23:27] of A5[0]).
|
||||||
|
last_chunk_counter: Counter of the last full 0x0200-byte chunk fetched
|
||||||
|
(the chunk that covers [last_chunk_counter,
|
||||||
|
last_chunk_counter + 0x0200)).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
(offset_word, params10) tuple. Pass as
|
||||||
|
`build_5a_frame(offset_word, params)`.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: on inconsistent inputs.
|
||||||
|
"""
|
||||||
|
if len(key4) != 4:
|
||||||
|
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
||||||
|
next_boundary = last_chunk_counter + 0x0200
|
||||||
|
if next_boundary > 0xFFFF:
|
||||||
|
raise ValueError(
|
||||||
|
f"next_boundary 0x{next_boundary:04X} exceeds uint16; check inputs"
|
||||||
|
)
|
||||||
|
if end_offset <= last_chunk_counter:
|
||||||
|
raise ValueError(
|
||||||
|
f"end_offset 0x{end_offset:04X} must be > "
|
||||||
|
f"last_chunk_counter 0x{last_chunk_counter:04X}"
|
||||||
|
)
|
||||||
|
offset_word = end_offset - next_boundary
|
||||||
|
if offset_word < 0:
|
||||||
|
# Last chunk overshot end_offset; caller should have stopped one chunk
|
||||||
|
# earlier. Treat as zero residual.
|
||||||
|
offset_word = 0
|
||||||
|
if offset_word > 0xFFFF:
|
||||||
|
raise ValueError(
|
||||||
|
f"offset_word 0x{offset_word:04X} exceeds uint16"
|
||||||
|
)
|
||||||
|
p = bytearray(10)
|
||||||
|
p[0] = key4[0]
|
||||||
|
p[1] = key4[1]
|
||||||
|
p[2] = (next_boundary >> 8) & 0xFF
|
||||||
|
p[3] = next_boundary & 0xFF
|
||||||
|
return offset_word, bytes(p)
|
||||||
|
|
||||||
|
|
||||||
|
# ── End-offset extraction from STRT record ────────────────────────────────────
|
||||||
|
|
||||||
|
STRT_MARKER = b"STRT"
|
||||||
|
|
||||||
|
|
||||||
|
def parse_strt_end_offset(a5_data: bytes) -> Optional[int]:
|
||||||
|
"""
|
||||||
|
Extract the event-end offset from the STRT record in an A5 response payload.
|
||||||
|
|
||||||
|
The first A5 response (the probe response, or the first chunk for events
|
||||||
|
with non-zero start_key[2:4]) contains a STRT record at byte offset 17 of
|
||||||
|
`data`. Layout:
|
||||||
|
|
||||||
|
data[17:21] "STRT"
|
||||||
|
data[21:23] ff fe sentinel
|
||||||
|
data[23:27] end_key ← 4-byte key of where this event ENDS
|
||||||
|
data[27:31] start_key
|
||||||
|
...
|
||||||
|
|
||||||
|
Returns `(end_key[2] << 8) | end_key[3]` — the absolute device-buffer
|
||||||
|
address where the event ends. Use this to bound the chunk loop and to
|
||||||
|
compute the TERM frame.
|
||||||
|
|
||||||
|
Verified end_offset values:
|
||||||
|
| event start_key | end_key | end_offset |
|
||||||
|
| 01110000 | 01111ABE | 0x1ABE |
|
||||||
|
| 01110000 | 011121F2 | 0x21F2 |
|
||||||
|
| 011121F2 | 0111417E | 0x417E |
|
||||||
|
|
||||||
|
Args:
|
||||||
|
a5_data: The `data` field of an A5 response frame (frame.data).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The end_offset (uint16) if STRT is found, else None.
|
||||||
|
"""
|
||||||
|
pos = a5_data.find(STRT_MARKER)
|
||||||
|
if pos < 0 or pos + 10 > len(a5_data):
|
||||||
|
return None
|
||||||
|
# data[pos+4:pos+6] is "ff fe"; data[pos+6:pos+10] is end_key.
|
||||||
|
end_key = a5_data[pos + 6 : pos + 10]
|
||||||
|
if len(end_key) < 4:
|
||||||
|
return None
|
||||||
|
return (end_key[2] << 8) | end_key[3]
|
||||||
|
|
||||||
|
|
||||||
# ── Pre-built POLL frames ─────────────────────────────────────────────────────
|
# ── Pre-built POLL frames ─────────────────────────────────────────────────────
|
||||||
#
|
#
|
||||||
# POLL (SUB 0x5B) uses the same two-step pattern as all other reads — the
|
# POLL (SUB 0x5B) uses the same two-step pattern as all other reads — the
|
||||||
@@ -470,7 +621,6 @@ class S3Frame:
|
|||||||
|
|
||||||
|
|
||||||
# ── Streaming S3 frame parser ─────────────────────────────────────────────────
|
# ── Streaming S3 frame parser ─────────────────────────────────────────────────
|
||||||
|
|
||||||
class S3FrameParser:
|
class S3FrameParser:
|
||||||
"""
|
"""
|
||||||
Incremental byte-stream parser for S3→BW response frames.
|
Incremental byte-stream parser for S3→BW response frames.
|
||||||
|
|||||||
@@ -0,0 +1,283 @@
|
|||||||
|
"""
|
||||||
|
histogram_codec.py — decoder for MiniMate Plus histogram-mode event bodies.
|
||||||
|
|
||||||
|
FULLY DECODED 2026-05-20. Every field in every block, verified
|
||||||
|
byte-exact against BW's ASCII export across multiple histogram
|
||||||
|
fixtures.
|
||||||
|
|
||||||
|
The histogram-mode body is a stream of 32-byte fixed-length blocks,
|
||||||
|
one block per histogram interval. Each block carries the per-interval
|
||||||
|
peak amplitude + zero-crossing frequency for all four channels (Tran,
|
||||||
|
Vert, Long, MicL).
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Body layout (CONFIRMED 2026-05-20)
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
[stream of 32-byte blocks]
|
||||||
|
|
||||||
|
Body length is approximately ``n_intervals * 32`` bytes plus a small
|
||||||
|
trailing remnant (1-9 bytes typically) at the very end. Walker should
|
||||||
|
iterate 32-stride and stop before the tail.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
32-byte block layout
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
[0] 0x00 always-zero tag
|
||||||
|
[1] segment_id (uint8) 0x00..0x03 — 256 blocks per segment
|
||||||
|
[2:4] block_ctr (uint16 LE) resets each segment (0x0100, 0x0101, …)
|
||||||
|
[4:6] 0x000a (uint16 LE) constant marker (= 10)
|
||||||
|
[6] T_peak_count uint8 Tran peak (count × 0.005 → in/s, max 1.275 in/s)
|
||||||
|
[7] T_annotation uint8 empirically non-zero on intervals with sub-Hz
|
||||||
|
or unmeasurable Tran freq; meaning not fully RE'd
|
||||||
|
[8:10] T_halfperiod uint16 LE Tran half-period in samples (freq = 512 / halfp Hz)
|
||||||
|
[10] V_peak_count uint8
|
||||||
|
[11] V_annotation uint8
|
||||||
|
[12:14] V_halfperiod uint16 LE
|
||||||
|
[14] L_peak_count uint8
|
||||||
|
[15] L_annotation uint8
|
||||||
|
[16:18] L_halfperiod uint16 LE
|
||||||
|
[18] M_peak_count uint8 MicL peak (count → dB via mic_count_to_db)
|
||||||
|
[19] M_annotation uint8
|
||||||
|
[20:22] M_halfperiod uint16 LE MicL half-period in samples (freq = 512 / halfp Hz)
|
||||||
|
[22:24] 0x00 0x00 constant
|
||||||
|
[24:28] 4-byte variable purpose unknown (possibly CRC or timestamp delta)
|
||||||
|
[28:32] 0x1e 0x0a 0x00 0x00 constant block-end signature
|
||||||
|
|
||||||
|
NOTE on peak-count width: an earlier interpretation treated the peak
|
||||||
|
fields as uint16 LE spanning [6:8] / [10:12] / [14:16] / [18:20].
|
||||||
|
That happened to be byte-exact against the N844 fixture corpus only
|
||||||
|
because every annotation byte in those fixtures was zero, making
|
||||||
|
``uint16 LE == uint8``. Cross-correlating BE9558 (K558) Tran-drift
|
||||||
|
and BE18003 (T003) Histogram+Continuous events against the BW ASCII
|
||||||
|
export proved peak is uint8 alone — see test_histogram_codec.py
|
||||||
|
and docs/histogram_codec_re_status.md.
|
||||||
|
|
||||||
|
Block-identification anchor: ``block[22:24] == b"\\x00\\x00"`` AND
|
||||||
|
``block[28:32] == b"\\x1e\\x0a\\x00\\x00"``. This is the reliable
|
||||||
|
distinguisher from non-block content in the file.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Per-channel encoding
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
Geophone channels (Tran, Vert, Long):
|
||||||
|
- peak_count × 0.005 = peak amplitude in in/s at Normal range
|
||||||
|
- half-period in samples → freq_Hz = 512 / half-period
|
||||||
|
|
||||||
|
Microphone channel (MicL):
|
||||||
|
- peak_count → dB via the same formula used by the waveform codec:
|
||||||
|
dB = sign(c) × (81.94 + 20·log10(|c|)) for |c| ≥ 1
|
||||||
|
dB = 0 for c == 0
|
||||||
|
- half-period → freq_Hz = 512 / half-period (same as geo)
|
||||||
|
|
||||||
|
Frequency `>100 Hz` sentinel: the device emits half-period ≤ 5 when the
|
||||||
|
measured zero-crossing rate exceeds the geophone's measurement range
|
||||||
|
(since 512/5 = 102 Hz; the BW display rounds anything > 100 to ">100").
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Output shape
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
``decode_histogram_body`` returns a per-channel dict matching the
|
||||||
|
waveform codec's shape so the rest of the pipeline (.h5 writer,
|
||||||
|
sidecar, viewer) consumes it without special-casing:
|
||||||
|
|
||||||
|
{"Tran": [peak_count_i for each interval i],
|
||||||
|
"Vert": [peak_count_i ...],
|
||||||
|
"Long": [peak_count_i ...],
|
||||||
|
"MicL": [peak_count_i ...]}
|
||||||
|
|
||||||
|
Values are in **16-count units for geo** (LSB = 0.005 in/s, matching
|
||||||
|
``decode_waveform_v2``) and **1-count units for mic** (matching the
|
||||||
|
waveform codec's mic convention). Run through
|
||||||
|
``waveform_codec.decoded_to_adc_counts`` to scale geo to 1-count ADC.
|
||||||
|
|
||||||
|
Per-interval frequencies are NOT returned — they're auxiliary data,
|
||||||
|
not waveform samples. Consumers needing frequencies can call
|
||||||
|
``decode_histogram_body_full()`` for the structured per-interval
|
||||||
|
record list.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import struct
|
||||||
|
from typing import List, Optional, Tuple
|
||||||
|
|
||||||
|
# Block-end signature: constant `1e 0a 00 00` in bytes [28:32] of every
|
||||||
|
# real data block. More distinctive than the byte-22 `00 00` (which
|
||||||
|
# matches many false positives), so we anchor on this.
|
||||||
|
_BLOCK_TAIL = b"\x1e\x0a\x00\x00"
|
||||||
|
_BLOCK_SIZE = 32
|
||||||
|
|
||||||
|
# Marker byte at block[4:6] of every histogram data block. Used as
|
||||||
|
# additional validation that we're looking at a real block.
|
||||||
|
_BLOCK_MARKER = 10
|
||||||
|
|
||||||
|
# Geo peak scaling: stored as "count × 0.005 in/s" where 1 count = one
|
||||||
|
# 0.005 in/s display quantum. Equivalent to the waveform codec's
|
||||||
|
# 16-count-unit output (1 unit = 0.005 in/s = 16 ADC counts).
|
||||||
|
_GEO_LSB_INS = 0.005
|
||||||
|
|
||||||
|
# Frequency formula: freq_Hz = _FREQ_NUMERATOR / half_period_samples.
|
||||||
|
# Empirically determined to be 512 (= sample_rate / 2, where sample rate
|
||||||
|
# is 1024 sps for the standard MiniMate Plus configuration).
|
||||||
|
_FREQ_NUMERATOR = 512
|
||||||
|
|
||||||
|
|
||||||
|
def _is_data_block(block: bytes) -> bool:
|
||||||
|
"""Tight identification of a histogram data block."""
|
||||||
|
if len(block) < _BLOCK_SIZE:
|
||||||
|
return False
|
||||||
|
if block[28:32] != _BLOCK_TAIL:
|
||||||
|
return False
|
||||||
|
if block[22:24] != b"\x00\x00":
|
||||||
|
return False
|
||||||
|
if block[0] != 0x00:
|
||||||
|
return False
|
||||||
|
marker = block[4] | (block[5] << 8)
|
||||||
|
if marker != _BLOCK_MARKER:
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_block(block: bytes) -> Optional[dict]:
|
||||||
|
"""Decode one 32-byte histogram block. Caller must have validated
|
||||||
|
with ``_is_data_block`` first.
|
||||||
|
|
||||||
|
Returns a record with per-channel peak counts (uint8) and
|
||||||
|
half-periods (uint16 LE).
|
||||||
|
"""
|
||||||
|
# Peak counts are uint8 at bytes [6] / [10] / [14] / [18]. The
|
||||||
|
# adjacent bytes [7] / [11] / [15] / [19] hold an annotation field
|
||||||
|
# whose meaning isn't fully understood (empirically non-zero in
|
||||||
|
# intervals with sub-Hz or unmeasurable geo frequencies, mostly
|
||||||
|
# zero otherwise — see test fixtures from BE9558/BE18003 corpora).
|
||||||
|
# Crucially, those annotation bytes are NOT the high byte of the
|
||||||
|
# peak count: cross-correlating against BW's per-interval ASCII
|
||||||
|
# export proves the peak is uint8 alone.
|
||||||
|
#
|
||||||
|
# Reading the peak as uint16 LE (the original interpretation) was
|
||||||
|
# accidentally correct only because every block in the N844 fixture
|
||||||
|
# corpus had a zero annotation byte; non-N844 events with non-zero
|
||||||
|
# annotation bytes decoded to physically impossible peaks (e.g.
|
||||||
|
# 268 in/s per channel) and produced 35× inflated PVS sums when
|
||||||
|
# first run against prod data. See histogram_codec_re_status.md.
|
||||||
|
t_peak = block[6]
|
||||||
|
v_peak = block[10]
|
||||||
|
l_peak = block[14]
|
||||||
|
m_peak = block[18]
|
||||||
|
t_halfp = block[8] | (block[9] << 8)
|
||||||
|
v_halfp = block[12] | (block[13] << 8)
|
||||||
|
l_halfp = block[16] | (block[17] << 8)
|
||||||
|
m_halfp = block[20] | (block[21] << 8)
|
||||||
|
segment_id = block[1]
|
||||||
|
block_ctr = block[2] | (block[3] << 8)
|
||||||
|
var_meta = bytes(block[24:28])
|
||||||
|
annotations = (block[7], block[11], block[15], block[19])
|
||||||
|
return {
|
||||||
|
"segment_id": segment_id,
|
||||||
|
"block_ctr": block_ctr,
|
||||||
|
"t_peak": t_peak,
|
||||||
|
"t_halfp": t_halfp,
|
||||||
|
"v_peak": v_peak,
|
||||||
|
"v_halfp": v_halfp,
|
||||||
|
"l_peak": l_peak,
|
||||||
|
"l_halfp": l_halfp,
|
||||||
|
"m_peak": m_peak,
|
||||||
|
"m_halfp": m_halfp,
|
||||||
|
"meta_var": var_meta,
|
||||||
|
"annotations": annotations,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def walk_body(body: bytes) -> List[dict]:
|
||||||
|
"""Walk the body and return one dict per histogram interval.
|
||||||
|
|
||||||
|
Iterates 32-byte strides from offset 0. Yields a decoded record
|
||||||
|
for every block that passes ``_is_data_block`` validation. Stops
|
||||||
|
when the remaining bytes are too short to form a complete block.
|
||||||
|
|
||||||
|
In Histogram+Continuous mode the body interleaves data blocks with
|
||||||
|
other 32-byte content (likely continuous-mode waveform blocks) that
|
||||||
|
fail the data-block validation; the walker naturally skips them
|
||||||
|
without losing 32-byte alignment. Use ``block_ctr`` from each
|
||||||
|
returned record to map back to the original interval index — the
|
||||||
|
record list is sparse when other block types are interleaved.
|
||||||
|
"""
|
||||||
|
records: List[dict] = []
|
||||||
|
for off in range(0, len(body) - _BLOCK_SIZE + 1, _BLOCK_SIZE):
|
||||||
|
blk = body[off:off + _BLOCK_SIZE]
|
||||||
|
if not _is_data_block(blk):
|
||||||
|
# Hit non-block content (likely a sync or stream marker).
|
||||||
|
# Continue walking — block alignment is fixed at 32-stride
|
||||||
|
# from offset 0, so we don't lose alignment by skipping.
|
||||||
|
continue
|
||||||
|
decoded = _decode_block(blk)
|
||||||
|
if decoded is None:
|
||||||
|
# Block validated as a histogram block but had peak fields
|
||||||
|
# outside the plausible range — undocumented extension.
|
||||||
|
# Skip rather than propagating bogus PVS contributions.
|
||||||
|
continue
|
||||||
|
records.append(decoded)
|
||||||
|
return records
|
||||||
|
|
||||||
|
|
||||||
|
def decode_histogram_body(body: bytes) -> Optional[dict]:
|
||||||
|
"""Decode a histogram-mode body into per-channel peak-sample arrays.
|
||||||
|
|
||||||
|
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||||
|
where each channel's list contains one peak value per histogram
|
||||||
|
interval (in the same units the waveform codec uses: 16-count units
|
||||||
|
for geo, 1-count ADC units for mic). Returns ``None`` if the body
|
||||||
|
doesn't contain any valid histogram blocks.
|
||||||
|
|
||||||
|
To convert to physical units:
|
||||||
|
- Geo channels: ``count * 0.005`` = peak in in/s at Normal range
|
||||||
|
(or run through ``waveform_codec.decoded_to_adc_counts`` first
|
||||||
|
to get 1-count ADC values, then ``count / 32767 * 10.0`` for in/s)
|
||||||
|
- Mic channel: use ``waveform_codec.mic_count_to_db(count)``
|
||||||
|
"""
|
||||||
|
records = walk_body(body)
|
||||||
|
if not records:
|
||||||
|
return None
|
||||||
|
return {
|
||||||
|
"Tran": [r["t_peak"] for r in records],
|
||||||
|
"Vert": [r["v_peak"] for r in records],
|
||||||
|
"Long": [r["l_peak"] for r in records],
|
||||||
|
"MicL": [r["m_peak"] for r in records],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def decode_histogram_body_full(body: bytes) -> Optional[List[dict]]:
|
||||||
|
"""Decode a histogram-mode body into the full per-interval record list.
|
||||||
|
|
||||||
|
Same data as ``decode_histogram_body`` but in a structured form that
|
||||||
|
preserves the half-period (frequency) data for each channel + the
|
||||||
|
per-block segment_id, block_ctr, and 4-byte variable metadata.
|
||||||
|
Useful for diagnostic tools, sidecar enrichment, and future-codec
|
||||||
|
work.
|
||||||
|
|
||||||
|
Returns ``None`` if the body has no valid blocks.
|
||||||
|
"""
|
||||||
|
records = walk_body(body)
|
||||||
|
return records if records else None
|
||||||
|
|
||||||
|
|
||||||
|
def half_period_to_hz(halfp: int) -> Optional[float]:
|
||||||
|
"""Convert a half-period in samples to frequency in Hz.
|
||||||
|
|
||||||
|
Returns ``None`` for half-period ≤ 5 — the device emits values in
|
||||||
|
that range when the measured zero-crossing rate exceeds 100 Hz
|
||||||
|
(the BW display reports `>100 Hz` for such cases). Callers can
|
||||||
|
treat ``None`` as the `>100 Hz` sentinel.
|
||||||
|
"""
|
||||||
|
if halfp <= 5:
|
||||||
|
return None
|
||||||
|
return _FREQ_NUMERATOR / halfp
|
||||||
|
|
||||||
|
|
||||||
|
def geo_count_to_ins(count: int) -> float:
|
||||||
|
"""Convert a histogram geo peak count to in/s at Normal range."""
|
||||||
|
return count * _GEO_LSB_INS
|
||||||
@@ -201,6 +201,58 @@ class Timestamp:
|
|||||||
second=second,
|
second=second,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_short_record(cls, data: bytes) -> "Timestamp":
|
||||||
|
"""
|
||||||
|
Decode an 8-byte timestamp header from a 210-byte waveform record.
|
||||||
|
|
||||||
|
Wire layout (✅ CONFIRMED 2026-05-01 against live SFM run on BE11529 in
|
||||||
|
Continuous mode, day-of-month = 1 May, raw: 01 05 07 ea 00 0d 15 25):
|
||||||
|
byte[0]: day (uint8)
|
||||||
|
byte[1]: month (uint8)
|
||||||
|
bytes[2-3]: year (big-endian uint16)
|
||||||
|
byte[4]: unknown (0x00 in observed sample)
|
||||||
|
byte[5]: hour (uint8)
|
||||||
|
byte[6]: minute (uint8)
|
||||||
|
byte[7]: second (uint8)
|
||||||
|
|
||||||
|
This is a third format observed in the wild — distinct from the 9-byte
|
||||||
|
(single-shot, sub_code=0x10 at [1]) and 10-byte (continuous, 0x10 at
|
||||||
|
[0] AND [2]) layouts. No marker bytes; disambiguated by where the
|
||||||
|
year lands when scanned at byte 2/3/4.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
data: at least 8 bytes; only the first 8 are consumed.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Decoded Timestamp.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: if data is fewer than 8 bytes.
|
||||||
|
"""
|
||||||
|
if len(data) < 8:
|
||||||
|
raise ValueError(
|
||||||
|
f"Short record timestamp requires at least 8 bytes, got {len(data)}"
|
||||||
|
)
|
||||||
|
day = data[0]
|
||||||
|
month = data[1]
|
||||||
|
year = struct.unpack_from(">H", data, 2)[0]
|
||||||
|
unknown_byte = data[4]
|
||||||
|
hour = data[5]
|
||||||
|
minute = data[6]
|
||||||
|
second = data[7]
|
||||||
|
return cls(
|
||||||
|
raw=bytes(data[:8]),
|
||||||
|
flag=0,
|
||||||
|
year=year,
|
||||||
|
unknown_byte=unknown_byte,
|
||||||
|
month=month,
|
||||||
|
day=day,
|
||||||
|
hour=hour,
|
||||||
|
minute=minute,
|
||||||
|
second=second,
|
||||||
|
)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def clock_set(self) -> bool:
|
def clock_set(self) -> bool:
|
||||||
"""False when year == 1995 (factory default / battery-lost state)."""
|
"""False when year == 1995 (factory default / battery-lost state)."""
|
||||||
|
|||||||
+244
-245
@@ -35,6 +35,8 @@ from .framing import (
|
|||||||
token_params,
|
token_params,
|
||||||
bulk_waveform_params,
|
bulk_waveform_params,
|
||||||
bulk_waveform_term_params,
|
bulk_waveform_term_params,
|
||||||
|
bulk_waveform_term_v2,
|
||||||
|
parse_strt_end_offset,
|
||||||
POLL_PROBE,
|
POLL_PROBE,
|
||||||
POLL_DATA,
|
POLL_DATA,
|
||||||
SESSION_RESET,
|
SESSION_RESET,
|
||||||
@@ -122,16 +124,22 @@ DATA_LENGTHS: dict[int, int] = {
|
|||||||
}
|
}
|
||||||
|
|
||||||
# SUB 5A (BULK_WAVEFORM_STREAM) protocol constants.
|
# SUB 5A (BULK_WAVEFORM_STREAM) protocol constants.
|
||||||
# Confirmed from 1-2-26 BW TX capture analysis (2026-04-02).
|
#
|
||||||
_BULK_CHUNK_OFFSET = 0x1004 # offset field for probe + all regular chunk requests ✅
|
# 2026-05-01 minimal-fix: the chunk-counter walk is now bounded by the event's
|
||||||
_BULK_TERM_OFFSET = 0x005A # offset field for termination request ✅
|
# `end_offset` extracted from the STRT record at data[23:27] of the probe
|
||||||
_BULK_COUNTER_STEP = 0x0400 # chunk counter increment per chunk ✅
|
# response. Without this bound the loop kept asking for chunks past the event
|
||||||
# Chunk counter formula: key4[2:4] + (chunk_num - 1) * 0x0400
|
# end and the device responded with post-event circular-buffer garbage,
|
||||||
# where key4[2:4] is the event's circular-buffer base offset ((key4[2]<<8)|key4[3]).
|
# corrupting reconstructed Blastware files for events ≥ 2 sec.
|
||||||
# Earlier captures showed 0x1004 for chunk 1 of key 01110000 — that was a Blastware
|
#
|
||||||
# artifact. For keys where key4[2:4] != 0x0000 (e.g. key 01111884) the old
|
# We keep the OLD 0x0400 chunk step here (BW actually uses 0x0200 — see §7.8.5
|
||||||
# "n * 0x0400" formula sends counters from the wrong buffer region and the device
|
# of the protocol reference for the corrected understanding) because the
|
||||||
# returns data from a different event. Confirmed correct 2026-04-24.
|
# existing blastware_file.py builder relies on the 0x0400-step frame structure
|
||||||
|
# to produce valid files. Switching to BW's 0x0200 step is a separate task
|
||||||
|
# that also requires updating the file builder.
|
||||||
|
# BW-exact protocol values (v0.14.0). Verified against 4-27-26 + 5-1-26 captures.
|
||||||
|
_BULK_CHUNK_OFFSET = 0x1002 # offset_word for probe + all chunk requests
|
||||||
|
_BULK_TERM_OFFSET = 0x005A # offset_word for the legacy terminator (fallback only)
|
||||||
|
_BULK_COUNTER_STEP = 0x0200 # chunk counter increment (matches chunk payload size)
|
||||||
|
|
||||||
# Default timeout values (seconds).
|
# Default timeout values (seconds).
|
||||||
# MiniMate Plus is a slow device — keep these generous.
|
# MiniMate Plus is a slow device — keep these generous.
|
||||||
@@ -526,223 +534,270 @@ class MiniMateProtocol:
|
|||||||
self,
|
self,
|
||||||
key4: bytes,
|
key4: bytes,
|
||||||
*,
|
*,
|
||||||
stop_after_metadata: bool = True,
|
stop_after_metadata: bool = True, # DEPRECATED — no-op under BW-exact walk
|
||||||
max_chunks: int = 32,
|
max_chunks: int = 256, # safety cap only; loop is bounded by end_offset
|
||||||
include_terminator: bool = False,
|
include_terminator: bool = False,
|
||||||
extra_chunks_after_metadata: int = 1,
|
extra_chunks_after_metadata: int = 1, # DEPRECATED — no-op
|
||||||
) -> list[S3Frame]:
|
) -> list[S3Frame]:
|
||||||
"""
|
"""
|
||||||
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event.
|
Download the SUB 5A (BULK_WAVEFORM_STREAM) A5 frames for one event using
|
||||||
|
Blastware's exact protocol. REWRITTEN 2026-05-02 (v0.14.0).
|
||||||
|
|
||||||
The bulk waveform stream carries both raw ADC samples (large) and
|
Algorithm (matches BW captures across 2-sec / 3-sec / event-2):
|
||||||
event-time metadata strings ("Project:", "Client:", "User Name:",
|
|
||||||
"Seis Loc:", "Extended Notes") embedded in one of the middle frames
|
|
||||||
(confirmed: A5[7] of 9 for 1-2-26 capture).
|
|
||||||
|
|
||||||
Protocol is request-per-chunk, NOT a continuous stream:
|
1. Probe
|
||||||
1. Probe (offset=_BULK_CHUNK_OFFSET, is_probe=True, counter=0x0000)
|
- For events at start_key[2:4] = 0x0000 (first event after erase
|
||||||
2. Chunks (offset=_BULK_CHUNK_OFFSET, is_probe=False, counter+=0x0400)
|
/ wrap): probe at counter=0x0000 with full key in params.
|
||||||
3. Loop until metadata found (stop_after_metadata=True) or max_chunks
|
- For continuation events (start_key[2:4] != 0): first chunk at
|
||||||
4. Termination (offset=_BULK_TERM_OFFSET, counter=last+_BULK_COUNTER_STEP)
|
counter = start_key[2:4] + 0x0046; acts as both probe and
|
||||||
Device responds with a final A5 frame (page_key=0x0000).
|
first sample chunk; response carries STRT.
|
||||||
|
|
||||||
By default the termination frame (page_key=0x0000) is NOT included in the
|
2. Parse end_offset from STRT record at data[23:27] of the probe response.
|
||||||
returned list. Pass include_terminator=True to append it; the blastware_file
|
|
||||||
writer needs the terminator frame's body to reconstruct the waveform file footer.
|
|
||||||
|
|
||||||
Args:
|
3. Read two fixed metadata pages at counter=0x1002 and counter=0x1004
|
||||||
key4: 4-byte waveform key from EVENT_HEADER (1E).
|
— global session metadata (Project / Client / User Name / Seis Loc
|
||||||
stop_after_metadata: If True (default), send termination as soon as
|
/ Extended Notes ASCII strings). Event 1 only; continuation
|
||||||
b"Project:" is found in a frame's data — avoids
|
events skip these (BW caches them across the session).
|
||||||
downloading the full ADC waveform payload (several
|
|
||||||
hundred KB). Set False to download everything.
|
4. Walk sample chunks at 0x0200 increments, starting from 0x0600 for
|
||||||
max_chunks: Safety cap on the number of chunk requests sent
|
event 1 or `start + 0x0046 + 0x0200` for continuation events.
|
||||||
(default 32; a typical event uses 9 large frames).
|
Stop when `next_chunk + 0x0200 > end_offset`.
|
||||||
include_terminator: If True, append the terminator A5 frame
|
|
||||||
(page_key=0x0000) to the returned list. The
|
5. Send TERM frame with offset_word and params computed by
|
||||||
terminator carries the waveform file footer bytes.
|
`bulk_waveform_term_v2(key4, end_offset, last_chunk_counter)`.
|
||||||
Default False preserves existing caller behaviour.
|
The TERM response contains the partial last chunk (residual =
|
||||||
|
end_offset - next_boundary) including the 26-byte 0e 08 file
|
||||||
|
footer.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of S3Frame objects from each A5 response frame. Frame indices
|
List of S3Frame objects from each A5 response (probe, metadata
|
||||||
match the request sequence: index 0 = probe response, index 1 = first
|
pages, sample chunks, optional TERM response). Caller passes
|
||||||
chunk, etc. If include_terminator=True, the last element is the
|
`include_terminator=True` (e.g. write_blastware_file) to keep the
|
||||||
terminator frame (page_key=0x0000).
|
TERM response in the list — it's required to reconstruct the
|
||||||
|
file footer.
|
||||||
|
|
||||||
|
Deprecated kwargs:
|
||||||
|
stop_after_metadata: legacy "Project:"-string-based stop condition.
|
||||||
|
No-op under the BW-exact walk; the loop is
|
||||||
|
deterministically bounded by end_offset from
|
||||||
|
STRT. Accepted for backward compat.
|
||||||
|
extra_chunks_after_metadata: same.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ProtocolError: on timeout, bad checksum, or unexpected SUB.
|
ProtocolError: on timeout / bad checksum / unexpected SUB.
|
||||||
|
|
||||||
Confirmed from 1-2-26 BW TX/RX captures (2026-04-02):
|
|
||||||
- probe + 8 regular chunks + 1 termination = 10 TX frames
|
|
||||||
- 9 large A5 responses + 1 terminator A5 = 10 RX frames
|
|
||||||
- page_key=0x0010 on large frames; page_key=0x0000 on terminator ✅
|
|
||||||
- "Project:" metadata at A5[7].data[626] ✅
|
|
||||||
"""
|
"""
|
||||||
if len(key4) != 4:
|
if len(key4) != 4:
|
||||||
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
raise ValueError(f"waveform key must be 4 bytes, got {len(key4)}")
|
||||||
|
|
||||||
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xFF - 0x5A = 0xA5
|
# Quietly accept and warn on deprecated kwargs.
|
||||||
|
if not stop_after_metadata:
|
||||||
|
log.debug("5A: stop_after_metadata=False is no-op under BW-exact walk")
|
||||||
|
if extra_chunks_after_metadata not in (0, 1):
|
||||||
|
log.debug("5A: extra_chunks_after_metadata=%d is no-op under BW-exact walk",
|
||||||
|
extra_chunks_after_metadata)
|
||||||
|
|
||||||
|
rsp_sub = _expected_rsp_sub(SUB_BULK_WAVEFORM) # 0xA5
|
||||||
frames_data: list[S3Frame] = []
|
frames_data: list[S3Frame] = []
|
||||||
counter = 0
|
|
||||||
|
|
||||||
# BW counter formula (confirmed from 4-3-26 capture for key 0111245a,
|
start_offset = (key4[2] << 8) | key4[3]
|
||||||
# and empirical live-device test 2026-04-06 for key 01110000):
|
is_event_1 = (start_offset == 0)
|
||||||
# counter for chunk n = max(key4[2:4], 0x0400) + (n - 1) * 0x0400
|
|
||||||
# key4[2:4] is the event's circular-buffer base offset. The max() guard
|
|
||||||
# ensures chunk 1 never uses counter=0x0000 (which equals the probe address
|
|
||||||
# and causes the device to re-return STRT record data for the first chunk).
|
|
||||||
_key4_offset = (key4[2] << 8) | key4[3]
|
|
||||||
|
|
||||||
# ── Step 1: probe ────────────────────────────────────────────────────
|
# ── Step 1: probe / first chunk ──────────────────────────────────────
|
||||||
log.debug("5A probe key=%s key4_offset=0x%04X", key4.hex(), _key4_offset)
|
if is_event_1:
|
||||||
params = bulk_waveform_params(key4, 0, is_probe=True)
|
probe_counter = 0
|
||||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
probe_params = bulk_waveform_params(key4, 0, is_probe=True)
|
||||||
self._parser.reset() # reset bytes_fed counter before probe recv
|
log.debug("5A probe (event-1) key=%s counter=0x0000", key4.hex())
|
||||||
|
else:
|
||||||
|
# Continuation events: first 5A request lands at counter = key[2:4]
|
||||||
|
# (i.e. the address of the off=0x46 WAVEHDR record returned by 1F).
|
||||||
|
# The probe response carries STRT at byte 17 with end_offset.
|
||||||
|
#
|
||||||
|
# Confirmed 2026-05-04 from 5-1-26 "copy 2nd address" capture
|
||||||
|
# (BW probes counter=0x2238 with key=01112238, STRT@17 end=0x417E)
|
||||||
|
# and 5-4-26 BW captures (2-sec event probes counter=0x2238).
|
||||||
|
#
|
||||||
|
# The earlier "+0x46" formula in the doc came from calling
|
||||||
|
# start_key the BOUNDARY (off=0x2C) key, but the iteration walk
|
||||||
|
# uses 1F's off=0x46 key as cur_key, which already incorporates
|
||||||
|
# the +0x46 offset relative to the boundary. Adding it again
|
||||||
|
# caused the probe to overshoot, miss STRT, and run uncapped.
|
||||||
|
probe_counter = start_offset
|
||||||
|
probe_params = bulk_waveform_params(key4, probe_counter)
|
||||||
|
log.debug(
|
||||||
|
"5A probe (event-N) key=%s counter=0x%04X",
|
||||||
|
key4.hex(), probe_counter,
|
||||||
|
)
|
||||||
|
|
||||||
|
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, probe_params))
|
||||||
|
self._parser.reset()
|
||||||
try:
|
try:
|
||||||
probe_batch = self._recv_5a_batch(rsp_sub)
|
rsp = self._recv_one(expected_sub=rsp_sub, reset_parser=False)
|
||||||
except TimeoutError:
|
except TimeoutError:
|
||||||
log.warning(
|
log.warning(
|
||||||
"5A probe TIMED OUT for key=%s — "
|
"5A probe TIMED OUT for key=%s — %d raw bytes received",
|
||||||
"%d raw bytes received (no complete A5 frame assembled)",
|
|
||||||
key4.hex(), self._parser.bytes_fed,
|
key4.hex(), self._parser.bytes_fed,
|
||||||
)
|
)
|
||||||
raise
|
raise
|
||||||
frames_data.extend(probe_batch)
|
|
||||||
log.debug(
|
|
||||||
"5A probe: %d frame(s) page_keys=%s",
|
|
||||||
len(probe_batch),
|
|
||||||
[f"0x{f.page_key:04X}" for f in probe_batch],
|
|
||||||
)
|
|
||||||
|
|
||||||
# Log probe frame size for diagnostics.
|
frames_data.append(rsp)
|
||||||
# The device always needs extra_chunks_after_metadata chunks after the
|
log.debug("5A A5[0] (probe) page_key=0x%04X %d bytes",
|
||||||
# metadata frame before termination to prime the valid waveform footer.
|
rsp.page_key, len(rsp.data))
|
||||||
# This holds regardless of TCP frame size (1-frame vs 2-frame mode).
|
|
||||||
_effective_extra_chunks = extra_chunks_after_metadata
|
|
||||||
log.warning(
|
|
||||||
"5A probe data_len=%d effective_extra_chunks=%d",
|
|
||||||
len(probe_batch[0].data),
|
|
||||||
_effective_extra_chunks,
|
|
||||||
)
|
|
||||||
|
|
||||||
# ── Step 2: chunk loop ───────────────────────────────────────────────
|
# ── Step 2: parse STRT end_offset from probe response ────────────────
|
||||||
# Counter formula: _chunk_base + (chunk_num - 1) * 0x0400
|
end_offset = parse_strt_end_offset(rsp.data)
|
||||||
# where _chunk_base = max(key4[2:4], 0x0400).
|
if end_offset is None:
|
||||||
#
|
log.warning(
|
||||||
# For events with key4[2:4] != 0 (e.g. key 0111245a, offset 0x245a):
|
"5A probe response did not contain a STRT record; "
|
||||||
# _chunk_base = 0x245a → chunk 1=0x245a, chunk 2=0x285a, ...
|
"cannot bound chunk loop — falling back to max_chunks=%d cap",
|
||||||
# Confirmed from 4-3-26 capture.
|
max_chunks,
|
||||||
#
|
)
|
||||||
# For events with key4[2:4] == 0 (e.g. key 01110000):
|
end_offset = 0xFFFF # impossible value → loop runs to max_chunks
|
||||||
# _chunk_base = max(0, 0x0400) = 0x0400
|
else:
|
||||||
# → chunk 1=0x0400, chunk 2=0x0800, ... (= old chunk_num*0x0400)
|
log.info(
|
||||||
# CRITICAL: counter=0x0000 (same as the probe) causes the device to
|
"5A STRT start_offset=0x%04X end_offset=0x%04X size=0x%04X",
|
||||||
# re-return the STRT record data for chunk 1, making frame 1 look like
|
start_offset, end_offset, end_offset - start_offset,
|
||||||
# a second probe response (confirmed from server log: frame 1 len=1097,
|
)
|
||||||
# contains STRT\xff\xfe, contributes zero body bytes after DLE-strip).
|
|
||||||
# counter=0x0400 for chunk 1 confirmed working (empirical test 2026-04-06).
|
# ── Step 3: metadata pages 0x1002 + 0x1004 (event 1 only) ────────────
|
||||||
_chunk_base = max(_key4_offset, _BULK_COUNTER_STEP)
|
# Confirmed from BW captures: BW reads these two fixed device-buffer
|
||||||
for chunk_num in range(1, max_chunks + 1):
|
# pages immediately after the probe for events at start_key[2:4]=0.
|
||||||
counter = _chunk_base + (chunk_num - 1) * _BULK_COUNTER_STEP
|
# Continuation events skip them (BW caches across the session).
|
||||||
params = bulk_waveform_params(key4, counter)
|
# Their content is global compliance-setup metadata: Project, Client,
|
||||||
log.debug("5A chunk %d counter=0x%04X", chunk_num, counter)
|
# User Name, Seis Loc, Extended Notes.
|
||||||
|
if is_event_1:
|
||||||
|
for meta_counter in (0x1002, 0x1004):
|
||||||
|
# Metadata page params have an extra trailing 0x00 byte
|
||||||
|
# (12-byte params instead of 11) — empirical from BW captures.
|
||||||
|
# Checksum-neutral but matches BW byte-for-byte.
|
||||||
|
meta_params = bytes([
|
||||||
|
0x00,
|
||||||
|
key4[0], key4[1],
|
||||||
|
(meta_counter >> 8) & 0xFF,
|
||||||
|
meta_counter & 0xFF,
|
||||||
|
0, 0, 0, 0, 0, 0, 0,
|
||||||
|
])
|
||||||
|
log.debug("5A metadata page counter=0x%04X", meta_counter)
|
||||||
|
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, meta_params))
|
||||||
|
self._parser.reset()
|
||||||
|
try:
|
||||||
|
meta_rsp = self._recv_one(
|
||||||
|
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
|
||||||
|
)
|
||||||
|
except TimeoutError:
|
||||||
|
log.warning(
|
||||||
|
"5A metadata page 0x%04X TIMED OUT — continuing",
|
||||||
|
meta_counter,
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
frames_data.append(meta_rsp)
|
||||||
|
log.debug(
|
||||||
|
"5A meta@0x%04X page_key=0x%04X %d bytes",
|
||||||
|
meta_counter, meta_rsp.page_key, len(meta_rsp.data),
|
||||||
|
)
|
||||||
|
|
||||||
|
# ── Step 4: sample chunk loop, bounded by end_offset ─────────────────
|
||||||
|
# Sample chunks start at:
|
||||||
|
# event 1: counter = 0x0600
|
||||||
|
# event N (>0): counter = probe_counter + 0x0200
|
||||||
|
# (probe was the first sample chunk)
|
||||||
|
if is_event_1:
|
||||||
|
counter = 0x0600
|
||||||
|
else:
|
||||||
|
counter = probe_counter + _BULK_COUNTER_STEP
|
||||||
|
|
||||||
|
last_chunk_counter: Optional[int] = (
|
||||||
|
probe_counter if not is_event_1 else None
|
||||||
|
)
|
||||||
|
chunks_fetched = 0
|
||||||
|
|
||||||
|
while chunks_fetched < max_chunks:
|
||||||
|
# Stop when next chunk would straddle the event end.
|
||||||
|
if counter + _BULK_COUNTER_STEP > end_offset:
|
||||||
|
log.debug(
|
||||||
|
"5A chunk loop done at counter=0x%04X (end=0x%04X); "
|
||||||
|
"%d chunks fetched",
|
||||||
|
counter, end_offset, chunks_fetched,
|
||||||
|
)
|
||||||
|
break
|
||||||
|
|
||||||
|
params = bulk_waveform_params(key4, counter)
|
||||||
|
log.debug("5A chunk #%d counter=0x%04X", chunks_fetched + 1, counter)
|
||||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
||||||
self._parser.reset() # reset bytes_fed for accurate per-chunk count
|
self._parser.reset()
|
||||||
try:
|
try:
|
||||||
# Collect ALL frames from this chunk response.
|
rsp = self._recv_one(
|
||||||
# Over TCP via modem, a single large A5 device response (~1100 bytes
|
expected_sub=rsp_sub, reset_parser=False, timeout=10.0,
|
||||||
# RS-232) is split across ~2 TCP segments, each parsed as its own
|
)
|
||||||
# complete S3 frame. _recv_5a_batch gathers all of them so that
|
|
||||||
# every subsequent chunk request is paired with the correct response.
|
|
||||||
batch = self._recv_5a_batch(rsp_sub, first_timeout=10.0)
|
|
||||||
except TimeoutError:
|
except TimeoutError:
|
||||||
raw = self._parser.bytes_fed
|
raw = self._parser.bytes_fed
|
||||||
log.warning(
|
log.warning(
|
||||||
"5A TIMEOUT chunk=%d counter=0x%04X raw_bytes=%d",
|
"5A TIMEOUT chunk=%d counter=0x%04X raw_bytes=%d",
|
||||||
chunk_num, counter, raw,
|
chunks_fetched + 1, counter, raw,
|
||||||
)
|
)
|
||||||
if raw > 0 and frames_data:
|
if raw > 0 and frames_data:
|
||||||
# Device sent a partial byte (likely a bare DLE/ETX end-of-stream
|
|
||||||
# signal) but never completed a full frame. Treat as graceful
|
|
||||||
# stream end and fall through to the termination step.
|
|
||||||
log.warning(
|
log.warning(
|
||||||
"5A end-of-stream detected at chunk=%d (raw_bytes=%d, "
|
"5A unexpected end-of-stream — proceeding to TERM",
|
||||||
"frames_collected=%d) — proceeding to termination",
|
|
||||||
chunk_num, raw, len(frames_data),
|
|
||||||
)
|
)
|
||||||
break
|
break
|
||||||
raise
|
raise
|
||||||
|
|
||||||
# Process all frames from this batch.
|
log.debug(
|
||||||
metadata_found = False
|
"5A RX chunk=%d page_key=0x%04X data_len=%d",
|
||||||
for rsp in batch:
|
chunks_fetched + 1, rsp.page_key, len(rsp.data),
|
||||||
log.warning(
|
|
||||||
"5A RX chunk=%d page_key=0x%04X data_len=%d contains_Project=%s",
|
|
||||||
chunk_num, rsp.page_key, len(rsp.data), b"Project:" in rsp.data,
|
|
||||||
)
|
|
||||||
if rsp.page_key == 0x0000:
|
|
||||||
# Device unexpectedly terminated mid-stream.
|
|
||||||
log.debug("5A page_key=0x0000 — device terminated early")
|
|
||||||
if include_terminator:
|
|
||||||
frames_data.append(rsp)
|
|
||||||
return frames_data
|
|
||||||
frames_data.append(rsp)
|
|
||||||
if stop_after_metadata and b"Project:" in rsp.data:
|
|
||||||
metadata_found = True
|
|
||||||
|
|
||||||
if metadata_found:
|
|
||||||
# Download extra_chunks_after_metadata more chunks after metadata.
|
|
||||||
# This primes the device to return the valid waveform footer in the
|
|
||||||
# termination response — without it the terminator carries too few bytes
|
|
||||||
# (confirmed 2026-04-23). The extra chunk data also belongs in the
|
|
||||||
# file body (confirmed from TCP capture analysis 2026-04-27).
|
|
||||||
log.debug("5A metadata found — fetching %d more chunk(s)",
|
|
||||||
_effective_extra_chunks)
|
|
||||||
for _extra_n in range(_effective_extra_chunks):
|
|
||||||
chunk_num += 1
|
|
||||||
counter = _chunk_base + (chunk_num - 1) * _BULK_COUNTER_STEP
|
|
||||||
params = bulk_waveform_params(key4, counter)
|
|
||||||
self._send(build_5a_frame(_BULK_CHUNK_OFFSET, params))
|
|
||||||
try:
|
|
||||||
extra_batch = self._recv_5a_batch(rsp_sub, first_timeout=10.0)
|
|
||||||
for ef in extra_batch:
|
|
||||||
log.debug(
|
|
||||||
"5A extra chunk page_key=0x%04X data_len=%d",
|
|
||||||
ef.page_key, len(ef.data),
|
|
||||||
)
|
|
||||||
if ef.page_key == 0x0000:
|
|
||||||
if include_terminator:
|
|
||||||
frames_data.append(ef)
|
|
||||||
return frames_data
|
|
||||||
frames_data.append(ef)
|
|
||||||
except TimeoutError:
|
|
||||||
log.debug("5A extra chunk %d timed out — end of stream", _extra_n + 1)
|
|
||||||
break
|
|
||||||
break
|
|
||||||
else:
|
|
||||||
log.warning(
|
|
||||||
"5A reached max_chunks=%d without end-of-stream; sending termination",
|
|
||||||
max_chunks,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
# ── Step 3: termination ──────────────────────────────────────────────
|
if rsp.page_key == 0x0000:
|
||||||
term_counter = counter + _BULK_COUNTER_STEP
|
# Device terminated mid-stream unexpectedly.
|
||||||
term_params = bulk_waveform_term_params(key4, term_counter)
|
log.warning(
|
||||||
log.debug(
|
"5A unexpected page_key=0x0000 mid-stream at counter=0x%04X",
|
||||||
"5A termination term_counter=0x%04X offset=0x%04X",
|
counter,
|
||||||
term_counter, _BULK_TERM_OFFSET,
|
)
|
||||||
)
|
if include_terminator:
|
||||||
self._send(build_5a_frame(_BULK_TERM_OFFSET, term_params))
|
frames_data.append(rsp)
|
||||||
try:
|
return frames_data
|
||||||
term_rsp = self._recv_one(expected_sub=rsp_sub)
|
|
||||||
|
frames_data.append(rsp)
|
||||||
|
last_chunk_counter = counter
|
||||||
|
counter += _BULK_COUNTER_STEP
|
||||||
|
chunks_fetched += 1
|
||||||
|
else:
|
||||||
|
log.warning(
|
||||||
|
"5A reached max_chunks=%d at counter=0x%04X (end=0x%04X)",
|
||||||
|
max_chunks, counter, end_offset,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ── Step 5: TERM with proper end_offset-derived formula ──────────────
|
||||||
|
if last_chunk_counter is None or end_offset == 0xFFFF:
|
||||||
|
# No STRT or no chunks fetched — fall back to legacy TERM.
|
||||||
|
log.warning(
|
||||||
|
"5A using legacy TERM (offset_word=0x005A); "
|
||||||
|
"end_offset unavailable or no chunks fetched",
|
||||||
|
)
|
||||||
|
legacy_counter = (last_chunk_counter or probe_counter) + _BULK_COUNTER_STEP
|
||||||
|
term_offset_word = _BULK_TERM_OFFSET # 0x005A
|
||||||
|
term_params = bulk_waveform_term_params(key4, legacy_counter)
|
||||||
|
else:
|
||||||
|
term_offset_word, term_params = bulk_waveform_term_v2(
|
||||||
|
key4, end_offset, last_chunk_counter,
|
||||||
|
)
|
||||||
log.debug(
|
log.debug(
|
||||||
"5A termination response page_key=0x%04X %d bytes",
|
"5A TERM offset_word=0x%04X params[2:4]=%s end=0x%04X "
|
||||||
|
"last_chunk=0x%04X",
|
||||||
|
term_offset_word, term_params[2:4].hex(),
|
||||||
|
end_offset, last_chunk_counter,
|
||||||
|
)
|
||||||
|
|
||||||
|
self._send(build_5a_frame(term_offset_word, term_params))
|
||||||
|
try:
|
||||||
|
term_rsp = self._recv_one(expected_sub=rsp_sub, timeout=10.0)
|
||||||
|
log.info(
|
||||||
|
"5A TERM response page_key=0x%04X %d bytes",
|
||||||
term_rsp.page_key, len(term_rsp.data),
|
term_rsp.page_key, len(term_rsp.data),
|
||||||
)
|
)
|
||||||
if include_terminator:
|
if include_terminator:
|
||||||
frames_data.append(term_rsp)
|
frames_data.append(term_rsp)
|
||||||
except TimeoutError:
|
except TimeoutError:
|
||||||
log.debug("5A no termination response — device may have already closed")
|
log.warning("5A no TERM response (timeout)")
|
||||||
|
|
||||||
return frames_data
|
return frames_data
|
||||||
|
|
||||||
@@ -882,7 +937,7 @@ class MiniMateProtocol:
|
|||||||
continue
|
continue
|
||||||
|
|
||||||
chunk = data_rsp.data[11:]
|
chunk = data_rsp.data[11:]
|
||||||
log.warning(
|
log.debug(
|
||||||
"read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d",
|
"read_compliance_config: frame %s page=0x%04X data=%d cfg_chunk=%d running_total=%d",
|
||||||
step_name, data_rsp.page_key, len(data_rsp.data),
|
step_name, data_rsp.page_key, len(data_rsp.data),
|
||||||
len(chunk), len(config) + len(chunk),
|
len(chunk), len(config) + len(chunk),
|
||||||
@@ -902,17 +957,18 @@ class MiniMateProtocol:
|
|||||||
except TimeoutError:
|
except TimeoutError:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
log.warning(
|
log.info(
|
||||||
"read_compliance_config: done — %d cfg bytes total",
|
"read_compliance_config: done — %d cfg bytes total",
|
||||||
len(config),
|
len(config),
|
||||||
)
|
)
|
||||||
|
|
||||||
# Hex dump first 128 bytes for field mapping
|
# Hex dump first 128 bytes — useful only for field-mapping work, not normal operation.
|
||||||
for row in range(0, min(len(config), 128), 16):
|
if log.isEnabledFor(logging.DEBUG):
|
||||||
row_bytes = bytes(config[row:row + 16])
|
for row in range(0, min(len(config), 128), 16):
|
||||||
hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
|
row_bytes = bytes(config[row:row + 16])
|
||||||
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
|
hex_part = ' '.join(f'{b:02x}' for b in row_bytes)
|
||||||
log.warning(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
|
asc_part = ''.join(chr(b) if 32 <= b < 127 else '.' for b in row_bytes)
|
||||||
|
log.debug(" cfg[%04x]: %-48s %s", row, hex_part, asc_part)
|
||||||
|
|
||||||
return bytes(config)
|
return bytes(config)
|
||||||
|
|
||||||
@@ -1403,63 +1459,6 @@ class MiniMateProtocol:
|
|||||||
log.debug("TX %d bytes: %s", len(frame), frame.hex())
|
log.debug("TX %d bytes: %s", len(frame), frame.hex())
|
||||||
self._transport.write(frame)
|
self._transport.write(frame)
|
||||||
|
|
||||||
def _recv_5a_batch(
|
|
||||||
self,
|
|
||||||
expected_sub: int,
|
|
||||||
first_timeout: float = 10.0,
|
|
||||||
batch_timeout: float = 0.5,
|
|
||||||
) -> list[S3Frame]:
|
|
||||||
"""
|
|
||||||
Collect all S3 frames that arrive as part of one device response.
|
|
||||||
|
|
||||||
Over TCP via cellular modem, a single device A5 response (~1100 bytes of
|
|
||||||
RS-232 data) is forwarded in multiple TCP segments due to the modem's
|
|
||||||
data-forwarding timeout (~100-150 ms per segment). Each TCP segment
|
|
||||||
contains a complete, valid S3 frame (~550 bytes). Calling _recv_one()
|
|
||||||
once returns only the first segment's frame and misses the rest, causing
|
|
||||||
the chunk request/response pairing to cascade out of alignment.
|
|
||||||
|
|
||||||
This helper collects ALL frames before returning, by trying additional
|
|
||||||
short-timeout receives after the first frame arrives.
|
|
||||||
|
|
||||||
The caller must call self._parser.reset() before this method to ensure
|
|
||||||
bytes_fed is accurate; this method always uses reset_parser=False.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
expected_sub: Expected SUB byte for validation.
|
|
||||||
first_timeout: Timeout for the mandatory first frame. Should be
|
|
||||||
generous (default 10 s) since the device may be slow.
|
|
||||||
batch_timeout: Short timeout for subsequent frames. Default 0.5 s
|
|
||||||
— comfortably longer than the modem forwarding gap
|
|
||||||
(~150 ms) but short enough to avoid stalling when
|
|
||||||
only one frame is expected (probe, terminator).
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of S3Frame objects in arrival order (at least one).
|
|
||||||
|
|
||||||
Raises:
|
|
||||||
TimeoutError: If no frame arrives within first_timeout.
|
|
||||||
UnexpectedResponse: If any frame has the wrong SUB byte.
|
|
||||||
"""
|
|
||||||
frames: list[S3Frame] = []
|
|
||||||
first = self._recv_one(
|
|
||||||
expected_sub=expected_sub,
|
|
||||||
reset_parser=False,
|
|
||||||
timeout=first_timeout,
|
|
||||||
)
|
|
||||||
frames.append(first)
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
extra = self._recv_one(
|
|
||||||
expected_sub=expected_sub,
|
|
||||||
reset_parser=False,
|
|
||||||
timeout=batch_timeout,
|
|
||||||
)
|
|
||||||
frames.append(extra)
|
|
||||||
except TimeoutError:
|
|
||||||
break
|
|
||||||
return frames
|
|
||||||
|
|
||||||
def _recv_one(
|
def _recv_one(
|
||||||
self,
|
self,
|
||||||
expected_sub: Optional[int] = None,
|
expected_sub: Optional[int] = None,
|
||||||
|
|||||||
@@ -454,3 +454,102 @@ class SocketTransport(TcpTransport):
|
|||||||
|
|
||||||
def __repr__(self) -> str:
|
def __repr__(self) -> str:
|
||||||
return f"SocketTransport(peer={self.host!r})"
|
return f"SocketTransport(peer={self.host!r})"
|
||||||
|
|
||||||
|
|
||||||
|
# ── Capturing transport (MITM-style raw byte mirror) ──────────────────────────
|
||||||
|
|
||||||
|
class CapturingTransport(BaseTransport):
|
||||||
|
"""
|
||||||
|
Wraps another BaseTransport and mirrors every byte to two raw capture files:
|
||||||
|
|
||||||
|
raw_bw_<...>.bin — bytes WE wrote to the device (BW-side TX)
|
||||||
|
raw_s3_<...>.bin — bytes the device wrote back (S3-side TX)
|
||||||
|
|
||||||
|
The file naming and on-wire byte layout are identical to the captures
|
||||||
|
produced by `bridges/ach_mitm.py`, so the resulting `.bin` files can be
|
||||||
|
loaded directly by the Analyzer (File > Open Capture) and parsed by the
|
||||||
|
same tooling used for genuine Blastware MITM captures.
|
||||||
|
|
||||||
|
All BaseTransport methods are forwarded to the inner transport; the only
|
||||||
|
side-effect is that successful read/write byte streams are appended to the
|
||||||
|
two open binary files.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
inner: An already-built BaseTransport (SerialTransport / TcpTransport).
|
||||||
|
bw_path: File path for the "BW TX" stream (bytes we send). Opened "wb".
|
||||||
|
s3_path: File path for the "S3 TX" stream (bytes the device sends).
|
||||||
|
Opened "wb".
|
||||||
|
|
||||||
|
Example:
|
||||||
|
with CapturingTransport(TcpTransport("1.2.3.4", 9034),
|
||||||
|
"raw_bw.bin", "raw_s3.bin") as t:
|
||||||
|
client = MiniMateClient(transport=t)
|
||||||
|
client.connect()
|
||||||
|
client.get_events()
|
||||||
|
# both .bin files now hold the full bidirectional capture.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, inner: BaseTransport, bw_path: str, s3_path: str) -> None:
|
||||||
|
self._inner = inner
|
||||||
|
self._bw_path = bw_path
|
||||||
|
self._s3_path = s3_path
|
||||||
|
self._bw_fh = None
|
||||||
|
self._s3_fh = None
|
||||||
|
# Forward inner attrs so callers can introspect (e.g. .host, .port).
|
||||||
|
self.host = getattr(inner, "host", None)
|
||||||
|
self.port = getattr(inner, "port", None)
|
||||||
|
|
||||||
|
# ── BaseTransport interface ───────────────────────────────────────────────
|
||||||
|
|
||||||
|
def connect(self) -> None:
|
||||||
|
if self._bw_fh is None:
|
||||||
|
self._bw_fh = open(self._bw_path, "wb", buffering=0)
|
||||||
|
if self._s3_fh is None:
|
||||||
|
self._s3_fh = open(self._s3_path, "wb", buffering=0)
|
||||||
|
self._inner.connect()
|
||||||
|
|
||||||
|
def disconnect(self) -> None:
|
||||||
|
try:
|
||||||
|
self._inner.disconnect()
|
||||||
|
finally:
|
||||||
|
for fh_attr in ("_bw_fh", "_s3_fh"):
|
||||||
|
fh = getattr(self, fh_attr)
|
||||||
|
if fh is not None:
|
||||||
|
try:
|
||||||
|
fh.flush()
|
||||||
|
fh.close()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
setattr(self, fh_attr, None)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def is_connected(self) -> bool:
|
||||||
|
return self._inner.is_connected
|
||||||
|
|
||||||
|
def write(self, data: bytes) -> None:
|
||||||
|
self._inner.write(data)
|
||||||
|
if data and self._bw_fh is not None:
|
||||||
|
try:
|
||||||
|
self._bw_fh.write(data)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
def read(self, n: int) -> bytes:
|
||||||
|
got = self._inner.read(n)
|
||||||
|
if got and self._s3_fh is not None:
|
||||||
|
try:
|
||||||
|
self._s3_fh.write(got)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
return got
|
||||||
|
|
||||||
|
@property
|
||||||
|
def bw_path(self) -> str:
|
||||||
|
return self._bw_path
|
||||||
|
|
||||||
|
@property
|
||||||
|
def s3_path(self) -> str:
|
||||||
|
return self._s3_path
|
||||||
|
|
||||||
|
def __repr__(self) -> str:
|
||||||
|
return f"CapturingTransport({self._inner!r}, bw={self._bw_path!r}, s3={self._s3_path!r})"
|
||||||
|
|||||||
@@ -0,0 +1,578 @@
|
|||||||
|
"""
|
||||||
|
waveform_codec.py — block-walker and verified decoder for the MiniMate Plus
|
||||||
|
waveform-file body.
|
||||||
|
|
||||||
|
FULLY DECODED 2026-05-11. Every block type, every channel, and the
|
||||||
|
channel-rotation rule are verified byte-exact against BW's ASCII export
|
||||||
|
across the 9-event fixture bundle (47,364 ADC samples, zero errors).
|
||||||
|
|
||||||
|
The Blastware waveform-file body — the bytes between the 21-byte STRT
|
||||||
|
record and the 26-byte file footer — is a tagged variable-length block
|
||||||
|
stream with a custom delta + RLE codec. (Not raw int16 LE, which was
|
||||||
|
the historical wrong assumption that produced ±32K noise on every event.)
|
||||||
|
|
||||||
|
Current status:
|
||||||
|
|
||||||
|
- Block framing: ✅ solved (5 block types and lengths all confirmed)
|
||||||
|
- Per-channel decode: ✅ solved (Tran / Vert / Long / MicL all byte-exact)
|
||||||
|
- Channel rotation: ✅ Tran → Vert → Long → MicL per segment
|
||||||
|
- Segment header: ✅ fully decoded (anchor pair + prev-channel extension)
|
||||||
|
- 30 NN packed-delta block: ✅ NN × 12-bit signed deltas in NN/4 groups
|
||||||
|
- MicL → dB(L) conversion: ✅ ``mic_count_to_db`` matches BW display
|
||||||
|
- Production wiring: ✅ ``client.py:_decode_a5_waveform`` uses the new
|
||||||
|
codec (via ``decode_a5_frames``). ``.h5`` sidecars now render
|
||||||
|
correctly.
|
||||||
|
|
||||||
|
Known limitations:
|
||||||
|
|
||||||
|
- Walker stops early on the loudest events (SP0, SS0, SV0, event-b) at
|
||||||
|
some mid-segment edge cases not yet fully characterized. Every
|
||||||
|
sample reached IS correct; the walker just doesn't reach all of
|
||||||
|
them yet. The cleanly-decoded subset is still ~5000–15000 samples
|
||||||
|
per loud event.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Body layout (CONFIRMED 2026-05-11 against 8 fixture events)
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
[7-byte preamble] [stream of tagged blocks] [trailer]
|
||||||
|
|
||||||
|
The preamble is always exactly 7 bytes:
|
||||||
|
|
||||||
|
body[0:3] = 00 02 00 magic
|
||||||
|
body[3:5] = Tran[0] int16 BE in 16-count units (LSB = 0.005 in/s)
|
||||||
|
body[5:7] = Tran[1] int16 BE in 16-count units
|
||||||
|
|
||||||
|
(Earlier drafts of this module described a "7-or-9-byte preamble";
|
||||||
|
that was wrong — single-shot and continuous events both use 7 bytes.
|
||||||
|
The "extra 2 bytes" on continuous events were the first ``00 NN`` RLE
|
||||||
|
marker, not part of the preamble.)
|
||||||
|
|
||||||
|
Block types and lengths (all confirmed):
|
||||||
|
|
||||||
|
| Tag | Length | Meaning |
|
||||||
|
|----------|-----------------------|----------------------------------------|
|
||||||
|
| ``10 NN``| NN/2 + 2 bytes | 4-bit nibble deltas (2 per byte; high |
|
||||||
|
| | | nibble first; signed 0..7 / 8..F = -8..-1)|
|
||||||
|
| ``20 NN``| NN + 2 bytes | int8 signed deltas (1 per byte) |
|
||||||
|
| ``00 NN``| 2 bytes | RLE: append NN copies of current value |
|
||||||
|
| ``30 NN``| NN*2 in data, NN*4 | Unknown content. Only in loud events. |
|
||||||
|
| | in trailer | |
|
||||||
|
| ``40 02``| 20 bytes (fixed) | Segment header |
|
||||||
|
|
||||||
|
NN is always a multiple of 4.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Tran channel, segment 0 (CONFIRMED 2026-05-11)
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
Segment 0 — everything before the first ``40 02`` segment header — encodes
|
||||||
|
Tran samples only. Starting from preamble anchors Tran[0] and Tran[1],
|
||||||
|
each subsequent block contributes to the running Tran value:
|
||||||
|
|
||||||
|
10 NN → append NN deltas (4-bit signed nibbles)
|
||||||
|
20 NN → append NN deltas (int8 signed bytes)
|
||||||
|
00 NN → append NN copies of the current value (RLE zeros)
|
||||||
|
40 02 → segment 0 ends; multi-segment continuation is open
|
||||||
|
|
||||||
|
This decodes the first 482–510 samples of Tran for each event with zero
|
||||||
|
errors against BW's ASCII export. The exact segment-0 sample count
|
||||||
|
varies per event (it's bounded by a fixed device-flash byte budget, not
|
||||||
|
a fixed sample count — quiet events fit more samples because zero
|
||||||
|
deltas pack into ``00 NN`` markers compactly).
|
||||||
|
|
||||||
|
Implementation: :func:`decode_tran_initial`.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Segment header (40 02, 20 bytes total)
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
The 18-byte payload of the ``40 02`` block:
|
||||||
|
|
||||||
|
| Offset | Field | Status |
|
||||||
|
|-----------|---------------------------------------------|-------------|
|
||||||
|
| [0:2] | T_delta at first sample of new segment | ✅ confirmed|
|
||||||
|
| | (int16 BE, in 16-count units) | |
|
||||||
|
| [2:4] | Likely T_delta at sample seg_start+1 | 🟡 likely |
|
||||||
|
| [4:6] | Unknown (varies; possibly checksum) | ❓ open |
|
||||||
|
| [6:8] | Byte length to next segment header − 2 | ✅ confirmed|
|
||||||
|
| | (uint16 BE; useful for walker pre-scan) | |
|
||||||
|
| [8:12] | Monotonic uint32 LE counter | ✅ confirmed|
|
||||||
|
| | (starts ~0x47, increments by 1 per segment) | |
|
||||||
|
| [12:14] | Constant ``02 00`` | ✅ confirmed|
|
||||||
|
| [14:18] | Unknown 4-byte field | ❓ open |
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
What breaks the multi-segment decoder (the main open question)
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
After segment 0 ends and the segment header T_delta is consumed,
|
||||||
|
applying segment 1's blocks as Tran continuation produces values that
|
||||||
|
diverge from truth by sample ~512. The block structure inside segment
|
||||||
|
1 is IDENTICAL to segment 0 (same alternating 10 NN / 00 NN pattern),
|
||||||
|
and the delta budget matches the segment size exactly (V70 segment 1
|
||||||
|
has 264 nibble-deltas + 244 RLE zeros = 508 = the segment's sample
|
||||||
|
count). But the cumulative is wrong.
|
||||||
|
|
||||||
|
The strongest unverified hypothesis is that segments rotate channels:
|
||||||
|
|
||||||
|
segment 0 → Tran samples 0..509
|
||||||
|
segment 1 → Vert samples 0..507
|
||||||
|
segment 2 → Long samples 0..507
|
||||||
|
segment 3 → Mic samples 0..507
|
||||||
|
segment 4 → Tran samples 510..N (continuation)
|
||||||
|
...
|
||||||
|
|
||||||
|
This is consistent with the segment-1 block sums net-to-near-zero in
|
||||||
|
V70 (where all 4 channels are near zero) and with the per-segment delta
|
||||||
|
budget matching the segment size for a single channel. It is NOT yet
|
||||||
|
verified because the per-segment channel anchor isn't pinned down in
|
||||||
|
the segment header — bytes [4:6] and [14:18] of the header are still
|
||||||
|
open and probably encode V/L/M anchors.
|
||||||
|
|
||||||
|
See ``docs/waveform_codec_re_status.md`` for the current working notes
|
||||||
|
and the suggested next experiment ("segment-channel scoring analyzer").
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import math
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional, Tuple
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class WaveformBlock:
|
||||||
|
"""One tagged block parsed out of a Blastware waveform-file body."""
|
||||||
|
offset: int # byte offset into body
|
||||||
|
tag_hi: int # first tag byte (0x10 / 0x20 / 0x00 / 0x30 / 0x40)
|
||||||
|
tag_lo: int # second tag byte (NN)
|
||||||
|
data: bytes # block payload (excludes the 2-byte tag)
|
||||||
|
length: int # total block length on the wire (includes the tag)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def kind(self) -> str:
|
||||||
|
return f"{self.tag_hi:02x} {self.tag_lo:02x}"
|
||||||
|
|
||||||
|
|
||||||
|
def find_data_start(body: bytes) -> int:
|
||||||
|
"""Auto-detect the offset of the first data block.
|
||||||
|
|
||||||
|
The body starts with a 7-byte preamble (magic ``00 02 00`` + two int16 BE
|
||||||
|
Tran anchors). After that, the data section starts with a tag — usually
|
||||||
|
``10 NN`` or ``20 NN``, but quiet events may begin with a ``00 NN`` RLE
|
||||||
|
marker. We return the offset of the first recognized tag.
|
||||||
|
"""
|
||||||
|
# Try fixed offset 7 first (canonical preamble length).
|
||||||
|
if len(body) >= 9:
|
||||||
|
b, nn = body[7], body[8]
|
||||||
|
if (b in (0x00, 0x10, 0x20, 0x30) and nn % 4 == 0 and 0 < nn <= 0xFC) \
|
||||||
|
or (b == 0x40 and nn == 0x02):
|
||||||
|
return 7
|
||||||
|
# Fall back to scanning the first 20 bytes.
|
||||||
|
for i in range(min(20, len(body) - 1)):
|
||||||
|
b = body[i]
|
||||||
|
nn = body[i + 1]
|
||||||
|
if b in (0x10, 0x20) and nn % 4 == 0 and 0 < nn <= 0xFC:
|
||||||
|
return i
|
||||||
|
return -1
|
||||||
|
|
||||||
|
|
||||||
|
def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
|
||||||
|
"""Walk the tagged-block sequence starting at *start* (auto-detected by default).
|
||||||
|
|
||||||
|
Stops when an unrecognized tag is encountered or end of body is reached.
|
||||||
|
Returned blocks are in stream order.
|
||||||
|
"""
|
||||||
|
if start is None:
|
||||||
|
start = find_data_start(body)
|
||||||
|
if start < 0:
|
||||||
|
return []
|
||||||
|
|
||||||
|
blocks: List[WaveformBlock] = []
|
||||||
|
i = start
|
||||||
|
while i + 1 < len(body):
|
||||||
|
t0 = body[i]
|
||||||
|
t1 = body[i + 1]
|
||||||
|
if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 // 2 + 2
|
||||||
|
elif (t0 & 0xF0) == 0x10 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
|
||||||
|
# Wide-NN nibble block: ``1X NN`` where X is the high nibble of a
|
||||||
|
# 12-bit NN value. NN = ((t0 & 0x0F) << 8) | t1. Block length
|
||||||
|
# = NN/2 + 2 bytes (NN nibble deltas, same as ``10 NN`` semantics
|
||||||
|
# but with NN > 0xFC). Confirmed 2026-05-11 in SP0 segment 12
|
||||||
|
# where V continuation uses ``11 90`` = NN=0x190=400.
|
||||||
|
wide_nn = ((t0 & 0x0F) << 8) | t1
|
||||||
|
length = wide_nn // 2 + 2
|
||||||
|
elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
|
||||||
|
length = t1 + 2
|
||||||
|
elif (t0 & 0xF0) == 0x20 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
|
||||||
|
# Wide-NN int8 block: ``2X NN`` extends NN to 12 bits the same way.
|
||||||
|
wide_nn = ((t0 & 0x0F) << 8) | t1
|
||||||
|
length = wide_nn + 2
|
||||||
|
elif t0 == 0x00 and t1 % 4 == 0:
|
||||||
|
length = 2
|
||||||
|
elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
|
||||||
|
# Data-section ``30 NN`` blocks carry NN 12-bit signed deltas packed
|
||||||
|
# as NN/4 groups of (2-byte high-nibble field + 4 × int8 low byte).
|
||||||
|
# Length = NN/4 × 6 + 2 = NN × 1.5 + 2 (= 8 for NN=4, 14 for NN=8,
|
||||||
|
# 20 for NN=12, etc.). Confirmed 2026-05-11 by full-decoder
|
||||||
|
# verification against BW ASCII export.
|
||||||
|
#
|
||||||
|
# Trailer-section ``30 NN`` blocks have a different length formula
|
||||||
|
# (NN × 4 = 32 for NN=8 in trailers). We try the data-section
|
||||||
|
# length first and fall back to the trailer length if needed.
|
||||||
|
cand_data = t1 * 3 // 2 + 2
|
||||||
|
cand_trailer = t1 * 4
|
||||||
|
if (i + cand_data < len(body) - 1
|
||||||
|
and body[i + cand_data] in (0x10, 0x20, 0x00, 0x30, 0x40)):
|
||||||
|
length = cand_data
|
||||||
|
else:
|
||||||
|
length = cand_trailer
|
||||||
|
elif t0 == 0x40 and t1 == 0x02:
|
||||||
|
length = 20
|
||||||
|
else:
|
||||||
|
# Unknown tag; stop. Caller can inspect ``i`` to see where.
|
||||||
|
break
|
||||||
|
|
||||||
|
if i + length > len(body):
|
||||||
|
break
|
||||||
|
|
||||||
|
data = bytes(body[i + 2 : i + length])
|
||||||
|
blocks.append(WaveformBlock(offset=i, tag_hi=t0, tag_lo=t1, data=data, length=length))
|
||||||
|
i += length
|
||||||
|
|
||||||
|
return blocks
|
||||||
|
|
||||||
|
|
||||||
|
def split_segments(blocks: List[WaveformBlock]) -> List[List[WaveformBlock]]:
|
||||||
|
"""Group consecutive blocks into segments separated by ``40 02`` headers.
|
||||||
|
|
||||||
|
The first segment is whatever runs before the first ``40 02`` header
|
||||||
|
(typically the "segment 0" preamble data after the body preamble).
|
||||||
|
Subsequent segments start with a ``40 02`` block, then have their
|
||||||
|
own data blocks until the next ``40 02``.
|
||||||
|
"""
|
||||||
|
segments: List[List[WaveformBlock]] = []
|
||||||
|
current: List[WaveformBlock] = []
|
||||||
|
for b in blocks:
|
||||||
|
if b.tag_hi == 0x40 and b.tag_lo == 0x02:
|
||||||
|
if current:
|
||||||
|
segments.append(current)
|
||||||
|
current = [b]
|
||||||
|
else:
|
||||||
|
current.append(b)
|
||||||
|
if current:
|
||||||
|
segments.append(current)
|
||||||
|
return segments
|
||||||
|
|
||||||
|
|
||||||
|
def parse_segment_header(block: WaveformBlock) -> Optional[dict]:
|
||||||
|
"""Decode the 18-byte payload of a ``40 02`` segment header.
|
||||||
|
|
||||||
|
Returns a dict with the labelled fields, or None if *block* is not
|
||||||
|
a ``40 02`` header.
|
||||||
|
"""
|
||||||
|
if not (block.tag_hi == 0x40 and block.tag_lo == 0x02):
|
||||||
|
return None
|
||||||
|
if len(block.data) < 18:
|
||||||
|
return None
|
||||||
|
p = block.data
|
||||||
|
counter = int.from_bytes(p[8:12], "little", signed=False)
|
||||||
|
return {
|
||||||
|
"anchor_bytes": p[0:4], # 4-byte field, role unconfirmed
|
||||||
|
"field2": p[4:8], # 4-byte field, role unconfirmed
|
||||||
|
"counter": counter, # uint32 LE — increments by 1 per segment
|
||||||
|
"fixed_pattern": p[12:16], # always b"\x02\x00\x00\x01"
|
||||||
|
"tail": p[16:18], # last 2 bytes
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _s4(n: int) -> int:
|
||||||
|
"""Sign-extend a 4-bit value to signed int (0..7 → 0..7; 8..F → -8..-1)."""
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def _i8(b: int) -> int:
|
||||||
|
"""Reinterpret an unsigned byte as signed int8."""
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def decode_tran_initial(body: bytes) -> Optional[List[int]]:
|
||||||
|
"""
|
||||||
|
Decode the initial Tran-channel samples — VERIFIED 2026-05-11.
|
||||||
|
|
||||||
|
Returns Tran samples in **16-count units** (LSB = 0.005 in/s at Normal
|
||||||
|
range — the same quantization BW uses for its ASCII export). Returns
|
||||||
|
``None`` if the body cannot be parsed.
|
||||||
|
|
||||||
|
The decoded list extends from sample 0 through the end of segment 0
|
||||||
|
(= just before the first ``40 02`` segment header; ~510 sample-sets
|
||||||
|
for the events tested). Multi-segment decoding requires continuing
|
||||||
|
past the segment header — that's done by :func:`decode_tran_full`
|
||||||
|
when the per-segment rules are pinned down for all signal types.
|
||||||
|
|
||||||
|
Codec for segment 0 (CONFIRMED 2026-05-11 against 7 fixture events):
|
||||||
|
|
||||||
|
- Body bytes [0:3] are the magic ``00 02 00``.
|
||||||
|
- Body bytes [3:5] = ``Tran[0]`` as int16 BE in 16-count units.
|
||||||
|
- Body bytes [5:7] = ``Tran[1]`` as int16 BE in 16-count units.
|
||||||
|
- Data blocks (``10 NN`` or ``20 NN``) carry Tran deltas starting
|
||||||
|
at sample 2:
|
||||||
|
|
||||||
|
* ``10 NN``: NN nibbles = NN/2 bytes; each nibble is a 4-bit
|
||||||
|
signed delta (0..7 → 0..+7; 8..F → -8..-1). High nibble of
|
||||||
|
each byte comes first.
|
||||||
|
* ``20 NN``: NN int8 signed deltas (one delta per byte).
|
||||||
|
|
||||||
|
- ``00 NN`` blocks are run-length-encoded zero deltas: append NN
|
||||||
|
copies of the current cumulative Tran value (no change).
|
||||||
|
|
||||||
|
- ``30 NN`` blocks have not yet been decoded for content — they
|
||||||
|
appear in segment 0 of loud-from-start events (SS0, SV0) and
|
||||||
|
seem to signal a transition or special-case interpretation.
|
||||||
|
The walker steps over them but their data is ignored.
|
||||||
|
|
||||||
|
The walk stops at the first ``40 02`` segment header.
|
||||||
|
"""
|
||||||
|
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||||
|
return None
|
||||||
|
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
|
||||||
|
start = find_data_start(body)
|
||||||
|
if start < 0:
|
||||||
|
return [t0, t1]
|
||||||
|
|
||||||
|
out = [t0, t1]
|
||||||
|
cur = t1
|
||||||
|
for blk in walk_body(body, start):
|
||||||
|
if blk.tag_hi == 0x40:
|
||||||
|
# Segment boundary — stop. Multi-segment decode is decode_tran_full.
|
||||||
|
break
|
||||||
|
if blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += _s4(nib)
|
||||||
|
out.append(cur)
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += _i8(byte)
|
||||||
|
out.append(cur)
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
# RLE zero deltas: append NN copies of current Tran value.
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
out.append(cur)
|
||||||
|
# 30 NN: unknown content; skip.
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def decode_waveform_v2(body: bytes) -> Optional[dict]:
|
||||||
|
"""
|
||||||
|
Decode the body into per-channel sample arrays.
|
||||||
|
|
||||||
|
Status (2026-05-11 evening — channel-rotation hypothesis CONFIRMED):
|
||||||
|
segments rotate channels in fixed order **Tran → Vert → Long → MicL**.
|
||||||
|
Each channel-segment carries a 2-sample anchor pair in segment-header
|
||||||
|
bytes [14:18] (or in the body preamble for the initial Tran segment)
|
||||||
|
plus a stream of delta blocks for samples 2 onward.
|
||||||
|
|
||||||
|
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||||
|
with each channel's decoded samples in 16-count units (LSB = 0.005
|
||||||
|
in/s at Normal range). Returns ``None`` if the body cannot be
|
||||||
|
parsed.
|
||||||
|
"""
|
||||||
|
if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
|
||||||
|
return None
|
||||||
|
|
||||||
|
channels = ["Tran", "Vert", "Long", "MicL"]
|
||||||
|
out: dict = {ch: [] for ch in channels}
|
||||||
|
|
||||||
|
# Initial Tran segment: preamble anchor pair + delta blocks before first 40 02.
|
||||||
|
t0 = int.from_bytes(body[3:5], "big", signed=True)
|
||||||
|
t1 = int.from_bytes(body[5:7], "big", signed=True)
|
||||||
|
out["Tran"].extend([t0, t1])
|
||||||
|
|
||||||
|
start = find_data_start(body)
|
||||||
|
if start < 0:
|
||||||
|
return out
|
||||||
|
|
||||||
|
blocks = walk_body(body, start)
|
||||||
|
seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||||
|
|
||||||
|
def apply_blocks(channel: str, anchor: int,
|
||||||
|
block_start: int, block_end: int) -> int:
|
||||||
|
"""Apply delta blocks [block_start, block_end) to *channel*'s sample
|
||||||
|
list, starting from *anchor*. Returns the final cumulative value."""
|
||||||
|
cur = anchor
|
||||||
|
for bi in range(block_start, block_end):
|
||||||
|
blk = blocks[bi]
|
||||||
|
if (blk.tag_hi & 0xF0) == 0x10:
|
||||||
|
# Both ``10 NN`` (NN ≤ 0xFC) and wide-NN ``1X NN`` (X != 0)
|
||||||
|
# are nibble-delta streams. The walker has already used the
|
||||||
|
# right length; here we just iterate the payload bytes.
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += _s4(nib)
|
||||||
|
out[channel].append(cur)
|
||||||
|
elif (blk.tag_hi & 0xF0) == 0x20:
|
||||||
|
# ``20 NN`` and wide ``2X NN`` both carry int8 deltas.
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += _i8(byte)
|
||||||
|
out[channel].append(cur)
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
out[channel].append(cur)
|
||||||
|
elif blk.tag_hi == 0x30:
|
||||||
|
# 12-bit signed deltas, packed as NN/4 groups of 6 bytes each:
|
||||||
|
# bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB first)
|
||||||
|
# bytes [2:6] = 4 × int8 low bytes
|
||||||
|
# Each delta = sign_extend_12((high_nibble << 8) | low_byte).
|
||||||
|
# Confirmed 2026-05-11 against all 14 ``30 NN`` blocks in the
|
||||||
|
# bundled fixtures.
|
||||||
|
n_groups = blk.tag_lo // 4
|
||||||
|
for g in range(n_groups):
|
||||||
|
grp = blk.data[g * 6 : (g + 1) * 6]
|
||||||
|
if len(grp) < 6:
|
||||||
|
break
|
||||||
|
high_word = (grp[0] << 8) | grp[1]
|
||||||
|
for k in range(4):
|
||||||
|
nib = (high_word >> (12 - 4 * k)) & 0xF
|
||||||
|
v = (nib << 8) | grp[2 + k]
|
||||||
|
if v >= 0x800:
|
||||||
|
v -= 0x1000
|
||||||
|
cur += v
|
||||||
|
out[channel].append(cur)
|
||||||
|
# 40 02: should not occur in segment data.
|
||||||
|
return cur
|
||||||
|
|
||||||
|
# Initial Tran segment: deltas from start of body up to first 40 02 (or end).
|
||||||
|
first_seg = seg_idx[0] if seg_idx else len(blocks)
|
||||||
|
last_tran_value = apply_blocks("Tran", t1, 0, first_seg)
|
||||||
|
|
||||||
|
# Subsequent segments rotate channels. Each segment header carries:
|
||||||
|
# bytes [0:2] and [2:4] = 2 deltas extending the PREVIOUS channel
|
||||||
|
# bytes [14:16] and [16:18] = anchor pair for THIS segment's channel
|
||||||
|
#
|
||||||
|
# Rotation: V, L, M, T, V, L, M, T, ... (initial Tran segment is the
|
||||||
|
# implicit T in the cycle.)
|
||||||
|
rotation = ["Vert", "Long", "MicL", "Tran"]
|
||||||
|
# Track each channel's "running cumulative value" so we can apply the
|
||||||
|
# previous-channel extension deltas at every segment boundary.
|
||||||
|
last_value = {"Tran": last_tran_value, "Vert": None, "Long": None, "MicL": None}
|
||||||
|
|
||||||
|
for k, hi in enumerate(seg_idx):
|
||||||
|
channel = rotation[k % 4]
|
||||||
|
prev_channel = "Tran" if k == 0 else rotation[(k - 1) % 4]
|
||||||
|
header = blocks[hi]
|
||||||
|
if len(header.data) < 18:
|
||||||
|
continue
|
||||||
|
# Validate: real segment headers have bytes [12:14] = `02 00`.
|
||||||
|
# Trailer/footer "40 02" markers contain ASCII serial bytes or other
|
||||||
|
# non-header data there and would otherwise be mis-interpreted as
|
||||||
|
# segment headers, adding spurious samples at the tail.
|
||||||
|
if header.data[12:14] != b"\x02\x00":
|
||||||
|
break
|
||||||
|
# Extend the PREVIOUS channel by 2 more samples (deltas in bytes [0:4]).
|
||||||
|
prev_d0 = int.from_bytes(header.data[0:2], "big", signed=True)
|
||||||
|
prev_d1 = int.from_bytes(header.data[2:4], "big", signed=True)
|
||||||
|
if last_value[prev_channel] is not None:
|
||||||
|
v = last_value[prev_channel] + prev_d0
|
||||||
|
out[prev_channel].append(v)
|
||||||
|
v += prev_d1
|
||||||
|
out[prev_channel].append(v)
|
||||||
|
last_value[prev_channel] = v
|
||||||
|
# Anchor pair for THIS segment's channel.
|
||||||
|
c0 = int.from_bytes(header.data[14:16], "big", signed=True)
|
||||||
|
c1 = int.from_bytes(header.data[16:18], "big", signed=True)
|
||||||
|
out[channel].extend([c0, c1])
|
||||||
|
# Apply delta blocks for this segment.
|
||||||
|
next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
|
||||||
|
last_value[channel] = apply_blocks(channel, c1, hi + 1, next_hi)
|
||||||
|
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
# ── ADC-scale conversion helpers ────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
# Scaling factor: decode_waveform_v2 produces geo-channel samples in the BW
|
||||||
|
# display quantization (16-count units, LSB = 0.005 in/s at Normal range).
|
||||||
|
# The legacy consumer pipeline (sfm/event_hdf5.py) expects raw_samples in
|
||||||
|
# 1-count ADC units (× full_scale / 32768 → physical). To plug the new
|
||||||
|
# decoder in without rewriting consumers, multiply geo values by 16.
|
||||||
|
#
|
||||||
|
# Mic samples are already in raw ADC counts (decoded value 1 = 1 mic ADC count
|
||||||
|
# = -81.94 dB on the BW display). Mic values pass through unchanged.
|
||||||
|
_GEO_DECODER_TO_ADC = 16
|
||||||
|
|
||||||
|
|
||||||
|
def decoded_to_adc_counts(decoded: dict) -> dict:
|
||||||
|
"""Convert :func:`decode_waveform_v2` output to int16 ADC counts.
|
||||||
|
|
||||||
|
Geo channels are scaled by ×16 (decoder produces 16-count units,
|
||||||
|
consumer expects 1-count ADC). Mic is passed through as raw counts.
|
||||||
|
"""
|
||||||
|
if not decoded:
|
||||||
|
return {}
|
||||||
|
return {
|
||||||
|
"Tran": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Tran", [])],
|
||||||
|
"Vert": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Vert", [])],
|
||||||
|
"Long": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Long", [])],
|
||||||
|
"MicL": list(decoded.get("MicL", [])),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def mic_count_to_db(count: int) -> float:
|
||||||
|
"""Convert a MicL ADC count to dB(L) for BW-display-compatible output.
|
||||||
|
|
||||||
|
Empirical formula (confirmed 2026-05-11 against V70 fixture: count=813
|
||||||
|
→ 140.1 dB; count=±1 → ±81.94 dB; count=±24 → ±109.5 dB):
|
||||||
|
|
||||||
|
dB = sign(count) × (81.94 + 20 × log10(|count|)) for |count| ≥ 1
|
||||||
|
dB = 0.0 for count == 0
|
||||||
|
|
||||||
|
The constant 81.94 corresponds to 10^(81.94/20) ≈ 12490 mic ADC counts
|
||||||
|
being the dB(L) reference level — almost certainly a calibration
|
||||||
|
constant from the device's mic.
|
||||||
|
"""
|
||||||
|
if count == 0:
|
||||||
|
return 0.0
|
||||||
|
sign = 1.0 if count > 0 else -1.0
|
||||||
|
return sign * (81.94 + 20.0 * math.log10(abs(count)))
|
||||||
|
|
||||||
|
|
||||||
|
# ── A5-frame entry point ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def decode_a5_frames(a5_frames) -> Optional[dict]:
|
||||||
|
"""Decode a list of A5 (BULK_WAVEFORM_STREAM) frames into per-channel
|
||||||
|
int16 ADC samples.
|
||||||
|
|
||||||
|
Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
|
||||||
|
with each channel's samples in **1-count ADC units** (the legacy
|
||||||
|
``event.raw_samples`` convention — multiply by ``full_scale / 32768``
|
||||||
|
to convert to physical units; for mic, use :func:`mic_count_to_db` or
|
||||||
|
a per-count psi factor).
|
||||||
|
|
||||||
|
Returns ``None`` if the frames cannot be parsed.
|
||||||
|
|
||||||
|
This is the wired-up production entry point. It:
|
||||||
|
1. Reconstructs the BW-binary body bytes from the A5 frames
|
||||||
|
(``blastware_file.extract_body_bytes``).
|
||||||
|
2. Runs the verified codec (``decode_waveform_v2``) on the body.
|
||||||
|
3. Converts to int16 ADC counts via :func:`decoded_to_adc_counts`.
|
||||||
|
"""
|
||||||
|
# Local import to avoid a cycle: blastware_file imports models and
|
||||||
|
# ultimately client.py imports waveform_codec.
|
||||||
|
from .blastware_file import extract_body_bytes
|
||||||
|
|
||||||
|
if not a5_frames:
|
||||||
|
return None
|
||||||
|
_strt, body, _footer = extract_body_bytes(a5_frames)
|
||||||
|
if not body:
|
||||||
|
return None
|
||||||
|
decoded = decode_waveform_v2(body)
|
||||||
|
if decoded is None:
|
||||||
|
return None
|
||||||
|
return decoded_to_adc_counts(decoded)
|
||||||
@@ -53,7 +53,9 @@ SUB_TABLE: dict[int, tuple[str, str, str]] = {
|
|||||||
0x82: ("TRIGGER_CONFIG_WRITE", "BW→S3", "0x1C bytes; trigger config block; mirrors SUB 1C"),
|
0x82: ("TRIGGER_CONFIG_WRITE", "BW→S3", "0x1C bytes; trigger config block; mirrors SUB 1C"),
|
||||||
0x83: ("TRIGGER_WRITE_CONFIRM", "BW→S3", "Short frame; commit step after 0x82"),
|
0x83: ("TRIGGER_WRITE_CONFIRM", "BW→S3", "Short frame; commit step after 0x82"),
|
||||||
# S3→BW responses
|
# S3→BW responses
|
||||||
|
0x5A: ("BULK_WAVEFORM_STREAM", "BW→S3", "Bulk waveform chunk request; response is A5 stream"),
|
||||||
0xA4: ("POLL_RESPONSE", "S3→BW", "Response to SUB 5B poll"),
|
0xA4: ("POLL_RESPONSE", "S3→BW", "Response to SUB 5B poll"),
|
||||||
|
0xA5: ("BULK_WAVEFORM_RESPONSE", "S3→BW", "Response to SUB 5A; waveform chunks + metadata"),
|
||||||
0xFE: ("FULL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 01"),
|
0xFE: ("FULL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 01"),
|
||||||
0xF9: ("CHANNEL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 06"),
|
0xF9: ("CHANNEL_CONFIG_RESPONSE", "S3→BW", "Response to SUB 06"),
|
||||||
0xF7: ("EVENT_INDEX_RESPONSE", "S3→BW", "Response to SUB 08; contains backlight/power-save"),
|
0xF7: ("EVENT_INDEX_RESPONSE", "S3→BW", "Response to SUB 08; contains backlight/power-save"),
|
||||||
|
|||||||
+33
-36
@@ -33,7 +33,7 @@ STX = 0x02
|
|||||||
ETX = 0x03
|
ETX = 0x03
|
||||||
ACK = 0x41
|
ACK = 0x41
|
||||||
|
|
||||||
__version__ = "0.2.3"
|
__version__ = "0.2.5"
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
@@ -184,9 +184,9 @@ def validate_bw_body_auto(body: bytes) -> Optional[Tuple[bytes, bytes, str]]:
|
|||||||
def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
|
def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
|
||||||
frames: List[Frame] = []
|
frames: List[Frame] = []
|
||||||
|
|
||||||
IDLE = 0
|
IDLE = 0
|
||||||
IN_FRAME = 1
|
IN_FRAME = 1
|
||||||
AFTER_DLE = 2
|
IN_FRAME_DLE = 2 # saw DLE inside frame — waiting for next byte
|
||||||
|
|
||||||
state = IDLE
|
state = IDLE
|
||||||
body = bytearray()
|
body = bytearray()
|
||||||
@@ -206,66 +206,63 @@ def parse_s3(blob: bytes, trailer_len: int) -> List[Frame]:
|
|||||||
state = IN_FRAME
|
state = IN_FRAME
|
||||||
i += 2
|
i += 2
|
||||||
continue
|
continue
|
||||||
|
# ACK bytes, boot strings, garbage — silently ignored
|
||||||
|
|
||||||
elif state == IN_FRAME:
|
elif state == IN_FRAME:
|
||||||
if b == DLE:
|
if b == DLE:
|
||||||
state = AFTER_DLE
|
state = IN_FRAME_DLE
|
||||||
i += 1
|
i += 1
|
||||||
continue
|
continue
|
||||||
body.append(b)
|
|
||||||
|
|
||||||
else: # AFTER_DLE
|
|
||||||
if b == DLE:
|
|
||||||
body.append(DLE)
|
|
||||||
state = IN_FRAME
|
|
||||||
i += 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if b == ETX:
|
if b == ETX:
|
||||||
|
# Bare ETX = real S3 frame terminator (confirmed from S3FrameParser)
|
||||||
end_offset = i + 1
|
end_offset = i + 1
|
||||||
trailer_start = i + 1
|
trailer_start = i + 1
|
||||||
trailer_end = trailer_start + trailer_len
|
trailer_end = trailer_start + trailer_len
|
||||||
trailer = blob[trailer_start:trailer_end]
|
trailer = blob[trailer_start:trailer_end]
|
||||||
|
|
||||||
chk_valid = None
|
# S3 checksums are deliberately not validated here.
|
||||||
chk_type = None
|
# Large S3 responses (A5 bulk waveform, E5 compliance) embed
|
||||||
chk_hex = None
|
# inner DLE+ETX sub-frame terminators whose trailing 0x03 byte
|
||||||
payload = bytes(body)
|
# lands where the parser would expect the SUM8 checksum, causing
|
||||||
|
# false failures. The live protocol (protocol.py _validate_frame)
|
||||||
if len(body) >= 1:
|
# also skips S3 checksum enforcement for the same reason.
|
||||||
received_chk = body[-1]
|
|
||||||
computed_chk = checksum8_sum(bytes(body[:-1]))
|
|
||||||
if computed_chk == received_chk:
|
|
||||||
chk_valid = True
|
|
||||||
chk_type = "SUM8"
|
|
||||||
chk_hex = f"{received_chk:02x}"
|
|
||||||
payload = bytes(body[:-1])
|
|
||||||
else:
|
|
||||||
chk_valid = False
|
|
||||||
|
|
||||||
frames.append(Frame(
|
frames.append(Frame(
|
||||||
index=idx,
|
index=idx,
|
||||||
start_offset=start_offset,
|
start_offset=start_offset,
|
||||||
end_offset=end_offset,
|
end_offset=end_offset,
|
||||||
payload_raw=bytes(body),
|
payload_raw=bytes(body),
|
||||||
payload=payload,
|
payload=bytes(body),
|
||||||
trailer=trailer,
|
trailer=trailer,
|
||||||
checksum_valid=chk_valid,
|
checksum_valid=None,
|
||||||
checksum_type=chk_type,
|
checksum_type=None,
|
||||||
checksum_hex=chk_hex
|
checksum_hex=None
|
||||||
))
|
))
|
||||||
|
|
||||||
idx += 1
|
idx += 1
|
||||||
state = IDLE
|
state = IDLE
|
||||||
i = trailer_end
|
i = trailer_end
|
||||||
continue
|
continue
|
||||||
|
body.append(b)
|
||||||
|
|
||||||
|
else: # IN_FRAME_DLE
|
||||||
|
if b == DLE:
|
||||||
|
# DLE DLE → literal 0x10 in payload
|
||||||
|
body.append(DLE)
|
||||||
|
state = IN_FRAME
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
|
if b == ETX:
|
||||||
|
# DLE+ETX inside a frame = inner-frame terminator (A4/E5 sub-frames).
|
||||||
|
# Treat as literal data, NOT the outer frame end.
|
||||||
|
body.append(DLE)
|
||||||
|
body.append(ETX)
|
||||||
|
state = IN_FRAME
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
# Unexpected DLE + byte → treat as literal data
|
# Unexpected DLE + byte → treat as literal data
|
||||||
body.append(DLE)
|
body.append(DLE)
|
||||||
body.append(b)
|
body.append(b)
|
||||||
state = IN_FRAME
|
state = IN_FRAME
|
||||||
i += 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
i += 1
|
i += 1
|
||||||
|
|
||||||
|
|||||||
+7
-3
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "seismo-relay"
|
name = "seismo-relay"
|
||||||
version = "0.12.0"
|
version = "0.21.1"
|
||||||
description = "Python client and REST server for MiniMate Plus seismographs"
|
description = "Python client and REST server for MiniMate Plus seismographs"
|
||||||
requires-python = ">=3.10"
|
requires-python = ">=3.10"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
@@ -12,9 +12,13 @@ dependencies = [
|
|||||||
"uvicorn[standard]>=0.24",
|
"uvicorn[standard]>=0.24",
|
||||||
"pyserial>=3.5",
|
"pyserial>=3.5",
|
||||||
"sqlalchemy>=2.0",
|
"sqlalchemy>=2.0",
|
||||||
|
"python-multipart>=0.0.7",
|
||||||
|
"h5py>=3.10",
|
||||||
|
"numpy>=1.24",
|
||||||
|
"matplotlib>=3.8",
|
||||||
]
|
]
|
||||||
|
|
||||||
[tool.setuptools.packages.find]
|
[tool.setuptools.packages.find]
|
||||||
# Auto-discovers minimateplus/, sfm/, bridges/ as packages
|
# Auto-discovers minimateplus/, micromate/, sfm/, bridges/ as packages
|
||||||
where = ["."]
|
where = ["."]
|
||||||
include = ["minimateplus*", "sfm*", "bridges*"]
|
include = ["minimateplus*", "micromate*", "sfm*", "bridges*"]
|
||||||
|
|||||||
@@ -2,3 +2,7 @@ fastapi
|
|||||||
uvicorn
|
uvicorn
|
||||||
sqlalchemy
|
sqlalchemy
|
||||||
pyserial
|
pyserial
|
||||||
|
python-multipart
|
||||||
|
h5py
|
||||||
|
numpy
|
||||||
|
matplotlib
|
||||||
|
|||||||
@@ -0,0 +1,360 @@
|
|||||||
|
"""
|
||||||
|
scratch/next_experiment_skeleton.py — segment-channel scoring analyzer.
|
||||||
|
|
||||||
|
This is the suggested NEXT EXPERIMENT for cracking the waveform body codec.
|
||||||
|
The goal is to figure out what segments 1+ contain, since segment 0 = Tran
|
||||||
|
is solved but multi-segment continuation diverges from truth at sample ~512.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
The hypothesis to test
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
Segments rotate through channels:
|
||||||
|
|
||||||
|
segment 0 → Tran samples 0..509
|
||||||
|
segment 1 → Vert samples 0..507
|
||||||
|
segment 2 → Long samples 0..507
|
||||||
|
segment 3 → Mic samples 0..507
|
||||||
|
segment 4 → Tran samples 510..N (continuation)
|
||||||
|
...
|
||||||
|
|
||||||
|
This would explain why segment 0 works perfectly (it's pure Tran) and why
|
||||||
|
applying segment 1's blocks as Tran continuation gives wrong values
|
||||||
|
(it's actually Vert).
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
What the analyzer should do
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
For each segment in each fixture event:
|
||||||
|
|
||||||
|
1. Run the segment-0 block-walker + RLE decode (the same algorithm that
|
||||||
|
``decode_tran_initial`` uses) over the segment's blocks. Start from
|
||||||
|
some anchor value and produce a cumulative trajectory of length =
|
||||||
|
number-of-deltas-in-segment.
|
||||||
|
|
||||||
|
2. For each candidate channel C ∈ {Tran, Vert, Long, MicL}:
|
||||||
|
For each candidate anchor location in the segment-header payload
|
||||||
|
(try [0:2], [2:4], [4:6], [14:16], [16:18] as int16 BE):
|
||||||
|
Compare the decoded trajectory against truth[C] starting from
|
||||||
|
the segment's first sample index.
|
||||||
|
Score = number of matches (or sum of squared errors).
|
||||||
|
|
||||||
|
3. Report the best (channel, anchor-location) combination per segment.
|
||||||
|
|
||||||
|
If the rotation hypothesis is correct, you'll see:
|
||||||
|
segment 0 → best score for (Tran, preamble bytes [3:5]) ✓ already known
|
||||||
|
segment 1 → best score for (Vert, <some-header-byte>)
|
||||||
|
segment 2 → best score for (Long, <some-header-byte>)
|
||||||
|
segment 3 → best score for (MicL, <some-header-byte>)
|
||||||
|
segment 4 → best score for (Tran, continuing from segment 0's end)
|
||||||
|
|
||||||
|
If the rotation hypothesis is NOT correct, the scorer will at least narrow
|
||||||
|
down what segment 1 actually carries. Maybe channels interleave at finer
|
||||||
|
granularity, or maybe segments alternate by something other than channel.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Why this is a scoring analyzer, not a hand-written decoder
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
Direct hand-coding ("assume segment 1 is Vert with anchor at byte X") gets
|
||||||
|
stuck when the assumption is wrong because the failure mode is silent —
|
||||||
|
you get plausible-looking-but-wrong samples and have to manually diff
|
||||||
|
against truth to debug.
|
||||||
|
|
||||||
|
The scorer is brute-force but cheap: every fixture event × every segment ×
|
||||||
|
4 channels × 5 anchor-byte candidates is only ~hundreds of comparisons.
|
||||||
|
The winning combination jumps out by score.
|
||||||
|
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
Skeleton
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional, Tuple
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
|
||||||
|
|
||||||
|
from minimateplus.waveform_codec import walk_body, find_data_start, WaveformBlock
|
||||||
|
|
||||||
|
|
||||||
|
# ── Reusable pieces ──────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
CHANNELS = ("Tran", "Vert", "Long", "MicL")
|
||||||
|
LSB_INV = 200 # 1 in/s / 0.005 in/s/LSB; multiply BW-export floats by this
|
||||||
|
# to get 16-count units (the body's native quantization).
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FixtureEvent:
|
||||||
|
name: str # e.g. "M529LL1A.SP0"
|
||||||
|
bin_path: str
|
||||||
|
txt_path: str
|
||||||
|
body: bytes
|
||||||
|
truth: dict # {channel: list of int16-quantized samples}
|
||||||
|
blocks: List[WaveformBlock]
|
||||||
|
segment_starts: List[int] # block indices of each 40 02 segment header
|
||||||
|
segment_sample_starts: List[int] # for each segment, the truth sample index it starts at
|
||||||
|
|
||||||
|
|
||||||
|
def s4(n: int) -> int:
|
||||||
|
"""4-bit signed nibble decode."""
|
||||||
|
return n if n < 8 else n - 16
|
||||||
|
|
||||||
|
|
||||||
|
def i8(b: int) -> int:
|
||||||
|
"""int8 reinterpret of unsigned byte."""
|
||||||
|
return b if b < 128 else b - 256
|
||||||
|
|
||||||
|
|
||||||
|
def load_fixture(name: str) -> FixtureEvent:
|
||||||
|
"""Load a fixture event with its truth values and parsed block stream."""
|
||||||
|
# Find the fixture (search both subdirs of tests/fixtures/).
|
||||||
|
base = os.path.join(os.path.dirname(__file__), "..", "tests", "fixtures")
|
||||||
|
candidates = [
|
||||||
|
os.path.join(base, "5-11-26", name),
|
||||||
|
os.path.join(base, "decode-re-5-8-26", "event-a", name), # not used directly
|
||||||
|
]
|
||||||
|
bin_path = next((c for c in candidates if os.path.exists(c)), None)
|
||||||
|
if bin_path is None:
|
||||||
|
# Try a glob walk for the 5-8 fixtures (they're in subdirs).
|
||||||
|
for root, _, files in os.walk(base):
|
||||||
|
if name in files:
|
||||||
|
bin_path = os.path.join(root, name)
|
||||||
|
break
|
||||||
|
if bin_path is None:
|
||||||
|
raise FileNotFoundError(name)
|
||||||
|
|
||||||
|
txt_path = bin_path + ".TXT"
|
||||||
|
with open(bin_path, "rb") as f:
|
||||||
|
raw = f.read()
|
||||||
|
body = raw[43:-26]
|
||||||
|
truth = _parse_txt(txt_path)
|
||||||
|
blocks = walk_body(body, find_data_start(body))
|
||||||
|
|
||||||
|
seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
|
||||||
|
# Segment 0 starts at sample 0; subsequent segments start at the
|
||||||
|
# cumulative sample count from previous segment(s). Tran's segment 0
|
||||||
|
# is N samples; if rotation hypothesis is correct, segment 1's data
|
||||||
|
# starts at sample 0 for a *different* channel. The analyzer should
|
||||||
|
# try both "continues from previous segment" and "starts at sample 0
|
||||||
|
# of a different channel."
|
||||||
|
seg_sample_starts = _compute_segment_sample_starts(blocks, seg_idx)
|
||||||
|
|
||||||
|
return FixtureEvent(
|
||||||
|
name=name, bin_path=bin_path, txt_path=txt_path,
|
||||||
|
body=body, truth=truth, blocks=blocks,
|
||||||
|
segment_starts=seg_idx, segment_sample_starts=seg_sample_starts,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_txt(path: str) -> dict:
|
||||||
|
"""Parse BW ASCII TXT export into {channel: [int_samples_in_16_count_units]}."""
|
||||||
|
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||||
|
lines = f.read().splitlines()
|
||||||
|
header_idx = next(
|
||||||
|
(i for i, l in enumerate(lines)
|
||||||
|
if all(c in l for c in CHANNELS)),
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
if header_idx is None:
|
||||||
|
return {ch: [] for ch in CHANNELS}
|
||||||
|
out = {ch: [] for ch in CHANNELS}
|
||||||
|
for line in lines[header_idx + 1:]:
|
||||||
|
parts = re.split(r"\s+", line.strip())
|
||||||
|
if len(parts) < 4:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
vals = [float(p) for p in parts[:4]]
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
for ch, v in zip(CHANNELS, vals):
|
||||||
|
# Multiply by LSB_INV; geo channels are in in/s, MicL is in dB(L)
|
||||||
|
# (which doesn't quantize the same way — leaving raw for MicL is fine,
|
||||||
|
# the scorer should treat MicL specially).
|
||||||
|
out[ch].append(round(v * LSB_INV) if ch != "MicL" else v)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def _compute_segment_sample_starts(
|
||||||
|
blocks: List[WaveformBlock], seg_idx: List[int]
|
||||||
|
) -> List[int]:
|
||||||
|
"""Cumulative sample-count up to each segment header (if all blocks treated
|
||||||
|
as Tran continuation). Useful as one candidate for segment-1-Tran tests.
|
||||||
|
|
||||||
|
The scorer should ALSO try "segment 1 starts at sample 0 of a new channel"
|
||||||
|
as the rotation hypothesis predicts.
|
||||||
|
"""
|
||||||
|
starts = []
|
||||||
|
cum = 2 # T[0] + T[1] from preamble
|
||||||
|
for i, b in enumerate(blocks):
|
||||||
|
if i in seg_idx:
|
||||||
|
starts.append(cum)
|
||||||
|
if b.tag_hi == 0x10:
|
||||||
|
cum += b.tag_lo
|
||||||
|
elif b.tag_hi == 0x20:
|
||||||
|
cum += b.tag_lo
|
||||||
|
elif b.tag_hi == 0x00:
|
||||||
|
cum += b.tag_lo
|
||||||
|
# 30 NN and 40 02 don't contribute samples (for this hypothesis)
|
||||||
|
return starts
|
||||||
|
|
||||||
|
|
||||||
|
# ── The core algorithm: decode a segment's blocks as deltas ─────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def decode_segment_as_channel(
|
||||||
|
blocks: List[WaveformBlock],
|
||||||
|
seg_start_block_idx: int,
|
||||||
|
seg_end_block_idx: int,
|
||||||
|
anchor: int,
|
||||||
|
) -> List[int]:
|
||||||
|
"""Apply the segment-0 codec rules to a range of blocks, starting from *anchor*.
|
||||||
|
|
||||||
|
Returns a list of cumulative sample values (one per delta). Does NOT include
|
||||||
|
the anchor itself in the output — the first returned value is anchor + first_delta.
|
||||||
|
"""
|
||||||
|
out = []
|
||||||
|
cur = anchor
|
||||||
|
for bi in range(seg_start_block_idx, seg_end_block_idx):
|
||||||
|
blk = blocks[bi]
|
||||||
|
if blk.tag_hi == 0x10:
|
||||||
|
for byte in blk.data:
|
||||||
|
for nib in ((byte >> 4) & 0xF, byte & 0xF):
|
||||||
|
cur += s4(nib)
|
||||||
|
out.append(cur)
|
||||||
|
elif blk.tag_hi == 0x20:
|
||||||
|
for byte in blk.data:
|
||||||
|
cur += i8(byte)
|
||||||
|
out.append(cur)
|
||||||
|
elif blk.tag_hi == 0x00:
|
||||||
|
for _ in range(blk.tag_lo):
|
||||||
|
out.append(cur)
|
||||||
|
# 30 NN: skip (content unknown)
|
||||||
|
# 40 02: shouldn't appear in segment data (it's the segment header)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def score_against_truth(
|
||||||
|
decoded: List[int],
|
||||||
|
truth: List[int],
|
||||||
|
truth_start: int,
|
||||||
|
) -> Tuple[int, int]:
|
||||||
|
"""Compare *decoded* to truth[truth_start : truth_start + len(decoded)].
|
||||||
|
|
||||||
|
Returns (n_matches, n_compared).
|
||||||
|
"""
|
||||||
|
n = min(len(decoded), len(truth) - truth_start)
|
||||||
|
if n <= 0:
|
||||||
|
return (0, 0)
|
||||||
|
matches = sum(1 for i in range(n) if decoded[i] == truth[truth_start + i])
|
||||||
|
return (matches, n)
|
||||||
|
|
||||||
|
|
||||||
|
# ── TODO for the next pass ──────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def score_segment_against_all_channels(
|
||||||
|
event: FixtureEvent,
|
||||||
|
segment_index: int,
|
||||||
|
) -> List[Tuple[str, int, int, int]]:
|
||||||
|
"""For segment *segment_index* of *event*, find the best (channel, start_sample)
|
||||||
|
fit.
|
||||||
|
|
||||||
|
For each candidate channel C and each candidate starting truth-sample index s,
|
||||||
|
we pick the anchor that makes the FIRST decoded value match truth[C][s], then
|
||||||
|
score the remaining decoded values against truth[C][s+1 : s+N].
|
||||||
|
|
||||||
|
Returns rows of (channel_name, start_sample, n_matches, n_compared)
|
||||||
|
sorted by match-count descending.
|
||||||
|
"""
|
||||||
|
# Block range of this segment: from the segment header (inclusive) up to
|
||||||
|
# the next segment header (exclusive), or end-of-blocks.
|
||||||
|
seg_header_idx = event.segment_starts[segment_index]
|
||||||
|
next_header_idx = (
|
||||||
|
event.segment_starts[segment_index + 1]
|
||||||
|
if segment_index + 1 < len(event.segment_starts)
|
||||||
|
else len(event.blocks)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Decode the segment's data blocks (skip the segment-header block itself).
|
||||||
|
# Use anchor=0 — we'll re-anchor when scoring against each channel.
|
||||||
|
deltas_trajectory = decode_segment_as_channel(
|
||||||
|
event.blocks, seg_header_idx + 1, next_header_idx, anchor=0
|
||||||
|
)
|
||||||
|
if not deltas_trajectory:
|
||||||
|
return []
|
||||||
|
|
||||||
|
n = len(deltas_trajectory)
|
||||||
|
results = []
|
||||||
|
|
||||||
|
for ch in ("Tran", "Vert", "Long"):
|
||||||
|
truth = event.truth.get(ch)
|
||||||
|
if not truth or len(truth) < n + 1:
|
||||||
|
continue
|
||||||
|
# For each candidate starting sample s in truth, check if applying
|
||||||
|
# the deltas starting from truth[s] reproduces truth[s+1:s+n+1].
|
||||||
|
best = (0, -1)
|
||||||
|
for s in range(len(truth) - n):
|
||||||
|
anchor = truth[s]
|
||||||
|
offset = anchor - deltas_trajectory[0] + truth[s + 1] - anchor
|
||||||
|
# Recompute: trajectory[i] = anchor + cumulative_delta_through_i
|
||||||
|
# but we already have deltas_trajectory computed from anchor=0,
|
||||||
|
# so trajectory_relative[i] = anchor + deltas_trajectory[i].
|
||||||
|
matches = 0
|
||||||
|
for i in range(n):
|
||||||
|
if truth[s + i + 1] == anchor + deltas_trajectory[i]:
|
||||||
|
matches += 1
|
||||||
|
# Note: we could break early on first mismatch for "matches start",
|
||||||
|
# but counting total matches gives a more robust score.
|
||||||
|
if matches > best[0]:
|
||||||
|
best = (matches, s)
|
||||||
|
results.append((ch, best[1], best[0], n))
|
||||||
|
|
||||||
|
results.sort(key=lambda r: -r[2])
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
|
# ── Driver ──────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Run the analyzer on all loud-bundle events and print best scores."""
|
||||||
|
events = ["M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
|
||||||
|
"M529LL1L.JQ0", "M529LL1L.V70"]
|
||||||
|
for name in events:
|
||||||
|
try:
|
||||||
|
event = load_fixture(name)
|
||||||
|
except FileNotFoundError:
|
||||||
|
print(f"{name}: fixture not found")
|
||||||
|
continue
|
||||||
|
|
||||||
|
print(f"\n=== {name} ===")
|
||||||
|
print(f" body bytes: {len(event.body)}")
|
||||||
|
print(f" blocks: {len(event.blocks)}")
|
||||||
|
print(f" segments: {len(event.segment_starts)}")
|
||||||
|
print(f" segment sample-starts (if all blocks are 1 channel):")
|
||||||
|
for si, sample_start in enumerate(event.segment_sample_starts):
|
||||||
|
print(f" seg {si}: sample {sample_start}")
|
||||||
|
|
||||||
|
for si in range(len(event.segment_starts)):
|
||||||
|
results = score_segment_against_all_channels(event, si)
|
||||||
|
if not results:
|
||||||
|
print(f" seg {si}: (no scorable data)")
|
||||||
|
continue
|
||||||
|
tag = "✓" if results[0][2] / max(results[0][3], 1) > 0.9 else " "
|
||||||
|
top = results[0]
|
||||||
|
print(f" seg {si}: best fit {tag} = {top[0]:<5} "
|
||||||
|
f"starting at sample {top[1]:>5}, {top[2]:>4}/{top[3]:<4} match"
|
||||||
|
+ (f" (next: {results[1][0]} @{results[1][1]} {results[1][2]}/{results[1][3]})"
|
||||||
|
if len(results) > 1 else ""))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,150 @@
|
|||||||
|
"""
|
||||||
|
scripts/backfill_record_type.py — fix `record_type` on legacy event
|
||||||
|
rows whose value was hardcoded to "Waveform" regardless of actual type.
|
||||||
|
|
||||||
|
Why this is needed
|
||||||
|
──────────────────
|
||||||
|
Pre-v0.16.1 the BW file importer (`event_file_io.read_blastware_file`)
|
||||||
|
hardcoded `ev.record_type = "Waveform"` for every imported event. Fixed
|
||||||
|
in commit aac1c8e — new ingests now derive the type from the Blastware
|
||||||
|
filename's extension last character (H=Histogram, W=Waveform, M=Manual,
|
||||||
|
E=Event, C=Combo) per the V10.72+ MiniMate Plus AB0T filename scheme.
|
||||||
|
|
||||||
|
Effect on a server that imported events under the old code: every
|
||||||
|
events row has `record_type = "Waveform"`, even for histograms,
|
||||||
|
manuals, etc. Visible in terra-view's event-detail modal under the
|
||||||
|
"Record Type" field. Terra-view also has a client-side workaround
|
||||||
|
that derives the type from the filename for display purposes, so
|
||||||
|
operators see the correct type in the UI even before this backfill.
|
||||||
|
This script makes the DB column match what the UI is already showing,
|
||||||
|
which matters for reporting and any downstream consumer that reads
|
||||||
|
events.record_type directly.
|
||||||
|
|
||||||
|
This script
|
||||||
|
───────────
|
||||||
|
Walks the `events` table and updates each row's `record_type` to the
|
||||||
|
derived value from its `blastware_filename`. Old S338 firmware files
|
||||||
|
(3-char extensions ending in `0`) and any unrecognized suffix get
|
||||||
|
left at the existing value (defaults to "Waveform").
|
||||||
|
|
||||||
|
Idempotent: re-running after a successful backfill finds zero rows
|
||||||
|
needing updates and exits cleanly (it always re-derives but only
|
||||||
|
writes when the value would change).
|
||||||
|
|
||||||
|
Usage
|
||||||
|
─────
|
||||||
|
# Dry-run (default): print what would change, don't touch the DB
|
||||||
|
python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db
|
||||||
|
|
||||||
|
# Apply the backfill
|
||||||
|
python -m scripts.backfill_record_type --db bridges/captures/seismo_relay.db --apply
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
from collections import Counter
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
# Must stay in sync with minimateplus.event_file_io._RECORD_TYPE_BY_EXT_SUFFIX.
|
||||||
|
_TYPE_FROM_SUFFIX = {
|
||||||
|
"H": "Histogram",
|
||||||
|
"W": "Waveform",
|
||||||
|
"M": "Manual",
|
||||||
|
"E": "Event",
|
||||||
|
"C": "Combo",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def derive_record_type(filename: str | None, default: str = "Waveform") -> str:
|
||||||
|
"""Mirror of minimateplus.event_file_io.derive_record_type_from_filename.
|
||||||
|
|
||||||
|
Vendored here so this script runs without needing the seismo-relay
|
||||||
|
package on the Python path (useful on prod where you might be
|
||||||
|
running it via `docker exec` against a container's DB volume).
|
||||||
|
"""
|
||||||
|
if not filename:
|
||||||
|
return default
|
||||||
|
name = Path(filename).name
|
||||||
|
if "." not in name:
|
||||||
|
return default
|
||||||
|
ext = name.rsplit(".", 1)[1]
|
||||||
|
if not ext:
|
||||||
|
return default
|
||||||
|
return _TYPE_FROM_SUFFIX.get(ext[-1].upper(), default)
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
ap = argparse.ArgumentParser(description=__doc__)
|
||||||
|
ap.add_argument("--db", required=True, help="Path to seismo_relay.db")
|
||||||
|
ap.add_argument("--apply", action="store_true",
|
||||||
|
help="Actually write changes (default is dry-run).")
|
||||||
|
ap.add_argument("--default", default="Waveform",
|
||||||
|
help="Fallback record_type when filename doesn't encode one. "
|
||||||
|
"Default: Waveform (matches the pre-fix bug's behavior).")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
db_path = Path(args.db)
|
||||||
|
if not db_path.exists():
|
||||||
|
print(f"ERROR: database not found at {db_path}", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
conn = sqlite3.connect(str(db_path))
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
cur = conn.cursor()
|
||||||
|
|
||||||
|
cur.execute("""
|
||||||
|
SELECT id, blastware_filename, record_type
|
||||||
|
FROM events
|
||||||
|
WHERE blastware_filename IS NOT NULL
|
||||||
|
AND blastware_filename != ''
|
||||||
|
""")
|
||||||
|
rows = cur.fetchall()
|
||||||
|
total = len(rows)
|
||||||
|
print(f"Scanning {total:,} event rows…")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Tally proposed changes.
|
||||||
|
transitions: Counter[tuple[str, str]] = Counter()
|
||||||
|
update_ids: list[tuple[str, str]] = []
|
||||||
|
unrecognized = 0
|
||||||
|
|
||||||
|
for row in rows:
|
||||||
|
derived = derive_record_type(row["blastware_filename"], default=args.default)
|
||||||
|
current = row["record_type"] or ""
|
||||||
|
if derived == current:
|
||||||
|
continue
|
||||||
|
transitions[(current, derived)] += 1
|
||||||
|
update_ids.append((row["id"], derived))
|
||||||
|
|
||||||
|
if not update_ids:
|
||||||
|
print("Nothing to update — all rows already match.")
|
||||||
|
conn.close()
|
||||||
|
return 0
|
||||||
|
|
||||||
|
print(f"{len(update_ids):,} row(s) need updating:")
|
||||||
|
for (old, new), count in sorted(transitions.items(), key=lambda x: -x[1]):
|
||||||
|
print(f" {count:>6,} {old!r:14s} → {new!r}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
if not args.apply:
|
||||||
|
print("(dry-run — re-run with --apply to write changes)")
|
||||||
|
conn.close()
|
||||||
|
return 0
|
||||||
|
|
||||||
|
print("Applying changes…")
|
||||||
|
cur.executemany(
|
||||||
|
"UPDATE events SET record_type = ? WHERE id = ?",
|
||||||
|
[(new, eid) for eid, new in update_ids],
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
print(f"Done. Updated {cur.rowcount:,} row(s).")
|
||||||
|
conn.close()
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
@@ -0,0 +1,466 @@
|
|||||||
|
"""
|
||||||
|
scripts/backfill_sidecars.py — generate .sfm.json sidecars AND .h5
|
||||||
|
clean-waveform files for existing events already in the waveform store
|
||||||
|
that predate those features.
|
||||||
|
|
||||||
|
Walks `<store_root>/<serial>/<filename>` and for each BW event file:
|
||||||
|
|
||||||
|
Sidecar (.sfm.json):
|
||||||
|
- Skip when an existing sidecar's blastware.sha256 matches the
|
||||||
|
current BW file's sha256.
|
||||||
|
- Else regenerate: prefer .a5.pkl (full fidelity); fall back to
|
||||||
|
parsing the BW binary directly (peaks computed from samples).
|
||||||
|
|
||||||
|
Clean waveform (.h5):
|
||||||
|
- Regenerated whenever the sidecar is regenerated (sha mismatch
|
||||||
|
OR sidecar.source.tool_version < current TOOL_VERSION OR --force).
|
||||||
|
The .h5 and the sidecar both come from the same decoder output,
|
||||||
|
so if the sidecar is stale the .h5 is too.
|
||||||
|
- Written when missing.
|
||||||
|
- --skip-hdf5 turns off all .h5 writes.
|
||||||
|
|
||||||
|
Typical use after a decoder upgrade:
|
||||||
|
1. Pull the new seismo-relay code (which bumped TOOL_VERSION).
|
||||||
|
2. Run this script — every sidecar with an older tool_version
|
||||||
|
stamp regenerates, and the associated .h5 cascade-regenerates.
|
||||||
|
3. Operator review state (review.false_trigger, notes, reviewer)
|
||||||
|
and the sidecar's extensions block are preserved across the
|
||||||
|
regen.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python scripts/backfill_sidecars.py [--store-root PATH]
|
||||||
|
[--db-path PATH]
|
||||||
|
[--dry-run]
|
||||||
|
[--skip-hdf5]
|
||||||
|
[-v]
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Allow running from the repo root without installation.
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||||
|
|
||||||
|
from minimateplus import event_file_io
|
||||||
|
from sfm import event_hdf5
|
||||||
|
from sfm.waveform_store import WaveformStore, _frame_to_dict, _dict_to_frame # noqa: F401
|
||||||
|
from sfm.database import SeismoDb
|
||||||
|
|
||||||
|
log = logging.getLogger("backfill_sidecars")
|
||||||
|
|
||||||
|
|
||||||
|
def _looks_like_event_file(path: Path) -> bool:
|
||||||
|
"""Same heuristic as the importer CLI.
|
||||||
|
|
||||||
|
Filters to BW (Series III) event files only — Thor (Series IV)
|
||||||
|
`.IDFW` / `.IDFH` files share the store but have their own ingest
|
||||||
|
path (`WaveformStore.save_imported_idf`) and are NOT decodable by
|
||||||
|
`event_file_io.read_blastware_file`. Their sidecars are populated
|
||||||
|
at ingest from the paired `.IDFW.txt` ASCII report; nothing the
|
||||||
|
backfill regenerates would improve on them, so we exclude them
|
||||||
|
from scope.
|
||||||
|
"""
|
||||||
|
if not path.is_file():
|
||||||
|
return False
|
||||||
|
if path.name.endswith((".a5.pkl", ".sfm.json", ".h5")):
|
||||||
|
return False
|
||||||
|
ext = path.suffix.lstrip(".")
|
||||||
|
if not (3 <= len(ext) <= 4):
|
||||||
|
return False
|
||||||
|
# Thor IDF files share the .{W,H}-suffix shape but aren't BW.
|
||||||
|
if ext.upper() in ("IDFW", "IDFH"):
|
||||||
|
return False
|
||||||
|
if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
|
||||||
|
return False
|
||||||
|
try:
|
||||||
|
return path.stat().st_size >= 70
|
||||||
|
except OSError:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def main(argv=None) -> int:
|
||||||
|
p = argparse.ArgumentParser(description=__doc__)
|
||||||
|
p.add_argument(
|
||||||
|
"--db-path",
|
||||||
|
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
|
||||||
|
)
|
||||||
|
p.add_argument("--store-root", default=None)
|
||||||
|
p.add_argument("--dry-run", action="store_true")
|
||||||
|
p.add_argument(
|
||||||
|
"--skip-hdf5", action="store_true",
|
||||||
|
help="Don't generate .h5 clean-waveform files (only sidecars).",
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--force", action="store_true",
|
||||||
|
help=(
|
||||||
|
"Regenerate sidecars + .h5 even when an existing sidecar's "
|
||||||
|
"blastware.sha256 matches the current BW file. Use this after "
|
||||||
|
"upgrading seismo-relay to pull in decoder bug fixes (e.g. the "
|
||||||
|
"STRT-rectime byte-offset fix in v0.15.x)."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--reparse-txt", action="store_true",
|
||||||
|
help=(
|
||||||
|
"Re-parse the preserved <serial>/<filename>_ASCII.TXT with the "
|
||||||
|
"current bw_ascii_report parser and overwrite the sidecar's "
|
||||||
|
"bw_report block. Use this after upgrading the ASCII parser to "
|
||||||
|
"pull in new fields (e.g. zc_freq_above_range for BW '>100 Hz' "
|
||||||
|
"ZC peaks). No-op for events without a preserved .TXT; safely "
|
||||||
|
"idempotent when the parser hasn't changed."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p.add_argument("-v", "--verbose", action="store_true")
|
||||||
|
args = p.parse_args(argv)
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.DEBUG if args.verbose else logging.INFO,
|
||||||
|
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
|
||||||
|
datefmt="%H:%M:%S",
|
||||||
|
)
|
||||||
|
|
||||||
|
db_path = Path(args.db_path).expanduser().resolve()
|
||||||
|
store_root = (
|
||||||
|
Path(args.store_root).expanduser().resolve()
|
||||||
|
if args.store_root else db_path.parent / "waveforms"
|
||||||
|
)
|
||||||
|
if not store_root.exists():
|
||||||
|
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
|
||||||
|
store = WaveformStore(store_root)
|
||||||
|
db = SeismoDb(db_path)
|
||||||
|
|
||||||
|
written = skipped = errors = 0
|
||||||
|
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
|
||||||
|
serial = serial_dir.name
|
||||||
|
for path in sorted(serial_dir.iterdir()):
|
||||||
|
if not _looks_like_event_file(path):
|
||||||
|
continue
|
||||||
|
sidecar_path = store.sidecar_path_for(serial, path.name)
|
||||||
|
try:
|
||||||
|
bw_sha = event_file_io.file_sha256(path)
|
||||||
|
except Exception as exc:
|
||||||
|
log.error("sha256 failed for %s: %s", path, exc)
|
||||||
|
errors += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Skip when an up-to-date sidecar already exists.
|
||||||
|
#
|
||||||
|
# Two-part freshness check:
|
||||||
|
# 1. blastware.sha256 must match the current BW file (proves
|
||||||
|
# the sidecar describes THIS file).
|
||||||
|
# 2. source.tool_version must be ≥ current TOOL_VERSION (proves
|
||||||
|
# the sidecar was written by a build that includes any
|
||||||
|
# decoder fixes shipped since).
|
||||||
|
# Either part failing → regenerate. --force bypasses both.
|
||||||
|
#
|
||||||
|
# Tracks whether we're regenerating the sidecar this iteration
|
||||||
|
# so the .h5 logic below knows to refresh that too — staleness
|
||||||
|
# of the sidecar implies staleness of the derived .h5 (both
|
||||||
|
# come out of the same decoder).
|
||||||
|
sidecar_stale = True
|
||||||
|
if sidecar_path.exists() and not args.force and not args.reparse_txt:
|
||||||
|
try:
|
||||||
|
existing = event_file_io.read_sidecar(sidecar_path)
|
||||||
|
sha_ok = existing.get("blastware", {}).get("sha256") == bw_sha
|
||||||
|
src_ver = existing.get("source", {}).get("tool_version", "")
|
||||||
|
def _vt(s):
|
||||||
|
try:
|
||||||
|
return tuple(int(p) for p in str(s).split(".")[:3])
|
||||||
|
except Exception:
|
||||||
|
return (0, 0, 0)
|
||||||
|
ver_ok = _vt(src_ver) >= _vt(event_file_io.TOOL_VERSION)
|
||||||
|
if sha_ok and ver_ok:
|
||||||
|
skipped += 1
|
||||||
|
sidecar_stale = False
|
||||||
|
continue
|
||||||
|
if sha_ok and not ver_ok:
|
||||||
|
log.info(
|
||||||
|
"regenerating %s (sidecar tool_version=%s < current %s)",
|
||||||
|
sidecar_path.name, src_ver or "(none)",
|
||||||
|
event_file_io.TOOL_VERSION,
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
pass # fall through to rewrite
|
||||||
|
|
||||||
|
# Decide path: A5-based (high-fidelity) or BW-only.
|
||||||
|
a5_path = serial_dir / f"{path.name}.a5.pkl"
|
||||||
|
try:
|
||||||
|
if a5_path.exists():
|
||||||
|
frames = store.load_a5(serial, path.name)
|
||||||
|
if not frames:
|
||||||
|
raise RuntimeError("a5_pickle present but unreadable")
|
||||||
|
# Build an Event by replaying the A5 decoders. Note:
|
||||||
|
# the .a5.pkl alone CANNOT recover timestamp /
|
||||||
|
# record_type / waveform_key / per-channel peaks —
|
||||||
|
# those live in the 0C record, which isn't saved
|
||||||
|
# separately. We seed those from the DB row + the
|
||||||
|
# existing sidecar below so a re-backfill doesn't
|
||||||
|
# nuke fields the original save populated.
|
||||||
|
from minimateplus.client import (
|
||||||
|
_decode_a5_metadata_into,
|
||||||
|
_decode_a5_waveform,
|
||||||
|
)
|
||||||
|
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
|
||||||
|
ev = Event(index=-1)
|
||||||
|
_decode_a5_metadata_into(frames, ev)
|
||||||
|
_decode_a5_waveform(frames, ev)
|
||||||
|
source_kind = "sfm-live"
|
||||||
|
a5_filename = a5_path.name
|
||||||
|
else:
|
||||||
|
ev = event_file_io.read_blastware_file(path)
|
||||||
|
source_kind = "bw-import"
|
||||||
|
a5_filename = None
|
||||||
|
from minimateplus.models import Event, PeakValues, ProjectInfo, Timestamp
|
||||||
|
|
||||||
|
# ── Seed missing fields from the SeismoDb events row ──
|
||||||
|
# The DB row was populated at original save time with peaks,
|
||||||
|
# project info, timestamp, record_type, sample_rate, etc.
|
||||||
|
# All of those survive intact in SQLite; pull them onto the
|
||||||
|
# rebuilt Event so the regenerated sidecar matches what was
|
||||||
|
# there before the backfill ran.
|
||||||
|
db_row = None
|
||||||
|
try:
|
||||||
|
import sqlite3 as _sql
|
||||||
|
with _sql.connect(str(db.db_path)) as _conn:
|
||||||
|
_conn.row_factory = _sql.Row
|
||||||
|
db_row = _conn.execute(
|
||||||
|
"SELECT * FROM events "
|
||||||
|
"WHERE serial=? AND blastware_filename=? "
|
||||||
|
"LIMIT 1",
|
||||||
|
(serial, path.name),
|
||||||
|
).fetchone()
|
||||||
|
except Exception as exc:
|
||||||
|
log.debug("DB lookup failed for %s: %s", path.name, exc)
|
||||||
|
|
||||||
|
if db_row is not None:
|
||||||
|
if ev.sample_rate is None and db_row["sample_rate"]:
|
||||||
|
ev.sample_rate = int(db_row["sample_rate"])
|
||||||
|
if not ev.record_type and db_row["record_type"]:
|
||||||
|
ev.record_type = db_row["record_type"]
|
||||||
|
if ev._waveform_key is None and db_row["waveform_key"]:
|
||||||
|
try:
|
||||||
|
ev._waveform_key = bytes.fromhex(db_row["waveform_key"])
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
# Timestamp from the ISO-8601 string in the DB row.
|
||||||
|
if ev.timestamp is None and db_row["timestamp"]:
|
||||||
|
try:
|
||||||
|
import datetime as _dt
|
||||||
|
_t = _dt.datetime.fromisoformat(db_row["timestamp"])
|
||||||
|
ev.timestamp = Timestamp(
|
||||||
|
raw=b"", flag=0x10,
|
||||||
|
year=_t.year, unknown_byte=0,
|
||||||
|
month=_t.month, day=_t.day,
|
||||||
|
hour=_t.hour, minute=_t.minute, second=_t.second,
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
# Peaks from the DB row when the A5 decode didn't supply them.
|
||||||
|
if ev.peak_values is None:
|
||||||
|
ev.peak_values = PeakValues(
|
||||||
|
tran=db_row["tran_ppv"],
|
||||||
|
vert=db_row["vert_ppv"],
|
||||||
|
long=db_row["long_ppv"],
|
||||||
|
peak_vector_sum=db_row["peak_vector_sum"],
|
||||||
|
micl=db_row["mic_ppv"],
|
||||||
|
)
|
||||||
|
# Project info from the DB row when the A5 metadata-page
|
||||||
|
# decode didn't pick it up.
|
||||||
|
if ev.project_info is None or all(
|
||||||
|
v in (None, "")
|
||||||
|
for v in (
|
||||||
|
(ev.project_info.project if ev.project_info else None),
|
||||||
|
(ev.project_info.client if ev.project_info else None),
|
||||||
|
(ev.project_info.operator if ev.project_info else None),
|
||||||
|
(ev.project_info.sensor_location if ev.project_info else None),
|
||||||
|
)
|
||||||
|
):
|
||||||
|
ev.project_info = ProjectInfo(
|
||||||
|
project=db_row["project"],
|
||||||
|
client=db_row["client"],
|
||||||
|
operator=db_row["operator"],
|
||||||
|
sensor_location=db_row["sensor_location"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# Derive total_samples when we have both rectime + sample_rate.
|
||||||
|
# The decoder's STRT-derived value can be a buffer offset
|
||||||
|
# rather than a sample count — drop it in that case.
|
||||||
|
if ev.sample_rate and ev.rectime_seconds:
|
||||||
|
derived = int(round(ev.sample_rate * ev.rectime_seconds))
|
||||||
|
if (ev.total_samples is None
|
||||||
|
or ev.total_samples > derived * 2
|
||||||
|
or ev.total_samples < derived // 4):
|
||||||
|
ev.total_samples = derived
|
||||||
|
|
||||||
|
# Preserve user-edited review state + extensions + the
|
||||||
|
# bw_report block from the existing sidecar so a backfill
|
||||||
|
# never wipes them out. The bw_report block originates
|
||||||
|
# from the paired .TXT ASCII report parsed at ORIGINAL
|
||||||
|
# import time (ach forward / direct upload); the .TXT
|
||||||
|
# file is not in the waveform store, so we can't re-derive
|
||||||
|
# it from disk. event_to_sidecar_dict takes a
|
||||||
|
# BwAsciiReport dataclass (not a dict), so for bw_report
|
||||||
|
# we overlay the existing block after regen instead of
|
||||||
|
# passing it as a kwarg.
|
||||||
|
preserved_review = None
|
||||||
|
preserved_ext = None
|
||||||
|
preserved_bw_report = None
|
||||||
|
preserved_txt_fn = None
|
||||||
|
if sidecar_path.exists():
|
||||||
|
try:
|
||||||
|
_existing = event_file_io.read_sidecar(sidecar_path)
|
||||||
|
preserved_review = _existing.get("review")
|
||||||
|
preserved_ext = _existing.get("extensions")
|
||||||
|
preserved_bw_report = _existing.get("bw_report")
|
||||||
|
# Preserve txt_filename so backfills don't blank out the
|
||||||
|
# pointer to the saved raw .TXT (events ingested after
|
||||||
|
# 2026-05-27 have this).
|
||||||
|
preserved_txt_fn = (_existing.get("source") or {}).get("txt_filename")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# --reparse-txt: if a .TXT is preserved on disk, run the
|
||||||
|
# current parser against it and overwrite the bw_report
|
||||||
|
# block. Picks up post-ingest parser fixes (e.g. the
|
||||||
|
# 2026-05-28 zc_freq_above_range / ">100 Hz" addition).
|
||||||
|
if args.reparse_txt and preserved_txt_fn:
|
||||||
|
try:
|
||||||
|
from minimateplus import bw_ascii_report
|
||||||
|
txt_path = store.txt_path_for(serial, path.name)
|
||||||
|
if txt_path.exists():
|
||||||
|
refreshed = bw_ascii_report.parse_report_file(txt_path)
|
||||||
|
preserved_bw_report = event_file_io._bw_report_to_dict(refreshed)
|
||||||
|
log.debug("reparsed bw_report from %s", txt_path.name)
|
||||||
|
else:
|
||||||
|
log.debug("--reparse-txt: no .TXT at %s (sidecar says %r)",
|
||||||
|
txt_path, preserved_txt_fn)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("--reparse-txt failed for %s: %s", path.name, exc)
|
||||||
|
|
||||||
|
# Overlay BW ASCII report fields onto the rebuilt Event
|
||||||
|
# BEFORE the sidecar + DB write. Mirrors what the ingest
|
||||||
|
# path does — BW's reported peaks (and sample_rate /
|
||||||
|
# record_time) win over codec output where present.
|
||||||
|
#
|
||||||
|
# Without this step, --force backfill silently overwrites
|
||||||
|
# the bw_report-overlaid DB columns with codec-derived
|
||||||
|
# values, which is wrong for events the codec doesn't
|
||||||
|
# fully decode (e.g. waveform walker edge cases on
|
||||||
|
# SP0/SS0/SV0-style events, or histogram sub-formats with
|
||||||
|
# byte[5]!=0 that aren't yet RE'd). Net effect was PVS=0
|
||||||
|
# on three top-10 events on 2026-05-22.
|
||||||
|
if preserved_bw_report:
|
||||||
|
event_file_io.apply_bw_report_dict_to_event(
|
||||||
|
ev, preserved_bw_report,
|
||||||
|
)
|
||||||
|
|
||||||
|
sidecar = event_file_io.event_to_sidecar_dict(
|
||||||
|
ev,
|
||||||
|
serial=serial,
|
||||||
|
blastware_filename=path.name,
|
||||||
|
blastware_filesize=path.stat().st_size,
|
||||||
|
blastware_sha256=bw_sha,
|
||||||
|
source_kind=source_kind,
|
||||||
|
a5_pickle_filename=a5_filename,
|
||||||
|
txt_filename=preserved_txt_fn,
|
||||||
|
review=preserved_review,
|
||||||
|
extensions=preserved_ext,
|
||||||
|
)
|
||||||
|
if preserved_bw_report is not None:
|
||||||
|
sidecar["bw_report"] = preserved_bw_report
|
||||||
|
|
||||||
|
# Also emit the .h5 clean-waveform file when:
|
||||||
|
# - it's missing, OR
|
||||||
|
# - --force was passed, OR
|
||||||
|
# - the sidecar is being regenerated this iteration
|
||||||
|
# (sha mismatch / tool_version too old). The .h5 and
|
||||||
|
# the sidecar are both derived from the same decoder
|
||||||
|
# output, so if the sidecar is stale, so is the .h5.
|
||||||
|
#
|
||||||
|
# Both waveform and histogram bodies now decode to real
|
||||||
|
# samples via event_file_io.read_blastware_file → either
|
||||||
|
# waveform_codec.decode_waveform_v2 or histogram_codec.
|
||||||
|
# decode_histogram_body. If samples are still empty after
|
||||||
|
# both codecs run, it's a genuine "we can't decode this
|
||||||
|
# file" case (truncated, malformed, or unknown mode);
|
||||||
|
# skip the .h5 write so we don't replace whatever's
|
||||||
|
# there with an empty placeholder.
|
||||||
|
has_samples = bool(
|
||||||
|
ev.raw_samples and any(
|
||||||
|
ev.raw_samples.get(ch) for ch in ("Tran", "Vert", "Long", "MicL")
|
||||||
|
)
|
||||||
|
)
|
||||||
|
hdf5_path = store.hdf5_path_for(serial, path.name)
|
||||||
|
hdf5_filename = hdf5_path.name if hdf5_path.exists() else None
|
||||||
|
hdf5_action = "kept"
|
||||||
|
need_h5 = (
|
||||||
|
not args.skip_hdf5
|
||||||
|
and (args.force or not hdf5_path.exists() or sidecar_stale)
|
||||||
|
and has_samples
|
||||||
|
)
|
||||||
|
if not has_samples and not args.skip_hdf5:
|
||||||
|
hdf5_action = "skipped-undecodable"
|
||||||
|
if need_h5:
|
||||||
|
if args.dry_run:
|
||||||
|
hdf5_action = "would (re)write"
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
event_hdf5.write_event_hdf5(
|
||||||
|
hdf5_path, ev,
|
||||||
|
serial=serial,
|
||||||
|
geo_range="normal",
|
||||||
|
source_kind=source_kind,
|
||||||
|
)
|
||||||
|
hdf5_filename = hdf5_path.name
|
||||||
|
hdf5_action = "rewrote" if hdf5_path.exists() else "wrote"
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("HDF5 write failed for %s: %s", path.name, exc)
|
||||||
|
hdf5_action = "FAILED"
|
||||||
|
|
||||||
|
if args.dry_run:
|
||||||
|
print(f" [DRY ] would write {sidecar_path.name} "
|
||||||
|
f"+ .h5 ({hdf5_action}) source={source_kind}")
|
||||||
|
written += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
event_file_io.write_sidecar(sidecar_path, sidecar)
|
||||||
|
|
||||||
|
# Best-effort: keep the SQL row's sidecar_filename in sync
|
||||||
|
# by upserting via insert_events (it dedups on serial+ts).
|
||||||
|
try:
|
||||||
|
db.insert_events(
|
||||||
|
[ev], serial=serial,
|
||||||
|
waveform_records=(
|
||||||
|
{ev._waveform_key.hex(): {
|
||||||
|
"filename": path.name,
|
||||||
|
"filesize": path.stat().st_size,
|
||||||
|
"a5_pickle_filename": a5_filename,
|
||||||
|
"sidecar_filename": sidecar_path.name,
|
||||||
|
}}
|
||||||
|
if ev._waveform_key else None
|
||||||
|
),
|
||||||
|
device_family="series3",
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("DB upsert failed for %s: %s", path.name, exc)
|
||||||
|
|
||||||
|
print(f" [OK ] {path.name} → {sidecar_path.name} "
|
||||||
|
f"+ h5 ({hdf5_action}) source={source_kind}")
|
||||||
|
written += 1
|
||||||
|
|
||||||
|
except Exception as exc:
|
||||||
|
log.error("backfill failed for %s: %s", path, exc, exc_info=args.verbose)
|
||||||
|
errors += 1
|
||||||
|
|
||||||
|
print(f"\nDone. written={written} skipped(uptodate)={skipped} errors={errors}")
|
||||||
|
return 0 if errors == 0 else 1
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
@@ -0,0 +1,331 @@
|
|||||||
|
"""
|
||||||
|
scripts/backfill_thor_events.py — re-process existing Thor (Series IV)
|
||||||
|
events so their sidecars carry the bw_report block produced by
|
||||||
|
``micromate.idf_to_bw_report.build_bw_report_from_idf`` + their .h5
|
||||||
|
clean-waveform files for IDFW events.
|
||||||
|
|
||||||
|
Why this exists
|
||||||
|
───────────────
|
||||||
|
|
||||||
|
Thor events ingested before v0.21.0 (or during the v0.21.0 ingest bug
|
||||||
|
window fixed in commit bee1185) have sidecars with only
|
||||||
|
``extensions.idf_report`` — no ``bw_report`` block. Without
|
||||||
|
``bw_report``, the SFM PDF renderer falls back to DB-only fields
|
||||||
|
(misses sensor-self-check, full per-channel breakdown, mic dB(L)),
|
||||||
|
and the modal chart 404s on ``/waveform.json`` for IDFW events
|
||||||
|
because no .h5 was written when the codec failed at ingest.
|
||||||
|
|
||||||
|
Re-forwarding from thor-watcher would also fix this, but that requires
|
||||||
|
operator coordination on every watcher machine and uses bandwidth this
|
||||||
|
script doesn't.
|
||||||
|
|
||||||
|
What this does
|
||||||
|
──────────────
|
||||||
|
|
||||||
|
Walks ``<store>/<serial>/<filename>`` for ``.IDFW`` / ``.IDFH`` files
|
||||||
|
and, for each one:
|
||||||
|
|
||||||
|
1. Reads the existing sidecar (preserving review state + captured_at).
|
||||||
|
2. Re-runs ``micromate.idf_file.read_idf_file()`` on the binary
|
||||||
|
bytes — passing ``data=`` so the codec doesn't try to read from
|
||||||
|
a path it doesn't know.
|
||||||
|
3. Pulls ``extensions.idf_report`` (the raw parsed Thor dict the
|
||||||
|
v0.18.0+ ingest path already stashed) and runs the v0.21.0
|
||||||
|
``build_bw_report_from_idf`` adapter against it.
|
||||||
|
4. Writes the refreshed sidecar with the new ``bw_report``,
|
||||||
|
bumped ``source.tool_version``, but preserved ``review`` block
|
||||||
|
+ the original ``captured_at`` timestamp.
|
||||||
|
5. Regenerates the .h5 waveform file via the existing
|
||||||
|
``event_hdf5`` writer. For IDFW that's the decoded per-sample
|
||||||
|
stream; for IDFH it's a 1-sample-per-interval synthesised array
|
||||||
|
(peak ADC count per channel) so the renderer's bar-chart code
|
||||||
|
has data to group on. Mic peak psi from the binary is merged
|
||||||
|
onto the IdfEvent before the bridge so the h5 writer's per-count
|
||||||
|
mic scale factor lands on a sensible value (without this the
|
||||||
|
mic chart on Thor events plots dB(L)-as-pseudo-psi and shows
|
||||||
|
bomb-level numbers).
|
||||||
|
|
||||||
|
Idempotent. Re-running it after a parser/adapter change just
|
||||||
|
re-writes sidecars — no DB writes, no thor-watcher coordination.
|
||||||
|
|
||||||
|
Usage
|
||||||
|
─────
|
||||||
|
|
||||||
|
python scripts/backfill_thor_events.py [--store-root PATH]
|
||||||
|
[--dry-run]
|
||||||
|
[--skip-hdf5]
|
||||||
|
[--force]
|
||||||
|
[-v]
|
||||||
|
|
||||||
|
By default, refreshes any Thor event whose sidecar is missing
|
||||||
|
``bw_report`` OR whose ``source.tool_version`` is older than the
|
||||||
|
current ``TOOL_VERSION``. ``--force`` refreshes every Thor event
|
||||||
|
regardless.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Allow running from the repo root without installation.
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||||
|
|
||||||
|
from minimateplus import event_file_io
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
|
||||||
|
log = logging.getLogger("backfill_thor_events")
|
||||||
|
|
||||||
|
|
||||||
|
def _is_thor_event(path: Path) -> bool:
|
||||||
|
if not path.is_file():
|
||||||
|
return False
|
||||||
|
if path.name.endswith((".sfm.json", ".h5", "_ASCII.TXT")):
|
||||||
|
return False
|
||||||
|
return path.suffix.upper() in (".IDFW", ".IDFH")
|
||||||
|
|
||||||
|
|
||||||
|
def _vtuple(s: str) -> tuple:
|
||||||
|
try:
|
||||||
|
return tuple(int(p) for p in str(s).split(".")[:3])
|
||||||
|
except Exception:
|
||||||
|
return (0, 0, 0)
|
||||||
|
|
||||||
|
|
||||||
|
def main(argv=None) -> int:
|
||||||
|
p = argparse.ArgumentParser(description=__doc__)
|
||||||
|
p.add_argument(
|
||||||
|
"--db-path",
|
||||||
|
default=str(Path(__file__).resolve().parent.parent / "bridges" / "captures" / "seismo_relay.db"),
|
||||||
|
help="Used only to derive the default --store-root.",
|
||||||
|
)
|
||||||
|
p.add_argument("--store-root", default=None)
|
||||||
|
p.add_argument("--dry-run", action="store_true")
|
||||||
|
p.add_argument("--skip-hdf5", action="store_true",
|
||||||
|
help="Don't regenerate .h5 files for IDFW events.")
|
||||||
|
p.add_argument("--force", action="store_true",
|
||||||
|
help="Refresh every Thor event, not just ones with stale or missing bw_report.")
|
||||||
|
p.add_argument("-v", "--verbose", action="store_true")
|
||||||
|
args = p.parse_args(argv)
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.DEBUG if args.verbose else logging.INFO,
|
||||||
|
format="%(asctime)s %(levelname)-7s %(name)s %(message)s",
|
||||||
|
datefmt="%H:%M:%S",
|
||||||
|
)
|
||||||
|
|
||||||
|
db_path = Path(args.db_path).expanduser().resolve()
|
||||||
|
store_root = (
|
||||||
|
Path(args.store_root).expanduser().resolve()
|
||||||
|
if args.store_root else db_path.parent / "waveforms"
|
||||||
|
)
|
||||||
|
if not store_root.exists():
|
||||||
|
log.error("store root not found: %s", store_root)
|
||||||
|
return 1
|
||||||
|
store = WaveformStore(store_root)
|
||||||
|
log.info("store root: %s", store_root)
|
||||||
|
log.info("current TOOL_VERSION: %s", event_file_io.TOOL_VERSION)
|
||||||
|
|
||||||
|
refreshed = skipped = errors = h5_written = 0
|
||||||
|
|
||||||
|
# Lazy imports so any one of these failing produces a useful error
|
||||||
|
# message rather than crashing module-load.
|
||||||
|
from micromate.idf_file import read_idf_file
|
||||||
|
from micromate.idf_to_bw_report import build_bw_report_from_idf
|
||||||
|
|
||||||
|
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
|
||||||
|
serial = serial_dir.name
|
||||||
|
for path in sorted(serial_dir.iterdir()):
|
||||||
|
if not _is_thor_event(path):
|
||||||
|
continue
|
||||||
|
|
||||||
|
sidecar_path = store.sidecar_path_for(serial, path.name)
|
||||||
|
if not sidecar_path.exists():
|
||||||
|
log.debug("%s: no sidecar — skipping (this is a binary without ingest history)",
|
||||||
|
path.name)
|
||||||
|
skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
existing = event_file_io.read_sidecar(sidecar_path)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("%s: failed to read sidecar — %s", path.name, exc)
|
||||||
|
errors += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
has_bw_report = bool(existing.get("bw_report"))
|
||||||
|
existing_version = (existing.get("source") or {}).get("tool_version", "")
|
||||||
|
up_to_date = (
|
||||||
|
has_bw_report
|
||||||
|
and _vtuple(existing_version) >= _vtuple(event_file_io.TOOL_VERSION)
|
||||||
|
)
|
||||||
|
if up_to_date and not args.force:
|
||||||
|
skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Re-decode the binary. Catch + log; continue with .txt-only
|
||||||
|
# data if it fails (matches the live ingest path's behavior).
|
||||||
|
idf_samples = None
|
||||||
|
idf_intervals = None
|
||||||
|
binary_md = None
|
||||||
|
is_histogram = path.suffix.upper() == ".IDFH"
|
||||||
|
try:
|
||||||
|
binary_bytes = path.read_bytes()
|
||||||
|
res = read_idf_file(path, data=binary_bytes)
|
||||||
|
idf_samples = res.samples or None
|
||||||
|
idf_intervals = res.intervals
|
||||||
|
binary_md = res.binary_metadata
|
||||||
|
is_histogram = res.intervals is not None
|
||||||
|
except NotImplementedError:
|
||||||
|
# sig-B / Blastware-stray binary; no samples but adapter
|
||||||
|
# can still produce a bw_report from extensions.idf_report.
|
||||||
|
log.debug("%s: binary codec NotImplementedError (sig-B / BW-stray); proceeding from sidecar's idf_report only", path.name)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("%s: binary decode failed — %s; proceeding from sidecar's idf_report only", path.name, exc)
|
||||||
|
|
||||||
|
# Run the adapter. Pull report_dict from
|
||||||
|
# extensions.idf_report (the v0.18.0+ ingest preserved it).
|
||||||
|
report_dict = (existing.get("extensions") or {}).get("idf_report") or {}
|
||||||
|
if not report_dict and binary_md is None:
|
||||||
|
log.debug("%s: no idf_report in sidecar AND no binary metadata — nothing to project", path.name)
|
||||||
|
skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
bw_report = build_bw_report_from_idf(
|
||||||
|
report_dict, binary_md=binary_md,
|
||||||
|
intervals=idf_intervals, is_histogram=is_histogram,
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("%s: adapter failed — %s", path.name, exc)
|
||||||
|
errors += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Build the new sidecar by overlaying refreshed fields onto
|
||||||
|
# the existing one — preserves review, captured_at, blastware
|
||||||
|
# block, source.kind, etc.
|
||||||
|
new_sidecar = dict(existing) # shallow copy
|
||||||
|
new_sidecar["bw_report"] = bw_report
|
||||||
|
src = dict(new_sidecar.get("source") or {})
|
||||||
|
src["tool_version"] = event_file_io.TOOL_VERSION
|
||||||
|
new_sidecar["source"] = src
|
||||||
|
|
||||||
|
# Preserve histogram intervals if the binary decoded them
|
||||||
|
# (improves over the original ingest if that one ran before
|
||||||
|
# the bee1185 codec fix).
|
||||||
|
if idf_intervals is not None:
|
||||||
|
ext = dict(new_sidecar.get("extensions") or {})
|
||||||
|
ext["idf_intervals"] = [
|
||||||
|
{
|
||||||
|
"offset": iv.offset,
|
||||||
|
"tran_peak": iv.peak_count("Tran"),
|
||||||
|
"tran_halfp": iv.tran_halfp,
|
||||||
|
"tran_freq": iv.freq_hz("Tran"),
|
||||||
|
"vert_peak": iv.peak_count("Vert"),
|
||||||
|
"vert_halfp": iv.vert_halfp,
|
||||||
|
"vert_freq": iv.freq_hz("Vert"),
|
||||||
|
"long_peak": iv.peak_count("Long"),
|
||||||
|
"long_halfp": iv.long_halfp,
|
||||||
|
"long_freq": iv.freq_hz("Long"),
|
||||||
|
"mic_peak": iv.peak_count("MicL"),
|
||||||
|
"mic_halfp": iv.micl_halfp,
|
||||||
|
"mic_freq": iv.freq_hz("MicL"),
|
||||||
|
}
|
||||||
|
for iv in idf_intervals
|
||||||
|
]
|
||||||
|
new_sidecar["extensions"] = ext
|
||||||
|
|
||||||
|
if args.dry_run:
|
||||||
|
will_write_h5 = (idf_samples or idf_intervals) and not args.skip_hdf5
|
||||||
|
log.info("[DRY] %s/%s — would refresh sidecar (bw_report=%s, h5=%s)",
|
||||||
|
serial, path.name,
|
||||||
|
"wrote" if not has_bw_report else "refreshed",
|
||||||
|
"would write" if will_write_h5 else "skipped")
|
||||||
|
else:
|
||||||
|
event_file_io.write_sidecar(sidecar_path, new_sidecar)
|
||||||
|
log.info("%s/%s — sidecar refreshed (bw_report=%s, intervals=%d)",
|
||||||
|
serial, path.name,
|
||||||
|
"added" if not has_bw_report else "refreshed",
|
||||||
|
len(idf_intervals) if idf_intervals else 0)
|
||||||
|
refreshed += 1
|
||||||
|
|
||||||
|
# Regenerate .h5 by replaying the same IdfEvent → Event bridge
|
||||||
|
# save_imported_idf uses. For IDFW we write the decoded per-
|
||||||
|
# sample arrays. For IDFH we synthesise a 1-sample-per-interval
|
||||||
|
# array (peak ADC count per channel per interval) so the
|
||||||
|
# renderer's bar-chart code has something to group on.
|
||||||
|
# Pre-condition: either real samples (IDFW) or decoded intervals
|
||||||
|
# (IDFH). Skip otherwise.
|
||||||
|
have_data = bool(idf_samples) or bool(idf_intervals)
|
||||||
|
if have_data and not args.skip_hdf5:
|
||||||
|
from sfm import event_hdf5
|
||||||
|
hdf5_path = store.hdf5_path_for(serial, path.name)
|
||||||
|
if args.dry_run:
|
||||||
|
log.debug("[DRY] would write %s", hdf5_path.name)
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
from micromate import IdfEvent
|
||||||
|
from minimateplus.event_file_io import file_sha256
|
||||||
|
idf_event = IdfEvent.from_report(report_dict, path.name)
|
||||||
|
|
||||||
|
# Merge the binary-derived mic peak psi (only the
|
||||||
|
# binary path knows the proper psi value; the .txt
|
||||||
|
# carries dB(L)). Without this, the h5 writer's
|
||||||
|
# per-count mic factor is computed against the
|
||||||
|
# dB(L) value-as-pseudo-psi and the mic chart
|
||||||
|
# scales wildly.
|
||||||
|
if (binary_md is not None and res is not None
|
||||||
|
and res.event.peaks.mic_pspl_psi is not None):
|
||||||
|
idf_event.peaks.mic_pspl_psi = res.event.peaks.mic_pspl_psi
|
||||||
|
|
||||||
|
sha256 = file_sha256(path)
|
||||||
|
waveform_key = bytes.fromhex(sha256)[:16]
|
||||||
|
ev = idf_event.to_minimateplus_event(waveform_key)
|
||||||
|
|
||||||
|
if is_histogram and idf_intervals:
|
||||||
|
# 1 sample per interval per channel — same
|
||||||
|
# synthesis save_imported_idf uses. The h5
|
||||||
|
# writer's count×geo_fs/32768 conversion turns
|
||||||
|
# each peak-ADC-count into the bar's physical
|
||||||
|
# value.
|
||||||
|
ev.raw_samples = {
|
||||||
|
"Tran": [iv.peak_count("Tran") for iv in idf_intervals],
|
||||||
|
"Vert": [iv.peak_count("Vert") for iv in idf_intervals],
|
||||||
|
"Long": [iv.peak_count("Long") for iv in idf_intervals],
|
||||||
|
"MicL": [iv.peak_count("MicL") for iv in idf_intervals],
|
||||||
|
}
|
||||||
|
ev.total_samples = ev.total_samples or len(idf_intervals)
|
||||||
|
elif idf_samples:
|
||||||
|
ev.raw_samples = idf_samples
|
||||||
|
n_samp = max(
|
||||||
|
(len(idf_samples.get(ch, []))
|
||||||
|
for ch in ("Tran", "Vert", "Long", "MicL")),
|
||||||
|
default=0,
|
||||||
|
)
|
||||||
|
ev.total_samples = ev.total_samples or n_samp
|
||||||
|
|
||||||
|
event_hdf5.write_event_hdf5(
|
||||||
|
hdf5_path, ev,
|
||||||
|
serial=serial,
|
||||||
|
geo_range="normal",
|
||||||
|
source_kind="idf-import",
|
||||||
|
tool_version=event_file_io.TOOL_VERSION,
|
||||||
|
)
|
||||||
|
h5_written += 1
|
||||||
|
log.debug("%s/%s — .h5 written (%s)",
|
||||||
|
serial, path.name,
|
||||||
|
f"{len(idf_intervals)} intervals" if is_histogram
|
||||||
|
else f"{sum(len(v) for v in (idf_samples or {}).values())} samples")
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("%s/%s — .h5 write failed: %s",
|
||||||
|
serial, path.name, exc)
|
||||||
|
|
||||||
|
log.info("Done. refreshed=%d skipped=%d errors=%d h5_written=%d",
|
||||||
|
refreshed, skipped, errors, h5_written)
|
||||||
|
return 0 if errors == 0 else 2
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
Executable
+100
@@ -0,0 +1,100 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Fire-and-forget Stop Monitoring loop — for wedged or constantly-triggering units.
|
||||||
|
#
|
||||||
|
# Hammers POST /device/stop_monitoring_blind in a tight loop. The endpoint
|
||||||
|
# opens TCP, dumps SESSION_RESET + a few copies of the SUB 0x97 frame, and
|
||||||
|
# closes — without ever reading an S3 response. Each TCP-won attempt is
|
||||||
|
# ~50ms of wire activity instead of the multi-frame handshake the regular
|
||||||
|
# rescue endpoint does, so windows that are too small for the full rescue
|
||||||
|
# can still land a stop-monitoring command.
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./blind_stop.sh <host> [tcp_port]
|
||||||
|
#
|
||||||
|
# Env:
|
||||||
|
# SFM_BASE_URL Default: http://localhost:8200 (SFM direct).
|
||||||
|
# Set to http://localhost:8001/api/sfm to route through
|
||||||
|
# Terra-View's proxy.
|
||||||
|
# MAX_ATTEMPTS Default: 600
|
||||||
|
# SLEEP_S Default: 0 (no backoff — hammer it)
|
||||||
|
# MAX_TIME_S Default: 15
|
||||||
|
# CONNECT_TIMEOUT Default: 5
|
||||||
|
# REPEAT Frames per TCP session (default 3 — increases hit rate
|
||||||
|
# if the device is busy reading its own buffer).
|
||||||
|
# STOP_ON_OK Default: 1. Set to 0 to keep hammering indefinitely
|
||||||
|
# even after successful sends (every 503 means the device
|
||||||
|
# is in *another* session, every 200 means our bytes got
|
||||||
|
# through — but the device may not have processed them).
|
||||||
|
|
||||||
|
set -u
|
||||||
|
|
||||||
|
host="${1:-}"
|
||||||
|
tcp_port="${2:-9034}"
|
||||||
|
if [[ -z "$host" ]]; then
|
||||||
|
echo "usage: $0 <host> [tcp_port]" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||||
|
max_attempts="${MAX_ATTEMPTS:-600}"
|
||||||
|
sleep_s="${SLEEP_S:-0}"
|
||||||
|
max_time_s="${MAX_TIME_S:-15}"
|
||||||
|
connect_timeout="${CONNECT_TIMEOUT:-5}"
|
||||||
|
repeat="${REPEAT:-3}"
|
||||||
|
stop_on_ok="${STOP_ON_OK:-1}"
|
||||||
|
|
||||||
|
url="${base}/device/stop_monitoring_blind?host=${host}&tcp_port=${tcp_port}&connect_timeout=${connect_timeout}&repeat=${repeat}"
|
||||||
|
|
||||||
|
echo "blind_stop: target ${host}:${tcp_port} connect_timeout=${connect_timeout}s repeat=${repeat}"
|
||||||
|
echo "blind_stop: POST ${url}"
|
||||||
|
echo "blind_stop: up to ${max_attempts} attempts, ${sleep_s}s between, ${max_time_s}s per request"
|
||||||
|
echo "blind_stop: stop_on_ok=${stop_on_ok}"
|
||||||
|
echo
|
||||||
|
|
||||||
|
ok_count=0
|
||||||
|
busy_count=0
|
||||||
|
err_count=0
|
||||||
|
started=$(date +%s)
|
||||||
|
|
||||||
|
for ((i=1; i<=max_attempts; i++)); do
|
||||||
|
printf "[%4d] %s " "$i" "$(date +%H:%M:%S)"
|
||||||
|
http_code=$(curl -sS -o /tmp/blind_resp.$$ -w "%{http_code}" \
|
||||||
|
--max-time "$max_time_s" \
|
||||||
|
-X POST "$url" || echo "000")
|
||||||
|
body=$(cat /tmp/blind_resp.$$ 2>/dev/null || true)
|
||||||
|
rm -f /tmp/blind_resp.$$
|
||||||
|
|
||||||
|
case "$http_code" in
|
||||||
|
200|201)
|
||||||
|
ok_count=$((ok_count + 1))
|
||||||
|
echo "SENT $body"
|
||||||
|
if [[ "$stop_on_ok" == "1" ]]; then
|
||||||
|
elapsed=$(( $(date +%s) - started ))
|
||||||
|
echo
|
||||||
|
echo "blind_stop: success after ${i} attempts (${elapsed}s). ok=${ok_count} busy=${busy_count} err=${err_count}"
|
||||||
|
echo "blind_stop: NEXT — wait ~10s, then try the full rescue:"
|
||||||
|
echo " /home/serversdown/seismo-relay/scripts/rescue_device.sh ${host} ${tcp_port}"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
503)
|
||||||
|
busy_count=$((busy_count + 1))
|
||||||
|
echo "busy (503)"
|
||||||
|
;;
|
||||||
|
000)
|
||||||
|
err_count=$((err_count + 1))
|
||||||
|
echo "curl error"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
err_count=$((err_count + 1))
|
||||||
|
echo "HTTP $http_code $body" | head -c 400
|
||||||
|
echo
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
[[ "$sleep_s" != "0" ]] && sleep "$sleep_s"
|
||||||
|
done
|
||||||
|
|
||||||
|
elapsed=$(( $(date +%s) - started ))
|
||||||
|
echo
|
||||||
|
echo "blind_stop: gave up after ${max_attempts} attempts (${elapsed}s). ok=${ok_count} busy=${busy_count} err=${err_count}" >&2
|
||||||
|
exit 1
|
||||||
@@ -0,0 +1,185 @@
|
|||||||
|
"""
|
||||||
|
scripts/check_bw_report_preservation.py — verify that running backfill_sidecars
|
||||||
|
doesn't wipe the `bw_report` block from sidecars that already had one.
|
||||||
|
|
||||||
|
Two-step workflow:
|
||||||
|
|
||||||
|
# Before running backfill — capture a baseline snapshot:
|
||||||
|
python scripts/check_bw_report_preservation.py snapshot \
|
||||||
|
--store-root /path/to/waveforms \
|
||||||
|
--out before.json
|
||||||
|
|
||||||
|
# Run backfill:
|
||||||
|
python scripts/backfill_sidecars.py --store-root /path/to/waveforms --force
|
||||||
|
|
||||||
|
# After backfill — diff against the baseline:
|
||||||
|
python scripts/check_bw_report_preservation.py diff \
|
||||||
|
--store-root /path/to/waveforms \
|
||||||
|
--baseline before.json
|
||||||
|
|
||||||
|
The diff classifies every sidecar into one of:
|
||||||
|
|
||||||
|
PRESERVED had bw_report before, has same hash now ← GOOD
|
||||||
|
CHANGED had bw_report before, has different hash now ← suspicious
|
||||||
|
(backfill should only ever copy the block verbatim)
|
||||||
|
WIPED had bw_report before, doesn't now ← BUG — data loss
|
||||||
|
STILL_MISSING didn't have bw_report before, still doesn't ← expected
|
||||||
|
NEW didn't have bw_report before, has one now
|
||||||
|
(only possible if a re-ingest happened between snapshots;
|
||||||
|
shouldn't happen during backfill)
|
||||||
|
REMOVED sidecar existed in baseline, file is gone now
|
||||||
|
ADDED sidecar didn't exist in baseline, exists now
|
||||||
|
|
||||||
|
Exit code is 0 if no WIPED or CHANGED entries are found, 1 otherwise.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
# Allow running from the repo root without installation.
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||||
|
|
||||||
|
from minimateplus import event_file_io
|
||||||
|
|
||||||
|
|
||||||
|
def _bw_report_hash(sidecar_data: dict) -> Optional[str]:
|
||||||
|
"""Canonical-JSON hash of the bw_report block, or None if absent."""
|
||||||
|
br = sidecar_data.get("bw_report")
|
||||||
|
if not br:
|
||||||
|
return None
|
||||||
|
# sort_keys for stable hashing across dict-ordering differences
|
||||||
|
blob = json.dumps(br, sort_keys=True, separators=(",", ":"))
|
||||||
|
return hashlib.sha256(blob.encode()).hexdigest()
|
||||||
|
|
||||||
|
|
||||||
|
def _scan_store(store_root: Path) -> dict:
|
||||||
|
"""Walk every <serial>/<file>.sfm.json and return {relpath: hash_or_None}.
|
||||||
|
|
||||||
|
Relpath is `<serial>/<filename>` — stable across machines/snapshots.
|
||||||
|
"""
|
||||||
|
out: dict[str, Optional[str]] = {}
|
||||||
|
for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
|
||||||
|
for sidecar in sorted(serial_dir.glob("*.sfm.json")):
|
||||||
|
relpath = f"{serial_dir.name}/{sidecar.name}"
|
||||||
|
try:
|
||||||
|
data = event_file_io.read_sidecar(sidecar)
|
||||||
|
except Exception as exc:
|
||||||
|
print(f" WARN: failed to read {relpath}: {exc}", file=sys.stderr)
|
||||||
|
continue
|
||||||
|
out[relpath] = _bw_report_hash(data)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_snapshot(args) -> int:
|
||||||
|
store_root = Path(args.store_root).expanduser().resolve()
|
||||||
|
if not store_root.exists():
|
||||||
|
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
out_path = Path(args.out).expanduser().resolve()
|
||||||
|
|
||||||
|
print(f"Scanning {store_root} …")
|
||||||
|
snapshot = _scan_store(store_root)
|
||||||
|
|
||||||
|
with_bw = sum(1 for v in snapshot.values() if v is not None)
|
||||||
|
without_bw = sum(1 for v in snapshot.values() if v is None)
|
||||||
|
print(f" total sidecars: {len(snapshot)}")
|
||||||
|
print(f" with bw_report: {with_bw}")
|
||||||
|
print(f" without bw_report: {without_bw}")
|
||||||
|
|
||||||
|
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
with open(out_path, "w") as f:
|
||||||
|
json.dump({
|
||||||
|
"store_root": str(store_root),
|
||||||
|
"total": len(snapshot),
|
||||||
|
"with_bw": with_bw,
|
||||||
|
"sidecars": snapshot,
|
||||||
|
}, f, indent=2, sort_keys=True)
|
||||||
|
print(f"Wrote baseline → {out_path}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_diff(args) -> int:
|
||||||
|
store_root = Path(args.store_root).expanduser().resolve()
|
||||||
|
if not store_root.exists():
|
||||||
|
print(f"error: store root does not exist: {store_root}", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
baseline_path = Path(args.baseline).expanduser().resolve()
|
||||||
|
if not baseline_path.exists():
|
||||||
|
print(f"error: baseline file not found: {baseline_path}", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
|
||||||
|
with open(baseline_path) as f:
|
||||||
|
baseline = json.load(f)
|
||||||
|
before = baseline["sidecars"]
|
||||||
|
print(f"Scanning {store_root} for comparison against {baseline_path.name} …")
|
||||||
|
after = _scan_store(store_root)
|
||||||
|
|
||||||
|
classes = {k: [] for k in (
|
||||||
|
"PRESERVED", "CHANGED", "WIPED", "STILL_MISSING", "NEW", "REMOVED", "ADDED",
|
||||||
|
)}
|
||||||
|
all_keys = set(before) | set(after)
|
||||||
|
for key in sorted(all_keys):
|
||||||
|
b = before.get(key, "__MISSING__")
|
||||||
|
a = after.get(key, "__MISSING__")
|
||||||
|
if b == "__MISSING__":
|
||||||
|
classes["ADDED"].append(key)
|
||||||
|
elif a == "__MISSING__":
|
||||||
|
classes["REMOVED"].append(key)
|
||||||
|
elif b is None and a is None:
|
||||||
|
classes["STILL_MISSING"].append(key)
|
||||||
|
elif b is None and a is not None:
|
||||||
|
classes["NEW"].append(key)
|
||||||
|
elif b is not None and a is None:
|
||||||
|
classes["WIPED"].append(key)
|
||||||
|
elif b == a:
|
||||||
|
classes["PRESERVED"].append(key)
|
||||||
|
else:
|
||||||
|
classes["CHANGED"].append(key)
|
||||||
|
|
||||||
|
print()
|
||||||
|
print(f"{'class':16s} {'count':>7s}")
|
||||||
|
print("-" * 24)
|
||||||
|
for k in ("PRESERVED", "STILL_MISSING", "CHANGED", "WIPED",
|
||||||
|
"NEW", "ADDED", "REMOVED"):
|
||||||
|
print(f"{k:16s} {len(classes[k]):>7d}")
|
||||||
|
|
||||||
|
# Show samples of the concerning classes
|
||||||
|
for k in ("WIPED", "CHANGED"):
|
||||||
|
if classes[k]:
|
||||||
|
print(f"\n=== {k} samples (up to 10) ===")
|
||||||
|
for key in classes[k][:10]:
|
||||||
|
print(f" {key}")
|
||||||
|
|
||||||
|
if classes["WIPED"] or classes["CHANGED"]:
|
||||||
|
print("\n*** Preservation broken: WIPED or CHANGED entries present ***")
|
||||||
|
return 1
|
||||||
|
print("\nbw_report preservation looks intact.")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def main(argv=None) -> int:
|
||||||
|
p = argparse.ArgumentParser(description=__doc__)
|
||||||
|
sub = p.add_subparsers(dest="cmd", required=True)
|
||||||
|
|
||||||
|
p_snap = sub.add_parser("snapshot", help="capture baseline bw_report hashes")
|
||||||
|
p_snap.add_argument("--store-root", required=True)
|
||||||
|
p_snap.add_argument("--out", required=True, help="output JSON path")
|
||||||
|
p_snap.set_defaults(func=cmd_snapshot)
|
||||||
|
|
||||||
|
p_diff = sub.add_parser("diff", help="diff current store against a baseline")
|
||||||
|
p_diff.add_argument("--store-root", required=True)
|
||||||
|
p_diff.add_argument("--baseline", required=True, help="JSON from `snapshot`")
|
||||||
|
p_diff.set_defaults(func=cmd_diff)
|
||||||
|
|
||||||
|
args = p.parse_args(argv)
|
||||||
|
return args.func(args)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
@@ -0,0 +1,151 @@
|
|||||||
|
"""
|
||||||
|
scripts/repair_unknown_serials.py — re-attribute events stuck under
|
||||||
|
`serial = 'UNKNOWN'` to their correct serial by decoding the BW filename.
|
||||||
|
|
||||||
|
Why this is needed
|
||||||
|
──────────────────
|
||||||
|
The /db/import/blastware_file endpoint had a bug (fixed in commit a032fa5+1
|
||||||
|
on the ach-report-ingestion branch) where every forwarded event was inserted
|
||||||
|
with serial='UNKNOWN' because the endpoint's `_serial_from_event(ev)` stub
|
||||||
|
returned None and never consulted the BW-filename serial that
|
||||||
|
`WaveformStore.save_imported_bw()` had already decoded.
|
||||||
|
|
||||||
|
Effect on a server that ran a buggy version: every forwarded event's
|
||||||
|
SeismoDb row has `serial='UNKNOWN'`, even though the on-disk waveform
|
||||||
|
store has correctly bucketed the files into `BE<NNNN>/` folders. So
|
||||||
|
the BW binaries / sidecars / HDF5s are fine, but `/db/units` and
|
||||||
|
`/db/events?serial=...` queries don't surface the events.
|
||||||
|
|
||||||
|
This script
|
||||||
|
───────────
|
||||||
|
Walks the events table looking for rows with `serial='UNKNOWN'` and
|
||||||
|
re-attributes each one to the serial decoded from its
|
||||||
|
`blastware_filename` column. If the row's serial would collide with
|
||||||
|
an existing row (already-correct duplicate from a later re-forward),
|
||||||
|
the UNKNOWN row is deleted. Otherwise the row's `serial` column is
|
||||||
|
updated in-place.
|
||||||
|
|
||||||
|
Idempotent: re-running after a successful repair finds zero matching
|
||||||
|
rows and exits cleanly.
|
||||||
|
|
||||||
|
Usage
|
||||||
|
─────
|
||||||
|
# Dry-run (default): print what would change, don't touch the DB
|
||||||
|
python -m scripts.repair_unknown_serials --db bridges/captures/seismo_relay.db
|
||||||
|
|
||||||
|
# Apply the repair
|
||||||
|
python -m scripts.repair_unknown_serials --db bridges/captures/seismo_relay.db --apply
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Reach into sfm.waveform_store for the serial decoder. This script
|
||||||
|
# is run from the repo root via `python -m scripts.repair_unknown_serials`.
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||||
|
from sfm.waveform_store import _serial_from_bw_filename
|
||||||
|
|
||||||
|
|
||||||
|
def main(argv: list[str] | None = None) -> int:
|
||||||
|
p = argparse.ArgumentParser(
|
||||||
|
description="Re-attribute events stuck under serial='UNKNOWN'.",
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--db", required=True, type=Path,
|
||||||
|
help="Path to seismo_relay.db (e.g. bridges/captures/seismo_relay.db)",
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--apply", action="store_true",
|
||||||
|
help="Apply the repair. Without this flag the script runs in "
|
||||||
|
"dry-run mode and only reports what would change.",
|
||||||
|
)
|
||||||
|
args = p.parse_args(argv)
|
||||||
|
|
||||||
|
if not args.db.exists():
|
||||||
|
print(f"DB not found: {args.db}", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
|
||||||
|
conn = sqlite3.connect(str(args.db))
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
|
||||||
|
rows = list(conn.execute(
|
||||||
|
"SELECT id, serial, timestamp, blastware_filename "
|
||||||
|
" FROM events "
|
||||||
|
" WHERE serial = 'UNKNOWN' "
|
||||||
|
" ORDER BY timestamp",
|
||||||
|
))
|
||||||
|
print(f"Found {len(rows)} UNKNOWN-serial rows in events table.")
|
||||||
|
if not rows:
|
||||||
|
return 0
|
||||||
|
|
||||||
|
updated = 0
|
||||||
|
deleted = 0
|
||||||
|
unresolved = 0
|
||||||
|
by_serial: dict[str, int] = {}
|
||||||
|
|
||||||
|
for row in rows:
|
||||||
|
rid = row["id"]
|
||||||
|
ts = row["timestamp"]
|
||||||
|
bw_name = row["blastware_filename"]
|
||||||
|
new_serial = _serial_from_bw_filename(bw_name) if bw_name else None
|
||||||
|
if not new_serial:
|
||||||
|
print(f" ⚠ id={rid[:8]} ts={ts} filename={bw_name!r} — "
|
||||||
|
f"cannot decode serial from filename; skipping")
|
||||||
|
unresolved += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check for an existing row at the target (serial, timestamp).
|
||||||
|
existing = conn.execute(
|
||||||
|
"SELECT id FROM events WHERE serial = ? AND timestamp = ?",
|
||||||
|
(new_serial, ts),
|
||||||
|
).fetchone()
|
||||||
|
action: str
|
||||||
|
if existing is None:
|
||||||
|
# Safe to UPDATE in place.
|
||||||
|
if args.apply:
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE events SET serial = ? WHERE id = ?",
|
||||||
|
(new_serial, rid),
|
||||||
|
)
|
||||||
|
action = "UPDATE"
|
||||||
|
updated += 1
|
||||||
|
else:
|
||||||
|
# A correctly-attributed row already exists. Drop the
|
||||||
|
# UNKNOWN duplicate.
|
||||||
|
if args.apply:
|
||||||
|
conn.execute("DELETE FROM events WHERE id = ?", (rid,))
|
||||||
|
action = "DELETE (dup)"
|
||||||
|
deleted += 1
|
||||||
|
|
||||||
|
by_serial[new_serial] = by_serial.get(new_serial, 0) + 1
|
||||||
|
print(f" {action:14s} id={rid[:8]} ts={ts} "
|
||||||
|
f"filename={bw_name} → {new_serial}")
|
||||||
|
|
||||||
|
if args.apply:
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
print()
|
||||||
|
print(f"Summary:")
|
||||||
|
print(f" UNKNOWN rows scanned: {len(rows)}")
|
||||||
|
print(f" Updated to real serial: {updated}")
|
||||||
|
print(f" Deleted (duplicate of an ")
|
||||||
|
print(f" already-correct row): {deleted}")
|
||||||
|
print(f" Unresolved (bad filename): {unresolved}")
|
||||||
|
print()
|
||||||
|
if by_serial:
|
||||||
|
print(f"Per-serial breakdown of repaired rows:")
|
||||||
|
for serial, count in sorted(by_serial.items()):
|
||||||
|
print(f" {serial:12s} {count}")
|
||||||
|
if not args.apply:
|
||||||
|
print()
|
||||||
|
print("(dry-run — re-run with --apply to commit)")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
Executable
+99
@@ -0,0 +1,99 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Rescue an uncooperative MiniMate that's busy with another ACH session.
|
||||||
|
#
|
||||||
|
# Hammers POST /device/rescue in a tight loop with a short timeout. When the
|
||||||
|
# device is in an ACH session our SYN either gets refused or silently dropped
|
||||||
|
# (5s connect timeout inside the endpoint) and we retry immediately. When the
|
||||||
|
# device is between sessions, our TCP wins, the endpoint disables Auto Call
|
||||||
|
# Home and erases events inside the same session, then returns success.
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./rescue_device.sh <host> [tcp_port] [--no-erase] [--no-disable-ach]
|
||||||
|
#
|
||||||
|
# Examples:
|
||||||
|
# ./rescue_device.sh 166.246.130.1 9034
|
||||||
|
# ./rescue_device.sh 166.246.130.1 9034 --no-erase # just silence it
|
||||||
|
#
|
||||||
|
# Environment:
|
||||||
|
# SFM_BASE_URL Defaults to http://localhost:8200 (SFM direct).
|
||||||
|
# Set to http://localhost:8001/api/sfm to route through
|
||||||
|
# Terra-View's proxy. Direct mode avoids the proxy's
|
||||||
|
# 60s timeout, which matters for long-running endpoints.
|
||||||
|
# MAX_ATTEMPTS Cap on retries (default 600 ≈ 30+ min).
|
||||||
|
# SLEEP_S Backoff between attempts (default 1).
|
||||||
|
# MAX_TIME_S Per-request timeout (default 60).
|
||||||
|
# CONNECT_TIMEOUT TCP connect timeout (default 5).
|
||||||
|
# RECV_TIMEOUT Per-frame S3 recv timeout (default 5). If POLL or any
|
||||||
|
# subsequent frame doesn't respond within this window, the
|
||||||
|
# rescue endpoint bails and this script retries.
|
||||||
|
|
||||||
|
set -u
|
||||||
|
|
||||||
|
host="${1:-}"
|
||||||
|
tcp_port="${2:-9034}"
|
||||||
|
shift 2 2>/dev/null || shift $# 2>/dev/null
|
||||||
|
|
||||||
|
if [[ -z "$host" ]]; then
|
||||||
|
echo "usage: $0 <host> [tcp_port] [--no-erase] [--no-disable-ach]" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
disable_ach="true"
|
||||||
|
erase="true"
|
||||||
|
for arg in "$@"; do
|
||||||
|
case "$arg" in
|
||||||
|
--no-erase) erase="false" ;;
|
||||||
|
--no-disable-ach) disable_ach="false" ;;
|
||||||
|
*) echo "unknown flag: $arg" >&2; exit 2 ;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
|
||||||
|
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||||
|
max_attempts="${MAX_ATTEMPTS:-600}"
|
||||||
|
sleep_s="${SLEEP_S:-1}"
|
||||||
|
max_time_s="${MAX_TIME_S:-60}"
|
||||||
|
connect_timeout="${CONNECT_TIMEOUT:-5}"
|
||||||
|
recv_timeout="${RECV_TIMEOUT:-5}"
|
||||||
|
|
||||||
|
url="${base}/device/rescue?host=${host}&tcp_port=${tcp_port}&disable_ach=${disable_ach}&erase=${erase}&connect_timeout=${connect_timeout}&recv_timeout=${recv_timeout}"
|
||||||
|
|
||||||
|
echo "rescue: target ${host}:${tcp_port} disable_ach=${disable_ach} erase=${erase}"
|
||||||
|
echo "rescue: connect_timeout=${connect_timeout}s recv_timeout=${recv_timeout}s"
|
||||||
|
echo "rescue: POST ${url}"
|
||||||
|
echo "rescue: up to ${max_attempts} attempts, ${sleep_s}s between, ${max_time_s}s per request"
|
||||||
|
echo
|
||||||
|
|
||||||
|
started=$(date +%s)
|
||||||
|
for ((i=1; i<=max_attempts; i++)); do
|
||||||
|
printf "[%3d] %s " "$i" "$(date +%H:%M:%S)"
|
||||||
|
http_code=$(curl -sS -o /tmp/rescue_resp.$$ -w "%{http_code}" \
|
||||||
|
--max-time "$max_time_s" \
|
||||||
|
-X POST "$url" || echo "000")
|
||||||
|
body=$(cat /tmp/rescue_resp.$$ 2>/dev/null || true)
|
||||||
|
rm -f /tmp/rescue_resp.$$
|
||||||
|
|
||||||
|
case "$http_code" in
|
||||||
|
200|201)
|
||||||
|
elapsed=$(( $(date +%s) - started ))
|
||||||
|
echo "OK (${elapsed}s total)"
|
||||||
|
echo "$body"
|
||||||
|
exit 0
|
||||||
|
;;
|
||||||
|
503)
|
||||||
|
# Connection refused / timeout — device busy in another session. Retry fast.
|
||||||
|
echo "busy (503)"
|
||||||
|
;;
|
||||||
|
000)
|
||||||
|
echo "curl error (network)"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "HTTP $http_code"
|
||||||
|
echo " $body" | head -c 400
|
||||||
|
echo
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
sleep "$sleep_s"
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "rescue: gave up after ${max_attempts} attempts" >&2
|
||||||
|
exit 1
|
||||||
Executable
+44
@@ -0,0 +1,44 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Hold a single TCP session open and drip stop-monitoring frames at a slow
|
||||||
|
# rate, so the device's UART RX FIFO has time to drain between sends.
|
||||||
|
#
|
||||||
|
# Use when high-rate spam isn't landing — typically because the device's
|
||||||
|
# firmware is too busy to drain its serial buffer fast enough and bytes
|
||||||
|
# are being lost to UART overrun.
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./slow_drip.sh <host> [tcp_port] [duration_s]
|
||||||
|
#
|
||||||
|
# Env:
|
||||||
|
# DURATION Default: 120 (seconds; arg 3 overrides). Clamped 1..600.
|
||||||
|
# INTERVAL Seconds between drip sends (default 3). Lower = more
|
||||||
|
# aggressive, more risk of FIFO overrun. Higher = safer
|
||||||
|
# but fewer total drips per duration.
|
||||||
|
# CONNECT_TIMEOUT Default: 5
|
||||||
|
# SFM_BASE_URL Default: http://localhost:8200 (SFM direct).
|
||||||
|
|
||||||
|
set -u
|
||||||
|
|
||||||
|
host="${1:-}"
|
||||||
|
tcp_port="${2:-9034}"
|
||||||
|
duration="${3:-${DURATION:-120}}"
|
||||||
|
if [[ -z "$host" ]]; then
|
||||||
|
echo "usage: $0 <host> [tcp_port] [duration_s]" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||||
|
interval="${INTERVAL:-3}"
|
||||||
|
connect_timeout="${CONNECT_TIMEOUT:-5}"
|
||||||
|
|
||||||
|
url="${base}/device/stop_monitoring_slow_drip?host=${host}&tcp_port=${tcp_port}&duration_s=${duration}&interval_s=${interval}&connect_timeout=${connect_timeout}"
|
||||||
|
|
||||||
|
echo "slow_drip: target ${host}:${tcp_port} duration=${duration}s interval=${interval}s connect_timeout=${connect_timeout}s"
|
||||||
|
echo "slow_drip: POST ${url}"
|
||||||
|
echo
|
||||||
|
|
||||||
|
# Give curl enough slack to wait out the duration plus a buffer
|
||||||
|
max_time=$(awk -v d="$duration" 'BEGIN { printf "%d", d + 30 }')
|
||||||
|
|
||||||
|
curl -sS --max-time "$max_time" -X POST "$url"
|
||||||
|
echo
|
||||||
Executable
+48
@@ -0,0 +1,48 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Hammer a device with blind stop-monitoring sessions as fast as possible.
|
||||||
|
# Single HTTP call kicks off the burst inside SFM (no per-attempt HTTP
|
||||||
|
# overhead). Default: 10 seconds, ~500 ms per attempt = ~20 attempts/sec.
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./spam_stop.sh <host> [tcp_port] [duration_s]
|
||||||
|
#
|
||||||
|
# Examples:
|
||||||
|
# ./spam_stop.sh 166.246.130.1 # 10s burst
|
||||||
|
# ./spam_stop.sh 166.246.130.1 9034 30 # 30s burst
|
||||||
|
# DURATION=60 CONNECT_TIMEOUT=0.2 ./spam_stop.sh 166.246.130.1
|
||||||
|
#
|
||||||
|
# Env:
|
||||||
|
# SFM_BASE_URL Default: http://localhost:8200 (SFM direct).
|
||||||
|
# Set to http://localhost:8001/api/sfm to route through
|
||||||
|
# Terra-View's proxy — but note the proxy has a 60s
|
||||||
|
# timeout, so long bursts need direct mode.
|
||||||
|
# DURATION Default: 10 (seconds; arg 3 overrides)
|
||||||
|
# CONNECT_TIMEOUT Default: 0.5 (seconds)
|
||||||
|
# REPEAT Default: 3 (stop frames per TCP session)
|
||||||
|
|
||||||
|
set -u
|
||||||
|
|
||||||
|
host="${1:-}"
|
||||||
|
tcp_port="${2:-9034}"
|
||||||
|
duration="${3:-${DURATION:-10}}"
|
||||||
|
|
||||||
|
if [[ -z "$host" ]]; then
|
||||||
|
echo "usage: $0 <host> [tcp_port] [duration_s]" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||||
|
connect_timeout="${CONNECT_TIMEOUT:-0.5}"
|
||||||
|
repeat="${REPEAT:-3}"
|
||||||
|
|
||||||
|
url="${base}/device/stop_monitoring_spam?host=${host}&tcp_port=${tcp_port}&duration_s=${duration}&connect_timeout=${connect_timeout}&repeat=${repeat}"
|
||||||
|
|
||||||
|
echo "spam_stop: target ${host}:${tcp_port} duration=${duration}s connect_timeout=${connect_timeout}s repeat=${repeat}"
|
||||||
|
echo "spam_stop: POST ${url}"
|
||||||
|
echo
|
||||||
|
|
||||||
|
# Give curl enough slack to wait out the duration plus a buffer
|
||||||
|
max_time=$(awk -v d="$duration" 'BEGIN { printf "%d", d + 10 }')
|
||||||
|
|
||||||
|
curl -sS --max-time "$max_time" -X POST "$url"
|
||||||
|
echo
|
||||||
@@ -0,0 +1,91 @@
|
|||||||
|
"""Re-ingest a prod IDFW + IDFH via the patched save_imported_idf and
|
||||||
|
render both PDFs to confirm charts have data."""
|
||||||
|
from __future__ import annotations
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
import datetime
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
|
||||||
|
|
||||||
|
from sfm.waveform_store import WaveformStore
|
||||||
|
from sfm import report_pdf
|
||||||
|
import h5py
|
||||||
|
|
||||||
|
|
||||||
|
class FakeDb:
|
||||||
|
def __init__(self, event):
|
||||||
|
self.event = event
|
||||||
|
def get_event(self, _id):
|
||||||
|
return self.event
|
||||||
|
|
||||||
|
|
||||||
|
def to_ts_iso(ts):
|
||||||
|
if ts is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return datetime.datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second).isoformat()
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def render_case(idf_path: Path, serial: str, out_pdf: Path, h5_summary: bool = True):
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
store = WaveformStore(Path(td))
|
||||||
|
ev, rec = store.save_imported_idf(
|
||||||
|
idf_path.read_bytes(),
|
||||||
|
idf_path,
|
||||||
|
idf_report_text=None, # production worst case: no .txt
|
||||||
|
)
|
||||||
|
print(f"=== {idf_path.name} ===")
|
||||||
|
print(f" h5: {rec['hdf5_filename']}, sidecar: {rec['sidecar_filename']}")
|
||||||
|
|
||||||
|
h5p = Path(td) / serial / f"{idf_path.name}.h5"
|
||||||
|
if h5p.exists() and h5_summary:
|
||||||
|
with h5py.File(h5p) as h:
|
||||||
|
for ch in ("Tran", "Vert", "Long", "MicL"):
|
||||||
|
ds = h.get(f"samples/{ch}")
|
||||||
|
if ds is not None:
|
||||||
|
n = ds.shape[0]
|
||||||
|
mx = float(abs(ds[...]).max()) if n else 0
|
||||||
|
print(f" samples/{ch}: n={n} max_abs={mx:.5f}")
|
||||||
|
|
||||||
|
record_type = "Histogram" if idf_path.suffix.upper() == ".IDFH" else "Waveform"
|
||||||
|
fake_row = {
|
||||||
|
"serial": serial,
|
||||||
|
"blastware_filename": rec["filename"],
|
||||||
|
"record_type": record_type,
|
||||||
|
"timestamp": to_ts_iso(ev.timestamp),
|
||||||
|
"sample_rate": ev.sample_rate,
|
||||||
|
"project": ev.project_info.project if ev.project_info else None,
|
||||||
|
"client": ev.project_info.client if ev.project_info else None,
|
||||||
|
"operator": ev.project_info.operator if ev.project_info else None,
|
||||||
|
"sensor_location": ev.project_info.sensor_location if ev.project_info else None,
|
||||||
|
"created_at": None,
|
||||||
|
}
|
||||||
|
rd = report_pdf.gather_report_data(FakeDb(fake_row), store, event_id="test-1")
|
||||||
|
print(f" ReportData: channels={ {k: len(v) for k,v in rd.channels.items()} }")
|
||||||
|
if rd.is_histogram:
|
||||||
|
print(f" histogram n_intervals={rd.histogram_n_intervals} interval_size={rd.histogram_interval_size}")
|
||||||
|
pdf = report_pdf.render_event_report_pdf(rd)
|
||||||
|
out_pdf.write_bytes(pdf)
|
||||||
|
print(f" PDF: {out_pdf} ({len(pdf)} bytes)")
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
out_dir = Path("/tmp/thor_render_test"); out_dir.mkdir(exist_ok=True)
|
||||||
|
cases = [
|
||||||
|
# IDFW that decoded to preamble-only under the old codec
|
||||||
|
("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804154137.IDFW", "UM6047"),
|
||||||
|
# IDFW that worked under the old codec (validates no regression)
|
||||||
|
("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804104450.IDFW", "UM6047"),
|
||||||
|
# IDFH histogram
|
||||||
|
("/home/serversdown/seismo-relay-prod-snap/waveforms/UM6047/UM6047_20250804190047.IDFH", "UM6047"),
|
||||||
|
]
|
||||||
|
for path, serial in cases:
|
||||||
|
render_case(Path(path), serial, out_dir / f"{Path(path).name}.pdf")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Executable
+58
@@ -0,0 +1,58 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Passive monitor for a misbehaving unit. Every INTERVAL seconds, attempts
|
||||||
|
# a single short TCP probe + storage_range read and logs the result. Designed
|
||||||
|
# to run unattended for hours/days and tell you when the unit comes back.
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./watch_unit.sh <host> [tcp_port]
|
||||||
|
#
|
||||||
|
# Env:
|
||||||
|
# INTERVAL Seconds between checks (default 300 = 5 min)
|
||||||
|
# LOG_FILE Append results here (default /tmp/watch_<host>.log)
|
||||||
|
# SFM_BASE_URL Default: http://localhost:8200
|
||||||
|
|
||||||
|
set -u
|
||||||
|
|
||||||
|
host="${1:-}"
|
||||||
|
tcp_port="${2:-9034}"
|
||||||
|
if [[ -z "$host" ]]; then
|
||||||
|
echo "usage: $0 <host> [tcp_port]" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
interval="${INTERVAL:-300}"
|
||||||
|
log_file="${LOG_FILE:-/tmp/watch_${host}.log}"
|
||||||
|
base="${SFM_BASE_URL:-http://localhost:8200}"
|
||||||
|
|
||||||
|
url="${base}/device/events/storage_range?host=${host}&tcp_port=${tcp_port}"
|
||||||
|
|
||||||
|
echo "watch_unit: target ${host}:${tcp_port} interval=${interval}s log=${log_file}"
|
||||||
|
echo "watch_unit: Ctrl-C to stop"
|
||||||
|
|
||||||
|
while true; do
|
||||||
|
ts=$(date '+%Y-%m-%d %H:%M:%S')
|
||||||
|
http_code=$(curl -sS -o /tmp/watch_resp.$$ -w "%{http_code}" \
|
||||||
|
--max-time 20 "$url" || echo "000")
|
||||||
|
body=$(cat /tmp/watch_resp.$$ 2>/dev/null || true)
|
||||||
|
rm -f /tmp/watch_resp.$$
|
||||||
|
|
||||||
|
case "$http_code" in
|
||||||
|
200|201)
|
||||||
|
# Strip the raw_hex for readability
|
||||||
|
summary=$(echo "$body" | sed 's/"raw_hex":"[^"]*",*//; s/,*$//' | head -c 200)
|
||||||
|
echo "$ts REACHABLE $summary" | tee -a "$log_file"
|
||||||
|
;;
|
||||||
|
502|503)
|
||||||
|
err=$(echo "$body" | head -c 150)
|
||||||
|
echo "$ts ERROR_$http_code $err" | tee -a "$log_file"
|
||||||
|
;;
|
||||||
|
000)
|
||||||
|
echo "$ts CURL_FAIL (network/timeout)" | tee -a "$log_file"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "$ts HTTP_$http_code $(echo "$body" | head -c 150)" | tee -a "$log_file"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
sleep "$interval"
|
||||||
|
done
|
||||||
+804
-112
File diff suppressed because it is too large
Load Diff
+140
-11
@@ -83,13 +83,24 @@ class CachedEvent(Base):
|
|||||||
|
|
||||||
Events are immutable once recorded on the device; once we have an event in
|
Events are immutable once recorded on the device; once we have an event in
|
||||||
the cache it never needs to be re-downloaded unless explicitly requested.
|
the cache it never needs to be re-downloaded unless explicitly requested.
|
||||||
|
|
||||||
|
The two extra columns `waveform_key` and `event_timestamp` are an
|
||||||
|
integrity stamp: when set_event() / set_waveform() are called with a
|
||||||
|
different (waveform_key, event_timestamp) for the same (conn_key, index),
|
||||||
|
we know the device was erased and re-recorded — the cached row no longer
|
||||||
|
refers to the same physical event and the entire device's cache is
|
||||||
|
flushed before the new entry is written. This catches the post-erase
|
||||||
|
key-reuse bug where the device's first new event (key 01110000) collides
|
||||||
|
with the first event we previously downloaded.
|
||||||
"""
|
"""
|
||||||
__tablename__ = "cached_events"
|
__tablename__ = "cached_events"
|
||||||
|
|
||||||
conn_key = sa.Column(sa.String, primary_key=True)
|
conn_key = sa.Column(sa.String, primary_key=True)
|
||||||
index = sa.Column(sa.Integer, primary_key=True)
|
index = sa.Column(sa.Integer, primary_key=True)
|
||||||
event_json = sa.Column(sa.Text, nullable=False) # serialised Event dict
|
event_json = sa.Column(sa.Text, nullable=False) # serialised Event dict
|
||||||
cached_at = sa.Column(sa.Float, nullable=False) # Unix timestamp
|
cached_at = sa.Column(sa.Float, nullable=False) # Unix timestamp
|
||||||
|
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
|
||||||
|
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
|
||||||
|
|
||||||
|
|
||||||
class CachedWaveform(Base):
|
class CachedWaveform(Base):
|
||||||
@@ -97,14 +108,18 @@ class CachedWaveform(Base):
|
|||||||
Full raw ADC waveform for a single event (SUB 5A full download).
|
Full raw ADC waveform for a single event (SUB 5A full download).
|
||||||
|
|
||||||
These are large (up to several MB) and expensive to fetch over cellular.
|
These are large (up to several MB) and expensive to fetch over cellular.
|
||||||
Once downloaded they are immutable and cached permanently.
|
Once downloaded they are immutable and cached permanently — but the
|
||||||
|
cache row is invalidated when the device is erased and a new event lands
|
||||||
|
at the same index (see CachedEvent docstring).
|
||||||
"""
|
"""
|
||||||
__tablename__ = "cached_waveforms"
|
__tablename__ = "cached_waveforms"
|
||||||
|
|
||||||
conn_key = sa.Column(sa.String, primary_key=True)
|
conn_key = sa.Column(sa.String, primary_key=True)
|
||||||
index = sa.Column(sa.Integer, primary_key=True)
|
index = sa.Column(sa.Integer, primary_key=True)
|
||||||
waveform_json = sa.Column(sa.Text, nullable=False) # full /device/event/{idx}/waveform response JSON
|
waveform_json = sa.Column(sa.Text, nullable=False) # full /device/event/{idx}/waveform response JSON
|
||||||
cached_at = sa.Column(sa.Float, nullable=False)
|
cached_at = sa.Column(sa.Float, nullable=False)
|
||||||
|
waveform_key = sa.Column(sa.String, nullable=True) # 8-hex device key
|
||||||
|
event_timestamp = sa.Column(sa.String, nullable=True) # ISO-8601 from 0C
|
||||||
|
|
||||||
|
|
||||||
class CachedMonitorStatus(Base):
|
class CachedMonitorStatus(Base):
|
||||||
@@ -149,6 +164,23 @@ class SFMCache:
|
|||||||
engine = sa.create_engine(url, connect_args={"check_same_thread": False})
|
engine = sa.create_engine(url, connect_args={"check_same_thread": False})
|
||||||
Base.metadata.create_all(engine)
|
Base.metadata.create_all(engine)
|
||||||
self._Session = orm.sessionmaker(bind=engine)
|
self._Session = orm.sessionmaker(bind=engine)
|
||||||
|
# In-place schema migration: add the (waveform_key, event_timestamp)
|
||||||
|
# integrity-stamp columns to legacy cache DBs that predate the
|
||||||
|
# post-erase eviction logic. ALTER TABLE ADD COLUMN is idempotent
|
||||||
|
# via the column-presence check below.
|
||||||
|
with engine.begin() as conn:
|
||||||
|
for table in ("cached_events", "cached_waveforms"):
|
||||||
|
cols = {
|
||||||
|
r[1]
|
||||||
|
for r in conn.exec_driver_sql(f"PRAGMA table_info({table})").fetchall()
|
||||||
|
}
|
||||||
|
for new_col, ddl in (
|
||||||
|
("waveform_key", "TEXT"),
|
||||||
|
("event_timestamp", "TEXT"),
|
||||||
|
):
|
||||||
|
if new_col not in cols:
|
||||||
|
log.info("cache schema: %s ADD COLUMN %s %s", table, new_col, ddl)
|
||||||
|
conn.exec_driver_sql(f"ALTER TABLE {table} ADD COLUMN {new_col} {ddl}")
|
||||||
log.info("SFM cache opened: %s", db_path)
|
log.info("SFM cache opened: %s", db_path)
|
||||||
|
|
||||||
# ── Connection key ────────────────────────────────────────────────────────
|
# ── Connection key ────────────────────────────────────────────────────────
|
||||||
@@ -242,15 +274,91 @@ class SFMCache:
|
|||||||
row = s.get(CachedEvent, (conn_key, index))
|
row = s.get(CachedEvent, (conn_key, index))
|
||||||
return json.loads(row.event_json) if row else None
|
return json.loads(row.event_json) if row else None
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _event_signature(ev: dict) -> tuple[Optional[str], Optional[str]]:
|
||||||
|
"""
|
||||||
|
Extract the (waveform_key_hex, timestamp_iso) integrity stamp from
|
||||||
|
a serialised event dict. Either field may be None if the source
|
||||||
|
Event was missing it; the comparison logic in set_events/set_waveform
|
||||||
|
treats "both sides have a value AND they differ" as the only
|
||||||
|
eviction trigger, so partial data never spuriously flushes cache.
|
||||||
|
"""
|
||||||
|
key = ev.get("waveform_key") or ev.get("_waveform_key")
|
||||||
|
if isinstance(key, (bytes, bytearray)):
|
||||||
|
key = bytes(key).hex()
|
||||||
|
ts = ev.get("timestamp")
|
||||||
|
if isinstance(ts, dict):
|
||||||
|
# _serialise_timestamp returns a dict like {"iso": "...", ...}
|
||||||
|
ts = ts.get("iso") or ts.get("string") or None
|
||||||
|
return (key if isinstance(key, str) else None,
|
||||||
|
ts if isinstance(ts, str) else None)
|
||||||
|
|
||||||
|
def _maybe_flush_on_mismatch(
|
||||||
|
self,
|
||||||
|
s,
|
||||||
|
conn_key: str,
|
||||||
|
index: int,
|
||||||
|
new_key: Optional[str],
|
||||||
|
new_ts: Optional[str],
|
||||||
|
) -> bool:
|
||||||
|
"""
|
||||||
|
Check whether the cached entry at (conn_key, index) has a different
|
||||||
|
(waveform_key, timestamp) than the incoming one. If so, treat it as
|
||||||
|
a post-erase key-reuse signal and flush ALL cached events/waveforms
|
||||||
|
for this device, then return True.
|
||||||
|
Returns False when no flush was needed.
|
||||||
|
"""
|
||||||
|
if not new_key and not new_ts:
|
||||||
|
return False # nothing to compare against
|
||||||
|
existing = s.get(CachedEvent, (conn_key, index))
|
||||||
|
if existing is None:
|
||||||
|
existing = s.get(CachedWaveform, (conn_key, index))
|
||||||
|
if existing is None:
|
||||||
|
return False
|
||||||
|
old_key = existing.waveform_key
|
||||||
|
old_ts = existing.event_timestamp
|
||||||
|
# Only flush when both sides have populated values and they differ.
|
||||||
|
differs = (
|
||||||
|
(new_key and old_key and new_key != old_key)
|
||||||
|
or (new_ts and old_ts and new_ts != old_ts)
|
||||||
|
)
|
||||||
|
if not differs:
|
||||||
|
return False
|
||||||
|
log.warning(
|
||||||
|
"cache: device %s — index %d (key=%s, ts=%s) replaces (key=%s, ts=%s); "
|
||||||
|
"flushing all cached events/waveforms for this device "
|
||||||
|
"(post-erase key reuse detected)",
|
||||||
|
conn_key, index, new_key, new_ts, old_key, old_ts,
|
||||||
|
)
|
||||||
|
s.query(CachedEvent).filter_by(conn_key=conn_key).delete()
|
||||||
|
s.query(CachedWaveform).filter_by(conn_key=conn_key).delete()
|
||||||
|
return True
|
||||||
|
|
||||||
def set_events(self, conn_key: str, events: list[dict]) -> None:
|
def set_events(self, conn_key: str, events: list[dict]) -> None:
|
||||||
"""
|
"""
|
||||||
Upsert a list of event dicts. Existing rows are updated; new rows are
|
Upsert a list of event dicts. Existing rows are updated; new rows are
|
||||||
inserted. This is used to add newly-discovered events to the cache.
|
inserted. This is used to add newly-discovered events to the cache.
|
||||||
|
|
||||||
|
Eviction: if any incoming event has a different (waveform_key,
|
||||||
|
timestamp) than the row currently cached at the same index, we flush
|
||||||
|
the entire device's cache before inserting the new entries. Catches
|
||||||
|
post-erase key reuse where index 0 silently switches identity.
|
||||||
"""
|
"""
|
||||||
now = time.time()
|
now = time.time()
|
||||||
with self._Session() as s:
|
with self._Session() as s:
|
||||||
|
# Eviction check: scan incoming events for any (index, key, ts)
|
||||||
|
# that conflicts with a cached row. A single conflict triggers
|
||||||
|
# a full device-wide flush so we don't end up with a mixed-era
|
||||||
|
# cache.
|
||||||
|
for ev in events:
|
||||||
|
key, ts = self._event_signature(ev)
|
||||||
|
if self._maybe_flush_on_mismatch(s, conn_key, ev["index"], key, ts):
|
||||||
|
s.commit()
|
||||||
|
break # cache is now empty for this device; carry on
|
||||||
|
|
||||||
for ev in events:
|
for ev in events:
|
||||||
idx = ev["index"]
|
idx = ev["index"]
|
||||||
|
key, ts = self._event_signature(ev)
|
||||||
row = s.get(CachedEvent, (conn_key, idx))
|
row = s.get(CachedEvent, (conn_key, idx))
|
||||||
if row is None:
|
if row is None:
|
||||||
row = CachedEvent(
|
row = CachedEvent(
|
||||||
@@ -258,12 +366,18 @@ class SFMCache:
|
|||||||
index=idx,
|
index=idx,
|
||||||
event_json=json.dumps(ev),
|
event_json=json.dumps(ev),
|
||||||
cached_at=now,
|
cached_at=now,
|
||||||
|
waveform_key=key,
|
||||||
|
event_timestamp=ts,
|
||||||
)
|
)
|
||||||
s.add(row)
|
s.add(row)
|
||||||
log.debug("cached new event %d for %s", idx, conn_key)
|
log.debug("cached new event %d for %s", idx, conn_key)
|
||||||
else:
|
else:
|
||||||
# Refresh in case project_info was backfilled after initial store
|
# Refresh in case project_info was backfilled after initial store
|
||||||
row.event_json = json.dumps(ev)
|
row.event_json = json.dumps(ev)
|
||||||
|
if key:
|
||||||
|
row.waveform_key = key
|
||||||
|
if ts:
|
||||||
|
row.event_timestamp = ts
|
||||||
s.commit()
|
s.commit()
|
||||||
|
|
||||||
# ── Waveforms ─────────────────────────────────────────────────────────────
|
# ── Waveforms ─────────────────────────────────────────────────────────────
|
||||||
@@ -278,8 +392,16 @@ class SFMCache:
|
|||||||
return json.loads(row.waveform_json)
|
return json.loads(row.waveform_json)
|
||||||
|
|
||||||
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
|
def set_waveform(self, conn_key: str, index: int, waveform: dict) -> None:
|
||||||
"""Store a full waveform response dict permanently."""
|
"""
|
||||||
|
Store a full waveform response dict permanently.
|
||||||
|
|
||||||
|
Like set_events, this checks the (waveform_key, timestamp) signature
|
||||||
|
of the incoming entry against what's currently cached at the same
|
||||||
|
index. A mismatch flushes the entire device's cache before insert.
|
||||||
|
"""
|
||||||
|
key, ts = self._event_signature(waveform)
|
||||||
with self._Session() as s:
|
with self._Session() as s:
|
||||||
|
self._maybe_flush_on_mismatch(s, conn_key, index, key, ts)
|
||||||
row = s.get(CachedWaveform, (conn_key, index))
|
row = s.get(CachedWaveform, (conn_key, index))
|
||||||
if row is None:
|
if row is None:
|
||||||
row = CachedWaveform(
|
row = CachedWaveform(
|
||||||
@@ -287,13 +409,20 @@ class SFMCache:
|
|||||||
index=index,
|
index=index,
|
||||||
waveform_json=json.dumps(waveform),
|
waveform_json=json.dumps(waveform),
|
||||||
cached_at=time.time(),
|
cached_at=time.time(),
|
||||||
|
waveform_key=key,
|
||||||
|
event_timestamp=ts,
|
||||||
)
|
)
|
||||||
s.add(row)
|
s.add(row)
|
||||||
else:
|
else:
|
||||||
row.waveform_json = json.dumps(waveform)
|
row.waveform_json = json.dumps(waveform)
|
||||||
row.cached_at = time.time()
|
row.cached_at = time.time()
|
||||||
|
if key:
|
||||||
|
row.waveform_key = key
|
||||||
|
if ts:
|
||||||
|
row.event_timestamp = ts
|
||||||
s.commit()
|
s.commit()
|
||||||
log.debug("cached waveform for %s event %d", conn_key, index)
|
log.debug("cached waveform for %s event %d (key=%s, ts=%s)",
|
||||||
|
conn_key, index, key, ts)
|
||||||
|
|
||||||
# ── Monitor status ────────────────────────────────────────────────────────
|
# ── Monitor status ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|||||||
+332
-18
@@ -81,6 +81,11 @@ CREATE TABLE IF NOT EXISTS events (
|
|||||||
sample_rate INTEGER,
|
sample_rate INTEGER,
|
||||||
record_type TEXT, -- "single_shot" | "continuous"
|
record_type TEXT, -- "single_shot" | "continuous"
|
||||||
false_trigger INTEGER NOT NULL DEFAULT 0, -- 0=no, 1=yes (manual flag)
|
false_trigger INTEGER NOT NULL DEFAULT 0, -- 0=no, 1=yes (manual flag)
|
||||||
|
blastware_filename TEXT, -- event file within waveform store; extension is per-event (AB0T encodes timestamp)
|
||||||
|
blastware_filesize INTEGER, -- bytes; NULL if no event file saved
|
||||||
|
a5_pickle_filename TEXT, -- "<filename>.a5.pkl" sidecar
|
||||||
|
sidecar_filename TEXT, -- "<filename>.sfm.json" review/metadata sidecar
|
||||||
|
device_family TEXT, -- "series3" (MiniMate Plus / BW) | "series4" (Micromate / Thor) — drives per-family UI rendering (units, labels)
|
||||||
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
|
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
|
||||||
UNIQUE(serial, timestamp)
|
UNIQUE(serial, timestamp)
|
||||||
);
|
);
|
||||||
@@ -184,6 +189,63 @@ class SeismoDb:
|
|||||||
""")
|
""")
|
||||||
log.info("_migrate: events table rebuilt OK")
|
log.info("_migrate: events table rebuilt OK")
|
||||||
|
|
||||||
|
# Migration 1b: add Blastware-file columns to existing events tables.
|
||||||
|
# New columns are NULLable so old rows just read NULL.
|
||||||
|
existing_cols = {
|
||||||
|
r[1] for r in conn.execute("PRAGMA table_info(events)").fetchall()
|
||||||
|
}
|
||||||
|
for col, ddl in (
|
||||||
|
("blastware_filename", "TEXT"),
|
||||||
|
("blastware_filesize", "INTEGER"),
|
||||||
|
("a5_pickle_filename", "TEXT"),
|
||||||
|
("sidecar_filename", "TEXT"),
|
||||||
|
("device_family", "TEXT"),
|
||||||
|
):
|
||||||
|
if col not in existing_cols:
|
||||||
|
log.info("_migrate: events ADD COLUMN %s %s", col, ddl)
|
||||||
|
conn.execute(f"ALTER TABLE events ADD COLUMN {col} {ddl}")
|
||||||
|
|
||||||
|
# Migration 1c: backfill device_family for existing rows by sniffing
|
||||||
|
# the device-native binary filename's extension. Thor (Micromate
|
||||||
|
# Series IV) writes `.IDFH` / `.IDFW`; MiniMate Plus (Series III)
|
||||||
|
# writes `.AB0*` / `.N00` / `.<base36>` Blastware extensions. We do
|
||||||
|
# this here rather than from sidecars so the migration is fully
|
||||||
|
# self-contained (doesn't need the waveform-store root) and runs at
|
||||||
|
# DB-init time. Only fills NULL device_family so re-runs are no-ops.
|
||||||
|
rebackfill = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM events WHERE device_family IS NULL"
|
||||||
|
).fetchone()
|
||||||
|
if rebackfill and rebackfill[0] > 0:
|
||||||
|
log.info("_migrate: backfilling device_family for %d events", rebackfill[0])
|
||||||
|
# Series IV (Thor IDF) — extension is exactly .IDFH or .IDFW
|
||||||
|
conn.execute(
|
||||||
|
"""
|
||||||
|
UPDATE events
|
||||||
|
SET device_family = 'series4'
|
||||||
|
WHERE device_family IS NULL
|
||||||
|
AND (
|
||||||
|
UPPER(blastware_filename) LIKE '%.IDFH'
|
||||||
|
OR UPPER(blastware_filename) LIKE '%.IDFW'
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
# Everything else with a filename → Series III (Blastware family)
|
||||||
|
conn.execute(
|
||||||
|
"""
|
||||||
|
UPDATE events
|
||||||
|
SET device_family = 'series3'
|
||||||
|
WHERE device_family IS NULL
|
||||||
|
AND blastware_filename IS NOT NULL
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
# Rows with no filename (e.g. older monitor_log-derived events)
|
||||||
|
# stay NULL — UI handles NULL as "unknown family".
|
||||||
|
remaining = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM events WHERE device_family IS NULL"
|
||||||
|
).fetchone()[0]
|
||||||
|
log.info("_migrate: device_family backfill complete (remaining NULL=%d)",
|
||||||
|
remaining)
|
||||||
|
|
||||||
# Migration 2: change monitor_log UNIQUE from (serial, waveform_key) to
|
# Migration 2: change monitor_log UNIQUE from (serial, waveform_key) to
|
||||||
# (serial, start_time) — same reasoning as events.
|
# (serial, start_time) — same reasoning as events.
|
||||||
row = conn.execute(
|
row = conn.execute(
|
||||||
@@ -282,12 +344,30 @@ class SeismoDb:
|
|||||||
*,
|
*,
|
||||||
serial: str,
|
serial: str,
|
||||||
session_id: Optional[str] = None,
|
session_id: Optional[str] = None,
|
||||||
|
waveform_records: Optional[dict[str, dict]] = None,
|
||||||
|
device_family: Optional[str] = None,
|
||||||
) -> tuple[int, int]:
|
) -> tuple[int, int]:
|
||||||
"""
|
"""
|
||||||
Insert triggered events. Silently skips duplicates (serial+timestamp).
|
Insert triggered events. Silently skips duplicates (serial+timestamp).
|
||||||
Returns (inserted, skipped).
|
Returns (inserted, skipped).
|
||||||
|
|
||||||
|
``waveform_records`` (optional): dict keyed by event waveform_key (hex)
|
||||||
|
whose value is a record from ``WaveformStore.save()``:
|
||||||
|
{"filename": str, "filesize": int, "a5_pickle_filename": str}
|
||||||
|
|
||||||
|
For events whose key is in this dict, the matching columns are
|
||||||
|
populated. If a row with the same (serial, timestamp) already exists
|
||||||
|
(dedup hit), the matching waveform record is upserted onto the
|
||||||
|
existing row so a re-download via the live endpoint refreshes the
|
||||||
|
file metadata.
|
||||||
|
|
||||||
|
``device_family`` (optional): "series3" (MiniMate Plus / Blastware) or
|
||||||
|
"series4" (Micromate / Thor). Drives per-family UI rendering — most
|
||||||
|
importantly the mic-unit convention (psi vs dB(L)). Set on every
|
||||||
|
insert and overwritten on every UPSERT so the latest writer wins.
|
||||||
"""
|
"""
|
||||||
inserted = skipped = 0
|
inserted = skipped = 0
|
||||||
|
wave_recs = waveform_records or {}
|
||||||
with self._connect() as conn:
|
with self._connect() as conn:
|
||||||
for ev in events:
|
for ev in events:
|
||||||
key = ev._waveform_key.hex() if ev._waveform_key else None
|
key = ev._waveform_key.hex() if ev._waveform_key else None
|
||||||
@@ -307,6 +387,7 @@ class SeismoDb:
|
|||||||
|
|
||||||
pv = ev.peak_values
|
pv = ev.peak_values
|
||||||
pi = ev.project_info
|
pi = ev.project_info
|
||||||
|
rec = wave_recs.get(key) or {}
|
||||||
|
|
||||||
try:
|
try:
|
||||||
conn.execute(
|
conn.execute(
|
||||||
@@ -315,8 +396,11 @@ class SeismoDb:
|
|||||||
(id, serial, waveform_key, session_id, timestamp,
|
(id, serial, waveform_key, session_id, timestamp,
|
||||||
tran_ppv, vert_ppv, long_ppv, peak_vector_sum, mic_ppv,
|
tran_ppv, vert_ppv, long_ppv, peak_vector_sum, mic_ppv,
|
||||||
project, client, operator, sensor_location,
|
project, client, operator, sensor_location,
|
||||||
sample_rate, record_type)
|
sample_rate, record_type,
|
||||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
blastware_filename, blastware_filesize,
|
||||||
|
a5_pickle_filename, sidecar_filename,
|
||||||
|
device_family)
|
||||||
|
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
""",
|
""",
|
||||||
(
|
(
|
||||||
self._new_id(), serial, key, session_id, ts,
|
self._new_id(), serial, key, session_id, ts,
|
||||||
@@ -331,16 +415,89 @@ class SeismoDb:
|
|||||||
pi.sensor_location if pi else None,
|
pi.sensor_location if pi else None,
|
||||||
ev.sample_rate,
|
ev.sample_rate,
|
||||||
ev.record_type,
|
ev.record_type,
|
||||||
|
rec.get("filename"),
|
||||||
|
rec.get("filesize"),
|
||||||
|
rec.get("a5_pickle_filename"),
|
||||||
|
rec.get("sidecar_filename"),
|
||||||
|
device_family,
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
inserted += 1
|
inserted += 1
|
||||||
except sqlite3.IntegrityError:
|
except sqlite3.IntegrityError:
|
||||||
skipped += 1
|
skipped += 1
|
||||||
|
# UPSERT path: a row for this (serial, timestamp) already
|
||||||
|
# exists. Refresh every device-authoritative field from
|
||||||
|
# the new data so that a re-import with better data (e.g.
|
||||||
|
# a watcher re-forward where the previous attempt missed
|
||||||
|
# the paired BW ASCII report) replaces stale peaks /
|
||||||
|
# project info / sample_rate.
|
||||||
|
#
|
||||||
|
# Preserved (not in this UPDATE):
|
||||||
|
# id, waveform_key, session_id, created_at — immutable / FK
|
||||||
|
# false_trigger — operator review state
|
||||||
|
#
|
||||||
|
# Behaviour change vs prior versions: this UPDATE used
|
||||||
|
# to only refresh filename / filesize / a5_pickle /
|
||||||
|
# sidecar fields. As a result, the first insert's
|
||||||
|
# broken-codec peak values were locked in forever even
|
||||||
|
# if subsequent re-forwards arrived with correct
|
||||||
|
# report-derived values. Now every re-import lifts the
|
||||||
|
# DB row up to whatever the latest Event carries.
|
||||||
|
conn.execute(
|
||||||
|
"""
|
||||||
|
UPDATE events
|
||||||
|
SET tran_ppv = ?,
|
||||||
|
vert_ppv = ?,
|
||||||
|
long_ppv = ?,
|
||||||
|
peak_vector_sum = ?,
|
||||||
|
mic_ppv = ?,
|
||||||
|
project = ?,
|
||||||
|
client = ?,
|
||||||
|
operator = ?,
|
||||||
|
sensor_location = ?,
|
||||||
|
sample_rate = ?,
|
||||||
|
record_type = ?,
|
||||||
|
blastware_filename = ?,
|
||||||
|
blastware_filesize = ?,
|
||||||
|
a5_pickle_filename = ?,
|
||||||
|
sidecar_filename = ?,
|
||||||
|
device_family = COALESCE(?, device_family)
|
||||||
|
WHERE serial = ? AND timestamp = ?
|
||||||
|
""",
|
||||||
|
(
|
||||||
|
pv.tran if pv else None,
|
||||||
|
pv.vert if pv else None,
|
||||||
|
pv.long if pv else None,
|
||||||
|
pv.peak_vector_sum if pv else None,
|
||||||
|
pv.micl if pv else None,
|
||||||
|
pi.project if pi else None,
|
||||||
|
pi.client if pi else None,
|
||||||
|
pi.operator if pi else None,
|
||||||
|
pi.sensor_location if pi else None,
|
||||||
|
ev.sample_rate,
|
||||||
|
ev.record_type,
|
||||||
|
rec.get("filename") if rec else None,
|
||||||
|
rec.get("filesize") if rec else None,
|
||||||
|
rec.get("a5_pickle_filename") if rec else None,
|
||||||
|
rec.get("sidecar_filename") if rec else None,
|
||||||
|
device_family,
|
||||||
|
serial,
|
||||||
|
ts,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
log.debug("insert_events serial=%s inserted=%d skipped=%d",
|
log.debug("insert_events serial=%s inserted=%d skipped=%d",
|
||||||
serial, inserted, skipped)
|
serial, inserted, skipped)
|
||||||
return inserted, skipped
|
return inserted, skipped
|
||||||
|
|
||||||
|
def get_event(self, event_id: str) -> Optional[dict]:
|
||||||
|
"""Return one event row by id, or None."""
|
||||||
|
with self._connect() as conn:
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT * FROM events WHERE id = ?", (event_id,),
|
||||||
|
).fetchone()
|
||||||
|
return dict(row) if row else None
|
||||||
|
|
||||||
def query_events(
|
def query_events(
|
||||||
self,
|
self,
|
||||||
serial: Optional[str] = None,
|
serial: Optional[str] = None,
|
||||||
@@ -387,6 +544,105 @@ class SeismoDb:
|
|||||||
)
|
)
|
||||||
return cur.rowcount > 0
|
return cur.rowcount > 0
|
||||||
|
|
||||||
|
def delete_event(self, event_id: str) -> Optional[dict]:
|
||||||
|
"""
|
||||||
|
Hard-delete one event row by id. Returns the deleted row (so the
|
||||||
|
caller can clean up any on-disk files referenced by it) or None
|
||||||
|
if no row matched.
|
||||||
|
"""
|
||||||
|
with self._connect() as conn:
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT * FROM events WHERE id = ?", (event_id,),
|
||||||
|
).fetchone()
|
||||||
|
if row is None:
|
||||||
|
return None
|
||||||
|
conn.execute("DELETE FROM events WHERE id = ?", (event_id,))
|
||||||
|
return dict(row)
|
||||||
|
|
||||||
|
def delete_events_bulk(
|
||||||
|
self,
|
||||||
|
serial: Optional[str] = None,
|
||||||
|
from_dt: Optional[datetime.datetime] = None,
|
||||||
|
to_dt: Optional[datetime.datetime] = None,
|
||||||
|
false_trigger: Optional[bool] = None,
|
||||||
|
ids: Optional[list[str]] = None,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""
|
||||||
|
Hard-delete events matching the given filters. Returns the list
|
||||||
|
of deleted row dicts. Refuses to delete with no filters at all
|
||||||
|
(would wipe the whole table) — raises ValueError.
|
||||||
|
|
||||||
|
Filter semantics match query_events: serial / from_dt / to_dt /
|
||||||
|
false_trigger combine with AND. `ids` is an additional inclusion
|
||||||
|
list (event_id IN (...)); if supplied alongside other filters,
|
||||||
|
only rows matching all conditions are deleted.
|
||||||
|
"""
|
||||||
|
clauses: list[str] = []
|
||||||
|
params: list = []
|
||||||
|
|
||||||
|
if serial:
|
||||||
|
clauses.append("serial = ?")
|
||||||
|
params.append(serial)
|
||||||
|
if from_dt:
|
||||||
|
clauses.append("timestamp >= ?")
|
||||||
|
params.append(from_dt.isoformat())
|
||||||
|
if to_dt:
|
||||||
|
clauses.append("timestamp <= ?")
|
||||||
|
params.append(to_dt.isoformat())
|
||||||
|
if false_trigger is not None:
|
||||||
|
clauses.append("false_trigger = ?")
|
||||||
|
params.append(1 if false_trigger else 0)
|
||||||
|
if ids:
|
||||||
|
placeholders = ",".join("?" * len(ids))
|
||||||
|
clauses.append(f"id IN ({placeholders})")
|
||||||
|
params.extend(ids)
|
||||||
|
|
||||||
|
if not clauses:
|
||||||
|
raise ValueError(
|
||||||
|
"delete_events_bulk refuses to delete with no filters "
|
||||||
|
"(would wipe the entire events table)"
|
||||||
|
)
|
||||||
|
|
||||||
|
where = "WHERE " + " AND ".join(clauses)
|
||||||
|
|
||||||
|
with self._connect() as conn:
|
||||||
|
rows = conn.execute(
|
||||||
|
f"SELECT * FROM events {where}", params,
|
||||||
|
).fetchall()
|
||||||
|
if rows:
|
||||||
|
conn.execute(f"DELETE FROM events {where}", params)
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
def update_event_review(self, event_id: str, review: dict) -> bool:
|
||||||
|
"""
|
||||||
|
Sync derived index columns from a sidecar's `review` block.
|
||||||
|
|
||||||
|
Currently the only derived index is `events.false_trigger` — kept
|
||||||
|
in sync so `/db/events?false_trigger=true` queries don't have to
|
||||||
|
scan every sidecar JSON on disk. The sidecar JSON itself remains
|
||||||
|
the source of truth for the full review state.
|
||||||
|
|
||||||
|
Returns True when the row exists, False otherwise. No-op fields
|
||||||
|
(review without `false_trigger`) leave the column untouched.
|
||||||
|
"""
|
||||||
|
if not isinstance(review, dict):
|
||||||
|
return False
|
||||||
|
if "false_trigger" not in review:
|
||||||
|
# Nothing derived to update; just confirm the row exists.
|
||||||
|
with self._connect() as conn:
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT 1 FROM events WHERE id=?", (event_id,),
|
||||||
|
).fetchone()
|
||||||
|
return row is not None
|
||||||
|
|
||||||
|
flag = 1 if review.get("false_trigger") else 0
|
||||||
|
with self._connect() as conn:
|
||||||
|
cur = conn.execute(
|
||||||
|
"UPDATE events SET false_trigger=? WHERE id=?",
|
||||||
|
(flag, event_id),
|
||||||
|
)
|
||||||
|
return cur.rowcount > 0
|
||||||
|
|
||||||
# ── Monitor log ───────────────────────────────────────────────────────────
|
# ── Monitor log ───────────────────────────────────────────────────────────
|
||||||
|
|
||||||
def insert_monitor_log(
|
def insert_monitor_log(
|
||||||
@@ -466,21 +722,79 @@ class SeismoDb:
|
|||||||
|
|
||||||
def query_units(self) -> list[dict]:
|
def query_units(self) -> list[dict]:
|
||||||
"""
|
"""
|
||||||
Return one row per known serial with summary stats:
|
Return one row per known serial with summary stats.
|
||||||
last_seen, total_events, total_monitor_entries.
|
|
||||||
|
Aggregates from BOTH source tables:
|
||||||
|
- `events` — populated by every ingest path
|
||||||
|
(live ACH, /db/import/blastware_file
|
||||||
|
from the series3-watcher forwarder, etc.)
|
||||||
|
- `ach_sessions` — only populated by the live ACH server;
|
||||||
|
empty for events that came in via the
|
||||||
|
BW-importer route.
|
||||||
|
|
||||||
|
Earlier this method only joined on `ach_sessions`, which made
|
||||||
|
watcher-forwarded units invisible to the SFM webapp's fleet
|
||||||
|
overview even though their events were correctly populated in
|
||||||
|
`events`. Now we union the two and surface every serial that
|
||||||
|
has activity in either table.
|
||||||
|
|
||||||
|
Fields:
|
||||||
|
serial — unit serial number (e.g. "BE11529")
|
||||||
|
last_seen — most recent of MAX(events.timestamp)
|
||||||
|
and MAX(ach_sessions.session_time)
|
||||||
|
total_events — COUNT(*) from `events` (the
|
||||||
|
authoritative count regardless of
|
||||||
|
ingest path)
|
||||||
|
total_monitor_entries — from `ach_sessions`, 0 when absent
|
||||||
|
total_sessions — COUNT(*) from `ach_sessions`, 0 when absent
|
||||||
"""
|
"""
|
||||||
with self._connect() as conn:
|
with self._connect() as conn:
|
||||||
rows = conn.execute(
|
event_stats = {
|
||||||
"""
|
row["serial"]: row
|
||||||
SELECT
|
for row in conn.execute(
|
||||||
s.serial,
|
"""
|
||||||
MAX(s.session_time) AS last_seen,
|
SELECT serial,
|
||||||
SUM(s.events_downloaded) AS total_events,
|
MAX(timestamp) AS last_event_at,
|
||||||
SUM(s.monitor_entries) AS total_monitor_entries,
|
COUNT(*) AS total_events
|
||||||
COUNT(*) AS total_sessions
|
FROM events
|
||||||
FROM ach_sessions s
|
GROUP BY serial
|
||||||
GROUP BY s.serial
|
""",
|
||||||
ORDER BY last_seen DESC
|
).fetchall()
|
||||||
"""
|
}
|
||||||
).fetchall()
|
session_stats = {
|
||||||
return [dict(r) for r in rows]
|
row["serial"]: row
|
||||||
|
for row in conn.execute(
|
||||||
|
"""
|
||||||
|
SELECT serial,
|
||||||
|
MAX(session_time) AS last_session_at,
|
||||||
|
SUM(monitor_entries) AS total_monitor_entries,
|
||||||
|
COUNT(*) AS total_sessions
|
||||||
|
FROM ach_sessions
|
||||||
|
GROUP BY serial
|
||||||
|
""",
|
||||||
|
).fetchall()
|
||||||
|
}
|
||||||
|
|
||||||
|
all_serials = set(event_stats) | set(session_stats)
|
||||||
|
units = []
|
||||||
|
for serial in all_serials:
|
||||||
|
e = event_stats.get(serial)
|
||||||
|
s = session_stats.get(serial)
|
||||||
|
last_event_at = e["last_event_at"] if e else None
|
||||||
|
last_session_at = s["last_session_at"] if s else None
|
||||||
|
# Prefer whichever timestamp is more recent
|
||||||
|
last_seen = max(
|
||||||
|
(t for t in (last_event_at, last_session_at) if t),
|
||||||
|
default=None,
|
||||||
|
)
|
||||||
|
units.append({
|
||||||
|
"serial": serial,
|
||||||
|
"last_seen": last_seen,
|
||||||
|
"total_events": e["total_events"] if e else 0,
|
||||||
|
"total_monitor_entries": s["total_monitor_entries"] if s else 0,
|
||||||
|
"total_sessions": s["total_sessions"] if s else 0,
|
||||||
|
})
|
||||||
|
|
||||||
|
# Sort by last_seen desc; serials with no timestamp at all sink to the bottom.
|
||||||
|
units.sort(key=lambda u: u.get("last_seen") or "", reverse=True)
|
||||||
|
return units
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user