The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py)
ran and immediately confirmed the rotation hypothesis:
SP0 seg 0: best fit Vert 508/508 ✓
SP0 seg 1: best fit Long 508/508 ✓
SP0 seg 3: best fit Tran 508/508 ✓ (Tran continuation)
SP0 seg 5: best fit Long 508/508 ✓
SP0 seg 9: best fit Long 508/508 ✓
V70 seg 0: best fit Vert 508/508 ✓
V70 seg 1: best fit Long 508/508 ✓
Channels rotate Tran → Vert → Long → MicL per 40 02 segment header.
Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor
the NEW segment's channel (2 samples as int16 BE in 16-count units), AND
bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as
int16 BE). This is the same "2 anchors + delta stream" structure as the
body preamble for Tran.
decode_waveform_v2 now returns full per-channel sample dicts.
Byte-exact verified ranges:
V70: Tran 512, Vert 512, Long 512 (all first segments)
JQ0: Tran 512, Vert 258
SP0: Long 1536 (all 3 L segments)
Still open: the 30 NN block format (high-amplitude packed deltas) —
appears mid-segment when single-byte deltas can't carry the magnitude.
6 new tests bring the count to 46. All passing.
User uploaded a Vert-heavy event (JQ0) and a Mic-heavy event (V70).
Those two were exactly what was needed to crack the next piece:
- 00 NN block = run-length-encoded zero deltas in the current channel.
Append NN copies of the current cumulative value (no change).
- find_data_start now recognizes 00 NN as a valid first tag (some events
begin with a leading 00 NN RLE block).
- decode_tran_initial now decodes the FULL segment 0 (not just the first
data block).
Results across 5 fixture events:
- M529LL1A.SP0 (loud-all-channels) : 510 / 510 ✓
- M529LL1L.JQ0 (Vert-heavy) : 510 / 510 ✓
- M529LL1L.V70 (Mic-heavy) : 510 / 510 ✓
- M529LL1A.SV0 (loud-from-start) : 58 / 58 ✓
- M529LL1A.SS0 (loud-from-start) : 42 / 502 (stops at first 30 04)
The 30 04 block (only seen in loud-from-start events) hasn't been
decoded yet — likely a channel-switch marker for the high-amplitude
regime.
Also discovered: segment header (40 02) payload bytes [0:2] = T_delta
at first sample of new segment, [6:8] = byte length to next segment.
Multi-segment Tran decoding still diverges after sample 512 because
the per-segment channel ordering after the header is unknown.
Tests: 40 pass (up from 36).
Files:
- minimateplus/waveform_codec.py: find_data_start fix, RLE handling,
full segment-0 decode in decode_tran_initial
- tests/test_waveform_codec.py: synthetic RLE test, full segment 0
tests for JQ0 and V70
- tests/fixtures/5-11-26/: M529LL1L.JQ0, M529LL1L.V70 + TXT exports
- docs/instantel_protocol_reference.md §7.6.1: RLE + segment-header docs
User uploaded 3 high-amplitude events (PPV 6-7 in/s — shook the geophone
hard) to decode-re/5-11-26/. These cracked the Tran codec:
- Preamble bytes [3:5] and [5:7] = Tran[0] and Tran[1] as int16 BE
in 16-count units (LSB = 0.005 in/s). Confirmed across all 7
fixtures.
- First data block carries Tran deltas from sample 2 onward:
* 10 NN block: NN/2 bytes of payload, each byte = two 4-bit signed
nibble deltas (high nibble first)
* 20 NN block: NN int8 signed deltas
Verified 22+42+46 = 110 Tran samples across SP0/SS0/SV0 with 0 errors
against BW's ASCII export.
Why the earlier 96-combination brute force failed: the quiet 5-8
events all had T[0] = T[1] ≈ 0 so the preamble's per-channel encoding
was undetectable. Loud events made the encoding obvious.
What's solved:
- minimateplus.waveform_codec.decode_tran_initial: returns first
N Tran samples in 16-count units for any body.
- Walker length formula for in-data 30 NN blocks (NN*2 instead of NN*4).
- Walker now handles bodies that start with 20 NN (in addition to 10 NN).
What's still open:
- Tran past the first data block (multi-block channel switching).
- Vert / Long / MicL channel encodings.
- Walker correctness past offset ~427 in event-b.
Tests: 36 pass. decode_waveform_v2 still returns None — the full
multi-channel decoder is not wired up. decode_tran_initial is the
new verified entry point.
Files: minimateplus/waveform_codec.py, tests/test_waveform_codec.py
(adds 5-11-26 fixtures + decode_tran_initial tests), and
docs/instantel_protocol_reference.md §7.6.1 (Tran codec spec).
Decoded the structural framing of the Blastware waveform body — the bytes
between the 21-byte STRT record and the 26-byte file footer. The body is
a sequence of tagged variable-length blocks, NOT raw int16 LE. Five tag
types (10/20/00/30/40 NN) and their lengths are now confirmed against the
4-event May 2026 fixture bundle. Body splits cleanly into ~16 segments
(for a 1280-sample event) separated by 40 02 segment headers carrying a
monotonically incrementing uint32 LE counter at bytes [8:12].
What's done:
- minimateplus/waveform_codec.py — block walker, segment splitter, segment
header parser. decode_waveform_v2 is a stub returning None until the
byte-to-sample mapping is solved; client.py is unchanged.
- tests/test_waveform_codec.py — 31 tests covering block detection, lengths,
contiguous-walk, segment splitting, segment-header parsing, and counter
monotonicity. All pass.
- tests/fixtures/decode-re-5-8-26/ — bundled fixtures (4 events, BW binary
+ Blastware ASCII export each).
- docs/instantel_protocol_reference.md §7.6.1 — replaced retraction box
with the verified structural decoding plus an explicit list of what's
still open.
What's still open: the per-byte mapping inside 10 NN / 20 NN blocks. 96
channel-permutation × nibble-order × sign-convention combinations were
brute-force tested; none match BW's ASCII export to within ±1 ADC count.
The codec is more elaborate than uniform 4-bit deltas — likely a hybrid
variable-bit-width scheme with segment-anchor resync points. Next
recommended step: capture an event with a known calibration tone to pin
down magnitude scaling.
Walker also bails out partway through event-b (open issue documented in
both the module and the protocol reference).
The /db/import/blastware_file endpoint was bucketing every
forwarded event into serial='UNKNOWN' in the DB. WaveformStore
correctly decoded the serial from the BW filename and saved
files to <store>/<serial>/<filename> (e.g.
.../BE17353/S353L5KC.DR0H.h5), but the endpoint code called
db.insert_events(serial=_serial_from_event(ev)) — and
_serial_from_event was a stub that always returned None,
falling back to "UNKNOWN".
Effect on the user's prod server: 3,039 events forwarded across
24 distinct units, ALL inserted under serial='UNKNOWN'. The
on-disk waveform store + sidecars + HDF5s were fine, but the
SFM webapp's /db/units only showed the two original manually-
uploaded serials because every forwarded row had its serial
column zeroed to UNKNOWN.
Fix:
- WaveformStore.save_imported_bw() now surfaces the decoded
serial on the returned `rec` dict (rec["serial"]).
- The import endpoint uses rec["serial"] as the authoritative
fallback when the operator hasn't supplied a serial_hint query
parameter. Order of precedence:
query string `serial` → rec["serial"] → _serial_from_event(ev) → "UNKNOWN"
- Response payload now includes `serial` per file so the watcher
log lines (or any future caller) can see which unit each event
was attributed to.
Recovery for existing DB rows:
scripts/repair_unknown_serials.py walks the events table looking
for rows with serial='UNKNOWN' and re-attributes each one to the
serial decoded from blastware_filename. Updates the row in place
unless the target (serial, timestamp) already has a row, in which
case the UNKNOWN duplicate is deleted. Idempotent. Default
dry-run; pass --apply to commit.
Verified on the user's actual DB (dry-run):
UNKNOWN rows scanned: 3039
Updated to real serial: 2602
Deleted (duplicate of an
already-correct row): 437
Unresolved (bad filename): 0
After running the repair, /db/units will show all 24 units
correctly populated.
The four operator-supplied note fields in BW's Compliance Setup →
Notes tab (Project / Client / User Name / Seis Loc) have
USER-EDITABLE LABELS — an operator can rename them in BW's UI to
"Building:", "Site Address:", "Inspector:", or anything else, and
the ASCII export writes those literal labels verbatim. The
previous label-normalisation map approach (just added in commit
6a7e8c6) was fragile: it could only match label spellings we'd
enumerated in advance. An operator using "Site:" instead of
"Seis Loc:" would have their sensor location silently dropped.
What IS reliable: BW always writes the 4 user-notes lines
contiguously, in the same order, between the "Units :" line and
the "Geo Range :" line of the export. So parse them by POSITION:
position 1 → project
position 2 → client
position 3 → operator
position 4 → sensor_location
The original labels BW wrote are preserved in a new
`BwAsciiReport.user_note_labels` dict (canonical slot → literal
label string) so terra-view can render them as the operator named
them.
Removes the `_OPERATOR_LABEL_MAP` / `_normalise_label_for_lookup`
helpers and the elif-by-normalised-label branch in `parse_report`.
Replaces with a small state machine that flips on the "Units" line
and flips off on the "Geo Range" line.
Tests:
- Default-label fixtures (waveform + histogram) still populate
correctly, with operator's labels captured.
- Synthetic custom-labelled exports ("Building:" / "Site Address:" /
etc.) populate the right slots by position.
- Histogram-specific "Seis. Location:" works.
- Lines outside the Units→Geo Range range are ignored even if
they look like user notes (defensive against malformed exports).
- Partial blocks (fewer than 4 lines) leave later slots None.
- Extra lines beyond 4 are dropped (5th slot doesn't exist).
26 tests in test_bw_ascii_report.py (was 33; net drop reflects
parametrised label tests collapsed into 6 focused position tests).
Full SFM suite: 62 passed, 44 skipped.
Pairs with series3-watcher v1.5.0 which fixes the filename pairing
so the report reaches this parser in the first place.
Blastware writes the operator-supplied fields with different label
spellings across firmware versions and recording modes — most
notably "Seis. Location" on histogram exports vs "Seis Loc:" on
waveform exports. Previous parser only matched the latter, so
every histogram event silently lost its sensor_location field.
Replace the four hardcoded `key.rstrip(":") == "X"` branches with
a single `_OPERATOR_LABEL_MAP` dispatch table keyed by normalised
label (lowercase, trailing colon/period stripped, internal
whitespace collapsed). Adds these variants on day 1:
project: "Project:" / "Project"
client: "Client:" / "Client"
operator: "User Name:" / "User Name"
sensor_location: "Seis Loc:" / "Seis. Location" / "Seis Location"
/ "Sensor Location" / "Seis Loc"
To absorb future BW label drift, add a one-line dict entry — no
new elif branch.
14 new tests cover:
- Each label variant routes to the correct field (parametrised)
- Case-insensitive matching ("seis loc" / "SEIS LOC" / "SeIs LoC")
- Whitespace-collapse ("Seis Loc" with double-space)
- End-to-end parse of a real histogram fixture from
example-events/histogram/ — sensor_location ('Loc #1 - 2652 Hepner...')
populates correctly even though the file uses "Seis. Location"
Total bw_ascii_report tests: 19 → 33. Full SFM suite still green
(69 passed, 44 skipped — pre-existing skips for h5py-dep tests).
Pairs with series3-watcher v1.5.4 (which fixes the filename pairing
so histograms actually reach this parser in the first place).
Blastware's ACH writes a per-event ASCII report (.TXT) alongside each
event binary, containing the rich derived per-channel fields BW
computes (PPV, ZC Freq, Time of Peak, Peak Acceleration, Peak
Displacement, Peak Vector Sum + time, sensor self-check Pass/Fail,
monitor-log timestamps). None of this lives in the BW binary itself.
When the watcher daemon forwards both files to /db/import/blastware_file
in one multipart POST, we now:
- Pair binaries with their .TXT partners by filename match
- Parse the report into a structured BwAsciiReport
- Land the rich fields in a new top-level `bw_report` block of the
sidecar JSON
- Overlay the report's peaks/project_info/timestamp/sample_rate/
record_time/total_samples/pretrig_samples onto the canonical
sidecar fields (the report values are device-authoritative; the
BW-binary STRT-derived values had bugs like reading the 0x46
record-type marker as rectime)
This unblocks the monthly-summary review workflow — events become
sortable/filterable by peak, location, project, etc. — without
depending on the still-undecoded waveform body codec.
### Added
- **Layered event storage architecture.** Each event now lands as four
files in the per-serial waveform store, each with a clear role:
- `<filename>` — the Blastware-readable binary (BW file). Untouched.
- `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
- `<filename>.h5` — clean per-channel waveform arrays in physical
units (in/s for geo, psi for mic) plus event metadata (HDF5 with
gzip compression). This is the canonical format for downstream
analysis tools.
- `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
project, source provenance, review state, extensions).
SQLite (`seismo_relay.db`) is the searchable index over all four.
- **Plot-ready waveform JSON (`sfm.plot.v1`).** The `/device/event/{idx}/waveform`
and `/db/events/{id}/waveform.json` endpoints now return samples in
physical units with explicit time-axis metadata, peak markers, and
per-channel unit hints — no more guessing the ADC-to-velocity scale
client-side. The webapp waveform viewer was rewritten to consume
this shape.
- **In-app waveform viewer accuracy fix.** The standalone SFM webapp
viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
(≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
*in/s per V* hardware constant — not the ADC-counts-to-velocity
factor. This silently scaled every plot ~38% too low for Normal-range
geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
Sensitive). Conversion is now done server-side using the geo_range
from compliance config; the client just plots.
- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
`read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.
### Dependencies
- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
`/db/import/blastware_file` endpoint introduced in this release).
- Added `waveform_key` and `event_timestamp` columns to `CachedEvent` and `CachedWaveform` for integrity verification.
- Implemented logic to flush the cache when a mismatch in (waveform_key, event_timestamp) is detected during event and waveform updates.
- Enhanced `set_events` and `set_waveform` methods to check for mismatches and trigger cache eviction as necessary.
- Introduced a new `LiveCache` class to manage in-memory caching of live device data, separating it from the server logic for better testability.
- Added tests to verify the correctness of cache invalidation logic, particularly for post-erase key reuse scenarios.
- Updated web application to include a "Force refresh" toggle, allowing users to bypass the cache and re-fetch data from the device.
test: add regression tests for v0.14.x SUB 5A protocol fixes
refactor(logging): change warning logs to debug for less verbosity in write_blastware_file