seismo-relay

serversdown/seismo-relay

Fork 0

Commit Graph

Author	SHA1	Message	Date
serversdown	d506ebc103	histogram_codec: peak count is uint8 (not uint16 LE) — properly cracks the BE9558 / BE18003 extension-byte case The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still unclear — empirically non-zero on intervals with sub-Hz or unmeasurable freq), NOT the high byte of the peak count. The N844 fixture corpus the original RE was done against had zero values in those bytes for every block, so uint8 and uint16 LE were equivalent there — but on real BE9558 Tran-drift events and BE18003 Histogram+Continuous events the uint16 LE interpretation produced peaks up to 268 in/s and 35× inflated PVS sums. Cross-correlated against BW's per-interval ASCII export on: - K558LKZU/LL1P/LL3K → 100% T/V/L/M peak match (1435 blocks each) - T003LKZR/LL0O/LL1M → 100% T/V/L, 99.3% M (0.05 dB rounding only) - N599LKZS/LL0L → 100% all channels - N844 fixture corpus → 100% all channels (unchanged) Annotations preserved on every record for future RE; the defensive _MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s, well below any physical limit). Synthetic regression test added using the verbatim K558LKZU.RE0H interval-12 block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 06:05:19 +00:00
serversdown	7183b953e4	minimateplus: histogram body codec — FULLY DECODED The histogram-mode event body is now byte-exact decodable. Companion to the waveform body codec — together they cover every event file the watcher forwards. Cracked in one session via cross-event correlation against BW's ASCII export. The §7.6.2 spec in instantel_protocol_reference.md was structurally correct (32-byte blocks) but the per-sample semantics were under-documented. Cross-checking block 130 of N844L6Z8.ZR0H against its TXT row revealed the layout perfectly: slot[0] = 10 (constant marker) slot[1] = T_peak_count (× 0.005 → in/s at Normal range) slot[2] = T_halfperiod (freq_Hz = 512 / halfp) slot[3] = V_peak_count slot[4] = V_halfperiod slot[5] = L_peak_count slot[6] = L_halfperiod slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db) slot[8] = MicL_halfperiod The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz). Mic dB uses the SAME formula as the waveform codec (sign × (81.94 + 20·log10(\|count\|))) — they share the mic ADC calibration constant. Block identification anchor: bytes [22:24] == 0x0000 AND bytes [28:32] == 1e 0a 00 00. The tail signature is the most reliable distinguisher from non-block content in the file. Files: minimateplus/histogram_codec.py (new) — decoder + public API matching the waveform codec's shape: walk_body(body) -> records decode_histogram_body(body) -> {Tran, Vert, Long, MicL} decode_histogram_body_full(body) -> [per-interval dicts] half_period_to_hz, geo_count_to_ins helpers minimateplus/event_file_io.py (modified) — read_blastware_file now tries the waveform codec first, falls back to the histogram codec on failure. Same output shape, same downstream pipeline. tests/test_histogram_codec.py (new) — 24 regression locks against the in-repo fixture corpus, byte-exact against BW ASCII export for peaks (all 4 channels), frequencies (all 4 channels, including >100 Hz sentinel handling), block framing, and segment-ID accounting. scripts/backfill_sidecars.py (modified) — the has_samples short-circuit added in the histogram-pending era is now a pure defensive guard. Histograms in prod will regen .h5 files correctly on the next backfill run. docs/histogram_codec_re_status.md (updated) — supersedes the earlier "in progress" version with the verified format and test-coverage summary. Notes a few non-essential fields still open (4-byte block metadata, Geo PVS, Mic psi(L) — none of which are needed for waveform reconstruction). Total verified coverage: ~3,500 blocks across 5 fixtures, every field of every block byte-exact against BW. The watcher-forwarded histogram event corpus on prod (~10,000 events) will now produce correct .h5 sidecars on the next backfill run. No additional changes needed to the backfill flow — the existing tool_version-bump cascade picks them up automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 23:05:13 +00:00
serversdown	c3c7fe559c	docs: histogram body codec RE — starting-point status doc Captures everything learned in the 2026-05-20 session before scope forced a pause: - Block framing is solved: 32-byte blocks, one per histogram interval, signature byte pattern `[22:24]=0x0000` + `[28:32]=0x1e 0x0a 0x00 0x00` reliably identifies data blocks. - Block count = interval count (791 blocks in N844L20G.630H for a TXT-reported 792 intervals). - Sample[0] = Tran peak in 0.0005 in/s/count units (verified on one event — needs cross-event confirmation). - Samples 1-8 → channel/metric mapping is still open. None of the obvious layouts (peak-then-freq alternating, all-peaks- then-all-freqs, per-channel 3-tuples) match the TXT values across multiple blocks. Likely needs a higher-activity fixture (current N844 corpus is all noise-floor data) to disambiguate. - `>100 Hz` sentinel encoding in the binary is unknown. - 4-byte variable metadata field at block[24:28] needs correlation work against TXT columns. Doc mirrors the structure of docs/waveform_codec_re_status.md so a future RE session has a familiar entry point. Includes the suggested attack plan + the code seam where the eventual decoder will land (minimateplus/histogram_codec.py). The §7.6.2 spec in instantel_protocol_reference.md is structurally correct but doesn't pin down per-sample semantics — this doc supersedes it where they conflict on confidence level. No code shipped on this branch. When the codec is cracked, the plan is to land minimateplus/histogram_codec.py + wire into event_file_io.read_blastware_file() + remove the has_samples short-circuit from scripts/backfill_sidecars.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 21:13:26 +00:00

Author

SHA1

Message

Date

serversdown

d506ebc103

histogram_codec: peak count is uint8 (not uint16 LE) — properly cracks

the BE9558 / BE18003 extension-byte case

The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still
unclear — empirically non-zero on intervals with sub-Hz or unmeasurable
freq), NOT the high byte of the peak count.  The N844 fixture corpus
the original RE was done against had zero values in those bytes for
every block, so uint8 and uint16 LE were equivalent there — but on
real BE9558 Tran-drift events and BE18003 Histogram+Continuous events
the uint16 LE interpretation produced peaks up to 268 in/s and 35×
inflated PVS sums.

Cross-correlated against BW's per-interval ASCII export on:
  - K558LKZU/LL1P/LL3K  → 100% T/V/L/M peak match (1435 blocks each)
  - T003LKZR/LL0O/LL1M  → 100% T/V/L, 99.3% M (0.05 dB rounding only)
  - N599LKZS/LL0L        → 100% all channels
  - N844 fixture corpus  → 100% all channels (unchanged)

Annotations preserved on every record for future RE; the defensive
_MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s,
well below any physical limit).

Synthetic regression test added using the verbatim K558LKZU.RE0H
interval-12 block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-21 06:05:19 +00:00

serversdown

7183b953e4

minimateplus: histogram body codec — FULLY DECODED

The histogram-mode event body is now byte-exact decodable.
Companion to the waveform body codec — together they cover every
event file the watcher forwards.  Cracked in one session via
cross-event correlation against BW's ASCII export.

The §7.6.2 spec in instantel_protocol_reference.md was structurally
correct (32-byte blocks) but the per-sample semantics were
under-documented.  Cross-checking block 130 of N844L6Z8.ZR0H
against its TXT row revealed the layout perfectly:

  slot[0] = 10 (constant marker)
  slot[1] = T_peak_count    (× 0.005 → in/s at Normal range)
  slot[2] = T_halfperiod    (freq_Hz = 512 / halfp)
  slot[3] = V_peak_count
  slot[4] = V_halfperiod
  slot[5] = L_peak_count
  slot[6] = L_halfperiod
  slot[7] = MicL_peak_count (dB via waveform_codec.mic_count_to_db)
  slot[8] = MicL_halfperiod

The `>100 Hz` sentinel is halfperiod ≤ 5 (since 512/5 = 100 Hz).
Mic dB uses the SAME formula as the waveform codec (sign × (81.94
+ 20·log10(|count|))) — they share the mic ADC calibration constant.

Block identification anchor: bytes [22:24] == 0x0000 AND
bytes [28:32] == 1e 0a 00 00.  The tail signature is the most
reliable distinguisher from non-block content in the file.

Files:

  minimateplus/histogram_codec.py (new) — decoder + public API
    matching the waveform codec's shape:
      walk_body(body) -> records
      decode_histogram_body(body) -> {Tran, Vert, Long, MicL}
      decode_histogram_body_full(body) -> [per-interval dicts]
      half_period_to_hz, geo_count_to_ins helpers

  minimateplus/event_file_io.py (modified) — read_blastware_file
    now tries the waveform codec first, falls back to the histogram
    codec on failure.  Same output shape, same downstream pipeline.

  tests/test_histogram_codec.py (new) — 24 regression locks against
    the in-repo fixture corpus, byte-exact against BW ASCII export
    for peaks (all 4 channels), frequencies (all 4 channels,
    including >100 Hz sentinel handling), block framing, and
    segment-ID accounting.

  scripts/backfill_sidecars.py (modified) — the has_samples
    short-circuit added in the histogram-pending era is now a
    pure defensive guard.  Histograms in prod will regen .h5 files
    correctly on the next backfill run.

  docs/histogram_codec_re_status.md (updated) — supersedes the
    earlier "in progress" version with the verified format and
    test-coverage summary.  Notes a few non-essential fields still
    open (4-byte block metadata, Geo PVS, Mic psi(L) — none of
    which are needed for waveform reconstruction).

Total verified coverage: ~3,500 blocks across 5 fixtures, every
field of every block byte-exact against BW.

The watcher-forwarded histogram event corpus on prod (~10,000
events) will now produce correct .h5 sidecars on the next backfill
run.  No additional changes needed to the backfill flow — the
existing tool_version-bump cascade picks them up automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:05:13 +00:00

serversdown

c3c7fe559c

docs: histogram body codec RE — starting-point status doc

Captures everything learned in the 2026-05-20 session before scope
forced a pause:

  - Block framing is solved: 32-byte blocks, one per histogram
    interval, signature byte pattern `[22:24]=0x0000` +
    `[28:32]=0x1e 0x0a 0x00 0x00` reliably identifies data blocks.
  - Block count = interval count (791 blocks in N844L20G.630H for
    a TXT-reported 792 intervals).
  - Sample[0] = Tran peak in 0.0005 in/s/count units (verified on
    one event — needs cross-event confirmation).
  - Samples 1-8 → channel/metric mapping is still open.  None of
    the obvious layouts (peak-then-freq alternating, all-peaks-
    then-all-freqs, per-channel 3-tuples) match the TXT values
    across multiple blocks.  Likely needs a higher-activity
    fixture (current N844 corpus is all noise-floor data) to
    disambiguate.
  - `>100 Hz` sentinel encoding in the binary is unknown.
  - 4-byte variable metadata field at block[24:28] needs
    correlation work against TXT columns.

Doc mirrors the structure of docs/waveform_codec_re_status.md so
a future RE session has a familiar entry point.  Includes the
suggested attack plan + the code seam where the eventual decoder
will land (minimateplus/histogram_codec.py).

The §7.6.2 spec in instantel_protocol_reference.md is structurally
correct but doesn't pin down per-sample semantics — this doc
supersedes it where they conflict on confidence level.

No code shipped on this branch.  When the codec is cracked, the
plan is to land minimateplus/histogram_codec.py + wire into
event_file_io.read_blastware_file() + remove the has_samples
short-circuit from scripts/backfill_sidecars.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 21:13:26 +00:00

3 Commits