Merge pull request 'Histogram body codec — full RE + peak-count fix that resolves the prod inflation incident' (#26 ) from feat/wire-histogram-codec into dev

Reviewed-on: #26
scripts: bw_report preservation check for backfill safety
2026-05-22 13:08:03 -04:00 · 2026-05-21 06:13:52 +00:00 · 2026-05-21 06:05:19 +00:00 · 2026-05-21 02:50:10 +00:00 · 2026-05-21 02:17:33 +00:00 · 2026-05-21 01:20:08 +00:00
6 changed files with 364 additions and 27 deletions
@@ -1,6 +1,6 @@
 /bridges/captures/
 /example-events/
-
+/tests/fixtures/
 /manuals/

 # Python build artifacts
@@ -12,7 +12,21 @@ implementation lives in `minimateplus/histogram_codec.py`.
 in-repo histogram fixture corpus decodes byte-exact against BW's
 ASCII export.

-24 regression tests pass against ~3,500 blocks across 5 fixtures.
+26 regression tests pass against ~3,500 blocks across 5 in-repo
+fixtures, plus a synthetic regression block taken from a real
+BE9558 prod event to lock in the uint8-peak interpretation.
+
+**Important correction (2026-05-21):** the per-channel peak count
+is `uint8` at byte[6]/[10]/[14]/[18], NOT `uint16 LE` at byte[6:8]
+etc.  The N844 fixture corpus the original RE was done against has
+zero values in bytes [7]/[11]/[15]/[19] for every block, so the
+two interpretations happened to be equivalent.  Cross-correlating
+non-N844 events (BE9558 Tran-drift, BE18003 Histogram+Continuous)
+against BW's per-interval ASCII export — 4 channels × ~1400 blocks
+per event × multiple events = 100% byte-exact only when the peak
+is read as uint8.  Reading as uint16 LE produced peaks up to 268
+in/s per channel and 35× inflated PVS sums when first deployed to
+prod (rolled back, root-caused, and fixed in commit 7183b95+1).

 ## Body format

@@ -27,15 +41,21 @@ Each block represents one histogram interval.  Block layout:
 [1]    segment_id (uint8)        0x00..0x03 — 256 blocks per segment
 [2:4]  block_ctr (uint16 LE)     resets each segment (0x0100, 0x0101, …)
 [4:6]  0x000a (uint16 LE)        constant marker (= 10)
-[6:8]  T_peak_count   uint16 LE  Tran peak (count × 0.005 → in/s at Normal)
+[6]    T_peak_count   uint8      Tran peak (count × 0.005 → in/s at Normal,
+                                  max 1.275 in/s — fits in uint8)
+[7]    T_annotation   uint8      empirically non-zero on intervals with sub-Hz
+                                  or unmeasurable freq; meaning not fully RE'd
 [8:10] T_halfperiod   uint16 LE  Tran half-period in samples
                                  (freq_Hz = 512 / halfp; ≤ 5 means ">100 Hz")
-[10:12] V_peak_count  uint16 LE  Vert peak
+[10]   V_peak_count   uint8      Vert peak
+[11]   V_annotation   uint8
 [12:14] V_halfperiod  uint16 LE  Vert freq half-period
-[14:16] L_peak_count  uint16 LE  Long peak
+[14]   L_peak_count   uint8      Long peak
+[15]   L_annotation   uint8
 [16:18] L_halfperiod  uint16 LE  Long freq half-period
-[18:20] M_peak_count  uint16 LE  MicL peak count
+[18]   M_peak_count   uint8      MicL peak count
                                  (dB via waveform_codec.mic_count_to_db)
+[19]   M_annotation   uint8
 [20:22] M_halfperiod  uint16 LE  MicL freq half-period
 [22:24] 0x00 0x00                constant
 [24:28] 4-byte variable          purpose unknown — possibly CRC,
@@ -99,6 +119,16 @@ slot[8] = 9  → 512/9 = 56.9 → 57 Hz       ✓ M_freq

 ## What's NOT yet decoded

+- **Annotation bytes (`block[7]/[11]/[15]/[19]`)**.  Empirically
+  non-zero on intervals where the per-channel ZC frequency comes
+  out as `N/A` or sub-Hz (`<1.0`, `1.X`).  Hypothesis tested in the
+  RE session: byte != 0 ↔ sub-Hz freq.  Only ~50% correlation
+  across the K558 corpus, so the relationship is more complex.
+  Possibilities: time-of-peak-within-interval, halfp extension for
+  very-long-period signals, or a debug/diagnostic field the firmware
+  writes opportunistically.  Doesn't affect peak amplitudes or
+  waveform reconstruction.  Captured as `record["annotations"]` for
+  future RE.
 - **4-byte variable metadata field (bytes 24:28)**.  Not needed for
  waveform reconstruction.  Speculation: per-block CRC, sub-second
  timestamp offset, or a Mic psi(L) count not in the 9 samples.
@@ -28,18 +28,32 @@ iterate 32-stride and stop before the tail.
    [1]    segment_id  (uint8)       0x00..0x03 — 256 blocks per segment
    [2:4]  block_ctr  (uint16 LE)    resets each segment (0x0100, 0x0101, …)
    [4:6]  0x000a (uint16 LE)        constant marker (= 10)
-    [6:8]  T_peak_count   uint16 LE  Tran peak (count × 0.005 → in/s)
+    [6]    T_peak_count   uint8      Tran peak (count × 0.005 → in/s, max 1.275 in/s)
+    [7]    T_annotation   uint8      empirically non-zero on intervals with sub-Hz
+                                     or unmeasurable Tran freq; meaning not fully RE'd
    [8:10] T_halfperiod   uint16 LE  Tran half-period in samples (freq = 512 / halfp Hz)
-    [10:12] V_peak_count  uint16 LE
+    [10]   V_peak_count   uint8
+    [11]   V_annotation   uint8
    [12:14] V_halfperiod  uint16 LE
-    [14:16] L_peak_count  uint16 LE
+    [14]   L_peak_count   uint8
+    [15]   L_annotation   uint8
    [16:18] L_halfperiod  uint16 LE
-    [18:20] M_peak_count  uint16 LE  MicL peak (count → dB via mic_count_to_db)
+    [18]   M_peak_count   uint8      MicL peak (count → dB via mic_count_to_db)
+    [19]   M_annotation   uint8
    [20:22] M_halfperiod  uint16 LE  MicL half-period in samples (freq = 512 / halfp Hz)
    [22:24] 0x00 0x00                constant
    [24:28] 4-byte variable          purpose unknown (possibly CRC or timestamp delta)
    [28:32] 0x1e 0x0a 0x00 0x00      constant block-end signature

+NOTE on peak-count width: an earlier interpretation treated the peak
+fields as uint16 LE spanning [6:8] / [10:12] / [14:16] / [18:20].
+That happened to be byte-exact against the N844 fixture corpus only
+because every annotation byte in those fixtures was zero, making
+``uint16 LE == uint8``.  Cross-correlating BE9558 (K558) Tran-drift
+and BE18003 (T003) Histogram+Continuous events against the BW ASCII
+export proved peak is uint8 alone — see test_histogram_codec.py
+and docs/histogram_codec_re_status.md.
+
 Block-identification anchor: ``block[22:24] == b"\\x00\\x00"`` AND
 ``block[28:32] == b"\\x1e\\x0a\\x00\\x00"``.  This is the reliable
 distinguisher from non-block content in the file.
@@ -128,17 +142,40 @@ def _is_data_block(block: bytes) -> bool:
    return True


-def _decode_block(block: bytes) -> dict:
+def _decode_block(block: bytes) -> Optional[dict]:
    """Decode one 32-byte histogram block.  Caller must have validated
-    with ``_is_data_block`` first."""
-    # All 16-bit fields are little-endian unsigned.  Peak counts are
-    # always non-negative; half-periods are always positive when valid.
-    t_peak, t_halfp, v_peak, v_halfp, l_peak, l_halfp, m_peak, m_halfp = struct.unpack_from(
-        "<HHHHHHHH", block, 6
-    )
+    with ``_is_data_block`` first.
+
+    Returns a record with per-channel peak counts (uint8) and
+    half-periods (uint16 LE).
+    """
+    # Peak counts are uint8 at bytes [6] / [10] / [14] / [18].  The
+    # adjacent bytes [7] / [11] / [15] / [19] hold an annotation field
+    # whose meaning isn't fully understood (empirically non-zero in
+    # intervals with sub-Hz or unmeasurable geo frequencies, mostly
+    # zero otherwise — see test fixtures from BE9558/BE18003 corpora).
+    # Crucially, those annotation bytes are NOT the high byte of the
+    # peak count: cross-correlating against BW's per-interval ASCII
+    # export proves the peak is uint8 alone.
+    #
+    # Reading the peak as uint16 LE (the original interpretation) was
+    # accidentally correct only because every block in the N844 fixture
+    # corpus had a zero annotation byte; non-N844 events with non-zero
+    # annotation bytes decoded to physically impossible peaks (e.g.
+    # 268 in/s per channel) and produced 35× inflated PVS sums when
+    # first run against prod data.  See histogram_codec_re_status.md.
+    t_peak = block[6]
+    v_peak = block[10]
+    l_peak = block[14]
+    m_peak = block[18]
+    t_halfp = block[8]  | (block[9]  << 8)
+    v_halfp = block[12] | (block[13] << 8)
+    l_halfp = block[16] | (block[17] << 8)
+    m_halfp = block[20] | (block[21] << 8)
    segment_id = block[1]
    block_ctr  = block[2] | (block[3] << 8)
    var_meta   = bytes(block[24:28])
+    annotations = (block[7], block[11], block[15], block[19])
    return {
        "segment_id":  segment_id,
        "block_ctr":   block_ctr,
@@ -151,6 +188,7 @@ def _decode_block(block: bytes) -> dict:
        "m_peak":      m_peak,
        "m_halfp":     m_halfp,
        "meta_var":    var_meta,
+        "annotations": annotations,
    }


@@ -160,6 +198,13 @@ def walk_body(body: bytes) -> List[dict]:
    Iterates 32-byte strides from offset 0.  Yields a decoded record
    for every block that passes ``_is_data_block`` validation.  Stops
    when the remaining bytes are too short to form a complete block.
+
+    In Histogram+Continuous mode the body interleaves data blocks with
+    other 32-byte content (likely continuous-mode waveform blocks) that
+    fail the data-block validation; the walker naturally skips them
+    without losing 32-byte alignment.  Use ``block_ctr`` from each
+    returned record to map back to the original interval index — the
+    record list is sparse when other block types are interleaved.
    """
    records: List[dict] = []
    for off in range(0, len(body) - _BLOCK_SIZE + 1, _BLOCK_SIZE):
@@ -169,7 +214,13 @@ def walk_body(body: bytes) -> List[dict]:
            # Continue walking — block alignment is fixed at 32-stride
            # from offset 0, so we don't lose alignment by skipping.
            continue
-        records.append(_decode_block(blk))
+        decoded = _decode_block(blk)
+        if decoded is None:
+            # Block validated as a histogram block but had peak fields
+            # outside the plausible range — undocumented extension.
+            # Skip rather than propagating bogus PVS contributions.
+            continue
+        records.append(decoded)
    return records


@@ -54,14 +54,26 @@ log = logging.getLogger("backfill_sidecars")


 def _looks_like_event_file(path: Path) -> bool:
-    """Same heuristic as the importer CLI."""
+    """Same heuristic as the importer CLI.
+
+    Filters to BW (Series III) event files only — Thor (Series IV)
+    `.IDFW` / `.IDFH` files share the store but have their own ingest
+    path (`WaveformStore.save_imported_idf`) and are NOT decodable by
+    `event_file_io.read_blastware_file`.  Their sidecars are populated
+    at ingest from the paired `.IDFW.txt` ASCII report; nothing the
+    backfill regenerates would improve on them, so we exclude them
+    from scope.
+    """
    if not path.is_file():
        return False
-    if path.name.endswith((".a5.pkl", ".sfm.json")):
+    if path.name.endswith((".a5.pkl", ".sfm.json", ".h5")):
        return False
    ext = path.suffix.lstrip(".")
    if not (3 <= len(ext) <= 4):
        return False
+    # Thor IDF files share the .{W,H}-suffix shape but aren't BW.
+    if ext.upper() in ("IDFW", "IDFH"):
+        return False
    if not (ext[-1].upper() in {"W", "H"} or ext.endswith("0")):
        return False
    try:
@@ -275,16 +287,25 @@ def main(argv=None) -> int:
                            or ev.total_samples < derived // 4):
                        ev.total_samples = derived

-                # Preserve user-edited review state + extensions from the
-                # existing sidecar (false_trigger flag, notes, etc.) so a
-                # backfill never wipes them out.
+                # Preserve user-edited review state + extensions + the
+                # bw_report block from the existing sidecar so a backfill
+                # never wipes them out.  The bw_report block originates
+                # from the paired .TXT ASCII report parsed at ORIGINAL
+                # import time (ach forward / direct upload); the .TXT
+                # file is not in the waveform store, so we can't re-derive
+                # it from disk.  event_to_sidecar_dict takes a
+                # BwAsciiReport dataclass (not a dict), so for bw_report
+                # we overlay the existing block after regen instead of
+                # passing it as a kwarg.
                preserved_review     = None
                preserved_ext        = None
+                preserved_bw_report  = None
                if sidecar_path.exists():
                    try:
                        _existing = event_file_io.read_sidecar(sidecar_path)
                        preserved_review    = _existing.get("review")
                        preserved_ext       = _existing.get("extensions")
+                        preserved_bw_report = _existing.get("bw_report")
                    except Exception:
                        pass

@@ -299,6 +320,8 @@ def main(argv=None) -> int:
                    review=preserved_review,
                    extensions=preserved_ext,
                )
+                if preserved_bw_report is not None:
+                    sidecar["bw_report"] = preserved_bw_report

                # Also emit the .h5 clean-waveform file when:
                #   - it's missing, OR
@@ -0,0 +1,185 @@
+"""
+scripts/check_bw_report_preservation.py — verify that running backfill_sidecars
+doesn't wipe the `bw_report` block from sidecars that already had one.
+
+Two-step workflow:
+
+  # Before running backfill — capture a baseline snapshot:
+  python scripts/check_bw_report_preservation.py snapshot \
+      --store-root /path/to/waveforms \
+      --out before.json
+
+  # Run backfill:
+  python scripts/backfill_sidecars.py --store-root /path/to/waveforms --force
+
+  # After backfill — diff against the baseline:
+  python scripts/check_bw_report_preservation.py diff \
+      --store-root /path/to/waveforms \
+      --baseline before.json
+
+The diff classifies every sidecar into one of:
+
+  PRESERVED      had bw_report before, has same hash now  ← GOOD
+  CHANGED        had bw_report before, has different hash now  ← suspicious
+                 (backfill should only ever copy the block verbatim)
+  WIPED          had bw_report before, doesn't now  ← BUG — data loss
+  STILL_MISSING  didn't have bw_report before, still doesn't  ← expected
+  NEW            didn't have bw_report before, has one now
+                 (only possible if a re-ingest happened between snapshots;
+                  shouldn't happen during backfill)
+  REMOVED        sidecar existed in baseline, file is gone now
+  ADDED          sidecar didn't exist in baseline, exists now
+
+Exit code is 0 if no WIPED or CHANGED entries are found, 1 otherwise.
+"""
+
+from __future__ import annotations
+
+import argparse
+import hashlib
+import json
+import sys
+from pathlib import Path
+from typing import Optional
+
+# Allow running from the repo root without installation.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from minimateplus import event_file_io
+
+
+def _bw_report_hash(sidecar_data: dict) -> Optional[str]:
+    """Canonical-JSON hash of the bw_report block, or None if absent."""
+    br = sidecar_data.get("bw_report")
+    if not br:
+        return None
+    # sort_keys for stable hashing across dict-ordering differences
+    blob = json.dumps(br, sort_keys=True, separators=(",", ":"))
+    return hashlib.sha256(blob.encode()).hexdigest()
+
+
+def _scan_store(store_root: Path) -> dict:
+    """Walk every <serial>/<file>.sfm.json and return {relpath: hash_or_None}.
+
+    Relpath is `<serial>/<filename>` — stable across machines/snapshots.
+    """
+    out: dict[str, Optional[str]] = {}
+    for serial_dir in sorted(p for p in store_root.iterdir() if p.is_dir()):
+        for sidecar in sorted(serial_dir.glob("*.sfm.json")):
+            relpath = f"{serial_dir.name}/{sidecar.name}"
+            try:
+                data = event_file_io.read_sidecar(sidecar)
+            except Exception as exc:
+                print(f"  WARN: failed to read {relpath}: {exc}", file=sys.stderr)
+                continue
+            out[relpath] = _bw_report_hash(data)
+    return out
+
+
+def cmd_snapshot(args) -> int:
+    store_root = Path(args.store_root).expanduser().resolve()
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+    out_path = Path(args.out).expanduser().resolve()
+
+    print(f"Scanning {store_root} …")
+    snapshot = _scan_store(store_root)
+
+    with_bw    = sum(1 for v in snapshot.values() if v is not None)
+    without_bw = sum(1 for v in snapshot.values() if v is None)
+    print(f"  total sidecars:     {len(snapshot)}")
+    print(f"  with bw_report:     {with_bw}")
+    print(f"  without bw_report:  {without_bw}")
+
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(out_path, "w") as f:
+        json.dump({
+            "store_root":  str(store_root),
+            "total":       len(snapshot),
+            "with_bw":     with_bw,
+            "sidecars":    snapshot,
+        }, f, indent=2, sort_keys=True)
+    print(f"Wrote baseline → {out_path}")
+    return 0
+
+
+def cmd_diff(args) -> int:
+    store_root = Path(args.store_root).expanduser().resolve()
+    if not store_root.exists():
+        print(f"error: store root does not exist: {store_root}", file=sys.stderr)
+        return 2
+    baseline_path = Path(args.baseline).expanduser().resolve()
+    if not baseline_path.exists():
+        print(f"error: baseline file not found: {baseline_path}", file=sys.stderr)
+        return 2
+
+    with open(baseline_path) as f:
+        baseline = json.load(f)
+    before = baseline["sidecars"]
+    print(f"Scanning {store_root} for comparison against {baseline_path.name} …")
+    after = _scan_store(store_root)
+
+    classes = {k: [] for k in (
+        "PRESERVED", "CHANGED", "WIPED", "STILL_MISSING", "NEW", "REMOVED", "ADDED",
+    )}
+    all_keys = set(before) | set(after)
+    for key in sorted(all_keys):
+        b = before.get(key, "__MISSING__")
+        a = after.get(key, "__MISSING__")
+        if b == "__MISSING__":
+            classes["ADDED"].append(key)
+        elif a == "__MISSING__":
+            classes["REMOVED"].append(key)
+        elif b is None and a is None:
+            classes["STILL_MISSING"].append(key)
+        elif b is None and a is not None:
+            classes["NEW"].append(key)
+        elif b is not None and a is None:
+            classes["WIPED"].append(key)
+        elif b == a:
+            classes["PRESERVED"].append(key)
+        else:
+            classes["CHANGED"].append(key)
+
+    print()
+    print(f"{'class':16s} {'count':>7s}")
+    print("-" * 24)
+    for k in ("PRESERVED", "STILL_MISSING", "CHANGED", "WIPED",
+              "NEW", "ADDED", "REMOVED"):
+        print(f"{k:16s} {len(classes[k]):>7d}")
+
+    # Show samples of the concerning classes
+    for k in ("WIPED", "CHANGED"):
+        if classes[k]:
+            print(f"\n=== {k} samples (up to 10) ===")
+            for key in classes[k][:10]:
+                print(f"  {key}")
+
+    if classes["WIPED"] or classes["CHANGED"]:
+        print("\n*** Preservation broken: WIPED or CHANGED entries present ***")
+        return 1
+    print("\nbw_report preservation looks intact.")
+    return 0
+
+
+def main(argv=None) -> int:
+    p = argparse.ArgumentParser(description=__doc__)
+    sub = p.add_subparsers(dest="cmd", required=True)
+
+    p_snap = sub.add_parser("snapshot", help="capture baseline bw_report hashes")
+    p_snap.add_argument("--store-root", required=True)
+    p_snap.add_argument("--out", required=True, help="output JSON path")
+    p_snap.set_defaults(func=cmd_snapshot)
+
+    p_diff = sub.add_parser("diff", help="diff current store against a baseline")
+    p_diff.add_argument("--store-root", required=True)
+    p_diff.add_argument("--baseline",   required=True, help="JSON from `snapshot`")
+    p_diff.set_defaults(func=cmd_diff)
+
+    args = p.parse_args(argv)
+    return args.func(args)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -335,3 +335,51 @@ def test_geo_count_to_ins_scale():
    assert geo_count_to_ins(1)  == pytest.approx(0.005)
    assert geo_count_to_ins(10) == pytest.approx(0.050)
    assert geo_count_to_ins(0)  == 0.0
+
+
+# ── Regression: peak is uint8 byte[N], NOT uint16 LE byte[N:N+2] ────────────
+#
+# Block taken verbatim from K558LKZU.RE0H (BE9558) interval 12 — a real
+# field event where the Tran channel had developed a DC offset and was
+# producing sub-Hz drift content the device couldn't characterize.
+# The annotation byte at [7] = 0xd2 is non-zero in that case.  The
+# legacy codec read [6:8] as uint16 LE, producing T_peak = 53763 →
+# 268 in/s — physically impossible and 35× too high for the actual
+# 0.015 in/s value (T_lo = 3 alone gives the correct count).
+# Verified against the paired BW ASCII export.
+_K558_INTERVAL_12_BLOCK = bytes.fromhex(
+    "00 00 0c 01 0a 00 03 d2 45 00 02 00 02 00 02 00"
+    "02 00 10 00 06 00 00 00 0e 91 2f 00 1e 0a 00 00".replace(" ", "")
+)
+
+
+def test_extension_byte_does_not_inflate_peak():
+    """The annotation byte at [7]/[11]/[15]/[19] must NOT contribute to
+    the peak count.  Decoded T_peak must be 3 (uint8 byte[6]), NOT
+    53763 (uint16 LE byte[6:8])."""
+    body = _K558_INTERVAL_12_BLOCK
+    records = decode_histogram_body_full(body)
+    assert records is not None
+    assert len(records) == 1
+    r = records[0]
+    assert r["t_peak"] == 3,    f"T_peak should be 3 (uint8), got {r['t_peak']}"
+    assert r["v_peak"] == 2
+    assert r["l_peak"] == 2
+    assert r["m_peak"] == 16
+    # Half-periods unchanged — still uint16 LE.
+    assert r["t_halfp"] == 0x0045  # 69 → 7.4 Hz
+    assert r["m_halfp"] == 6       # → 85.3 Hz
+    # Annotation byte is preserved (for future RE) but does not affect peak.
+    assert r["annotations"] == (0xd2, 0x00, 0x00, 0x00)
+
+
+def test_extension_byte_decoded_to_correct_in_s():
+    """End-to-end: the channel-grouped output for the K558 ext block
+    should give T = 3 counts = 0.015 in/s, not 53763 counts = 268 in/s."""
+    channels = decode_histogram_body(_K558_INTERVAL_12_BLOCK)
+    assert channels is not None
+    assert channels["Tran"] == [3]
+    assert geo_count_to_ins(channels["Tran"][0]) == pytest.approx(0.015)
+    assert channels["Vert"] == [2]
+    assert channels["Long"] == [2]
+    assert channels["MicL"] == [16]
Author	SHA1	Message	Date
serversdown	9ef424d098	Merge pull request 'Histogram body codec — full RE + peak-count fix that resolves the prod inflation incident' (#26 ) from feat/wire-histogram-codec into dev Reviewed-on: #26	2026-05-22 13:08:03 -04:00
serversdown	ed6982c512	scripts: bw_report preservation check for backfill safety Two-step tool to verify that backfill_sidecars doesn't wipe the bw_report block from existing sidecars. Workflow: 1. snapshot --out before.json (canonical-JSON hash per sidecar) 2. run backfill 3. diff --baseline before.json (classifies every sidecar: PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED) Exit code 1 if any WIPED or CHANGED entries found, 0 otherwise — so it can gate a CI step or a deploy script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 06:13:52 +00:00
serversdown	d506ebc103	histogram_codec: peak count is uint8 (not uint16 LE) — properly cracks the BE9558 / BE18003 extension-byte case The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still unclear — empirically non-zero on intervals with sub-Hz or unmeasurable freq), NOT the high byte of the peak count. The N844 fixture corpus the original RE was done against had zero values in those bytes for every block, so uint8 and uint16 LE were equivalent there — but on real BE9558 Tran-drift events and BE18003 Histogram+Continuous events the uint16 LE interpretation produced peaks up to 268 in/s and 35× inflated PVS sums. Cross-correlated against BW's per-interval ASCII export on: - K558LKZU/LL1P/LL3K → 100% T/V/L/M peak match (1435 blocks each) - T003LKZR/LL0O/LL1M → 100% T/V/L, 99.3% M (0.05 dB rounding only) - N599LKZS/LL0L → 100% all channels - N844 fixture corpus → 100% all channels (unchanged) Annotations preserved on every record for future RE; the defensive _MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s, well below any physical limit). Synthetic regression test added using the verbatim K558LKZU.RE0H interval-12 block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 06:05:19 +00:00
serversdown	e949232875	histogram_codec + backfill: tighter peak ceiling, preserve bw_report histogram_codec: drop _MAX_PEAK_COUNT 4096 → 2200. The old ceiling let extension-byte blocks slip through at up to 20.48 in/s per channel, producing 35× inflated PVS sums when first deployed to prod. 2200 covers Normal-range full-scale (10 in/s = 2000 counts) plus 10% headroom for quantization edge cases. backfill_sidecars: also preserve the bw_report block alongside review + extensions when regenerating sidecars. event_to_sidecar_dict takes a BwAsciiReport dataclass not a dict, so for bw_report we overlay the existing block after regen rather than passing as a kwarg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 02:50:10 +00:00
serversdown	bc5a2d3f19	histogram_codec: defensive bounds-check on peak counts Discovered while running the backfill on prod: certain histogram blocks contain an undocumented extension byte format whose naive uint16 LE interpretation yields physically impossible peak values (150+ in/s when the device max is 10). Concrete example from K558LKSG.3I0H block at body+7424: bytes [6:10] = 05 79 69 00 current code: T_peak = uint16 LE = 0x7905 = 30981 → 154.9 in/s reality: T_peak = byte[6] = 5 → 0.025 in/s (matches BW display) The high byte (0x79 here) appears to be an extension field — possibly "time of peak within interval" or a Histogram+Continuous sub-mode marker. Observed across BE9558 and BE18003 units in prod data; never appeared in the BE12844 fixture corpus the codec was originally verified against. Effect on prod: 26 out of 1433 blocks in this one event had inflated peaks, plus dozens of similar events across the fleet → sum(PVS) inflated from baseline 988 to 34501 (35x). Rolled back via the pre-backfill snapshot before any UI exposure. Defensive fix: bounds-check peak counts in `_decode_block`. Any field exceeding `_MAX_PEAK_COUNT` (4096 = ~20 in/s, well past the device's 10 in/s Normal-range FS) causes the block to be skipped entirely. Other valid blocks in the same event still decode correctly. Trade-off: those skipped blocks lose their per-interval data (peaks + frequencies). Acceptable until the extension format is reverse-engineered — better than propagating bogus values into PVS computations downstream. The 24 existing tests all still pass — the fixtures used during the original codec development don't exercise the extension-byte case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 02:17:33 +00:00
serversdown	88549bc659	backfill_sidecars: filter out Thor IDF files Discovered while dry-running the backfill on prod: the waveform store contains both BW (.AB0*/.N00) and Thor IDF (.IDFW/.IDFH) event files side-by-side because both go through the same per-serial directory layout. The script's `_looks_like_event_file` heuristic accepted any 3-4 char extension ending in W or H, which matched both BW and IDF. The script then routes everything through `event_file_io.read_blastware_file`, which rejects IDF files with "not a Blastware file (bad header prefix)" — 3807 errors on prod out of 7201 total events. Thor IDF events have their own ingest path (`WaveformStore.save_imported_idf`) and their sidecars are populated at ingest from the paired `.IDFW.txt` ASCII report. The backfill script has no value to add for them — there's no decoder to refresh, and the sidecar metadata is already correct. Filter them out. After this fix, the prod backfill should run clean: ~3392 BW events get sidecar+h5 regen as expected; the ~3807 Thor IDF events are silently skipped. The proper "IDF backfill" (refresh tool_version stamp on IDF sidecars by re-running event_to_sidecar_dict against the stored DB row + sidecar extensions block) is a separate, narrower follow-up — not blocking the BW backfill rollout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 01:20:08 +00:00