381 Commits

Author SHA1 Message Date
serversdown d0b66368d5 Merge pull request 'update to v0.21.1, thor data import successful' (#29) from dev into main
Reviewed-on: #29
2026-06-01 16:54:23 -04:00
serversdown 25386cab8b fix(backfill): regenerate IDFH .h5 + merge binary mic_pspl_psi onto bridge
Two gaps in backfill_thor_events.py that left old Thor events showing
stale charts after a v0.21.1 backfill pass:
1. IDFH events were skipped from .h5 regeneration (the "have decoded
   samples" gate was IDFW-only).  Histograms kept their pre-v0.21.1
   .h5 — written from raw_samples = None, which the renderer turned
   into a near-empty bar chart, or for older events the dB(L)-as-pseudo-
   psi mic scale that produced "107.7 psi" peaks (atomic-bomb level
   instead of footstep level).  Fix: synthesise the same 1-sample-per-
   interval array save_imported_idf v0.21.1 uses (peak ADC count per
   channel per interval) so the renderer's bar-chart grouping has
   data to work with.
2. The IDFW h5 path didn't merge binary_peaks.mic_pspl_psi onto the
   IdfEvent before to_minimateplus_event().  The live save_imported_idf
   does this merge — without it, IdfEvent.from_report() only sees the
   .txt's dB(L) value, the bridge falls back to the dBL→psi formula
   (instead of the binary-accurate 2.14e-6 psi/count value), and the
   h5 writer's per-count mic factor lands on a less-correct value.
   Fix: same merge the live ingest does (lift res.event.peaks.mic_pspl_psi
   onto idf_event.peaks before the bridge call).
Verified against UM6047_20250804190047.IDFH (250-interval prod
histogram): 250 intervals decode, mic_pspl_psi = 2.78e-5 (was being
treated as dB(L)=107.7 in the old h5).
Operator: re-run after deploy.  `docker compose exec sfm python
scripts/backfill_thor_events.py` is idempotent — the existing version
check still skips events already at the new TOOL_VERSION, and review
state + captured_at are preserved on the second pass.
2026-06-01 20:02:54 +00:00
serversdown 6cb619ecc4 version bump - 0.21.1 2026-06-01 19:33:44 +00:00
serversdown 1ed86244d0 fix(thor-events): add parallel field for mic psi. Now shows mic in dbl and psi. (psi for charts) 2026-06-01 18:27:24 +00:00
serversdown b2c565f217 fix(idf_waveforms): _find_waveform_body_offset() — scans every 00 02 00 magic past offset 0x0E00, runs decode_waveform_v2 on each candidate, picks the one that returns the most samples. Validated on 483 prod IDFW files: 0 preamble-only events (was ~50%), 355/483 fully decode, 126/483 partial (BW codec walker-stops-early on loud events — known issue).
IDFH now synthesises a 1-sample-per-interval array from the binary intervals and writes an .h5 so the existing renderer works unchanged. Each "sample" is the per-interval peak ADC count → h5_value = count × geo_fs/32768 yields the right bar height.
2026-05-31 20:51:09 +00:00
serversdown 43f440812a scripts: add backfill_thor_events.py
Refreshes the bw_report sidecar block + .h5 waveform files for Thor
events ingested before the v0.21.0 adapter wiring + the bee1185 codec
fix.  Those events landed with extensions.idf_report only (no
bw_report, no .h5 for IDFW) — symptom on the UI side: the modal chart
404'd on /waveform.json and the PDF rendered from DB-only fields
without sensor self-check, full per-channel breakdown, or mic dB(L).

Walks <store>/<serial>/<filename>:
  - Reads the existing sidecar (preserves review state + captured_at)
  - Re-runs read_idf_file() on the binary bytes (passes data=
    kwarg so codec doesn't try the broken bare-path Path.read_bytes)
  - Reads extensions.idf_report from the existing sidecar
  - Runs build_bw_report_from_idf adapter
  - Writes refreshed sidecar with bw_report + bumped tool_version,
    preserving review block and original captured_at
  - For IDFW: regenerates .h5 by bridging IdfEvent.from_report ->
    to_minimateplus_event -> write_event_hdf5 (mirrors save_imported_idf
    steps 4-7)
  - IDFH events skip .h5 (histograms have no per-sample data)

Skips events already at current TOOL_VERSION with bw_report present.
--force overrides.  --skip-hdf5 limits to sidecar-only refresh.
--dry-run for preview.

Validated against the prod-snap waveform store: 3,815 Thor sidecars
refreshed cleanly with 0 errors, 462 IDFW .h5 files written, 2 skipped
(binaries with no sidecar — backfill doesn't conjure events from
nothing).  Verified one originally-broken IDFW event now serves
waveform.json (200, 168KB) and a fully populated PDF (119KB vs the
previous 56KB sparse output).

Operator workflow on prod:
  docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py --dry-run
  # Inspect counts, then for real:
  docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py

Idempotent — re-running it is a no-op once everything's at the current
TOOL_VERSION.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 04:37:43 +00:00
serversdown 23e83908c2 report_pdf: fix PVS overlapping stats table, drop NA caption
Two related fixes to the per-channel stats block:

1. Pin the stats table's position via an explicit bbox= on
   ax.table() so the bottom edge is at a known axes-fraction Y.
   The previous loc="upper left" + tbl.scale(1, 1.4) combo let
   matplotlib choose row heights based on text size, which made the
   table extend further below the axes than the hard-coded PVS line
   at y=-0.08 expected.  Result was the "Peak Vector Sum X in/s"
   string landing horizontally inside the Peak Displacement row.

   With bbox=[0, 1-N*0.12, 0.80, N*0.12] the table is pinned to a
   precise rectangle (12% axes-fraction per row × N rows tall).
   _draw_stats_table now stashes the bottom Y on the axes for the
   PVS helper to reference, so the geometry stays in sync.

2. Center PVS horizontally (ha="center" at x=0.5 instead of ha="left"
   at x=0).  The previous left-edge alignment put PVS at the same
   X as the label column, which read as "off-center" once the rest
   of the stats data was column-aligned further right.

3. Drop the "NA: Not Applicable" caption.  It existed to explain
   "—" placeholder cells, but "—" is universally understood and the
   caption was always visually squished against the PVS line below.
   Less cruft on the page; one fewer position to manage.

Verified against a real BE12599 histogram event (5 data rows) and
a real UM12947 IDFW waveform event (6 data rows) — both layouts
clear the table cleanly with no overlap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 22:17:43 +00:00
serversdown bee118506b fix(idf): decode from in-memory bytes during ingest
Bug shipped in v0.21.0: save_imported_idf called read_idf_file()
with `source_path` (a bare filename like "UM12947_….IDFW") BEFORE
writing the binary to disk.  The codec did Path(path).read_bytes()
which resolved relative to /app and hit FileNotFoundError.  The
error was caught + logged as a warning, and ingest fell back to
.txt-only — events still landed in the DB but lost the bw_report
block + .h5 waveform that the codec was supposed to produce.

Observed during a full re-forward from thor-watcher on 2026-05-29:
every Thor event logged "binary codec failed for X: [Errno 2] No
such file or directory" and got binary_decoded=False.

Fix:
- read_idf_file() gains a `data: Optional[bytes]` kwarg.  When
  supplied, skips the disk read and decodes the provided bytes
  directly.  `path` stays required (used for filename in error
  messages + .IDFH vs .IDFW suffix detection); only the read is
  conditional.  Backward compatible — existing positional callers
  (CLI scripts, tests) continue to work unchanged.
- save_imported_idf passes `data=idf_bytes` since the bytes are
  already in memory from the multipart upload.  Filesystem write
  still happens at step 5 of the existing flow; codec just no
  longer depends on it.

Verified end-to-end against UM11719_20231219162723.IDFW from the
example-data corpus: ingest endpoint returns inserted=1, log line
shows binary_decoded=True + h5=...IDFW.h5, no warnings.

Re-forward existing Thor events from thor-watcher after deploy to
backfill the bw_report block — UPSERT preserves review state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 20:09:54 +00:00
serversdown defd17d9c2 sfm_webapp: harmonize "Received by server at" → "Time received"
Matches Terra-View's event-modal relabel from the same iteration.
Wording was already clearer here than in Terra-View's "Captured at",
but using identical text across both surfaces means operators see the
same label whether they're in the native modal or the standalone
webapp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 19:51:58 +00:00
serversdown e42956a20b release: v0.21.0 — Thor / Series IV codec + Thor→BW adapter
Documents two commits that landed on dev since v0.20.0:

  9b71ead  series 4 codec work, initial decode success
           micromate/idf_file.read_idf_file() decodes both IDFW
           (waveform; 87-99% sample fidelity reusing
           decode_waveform_v2 at offset 0x0f1f) and IDFH (histogram;
           dedicated segment-based decoder, all 859 corpus files
           decode, 181,071 intervals total).

  9fd52dd  feat: add thor report generation, pdf generation
           micromate/idf_to_bw_report.py adapter projects parsed
           Thor data into the bw_report sidecar shape so Thor
           events flow through sfm/report_pdf.py without a
           separate renderer.  Wired into save_imported_idf.

Net effect: a Thor event ingested via /db/import/idf_file now
lands with the same fidelity as a BW event, gets a per-event PDF
on demand, and renders in Terra-View's modal chart using the same
plotting code as a BW event.

Roadmap items closed:
- Binary .IDFW / .IDFH codec (was pending)
- Series IV (Thor IDF) binary codec reverse-engineering

Companion: Terra-View v0.13.0 ships in parallel and closes Phase 1
of the SFM integration.  No API changes in seismo-relay for that
piece — Terra-View just consumes existing endpoints better.

Bumps:
- pyproject.toml 0.20.0 → 0.21.0
- minimateplus.event_file_io.TOOL_VERSION 0.20.0 → 0.21.0
  (any subsequent backfill_sidecars.py --force will re-stamp
  existing sidecars; expected + harmless)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 19:25:44 +00:00
serversdown 9fd52ddabb feat: add thor report generation, pdf generation. 2026-05-29 19:03:06 +00:00
serversdown 9b71ead44b series 4 codec work, inital decode success 2026-05-29 06:33:13 +00:00
serversdown 2eb1d25028 Merge pull request 'v0.20.0 -- Full s3 event parse and PDF creation.' (#28) from dev into main
Reviewed-on: #28
2026-05-28 17:54:31 -04:00
serversdown 1bccc44b88 release: v0.20.0 — PDF + parser polish
Closes out the Event-Report PDF iteration started in v0.17.x and ships
the parser fixes the real-world events were tripping over.

Today's additions on top of the pre-v0.20 unreleased body:

- Server-wide display TZ via the TZ env var (default America/New_York
  on prod).  Affects server logs, the PDF report's "Created" footer,
  matplotlib datetime axes.  DB columns stay UTC.  Dockerfile now
  installs tzdata.
- ZC Freq "above-range" handling — parser stores 100.0 +
  zc_freq_above_range flag for BW's ">100 Hz" marker.  Renders as
  >100 in the PDF stats table, both modals (inline on webapp Peaks,
  new column on event-browser table).
- scripts/backfill_sidecars.py --reparse-txt — re-runs the current
  parser against the preserved _ASCII.TXT and overwrites the
  sidecar's bw_report block.  Lets parser fixes reach old events
  without re-forwarding.  Validated end-to-end against ~10k prod
  events.

Fixes shipped today:
- histogram_interval_size_s missing from ReportData → every
  histogram PDF render 500'd.
- Histogram PDF geo channels now share a nice-quantized y-axis
  (0.005-LSB-aware 1-2-5 step sequence) instead of auto-scaling
  per channel + inventing sub-LSB "0.003 in/s/div" footer labels.

Roadmap delta: closes the BW ASCII parser "PPV-miss on some TXT
formats", "histogram-specific structural fields", and ">100 Hz value
parsing" items.  Adds a new entry for the byte[5]==0 histogram body
sub-format observed on S353 events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 21:17:53 +00:00
serversdown a3cc44d30a feat(backfill): --reparse-txt flag to refresh bw_report from preserved .TXT
The existing backfill_sidecars.py PRESERVES the bw_report block across
regenerations — it's treated as the source of truth from the original
ingest pass (the .TXT isn't reachable from the script's normal data
path, so it can't be re-derived).

That means parser-side fixes (like the 2026-05-28 ">100 Hz" ZC Freq
addition) won't reach old events even with --force.  The new
--reparse-txt flag fixes that: when the sidecar's source.txt_filename
points at a preserved <serial>/<filename>_ASCII.TXT, the script re-runs
the current parser against it and overwrites the bw_report block.

Implies sidecar regeneration on every event (bypasses the
sha-up-to-date / version-up-to-date skip), so that the .h5 cascade-
regenerates alongside.  No-op for events without a preserved .TXT
(legacy ingests pre-2026-05-27).  Idempotent — re-running it produces
the same sidecar bytes when the parser hasn't changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 18:56:23 +00:00
serversdown 6a73523e4d ui: surface per-channel ZC Freq (and ">100") in event modals
The PDF report shows per-channel ZC Freq alongside PPV in the stats
block, but neither modal exposed it.  Now that the sidecar projection
carries zc_freq_hz + zc_freq_above_range, plumb them through:

- sfm_webapp.html: inline suffix on existing Peaks cells, e.g.
  "Tran  0.04500 in/s · >100 Hz".  Empty suffix when no ZC is
  available (legacy events without a preserved .TXT).

- event_browser.html: new ZC Freq column on the per-channel stats
  table.  Required adding a parallel sidecar fetch in loadEvent()
  (waveform.json alone doesn't carry bw_report).  Fetch failure is
  non-fatal — falls back to "—" in the new column.

Above-range ZC peaks (BW ">100 Hz") render with a literal ">"
prefix mirroring the PDF, so operators don't have to generate the
PDF to see when a channel hit the zero-crossing ceiling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 18:47:37 +00:00
serversdown 780b45a371 feat: render ">100" for above-range ZC Freq instead of "—"
BW writes ">100 Hz" for ZC Freq when the zero-crossing algorithm sees a
peak too fast to count — the device's reporting ceiling is 100 Hz on
V10.72.  Our parser fell back to None via _parse_number (which requires
a leading digit), so the PDF rendered "—" where BW shows ">100".

Mirrors the OORANGE/saturated pattern already used for PPV and PSPL:
parser stores the threshold (100.0) on zc_freq_hz + sets a new
zc_freq_above_range flag.  Projection carries the flag through to the
sidecar; PDF renderer prepends ">" when set.

Affects both per-channel stats tables (waveform + histogram variants)
and the mic block's ZC Freq row.

Verified on the real T190LD5Q.LK0W fixture: Tran zc_freq_hz=100.0
above_range=True; Vert/Long (normal values) above_range=False; "N/A"
still produces zc_freq_hz=None which renders as "—" (unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 18:38:49 +00:00
serversdown f6abe3caa0 fix(report_pdf): histogram geo channels share nice-quantized y-axis
Two related visual bugs on histogram PDFs:

1. Per-channel auto-scale meant Tran/Vert/Long had different y-axes
   (e.g. 0-0.015, 0-0.025, 0-0.020) — bars looked taller on the
   channel that happened to be quietest.  Not directly comparable.

2. Footer "Amplitude Geo: X in/s/div" was just amax/5 of the FIRST
   geo channel with data, with no LSB quantization — producing
   nonsense like 0.003 in/s/div when the geophone LSB is 0.005.

Fix: compute a single shared geo y-axis range from max(Tran,Vert,Long),
quantize the per-division step to BW's 1-2-5 sequence rounded to the
0.005 LSB (0.005, 0.01, 0.025, 0.05, 0.1, 0.25, ...), apply the same
ylim + ticks to all three geo subplots, and use that same step for the
footer label.  MicL stays on its own auto-scale (different units).

Verified across edge cases including the reported event
(geo max 0.025 → 0.005/div, top 0.025), small PVS events, and large
blast amplitudes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 18:22:20 +00:00
serversdown ad2702d4bf fix(report_pdf): add missing histogram_interval_size_s field
The histogram-interval-times derivation block at line 314 references
rd.histogram_interval_size_s, but the field wasn't declared on the
ReportData dataclass — only the string form histogram_interval_size
was.  Result: every PDF render of a histogram event raised
AttributeError → 500 from /db/events/{id}/report.pdf.

Cause: when the histogram aggregation block was inlined into
gather_report_data, the seconds-numeric counterpart that the
projection already carries (bw_report.histogram.interval_size_s) was
never wired into the dataclass.  Waveform PDFs weren't affected
because the offending line is gated on is_histogram.

Fix: add the field, read it from the projection alongside the other
histogram keys.  No-op for waveform events (the field stays None and
the gate skips it).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 18:07:41 +00:00
serversdown 86325b9bab docs: roadmap entry for a SECOND undecoded histogram sub-format (S353)
Observed in fresh ingest logs on 2026-05-28: BE17353 events
(S353L4H2.FZ0H, S353L4H2.P00H, etc.) cause "body codec failed to
decode" warnings.  Different from the byte[5]!=0 case already tracked
(T190 / O121) — these have byte[5]==0x00 with what looks like a
valid block header, but the walker finds zero data blocks anyway.

Operational impact identical to the existing case: ingestion
succeeds, DB peaks come from bw_report overlay, only the chart is
empty.  No data loss.

Pinning so it doesn't get lost — needs a hex dump of one body to
work out what's different about these.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 05:42:18 +00:00
serversdown 6381dcb312 tz: server-wide display timezone via TZ env var (default EST/EDT)
User-reported issue: server logs were timestamped in UTC ("05:36:20"
when local was ~01:36 EDT), and the PDF report's "Created" footer
similarly showed raw UTC.  Inconsistent with the modal which already
converts to browser local via toLocaleString.

Solution: standard Linux TZ env var.  Set once in the container, and:
  - Python's datetime.now() uses local
  - Logging module's timestamps use local
  - matplotlib renderers + report_pdf formatters use local
  - astimezone() conversions resolve to the configured TZ

DB columns stay UTC (created_at uses SQLite's strftime('%Y-...Z', 'now')
which is always UTC, regardless of TZ env var — proper "store UTC,
display local" pattern).

Changes:
  - Dockerfile: install tzdata (python:3.11-slim omits the timezone
    database), set default TZ=America/New_York
  - sfm/report_pdf.py: _fmt_iso_to_bw and _split_iso_to_date_time now
    convert UTC inputs (Z-suffixed) to local via astimezone(); naïve
    inputs (BW recorded-at, already unit-local) returned as-is.
    New _to_display_local helper centralizes the logic.
  - "Created" line in the PDF page footer now uses the converted
    timestamp.

Override per-deployment via the TZ env var in docker-compose
(separate commit on terra-view side).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 05:41:10 +00:00
serversdown 53c05d93e2 delete: also clean up preserved _ASCII.TXT file
_cleanup_event_files() removes the on-disk artifacts when an event is
hard-deleted (binary, a5_pickle, sidecar, h5).  Today's .TXT
preservation feature added a new on-disk file (_ASCII.TXT next to the
binary) but the cleanup didn't know about it — so any event deleted
via /db/events/{id} (single) or /db/events/delete_bulk (or the
Terra-View "SFM Event DB Manager" UI which proxies through to those
endpoints) was leaving orphan .TXT files in the store.

Added "txt" to the cleanup list using the new
WaveformStore.txt_path_for().  Safe for old events without a .TXT —
the exists() check skips the unlink.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 05:31:08 +00:00
serversdown a5888e1b5c report_pdf: PDF histogram aggregation + fix footer/x-axis overlap
Two issues spotted on a histogram event PDF:

1. Footer scale ("Time — /div  Amplitude Geo: X in/s/div  Mic: Y
   psi(L)/div") was overlapping horizontally with the x-axis tick
   labels (0, 20, 40, 60...).  Both rendered on the same Y row.
   Fix: bumped gridspec bottom margin from 0.06 → 0.12, moved the
   footer text from y=0.045 → y=0.030 (below the tick labels), moved
   the page-bottom Created/Event line from y=0.015 → y=0.005.
   Trigger legend on waveforms moved 0.030 → 0.018.  Everything
   stacks cleanly now without collision.

2. PDF was showing the raw codec output (~150+ bars per histogram)
   instead of BW's per-interval aggregation.  Why: the aggregation
   I'd added to /db/events/{id}/waveform.json wasn't replicated in
   the PDF gather path.  Now: gather_report_data does the same
   max-per-group aggregation when bw_report.histogram.n_intervals is
   populated, AND derives per-interval HH:MM:SS labels from the
   start time + interval_size_s.  Result: histogram PDFs now match
   BW's display (one bar per BW interval, x-axis labeled with actual
   times) — same fix as the modal chart, applied to the PDF.

For events ingested BEFORE the parser extension (no histogram block
in their sidecar), aggregation is a no-op — they still render with
per-block bars + interval-index x-axis (but the overlap fix applies
to them too).  Re-forwarding repopulates the histogram block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 04:33:53 +00:00
serversdown b9f8bbb220 viewers: enforce minimum Y-range on histogram channels
Quiet histogram events were filling the chart panel even though the
peak was tiny (0.005 in/s rendered as 90% of chart height because
Chart.js auto-scaled to peak * 1.1).  Made everything look uniformly
loud regardless of actual amplitude.

BW's solution: a near-fixed scale per channel ("Geo: 0.002 in/s/div"
from the footer).  Quiet events render small, loud events render
proportionally tall.

Match the intent without copying BW's "no Y-axis labels at all"
convention.  For histogram channels:

  Geo (in/s):       min Y range 0.05 in/s
  Mic in psi:       min Y range 0.001 psi
  Mic in dBL:       unchanged (the 60 dBL floor + peak+5 top already
                    gives quiet events a sensible baseline)

So a 0.005 in/s geo event renders as ~10% of chart height; a 0.05
event fills it; a 5.0 event still fills it (max(peak*1.1, 0.05) ==
peak*1.1 for any peak > 0.045).

Waveform charts unchanged — they should zoom for shape detail.
Applied to both the modal in sfm_webapp.html and the standalone
/events page in event_browser.html.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 04:23:01 +00:00
serversdown b59f886cb7 docs: roadmap entry for sensor-check waveform extraction
BW's Event Report PDFs include a per-channel sensor-check response
waveform on the right side of the bottom plot (damped sinusoid for
geo channels, sawtooth-at-test-freq for mic).  Looks like real
per-sample data extracted from the binary, not synthesized.

Our parser captures the test results (freq, ratio, amplitude,
pass/fail) but not the waveform samples — so the report shows text
only for sensor check.  Pinning a roadmap entry to investigate the
binary for the sample data (path a) or fall back to synthesized
visualization (path b).

Current text-only display is operationally sufficient.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 04:17:50 +00:00
serversdown 87aec3f4d1 viewers: smoother mic dBL chart + restore binary/TXT download links
Two issues spotted in the modal:

1. Mic dBL chart looked spikey/discontinuous — isolated bars at 80-95
   with gaps in between.  Cause: _psiToDbl() returns null for zero or
   negative samples, and most mic samples on a quiet event sit at the
   digitization noise floor where they're effectively zero.  Result:
   the chart only renders the moments when instantaneous SPL exceeded
   the Y-axis bottom — looks like a sound trigger gate.

   Fix: new _psiToDblForChart() rectifies the AC waveform (abs), then
   converts to dBL, then floors at MIC_DBL_FLOOR=60 dBL.  Chart now
   has a continuous 60 dBL baseline with peaks above it — matches how
   acoustic engineers expect SPL-vs-time.  Y-axis bottom pinned to
   MIC_DBL_FLOOR, top to peak + 5 dB headroom.  Peak label still uses
   the unrectified _psiToDbl so the displayed peak value is exact.

2. Filename in Source/Files block was unlinked.  Endpoint exists
   (/db/events/{id}/blastware_file) — just wasn't wired to the modal.
   Made it a clickable download link.  Same treatment for the
   preserved .TXT — added "(download .TXT)" link next to source kind
   when source.txt_filename is populated (events ingested after the
   .TXT preservation feature landed; older events show no link).

Applied to both the inline modal in sfm_webapp.html and the
standalone /events page in event_browser.html.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:08:21 +00:00
serversdown ace542cba5 report_pdf: wire histogram peak date/time + PVS-when + Finish field
Spotted comparing our PDF to BW's reference for T003LLUB.CE0H:
  - Finish blank
  - Per-channel Date / Time rows all dashes
  - MicL PSPL line missing "on May 27, 2026 at 06:19:14"
  - Peak Vector Sum missing "on May 27, 2026 At 06:06:14"

Root cause: I'd added these fields to the projection (write side) in
_bw_report_to_dict but never wired them into gather_report_data
(read side).  Plus the projection used keys "start"/"stop" while
gather was reading "start_str"/"stop_str" — typo'd lookup.

Fixes:
  - gather_report_data now reads bw_report.histogram.start /
    .stop / .channel_peak_when (correct keys, matching the projection)
  - Per-channel "peak_date" / "peak_time" populated from
    channel_peak_when[<channel>] for the histogram stats table
  - MicL PSPL line formats as "PSPL  125.7 dB(L) on May 27, 2026
    at 06:19:14" (BW style) when channel_peak_when["MicL"] is present;
    falls back to the waveform-relative "at 0.012 sec" otherwise
  - PVS line formats as "Peak Vector Sum  0.091 in/s on May 27, 2026
    At 06:06:14" (BW style) when bw_report.peaks.vector_sum.when is
    populated; falls back to the relative time_s for waveforms
  - New _split_iso_to_date_time() helper splits ISO timestamps into
    BW-formatted ("May 27 /26", "06:06:14") date+time pairs for the
    stats table's separate Date and Time rows

Events ingested BEFORE the parser extension landed (most of the
existing prod corpus) still show dashes — their sidecars lack the
histogram block.  Re-forwarding repopulates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 22:47:53 +00:00
serversdown 8cbda09917 viewers: render timestamps in browser-local time
Spotted on the SFM webapp event modal — "Received by server at" was
showing the raw ISO string "2026-05-27T21:59:57.213043Z" because we
were assigning ev.timestamp / src.captured_at directly to the
textContent of the modal fields, bypassing the existing _fmtTs()
helper that wraps them in toLocaleString().

Net effect for operators: confusing "21:59 vs it's 6 PM" mismatch
when the displayed UTC timestamp didn't match wall-clock time.  The
values were always correct; the display was just ambiguous.

After this fix:
  - "Recorded at" (naive ISO from BW = unit local time) renders
    cleanly as the unit wrote it: "5/27/2026, 6:00:13 AM"
  - "Received by server at" (UTC with Z suffix) converts to browser
    local: "5/27/2026, 5:59:57 PM"
  - Timestamp column in the history table already used _fmtTs —
    unchanged
  - Same fix applied to the standalone /events page (sidebar event
    list + meta header) via a new _fmtTsLocal helper

Note: did NOT add file-mtime-on-watcher-PC tracking as a separate
"Called in at" column — discussed and decided created_at is close
enough for schedule-compliance monitoring (worst case lag = watcher
poll interval ~60s, indistinguishable from BW write time at the
operationally-relevant resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 22:30:43 +00:00
serversdown 3457ed0072 bw_ascii_report: parse OORANGE saturation marker + TimeSum typo
BW writes "OORANGE" (truncation of "Out Of Range") when a channel
exceeds its full-scale, and uses a typo'd label "Peak Vector Sum
TimeSum" for the PVS time field.  Both confirmed against real ASCII
files pulled from a Windows watcher PC 2026-05-27:

  T190LD5Q.LK0W  Vert PPV = OORANGE  (Normal range, 10 in/s exceeded)
  T438L713.RY0W  All three PPVs OORANGE  (Sensitive range, 1.25 in/s)
  K557L3YM.OE0W  Tran+Vert PPV OORANGE + MicL PSPL OORANGE

Previously our _parse_number() returned None for OORANGE → DB columns
ended up NULL → events vanished from filters / sorts / dashboards
despite being legitimate high-amplitude events.

New behavior — substitute a conservative bound + set a saturation flag:
  - Channel PPV       → geo_range_ips + ChannelStats.ppv_saturated
  - Peak Vector Sum   → sqrt(3) * geo_range_ips + peak_vector_sum_saturated
  - MicL PSPL         → 140 dB(L) + MicStats.pspl_saturated

Flags propagate to the sidecar's bw_report block so the SFM UI can
render "> 10 in/s" / "> 140 dBL" rather than treating the substituted
value as exact.

Same commit also accepts "Peak Vector Sum TimeSum" as an alias for
"Peak Vector Sum Time" (BW always writes the typo on OORANGE PVS
lines — every example file confirms it).

Tests: new test_oorange_marker_treated_as_saturation (synthetic) +
test_real_oorange_event_t190_parses (skips if real fixture absent).
177/177 tests pass; 16 pre-existing missing-fixture skips unchanged.

Five events on prod (T190, T438, K557, plus 2 others matching the
same fault pattern) will pick up correct peaks + saturation flags
once watchers re-forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:32:56 +00:00
serversdown d21e3b5298 histogram aggregation + parser extension for BW interval fields
Three layered changes that together make histogram charts visually
match BW's printout (one bar per interval, not per codec block):

1. bw_ascii_report parser captures histogram fields it previously
   dropped:
     - Histogram Start/Stop Time + Date → datetime
     - Number of Intervals + Interval Size (string + parsed seconds)
     - <Channel> Peak Time + Peak Date → datetime (per-channel)
     - Peak Vector Sum Date (combined with PVS Time → datetime;
       clears the bogus seconds parse that interpreted "22:33:52"
       as 22.0)
   New _parse_iso_date() handles BW's ISO format for histograms
   (waveforms use "May 8, 2026" long form).  New _parse_interval_size()
   handles "1 minute" / "5 minutes" / "15 seconds" etc.

2. _bw_report_to_dict() projects the new fields into a new
   bw_report.histogram block in the sidecar.

3. /db/events/{id}/waveform.json wraps the existing path 1 (HDF5)
   output with _maybe_aggregate_histogram(): when the event is a
   histogram AND the sidecar has bw_report.histogram.n_intervals,
   group the codec's per-block samples into N intervals via
   max-per-group and return the aggregated array.  time_axis gains
   histogram_aggregated / n_intervals / interval_size_s / interval_times
   fields.

Frontend (both modal chart in sfm_webapp.html + standalone event
browser) uses interval_times as x-axis labels when provided (BW-style
HH:MM:SS), falls back to interval index.

Defensive: aggregation is no-op when the sidecar lacks the histogram
block (events ingested before this change).  Activates automatically
on prod once a watcher re-forward populates new sidecars.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:23:05 +00:00
serversdown ad2b553c7b ingest: preserve raw BW ASCII report (.TXT) alongside the binary
Previously the .TXT was parsed into the sidecar's bw_report projection
and then discarded at ingest time.  Now save_imported_bw() writes it
to <store>/<serial>/<filename>_ASCII.TXT permanently.

Rationale: with BW Mail / Forwarding Agent being phased out of the
operator workflow, the XML/PDF/WMF those tools produce won't be
available — the binary + .TXT (created by BW ACH itself) are our
only authoritative inputs going forward.  Keeping the raw .TXT
unlocks:

  - Parser bug fixes can be applied RETROACTIVELY by re-parsing the
    stored .TXT, instead of requiring a re-forward from the watcher
    PC (which lost the .TXT after BW ACH cleanup).
  - Audit trail of what BW actually sent us, for debugging.
  - The five known parser-PPV-miss events will be re-parseable once
    the regex fix lands (instead of staying broken indefinitely).

Storage cost: ~15 KB per event × 14k events = ~210 MB on the
existing prod corpus.  Negligible.

Implementation:
  - WaveformStore gains txt_path_for() + open_txt()
  - save_imported_bw() writes the .TXT when bw_report_text is supplied
  - sidecar source block records the txt_filename
  - backfill_sidecars.py preserves txt_filename across regens
  - New GET /db/events/{id}/ascii_report.txt endpoint serves it
  - Returns 404 for events ingested before this change (no .TXT in
    the store yet) — re-forward to populate

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:01:12 +00:00
serversdown dfbc8b8520 report_pdf: split waveform vs histogram layouts (BW PDF iteration)
Reviewed against real Blastware Event Report PDFs (uploaded to
example-events/pdfsnstuff/) for K558LLB7.V20H (histogram) and
K558LLB8.0E0W (waveform).  Each event type has its own layout because
BW's printouts genuinely differ:

  Waveform header:   Date/Time, Trigger Source, Range, Sample Rate
  Histogram header:  Start, Finish, Intervals At Size, Range, Sample Rate
                     (no trigger field — histograms aren't triggered)

  Waveform stats:    PPV, ZC Freq, Time (Rel. to Trig),
                     Peak Acceleration, Peak Displacement, Sensor Check
  Histogram stats:   PPV, ZC Freq, Date, Time (of peak), Sensor Check

  Waveform plot:     4-channel stacked line, x-axis in SECONDS,
                     trigger triangle + window markers, symmetric Y
                     for geo, zero-anchored mic, "0.0" baseline label
                     on right edge per BW convention
  Histogram plot:    4-channel stacked bars, Y-axis 0-to-peak only
                     (never negative — peaks are magnitudes), 0.0
                     baseline at the bottom

  Waveform footer:   USBM chart placeholder upper-right;
                     "Time X sec/div   Amplitude Geo: Y in/s/div   Mic: 0.001 psi(L)/div"
                     "Trigger = ▶━━◀"
  Histogram footer:  No USBM chart; same scale-info footer with
                     interval-size as the time unit

Other fixes from the first-pass screenshot review:
  - Channel labels (MicL/Long/Vert/Tran) no longer cut off (wider
    left margin)
  - Histogram bars rise from zero baseline (abs of any signed values)
  - ISO timestamp "2026-05-16T22:33:50" → "22:33:50 May 16, 2026"
    matching BW's display format

Known gaps (separate work):
  - Histogram codec returns per-block granularity (~200 bars for
    BW's 4-interval display).  XML-driven data source is the planned
    fix; the structured BW XML has the per-interval aggregates.
  - USBM RI8507 / OSMRE compliance chart still placeholder

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 18:22:03 +00:00
serversdown 411ef8139e sfm: Event Report PDF generation (v0.20.0 stub layout)
New endpoint GET /db/events/{id}/report.pdf returns a single-page
letter-portrait PDF for any event with waveform data on disk.

Architecture:
  sfm/report_pdf.py — gather_report_data() assembles fields from
    SeismoDb row + .sfm.json sidecar (bw_report block) + .h5 samples;
    render_event_report_pdf() turns that into PDF bytes via matplotlib.
  sfm/server.py — new endpoint wires them together, streams PDF back
    with Content-Disposition: inline so the browser displays it.
  sfm_webapp.html — new "Download PDF" button in the event modal
    footer that opens the endpoint in a new tab.

Fields surfaced — same coverage as a Blastware Event Report:
  Header metadata (date/time, trigger source, range, sample rate,
                   project, client, operator, location, serial+firmware,
                   battery, calibration, file name)
  Microphone block (PSPL in dB(L) + psi, ZC freq, channel test)
  Per-channel stats (PPV, ZC Freq, Time of Peak, Peak Accel,
                     Peak Disp, Sensor Check) for Tran/Vert/Long
  Peak Vector Sum
  Waveform plot (MicL/Long/Vert/Tran stacked, shared time axis,
                 trigger marker, symmetric Y for geo, zero-anchored
                 mic) — OR per-interval bar chart for histograms.

Rendering pipeline = matplotlib only (vector PDF, no headless-browser
dep).  Adds matplotlib>=3.8 to deps.

Visual layout is approximate until reference PDFs from Instantel land
at docs/reference/instantel/ for iteration.  USBM RI8507 / OSMRE
compliance chart is stubbed (placeholder rectangle) — separate work
item.

Smoke-tested on a K558 waveform event: 77 KB valid PDF, all fields
populated correctly from the snapshot DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 02:55:58 +00:00
serversdown ed926de3f4 viewers: default mic to dB(L) + add Mic-unit toggle (dBL ↔ psi)
The sidecar-modal waveform plot was rendering mic in raw psi, while the
rest of SFM (history table column, peaks block, live-device chart,
event detail modal mic field) had already converted to dB(L) — matching
the BW Event Report convention.  Unifying.

Both viewers now:
  - Default mic chart values + axis title + peak label to dB(L)
  - Provide a header toggle ("Mic: dBL" pill) to flip to psi
  - Persist the preference via localStorage (sfm_mic_unit)
  - Re-render the open chart immediately on toggle

Conversion: dBL = 20 * log10(psi / 2.9e-9), where 2.9e-9 psi is the
20 µPa reference pressure already defined for the rest of the webapp.
Non-positive psi samples (log undefined) render as null; Chart.js
handles them as gaps in line mode and missing bars in histogram mode.

Also fixes event_browser.html's stats table — the MicL row was
hard-coding "<value> psi"; now honors the same toggle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 02:30:56 +00:00
serversdown 5d5441604b viewers: symmetric Y-axis on geo waveforms + clarify timestamp labels
Two fixes from the second screenshot review:

1. Geophone waveform Y-axis now renders SYMMETRIC around zero — zero
   line sits in the middle of the chart, signal goes both above and
   below.  Standard seismograph display convention; matches the
   Instantel printout look.  Previously Chart.js auto-scaled to the
   data range so e.g. Vert showing values from -0.005 to -0.015 had
   the zero line completely off-screen.

   Mic channel (sound pressure, always positive) keeps the default
   auto-scale anchored at zero.  Histograms (per-interval peaks, also
   always positive) likewise keep bars rising from a zero baseline.

2. Modal labels clarified to remove the 'Timestamp' vs 'Captured at'
   ambiguity:
     'Timestamp'   →  'Recorded at'         (when the seismograph
                                              recorded the event —
                                              from BW report's Event
                                              Time field)
     'Captured at' →  'Received by server at' (when our sfm-db
                                              inserted the row)
   Both have tooltips explaining the distinction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 20:26:23 +00:00
serversdown 784f2cca36 viewers: decimal peak labels + bar chart for histograms + clean x-axis ticks
Three polish fixes spotted in the first prod screenshot of the inline
event-modal waveform plot:

1. Peak labels were rendering as "PEAK 2.500E-2 IN/S" because of a
   blanket toExponential(3) call.  New _fmtPeak() formatter picks
   decimal with adaptive precision for normal-range values (0.0001 to
   10000) and falls back to scientific only for truly extreme
   magnitudes.  Same value now reads "peak 0.0250 in/s".

2. Histogram events were being plotted as connected line charts, but
   histograms are per-INTERVAL peaks (one bar per minute, typically),
   not per-sample waveforms.  Now: detect histogram via record_type,
   render as a tight bar graph (bars touch), suppress the trigger line
   + zero baseline overlays (no trigger event on a histogram), and
   label the x-axis with interval number instead of milliseconds.

3. X-axis tick labels were displaying as "11.7187040000000002 ms"
   because the callback used the raw float, not the formatted label.
   Snap to 1 decimal place (or integer for whole-number values like
   histogram intervals).

Applied to both the inline modal plot in sfm_webapp.html and the
standalone /events viewer in event_browser.html — they share the same
data shape and presentation conventions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 19:54:04 +00:00
serversdown 6abfadae4f viewers: render pre-trigger samples (time_axis is metadata, not an array)
The /db/events/{id}/waveform.json endpoint returns `time_axis` as a
metadata object — {sample_rate, pretrig_samples, t0_ms, dt_ms,
n_samples, total_samples, rectime_seconds} — not a per-sample times
array.  Both viewers (sfm_webapp.html sidecar modal + event_browser.html)
were treating it as an array, silently falling back to a derived path
that ignored pretrig entirely and started the time axis at 0.

Symptom: trigger line drawn at the very left edge of every chart, no
visible "leading up to the event" samples even though they're in the
decoded data.

Fix: read time_axis.t0_ms (negative when pretrig samples exist),
time_axis.dt_ms, build per-sample times as `t0_ms + i * dt_ms`.  Trigger
line lands at sample where t crosses 0; pretrig samples render at
negative t to the left of it.

Confirmed on a K558 event with 208 pretrig samples + 2 sec rectime at
1024 sps — time axis now spans -203 ms to +2046 ms, trigger line at
~9% from the left edge as expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 21:58:20 +00:00
serversdown fd0e28657d sfm_webapp: default to Database view + sortable columns + inline waveform plot
Three UX upgrades to the main SFM webapp at /, all reinforcing the
'browse stored events' flow as the primary entry point:

1. Default section is now Database, not Live Device.  Most users land
   here to look at stored events; Live Device is opt-in (click the tab
   to talk to a unit).  Initial history + units fetch fires on first
   paint so the table is populated when the page loads.

2. History table columns are sortable.  Click any header to sort:
   timestamp, serial, per-channel PPV (Tran/Vert/Long), PVS, mic dB(L),
   project, client, type, key.  Default direction varies by column type
   (desc for numbers + timestamps, asc for text).  Sort arrows appear
   in the active column header.  Headers are sticky so they stay
   visible while scrolling.

3. Click-event-to-see-waveform.  The existing sidecar review modal now
   renders the 4-channel waveform plot inline at the top, fetched from
   /db/events/{id}/waveform.json in parallel with the sidecar fetch.
   Channels stacked MicL / Long / Vert / Tran (Instantel printout
   order), shared bottom time axis, dashed trigger line + triangle
   markers at t=0, zero baseline with "0.0" label on the right edge,
   peak callouts per channel.  Charts cleaned up on modal close.

Resolves the "where is the viewer" surprise — operators no longer need
to know about the /events route to see waveforms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:39:18 +00:00
serversdown c14a8c54db event_browser: Instantel-printout-style polish
Apply the cheap visual wins from the BW Event Report layout:

  1. Channel order reversed → MicL (top), Long, Vert, Tran (bottom)
     to match the Instantel printout.
  2. Shared bottom time axis — x-axis ticks only render on the
     bottom-most data channel; other channels hide ticks so all four
     visually share one time scale.
  3. Triangle trigger markers above and below the t=0 dashed line.
  4. Horizontal zero-baseline (dotted) per channel with "0.0" label
     on the right edge — Instantel convention.
  5. "Print view" toggle that flips dark→light theme (white panels,
     light grids, dark text) so the viewer can render usefully on
     paper-style output / @media print.
  6. Per-channel PPV stats table in the metadata header, with Peak
     Vector Sum displayed prominently.
  7. Colors adjusted to approximate BW trace colors (magenta MicL,
     blue Long, green Vert, red Tran).

Future PDF-export work will reproduce the same layout server-side
once you upload a real example PDF and we pick a rendering pipeline
(weasyprint / chromium --print-to-pdf / etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:09:12 +00:00
serversdown 460006e5cd sfm: stored-event browser at /events
New standalone HTML page (sfm/event_browser.html, ~470 lines, Chart.js)
that lets you browse persisted events from the SeismoDb + WaveformStore.
Companion to the existing live-device viewer at /waveform:

  /waveform  — connect to a unit and pull events in real time
  /events    — browse events already stored in the DB

Flow:
  1. Page loads → GET /db/units → populate serial dropdown
  2. Select serial → GET /db/events?serial=X&limit=500 → event list
  3. Click event → GET /db/events/{id}/waveform.json → render

Layout is Instantel-printout-ready: channels stacked vertically in
Tran / Vert / Long / MicL order, trigger line at t=0, peak labels,
clean dark theme.  Frames the future PDF-export feature without
needing extra layout work.

Smoke-tested against the dev prod-snapshot — 4 channels render with
correct peaks for K558 events (L=0.3 in/s = the offset-fault peak
we've been chasing all week).

CHANGELOG entry added under [Unreleased] per the v0.20.0 release plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 06:53:48 +00:00
serversdown 8710b8f327 docs: record three known issues discovered during prod deployment
1. bw_ascii_report parser misses PPV/vector_sum fields on certain TXT
   formats (5 events in prod).  Parser extracts every OTHER field for
   the same channels — likely a regex / format mismatch specific to
   some firmware-or-event-type combination.

2. NULL-timestamp duplicate rows.  events.timestamp can come back as
   NULL when the codec can't extract a footer timestamp; UNIQUE(serial,
   timestamp) doesn't fire on NULL, so backfills create new rows
   instead of upserting.  2 affected events on prod, easy SQL cleanup.

3. Histogram body sub-format with byte[5] != 0.  ~3 events on prod
   (T190LD5Q, O121L4L1) use a histogram body the walker doesn't
   recognize.  Codec returns 0 valid blocks; DB peaks come from the
   bw_report ASCII overlay so DB columns are correct, only the .h5
   plot is empty.  Cracking the sub-format unlocks the plot.

All three are pre-existing issues that today's deployment surfaced
during validation; none are regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 21:02:13 +00:00
serversdown db657bcac9 Merge pull request 'fix: bw_report overlay onto event before DB, prevents data loss docs: three-tier architecture model + strategic roadmap' (#27) from feat/wire-histogram-codec into dev
Reviewed-on: #27
2026-05-22 15:46:46 -04:00
serversdown 35842ac50a backfill: overlay bw_report onto Event before DB upsert
Mirror what the ingest path does: BW's reported peaks (and sample_rate
/ record_time) take precedence over codec output where present.

Without this, --force backfill silently overwrites bw_report-overlaid
DB columns with codec-derived peaks.  Wrong for events where the codec
doesn't fully decode (waveform walker edge cases on SP0/SS0/SV0-style
events, histogram byte[5]!=0 sub-format that isn't yet RE'd), producing
PVS=0 on real high-amplitude events.  Bit on prod 2026-05-22 with
three top-10 waveform events ending up at PVS=0 (rolled back same day,
this fix is the proper resolution).

New helper minimateplus.event_file_io.apply_bw_report_dict_to_event
operates on the projected sidecar dict shape (the structure
_bw_report_to_dict produces, which is what gets preserved in the
sidecar).  Mirrors apply_report_to_event's semantics: only writes
fields where bw_report has a non-None value, no-ops cleanly on
empty / None input.

Dev validation against prod snapshot:
  pre  : 1839.7315 pvs_sum   356 events with DB PVS ≠ sidecar bw_report
  post : 2016.4902 pvs_sum     2 events still mismatched (both have NULL
                                timestamp + duplicate rows, edge case)

Both edge-case events DO get the correct value written by the new
backfill — their stale rows from prior backfills remain because
UNIQUE(serial, timestamp) doesn't fire on NULL.  Separate dedup
cleanup needed for those 2 events (0.014% of corpus); not blocking.

Backfill remains idempotent + bw_report preservation still passes
(0 WIPED, 0 CHANGED on the 3rd consecutive run).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 18:56:22 +00:00
serversdown 49a524d0d4 docs: three-tier architecture model + strategic roadmap
CLAUDE.md gains an Architecture section near the top describing the
canonical three-tier mental model:

  - SFM: device-side, live connections, /device/* endpoints
  - SDM: data-side, DB + waveform store + /db/* endpoints (currently
    living under sfm/ for historical reasons; rename deferred)
  - Codec library: pure data-interpretation, used by both tiers

Future code should be placed and named according to this model even
though the directory layout doesn't fully reflect it yet.  Decision
rule for where new code goes is documented inline.

README.md's Roadmap section gains two strategic-direction subsections:

  - "Strategic direction" — frames the suite-of-components vision and
    notes that BW ACH + Thor IDF call-home remain the data movers;
    seismo-relay's value is on the receiving and processing side.
  - "Terra-View ↔ SFM device control" — the long-term vision where
    Terra-View can launch into SFM device-control surfaces (operator
    notices missing unit → clicks "Connect to Device" → live view in
    browser).  Includes concrete implementation checklist (auth,
    embedded live-monitor view, action history, series IV live
    support).

The existing tactical roadmap items remain unchanged below.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 18:38:00 +00:00
serversdown 9ef424d098 Merge pull request 'Histogram body codec — full RE + peak-count fix that resolves the prod inflation incident' (#26) from feat/wire-histogram-codec into dev
Reviewed-on: #26
2026-05-22 13:08:03 -04:00
claude cc821f9ee3 hotfix: fix dockerfile on main to fix import bug on prod 2026-05-21 20:42:15 +00:00
serversdown ed6982c512 scripts: bw_report preservation check for backfill safety
Two-step tool to verify that backfill_sidecars doesn't wipe the
bw_report block from existing sidecars.  Workflow:

  1. snapshot --out before.json    (canonical-JSON hash per sidecar)
  2. run backfill
  3. diff --baseline before.json   (classifies every sidecar:
       PRESERVED / CHANGED / WIPED / STILL_MISSING / NEW / ADDED / REMOVED)

Exit code 1 if any WIPED or CHANGED entries found, 0 otherwise — so
it can gate a CI step or a deploy script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 06:13:52 +00:00
serversdown d506ebc103 histogram_codec: peak count is uint8 (not uint16 LE) — properly cracks
the BE9558 / BE18003 extension-byte case

The bytes at [7]/[11]/[15]/[19] are an annotation field (purpose still
unclear — empirically non-zero on intervals with sub-Hz or unmeasurable
freq), NOT the high byte of the peak count.  The N844 fixture corpus
the original RE was done against had zero values in those bytes for
every block, so uint8 and uint16 LE were equivalent there — but on
real BE9558 Tran-drift events and BE18003 Histogram+Continuous events
the uint16 LE interpretation produced peaks up to 268 in/s and 35×
inflated PVS sums.

Cross-correlated against BW's per-interval ASCII export on:
  - K558LKZU/LL1P/LL3K  → 100% T/V/L/M peak match (1435 blocks each)
  - T003LKZR/LL0O/LL1M  → 100% T/V/L, 99.3% M (0.05 dB rounding only)
  - N599LKZS/LL0L        → 100% all channels
  - N844 fixture corpus  → 100% all channels (unchanged)

Annotations preserved on every record for future RE; the defensive
_MAX_PEAK_COUNT bound is no longer needed (uint8 maxes at 1.275 in/s,
well below any physical limit).

Synthetic regression test added using the verbatim K558LKZU.RE0H
interval-12 block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 06:05:19 +00:00
serversdown e949232875 histogram_codec + backfill: tighter peak ceiling, preserve bw_report
histogram_codec: drop _MAX_PEAK_COUNT 4096 → 2200. The old ceiling
let extension-byte blocks slip through at up to 20.48 in/s per
channel, producing 35× inflated PVS sums when first deployed to
prod. 2200 covers Normal-range full-scale (10 in/s = 2000 counts)
plus 10% headroom for quantization edge cases.

backfill_sidecars: also preserve the bw_report block alongside
review + extensions when regenerating sidecars. event_to_sidecar_dict
takes a BwAsciiReport dataclass not a dict, so for bw_report we
overlay the existing block after regen rather than passing as a kwarg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:50:10 +00:00
serversdown bc5a2d3f19 histogram_codec: defensive bounds-check on peak counts
Discovered while running the backfill on prod: certain histogram
blocks contain an undocumented extension byte format whose naive
uint16 LE interpretation yields physically impossible peak values
(150+ in/s when the device max is 10).  Concrete example from
K558LKSG.3I0H block at body+7424:

  bytes [6:10] = 05 79 69 00
  current code: T_peak = uint16 LE = 0x7905 = 30981 → 154.9 in/s
  reality:     T_peak = byte[6] = 5 → 0.025 in/s (matches BW display)

The high byte (0x79 here) appears to be an extension field — possibly
"time of peak within interval" or a Histogram+Continuous sub-mode
marker.  Observed across BE9558 and BE18003 units in prod data; never
appeared in the BE12844 fixture corpus the codec was originally
verified against.

Effect on prod: 26 out of 1433 blocks in this one event had inflated
peaks, plus dozens of similar events across the fleet → sum(PVS)
inflated from baseline 988 to 34501 (35x).  Rolled back via the
pre-backfill snapshot before any UI exposure.

Defensive fix: bounds-check peak counts in `_decode_block`.  Any
field exceeding `_MAX_PEAK_COUNT` (4096 = ~20 in/s, well past the
device's 10 in/s Normal-range FS) causes the block to be skipped
entirely.  Other valid blocks in the same event still decode
correctly.

Trade-off: those skipped blocks lose their per-interval data
(peaks + frequencies).  Acceptable until the extension format is
reverse-engineered — better than propagating bogus values into PVS
computations downstream.

The 24 existing tests all still pass — the fixtures used during the
original codec development don't exercise the extension-byte case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:17:33 +00:00