Commit Graph

11 Commits

Author SHA1 Message Date
serversdown 9b71ead44b series 4 codec work, inital decode success 2026-05-29 06:33:13 +00:00
serversdown ad2b553c7b ingest: preserve raw BW ASCII report (.TXT) alongside the binary
Previously the .TXT was parsed into the sidecar's bw_report projection
and then discarded at ingest time.  Now save_imported_bw() writes it
to <store>/<serial>/<filename>_ASCII.TXT permanently.

Rationale: with BW Mail / Forwarding Agent being phased out of the
operator workflow, the XML/PDF/WMF those tools produce won't be
available — the binary + .TXT (created by BW ACH itself) are our
only authoritative inputs going forward.  Keeping the raw .TXT
unlocks:

  - Parser bug fixes can be applied RETROACTIVELY by re-parsing the
    stored .TXT, instead of requiring a re-forward from the watcher
    PC (which lost the .TXT after BW ACH cleanup).
  - Audit trail of what BW actually sent us, for debugging.
  - The five known parser-PPV-miss events will be re-parseable once
    the regex fix lands (instead of staying broken indefinitely).

Storage cost: ~15 KB per event × 14k events = ~210 MB on the
existing prod corpus.  Negligible.

Implementation:
  - WaveformStore gains txt_path_for() + open_txt()
  - save_imported_bw() writes the .TXT when bw_report_text is supplied
  - sidecar source block records the txt_filename
  - backfill_sidecars.py preserves txt_filename across regens
  - New GET /db/events/{id}/ascii_report.txt endpoint serves it
  - Returns 404 for events ingested before this change (no .TXT in
    the store yet) — re-forward to populate

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:01:12 +00:00
serversdown ecc935482b seismo-relay v0.19.0 — device-family separation + micromate/ package
Tighten the Series III / Series IV boundary so UI and storage dispatch
on a clean signal instead of sniffing filenames or applying magnitude
heuristics.

Phase 1 — events.device_family column ("series3" | "series4"):
  self-applying migration with filename-based backfill of existing rows
  (1,132 backfilled on prod 2026-05-20); plumbed through every import
  path (BW endpoint, IDF endpoint, ACH server, BW CLI, sidecar
  backfill); UPSERT preserves via COALESCE; UI dispatches on it.

Phase 2 — extract micromate/ package alongside minimateplus/:
  native IdfEvent / IdfReport / IdfPeaks / IdfProjectInfo /
  IdfSensorCheck (mic in dB(L), not pseudo-psi); moved
  idf_ascii_report.py from sfm/ to micromate/; refactored
  save_imported_idf to use IdfEvent and bridge to minimateplus.Event at
  the SQL-insert boundary; idf_file.py stub for the future binary codec.

Phase 3 prep — docs/idf_protocol_reference.md captures the two
observed Thor binary header signatures (1,012 newer-firmware files vs
2 old files whose layout is byte-for-byte BW-STRT-compatible), file-size
hints suggesting int8 sample encoding, open questions in dependency
order, and a concrete first-session plan for cracking the codec.

Also rolled in the v0.18.1 hotfixes that motivated this work:
  - idf_ascii_report parser now handles "<0.005 in/s" (below-threshold)
    and "N/A" markers without leaving raw strings in numeric DB columns.
  - sfm_webapp.html: defensive _ppvFmt / mic formatter so future
    data-shape drift can't kill the whole events table render.

All 1,014 example-data sidecars round-trip through the new package.
See CHANGELOG.md for full notes.
2026-05-20 15:19:49 +00:00
serversdown cd20be2eff feat: add thor/micromate compatibility v0.18.0 2026-05-19 04:32:43 +00:00
serversdown aac1c8e06d fix(import): derive record_type from filename suffix instead of hardcoding "Waveform"
The BW ACH ingest path was inserting every event with
record_type="Waveform" regardless of the actual type because
read_blastware_file() had `ev.record_type = "Waveform"` hardcoded, and
the live watcher-forward path parses files from a tmp path (suffix
".bw") that doesn't carry the original extension.

V10.72+ MiniMate Plus firmware encodes the event type as the last
character of the AB0T extension scheme (H=Histogram, W=Waveform,
M=Manual, E=Event, C=Combo).  This change:

  1. Adds derive_record_type_from_filename() public helper in
     minimateplus/event_file_io.py
  2. Uses it inside read_blastware_file() so direct callers (the
     --dry-run path of scripts/import_bw.py, tests, ad-hoc scripts)
     get correct types automatically
  3. Overrides ev.record_type in WaveformStore.save_imported_bw()
     using the ORIGINAL filename (source_path.name) — required
     because the parser sees only the tmp file

Old S338 firmware (3-char extensions ending in `0`) and any
unrecognized suffix fall back to "Waveform".

Existing DB rows ingested before this fix are stuck with
record_type="Waveform" — a one-off SQL backfill would fix them
retroactively if desired.  Terra-view's event modal also derives
client-side from the filename, so the UI already shows the correct
type for old events even without the backfill.

Version bumped to 0.16.1 in pyproject.toml, event_file_io.py
TOOL_VERSION, sfm/server.py FastAPI version, and CHANGELOG.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 21:09:21 +00:00
serversdown 6b2a44ff02 fix(import): overlay BW report onto Event + upsert DB row on re-import
Two compounding bugs caused forwarded events to land in the DB with
broken-codec peak values (~10 in/s saturation on every channel) and
no project info, even when the watcher correctly paired a BW ASCII
report with the binary.

Bug 1: save_imported_bw built the sidecar JSON with the report's
authoritative peak / project values via event_to_sidecar_dict(
bw_report=...), but never overlaid those onto the in-memory Event
that flows to db.insert_events().  So the DB row got peak_values
from read_blastware_file()._peaks_from_samples() — which runs the
still-undecoded waveform body codec assuming raw int16 LE and
produces ±32K-shaped noise (= ±10 in/s at Normal range) regardless
of the actual signal.  The sidecar JSON had the truth but the DB
columns (which the webapp queries for fast filter/sort) lied.

Bug 2: insert_events' IntegrityError handler only refreshed the
filename/filesize/a5_pickle/sidecar columns when a duplicate
(serial, timestamp) was seen.  Peak values, project info,
sample_rate, record_type stayed locked in at whatever the FIRST
insert wrote.  So even after Bug 1 was fixed, the historical
events in the DB (already inserted with broken-codec peaks) would
never get their values corrected, because a re-forward would just
hit IntegrityError and skip the field refresh.

Fix 1 (minimateplus/event_file_io.py + sfm/waveform_store.py):
  - New apply_report_to_event(event, report) helper folds the BW
    report's device-authoritative fields onto the Event in-place:
    per-channel PPV, peak vector sum, mic PSPL→psi, project /
    client / operator / sensor_location, sample_rate, record_time.
  - save_imported_bw() calls the helper right after parsing the
    report.  The Event that flows to insert_events() now carries
    correct values.

Fix 2 (sfm/database.py):
  - insert_events()'s IntegrityError UPDATE now refreshes every
    device-authoritative column from the new data: tran_ppv,
    vert_ppv, long_ppv, peak_vector_sum, mic_ppv, project, client,
    operator, sensor_location, sample_rate, record_type, plus
    the existing filename/filesize/a5_pickle/sidecar fields.
  - Preserves: id, waveform_key, session_id, created_at (immutable
    / FK fields), and false_trigger (operator review state).

End-to-end simulation verified:
  - Step 1: import without report → DB has ±10 in/s peaks, no project
  - Step 2: re-import WITH report → upsert path fires, DB now has
            device-authoritative 0.005 in/s peaks + sensor_location
  - Step 3: operator sets false_trigger=1, re-import again → flag
            preserved, peaks remain correct

For the user's situation: deleting the watcher state file forces a
re-forward of all events.  Each re-forward now pairs with its
_ASCII.TXT, applies the report onto the Event, and the upsert
refreshes the DB row.  No DB nuke needed.

Full SFM suite: 62 passed, 44 skipped.
2026-05-11 05:51:39 +00:00
serversdown 082e5946bc fix(import): resolve real serial from BW filename instead of bucketing to UNKNOWN
The /db/import/blastware_file endpoint was bucketing every
forwarded event into serial='UNKNOWN' in the DB.  WaveformStore
correctly decoded the serial from the BW filename and saved
files to <store>/<serial>/<filename> (e.g.
.../BE17353/S353L5KC.DR0H.h5), but the endpoint code called
db.insert_events(serial=_serial_from_event(ev)) — and
_serial_from_event was a stub that always returned None,
falling back to "UNKNOWN".

Effect on the user's prod server: 3,039 events forwarded across
24 distinct units, ALL inserted under serial='UNKNOWN'.  The
on-disk waveform store + sidecars + HDF5s were fine, but the
SFM webapp's /db/units only showed the two original manually-
uploaded serials because every forwarded row had its serial
column zeroed to UNKNOWN.

Fix:
  - WaveformStore.save_imported_bw() now surfaces the decoded
    serial on the returned `rec` dict (rec["serial"]).
  - The import endpoint uses rec["serial"] as the authoritative
    fallback when the operator hasn't supplied a serial_hint query
    parameter.  Order of precedence:
      query string `serial` → rec["serial"] → _serial_from_event(ev) → "UNKNOWN"
  - Response payload now includes `serial` per file so the watcher
    log lines (or any future caller) can see which unit each event
    was attributed to.

Recovery for existing DB rows:
  scripts/repair_unknown_serials.py walks the events table looking
  for rows with serial='UNKNOWN' and re-attributes each one to the
  serial decoded from blastware_filename.  Updates the row in place
  unless the target (serial, timestamp) already has a row, in which
  case the UNKNOWN duplicate is deleted.  Idempotent.  Default
  dry-run; pass --apply to commit.

  Verified on the user's actual DB (dry-run):
    UNKNOWN rows scanned:       3039
    Updated to real serial:     2602
    Deleted (duplicate of an
     already-correct row):      437
    Unresolved (bad filename):  0

After running the repair, /db/units will show all 24 units
correctly populated.
2026-05-11 02:25:08 +00:00
serversdown cdfe4ad3c8 feat(import): parse paired BW ASCII reports on /db/import/blastware_file
Blastware's ACH writes a per-event ASCII report (.TXT) alongside each
event binary, containing the rich derived per-channel fields BW
computes (PPV, ZC Freq, Time of Peak, Peak Acceleration, Peak
Displacement, Peak Vector Sum + time, sensor self-check Pass/Fail,
monitor-log timestamps).  None of this lives in the BW binary itself.

When the watcher daemon forwards both files to /db/import/blastware_file
in one multipart POST, we now:

  - Pair binaries with their .TXT partners by filename match
  - Parse the report into a structured BwAsciiReport
  - Land the rich fields in a new top-level `bw_report` block of the
    sidecar JSON
  - Overlay the report's peaks/project_info/timestamp/sample_rate/
    record_time/total_samples/pretrig_samples onto the canonical
    sidecar fields (the report values are device-authoritative; the
    BW-binary STRT-derived values had bugs like reading the 0x46
    record-type marker as rectime)

This unblocks the monthly-summary review workflow — events become
sortable/filterable by peak, location, project, etc. — without
depending on the still-undecoded waveform body codec.
2026-05-08 23:56:43 +00:00
serversdown c641d5fc10 feat: v0.15.0
### Added

- **Layered event storage architecture.**  Each event now lands as four
  files in the per-serial waveform store, each with a clear role:

  - `<filename>` — the Blastware-readable binary (BW file).  Untouched.
  - `<filename>.a5.pkl` — the raw 5A frames (regenerative source).
  - `<filename>.h5` — clean per-channel waveform arrays in physical
    units (in/s for geo, psi for mic) plus event metadata (HDF5 with
    gzip compression).  This is the canonical format for downstream
    analysis tools.
  - `<filename>.sfm.json` — the modern review/metadata sidecar (peaks,
    project, source provenance, review state, extensions).

  SQLite (`seismo_relay.db`) is the searchable index over all four.

- **Plot-ready waveform JSON (`sfm.plot.v1`).**  The `/device/event/{idx}/waveform`
  and `/db/events/{id}/waveform.json` endpoints now return samples in
  physical units with explicit time-axis metadata, peak markers, and
  per-channel unit hints — no more guessing the ADC-to-velocity scale
  client-side.  The webapp waveform viewer was rewritten to consume
  this shape.

- **In-app waveform viewer accuracy fix.**  The standalone SFM webapp
  viewer was scaling geophone amplitudes by `geoAdcScale / 32767`
  (≈ 6.206 / 32767), where `geoAdcScale = 6.206053` is the device's
  *in/s per V* hardware constant — not the ADC-counts-to-velocity
  factor.  This silently scaled every plot ~38% too low for Normal-range
  geophones (the correct full-scale is 10.0 in/s, or 1.25 in/s for
  Sensitive).  Conversion is now done server-side using the geo_range
  from compliance config; the client just plots.

- New `sfm/event_hdf5.py` module: `write_event_hdf5()`,
  `read_event_hdf5()`, plus a plot-JSON helper.
- Backfill script extended to also emit `.h5` for existing events.

### Dependencies

- Added `h5py>=3.10` and `numpy>=1.24` for the HDF5 storage layer.
- Added `python-multipart>=0.0.7` (required by FastAPI for the
  `/db/import/blastware_file` endpoint introduced in this release).
2026-05-08 04:39:51 +00:00
serversdown 0484680c89 fix(docs/comments): rename refs to 'event files' to reflect their timestamp extenion names. 2026-05-06 19:08:38 +00:00
serversdown 3711b11bda feat: add waveform store handling 2026-05-06 19:03:38 +00:00