fix(import): resolve real serial from BW filename instead of bucketing to UNKNOWN

The /db/import/blastware_file endpoint was bucketing every
forwarded event into serial='UNKNOWN' in the DB.  WaveformStore
correctly decoded the serial from the BW filename and saved
files to <store>/<serial>/<filename> (e.g.
.../BE17353/S353L5KC.DR0H.h5), but the endpoint code called
db.insert_events(serial=_serial_from_event(ev)) — and
_serial_from_event was a stub that always returned None,
falling back to "UNKNOWN".

Effect on the user's prod server: 3,039 events forwarded across
24 distinct units, ALL inserted under serial='UNKNOWN'.  The
on-disk waveform store + sidecars + HDF5s were fine, but the
SFM webapp's /db/units only showed the two original manually-
uploaded serials because every forwarded row had its serial
column zeroed to UNKNOWN.

Fix:
  - WaveformStore.save_imported_bw() now surfaces the decoded
    serial on the returned `rec` dict (rec["serial"]).
  - The import endpoint uses rec["serial"] as the authoritative
    fallback when the operator hasn't supplied a serial_hint query
    parameter.  Order of precedence:
      query string `serial` → rec["serial"] → _serial_from_event(ev) → "UNKNOWN"
  - Response payload now includes `serial` per file so the watcher
    log lines (or any future caller) can see which unit each event
    was attributed to.

Recovery for existing DB rows:
  scripts/repair_unknown_serials.py walks the events table looking
  for rows with serial='UNKNOWN' and re-attributes each one to the
  serial decoded from blastware_filename.  Updates the row in place
  unless the target (serial, timestamp) already has a row, in which
  case the UNKNOWN duplicate is deleted.  Idempotent.  Default
  dry-run; pass --apply to commit.

  Verified on the user's actual DB (dry-run):
    UNKNOWN rows scanned:       3039
    Updated to real serial:     2602
    Deleted (duplicate of an
     already-correct row):      437
    Unresolved (bad filename):  0

After running the repair, /db/units will show all 24 units
correctly populated.
This commit is contained in:
2026-05-11 02:25:08 +00:00
parent a032fa5451
commit 082e5946bc
4 changed files with 169 additions and 1 deletions
+13 -1
View File
@@ -1673,9 +1673,20 @@ async def db_import_blastware_file(
serial_hint=serial,
bw_report_text=report_bytes,
)
# WaveformStore decoded the serial from the BW filename
# (e.g. T104… → BE18104) and surfaces it on `rec`. Use that
# rather than the placeholder `_serial_from_event(ev)` stub,
# which always returned None and was silently bucketing every
# forwarded event into serial="UNKNOWN" in the DB.
resolved_serial = (
serial
or rec.get("serial")
or _serial_from_event(ev)
or "UNKNOWN"
)
inserted, skipped = db.insert_events(
[ev],
serial=(serial or _serial_from_event(ev) or "UNKNOWN"),
serial=resolved_serial,
waveform_records={
ev._waveform_key.hex(): rec
if ev._waveform_key else None
@@ -1687,6 +1698,7 @@ async def db_import_blastware_file(
"stored_filename": rec["filename"],
"filesize": rec["filesize"],
"sha256": rec["sha256"],
"serial": resolved_serial,
"report_attached": report_bytes is not None,
"inserted": inserted,
"skipped": skipped,