fix(import): resolve real serial from BW filename instead of bucketing to UNKNOWN
The /db/import/blastware_file endpoint was bucketing every
forwarded event into serial='UNKNOWN' in the DB. WaveformStore
correctly decoded the serial from the BW filename and saved
files to <store>/<serial>/<filename> (e.g.
.../BE17353/S353L5KC.DR0H.h5), but the endpoint code called
db.insert_events(serial=_serial_from_event(ev)) — and
_serial_from_event was a stub that always returned None,
falling back to "UNKNOWN".
Effect on the user's prod server: 3,039 events forwarded across
24 distinct units, ALL inserted under serial='UNKNOWN'. The
on-disk waveform store + sidecars + HDF5s were fine, but the
SFM webapp's /db/units only showed the two original manually-
uploaded serials because every forwarded row had its serial
column zeroed to UNKNOWN.
Fix:
- WaveformStore.save_imported_bw() now surfaces the decoded
serial on the returned `rec` dict (rec["serial"]).
- The import endpoint uses rec["serial"] as the authoritative
fallback when the operator hasn't supplied a serial_hint query
parameter. Order of precedence:
query string `serial` → rec["serial"] → _serial_from_event(ev) → "UNKNOWN"
- Response payload now includes `serial` per file so the watcher
log lines (or any future caller) can see which unit each event
was attributed to.
Recovery for existing DB rows:
scripts/repair_unknown_serials.py walks the events table looking
for rows with serial='UNKNOWN' and re-attributes each one to the
serial decoded from blastware_filename. Updates the row in place
unless the target (serial, timestamp) already has a row, in which
case the UNKNOWN duplicate is deleted. Idempotent. Default
dry-run; pass --apply to commit.
Verified on the user's actual DB (dry-run):
UNKNOWN rows scanned: 3039
Updated to real serial: 2602
Deleted (duplicate of an
already-correct row): 437
Unresolved (bad filename): 0
After running the repair, /db/units will show all 24 units
correctly populated.
This commit is contained in:
+13
-1
@@ -1673,9 +1673,20 @@ async def db_import_blastware_file(
|
||||
serial_hint=serial,
|
||||
bw_report_text=report_bytes,
|
||||
)
|
||||
# WaveformStore decoded the serial from the BW filename
|
||||
# (e.g. T104… → BE18104) and surfaces it on `rec`. Use that
|
||||
# rather than the placeholder `_serial_from_event(ev)` stub,
|
||||
# which always returned None and was silently bucketing every
|
||||
# forwarded event into serial="UNKNOWN" in the DB.
|
||||
resolved_serial = (
|
||||
serial
|
||||
or rec.get("serial")
|
||||
or _serial_from_event(ev)
|
||||
or "UNKNOWN"
|
||||
)
|
||||
inserted, skipped = db.insert_events(
|
||||
[ev],
|
||||
serial=(serial or _serial_from_event(ev) or "UNKNOWN"),
|
||||
serial=resolved_serial,
|
||||
waveform_records={
|
||||
ev._waveform_key.hex(): rec
|
||||
if ev._waveform_key else None
|
||||
@@ -1687,6 +1698,7 @@ async def db_import_blastware_file(
|
||||
"stored_filename": rec["filename"],
|
||||
"filesize": rec["filesize"],
|
||||
"sha256": rec["sha256"],
|
||||
"serial": resolved_serial,
|
||||
"report_attached": report_bytes is not None,
|
||||
"inserted": inserted,
|
||||
"skipped": skipped,
|
||||
|
||||
@@ -383,6 +383,7 @@ class WaveformStore:
|
||||
"a5_pickle_filename": None,
|
||||
"hdf5_filename": hdf5_filename,
|
||||
"sidecar_filename": sidecar_path.name,
|
||||
"serial": serial,
|
||||
}
|
||||
|
||||
def load_a5(self, serial: str, filename: str) -> Optional[list[S3Frame]]:
|
||||
|
||||
Reference in New Issue
Block a user