ingest: preserve raw BW ASCII report (.TXT) alongside the binary

Previously the .TXT was parsed into the sidecar's bw_report projection
and then discarded at ingest time.  Now save_imported_bw() writes it
to <store>/<serial>/<filename>_ASCII.TXT permanently.

Rationale: with BW Mail / Forwarding Agent being phased out of the
operator workflow, the XML/PDF/WMF those tools produce won't be
available — the binary + .TXT (created by BW ACH itself) are our
only authoritative inputs going forward.  Keeping the raw .TXT
unlocks:

  - Parser bug fixes can be applied RETROACTIVELY by re-parsing the
    stored .TXT, instead of requiring a re-forward from the watcher
    PC (which lost the .TXT after BW ACH cleanup).
  - Audit trail of what BW actually sent us, for debugging.
  - The five known parser-PPV-miss events will be re-parseable once
    the regex fix lands (instead of staying broken indefinitely).

Storage cost: ~15 KB per event × 14k events = ~210 MB on the
existing prod corpus.  Negligible.

Implementation:
  - WaveformStore gains txt_path_for() + open_txt()
  - save_imported_bw() writes the .TXT when bw_report_text is supplied
  - sidecar source block records the txt_filename
  - backfill_sidecars.py preserves txt_filename across regens
  - New GET /db/events/{id}/ascii_report.txt endpoint serves it
  - Returns 404 for events ingested before this change (no .TXT in
    the store yet) — re-forward to populate

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 20:01:12 +00:00
parent dfbc8b8520
commit ad2b553c7b
5 changed files with 85 additions and 0 deletions
+33
View File
@@ -2178,6 +2178,39 @@ def db_event_blastware_file(event_id: str) -> FileResponse:
)
@app.get("/db/events/{event_id}/ascii_report.txt")
def db_event_ascii_report_txt(event_id: str):
"""Serve the raw BW ASCII report (.TXT) for an event, when preserved.
Returns 404 for events ingested before the .TXT-preservation feature
landed (2026-05-27) — those events have only the parsed ``bw_report``
block in the sidecar, not the raw .TXT. Re-forwarding from the
watcher PC will populate the .TXT going forward.
"""
row = _get_db().get_event(event_id)
if row is None:
raise HTTPException(status_code=404, detail=f"Event {event_id} not found")
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
raise HTTPException(status_code=404, detail="Event has no associated BW file")
txt_path = _get_store().open_txt(serial, filename)
if txt_path is None:
raise HTTPException(
status_code=404,
detail=(
f"Raw .TXT not preserved for {filename}. Events ingested "
"before 2026-05-27 don't have it; re-forward from the "
"watcher PC to populate."
),
)
return FileResponse(
path=str(txt_path),
media_type="text/plain",
filename=txt_path.name,
)
@app.get("/db/events/{event_id}/report.pdf")
def db_event_report_pdf(event_id: str):
"""Render an Instantel-style Event Report as a PDF.