sfm: Event Report PDF generation (v0.20.0 stub layout)

New endpoint GET /db/events/{id}/report.pdf returns a single-page
letter-portrait PDF for any event with waveform data on disk.

Architecture:
  sfm/report_pdf.py — gather_report_data() assembles fields from
    SeismoDb row + .sfm.json sidecar (bw_report block) + .h5 samples;
    render_event_report_pdf() turns that into PDF bytes via matplotlib.
  sfm/server.py — new endpoint wires them together, streams PDF back
    with Content-Disposition: inline so the browser displays it.
  sfm_webapp.html — new "Download PDF" button in the event modal
    footer that opens the endpoint in a new tab.

Fields surfaced — same coverage as a Blastware Event Report:
  Header metadata (date/time, trigger source, range, sample rate,
                   project, client, operator, location, serial+firmware,
                   battery, calibration, file name)
  Microphone block (PSPL in dB(L) + psi, ZC freq, channel test)
  Per-channel stats (PPV, ZC Freq, Time of Peak, Peak Accel,
                     Peak Disp, Sensor Check) for Tran/Vert/Long
  Peak Vector Sum
  Waveform plot (MicL/Long/Vert/Tran stacked, shared time axis,
                 trigger marker, symmetric Y for geo, zero-anchored
                 mic) — OR per-interval bar chart for histograms.

Rendering pipeline = matplotlib only (vector PDF, no headless-browser
dep).  Adds matplotlib>=3.8 to deps.

Visual layout is approximate until reference PDFs from Instantel land
at docs/reference/instantel/ for iteration.  USBM RI8507 / OSMRE
compliance chart is stubbed (placeholder rectangle) — separate work
item.

Smoke-tested on a K558 waveform event: 77 KB valid PDF, all fields
populated correctly from the snapshot DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 02:55:58 +00:00
parent ed926de3f4
commit 411ef8139e
6 changed files with 566 additions and 1 deletions
+518
View File
@@ -0,0 +1,518 @@
"""
sfm/report_pdf.py — generate Instantel-style Event Report PDFs.
Stub layout for v0.20.0 — the exact visual is iterated against actual
Blastware reference PDFs (uploaded to docs/reference/instantel/).
Current output captures all the data fields a real BW Event Report
contains, but the visual hierarchy / spacing is still approximate.
Architecture
────────────
1. ``gather_report_data(event_id)`` — assembles a flat dict from three
sources: the SeismoDb events row, the .sfm.json sidecar (bw_report
block), and the .h5 waveform samples. Returns ``None`` when the
event doesn't exist or has no waveform data on disk.
2. ``render_event_report_pdf(data)`` — takes that dict and produces a
single-page letter-sized PDF as bytes, using matplotlib's PDF
backend (vector output, no rasterization, prints cleanly).
3. The HTTP endpoint at ``/db/events/{id}/report.pdf`` wires them
together: fetch event → gather → render → stream bytes back with
``Content-Type: application/pdf``.
What's in the report (every field BW's printout includes):
Header (left): Date/Time, Trigger Source, Range, Sample Rate, Notes,
Project, Client, User Name, Seis. Loc
Header (right): Serial + firmware, Battery, Calibration, File Name,
Post Event Notes
Mic block: PSPL (dBL + psi), ZC Freq, Channel Test result
Stats table: per-channel PPV / ZC Freq / Time of Peak /
Peak Acceleration / Peak Displacement / Sensor Check
Peak Vector Sum
Waveform plot: 4 channels stacked (MicL/Long/Vert/Tran), shared
time axis, trigger marker, peak markers
USBM RI8507/OSMRE compliance chart: STUBBED — separate work item
Histogram events: the layout differs (Number of Intervals header
field, no trigger marker, per-interval bar chart instead of waveform).
Handled via a record_type branch in ``render_event_report_pdf``.
"""
from __future__ import annotations
import io
import json
import logging
import math
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
import matplotlib
matplotlib.use("Agg") # headless — no display required
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.backends.backend_pdf import PdfPages
log = logging.getLogger(__name__)
# Reference pressure for dB(L) conversion: 20 µPa expressed in psi.
DBL_REF_PSI = 2.9e-9
# ── Data assembly ────────────────────────────────────────────────────────────
@dataclass
class ReportData:
"""All fields needed to render an Instantel-style Event Report.
Most fields are Optional — BW's printout shows '' or just omits
sections when source data is missing. The renderer mirrors that.
"""
# Header — left column
event_datetime_str: Optional[str] = None
trigger_source: Optional[str] = None
geo_range_str: Optional[str] = None
sample_rate_str: Optional[str] = None
notes: Optional[str] = None
project: Optional[str] = None
client: Optional[str] = None
operator: Optional[str] = None
sensor_location: Optional[str] = None
# Header — right column
serial: Optional[str] = None
firmware: Optional[str] = None
battery_volts: Optional[float] = None
calibration_date: Optional[str] = None
calibration_by: Optional[str] = None
file_name: Optional[str] = None
post_event_notes: Optional[str] = None
# Microphone block
mic_pspl_dbl: Optional[float] = None
mic_pspl_psi: Optional[float] = None
mic_pspl_time_s: Optional[float] = None
mic_zc_freq_hz: Optional[float] = None
mic_channel_test_result: Optional[str] = None
mic_channel_test_freq_hz: Optional[float] = None
mic_channel_test_amp_mv: Optional[float] = None
# Per-channel stats — list of dicts (one per channel)
# Keys: name, ppv_ips, zc_freq_hz, time_of_peak_s,
# peak_accel_g, peak_disp_in, sensor_check
channel_stats: list[dict] = field(default_factory=list)
# Peak Vector Sum
peak_vector_sum_ips: Optional[float] = None
peak_vector_sum_time_s: Optional[float] = None
# Waveform samples — channels[ch] = list of floats in physical units
# Time axis derived from sample_rate + pretrig_samples
channels: dict = field(default_factory=dict)
sample_rate_sps: Optional[int] = None
pretrig_samples: Optional[int] = None
t0_ms: Optional[float] = None
dt_ms: Optional[float] = None
# Record-type discriminator
record_type: Optional[str] = None
is_histogram: bool = False
# Bookkeeping
event_id: Optional[str] = None
server_received_at: Optional[str] = None
bw_pc_sw_version: Optional[str] = None
def gather_report_data(
db,
store,
event_id: str,
) -> Optional[ReportData]:
"""Collect every field needed to render an event report.
Returns ``None`` if the event is unknown or has no waveform data
on disk (no .h5, no .a5.pkl — same condition the waveform.json
endpoint 404s on).
"""
row = db.get_event(event_id)
if row is None:
return None
serial = row.get("serial")
filename = row.get("blastware_filename")
if not serial or not filename:
return None
rd = ReportData(
event_id=event_id,
serial=serial,
file_name=filename,
record_type=row.get("record_type"),
is_histogram=str(row.get("record_type", "")).lower().startswith("hist"),
event_datetime_str=row.get("timestamp"),
sample_rate_sps=row.get("sample_rate"),
project=row.get("project"),
client=row.get("client"),
operator=row.get("operator"),
sensor_location=row.get("sensor_location"),
server_received_at=row.get("created_at"),
)
# ── Sidecar bw_report — the rich BW-derived fields ──
sidecar_path = store.sidecar_path_for(serial, filename)
if sidecar_path.exists():
try:
sc = json.loads(sidecar_path.read_text())
except Exception as exc:
log.warning("gather_report_data: sidecar read failed: %s", exc)
sc = {}
bw = sc.get("bw_report") or {}
# Trigger / range / sample-rate display
trig = bw.get("trigger") or {}
rd.trigger_source = (
f"{trig.get('channel','')}: {trig.get('geo_level_ips')} in/s"
if trig.get("channel") or trig.get("geo_level_ips") is not None
else None
)
rec = bw.get("recording") or {}
rd.geo_range_str = (
f"Geo: {rec.get('geo_range_ips')} in/s"
if rec.get("geo_range_ips") is not None else None
)
rt = rec.get("record_time_s")
if rt is not None and rd.sample_rate_sps:
rd.sample_rate_str = f"{rt:.1f} sec At {rd.sample_rate_sps} Sps"
# Device block
dev = bw.get("device") or {}
rd.battery_volts = dev.get("battery_volts")
rd.calibration_date = dev.get("calibration_date")
rd.calibration_by = dev.get("calibration_by")
rd.firmware = bw.get("version")
rd.bw_pc_sw_version = bw.get("pc_sw_version")
# Microphone block
mic = bw.get("mic") or {}
rd.mic_pspl_dbl = mic.get("pspl_dbl")
if rd.mic_pspl_dbl is not None and rd.mic_pspl_dbl > 0:
# Inverse of the dBL formula → psi. Mirrors waveform_codec convention.
rd.mic_pspl_psi = DBL_REF_PSI * (10 ** (rd.mic_pspl_dbl / 20))
rd.mic_pspl_time_s = mic.get("time_of_peak_s")
rd.mic_zc_freq_hz = mic.get("zc_freq_hz")
sc_mic = (bw.get("sensor_check") or {}).get("mic") or {}
rd.mic_channel_test_result = sc_mic.get("result")
rd.mic_channel_test_freq_hz = sc_mic.get("freq_hz")
rd.mic_channel_test_amp_mv = sc_mic.get("amplitude_mv")
# Per-channel stats (Tran / Vert / Long)
peaks = bw.get("peaks") or {}
sc_block = bw.get("sensor_check") or {}
for ch_lc, ch_label in (("tran", "Tran"), ("vert", "Vert"), ("long", "Long")):
ch = peaks.get(ch_lc) or {}
sc_ch = sc_block.get(ch_lc) or {}
rd.channel_stats.append({
"name": ch_label,
"ppv_ips": ch.get("ppv_ips"),
"zc_freq_hz": ch.get("zc_freq_hz"),
"time_of_peak_s": ch.get("time_of_peak_s"),
"peak_accel_g": ch.get("peak_accel_g"),
"peak_disp_in": ch.get("peak_disp_in"),
"sensor_check": sc_ch.get("result"),
})
# Peak Vector Sum
vs = peaks.get("vector_sum") or {}
rd.peak_vector_sum_ips = vs.get("ips")
rd.peak_vector_sum_time_s = vs.get("time_s")
# ── Waveform samples — from the .h5 via the existing helper ──
from sfm import event_hdf5
h5_path = store.hdf5_path_for(serial, filename)
if h5_path.exists():
try:
wf = event_hdf5.plot_json_from_hdf5(h5_path, event_id=event_id)
rd.channels = {
ch: (chd.get("values") or [])
for ch, chd in (wf.get("channels") or {}).items()
}
ta = wf.get("time_axis") or {}
rd.sample_rate_sps = rd.sample_rate_sps or ta.get("sample_rate")
rd.pretrig_samples = ta.get("pretrig_samples")
rd.t0_ms = ta.get("t0_ms")
rd.dt_ms = ta.get("dt_ms")
except Exception as exc:
log.warning("gather_report_data: hdf5 read failed: %s", exc)
return rd
# ── PDF rendering ────────────────────────────────────────────────────────────
def render_event_report_pdf(rd: ReportData) -> bytes:
"""Render an event report dict to a single-page letter PDF.
Returns the raw PDF bytes — caller streams them back via FastAPI.
NOTE: this is a v0.20.0 stub layout. The visual hierarchy will be
refined once reference PDFs land at docs/reference/instantel/. All
fields the printout includes are surfaced; spacing and typography
are approximate.
"""
# Letter portrait — 8.5"×11"
fig = plt.figure(figsize=(8.5, 11), dpi=100)
fig.patch.set_facecolor("white")
# Grid: header rows on top, stats in the middle, waveform plot at bottom
# height_ratios sum doesn't matter, only the relative proportions
gs = fig.add_gridspec(
nrows=4, ncols=1,
left=0.07, right=0.96, top=0.96, bottom=0.04,
height_ratios=[2.2, 1.0, 1.4, 5.0],
hspace=0.35,
)
# ── Header area (top) ──
ax_header = fig.add_subplot(gs[0])
ax_header.axis("off")
_draw_header(ax_header, rd)
# ── Mic block (left) + USBM chart placeholder (right) ──
ax_mic = fig.add_subplot(gs[1])
ax_mic.axis("off")
_draw_mic_block(ax_mic, rd)
# ── Per-channel stats table + Peak Vector Sum ──
ax_stats = fig.add_subplot(gs[2])
ax_stats.axis("off")
_draw_channel_stats(ax_stats, rd)
# ── Waveform / histogram plot ──
if rd.is_histogram:
_draw_histogram_subplot(fig, gs[3], rd)
else:
_draw_waveform_subplot(fig, gs[3], rd)
# Footer
fig.text(
0.07, 0.015,
f"Generated by seismo-relay • event_id={rd.event_id or ''}",
fontsize=7, color="#888", ha="left",
)
buf = io.BytesIO()
fig.savefig(buf, format="pdf")
plt.close(fig)
return buf.getvalue()
def _kv(ax, x, y, label, value, *, label_w=0.18):
"""Render a 'Label Value' row at axes-coordinates (x, y)."""
ax.text(x, y, label, fontsize=8, color="#555", ha="left", va="top",
transform=ax.transAxes)
ax.text(x + label_w, y, _fmt(value), fontsize=8, ha="left", va="top",
transform=ax.transAxes, family="monospace")
def _fmt(v):
"""Format any field for display — '' for None, str otherwise."""
if v is None:
return ""
if isinstance(v, float):
return f"{v:.4f}".rstrip("0").rstrip(".")
return str(v)
def _draw_header(ax, rd: ReportData) -> None:
"""Two-column metadata header — matches BW printout layout."""
# Left column
rows_left = [
("Date/Time", rd.event_datetime_str),
("Trigger Source", rd.trigger_source),
("Range", rd.geo_range_str),
("Sample Rate", rd.sample_rate_str),
("Notes", rd.notes),
("Project:", rd.project),
("Client:", rd.client),
("User Name:", rd.operator),
("Seis. Loc:", rd.sensor_location),
]
rows_right = [
("Serial Number", f"{rd.serial or ''}"
+ (f" {rd.firmware}" if rd.firmware else "")),
("Battery Level", f"{rd.battery_volts:.1f} Volts" if rd.battery_volts is not None else None),
("Unit Calibration", (f"{rd.calibration_date}"
+ (f" by {rd.calibration_by}" if rd.calibration_by else ""))
if rd.calibration_date else None),
("File Name", rd.file_name),
("Post Event Notes", rd.post_event_notes),
]
y = 0.95
dy = 0.10
for label, value in rows_left:
_kv(ax, 0.0, y, label, value, label_w=0.18)
y -= dy
y = 0.95
for label, value in rows_right:
_kv(ax, 0.55, y, label, value, label_w=0.20)
y -= dy
def _draw_mic_block(ax, rd: ReportData) -> None:
"""Microphone block — PSPL, ZC Freq, Channel Test. USBM chart
placeholder on the right (filled in a separate work item)."""
ax.text(0.0, 0.95, "Microphone Linear Weighting", fontsize=8, color="#555",
transform=ax.transAxes, va="top")
rows = []
if rd.mic_pspl_dbl is not None:
line = f"{rd.mic_pspl_dbl:.1f} dB(L)"
if rd.mic_pspl_time_s is not None:
line += f" at {rd.mic_pspl_time_s:.3f} sec."
rows.append(("PSPL", line))
if rd.mic_zc_freq_hz is not None:
rows.append(("ZC Freq", f"{rd.mic_zc_freq_hz:.0f} Hz"))
if rd.mic_channel_test_result:
line = rd.mic_channel_test_result
if rd.mic_channel_test_freq_hz is not None and rd.mic_channel_test_amp_mv is not None:
line += (f" (Freq = {rd.mic_channel_test_freq_hz:.1f} Hz, "
f"Amp = {rd.mic_channel_test_amp_mv:.0f} mv)")
rows.append(("Channel Test", line))
y = 0.70
for label, value in rows:
_kv(ax, 0.0, y, label, value, label_w=0.18)
y -= 0.22
# USBM chart placeholder — upper-right of this row
ax.text(0.75, 0.95, "USBM RI8507 / OSMRE",
fontsize=8, color="#555", ha="center", va="top",
transform=ax.transAxes)
ax.text(0.75, 0.45, "[compliance chart\nrenders here]",
fontsize=8, color="#bbb", ha="center", va="center",
transform=ax.transAxes, style="italic")
def _draw_channel_stats(ax, rd: ReportData) -> None:
"""Per-channel stats table + Peak Vector Sum row."""
# Build a 2-D array of strings: header row + 3 channel rows
headers = ["", "Tran", "Vert", "Long", ""]
rows = [
["PPV", "ppv_ips", "in/s"],
["ZC Freq", "zc_freq_hz", "Hz"],
["Time (Rel. to Trig)", "time_of_peak_s", "sec"],
["Peak Acceleration", "peak_accel_g", "g"],
["Peak Displacement", "peak_disp_in", "in"],
["Sensor Check", "sensor_check", ""],
]
ch_lookup = {c["name"]: c for c in rd.channel_stats}
def _cell(field, ch_name):
val = ch_lookup.get(ch_name, {}).get(field)
if val is None:
return ""
if field == "sensor_check":
return str(val)
if isinstance(val, float):
return f"{val:.3f}"
return str(val)
table_data = [headers]
for label, field_name, unit in rows:
table_data.append([
label,
_cell(field_name, "Tran"),
_cell(field_name, "Vert"),
_cell(field_name, "Long"),
unit,
])
tbl = ax.table(
cellText=table_data, loc="upper left",
colWidths=[0.30, 0.13, 0.13, 0.13, 0.10],
cellLoc="left", edges="open",
)
tbl.auto_set_font_size(False)
tbl.set_fontsize(8)
tbl.scale(1, 1.4)
# Header row styling
for j in range(5):
cell = tbl[(0, j)]
cell.set_text_props(weight="bold", color="#555")
# Peak Vector Sum
if rd.peak_vector_sum_ips is not None:
line = f"Peak Vector Sum {rd.peak_vector_sum_ips:.3f} in/s"
if rd.peak_vector_sum_time_s is not None:
line += f" At {rd.peak_vector_sum_time_s:.3f} sec."
ax.text(0.0, -0.05, line, fontsize=9, weight="bold",
ha="left", va="top", transform=ax.transAxes)
def _channel_axis_color(ch: str) -> str:
return {"MicL": "#cc00cc", "Long": "#0066ff", "Vert": "#009933", "Tran": "#cc0000"}.get(ch, "#444")
def _draw_waveform_subplot(fig, gridspec_cell, rd: ReportData) -> None:
"""4-channel stacked waveform plot — Instantel printout order
(MicL on top, Tran on bottom), shared x-axis."""
inner = gridspec_cell.subgridspec(4, 1, hspace=0.0)
order = ["MicL", "Long", "Vert", "Tran"]
sr = rd.sample_rate_sps or 1024
dt_ms = rd.dt_ms or (1000.0 / sr)
t0_ms = rd.t0_ms if rd.t0_ms is not None else 0.0
last_idx = len(order) - 1
for i, ch in enumerate(order):
ax = fig.add_subplot(inner[i])
values = rd.channels.get(ch) or []
times = [t0_ms + j * dt_ms for j in range(len(values))]
if values:
color = _channel_axis_color(ch)
ax.plot(times, values, color=color, linewidth=0.6)
# Symmetric y-axis for geo; zero-anchored for mic
if ch != "MicL":
amax = max((abs(v) for v in values), default=0.001)
ax.set_ylim(-amax * 1.1, amax * 1.1)
# Channel label on left
ax.set_ylabel(ch, fontsize=8, rotation=0, ha="right", va="center",
color=_channel_axis_color(ch), weight="bold", labelpad=14)
ax.grid(True, linestyle=":", linewidth=0.4, alpha=0.5)
# Dashed trigger line at t=0
ax.axvline(0.0, color="#cc0000", linestyle="--", linewidth=0.8, alpha=0.7)
# Zero baseline
ax.axhline(0.0, color="#888", linestyle="-", linewidth=0.4, alpha=0.5)
if i != last_idx:
ax.set_xticklabels([])
else:
ax.set_xlabel("Time (ms)", fontsize=8)
ax.tick_params(axis="both", labelsize=7)
def _draw_histogram_subplot(fig, gridspec_cell, rd: ReportData) -> None:
"""4-channel stacked histogram bar chart — per-interval peaks."""
inner = gridspec_cell.subgridspec(4, 1, hspace=0.0)
order = ["MicL", "Long", "Vert", "Tran"]
last_idx = len(order) - 1
for i, ch in enumerate(order):
ax = fig.add_subplot(inner[i])
values = rd.channels.get(ch) or []
if values:
xs = np.arange(1, len(values) + 1)
color = _channel_axis_color(ch)
ax.bar(xs, values, color=color, width=1.0, linewidth=0)
ax.set_ylabel(ch, fontsize=8, rotation=0, ha="right", va="center",
color=_channel_axis_color(ch), weight="bold", labelpad=14)
ax.grid(True, axis="y", linestyle=":", linewidth=0.4, alpha=0.5)
if i != last_idx:
ax.set_xticklabels([])
else:
ax.set_xlabel("Interval", fontsize=8)
ax.tick_params(axis="both", labelsize=7)