diff --git a/CHANGELOG.md b/CHANGELOG.md index 3973d36..a71d3cf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,56 @@ All notable changes to Terra-View will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.10.0] - 2026-05-14 + +This release brings terra-view onto the SFM (Seismograph Field Module) event pipeline. Triggered events forwarded by series3-watcher now land in SFM, and terra-view reads from that store as the authoritative source for vibration data. The watcher heartbeat is preserved as a transparent fallback signal. + +### Added +- **SFM Integration**: New fleet-wide events page at `/sfm` listing every event ingested by SFM, with filters for serial, date range, false-trigger flag, and limit. Unit detail pages and project-location pages show their own attributed subsets of the same event stream. +- **Event Detail Modal**: Shared across `/sfm`, unit detail, and project-location pages — clicking any event opens a rich modal showing peaks per channel (PVS color-coded by magnitude), microphone dB(L) + ZC frequency + time of peak, sensor self-check table with pass/fail per channel, device/recording metadata (firmware, battery, calibration date, geo range), and download buttons for the original Blastware binary and the sidecar JSON. Includes an inline pretty-printed JSON viewer with copy-to-clipboard. +- **Events Attribution Engine** (`backend/services/sfm_events.py`): Per-event attribution against `UnitAssignment` time windows. Events outside any assignment window surface in an "Unattributed" bucket with the nearest-assignment diagnostic (which location, signed delta in days). +- **Metadata Backfill Tool** (`/tools` → Backfill from event metadata): Scans operator-typed `project` and `sensor_location` strings in event sidecars, fuzzy-clusters them via `rapidfuzz.WRatio`, and proposes retroactive `UnitAssignment` records to attribute orphan events. Tracks operator decisions per cluster across re-scans. +- **Project Tidy Tool** (`/tools` → Project Tidy): Fuzzy-detect duplicate projects and bulk-merge them with a single click. Source projects soft-deleted with full audit trail. +- **Vibration Summary on Project Pages**: New roll-up card on vibration project detail pages showing per-location event counts, the project's "Overall Peak" PVS (false triggers excluded), last event timestamp, and a Top Locations by Activity list. +- **SFM-Primary Seismograph Status**: `emit_status_snapshot()` now consults SFM's `/db/units` (cached 15s) before falling back to `Emitter.last_seen` for each seismograph. The fresher signal wins; the choice is recorded in a new per-unit `last_seen_source` field. A small `SFM` (orange) or `HB` (gray) badge on each unit's active-table row shows which path is currently driving the status. +- **Dashboard Rework**: Top row reordered to Recent Alerts → Recent Call-Ins (double-wide) → Fleet Summary. Today's Schedule moved to a horizontal collapsible card below the Fleet Map, auto-expanding only when pending actions exist. Recent Call-Ins now sources from a new `/api/recent-event-callins` endpoint backed by SFM event forwards instead of the watcher-heartbeat endpoint. +- **Sortable Events Tables**: `/sfm` and unit-detail SFM Events tables now have clickable column headers with ↕/↓/↑ indicators. Default sort is Timestamp DESC. Click same column to toggle direction; click different column to switch and reset to DESC. Pure client-side over cached rows — no re-fetches. +- **Developer → SFM Admin** (`/admin/sfm`): Health banner with reachability indicator, terra-view↔SFM connection panel, 4 KPI tiles (known units, total events, stale `monitor_log` rows, stale `ach_sessions` rows), per-unit roll-up table, recent-events table with color-coded forwarding latency (so stale watcher forwards stand out), and a raw API tester for any `/api/sfm/*` path. +- **Developer → SLMM Admin** (`/admin/slmm`): Stripped-down companion page — health, connection info, raw API tester. +- **Tools Workflow Hub** (`/tools`): New top-level sidebar entry consolidating Pair Devices, Project Tidy, Metadata Backfill, Reports (info card), and Swap Detection (placeholder). +- **Sidebar Reorganization**: Devices → Projects → Events → Tools → Job Planner → Settings. Devices is now a single entry with internal tabs (All Devices / Seismographs / Sound Level Meters / Modems / Pair Devices) replacing five separate sidebar items. +- **Synology Deployment Doc** (`docs/SYNOLOGY_DEPLOYMENT.md`): End-to-end playbook for migrating the stack to an always-on office NAS — phased rollout (pre-stage, data rsync, watcher repoint, external access, decommission), Tailscale vs reverse-proxy options, rollback plan, and gotchas. + +### Changed +- **Overall Peak excludes false triggers**: The project-level "Overall Peak" KPI tile (and the underlying `_compute_stats()` function in `sfm_events.py`) now skip events flagged as false triggers when computing the highest PVS, so operators see the highest real event rather than the biggest sensor glitch. `false_trigger_count` still includes flagged events so operators can see how many were filtered out. +- **`RosterUnit.note` Editing**: Inline edit on seismograph cards is more forgiving and now auto-saves on blur. +- **Sidebar Nav Renamed**: Old "Fleet" sidebar entry → "Devices" (renamed because it always meant the device list, not the broader fleet view). + +### Fixed +- **Status drift between watcher heartbeat and actual event arrivals**: Seismographs are now reported with whichever signal is more recent — eliminates the case where a unit had recent SFM events but a stale heartbeat (or vice-versa) showed the wrong status. +- **Event modal: Record Type always showed "Waveform"**: Workaround client-side — Record Type now derived from the Blastware filename's last-char code (`H`=Histogram, `W`=Waveform, `M`=Manual, `E`=Event, `C`=Combo). The proper fix lives in SFM's sidecar parser; tracked separately. +- **Event modal: Mic PSI tile removed**: Operators only care about dB(L); the redundant PSI tile was dropped. + +### Migration Notes +Run on each database before deploying. Every migration is idempotent. + +```bash +# Cleanest: re-run all migrations in chronological order. +# Already-applied migrations no-op safely. +for f in backend/migrate_*.py; do + docker exec terra-view-terra-view-1 python3 "/app/backend/$(basename $f)" +done +``` + +Migrations new in this release: +- `migrate_add_metadata_backfill.py` — adds `unit_assignments.source` column and `metadata_backfill_decisions` table for the Metadata Backfill tool + +### Deployment Notes +- **`SFM_BASE_URL`**: Confirm prod's `docker-compose.yml` sets this for the terra-view service (typically `http://sfm:8200` for the in-stack SFM container, or an external URL if SFM lives elsewhere). +- **Watcher repoint**: series3-watcher's `sfm_forward_url` should point at `https:///api/sfm` (proxy-based — no second port forward needed). Watcher composes the full path `/db/import/blastware_file` itself. + +--- + ## [0.9.4] - 2026-04-06 ### Added diff --git a/README.md b/README.md index 011f764..9ab2bc2 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Terra-View v0.9.4 +# Terra-View v0.10.0 Backend API and HTMX-powered web interface for managing a mixed fleet of seismographs and field modems. Track deployments, monitor health in real time, merge roster intent with incoming telemetry, and control your fleet through a unified database and dashboard. ## Features @@ -496,6 +496,19 @@ docker compose down -v ## Release Highlights +### v0.10.0 — 2026-05-14 +- **SFM Integration**: terra-view now consumes events from the SFM (Seismograph Field Module) backend in real time, with a fleet-wide events page at `/sfm`, per-unit attribution against project assignment windows, and a project-level vibration roll-up that uses SFM data as the single source of truth. +- **SFM-Primary Seismograph Status**: Deployed seismograph status (OK/Pending/Missing) now flows from SFM event forwards first; the watcher heartbeat stays as a transparent backup. Each unit's active table row shows a small `SFM` or `HB` badge so operators can see at a glance which signal is currently driving the status. +- **Dashboard Rework**: Top row reordered to Recent Alerts → Recent Call-Ins (double-wide) → Fleet Summary. Today's Schedule moves to a horizontal collapsible card below the Fleet Map, auto-expanding only when there's a pending action. Recent Call-Ins now sources from SFM event forwards instead of the legacy watcher-heartbeat endpoint. +- **Event Detail Modal**: Click any event anywhere in the app to open a rich detail modal showing peak particle velocity per channel, microphone dB(L), sensor self-check results, device/recording metadata, and download buttons for the original Blastware binary and sidecar JSON. Includes an inline JSON viewer with one-click copy. +- **Sortable Events Tables**: Every events table (project events, unit-detail events, fleet-wide /sfm) now supports clickable column-header sorting with directional indicators. Defaults to newest-first. +- **Events Attribution & Backfill**: Each SFM event is automatically attributed to a project/location based on `UnitAssignment` time windows. Unattributed events get a diagnostic showing the nearest assignment and a delta-days gap. The metadata-backfill tool in `/tools` scans operator-typed project/sensor-location strings in event sidecars and clusters them via fuzzy matching to propose new assignment retroactives. +- **Projects Tools**: New `/tools` workflow hub consolidates Pair Devices, Project Tidy (fuzzy-detect + merge duplicate projects), Metadata Backfill, Reports, and Swap Detection (placeholder). +- **Sidebar Reorganization**: Devices → Projects → Events → Tools → Job Planner → Settings. Devices is now a single entry with internal tabs (All Devices / Seismographs / Sound Level Meters / Modems / Pair Devices). +- **Developer → SFM Admin**: New `/admin/sfm` page surfacing SFM health, per-unit roll-up from `/db/units`, recent-events table with forwarding latency (so operators can spot stale watcher forwards), stale-table counts, and a raw API tester. Companion `/admin/slmm` page covers SLMM health + raw API. +- **"Overall Peak" excludes False Triggers**: The project-level Overall Peak KPI tile now excludes events flagged as false triggers — operators see the highest real event, not the biggest sensor glitch. +- **Synology Deployment Doc**: New `docs/SYNOLOGY_DEPLOYMENT.md` covers migrating the stack to an always-on office NAS, including phased rollout, data rsync, watcher repoint, external-access (Tailscale or reverse-proxy), and rollback plan. + ### v0.8.0 — 2026-03-18 - **Watcher Manager**: Admin page for monitoring field watcher agents with live status cards, log tails, and one-click update triggering - **Watcher Status Fix**: Agent status now reflects heartbeat connectivity (missing if not heard from in >60 min) rather than unit-level data staleness @@ -599,9 +612,21 @@ MIT ## Version -**Current: 0.8.0** — Watcher Manager admin page, live agent status refresh, watcher connectivity-based status (2026-03-18) +**Current: 0.10.0** — SFM integration, SFM-primary seismograph status, dashboard rework, sortable events tables, event detail modal, /admin/sfm + /admin/slmm diagnostic pages, Tools workflow hub (2026-05-14) -Previous: 0.7.1 — Out-for-calibration status, reservation modal, migration fixes (2026-03-12) +Previous: 0.9.4 — Modular project types, deleted project management, swap modal search, roster auto-refresh fix (2026-04-06) + +0.9.3 — Monitoring session detail page, configurable period windows, vibration project redesign, modem assignment on locations (2026-03-28) + +0.9.2 — Deployment records, allocated status, quick-info unit modal, inline seismograph editing (2026-03-27) + +0.9.1 — Fix location slots not persisting on save/reload (2026-03-20) + +0.9.0 — Job Planner redesign, monitoring locations, estimated units, smart color picker, calendar bar tooltips, toast notifications (2026-03-19) + +0.8.0 — Watcher Manager admin page, live agent status refresh, watcher connectivity-based status (2026-03-18) + +0.7.1 — Out-for-calibration status, reservation modal, migration fixes (2026-03-12) 0.7.0 — Project status management, manual SD card upload, combined report wizard, NL32 support, MonitoringSession rename (2026-03-07) diff --git a/backend/main.py b/backend/main.py index ca35337..c7d39d7 100644 --- a/backend/main.py +++ b/backend/main.py @@ -18,7 +18,7 @@ logging.basicConfig( logger = logging.getLogger(__name__) from backend.database import engine, Base, get_db -from backend.routers import roster, units, photos, roster_edit, roster_rename, dashboard, dashboard_tabs, activity, slmm, slm_ui, slm_dashboard, seismo_dashboard, projects, project_locations, scheduler, modem_dashboard +from backend.routers import roster, units, photos, roster_edit, roster_rename, dashboard, dashboard_tabs, activity, slmm, slm_ui, slm_dashboard, seismo_dashboard, sfm, projects, project_locations, scheduler, modem_dashboard from backend.services.snapshot import emit_status_snapshot from backend.models import IgnoredUnit from backend.utils.timezone import get_user_timezone @@ -30,7 +30,7 @@ Base.metadata.create_all(bind=engine) ENVIRONMENT = os.getenv("ENVIRONMENT", "production") # Initialize FastAPI app -VERSION = "0.9.4" +VERSION = "0.10.0" if ENVIRONMENT == "development": _build = os.getenv("BUILD_NUMBER", "0") if _build and _build != "0": @@ -97,6 +97,7 @@ app.include_router(slmm.router) app.include_router(slm_ui.router) app.include_router(slm_dashboard.router) app.include_router(seismo_dashboard.router) +app.include_router(sfm.router) app.include_router(modem_dashboard.router) from backend.routers import settings @@ -105,6 +106,9 @@ app.include_router(settings.router) from backend.routers import watcher_manager app.include_router(watcher_manager.router) +from backend.routers import admin_modules +app.include_router(admin_modules.router) + # Projects system routers app.include_router(projects.router) app.include_router(project_locations.router) @@ -114,6 +118,10 @@ app.include_router(scheduler.router) from backend.routers import report_templates app.include_router(report_templates.router) +# Metadata-backfill admin router (Phase 5a) +from backend.routers import metadata_backfill +app.include_router(metadata_backfill.router) + # Alerts router from backend.routers import alerts app.include_router(alerts.router) @@ -233,6 +241,34 @@ async def seismographs_page(request: Request): return templates.TemplateResponse("seismographs.html", {"request": request}) +@app.get("/sfm", response_class=HTMLResponse) +async def sfm_page(request: Request): + """SFM live event data and device control dashboard""" + return templates.TemplateResponse("sfm.html", {"request": request}) + + +@app.get("/settings/developer/metadata-backfill", response_class=HTMLResponse) +async def metadata_backfill_wizard_page(request: Request): + """Wizard for auto-creating projects/locations/assignments from + operator-typed BW event metadata (Phase 5a).""" + return templates.TemplateResponse("admin/metadata_backfill.html", {"request": request}) + + +@app.get("/settings/developer/project-tidy", response_class=HTMLResponse) +async def project_tidy_page(request: Request): + """Tidy duplicate-looking projects: detect by fuzzy name match, merge + by clicking through pairs (Phase 5b).""" + return templates.TemplateResponse("admin/project_tidy.html", {"request": request}) + + +@app.get("/tools", response_class=HTMLResponse) +async def tools_page(request: Request): + """Tools / workflow hub. Active operator workflows (device pairing, + project tidy, metadata backfill, future swap detection, report + generators) all live here in card form.""" + return templates.TemplateResponse("tools.html", {"request": request}) + + @app.get("/modems", response_class=HTMLResponse) async def modems_page(request: Request): """Field modems management dashboard""" diff --git a/backend/migrate_add_metadata_backfill.py b/backend/migrate_add_metadata_backfill.py new file mode 100644 index 0000000..1afbbd8 --- /dev/null +++ b/backend/migrate_add_metadata_backfill.py @@ -0,0 +1,94 @@ +""" +Migration: add metadata-backfill support. + +Adds: + 1. `unit_assignments.source` column (TEXT, default 'manual'). + Lets us audit which assignments were created by the metadata-backfill + parser vs by a human, and bulk-undo parser actions if needed. + + 2. `metadata_backfill_decisions` table. Tracks operator decisions per + cluster_id so the wizard remembers what's been skipped, what's + been applied, and what's pending across re-scans. + +Idempotent — safe to re-run. +Non-destructive — adds only. + +Run with: + docker exec terra-view-terra-view-1 python3 /app/backend/migrate_add_metadata_backfill.py +""" + +import os +import sqlite3 + +DB_PATH = "./data/seismo_fleet.db" + + +def migrate_database(): + if not os.path.exists(DB_PATH): + print(f"Database not found at {DB_PATH}") + return + + print(f"Migrating database: {DB_PATH}") + conn = sqlite3.connect(DB_PATH) + cur = conn.cursor() + + # ── 1. unit_assignments.source column ────────────────────────────────── + cur.execute("PRAGMA table_info(unit_assignments)") + cols = {row[1] for row in cur.fetchall()} + if "source" not in cols: + print("Adding unit_assignments.source column (default 'manual') ...") + cur.execute( + "ALTER TABLE unit_assignments ADD COLUMN source TEXT DEFAULT 'manual'" + ) + # Backfill: any existing row gets source='manual' + cur.execute("UPDATE unit_assignments SET source='manual' WHERE source IS NULL") + conn.commit() + print(" Done.") + else: + print("unit_assignments.source already exists — skipping") + + # ── 2. metadata_backfill_decisions table ────────────────────────────── + cur.execute( + "SELECT name FROM sqlite_master WHERE type='table' AND name='metadata_backfill_decisions'" + ) + if cur.fetchone() is None: + print("Creating metadata_backfill_decisions table ...") + cur.execute(""" + CREATE TABLE metadata_backfill_decisions ( + cluster_id TEXT PRIMARY KEY, -- deterministic hash + status TEXT NOT NULL, -- pending | applied | skipped | conflict + confidence TEXT NOT NULL, -- high | medium | low (at time of decision) + decided_at TEXT, -- when applied/skipped + decided_by TEXT, -- 'background' | 'operator' | 'auto-high' + applied_assignment_id TEXT, -- FK to unit_assignments (if applied) + notes TEXT, + first_seen_at TEXT NOT NULL, + last_seen_at TEXT NOT NULL, + serial TEXT NOT NULL, + project_raw TEXT, + location_raw TEXT, + first_event_ts TEXT, + last_event_ts TEXT, + event_count INTEGER NOT NULL DEFAULT 0 + ) + """) + cur.execute( + "CREATE INDEX idx_mbd_status ON metadata_backfill_decisions(status)" + ) + cur.execute( + "CREATE INDEX idx_mbd_last_seen ON metadata_backfill_decisions(last_seen_at)" + ) + cur.execute( + "CREATE INDEX idx_mbd_serial ON metadata_backfill_decisions(serial)" + ) + conn.commit() + print(" Done.") + else: + print("metadata_backfill_decisions table already exists — skipping") + + conn.close() + print("\nMigration complete.") + + +if __name__ == "__main__": + migrate_database() diff --git a/backend/migrate_deprecate_deployment_records.py b/backend/migrate_deprecate_deployment_records.py new file mode 100644 index 0000000..0ccdc4f --- /dev/null +++ b/backend/migrate_deprecate_deployment_records.py @@ -0,0 +1,209 @@ +""" +Migration: deprecate the `deployment_records` table. + +Why: + The deployment-history view on the unit detail page used to render + from `deployment_records` — a manually-maintained table that drifted + out of sync with `unit_assignments` (the auto-written project/location + assignment table). That caused the "wonky timeline" symptom: missing + entries, duplicate / contradictory rows, and a UI that couldn't tell + the operator what the unit was actually doing during each window. + + Phase 4 of the SFM integration replaces the deployment-history view + with a derived timeline computed from `unit_assignments` + + `unit_history` + SFM event overlay. This migration is the cleanup: + + 1. Adds a `deprecated_at` timestamp column to `deployment_records` so + we can mark rows that have been migrated. + 2. For every `deployment_records` row that does NOT have a matching + `unit_assignments` row (matched by unit_id + overlapping date + range), synthesizes a best-effort UnitAssignment row. The + free-text `location_name` from the legacy table is preserved on + the new row's `notes` field (we do NOT try to fuzzy-match it to a + MonitoringLocation id; too error-prone — operators will need to + reattach those manually if they want). + 3. Marks every migrated deployment_records row with `deprecated_at`. + + This migration is non-destructive: deployment_records rows stay in + the DB. The actual `DROP TABLE` happens in a follow-up release after + one operator cycle confirms nothing relies on the legacy data. + +Idempotent: re-running the script is a no-op if the column already +exists and all migratable rows have already been processed. + +Run with: + docker exec terra-view-terra-view-1 python3 /app/backend/migrate_deprecate_deployment_records.py +""" + +import os +import sqlite3 +import uuid +from datetime import datetime + +DB_PATH = "./data/seismo_fleet.db" + + +def migrate_database(): + if not os.path.exists(DB_PATH): + print(f"Database not found at {DB_PATH}") + return + + print(f"Migrating database: {DB_PATH}") + conn = sqlite3.connect(DB_PATH) + conn.row_factory = sqlite3.Row + cur = conn.cursor() + + # 1. Add deprecated_at column if not present. + cur.execute("PRAGMA table_info(deployment_records)") + cols = {row["name"] for row in cur.fetchall()} + if "deprecated_at" not in cols: + print("Adding deployment_records.deprecated_at column ...") + cur.execute("ALTER TABLE deployment_records ADD COLUMN deprecated_at TEXT") + conn.commit() + else: + print("deployment_records.deprecated_at column already exists — skipping ADD COLUMN") + + # 2. Find candidate rows: not-yet-deprecated deployment_records that + # have no matching unit_assignments row. + cur.execute(""" + SELECT id, unit_id, deployed_date, estimated_removal_date, + actual_removal_date, project_id, project_ref, location_name, notes + FROM deployment_records + WHERE deprecated_at IS NULL + """) + rows = cur.fetchall() + print(f"\nFound {len(rows)} deployment_records rows not yet deprecated.") + + backfilled = 0 + skipped_no_match_attempted = 0 + skipped_already_in_assignments = 0 + skipped_missing_unit = 0 + + for row in rows: + unit_id = row["unit_id"] + if not unit_id: + print(f" ⚠ row {row['id']!r}: no unit_id, marking deprecated without backfill") + cur.execute( + "UPDATE deployment_records SET deprecated_at=? WHERE id=?", + (datetime.utcnow().isoformat(), row["id"]), + ) + skipped_missing_unit += 1 + continue + + # Does the unit still exist? If not, skip — we don't synthesize + # assignments for ghost units. + cur.execute("SELECT id, device_type FROM roster WHERE id=?", (unit_id,)) + roster = cur.fetchone() + if not roster: + print(f" ⚠ row {row['id']!r}: unit_id {unit_id!r} not in roster, marking deprecated without backfill") + cur.execute( + "UPDATE deployment_records SET deprecated_at=? WHERE id=?", + (datetime.utcnow().isoformat(), row["id"]), + ) + skipped_missing_unit += 1 + continue + + # Check if a UnitAssignment already covers this window (any overlap). + # We don't try to be clever — just see if a row exists for this unit + # whose [assigned_at, assigned_until] overlaps the deployment window. + cur.execute(""" + SELECT id FROM unit_assignments + WHERE unit_id=? + AND (assigned_at <= COALESCE(?, '9999') + AND COALESCE(assigned_until, '9999') >= COALESCE(?, '0000')) + LIMIT 1 + """, ( + unit_id, + row["actual_removal_date"] or row["estimated_removal_date"] or row["deployed_date"], + row["deployed_date"], + )) + if cur.fetchone(): + cur.execute( + "UPDATE deployment_records SET deprecated_at=? WHERE id=?", + (datetime.utcnow().isoformat(), row["id"]), + ) + skipped_already_in_assignments += 1 + continue + + # No matching UnitAssignment — synthesize one. We can't FK to a + # MonitoringLocation because the legacy `location_name` is free + # text. Backfilled rows go in with location_id = "" (empty) and + # the original location_name dropped into notes for operator + # context. + if not row["project_id"]: + print(f" ⚠ row {row['id']!r}: no project_id, can't synthesize unit_assignment, marking deprecated") + cur.execute( + "UPDATE deployment_records SET deprecated_at=? WHERE id=?", + (datetime.utcnow().isoformat(), row["id"]), + ) + skipped_no_match_attempted += 1 + continue + + synthesized_id = str(uuid.uuid4()) + synth_notes_parts = [] + if row["location_name"]: + synth_notes_parts.append(f"Legacy location: {row['location_name']}") + if row["project_ref"]: + synth_notes_parts.append(f"Legacy project_ref: {row['project_ref']}") + if row["notes"]: + synth_notes_parts.append(f"Original notes: {row['notes']}") + synth_notes_parts.append(f"(Synthesized from deployment_records row {row['id']})") + synth_notes = " | ".join(synth_notes_parts) + + assigned_until = row["actual_removal_date"] + # Don't auto-close active deployments based on estimated_removal_date. + status = "completed" if assigned_until else "active" + + # Need a location_id to satisfy NOT NULL constraint. Use a + # placeholder UUID so the FK can be cleaned up later if the + # operator decides to retarget the assignment to a real location. + # We tag this with the synthesized notes so it's discoverable. + placeholder_loc_id = "" + + try: + cur.execute(""" + INSERT INTO unit_assignments ( + id, unit_id, location_id, project_id, device_type, + assigned_at, assigned_until, status, notes, created_at + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, ( + synthesized_id, + unit_id, + placeholder_loc_id, + row["project_id"], + roster["device_type"] or "seismograph", + row["deployed_date"] or datetime.utcnow().isoformat(), + assigned_until, + status, + synth_notes, + datetime.utcnow().isoformat(), + )) + cur.execute( + "UPDATE deployment_records SET deprecated_at=? WHERE id=?", + (datetime.utcnow().isoformat(), row["id"]), + ) + backfilled += 1 + print( + f" ✓ row {row['id']!r}: synthesized unit_assignment {synthesized_id} " + f"for unit={unit_id} project={row['project_id'][:8]}… " + f"({row['deployed_date']} → {assigned_until or 'present'})" + ) + except Exception as e: + print(f" ✗ row {row['id']!r}: failed to synthesize — {e}") + + conn.commit() + conn.close() + + print("\n────────────────────────────────────────────────────────") + print(f"Backfilled new unit_assignments: {backfilled}") + print(f"Already covered (deprecated only): {skipped_already_in_assignments}") + print(f"No project_id (deprecated only): {skipped_no_match_attempted}") + print(f"Missing/orphaned unit (deprecated): {skipped_missing_unit}") + print(f"\nNOTE: synthesized rows have an empty location_id and the legacy") + print(f" free-text location is preserved in notes. An operator should") + print(f" retarget them to real MonitoringLocation rows if they want") + print(f" events to show up on a location detail page.") + + +if __name__ == "__main__": + migrate_database() diff --git a/backend/models.py b/backend/models.py index b3c3665..5ab7761 100644 --- a/backend/models.py +++ b/backend/models.py @@ -259,9 +259,48 @@ class UnitAssignment(Base): device_type = Column(String, nullable=False) # "slm" | "seismograph" project_id = Column(String, nullable=False, index=True) # FK to Project.id + # Provenance: how was this assignment created? Used for auditing, + # bulk-undo of parser actions, and the Phase 4 deployment timeline. + # "manual" — operator created via UI + # "metadata_backfill" — auto-created by the metadata parser + # from operator-typed BW event metadata + # (bulk backfill workflow) + # "metadata_backfill_swap" — auto-created by swap-detection + # background job + source = Column(String, nullable=False, default="manual") + created_at = Column(DateTime, default=datetime.utcnow) +class MetadataBackfillDecision(Base): + """ + Per-cluster decisions tracked by the metadata-backfill parser. + + `cluster_id` is the deterministic SHA1 hash of + (serial, first_event_date, last_event_date), so the same cluster + produces the same id across re-scans. The decisions table lets the + parser remember "I already applied this" or "operator skipped this" + across scan invocations. + """ + __tablename__ = "metadata_backfill_decisions" + + cluster_id = Column(String, primary_key=True) + status = Column(String, nullable=False) # pending | applied | skipped | conflict + confidence = Column(String, nullable=False) # high | medium | low + decided_at = Column(DateTime, nullable=True) + decided_by = Column(String, nullable=True) # background | operator | auto-high + applied_assignment_id = Column(String, nullable=True) # FK to unit_assignments.id + notes = Column(Text, nullable=True) + first_seen_at = Column(DateTime, nullable=False, default=datetime.utcnow) + last_seen_at = Column(DateTime, nullable=False, default=datetime.utcnow) + serial = Column(String, nullable=False, index=True) + project_raw = Column(String, nullable=True) + location_raw = Column(String, nullable=True) + first_event_ts = Column(DateTime, nullable=True) + last_event_ts = Column(DateTime, nullable=True) + event_count = Column(Integer, nullable=False, default=0) + + class ScheduledAction(Base): """ Scheduled actions: automation for recording start/stop/download. diff --git a/backend/routers/activity.py b/backend/routers/activity.py index b881a8e..02c6813 100644 --- a/backend/routers/activity.py +++ b/backend/routers/activity.py @@ -4,11 +4,29 @@ from sqlalchemy import desc from pathlib import Path from datetime import datetime, timedelta, timezone from typing import List, Dict, Any +import os +import logging +import httpx from backend.database import get_db from backend.models import UnitHistory, Emitter, RosterUnit +log = logging.getLogger(__name__) + router = APIRouter(prefix="/api", tags=["activity"]) +SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200") + + +def _humanize_age(seconds: float) -> str: + if seconds < 60: + return "just now" + if seconds < 3600: + return f"{int(seconds / 60)}m ago" + if seconds < 86400: + hrs = seconds / 3600 + return f"{int(hrs)}h {int((hrs % 1) * 60)}m ago" + return f"{int(seconds / 86400)}d ago" + PHOTOS_BASE_DIR = Path("data/photos") @@ -144,3 +162,86 @@ def get_recent_callins(hours: int = 6, limit: int = None, db: Session = Depends( "hours": hours, "time_threshold": time_threshold.isoformat() } + + +@router.get("/recent-event-callins") +async def get_recent_event_callins(limit: int = 10, db: Session = Depends(get_db)): + """ + Recent unit call-ins derived from SFM event forwards. + + Architecture context: the live ACH replacement is on hold, so call-homes + arrive as Blastware ACH event files forwarded by series3-watcher and + landed in the SFM events store. One event ≈ one call-in. This is the + forward-looking source of "recent call-ins" that will eventually replace + the heartbeat-based /recent-callins endpoint entirely. + + Each row represents one event; multiple consecutive events from the same + serial are intentionally NOT collapsed — each one is a distinct call-home. + """ + try: + async with httpx.AsyncClient(timeout=10.0) as client: + resp = await client.get( + f"{SFM_BASE_URL}/db/events", + params={"limit": limit}, + ) + resp.raise_for_status() + payload = resp.json() + except httpx.HTTPError as e: + log.warning("SFM /db/events failed for recent-event-callins: %s", e) + return {"call_ins": [], "total": 0, "error": str(e)} + + events = payload.get("events", []) or [] + + # Bulk-resolve serials → roster (single query, no N+1) + serials = list({ev.get("serial") for ev in events if ev.get("serial")}) + roster_map: Dict[str, RosterUnit] = {} + if serials: + roster_map = { + r.id: r + for r in db.query(RosterUnit).filter(RosterUnit.id.in_(serials)).all() + } + + now = datetime.now(timezone.utc) + call_ins: List[Dict[str, Any]] = [] + + for ev in events: + serial = ev.get("serial") + if not serial: + continue + + roster = roster_map.get(serial) + + # created_at = when SFM received the forward. Falls back to the event + # timestamp if the SFM payload didn't carry created_at (older rows). + created_at_str = ev.get("created_at") or ev.get("timestamp") + time_ago = "—" + if created_at_str: + try: + ts = datetime.fromisoformat(created_at_str.replace("Z", "+00:00")) + if ts.tzinfo is None: + ts = ts.replace(tzinfo=timezone.utc) + time_ago = _humanize_age((now - ts).total_seconds()) + except ValueError: + pass + + call_ins.append({ + "unit_id": serial, + "serial": serial, + "event_id": ev.get("id"), + "event_timestamp": ev.get("timestamp"), + "created_at": ev.get("created_at"), + "time_ago": time_ago, + "peak_vector_sum": ev.get("peak_vector_sum"), + "false_trigger": bool(ev.get("false_trigger")), + "sensor_location": ev.get("sensor_location") or "", + "project": ev.get("project") or "", + "device_type": roster.device_type if roster else "seismograph", + "in_roster": roster is not None, + "note": (roster.note if roster else "") or "", + }) + + return { + "call_ins": call_ins, + "total": len(call_ins), + "source": "sfm-events", + } diff --git a/backend/routers/admin_modules.py b/backend/routers/admin_modules.py new file mode 100644 index 0000000..8f9dc6c --- /dev/null +++ b/backend/routers/admin_modules.py @@ -0,0 +1,209 @@ +""" +Admin / diagnostic pages for the device modules (SFM, SLMM). + +These pages live under /admin/{module} and exist purely so an operator can +peek under the hood and confirm the module is reachable, what data it's +holding, and whether the proxy from terra-view is healthy. + +Routes: + GET /admin/sfm — SFM diagnostic page + GET /admin/slmm — SLMM diagnostic page + +API helpers (called by the HTML pages via fetch): + GET /api/admin/sfm/overview — aggregated SFM health + db stats in one call + GET /api/admin/slmm/overview — aggregated SLMM health + device count + +The pages are intentionally read-only. Any actual administration of SFM +or SLMM happens in those modules directly. +""" + +import logging +import os +from datetime import datetime, timezone +from typing import Any, Dict + +import httpx +from fastapi import APIRouter, Depends, Request +from fastapi.responses import HTMLResponse, JSONResponse +from sqlalchemy.orm import Session + +from backend.database import get_db +from backend.templates_config import templates + +log = logging.getLogger(__name__) + +router = APIRouter() + +SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200") +SLMM_BASE_URL = os.getenv("SLMM_BASE_URL", "http://localhost:8100") + + +# ── SFM ─────────────────────────────────────────────────────────────────────── + + +@router.get("/admin/sfm", response_class=HTMLResponse) +def admin_sfm_page(request: Request): + return templates.TemplateResponse("admin_sfm.html", { + "request": request, + "sfm_base_url": SFM_BASE_URL, + }) + + +@router.get("/api/admin/sfm/overview") +async def admin_sfm_overview() -> JSONResponse: + """Aggregated SFM diagnostic snapshot. + + Returns health, db stats, stale-table counts, per-unit summary, and + recent events with forwarding latency. Tolerant of partial failures: + any individual sub-fetch error is captured into its section, so a flaky + sub-endpoint doesn't break the whole page. + """ + overview: Dict[str, Any] = { + "sfm_base_url": SFM_BASE_URL, + "checked_at": datetime.now(timezone.utc).isoformat(), + "health": None, + "reachable": False, + "units": [], + "events": [], + "stale": { + "monitor_log": None, + "sessions": None, + }, + "cache_stats": None, + "errors": {}, + } + + async with httpx.AsyncClient(timeout=5.0) as client: + # Health + try: + r = await client.get(f"{SFM_BASE_URL}/health") + r.raise_for_status() + overview["health"] = r.json() + overview["reachable"] = overview["health"].get("status") == "ok" + except Exception as e: # noqa: BLE001 + overview["errors"]["health"] = str(e) + overview["reachable"] = False + + # If SFM is down, no point hitting the rest. + if not overview["reachable"]: + return JSONResponse(overview) + + # Units + try: + r = await client.get(f"{SFM_BASE_URL}/db/units") + r.raise_for_status() + overview["units"] = r.json() or [] + except Exception as e: # noqa: BLE001 + overview["errors"]["units"] = str(e) + + # Recent events (newest 25 — bigger sample of the call-home stream) + try: + r = await client.get(f"{SFM_BASE_URL}/db/events", params={"limit": 25}) + r.raise_for_status() + payload = r.json() or {} + events = payload.get("events", []) or [] + # Compute forwarding latency: created_at (SFM ingest) − timestamp (event). + now = datetime.now(timezone.utc) + for ev in events: + ev.pop("waveform_blob", None) + ev.pop("a5_pickle_filename", None) + ts_str = ev.get("timestamp") + ca_str = ev.get("created_at") + latency_seconds = None + try: + if ts_str and ca_str: + ts = datetime.fromisoformat(ts_str.replace("Z", "+00:00")) + ca = datetime.fromisoformat(ca_str.replace("Z", "+00:00")) + if ts.tzinfo is None: ts = ts.replace(tzinfo=timezone.utc) + if ca.tzinfo is None: ca = ca.replace(tzinfo=timezone.utc) + latency_seconds = (ca - ts).total_seconds() + except ValueError: + pass + ev["forwarding_latency_seconds"] = latency_seconds + overview["events"] = events + except Exception as e: # noqa: BLE001 + overview["errors"]["events"] = str(e) + + # Stale tables (deprecated by the watcher-forward pipeline but still + # present in SFM's SQLite). Surface as counts only. + for key, path in (("monitor_log", "/db/monitor_log"), + ("sessions", "/db/sessions")): + try: + r = await client.get(f"{SFM_BASE_URL}{path}", params={"limit": 1}) + r.raise_for_status() + payload = r.json() or {} + # SFM returns count = total when limit covers all rows; we + # query with limit=1 just to be polite, then ask again with + # a high limit if we need the real total. + first_count = payload.get("count") + if first_count is None: + overview["stale"][key] = None + continue + # Re-query with high limit to get the true total. + r2 = await client.get(f"{SFM_BASE_URL}{path}", params={"limit": 100000}) + r2.raise_for_status() + overview["stale"][key] = (r2.json() or {}).get("count") + except Exception as e: # noqa: BLE001 + overview["errors"][f"stale_{key}"] = str(e) + + # Cache stats (in-memory device cache on SFM) + try: + r = await client.get(f"{SFM_BASE_URL}/cache/stats") + r.raise_for_status() + overview["cache_stats"] = r.json() + except Exception as e: # noqa: BLE001 + overview["errors"]["cache_stats"] = str(e) + + # Aggregate counts the UI can render without re-walking arrays + overview["totals"] = { + "units": len(overview["units"]), + "events_total": sum(u.get("total_events", 0) for u in overview["units"]), + "stale_monitor_log": overview["stale"]["monitor_log"], + "stale_sessions": overview["stale"]["sessions"], + } + + return JSONResponse(overview) + + +# ── SLMM ────────────────────────────────────────────────────────────────────── + + +@router.get("/admin/slmm", response_class=HTMLResponse) +def admin_slmm_page(request: Request): + return templates.TemplateResponse("admin_slmm.html", { + "request": request, + "slmm_base_url": SLMM_BASE_URL, + }) + + +@router.get("/api/admin/slmm/overview") +async def admin_slmm_overview() -> JSONResponse: + """Aggregated SLMM diagnostic snapshot.""" + overview: Dict[str, Any] = { + "slmm_base_url": SLMM_BASE_URL, + "checked_at": datetime.now(timezone.utc).isoformat(), + "health": None, + "reachable": False, + "devices": [], + "errors": {}, + } + + async with httpx.AsyncClient(timeout=5.0) as client: + try: + r = await client.get(f"{SLMM_BASE_URL}/health") + r.raise_for_status() + overview["health"] = r.json() + overview["reachable"] = True + except Exception as e: # noqa: BLE001 + overview["errors"]["health"] = str(e) + return JSONResponse(overview) + + # Pull a roster of configured devices (SLMM exposes per-unit + # config + status under /api/nl43/*). This is a best-effort probe + # — SLMM doesn't expose a "list all devices" endpoint, so we ask + # terra-view's RosterUnit table what serials it knows about for + # SLMs and just check each one. For now, just surface the health + # payload and let the operator click through to /sound-level-meters + # for the per-device details. + + return JSONResponse(overview) diff --git a/backend/routers/metadata_backfill.py b/backend/routers/metadata_backfill.py new file mode 100644 index 0000000..3e002a1 --- /dev/null +++ b/backend/routers/metadata_backfill.py @@ -0,0 +1,394 @@ +""" +Metadata-backfill admin router. + +Endpoints under /api/admin/metadata_backfill: + + GET /scan — run the scan; return clusters + suggestions (JSON). + Cached 5 minutes so the wizard doesn't re-scan on + every page render. + POST /apply — apply a list of cluster_ids; body specifies which to + accept and optional per-cluster overrides. + POST /skip — mark cluster_ids as skipped (won't reappear). +""" + +from __future__ import annotations + +import os +import time +from typing import Optional + +from fastapi import APIRouter, Depends, HTTPException, Request +from fastapi.responses import JSONResponse +from sqlalchemy.orm import Session + +from backend.database import get_db +from backend.models import Project, MonitoringLocation +from backend.services import metadata_backfill as svc + +router = APIRouter(prefix="/api/admin/metadata_backfill", tags=["metadata-backfill"]) + +SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200") + +# In-process scan cache. Trades memory for not re-hammering SFM on every +# wizard render. TTL: 5 minutes. Singleton per-process; fine for a +# single-worker uvicorn dev setup. For prod multi-worker we'd want to put +# this in the DB or Redis; deferred. +_SCAN_CACHE: dict = {"at": 0.0, "result": None} +_SCAN_CACHE_TTL_SECONDS = 300.0 + + +def _serialise_suggestion(s: svc.Suggestion) -> dict: + c = s.cluster + return { + "cluster_id": c.cluster_id, + "serial": c.serial, + "first_event_ts": c.first_event_ts.isoformat(), + "last_event_ts": c.last_event_ts.isoformat(), + "event_count": c.event_count, + "sample_event_id": c.sample_event_id, + "project_raw": c.project_raw, + "project_root": c.project_root, + "location_raw": c.location_raw, + "client_raw": c.client_raw, + "operator_raw": c.operator_raw, + "is_blank_meta": c.is_blank_meta, + "metadata_consistency": c.metadata_consistency, + + "project_match": s.project_match, + "project_existing_id": s.project_existing_id, + "project_existing_name": s.project_existing_name, + "project_match_score": s.project_match_score, + "project_suggested_name": s.project_suggested_name, + + "location_match": s.location_match, + "location_existing_id": s.location_existing_id, + "location_existing_name": s.location_existing_name, + "location_match_score": s.location_match_score, + "location_suggested_name": s.location_suggested_name, + + "proposed_assigned_at": s.proposed_assigned_at.isoformat(), + "proposed_assigned_until": s.proposed_assigned_until.isoformat() if s.proposed_assigned_until else None, + + "confidence": s.confidence, + "blocking_conflict": s.blocking_conflict, + "conflicts": [ + { + "existing_assignment_id": cf.existing_assignment_id, + "other_location_id": cf.other_location_id, + "other_location_name": cf.other_location_name, + "other_project_id": cf.other_project_id, + "other_project_name": cf.other_project_name, + } + for cf in s.conflicts + ], + } + + +@router.get("/scan") +async def scan( + force: bool = False, + db: Session = Depends(get_db), +): + """Run a scan and return clusters + suggestions. + + Set force=true to bypass the 5-minute cache. + """ + now = time.time() + if not force and _SCAN_CACHE["result"] is not None \ + and (now - _SCAN_CACHE["at"]) < _SCAN_CACHE_TTL_SECONDS: + return _SCAN_CACHE["result"] + + result = await svc.scan_clusters_and_build_suggestions(db, SFM_BASE_URL) + + # Group suggestions for the wizard UI. + by_confidence = {"high": [], "medium": [], "low": []} + blocking_conflict_count = 0 + for s in result.suggestions: + by_confidence[s.confidence].append(_serialise_suggestion(s)) + if s.blocking_conflict: + blocking_conflict_count += 1 + + payload = { + "scanned_event_count": result.scanned_event_count, + "cluster_count": result.cluster_count, + "already_attributed": result.already_attributed, + "skipped_orphans": result.skipped_orphans, + "pending_count": len(result.suggestions), + "blocking_conflict_count": blocking_conflict_count, + "by_confidence": { + "high": by_confidence["high"], + "medium": by_confidence["medium"], + "low": by_confidence["low"], + }, + "scanned_at": now, + } + _SCAN_CACHE["result"] = payload + _SCAN_CACHE["at"] = now + return payload + + +@router.post("/apply") +async def apply( + request: Request, + db: Session = Depends(get_db), +): + """Apply a list of clusters. + + Body: + { + "cluster_ids": ["abc...", "def..."], + "overrides": { "abc...": { "project_name": "...", "location_name": "..." } } + } + + To accept ALL non-conflict suggestions in one shot, the UI sends every + pending cluster_id with no overrides. + """ + try: + body = await request.json() + except Exception: + raise HTTPException(status_code=400, detail="Invalid JSON body") + + cluster_ids = body.get("cluster_ids") or [] + overrides = body.get("overrides") or {} + if not isinstance(cluster_ids, list) or not cluster_ids: + raise HTTPException(status_code=400, detail="cluster_ids must be a non-empty list") + + # Re-scan to get current suggestions. We don't trust the cached scan + # blindly — the operator might have manually created projects in + # between scan and apply. + scan_result = await svc.scan_clusters_and_build_suggestions(db, SFM_BASE_URL) + suggestions_by_id = {s.cluster.cluster_id: s for s in scan_result.suggestions} + + selected: list[svc.Suggestion] = [] + not_found: list[str] = [] + for cid in cluster_ids: + s = suggestions_by_id.get(cid) + if s is None: + not_found.append(cid) + continue + # Apply overrides. Per-cluster overrides take precedence over the + # parser's suggested match. Four override fields supported: + # project_id — attach to an existing Project (operator picked + # from the typeahead) + # project_name — create new project with this name (operator + # typed a custom name not matching anything) + # location_id — attach to an existing MonitoringLocation + # location_name — create new location with this name + # project_id + location_id pairings: location_id is only honored + # if its project_id matches the chosen project (otherwise treated + # as a create-new). + ov = overrides.get(cid) or {} + + if ov.get("project_id"): + target_id = ov["project_id"] + existing = db.query(svc.Project).filter_by(id=target_id).first() + if existing is not None: + s.project_existing_id = existing.id + s.project_existing_name = existing.name + s.project_suggested_name = existing.name + s.project_match = "exact" + else: + # Stale ID — treat as create_new with the cluster's typed name. + s.project_existing_id = None + s.project_match = "create_new" + elif "project_name" in ov: + new_name = (ov["project_name"] or "").strip() + if new_name: + s.project_suggested_name = new_name + s.project_existing_id = None + s.project_existing_name = None + s.project_match = "create_new" + + if ov.get("location_id"): + target_id = ov["location_id"] + existing = db.query(svc.MonitoringLocation).filter_by(id=target_id).first() + # Only attach if the location belongs to the (now chosen) project. + chosen_project_id = s.project_existing_id + if existing is not None and ( + chosen_project_id is None or existing.project_id == chosen_project_id + ): + s.location_existing_id = existing.id + s.location_existing_name = existing.name + s.location_suggested_name = existing.name + s.location_match = "exact" + else: + s.location_existing_id = None + s.location_match = "create_new" + elif "location_name" in ov: + new_name = (ov["location_name"] or "").strip() + if new_name: + s.location_suggested_name = new_name + s.location_existing_id = None + s.location_existing_name = None + s.location_match = "create_new" + + selected.append(s) + + apply_result = svc.apply_suggestions(db, selected, decided_by="operator") + + # Invalidate the scan cache so the next /scan picks up the new state. + _SCAN_CACHE["at"] = 0.0 + _SCAN_CACHE["result"] = None + + return { + "applied": apply_result.applied, + "failed": [{"cluster_id": cid, "reason": r} for cid, r in apply_result.failed], + "not_found": not_found, + "project_ids_created": apply_result.project_ids_created, + "location_ids_created": apply_result.location_ids_created, + "assignment_ids_created": apply_result.assignment_ids_created, + } + + +@router.post("/skip") +async def skip( + request: Request, + db: Session = Depends(get_db), +): + """Mark cluster_ids as skipped — they won't reappear in future scans.""" + try: + body = await request.json() + except Exception: + raise HTTPException(status_code=400, detail="Invalid JSON body") + + cluster_ids = body.get("cluster_ids") or [] + if not isinstance(cluster_ids, list): + raise HTTPException(status_code=400, detail="cluster_ids must be a list") + + n = svc.skip_clusters(db, cluster_ids, decided_by="operator") + + _SCAN_CACHE["at"] = 0.0 + _SCAN_CACHE["result"] = None + + return {"skipped": n} + + +@router.get("/projects_search") +def projects_search( + q: str = "", + limit: int = 10, + db: Session = Depends(get_db), +): + """Typeahead search of existing projects for the wizard's per-cluster + override inputs. Combines case-insensitive substring match with + rapidfuzz scoring so partial typing and slight typos both surface + candidates. Always returns a 'Create new' option at the end so the + operator can confirm they want to create rather than match. + + Returns: + { + "matches": [ + {"id": "...", "name": "...", "score": 0.91, "location_count": 3}, + ... + ], + "create_new": {"label": "Create new: \"\""} + } + """ + q_clean = (q or "").strip() + q_norm = svc._normalise(q_clean) + + projects = ( + db.query(Project) + .filter(Project.status != "deleted") + .all() + ) + + scored: list[tuple[Project, float]] = [] + for p in projects: + p_norm = svc._normalise(p.name) + if not q_norm: + # Empty query → return top projects by latest activity + # (cheap heuristic: keep them all and sort by name). + scored.append((p, 0.0)) + continue + # Cheap substring boost: if the normalised query is a substring, + # treat that as 1.0 regardless of WRatio. + if q_norm in p_norm: + scored.append((p, 1.0)) + continue + score = svc.similarity(q_norm, p_norm) + if score >= 0.50: # surfacing threshold; not the match threshold + scored.append((p, score)) + + # Sort: score desc, then name asc. + scored.sort(key=lambda t: (-t[1], t[0].name.lower())) + scored = scored[:limit] + + # Compute location counts in one batch query. + loc_counts: dict[str, int] = {} + if scored: + from sqlalchemy import func + ids = [p.id for p, _ in scored] + rows = ( + db.query(MonitoringLocation.project_id, func.count(MonitoringLocation.id)) + .filter(MonitoringLocation.project_id.in_(ids)) + .group_by(MonitoringLocation.project_id) + .all() + ) + loc_counts = {pid: cnt for pid, cnt in rows} + + return { + "matches": [ + { + "id": p.id, + "name": p.name, + "project_number": p.project_number, + "client_name": p.client_name, + "score": round(score, 3), + "location_count": loc_counts.get(p.id, 0), + } + for p, score in scored + ], + "create_new": {"label": f'Create new: "{q_clean}"' if q_clean else None}, + } + + +@router.get("/locations_search") +def locations_search( + project_id: str, + q: str = "", + limit: int = 10, + db: Session = Depends(get_db), +): + """Typeahead search of existing locations within a project.""" + if not project_id: + raise HTTPException(status_code=400, detail="project_id required") + + q_clean = (q or "").strip() + q_norm = svc._normalise(q_clean) + + locations = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == project_id) + .filter(MonitoringLocation.location_type == "vibration") + .all() + ) + + scored: list[tuple[MonitoringLocation, float]] = [] + for l in locations: + l_norm = svc._normalise(l.name) + if not q_norm: + scored.append((l, 0.0)) + continue + if q_norm in l_norm: + scored.append((l, 1.0)) + continue + score = svc.similarity(q_norm, l_norm) + if score >= 0.50: + scored.append((l, score)) + + scored.sort(key=lambda t: (-t[1], t[0].name.lower())) + scored = scored[:limit] + + return { + "matches": [ + { + "id": l.id, + "name": l.name, + "address": l.address, + "score": round(score, 3), + } + for l, score in scored + ], + "create_new": {"label": f'Create new: "{q_clean}"' if q_clean else None}, + } diff --git a/backend/routers/project_locations.py b/backend/routers/project_locations.py index 701ffcf..733fd38 100644 --- a/backend/routers/project_locations.py +++ b/backend/routers/project_locations.py @@ -30,6 +30,7 @@ from backend.models import ( RosterUnit, MonitoringSession, DataFile, + UnitHistory, ) from backend.templates_config import templates from backend.utils.timezone import local_to_utc @@ -37,6 +38,42 @@ from backend.utils.timezone import local_to_utc router = APIRouter(prefix="/api/projects/{project_id}", tags=["project-locations"]) +# ── Audit log helper ────────────────────────────────────────────────────────── +# Mirrors record_history() in roster_edit.py. Kept local to avoid cross-router +# imports. The four assignment endpoints below use this to write UnitHistory +# rows that the unit-detail deployment timeline (Phase 4) renders. + +def _record_assignment_history( + db: Session, + unit_id: str, + change_type: str, + *, + old_value: Optional[str] = None, + new_value: Optional[str] = None, + notes: Optional[str] = None, +) -> None: + """Append a UnitHistory row for an assignment-lifecycle event. + + change_type values used: + - assignment_created — unit assigned to a location (new assignment) + - assignment_ended — unit unassigned / removed (assigned_until set) + - assignment_swapped — unit replaced by another at the same location + - assignment_updated — assignment dates / notes edited via PATCH + + Caller is responsible for db.commit(). + """ + db.add(UnitHistory( + unit_id=unit_id, + change_type=change_type, + field_name="unit_assignment", + old_value=old_value, + new_value=new_value, + changed_at=datetime.utcnow(), + source="manual", + notes=notes, + )) + + # ============================================================================ # Shared helpers # ============================================================================ @@ -403,6 +440,13 @@ async def assign_unit_to_location( ) db.add(assignment) + _record_assignment_history( + db, + unit_id=unit_id, + change_type="assignment_created", + new_value=f"{location.name} (project: {location.project_id})", + notes=form_data.get("notes"), + ) db.commit() db.refresh(assignment) @@ -448,11 +492,164 @@ async def unassign_unit( assignment.status = "completed" assignment.assigned_until = datetime.utcnow() + location = db.query(MonitoringLocation).filter_by(id=assignment.location_id).first() + _record_assignment_history( + db, + unit_id=assignment.unit_id, + change_type="assignment_ended", + old_value=location.name if location else assignment.location_id, + new_value="unassigned", + ) + db.commit() return {"success": True, "message": "Unit unassigned successfully"} +@router.patch("/assignments/{assignment_id}") +async def update_assignment( + project_id: str, + assignment_id: str, + request: Request, + db: Session = Depends(get_db), +): + """ + Update an assignment's date window and/or notes. + + Common use case: backdate a deployment so events emitted before the + operator created the assignment in terra-view (e.g. a unit that was + physically deployed in December but only recorded in the system today) + get correctly attributed to the location. + + Accepts JSON body with optional fields: + - assigned_at: ISO datetime (or empty string to leave unchanged) + - assigned_until: ISO datetime, or null/"" to mark indefinite (active) + - notes: string + + Sets `status` to "active" when assigned_until is cleared, "completed" + when it's set in the past. + """ + assignment = db.query(UnitAssignment).filter_by( + id=assignment_id, + project_id=project_id, + ).first() + + if not assignment: + raise HTTPException(status_code=404, detail="Assignment not found") + + try: + payload = await request.json() + except Exception: + raise HTTPException(status_code=400, detail="Invalid JSON body") + + # Parse new values (None = unchanged, explicit None/"" for assigned_until = clear) + new_assigned_at = assignment.assigned_at + new_assigned_until = assignment.assigned_until + new_notes = assignment.notes + + if "assigned_at" in payload: + raw = payload["assigned_at"] + if raw is None or raw == "": + raise HTTPException( + status_code=400, + detail="assigned_at is required; cannot be cleared.", + ) + try: + # Accept "YYYY-MM-DDTHH:MM" from datetime-local inputs or full ISO. + new_assigned_at = datetime.fromisoformat(raw) + except (TypeError, ValueError): + raise HTTPException( + status_code=400, + detail=f"Invalid assigned_at datetime: {raw!r}", + ) + + if "assigned_until" in payload: + raw = payload["assigned_until"] + if raw is None or raw == "": + new_assigned_until = None + else: + try: + new_assigned_until = datetime.fromisoformat(raw) + except (TypeError, ValueError): + raise HTTPException( + status_code=400, + detail=f"Invalid assigned_until datetime: {raw!r}", + ) + + if "notes" in payload: + raw = payload["notes"] + new_notes = (raw or "").strip() or None + + # Validation: end must be after start if both set. + if new_assigned_until is not None and new_assigned_until <= new_assigned_at: + raise HTTPException( + status_code=400, + detail="assigned_until must be after assigned_at.", + ) + + # Sanity: reject creating an overlap with another assignment of the SAME + # unit at the SAME location. Different units at the same location can + # legitimately overlap during a swap window (rare but valid). + new_end_for_overlap = new_assigned_until or datetime.utcnow() + overlapping = ( + db.query(UnitAssignment) + .filter(UnitAssignment.location_id == assignment.location_id) + .filter(UnitAssignment.unit_id == assignment.unit_id) + .filter(UnitAssignment.id != assignment.id) + .all() + ) + for other in overlapping: + other_start = other.assigned_at + other_end = other.assigned_until or datetime.utcnow() + if new_assigned_at < other_end and new_end_for_overlap > other_start: + raise HTTPException( + status_code=400, + detail=( + f"This window overlaps with another assignment for the " + f"same unit ({other.assigned_at:%Y-%m-%d} → " + f"{other.assigned_until and other.assigned_until.strftime('%Y-%m-%d') or 'present'})." + ), + ) + + # Capture change description for audit log BEFORE mutating. + old_start = assignment.assigned_at.isoformat() if assignment.assigned_at else None + old_end = assignment.assigned_until.isoformat() if assignment.assigned_until else "active" + new_start = new_assigned_at.isoformat() if new_assigned_at else None + new_end = new_assigned_until.isoformat() if new_assigned_until else "active" + + # Apply. + assignment.assigned_at = new_assigned_at + assignment.assigned_until = new_assigned_until + assignment.notes = new_notes + assignment.status = "completed" if new_assigned_until is not None else "active" + + if old_start != new_start or old_end != new_end: + _record_assignment_history( + db, + unit_id=assignment.unit_id, + change_type="assignment_updated", + old_value=f"{old_start} → {old_end}", + new_value=f"{new_start} → {new_end}", + notes=new_notes, + ) + + db.commit() + db.refresh(assignment) + + return { + "success": True, + "assignment": { + "id": assignment.id, + "unit_id": assignment.unit_id, + "location_id": assignment.location_id, + "assigned_at": assignment.assigned_at.isoformat() if assignment.assigned_at else None, + "assigned_until": assignment.assigned_until.isoformat() if assignment.assigned_until else None, + "status": assignment.status, + "notes": assignment.notes, + }, + } + + @router.post("/locations/{location_id}/swap") async def swap_unit_on_location( project_id: str, @@ -503,6 +700,16 @@ async def swap_unit_on_location( if current: current.assigned_until = datetime.utcnow() current.status = "completed" + # If the swap is replacing a different unit, that unit's deployment ended. + if current.unit_id != unit_id: + _record_assignment_history( + db, + unit_id=current.unit_id, + change_type="assignment_swapped", + old_value=location.name, + new_value=f"swapped out → {unit_id}", + notes=notes, + ) # Create new assignment new_assignment = UnitAssignment( @@ -516,6 +723,13 @@ async def swap_unit_on_location( notes=notes, ) db.add(new_assignment) + _record_assignment_history( + db, + unit_id=unit_id, + change_type="assignment_swapped" if (current and current.unit_id != unit_id) else "assignment_created", + new_value=f"{location.name} (project: {location.project_id})", + notes=notes, + ) # Update modem pairing on the seismograph if modem provided if modem_id: @@ -648,6 +862,108 @@ async def get_nrl_sessions( }) +@router.get("/vibration_summary", response_class=HTMLResponse) +async def get_project_vibration_summary( + project_id: str, + request: Request, + from_dt: Optional[datetime] = Query(None), + to_dt: Optional[datetime] = Query(None), + db: Session = Depends(get_db), +): + """ + Render a small HTML partial summarising vibration-event activity + across every vibration MonitoringLocation in the project. + + Returned to the Vibration tab of the project detail page via HTMX. + Fans out concurrently across all locations (which in turn fan out + across each location's UnitAssignment windows). Total queries to + SFM = sum of assignments across the project. + + 404 if the project doesn't exist. Empty-state partial if the + project has no vibration locations. + """ + project = db.query(Project).filter_by(id=project_id).first() + if not project: + raise HTTPException(status_code=404, detail="Project not found.") + + from backend.services.sfm_events import vibration_summary_for_project + + summary = await vibration_summary_for_project( + db, project_id, from_dt=from_dt, to_dt=to_dt + ) + + return templates.TemplateResponse( + "partials/projects/vibration_summary.html", + { + "request": request, + "project_id": project_id, + "summary": summary, + }, + ) + + +@router.get("/locations/{location_id}/events", response_class=JSONResponse) +async def get_location_events( + project_id: str, + location_id: str, + from_dt: Optional[datetime] = Query(None), + to_dt: Optional[datetime] = Query(None), + false_trigger: Optional[bool] = Query(None), + limit: int = Query(500, ge=1, le=5000), + db: Session = Depends(get_db), +): + """ + Return SFM events recorded at this monitoring location. + + Fans out the location's UnitAssignment rows (every seismograph ever + assigned to this location, active + closed), queries SFM /db/events + for each (serial, time-window) pair concurrently, and unions the + results. + + Sound (SLM) locations return an empty payload — SFM events are + seismograph-only. + """ + location = db.query(MonitoringLocation).filter_by(id=location_id).first() + if not location: + raise HTTPException(status_code=404, detail="Location not found.") + if location.project_id != project_id: + raise HTTPException( + status_code=404, + detail="Location does not belong to this project.", + ) + + # SLM locations don't have SFM events — return an empty payload rather + # than 404 so the frontend can render an empty state gracefully. + if location.location_type != "vibration": + return { + "events": [], + "count": 0, + "stats": { + "event_count": 0, + "peak_pvs": None, + "peak_pvs_at": None, + "peak_pvs_serial": None, + "last_event": None, + "false_trigger_count": 0, + }, + "assignments_used": [], + "location_type": location.location_type, + } + + from backend.services.sfm_events import events_for_location + + result = await events_for_location( + db, + location_id, + from_dt=from_dt, + to_dt=to_dt, + false_trigger=false_trigger, + limit=limit, + ) + result["location_type"] = location.location_type + return result + + @router.get("/nrl/{location_id}/files", response_class=HTMLResponse) async def get_nrl_files( project_id: str, diff --git a/backend/routers/projects.py b/backend/routers/projects.py index bc40b1b..3c61236 100644 --- a/backend/routers/projects.py +++ b/backend/routers/projects.py @@ -688,6 +688,115 @@ async def restore_project(project_id: str, db: Session = Depends(get_db)): return {"success": True, "message": f"Project '{project.name}' restored."} +# ── Project merge ────────────────────────────────────────────────────────────── +# Consolidate a duplicate project into another. Common after the +# metadata-backfill parser creates near-duplicate projects from name +# variations operators typed on the BW device. +# See backend/services/project_merge.py for the merge logic. + +@router.get("/{source_id}/merge_preview") +async def project_merge_preview( + source_id: str, + target_id: str, + db: Session = Depends(get_db), +): + """Preview what the merge will do — used by the confirmation modal. + No writes.""" + from backend.services import project_merge as pm + preview = pm.preview(db, source_id, target_id) + return { + "source_project_id": preview.source_project_id, + "source_project_name": preview.source_project_name, + "target_project_id": preview.target_project_id, + "target_project_name": preview.target_project_name, + "total_assignments_moving": preview.total_assignments_moving, + "total_sessions_moving": preview.total_sessions_moving, + "total_data_files_moving": preview.total_data_files_moving, + "modules_to_add": preview.modules_to_add, + "warnings": preview.warnings, + "location_plans": [ + { + "source_id": p.source_id, + "source_name": p.source_name, + "target_id": p.target_id, + "target_name": p.target_name, + "action": p.action, + "assignments_moving": p.assignments_moving, + "sessions_moving": p.sessions_moving, + } + for p in preview.location_plans + ], + } + + +@router.get("/admin/duplicate_pairs") +async def get_duplicate_pairs( + threshold: float = 0.85, + max_pairs: int = 200, + db: Session = Depends(get_db), +): + """Return all active-project pairs whose names fuzzy-match above the + threshold. Used by the Tidy page to surface duplicates that would + otherwise have to be hunted down one at a time. + + Each pair carries a suggested merge-target with the reasoning so the + operator can decide direction with one click. + """ + from backend.services import project_tidy as pt + pairs = pt.find_duplicate_pairs(db, threshold=threshold, max_pairs=max_pairs) + + def _ps(p): + return { + "id": p.id, + "name": p.name, + "project_number": p.project_number, + "client_name": p.client_name, + "source": p.source, + "status": p.status, + "location_count": p.location_count, + "assignment_count": p.assignment_count, + } + + return { + "pairs": [ + { + "a": _ps(pair.a), + "b": _ps(pair.b), + "score": round(pair.score, 3), + "suggested_target_id": pair.suggested_target_id, + "reason": pair.reason, + } + for pair in pairs + ], + "threshold": threshold, + } + + +@router.post("/{source_id}/merge_into") +async def project_merge_execute( + source_id: str, + target_id: str, + db: Session = Depends(get_db), +): + """Execute the merge. Source project gets soft-deleted; all its + locations / assignments / sessions / data_files / modules move to + the target. Same-named locations consolidate.""" + from backend.services import project_merge as pm + result = pm.execute(db, source_id, target_id, decided_by="operator") + return { + "success": True, + "source_project_id": result.source_project_id, + "target_project_id": result.target_project_id, + "assignments_moved": result.assignments_moved, + "locations_moved": result.locations_moved, + "locations_consolidated": result.locations_consolidated, + "sessions_moved": result.sessions_moved, + "data_files_moved": result.data_files_moved, + "modules_added": result.modules_added, + "audit_rows_written": result.audit_rows_written, + } + + @router.get("/{project_id}") async def get_project(project_id: str, db: Session = Depends(get_db)): """ diff --git a/backend/routers/sfm.py b/backend/routers/sfm.py new file mode 100644 index 0000000..5126284 --- /dev/null +++ b/backend/routers/sfm.py @@ -0,0 +1,130 @@ +""" +SFM (Seismograph Field Module) Proxy Router + +Proxies requests from terra-view to the standalone SFM backend service. +SFM runs on port 8200 and handles MiniMate Plus seismograph communication +and event database queries. + +SFM endpoints are at root level (e.g. /db/units, /device/info) — no /api/ prefix. +""" + +from fastapi import APIRouter, HTTPException, Request, Response +import httpx +import logging +import os + +logger = logging.getLogger(__name__) + +router = APIRouter(prefix="/api/sfm", tags=["sfm"]) + +# SFM backend URL - configurable via environment variable +SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200") + + +@router.get("/health") +async def check_sfm_health(): + """ + Check if the SFM backend service is reachable and healthy. + """ + try: + async with httpx.AsyncClient(timeout=5.0) as client: + response = await client.get(f"{SFM_BASE_URL}/health") + + if response.status_code == 200: + data = response.json() + return { + "status": "ok", + "sfm_status": "connected", + "sfm_url": SFM_BASE_URL, + "sfm_response": data + } + else: + return { + "status": "degraded", + "sfm_status": "error", + "sfm_url": SFM_BASE_URL, + "detail": f"SFM returned status {response.status_code}" + } + + except httpx.ConnectError: + return { + "status": "error", + "sfm_status": "unreachable", + "sfm_url": SFM_BASE_URL, + "detail": "Cannot connect to SFM backend. Is it running?" + } + except Exception as e: + return { + "status": "error", + "sfm_status": "error", + "sfm_url": SFM_BASE_URL, + "detail": str(e) + } + + +# HTTP catch-all — proxies everything else to SFM backend +@router.api_route("/{path:path}", methods=["GET", "POST", "PUT", "DELETE", "PATCH"]) +async def proxy_to_sfm(path: str, request: Request): + """ + Proxy all requests to the SFM backend service. + + SFM endpoints have no /api/ prefix — target URL is {SFM_BASE_URL}/{path}. + Timeout is 60s to allow for live device round-trips (event downloads can + take 30-45s for a full event list). + """ + # Build target URL — SFM endpoints live at root, not /api/ + target_url = f"{SFM_BASE_URL}/{path}" + + # Forward query params + query_params = dict(request.query_params) + + # Read body for mutation requests + body = None + if request.method in ["POST", "PUT", "PATCH"]: + try: + body = await request.body() + except Exception as e: + logger.error(f"Failed to read request body: {e}") + body = None + + # Strip hop-by-hop headers + headers = dict(request.headers) + headers_to_exclude = ["host", "content-length", "transfer-encoding", "connection"] + proxy_headers = {k: v for k, v in headers.items() if k.lower() not in headers_to_exclude} + + logger.info(f"Proxying {request.method} {path} → SFM: {target_url}") + + try: + async with httpx.AsyncClient(timeout=60.0) as client: + response = await client.request( + method=request.method, + url=target_url, + params=query_params, + headers=proxy_headers, + content=body + ) + return Response( + content=response.content, + status_code=response.status_code, + headers=dict(response.headers), + media_type=response.headers.get("content-type") + ) + + except httpx.ConnectError: + logger.error(f"Failed to connect to SFM backend at {SFM_BASE_URL}") + raise HTTPException( + status_code=503, + detail=f"SFM backend service unavailable. Is SFM running on {SFM_BASE_URL}?" + ) + except httpx.TimeoutException: + logger.error(f"Timeout connecting to SFM backend at {SFM_BASE_URL}") + raise HTTPException( + status_code=504, + detail="SFM backend timeout" + ) + except Exception as e: + logger.error(f"Error proxying to SFM: {e}") + raise HTTPException( + status_code=500, + detail=f"Failed to proxy request to SFM: {str(e)}" + ) diff --git a/backend/routers/units.py b/backend/routers/units.py index 31ffc07..55fc75d 100644 --- a/backend/routers/units.py +++ b/backend/routers/units.py @@ -1,7 +1,7 @@ -from fastapi import APIRouter, Depends, HTTPException +from fastapi import APIRouter, Depends, HTTPException, Query from sqlalchemy.orm import Session from datetime import datetime -from typing import Dict, Any +from typing import Dict, Any, Optional from backend.database import get_db from backend.services.snapshot import emit_status_snapshot @@ -72,3 +72,101 @@ def get_unit_by_id(unit_id: str, db: Session = Depends(get_db)): "slm_serial_number": unit.slm_serial_number, "deployed_with_modem_id": unit.deployed_with_modem_id } + + +@router.get("/units/{unit_id}/events") +async def get_unit_events( + unit_id: str, + bucket: str = Query("all", regex="^(all|attributed|unattributed)$"), + from_dt: Optional[datetime] = Query(None), + to_dt: Optional[datetime] = Query(None), + false_trigger: Optional[bool] = Query(None), + limit: int = Query(500, ge=1, le=5000), + db: Session = Depends(get_db), +): + """ + Return SFM events for a single unit, annotated with assignment attribution. + + Each event includes an `attribution` object pointing at the project/location + it falls into (or null if outside every assignment window). Unattributed + events also carry a `nearest_assignment` field with `delta_days` so the + operator can see how far off the nearest assignment is — useful for + deciding whether to backdate the assignment to absorb the event. + + Bucket filter: + - all (default): every event + - attributed: only events inside an assignment window + - unattributed: only orphan events (the diagnostic bucket) + + Non-seismograph units return an empty events list. The route does not + 404 for SLMs/modems so the unit detail page can render the section + conditionally without depending on the response shape. + """ + unit = db.query(RosterUnit).filter_by(id=unit_id).first() + if not unit: + raise HTTPException(status_code=404, detail=f"Unit {unit_id} not found") + + if unit.device_type != "seismograph": + return { + "events": [], + "count": 0, + "stats": { + "event_count": 0, + "unattributed_count": 0, + "peak_pvs": None, + "peak_pvs_at": None, + "peak_pvs_serial": None, + "last_event": None, + "false_trigger_count": 0, + }, + "assignments_total": 0, + "device_type": unit.device_type, + } + + from backend.services.sfm_events import events_for_unit + + result = await events_for_unit( + db, + unit_id, + bucket=bucket, + from_dt=from_dt, + to_dt=to_dt, + false_trigger=false_trigger, + limit=limit, + ) + result["device_type"] = unit.device_type + return result + + +@router.get("/units/{unit_id}/deployment_timeline") +async def get_unit_deployment_timeline( + unit_id: str, + include_events: bool = Query(True), + db: Session = Depends(get_db), +): + """ + Return a chronological deployment timeline for a unit. + + Merges three sources: + 1. unit_assignments — authoritative project/location deployments + 2. unit_history — state changes (calibration, retirement, etc.) + 3. SFM events — per-assignment overlay (count + peak PVS + last event) + + Replaces the legacy /api/deployments/{unit_id} (which read the + deprecated `deployment_records` table) and the + /api/roster/history/{unit_id} timeline endpoint, unifying them into + a single derived view. + + Gaps >= 1 day between consecutive assignments are surfaced as + synthetic "gap" entries. + + Pass include_events=false to skip the SFM event overlay (saves N + HTTP calls; useful for fast text-only history dumps). + """ + from backend.services.deployment_timeline import deployment_timeline_for_unit + + return await deployment_timeline_for_unit( + db, + unit_id, + include_event_overlay=include_events, + ) diff --git a/backend/services/deployment_timeline.py b/backend/services/deployment_timeline.py new file mode 100644 index 0000000..21fa8af --- /dev/null +++ b/backend/services/deployment_timeline.py @@ -0,0 +1,256 @@ +""" +Deployment timeline service — replaces the legacy `deployment_records`-driven +timeline on the seismograph unit detail page. + +Architecture: + - `unit_assignments` is the authoritative source for "where was this unit" + (one row per location/time-window). Auto-written by the project location + swap/assign/unassign/update workflows. + - `unit_history` is the audit log for non-location state changes + (calibration toggles, retirement, allocation, etc.). + - SFM events are overlaid per assignment window to show "what was the unit + actually doing during this deployment" (count + peak PVS + last-event). + +Gaps between assignments are emitted as synthetic "gap" entries so operators +can see when the unit was idle vs out-of-service. + +`deployment_records` is being deprecated; this module does not read it. +""" + +from __future__ import annotations + +import asyncio +import logging +from datetime import datetime, timedelta +from typing import Optional + +import httpx +from sqlalchemy.orm import Session + +from backend.models import ( + UnitAssignment, + UnitHistory, + MonitoringLocation, + Project, + RosterUnit, +) +from backend.services.sfm_events import ( + SFM_BASE_URL, + _fetch_events_for_serial, + _iso_utc, +) + +log = logging.getLogger("backend.services.deployment_timeline") + +# Don't emit synthetic gap entries shorter than this (seconds). Avoids visual +# clutter from a sub-second handoff during a swap workflow. +_MIN_GAP_SECONDS = 24 * 3600 # 1 day + +# Per-call timeout when querying SFM for the event overlay. +_SFM_TIMEOUT = 10.0 +_SFM_FETCH_CEILING = 5000 + + +# ── Public API ──────────────────────────────────────────────────────────────── + + +async def deployment_timeline_for_unit( + db: Session, + unit_id: str, + *, + include_event_overlay: bool = True, +) -> dict: + """Build a chronological timeline for a unit. + + Returns: + { + "unit_id": str, + "device_type": str, + "entries": [ + { + "kind": "assignment" | "gap" | "state_change", + "starts_at": ISO timestamp, + "ends_at": ISO timestamp | None, + "duration_days": float | None, + # — assignment-only fields — + "assignment_id": str, + "location_id": str, + "location_name": str, + "project_id": str, + "project_name": str, + "is_active": bool, + "event_overlay": {event_count, peak_pvs, peak_pvs_at, last_event} + or None if include_event_overlay=False, + "notes": str | None, + # — gap-only fields — + "context": "between assignments" | None, + # — state_change-only fields — + "change_type": str, + "field_name": str | None, + "old_value": str | None, + "new_value": str | None, + "source": str, + "history_notes": str | None, + }, + ... # newest first + ], + } + """ + unit = db.query(RosterUnit).filter_by(id=unit_id).first() + if not unit: + return {"unit_id": unit_id, "device_type": None, "entries": []} + + # 1. Load assignments + their location/project lookups in bulk. + assignments = ( + db.query(UnitAssignment) + .filter(UnitAssignment.unit_id == unit_id) + .order_by(UnitAssignment.assigned_at.asc()) + .all() + ) + + loc_ids = {a.location_id for a in assignments} + proj_ids = {a.project_id for a in assignments} + loc_map = { + l.id: l for l in db.query(MonitoringLocation).filter( + MonitoringLocation.id.in_(loc_ids) + ).all() + } if loc_ids else {} + proj_map = { + p.id: p for p in db.query(Project).filter( + Project.id.in_(proj_ids) + ).all() + } if proj_ids else {} + + # 2. Load relevant unit_history rows. We surface state changes that + # operators care about on a deployment timeline: calibration status, + # retirement, deployed flag, allocation, calibration date, and the + # assignment_* events we just added (those are redundant with the + # assignment rows themselves, so we skip them to avoid double-rendering). + interesting_change_types = ( + "calibration_status_change", + "retired_change", + "deployed_change", + "allocation_change", + "last_calibrated_change", + "next_calibration_due_change", + ) + history = ( + db.query(UnitHistory) + .filter(UnitHistory.unit_id == unit_id) + .filter(UnitHistory.change_type.in_(interesting_change_types)) + .order_by(UnitHistory.changed_at.asc()) + .all() + ) + + now = datetime.utcnow() + + # 3. Optionally fetch SFM event overlay for each assignment window. + # Concurrent fan-out via httpx + asyncio.gather. + overlays: dict[str, dict] = {} + if include_event_overlay and assignments and unit.device_type == "seismograph": + async with httpx.AsyncClient(timeout=_SFM_TIMEOUT) as client: + results = await asyncio.gather( + *( + _fetch_events_for_serial( + client, + serial=unit_id, + from_dt=a.assigned_at, + to_dt=a.assigned_until or now, + false_trigger=None, + limit=_SFM_FETCH_CEILING, + ) + for a in assignments + ), + return_exceptions=False, + ) + for a, events in zip(assignments, results): + peak = None + peak_at = None + last_ev = None + for ev in events: + pvs = ev.get("peak_vector_sum") + if pvs is not None and (peak is None or pvs > peak): + peak = pvs + peak_at = ev.get("timestamp") + ts = ev.get("timestamp") + if ts and (last_ev is None or ts > last_ev): + last_ev = ts + overlays[a.id] = { + "event_count": len(events), + "peak_pvs": peak, + "peak_pvs_at": peak_at, + "last_event": last_ev, + } + + # 4. Build entries. Start by emitting assignment rows + gap rows between + # consecutive assignments, then add state-change rows from unit_history. + entries: list[dict] = [] + + for idx, a in enumerate(assignments): + loc = loc_map.get(a.location_id) + proj = proj_map.get(a.project_id) + is_active = a.assigned_until is None + ends_at = a.assigned_until or now + duration_days = (ends_at - a.assigned_at).total_seconds() / 86400 if a.assigned_at else None + + entry = { + "kind": "assignment", + "starts_at": _iso_utc(a.assigned_at), + "ends_at": _iso_utc(a.assigned_until), + "duration_days": round(duration_days, 1) if duration_days is not None else None, + "assignment_id": a.id, + "location_id": a.location_id, + "location_name": loc.name if loc else None, + "project_id": a.project_id, + "project_name": proj.name if proj else None, + "is_active": is_active, + "notes": a.notes, + "event_overlay": overlays.get(a.id), + } + entries.append(entry) + + # Gap detection: from the end of this assignment to the start of the + # next one. Only emit gaps that are at least _MIN_GAP_SECONDS long + # so trivial sub-second handoffs during swaps don't clutter the view. + if idx + 1 < len(assignments): + next_a = assignments[idx + 1] + gap_start = a.assigned_until or now + gap_end = next_a.assigned_at + gap_seconds = (gap_end - gap_start).total_seconds() if gap_end and gap_start else 0 + if gap_seconds >= _MIN_GAP_SECONDS: + entries.append({ + "kind": "gap", + "starts_at": _iso_utc(gap_start), + "ends_at": _iso_utc(gap_end), + "duration_days": round(gap_seconds / 86400, 1), + "context": "between assignments", + }) + + # 5. State changes — interleaved by timestamp. Skip no-op rows where + # old_value == new_value (an artifact of the legacy record_history() + # being called on every save regardless of whether the field changed). + for h in history: + if h.old_value == h.new_value: + continue + entries.append({ + "kind": "state_change", + "starts_at": _iso_utc(h.changed_at), + "ends_at": None, + "duration_days": None, + "change_type": h.change_type, + "field_name": h.field_name, + "old_value": h.old_value, + "new_value": h.new_value, + "source": h.source, + "history_notes": h.notes, + }) + + # 6. Sort newest first. Active assignments (no end) sort by start time, + # same as everything else. + entries.sort(key=lambda e: e.get("starts_at") or "", reverse=True) + + return { + "unit_id": unit.id, + "device_type": unit.device_type, + "entries": entries, + } diff --git a/backend/services/metadata_backfill.py b/backend/services/metadata_backfill.py new file mode 100644 index 0000000..303328a --- /dev/null +++ b/backend/services/metadata_backfill.py @@ -0,0 +1,1139 @@ +""" +metadata_backfill.py — turn operator-typed BW event metadata into the +terra-view Project / MonitoringLocation / UnitAssignment graph. + +Architecture (see /home/serversdown/.claude/plans/sfm-metadata-backfill-parser.md): + + 1. Pre-filter: drop events that already fall inside an existing + UnitAssignment window (Phase 2 attribution already handles them). + 2. Time-cluster: serial + 7-day gap is the cluster identity. + 3. Metadata-split: split on persistent (>= 2 events) metadata transitions. + 4. Match against existing graph (rapidfuzz multi-signal scoring). + 5. Score confidence (high/medium/low). + 6. Detect conflicts (overlap with existing UnitAssignment at different + location for the same serial → blocking). + 7. Apply: create Project / MonitoringLocation / UnitAssignment + + UnitHistory audit row, all in one transaction. + +Public API: + scan_clusters_and_build_suggestions(db, sfm_base_url) → ScanResult + apply_suggestions(db, cluster_ids, *, decided_by) → ApplyResult + skip_clusters(db, cluster_ids, *, decided_by) → int +""" + +from __future__ import annotations + +import asyncio +import hashlib +import logging +import os +import re +import uuid +from collections import Counter +from dataclasses import dataclass, field +from datetime import datetime, date, timedelta +from typing import Optional, Iterable, Literal + +import httpx +import rapidfuzz +from sqlalchemy.orm import Session + +from backend.models import ( + Project, + ProjectModule, + MonitoringLocation, + UnitAssignment, + RosterUnit, + UnitHistory, + MetadataBackfillDecision, +) + +log = logging.getLogger("backend.services.metadata_backfill") + + +# ── Tunables ────────────────────────────────────────────────────────────────── +CLUSTER_GAP_DAYS = 7 # time gap that splits a time-cluster +MIN_SPLIT_RUN_LENGTH = 2 # min consecutive events to trigger meta-split +FUZZY_EXACT_THRESHOLD = 0.95 # WRatio score → treated as exact match +FUZZY_MATCH_THRESHOLD = 0.80 # WRatio score → treated as fuzzy match +FUZZY_AMBIGUITY_DELTA = 0.05 # if 2nd-best score is within this of 1st → ambiguous +SUSPICIOUS_SPAN_DAYS = 90 # cluster spanning > this with sparse events → suspicious +RECENT_CLUSTER_DAYS = 3 # if cluster end is within this many days of now → leave assigned_until=NULL + +SFM_FETCH_CEILING = 5000 # max events per SFM /db/events call +SFM_TIMEOUT = 30.0 # generous; this runs in the background + +# ProjectType to assign to auto-created projects. We use a sentinel +# "auto_imported" type that the parser ensures exists. Operator can re-type +# them later by editing the Project. +AUTO_IMPORTED_TYPE_ID = "auto_imported" +AUTO_IMPORTED_TYPE_NAME = "Auto-imported (from event metadata)" + + +# ── Normalisation + fuzzy matching ───────────────────────────────────────────── + + +def _normalise(s: Optional[str]) -> str: + """Lowercase, replace internal punctuation with spaces, collapse spaces. + + Aggressive enough that 'Test-5-8-26' and 'Test 5/8/26' produce the + same normalised string ('test 5 8 26'), but preserves alphanumeric + content so 'I-80 N Fork' doesn't lose its content (becomes 'i 80 n fork'). + """ + if not s: + return "" + s = s.strip().lower() + # Replace any non-alphanumeric character with a space (preserves the + # tokens, just normalises the separators). + s = re.sub(r"[^a-z0-9]+", " ", s) + s = re.sub(r"\s+", " ", s).strip() + return s + + +# Match a "Loc N" / "Location #N" suffix preceded by a separator. Operators +# often type project names like "Fay - Locks & Dam No3 - Loc 2 - 735 Bunola" +# where the leading "Fay - Locks & Dam No3" is the actual project and the +# trailing "- Loc 2 - ..." is location info that already lives in the +# sensor_location field. We strip the trailing junk so projects with the +# same root get clustered together. +# +# Matches: +# "- Loc 2", "-Loc3", "- Location #5", " — Location.5", "- LOC #07" +# Doesn't match strings without an obvious Loc N marker — those keep +# their full project_raw and the operator can edit them in the wizard. +_PROJECT_LOC_SUFFIX = re.compile( + r""" + \s* # any leading whitespace + [-–—.] # separator: hyphen, em-dash, or period + # (operators use any of these — see + # "Mont.Dam.Loc 2-R-25") + \s* + (?:loc|location) # 'Loc' or 'Location' + \.? # optional trailing period after Loc + \s* + (?:no\.?\s*)? # optional "No." or "No " before the digit + # (e.g. "Loc No. 3", "Loc No 5") + \#? # optional '#' + \s* + \d+ # required digit + \b + """, + re.IGNORECASE | re.VERBOSE, +) + + +def _extract_project_root(project_raw: str) -> str: + """Return the leading 'project root' portion of an operator-typed string. + + Strips everything from the first " - Loc N" (or similar) marker forward, + so 'Fay - Locks & Dam No3 - Loc 2 - 735 Bunola' becomes + 'Fay - Locks & Dam No3'. Strings without a Loc-marker pass through + unchanged. + + Trailing whitespace and dangling hyphens are cleaned up. + """ + if not project_raw: + return "" + m = _PROJECT_LOC_SUFFIX.search(project_raw) + if m is None: + return project_raw.strip() + root = project_raw[: m.start()] + # Strip trailing whitespace + dangling separators left behind + # (e.g. "Fay - Locks & Dam No3 -" → "Fay - Locks & Dam No3"). + root = re.sub(r"[\s\-–—]+$", "", root) + return root.strip() + + +# Min length of the SHORTER input before a fuzzy match is accepted. +# rapidfuzz.WRatio is generous with partial_ratio on short strings — e.g. +# 'demo' vs 'bridge demo project' scores 0.90 (false positive). Requiring +# the shorter input be >= 5 chars filters out those degenerate cases. +_MIN_FUZZY_LEN = 5 + + +def similarity(a: str, b: str) -> float: + """Multi-signal similarity score in [0.0, 1.0] via rapidfuzz.WRatio. + + Blends Levenshtein, partial-substring, token-sort, token-set + scoring. Handles abbreviations ('N' vs 'North'), word reordering, + and substring containment. + + Returns 0.0 if either input is empty OR if the shorter input is + too short to fuzzy-match safely (see _MIN_FUZZY_LEN comment) AND the + strings don't exact-match. This guardrails the 'one common word + inside a longer phrase' false positive. + """ + if not a or not b: + return 0.0 + if a == b: + return 1.0 + if min(len(a), len(b)) < _MIN_FUZZY_LEN: + return 0.0 + return rapidfuzz.fuzz.WRatio(a, b) / 100.0 + + +# ── Cluster + Suggestion dataclasses ─────────────────────────────────────────── + + +@dataclass +class Cluster: + cluster_id: str + serial: str + first_event_ts: datetime + last_event_ts: datetime + event_count: int + sample_event_id: str + + # project_raw is the FULL operator-typed string (e.g. + # "Fay - Locks & Dam No3 - Loc 5 Synthomer"). Kept for display so + # operator can sanity-check what they typed. + project_raw: str + # project_root is project_raw with any trailing "- Loc N" suffix + # stripped — what we actually use for matching and as the suggested + # project name. (e.g. "Fay - Locks & Dam No3"). Same as project_raw + # if no Loc marker was found. + project_root: str + project_norm: str + + location_raw: str + location_norm: str + client_raw: str + operator_raw: str + + is_blank_meta: bool + metadata_consistency: float # 0.0–1.0 + + +@dataclass +class ConflictHint: + existing_assignment_id: str + other_location_id: str + other_location_name: str + other_project_id: str + other_project_name: str + + +@dataclass +class Suggestion: + cluster: Cluster + project_match: Literal["exact", "fuzzy", "create_new", "ambiguous"] + project_existing_id: Optional[str] + project_existing_name: Optional[str] + project_match_score: Optional[float] + project_suggested_name: str + + location_match: Literal["exact", "fuzzy", "create_new"] + location_existing_id: Optional[str] + location_existing_name: Optional[str] + location_match_score: Optional[float] + location_suggested_name: str + + proposed_assigned_at: datetime + proposed_assigned_until: Optional[datetime] # None = active / open-ended + + confidence: Literal["high", "medium", "low"] + conflicts: list[ConflictHint] = field(default_factory=list) + blocking_conflict: bool = False + already_attributed: bool = False + + +@dataclass +class ScanResult: + suggestions: list[Suggestion] + skipped_orphans: int # clusters skipped because pre-existing skip decision + already_attributed: int # clusters with full pre-existing assignment overlap + scanned_event_count: int + cluster_count: int + + +@dataclass +class ApplyResult: + applied: int + skipped: int + failed: list[tuple[str, str]] # (cluster_id, reason) + project_ids_created: list[str] + location_ids_created: list[str] + assignment_ids_created: list[str] + + +# ── Step 1+2+3: clustering ──────────────────────────────────────────────────── + + +def _build_cluster_id(serial: str, first_ts: datetime, last_ts: datetime) -> str: + """Deterministic SHA1 hash of (serial, first_date, last_date). + + Stable across re-scans of the same data — typo-corrected events + don't change the cluster_id. + """ + key = f"{serial}|{first_ts.date().isoformat()}|{last_ts.date().isoformat()}" + return hashlib.sha1(key.encode("utf-8")).hexdigest() + + +def _events_overlap_existing_assignment( + ev_ts: datetime, + serial: str, + assignments_by_serial: dict[str, list[UnitAssignment]], + now: datetime, +) -> bool: + """Does this event timestamp fall inside any existing UnitAssignment + window for this serial?""" + for a in assignments_by_serial.get(serial, []): + a_end = a.assigned_until or now + if a.assigned_at <= ev_ts <= a_end: + return True + return False + + +def _mode_string(values: Iterable[Optional[str]]) -> tuple[str, float]: + """Return (most_common_value, consistency_fraction). + + Treats None/empty as a single "blank" bucket. consistency_fraction + is fraction of inputs equal to the modal value. + """ + vals = [v if v else "" for v in values] + if not vals: + return "", 1.0 + counts = Counter(vals) + mode_value, mode_count = counts.most_common(1)[0] + return mode_value, mode_count / len(vals) + + +def _split_by_metadata_runs(events: list[dict]) -> list[list[dict]]: + """Run-length-encode (project_norm, location_norm) sequence; drop runs + shorter than MIN_SPLIT_RUN_LENGTH; return list of sub-cluster event lists. + + Single-event metadata blips (typos) are merged back into the surrounding + cluster's modal metadata. + """ + if not events: + return [] + + # Build (key, idx) pairs. Blanks count as the "previous" key for splitting + # purposes — i.e., a blank event doesn't fork the cluster. + def _key(ev: dict) -> tuple[str, str]: + return (_normalise(ev.get("project")), _normalise(ev.get("sensor_location"))) + + # Identify runs. + runs: list[list[int]] = [] # each run is a list of event-indices + last_key: Optional[tuple[str, str]] = None + current_run: list[int] = [] + blank = ("", "") + for i, ev in enumerate(events): + k = _key(ev) + if k == blank: + # Blank events inherit the previous run (or start a blank run if + # they're at the beginning). + current_run.append(i) + continue + if last_key is None or k == last_key: + current_run.append(i) + last_key = k + else: + runs.append(current_run) + current_run = [i] + last_key = k + if current_run: + runs.append(current_run) + + # Filter out short runs (typos / one-off blips). Their events are folded + # back into the run that follows or precedes them, whichever is longer. + def _run_key(run: list[int]) -> tuple[str, str]: + for idx in run: + k = _key(events[idx]) + if k != blank: + return k + return blank + + filtered: list[list[int]] = [] + pending_short: list[int] = [] + for run in runs: + if len(run) >= MIN_SPLIT_RUN_LENGTH: + # Fold any pending-short events into this run's front. + run = pending_short + run + pending_short = [] + filtered.append(run) + else: + pending_short.extend(run) + + # Any trailing pending-short get appended to the previous filtered run, + # or become their own run if nothing came before. + if pending_short: + if filtered: + filtered[-1].extend(pending_short) + else: + filtered.append(pending_short) + + return [[events[i] for i in run] for run in filtered] + + +def _events_to_clusters( + serial: str, + events: list[dict], # already filtered: assignments_by_serial overlap dropped +) -> list[Cluster]: + """Sort by timestamp, time-cluster on CLUSTER_GAP_DAYS, then metadata-split. + + Each output cluster has its modal metadata computed across its events. + """ + if not events: + return [] + + # Sort by timestamp ascending. (Caller has already dropped events with + # blank timestamps; guard anyway.) + events = [e for e in events if e.get("timestamp")] + events = sorted(events, key=lambda e: e["timestamp"]) + if not events: + return [] + + # Time-cluster. + gap = timedelta(days=CLUSTER_GAP_DAYS) + time_clusters: list[list[dict]] = [] + current: list[dict] = [] + last_ts: Optional[datetime] = None + for ev in events: + ts = _parse_ts(ev["timestamp"]) + if last_ts is None or (ts - last_ts) <= gap: + current.append(ev) + else: + time_clusters.append(current) + current = [ev] + last_ts = ts + if current: + time_clusters.append(current) + + # Metadata-split each time-cluster. + out: list[Cluster] = [] + for tc in time_clusters: + sub_clusters = _split_by_metadata_runs(tc) + for sc in sub_clusters: + cluster = _build_cluster(serial, sc) + out.append(cluster) + return out + + +def _parse_ts(ts: str) -> datetime: + """Parse SFM-returned timestamp string into datetime.""" + # SFM returns ISO 8601 with 'T' separator, sometimes with offset. + s = ts.replace("Z", "+00:00") + try: + dt = datetime.fromisoformat(s) + except ValueError: + # Fallback: space separator + dt = datetime.fromisoformat(s.replace(" ", "T")) + # Strip tzinfo for consistency with terra-view's naive UTC datetimes. + if dt.tzinfo is not None: + dt = dt.replace(tzinfo=None) + return dt + + +def _build_cluster(serial: str, events: list[dict]) -> Cluster: + """Compute a Cluster from a list of events that already belong together.""" + timestamps = [_parse_ts(ev["timestamp"]) for ev in events] + first_ts = min(timestamps) + last_ts = max(timestamps) + + project_mode_norm, project_consistency = _mode_string(_normalise(ev.get("project")) for ev in events) + location_mode_norm, location_consistency = _mode_string(_normalise(ev.get("sensor_location")) for ev in events) + + # For display, pick the most common RAW value (case-preserved) that + # normalises to the modal normalised value. + def _pick_display(field: str, mode_norm: str) -> str: + for ev in events: + v = (ev.get(field) or "").strip() + if _normalise(v) == mode_norm and v: + return v + return "" + + project_raw = _pick_display("project", project_mode_norm) + location_raw = _pick_display("sensor_location", location_mode_norm) + client_raw = _pick_display("client", _normalise(events[0].get("client"))) + operator_raw = _pick_display("operator", _normalise(events[0].get("operator"))) + + # Strip trailing "- Loc N" location info that operators sometimes bake + # into the project string for email-readability ("I-80 - Loc 2 - 543 W + # Plant Rd" → "I-80"). The sensor_location field already has the + # authoritative location identifier. Use project_root for matching + # and as the suggested project name; keep project_raw for display. + project_root = _extract_project_root(project_raw) + project_norm_for_matching = _normalise(project_root) + + consistency = min(project_consistency, location_consistency) + is_blank = (not project_norm_for_matching) or (not location_mode_norm) + + return Cluster( + cluster_id = _build_cluster_id(serial, first_ts, last_ts), + serial = serial, + first_event_ts = first_ts, + last_event_ts = last_ts, + event_count = len(events), + sample_event_id = events[0]["id"], + project_raw = project_raw, + project_root = project_root, + project_norm = project_norm_for_matching, + location_raw = location_raw, + location_norm = location_mode_norm, + client_raw = client_raw, + operator_raw = operator_raw, + is_blank_meta = is_blank, + metadata_consistency = consistency, + ) + + +# ── Step 4: SFM fetch ───────────────────────────────────────────────────────── + + +async def _fetch_all_events_from_sfm(sfm_base_url: str) -> list[dict]: + """Pull every event from SFM. Currently the only filter we can use is + serial-based, so we have to iterate over known serials. Practically this + means: get the list of units from /db/units, then fetch /db/events per + serial. ~26 calls for the dev box, manageable for prod (~50-100 serials). + """ + async with httpx.AsyncClient(timeout=SFM_TIMEOUT) as client: + units_resp = await client.get(f"{sfm_base_url}/db/units") + units_resp.raise_for_status() + units = units_resp.json() + serials = [u["serial"] for u in units if u.get("serial")] + + async def _fetch_one(serial: str) -> list[dict]: + try: + r = await client.get( + f"{sfm_base_url}/db/events", + params={"serial": serial, "limit": SFM_FETCH_CEILING}, + ) + r.raise_for_status() + payload = r.json() + # Strip waveform_blob (large bytes; we don't need them) + evs = payload.get("events", []) or [] + for ev in evs: + ev.pop("waveform_blob", None) + ev.pop("a5_pickle_filename", None) + return evs + except httpx.HTTPError as e: + log.warning("SFM fetch failed for serial=%s: %s", serial, e) + return [] + + all_events_nested = await asyncio.gather(*[_fetch_one(s) for s in serials]) + + all_events: list[dict] = [] + for batch in all_events_nested: + all_events.extend(batch) + return all_events + + +# ── Step 4 cont.: cluster building from SFM ─────────────────────────────────── + + +async def _scan_clusters( + db: Session, + sfm_base_url: str, +) -> tuple[list[Cluster], int, int]: + """Pull SFM events, pre-filter, time-cluster, metadata-split. + + Returns (clusters, scanned_event_count, already_attributed_event_count). + """ + events = await _fetch_all_events_from_sfm(sfm_base_url) + scanned = len(events) + + # Load all existing UnitAssignments once, group by serial. + all_assignments = db.query(UnitAssignment).all() + assignments_by_serial: dict[str, list[UnitAssignment]] = {} + for a in all_assignments: + assignments_by_serial.setdefault(a.unit_id, []).append(a) + + now = datetime.utcnow() + + # Bucket events by serial. + by_serial: dict[str, list[dict]] = {} + already_attributed = 0 + for ev in events: + serial = ev.get("serial") + if not serial: + continue + ts_raw = ev.get("timestamp") + if not ts_raw: + # Skip events without a timestamp — they can't be clustered. + continue + ts = _parse_ts(ts_raw) + if _events_overlap_existing_assignment(ts, serial, assignments_by_serial, now): + already_attributed += 1 + continue + by_serial.setdefault(serial, []).append(ev) + + # Cluster per serial. + clusters: list[Cluster] = [] + for serial, serial_events in by_serial.items(): + clusters.extend(_events_to_clusters(serial, serial_events)) + + return clusters, scanned, already_attributed + + +# ── Step 5: matching against the existing graph ─────────────────────────────── + + +def _find_best_match( + candidate_norm: str, + candidates: list[tuple[str, str]], # (id, normalised_name) +) -> tuple[Optional[str], Optional[float], str]: + """Return (best_id, best_score, classification). + + classification ∈ {"exact", "fuzzy", "ambiguous", "no_match"} + """ + if not candidate_norm or not candidates: + return None, None, "no_match" + + scored = [(cid, similarity(candidate_norm, cnorm)) for cid, cnorm in candidates] + scored.sort(key=lambda x: x[1], reverse=True) + best_id, best_score = scored[0] + + # Check for ambiguity: 2nd-best within FUZZY_AMBIGUITY_DELTA of best, + # and both above the match threshold. + if len(scored) > 1: + _, second_score = scored[1] + if best_score >= FUZZY_MATCH_THRESHOLD and second_score >= FUZZY_MATCH_THRESHOLD \ + and (best_score - second_score) < FUZZY_AMBIGUITY_DELTA: + return best_id, best_score, "ambiguous" + + if best_score >= FUZZY_EXACT_THRESHOLD: + return best_id, best_score, "exact" + if best_score >= FUZZY_MATCH_THRESHOLD: + return best_id, best_score, "fuzzy" + return None, best_score, "no_match" + + +def _detect_conflicts( + db: Session, + cluster: Cluster, + target_location_id: Optional[str], + proposed_assigned_at: datetime, + proposed_assigned_until: Optional[datetime], +) -> tuple[list[ConflictHint], bool, bool]: + """Return (conflicts, blocking, already_attributed_at_target). + + already_attributed_at_target = True means an existing UnitAssignment for + THIS serial is already at THIS target location during this window — + nothing to do, skip the cluster. + """ + now = datetime.utcnow() + end = proposed_assigned_until or now + + overlapping = ( + db.query(UnitAssignment) + .filter(UnitAssignment.unit_id == cluster.serial) + .all() + ) + conflicts: list[ConflictHint] = [] + already_at_target = False + blocking = False + + for a in overlapping: + a_end = a.assigned_until or now + if proposed_assigned_at < a_end and end > a.assigned_at: + # Overlapping window. + if target_location_id is not None and a.location_id == target_location_id: + already_at_target = True + continue + # Different location → blocking conflict. + loc = db.query(MonitoringLocation).filter_by(id=a.location_id).first() + proj = db.query(Project).filter_by(id=a.project_id).first() + conflicts.append(ConflictHint( + existing_assignment_id = a.id, + other_location_id = a.location_id, + other_location_name = loc.name if loc else "?", + other_project_id = a.project_id, + other_project_name = proj.name if proj else "?", + )) + blocking = True + + return conflicts, blocking, already_at_target + + +def _score_confidence( + cluster: Cluster, + project_match: str, + location_match: str, + conflicts: list[ConflictHint], +) -> str: + """Return 'high' | 'medium' | 'low'. See plan for full rules.""" + if conflicts: + return "low" + if cluster.is_blank_meta: + return "low" + if cluster.event_count < 2: + return "low" + + span_days = (cluster.last_event_ts - cluster.first_event_ts).days + if span_days > SUSPICIOUS_SPAN_DAYS and cluster.event_count < (span_days / 7): + # Cluster spans > 90 days but emits less than 1 event/week — sparse. + return "low" + + if project_match == "ambiguous" or location_match == "ambiguous": + return "low" + + if project_match == "fuzzy" or location_match == "fuzzy": + return "medium" + + # Exact match on both, OR clean create_new on both with good event count + # and non-blank metadata. + if (project_match == "exact" and location_match == "exact"): + return "high" + if (project_match == "create_new" and location_match == "create_new"): + return "high" + # Mixed: one exact, one create_new → medium (probably means location + # is new under an existing project, which is common but worth review) + return "medium" + + +def _build_suggestion(db: Session, cluster: Cluster) -> Suggestion: + """Match cluster against the existing graph, score, detect conflicts.""" + + # Match project. + existing_projects = db.query(Project).filter(Project.status != "deleted").all() + project_candidates = [(p.id, _normalise(p.name)) for p in existing_projects] + if cluster.project_norm: + proj_id, proj_score, proj_match = _find_best_match(cluster.project_norm, project_candidates) + else: + proj_id, proj_score, proj_match = None, None, "create_new" + + project_existing = None + project_match: Literal["exact", "fuzzy", "create_new", "ambiguous"] + if proj_match == "exact": + project_match = "exact" + project_existing = next((p for p in existing_projects if p.id == proj_id), None) + elif proj_match == "fuzzy": + project_match = "fuzzy" + project_existing = next((p for p in existing_projects if p.id == proj_id), None) + elif proj_match == "ambiguous": + project_match = "ambiguous" + project_existing = next((p for p in existing_projects if p.id == proj_id), None) + else: + project_match = "create_new" + + project_suggested_name = ( + project_existing.name if project_existing and project_match == "exact" + else cluster.project_root or cluster.project_raw or f"Project {cluster.serial}" + ) + + # Match location ONLY within the matched project's existing locations. + # If we're creating a new project, every location is also new (location + # names should not be fuzzy-matched across project boundaries — "Loc 1" + # at Project A is a different thing than "Loc 1" at Project B). + if project_existing and project_match in ("exact", "fuzzy"): + location_candidates_objs = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == project_existing.id) + .filter(MonitoringLocation.location_type == "vibration") + .all() + ) + location_candidates = [(l.id, _normalise(l.name)) for l in location_candidates_objs] + if cluster.location_norm: + loc_id, loc_score, loc_match = _find_best_match(cluster.location_norm, location_candidates) + else: + loc_id, loc_score, loc_match = None, None, "create_new" + else: + # Project will be created new → location is automatically new. + location_candidates_objs = [] + loc_id, loc_score, loc_match = None, None, "create_new" + + location_existing = None + location_match: Literal["exact", "fuzzy", "create_new"] + if loc_match == "exact": + location_match = "exact" + location_existing = next((l for l in location_candidates_objs if l.id == loc_id), None) + elif loc_match == "fuzzy": + location_match = "fuzzy" + location_existing = next((l for l in location_candidates_objs if l.id == loc_id), None) + elif loc_match == "ambiguous": + # We treat ambiguous-location as fuzzy for now; the operator can + # pick which one (out of scope: a richer disambiguation UI). + location_match = "fuzzy" + location_existing = next((l for l in location_candidates_objs if l.id == loc_id), None) + else: + location_match = "create_new" + + location_suggested_name = ( + location_existing.name if location_existing and location_match == "exact" + else cluster.location_raw or "Unnamed location" + ) + + # Proposed assignment window. + proposed_at = cluster.first_event_ts - timedelta(hours=1) + now = datetime.utcnow() + if (now - cluster.last_event_ts) <= timedelta(days=RECENT_CLUSTER_DAYS): + proposed_until = None # Treat as active + else: + proposed_until = cluster.last_event_ts + timedelta(hours=1) + + # Conflict detection vs existing UnitAssignments. + target_loc_id = location_existing.id if location_existing else None + conflicts, blocking, already_attributed = _detect_conflicts( + db, cluster, target_loc_id, proposed_at, proposed_until, + ) + + confidence = _score_confidence(cluster, project_match, location_match, conflicts) + + return Suggestion( + cluster = cluster, + project_match = project_match, + project_existing_id = project_existing.id if project_existing else None, + project_existing_name = project_existing.name if project_existing else None, + project_match_score = proj_score, + project_suggested_name = project_suggested_name, + location_match = location_match, + location_existing_id = location_existing.id if location_existing else None, + location_existing_name = location_existing.name if location_existing else None, + location_match_score = loc_score, + location_suggested_name = location_suggested_name, + proposed_assigned_at = proposed_at, + proposed_assigned_until = proposed_until, + confidence = confidence, + conflicts = conflicts, + blocking_conflict = blocking, + already_attributed = already_attributed, + ) + + +# ── Public API: scan ────────────────────────────────────────────────────────── + + +async def scan_clusters_and_build_suggestions( + db: Session, + sfm_base_url: str, +) -> ScanResult: + """End-to-end scan: fetch SFM events, cluster, build suggestions. + + Persists / updates a MetadataBackfillDecision row per non-already-attributed + cluster with status='pending' (first time) or refreshed last_seen_at. + """ + clusters, scanned, already_attributed_events = await _scan_clusters(db, sfm_base_url) + + suggestions: list[Suggestion] = [] + skipped_orphans = 0 + already_attributed_clusters = 0 + now = datetime.utcnow() + + for cluster in clusters: + # Check existing decision. + decision = db.query(MetadataBackfillDecision).filter_by( + cluster_id=cluster.cluster_id + ).first() + + if decision and decision.status == "skipped": + skipped_orphans += 1 + # Refresh last_seen so we know it's still around. + decision.last_seen_at = now + continue + if decision and decision.status == "applied": + # Already done. Phase 2 attribution should be picking up these + # events now. + decision.last_seen_at = now + continue + + suggestion = _build_suggestion(db, cluster) + + if suggestion.already_attributed: + already_attributed_clusters += 1 + continue + + suggestions.append(suggestion) + + # Upsert decision row. + if decision is None: + db.add(MetadataBackfillDecision( + cluster_id = cluster.cluster_id, + status = "conflict" if suggestion.blocking_conflict else "pending", + confidence = suggestion.confidence, + first_seen_at = now, + last_seen_at = now, + serial = cluster.serial, + project_raw = cluster.project_raw or None, + location_raw = cluster.location_raw or None, + first_event_ts = cluster.first_event_ts, + last_event_ts = cluster.last_event_ts, + event_count = cluster.event_count, + )) + else: + decision.status = "conflict" if suggestion.blocking_conflict else "pending" + decision.confidence = suggestion.confidence + decision.last_seen_at = now + decision.event_count = cluster.event_count + decision.last_event_ts = cluster.last_event_ts + + db.commit() + + return ScanResult( + suggestions = suggestions, + skipped_orphans = skipped_orphans, + already_attributed = already_attributed_clusters, + scanned_event_count = scanned, + cluster_count = len(clusters), + ) + + +# ── Public API: apply ───────────────────────────────────────────────────────── + + +def _ensure_auto_imported_project_type(db: Session) -> str: + """Create the 'auto_imported' ProjectType if it doesn't exist. Returns id.""" + from backend.models import ProjectType + pt = db.query(ProjectType).filter_by(id=AUTO_IMPORTED_TYPE_ID).first() + if pt is None: + pt = ProjectType( + id = AUTO_IMPORTED_TYPE_ID, + name = AUTO_IMPORTED_TYPE_NAME, + description = "Projects created automatically by the metadata-backfill parser. Operators can re-type them later.", + supports_vibration = True, + supports_sound = False, + ) + db.add(pt) + db.flush() + return pt.id + + +def _ensure_project(db: Session, suggestion: Suggestion) -> tuple[Project, bool]: + """Return (project, created_flag). + + Dedup is normalisation-aware: "SR81" and "SR 81" collapse to the same + project (both normalise to "sr 81"), as do "Fay - Locks & Dam No3" + and "Fay-Locks-&-Dam-No3". Important when applying many clusters in + one bulk operation — the first creates the project, subsequent + clusters with normalisation-equivalent names attach to it instead + of triggering a UNIQUE constraint violation. + """ + if suggestion.project_existing_id: + p = db.query(Project).filter_by(id=suggestion.project_existing_id).first() + if p is not None: + return p, False + + candidate_name = suggestion.project_suggested_name.strip() or f"Auto-imported project ({suggestion.cluster.serial})" + candidate_norm = _normalise(candidate_name) + + # Pre-flight normalised lookup: avoids creating duplicates that + # differ only in punctuation/spacing. + if candidate_norm: + for p in db.query(Project).filter(Project.status != "deleted").all(): + if _normalise(p.name) == candidate_norm: + return p, False + + # Final fallback: case-insensitive exact (cheap, catches the same + # things normalised lookup would but it's harmless to keep). + existing = db.query(Project).filter(Project.name.ilike(candidate_name)).first() + if existing is not None: + return existing, False + + type_id = _ensure_auto_imported_project_type(db) + + # Derive dates. + start = suggestion.cluster.first_event_ts.date() + end = (suggestion.cluster.last_event_ts.date() + if suggestion.proposed_assigned_until else None) + + p = Project( + id = str(uuid.uuid4()), + name = candidate_name, + description = ( + f"Auto-created by the metadata-backfill parser on " + f"{datetime.utcnow():%Y-%m-%d}. Sourced from operator-typed " + f"BW metadata across {suggestion.cluster.event_count} event(s) " + f"from serial {suggestion.cluster.serial}." + ), + project_type_id = type_id, + status = "active", + data_collection_mode = "remote", + client_name = suggestion.cluster.client_raw or None, + start_date = start, + end_date = end, + ) + db.add(p) + db.flush() + + # Ensure the vibration_monitoring module is enabled. + pm = ProjectModule( + id = str(uuid.uuid4()), + project_id = p.id, + module_type = "vibration_monitoring", + enabled = True, + ) + db.add(pm) + db.flush() + + return p, True + + +def _ensure_location( + db: Session, + project: Project, + suggestion: Suggestion, +) -> tuple[MonitoringLocation, bool]: + """Return (location, created_flag).""" + if suggestion.location_existing_id: + l = db.query(MonitoringLocation).filter_by(id=suggestion.location_existing_id).first() + if l is not None and l.project_id == project.id: + return l, False + + candidate_name = suggestion.location_suggested_name.strip() or "Unnamed location" + candidate_norm = _normalise(candidate_name) + + # Normalisation-aware lookup within this project — same dedup + # principle as _ensure_project. + if candidate_norm: + for existing in ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == project.id) + .all() + ): + if _normalise(existing.name) == candidate_norm: + return existing, False + + # Fallback to case-insensitive exact. + existing = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == project.id) + .filter(MonitoringLocation.name.ilike(candidate_name)) + .first() + ) + if existing is not None: + return existing, False + + l = MonitoringLocation( + id = str(uuid.uuid4()), + project_id = project.id, + location_type = "vibration", + name = candidate_name, + description = ( + f"Auto-created by metadata-backfill from operator-typed sensor_location " + f"\"{suggestion.cluster.location_raw}\" on events from serial " + f"{suggestion.cluster.serial}." + ), + ) + db.add(l) + db.flush() + return l, True + + +def _apply_one( + db: Session, + suggestion: Suggestion, + *, + decided_by: str, +) -> tuple[str, str, str]: + """Apply a single suggestion in a transaction. + + Returns (project_id, location_id, assignment_id). + """ + project, _proj_created = _ensure_project(db, suggestion) + location, _loc_created = _ensure_location(db, project, suggestion) + + # Create the UnitAssignment. + assignment_id = str(uuid.uuid4()) + assignment = UnitAssignment( + id = assignment_id, + unit_id = suggestion.cluster.serial, + location_id = location.id, + project_id = project.id, + device_type = "seismograph", + assigned_at = suggestion.proposed_assigned_at, + assigned_until = suggestion.proposed_assigned_until, + status = "active" if suggestion.proposed_assigned_until is None else "completed", + source = "metadata_backfill", + notes = ( + f"Auto-created from operator-typed metadata on " + f"{suggestion.cluster.event_count} event(s). " + f"Project: \"{suggestion.cluster.project_raw}\". " + f"Location: \"{suggestion.cluster.location_raw}\". " + f"Confidence: {suggestion.confidence}." + ), + ) + db.add(assignment) + + # Audit log entry. + db.add(UnitHistory( + unit_id = suggestion.cluster.serial, + change_type = "assignment_backfilled", + field_name = "unit_assignment", + old_value = None, + new_value = f"{project.name} / {location.name}", + changed_at = datetime.utcnow(), + source = "metadata_backfill", + notes = ( + f"Created from {suggestion.cluster.event_count} events tagged " + f"({suggestion.cluster.project_raw!r}, {suggestion.cluster.location_raw!r}). " + f"By: {decided_by}." + ), + )) + + # Update the decision record. + decision = db.query(MetadataBackfillDecision).filter_by( + cluster_id=suggestion.cluster.cluster_id + ).first() + if decision: + decision.status = "applied" + decision.decided_at = datetime.utcnow() + decision.decided_by = decided_by + decision.applied_assignment_id = assignment_id + + return project.id, location.id, assignment_id + + +def apply_suggestions( + db: Session, + suggestions: list[Suggestion], + *, + decided_by: str, +) -> ApplyResult: + """Apply a list of suggestions. Each suggestion is applied in its own + sub-transaction (via db.flush) — failures don't roll back successful ones. + """ + applied = 0 + failed: list[tuple[str, str]] = [] + proj_ids: list[str] = [] + loc_ids: list[str] = [] + asgn_ids: list[str] = [] + + for s in suggestions: + if s.blocking_conflict: + failed.append((s.cluster.cluster_id, "blocking conflict — needs manual resolution")) + continue + try: + p_id, l_id, a_id = _apply_one(db, s, decided_by=decided_by) + db.flush() # ensure FK consistency within transaction + proj_ids.append(p_id) + loc_ids.append(l_id) + asgn_ids.append(a_id) + applied += 1 + except Exception as e: + log.exception("Failed to apply cluster %s", s.cluster.cluster_id) + db.rollback() + failed.append((s.cluster.cluster_id, str(e))) + + db.commit() + + return ApplyResult( + applied = applied, + skipped = 0, + failed = failed, + project_ids_created = list(dict.fromkeys(proj_ids)), + location_ids_created = list(dict.fromkeys(loc_ids)), + assignment_ids_created = asgn_ids, + ) + + +def skip_clusters( + db: Session, + cluster_ids: list[str], + *, + decided_by: str = "operator", +) -> int: + """Mark clusters as skipped so they don't reappear in future scans.""" + now = datetime.utcnow() + n = 0 + for cluster_id in cluster_ids: + decision = db.query(MetadataBackfillDecision).filter_by(cluster_id=cluster_id).first() + if decision is None: + continue + if decision.status not in ("pending", "conflict"): + continue + decision.status = "skipped" + decision.decided_at = now + decision.decided_by = decided_by + n += 1 + db.commit() + return n diff --git a/backend/services/project_merge.py b/backend/services/project_merge.py new file mode 100644 index 0000000..53a1c5d --- /dev/null +++ b/backend/services/project_merge.py @@ -0,0 +1,435 @@ +""" +project_merge.py — consolidate a duplicate project into another. + +Use case: the metadata-backfill parser (and operators) create projects with +slight name variations ("SR81" vs "SR 81", "Swank-Karns Crossing" vs +"Swank-Karns Crossings", "Trumbull-Bryman Mont.Dam" vs +"Trumbull-Brayman-Mont Dam"). Operator picks a SOURCE project to merge +into a TARGET project; everything attached to source moves to target, +same-named locations consolidate, and source is soft-deleted. + +Public API: + preview(db, source_id, target_id) → MergePreview + execute(db, source_id, target_id, *, decided_by="operator") → MergeResult + +Both raise HTTPException with appropriate 4xx codes for validation failures. +""" + +from __future__ import annotations + +import logging +from dataclasses import dataclass, field +from datetime import datetime +from typing import Optional + +from fastapi import HTTPException +from sqlalchemy.orm import Session + +from backend.models import ( + Project, + ProjectModule, + MonitoringLocation, + UnitAssignment, + UnitHistory, + MonitoringSession, + DataFile, +) + +log = logging.getLogger("backend.services.project_merge") + + +# ── Dataclasses ─────────────────────────────────────────────────────────────── + + +@dataclass +class LocationMergePlan: + source_id: str + source_name: str + target_id: Optional[str] # None = will be inserted as-new under target project + target_name: Optional[str] # name in target after merge + action: str # "move" | "consolidate" + assignments_moving: int + sessions_moving: int + + +@dataclass +class MergePreview: + source_project_id: str + source_project_name: str + target_project_id: str + target_project_name: str + location_plans: list[LocationMergePlan] = field(default_factory=list) + total_assignments_moving: int = 0 + total_sessions_moving: int = 0 + total_data_files_moving: int = 0 + modules_to_add: list[str] = field(default_factory=list) + warnings: list[str] = field(default_factory=list) + + +@dataclass +class MergeResult: + source_project_id: str + target_project_id: str + assignments_moved: int + locations_moved: int + locations_consolidated: int + sessions_moved: int + data_files_moved: int + modules_added: list[str] + audit_rows_written: int + + +# ── Helpers ─────────────────────────────────────────────────────────────────── + + +def _normalise_name(s: Optional[str]) -> str: + """Case-insensitive, whitespace-collapsing name normalisation. + + Lighter than metadata_backfill._normalise (no punctuation stripping) + — for merging we want "Loc 1" and "Loc 1" to match but NOT "Loc 1" + and "Loc-1" (those might be intentionally different). If operators + DO want loose matching, they can rename one before merging. + """ + if not s: + return "" + import re + return re.sub(r"\s+", " ", s.strip()).casefold() + + +def _validate_pair(db: Session, source_id: str, target_id: str) -> tuple[Project, Project]: + if source_id == target_id: + raise HTTPException(status_code=400, detail="Cannot merge a project into itself.") + + source = db.query(Project).filter_by(id=source_id).first() + target = db.query(Project).filter_by(id=target_id).first() + if source is None: + raise HTTPException(status_code=404, detail=f"Source project not found.") + if target is None: + raise HTTPException(status_code=404, detail=f"Target project not found.") + if source.status == "deleted": + raise HTTPException(status_code=400, detail=f"Source project '{source.name}' is already deleted.") + if target.status == "deleted": + raise HTTPException(status_code=400, detail=f"Target project '{target.name}' is deleted.") + + return source, target + + +# ── Preview ─────────────────────────────────────────────────────────────────── + + +def preview(db: Session, source_id: str, target_id: str) -> MergePreview: + """Build a preview of what the merge will do. No writes.""" + source, target = _validate_pair(db, source_id, target_id) + + # Locations in source vs target. + source_locs = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == source_id) + .all() + ) + target_locs = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == target_id) + .all() + ) + target_by_norm = {_normalise_name(l.name): l for l in target_locs} + + location_plans: list[LocationMergePlan] = [] + total_assignments_moving = 0 + total_sessions_moving = 0 + + for sl in source_locs: + n = _normalise_name(sl.name) + tl = target_by_norm.get(n) + + a_count = ( + db.query(UnitAssignment) + .filter(UnitAssignment.location_id == sl.id) + .count() + ) + s_count = ( + db.query(MonitoringSession) + .filter(MonitoringSession.location_id == sl.id) + .count() + ) + total_assignments_moving += a_count + total_sessions_moving += s_count + + if tl is not None: + location_plans.append(LocationMergePlan( + source_id = sl.id, + source_name = sl.name, + target_id = tl.id, + target_name = tl.name, + action = "consolidate", + assignments_moving = a_count, + sessions_moving = s_count, + )) + else: + location_plans.append(LocationMergePlan( + source_id = sl.id, + source_name = sl.name, + target_id = None, + target_name = sl.name, + action = "move", + assignments_moving = a_count, + sessions_moving = s_count, + )) + + # DataFiles attached to the source project (if the table exists with a + # project_id column). Optional — terra-view's DataFile model may not + # always FK to project, so handle gracefully. + df_count = 0 + try: + df_count = ( + db.query(DataFile) + .filter(DataFile.project_id == source_id) + .count() + ) + except Exception: + df_count = 0 + total_data_files_moving = df_count + + # Modules: add anything in source missing from target. + src_modules = { + m.module_type for m in db.query(ProjectModule) + .filter(ProjectModule.project_id == source_id, ProjectModule.enabled.is_(True)) + .all() + } + tgt_modules = { + m.module_type for m in db.query(ProjectModule) + .filter(ProjectModule.project_id == target_id, ProjectModule.enabled.is_(True)) + .all() + } + modules_to_add = sorted(src_modules - tgt_modules) + + warnings: list[str] = [] + # Surface conditions the operator should think about. + consolidations = sum(1 for p in location_plans if p.action == "consolidate") + if consolidations: + warnings.append( + f"{consolidations} location(s) with matching names will be consolidated " + f"(source assignments will move to the target's existing location). " + f"If your same-named locations are actually different sites, rename one first." + ) + if source.client_name and target.client_name and source.client_name.strip().casefold() != target.client_name.strip().casefold(): + warnings.append( + f"Client names differ: source is \"{source.client_name}\", target is " + f"\"{target.client_name}\". Target's client name will be kept." + ) + + return MergePreview( + source_project_id = source.id, + source_project_name = source.name, + target_project_id = target.id, + target_project_name = target.name, + location_plans = location_plans, + total_assignments_moving = total_assignments_moving, + total_sessions_moving = total_sessions_moving, + total_data_files_moving = total_data_files_moving, + modules_to_add = modules_to_add, + warnings = warnings, + ) + + +# ── Execute ─────────────────────────────────────────────────────────────────── + + +def execute( + db: Session, + source_id: str, + target_id: str, + *, + decided_by: str = "operator", +) -> MergeResult: + """Perform the merge in a single transaction. + + Steps: + 1. Re-validate the pair. + 2. For each location in source: + - if a same-name location exists in target → "consolidate" mode: + move source's assignments + sessions to target's location id, + delete source's location. + - else → "move" mode: just re-point the location's project_id. + 3. Move any remaining direct-to-project FK rows (DataFiles). + 4. Ensure target has all of source's modules. + 5. Soft-delete source project. + 6. Write a UnitHistory row per assignment that was moved + (change_type='assignment_merged') so the deployment timeline + on each affected unit reflects the merge. + 7. Commit. + """ + source, target = _validate_pair(db, source_id, target_id) + + src_modules = { + m.module_type for m in db.query(ProjectModule) + .filter(ProjectModule.project_id == source_id, ProjectModule.enabled.is_(True)) + .all() + } + tgt_modules = { + m.module_type for m in db.query(ProjectModule) + .filter(ProjectModule.project_id == target_id, ProjectModule.enabled.is_(True)) + .all() + } + modules_to_add = sorted(src_modules - tgt_modules) + + # ── 1. Locations + their dependents ─────────────────────────────── + source_locs = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == source_id) + .all() + ) + target_locs = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == target_id) + .all() + ) + target_by_norm = {_normalise_name(l.name): l for l in target_locs} + + assignments_moved = 0 + sessions_moved = 0 + locations_moved = 0 + locations_consolidated = 0 + audit_rows_written = 0 + + for sl in source_locs: + n = _normalise_name(sl.name) + tl = target_by_norm.get(n) + + # Pull this location's assignments + sessions (we'll re-point them). + assignments = ( + db.query(UnitAssignment) + .filter(UnitAssignment.location_id == sl.id) + .all() + ) + sessions = ( + db.query(MonitoringSession) + .filter(MonitoringSession.location_id == sl.id) + .all() + ) + + if tl is not None: + # Consolidate: move dependents to target's existing location; + # then delete the source location. + for a in assignments: + old_loc_id = a.location_id + a.location_id = tl.id + a.project_id = target.id + + db.add(UnitHistory( + unit_id = a.unit_id, + change_type = "assignment_merged", + field_name = "unit_assignment.project_id", + old_value = f"{source.name} / {sl.name}", + new_value = f"{target.name} / {tl.name}", + changed_at = datetime.utcnow(), + source = "project_merge", + notes = ( + f"Project merge: '{source.name}' → '{target.name}'. " + f"Location consolidated by name match. " + f"By: {decided_by}." + ), + )) + audit_rows_written += 1 + assignments_moved += 1 + + for s in sessions: + s.location_id = tl.id + s.project_id = target.id + sessions_moved += 1 + + # Delete the now-empty source location. + db.delete(sl) + locations_consolidated += 1 + else: + # Move: just re-point this location to the target project. + sl.project_id = target.id + + for a in assignments: + old_proj_id = a.project_id + a.project_id = target.id + + db.add(UnitHistory( + unit_id = a.unit_id, + change_type = "assignment_merged", + field_name = "unit_assignment.project_id", + old_value = f"{source.name} / {sl.name}", + new_value = f"{target.name} / {sl.name}", + changed_at = datetime.utcnow(), + source = "project_merge", + notes = ( + f"Project merge: '{source.name}' → '{target.name}'. " + f"Location moved as-is. By: {decided_by}." + ), + )) + audit_rows_written += 1 + assignments_moved += 1 + + for s in sessions: + s.project_id = target.id + sessions_moved += 1 + + locations_moved += 1 + + # ── 2. Direct-to-project rows (DataFiles, ScheduledActions) ────── + data_files_moved = 0 + try: + data_files = ( + db.query(DataFile) + .filter(DataFile.project_id == source_id) + .all() + ) + for df in data_files: + df.project_id = target.id + data_files_moved += 1 + except Exception as e: + log.warning("DataFile move skipped (model may differ): %s", e) + + # ── 3. UnitAssignments that point directly at source.project_id with + # no location (shouldn't happen but be defensive) ────────────── + orphan_assignments = ( + db.query(UnitAssignment) + .filter(UnitAssignment.project_id == source_id) + .all() + ) + for a in orphan_assignments: + # Already moved if its location was moved. Catch any stragglers. + if a.project_id == source_id: + a.project_id = target.id + + # ── 4. Modules ──────────────────────────────────────────────────── + import uuid + for mod_type in modules_to_add: + db.add(ProjectModule( + id = str(uuid.uuid4()), + project_id = target.id, + module_type = mod_type, + enabled = True, + )) + + # Disable source's modules (defensive — source is being soft-deleted + # but its modules table rows could still be inspected). + for m in db.query(ProjectModule).filter(ProjectModule.project_id == source_id).all(): + m.enabled = False + + # ── 5. Soft-delete source ───────────────────────────────────────── + source.status = "deleted" + source.deleted_at = datetime.utcnow() + + # Final audit row on the source project itself (operator-facing). + # We don't have a Project-level history table, so log on every + # affected unit as a marker. Already done per-assignment above. + + db.commit() + + return MergeResult( + source_project_id = source.id, + target_project_id = target.id, + assignments_moved = assignments_moved, + locations_moved = locations_moved, + locations_consolidated = locations_consolidated, + sessions_moved = sessions_moved, + data_files_moved = data_files_moved, + modules_added = modules_to_add, + audit_rows_written = audit_rows_written, + ) diff --git a/backend/services/project_tidy.py b/backend/services/project_tidy.py new file mode 100644 index 0000000..482b517 --- /dev/null +++ b/backend/services/project_tidy.py @@ -0,0 +1,235 @@ +""" +project_tidy.py — find duplicate-looking projects + offer bulk merge. + +The metadata-backfill parser is good at clustering events into candidate +projects but doesn't compare its proposed project names against EACH OTHER +(it only checks against existing terra-view projects). After a bulk +apply, you can end up with many near-duplicate projects — typo variants, +abbreviation differences, etc. This module surfaces them as pairs the +operator can merge. + +Pairs vs clusters: a fully-connected group like (A, B, C) where each pair +scores >= threshold becomes 3 pairs. The operator has to do 2 merges to +fully consolidate. We don't try to be smarter about transitive grouping — +in practice operators want to review the highest-similarity pair first +anyway, and the list re-computes after each merge. + +Public API: + find_duplicate_pairs(db, *, threshold=0.85, max_pairs=200) → list[DuplicatePair] +""" + +from __future__ import annotations + +import logging +from dataclasses import dataclass +from typing import Optional + +import rapidfuzz +from sqlalchemy import func +from sqlalchemy.orm import Session + +from backend.models import ( + Project, + MonitoringLocation, + UnitAssignment, +) +from backend.services.metadata_backfill import _normalise as _meta_normalise + +log = logging.getLogger("backend.services.project_tidy") + + +DEFAULT_THRESHOLD = 0.85 # WRatio similarity above which we surface a pair +DEFAULT_MAX_PAIRS = 200 # Cap the result list to keep response small +MIN_NORMALISED_LENGTH = 4 # Skip projects whose normalised name is too short + # to fuzzy-match safely (avoids "1" / "1" pairs). + + +@dataclass +class ProjectSummary: + id: str + name: str + project_number: Optional[str] + client_name: Optional[str] + source: str # 'manual' | 'metadata_backfill' | ... + status: str + location_count: int + assignment_count: int + event_count_total: int # approx — sum across assignments + + +@dataclass +class DuplicatePair: + a: ProjectSummary + b: ProjectSummary + score: float + suggested_target_id: str # the recommended "keep" side + reason: str # why we picked that target + + +# ── Helpers ────────────────────────────────────────────────────────────────── + + +def _normalise_project_name(name: str) -> str: + """Project-name normalisation for tidy comparison. + + Reuses the metadata_backfill normaliser (lowercase, punctuation→space, + collapse whitespace). Returns "" for None or all-punctuation names. + """ + return _meta_normalise(name) + + +def _summarise_projects(db: Session) -> list[ProjectSummary]: + """One row per active project with cached counts. Excludes deleted.""" + projects = ( + db.query(Project) + .filter(Project.status != "deleted") + .all() + ) + + # Bulk lookup: assignment counts + location counts per project. + loc_counts: dict[str, int] = dict( + db.query(MonitoringLocation.project_id, func.count(MonitoringLocation.id)) + .filter(MonitoringLocation.project_id.in_([p.id for p in projects]) if projects else False) + .group_by(MonitoringLocation.project_id) + .all() + ) + asgn_counts: dict[str, int] = dict( + db.query(UnitAssignment.project_id, func.count(UnitAssignment.id)) + .filter(UnitAssignment.project_id.in_([p.id for p in projects]) if projects else False) + .group_by(UnitAssignment.project_id) + .all() + ) + + summaries: list[ProjectSummary] = [] + for p in projects: + summaries.append(ProjectSummary( + id = p.id, + name = p.name, + project_number = p.project_number, + client_name = p.client_name, + source = None, # filled below per assignment + status = p.status or "active", + location_count = loc_counts.get(p.id, 0), + assignment_count = asgn_counts.get(p.id, 0), + event_count_total = 0, # not cheap to compute here; left 0 + )) + + # Determine each project's dominant assignment source. Used to break ties + # when picking the "keep" target — prefer manual over parser-created. + rows = ( + db.query(UnitAssignment.project_id, UnitAssignment.source, func.count(UnitAssignment.id)) + .group_by(UnitAssignment.project_id, UnitAssignment.source) + .all() + ) + by_proj_src: dict[str, dict[str, int]] = {} + for proj_id, src, cnt in rows: + by_proj_src.setdefault(proj_id, {})[src or "manual"] = cnt + for s in summaries: + src_map = by_proj_src.get(s.id, {}) + if not src_map: + s.source = "manual" + else: + # Dominant source (most assignments). + s.source = max(src_map.items(), key=lambda kv: kv[1])[0] + + return summaries + + +def _pick_target(a: ProjectSummary, b: ProjectSummary) -> tuple[str, str]: + """Decide which project should be the merge target (the one we keep). + + Priorities (in order): + 1. The one with `source='manual'` over `source='metadata_backfill'` + — operator-curated projects beat parser-created ones. + 2. The one with a populated `project_number`. + 3. The one with more locations (more curation history). + 4. The one with more assignments. + 5. The one with the shorter, cleaner name (tiebreaker). + + Returns (target_id, reason_string). + """ + # 1. Source provenance. + a_manual = a.source == "manual" + b_manual = b.source == "manual" + if a_manual and not b_manual: + return a.id, "A is manually-created; B is parser-created" + if b_manual and not a_manual: + return b.id, "B is manually-created; A is parser-created" + + # 2. project_number populated. + if a.project_number and not b.project_number: + return a.id, "A has a project_number; B doesn't" + if b.project_number and not a.project_number: + return b.id, "B has a project_number; A doesn't" + + # 3. More locations. + if a.location_count > b.location_count: + return a.id, f"A has more locations ({a.location_count} vs {b.location_count})" + if b.location_count > a.location_count: + return b.id, f"B has more locations ({b.location_count} vs {a.location_count})" + + # 4. More assignments. + if a.assignment_count > b.assignment_count: + return a.id, f"A has more assignments ({a.assignment_count} vs {b.assignment_count})" + if b.assignment_count > a.assignment_count: + return b.id, f"B has more assignments ({b.assignment_count} vs {a.assignment_count})" + + # 5. Shorter name (less likely to have baked-in junk). + if len(a.name) <= len(b.name): + return a.id, "A has the shorter / cleaner name" + return b.id, "B has the shorter / cleaner name" + + +# ── Public ─────────────────────────────────────────────────────────────────── + + +def find_duplicate_pairs( + db: Session, + *, + threshold: float = DEFAULT_THRESHOLD, + max_pairs: int = DEFAULT_MAX_PAIRS, +) -> list[DuplicatePair]: + """Compute all project-pair similarities above `threshold`. + + O(N^2) over the project count — fine up to ~500 projects; beyond that + we'd want a blocked / token-indexed approach. In practice + `metadata_backfill` projects tend to share tokens, so a simple + pre-filter (skip pairs that share NO tokens) would cheaply cut the + inner loop. Deferred until profiling motivates it. + """ + summaries = _summarise_projects(db) + + # Pre-compute normalised names; skip too-short ones. + norm_by_id: dict[str, str] = {} + candidates: list[ProjectSummary] = [] + for s in summaries: + n = _normalise_project_name(s.name) + if len(n) < MIN_NORMALISED_LENGTH: + continue + norm_by_id[s.id] = n + candidates.append(s) + + pairs: list[DuplicatePair] = [] + n = len(candidates) + for i in range(n): + a = candidates[i] + a_norm = norm_by_id[a.id] + for j in range(i + 1, n): + b = candidates[j] + b_norm = norm_by_id[b.id] + score = rapidfuzz.fuzz.WRatio(a_norm, b_norm) / 100.0 + if score < threshold: + continue + target_id, reason = _pick_target(a, b) + pairs.append(DuplicatePair( + a = a, + b = b, + score = score, + suggested_target_id = target_id, + reason = reason, + )) + + # Sort by score desc, then by total content (more data → review first). + pairs.sort(key=lambda p: (-p.score, -(p.a.assignment_count + p.b.assignment_count))) + + return pairs[:max_pairs] diff --git a/backend/services/sfm_events.py b/backend/services/sfm_events.py new file mode 100644 index 0000000..b866818 --- /dev/null +++ b/backend/services/sfm_events.py @@ -0,0 +1,592 @@ +""" +SFM events service — bridge between terra-view's UnitAssignment time-windows +and the SFM (seismo-relay) events store. + +Architecture: + 1. Terra-view owns the *assignment graph*: which seismograph was at which + monitoring location during which time window (UnitAssignment rows). + 2. SFM owns the *events store*: triggered waveform events keyed by + (serial, timestamp), forwarded from Blastware ACH by series3-watcher. + 3. This module fans out the assignments for a given location, queries SFM + for the events emitted by each (serial, window) pair concurrently, and + unions/sorts/paginates the results. + +SFM remains the single source of truth for events. Terra-view does not +copy events into its own DB; every query hits SFM live. + +The events_for_location helper is also reused by Phase 3 (project-level +roll-up) to aggregate across every location in a project. +""" + +from __future__ import annotations + +import asyncio +import logging +import os +from datetime import datetime, timezone +from typing import Optional + +import httpx +from sqlalchemy.orm import Session + +from backend.models import UnitAssignment, RosterUnit, MonitoringLocation, Project + +log = logging.getLogger("backend.services.sfm_events") + +SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200") + +# Per-request timeout when calling SFM /db/events. SFM is local on the +# docker network so this should be fast; bump if you start seeing timeouts. +_SFM_TIMEOUT_SECONDS = 10.0 + +# Max events we ever fetch per (serial, window) call to SFM. Must match +# SFM's own /db/events max limit (currently 5000). The user-facing display +# limit is independent — we over-fetch up to this cap so summary stats are +# accurate, then trim the displayed list to the requested limit. +_SFM_FETCH_CEILING = 5000 + + +# ── Helpers ─────────────────────────────────────────────────────────────────── + + +def _iso_utc(dt: Optional[datetime]) -> Optional[str]: + """Render a datetime in the ISO format SFM /db/events expects.""" + if dt is None: + return None + # SFM parses naive ISO strings as UTC; strip tzinfo for consistency. + if dt.tzinfo is not None: + dt = dt.astimezone(timezone.utc).replace(tzinfo=None) + return dt.isoformat(sep=" ", timespec="seconds") + + +def _intersect_window( + assignment_start: datetime, + assignment_end: Optional[datetime], + filter_from: Optional[datetime], + filter_to: Optional[datetime], + now: datetime, +) -> Optional[tuple[datetime, datetime]]: + """Intersect an assignment window with the requested filter window. + + Returns (effective_start, effective_end) or None if there's no overlap. + Open-ended assignments (assigned_until=NULL) are bounded by `now`. + """ + a_end = assignment_end or now + if filter_from and a_end <= filter_from: + return None + if filter_to and assignment_start >= filter_to: + return None + start = max(assignment_start, filter_from) if filter_from else assignment_start + end = min(a_end, filter_to) if filter_to else a_end + if end <= start: + return None + return (start, end) + + +async def _fetch_events_for_serial( + client: httpx.AsyncClient, + serial: str, + *, + from_dt: datetime, + to_dt: datetime, + false_trigger: Optional[bool], + limit: int, +) -> list[dict]: + """Issue one /db/events call to SFM for one (serial, window) pair.""" + params: dict[str, str] = { + "serial": serial, + "from_dt": _iso_utc(from_dt) or "", + "to_dt": _iso_utc(to_dt) or "", + "limit": str(limit), + } + if false_trigger is not None: + params["false_trigger"] = "true" if false_trigger else "false" + + try: + resp = await client.get(f"{SFM_BASE_URL}/db/events", params=params) + resp.raise_for_status() + except httpx.HTTPError as e: + log.warning("SFM /db/events failed for serial=%s: %s", serial, e) + return [] + + payload = resp.json() + events = payload.get("events", []) or [] + # Strip waveform_blob if present — it's the big per-event binary and we + # don't render it in the list view. SFM returns it by default. + for ev in events: + ev.pop("waveform_blob", None) + ev.pop("a5_pickle_filename", None) + return events + + +# ── Public API ──────────────────────────────────────────────────────────────── + + +async def events_for_location( + db: Session, + location_id: str, + *, + from_dt: Optional[datetime] = None, + to_dt: Optional[datetime] = None, + false_trigger: Optional[bool] = None, + limit: int = 500, +) -> dict: + """Fan out UnitAssignment rows for `location_id` and union SFM events. + + Returns: + { + "events": [merged event dicts, newest first, capped at limit], + "count": total events found across all windows (pre-cap), + "stats": {event_count, peak_pvs, peak_pvs_at, + last_event, false_trigger_count}, + "assignments_used": [{unit_id, assigned_at, assigned_until, + events_in_window}, ...], + } + + The "events outside any assignment window" rule (Phase 1 design decision): + events whose timestamp falls outside every assignment window are simply + not fetched — we only ask SFM for events inside the intersected windows. + Those orphan events surface under the per-unit detail page in Phase 2. + """ + # 1. Fetch all assignments (active + closed) for the location. + assignments = ( + db.query(UnitAssignment) + .filter(UnitAssignment.location_id == location_id) + .filter(UnitAssignment.device_type == "seismograph") + .order_by(UnitAssignment.assigned_at.asc()) + .all() + ) + + if not assignments: + return { + "events": [], + "count": 0, + "stats": _empty_stats(), + "assignments_used": [], + } + + now = datetime.utcnow() + + # 2. For each assignment, compute the effective (start, end) window after + # intersecting with the requested filter range. Drop assignments that + # don't overlap the filter window. + fetch_specs: list[tuple[UnitAssignment, datetime, datetime]] = [] + for a in assignments: + window = _intersect_window(a.assigned_at, a.assigned_until, from_dt, to_dt, now) + if window is not None: + fetch_specs.append((a, window[0], window[1])) + + if not fetch_specs: + return { + "events": [], + "count": 0, + "stats": _empty_stats(), + "assignments_used": [ + { + "unit_id": a.unit_id, + "assigned_at": _iso_utc(a.assigned_at), + "assigned_until": _iso_utc(a.assigned_until), + "events_in_window": 0, + } + for a in assignments + ], + } + + # 3. Concurrent SFM fetches. We over-fetch (up to _SFM_FETCH_CEILING per + # window) so summary stats reflect the true peak/last/count across the + # full filter window, not just what fits in the user's display limit. + # The displayed event list is trimmed to `limit` after merge. + async with httpx.AsyncClient(timeout=_SFM_TIMEOUT_SECONDS) as client: + per_window_lists = await asyncio.gather( + *( + _fetch_events_for_serial( + client, + serial=a.unit_id, + from_dt=start, + to_dt=end, + false_trigger=false_trigger, + limit=_SFM_FETCH_CEILING, + ) + for a, start, end in fetch_specs + ), + return_exceptions=False, + ) + + # 4. Build the per-assignment event counts (transparency for the operator). + spec_event_counts: dict[str, int] = {} + for (a, _start, _end), evs in zip(fetch_specs, per_window_lists): + spec_event_counts[a.id] = len(evs) + + # 5. Union, sort newest-first, cap. + merged: list[dict] = [] + for evs in per_window_lists: + merged.extend(evs) + merged.sort(key=lambda e: e.get("timestamp") or "", reverse=True) + total_count = len(merged) + capped = merged[:limit] + + # 6. Compute summary stats over the full merged set (not the capped one). + stats = _compute_stats(merged) + + # 7. Build the assignments_used report (every assignment, in chronological + # order, with its event count — even ones that fell outside the filter + # window so the operator sees them but with count=0). + assignments_used = [] + for a in assignments: + assignments_used.append( + { + "unit_id": a.unit_id, + "assignment_id": a.id, + "assigned_at": _iso_utc(a.assigned_at), + "assigned_until": _iso_utc(a.assigned_until), + "events_in_window": spec_event_counts.get(a.id, 0), + "status": a.status, + } + ) + + return { + "events": capped, + "count": total_count, + "stats": stats, + "assignments_used": assignments_used, + } + + +# ── Per-unit (cross-project) view ───────────────────────────────────────────── + + +async def events_for_unit( + db: Session, + unit_id: str, + *, + bucket: str = "all", # "all" | "attributed" | "unattributed" + from_dt: Optional[datetime] = None, + to_dt: Optional[datetime] = None, + false_trigger: Optional[bool] = None, + limit: int = 500, +) -> dict: + """Return events for a unit annotated with their assignment attribution. + + Unlike events_for_location (which queries SFM per assignment window), this + helper queries SFM for ALL events for the serial within the optional + [from_dt, to_dt] filter, then walks each event against the unit's + UnitAssignment intervals to compute attribution. + + Bucket semantics: + - "all": every event, attributed or not + - "attributed": events that fall inside at least one assignment window + - "unattributed": events with no overlapping assignment (the diagnostic + bucket — operator should fix assignment dates to + attribute these) + + Each event gets an extra `attribution` field: + {assignment_id, location_id, location_name, project_id, project_name, + assigned_at, assigned_until} or None + + Unattributed events also get a `nearest_assignment` field with the + same shape plus `delta_days` (signed; negative = event before assignment). + """ + # 1. Pull all assignments for this unit (any device_type — caller has + # already filtered by seismograph in the route). Order matters: we + # want the earliest-start assignment first so attribution prefers the + # chronologically-first overlap when there are simultaneous active + # assignments at different locations (rare but possible). + assignments = ( + db.query(UnitAssignment) + .filter(UnitAssignment.unit_id == unit_id) + .order_by(UnitAssignment.assigned_at.asc()) + .all() + ) + + # Resolve location + project names once. + loc_ids = {a.location_id for a in assignments} + proj_ids = {a.project_id for a in assignments} + loc_map = { + l.id: l for l in db.query(MonitoringLocation).filter( + MonitoringLocation.id.in_(loc_ids) + ).all() + } if loc_ids else {} + proj_map = { + p.id: p for p in db.query(Project).filter( + Project.id.in_(proj_ids) + ).all() + } if proj_ids else {} + + now = datetime.utcnow() + + def _attr_dict(a: UnitAssignment) -> dict: + loc = loc_map.get(a.location_id) + proj = proj_map.get(a.project_id) + return { + "assignment_id": a.id, + "location_id": a.location_id, + "location_name": loc.name if loc else None, + "project_id": a.project_id, + "project_name": proj.name if proj else None, + "assigned_at": _iso_utc(a.assigned_at), + "assigned_until": _iso_utc(a.assigned_until), + } + + # 2. Fetch all events for this serial in one shot. + async with httpx.AsyncClient(timeout=_SFM_TIMEOUT_SECONDS) as client: + events = await _fetch_events_for_serial( + client, + serial=unit_id, + from_dt=from_dt or datetime(1970, 1, 1), + to_dt=to_dt or now, + false_trigger=false_trigger, + limit=_SFM_FETCH_CEILING, + ) + + # 3. For each event, walk the assignment list and find the first + # overlapping window. O(N * M) but both are small in practice. + for ev in events: + ts_str = ev.get("timestamp") + if not ts_str: + ev["attribution"] = None + continue + try: + # SFM returns ISO with "T" separator; tolerate both. + ts = datetime.fromisoformat(ts_str.replace(" ", "T")) + except ValueError: + ev["attribution"] = None + continue + + matched: Optional[UnitAssignment] = None + for a in assignments: + a_end = a.assigned_until or now + if a.assigned_at <= ts <= a_end: + matched = a + break + + if matched is not None: + ev["attribution"] = _attr_dict(matched) + else: + ev["attribution"] = None + # Find the nearest assignment (chronologically) for diagnostic. + if assignments: + nearest = min( + assignments, + key=lambda a: min( + abs((ts - a.assigned_at).total_seconds()), + abs((ts - (a.assigned_until or now)).total_seconds()), + ), + ) + # Signed delta in days from the nearest boundary + # (negative = event BEFORE that boundary). + if ts < nearest.assigned_at: + delta_seconds = (ts - nearest.assigned_at).total_seconds() + elif ts > (nearest.assigned_until or now): + delta_seconds = (ts - (nearest.assigned_until or now)).total_seconds() + else: + delta_seconds = 0 + ev["nearest_assignment"] = { + **_attr_dict(nearest), + "delta_days": round(delta_seconds / 86400, 1), + } + + # 4. Apply bucket filter. + if bucket == "attributed": + filtered = [e for e in events if e.get("attribution") is not None] + elif bucket == "unattributed": + filtered = [e for e in events if e.get("attribution") is None] + else: + filtered = events + + filtered.sort(key=lambda e: e.get("timestamp") or "", reverse=True) + total_count = len(filtered) + capped = filtered[:limit] + + # 5. Stats: compute over the ENTIRE event set (not the filtered bucket) + # so the unattributed_count tile is always meaningful regardless of + # which bucket the operator has selected. + base_stats = _compute_stats(events) + unattributed_count = sum( + 1 for e in events if e.get("attribution") is None + ) + base_stats["unattributed_count"] = unattributed_count + + return { + "events": capped, + "count": total_count, + "stats": base_stats, + "assignments_total": len(assignments), + } + + +# ── Project-level roll-up (aggregates across all vibration locations) ───────── + + +async def vibration_summary_for_project( + db: Session, + project_id: str, + *, + from_dt: Optional[datetime] = None, + to_dt: Optional[datetime] = None, +) -> dict: + """Aggregate SFM events across every vibration location in a project. + + Returns: + { + "project_id": str, + "total_events": int, + "peak_pvs": float | None, + "peak_pvs_at": ISO timestamp | None, + "peak_pvs_location_id": str | None, + "peak_pvs_location_name": str | None, + "last_event": ISO timestamp | None, + "false_trigger_count": int, + "per_location": [ + {"location_id", "location_name", "event_count", + "peak_pvs", "last_event"}, + ... # sorted by event_count DESC + ], + "vibration_location_count": int, + } + """ + locations = ( + db.query(MonitoringLocation) + .filter(MonitoringLocation.project_id == project_id) + .filter(MonitoringLocation.location_type == "vibration") + .all() + ) + + if not locations: + return { + "project_id": project_id, + "total_events": 0, + "peak_pvs": None, + "peak_pvs_at": None, + "peak_pvs_location_id": None, + "peak_pvs_location_name": None, + "last_event": None, + "false_trigger_count": 0, + "per_location": [], + "vibration_location_count": 0, + } + + # Fan out across locations. Each call internally fans out across that + # location's UnitAssignment rows, so this is a nested fan-out. Both + # tiers happen concurrently because asyncio.gather + httpx pool. + results = await asyncio.gather( + *( + events_for_location( + db, + loc.id, + from_dt=from_dt, + to_dt=to_dt, + false_trigger=None, + limit=1, # We only need stats; events list itself is ignored. + ) + for loc in locations + ), + return_exceptions=False, + ) + + per_location: list[dict] = [] + total_events = 0 + peak_pvs = None + peak_pvs_at = None + peak_pvs_location_id = None + peak_pvs_location_name = None + last_event = None + false_trigger_count = 0 + + for loc, res in zip(locations, results): + st = res.get("stats", {}) or {} + ec = st.get("event_count", 0) or 0 + total_events += ec + false_trigger_count += st.get("false_trigger_count", 0) or 0 + + ev_last = st.get("last_event") + if ev_last and (last_event is None or ev_last > last_event): + last_event = ev_last + + ev_peak = st.get("peak_pvs") + if ev_peak is not None and (peak_pvs is None or ev_peak > peak_pvs): + peak_pvs = ev_peak + peak_pvs_at = st.get("peak_pvs_at") + peak_pvs_location_id = loc.id + peak_pvs_location_name = loc.name + + per_location.append({ + "location_id": loc.id, + "location_name": loc.name, + "event_count": ec, + "peak_pvs": ev_peak, + "last_event": ev_last, + }) + + per_location.sort(key=lambda r: r["event_count"], reverse=True) + + return { + "project_id": project_id, + "total_events": total_events, + "peak_pvs": peak_pvs, + "peak_pvs_at": peak_pvs_at, + "peak_pvs_location_id": peak_pvs_location_id, + "peak_pvs_location_name": peak_pvs_location_name, + "last_event": last_event, + "false_trigger_count": false_trigger_count, + "per_location": per_location, + "vibration_location_count": len(locations), + } + + +# ── Stats helpers ───────────────────────────────────────────────────────────── + + +def _empty_stats() -> dict: + return { + "event_count": 0, + "peak_pvs": None, + "peak_pvs_at": None, + "peak_pvs_serial": None, + "last_event": None, + "false_trigger_count": 0, + } + + +def _compute_stats(events: list[dict]) -> dict: + """Roll up summary stats from a merged event list. Cheap O(N) pass. + + The "Overall Peak" stat (peak_pvs) EXCLUDES events flagged as false + triggers — operators care about the highest REAL event, not the + biggest sensor glitch. false_trigger_count still includes them so + operators can see how many were filtered out. last_event uses + every event regardless (it's about activity recency, not magnitude). + """ + if not events: + return _empty_stats() + + peak_pvs = None + peak_pvs_at = None + peak_pvs_serial = None + last_event = None + false_trigger_count = 0 + + for ev in events: + is_false_trigger = bool(ev.get("false_trigger")) + if is_false_trigger: + false_trigger_count += 1 + + # Peak calculation: skip flagged false triggers. + if not is_false_trigger: + pvs = ev.get("peak_vector_sum") + if pvs is not None and (peak_pvs is None or pvs > peak_pvs): + peak_pvs = pvs + peak_pvs_at = ev.get("timestamp") + peak_pvs_serial = ev.get("serial") + + ts = ev.get("timestamp") + if ts and (last_event is None or ts > last_event): + last_event = ts + + return { + "event_count": len(events), + "peak_pvs": peak_pvs, + "peak_pvs_at": peak_pvs_at, + "peak_pvs_serial": peak_pvs_serial, + "last_event": last_event, + "false_trigger_count": false_trigger_count, + } diff --git a/backend/services/snapshot.py b/backend/services/snapshot.py index da54f65..f01cef6 100644 --- a/backend/services/snapshot.py +++ b/backend/services/snapshot.py @@ -1,9 +1,77 @@ from datetime import datetime, timezone +import logging +import os +import threading +import time +from typing import Optional + +import httpx from sqlalchemy.orm import Session from backend.database import get_db_session from backend.models import Emitter, RosterUnit, IgnoredUnit +log = logging.getLogger(__name__) + +SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200") + +# Tiny module-level cache: /api/status-snapshot is polled every 10s by the +# dashboard, and we don't want to hammer SFM with one /db/units roundtrip per +# call. 15s TTL keeps the cache mostly hot, with occasional refreshes. +_SFM_CACHE_TTL_SECONDS = 15.0 +_sfm_cache_lock = threading.Lock() +_sfm_cache: dict = {"fetched_at": 0.0, "data": None, "reachable": False} + + +def _parse_sfm_timestamp(ts_str: Optional[str]) -> Optional[datetime]: + """SFM /db/units returns naive ISO timestamps (no tz suffix). Treat them + as UTC, mirroring how the watcher heartbeat stores Emitter.last_seen.""" + if not ts_str: + return None + try: + ts = datetime.fromisoformat(ts_str.replace("Z", "+00:00")) + except ValueError: + return None + if ts.tzinfo is None: + ts = ts.replace(tzinfo=timezone.utc) + return ts + + +def fetch_sfm_unit_last_seen() -> tuple[dict[str, datetime], bool]: + """Return ({serial: last_seen_utc}, sfm_reachable). + + Cached for _SFM_CACHE_TTL_SECONDS. On any HTTP error returns ({}, False) + so callers transparently fall back to the watcher-heartbeat path. + """ + now = time.monotonic() + with _sfm_cache_lock: + if _sfm_cache["data"] is not None and (now - _sfm_cache["fetched_at"]) < _SFM_CACHE_TTL_SECONDS: + return _sfm_cache["data"], _sfm_cache["reachable"] + + data: dict[str, datetime] = {} + reachable = False + try: + with httpx.Client(timeout=4.0) as client: + resp = client.get(f"{SFM_BASE_URL}/db/units") + resp.raise_for_status() + payload = resp.json() or [] + for row in payload: + serial = row.get("serial") + ts = _parse_sfm_timestamp(row.get("last_seen")) + if serial and ts is not None: + data[serial] = ts + reachable = True + except httpx.HTTPError as e: + log.warning("SFM /db/units unreachable for status snapshot: %s", e) + except Exception as e: # noqa: BLE001 — defensive against malformed payload + log.warning("SFM /db/units parse error: %s", e) + + with _sfm_cache_lock: + _sfm_cache["fetched_at"] = now + _sfm_cache["data"] = data + _sfm_cache["reachable"] = reachable + return data, reachable + def ensure_utc(dt): if dt is None: @@ -69,6 +137,11 @@ def emit_status_snapshot(): emitters = {e.id: e for e in db.query(Emitter).all()} ignored = {i.id for i in db.query(IgnoredUnit).all()} + # SFM event-forwards are now the primary "last seen" signal for + # seismographs. Watcher heartbeats stay as a backup — if SFM is down + # or hasn't seen a serial, we fall back to Emitter.last_seen. + sfm_last_seen_map, sfm_reachable = fetch_sfm_unit_last_seen() + units = {} # --- Merge roster entries first --- @@ -93,24 +166,49 @@ def emit_status_snapshot(): last_seen = None fname = "" else: - if e: - last_seen = ensure_utc(e.last_seen) - # RECALCULATE status based on current time, not stored value + device_type = r.device_type or "seismograph" + emitter_last_seen = ensure_utc(e.last_seen) if e else None + fname = e.last_file if e else "" + + # SFM-primary, heartbeat-backup logic — only for seismographs. + # (SLMs / modems aren't forwarded into SFM's events store.) + sfm_last_seen = sfm_last_seen_map.get(unit_id) if device_type == "seismograph" else None + + if sfm_last_seen and emitter_last_seen: + # Both sources reported — use whichever is more recent. + if sfm_last_seen >= emitter_last_seen: + last_seen = sfm_last_seen + last_seen_source = "sfm" + else: + last_seen = emitter_last_seen + last_seen_source = "heartbeat" + elif sfm_last_seen: + last_seen = sfm_last_seen + last_seen_source = "sfm" + elif emitter_last_seen: + last_seen = emitter_last_seen + # If SFM was reachable but doesn't have this serial, it + # means the unit is calling home to the watcher but not + # being forwarded — still a working state for now. + last_seen_source = "heartbeat" + else: + last_seen = None + last_seen_source = "none" + + if last_seen is not None: status = calculate_status(last_seen, status_ok_threshold, status_pending_threshold) age = format_age(last_seen) - fname = e.last_file else: - # Rostered but no emitter data status = "Missing" - last_seen = None age = "N/A" - fname = "" units[unit_id] = { "id": unit_id, "status": status, "age": age, "last": last_seen.isoformat() if last_seen else None, + "last_seen_source": last_seen_source, + "sfm_reachable": sfm_reachable, "fname": fname, "deployed": r.deployed, "note": r.note or "", @@ -136,14 +234,23 @@ def emit_status_snapshot(): # --- Add unexpected emitter-only units --- for unit_id, e in emitters.items(): if unit_id not in roster: - last_seen = ensure_utc(e.last_seen) + emitter_last_seen = ensure_utc(e.last_seen) + sfm_last_seen = sfm_last_seen_map.get(unit_id) + if sfm_last_seen and (not emitter_last_seen or sfm_last_seen >= emitter_last_seen): + last_seen = sfm_last_seen + last_seen_source = "sfm" + else: + last_seen = emitter_last_seen + last_seen_source = "heartbeat" # RECALCULATE status for unknown units too status = calculate_status(last_seen, status_ok_threshold, status_pending_threshold) units[unit_id] = { "id": unit_id, "status": status, "age": format_age(last_seen), - "last": last_seen.isoformat(), + "last": last_seen.isoformat() if last_seen else None, + "last_seen_source": last_seen_source, + "sfm_reachable": sfm_reachable, "fname": e.last_file, "deployed": False, # default "note": "", @@ -192,6 +299,7 @@ def emit_status_snapshot(): unit_data["status"] = paired_unit.get("status", "Missing") unit_data["age"] = paired_unit.get("age", "N/A") unit_data["last"] = paired_unit.get("last") + unit_data["last_seen_source"] = paired_unit.get("last_seen_source", "none") unit_data["derived_from"] = paired_unit_id # Separate buckets for UI diff --git a/backend/static/event-modal.js b/backend/static/event-modal.js new file mode 100644 index 0000000..b41d961 --- /dev/null +++ b/backend/static/event-modal.js @@ -0,0 +1,401 @@ +/* event-modal.js — shared event-detail modal. + * + * Used by: + * - /sfm (admin Events tab) + * - /projects/{p}/nrl/{l} (project-location Events tab) + * - /unit/{id} (unit-detail SFM Events table) + * + * Pages must include partials/event_detail_modal.html in the body + * before this script is loaded. + * + * Public API: + * showEventDetail(eventId) + * Open the modal and fetch /api/sfm/db/events/{id}/sidecar to + * populate the rich BW report fields (peaks, ZC freq, sensor + * self-check, device info, etc.) into a tabbed/sectioned view. + * + * closeEventDetailModal() + * Close the modal. + * + * Notes: + * - Fetches sidecar live from SFM via terra-view's /api/sfm proxy. + * - Renders gracefully when the sidecar lacks a bw_report block + * (older events forwarded before the _ASCII.TXT pairing fix). + * - All functions are global on window so inline onclick handlers + * can reach them across all three host pages. + */ + +(function () { + const MODAL_ID = 'event-detail-modal'; + + function _esc(s) { + if (s == null) return ''; + return String(s).replace(/&/g, '&') + .replace(//g, '>') + .replace(/"/g, '"'); + } + + function _fmt(v, digits = 4, suffix = '') { + if (v == null || (typeof v === 'number' && Number.isNaN(v))) return '—'; + if (typeof v === 'number') { + return v.toFixed(digits) + (suffix ? ` ${suffix}` : ''); + } + return _esc(v) + (suffix ? ` ${suffix}` : ''); + } + + function _ppvClass(v) { + if (v == null) return 'text-gray-400'; + if (v < 0.5) return 'text-green-600 dark:text-green-400'; + if (v < 2.0) return 'text-amber-600 dark:text-amber-400'; + return 'text-red-600 dark:text-red-400 font-semibold'; + } + + function _kvCard(label, value, options = {}) { + // Single key-value tile. `value` is pre-rendered HTML (or text). + const colorCls = options.colorCls || ''; + const valCls = `font-mono font-semibold ${colorCls}`; + return `
+
${_esc(label)}
+
${value}
+ ${options.sub ? `
${options.sub}
` : ''} +
`; + } + + function _deriveRecordType(filename, fallback) { + // SFM currently hardcodes record_type="Waveform" for every event. + // The actual type is encoded in the LAST character of the Blastware + // filename's extension (e.g. "O121LL5E.IS0H" → "H" → Histogram). + // We derive it client-side until SFM is fixed; if the suffix isn't + // a known code we fall back to whatever SFM reported. + if (!filename) return fallback || '—'; + const dotIdx = filename.lastIndexOf('.'); + if (dotIdx < 0 || dotIdx === filename.length - 1) return fallback || '—'; + const ext = filename.slice(dotIdx + 1); + const lastChar = ext.slice(-1).toUpperCase(); + const typeMap = { + 'H': 'Histogram', + 'W': 'Waveform', + 'M': 'Manual', + 'E': 'Event', + 'C': 'Combo', + }; + return typeMap[lastChar] || (fallback || '—'); + } + + function _sectionHeader(title, sub) { + return `

+ ${_esc(title)}${sub ? ` ${_esc(sub)}` : ''} +

`; + } + + // ── Section renderers ──────────────────────────────────────────── + + function _renderEventHeader(s) { + const ev = s.event || {}; + const bw = s.blastware || {}; + const ts = ev.timestamp ? ev.timestamp.replace('T', ' ').slice(0, 19) : '—'; + const recType = _deriveRecordType(bw.filename || ev.blastware_filename, ev.record_type); + return `
+
Serial ${_esc(ev.serial)}
+
Timestamp ${ts}
+
Record Type ${_esc(recType)}
+
Sample Rate ${ev.sample_rate ?? '—'} sps
+
Rec Time ${ev.rectime_seconds != null ? ev.rectime_seconds + ' s' : '—'}
+
Waveform Key ${_esc(ev.waveform_key || '—')}
+
`; + } + + function _renderUserNotes(s) { + // The "user notes" metadata the operator typed into the BW device. + // These are the strings the future metadata-driven parser will use. + // NOTE: SFM's sidecar JSON still names this block `project_info` — + // we render it as "User Notes" (the actual BW term) but read the + // field by its SFM-API name. Rename in SFM is a future cleanup. + const p = s.project_info || {}; + return `
+
Project ${_esc(p.project || '—')}
+
Client ${_esc(p.client || '—')}
+
Operator ${_esc(p.operator || '—')}
+
Sensor Location ${_esc(p.sensor_location || '—')}
+
+

+ Values are as typed into the seismograph at session start — not the terra-view project/location assignment. +

`; + } + + function _renderPeakValues(s) { + // Prefer bw_report.peaks for richer per-channel data; fall back to peak_values. + const bwPeaks = (s.bw_report && s.bw_report.peaks) || null; + const pv = s.peak_values || {}; + + const tran = bwPeaks ? bwPeaks.tran?.ppv_ips : pv.transverse; + const vert = bwPeaks ? bwPeaks.vert?.ppv_ips : pv.vertical; + const lng = bwPeaks ? bwPeaks.long?.ppv_ips : pv.longitudinal; + const pvs = bwPeaks ? bwPeaks.vector_sum?.ips : pv.vector_sum; + const pvsAt = bwPeaks ? bwPeaks.vector_sum?.time_s : null; + + return `
+ ${_kvCard('Transverse', `${_fmt(tran, 4)}`, { sub: 'in/s' })} + ${_kvCard('Vertical', `${_fmt(vert, 4)}`, { sub: 'in/s' })} + ${_kvCard('Longitudinal', `${_fmt(lng, 4)}`, { sub: 'in/s' })} + ${_kvCard('Peak Vector Sum', `${_fmt(pvs, 4)}`, { + sub: pvsAt != null ? `in/s @ t=${_fmt(pvsAt, 2)}s` : 'in/s', + })} +
`; + } + + function _renderMic(s) { + // Operators only care about dB(L); PSI tile was dropped 2026-05. + // We still render the row if any mic data is present so ZC freq / + // time-of-peak stay visible even when bw_report.mic is missing. + const mic = (s.bw_report && s.bw_report.mic) || null; + const pv = s.peak_values || {}; + + if (!mic && pv.mic_psi == null) return ''; + + const dbl = mic?.pspl_dbl; + const zcHz = mic?.zc_freq_hz; + const tPk = mic?.time_of_peak_s; + const wt = mic?.weighting; + + return `
+ ${_kvCard('Peak Mic dB(L)', _fmt(dbl, 1), { sub: wt || '' })} + ${_kvCard('ZC Frequency', _fmt(zcHz, 1, 'Hz'))} + ${_kvCard('Time of Peak', tPk != null ? _fmt(tPk, 2, 's') : '—')} +
`; + } + + function _sensorRow(label, ch) { + if (!ch) { + return `${_esc(label)} + —`; + } + const result = ch.result || '—'; + const resultCls = result === 'Passed' + ? 'text-green-600 dark:text-green-400' + : (result === 'Failed' ? 'text-red-600 dark:text-red-400 font-semibold' : 'text-gray-500'); + + // Geo channels have freq + ratio; mic has freq + amplitude. + const rightCol = (ch.amplitude_mv != null) + ? `${_fmt(ch.amplitude_mv, 1, 'mV')}` + : `${ch.ratio != null ? ch.ratio.toFixed(1) + ' ratio' : '—'}`; + + return ` + ${_esc(label)} + ${_fmt(ch.freq_hz, 1, 'Hz')} + ${rightCol} + ${_esc(result)} + `; + } + + function _renderSensorCheck(s) { + const sc = s.bw_report && s.bw_report.sensor_check; + if (!sc) return ''; + return ` + + + + + + + + + + ${_sensorRow('Transverse', sc.tran)} + ${_sensorRow('Vertical', sc.vert)} + ${_sensorRow('Longitudinal', sc.long)} + ${_sensorRow('Microphone', sc.mic)} + +
ChannelFrequencyAmplitude/RatioResult
`; + } + + function _renderDeviceMetadata(s) { + const bw = s.bw_report || {}; + const dev = bw.device || {}; + const rec = bw.recording || {}; + return `
+
Firmware ${_esc(bw.version || '—')}
+
Battery ${dev.battery_volts != null ? dev.battery_volts.toFixed(2) + ' V' : '—'}
+
Calibrated ${_esc(dev.calibration_date || '—')}${dev.calibration_by ? ' (' + _esc(dev.calibration_by) + ')' : ''}
+
Geo Range ${rec.geo_range_ips != null ? rec.geo_range_ips + ' in/s' : '—'}
+
Stop Mode ${_esc(rec.stop_mode || '—')}
+
Units ${_esc(rec.units || '—')}
+
`; + } + + function _renderFileInfo(s, eventId) { + const bw = s.blastware || {}; + const src = s.source || {}; + const sizeKb = bw.filesize ? (bw.filesize / 1024).toFixed(1) : null; + const canDownloadBinary = !!(bw.available && bw.filename && eventId); + + const downloadButtons = ` +
+ ${canDownloadBinary ? ` + + + + + Download Blastware file + (${_esc(bw.filename)}${sizeKb ? `, ${sizeKb} KB` : ''}) + + ` : ` + + + + + Blastware file unavailable + + `} + + + + + + Download sidecar JSON + +
+ + `; + + return `${downloadButtons} +
+
Blastware file ${_esc(bw.filename || '—')} ${sizeKb ? `(${sizeKb} KB)` : ''}
+
SHA-256 ${_esc(bw.sha256 || '—')}
+
Captured at ${_esc(src.captured_at ? src.captured_at.slice(0, 19).replace('T', ' ') : '—')}
+
Tool version ${_esc(src.tool_version || '—')}
+
`; + } + + // ── Public API ─────────────────────────────────────────────────── + + window.showEventDetail = async function (eventId) { + const modal = document.getElementById(MODAL_ID); + if (!modal) { + console.warn('event-modal: include event_detail_modal.html partial on this page.'); + return; + } + modal.classList.remove('hidden'); + document.getElementById(MODAL_ID + '-title').textContent = 'Event Detail'; + document.getElementById(MODAL_ID + '-content').innerHTML = ` +
+
+ Loading event detail… +
`; + + let s; + try { + const r = await fetch(`/api/sfm/db/events/${encodeURIComponent(eventId)}/sidecar`); + if (!r.ok) { + throw new Error('HTTP ' + r.status + ' fetching sidecar'); + } + s = await r.json(); + } catch (e) { + document.getElementById(MODAL_ID + '-content').innerHTML = ` +
+ Failed to load event detail: ${_esc(e.message)} +
`; + return; + } + + const ev = s.event || {}; + const ts = ev.timestamp ? ev.timestamp.replace('T', ' ').slice(0, 19) : ''; + document.getElementById(MODAL_ID + '-title').textContent = + `Event — ${ev.serial || '?'} @ ${ts}`; + + const hasReport = !!s.bw_report; + const reportNote = hasReport + ? '' + : `
+ No BW ASCII report paired with this event. + Older events forwarded before the watcher's _ASCII.TXT pairing fix landed lack this data. + PPV is still available from the binary event file. +
`; + + document.getElementById(MODAL_ID + '-content').innerHTML = ` + ${reportNote} + + ${_sectionHeader('Event')} + ${_renderEventHeader(s)} + + ${_sectionHeader('User Notes')} + ${_renderUserNotes(s)} + + ${_sectionHeader('Peak Particle Velocity')} + ${_renderPeakValues(s)} + + ${(s.bw_report && (s.bw_report.mic || s.peak_values?.mic_psi != null)) ? ` + ${_sectionHeader('Microphone')} + ${_renderMic(s)} + ` : ''} + + ${hasReport ? ` + ${_sectionHeader('Sensor Self-Check')} + ${_renderSensorCheck(s)} + + ${_sectionHeader('Device & Recording Metadata')} + ${_renderDeviceMetadata(s)} + ` : ''} + + ${_sectionHeader('Source File')} + ${_renderFileInfo(s, eventId)} + `; + }; + + window.closeEventDetailModal = function () { + const modal = document.getElementById(MODAL_ID); + if (modal) modal.classList.add('hidden'); + }; + + window.toggleEventJsonViewer = function () { + const viewer = document.getElementById('event-json-viewer'); + const label = document.getElementById('event-json-toggle-label'); + if (!viewer) return; + const isHidden = viewer.classList.toggle('hidden'); + if (label) label.textContent = isHidden ? 'View JSON' : 'Hide JSON'; + }; + + window.copyEventJson = function () { + const pre = document.getElementById('event-json-pre'); + const label = document.getElementById('event-json-copy-label'); + if (!pre) return; + navigator.clipboard.writeText(pre.textContent).then(() => { + if (label) { + label.textContent = 'Copied!'; + setTimeout(() => { label.textContent = 'Copy'; }, 1500); + } + }).catch(err => { + console.error('clipboard write failed', err); + if (label) { + label.textContent = 'Failed'; + setTimeout(() => { label.textContent = 'Copy'; }, 1500); + } + }); + }; + + // Close on Escape. + document.addEventListener('keydown', function (e) { + if (e.key === 'Escape') window.closeEventDetailModal(); + }); +})(); diff --git a/docker-compose.yml b/docker-compose.yml index 5ea8549..a9a1ca2 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -10,9 +10,11 @@ services: - PYTHONUNBUFFERED=1 - ENVIRONMENT=production - SLMM_BASE_URL=http://host.docker.internal:8100 + - SFM_BASE_URL=http://sfm:8200 restart: unless-stopped depends_on: - slmm + - sfm extra_hosts: - "host.docker.internal:host-gateway" healthcheck: @@ -44,5 +46,25 @@ services: retries: 3 start_period: 10s + # --- SFM (Seismo Fleet Manager) --- + sfm: + build: + context: ../seismo-relay + dockerfile: Dockerfile + ports: + - "8200:8200" + volumes: + - ../seismo-relay/sfm/data:/app/sfm/data + - ../seismo-relay/bridges/captures:/app/bridges/captures + environment: + - PYTHONUNBUFFERED=1 + - PORT=8200 + restart: unless-stopped + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8200/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 10s volumes: data: diff --git a/docs/SYNOLOGY_DEPLOYMENT.md b/docs/SYNOLOGY_DEPLOYMENT.md new file mode 100644 index 0000000..0434647 --- /dev/null +++ b/docs/SYNOLOGY_DEPLOYMENT.md @@ -0,0 +1,436 @@ +# Synology NAS Deployment Guide + +This guide covers migrating the terra-view stack from a generic Linux host +(currently the home server at `10.0.0.44`) to an always-on Synology NAS in +the office, including data migration and the minimal external-access +networking layer. + +## Table of Contents + +1. [Architecture overview](#architecture-overview) +2. [Pre-requisites](#pre-requisites) +3. [Phase 1 — Pre-stage on the NAS (no downtime)](#phase-1--pre-stage-on-the-nas-no-downtime) +4. [Phase 2 — Data migration (~10 min window)](#phase-2--data-migration-10-min-window) +5. [Phase 3 — Repoint the watcher (download2-PC)](#phase-3--repoint-the-watcher-download2-pc) +6. [Phase 4 — External access for remote operators](#phase-4--external-access-for-remote-operators) +7. [Phase 5 — Decommission home server](#phase-5--decommission-home-server) +8. [Verification checklist](#verification-checklist) +9. [Rollback plan](#rollback-plan) +10. [Gotchas](#gotchas) + +--- + +## Architecture overview + +The terra-view stack is three containers: + +| Service | Port | What writes to it | Where it lives | +|---------------|-------|-----------------------------|----------------| +| terra-view | 8001 | Operators (UI), watchers (heartbeat) | Synology NAS | +| SFM | 8200 | Watchers (Blastware ACH forwards) | Synology NAS | +| SLMM | 8100 | terra-view (proxied), SLMs on LAN | Synology NAS | + +Everything that **writes** to the stack lives inside the office LAN: + +- **download2-PC** is the series3-watcher host. It has a static office IP and + POSTs to terra-view's heartbeat endpoint plus SFM's Blastware import + endpoint. Both flows are LAN-internal. +- **Sound level meters (NL-43)** sit on the office LAN; SLMM reaches them + via `network_mode: host`. + +The **only** thing that needs to cross the office firewall is operator UI +access from outside the office (laptops, phones, working from home). That +makes the external networking layer trivial — see Phase 4. + +--- + +## Pre-requisites + +On the Synology side: + +- **DSM 7.2+** with **Container Manager** installed (Package Center). + Older "Docker" package works too — same engine, different menu names. +- **x86_64 model** (Plus / Value / XS series). ARM j-series will build but + expect a slower first build. +- **Static LAN IP** reserved for the NAS in the office router's DHCP table. + Devices on the LAN must have a stable target. +- **SSH enabled** — Control Panel → Terminal & SNMP → Enable SSH service. +- **Shared folder** for the stack — e.g. `/volume1/docker/`. + +On the home server side: + +- Working terra-view / SFM / SLMM stack you want to migrate. +- `rsync` available (it almost certainly is). + +You will also need: + +- An admin account on the Synology with sudo privileges. +- Network access between the home server and the NAS during the migration + window (or USB-drive shuttle if not). + +--- + +## Phase 1 — Pre-stage on the NAS (no downtime) + +Goal: get the NAS booting an empty stack so you can validate the build and +networking *before* touching any production data. + +### 1.1 Clone the repos + +SSH to the NAS as admin: + +```bash +sudo mkdir -p /volume1/docker +cd /volume1/docker +sudo git clone terra-view +sudo git clone slmm +sudo git clone seismo-relay +cd terra-view +sudo git checkout main # or whichever branch you ship from +``` + +### 1.2 Build images + +```bash +cd /volume1/docker/terra-view +sudo docker compose build +``` + +First build takes 5–15 min depending on model. + +### 1.3 Boot the empty stack + +```bash +sudo docker compose up -d +``` + +Hit `http://:1001` (dev profile) or `:8001` (prod profile) from +another office machine. You should see an empty fleet roster. If that +works, the NAS can run the stack — proven before any production data is +at risk. + +### 1.4 Stop the NAS stack again + +```bash +sudo docker compose stop +``` + +We're ready for the data migration. + +--- + +## Phase 2 — Data migration (~10 min window) + +The terra-view stack is stateful in three places. All three must be moved +together for consistency. + +| Service | Data location (home server) | +|------------|----------------------------------------------| +| terra-view | `/home/serversdown/terra-view/data/` | +| SLMM | `/home/serversdown/slmm/data/` | +| SFM | `/home/serversdown/seismo-relay/data/` | + +### 2.1 Stop writes on both sides + +On the NAS: + +```bash +cd /volume1/docker/terra-view +sudo docker compose stop +``` + +On the home server: + +```bash +cd /home/serversdown/terra-view +docker compose stop terra-view slmm sfm +``` + +### 2.2 rsync the data dirs + +From the home server (or anywhere with SSH access to both): + +```bash +rsync -avh /home/serversdown/terra-view/data/ admin@:/volume1/docker/terra-view/data/ +rsync -avh /home/serversdown/slmm/data/ admin@:/volume1/docker/slmm/data/ +rsync -avh /home/serversdown/seismo-relay/data/ admin@:/volume1/docker/seismo-relay/data/ +``` + +### 2.3 Fix ownership on the NAS + +Synology admin is usually UID `1026`, GID `100`. Inside containers running +as root, this doesn't matter — but if you've configured `user:` in any +compose file it will. Safe default: + +```bash +ssh admin@ "sudo chown -R 1026:100 \ + /volume1/docker/terra-view/data \ + /volume1/docker/slmm/data \ + /volume1/docker/seismo-relay/data" +``` + +### 2.4 Run any pending migrations + +Some earlier feature work added migration scripts that need to run once +per database. After the rsync, before starting the stack, check what's +pending: + +```bash +ssh admin@ +cd /volume1/docker/terra-view +ls backend/migrate_*.py +``` + +Run each one inside the container (after starting it temporarily) or apply +them on the host with the same Python environment. Idempotent migrations +re-run safely. + +### 2.5 Start the NAS stack + +```bash +ssh admin@ \ + "cd /volume1/docker/terra-view && sudo docker compose up -d" +``` + +### 2.6 Spot-check + +- Dashboard loads with real units +- `/sfm` page lists historical events +- A photo loads on a unit detail page +- SFM/HB badge mix on the active table matches what you saw on the home + server + +If anything's off, see [Rollback plan](#rollback-plan). + +--- + +## Phase 3 — Repoint the watcher (download2-PC) + +The download2-PC is the one client we have to reconfigure. It currently +POSTs to the home server. Two endpoints to change: + +1. **terra-view heartbeat URL** — + `http://:8001/api/series3/heartbeat` + → `http://:8001/api/series3/heartbeat` + +2. **SFM Blastware import URL** — + `http://:8200/db/import/blastware_file` + → `http://:8200/db/import/blastware_file` + + Or, if you want to keep SFM container-internal and not publish 8200 on + the LAN at all, point it through terra-view's existing SFM proxy: + → `http://:8001/api/sfm/db/import/blastware_file` + +Update the config, restart the watcher service, and confirm the next +heartbeat lands in the NAS DB (check the Recent Call-Ins card on the +dashboard). + +> **Tip:** keep the home server running in parallel for 1–2 days. If you +> forget to repoint something, it'll still flow into the old DB and you +> can resync. + +--- + +## Phase 4 — External access for remote operators + +Only the terra-view UI needs to be reachable from outside the office. Two +clean options — pick one. + +### Option A — Tailscale (recommended for small teams) + +Zero port forwards, zero certs, zero public DNS, zero reverse proxy. + +1. Install Tailscale from Synology Package Center, sign in. +2. Install Tailscale on each operator's laptop/phone, sign in to the same + tailnet. +3. Operators access `http://:8001` from anywhere. + +That's the whole setup. The office network has no external exposure at +all. + +### Option B — Reverse proxy with Let's Encrypt + +If you want a `https://terraview.yourdomain.com` URL that any browser can +reach: + +#### B.1 Port forward on the office router + +``` +WAN 443 → :443 +WAN 80 → :80 (only needed for Let's Encrypt HTTP-01; + skip if you use DNS-01 challenge) +``` + +Do **not** forward 1001, 8001, 8100, or 8200. + +#### B.2 Public DNS + +- Free: Synology DDNS (Control Panel → External Access → DDNS) — gives + you `something.synology.me`. +- Better: your own domain with an A record → office WAN IP, or a CNAME → + Synology DDNS hostname (handles dynamic IPs automatically). + +#### B.3 Let's Encrypt certificate + +Control Panel → Security → Certificate → Add → "Get a certificate from +Let's Encrypt." DSM handles renewal. + +#### B.4 Synology reverse proxy + +Control Panel → Login Portal → Advanced → Reverse Proxy → Create: + +``` +Source: Hostname terraview.yourdomain.com + Protocol HTTPS + Port 443 +Destination: Hostname localhost + Protocol HTTP + Port 8001 +``` + +Under "Custom Header", add: + +| Header | Value | +|---------------------|------------------------------------| +| `X-Forwarded-For` | `$proxy_add_x_forwarded_for` | +| `X-Forwarded-Proto` | `$scheme` | +| `Host` | `$host` | + +Tick the WebSocket support checkbox. + +#### B.5 DSM firewall + +Control Panel → Security → Firewall → enable: + +- 443/TCP from `Anywhere` — allow +- 80/TCP from `Anywhere` — allow (cert renewal only) +- Everything else from WAN — deny +- All from LAN — allow + +Optional: geo-block to your country if your operators are domestic only. +Big reduction in scanning noise. + +--- + +## Phase 5 — Decommission home server + +After 1–2 weeks of stable NAS operation: + +1. Take a final `docker compose down` on the home server. +2. Archive `/home/serversdown/{terra-view,slmm,seismo-relay}/data/` to a + backup volume. +3. Free the home server hardware. + +--- + +## Verification checklist + +After Phase 2 (data migration): + +- [ ] `http://:8001/` loads dashboard with real units +- [ ] Recent Alerts, Call-Ins (2 cols), Fleet Summary across the top +- [ ] SFM/HB badge mix on the active table looks sane +- [ ] `/sfm` page lists historical events (the same count as before) +- [ ] A unit detail page loads with photos rendering +- [ ] `/api/recent-event-callins` returns 200 with real data +- [ ] `/api/status-snapshot` returns 200, `sfm_reachable: true` + +After Phase 3 (watcher cutover): + +- [ ] Next heartbeat from download2-PC lands in NAS DB +- [ ] A new event arrives in `/sfm` page on the NAS within the next + Blastware ACH cycle +- [ ] No errors in `docker logs terra-view-terra-view-1` + +After Phase 4 (external access): + +- [ ] (Option A) Operator laptop on tailnet can reach + `http://:8001` +- [ ] (Option B) `https://terraview.yourdomain.com` resolves, cert is + valid, dashboard loads +- [ ] (Option B) Office DSM admin (5001) is **not** reachable from outside + +--- + +## Rollback plan + +The home server stays alive in parallel through Phases 2–3 as a safety +net. If anything goes wrong on the NAS: + +1. On the home server: + ```bash + cd /home/serversdown/terra-view + docker compose up -d + ``` +2. Point download2-PC back at the home server IP. +3. NAS data isn't lost — it's just sitting idle. Investigate, fix, retry. + +The "irreversible" point is when you decommission the home server in +Phase 5. Until then, you can always fall back. + +--- + +## Gotchas + +1. **Synology UID/GID quirks.** Synology admin is usually `1026:100`. + Containers running as root inside don't care, but if your compose + files set `user:`, mismatched UIDs cause SQLite "readonly database" + errors. Easiest fix: omit `user:` and let containers run as root. + +2. **`network_mode: host` for SLMM.** Required for LAN-direct comms with + sound level meters. On Synology this binds to the NAS's interface — + confirm nothing else on the NAS uses ports 8100 or 21 (FTP). + +3. **Auto-start on boot.** Container Manager → Project → Settings → + enable "Auto-restart". Otherwise a DSM update or NAS reboot drops the + stack. + +4. **`restart: unless-stopped` in compose.** Verify every service has it. + DSM occasionally restarts Docker during DSM updates — this flag + ensures everything comes back. + +5. **Hyper Backup.** Schedule a daily snapshot of + `/volume1/docker/terra-view/data/` to a USB drive or off-site. SQLite + + small photo dir = trivially small backups. The DB-Management UI's + built-in snapshots are an additional layer but not a replacement. + +6. **NAT loopback (Option B only).** If your office router doesn't + support hairpinning, machines INSIDE the office can't reach the NAS + by its public hostname — they have to use the LAN IP. Most modern + routers handle this; some ISP-provided ones don't. Test from a laptop + on the office Wi-Fi. + +7. **Let's Encrypt rate limits (Option B only).** 5 issuances per domain + per week. Don't fat-finger DNS or you'll be locked out. Test with the + staging endpoint first if unsure. + +8. **`host.docker.internal` resolution.** terra-view's + `SFM_BASE_URL=http://host.docker.internal:8200` relies on Docker's + internal DNS. Works on DSM 7.2+ in bridge mode. If you see "name not + resolved" errors, fall back to explicit container names with a custom + network in compose. + +9. **SFM stale rows.** The SFM SQLite has a few rows in `monitor_log` + and `ach_sessions` from earlier Python-ACH experiments. Harmless + to bring over — invisible to terra-view's UI under the + watcher-forward pipeline. + +--- + +## Suggested timeline + +For a low-risk migration: + +- **Week 1**: Phase 1. Get the NAS booting an empty stack. No production + touch. +- **Week 2, day 1**: Phase 2. Migrate data. 10-min window. Keep home + server alive in parallel. +- **Week 2, day 1**: Phase 3. Repoint download2-PC. Watch heartbeats + land on the NAS for the rest of the day. +- **Week 3**: Phase 4. Add Tailscale or reverse-proxy access for remote + operators. +- **Week 4–5**: Monitor. Confirm everything's stable. Then Phase 5 + (decommission home server). + +Splitting "make it work on LAN" from "expose it remotely" means you +debug one thing at a time. diff --git a/requirements.txt b/requirements.txt index 542f015..f5bd95d 100644 --- a/requirements.txt +++ b/requirements.txt @@ -8,3 +8,4 @@ aiofiles==23.2.1 Pillow==10.1.0 httpx==0.25.2 openpyxl==3.1.2 +rapidfuzz==3.10.1 diff --git a/templates/admin/metadata_backfill.html b/templates/admin/metadata_backfill.html new file mode 100644 index 0000000..84459cd --- /dev/null +++ b/templates/admin/metadata_backfill.html @@ -0,0 +1,692 @@ +{% extends "base.html" %} + +{% block title %}Metadata Backfill - Seismo Fleet Manager{% endblock %} + +{% block content %} + +
+ +
+ + +
+

Backfill from event metadata

+

+ Auto-create projects, locations, and unit assignments from operator-typed metadata on Blastware events. +

+
+ + +
+
+
+ + + +

Scan SFM events

+

+ Reads all events from SFM, clusters them by serial & time, matches the + operator-typed metadata against your existing projects, and proposes + Project / Location / UnitAssignment + chains to create. +

+ +
+
+ + +
+ + +
+ + + + + +{% include 'partials/event_detail_modal.html' %} + + + +{% endblock %} diff --git a/templates/admin/project_tidy.html b/templates/admin/project_tidy.html new file mode 100644 index 0000000..e314b17 --- /dev/null +++ b/templates/admin/project_tidy.html @@ -0,0 +1,267 @@ +{% extends "base.html" %} + +{% block title %}Project Tidy - Seismo Fleet Manager{% endblock %} + +{% block content %} + +
+ +
+ + +
+

Project Tidy

+

+ Find duplicate-looking projects via fuzzy name matching, then merge them with one click. + Useful after the metadata-backfill parser creates near-duplicates from operator name variations. +

+
+ + +
+
+
+ + +
+ +
+
+ + +
+
+ Click "Scan for duplicates" to find pairs. +
+
+ + + + + +{% endblock %} diff --git a/templates/admin_sfm.html b/templates/admin_sfm.html new file mode 100644 index 0000000..e73ea09 --- /dev/null +++ b/templates/admin_sfm.html @@ -0,0 +1,264 @@ +{% extends "base.html" %} + +{% block title %}SFM Admin - Seismo Fleet Manager{% endblock %} + +{% block content %} +
+
+ ← Back to Developer Tools +

SFM Admin

+

Diagnostics for the Seismograph Field Module (SFM) backend.

+
+ +
+ + +
+

Loading SFM status…

+
+ + +
+

Connection

+
+
+ terra-view → SFM URL +
{{ sfm_base_url }}
+
+
+ Last checked +
+
+
+ Version +
+
+
+
+ + +
+
+
Known Units
+
+
+
+
Total Events
+
+
+
+
Stale: monitor_log
+
+
+
+
Stale: ach_sessions
+
+
+
+ + +
+

Per-Unit Roll-up

+

All seismograph serials SFM has ever seen, with their last-event timestamp and total event count. Sourced from GET /db/units.

+
+ + + + + + + + + + + + + +
SerialLast SeenEventsMonitor (stale)Sessions (stale)
Loading…
+
+
+ + +
+

Recent Events — Forwarding Latency

+

The last 25 events SFM ingested, with the gap between the event's recorded timestamp and when SFM received the forward. Large latencies indicate the watcher is forwarding stale files (e.g. after a network outage).

+
+ + + + + + + + + + + + + +
RecordedSerialForwardedLatencyFile
Loading…
+
+
+ + +
+

Raw API Tester

+

Send a GET request to any SFM endpoint via the terra-view /api/sfm/* proxy. Path is relative to SFM root (no leading slash).

+
+ /api/sfm/ + + +
+ +
+ + +{% endblock %} diff --git a/templates/admin_slmm.html b/templates/admin_slmm.html new file mode 100644 index 0000000..c9056b1 --- /dev/null +++ b/templates/admin_slmm.html @@ -0,0 +1,138 @@ +{% extends "base.html" %} + +{% block title %}SLMM Admin - Seismo Fleet Manager{% endblock %} + +{% block content %} +
+
+ ← Back to Developer Tools +

SLMM Admin

+

Diagnostics for the Sound Level Meter Manager (SLMM) backend.

+
+ +
+ + +
+

Loading SLMM status…

+
+ + +
+

Connection

+
+
+ terra-view → SLMM URL +
{{ slmm_base_url }}
+
+
+ Last checked +
+
+
+ Version +
+
+
+
+ For per-device SLM control, see the Sound Level Meters dashboard. +
+
+ + +
+

Raw API Tester

+

Send a GET request to any SLMM endpoint via the terra-view /api/slmm/* proxy.

+
+ /api/slmm/ + + +
+ +
+ + +{% endblock %} diff --git a/templates/base.html b/templates/base.html index 36bd6c0..d4b9ccf 100644 --- a/templates/base.html +++ b/templates/base.html @@ -109,41 +109,24 @@ Dashboard - + {# Devices — single sidebar entry covering all device-type + pages. Lands on /roster (the unified all-devices view); + the tab strip on each underlying page lets the operator + drill into seismograph / SLM / modem specifics. + Active when on any /seismographs, /sound-level-meters, + /modems, /roster, /pair-devices, /unit/* page. #} + {% set _is_devices = ( + request.url.path in ('/seismographs', '/sound-level-meters', '/modems', '/roster', '/pair-devices') + or request.url.path.startswith('/unit/') + or request.url.path.startswith('/slm/') + ) %} + Devices - - - - - Seismographs - - - - - - - Sound Level Meters - - - - - - - Modems - - - - - - - Pair Devices - - @@ -151,6 +134,34 @@ Projects + {# Events — fleet-wide event database (SFM). Cross-project + sortable/filterable event list. Day-to-day event browsing + for a specific location or unit lives on those detail + pages; this is the firehose for cross-cutting queries. #} + + + + + Events + + + {# Tools — operator workflow hub. Active when on /tools + itself or any of the workflow pages it links into + (project tidy, metadata backfill, pair devices). #} + {% set _is_tools = ( + request.url.path == '/tools' + or request.url.path == '/pair-devices' + or request.url.path == '/settings/developer/project-tidy' + or request.url.path == '/settings/developer/metadata-backfill' + ) %} + + + + + + Tools + + diff --git a/templates/dashboard.html b/templates/dashboard.html index d1ec900..8b8e22c 100644 --- a/templates/dashboard.html +++ b/templates/dashboard.html @@ -29,7 +29,55 @@
- + +
+
+

Recent Alerts

+
+ + + + + + + +
+
+
+

Loading alerts...

+
+
+ + +
+ +

Fleet Summary

@@ -121,74 +169,6 @@
- -
-
-

Recent Alerts

-
- - - - - - - -
-
-
-

Loading alerts...

-
-
- - -
-
-

Recent Call-Ins

-
- - - - - - - -
-
-
-
-

Loading recent call-ins...

-
- -
-
- - -
-
-

Today's Schedule

-
- - - - - - - -
-
-
-

Loading scheduled actions...

-
-
-
@@ -269,6 +249,36 @@ + +
+
+
+ + + +

Today's Schedule

+ +
+ +
+ +
+
@@ -364,6 +374,17 @@ transform: rotate(-90deg); } } + +/* Today's Schedule — horizontal collapsible at all breakpoints. */ +#todays-actions-content.collapsed { + display: none; +} +#todays-actions-chevron.collapsed { + transform: rotate(-90deg); +} +#todays-actions-chevron { + transition: transform 0.2s ease-in-out; +} @@ -654,7 +675,8 @@ function toggleCard(cardName) { // Restore card states from localStorage on page load function restoreCardStates() { const cardStates = JSON.parse(localStorage.getItem('dashboardCardStates') || '{}'); - const cardNames = ['fleet-summary', 'recent-alerts', 'recent-callins', 'todays-actions', 'fleet-map', 'fleet-status']; + // Note: todays-actions has its own collapse handling (see toggleTodaysSchedule / onTodaysActionsSwap) + const cardNames = ['fleet-summary', 'recent-alerts', 'recent-callins', 'fleet-map', 'fleet-status']; cardNames.forEach(cardName => { const content = document.getElementById(`${cardName}-content`); @@ -839,89 +861,139 @@ async function loadRecentPhotos() { loadRecentPhotos(); setInterval(loadRecentPhotos, 30000); -// Load and display recent call-ins -let showingAllCallins = false; -const DEFAULT_CALLINS_DISPLAY = 5; - +// Load and display recent call-ins. +// Source: SFM events (forwarded by series3-watcher from Blastware ACH). +// Each event = one call-home. Heartbeat-derived endpoint /api/recent-callins +// is being phased out but kept as a backup. async function loadRecentCallins() { + const callinsList = document.getElementById('recent-callins-list'); try { - const response = await fetch('/api/recent-callins?hours=6'); + const response = await fetch('/api/recent-event-callins?limit=10'); if (!response.ok) { throw new Error('Failed to load recent call-ins'); } const data = await response.json(); - const callinsList = document.getElementById('recent-callins-list'); - const showAllButton = document.getElementById('show-all-callins'); - if (data.call_ins && data.call_ins.length > 0) { - // Determine how many to show - const displayCount = showingAllCallins ? data.call_ins.length : Math.min(DEFAULT_CALLINS_DISPLAY, data.call_ins.length); - const callinsToDisplay = data.call_ins.slice(0, displayCount); - - // Build HTML for call-ins list - let html = ''; - callinsToDisplay.forEach(callin => { - // Status color - const statusColor = callin.status === 'OK' ? 'green' : callin.status === 'Pending' ? 'yellow' : 'red'; - const statusClass = callin.status === 'OK' ? 'bg-green-500' : callin.status === 'Pending' ? 'bg-yellow-500' : 'bg-red-500'; - - // Build location/note line - let subtitle = ''; - if (callin.location) { - subtitle = callin.location; - } else if (callin.note) { - subtitle = callin.note; - } - - html += ` -
-
- -
- - ${callin.unit_id} - - ${subtitle ? `

${subtitle}

` : ''} -
-
- ${callin.time_ago} -
`; - }); - - callinsList.innerHTML = html; - - // Show/hide the "Show all" button - if (data.call_ins.length > DEFAULT_CALLINS_DISPLAY) { - showAllButton.classList.remove('hidden'); - showAllButton.textContent = showingAllCallins - ? `Show fewer (${DEFAULT_CALLINS_DISPLAY})` - : `Show all (${data.call_ins.length})`; - } else { - showAllButton.classList.add('hidden'); - } - } else { - callinsList.innerHTML = '

No units have called in within the past 6 hours

'; - showAllButton.classList.add('hidden'); + if (!data.call_ins || data.call_ins.length === 0) { + callinsList.innerHTML = '

No recent event call-ins from SFM

'; + return; } + + // Two-column dense grid on lg+, single column below. + let html = '
'; + data.call_ins.forEach(c => { + const isFalse = c.false_trigger; + const pvs = c.peak_vector_sum; + const pvsStr = (pvs !== null && pvs !== undefined) + ? Number(pvs).toFixed(3) + ' in/s' + : '—'; + + // Subtitle: prefer sensor_location, fallback to project. + const subtitle = c.sensor_location || c.project || ''; + + // Status dot: amber for false trigger, green for real event, + // gray if unit not in roster. + const dotClass = !c.in_roster + ? 'bg-gray-400' + : (isFalse ? 'bg-amber-400' : 'bg-green-500'); + + // Format event timestamp short (e.g. "05-13 05:00"). + let tsShort = ''; + if (c.event_timestamp) { + const ts = c.event_timestamp.replace('T', ' '); + // "2026-05-13 05:00:13" → "05-13 05:00" + tsShort = ts.length >= 16 ? ts.slice(5, 16) : ts; + } + + const unitLink = c.in_roster + ? `${c.unit_id}` + : `${c.unit_id}`; + + html += ` +
+
+ +
+
+ ${unitLink} + ${isFalse ? 'false' : ''} + ${pvsStr} +
+ ${subtitle ? `

${subtitle}

` : ''} +
+
+
+ ${c.time_ago} + ${tsShort ? `${tsShort}` : ''} +
+
`; + }); + html += '
'; + callinsList.innerHTML = html; } catch (error) { console.error('Error loading recent call-ins:', error); - document.getElementById('recent-callins-list').innerHTML = '

Failed to load recent call-ins

'; + callinsList.innerHTML = '

Failed to load recent call-ins

'; } } -// Toggle show all/show fewer -document.addEventListener('DOMContentLoaded', function() { - const showAllButton = document.getElementById('show-all-callins'); - showAllButton.addEventListener('click', function() { - showingAllCallins = !showingAllCallins; - loadRecentCallins(); - }); -}); - -// Load recent call-ins on page load and refresh every 30 seconds +// Load recent call-ins on page load and refresh every 30 seconds. loadRecentCallins(); setInterval(loadRecentCallins, 30000); + +// ===== Today's Schedule horizontal card ===== +function toggleTodaysSchedule() { + const content = document.getElementById('todays-actions-content'); + const chevron = document.getElementById('todays-actions-chevron'); + if (!content || !chevron) return; + const isCollapsed = content.classList.toggle('collapsed'); + chevron.classList.toggle('collapsed', isCollapsed); + // Remember the user's explicit choice so we don't fight them on the next + // 30s htmx refresh. + localStorage.setItem('todaysScheduleUserToggled', '1'); + localStorage.setItem('todaysScheduleCollapsed', isCollapsed ? '1' : '0'); +} + +function onTodaysActionsSwap(el) { + // Read pending/total counts from the rendered partial to drive + // auto-expand + the header badge. + const badge = document.getElementById('todays-actions-badge'); + const content = document.getElementById('todays-actions-content'); + const chevron = document.getElementById('todays-actions-chevron'); + if (!content || !chevron) return; + + // Count yellow status indicators in the rendered partial as a proxy for + // "pending action present today". + const pendingDots = el.querySelectorAll('.bg-yellow-400').length; + const pendingTimes = el.querySelectorAll('.text-yellow-600').length; + const hasPending = pendingDots > 0 || pendingTimes > 0; + + if (badge) { + if (hasPending) { + const n = Math.max(pendingDots, pendingTimes); + badge.textContent = `${n} pending today`; + badge.classList.remove('hidden'); + } else { + badge.classList.add('hidden'); + } + } + + // Auto-expand only if the user hasn't manually toggled this session AND + // there's something pending. Once the user collapses/expands manually, + // their preference sticks. + const userToggled = localStorage.getItem('todaysScheduleUserToggled') === '1'; + if (!userToggled && hasPending) { + content.classList.remove('collapsed'); + chevron.classList.remove('collapsed'); + } else if (!userToggled && !hasPending) { + content.classList.add('collapsed'); + chevron.classList.add('collapsed'); + } else if (userToggled) { + const stored = localStorage.getItem('todaysScheduleCollapsed') === '1'; + content.classList.toggle('collapsed', stored); + chevron.classList.toggle('collapsed', stored); + } +} {% endblock %} diff --git a/templates/modems.html b/templates/modems.html index 46dce54..a815e9e 100644 --- a/templates/modems.html +++ b/templates/modems.html @@ -3,6 +3,7 @@ {% block title %}Field Modems - Terra-View{% endblock %} {% block content %} +{% include "partials/fleet_tab_strip.html" %}

diff --git a/templates/partials/active_table.html b/templates/partials/active_table.html index e72a4d5..085766c 100644 --- a/templates/partials/active_table.html +++ b/templates/partials/active_table.html @@ -36,7 +36,14 @@

-
+
+ {% if unit.last_seen_source == 'sfm' %} + SFM + {% elif unit.last_seen_source == 'heartbeat' %} + HB + {% endif %} {{ unit.age }} diff --git a/templates/partials/event_detail_modal.html b/templates/partials/event_detail_modal.html new file mode 100644 index 0000000..68dc574 --- /dev/null +++ b/templates/partials/event_detail_modal.html @@ -0,0 +1,25 @@ +{# Shared event detail modal. + +Include this partial on any page that wants to call showEventDetail(eventId) +from event-modal.js. The partial provides only the modal shell — the +actual content is rendered by JS into #event-detail-modal-content. + +Usage: + {% include 'partials/event_detail_modal.html' %} + +#} + diff --git a/templates/partials/fleet_tab_strip.html b/templates/partials/fleet_tab_strip.html new file mode 100644 index 0000000..21ab219 --- /dev/null +++ b/templates/partials/fleet_tab_strip.html @@ -0,0 +1,54 @@ +{# Fleet tab strip. + +Shared header for every page under the "Fleet" sidebar section. Each +underlying page (/roster, /seismographs, /sound-level-meters, /modems) +keeps its own custom layout — this partial just provides the tab +navigation across the top so they feel like one logical area. + +The active tab is detected from request.url.path so deep links work. + +Usage at top of any Fleet-section template: + {% include 'partials/fleet_tab_strip.html' %} +#} +{% set _path = request.url.path %} + diff --git a/templates/partials/projects/project_header.html b/templates/partials/projects/project_header.html index f3e3f2e..6aeff79 100644 --- a/templates/partials/projects/project_header.html +++ b/templates/partials/projects/project_header.html @@ -75,10 +75,266 @@ Generate Combined Report {% endif %} +
+ + + + +