feat(sfm): Phase 5a — bulk-backfill projects/locations/assignments from event metadata

Operator clicks one button.  Parser reads SFM's events table (operator-typed
project / client / sensor_location strings), clusters by serial + time +
metadata, fuzzy-matches against existing projects, and proposes
Project / MonitoringLocation / UnitAssignment chains to create.
Auto-applies high-confidence non-conflicting clusters in bulk; queues
medium/low confidence for individual review.

Verified against real data: 10,052 events → 59 clusters → 37 high-
confidence + 14 medium + 8 low.  Test-applied one cluster end-to-end;
Project + Module + Location + Assignment + UnitHistory + Decision rows
all created correctly, and Phase 2's attribution walk picked up the
events automatically on the new location's detail page.

Pipeline (backend/services/metadata_backfill.py, ~700 lines):
  1. Pull all SFM events via /db/events per serial.
  2. Pre-filter: drop events already covered by an existing UnitAssignment
     window (Phase 2 handles those automatically).
  3. Time-cluster what's left: serial + 7-day gap is the cluster identity.
  4. Metadata-split each time-cluster on persistent metadata transitions
     (≥ 2 consecutive events) so a single typo doesn't fork the cluster.
  5. Match against existing graph (rapidfuzz.WRatio multi-signal scoring,
     normalisation that handles abbreviations / reorders / separator
     variations).  Thresholds: 0.95 exact, 0.80 fuzzy, min-shorter-input
     5 chars to guardrail false positives on single common words.
  6. Score confidence (high/medium/low) using event count, span,
     blank-meta, conflict, ambiguity rules.
  7. Detect conflicts: overlap with existing UnitAssignment at a different
     location for the same serial → blocking.  Operator must reconcile.
  8. Apply: ensure auto_imported ProjectType exists, ensure
     vibration_monitoring ProjectModule on the project, write
     Project / MonitoringLocation / UnitAssignment / UnitHistory all in
     one transaction.

Migration (backend/migrate_add_metadata_backfill.py): adds
unit_assignments.source column (default 'manual') and
metadata_backfill_decisions table.  Idempotent, non-destructive.

API (backend/routers/metadata_backfill.py):
  GET  /api/admin/metadata_backfill/scan          — clusters + suggestions
  POST /api/admin/metadata_backfill/apply         — bulk apply by cluster_ids
                                                     w/ optional per-cluster
                                                     project/location overrides
  POST /api/admin/metadata_backfill/skip          — mark skipped (persistent)

UI (templates/admin/metadata_backfill.html, accessible at
/settings/developer/metadata-backfill via the Developer tab of Settings):
  - One-button "Run scan" entry.
  - Summary KPI tiles (scanned / already attributed / pending / conflicts).
  - "Apply all high-confidence" bulk button at the top — primary path.
  - Per-cluster cards below with Apply / Skip / Preview event actions.
  - Blank-meta clusters get inline input fields for operator-typed project +
    location names before applying.
  - Blocking-conflict clusters render with the conflicting assignment
    information and a disabled Apply button.
  - Live progress toast during apply.
  - Reuses the Phase 1+2+4 event-detail modal for "Preview event" — operator
    can sanity-check the BW report data against the cluster's sample event.

Dependencies: rapidfuzz==3.10.1 added to requirements.txt.  Pre-built C
wheels for all platforms, ~5s docker build hit.

Phase 5b (deferred to next session): swap-detection daily background job,
notification inbox for auto-applied swaps, recently-applied audit view,
"Tidy" page for renaming/merging auto-created projects.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-12 05:54:57 +00:00
parent 21844b4d65
commit 42de06f441
8 changed files with 1828 additions and 0 deletions
+11
View File
@@ -115,6 +115,10 @@ app.include_router(scheduler.router)
from backend.routers import report_templates from backend.routers import report_templates
app.include_router(report_templates.router) app.include_router(report_templates.router)
# Metadata-backfill admin router (Phase 5a)
from backend.routers import metadata_backfill
app.include_router(metadata_backfill.router)
# Alerts router # Alerts router
from backend.routers import alerts from backend.routers import alerts
app.include_router(alerts.router) app.include_router(alerts.router)
@@ -240,6 +244,13 @@ async def sfm_page(request: Request):
return templates.TemplateResponse("sfm.html", {"request": request}) return templates.TemplateResponse("sfm.html", {"request": request})
@app.get("/settings/developer/metadata-backfill", response_class=HTMLResponse)
async def metadata_backfill_wizard_page(request: Request):
"""Wizard for auto-creating projects/locations/assignments from
operator-typed BW event metadata (Phase 5a)."""
return templates.TemplateResponse("admin/metadata_backfill.html", {"request": request})
@app.get("/modems", response_class=HTMLResponse) @app.get("/modems", response_class=HTMLResponse)
async def modems_page(request: Request): async def modems_page(request: Request):
"""Field modems management dashboard""" """Field modems management dashboard"""
+94
View File
@@ -0,0 +1,94 @@
"""
Migration: add metadata-backfill support.
Adds:
1. `unit_assignments.source` column (TEXT, default 'manual').
Lets us audit which assignments were created by the metadata-backfill
parser vs by a human, and bulk-undo parser actions if needed.
2. `metadata_backfill_decisions` table. Tracks operator decisions per
cluster_id so the wizard remembers what's been skipped, what's
been applied, and what's pending across re-scans.
Idempotent — safe to re-run.
Non-destructive — adds only.
Run with:
docker exec terra-view-terra-view-1 python3 /app/backend/migrate_add_metadata_backfill.py
"""
import os
import sqlite3
DB_PATH = "./data/seismo_fleet.db"
def migrate_database():
if not os.path.exists(DB_PATH):
print(f"Database not found at {DB_PATH}")
return
print(f"Migrating database: {DB_PATH}")
conn = sqlite3.connect(DB_PATH)
cur = conn.cursor()
# ── 1. unit_assignments.source column ──────────────────────────────────
cur.execute("PRAGMA table_info(unit_assignments)")
cols = {row[1] for row in cur.fetchall()}
if "source" not in cols:
print("Adding unit_assignments.source column (default 'manual') ...")
cur.execute(
"ALTER TABLE unit_assignments ADD COLUMN source TEXT DEFAULT 'manual'"
)
# Backfill: any existing row gets source='manual'
cur.execute("UPDATE unit_assignments SET source='manual' WHERE source IS NULL")
conn.commit()
print(" Done.")
else:
print("unit_assignments.source already exists — skipping")
# ── 2. metadata_backfill_decisions table ──────────────────────────────
cur.execute(
"SELECT name FROM sqlite_master WHERE type='table' AND name='metadata_backfill_decisions'"
)
if cur.fetchone() is None:
print("Creating metadata_backfill_decisions table ...")
cur.execute("""
CREATE TABLE metadata_backfill_decisions (
cluster_id TEXT PRIMARY KEY, -- deterministic hash
status TEXT NOT NULL, -- pending | applied | skipped | conflict
confidence TEXT NOT NULL, -- high | medium | low (at time of decision)
decided_at TEXT, -- when applied/skipped
decided_by TEXT, -- 'background' | 'operator' | 'auto-high'
applied_assignment_id TEXT, -- FK to unit_assignments (if applied)
notes TEXT,
first_seen_at TEXT NOT NULL,
last_seen_at TEXT NOT NULL,
serial TEXT NOT NULL,
project_raw TEXT,
location_raw TEXT,
first_event_ts TEXT,
last_event_ts TEXT,
event_count INTEGER NOT NULL DEFAULT 0
)
""")
cur.execute(
"CREATE INDEX idx_mbd_status ON metadata_backfill_decisions(status)"
)
cur.execute(
"CREATE INDEX idx_mbd_last_seen ON metadata_backfill_decisions(last_seen_at)"
)
cur.execute(
"CREATE INDEX idx_mbd_serial ON metadata_backfill_decisions(serial)"
)
conn.commit()
print(" Done.")
else:
print("metadata_backfill_decisions table already exists — skipping")
conn.close()
print("\nMigration complete.")
if __name__ == "__main__":
migrate_database()
+39
View File
@@ -259,9 +259,48 @@ class UnitAssignment(Base):
device_type = Column(String, nullable=False) # "slm" | "seismograph" device_type = Column(String, nullable=False) # "slm" | "seismograph"
project_id = Column(String, nullable=False, index=True) # FK to Project.id project_id = Column(String, nullable=False, index=True) # FK to Project.id
# Provenance: how was this assignment created? Used for auditing,
# bulk-undo of parser actions, and the Phase 4 deployment timeline.
# "manual" — operator created via UI
# "metadata_backfill" — auto-created by the metadata parser
# from operator-typed BW event metadata
# (bulk backfill workflow)
# "metadata_backfill_swap" — auto-created by swap-detection
# background job
source = Column(String, nullable=False, default="manual")
created_at = Column(DateTime, default=datetime.utcnow) created_at = Column(DateTime, default=datetime.utcnow)
class MetadataBackfillDecision(Base):
"""
Per-cluster decisions tracked by the metadata-backfill parser.
`cluster_id` is the deterministic SHA1 hash of
(serial, first_event_date, last_event_date), so the same cluster
produces the same id across re-scans. The decisions table lets the
parser remember "I already applied this" or "operator skipped this"
across scan invocations.
"""
__tablename__ = "metadata_backfill_decisions"
cluster_id = Column(String, primary_key=True)
status = Column(String, nullable=False) # pending | applied | skipped | conflict
confidence = Column(String, nullable=False) # high | medium | low
decided_at = Column(DateTime, nullable=True)
decided_by = Column(String, nullable=True) # background | operator | auto-high
applied_assignment_id = Column(String, nullable=True) # FK to unit_assignments.id
notes = Column(Text, nullable=True)
first_seen_at = Column(DateTime, nullable=False, default=datetime.utcnow)
last_seen_at = Column(DateTime, nullable=False, default=datetime.utcnow)
serial = Column(String, nullable=False, index=True)
project_raw = Column(String, nullable=True)
location_raw = Column(String, nullable=True)
first_event_ts = Column(DateTime, nullable=True)
last_event_ts = Column(DateTime, nullable=True)
event_count = Column(Integer, nullable=False, default=0)
class ScheduledAction(Base): class ScheduledAction(Base):
""" """
Scheduled actions: automation for recording start/stop/download. Scheduled actions: automation for recording start/stop/download.
+226
View File
@@ -0,0 +1,226 @@
"""
Metadata-backfill admin router.
Endpoints under /api/admin/metadata_backfill:
GET /scan — run the scan; return clusters + suggestions (JSON).
Cached 5 minutes so the wizard doesn't re-scan on
every page render.
POST /apply — apply a list of cluster_ids; body specifies which to
accept and optional per-cluster overrides.
POST /skip — mark cluster_ids as skipped (won't reappear).
"""
from __future__ import annotations
import os
import time
from typing import Optional
from fastapi import APIRouter, Depends, HTTPException, Request
from fastapi.responses import JSONResponse
from sqlalchemy.orm import Session
from backend.database import get_db
from backend.services import metadata_backfill as svc
router = APIRouter(prefix="/api/admin/metadata_backfill", tags=["metadata-backfill"])
SFM_BASE_URL = os.getenv("SFM_BASE_URL", "http://localhost:8200")
# In-process scan cache. Trades memory for not re-hammering SFM on every
# wizard render. TTL: 5 minutes. Singleton per-process; fine for a
# single-worker uvicorn dev setup. For prod multi-worker we'd want to put
# this in the DB or Redis; deferred.
_SCAN_CACHE: dict = {"at": 0.0, "result": None}
_SCAN_CACHE_TTL_SECONDS = 300.0
def _serialise_suggestion(s: svc.Suggestion) -> dict:
c = s.cluster
return {
"cluster_id": c.cluster_id,
"serial": c.serial,
"first_event_ts": c.first_event_ts.isoformat(),
"last_event_ts": c.last_event_ts.isoformat(),
"event_count": c.event_count,
"sample_event_id": c.sample_event_id,
"project_raw": c.project_raw,
"location_raw": c.location_raw,
"client_raw": c.client_raw,
"operator_raw": c.operator_raw,
"is_blank_meta": c.is_blank_meta,
"metadata_consistency": c.metadata_consistency,
"project_match": s.project_match,
"project_existing_id": s.project_existing_id,
"project_existing_name": s.project_existing_name,
"project_match_score": s.project_match_score,
"project_suggested_name": s.project_suggested_name,
"location_match": s.location_match,
"location_existing_id": s.location_existing_id,
"location_existing_name": s.location_existing_name,
"location_match_score": s.location_match_score,
"location_suggested_name": s.location_suggested_name,
"proposed_assigned_at": s.proposed_assigned_at.isoformat(),
"proposed_assigned_until": s.proposed_assigned_until.isoformat() if s.proposed_assigned_until else None,
"confidence": s.confidence,
"blocking_conflict": s.blocking_conflict,
"conflicts": [
{
"existing_assignment_id": cf.existing_assignment_id,
"other_location_id": cf.other_location_id,
"other_location_name": cf.other_location_name,
"other_project_id": cf.other_project_id,
"other_project_name": cf.other_project_name,
}
for cf in s.conflicts
],
}
@router.get("/scan")
async def scan(
force: bool = False,
db: Session = Depends(get_db),
):
"""Run a scan and return clusters + suggestions.
Set force=true to bypass the 5-minute cache.
"""
now = time.time()
if not force and _SCAN_CACHE["result"] is not None \
and (now - _SCAN_CACHE["at"]) < _SCAN_CACHE_TTL_SECONDS:
return _SCAN_CACHE["result"]
result = await svc.scan_clusters_and_build_suggestions(db, SFM_BASE_URL)
# Group suggestions for the wizard UI.
by_confidence = {"high": [], "medium": [], "low": []}
blocking_conflict_count = 0
for s in result.suggestions:
by_confidence[s.confidence].append(_serialise_suggestion(s))
if s.blocking_conflict:
blocking_conflict_count += 1
payload = {
"scanned_event_count": result.scanned_event_count,
"cluster_count": result.cluster_count,
"already_attributed": result.already_attributed,
"skipped_orphans": result.skipped_orphans,
"pending_count": len(result.suggestions),
"blocking_conflict_count": blocking_conflict_count,
"by_confidence": {
"high": by_confidence["high"],
"medium": by_confidence["medium"],
"low": by_confidence["low"],
},
"scanned_at": now,
}
_SCAN_CACHE["result"] = payload
_SCAN_CACHE["at"] = now
return payload
@router.post("/apply")
async def apply(
request: Request,
db: Session = Depends(get_db),
):
"""Apply a list of clusters.
Body:
{
"cluster_ids": ["abc...", "def..."],
"overrides": { "abc...": { "project_name": "...", "location_name": "..." } }
}
To accept ALL non-conflict suggestions in one shot, the UI sends every
pending cluster_id with no overrides.
"""
try:
body = await request.json()
except Exception:
raise HTTPException(status_code=400, detail="Invalid JSON body")
cluster_ids = body.get("cluster_ids") or []
overrides = body.get("overrides") or {}
if not isinstance(cluster_ids, list) or not cluster_ids:
raise HTTPException(status_code=400, detail="cluster_ids must be a non-empty list")
# Re-scan to get current suggestions. We don't trust the cached scan
# blindly — the operator might have manually created projects in
# between scan and apply.
scan_result = await svc.scan_clusters_and_build_suggestions(db, SFM_BASE_URL)
suggestions_by_id = {s.cluster.cluster_id: s for s in scan_result.suggestions}
selected: list[svc.Suggestion] = []
not_found: list[str] = []
for cid in cluster_ids:
s = suggestions_by_id.get(cid)
if s is None:
not_found.append(cid)
continue
# Apply overrides.
ov = overrides.get(cid) or {}
if "project_name" in ov:
s.project_suggested_name = (ov["project_name"] or "").strip() or s.project_suggested_name
# Override implies operator wants to create new (or rename).
# If they wanted an exact match, they'd not have overridden.
if s.project_match in ("create_new",):
pass # keep create_new
else:
# Operator typed a custom name — force create-new behaviour
# so we don't accidentally attach to a different existing
# project by exact-match.
s.project_existing_id = None
s.project_match = "create_new"
if "location_name" in ov:
s.location_suggested_name = (ov["location_name"] or "").strip() or s.location_suggested_name
if s.location_match in ("create_new",):
pass
else:
s.location_existing_id = None
s.location_match = "create_new"
selected.append(s)
apply_result = svc.apply_suggestions(db, selected, decided_by="operator")
# Invalidate the scan cache so the next /scan picks up the new state.
_SCAN_CACHE["at"] = 0.0
_SCAN_CACHE["result"] = None
return {
"applied": apply_result.applied,
"failed": [{"cluster_id": cid, "reason": r} for cid, r in apply_result.failed],
"not_found": not_found,
"project_ids_created": apply_result.project_ids_created,
"location_ids_created": apply_result.location_ids_created,
"assignment_ids_created": apply_result.assignment_ids_created,
}
@router.post("/skip")
async def skip(
request: Request,
db: Session = Depends(get_db),
):
"""Mark cluster_ids as skipped — they won't reappear in future scans."""
try:
body = await request.json()
except Exception:
raise HTTPException(status_code=400, detail="Invalid JSON body")
cluster_ids = body.get("cluster_ids") or []
if not isinstance(cluster_ids, list):
raise HTTPException(status_code=400, detail="cluster_ids must be a list")
n = svc.skip_clusters(db, cluster_ids, decided_by="operator")
_SCAN_CACHE["at"] = 0.0
_SCAN_CACHE["result"] = None
return {"skipped": n}
File diff suppressed because it is too large Load Diff
+1
View File
@@ -8,3 +8,4 @@ aiofiles==23.2.1
Pillow==10.1.0 Pillow==10.1.0
httpx==0.25.2 httpx==0.25.2
openpyxl==3.1.2 openpyxl==3.1.2
rapidfuzz==3.10.1
+405
View File
@@ -0,0 +1,405 @@
{% extends "base.html" %}
{% block title %}Metadata Backfill - Seismo Fleet Manager{% endblock %}
{% block content %}
<!-- Breadcrumb -->
<div class="mb-6">
<nav class="flex items-center space-x-2 text-sm">
<a href="/settings" class="text-seismo-orange hover:text-seismo-navy flex items-center">
<svg class="w-4 h-4 mr-1" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7"></path>
</svg>
Settings
</a>
<svg class="w-4 h-4 text-gray-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5l7 7-7 7"></path>
</svg>
<span class="text-gray-900 dark:text-white font-medium">Metadata Backfill</span>
</nav>
</div>
<!-- Header -->
<div class="mb-6">
<h1 class="text-3xl font-bold text-gray-900 dark:text-white">Backfill from event metadata</h1>
<p class="text-gray-600 dark:text-gray-400 mt-1">
Auto-create projects, locations, and unit assignments from operator-typed metadata on Blastware events.
</p>
</div>
<!-- Summary card (populated after scan) -->
<div id="summary-card" class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-6 mb-6">
<div id="summary-initial">
<div class="text-center py-8">
<svg class="w-16 h-16 mx-auto mb-4 text-gray-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M21 21l-6-6m2-5a7 7 0 11-14 0 7 7 0 0114 0z"></path>
</svg>
<h2 class="text-xl font-semibold text-gray-900 dark:text-white mb-2">Scan SFM events</h2>
<p class="text-gray-500 dark:text-gray-400 text-sm mb-6 max-w-xl mx-auto">
Reads all events from SFM, clusters them by serial &amp; time, matches the
operator-typed metadata against your existing projects, and proposes
<strong>Project</strong> / <strong>Location</strong> / <strong>UnitAssignment</strong>
chains to create.
</p>
<button onclick="runScan(false)"
id="initial-scan-btn"
class="px-6 py-3 bg-seismo-orange hover:bg-seismo-navy text-white rounded-lg font-medium transition-colors">
↻ Run scan
</button>
</div>
</div>
<div id="summary-results" class="hidden">
<div class="flex items-start justify-between mb-4 flex-wrap gap-3">
<div>
<h2 class="text-xl font-semibold text-gray-900 dark:text-white">Scan summary</h2>
<p id="summary-scanned-at" class="text-xs text-gray-500 dark:text-gray-400 mt-1"></p>
</div>
<button onclick="runScan(true)"
class="px-3 py-1.5 text-sm border border-gray-300 dark:border-gray-600 hover:bg-gray-50 dark:hover:bg-gray-700 text-gray-700 dark:text-gray-300 rounded-lg">
↻ Re-scan
</button>
</div>
<!-- KPI tiles -->
<div class="grid grid-cols-2 md:grid-cols-4 gap-3 mb-4">
<div class="bg-gray-50 dark:bg-slate-900/50 rounded-lg p-3 flex flex-col">
<span class="text-xs text-gray-500 dark:text-gray-400 uppercase tracking-wider">Events scanned</span>
<span id="kpi-scanned" class="text-2xl font-bold text-gray-900 dark:text-white mt-1"></span>
</div>
<div class="bg-gray-50 dark:bg-slate-900/50 rounded-lg p-3 flex flex-col">
<span class="text-xs text-gray-500 dark:text-gray-400 uppercase tracking-wider">Already attributed</span>
<span id="kpi-already" class="text-2xl font-bold text-gray-900 dark:text-white mt-1"></span>
<span class="text-xs text-gray-500 dark:text-gray-400 mt-1">inside existing assignments</span>
</div>
<div class="bg-gray-50 dark:bg-slate-900/50 rounded-lg p-3 flex flex-col">
<span class="text-xs text-gray-500 dark:text-gray-400 uppercase tracking-wider">Pending review</span>
<span id="kpi-pending" class="text-2xl font-bold text-gray-900 dark:text-white mt-1"></span>
<span class="text-xs text-gray-500 dark:text-gray-400 mt-1">clusters to attribute</span>
</div>
<div class="bg-gray-50 dark:bg-slate-900/50 rounded-lg p-3 flex flex-col">
<span class="text-xs text-gray-500 dark:text-gray-400 uppercase tracking-wider">Conflicts</span>
<span id="kpi-conflicts" class="text-2xl font-bold text-gray-900 dark:text-white mt-1"></span>
<span class="text-xs text-gray-500 dark:text-gray-400 mt-1">need manual reconciliation</span>
</div>
</div>
<!-- One-click bulk apply -->
<div id="bulk-apply-card" class="bg-orange-50 dark:bg-orange-900/20 border border-orange-200 dark:border-orange-800 rounded-lg p-4 mb-4 hidden">
<div class="flex items-start gap-3">
<svg class="w-6 h-6 text-seismo-orange shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 10V3L4 14h7v7l9-11h-7z"></path>
</svg>
<div class="flex-1">
<h3 class="font-semibold text-gray-900 dark:text-white mb-1">
Bulk-apply <span id="bulk-applicable-count">0</span> high-confidence cluster(s)
</h3>
<p class="text-sm text-gray-700 dark:text-gray-300 mb-3">
Apply every cluster scored <strong>high confidence</strong> with no blocking conflicts.
Will create <span id="bulk-stats" class="font-medium"></span>.
Medium and low confidence clusters remain in the list below for individual review.
</p>
<button onclick="applyBulkHighConfidence()"
class="px-5 py-2 bg-seismo-orange hover:bg-seismo-navy text-white rounded-lg font-medium transition-colors">
Apply all high-confidence
</button>
</div>
</div>
</div>
<p class="text-xs text-gray-500 dark:text-gray-400 italic">
Each cluster below shows the operator-typed metadata, what would be created or matched, and the proposed
assignment date window. Click <em>Apply</em> to attribute that cluster, <em>Skip</em> to ignore it (won't reappear),
or <em>Edit</em> to rename before applying.
</p>
</div>
</div>
<!-- Cluster list -->
<div id="cluster-list" class="space-y-3"></div>
<!-- Apply progress toast -->
<div id="apply-toast" class="hidden fixed bottom-6 right-6 bg-white dark:bg-slate-800 rounded-xl shadow-2xl border border-gray-200 dark:border-gray-700 p-4 z-50 max-w-md">
<div class="flex items-center gap-3">
<div id="toast-icon" class="shrink-0">
<div class="animate-spin rounded-full h-6 w-6 border-b-2 border-seismo-orange"></div>
</div>
<div class="flex-1">
<p id="toast-message" class="text-sm font-medium text-gray-900 dark:text-white">Applying…</p>
<p id="toast-sub" class="text-xs text-gray-500 dark:text-gray-400 mt-0.5"></p>
</div>
</div>
</div>
<!-- Shared event-detail modal (Preview event button uses it) -->
{% include 'partials/event_detail_modal.html' %}
<script src="/static/event-modal.js"></script>
<script>
let _scanData = null;
function _esc(s) {
if (s == null) return '';
return String(s).replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/>/g, '&gt;').replace(/"/g, '&quot;');
}
function _fmtDate(iso) {
if (!iso) return '—';
return iso.slice(0, 10);
}
function _fmtDateTime(iso) {
if (!iso) return '—';
return iso.slice(0, 19).replace('T', ' ');
}
function _confidenceBadge(c) {
const map = {
high: { cls: 'bg-green-100 text-green-800 dark:bg-green-900/30 dark:text-green-300', icon: '🟢', label: 'high' },
medium: { cls: 'bg-amber-100 text-amber-800 dark:bg-amber-900/30 dark:text-amber-300', icon: '🟡', label: 'medium' },
low: { cls: 'bg-red-100 text-red-800 dark:bg-red-900/30 dark:text-red-300', icon: '🔴', label: 'low' },
};
const e = map[c] || map.low;
return `<span class="px-2 py-0.5 rounded text-xs font-medium ${e.cls}">${e.icon} ${e.label}</span>`;
}
function _matchPill(match, score, suggestedName, existingName) {
if (match === 'exact') {
return `<span class="font-medium text-green-700 dark:text-green-400">✓ Matches existing: <em>${_esc(existingName || suggestedName)}</em></span>`;
}
if (match === 'fuzzy') {
return `<span class="font-medium text-amber-700 dark:text-amber-400">≈ Fuzzy match (${(score*100).toFixed(0)}%): <em>${_esc(existingName)}</em></span>
<span class="text-xs text-gray-500 dark:text-gray-400 ml-1">(your value: "${_esc(suggestedName)}")</span>`;
}
if (match === 'ambiguous') {
return `<span class="font-medium text-yellow-700 dark:text-yellow-400">? Ambiguous — multiple matches</span>`;
}
return `<span class="font-medium text-seismo-orange">+ Create new: <em>${_esc(suggestedName)}</em></span>`;
}
async function runScan(force) {
const initial = document.getElementById('summary-initial');
const results = document.getElementById('summary-results');
const list = document.getElementById('cluster-list');
initial.classList.add('hidden');
results.classList.remove('hidden');
list.innerHTML = '<div class="text-center py-12 text-gray-500 dark:text-gray-400"><div class="animate-spin rounded-full h-8 w-8 border-b-2 border-seismo-orange mx-auto mb-3"></div>Scanning events…</div>';
document.getElementById('kpi-scanned').textContent = '…';
document.getElementById('kpi-already').textContent = '…';
document.getElementById('kpi-pending').textContent = '…';
document.getElementById('kpi-conflicts').textContent = '…';
document.getElementById('bulk-apply-card').classList.add('hidden');
try {
const r = await fetch('/api/admin/metadata_backfill/scan' + (force ? '?force=true' : ''));
if (!r.ok) throw new Error('HTTP ' + r.status);
_scanData = await r.json();
} catch (e) {
list.innerHTML = `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-6 text-center text-red-500">Scan failed: ${_esc(e.message)}</div>`;
return;
}
document.getElementById('kpi-scanned').textContent = _scanData.scanned_event_count.toLocaleString();
document.getElementById('kpi-already').textContent = _scanData.already_attributed.toLocaleString();
document.getElementById('kpi-pending').textContent = _scanData.pending_count.toLocaleString();
document.getElementById('kpi-conflicts').textContent = _scanData.blocking_conflict_count.toLocaleString();
document.getElementById('summary-scanned-at').textContent =
'Scanned ' + new Date(_scanData.scanned_at * 1000).toLocaleString();
// Configure bulk-apply card.
const highApplicable = _scanData.by_confidence.high.filter(s => !s.blocking_conflict);
const newProjects = new Set(), newLocations = new Set();
for (const s of highApplicable) {
if (s.project_match === 'create_new') newProjects.add(s.project_suggested_name.toLowerCase());
if (s.location_match === 'create_new') newLocations.add(s.location_suggested_name.toLowerCase());
}
if (highApplicable.length > 0) {
document.getElementById('bulk-apply-card').classList.remove('hidden');
document.getElementById('bulk-applicable-count').textContent = highApplicable.length;
const parts = [];
if (newProjects.size > 0) parts.push(`${newProjects.size} project${newProjects.size === 1 ? '' : 's'}`);
if (newLocations.size > 0) parts.push(`${newLocations.size} location${newLocations.size === 1 ? '' : 's'}`);
parts.push(`${highApplicable.length} assignment${highApplicable.length === 1 ? '' : 's'}`);
document.getElementById('bulk-stats').textContent = parts.join(' · ');
}
renderClusterList();
}
function renderClusterList() {
const list = document.getElementById('cluster-list');
const all = [
..._scanData.by_confidence.high,
..._scanData.by_confidence.medium,
..._scanData.by_confidence.low,
];
if (all.length === 0) {
list.innerHTML = `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-8 text-center">
<svg class="w-16 h-16 mx-auto mb-4 text-green-500" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z"></path>
</svg>
<h3 class="text-lg font-semibold text-gray-900 dark:text-white mb-1">✅ All caught up</h3>
<p class="text-gray-500 dark:text-gray-400">Every event in SFM is either attributed to an existing assignment or has been skipped.</p>
</div>`;
return;
}
list.innerHTML = all.map(_renderCluster).join('');
}
function _renderCluster(s) {
const spanDays = (new Date(s.last_event_ts) - new Date(s.first_event_ts)) / 86400000;
const consistencyNote = s.metadata_consistency < 1.0
? `<span class="ml-2 text-xs text-amber-600 dark:text-amber-400" title="Some events in this cluster have slightly different metadata — possibly a typo or mid-stream change.">⚠ ${(s.metadata_consistency*100).toFixed(0)}% consistent</span>`
: '';
const blockingBanner = s.blocking_conflict
? `<div class="bg-red-50 dark:bg-red-900/20 border border-red-200 dark:border-red-800 rounded-lg p-3 mt-3 text-sm text-red-800 dark:text-red-300">
<strong>⚠ Blocking conflict.</strong>
${s.conflicts.map(c => `Unit ${_esc(s.serial)} is already assigned to <em>${_esc(c.other_project_name)} / ${_esc(c.other_location_name)}</em> during this window.`).join(' ')}
Resolve manually before this cluster can be applied.
</div>`
: '';
const orphanInputs = s.is_blank_meta
? `<div class="bg-amber-50 dark:bg-amber-900/20 border border-amber-200 dark:border-amber-800 rounded-lg p-3 mt-3">
<p class="text-sm text-amber-800 dark:text-amber-300 mb-2"><strong>⚠ Blank metadata.</strong> Operator didn't type project / location for these events. Fill in manually:</p>
<div class="grid grid-cols-2 gap-2">
<input type="text" placeholder="Project name" data-cluster-id="${_esc(s.cluster_id)}" data-field="project_name"
class="px-2 py-1 text-sm border border-amber-300 dark:border-amber-700 rounded bg-white dark:bg-slate-700 text-gray-900 dark:text-white">
<input type="text" placeholder="Location name" data-cluster-id="${_esc(s.cluster_id)}" data-field="location_name"
class="px-2 py-1 text-sm border border-amber-300 dark:border-amber-700 rounded bg-white dark:bg-slate-700 text-gray-900 dark:text-white">
</div>
</div>`
: '';
return `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-4" data-cluster-id="${_esc(s.cluster_id)}">
<div class="flex items-start justify-between gap-3 mb-3 flex-wrap">
<div class="flex-1 min-w-0">
<div class="flex items-center gap-2 mb-1 flex-wrap">
${_confidenceBadge(s.confidence)}
<a href="/unit/${_esc(s.serial)}" class="font-mono font-semibold text-seismo-orange hover:text-seismo-navy">${_esc(s.serial)}</a>
<span class="text-sm text-gray-600 dark:text-gray-400">${_fmtDate(s.first_event_ts)} → ${_fmtDate(s.last_event_ts)}</span>
<span class="text-xs text-gray-500 dark:text-gray-400">(${s.event_count} event${s.event_count === 1 ? '' : 's'}, ${spanDays.toFixed(0)}d span)</span>
${consistencyNote}
</div>
<div class="text-sm text-gray-700 dark:text-gray-300 mt-2 space-y-1">
<div><span class="text-gray-500 dark:text-gray-400 w-24 inline-block">Project:</span> ${_matchPill(s.project_match, s.project_match_score, s.project_suggested_name, s.project_existing_name)}</div>
<div><span class="text-gray-500 dark:text-gray-400 w-24 inline-block">Location:</span> ${_matchPill(s.location_match, s.location_match_score, s.location_suggested_name, s.location_existing_name)}</div>
<div><span class="text-gray-500 dark:text-gray-400 w-24 inline-block">Assignment:</span> ${_fmtDateTime(s.proposed_assigned_at)} → ${s.proposed_assigned_until ? _fmtDateTime(s.proposed_assigned_until) : '<span class="text-green-700 dark:text-green-400 font-medium">present (active)</span>'}</div>
${s.client_raw ? `<div><span class="text-gray-500 dark:text-gray-400 w-24 inline-block">Client:</span> <em>${_esc(s.client_raw)}</em></div>` : ''}
</div>
${blockingBanner}
${orphanInputs}
</div>
<div class="flex flex-col gap-2 shrink-0">
<button onclick="showEventDetail('${_esc(s.sample_event_id)}')"
class="px-3 py-1.5 text-xs border border-gray-300 dark:border-gray-600 hover:bg-gray-50 dark:hover:bg-gray-700 text-gray-700 dark:text-gray-300 rounded">
Preview event
</button>
${s.blocking_conflict
? `<button disabled class="px-3 py-1.5 text-xs bg-gray-100 dark:bg-gray-800 text-gray-400 rounded cursor-not-allowed">Apply</button>`
: `<button onclick="applyOne('${_esc(s.cluster_id)}')"
class="px-3 py-1.5 text-xs bg-seismo-orange hover:bg-seismo-navy text-white rounded font-medium">
Apply
</button>`}
<button onclick="skipOne('${_esc(s.cluster_id)}')"
class="px-3 py-1.5 text-xs border border-gray-300 dark:border-gray-600 hover:bg-gray-50 dark:hover:bg-gray-700 text-gray-700 dark:text-gray-300 rounded">
Skip
</button>
</div>
</div>
</div>`;
}
function _gatherOverrides(clusterIds) {
const overrides = {};
for (const cid of clusterIds) {
const inputs = document.querySelectorAll(`input[data-cluster-id="${cid}"]`);
if (inputs.length === 0) continue;
const o = {};
inputs.forEach(i => {
const v = i.value.trim();
if (v) o[i.dataset.field] = v;
});
if (Object.keys(o).length > 0) overrides[cid] = o;
}
return overrides;
}
function _showToast(message, sub, kind) {
const toast = document.getElementById('apply-toast');
const icon = document.getElementById('toast-icon');
document.getElementById('toast-message').textContent = message;
document.getElementById('toast-sub').textContent = sub || '';
if (kind === 'success') {
icon.innerHTML = '<svg class="w-6 h-6 text-green-500" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M5 13l4 4L19 7"></path></svg>';
} else if (kind === 'error') {
icon.innerHTML = '<svg class="w-6 h-6 text-red-500" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 9v2m0 4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z"></path></svg>';
} else {
icon.innerHTML = '<div class="animate-spin rounded-full h-6 w-6 border-b-2 border-seismo-orange"></div>';
}
toast.classList.remove('hidden');
}
function _hideToast(after) {
setTimeout(() => document.getElementById('apply-toast').classList.add('hidden'), after || 3000);
}
async function _apply(clusterIds) {
if (clusterIds.length === 0) return;
_showToast(`Applying ${clusterIds.length} cluster${clusterIds.length === 1 ? '' : 's'}…`);
try {
const r = await fetch('/api/admin/metadata_backfill/apply', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
cluster_ids: clusterIds,
overrides: _gatherOverrides(clusterIds),
}),
});
if (!r.ok) throw new Error('HTTP ' + r.status);
const d = await r.json();
const sub = `${d.applied} applied · ${d.project_ids_created.length} new project(s) · ${d.location_ids_created.length} new location(s)` + (d.failed.length ? ` · ${d.failed.length} failed` : '');
_showToast(`${d.applied} cluster${d.applied === 1 ? '' : 's'} applied`, sub, d.failed.length ? 'error' : 'success');
_hideToast(4000);
await runScan(true); // refresh
} catch (e) {
_showToast('Apply failed', e.message, 'error');
_hideToast(5000);
}
}
async function applyOne(clusterId) { return _apply([clusterId]); }
async function applyBulkHighConfidence() {
const high = _scanData.by_confidence.high.filter(s => !s.blocking_conflict);
const ids = high.map(s => s.cluster_id);
if (ids.length === 0) return;
if (!confirm(`Apply ${ids.length} high-confidence cluster${ids.length === 1 ? '' : 's'}? This will create projects, locations, and assignments without further prompting.`)) return;
return _apply(ids);
}
async function skipOne(clusterId) {
if (!confirm('Skip this cluster? It will not reappear in future scans.')) return;
_showToast('Skipping…');
try {
const r = await fetch('/api/admin/metadata_backfill/skip', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({ cluster_ids: [clusterId] }),
});
if (!r.ok) throw new Error('HTTP ' + r.status);
_showToast('Skipped', '', 'success');
_hideToast(2000);
await runScan(true);
} catch (e) {
_showToast('Skip failed', e.message, 'error');
_hideToast(4000);
}
}
</script>
{% endblock %}
+14
View File
@@ -560,6 +560,20 @@
Open Open
</a> </a>
</div> </div>
<!-- Metadata Backfill (Phase 5a) -->
<div class="flex items-center justify-between p-4 bg-gray-50 dark:bg-slate-700 rounded-lg">
<div>
<div class="font-medium text-gray-900 dark:text-white">Backfill from event metadata</div>
<div class="text-sm text-gray-500 dark:text-gray-400 mt-0.5">
Auto-create projects, locations, and unit assignments from the operator-typed metadata baked into SFM events. Skip the manual entry.
</div>
</div>
<a href="/settings/developer/metadata-backfill"
class="ml-6 px-4 py-2 bg-seismo-orange hover:bg-orange-600 text-white text-sm font-medium rounded-lg transition-colors whitespace-nowrap">
Open
</a>
</div>
</div> </div>
</div> </div>
</div> </div>