terra-view

Author	SHA1	Message	Date
serversdown	ad55d4ca09	fix(backfill): location matching over-confident on boilerplate-shared names rapidfuzz.fuzz.WRatio inflates scores when two strings share substring tokens, even when the shared tokens are common boilerplate. For project names this is desirable (catches typos like '1-80' vs 'I-80') but for location names it produces obvious false positives: 'Area 2 - Brookville Dam - Loc 2 East' vs 'Area 1 - Loc 1 - 87 Jenks' → WRatio 85.5 (above 0.80 fuzzy threshold) These share only 'area' + 'loc' + a digit but score 85%+ because WRatio weights partial-substring overlap heavily. Operator reported the backfill tool suggesting completely unrelated locations as 86% matches. Fix: introduce `location_similarity()` — token_set_ratio + multi-digit mismatch penalty. Used for location matching everywhere; WRatio stays as the scorer for project names where its leniency is correct. The multi-digit penalty (-0.30) triggers when both strings contain 2+- digit numbers and none overlap. Catches the harder "same project, different address identifier" case: 'Area 1 - Loc 2 - 68 Jenks' vs 'Area 1 - Loc 1 - 87 Jenks' token_set_ratio = 0.91 (would still match without penalty) multi-digit tokens {68} and {87} disjoint → -0.30 → 0.61 (rejected) Single-digit tokens ('Loc 1', 'Area 2') are excluded from the penalty because they're often coincidentally shared. Updated: - backend/services/metadata_backfill.py: new location_similarity() function; _find_best_match() gains a `kind` parameter that selects scorer; cluster-match call site passes kind='location' - backend/routers/metadata_backfill.py: locations_search endpoint (the typeahead dropdown's data source) uses location_similarity instead of similarity for the same reason Verified all six test cases land correctly: - user-reported false positive: 0.85 → 0.59 (rejected) - '87 Jenks' vs '68 Jenks': 0.90 → 0.61 (rejected) - NRL-01 vs NRL-02: 0.83 → 0.53 (rejected) - 'Loc 2 - 735 Bunola' vs 'Loc 2 735 Bunola Rd': 1.00 (still matches) - punctuation-only difference: 1.00 (still matches) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 04:10:48 +00:00
serversdown	d5a0163852	feat(locations): soft-remove monitoring locations without destroying history When a client drops a location from scope mid-project (e.g. the office half of a museum+office monitoring job), operators couldn't previously mark it as no-longer-active without either deleting it (which would orphan historical events) or leaving it in the active list looking deployable. Now there's a proper middle ground. Data model - MonitoringLocation gets two new nullable columns: - removed_at — NULL means active; set means soft-removed - removal_reason — optional operator note Migration: backend/migrate_add_location_removed.py (idempotent) Endpoints - POST /api/projects/{p}/locations/{l}/remove Body: { effective_date?: ISO-datetime, reason?: str } Side effects (cascade): 1. Closes active UnitAssignment rows at this location (assigned_until = effective_date, status = "completed") 2. Cancels pending ScheduledActions at this location 3. Marks location.removed_at = effective_date Returns counts of assignments closed + actions cancelled. - POST /api/projects/{p}/locations/{l}/restore Clears removed_at + removal_reason. Does NOT auto-reopen assignments — operator creates new ones if resuming monitoring. Active-surface filters - locations-json defaults to active-only; pass include_removed=true for historical / reporting views. Schedule modal dropdowns now exclude removed locations automatically. - Metadata-backfill fuzzy matcher excludes removed locations from proposed targets (don't want backfill creating new assignments at decommissioned locations). - Vibration-summary per_location rollup includes removed locations (so historical event totals stay accurate) but tags each with removed_at so the UI can show a badge. UI - Project detail page's Monitoring Locations section now splits into: Active locations (full card with Assign / Edit / Remove / Delete) Removed locations (collapsed <details>, greyed cards, Restore button, shows removal date + reason) - New per-card "Remove" button → opens confirmation modal explaining the cascade, with optional effective-date (defaults to now, backdateable) and reason fields. - Unit detail's SFM Events attribution cell shows a small "removed" badge next to historical attributions whose location is no longer active. Same pattern in vibration_summary's top-locations list. - Soft-removal indicator surfaced through the events_for_unit attribution payload as location_removed_at. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 22:22:40 +00:00
serversdown	d3b5a3fd26	feat(sfm): inline typeahead override of project + location on each cluster card Operator no longer has to accept the parser's suggested project / location verbatim. Each cluster card now has editable typeahead inputs that search existing projects (and existing locations within the chosen project), with a "Create new: <typed>" fallback always available. Solves the I-80-North-Fork case: of the 20+ cluster variants ("I-80-North Fork Bridges-I80 E. Abutment", "I-80- North Fork Bridges-543 Plank Rd", etc.), operator types "I-80" in the Project input, picks the existing project from the dropdown, and the cluster attaches to it. Repeat for the other variants. No need to pre-create the canonical project — though pre-creation still works fine if you'd rather. Backend (backend/routers/metadata_backfill.py): - GET /api/admin/metadata_backfill/projects_search?q=&limit= Returns existing projects matching by case-insensitive substring OR rapidfuzz WRatio score >= 0.50. Substring matches sort to the top (treated as exact for ordering). Includes location_count and project_number/client_name in each result for disambiguation. Always emits a "Create new: <q>" suggestion alongside the matches. - GET /api/admin/metadata_backfill/locations_search?project_id=&q=&limit= Same shape, scoped to a single project's vibration locations. - POST /api/admin/metadata_backfill/apply now accepts four override keys per cluster (was previously two): project_id → attach to existing Project (operator picked from typeahead) project_name → create new with this name (operator typed a custom name; existing project_name behaviour) location_id → attach to existing MonitoringLocation; validated against the chosen project_id so a stale location FK can't sneak in location_name → create new location with this name Frontend (templates/admin/metadata_backfill.html): - Each non-blank-meta cluster card now has two editable typeahead inputs (Project + Location) pre-populated with the parser's suggested values. Old static "Project: + Create new: X" / "≈ Fuzzy match" pills replaced with compact hint lines under the inputs showing what the current value will do. - Typeahead dropdown opens on focus, debounced 150ms on type. Shows matched existing entities with score badges (exact / NN%) plus a "Create new: <typed>" option at the bottom. Click-to-pick fills the text input and writes the entity id into a hidden field. - Picking a new project clears the location id (forces re-pick under the new project, avoids cross-project location FKs). - _gatherOverrides re-wired to emit the new project_id / location_id keys when the operator picked from the dropdown, falling back to *_name when they typed free-form. Backward-compatible: blank-meta clusters keep their existing "project_name / location_name" plain inputs and the override path still honours them. Verified end-to-end: - /projects_search?q=I-80 returns the existing "I-80 - North Fork Bridge" project (score 1.0, has 4 locations) plus a "Create new" option. - /locations_search requires project_id (400 without it). - Wizard page renders with typeahead wiring confirmed in HTML. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 19:48:09 +00:00
serversdown	6ebbe28308	feat(sfm): strip "- Loc N" suffix from operator-typed project names Operators sometimes bake location identifiers into the project string for email-readability — "Fay - Locks & Dam No3 - Loc 2 - 735 Bunola" where "Fay - Locks & Dam No3" is the actual project and "- Loc 2 - 735 Bunola" is location info that already lives in sensor_location. Without stripping, every "- Loc N" variant became a separate project, fragmenting what should be one project with several locations. Backend: - New _extract_project_root() helper. Regex matches " - Loc N" / "-Loc3" / " - Location #5" / etc. with case-insensitive multi-dash support; strips from that marker forward and cleans up dangling separators. Strings without a Loc-marker pass through unchanged. - Cluster dataclass adds project_root field alongside project_raw. project_raw stays the operator-typed string for display ("hover to see what was actually typed"). project_root is what gets normalised for matching and used as the suggested project name. - _ensure_project + _ensure_location now do normalisation-aware dedup before creating: a cluster of "SR81" and a cluster of "SR 81" (which normalise to the same string) collapse into one project on apply, even when applied in the same bulk operation. Avoids UNIQUE constraint collisions and duplicate-named-by-spacing projects. Frontend: - Wizard cluster cards show "↳ stripped trailing 'Loc N' suffix; operator typed: <raw>" when project_root differs from project_raw, so the operator can see at a glance what the parser did to the string. Real-data results: against the same 10,055 SFM events, confidence distribution improved from 37/14/8 (high/med/low) to 43/9/7. "Fay - Locks & Dam No3" now appears as ONE project across 6 cluster instances spanning 3 serials and 6 different locations — exactly the "one project, many locations" model the user described. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 16:49:14 +00:00
serversdown	42de06f441	feat(sfm): Phase 5a — bulk-backfill projects/locations/assignments from event metadata Operator clicks one button. Parser reads SFM's events table (operator-typed project / client / sensor_location strings), clusters by serial + time + metadata, fuzzy-matches against existing projects, and proposes Project / MonitoringLocation / UnitAssignment chains to create. Auto-applies high-confidence non-conflicting clusters in bulk; queues medium/low confidence for individual review. Verified against real data: 10,052 events → 59 clusters → 37 high- confidence + 14 medium + 8 low. Test-applied one cluster end-to-end; Project + Module + Location + Assignment + UnitHistory + Decision rows all created correctly, and Phase 2's attribution walk picked up the events automatically on the new location's detail page. Pipeline (backend/services/metadata_backfill.py, ~700 lines): 1. Pull all SFM events via /db/events per serial. 2. Pre-filter: drop events already covered by an existing UnitAssignment window (Phase 2 handles those automatically). 3. Time-cluster what's left: serial + 7-day gap is the cluster identity. 4. Metadata-split each time-cluster on persistent metadata transitions (≥ 2 consecutive events) so a single typo doesn't fork the cluster. 5. Match against existing graph (rapidfuzz.WRatio multi-signal scoring, normalisation that handles abbreviations / reorders / separator variations). Thresholds: 0.95 exact, 0.80 fuzzy, min-shorter-input 5 chars to guardrail false positives on single common words. 6. Score confidence (high/medium/low) using event count, span, blank-meta, conflict, ambiguity rules. 7. Detect conflicts: overlap with existing UnitAssignment at a different location for the same serial → blocking. Operator must reconcile. 8. Apply: ensure auto_imported ProjectType exists, ensure vibration_monitoring ProjectModule on the project, write Project / MonitoringLocation / UnitAssignment / UnitHistory all in one transaction. Migration (backend/migrate_add_metadata_backfill.py): adds unit_assignments.source column (default 'manual') and metadata_backfill_decisions table. Idempotent, non-destructive. API (backend/routers/metadata_backfill.py): GET /api/admin/metadata_backfill/scan — clusters + suggestions POST /api/admin/metadata_backfill/apply — bulk apply by cluster_ids w/ optional per-cluster project/location overrides POST /api/admin/metadata_backfill/skip — mark skipped (persistent) UI (templates/admin/metadata_backfill.html, accessible at /settings/developer/metadata-backfill via the Developer tab of Settings): - One-button "Run scan" entry. - Summary KPI tiles (scanned / already attributed / pending / conflicts). - "Apply all high-confidence" bulk button at the top — primary path. - Per-cluster cards below with Apply / Skip / Preview event actions. - Blank-meta clusters get inline input fields for operator-typed project + location names before applying. - Blocking-conflict clusters render with the conflicting assignment information and a disabled Apply button. - Live progress toast during apply. - Reuses the Phase 1+2+4 event-detail modal for "Preview event" — operator can sanity-check the BW report data against the cluster's sample event. Dependencies: rapidfuzz==3.10.1 added to requirements.txt. Pre-built C wheels for all platforms, ~5s docker build hit. Phase 5b (deferred to next session): swap-detection daily background job, notification inbox for auto-applied swaps, recently-applied audit view, "Tidy" page for renaming/merging auto-created projects. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 05:54:57 +00:00

5 Commits