main
5 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ba1f28ee53 |
fix(backfill): typeahead picks broken by JSON.stringify quote collision in onclick
The inline onclick on each typeahead dropdown item was:
onclick="onTypeaheadPick(event, 'cid', 'location', 'loc-id', ${JSON.stringify(m.name)})"
For any name with spaces/punctuation (i.e. every real location name like
"Area 1 - Loc 1 - 87 Jenks"), JSON.stringify emits double quotes around
the value, which collide with the onclick attribute's own double quotes
and terminate the attribute early. The dropdown rendered fine via
.innerHTML, but the browser's HTML parser saw a broken attribute and
never bound the click handler — clicks on dropdown items silently did
nothing.
Same pattern that broke the location Remove button yesterday. Same fix:
move args into data-* attributes and dispatch through a tiny trampoline
that reads from this.dataset. Robust against any character in
project/location names.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
77483c2186 |
feat(projects): Tidy page for fuzzy-detecting + bulk-merging duplicate projects
Phase 5b first slice. Surfaces near-duplicate projects (typo variants, abbreviation differences, spacing variations like "SR81" vs "SR 81") as side-by-side pairs the operator can merge with one click. Backend (backend/services/project_tidy.py): - find_duplicate_pairs(db, threshold=0.85) walks all active projects and computes rapidfuzz.WRatio similarity for every pair. Pre-filters too-short normalised names (< 4 chars) to avoid noise. Skips soft-deleted projects. Returns pairs sorted by score desc, then by total content (more assignments → review first). - Each pair carries a suggested merge target with a human-readable reason. Priorities (in order): manual source over parser source, populated project_number, more locations, more assignments, shorter name. Operator can override the suggestion by clicking the OTHER direction button. - O(N^2) over project count. Fine up to ~500 projects. Token-prefix blocking is the obvious next optimisation if it becomes slow. Backend (backend/routers/projects.py): - GET /api/projects/admin/duplicate_pairs?threshold=&max_pairs= returns pairs as JSON for the Tidy page. Frontend (templates/admin/project_tidy.html): - New admin page at /settings/developer/project-tidy. Threshold selector (95% / 90% / 85% / 80%) at the top; rescan button next to it; auto- scans on load. - Each pair card shows side-by-side project summaries (name, project_ number, client, source-badge, location/assignment counts) with the suggested target visually highlighted (orange border). Three buttons: "Merge A → B", "Merge B → A", "Not a dup" (hide locally). - Click-to-merge opens a native confirm with the preview totals (assignments/sessions/data files moving, consolidations) — same data the project_header.html merge modal shows. On confirm, hits the existing /merge_into endpoint and re-scans automatically. - Source badges distinguish parser-created (`metadata_backfill`) from manual projects — at a glance the operator can see "this duplicate is parser-generated; safe to merge into the manual one". Frontend (templates/admin/metadata_backfill.html): - Apply-result handling now surfaces failed[] cluster reasons in a dedicated failure panel (bottom-left, dismissable). Previously a 200 OK with all-failures showed a misleading "1 cluster applied" success toast because the count and the failure list weren't being reconciled. This bit us during the DB-revert recovery earlier — the project_modules table was missing, every apply silently rolled back, user saw success toasts. Fixed. Smoke-verified against current state (10K events, 9 projects, post- merge): tool correctly finds 0 pairs at threshold 0.85 (data is clean), 1 false-positive at 0.70 (two unrelated projects sharing the token "81" — example of why the 0.85 default is correct). Settings link added under Developer → Project Tidy. Phase 5c (swap-detection daily background job + notification inbox) remains deferred to the next session. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
d3b5a3fd26 |
feat(sfm): inline typeahead override of project + location on each cluster card
Operator no longer has to accept the parser's suggested project /
location verbatim. Each cluster card now has editable typeahead inputs
that search existing projects (and existing locations within the chosen
project), with a "Create new: <typed>" fallback always available.
Solves the I-80-North-Fork case: of the 20+ cluster variants
("I-80-North Fork Bridges-I80 E. Abutment", "I-80- North Fork
Bridges-543 Plank Rd", etc.), operator types "I-80" in the Project
input, picks the existing project from the dropdown, and the cluster
attaches to it. Repeat for the other variants. No need to pre-create
the canonical project — though pre-creation still works fine if you'd
rather.
Backend (backend/routers/metadata_backfill.py):
- GET /api/admin/metadata_backfill/projects_search?q=&limit=
Returns existing projects matching by case-insensitive substring OR
rapidfuzz WRatio score >= 0.50. Substring matches sort to the top
(treated as exact for ordering). Includes location_count and
project_number/client_name in each result for disambiguation. Always
emits a "Create new: <q>" suggestion alongside the matches.
- GET /api/admin/metadata_backfill/locations_search?project_id=&q=&limit=
Same shape, scoped to a single project's vibration locations.
- POST /api/admin/metadata_backfill/apply now accepts four override
keys per cluster (was previously two):
project_id → attach to existing Project (operator picked from
typeahead)
project_name → create new with this name (operator typed a
custom name; existing project_name behaviour)
location_id → attach to existing MonitoringLocation; validated
against the chosen project_id so a stale location
FK can't sneak in
location_name → create new location with this name
Frontend (templates/admin/metadata_backfill.html):
- Each non-blank-meta cluster card now has two editable typeahead inputs
(Project + Location) pre-populated with the parser's suggested
values. Old static "Project: + Create new: X" / "≈ Fuzzy match" pills
replaced with compact hint lines under the inputs showing what the
current value will do.
- Typeahead dropdown opens on focus, debounced 150ms on type. Shows
matched existing entities with score badges (exact / NN%) plus a
"Create new: <typed>" option at the bottom. Click-to-pick fills the
text input and writes the entity id into a hidden field.
- Picking a new project clears the location id (forces re-pick under
the new project, avoids cross-project location FKs).
- _gatherOverrides re-wired to emit the new project_id / location_id
keys when the operator picked from the dropdown, falling back to
*_name when they typed free-form.
Backward-compatible: blank-meta clusters keep their existing "project_name
/ location_name" plain inputs and the override path still honours them.
Verified end-to-end:
- /projects_search?q=I-80 returns the existing "I-80 - North Fork
Bridge" project (score 1.0, has 4 locations) plus a "Create new"
option.
- /locations_search requires project_id (400 without it).
- Wizard page renders with typeahead wiring confirmed in HTML.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
6ebbe28308 |
feat(sfm): strip "- Loc N" suffix from operator-typed project names
Operators sometimes bake location identifiers into the project string for email-readability — "Fay - Locks & Dam No3 - Loc 2 - 735 Bunola" where "Fay - Locks & Dam No3" is the actual project and "- Loc 2 - 735 Bunola" is location info that already lives in sensor_location. Without stripping, every "- Loc N" variant became a separate project, fragmenting what should be one project with several locations. Backend: - New _extract_project_root() helper. Regex matches " - Loc N" / "-Loc3" / " - Location #5" / etc. with case-insensitive multi-dash support; strips from that marker forward and cleans up dangling separators. Strings without a Loc-marker pass through unchanged. - Cluster dataclass adds project_root field alongside project_raw. project_raw stays the operator-typed string for display ("hover to see what was actually typed"). project_root is what gets normalised for matching and used as the suggested project name. - _ensure_project + _ensure_location now do normalisation-aware dedup before creating: a cluster of "SR81" and a cluster of "SR 81" (which normalise to the same string) collapse into one project on apply, even when applied in the same bulk operation. Avoids UNIQUE constraint collisions and duplicate-named-by-spacing projects. Frontend: - Wizard cluster cards show "↳ stripped trailing 'Loc N' suffix; operator typed: <raw>" when project_root differs from project_raw, so the operator can see at a glance what the parser did to the string. Real-data results: against the same 10,055 SFM events, confidence distribution improved from 37/14/8 (high/med/low) to 43/9/7. "Fay - Locks & Dam No3" now appears as ONE project across 6 cluster instances spanning 3 serials and 6 different locations — exactly the "one project, many locations" model the user described. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
42de06f441 |
feat(sfm): Phase 5a — bulk-backfill projects/locations/assignments from event metadata
Operator clicks one button. Parser reads SFM's events table (operator-typed
project / client / sensor_location strings), clusters by serial + time +
metadata, fuzzy-matches against existing projects, and proposes
Project / MonitoringLocation / UnitAssignment chains to create.
Auto-applies high-confidence non-conflicting clusters in bulk; queues
medium/low confidence for individual review.
Verified against real data: 10,052 events → 59 clusters → 37 high-
confidence + 14 medium + 8 low. Test-applied one cluster end-to-end;
Project + Module + Location + Assignment + UnitHistory + Decision rows
all created correctly, and Phase 2's attribution walk picked up the
events automatically on the new location's detail page.
Pipeline (backend/services/metadata_backfill.py, ~700 lines):
1. Pull all SFM events via /db/events per serial.
2. Pre-filter: drop events already covered by an existing UnitAssignment
window (Phase 2 handles those automatically).
3. Time-cluster what's left: serial + 7-day gap is the cluster identity.
4. Metadata-split each time-cluster on persistent metadata transitions
(≥ 2 consecutive events) so a single typo doesn't fork the cluster.
5. Match against existing graph (rapidfuzz.WRatio multi-signal scoring,
normalisation that handles abbreviations / reorders / separator
variations). Thresholds: 0.95 exact, 0.80 fuzzy, min-shorter-input
5 chars to guardrail false positives on single common words.
6. Score confidence (high/medium/low) using event count, span,
blank-meta, conflict, ambiguity rules.
7. Detect conflicts: overlap with existing UnitAssignment at a different
location for the same serial → blocking. Operator must reconcile.
8. Apply: ensure auto_imported ProjectType exists, ensure
vibration_monitoring ProjectModule on the project, write
Project / MonitoringLocation / UnitAssignment / UnitHistory all in
one transaction.
Migration (backend/migrate_add_metadata_backfill.py): adds
unit_assignments.source column (default 'manual') and
metadata_backfill_decisions table. Idempotent, non-destructive.
API (backend/routers/metadata_backfill.py):
GET /api/admin/metadata_backfill/scan — clusters + suggestions
POST /api/admin/metadata_backfill/apply — bulk apply by cluster_ids
w/ optional per-cluster
project/location overrides
POST /api/admin/metadata_backfill/skip — mark skipped (persistent)
UI (templates/admin/metadata_backfill.html, accessible at
/settings/developer/metadata-backfill via the Developer tab of Settings):
- One-button "Run scan" entry.
- Summary KPI tiles (scanned / already attributed / pending / conflicts).
- "Apply all high-confidence" bulk button at the top — primary path.
- Per-cluster cards below with Apply / Skip / Preview event actions.
- Blank-meta clusters get inline input fields for operator-typed project +
location names before applying.
- Blocking-conflict clusters render with the conflicting assignment
information and a disabled Apply button.
- Live progress toast during apply.
- Reuses the Phase 1+2+4 event-detail modal for "Preview event" — operator
can sanity-check the BW report data against the cluster's sample event.
Dependencies: rapidfuzz==3.10.1 added to requirements.txt. Pre-built C
wheels for all platforms, ~5s docker build hit.
Phase 5b (deferred to next session): swap-detection daily background job,
notification inbox for auto-applied swaps, recently-applied audit view,
"Tidy" page for renaming/merging auto-created projects.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|