Files
terra-view/templates/admin/project_tidy.html
serversdown 77483c2186 feat(projects): Tidy page for fuzzy-detecting + bulk-merging duplicate projects
Phase 5b first slice.  Surfaces near-duplicate projects (typo variants,
abbreviation differences, spacing variations like "SR81" vs "SR 81")
as side-by-side pairs the operator can merge with one click.

Backend (backend/services/project_tidy.py):
- find_duplicate_pairs(db, threshold=0.85) walks all active projects and
  computes rapidfuzz.WRatio similarity for every pair.  Pre-filters
  too-short normalised names (< 4 chars) to avoid noise.  Skips
  soft-deleted projects.  Returns pairs sorted by score desc, then by
  total content (more assignments → review first).
- Each pair carries a suggested merge target with a human-readable
  reason.  Priorities (in order): manual source over parser source,
  populated project_number, more locations, more assignments, shorter
  name.  Operator can override the suggestion by clicking the OTHER
  direction button.
- O(N^2) over project count.  Fine up to ~500 projects.  Token-prefix
  blocking is the obvious next optimisation if it becomes slow.

Backend (backend/routers/projects.py):
- GET /api/projects/admin/duplicate_pairs?threshold=&max_pairs=  returns
  pairs as JSON for the Tidy page.

Frontend (templates/admin/project_tidy.html):
- New admin page at /settings/developer/project-tidy.  Threshold selector
  (95% / 90% / 85% / 80%) at the top; rescan button next to it; auto-
  scans on load.
- Each pair card shows side-by-side project summaries (name, project_
  number, client, source-badge, location/assignment counts) with the
  suggested target visually highlighted (orange border).  Three buttons:
  "Merge A → B", "Merge B → A", "Not a dup" (hide locally).
- Click-to-merge opens a native confirm with the preview totals
  (assignments/sessions/data files moving, consolidations) — same data
  the project_header.html merge modal shows.  On confirm, hits the
  existing /merge_into endpoint and re-scans automatically.
- Source badges distinguish parser-created (`metadata_backfill`) from
  manual projects — at a glance the operator can see "this duplicate is
  parser-generated; safe to merge into the manual one".

Frontend (templates/admin/metadata_backfill.html):
- Apply-result handling now surfaces failed[] cluster reasons in a
  dedicated failure panel (bottom-left, dismissable).  Previously a 200
  OK with all-failures showed a misleading "1 cluster applied" success
  toast because the count and the failure list weren't being reconciled.
  This bit us during the DB-revert recovery earlier — the
  project_modules table was missing, every apply silently rolled back,
  user saw success toasts.  Fixed.

Smoke-verified against current state (10K events, 9 projects, post-
merge): tool correctly finds 0 pairs at threshold 0.85 (data is clean),
1 false-positive at 0.70 (two unrelated projects sharing the token "81"
— example of why the 0.85 default is correct).

Settings link added under Developer → Project Tidy.

Phase 5c (swap-detection daily background job + notification inbox)
remains deferred to the next session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 21:29:50 +00:00

268 lines
13 KiB
HTML

{% extends "base.html" %}
{% block title %}Project Tidy - Seismo Fleet Manager{% endblock %}
{% block content %}
<!-- Breadcrumb -->
<div class="mb-6">
<nav class="flex items-center space-x-2 text-sm">
<a href="/settings" class="text-seismo-orange hover:text-seismo-navy flex items-center">
<svg class="w-4 h-4 mr-1" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7"></path>
</svg>
Settings
</a>
<svg class="w-4 h-4 text-gray-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5l7 7-7 7"></path>
</svg>
<span class="text-gray-900 dark:text-white font-medium">Project Tidy</span>
</nav>
</div>
<!-- Header -->
<div class="mb-6">
<h1 class="text-3xl font-bold text-gray-900 dark:text-white">Project Tidy</h1>
<p class="text-gray-600 dark:text-gray-400 mt-1">
Find duplicate-looking projects via fuzzy name matching, then merge them with one click.
Useful after the metadata-backfill parser creates near-duplicates from operator name variations.
</p>
</div>
<!-- Controls -->
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-4 mb-5">
<div class="flex flex-wrap items-end gap-3">
<div class="flex flex-col gap-1">
<label class="text-xs text-gray-500 dark:text-gray-400">Similarity threshold</label>
<select id="threshold" onchange="runScan()"
class="px-3 py-1.5 text-sm border border-gray-300 dark:border-gray-600 rounded-lg bg-white dark:bg-slate-700 text-gray-900 dark:text-white">
<option value="0.95">≥ 95% — near-identical only (typos)</option>
<option value="0.90">≥ 90% — close variants</option>
<option value="0.85" selected>≥ 85% — fuzzy match floor (recommended)</option>
<option value="0.80">≥ 80% — aggressive (more false positives)</option>
</select>
</div>
<button onclick="runScan()"
class="ml-auto px-4 py-1.5 text-sm bg-seismo-orange text-white rounded-lg hover:bg-seismo-navy transition-colors">
↻ Scan for duplicates
</button>
</div>
</div>
<!-- Results -->
<div id="results" class="space-y-3">
<div class="text-center py-12 text-gray-500 dark:text-gray-400">
Click "Scan for duplicates" to find pairs.
</div>
</div>
<!-- Apply progress toast -->
<div id="tidy-toast" class="hidden fixed bottom-6 right-6 bg-white dark:bg-slate-800 rounded-xl shadow-2xl border border-gray-200 dark:border-gray-700 p-4 z-50 max-w-md">
<div class="flex items-center gap-3">
<div id="toast-icon" class="shrink-0">
<div class="animate-spin rounded-full h-6 w-6 border-b-2 border-seismo-orange"></div>
</div>
<div class="flex-1">
<p id="toast-message" class="text-sm font-medium text-gray-900 dark:text-white">Working…</p>
<p id="toast-sub" class="text-xs text-gray-500 dark:text-gray-400 mt-0.5"></p>
</div>
</div>
</div>
<script>
let _pairs = [];
function _esc(s) {
if (s == null) return '';
return String(s).replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/>/g, '&gt;').replace(/"/g, '&quot;');
}
function _sourceBadge(source) {
if (source === 'metadata_backfill' || source === 'metadata_backfill_swap') {
return '<span class="px-1.5 py-0.5 rounded text-xs bg-amber-100 text-amber-800 dark:bg-amber-900/30 dark:text-amber-300" title="Auto-created by the metadata-backfill parser">parser</span>';
}
return '<span class="px-1.5 py-0.5 rounded text-xs bg-blue-100 text-blue-800 dark:bg-blue-900/30 dark:text-blue-300" title="Manually created via the UI">manual</span>';
}
function _showToast(message, sub, kind) {
const toast = document.getElementById('tidy-toast');
const icon = document.getElementById('toast-icon');
document.getElementById('toast-message').textContent = message;
document.getElementById('toast-sub').textContent = sub || '';
if (kind === 'success') {
icon.innerHTML = '<svg class="w-6 h-6 text-green-500" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M5 13l4 4L19 7"></path></svg>';
} else if (kind === 'error') {
icon.innerHTML = '<svg class="w-6 h-6 text-red-500" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 9v2m0 4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z"></path></svg>';
} else {
icon.innerHTML = '<div class="animate-spin rounded-full h-6 w-6 border-b-2 border-seismo-orange"></div>';
}
toast.classList.remove('hidden');
}
function _hideToast(after) {
setTimeout(() => document.getElementById('tidy-toast').classList.add('hidden'), after || 3000);
}
async function runScan() {
const results = document.getElementById('results');
results.innerHTML = '<div class="text-center py-12 text-gray-500 dark:text-gray-400"><div class="animate-spin rounded-full h-8 w-8 border-b-2 border-seismo-orange mx-auto mb-3"></div>Scanning…</div>';
const threshold = document.getElementById('threshold').value;
try {
const r = await fetch(`/api/projects/admin/duplicate_pairs?threshold=${threshold}`);
if (!r.ok) throw new Error('HTTP ' + r.status);
const d = await r.json();
_pairs = d.pairs || [];
render();
} catch (e) {
results.innerHTML = `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-6 text-center text-red-500">Scan failed: ${_esc(e.message)}</div>`;
}
}
function render() {
const results = document.getElementById('results');
if (_pairs.length === 0) {
results.innerHTML = `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-8 text-center">
<svg class="w-16 h-16 mx-auto mb-4 text-green-500" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z"></path>
</svg>
<h3 class="text-lg font-semibold text-gray-900 dark:text-white mb-1">✨ No duplicates above the threshold</h3>
<p class="text-gray-500 dark:text-gray-400">Lower the threshold or call it good.</p>
</div>`;
return;
}
const summary = `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-4 mb-3">
<div class="text-sm text-gray-700 dark:text-gray-300">
Found <strong>${_pairs.length}</strong> duplicate pair${_pairs.length === 1 ? '' : 's'}.
Review the suggested merge direction (arrow points at the target project to keep),
adjust if needed, then click <strong>Merge</strong>.
</div>
</div>`;
results.innerHTML = summary + _pairs.map(_renderPair).join('');
}
function _renderPair(pair, i) {
const sourceTarget = pair.suggested_target_id === pair.a.id ? 'a' : 'b';
return `<div class="bg-white dark:bg-slate-800 rounded-xl shadow-lg p-4" data-idx="${i}">
<div class="flex items-center justify-between mb-3">
<div class="flex items-center gap-2">
<span class="px-2 py-0.5 rounded text-xs font-medium bg-amber-100 text-amber-800 dark:bg-amber-900/30 dark:text-amber-300">${(pair.score * 100).toFixed(0)}% match</span>
<span class="text-xs text-gray-500 dark:text-gray-400">${_esc(pair.reason)}</span>
</div>
<div class="flex items-center gap-2">
<button onclick="confirmMerge(${i}, 'a_into_b')"
class="px-3 py-1.5 text-xs rounded ${sourceTarget === 'b' ? 'bg-seismo-orange hover:bg-seismo-navy text-white font-medium' : 'border border-gray-300 dark:border-gray-600 text-gray-700 dark:text-gray-300 hover:bg-gray-50 dark:hover:bg-gray-700'}">
Merge A → B
</button>
<button onclick="confirmMerge(${i}, 'b_into_a')"
class="px-3 py-1.5 text-xs rounded ${sourceTarget === 'a' ? 'bg-seismo-orange hover:bg-seismo-navy text-white font-medium' : 'border border-gray-300 dark:border-gray-600 text-gray-700 dark:text-gray-300 hover:bg-gray-50 dark:hover:bg-gray-700'}">
Merge B → A
</button>
<button onclick="dismissPair(${i})"
title="Hide this pair (not actually a duplicate)"
class="px-3 py-1.5 text-xs border border-gray-300 dark:border-gray-600 text-gray-500 dark:text-gray-400 rounded hover:bg-gray-50 dark:hover:bg-gray-700">
Not a dup
</button>
</div>
</div>
<div class="grid grid-cols-2 gap-3">
${_renderProject(pair.a, 'A', sourceTarget === 'a')}
${_renderProject(pair.b, 'B', sourceTarget === 'b')}
</div>
</div>`;
}
function _renderProject(p, label, isTarget) {
const borderCls = isTarget ? 'border-seismo-orange ring-1 ring-seismo-orange/30' : 'border-gray-200 dark:border-gray-700';
return `<a href="/projects/${_esc(p.id)}" target="_blank"
class="block bg-gray-50 dark:bg-slate-900/50 rounded-lg p-3 border ${borderCls} hover:shadow-md transition-shadow">
<div class="flex items-start justify-between gap-2 mb-1">
<div class="text-xs text-gray-500 dark:text-gray-400">Project ${label}${isTarget ? ' · suggested target' : ''}</div>
${_sourceBadge(p.source)}
</div>
<div class="font-semibold text-gray-900 dark:text-white text-sm">${_esc(p.name)}</div>
${p.project_number ? `<div class="text-xs text-gray-500 dark:text-gray-400 mt-0.5">#${_esc(p.project_number)}</div>` : ''}
${p.client_name ? `<div class="text-xs text-gray-500 dark:text-gray-400 mt-0.5">${_esc(p.client_name)}</div>` : ''}
<div class="flex items-center gap-3 text-xs text-gray-600 dark:text-gray-400 mt-2">
<span><strong>${p.location_count}</strong> location${p.location_count === 1 ? '' : 's'}</span>
<span><strong>${p.assignment_count}</strong> assignment${p.assignment_count === 1 ? '' : 's'}</span>
</div>
</a>`;
}
async function confirmMerge(idx, direction) {
const pair = _pairs[idx];
if (!pair) return;
let sourceId, targetId, sourceName, targetName;
if (direction === 'a_into_b') {
sourceId = pair.a.id; targetId = pair.b.id;
sourceName = pair.a.name; targetName = pair.b.name;
} else {
sourceId = pair.b.id; targetId = pair.a.id;
sourceName = pair.b.name; targetName = pair.a.name;
}
// Pull preview to surface conflicts / consolidation count BEFORE merging.
let preview;
try {
const r = await fetch(`/api/projects/${sourceId}/merge_preview?target_id=${targetId}`);
if (!r.ok) {
const err = await r.json().catch(() => ({detail: 'HTTP ' + r.status}));
throw new Error(err.detail || ('HTTP ' + r.status));
}
preview = await r.json();
} catch (e) {
alert('Preview failed: ' + e.message);
return;
}
const summary = [
`${preview.total_assignments_moving} assignment(s)`,
`${preview.total_sessions_moving} session(s)`,
`${preview.total_data_files_moving} data file(s)`,
].join(', ');
let consolidation = '';
const consolidates = preview.location_plans.filter(p => p.action === 'consolidate').length;
if (consolidates > 0) {
consolidation = `\n\n${consolidates} same-named location(s) will be consolidated.`;
}
const ok = confirm(
`Merge "${sourceName}" into "${targetName}"?\n\n` +
`Will move: ${summary}.${consolidation}\n\n` +
`Source will be soft-deleted. This is reversible only via direct DB edit.`
);
if (!ok) return;
_showToast(`Merging "${sourceName}" → "${targetName}"…`);
try {
const r = await fetch(`/api/projects/${sourceId}/merge_into?target_id=${targetId}`, { method: 'POST' });
if (!r.ok) {
const err = await r.json().catch(() => ({detail: 'HTTP ' + r.status}));
throw new Error(err.detail || ('HTTP ' + r.status));
}
const d = await r.json();
_showToast(`Merged into "${targetName}"`,
`${d.assignments_moved} assignment(s), ${d.locations_moved + d.locations_consolidated} location(s)`,
'success');
_hideToast(3500);
// Re-scan: list updates without the merged pair.
await runScan();
} catch (e) {
_showToast('Merge failed', e.message, 'error');
_hideToast(5000);
}
}
function dismissPair(idx) {
// Just hide locally for now; doesn't persist across re-scans.
// A persistent "ignore pair" feature would need a new table; defer.
_pairs.splice(idx, 1);
render();
}
// Auto-scan on load with default threshold.
document.addEventListener('DOMContentLoaded', () => {
runScan();
});
</script>
{% endblock %}