perf: incremental era rebuilds — skip unchanged months

rebuild_eras() re-digested EVERY month from scratch on every coherence pass,
including old months whose sessions never change — ~17 redundant 32B calls per pass
(a big slice of the ~40-min consolidation grind + MI50 heat). Now it compares each
month's current session count to the stored era and only rebuilds changed months
(force=True still does all). Report gains built/skipped counts.

test_era.py: builds all first pass, skips unchanged, rebuilds only a month that
gained a session, force rebuilds all. Suite 99 green, ruff clean.

(Profile rebuild re-reading all 851 sessions every pass is the bigger remaining
hog — separate, harder fix.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-25 03:31:02 +00:00
parent 51c2d6abb9
commit d6f3516a34
2 changed files with 58 additions and 7 deletions
+14 -7
View File
@@ -54,17 +54,24 @@ def _digest_month(gists: list[str], backend: Backend) -> str:
return partials[0]
def rebuild_eras(backend: Backend | None = None) -> dict:
"""(Re)build a digest for every month that has session gists."""
def rebuild_eras(backend: Backend | None = None, force: bool = False) -> dict:
"""Build a digest per month, but only for months whose session count changed since
the last build — old months don't change, so re-digesting them every consolidation
pass was pure wasted LLM work (and MI50 heat). `force=True` rebuilds everything."""
backend = backend or config.load().summary_backend
by_month = memory.summaries_by_month()
months = 0
have = {e.month: e.session_count for e in memory.list_eras()}
built = skipped = 0
for month in sorted(by_month):
n = len(by_month[month])
if not force and have.get(month) == n:
skipped += 1
continue # unchanged month — keep its existing digest
digest = _digest_month(by_month[month], backend)
memory.store_era(month, digest, len(by_month[month]))
months += 1
logbus.log("info", "era built", month=month, sessions=len(by_month[month]))
report = {"months": months}
memory.store_era(month, digest, n)
built += 1
logbus.log("info", "era built", month=month, sessions=n)
report = {"built": built, "skipped": skipped, "months": built + skipped}
logbus.log("info", "eras complete", **report)
return report