5c41bd48d1
Two bugs surfacing in the log during live play: - SUMMARY_BACKEND=mi50 (llama.cpp, 32B) was fed 24k-char chunks → "Context size has been exceeded". Chunk budget is now backend-aware: cloud 24k, local/mi50 8k, and the merge step recurses so merged partials never overflow either. - maybe_summarize ran inline in the chat turn and retried 4× with backoff (~30s), stalling the reply and surfacing the error. It now runs in a background daemon thread, swallows errors (consolidation is best-effort maintenance), and dedupes so at most one summary per session runs at a time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>