fix(consolidation): prevent orphan observations when source deleted mid-consolidation#1090
Merged
nicoloboschi merged 1 commit intomainfrom Apr 16, 2026
Merged
Conversation
… deleted mid-consolidation Consolidation reads a source memory, calls an LLM for several seconds, then writes an observation referencing that source. If the source memory was hard-deleted during the LLM call, the observation landed referencing a now-missing uuid — the delete's stale-observation sweep had already run and could not see the not-yet-inserted row. source_memory_ids is a uuid[] so Postgres cannot cascade through it, making this manual cleanup necessary. Two coordinated changes close the race: - Consolidator filters source_memory_ids against live rows with SELECT ... FOR SHARE inside the same transaction as the INSERT/UPDATE, dropping any id whose row has already been deleted and blocking concurrent deletes until the write commits. Skips the create/update entirely when no live sources remain. - Delete paths (delete_memory_unit, delete_document, delete_bank by fact_type) now DELETE the source rows first and run the stale-observation sweep afterwards, so any observation that was inserted concurrently is also caught by the sweep under READ COMMITTED. Adds three regression tests exercising the consolidator helpers directly with mixed live/dead and all-dead source_memory_ids.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes a race between consolidation and memory deletion where an observation could be written referencing an already-deleted source memory.
The race: consolidation fetches source memory X, runs an LLM for several seconds, then inserts an observation with
source_memory_ids = [X, ...]. If X is hard-deleted during the LLM call, the delete path's stale-observation sweep has already run and cannot see the not-yet-inserted row, leaving an orphan.source_memory_idsis auuid[]column, so Postgres cannot cascade through it — cleanup is manual by necessity, which is why this is a latent bug in the current design.Fix (two coordinated changes):
consolidator.py: new_filter_live_source_memorieshelper runsSELECT ... FOR SHAREagainstmemory_unitsinside the same transaction as the INSERT/UPDATE, dropping uuids whose rows have been deleted and blocking concurrent deletes until the write commits._create_observation_directlyand_execute_update_actioncall it before writing; if no live sources remain, the action is skipped ({"action": "skipped", "reason": "sources_deleted"}for creates; no-op return for updates).memory_engine.py:delete_memory_unit,delete_document, anddelete_bank(fact_type=...)nowDELETEthe source rows before calling_delete_stale_observations_for_memories, so under READ COMMITTED the post-delete sweep also catches any observation a concurrent consolidation committed in the window.Together these give us: either the consolidator sees the source gone and filters it, or the delete sees the new observation and sweeps it.
Three regression tests in
test_observation_invalidation.pyexercise the consolidator helpers directly:source_memory_ids→ observation created with only live sourcessource_memory_ids→ create skipped, no row writtenTest plan
uv run pytest tests/test_observation_invalidation.py— 8/8 pass locally (3 new + 5 existing)./scripts/hooks/lint.sh— all passed