Skip to content

fix(queuefs): dedupe memory semantic parent enqueues (#769)#792

Open
Protocol-zero-0 wants to merge 1 commit intovolcengine:mainfrom
Protocol-zero-0:fix/memory-semantic-parent-dedupe
Open

fix(queuefs): dedupe memory semantic parent enqueues (#769)#792
Protocol-zero-0 wants to merge 1 commit intovolcengine:mainfrom
Protocol-zero-0:fix/memory-semantic-parent-dedupe

Conversation

@Protocol-zero-0
Copy link

@Protocol-zero-0 Protocol-zero-0 commented Mar 19, 2026

Summary

Fixes volcengine/OpenViking#769.

Each memory write can enqueue the same parent directory for directory-level semantic regeneration (overview/abstract markdown). With no coalescing, rapid writes cause repeated expensive VLM work on large memory folders.

Change

  • In SemanticQueue.enqueue, for context_type == "memory" only, skip enqueueing when the same dedupe key (account_id, user_id, agent_id, uri) was enqueued within the last 45 seconds.
  • Return "deduplicated" (no backend write) in that case; resource and other context types are unchanged.
  • Bounded in-memory map with occasional stale-key pruning.

Tests

  • tests/storage/test_semantic_queue_memory_dedupe.py: same parent twice → one NamedQueue.enqueue; different URIs / resource → no dedupe.

Trade-off

Directory-level semantics may lag up to one window after burst writes; matches the short-term mitigation described in the issue.

…olcengine#769)

Repeated memory writes enqueue the same parent directory for .overview/.abstract
regeneration, causing redundant VLM work. Skip duplicate enqueues for the same
(account, user, agent, uri) within 45s; resource/session semantic paths unchanged.

Fixes volcengine#769

Made-with: Cursor
Copilot AI review requested due to automatic review settings March 19, 2026 21:53
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR mitigates excessive directory-level semantic recomputation for memory writes by coalescing repeated enqueues targeting the same parent directory within a short time window, addressing issue #769’s cost amplification on large memory folders.

Changes:

  • Added a 45s in-process dedupe window for context_type == "memory" in SemanticQueue.enqueue, returning "deduplicated" when coalesced.
  • Implemented a bounded-ish in-memory tracking map with periodic stale-key pruning.
  • Added pytest coverage to ensure same-parent memory enqueues are deduped while other URIs/context types are not.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
openviking/storage/queuefs/semantic_queue.py Adds memory-only enqueue deduplication with locking and periodic pruning.
tests/storage/test_semantic_queue_memory_dedupe.py Adds tests validating dedupe behavior for memory context and non-dedupe for other cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +34 to +46
if msg.context_type == "memory":
key = self._memory_parent_semantic_key(msg)
now = time.monotonic()
with self._memory_parent_semantic_lock:
last = self._memory_parent_semantic_last.get(key, 0.0)
if now - last < _MEMORY_PARENT_SEMANTIC_DEDUPE_SEC:
logger.debug(
"[SemanticQueue] Skipping duplicate memory semantic enqueue for %s "
"(within %.0fs dedupe window; see #769)",
msg.uri,
_MEMORY_PARENT_SEMANTIC_DEDUPE_SEC,
)
return "deduplicated"
Copy link

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enqueue() is async but uses a threading.Lock() critical section. Under contention this can block the event loop (and it currently includes a logger.debug(...) call inside the lock). Consider using an asyncio.Lock if this is only accessed from the event loop, or otherwise minimizing the locked region (compute the decision under lock, then log/return after releasing).

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +52
if len(self._memory_parent_semantic_last) > 2000:
cutoff = now - (_MEMORY_PARENT_SEMANTIC_DEDUPE_SEC * 4)
stale = [k for k, t in self._memory_parent_semantic_last.items() if t < cutoff]
for k in stale[:800]:
self._memory_parent_semantic_last.pop(k, None)
Copy link

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stale-key pruning path does an O(n) scan of the entire _memory_parent_semantic_last dict (stale = [...]) while holding the lock, and the map is only softly bounded (it can grow well beyond 2000 if enqueue rate is high within the cutoff window). This can become a hot-path bottleneck during bursts. Consider switching to a real bounded TTL/LRU structure (e.g., a deque of (ts,key) + dict, or a TTL cache) and pruning incrementally so per-enqueue work stays O(1) amortized.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

Bug: repeated parent-directory semantic recomputation on each new memory write

5 participants