fix(queuefs): dedupe memory semantic parent enqueues (#769) by Protocol-zero-0 · Pull Request #792 · volcengine/OpenViking

Protocol-zero-0 · 2026-03-19T21:53:13Z

Summary

Each memory write can enqueue the same parent directory for directory-level semantic regeneration (overview/abstract markdown). With no coalescing, rapid writes cause repeated expensive VLM work on large memory folders.

Change

In SemanticQueue.enqueue, for context_type == "memory" only, skip enqueueing when the same dedupe key (account_id, user_id, agent_id, uri) was enqueued within the last 45 seconds.
Return "deduplicated" (no backend write) in that case; resource and other context types are unchanged.
Bounded in-memory map with occasional stale-key pruning.

Tests

tests/storage/test_semantic_queue_memory_dedupe.py: same parent twice → one NamedQueue.enqueue; different URIs / resource → no dedupe.

Trade-off

Directory-level semantics may lag up to one window after burst writes; matches the short-term mitigation described in the issue.

…olcengine#769) Repeated memory writes enqueue the same parent directory for .overview/.abstract regeneration, causing redundant VLM work. Skip duplicate enqueues for the same (account, user, agent, uri) within 45s; resource/session semantic paths unchanged. Fixes volcengine#769 Made-with: Cursor

CLAassistant · 2026-03-19T21:53:20Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Copilot

Pull request overview

This PR mitigates excessive directory-level semantic recomputation for memory writes by coalescing repeated enqueues targeting the same parent directory within a short time window, addressing issue #769’s cost amplification on large memory folders.

Changes:

Added a 45s in-process dedupe window for context_type == "memory" in SemanticQueue.enqueue, returning "deduplicated" when coalesced.
Implemented a bounded-ish in-memory tracking map with periodic stale-key pruning.
Added pytest coverage to ensure same-parent memory enqueues are deduped while other URIs/context types are not.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`openviking/storage/queuefs/semantic_queue.py`	Adds memory-only enqueue deduplication with locking and periodic pruning.
`tests/storage/test_semantic_queue_memory_dedupe.py`	Adds tests validating dedupe behavior for memory context and non-dedupe for other cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-19T21:56:42Z

openviking/storage/queuefs/semantic_queue.py

+        if msg.context_type == "memory":
+            key = self._memory_parent_semantic_key(msg)
+            now = time.monotonic()
+            with self._memory_parent_semantic_lock:
+                last = self._memory_parent_semantic_last.get(key, 0.0)
+                if now - last < _MEMORY_PARENT_SEMANTIC_DEDUPE_SEC:
+                    logger.debug(
+                        "[SemanticQueue] Skipping duplicate memory semantic enqueue for %s "
+                        "(within %.0fs dedupe window; see #769)",
+                        msg.uri,
+                        _MEMORY_PARENT_SEMANTIC_DEDUPE_SEC,
+                    )
+                    return "deduplicated"


enqueue() is async but uses a threading.Lock() critical section. Under contention this can block the event loop (and it currently includes a logger.debug(...) call inside the lock). Consider using an asyncio.Lock if this is only accessed from the event loop, or otherwise minimizing the locked region (compute the decision under lock, then log/return after releasing).

Copilot · 2026-03-19T21:56:42Z

openviking/storage/queuefs/semantic_queue.py

+                if len(self._memory_parent_semantic_last) > 2000:
+                    cutoff = now - (_MEMORY_PARENT_SEMANTIC_DEDUPE_SEC * 4)
+                    stale = [k for k, t in self._memory_parent_semantic_last.items() if t < cutoff]
+                    for k in stale[:800]:
+                        self._memory_parent_semantic_last.pop(k, None)


The stale-key pruning path does an O(n) scan of the entire _memory_parent_semantic_last dict (stale = [...]) while holding the lock, and the map is only softly bounded (it can grow well beyond 2000 if enqueue rate is high within the cutoff window). This can become a hot-path bottleneck during bursts. Consider switching to a real bounded TTL/LRU structure (e.g., a deque of (ts,key) + dict, or a TTL cache) and pruning incrementally so per-enqueue work stays O(1) amortized.

Copilot AI review requested due to automatic review settings March 19, 2026 21:53

github-project-automation bot added this to OpenViking project Mar 19, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 19, 2026

Copilot started reviewing on behalf of Protocol-zero-0 March 19, 2026 21:53 View session

Copilot AI reviewed Mar 19, 2026

View reviewed changes

qin-ctx assigned chenjw and myysy Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(queuefs): dedupe memory semantic parent enqueues (#769)#792

fix(queuefs): dedupe memory semantic parent enqueues (#769)#792
Protocol-zero-0 wants to merge 1 commit intovolcengine:mainfrom
Protocol-zero-0:fix/memory-semantic-parent-dedupe

Protocol-zero-0 commented Mar 19, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Protocol-zero-0 commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Tests

Trade-off

Uh oh!

CLAassistant commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Protocol-zero-0 commented Mar 19, 2026 •

edited

Loading