fix: Optimize TaskQueueDB matching with composite indices and EXISTS by chrisburr · Pull Request #8462 · DIRACGrid/DIRAC

chrisburr · 2026-02-19T13:08:52Z

The __generateTQMatchSQL matching query uses COUNT subqueries to check whether multi-value tables (tq_TQTo*) have rows for a given TQId. This is inefficient because COUNT scans all matching rows even when we only need to know if any row exists, and the single-column TaskIndex on TQId requires a separate lookup to fetch the Value column.

This PR makes two changes:

Composite (TQId, Value) indices on the multi-value tables, replacing the single-column TaskIndex on TQId. This makes the index a covering index for all subqueries that filter on both columns. The existing per-field {Field}Index on Value alone is kept for reverse lookups (e.g. BannedSites antijoin).
EXISTS/NOT EXISTS instead of COUNT in __generateTQMatchSQL, __generateTagSQLSubCond, and __generateTQFindSQL. EXISTS short-circuits on the first matching row rather than counting all of them.

Together these drop __generateTQMatchSQL query time from ~30 ms to ~3 ms on a production-sized DB (confirmed via EXPLAIN ANALYZE).

Note: For existing deployments the schema change only takes effect on newly created tables (__initializeDB skips tables that already exist). Production DBs need a one-off ALTER TABLE to add the composite index.

BEGINRELEASENOTES
*WorkloadManagement
FIX: Optimize TaskQueueDB matching queries by adding composite (TQId, Value) indices and replacing COUNT subqueries with EXISTS/NOT EXISTS

ENDRELEASENOTES

The TaskIndex on multi-value tables (tq_TQTo*) previously only covered TQId, requiring a separate lookup for the Value column. Adding Value to the composite index makes it a covering index for all subqueries that filter on both columns, improving query performance significantly.

Replace COUNT-based subqueries with EXISTS/NOT EXISTS patterns in __generateTQMatchSQL, __generateTagSQLSubCond, and __generateTQFindSQL. EXISTS short-circuits on the first matching row instead of scanning all rows, which combined with the composite (TQId, Value) indices reduces matching query time from ~30ms to ~3ms on production.

chrisburr · 2026-02-19T13:18:19Z

@fstagni At the dops it's probably worth recommending people do this:

ALTER TABLE tq_TQToSites ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToPlatforms ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToJobTypes ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToBannedSites ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToGridCEs ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToTags ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;

Even on lbwms this ran in a few hundred milliseconds so it's pretty safe to do even in a production system.

aldbr

Not an expert of that part of the code but, given the description of the PR, the changes look good to me

fstagni · 2026-02-19T15:26:23Z

src/DIRAC/WorkloadManagementSystem/DB/TaskQueueDB.py

-                sql2 = sql1 + f" AND {tableName}.Value={tagMatchList}"
-        sql = "( " + sql1 + " ) = (" + sql2 + " )"
+                valuesStr = tagMatchList
+            sql = f"NOT EXISTS ( SELECT 1 FROM {tableName} WHERE {tableName}.TQId=tq.TQId AND {tableName}.Value NOT IN ( {valuesStr} ) )"


My favorite llm is telling me that:

When tagMatchList is empty, the new code accepts ANY task (with zero values), but the old code required tasks to ONLY contain empty string values.

Old logic:
(SELECT COUNT(all) WHERE id=tq.TQId) = (SELECT COUNT(...) AND value='')

New logic: NOT EXISTS (... WHERE value != '')
→ Accepts tasks with 0 values OR tasks where all values = ''

This changes semantics - might incorrectly accept tasks that should be rejected.

I would say this is OK. Any other ideas?

I think you need a smarter favorite LLM, I think that comment is claiming that 0 != 0

fstagni · 2026-02-19T15:33:09Z

@fstagni At the dops it's probably worth recommending people do this:

ALTER TABLE tq_TQToSites ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToPlatforms ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToJobTypes ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToBannedSites ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToGridCEs ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;
ALTER TABLE tq_TQToTags ADD INDEX idx_tqid_value (TQId, Value), DROP INDEX TaskIndex;

Even on lbwms this ran in a few hundred milliseconds so it's pretty safe to do even in a production system.

I will do, but just for the sake of organization, I would:

mark these as CHANGES in the release notes. I know they are "recommendations" and not mandatory changes, but still...
create a 9.1.0 release at the next tagging

Like this:

they will get into the release notes
installing a minor release instead of a patch one will provide a decent reminder to check for changes in the release notes (something that I will anyway advertise)

chrisburr added 2 commits February 19, 2026 14:10

chrisburr force-pushed the fix/taskqueue-exists-optimization branch from 495abee to fe366e0 Compare February 19, 2026 13:13

chrisburr marked this pull request as ready for review February 19, 2026 14:20

chrisburr requested review from atsareg and fstagni as code owners February 19, 2026 14:20

aldbr approved these changes Feb 19, 2026

View reviewed changes

fstagni reviewed Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: Optimize TaskQueueDB matching with composite indices and EXISTS#8462

fix: Optimize TaskQueueDB matching with composite indices and EXISTS#8462
chrisburr wants to merge 2 commits intoDIRACGrid:integrationfrom
chrisburr:fix/taskqueue-exists-optimization

chrisburr commented Feb 19, 2026 •

edited

Loading

Uh oh!

chrisburr commented Feb 19, 2026

Uh oh!

aldbr left a comment

Uh oh!

fstagni Feb 19, 2026

Uh oh!

chrisburr Feb 19, 2026

Uh oh!

fstagni commented Feb 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

chrisburr commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chrisburr commented Feb 19, 2026

Uh oh!

aldbr left a comment

Choose a reason for hiding this comment

Uh oh!

fstagni Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

chrisburr Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

fstagni commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chrisburr commented Feb 19, 2026 •

edited

Loading

fstagni commented Feb 19, 2026 •

edited

Loading