Skip to content

Conversation

@hvitved
Copy link
Contributor

@hvitved hvitved commented Dec 19, 2025

Refactors existing logic, and also prioritizes specific trait implementation models over generic trait models.

@github-actions github-actions bot added the Rust Pull requests that update Rust code label Dec 19, 2025
@hvitved hvitved force-pushed the rust/mad-provenance-refactor branch 2 times, most recently from 868e36f to 8ffed9d Compare December 19, 2025 14:16
@hvitved hvitved force-pushed the rust/mad-provenance-refactor branch from 8ffed9d to f23351a Compare December 19, 2025 18:36
@hvitved hvitved added the no-change-note-required This PR does not need a change note label Dec 19, 2025
@hvitved hvitved marked this pull request as ready for review December 19, 2025 18:38
@hvitved hvitved requested a review from a team as a code owner December 19, 2025 18:38
Copilot AI review requested due to automatic review settings December 19, 2025 18:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Models as Data (MaD) provenance-based filtering logic in the Rust dataflow analysis, with the key goal of prioritizing specific trait implementation models over generic trait models.

Key Changes:

  • Introduced a new filtering mechanism (summaryModelRelevant) that distinguishes between exact models (specific implementations) and inexact models (trait models), preventing unnecessary duplication of flow steps
  • Simplified the callable resolution logic in DataFlowImpl.qll by removing complex filtering that is now handled in ModelsAsData.qll
  • Added specific manual models for String::Add operations to override generic trait models and improve analysis precision

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
rust/ql/lib/codeql/rust/dataflow/internal/ModelsAsData.qll Core refactoring - introduces summaryModel and summaryModelRelevant predicates to filter models based on exactness (trait vs specific implementation) and provenance (manual vs generated)
rust/ql/lib/codeql/rust/dataflow/internal/DataFlowImpl.qll Simplifies viableCallable by delegating model filtering to ModelsAsData.qll and improves toString() for summarized callables with [summarized] prefix
rust/ql/lib/codeql/rust/frameworks/stdlib/alloc.model.yml Adds specific manual models for String as Add trait to override generated generic trait models
rust/ql/test/library-tests/dataflow/models/models.ext.yml Adds test model for <core::i32 as core::cmp::PartialOrd>::lt to verify specific implementations override trait models
rust/ql/test/library-tests/dataflow/models/main.rs Adds test case for i32 comparison to verify new filtering behavior
rust/ql/test/library-tests/dataflow/strings/main.rs Updates test expectation - removes SPURIOUS flow annotation, confirming the refactoring fixed a false positive
Test expectation files (.expected) Updates model IDs and edge numbering reflecting the reduced number of duplicate models being applied
rust/ql/test/library-tests/dataflow/global/viableCallable.expected Updates to show [summarized] prefix for better clarity in test output

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// only apply trait models to concrete implementations when they are not
// defined in source code, and when there are no exact model for the functions
isExact = false and
not summaryModel(f, _, _, _, provenance, true, _) and
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks for the absence of an exact model with the same provenance. However, this means a generated trait model could still be applied even when a manual specific implementation model exists (since they have different provenance values). Based on the comment stating "when there are no exact model for the functions" (line 148), this should likely check for any exact model regardless of provenance: not summaryModel(f, _, _, _, _, true, _)

Suggested change
not summaryModel(f, _, _, _, provenance, true, _) and
not summaryModel(f, _, _, _, _, true, _) and

Copilot uses AI. Check for mistakes.
isExact = true
or
// only apply trait models to concrete implementations when they are not
// defined in source code, and when there are no exact model for the functions
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "when there are no exact model for the functions" but should be clearer about whether this means "no exact model with any provenance" or "no exact model with the same provenance". Consider rephrasing for clarity.

Suggested change
// defined in source code, and when there are no exact model for the functions
// defined in source code, and when there is no exact model with the same provenance for the function

Copilot uses AI. Check for mistakes.
provenance.isManual()
or
// only apply generated models to functions not defined in source code, and
// when there are no exact manual models for the functions
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment uses plural "functions" but should use singular "function" since it refers to a specific function f.

Copilot uses AI. Check for mistakes.
isExact = true
or
// only apply trait models to concrete implementations when they are not
// defined in source code, and when there are no exact model for the functions
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment uses plural "functions" but should use singular "function" since it refers to a specific function f.

Suggested change
// defined in source code, and when there are no exact model for the functions
// defined in source code, and when there is no exact model for the function

Copilot uses AI. Check for mistakes.
@hvitved hvitved requested a review from paldepind January 6, 2026 08:20
Copy link
Contributor

@paldepind paldepind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Copilot have some thoughts on the grammar that seems correct.

private class SummarizedCallableFromModel extends SummarizedCallable::Range {
private string path;
private predicate summaryModel(
Function f, string input, string output, string kind, Provenance provenance, boolean isExact,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about isDirect or isInherited (with opposite value) instead of "exact"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for isExact, since that is what other languages use, and will be shared once #21051 is ready.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but maybe that's an additional reason to get the naming right? :)

To me isExact sounds like something about the accuracy of the model itself and where the opposite would be an "approximate" model. The suggested names (and isInherited in particular) sounds more like what this is actually about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I have changed it to isInherited.

Comment on lines 134 to 152
exists(boolean isExact | summaryModel(f, input, output, kind, provenance, isExact, madId) |
(
provenance.isManual()
or
// only apply generated models to functions not defined in source code, and
// when there are no exact manual models for the functions
provenance.isGenerated() and
not any(Provenance manual | summaryModel(f, _, _, _, manual, true, _)).isManual() and
not f.fromSource()
) and
(
isExact = true
or
// only apply trait models to concrete implementations when they are not
// defined in source code, and when there are no exact model for the functions
isExact = false and
not summaryModel(f, _, _, _, provenance, true, _) and
not f.fromSource()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this to avoid a bit of repetition?

Suggested change
exists(boolean isExact | summaryModel(f, input, output, kind, provenance, isExact, madId) |
(
provenance.isManual()
or
// only apply generated models to functions not defined in source code, and
// when there are no exact manual models for the functions
provenance.isGenerated() and
not any(Provenance manual | summaryModel(f, _, _, _, manual, true, _)).isManual() and
not f.fromSource()
) and
(
isExact = true
or
// only apply trait models to concrete implementations when they are not
// defined in source code, and when there are no exact model for the functions
isExact = false and
not summaryModel(f, _, _, _, provenance, true, _) and
not f.fromSource()
)
exists(boolean isExact | summaryModel(f, input, output, kind, provenance, isExact, madId) |
// Only apply generated or inherited models to functions in library code and
// when no strictly better model exist
if provenance.isGenerated() or isExact = false
then
not f.fromSource() and
not exists(Provenance provenance2 | summaryModel(f, _, _, _, provenance2, true, _) |
provenance.isGenerated() and provenance2.isManual()
or
provenance = provenance2 and isExact = false
)
else any()
)

@hvitved hvitved requested a review from paldepind January 6, 2026 12:30
@hvitved hvitved force-pushed the rust/mad-provenance-refactor branch from e4865b8 to 646a932 Compare January 6, 2026 12:34
@hvitved hvitved force-pushed the rust/mad-provenance-refactor branch from 646a932 to da6d0ab Compare January 6, 2026 13:35
Copy link
Contributor

@paldepind paldepind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR 👍 and for considering my comments :)

@hvitved hvitved merged commit 1a2f722 into github:main Jan 6, 2026
19 checks passed
@hvitved hvitved deleted the rust/mad-provenance-refactor branch January 6, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note Rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants