feat: add variable substitution support for check definitions#1078
feat: add variable substitution support for check definitions#1078mwojtyczka merged 31 commits intodatabrickslabs:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds runtime variable substitution ({{ key }}) for metadata-defined checks so users can template rule definitions and resolve them by passing a variables dict at execution/validation time.
Changes:
- Introduces
apply_variables()inutils.pyto recursively substitute placeholders in all string fields, validate variable value types, and warn on unresolved placeholders. - Extends
DQEngine/DQEngineCoremetadata entrypoints (apply_checks_by_metadata*,validate_checks) with an optionalvariablesparameter and applies substitution before validation/deserialization. - Adds unit + integration coverage for substitution behavior and engine plumbing.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/databricks/labs/dqx/utils.py |
Implements recursive placeholder substitution + scalar-only variable validation + unresolved-placeholder warnings. |
src/databricks/labs/dqx/engine.py |
Wires variables through metadata APIs and applies substitution before validation/deserialization. |
tests/unit/test_utils.py |
Adds focused unit tests for substitution correctness, immutability, warnings, and type validation. |
tests/integration/test_apply_checks.py |
Adds integration tests proving substitution works through metadata apply/split + validation paths. |
tests/integration/test_apply_checks_and_save_in_table.py |
Adds integration test proving substitution works when saving results to a table. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…ks_by_metadata_and_split, validate_checks to be consistent with the downstream implementation
|
✅ 641/641 passed, 36 skipped, 6h43m11s total Running from acceptance #4239 |
ghanse
left a comment
There was a problem hiding this comment.
I think substitution should happen when the user loads checks. We should allow the user to pass variables to load_checks and delegate substitution to that method instead of adding to the apply_checks methods.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1078 +/- ##
==========================================
- Coverage 92.02% 91.94% -0.08%
==========================================
Files 98 98
Lines 9093 9140 +47
==========================================
+ Hits 8368 8404 +36
- Misses 725 736 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code review feedback implementation Co-authored-by: Marcin Wojtyczka <marcin.wojtyczka@databricks.com>
mwojtyczka
left a comment
There was a problem hiding this comment.
This must work for save_checks method to. Currently saving errors out. We need to resolve vars during saving. I'm updating accordingly
Changes
Add variable substitution support for check definitions. This allows users to define reusable check templates with
{{ placeholder }}syntax and resolve them prior to execution by passing a [variables] dictionary at load time.New functionality:
resolve_variables()inutils.py— recursively replaces{{ key }}placeholders in all string values of check definitions using a highly efficient single-pass substitution.variablesparameter onload_checks()andload_checks_from_local_file(). This delegates the templating to the data-loading layer so the engine always executes clean and self-contained rules.ExtraParams: TheDQEnginenow accepts default variables viaExtraParams. These are applied to all checks loaded by that engine instance unless overridden in a specificload_checksstr,int,float,bool,Decimal,datetime.date,datetime.datetime,datetime.time).list,dict,settupleandNonewith an explicitInvalidParameterError.Example usage:
Using
load_checksfrom a file or table:If building checks programmatically using raw dictionaries:
Linked issues
Resolves #967
Tests