Skip to content

Spark: backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix#16245

Merged
amogh-jahagirdar merged 2 commits intoapache:mainfrom
stevenzwu:backport-15512-spark-wap-branch-delete
May 7, 2026
Merged

Spark: backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix#16245
amogh-jahagirdar merged 2 commits intoapache:mainfrom
stevenzwu:backport-15512-spark-wap-branch-delete

Conversation

@stevenzwu
Copy link
Copy Markdown
Contributor

@stevenzwu stevenzwu commented May 7, 2026

Backport of #15512 to Spark v3.4, v3.5, and v4.0.

Bug

When WAP (Write-Audit-Publish) is enabled via spark.wap.branch,
canDeleteWhere() and deleteWhere() could target different branches:

  • canDeleteWhere() scanned the table identifier branch (null → main),
    because the WAP branch is only a session config and not part of the
    identifier.
  • deleteWhere() resolved the WAP branch before committing.

This could cause canDeleteWhere() to incorrectly approve a
metadata-only delete based on data that was never on the WAP branch,
surfacing at commit time as:

ValidationException: Cannot delete file where some, but not all, rows match filter

Not a clean backport

The v4.1 fix relies on APIs that don't exist in older versions, so it had to be adapted. Two real differences:

  1. SparkTableUtil.determineReadBranch doesn't exist in v3.4/v3.5/v4.0. It was added in v4.1 only by Spark 4.1: Align handling of branches in reads and writes #15288 ("Spark 4.1: Align handling of branches in reads and writes"), which restructured read/write branch resolution to share logic, support option-based branches, and add validateReadBranch/validateWriteBranch. That refactor touched 19 files and is v4.1-only, so backporting it as a prerequisite isn't appropriate. The equivalent logic this fix needs is inlined here as a small private scanBranchForDelete() in SparkTable.java. The behavior matches v4.1 for the scenarios these versions support (the inlined version skips the option-based branch resolution because older versions don't have option-based branches on SparkTable in the first place).

  2. SnapshotUtil.schemaFor(table(), branch) call. In v4.1's canDeleteUsingMetadata, the StrictMetricsEvaluator uses a class-level schema field — it doesn't take branch at all. In v3.4/v3.5/v4.0, that line is new StrictMetricsEvaluator(SnapshotUtil.schemaFor(table(), branch), deleteExpr). I changed branchscanBranch there too, so the schema lookup is consistent with the scan ref. v4.1's diff doesn't touch that line because it doesn't exist in v4.1.

Cosmetic: the new tests use spark.conf().set/unset directly (matching v4.1's PR) instead of the withSQLConf(...) lambda style that older WAP tests in this file use — withSQLConf takes a Runnable, which can't throw the checked NoSuchTableException that append(...) declares.

The fix in deleteWhere itself didn't need backporting — older versions already call determineWriteBranch there. The bug was only on the canDeleteWhere read path.

Test plan

  • compileJava and compileTestJava pass for v3.4, v3.5, v4.0
  • CI runs the new tests in each version's spark-extensions module:
    • TestDelete#testDeleteToWapBranchCanDeleteWhereScansWapBranch
    • TestDelete#testMetadataDeleteToWapBranchCommitsToWapBranch

🤖 Generated with Claude Code

…lete fix

When WAP is enabled via spark.wap.branch, canDeleteWhere() previously
scanned the main branch while deleteWhere() committed to the WAP branch.
This could cause canDeleteWhere() to incorrectly approve a metadata-only
delete based on data that was never on the WAP branch, surfacing as
"Cannot delete file where some, but not all, rows match filter" at
commit time.

Resolve the scan branch the same way deleteWhere resolves the write
branch (with a fall-back to main when the WAP branch has not been
created yet), and pass it through canDeleteUsingMetadata.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Iceberg style requires an empty line between a control flow block and
the following statement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@aihuaxu aihuaxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@amogh-jahagirdar amogh-jahagirdar merged commit 57b1211 into apache:main May 7, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants