Skip to content

Removes in memory set from dead compaction detector#6283

Open
keith-turner wants to merge 2 commits intoapache:mainfrom
keith-turner:tmp-file-cleanup
Open

Removes in memory set from dead compaction detector#6283
keith-turner wants to merge 2 commits intoapache:mainfrom
keith-turner:tmp-file-cleanup

Conversation

@keith-turner
Copy link
Copy Markdown
Contributor

Removed an in memory set of tables ids in the dead compaction detectors that contained table ids that may have compaction tmp files that needed cleanup. This set would be hard to maintain in multiple managers. Also the set could lose track of tables if the process died.

Replaced the in memory set with a set in the metadata table. This set is directly populated by the split and merge fate operations, so there is no chance of losing track of things when a process dies. Also this set is more narrow and allows looking for tmp files to cleanup in single tablets dirs rather than scanning an entire tables dir.

Also made a change to the order in which tmp files are deleted for failed compactions. They used to be deleted after the metadata for the compaction was cleaned up, this could lead to losing track of the cleanup if the process died after deleting the metadata but before deleting the tmp file. Now the tmp files are deleted before the metadata entry, so should no longer lose track in process death.

This change is needed by #6217

Removed an in memory set of tables ids in the dead compaction detectors
that contained table ids that may have compaction tmp files that needed
cleanup. This set would be hard to maintain in multiple managers. Also
the set could lose track of tables if the process died.

Replaced the in memory set with a set in the metadata table. This set is
directly populated by the split and merge fate operations, so there is
no chance of losing track of things when a process dies.  Also this set
is more narrow and allows looking for tmp files to cleanup in single
tablets dirs rather than scanning an entire tables dir.

Also made a change to the order in which tmp files are deleted for
failed compactions.  They used to be deleted after the metadata for the
compaction was cleaned up, this could lead to losing track of the
cleanup if the process died after deleting the metadata but before
deleting the tmp file.  Now the tmp files are deleted before the
metadata entry, so should no longer lose track in process death.

This change is needed by apache#6217
@keith-turner keith-turner added this to the 4.0.0 milestone Mar 31, 2026
final TabletMetadata tm = ctx.getAmple().readTablet(extent, ColumnType.DIR);
if (tm != null) {
final Collection<Volume> vols = ctx.getVolumeManager().getVolumes();
for (Volume vol : vols) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was moved to FindCompactionTmpFiles so it could be called here and in the dead compaction detector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant