Skip to content

Add directory cleanup after ref deletion#160

Open
tomchon wants to merge 1 commit intomainfrom
tomchon-patch-1
Open

Add directory cleanup after ref deletion#160
tomchon wants to merge 1 commit intomainfrom
tomchon-patch-1

Conversation

@tomchon
Copy link
Copy Markdown
Contributor

@tomchon tomchon commented Apr 27, 2026

This pull request refactors and enhances the .github/scripts/git_ref_lock_cleaner.py script to improve error handling, robustness, and cleanup of empty directories after reference deletions. The changes ensure that the script can better identify and resolve git ref lock errors, clean up any resulting empty directories, and retry operations when appropriate.

Error handling and recovery improvements:

  • Refactored the handler logic so that handle, delete_ref, and related methods return a boolean indicating success, allowing for better control flow and retry logic. [1] [2] [3]
  • Enhanced the main execution flow to support retrying git fetch up to two times if an error is detected and fixed, improving robustness against transient issues.

Directory cleanup enhancements:

  • Added the remove_empty_parents utility and integrated it into the ref deletion process to automatically remove empty parent directories under .git/refs and .git/logs/refs after a ref or reflog is deleted. [1] [2]

Error pattern matching and handling:

  • Improved regex patterns in the handlers to be more flexible and to support both English and Chinese error messages, and added logic to handle specific conflict scenarios where a blocking ref prevents the creation of a target ref. [1] [2]

These changes collectively make the script more reliable and easier to maintain in the face of various git ref lock error patterns.

Copilot AI review requested due to automatic review settings April 27, 2026 08:22
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the git ref lock cleaner script by introducing a retry mechanism for git fetch operations and implementing automatic cleanup of empty parent directories within the .git structure after a reference is deleted. It also refines the error handling flow to return success statuses and updates regex patterns for better compatibility. A review comment identifies that the handle override in Type2Handler is redundant and potentially problematic due to a fragile regex and incorrect directory cleanup logic, suggesting that the base class implementation is sufficient.

Comment on lines +100 to +118
def handle(self, error_output: str) -> bool:
# Example:
# cannot lock ref 'refs/remotes/origin/dev':
# 'refs/remotes/origin/dev/trigger-ci-for-3.0' exists; cannot create 'refs/remotes/origin/dev'
match = re.search(
r"cannot lock ref '(refs/remotes/origin/[^']+)': '(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'",
error_output,
)
if match:
target_ref = match.group(1)
blocking_ref = match.group(2)
print(
f"Detected conflict: blocking ref {blocking_ref} prevents creating {target_ref}"
)
fixed = self.delete_ref(blocking_ref)
# Ensure parent directories of target ref are not left as empty dirs.
self.cleanup_ref_dirs(target_ref)
return fixed
return super().handle(error_output)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The handle method override in Type2Handler is redundant and introduces potential issues.

  1. Redundancy: The base class handle method already uses parse_branch to identify the ref to delete. Since Type2Handler.parse_branch (line 97) correctly identifies the blocking ref (e.g., from the '... exists;' part of the error), the base implementation will correctly delete it and perform the necessary directory cleanup.
  2. Fragile Regex: The regex at line 105 is fragile because it expects a single space after the colon (': '), whereas Git error output for D/F (Directory/File) conflicts often contains newlines and indentation at that position.
  3. Incorrect Cleanup: The call to self.cleanup_ref_dirs(target_ref) at line 116 is problematic. If target_ref is refs/remotes/origin/dev, it will attempt to remove the parent directory .git/refs/remotes/origin if it is empty, which is likely not intended. The directory that actually needs cleaning (the one named after the target ref) is already handled by the delete_ref(blocking_ref) call.

Removing this override simplifies the code and avoids these issues while maintaining the same functionality through the base class implementation.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the .github/scripts/git_ref_lock_cleaner.py helper used in CI/local automation to recover from git fetch failures caused by ref lock/conflict issues, adding retry logic and cleanup of now-empty ref/reflog directories under .git.

Changes:

  • Refactors handler flow so handle() / delete_ref() return a boolean success value, enabling controlled retries.
  • Adds cleanup of empty parent directories after ref/reflog deletions.
  • Expands/refines error matching (including a more specific conflict case) and retries git fetch up to 2 times when a fix was applied.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +13
stop_dir = os.path.normpath(stop_dir)
while current and os.path.normpath(current).startswith(stop_dir):
if os.path.normpath(current) == stop_dir:
break
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_empty_parents() uses os.path.normpath(current).startswith(stop_dir) to decide whether current is under stop_dir. startswith() is not a path-safe containment check (e.g., .git/refs_backup would match .git/refs), which can make the cleanup traverse/remove unexpected directories. Consider converting both to absolute (and on Windows, normcase) and using os.path.commonpath([current, stop_dir]) == stop_dir (or pathlib.Path.resolve().is_relative_to() on 3.9+) to enforce a real directory boundary check.

Suggested change
stop_dir = os.path.normpath(stop_dir)
while current and os.path.normpath(current).startswith(stop_dir):
if os.path.normpath(current) == stop_dir:
break
stop_dir_abs = os.path.normcase(os.path.abspath(os.path.normpath(stop_dir)))
while current:
current_abs = os.path.normcase(os.path.abspath(os.path.normpath(current)))
if os.path.commonpath([current_abs, stop_dir_abs]) != stop_dir_abs:
break
if current_abs == stop_dir_abs:
break

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +73
if not os.path.exists(ref_file):
remove_empty_parents(ref_file, os.path.join(".git", "refs"))
if not os.path.exists(reflog_file):
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup_ref_dirs() only runs directory cleanup when ref_file / reflog_file do not exist. In the conflict scenario this PR is addressing, the target ref path can exist as an (empty) directory, which still blocks Git from creating the ref file, but this code will skip cleanup because os.path.exists() is true for directories. Consider explicitly handling os.path.isdir(ref_file) / os.path.isdir(reflog_file) by attempting to rmdir the directory (if empty) and then cleaning up parents.

Suggested change
if not os.path.exists(ref_file):
remove_empty_parents(ref_file, os.path.join(".git", "refs"))
if not os.path.exists(reflog_file):
if os.path.isdir(ref_file):
try:
os.rmdir(ref_file)
print(f"Removed empty directory: {ref_file}")
except OSError as ex:
print(f"Failed to remove directory {ref_file}: {ex}")
else:
remove_empty_parents(ref_file, os.path.join(".git", "refs"))
elif not os.path.exists(ref_file):
remove_empty_parents(ref_file, os.path.join(".git", "refs"))
if os.path.isdir(reflog_file):
try:
os.rmdir(reflog_file)
print(f"Removed empty directory: {reflog_file}")
except OSError as ex:
print(f"Failed to remove directory {reflog_file}: {ex}")
else:
remove_empty_parents(reflog_file, os.path.join(".git", "logs", "refs"))
elif not os.path.exists(reflog_file):

Copilot uses AI. Check for mistakes.
# cannot lock ref 'refs/remotes/origin/dev':
# 'refs/remotes/origin/dev/trigger-ci-for-3.0' exists; cannot create 'refs/remotes/origin/dev'
match = re.search(
r"cannot lock ref '(refs/remotes/origin/[^']+)': '(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'",
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conflict regex in Type2Handler.handle() doesn't allow for the newline/whitespace Git often includes after cannot lock ref '...': (your own example shows a line break). As written, match will fail in that common format, so the target-ref cleanup path won’t run. Consider allowing \s* after the colon (or using re.DOTALL + more flexible spacing) so both single-line and multi-line error outputs are handled.

Suggested change
r"cannot lock ref '(refs/remotes/origin/[^']+)': '(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'",
r"cannot lock ref '(refs/remotes/origin/[^']+)':\s*'(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'",

Copilot uses AI. Check for mistakes.
Comment on lines 83 to 88
def parse_branch(self, error_output: str) -> str:
# 匹配 cannot lock ref 部分,兼容中英文
match = re.search(
r"error: cannot lock ref '(refs/remotes/origin/[^']+)': is at", error_output
r"cannot lock ref '(refs/remotes/origin/[^']+)': is at", error_output
)
return match.group(1) if match else None
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type1Handler.parse_branch() is annotated as returning str, but it can return None when the regex doesn't match (and the abstract parse_branch contract is Optional[str]). Align the return type annotation with the actual behavior (and keep handler overrides consistent with the base class) to avoid misleading typing and static-analysis issues.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants