Conversation
There was a problem hiding this comment.
Code Review
This pull request enhances the git ref lock cleaner script by introducing a retry mechanism for git fetch operations and implementing automatic cleanup of empty parent directories within the .git structure after a reference is deleted. It also refines the error handling flow to return success statuses and updates regex patterns for better compatibility. A review comment identifies that the handle override in Type2Handler is redundant and potentially problematic due to a fragile regex and incorrect directory cleanup logic, suggesting that the base class implementation is sufficient.
| def handle(self, error_output: str) -> bool: | ||
| # Example: | ||
| # cannot lock ref 'refs/remotes/origin/dev': | ||
| # 'refs/remotes/origin/dev/trigger-ci-for-3.0' exists; cannot create 'refs/remotes/origin/dev' | ||
| match = re.search( | ||
| r"cannot lock ref '(refs/remotes/origin/[^']+)': '(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'", | ||
| error_output, | ||
| ) | ||
| if match: | ||
| target_ref = match.group(1) | ||
| blocking_ref = match.group(2) | ||
| print( | ||
| f"Detected conflict: blocking ref {blocking_ref} prevents creating {target_ref}" | ||
| ) | ||
| fixed = self.delete_ref(blocking_ref) | ||
| # Ensure parent directories of target ref are not left as empty dirs. | ||
| self.cleanup_ref_dirs(target_ref) | ||
| return fixed | ||
| return super().handle(error_output) |
There was a problem hiding this comment.
The handle method override in Type2Handler is redundant and introduces potential issues.
- Redundancy: The base class
handlemethod already usesparse_branchto identify the ref to delete. SinceType2Handler.parse_branch(line 97) correctly identifies the blocking ref (e.g., from the'... exists;'part of the error), the base implementation will correctly delete it and perform the necessary directory cleanup. - Fragile Regex: The regex at line 105 is fragile because it expects a single space after the colon (
': '), whereas Git error output for D/F (Directory/File) conflicts often contains newlines and indentation at that position. - Incorrect Cleanup: The call to
self.cleanup_ref_dirs(target_ref)at line 116 is problematic. Iftarget_refisrefs/remotes/origin/dev, it will attempt to remove the parent directory.git/refs/remotes/originif it is empty, which is likely not intended. The directory that actually needs cleaning (the one named after the target ref) is already handled by thedelete_ref(blocking_ref)call.
Removing this override simplifies the code and avoids these issues while maintaining the same functionality through the base class implementation.
There was a problem hiding this comment.
Pull request overview
This PR enhances the .github/scripts/git_ref_lock_cleaner.py helper used in CI/local automation to recover from git fetch failures caused by ref lock/conflict issues, adding retry logic and cleanup of now-empty ref/reflog directories under .git.
Changes:
- Refactors handler flow so
handle()/delete_ref()return a boolean success value, enabling controlled retries. - Adds cleanup of empty parent directories after ref/reflog deletions.
- Expands/refines error matching (including a more specific conflict case) and retries
git fetchup to 2 times when a fix was applied.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| stop_dir = os.path.normpath(stop_dir) | ||
| while current and os.path.normpath(current).startswith(stop_dir): | ||
| if os.path.normpath(current) == stop_dir: | ||
| break |
There was a problem hiding this comment.
remove_empty_parents() uses os.path.normpath(current).startswith(stop_dir) to decide whether current is under stop_dir. startswith() is not a path-safe containment check (e.g., .git/refs_backup would match .git/refs), which can make the cleanup traverse/remove unexpected directories. Consider converting both to absolute (and on Windows, normcase) and using os.path.commonpath([current, stop_dir]) == stop_dir (or pathlib.Path.resolve().is_relative_to() on 3.9+) to enforce a real directory boundary check.
| stop_dir = os.path.normpath(stop_dir) | |
| while current and os.path.normpath(current).startswith(stop_dir): | |
| if os.path.normpath(current) == stop_dir: | |
| break | |
| stop_dir_abs = os.path.normcase(os.path.abspath(os.path.normpath(stop_dir))) | |
| while current: | |
| current_abs = os.path.normcase(os.path.abspath(os.path.normpath(current))) | |
| if os.path.commonpath([current_abs, stop_dir_abs]) != stop_dir_abs: | |
| break | |
| if current_abs == stop_dir_abs: | |
| break |
| if not os.path.exists(ref_file): | ||
| remove_empty_parents(ref_file, os.path.join(".git", "refs")) | ||
| if not os.path.exists(reflog_file): |
There was a problem hiding this comment.
cleanup_ref_dirs() only runs directory cleanup when ref_file / reflog_file do not exist. In the conflict scenario this PR is addressing, the target ref path can exist as an (empty) directory, which still blocks Git from creating the ref file, but this code will skip cleanup because os.path.exists() is true for directories. Consider explicitly handling os.path.isdir(ref_file) / os.path.isdir(reflog_file) by attempting to rmdir the directory (if empty) and then cleaning up parents.
| if not os.path.exists(ref_file): | |
| remove_empty_parents(ref_file, os.path.join(".git", "refs")) | |
| if not os.path.exists(reflog_file): | |
| if os.path.isdir(ref_file): | |
| try: | |
| os.rmdir(ref_file) | |
| print(f"Removed empty directory: {ref_file}") | |
| except OSError as ex: | |
| print(f"Failed to remove directory {ref_file}: {ex}") | |
| else: | |
| remove_empty_parents(ref_file, os.path.join(".git", "refs")) | |
| elif not os.path.exists(ref_file): | |
| remove_empty_parents(ref_file, os.path.join(".git", "refs")) | |
| if os.path.isdir(reflog_file): | |
| try: | |
| os.rmdir(reflog_file) | |
| print(f"Removed empty directory: {reflog_file}") | |
| except OSError as ex: | |
| print(f"Failed to remove directory {reflog_file}: {ex}") | |
| else: | |
| remove_empty_parents(reflog_file, os.path.join(".git", "logs", "refs")) | |
| elif not os.path.exists(reflog_file): |
| # cannot lock ref 'refs/remotes/origin/dev': | ||
| # 'refs/remotes/origin/dev/trigger-ci-for-3.0' exists; cannot create 'refs/remotes/origin/dev' | ||
| match = re.search( | ||
| r"cannot lock ref '(refs/remotes/origin/[^']+)': '(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'", |
There was a problem hiding this comment.
The conflict regex in Type2Handler.handle() doesn't allow for the newline/whitespace Git often includes after cannot lock ref '...': (your own example shows a line break). As written, match will fail in that common format, so the target-ref cleanup path won’t run. Consider allowing \s* after the colon (or using re.DOTALL + more flexible spacing) so both single-line and multi-line error outputs are handled.
| r"cannot lock ref '(refs/remotes/origin/[^']+)': '(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'", | |
| r"cannot lock ref '(refs/remotes/origin/[^']+)':\s*'(refs/remotes/origin/[^']+)' exists; cannot create '(refs/remotes/origin/[^']+)'", |
| def parse_branch(self, error_output: str) -> str: | ||
| # 匹配 cannot lock ref 部分,兼容中英文 | ||
| match = re.search( | ||
| r"error: cannot lock ref '(refs/remotes/origin/[^']+)': is at", error_output | ||
| r"cannot lock ref '(refs/remotes/origin/[^']+)': is at", error_output | ||
| ) | ||
| return match.group(1) if match else None |
There was a problem hiding this comment.
Type1Handler.parse_branch() is annotated as returning str, but it can return None when the regex doesn't match (and the abstract parse_branch contract is Optional[str]). Align the return type annotation with the actual behavior (and keep handler overrides consistent with the base class) to avoid misleading typing and static-analysis issues.
This pull request refactors and enhances the
.github/scripts/git_ref_lock_cleaner.pyscript to improve error handling, robustness, and cleanup of empty directories after reference deletions. The changes ensure that the script can better identify and resolve git ref lock errors, clean up any resulting empty directories, and retry operations when appropriate.Error handling and recovery improvements:
handle,delete_ref, and related methods return a boolean indicating success, allowing for better control flow and retry logic. [1] [2] [3]git fetchup to two times if an error is detected and fixed, improving robustness against transient issues.Directory cleanup enhancements:
remove_empty_parentsutility and integrated it into the ref deletion process to automatically remove empty parent directories under.git/refsand.git/logs/refsafter a ref or reflog is deleted. [1] [2]Error pattern matching and handling:
These changes collectively make the script more reliable and easier to maintain in the face of various git ref lock error patterns.