Skip to content

feat: support --from-file and --to-file#38

Open
kormide wants to merge 1 commit intomainfrom
from-to-file
Open

feat: support --from-file and --to-file#38
kormide wants to merge 1 commit intomainfrom
from-to-file

Conversation

@kormide
Copy link
Owner

@kormide kormide commented Mar 14, 2026

Diff can create multifile 1-N patches with --from-file and --to-file.

Due to a bug in diff where --label flags don't capture all of the file labels when using --to-file and --from-file, I stopped setting overridden labels and instead I'm zeroing out the timestamps to make the patches deterministic.

For other multifile patch cases, I'm going to make new rule later that takes in diff targets as deps. This is in line with keeping the diff rules as functional as the diff binary itself, which can't arbitrarily combine patches unless it's using one of the two options in this PR or --recursive.

@kormide kormide requested a review from alexeagle March 14, 2026 20:17
@kormide kormide force-pushed the from-to-file branch 19 times, most recently from c3fd961 to 321acff Compare March 15, 2026 19:57
@kormide kormide removed the request for review from alexeagle March 15, 2026 20:05
@kormide kormide marked this pull request as draft March 15, 2026 20:05
@kormide kormide force-pushed the from-to-file branch 4 times, most recently from a94662b to 4b2bf38 Compare March 15, 2026 21:33
@kormide kormide marked this pull request as ready for review March 15, 2026 21:33
@kormide kormide requested a review from alexeagle March 15, 2026 21:33
return (from_file, to_file, file_path)

def _build_command(bin_dir, diff_bin, args, files, patch, type):
zero_timestamp_and_offset = "0000-00-00 00:00:00.000000000 +0000"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: some systems compute a time delta, and the year 0 causes them to overflow a datatype (say, seconds in the period). So I think it's better to use a default value like we do in tar.bzl which is effectively "start of UNIX epoch (1970) or maybe it's a more recent date than that.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I'll switch to using the unix epoch.

def _build_command(bin_dir, diff_bin, args, files, patch, type):
zero_timestamp_and_offset = "0000-00-00 00:00:00.000000000 +0000"
if type == "unified":
match_timestamp_and_offset = "[0-9]{4}-[0-9]{2}-[0-9]{2}\\s+[0-9]{2}:[0-9]{2}:[0-9]{2}\\.[0-9]+\\s+[-+][0-9]{4}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: offset actually means timezone right? it's not an offset within the file?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. It's the timezone offset. +0000 is the offset for UTC.

if [[ $? == '2' ]]; then
exit 2
fi
echo "$DIFF" | sed -r 's#^((---|\\+\\+\\+)\\s+)({}/)?(\\S+)\\s+{}#\\1\\4 {}#' > {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we get sed from somewhere like coreutils?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sed isn't part of coreutils unfortunately so we can't use the bazel-lib toolchain. There is a source build on the BCR, but it doesn't work on windows. I'm hesitant to use it for Linux because I'm going to have to write windows PS or batch support eventually and I can't use that source build. Maybe I'll try fixing that upstream.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to use a hermetic sed but I can live with not doing that at this moment.

""".format(
diff_bin,
" ".join(args),
" ".join([file.path for file in files]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible, for optimization I think it's better to accumulate these in an Args object so the file paths are lazily evaluated at execution time rather than during analysis

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. I didn't know about Args.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this file be in .gitattributes as generated so the delta doesn't appear in the PR twice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants