-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathnextFilename
More file actions
executable file
·281 lines (227 loc) · 10.1 KB
/
nextFilename
File metadata and controls
executable file
·281 lines (227 loc) · 10.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
#!/usr/bin/env python3
#
# nextFilename.py: Find the next available serial-numbered filename.
# 2014-11-04: Written by Steven J. DeRose.
#
import sys
import os
import re
import argparse
from collections import defaultdict
__metadata__ = {
"title" : "nextFilename",
"description" : "Find the next available serial-numbered filename.",
"rightsHolder" : "Steven J. DeRose",
"creator" : "http://viaf.org/viaf/50334488",
"type" : "http://purl.org/dc/dcmitype/Software",
"language" : "Python 3.7",
"created" : "2014-11-04",
"modified" : "2022-04-27",
"publisher" : "http://github.com/sderose",
"license" : "https://creativecommons.org/licenses/by-sa/3.0/"
}
__version__ = __metadata__['modified']
descr = """
=Description=
Generate the "next" or "next available" filename relative to one provided,
by using (or adding) serial numbers within the name. For example,
nextFilename myFile0205.txt
would return "myFile0206.txt".
* By default, neither the mentioned file nor the
constructed incremental file needs to exist or not exist.
* Generated numbers are padded to at least `--width` digits.
* If there are no digits in a supplied filename, 1 (as padded) is appended
immediately preceding the extension (if any).
* If there are noncontiguous digits, the last sequence is used.
* If there are leading path components, they are retained.
To get the next `available` filename,
use `--mustBeAvailable` (or either of the synonyms `--available` or `-a`).
In that case, if the file named by the argument does not exist, that name is returned
unchanged. Otherwise, the first natural number is used that yields
a filename that does not represent an extant file.
''Note'': A reasonable alternative is
to want the number one past the highest one that does exist. That is not yet
implemented.
==Number padding==
Serial numbers are left-0-padded to `--width` digits (default 4), and are added
if there are no digits already present. That means that
nextFilename myFile205.txt
will also return "myFile0206.txt" (note the added "0").
For testing existence, "myFile205.txt" does not
(yet, at least) block the script from returning "myFile0205.txt", because they are
not properly "the same filename" (at least for any OS I can think of).
This may or may not be what you want.
You may instead want to suppress padding entirely here, which you can accomplish
by setting `--width 0`. But if your files
have mixed widths (say, myFile1.txt, ...myFile15.txt), you may want to
normalize them first, for example using my `renameFiles` command.
==Order of processing==
For each path provided as an argument, this is the basic order of steps:
Separate the path, basename, and extension
Extract final ASCII decimal digits from the basename (default to 1)
W = number of digits (including any leading zeros)
N = integer value of that number
Repeat forever:
NF = N as formatted given W and various options
newName = path + basename + NF + '.' + extension
if (not --available): break
if (newName does not exist): break
N = N + 1
If `--available` is specified, such names will be generated with larger
and larger numeric values inserted, until a name (path) is generated that
does not already exist (or until `--maxTries` attempts have been made).
=Related commands=
*nix C<rename>, my C<renameFiles>.
=Known bugs and limitations=
It is possible that files with higher numbers exist, for example if
there are gaps in the sequence of extant filenames. An option to avoid
this is desirable.
Existence is tested for
the literal filename, including the exact number of 0s inserted as padding.
Thus, "file01" and "file000000001" are not considered the same. It would be
better to automatically find all variants, and even auto-set `--width`.
Knows nothing about OS conventions such as "Copy 12 of X", "Backup of backup of X", etc.
Does not know enough to ignore numbers if they are part of dates, such as
in "foo-2020-01-01.log". Speaking of which, it knows nothing of negative or
non-integer or non-decimal numbers.
It is worth mentioning that many file-systems get slow if a single directory
contains more than about 1,000 files (see `bigDirTimer` to test yours).
=To do=
* Option to skip gaps (started, see getNumberedFiles()).
* Maybe support hexadecimal? But what about "foo00BEef.txt"?
* Maybe warn if files with different-width number fields are around?
=History=
* 2014-11-04: Written by Steven J. DeRose.
* 2015-02-06: Add --available, --width. Rename incrementFilename->nextFilename
* 2018-04-18: lint.
* 2020-09-15: New layout. Default to test existence.
* 2022-04-27: Improve help. Better handling of --available. Start smarter
handling of varying widths, numbering gaps, etc.
* 2022-07-13: Improve help, describe various edge cases. Fix bug that
prepended "/" even when there was no prior path. Add type hinting.
=Rights=
Copyright 2014-11-04 by Steven J. DeRose. This work is licensed under a
Creative Commons Attribution-Share-alike 3.0 unported license.
For further information on this license, see
[https://creativecommons.org/licenses/by-sa/3.0].
For the most recent version, see [http://www.derose.net/steve/utilities]
or [https://github.com/sderose].
=Options=
"""
###############################################################################
#
def getNumberedFiles(orig:str) -> (int, list, list):
"""Locate all filenames in the directory that differ from the given
one only in the last set of digits (immediately before the extension).
Return the maximum integer value found for those numbers, the list of
names, and a histogram of widths (hopefully there's only one width).
"""
dirPath, basename = os.path.split(orig)
nam, ext = os.path.splitext(basename) # (keeps the 'dot')
if (not re.match(r"\d", nam)):
sys.stderr.write("No digits found in '%s'." % (nam))
return None
widths = defaultdict(int)
nameList = []
maxValue = 0
namStripped = nam.rstrip("0123456789")
for f in os.listdir(dirPath):
if f.startswith(namStripped) and f.endswith(ext):
nameList.append(f)
numWidth = len(nam)-len(namStripped)
widths[numWidth] += 1
numPart = nam[-numWidth:]
numValue = int(numPart)
if (numValue > maxValue): maxValue = numValue
return maxValue, nameList, widths
def getNewPath(orig:str, mustBeAvailable:bool=True, maxTries:int=100) -> str:
"""Split up the path, then try numbers as long as needed.
Return the end result.
"""
dirPath, basename = os.path.split(orig)
nam, ext = os.path.splitext(basename) # (keeps the 'dot')
# Locate the last group of digits (if any)
mat = re.match(r'(.*?)(\d+)(\D*)$', nam)
if (mat):
pre = mat.group(1)
num = mat.group(2)
post = mat.group(3)
else: # no digits
pre = nam
num = "0"
post = ""
thisWidth = args.width
if (len(num) > args.width):
thisWidth = len(num)
sys.stderr.write("Warning: --width raised to %d to accommodate '%s'." %
(thisWidth, num))
newPath = ''
for i in range(maxTries):
newNum = int(num) + i + 1
if (args.verbose):
sys.stderr.write("Dir(%s) Pre(%s) Num(%d) Post(%s) Ext(%s).\n" %
(dirPath, pre, newNum, post, ext))
numPadded = str(newNum).rjust(thisWidth, '0')
newPath = ""
if (dirPath): newPath += dirPath + "/"
newPath += pre + numPadded + post + ext
if (not mustBeAvailable): break
if (not os.path.exists(newPath)): break
return(newPath)
###############################################################################
# Main
#
if __name__ == '__main__':
def processOptions():
try:
from BlockFormatter import BlockFormatter
parser = argparse.ArgumentParser(
description=descr, formatter_class=BlockFormatter)
except ImportError:
parser = argparse.ArgumentParser(description=descr)
parser.add_argument(
"--list", action="store_true",
help='Just list all existing numerical variants, and exit.')
parser.add_argument(
"--maxTries", type=int, default=100,
help='Only make this many tries before giving up.')
parser.add_argument(
"--mustBeAvailable", "--available", "-a", action="store_true",
help='Return the next name that does NOT yet exist.')
parser.add_argument(
"--no-mustBeAvailable", "--no-available",
action="store_false", dest='mustBeAvailable',
help='Return the next name whether or not it already exists.')
parser.add_argument(
"--quiet", "-q", action="store_true",
help='Suppress most messages.')
parser.add_argument(
"--verbose", "-v", action="count", default=0,
help='Add more messages (repeatable).')
parser.add_argument(
'--version', action="version", version='Version of '+__version__,
help='Display version information, then exit.')
parser.add_argument(
"--width", type=int, default=4,
help='Zero-pad numbers to be at least this wide. Default: 4.')
parser.add_argument(
'names', type=str, nargs=argparse.REMAINDER,
help='Filenames or other strings to "increment."')
args0 = parser.parse_args()
return args0
args = processOptions()
for n in (range(len(args.names))):
if (args.list):
theMax, theNames, theWidths = getNumberedFiles(args.names[n])
print("Max integer found: %d" % (theMax))
if (len(set(theWidths)) > 1):
print("WARNING: Not all the same width.")
for i0 in range(len(theNames)):
print("%2d: '%s'" % (theWidths[i0], theNames[i0]))
else:
newPath0 = getNewPath(args.names[n],
mustBeAvailable=args.mustBeAvailable, maxTries=args.maxTries)
if (args.verbose):
print("%s\t%s" % (args.names[n], newPath0))
else:
print(newPath0)