Automatic file renaming

Perform an automatic renaming of a list of files due to some basic rules.

This seems like a common problem, where a simple script may help us. We have a list of files that needs to be renamed. This problem can occur due to several reasons. One of those reasons is certainly a changed naming convention. Let's consider the following piece of information (from Wikipedia):

116	1	"Hot Water"	Chris Bennett	Judah Miller & Murray Miller	September 25, 2011	6AJN18	5.83[135]
117	2	"Hurricane!"	Tim Parsons	Erik Sommers	October 2, 2011	6AJN07	5.80[136]
118	3	"A Ward Show"	Josue Cervantes	Erik Durbin	November 6, 2011	6AJN01	4.85[137]
119	4	"The Worst Stan"	Rodney Clouden	Nahnatchka Khan	November 13, 2011	6AJN11	4.87[138]
120	5	"Virtual In-Stanity"	Shawn Murray	Jordan Blum & Parker Deay	November 20, 2011	6AJN16	4.82[139]
121	6	"The Scarlett Getter"	Josue Cervantes	Matt Fusfeld & Alex Cuthbertson	November 27, 2011	6AJN09	4.48[140]
122	7	"Season's Beatings"	Joe Daniello	Erik Sommers	December 11, 2011	6AJN21	5.00[141]
123	8	"The Unbrave One"	Joe Daniello	Rick Wiener & Kenny Schwartz	January 8, 2012	6AJN13	4.79[142]
124	9	"Stanny Tendergrass"	Tim Parsons	Keith Heisler	January 29, 2012	6AJN15	4.77[143]
125	10	"Wheels & the Legman and the Case of Grandpa's Key"	Josue Cervantes	Laura McCreary	February 12, 2012	6AJN17	3.59[144]
126	11	"Old Stan in the Mountain"	Pam Cooke & Valerie Fletcher	Jonathan Fener	February 19, 2012	6AJN14	4.40[145]
127	12	"The Wrestler"	Rodney Clouden	Alan R. Cohen & Alan Freedland	March 4, 2012	6AJN19	4.29[146]
128	13	"Dr. Klaustus"	John Aoshima & Jansen Yee	Brian Boyle	March 11, 2012	6AJN12	4.62[147]
129	14	"Stan's Best Friend"	John Aoshima & Jansen Yee	Jonathan Fener	March 18, 2012	6AJN20	4.61[148]
130	15	"Less Money, Mo' Problems"	Chris Bennett	Murray Miller & Judah Miller	March 25, 2012	6AJN02	4.28[149]
131	16	"The Kidney Stays in the Picture"	Pam Cooke & Valerie Fletcher	Rick Wiener & Kenny Schwartz	April 1, 2012	6AJN22	4.18[150]
132	17	"Ricky Spanish"	Shawn Murray	Erik Sommers	May 6, 2012	7AJN02	4.82[151]
133	18	"Toy Whorey"[152]	Tim Parsons & Jennifer Graves	Matt Fusfeld & Alex Cuthbertson	May 13, 2012	7AJN01

This list represents the episodes of a famous TV show. Since I digitalized my content I only have files, or digital data in general, of any media stored on one of my media center drives. The current format of those files looked something like this:

American.Dad.S07E01.mkv
American.Dad.S07E02.mkv
American.Dad.S07E03.mkv
American.Dad.S07E04.mkv
American.Dad.S07E05.mkv
American.Dad.S07E06.mkv
American.Dad.S07E07.mkv
American.Dad.S07E08.mkv
American.Dad.S07E09.mkv
American.Dad.S07E10.mkv
American.Dad.S07E11.mkv
American.Dad.S07E12.mkv
American.Dad.S07E13.mkv
American.Dad.S07E14.mkv
American.Dad.S07E15.mkv
American.Dad.S07E16.mkv
American.Dad.S07E17.mkv
American.Dad.S07E18.mkv

This might be good enough if we are just in search for a specific episode of a specific season. But we could grasp a lot more information, like the title of the episode. Also storing all files in one directory will result in a really bad overview. Therefore I did come up with the convention to seperate between Movies and Series, between different types of Series and between different Seasons. Every differentiation is stored in a single directory within the parent category. So storing the Season number in the file name is actually redundant information.

The following code snippet is able to perform the required transformation:

var ext = "mkv";
var dir = @"X:\Media\Series\American Dad\Season 7";
var list = "list.txt";
var regex = @"[\d]{3}\t([\d]{1,2})\t""(.*)"".*";
var files = System.IO.Directory.GetFiles(dir, "*." + ext);
var lines = System.IO.File.ReadAllLines(dir + "\\" + list);
var n = Math.Min(files.Length, lines.Length);
var r = new System.Text.RegularExpressions.Regex(regex);
var invalids = System.IO.Path.GetInvalidFileNameChars();
for (var i = 0; i < n; i++)
{
	var m = r.Match(lines[i]);
	if (m.Success)
	{
		var nr = m.Groups[1].Value;
		var title = m.Groups[2].Value;
		var file = string.Format("{0} - {1}.{2}", nr, title, ext);

		foreach (var invalid in invalids)
			if (file.Contains(invalid))
				file = file.Replace(invalid, ' ');

		System.IO.File.Move(files[i], dir + "\\" + file);
	}
}

We can outsource the variables ext (extension of the files to transform), dir (directory of the files to transform), list (file with the transformation data) and regex (regular expression, which expresses the transformation rule) to refactor the snippet as a method.

The magic lies in the regular expression. Here we have an expression that searches for a string which starts with 3 digits followed by a tab followed by a group of 1 or 2 digits followed by a tab. Then finally we have another group of characters within quotation marks and arbitrary characters afterwards. We are interested in the first and second group. The group with index zero represents the whole expression. Its value is therefore equal to the whole line of the transformation data.

The outcome of the transformation will look like this:

1 - Hot Water.mkv
2 - Hurricane!.mkv
3 - A Ward Show.mkv
4 - The Worst Stan.mkv
5 - Virtual In-Stanity.mkv
6 - The Scarlett Getter.mkv
7 - Season's Beatings.mkv
8 - The Unbrave One.mkv
9 - Stanny Tendergrass.mkv
10 - Wheels & the Legman and the Case of Grandpa's Key.mkv
11 - Old Stan in the Mountain.mkv
12 - The Wrestler.mkv
13 - Dr. Klaustus.mkv
14 - Stan's Best Friend.mkv
15 - Less Money, Mo' Problems.mkv
16 - The Kidney Stays in the Picture.mkv
17 - Ricky Spanish.mkv
18 - Toy Whorey.mkv

Most files can be easily renamed using the code snippet above and a data source from Wikipedia. The most complicated part will be the modifications in the regular expression. A quick test in a good editor or using a program like Expresso can help a lot here.

Created .

References

Sharing is caring!