Changing characters in multiple file names

I 'm building an archive of TV shows and I need to change over 1000 file names from the format :
S1 E5 N338 Bonanza - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg
to
Bonanza - S01 E05 - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg
for use with Plex

I have been using Using AppleScript’s Text Item Delimiters but the code is getting rather long. I’m not fluent in applescript but have managed to code sections where I select the folder and strip the beginning codes from the file but am struggling to figure out :
how to test for 2 digit numbers following S (for season) and E(for episode)
how then to add that code in the correct space following the - and adding a final -
in basic i would use left$ right$ and mid$ is there similar code in applescript? Any ideas would be appreciated

Here is what I have so far. I hope there is an easier way to do this.


global theNewFile
global res
global tvCode
global newCode





---select folder to change
set the_folder to (choose folder with prompt "Choose Folder to Operate On")
set the_folder_list to list folder the_folder without invisibles

--cycle through folder file by file
repeat with x from 1 to count of the_folder_list
	set the_file to item x of the_folder_list
	--set the_new_file to the_file ---------the_folder & the_file
	tell application "Finder"
		activate
		--display dialog the_file as text
		
		set jobNum to the_file --jobNum is file name
		set theOldFile to jobNum --save the orginal file name
		--set theNewFile to "Hi my mame is joe!"
		--display dialog "jobNum " & jobNum -- as text	
	end tell
	
	-- rather than send the folder I'll just plug in the first file in it
	
	set jobNum to "S1 E5 N338 Bonanza - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg"
	
	
	-- get tvCode from jobNum
	activate me
	getBaseName from jobNum
	display dialog "tvCode <-> " & theOldFile & tvCode
	
	findAndReplace(tvCode, "", theOldFile)
	tell application "Finder"
		activate
		--display dialog "res " & res
		
	end tell
	activate
	getNewCode from tvCode
	display dialog "newCode " & newCode
	
end repeat

------------------

on findAndReplace(tofind, toreplace, TheString)
	set ditd to text item delimiters
	set res to missing value
	set text item delimiters to tofind
	repeat with tis in text items of TheString
		if res is missing value then
			set res to tis
		else
			set res to res & toreplace & tis
		end if
	end repeat
	set text item delimiters to ditd
	return res
end findAndReplace

---------------

to getBaseName from t -- (Kai Edwards)
	set d to AppleScript's text item delimiters
	set AppleScript's text item delimiters to " " -- separated at periods
	if (count t's text items) > 1 then set t to t's text 1 thru text item 3
	
	--display dialog "t " & t as text
	
	set tvCode to t as text
	
	set AppleScript's text item delimiters to d -- always set them back again!
	return t
end getBaseName

-----
to getNewCode from t -- (Kai Edwards)
	set d to AppleScript's text item delimiters
	set AppleScript's text item delimiters to " " -- separated at periods
	if (count t's text items) > 1 then set t to t's text 1 thru text item 2
	
	
	--display dialog "t " & t as text
	
	set newCode to t as text
	
	set AppleScript's text item delimiters to d -- always set them back again!
	return t
end getNewCode

Model: MacBook 2.6 ghz 4gb ram
AppleScript: Version 2.4.3 (131.2)
Browser: Safari 536.25
Operating System: Mac OS X (10.8)

Will the N338 always be in the same format? If not, how will you distinguish it from the name of the show?

Hi. This isn’t connected to a method for obtaining or changing the file names, but here is my solution to handle the main issue. I’m using words to section the items; this relies on the assumption that there are no strange characters in the name and that the source formatting remains consistent (series number, episode number, useless alphanumeric, show name, hyphen with trailing text).

set AppleScript's text item delimiters to "" --reset to default

set nameString to "S1 E5 N338 Bonanza - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg"
set {{series, Episode, |¿|, title}, filler} to nameString's {words from character 1 to 4, characters (offset of "-" in nameString) thru -1}
set newName to title & space & "-" & space & paddZero(series) & space & paddZero(Episode) & space & filler


on paddZero(this)
	tell this to if length ≤ 2 then
		character 1 & 0 & its text items's rest
	else
		it
	end if
end paddZero

Regular expressions are not my strong point but assuming that “N338” is a constant, this should get you on the right path.

tell application "Finder" to set theFiles to every file of (choose folder)
repeat with aFile in theFiles
	set fileName to name of aFile
	set newName to do shell script "echo " & quoted form of fileName & " | sed -e 's/^S\\([0-9]\\) /S0\\1 /' -e 's/\\(S[0-9][0-9] E\\)\\([0-9]\\) /\\10\\2 /' -e s'/\\(S[0-9][0-9] E[0-9][0-9]\\) \\(N338\\) \\(.*-\\) \\(.*\\)/\\3 \\1 - \\4/'"
	tell application "Finder" to set name of aFile to newName
end repeat

Wow you folks are fast! Many thanks. Marc the code works beautifully. I plugged it in and it changes everything just fine. I wish I could understand the code. Is there a reference on the delimiters some place?

I think the code means:

set AppleScript's text item delimiters to "" --reset to default

-add routine to get file names into nameString
set nameString to "S1 E5 N338 Bonanza - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg"

--this seems to "read" characters from newString into series Episode and title but I don"t understand how " to
 ---nameString's{words from.to."  works. Chaaracter 1 to 4 has me stumped
set {{series, Episode, |¿|, title}, filler} to nameString's {words from character 1 to 4, characters (offset of "-" in nameString) thru -1}

--this recombines the pieces above into the newName
set newName to title & space & "-" & space & paddZero(series) & space & paddZero(Episode) & space & filler

-- this subroutine I understand.
on paddZero(this)
	tell this to if length ≤ 2 then
		character 1 & 0 & its text items's rest
	else
		it
	end if
end paddZero

Here is my work in progress so far:






---elect folder to change
set the_folder to (choose folder with prompt "Choose Folder to Operate On")
set the_folder_list to list folder the_folder without invisibles

--cycle through folder file by file
repeat with x from 1 to count of the_folder_list
	set the_file to item x of the_folder_list
	tell application "Finder"
		activate
		--display dialog the_file as text
		
		
		set theOldFile to the_file --save the orginal file name
		
		display dialog "the file " & the_file -- as text	
	end tell
	
	-- dummy data represents file name from above
	set the_file to "S13 E09 N447  Gunsmoke - The Pillagers (08_15_2012) 6709 WOIODT2 110.mpeg"
	set AppleScript's text item delimiters to "" --reset to default
	
	set nameString to the_file
	set {{series, Episode, |¿|, title}, filler} to nameString's {words from character 1 to 4, characters (offset of "-" in nameString) thru -1}
	set newName to title & space & "-" & space & paddZero(series) & space & paddZero(Episode) & space & filler
	activate me
	display dialog "new Name " & newName
	
end repeat

end
on paddZero(this)
	tell this to if length ≤ 2 then
		character 1 & 0 & its text items's rest
	else
		it
	end if
end paddZero

Now I have to paste the newName onto the file, any suggestions?

I like Marc’s words from character 1 to 4 method. However, it has nothing to do with the text item delimiters. He set them to the default value so his handler would work but none of the text manipulation is done with the delimiters.

I have not seen that before. Mark, would you elaborate on the words from character 1 to 4 a bit?

Hi,

my version


---select folder to change
set theFolder to (choose folder with prompt "Choose Folder to Operate On")
activate application "Finder"
tell application "Finder" to set theFiles to files of theFolder

--cycle through folder file by file
set {TID, text item delimiters} to {text item delimiters, space}
repeat with aFile in theFiles
	set fileName to name of aFile
	try
		tell text items of fileName to set {season, episode, episodeName, theRest} to {item 1, item 2, item 4, items 5 thru -1 as text}
		set newName to ({episodeName} & "-" & pad(season) & pad(episode) & theRest as text)
		tell application "Finder"
			display dialog "old name: " & fileName & return & return & "new name: " & newName buttons {"Cancel", "Skip", "Replace"}
			if button returned of result is "Replace" then set name of contents of aFile to newName
		end tell
	on error
		tell application "Finder" to display dialog "Filename does not match naming scheme"
	end try
end repeat
set text item delimiters to TID

------------------

on pad(v)
	try
		tell v to set {theLetter, theNumber} to {text 1, text 2 thru -1 as integer}
		return theLetter & text -2 thru -1 of ((100 + theNumber) as text)
	on error
		return v
	end try
end pad

It is, actually. Marc’s ‘filler’ variable contains a list of characters which is coerced to text when concatenated to the end of the text in line 4. The delimiter value comes into play then. The ‘text items’s rest’ in the handler is just part of the general obfuscation in the script.

set nameString to "S1 E5 N338 Bonanza - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg"
set {{series, Episode, |¿|, title}, filler} to nameString's {words 1 thru 4, text (offset of "-" in nameString) thru -1}
set newName to title & space & "-" & space & paddZero(series) & space & paddZero(Episode) & space & filler

on paddZero(this)
	if (this's length is 2) then
		return character 1 of this & "0" & character 2 of this
	else
		return this
	end if
end paddZero

It’s not clear from wkmanley’s query whether he just wants one word transposed to the front of the name or whatever text there may be between the three opening codes and the dash. Marc’s script and my rephrasing of it above assume just one word. This version copes with multiple-word and hyphenated series titles:

set nameString to "S1 E5 N338 Bonanza Time - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg"

set o to (offset of " - " in nameString) -- ie. <space><dash><space>
set {{series, Episode}, title, filler} to nameString's {words 1 thru 2, text from word 4 to (o - 1), text o thru -1}
set newName to title & space & "-" & space & paddZero(series) & space & paddZero(Episode) & filler

on paddZero(this)
	if (this's length is 2) then
		return character 1 of this & "0" & character 2 of this
	else
		return this
	end if
end paddZero

I came up with a slightly shorter “sed” script than yours, but I don’t know if it’s any better:

set fileName to "S2 E5 N338 Bonanza - Anatomy of a Lynching (08_15_2012) 1110 TVLAND 66.mpeg" -- Test line.

set newName to (do shell script ("echo " & quoted form of fileName & " | sed -E 's/^S([0-9] )/S0\\1/ ;  s/^(.{4}E)([0-9] )/\\10\\2/ ; s/^(.{8})[^ ]+ ([^-]+ -)/\\2 \\1-/'"))

Hi Nigel. What I meant was that even though delimiters were used in the handler, the delimiter value had not changed from the default value. It seemed like a different approach than the OP was using because he/she had changed the delimiter value to manipulate the text.

I am still interested in the words from character 1 to 4 section. Will you elaborate on that for me?

That is a slick reg ex solution by the way …

Thanks.

Hi, John.

It means “every word from the first character in the text to the fourth word in the text.” It’s just a circumlocution for ‘words 1 thru 4’.

I’ve written about the various possibilities of range references in text here, if you’re interested.

I’ve now edited my post above to include a more robust version of Marc’s script and to cut one of the “memory” uses in my “sed” script.

With every post that Nigel makes, my vocabulary grows in proportion to my understanding of AppleScript. :slight_smile:

Just don’t show him Towers of Hanoi programmed with sed . :wink:

IMHO I liked Stefan’s and Marc Anthony’s solutions best, from readability perspective!

I just want to add, that a solution with sed, is as good as any, sometimes much faster, so it isn’t that. And when I come up with solutions, I also use the tools that are closest to my mind, And I see nothing wrong in that!

The overall objective here, is often to produce a solution!

Thanks to all of you. I understand about 1/4 of the above conversation but Stefan’s script is so simple and clear that I actually think I understand all of it! Extra thanks to Stefan who helped me with a previous project that strips commercials from TV shows (really, he wrote most of the crucial parts of it).

I knew you’d love it, Nige, and now I’m somewhat tempted to change my name on the forum to “General Obfuscation””it has a nice ring to it. :smiley: I appreciate the brevity of your and adayzdone’s regEx solutions, however, that’s quite the feat in obfuscation, itself; you need a cipher to parse.

Phew! I must look at that more closely later on.

They both assume that all the series have one-word titles, if I’ve correctly understood the meaning of the dash in the original file names. I’m not sure if that’s what wkmanley wants or not.

I came back to say this but you beat me to it.

With the example:
“S4 E3 N338 Breaking Bad - Open House.txt”

adayzdone: Breaking Bad - S04 E03 - Open House.txt
Nigel: Breaking Bad - S04 E03 - Open House.txt
Marc Anthony : Breaking - S04 E03 - Open House.txt
Stefan : Breaking - S04 E03 Bad - Open House.txt