Using AppleScript's Text Item Delimiters

Adam_Bell · June 18, 2007, 11:00am

Forums are full of scripting problems that involve removing the extension from a name, extracting text preceded by a date from a photo title, finding and replacing words in a text document, creating web-friendly names by inserting underscores (“_”) in place of spaces, grabbing extensions from file names, or removing “/” characters and replacing them with colons in a path. The answers to these queries almost always involve AppleScript’s Text Item Delimiters. This scriptutorial, if I might call it that, tries to clarify TIDs by explaining how they work and giving some hopefully useful examples.

It’s useful to start the discussion with what we know about written text. In plain English, a delimiter is a character or string of characters used to separate or mark the ends of items of data. We use this idea unconciously all the time. We put parentheses around words in a sentence to indicate that they are an “aside”. In word processing, we think of non-printing characters like space, tab, return, and linefeed as separating words. In anglophone texts, leading capital letters and ending periods are usually our markers for sentences, and return or linefeed characters separate blocks of text containing words and sentences into paragraphs. In HTML text, we use “tags” - special sets of delimiters containing information for the rendering engine of a browser. In forums, we use “bbCode” to do the same thing. The Script Editor automatically treats a double dash: “–”, in a single string ending with a line feed, or “(…)” enclosing several lines of text as “not script”. Those symbols are the Script Editor’s delimiters for a comment in the midst of its otherwise plain text code.

[b]Think of text item delimiters as having two main functions:

breaking a string of text into parts called “text items” that are separated by the delimiter chosen. The delimiter does not show in the list of text items.
To insert text between each of the items of a list when coercing it to a string of text. The text inserted will be the value of the current delimiter.[/b]

For many of the more common operations, these two functions are often used alternately, and we’ll be exploring examples of how to use that idea here.

AppleScript’s Words and Paragraphs use delimiters: AppleScript includes some built-in special delimiters with names: it is capable of discerning words delimited by some but not all non-printing characters, and paragraphs delimited by line feeds or carriage returns. Here are some examples.

words of "Hi, I'm Peggy-Sue" --> {"Hi", "I'm", "Peggy-Sue"}

words of "Now*is the.time&for all_good  (folks) to learn-AppleScript"
--> {"Now", "*", "is", "the.time", "&", "for", "all", "_", "good", "folks", "to", "learn-AppleScript"}

words of "this string has a CR
in it"  --> {"this", "string", "has", "a", "CR", "in", "it"} - but the return itself doesn't show.

words of "this string has" & return & "in it"
-- {"this", "string", "has", "
-- ", "in it"}  - this time it does show because we inserted it as a character.

words of "this string has a" & tab & "in it" --> {"this", "string", "has", "a", "	", "in it"} - the tab counts

words of "set-piece is hyphenated" --> {"set-piece", "is", "hyphenated"} - the hyphen isn't separated

words of "funny*word with asterisk" --> {"funny", "*", "word", "with", "asterisk"} - the asterisk is separate

set t to "This, he said, is a sentence."
words of t --> {"This", "he", "said", "is", "a", "sentence"}
-- Note that the commas didn't come through. The commas in the result are separating the words in the list.

paragraphs of "this string has a return
in it" --> {"this string has a return", "in it"}

AppleScript’s text item delimiters: AppleScript includes a global property called “text item delimiters” that can be set in a script. The default value is {“”}, i.e., scripts in which the text item delimiters are not specifically changed don’t alter text in any way. Note that while the default value of AppleScript’s text item delimiters is actually {“”}, in our AppleScripts, “” will do because AppleScript will usually treat single-item lists as the item the list contains. The reason for the list default is to allow for future expansion with multiple delimiters, but this has been so for years and multiple delimiters have yet to be implemented. Perhaps they never will be. See this article on “AppleScript Properties” for the definitive word.

In the discussion that follows, I have carefully referred to “AppleScript’s text item delimiters” in the examples. As a matter of interest, for AppleScripts not involving a tell block, “text item delimiters” by itself will work in most scripts without the “AppleScript’s” preface. Caution is required, however, when a tell block is involved because applications like TextEdit also use the AppleScript key words “text item delimiters” in their dictionaries, and instant confusion will result about whose text item delimiters are meant. Beginners should stick to the full version. Experienced scripters will know when to use the shortened form.

It is easy to see that there is no pre-set delimiter in AppleScript if you take a list of words and coerce them into a text string as in the example below. (Note that for the remainder of this article, I will omit the braces when setting AppleScript’s text item delimiters.)

set myWords to words of "The time has come the walrus said"
--> {"The", "time", "has", "come", "the", "walrus", "said"}
-- Now coerce that list back to a string:
myWords as string --> "Thetimehascomethewalrussaid"
-- That is squished together because no text "" is put between the items.
-- but if we set the AppleScript's text item delimiters to space...
set AppleScript's text item delimiters to space
set MW to myWords as string
set AppleScript's text item delimiters to "" -- ALWAYS SET THEM BACK
MW --> "The time has come the walrus said"

It is important to pay attention to the message: “ALWAYS SET THEM BACK”. AppleScript remembers its delimiters setting. Even if you open a new second script in the Script Editor, the delimiters you set in the first will apply in the second. The “always set them back afterwards” philosophy with delimiters will avoid serious problems later. Some people use an “always set them explicitly before use” approach, which works well within self-contained scripts. It is the view of many expert scripters, however, that it’s safest to script courteously but defensively. Reset the delimiters yourself before finishing or handing over to another script, but don’t assume that other scripts are doing the same for you. Note what they are and set them back when they are no longer needed is the best policy.

As a second example, the Script Editor uses ASCII character 10 (a line feed) between paragraphs so delimiting paragraphs by looking for line feeds is the same as simply asking for the paragraphs.

set txt to "line 1
line 2 
line 3
line 4"
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to ASCII character 10 -- (a line feed)
set newTxt to text items of txt -- not text of, text items of
set AppleScript's text item delimiters to tid -- whatever they were before - ALWAYS SET THEM BACK!
newTxt --> {"line 1", "line 2 ", "line 3", "line 4"}
-- The script term "words" segregates text items using spaces, returns, and line feeds
words of txt --> {"line", "1", "line", "2", "line", "3", "line", "4"}
-- and the script term "paragraphs" produces the same result as the delimiters did above
paragraphs of txt --> {"line 1", "line 2 ", "line 3", "line 4"}
and if we look at txt as string with a line feed as the delimiter we get:
(* line 1
line 2
line 3
line 4 *)

Searching and Replacing: One of the most frequent uses of AppleScript’s text item delimiters is to search a string of text and replace one or more words in it with an alternative. Most text processors have a built-in tool to do this. Here’s a tool in AppleScript for finding and replacing text in a variable. It is in the form of a handler called “switchText”:

set ourText to "To be or not to be, that is the question."
set findThis to "be"
set replaceItWith to "script"
set newText to switchText of ourText from findThis to replaceItWith -- our call to the handler
--> "To script or not to script, that is the question."  
-- but we'll continue:
set nextText to switchText of newText from " is the question" to " in doubt"
--> "To script or not to script, that in doubt." 
-- and then again:
set lastText to switchText of nextText from "that" to "never"

--> "To script or not to script, never in doubt."

to switchText of theText from SearchString to ReplaceString
	set OldDelims to AppleScript's AppleScript's text item delimiters
	set AppleScript's AppleScript's text item delimiters to SearchString
	set newText to text items of theText
	set AppleScript's AppleScript's text item delimiters to ReplaceString
	set newText to newText as text
	set AppleScript's AppleScript's text item delimiters to OldDelims
	return newText
end switchText

Now what has happened here? Here’s a really nice explanation of what the handler is doing written by Kai Edwards (who I hope will forgive me for the editorial changes I’ve made to it):

set newText to switchText of "What, Purple Shoes?" from "Purple" to "Green" 

to switchText of currentText from SearchString to ReplaceString -- the handler
	
	set storedDelimiters to AppleScript's text item delimiters
	-- this simply stores the current value of AppleScript's AppleScript's text item delimiters
	-- so they can be restored later (thus helping to avoid potential problems elsewhere).
	-- Remember, we always set them back to what they were.
	
	set AppleScript's text item delimiters to SearchString
	-- AppleScript's AppleScript's text item delimiters are now set to "Purple"
	
	set currentText to currentText's text items -- note we have changed currentText's value
	-- create a list of text items from the original text, separated at the points where the
	-- current text item delimiter ("Purple") appeared.
	--> {"What, ", " Shoes?"} - Note that the spaces and punctuation are retained.
	
	set AppleScript's text item delimiters to ReplaceString
	-- AppleScript's AppleScript's text item delimiters are now set to "Green"
	
	set currentText to currentText as Unicode text
	-- coerce the list  {"What, ", " Shoes?"} to Unicode text. This operation will also 
	-- insert the current value of AppleScript's AppleScript's text item delimiters ("Green")
	-- between each of the listed items
	
	--> "What, Green Shoes?"
	
	set AppleScript's text item delimiters to storedDelimiters
	-- restore the value of AppleScript's AppleScript's text item delimiters
	-- to whatever they were on entering the subroutine. Remember that a call to this
	-- might have been made from within a section of script that had the TIDs set to
	-- something else. Hand the result back with the TIDs as they were.
	
	currentText
	-- return the now modified text (and restored TIDs) -- "What, Green Shoes?"
	
end switchText -- the end of the handler.

Finding Base Name of File (without the extension): Another example often seen is removing an extension, i.e., finding the “base name” of a file. TIDs will do it, although they are assuredly not the only way. We could have reversed the characters of the file name and searched for the first period in a repeat loop, for example.

set jobNum to "123.456.pdf"
getBaseName from jobNum --> "123.456"

-- the following looks after the possibility that the base name includes a "."
-- e.g. This.fileName.ext. If underscores are wanted instead of spaces
-- in a name, uncomment the two commented lines in the handler.

to getBaseName from t -- (Kai Edwards)
	set d to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "." -- separated at periods
	if (count t's text items) > 1 then set t to t's text 1 thru text item -2
	-- Aha, there is more than one, but the text is split at the second
	--> "123.456" 
	-- This is actually the result we want, and we could stop but the next few 
	-- instructions deal with getting the text back to it's initial Unicode 
	-- text or ASCII text form.
	
	set t to t's text items -- splits t into a list again at the periods
	tell t to set t to beginning & ({""} & rest) -- puts it back together to
	-- preserve it's "type". Basically what it does is to leave the result as
	-- ASCII text if it was ASCII text or as Unicode text if it was Unicode text.
	set AppleScript's text item delimiters to d -- always set them back again!
	return t
end getBaseName

Note that the purpose of restoring the text to its original form is that avoids potential problems later. This dichotomy of ASCII text versus Unicode text in AppleScript is the subject of another article sometime. It’s a constant source of confusion. Readers might refer to this article in “Joel on Software” for some hints.

Stripping Extra Spaces From Text: This example, by Nigel Garvey, shows how to remove an arbitrary number of spaces in front of and following some text to be retained. This often results from reading text that has been set in columns by separating the words in the rows by enough spaces to line up the columns in a monotext font like Courier or Monoco. As Mr. Garvey said: this is by no means the only way to do this, but it happens to be a convenient way to cater for the possibility that the input string might be all spaces or zero-length.

set someText to "     -test bin   " -- as Unicode text

set ASTID to AppleScript's AppleScript's text item delimiters -- remember the old value
set AppleScript's text item delimiters to space  -- the character we want to remove
set TIs to someText's text items -- get the list of items, {"", "", "", "-test", "bin", "", ""} -- more later on why the empty characters appear.

set a to 1
set b to (count TIs) --> 7 in this case

repeat while (a < b) and ((count item a of TIs) is 0) -- count the characters in the item
	set a to a + 1
end repeat 
--> a is now 4 for this example, i.e., "-test" is the 4th text item in TIs.

-- Stripping trailing spaces as well.
repeat while (b > a) and ((count item b of TIs) is 0) -- start at the end of TIs and go backwards
	set b to b - 1
end repeat
--> b is now 5 for this example, i.e., "bin" is the 5th text item in TIs.

set strippedText to text from text item a to text item b of someText
set AppleScript's AppleScript's text item delimiters to ASTID -- SET THEM BACK!

strippedText --> "-test bin" with the internal space left intact

Why didn’t we lose the space between “-test” and “bin”? Because we counted in from the ends of the list of text items to get a and b and those counts never reached any spaces in the middle of the text items. Notice in this example that someText’s text items turned out to be {“”, “”, “”, “-test”, “bin”, “”, “”}. Why do the “empty” text items appear here but not in other examples? An “empty” item occurs whenever the delimiter occurs at the beginning or end of the given text. Compare these:

set t to "Able was I ere I saw Elba"
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "Able"
set ti to text items of t
ti --> {"", " was I ere I saw Elba"}
-- OR
set AppleScript's text item delimiters to "Elba"
set tii to text items of t
tii --> {"Able was I ere I saw ", ""}
-- BUT
set AppleScript's text item delimiters to "ere"
set tiii to text items of t
tiii --> {"Able was I ", " I saw Elba"}
set AppleScript's text item delimiters to tid

Just as an aside, the opposite of splitting the text at the middle word is finding the middle word using AppleScript’s “middle element reference”:

set t to "Able was I ere I saw Elba"
middle word of t --> "ere"
-- so we could have split this text like this:
set AppleScript's text item delimiters to middle word of t
-- Note that middle [i]element[/i] will return the left one of a pair if there are an even number of [i]element[/i]s in the object of the command.
set tm to text items of t
set AppleScript's text item delimiters to tid
tm --> {"Able was I ", " I saw Elba"}
{item 1 of tm, (reverse of characters of item 2 of tm) as string} --> {"Able was I ", "ablE was I "} - A Palindrome because the middle word is too!
--> Not very useful, but fun. We could have done this about the middle character as well to prove that the phrase was a palindrome (punctuation and capitalization excepted, the same read in either direction).

A word of warning about case: AppleScript’s text item delimiters are case sensitive for plain ASCII text, but they are case insensitive for Unicode text. Further, as of this writing, AppleScript cannot deal with text item delimiters containing Unicode characters that do not map to Western Mac OS Roman characters. This can be problematic when reading text into a script that contains such characters even when they may be quite readable in TextEditor, for example. Exploring what to do about such characters is a complex topic for another day.

The solution when using mappable Unicode text where case is important is to use “considering case” in the script. The following examples illustrate:

set t to "Twas brillig and the slithy toves"
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "twas"
set ti to t's text items
set AppleScript's text item delimiters to tid
ti --> {"Twas brillig and the slithy toves"} -- no match.
-- Try for Unicode text, however:
set t to t as Unicode text
set AppleScript's text item delimiters to "twas"
set tiUCT to t's text items
set AppleScript's text item delimiters to tid
tiUCT --> {"", " brillig and the slithy toves"}

To fix that “miss” for Unicode text, we insert “considering case”

set t to "Twas brillig and the slithy toves" as Unicode text
considering case
	set AppleScript's text item delimiters to "twas"
	set tiU to t's text items
	set AppleScript's text item delimiters to tid
end considering
tiU --> {"Twas brillig and the slithy toves"}
-- "Now, twas" is not equal to "Twas"

As our final examples, here are four scripts for dealing with multiple delimiters. The first pair, below, includes a handler for finding the text between two delimiters.

set t to "My father has spanked me, and my mother has spanked me; all my aunts and uncles have spanked me for my 'satiable curtiosity; and still I want to know what the Crocodile has for dinner!"

extractBetween(t, "my ", ";") --> "'satiable curtiosity"
---- The handler ----
to extractBetween(SearchText, startText, endText)
	set tid to AppleScript's text item delimiters  -- save them for later.
	set AppleScript's text item delimiters to startText -- find the first one.
	set endItems to text of text item -1 of SearchText -- everything after the first.
	set AppleScript's text item delimiters to endText  -- find the end one.
	set beginningToEnd to text of text item 1 of endItems -- get the first part.
	set AppleScript's text item delimiters to tid  -- back to original values.
	return beginningToEnd -- pass back the piece.
end extractBetween

A more useful use of the same handler is to form the bbCode for a link in a forum from a webloc (anything but a Safari webloc created by dragging the favicon to the desktop - drag the text only if you use Safari). To illustrate this on this web page without confusing your browser and the software that produces the page, I must change the brackets used from the usual left and right chevrons “<->” for XML, and left and right brackets “[-]” for the bbCode. To actually use this script (which will run as is), you will have to change them back. I have made the chevrons into ^, and the brackets into |.

-- read (pathToYourWeblocHere) - text shown below as "p"
set p to "^?xml version=\"1.0\" encoding=\"UTF-8\"?^
^!DOCTYPE plist PUBLIC \"-//Apple Computer//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\"^
^plist version=\"1.0\"^
^dict^
	^key^URL^/key^
	^string^http://bbs.applescript.net/^/string^
^/dict^
^/plist^
"
-- I want the url from that webloc inserted in bbCode, say, with "AppleScript Forums" as the link text.
set link to "AppleScript Forums" -- the text for my link
set ex to extractBetween(p, "^string^", "^/string^") -- extract the URL
--> "http://bbs.applescript.net/"
set tURL to "|url=" & ex & "|" & link & "|/url|" -- form it into bbCode
"|url=http://bbs.applescript.net/|AppleScript Forums|/url|"
-- this, pasted into a forum, would look like a link but point to the url, after the symbols are changed back.

to extractBetween(SearchText, startText, endText)
	set tid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to startText
	set endItems to text of text item -1 of SearchText
	set AppleScript's text item delimiters to endText
	set beginningToEnd to text of text item 1 of endItems
	set AppleScript's text item delimiters to tid
	return beginningToEnd
end extractBetween

As the final set of scripts we explore a handler for dealing with lists of TIDs instead of just a beginning and an ending one. We will use it for only two, however, just to illustrate. Although AppleScript’s text item delimiters are literally a list of one or more strings as was mentioned above, current versions of AppleScript process only the first item of a list and ignore any others, present or not. If AppleScript’s text item delimiters was set to {“tid-1”, “tid-2”} only “tid-1” would be processed. The following is an example by Jon Nathan of a script for processing multiple delimiters.

The handler below depends on the technique of inserting a string not likely to be encountered in any real text, like “?|?” for example, as a marker for each of the delimiters in the given list to be processed, and then replaces the markers with the saved TIDs. To accomplish this, the handler that does it runs through a repeat loop, inserting the marker in place of each delimiter, and then, having marked them all, the handler replaces the markers with the original delimiters.

set the_string to "This is a string: with_multiple delimiters."
set the_delims to {":", "_"}
my multi_atid_split(the_string, the_delims)
--> {"This is a string", " with", "multiple delimiters."}

on multi_atid_split(the_string, the_delims)
	-- store the originals and set up the marker.
	set {OLD_delim, _marker_} to {AppleScript's text item delimiters, "?|?"}
	-- process each of the delimiters in the_delims replacing each with the  _marker_
	repeat with this_delim in the_delims
		my atid(this_delim) -- see the handler that follows
		set the_string to text items of the_string
		my atid(_marker_)
		set the_string to text items of the_string as string
	end repeat
	-- At this point our text looks like this:
	-- "This is a string?|? with?|?multiple delimiters."
	my atid(_marker_) -- now get the markers out
	set the_string to text items of the_string
	my atid(OLD_delim) -- rebuild with the originals
	return the_string
end multi_atid_split

-- This 3-line handler saves a lot of typing.
on atid(the_delim)
	set AppleScript's text item delimiters to the_delim
end atid

This can be useful too when removing extraneous characters from text. Suppose we had phone numbers in a number of formats and for our database, we wanted a simple string of numbers including the area code but without parentheses, dashes or spaces in it. The following script does this for a sampling of four phone numbers (in the North American style) with and without area codes. It would be easy to modify it to accept a single phone number and clean out the characters we didn’t want.

-- Set the phone numbers to be tested
set phNums to {"456-4321", "876-789-1212", "(898) 321-2121", "505 1234"}
set localAC to "911" -- local area code to use when none is given.

-- Test all the numbers
set cleanedPN to {} -- a place to put the "cleaned" phone numbers
set removals to {"(", ")", "-", space} -- the characters to remove
repeat with k from 1 to count phNums
	tell item k of phNums -- to permit using "it" for short
		tell my multiTiD(it, removals) as string -- this does the job
			if (count of it) > 7 then -- it's got an area code (different "it")
				set end of cleanedPN to it -- add to our list
			else -- it hasn't got an area code
				set end of cleanedPN to localAC & it -- add with AC to our list
			end if
		end tell
	end tell
end repeat

cleanedPN --> {"9114564321", "8767891212", "8983212121", "9115051234"}

--- handlers ---
on multiTiD(tString, delims)
	set {saveOTID, _marker_} to {atid("otid"), "?|?"}
	repeat with aDelim in delims
		my atid(aDelim)
		set tString to text items of tString
		my atid(_marker_)
		set tString to text items of tString as string
	end repeat
	my atid(_marker_)
	set tString to text items of tString
	my atid(saveOTID)
	return tString
end multiTiD

on atid(delim)
	if delim = "otid" then
		return AppleScript's text item delimiters
	else
		set AppleScript's text item delimiters to delim
	end if
end atid

Edit: An interested reader (JimT) wrote in to say that the approach above could be problematic if the marker inserted contained any characters to be included in the search. He proposed a workaround as follows:


set theString to "This is a string: with_multiple
delimiters."
set theDelims to {":", "_"} -- remove the colon and underscore
set tTextItems to my multiDelimSplit(theString, theDelims)
--> {"This is a string", " with", "multiple delimiters."}

on multiDelimSplit(theString, theDelims)
	set oldDelim to AppleScript's text item delimiters
	set theList to {theString}
	repeat with aDelim in theDelims
		set AppleScript's text item delimiters to aDelim
		set newList to {}
		repeat with anItem in theList
			set newList to newList & text items of anItem
		end repeat
		set theList to newList
	end repeat
	set AppleScript's text item delimiters to oldDelim
	return theList
end multiDelimSplit

-- note that order of delims can matter if they have a common character
multiDelimSplit("This | is a | string.", {" | ", " "})
--> {"This", "is", "a", "string."}
multiDelimSplit("This | is a | string.", {" ", " | "})
--> {"This", "|", "is", "a", "|", "string."}
-- in the second case, the spaces used by the second delim
-- were removed by the first delim

For even more good examples, search the forums for “AppleScript’s text item delimiters”. There are hundreds of examples and one of them might be just what you need.

Another important example was brought to my attention on June 10, 2010; namely this handler by Yvan Koenig for extracting every instance of text that occurs between bounding delimiters:

-- Extract every instance of text between bounding delimiters (Yvan Koenig)
set t to "My father has spanked me, and my mother has spanked me; all my aunts and uncles have spanked me for my 'satiable curtiosity; and still I want to know what the Crocodile has for dinner!"

set extract to extractBetween(t, "my ", ";") --> {"mother has spanked me", "'satiable curtiosity"}

---- The handler ----
to extractBetween(SearchText, startText, endText)
	set tid to AppleScript's text item delimiters -- save them for later.
	set AppleScript's text item delimiters to startText -- find the first one.
	set liste to text items of SearchText
	set AppleScript's text item delimiters to endText -- find the end one.
	set extracts to {}
	repeat with subText in liste
		if subText contains endText then
			copy text item 1 of subText to end of extracts
		end if
	end repeat
	set AppleScript's text item delimiters to tid -- back to original values.
	return extracts
end extractBetween

Rather neat.

Adam Bell

Krioni · January 18, 2010, 10:33pm

According to the release notes for Mac OS X 10.6, and verified by my tests, AppleScript’s text item delimiters now RESPECTS and uses a list of text item delimiters. Read the official note at:
http://developer.apple.com/mac/library/releasenotes/AppleScript/RN-AppleScript/RN-10_6/RN-10_6.html

So, it is possible now to split a long string of text into chunks using a list of separator characters, as long as you’re willing to have something only work in Snow Leopard.

That “s” in delimiters finally means something!

Model: MacBook Pro
AppleScript: 2.1.1
Browser: Safari 531.21.10
Operating System: Mac OS X (10.6)

Adam_Bell · January 19, 2010, 8:16pm

Thanks for noticing that, Dan; I hadn’t (and still do about half my work on a PPC dual-core G5 which can’t run 10.6)

Georgech · July 10, 2011, 8:43pm

Hi,
I have tried to traverse 3 text files in a folder. but the delimiter fail when it go through the 2nd text file. I am not sure what is wrong? The following is my script.

on open user_Choice
tell application “Finder” to set fileList to every file in item 1 of user_Choice

repeat with aFile in fileList
	set aid to (read (aFile as alias))
	set AppleScript's text item delimiters to {"="}
	set aa to aid as string
	set aa to text item 2 of aid
	display dialog aa
end repeat

end open

Georgech · July 11, 2011, 7:07am

ooops, I solved my own question. Thanks anyway.

on open user_Choice
tell application “Finder” to set fileList to every file in item 1 of user_Choice
repeat with aFile in fileList
set aid to (read (aFile as alias))
getContent(aid)
end repeat
end open

–subroutine, to traverse text files and place in variables
on getContent(aid)
set tid to AppleScript’s text item delimiters
set AppleScript’s text item delimiters to {“=”}
set nname to text item 1 of aid
set mval to text item 2 of aid
if (nname = “studentid”) then
set idd to mval
display dialog idd with title “Student’s ID”
else if (nname = “studentip”) then
set ipp to mval
display dialog ipp with title “Student’s IP”
else if (nname = “studentname”) then
set myname to mval
display dialog myname with title “Student’s Name”
end if
set AppleScript’s text item delimiters to tid
end getContent

casp · August 26, 2011, 8:29pm

Thanks for the tutorial, very informative and educational as I"m new to this forum and applescripting.

I attempted to use the “Searching and Replacing” script to edit the string of a variable, but I can’t seem to get it to work in my situation. Here is my script:

repeat with i from 1 to (count of listOfFilesNotInDatabase)
	try
		set aFile to ((item i of listOfFilesNotInDatabase) as POSIX file as alias)
		tell application "iTunes" to add aFile
		log aFile -- debugging purposes
	end try
end repeat
try
	set theText to listOfFilesNotInDatabase
	set newText to switchText of theText from "/Volumes/Media/iTunes/iTunes Media/Music/" to ""
	---Subroutine for switchText is below----
	display dialog "The following files were added to your iTunes Library  " & newText
end try

Essentially the vairable, listOfFilesNotinDatabase is a file path all of which will have the same first portion, /Volumes/Media/iTunes/iTunes Media/Music/. All I want is to clear this out so the display dialog doesn’t look so messy.

Any suggestions?
Thanks

Here is the subroutine from the previous post:

to switchText of theText from SearchString to ReplaceString
	set OldDelims to AppleScript's AppleScript's text item delimiters
	set AppleScript's AppleScript's text item delimiters to SearchString
	set newText to text items of theText
	set AppleScript's AppleScript's text item delimiters to ReplaceString
	set newText to newText as text
	set AppleScript's AppleScript's text item delimiters to OldDelims
	return newText
end switchText

Schmye_Bubbula · August 7, 2013, 5:39pm

Somebody explain conceptually what’s going on here; what’s the difference?


set AppleScript's text item delimiters to ASCII character (0)

repeat with x in "aabbccbb"
	exit repeat
end repeat

x --> item 1 of "aabbccbb"

set AppleScript's text item delimiters to x

AppleScript's text item delimiters --> "a"

text items of "aabbccbb" as text --> "bbccbb"

set AppleScript's text item delimiters to ASCII character (0)

But now do it like this:


set AppleScript's text item delimiters to ASCII character (0)

(*
repeat with x in "aabbccbb"
	exit repeat
end repeat

x --> item 1 of "aabbccbb"
*)

set AppleScript's text item delimiters to item 1 of "aabbccbb"

AppleScript's text item delimiters --> "a"

text items of "aabbccbb" as text --> "aabbccbb" (Huh?)

set AppleScript's text item delimiters to ASCII character (0)

I don’t get it ” the meat of the algorithm seems the same to me either way, namely:

1st way: set AppleScript’s text item delimiters to x
vs.
2nd way: set AppleScript’s text item delimiters to item 1 of “aabbccbb”

Why doesn’t…
text items of “aabbccbb” as text
…the 2nd way strip out the first two characters of “aabbccbb”?
AppleScript’s text item delimiters read the same both ways (“a”).
What am I missing?

Adam_Bell · August 7, 2013, 6:25pm

text items returns a list, in this case three items: {“”, “”, “bbccbb”} where the first empty quote is the first “a”, the second empty quote is the second “a”, and the third part is the rest. By converting text items to text you get the whole list.

set AppleScript's text item delimiters to item 1 of "aabbccbb"

set td to AppleScript's text item delimiters --> "a"

set y to text items of "aabbccbb" --> {"", "", "bbccbb"}

I don’t know why the conversion reveals the “a"s. Seems to me the answer should be " bbccbb”

Schmye_Bubbula · August 7, 2013, 6:44pm

Why not “bbccbb” instead of " bbccbb"? Aren’t the first two text items null characters?
And “item 1 of ‘aabbccbb’” is the same thing as “x” when setting AppleScript’s text item delimiters ” no? Why the different result?
(My head is getting ready to explode! )

McUsrII · August 7, 2013, 8:47pm

Hello.

What first happens is that you get the same list as Adam Bell got: namely {“”,“”,“bbccbb”}. Now, you must look at the empty strings as placeholders for text item delimiters.

You haven’t changed your text item delimiters, so when the string is turned back to text, then AppleScript inserts the text item delimiters there for you, so you end up with “aabbccbb” again.

This is totally normal behaviour. If you don’t change the text delimiter, you’ll end up with what you had originally, the moment you convert the list of text items back to text.


set AppleScript's text item delimiters to item 1 of "aabbccbb"

log AppleScript's text item delimiters --> "a"

set d to text items of "aabbccbb" 
# {"","","bbccbb"}
set AppleScript's text item delimiters to ""
# we remove the slots for the text item delimiters from the list
set d to d as text
log d
-- > "bbccbb"

Schmye_Bubbula · August 7, 2013, 11:40pm

Eureka! Got it! Thanks, guys ” and especially McUsrII, who finally made me see it: I needed to split-up the “to text items of” and the “as text,” and restore the standard null text item delimeters in-between. So in terms of my 2nd scriplet in post #7 above, to make it work, it should be modified to…


set AppleScript's text item delimiters to ASCII character (0)

(*
repeat with x in "aabbccdd"
	exit repeat
end repeat

x --> item 1 of "aabbccdd"
*)

set AppleScript's text item delimiters to item 1 of "aabbccdd"

AppleScript's text item delimiters --> "a"

set strippedText to text items of "aabbccdd"
-- Needed to split-up the "to text items of" and the "as text"...

set AppleScript's text item delimiters to ASCII character (0)
-- ... and put this in-between.

set strippedText to strippedText as text --> "bbccdd"
-- (This is the "as text" split away after the null text item delimeters restored.)

AppleScript: 2.0.1
Browser: Firefox 4.0.1
Operating System: Mac OS X (10.5)

Adam_Bell · August 7, 2013, 11:54pm

Thanks, McUsrII – that was the missing link; that he didn’t change the TID before converting to text. I knew better but had a senior moment; just turned 76 a few days ago. :rolleyes:

McUsrII · August 8, 2013, 12:16am

You don’t have to be 76 to have a senior moment, I guess I am a living proof of that.

I didn’t figure it out at first either, not util it suddenly dawned upon me.

Belated Congratualtions with your birthday Adam.

Schmye_Bubbula · August 8, 2013, 12:37am

I still don’t know why my 1st scriplet worked in post #7. I didn’t split the operations with restoration of the null TID in-between in that one, did I? Or did the repeat slip it in somehow?

What really threw me was that the TIDs were “a” in both scriplets in my post #7. (And apparently it’s still an anomaly, judging from Adam’s post #8.)

McUsrII · August 8, 2013, 2:35am

Hello.

Yes, I’d say that behaviour is an anomaly! I can’t explain it, but I can guess that since the variable is declared in a loop, then it just isn’t “visible” enough to work properly. It works well enough to get things removed but not to be inserted.

Well, that is a clever hack, if you just want to remove stuff! Now, if you change the line

set AppleScript's text item delimiters to x

into

set AppleScript's text item delimiters to contents of x

Then you get the default behaviour.

If you change the x into a normal variable in your first script in your post #7, then it works as it should too, (returning aabbccbb), so I guess it is that very locally scoped variable that does the trick, (x is really only meant to be used in the loop as a loop variable). If you declare x as local before the assignment in the repeat loop, then the “trick” also breaks, returning (aabbccbb), so I think the scoping is the culprit.

I don’t think the trick saves you much anyway, 3 lines added contra one setting the delimiters, and one coercing to text, but there might be hiding a slight speed gain there.

Schmye_Bubbula · August 16, 2013, 2:12am

I don’t understand why text item delimiters appear only whenever they occur at the beginning or end of given text, and not in the middle. In other words:


set text item delimiters to ""
set x to "abbc"
set text item delimiters to "b"
set x to x's text items --> {"a", "", "c"}
set text item delimiters to ""

But:


set text item delimiters to ""
set x to "abc"
set text item delimiters to "b"
set x to x's text items --> {"a", "c"}
set text item delimiters to ""

Why weren’t the latter’s text items {“a”, “”, “c”}, and the former’s {“a”, “”, “”, “c”}? My natural expectation ” wrong it turns out ” was that the delimiters are there in the original text string so they should also appear in the list of text items. The “Able was I ere I saw Elba” example in the first post’s tutorial says that’s how it works, but I don’t get why it works that way.

Nigel_Garvey · August 16, 2013, 7:00am

You have to think of ‘text items’ and ‘text item delimiters’ as alternating in the text, with ‘text items’ always being on the outside. Each instance of a delimiter comes between two text items.

In the text “abc”, the single instance of the delimiter “b” comes between “a” and “c”, so those are the text items.

In “abbc”, the two instances of the delimiter are adjacent. But there’s notionally a zero-length (or “empty”) text item between them and this is returned as the zero-length string “”.

Similarly, if an instance of the delimiter occurs at the beginning or end of the text, there’s notionally an empty text item on its outer side.

This sounds unnecessarily esoteric when you’re talking about extracting the text items, but when the list of text items is coerced back to text, the delimiter is simply inserted between the items and the result contains the right number of delimiter instances in the right places.

Schmye_Bubbula · August 17, 2013, 12:34pm

Thanks, Nigel. I guess that’s the crux: The text items in the list pertaining to the delimiters (i.e., each “”) aren’t the delimiters themselves, as I was wrongly construing, but rather the “empty” text items between the delimiters. Hard to wrap my head around, and I will have to ponder why it’s thusly as I work more with delimiters before it sinks-in. I was wanting it to be otherwise (each “” representing the delimiters themselves, not the absence of text items between them) because I was trying to use that as a way of parsing strings with far fewer passes than going through every single character. There’s probably still a way of doing that if I can just get the pattern of the way it really works down pat, so I’ll just plod along until I get it. Seat-of-the-pants AppleScripting is hard.

Nigel_Garvey · August 17, 2013, 2:55pm

You’ve “got” it now with regard to each “” in the list representing an “empty” text item.

There is of course no significance or reality to “empty” text items. They’re simply a convenient device to indicate the insertion points for any adjacent or text-end delimiter instances in the text for which you have text items. These situations would be difficult to indicate otherwise.

{“”, “a”, “”, “c”, “”} is five text items, so four delimiter instances have been removed from between them or four can be inserted:
“” & “b” & “a” & “b” & “” & “b” & “c” & “b” & “”
or “babbcb”

Schmye_Bubbula · August 23, 2013, 7:16pm

(If my problem here turns-out not to be specifically pertinent to text item delimiters, I’ll ask a mod to move this post out of the thread.)

This script aspires to remove redundant characters from a string.
It does the following:
“ For each character in turn…
“ Concatenate a dupe character to the first occurrence so that ” after then setting text item delimiters to that character ” any single instance not at the string endpoints (i.e., within the string) will explicitly show up as an empty string in the resulting text items list.
“ Look for the first appearance of an empty string in the text items list and replace it with the character in question. (Not necessary to restore it in the same order to remove redundancies; just for shits & giggles.)
“ After setting text item delimiters back to the regular {“”}, convert the text items list back to a string, freshly stripped of that character’s redundancies, and go on to the next character.

There may be a better way ” and if there is, I’d love to hear it, however strictly speaking it would be off-topic ” but what I really want to know is why in the first pass of the y-loop, {“a”, “”, “”, “bbccdd”} becomes an empty string (“”) when its variable is set to “as text” under the auspices of the default {“”} text item delimiters.


stripRedundantCharacters("aabbccdd")

to stripRedundantCharacters(inputText)
	repeat with x in inputText
		set getUniqueCharacters to text 1 through (the offset of x in inputText) in inputText & ¬
			text (the offset of x in inputText) through -1 in inputText
		set text item delimiters to x
		set getUniqueCharacters to getUniqueCharacters's text items
		
		repeat with y from 1 to the count of getUniqueCharacters's text items
			if text item y in getUniqueCharacters = "" then
				set text item y in getUniqueCharacters to x
				set text item delimiters to ""
				set getUniqueCharacters to getUniqueCharacters as text
				exit repeat
			end if
		end repeat
		
	end repeat
	return getUniqueCharacters
end stripRedundantCharacters

For your convenience, here it is again with the addition of debug code and comments:


stripRedundantCharacters("aabbccdd")

to stripRedundantCharacters(inputText)
	repeat with x in inputText
		log x --> "a" (in 1st pass; similarly with the following recorded Results)
		set getUniqueCharacters to text 1 through (the offset of x in inputText) in inputText & ¬
			text (the offset of x in inputText) through -1 in inputText ¬
			# Doubles-up the first character which will next become text item delimiters so that any single instances of it (within the string's interior) will make sure to get a null character ("") in the string's text items list.
		set text item delimiters to x
		set getUniqueCharacters to getUniqueCharacters's text items
		log getUniqueCharacters --> {"", "", "", "bbccdd"}
		log the (count of getUniqueCharacters's text items) --> 4
		
		repeat with y from 1 to the count of getUniqueCharacters's text items
			log y --> 1
			log text item y in getUniqueCharacters --> ""
			if text item y in getUniqueCharacters = "" then
				set text item y in getUniqueCharacters to x ¬
					# Puts-back the first occurance of the character, in place.
				log text item y in getUniqueCharacters --> "a"
				log getUniqueCharacters --> {"a", "", "", "bbccdd"}
				set text item delimiters to ""
				set getUniqueCharacters to getUniqueCharacters as text
				log getUniqueCharacters --> "" (Huh? Why not "abbccdd" in the 1st pass?)
				log "Next x"
				exit repeat
			end if
		end repeat
		
	end repeat
	return getUniqueCharacters
end stripRedundantCharacters

Actually, I should have posed my question more simply in the general form (apologies!):


set z to "aabbccdd"

repeat with x in z
	set text item delimiters to x
	log text item delimiters --> "a"
	set z to z's text items
	log z --> {"", "", "bbccdd"}
	
	repeat with y from 1 to the count of z's text items
		set text item y in z to x
		log z --> {"a", "", "bbccdd"}
		set text item delimiters to ""
		set z to z as text
		log z --> "" (Huh? Why not "abbccdd" in 1st pass?)
		
	end repeat
end repeat

AppleScript: 2.0.1
Browser: Firefox 4.0.1
Operating System: Mac OS X (10.5)