A technique for speeding up handling of very long strings

G’day scripters.

I recently came across a problem in that Applescript is very slow when it comes to finding text characters near the end of very long strings; I’m talking 50,000 + characters here.

However, I came up with a way of greatly speeding up the process by dealing with the string in ‘chunks’ at a time. By doing this i cut down my whole routine from 55 minutes to 4 1/2 minutes.

My problem was I needed to color every instance of “∎” in a mail message that was between 10,000 and 50,000 characters in length. Mail is fast when telling it to color the 49,999th instance of a character, but finding that character in the long original string takes forever.

As well, the list of clients that the message had to supply a graph for was over 2,900 lines long, so I used a similar method to cut it’s time down too. The time above is the total for both routines combined.

So, here’s the code. However, if someone knows of a faster way of doing the above, I’d be grateful to know of it.

Regards

Santa

Edit: I hit the ‘post’ button without finishing the text, dammit.


on SendTheEmail(the_content, the_subject)
		set logo1 to ((current application's class "NSBundle")'s mainBundle())
		set LogoPath1 to (logo1's bundlePath()) as text
		set LogoPath to LogoPath1 & "/Contents/Resources/Logo.png" as text
		set LogoMMPath to LogoPath1 & "/Contents/Resources/LogoMM.png" as text
	tell application "Mail"
		activate
		set newMessage to make new outgoing message with properties {address:the_mailto, subject:the_subject, attachment:LogoPath, content:the_content}
		tell newMessage
			if colorFlag then
				set ContentCount to count of the_content
				set theCycleLoop to 1000
				if theCycleLoop > ContentCount then set theCycleLoop to ContentCount
				try
					repeat with x from 1 to ContentCount by theCycleLoop
						set y to x + theCycleLoop
						if y ≥ ContentCount then set y to ContentCount
						set tempstring to characters x thru y of the_content as text
						set thePosition to 0
						repeat with xxx from 1 to (count of tempstring)
							if character xxx of tempstring = "∎" then
								if thePosition = 0 then set thePosition to xxx
							else
								if thePosition ≠ 0 then
									set color of characters (x + thePosition - 2) thru (x + xxx - 1) of content to TheColor
									set thePosition to 0
								end if
							end if
						end repeat
						if thePosition ≠ 0 then set color of characters (x + thePosition - 1) thru (x + xxx - 1) of content to TheColor
					end repeat
				on error errmsg
					tell application "Finder"
						say "error"
						activate
						display dialog errmsg
					end tell
				end try
			end if
			set TempP to count of paragraphs
			make new attachment with properties {file name:LogoPath} at before the first paragraph
			make new attachment with properties {file name:LogoMMPath} at after paragraph (TempP - 4)
			repeat with themailitem in MainRecipients
				if themailitem as text ≠ "" then make new to recipient at end of to recipients with properties ¬
					{address:themailitem}
			end repeat
			repeat with themailitem in CCRecipients
				if themailitem as text ≠ "" then make new cc recipient at end of cc recipients with properties ¬
					{address:themailitem}
			end repeat
			send
		end tell
	end tell
end SendTheEmail


set theClientList to {}
		set theClientList2 to {}
		tell application "Finder"
			set theFiles to files of folder TallyName2
			set xx to 0
			repeat with individualFile in theFiles
				my UpDateProgress(10)
				set theCreationDate to creation date of individualFile
				set theCMonth to month of theCreationDate as integer
				if theCMonth ≥ tempStartingMonth and theCMonth ≤ tempendingmonth then
					set tempWholeList to my ReadFile2(individualFile as text) as list
					repeat with paragraphCycle in paragraphs of item 1 of tempWholeList -- run through the clients in .dat
						try
							set x to offset of "," in paragraphCycle
							set y to offset of ">" in paragraphCycle
							set tempClientString to characters (x + 1) thru y of paragraphCycle as text
							set exitFlag to false
							set ContentCount to count of items of theClientList
							set theCycleLoop to 100
							if theCycleLoop > ContentCount then set theCycleLoop to ContentCount
							if ContentCount ≠ 0 then
								repeat with x from 1 to ContentCount by theCycleLoop
									set y to x + theCycleLoop
									if y ≥ ContentCount then set y to ContentCount
									set tempstring to items x thru y of theClientList
									repeat with tempCycle from 1 to count of items of tempstring -- now check through existing clients
										if tempClientString = item tempCycle of tempstring as text then
											set item (tempCycle + x - 1) of theClientList2 to (item (tempCycle + x - 1) of theClientList2) + 1
											set exitFlag to true
											exit repeat
										end if
									end repeat
									if exitFlag then exit repeat
								end repeat
							end if
							if not exitFlag then
								if tempClientString ≠ "" then
									set end of theClientList to tempClientString
									set end of theClientList2 to 1
								end if
							end if
						on error errmsg
							--display dialog errmsg
						end try
					end repeat
				end if
			end repeat
		end tell