Is AppleScript suitable for doing this?

Is AppleScript suitable for doing this?

I have a text file, named MortIn, that is a little more than a gig in size. It consists of more than two million 440 character records with a return character at the end. I need to extract the first 159 characters of each record, Then check the value of characters 155 to 156 to see if they match a desired value. If it does match that desired value I then want to write the 159 characters plus a return character to another file, MortOut. When completed the MortOut file would have between 20,000 and 30,000 records. I have spent some time looking at AppleScript but all the examples do very different kinds of tasks. I used to do this kind of work with HyperCard but it no longer reads or writes files even when running under OS 9 once OSX has been installed. I want to learn AppleScript sometime but I need to get this done soon. It would be ideal to learn some AppleScript while getting this done but if it is not suitable for this, I need to start looking elsewhere.

Jim Gundlach

Yes, AppleScript can do this quite easily.

The thing to be aware of, though, is the size of the file. Since there are so many records (2 million) you don’t want to read the entire file in at once.

Instead, if you’re sure the records are all fixed-length, at 440 chars each, something like this should work (untested, since I don’t have the file in question, but you should be able to make any necessary adjustments):

--adjust for the size of each record
property recordSize : 440

-- define the input and output files
set theInputFile to open for access file "Macintosh HD:MortIn"
set theOutputFile to open for access file "Macintosh HD:MortOut" with write permission

-- initialize some variables
set currentRecNum to 0
set numLinesOut to 0

repeat
	-- where are we in the file?
	set curPos to currentRecNum * recordSize
	try
		-- can we read some data?
		set thisRecord to read theInputFile from curPos for recordSize
	on error
		-- if we get here, there's no more data to read
		exit repeat
	end try
	-- increment the counter so that we can read the next record correctly
	set currentRecNum to currentRecNum + 1
	
	-- at this point, thisRecord contains the entire 440-character record
	
	-- get the bits to check
	set charsToCheck to (characters 155 through 156 of thisRecord) as string
	-- do they equal the check string?
	if charsToCheck = "xx" then
		-- if so, get the first 159 chars
		set outputStr to (characters 1 through 159 of thisRecord) & return as string
		-- write them to the output file
		write outputStr to theOutputFile
		-- keep a running count, just for kicks
		set numLinesOut to numLinesOut + 1
	end if
end repeat

-- by the time we get here, we're done, so close the output file
close access theOutputFile
-- and let the user know how it went
display dialog numLinesOut & " written to Mortout." buttons {"OK"}

That looks very much like I expected it to if it would work. I will give it a try this evening.

Again, Thanks - you went way beyond what I expected. This is the kind of helpfulness than makes the Mac community a real community.

Jim Gundlach :smiley: