Convert formatted text file to xls, then process?

Hello–
I am an extreme n00b to Applescript, and though I have been struggling through tutorials here and there, I’m now a bit stumped. Here’s my situation:
I am running experiments for a linguistics lab, using the TCL/TK program Linger. This is insubstantial to the question at hand, but it gives a general idea of the scope. Anyhow, for each participant, the program outputs a .dat file with the subject number, the name of the experiment, the sentence number, the condition (for this experiment, it’s “obj” for every one), a percentage sign, their rating of this particular question on a scale of 1-7, and the time it took them to answer. Here’s a sample:
1 similarity 62 obj 0 - - 1886884
1 similarity 62 obj % 2 {} 1689426

The first one above was the reading screen, which did not have a rating associated with it. Every other screen is a reading screen. The first two entries thus look like this:
1 similarity 62 obj 0 - - 1886884
1 similarity 62 obj % 2 {} 1689426
1 similarity 38 obj 0 - - 414750
1 similarity 38 obj % 6 {} 1164866

This is just to let you know there will be some “junk processing” that needs to go on.

Anyhow, for this particular experiment, I need to put this data into an excel 2008 spreadsheet, ordered by the sentence number. So for the above input, all I want in excel is:
38 6
62 2

I have been able to do this so far with the regular expression [0-9] similarity ([0-9]+) obj % ([0-9]) .* , which is replaced with $1 $2. I also run a regular expression to get rid of the junk data and replace it with a blank space. This allows me to open the file in excel and import the data, whereupon I can sort by sentence number manually.

Given that there will be quite a few subjects, however, I want to be able to do this via Applescript. Each subject has his/her own .dat file, and all of them are similarly formatted. Namely, I want to be able to open each .dat file, parse it correctly to get sentence number and rating for all sentences, and put it, sorted, into an excel worksheet.

Ideally, I would have each sentence number corresponding to row number, and columns corresponding to participant number. I would fill the data in with the rating that each participant gave to the given sentence.

This seems like a specific question, but much more generally, it deals with parsing a text file via applescript and putting it into an excel worksheet in the right format. I believe this question can benefit many people seeking to do similar excel processing with applescript (especially now that they disabled VB).

Any help you can give would be greatly appreciated!

Hi,

assuming the .dat file is plain text MacRoman encoded, you can use this to read the file (test.dat on desktop),
parse the data, put the result in a Excel sheet and sort the column with the sentence numbers


set theText to read file ((path to desktop as text) & "test.dat")

tell application "Microsoft Excel"
	activate
	make new workbook
end tell
set {TID, text item delimiters} to {text item delimiters, space}
set x to 1
repeat with i in paragraphs of theText
	if i contains "%" then
		tell text items of i to set {s, r} to {item 3, item 6}
		tell application "Microsoft Excel"
			tell active sheet
				set value of cell x of column 1 to s
				set value of cell x of column 2 to r
			end tell
		end tell
		set x to x + 1
	end if
end repeat
set text item delimiters to TID
tell application "Microsoft Excel"
	sort range "A1" of active sheet order1 sort ascending key1 range "A:A"
end tell

WOW!!! Thank you very much. This is exactly what I was looking for. I am wondering if it is possible to extend it to be able to handle all the files ending in .dat in a directory? Each file is also named a number corresponding to the subject number. Can this further be the column of the data?

For instance, if we have two subjects, as follows: (note, located in two separate files, the first in 1.dat, the second in 2.dat)
1 similarity 62 obj % 6 {} 3243243
2 similarity 62 obj % 3 {} 1854764

I want to be able to have the following, at cell number 62
62 6 3
(rows represent sentence numbers, columns represent subject numbers, values represent ratings)

I played around with the example you gave and it keeps telling me “the object you are trying to access does not exist”. My method was to add a third parameter t, representing the subject number, as follows:
if i contains “%” then
tell text items of i to set {t, s, r} to {item 1, item 3, item 6}
tell application “Microsoft Excel”
tell active sheet
set value of cell s of column t to r
end tell
end tell

This does not take into account the fact that this data would be in different files. Thank you again so much for your help!! This is really helpful as well as educational for me.

try this (I haven’t seen your edited post yet)


set theFolder to choose folder
tell application "Finder" to set theFiles to files of theFolder whose name extension is "dat"
if theFiles is {} then return

tell application "Microsoft Excel"
	activate
	make new workbook
end tell

set {TID, text item delimiters} to {text item delimiters, space}
set x to 1
repeat with oneFile in theFiles
	set theText to read (oneFile as alias)
	repeat with i in paragraphs of theText
		if i contains "%" then
			tell text items of i to set {t, s, r} to {item 1, item 3, item 6}
			tell application "Microsoft Excel"
				tell active sheet
					set value of cell x of column 1 to s
					set value of cell x of column 2 to t
					set value of cell x of column 3 to r
				end tell
			end tell
			set x to x + 1
		end if
	end repeat
	set text item delimiters to TID
end repeat
tell application "Microsoft Excel"
	sort range "A1" of active sheet order1 sort ascending key1 range "A:A"
end tell