How to find text and display results from PostScript in TextEdit?

Hello MacScripter Forum

This seems to be the right place to post the question I have. I am not an experienced scripter at all and lack the fundamentals but can normally find my way around logical code to make it work after some time.
The challenge I would like to solve with AppleScript looks like this: We have PostScript files in various sizes (1MB - 100MB) generated from Adobe Illustrator or ArtPro looking when opened in TextEdit like this (a small extract):
‘…
} Adobe_AGM_Core/end_feature get exec
Adobe_AGM_Core/driver_check_media_override get exec
[1 0 0 -1 0 841.89 ] concat
Adobe_AGM_Core begin
50.8 37.5 getspotfunction setscreen
end
Adobe_CoolType_Core/page_setup get exec
Adobe_AGM_Image/page_setup get exec
mTSsetup
pmSVsetup
initializepage
…’

I would like to write an AppleScript which will search the opened PS file in TextEdit for the string ‘getspotfunction setscreen’ and then display the two numbers in front from the same line ‘50.8 37.5’ which are screen frequency and screen angle respectively. Depending on the number of separations (color plates) in the PS file, the occurrences of ‘getspotfunction setscreen’ with different numbers in front vary as well (normally 1-3).
I have started with:


tell application "TextEdit"
	set text of document 1 to my PSSearch(get text of document 1, "getspotfunction setscreen")
end tell

on PSSearch(txtStr, srchStr)
	set {PS, AppleScript's text item delimiters} to {AppleScript's text item delimiters, {srchStr}}
	show(srchStr)
	set temp to every text item of txtStr
	set txtStr to temp as text
	set AppleScript's text item delimiters to PS
	return txtStr
end PSSearch

but then got stuck. Can anybody provide some assistance to develop this a bit (a lot :wink: ) further?

Thanks, Axel

This should be relatively readable and doesn’t require TextEdit:

on run
set theFileAsPOSIX to (POSIX path of (choose file with prompt "Choose a file whose screen frequencies and angles you wish to display.") as Unicode text)
set screenFrequenciesAndAngles to (do shell script "/usr/bin/awk '/getspotfunction setscreen/{print \"Frequency: \"$1\"   Angle: \"$2}'" & space & quoted form of theFileAsPOSIX)
display alert "The following screen frequencies and angles were found in the chosen colour plate:" message screenFrequenciesAndAngles
end run

Hi Mikey-San

:slight_smile: Excellent. This looks very elegant.

It seems however that in Unicode Text, the text ignores all paragraphs and hence the position 1 and 2 you indicate return values from higher up the page when viewed in TextEdit. Is there a way to maybe ‘offset’ by -2 and -1 from the ‘getspotfunction setscreen’ position in order not to handle position 90 and 91which would be a bit risky as these might change?

Ciao, Axel

I think I know what you’re trying to say, here.

The “Unicode text” bit is only for getting the path to the file as Unicode, to account for special characters in the path/file name. Has nothing to do with the text itself. [Note that this bit has been removed in the following version of the script. “As Unicode text” was redundant. See Hamish’s post below.]

You may need to pipe through strings to do it, however, because the file isn’t just plain text:

on run
set theFileAsPOSIX to POSIX path of (choose file with prompt "Choose a file whose screen frequencies and angles you wish to display.")
set screenFrequenciesAndAngles to (do shell script "/usr/bin/strings" & space & quoted form of theFileAsPOSIX & "| /usr/bin/awk '/getspotfunction setscreen/{print \"Frequency: \"$1\" Angle: \"$2}'")
display alert "The following screen frequencies and angles were found in the chosen colour plate:" message screenFrequenciesAndAngles
end run

Try that. If it still does not produce the desired results, I will obtain an example document from you and work from that.

Hello again

It works. I will have to look at it in detail in order to really understand it so that I can expand it. But the main part is solved.

Thanks,

Axel

I’m happy to break it down for you.

Step one: We need to get the path to the file, but the system will give us what’s called an alias path. Alias paths look like this:

It’s not what we call “POSIX-compliant”, so the shell can’t use it. AppleScript, however, is kind enough to provide us a facility to convert Mac OS aliases to POSIX paths:

set pathAsPOSIX to POSIX path of pathAsAlias

That will turn the above alias-style path into this:

We can do the conversion in the same line with the “choose file with prompt” command:

set pathAsPOSIX to POSIX path of (choose file with prompt "Your ad here.")

Step two: The file isn’t a regular plain text or rich text file, so we need to yank out the plain text strings within. We already know the value we want is stored as a plain text string inside the file, so we use a shell program called strings to dump a list of the text strings inside the file. E.g.,

In your case, the strings commandwill dump the same info TextEdit shows you. Now we’re ready to look for the text you want to extract. For this, we turn to awk.

Step the third: We need to locate lines that contain “getspotfunction setscreen”. We could use another shell program called grep, but awk can perform both the search and the value extraction we’re going to do later, so we’ll keep it down to just using strings and awk.

So now, we search:

Oh, but that doesn’t work, because we want to search the text that the strings command gave us. For this, we use what we call “output redirection”. We’re going to take the output of the strings command and pipe it to awk. Awk will read the output of strings and use that instead of a file. E.g.,

Data comes out of strings and goes into awk. Easy peasy.

But now we need to return what awk finds, so we tell awk to print it:

The output of that entire command might look like this:

We’re rockin’ out so far. We just need to . . .

Step four: Extract the two values at the beginning of the lines of text returned by awk. By default, awk’s field delimiter is the space character. This is analagous to AppleScript’s text item delimiters. Since the default is a space, and our fields are space-delimited anyway, we just have to reference them that way:

Since I’m already telling awk to print text anyway and return it to AppleScript, I may as well label and format them for my dialog inside the awk command:

(There should be three spaces between the value of the first field and the label of the second, just in case the forums truncate the runs of spaces.)

Step five: Almost done. Now we just need to bring it into AppleScript.

set shellOutput to (do shell script "/usr/bin/strings" & space & quoted form of theFileAsPOSIX & "| /usr/bin/awk '/getspotfunction setscreen/{print \"Frequency: \"$1\" Angle: \"$2}'"

Note that we need to do THREE IMPORTANT THINGS (IN ALL CAPS HERE BECAUSE THEY’RE IMPORTANT) when we turn this into an AppleScript shell call:

  1. We always reference shell programs /by/full/path/names. See this post for an explanation:

http://bbs.applescript.net/viewtopic.php?pid=52253#p52253

  1. Spaces in disk, folder, and file names have to be handled correctly in the shell. The way we commonly do this in AppleScript is to use “quoted form of” with our path name variable:
do shell script "/bin/rm -r" & space & quoted form of theFile"

In the shell, it looks like this:

  1. Double quotes tell AppleScript where the shell script command starts and ends, so we have to escape them (protect/translate them) properly:
do shell script "/bin/echo "bash rules!""

That won’t compile, but this will:

do shell script "/bin/echo \"bash rules!\""

Step six: Insert into the display alert command, which you should be familiar with (or be able to pick through) at this point.

And there you have it.

Minor point: that step’s really redundant since ‘POSIX path of [some file object]’ already returns Unicode text.

HTH

Hm, it does indeed. I keep forgetting that for some reason. Edited to remove redundancy.

(Note that the step isn’t redundant, but “as Unicode text” definitely is.)

Good Morning Mikey-San, MacScripter Forum

In your excellent description you write in Step 3:

Can awk as well return a line which is, let’s say 4 lines above or 2 lines below the found line incl. “getspotfunction setscreen”?
I am working on expanding the script to include as well the plate-name which is found some lines above the “getspotfunction setscreen” info.

Ciao, Axel

Sorry, I didn’t see your e-mail to me and forgot to check up on this thread in the last couple of days.

Four lines above the found regex:

Two lines above:

(I have added spaces around the /regex/ to make it more readable.)