Convert PDF To Plain Text - Modify Current Folder Action Script?

(This is a continuation of my original post “Watch Folder with Text Extraction? Looking for ideas.”)


My work flow is:

Open email in Thunderbird.

Save/Open PDF In Acrobat

Save PDF as Plain Text Into “Watched” Folder

  • Data extracted from the plain text file, an “or” is added and the final result is entered into Houdahspot search software.

Would it be possible to save the PDF file DIRECTLY from Thunderbird into my current watch folder and have the folder action script handle the conversion from PDF file to plain text; using the result as the data for the current script??

In other words: I would like to eliminate the end user interaction with Acrobat from the workflow.

Also it would be nice if the folder action scripted deleted its contents after processing,


The current folder action script is as follows:



use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

on findPattern:thePattern inString:theString
	set theNSString to current application's NSString's stringWithString:theString
	set theOptions to ((current application's NSRegularExpressionDotMatchesLineSeparators) as integer) + ((current application's NSRegularExpressionAnchorsMatchLines) as integer)
	set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:theOptions |error|:(missing value)
	set theFinds to theRegEx's matchesInString:theNSString options:0 range:{location:0, |length|:theNSString's |length|()}
	set theResult to {} -- we will add to this
	repeat with i from 1 to count of theFinds
		set theRange to (item i of theFinds)'s range()
		set end of theResult to (theNSString's substringWithRange:theRange) as string
	end repeat
	return theResult
end findPattern:inString:

on adding folder items to this_folder after receiving these_items
	# read the contents of the first added item
	set theDatas to read (item 1 of these_items)
	# extract every groups of a space, 7 digits and a space from the datas
	set searchNumbers to its findPattern:"(?<!AdServices:DisplayAds_HiRes:[0-9]{7}\\.pdf\\s{0,5})\\s[0-9]{7}\\s" inString:theDatas
	# concatenate the “numbers” separated by "or"
	set oTIDs to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "OR"
	set searchtext to searchNumbers as text
	set AppleScript's text item delimiters to oTIDs
	set searchtext to text 2 thru -2 of searchtext
	
	tell application "HoudahSpot"
		activate
		search searchtext
	end tell
	
end adding folder items to



thank you in advance! R260

Hi rick_260.

When posting AppleScript code here, could you please put it between this forum’s [applescript] and [/applescript] BBCode tags? Thanks.

Result:

-- Your code here.

fixed. sorry about that. noted for the future.