Monday, July 23, 2018

#1 2017-08-18 05:48:26 am

Registered: 2012-06-28
Posts: 54

Convert PDF To Plain Text - Modify Current Folder Action Script?

(This is a continuation of my original post "Watch Folder with Text Extraction? Looking for ideas.")


My work flow is:

Open email in Thunderbird.

Save/Open PDF In Acrobat

Save PDF as Plain Text Into "Watched" Folder

- Data extracted from the plain text file, an "or" is added and the final result is entered into Houdahspot search software.


Would it be possible to save the PDF file DIRECTLY from Thunderbird into my current watch folder and have the folder action script handle the conversion from PDF file to plain text; using the result as the data for the current script??

In other words: I would like to eliminate the end user interaction with Acrobat from the workflow.

Also it would be nice if the folder action scripted deleted its contents after processing,


The current folder action script is as follows:


use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

on findPattern:thePattern inString:theString
   set theNSString to current application's NSString's stringWithString:theString
   set theOptions to ((current application's NSRegularExpressionDotMatchesLineSeparators) as integer) + ((current application's NSRegularExpressionAnchorsMatchLines) as integer)
   set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:theOptions |error|:(missing value)
   set theFinds to theRegEx's matchesInString:theNSString options:0 range:{location:0, |length|:theNSString's |length|()}
   set theResult to {} -- we will add to this
   repeat with i from 1 to count of theFinds
       set theRange to (item i of theFinds)'s range()
       set end of theResult to (theNSString's substringWithRange:theRange) as string
   end repeat
   return theResult
end findPattern:inString:

on adding folder items to this_folder after receiving these_items
   # read the contents of the first added item
   set theDatas to read (item 1 of these_items)
   # extract every groups of a space, 7 digits and a space from the datas
   set searchNumbers to its findPattern:"(?<!AdServices:DisplayAds_HiRes:[0-9]{7}\\.pdf\\s{0,5})\\s[0-9]{7}\\s" inString:theDatas
   # concatenate the “numbers” separated by "or"
   set oTIDs to AppleScript's text item delimiters
   set AppleScript's text item delimiters to "OR"
   set searchtext to searchNumbers as text
   set AppleScript's text item delimiters to oTIDs
   set searchtext to text 2 thru -2 of searchtext
   tell application "HoudahSpot"
       search searchtext
   end tell
end adding folder items to


thank you in advance! R260

Last edited by rick_260 (2017-08-18 10:59:34 am)



#2 2017-08-18 06:39:45 am

Nigel Garvey
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 4604

Re: Convert PDF To Plain Text - Modify Current Folder Action Script?

Hi rick_260.

When posting AppleScript code here, could you please put it between this forum's [applescript] and [/applescript] BBCode tags? Thanks.

[applescript]-- Your code here.[/applescript]



-- Your code here.




#3 2017-08-18 11:00:10 am

Registered: 2012-06-28
Posts: 54

Re: Convert PDF To Plain Text - Modify Current Folder Action Script?

fixed. sorry about that. noted for the future.



Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)