I’ve got a folder action script for automating Acrobat OCR put together from a variety of sources including:
Applescript code posted by Joe Kissell in MacWorld (http://www.macworld.com/article/60229/2007/10/nov07geekfactor.html)
Document Snap (http://www.documentsnap.com/acrobat-applescript-for-scansnap-ocr/)
and other places (can’t find the references right at the moment).
The problem I’m having is that if I add a PDF that has “renderable” text in it, Acrobat displays a dialog to indicate that OCR couldn’t be done as the page (and document) contains renderable text.
The problem is that this dialog (see http://gallery.me.com/jduke#100028/Picture%203&bgcolor=black ) has no title, so I’m not sure how I can setup some error trapping to intercept the dialog, log the issue, then continue to process the next file in queue. Can I “read” the text in the dialog? And I’d like to be able to check the checkbox to not display the error again in the document (that way I need to check for the error only once). Then, I can just close the file (without saving), log the error, and move on.
Any help from those assembled here would be greatly appreciated.
Thanks.
Cheers,
Jon
Here’s my current code:
on adding folder items to this_folder after receiving these_items
--delay 240
(* Finder Label Colors No color = 0 Orange = 1 Red = 2 Yellow = 3 Blue = 4 Purple = 5 Green = 6 Gray = 7 *)
repeat with i from 1 to number of items in these_items
set this_item to item i of these_items
set the item_info to info for this_item
set the item_size to size of (info for this_item)
-- set delay_time to ((item_size / 1024 / 30) as integer)
set file_type to name extension of (info for this_item)
if file_type is equal to "pdf" then
tell application "Finder"
activate
set label index of this_item to 2
end tell
try
tell application "Adobe Acrobat Pro"
activate
open this_item
end tell
tell application "System Events"
tell application process "AdobeAcrobat"
click the menu item "Recognize Text Using OCR..." of menu 1 of menu item "OCR Text Recognition" of the menu "Document" of menu bar 1
try
click radio button "All pages" of group 1 of group 2 of group 1 of window "Recognize Text"
end try
click button "OK" of window "Recognize Text"
end tell
--Need to insert renderable text error handler here(?)
end tell
with timeout of 600 seconds
tell application "Adobe Acrobat Pro"
save the front document with linearize
close the front document
end tell
end timeout
tell application "Finder"
activate
set label index of this_item to 6
end tell
end try
end if
end repeat
end adding folder items to