Hi,
I need help to save content of pdf file into text file.
Right Now I am doing manually like this :
Open the pdf file, Press V (text select tool), Press cmd+a (select all), cmd+c (copy all text), open an text editor and cmd+v (paste) and then save the file in required folder with specific name.
Can this process be done automatically using apple script ?
I am not that expert in apple scripting.
thanks in advance,
-P
In fact, a PDF file is as TEXT file (you can open it in a text editor, do you?).
Perhaps you may duplicate the file, then change its “creator type” to whatever application you wish to open it (TextEdit, Tex-Edit Plus, Word…):
tell application "Finder"
set x to duplicate alias "path:to:file.pdf"
set creator type of x to "ttxt" --> TextEdit
set file type of x to "TEXT" --> so, TextEdit will recognize the document
set name of x to (name of x) & ".txt" --> if you wish the extension, or rename it to whatever you want
end tell
If you don’t plan to distribute your script, and aren’t opposed to using a scripting addition outside of the Standard Additions you could download Sändi’s Additions:
http://osaxen.com/modules.php?op=modload&name=Downloads&file=index&req=getit&lid=40
Place the scripting addition in the scripting additions folder in yor system folder.
Sändi’s Additions allow you to invoke keystrokes - which is pretty much what you are doing manually now.
The following, saved as a classic application, should allow you to drop PDF’s onto the script - with the end result being a text file found in the same location and the original PDF untouched.
But then, I am not doing a very great job of writing stuff today.
on open droppedItems --droppedItems results in a list of the files you dropped on the droplet (script)
repeat with thisItem in droppedItems --repeat with each item in the list
set thisItem to thisItem as string --coerce each item of the list to text
tell application "Finder"
set folderName to the folder of file thisItem as string --get the name of the folder this item is found in
set fileName to the name of file thisItem --get the name of the file
end tell
set oldDelims to AppleScript's text item delimiters --capture current text item delimiters so we can reset them later
set AppleScript's text item delimiters to "." --set them to period for now
set splitName to the text items of fileName --get the text items of the file name (everything on either side of periods)
set nameOnly to item 1 of splitName --item 1 should be the file name
set AppleScript's text item delimiters to oldDelims --reset the text item delimiters
set finishedFile to folderName & nameOnly & ".txt" as string --build the path to the new text file
tell application "Acrobat™ Reader 4.0"
activate --bring Acrobat Reader to the front so the keyboard commands work
open alias thisItem --open the file we are on
end tell
TypeText "v" --select text tool
TypeText "a" with Command --select all
TypeText "c" with Command --copy
tell application "Acrobat™ Reader 4.0"
activate --bring Acrobat to the front if not already
TypeText "w" with Command --close the window
end tell
tell application "Finder"
activate --in my experience you have to activate the finder to work with the clipboard contents
set extractedText to the clipboard as string --this is what we copied
set newFile to open for access file finishedFile with write permission --create a new text file based on the file name & location of the PDF we are on
repeat until exists alias finishedFile --sometimes the file doesn't appear fast enough - repeat until it exists
--
end repeat
set eof file finishedFile to 0 --once it does, set the end of the file to 0 (might not be necessary)
write extractedText to newFile --write the text we copied from the PDF to the text file
close access newFile --close access to the file so it can be worked with
end tell
end repeat
end open
[/url]
thanks, you solved my problem. 