Lately I received an eMail from someone asking me if there was a possibility to convert all pages of a PDF file to JPG images using AppleScript.
I knew about the pdf2tiff script from Dinu C. Gherman, but never heard about a similar solution for JPG conversion. Moreover pdf2tiff currently just converts one page of a PDF file to TIFF, not all pages.
And then there is the versatile sips command, but this only converts the first page of a PDF file to JPG:
sips -s format jpeg /Users/martin/Desktop/sample.pdf -out /Users/martin/Desktop/sample.jpg
But the automation request really kept me thinking and so I finally sat down and quickly modified Gherman’s Python script to convert all pages of a PDF document to JPG. In addition I wrote a convenient AppleScript droplet to execute this Python script on dropped PDF files.
I invite you to download the result right here, maybe you can also make good use of the script:
PDF2JPG ¢ Convert all pages of PDF files to JPG images (ca. 38.1 KB)
The script was tested on Mac OS X 10.5.2 and does not run on Mac OS X 10.4 or earlier incarnations of our beloved operating system. It works on Intel & PowerPC based Macs and also asks you to choose a resolution to be used for the image conversion.
For the future it would be nice to let the user choose single page numbers (or a range of page numbers) to be converted to JPG as well as set the image compression factor, but currently I have no time to implement this.
This is the modified Python script responsible for the PDF to JPG conversion step.
This is the AppleScript droplet code that executes the above Python script on every dropped PDF document:
-- created: 01.04.2008
-- version: 0.1
-- tested on:
-- ¢ Mac OS X 10.5.2
-- ¢ Intel & PowerPC based Macs
-- This script will convert dropped PDF files to JPG images.
-- The JPG images are saved in the same folder as the
-- PDF source files. If a JPG file already exists,
-- it won't be replaced. The PDF source files are
-- not modified. Until now, all pages of a PDF file
-- are converted to JPG. It would be a nice feature,
-- if the user could also choose only certain page
-- numbers to be converted. Future?
property mytitle : "PDF2JPG"
property batchresolution : missing value
-- I am called when the user drops Finder items onto the script icon
on open droppeditems
my main(droppeditems)
end open
-- I am called when the user double clicks the script icon
on run
set infomsg to "I am a hungry AppleScript droplet, so please drop a bunch of PDF files onto my icon to convert them to JPG images."
my dspinfomsg(infomsg)
return
end run
-- I am the main function controlling the script flow
on main(droppeditems)
try
-- initializing important script properties
set batchresolution to missing value
-- searching th dropped items for PDF files
set pdfpaths to my getpdfpaths(droppeditems)
-- no PDF files found :(
if pdfpaths is {} then
set errmsg to "You did not drop any PDF documents onto the script."
my dsperrmsg(errmsg, "--")
return
end if
-- processing the PDF files
repeat with pdfpath in pdfpaths
-- getting the image resolution to be used fot the PDF2JPG conversion
if batchresolution is missing value then
set resolution to my askforresolution(pdfpath)
else
set resolution to batchresolution
end if
-- did the user provide a resolution?
if resolution is not missing value then
-- yes, so let's convert the PDF to JPG
my pdf2jpg(pdfpath, resolution)
end if
end repeat
-- catching unexpected errors
on error errmsg number errnum
my dsperrmsg(errmsg, errnum)
end try
end main
-- I am searching the dropped items for PDF files
-- and return a list of unquoted Posix file paths
on getpdfpaths(droppeditems)
set pdfpaths to {}
repeat with droppeditem in droppeditems
set iteminfo to info for droppeditem
if folder of iteminfo is false and name extension of iteminfo is "pdf" then
set pdfpaths to pdfpaths & (POSIX path of (droppeditem as Unicode text))
end if
end repeat
return pdfpaths
end getpdfpaths
-- I am returning the Posix path to the Python script
-- responsible for the PDF manipulation, which is
-- located in the application bundle
on getpyscriptpath()
set pyscriptpath to ((path to me) as Unicode text) & "Contents:Resources:pdflib.py"
return (POSIX path of pyscriptpath)
end getpyscriptpath
-- I am returning the total page count of a given PDF file
-- [PDF file path must be passed as an unquoted Posix path]
on getpagecount(pdfpath)
set action to "getpagecount"
set cmd to "python" & space & quoted form of (my getpyscriptpath()) & space & action & space & quoted form of pdfpath
set cmd to cmd as «class utf8»
set pagecount to (do shell script cmd) as integer
return pagecount
end getpagecount
-- I am converting a given PDF file to JPG
-- [PDF file path must be passed as an unquoted Posix path]
on pdf2jpg(pdfpath, resolution)
set action to "pdf2jpg"
set cmd to "python" & space & quoted form of (my getpyscriptpath()) & space & action & space & quoted form of pdfpath & space & resolution
set cmd to cmd as «class utf8»
do shell script cmd
end pdf2jpg
-- I am asking the user to provide a value for the resolution
on askforresolution(pdfpath)
set msg to "Please enter a resolution used for the JPG conversion of the followng PDF file (72-600):"
try
tell me
display dialog msg default answer "72" buttons {"Use value for batch", "Cancel", "Enter"} default button 3 with title mytitle
set dlgresult to result
end tell
on error errmsg number errnum
-- user hit 'Cancel' button :(
if errnum is equal to -128 then
return missing value
end if
end try
set resolution to text returned of dlgresult
-- empty input...asking again :)
if resolution is "" then
my askforresolution(pdfpath)
else
try
-- can the input be coerced to an integer?
set resolution to resolution as integer
on error
-- no, it can't...
set errmsg to "The entered resolution is not a number."
my dsperrmsg(errmsg, "--")
my askforresolution(pdfpath)
end try
-- is the given resolution valid?
if resolution > 600 then
-- no, it's to high...
set errmsg to "The entered resolution (" & resolution & ") exceeds the maximum value (600)."
my dsperrmsg(errmsg, "--")
my askforresolution(pdfpath)
else if resolution < 0 then
-- no, it's to low...
set errmsg to "The entered resolution (" & resolution & ") is a negative value."
my dsperrmsg(errmsg, "--")
my askforresolution(pdfpath)
else
-- finally...
if button returned of dlgresult is "Use value for batch" then
set batchresolution to resolution
end if
return resolution
end if
end if
end askforresolution
-- I am displaying info messages
on dspinfomsg(infomsg)
tell me
activate
display dialog infomsg buttons {"OK"} default button 1 with icon note with title mytitle
end tell
end dspinfomsg
-- I am displaying error messages, hopefully rather seldom :)
on dsperrmsg(errmsg, errnum)
set msg to "Sorry, an error occured:" & return & return & errmsg & " (" & errnum & ")"
tell me
activate
display dialog msg buttons {"OK"} default button 1 with icon stop with title mytitle
end tell
end dsperrmsg