Import Multi-page PDF into Pages as background

kiwilegal · February 28, 2009, 5:40pm

Hi there

Am wondering if someone can tell me that I am barking up the wrong tree.

I have found Apple’s Pages word processor in its current form to be very handy for completing PDF forms, overlaying translation etc etc. At the moment, I open the relevant PDF in Preview, drag over the PDF pages one by one into Pages, creating a new page in Pages to receive each PDF page. I resize each imported page in Pages to fill the Pages page. I then lock the page and send it to the background using Pages’ object inspector.

That all sounds rather complicated but hopefully you get the idea.

I am trying to automate this process so I can get down to the business end ASAP without having to manually set up the Pages document each time in tha above fashion. However, I am stuck. And it may be a limitation of the scriptability of Pages.

So far, all that works is this:

tell me to set thePDF to choose file with prompt "CHOOSE A PDF FILE" of type {"PDF "} without invisibles
tell application "Pages"
	set newDoc to make document
	tell newDoc
		set {top margin, bottom margin, right margin, left margin} to {0, 0, 0, 0}
		(Routine here to bring pages from PDF, expand them to fill page x in Pages, lock and move to background. Repeat for number of pages in PDF) 
	end tell
end tell

As you can see, the hard bit is missing. I have tried a number of things but nothing seems to work. I cannot see how to “extract” a particular PDF page. Then in Pages, I am not sure you can send an object to the background using applescript. You can apparently do this for a layer. Also, it is important that the imported PDF page be a floating object, not inline (moves with text). Again, I cannot see in applescript how to make an object (graphic) floating - the default when a graphic comes in seems to be inline.

Perhaps someone will put me out of my misery. I think the task may be too difficult.

Thanks

Fenton · March 1, 2009, 5:54pm

I don’t know about the Pages part, as I don’t use Pages. But I recently was trying to figure out a good way to split PDFs into a file per page. There’s several ways to do that. There’s commercial apps which can do it (which I don’t have), there’s the free Skim, which I imagine could do it (it can extract the text), but I can’t figure out how to save each page as a PDF file (please do tell if you know), there’s Automator, for which I found an ExtractPDF workflow (worked, but each file was way bigger than the original).

Finally I got and installed this small command line tool, which seems to do a good job. The files are the appropriate size. Its use is very simple, if you are OK with just doing it in the same folder as the original file (you could move them later).

http://www.iis.ee.ic.ac.uk/~g.briscoe/ICL/JoinPDF.html

It’s named JoinPDF, but it can also split PDFs. Example:


set mac_file to choose file
set unix_file to quoted form of POSIX path of mac_file

do shell script "splitPDF " & unix_file
-- splits into filename1.pdf, filename2.pdf, etc., in same folder

Browser: Safari 523.12
Operating System: Mac OS X (10.4)

kiwilegal · March 2, 2009, 8:29am

Thanks for all the help here guys. I was off line for most of the weekend but had almost completed a script. But nothing as cool looking as Jacques’!! I will have a play around. One of the problems I was having was creating a new document in a way that allowed the PDF page image to completely fill the page. Setting margins to (0,0,0,0) did not do it. I also had to deactivate headers/footers in the document inspector. Could not see how to do this in applescript with Pages so saved a template and opened that each time I started.

Does your script entirely fill the page with the PDF page image, Jacques? ie right to the edge.

Sorry, not near my Mac for a while so am unable to test:(

kiwilegal · March 4, 2009, 8:35pm

Thanks, Jacques

Will be deploying this weekend

Merci

kiwilegal · March 7, 2009, 9:45pm

Brilliant!

All works without a hitch.

I have made some small tweaks (eg to delete created PDF page files as they are imported), but it works beautifully. 440 page PDF brought into Pages without any problems at all.

Many thanks, Jacques

kiwilegal · March 7, 2009, 10:12pm

Tweaked script follows:

tell me to activate
set a_PDF to choose file of type "com.adobe.pdf" without invisibles
set posix_PDF_Path to POSIX path of a_PDF

do shell script "/usr/bin/python -c 'import os, sys
from CoreGraphics import *

pdf_filename = \"" & (posix_PDF_Path) & "\"
pdf_name, ext = os.path.splitext(pdf_filename)
provider = CGDataProviderCreateWithFilename( pdf_filename )
this_pdf = CGPDFDocumentCreateWithProvider( provider )
if this_pdf is None:
   print \"Error reading PDF document - check that the supplied filename points to a PDF file\"
   sys.exit(1)

for page_number in range( 1, this_pdf.getNumberOfPages() + 1 ):
   new_file_path = \"%s_%d.pdf\" % (pdf_name, page_number)
   pageRect = this_pdf.getPage(page_number).getBoxRect(kCGPDFMediaBox)
   writeContext = CGPDFContextCreateWithFilename(new_file_path, pageRect)
   writeContext.beginPage(pageRect)
   writeContext.drawPDFDocument(pageRect, this_pdf, page_number)
   writeContext.endPage()
   print new_file_path'"

set pdf_List to paragraphs of the result
--display dialog pdf_List
set tot to (count pdf_List)
set i to 1
if "Error reading PDF document" is not in item 1 of pdf_List then
	tell application "Pages"
		set newDoc to make document
		tell newDoc
			set {top margin, bottom margin, right margin, left margin} to {0, 0, 0, 0}
			tell page 1
				set h to height
				set w to width
			end tell
			repeat with aPdf in pdf_List
				set thisFile to ((POSIX file aPdf) as string) as alias
				tell (make new image at end of images of background layer of page i with properties ¬
					{image data:thisFile, vertical position:0, horizontal position:0, height:h, width:w})
					set locked to 1
				end tell
				select insertion point after every text of body text
				if i < tot then insert page break (body text) -- new page
				set i to i + 1
				--display dialog aPdf
				set filepath to POSIX path of thisFile
				tell application "System Events" to delete disk item filepath
			end repeat
		end tell
	end tell
end if