Sunday, April 18, 2021

#1 2021-03-14 01:38:43 pm

ayden27
Member
Registered: 2021-03-08
Posts: 1

Rasterize pdf

Hi, guys!
I'm newbie in AppleScript and didn't have any coding experience. I'm trying to make a script to rasterize pdf files. Basically i try to first convert selected pdf file to tiff format, and second convert this tiff image back to pdf with the same name in the same folder, i.e. overwrite file. I try to use preview app to convert pdf to tiff, but no luck. And second i don't see how to make it work when i choose several pdfs at ones (i use automator quick action for Finder, to select pdf, and then send it to AppleScript).
Will be very grateful for any help.


Filed under: PDF

Offline

 

#2 2021-03-15 07:21:49 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 839

Re: Rasterize pdf

ayden27. The Preview app does have the ability to save a PDF as TIFF--at least with Catalina. Just open the PDF in Preview, hold down the option key and select "File" > "Save As", and select TIFF as the file format. Then, repeat this to save the TIFF as a PDF. You could automate this process with GUI scripting. It's important to note that Catalina's Preview will save a multipage PDF as a multipage TIFF.

The following will do what you want with an AppleScript but it has a fatal flaw, which is that the sips utility appears unable to create a multipage TIFF. The same appears to be the case with Image Events.

Applescript:

set sourceFiles to (choose file with multiple selections allowed)

repeat with aFile in sourceFiles
   set aFile to quoted form of POSIX path of aFile
   repeat with aFormat in {"tiff", "pdf"}
       do shell script "sips --setProperty format " & (aFormat as text) & " " & aFile & " --out " & aFile
       delay 0.5 -- test different values
   end repeat
end repeat

My guess is that ASObjC could be used to do what you want but I don't have the knowledge level to accomplish this. Perhaps another forum member will let you know if this is possible.

Last edited by peavine (2021-03-15 07:45:01 am)


2018 Mac mini - macOS Catalina

Offline

 

#3 2021-03-16 10:21:54 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 1752

Re: Rasterize pdf

I was interested in this article.

As I see it, scripting the "Preview" interface is a good way to convert PDF to multipage TIFF. There is an AsObjC script for reverse conversion. But I would like to understand what is the meaning of these operations - there and back. What is the advantage of this?

Other point. As I see, sips and Image Events can export single TIFFs. I think, AsObjC can convert each page of PDF to single TIFFs set, as well. And, as I know, exists AsObjC script for merging TIFFs. So, "Preview" GUI scripting may be avoided. But firstly I want know: what is meaning.

NOTE: on Catalina "Save As..." is "Export..." menu item of interface.

Last edited by KniazidisR (2021-03-16 10:39:18 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 13.1
Ram: 4 GB

Offline

 

#4 2021-03-16 12:28:25 pm

db123
Member
Registered: 2020-12-07
Posts: 16

Re: Rasterize pdf

If you convert a PDF to a TIFF and back again, then it is no longer searchable and can't be indexed. Possibly that is the reason, but certainly ayden27 can say more about it.

Offline

 

#5 2021-03-17 01:23:49 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 1752

Re: Rasterize pdf

I wrote here AsObjC solution for me and users, without involving Preview GUI scripting. I improve script's speed further (creating and using RAM-disk instead of using Temporary Items folder of user domain):

Applescript:


use scripting additions
use framework "Foundation"
use framework "AppKit"
use framework "Quartz"
use framework "QuartzCore"

property NSString : a reference to NSString of current application
property |NSURL| : a reference to |NSURL| of current application
property PDFDocument : a reference to PDFDocument of current application
property NSImage : a reference to NSImage of current application
property NSImageView : a reference to NSImageView of current application
property NSBitmapImageRep : a reference to NSBitmapImageRep of current application
property NSTIFFFileType : a reference to NSTIFFFileType of current application
property desktopFolder : path to desktop folder

-- Create RAM disk
my makeRAMdisk()
-- Choose some pdf file
set aPDF to choose file of type "pdf"
-- Make tempopary job folder at RAM disk
tell application "System Events" to try
   make new folder at folder "Ram disk:" with properties {name:"TIFFs"}
end try
-- Save PDF as multiple TIFFs at the tempopary job folder
set TIFFs to my splitPDFasTIFFs(aPDF)
-- Merge TIFFs back to single PDF file, saved on the desktop
my combineFiles:TIFFs savingToPDF:(POSIX path of desktopFolder & "Combined.pdf")
-- Now, we can delete unneeded temporary folder
tell application "System Events" to delete folder "Ram disk:TIFFs:"
-- eject RAM disk (if need)
tell application "Finder" to eject "Ram disk:"


--=================================== HANDLERS =======================================

on makeRAMdisk()
   set dName to "RAM Disk"
   set dCapacity to 512 * 2 * 2000 --1GB
   set aCmd to "diskutil erasevolume HFS+ '" & dName & "' `hdiutil attach -nomount ram://" & (dCapacity as string) & "`"
   do shell script aCmd
end makeRAMdisk


on splitPDFasTIFFs(aPDF)
   set aURL to (|NSURL|'s fileURLWithPath:(POSIX path of aPDF))
   set aPDFdoc to PDFDocument's alloc()'s initWithURL:aURL
   set pCount to aPDFdoc's pageCount()
   set TIFFs to {}
   -- Split the PDF into pages exported as Tiff files
   repeat with i from 0 to (pCount - 1)
       set thisPage to (aPDFdoc's pageAtIndex:i)
       set thisDoc to (NSImage's alloc()'s initWithData:(thisPage's dataRepresentation()))
       if thisDoc = missing value then error "Error in getting imagerep from PDF in page:" & (i as string)
       set theData to thisDoc's TIFFRepresentation()
       set newRep to (NSBitmapImageRep's imageRepWithData:theData)
       set targData to (newRep's representationUsingType:NSTIFFFileType |properties|:{NSTIFFCompressionNone:1})
       set nextPath to "/Volumes/RAM Disk/TIFFs/" & i & ".tiff"
       set end of TIFFs to nextPath
       set outPath to (NSString's stringWithString:nextPath)
       (targData's writeToFile:outPath atomically:true) -- Export
   end repeat
   return TIFFs
end splitPDFasTIFFs


on combineFiles:TIFFs savingToPDF:destPosixPath
   -- make new empty PDF document
   set theDoc to PDFDocument's alloc()'s init()
   repeat with i from 0 to (count TIFFs) - 1
       -- make URL of the next PDF
       set inNSURL to (|NSURL|'s fileURLWithPath:(item (i + 1) of TIFFs))
       -- make PDF document from the URL
       set newDoc to (my pdfDocFromImageURL:inNSURL)
       -- get page of PDF
       set thePDFPage to (newDoc's pageAtIndex:0) -- zero-based indexes
       -- insert the page into main PDF
       (theDoc's insertPage:thePDFPage atIndex:i)
   end repeat
   set outNSURL to |NSURL|'s fileURLWithPath:destPosixPath
   -- save the main PDF
   (theDoc's writeToURL:outNSURL)
end combineFiles:savingToPDF:


on pdfDocFromImageURL:inNSURL
   set theImage to NSImage's alloc()'s initWithContentsOfURL:inNSURL
   set theSize to theImage's |size|()
   set theRect to {{0, 0}, theSize}
   set theImageView to NSImageView's alloc()'s initWithFrame:theRect
   theImageView's setImage:theImage
   set theData to theImageView's dataWithPDFInsideRect:theRect
   return PDFDocument's alloc()'s initWithData:theData
end pdfDocFromImageURL:

Last edited by KniazidisR (2021-03-18 03:23:45 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 13.1
Ram: 4 GB

Offline

 

#6 2021-03-17 04:33:03 am

Mockman
Member
From:: Toronto
Registered: 2020-05-27
Posts: 12

Re: Rasterize pdf

peavine wrote:

The following will do what you want with an AppleScript but it has a fatal flaw, which is that the sips utility appears unable to create a multipage TIFF. The same appears to be the case with Image Events.


I wouldn't call it a fatal flaw.

sips is a tool for working with raster images and colour profiles. Image Events provides access to its functionality.

PDF is a vector format, and thus sips is not built for such a purpose.

Offline

 

#7 2021-03-17 05:56:56 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6615

Re: Rasterize pdf

KniazidisR wrote:

I wrote here AsObjC solution for me and users, without involving Preview GUI scripting. I improve script's speed further (creating and using RAM-disk instead of using Temporary Items folder of user domain):



You can probably skip the intermediate files and RAM disk altogether, like this:

Applescript:

on rasterPDF:aPDF savingTo:destPosixPath
   set aURL to (|NSURL|'s fileURLWithPath:(POSIX path of aPDF))
   set aPDFdoc to PDFDocument's alloc()'s initWithURL:aURL
   set pCount to aPDFdoc's pageCount()
   repeat with i from 0 to (pCount - 1)
       set thisPage to (aPDFdoc's pageAtIndex:i)
       set thisDoc to (NSImage's alloc()'s initWithData:(thisPage's dataRepresentation()))
       if thisDoc = missing value then error "Error in getting image from PDF in page:" & (i as string)
       set theData to thisDoc's TIFFRepresentation()
       set theImage to (NSImage's alloc()'s initWithData:theData)
       set newPage to (current application's PDFPage's alloc's initWithImage:theImage)
       (aPDFdoc's removePageAtIndex:i)
       (aPDFdoc's insertPage:newPage atIndex:i)
   end repeat
   set outNSURL to |NSURL|'s fileURLWithPath:destPosixPath
   aPDFdoc's writeToURL:outNSURL
end rasterPDF:savingTo:


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#8 2021-03-17 06:48:50 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 1752

Re: Rasterize pdf

There are no words. Great. I will keep both scripts for myself, as a keepsake. Your script, Shane, is what I call optimal. By the way, I haven’t found the slightest information on PDF rasterization using AsObjC before.

I tested two scripts with 168-pages PDF. The speed is almost same (my is slightly faster), but resulting PDF of Shane script is 42 MB and  resulting PDF of my script is 9 MB. I don't understand why so big difference between them.

Last edited by KniazidisR (2021-03-17 07:47:53 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 13.1
Ram: 4 GB

Offline

 

#9 2021-03-17 07:45:35 am

Mockman
Member
From:: Toronto
Registered: 2020-05-27
Posts: 12

Re: Rasterize pdf

Is it possible to increase the resolution within the script? Say… to 150 dpi? The output seems like it's about 72 dpi.

Offline

 

#10 2021-03-17 08:05:55 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 839

Re: Rasterize pdf

KniazidisR wrote:

There are no words. Great... By the way, I haven’t found the slightest information on PDF rasterization using AsObjC before.



@Shane. I agree with KniazidisR--your script is outstanding. Very useful and beautifully compact.

BTW, the script did not work until I inserted "current application's" in several spots. Is there some reason these are not needed?

@Mockman. I meant the words, fatal flaw, to refer to my script and its inability to fulfill the OP's needs. Perhaps I should have been more clear.


2018 Mac mini - macOS Catalina

Offline

 

#11 2021-03-17 09:08:48 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 839

Re: Rasterize pdf

KniazidisR wrote:

I tested two scripts with 168-pages PDF. The speed is almost same (my is slightly faster), but resulting PDF of Shane script is 42 MB and  resulting PDF of my script is 9 MB. I don't understand why so big difference between them.



FWIW, I tested Shane's and KniazidisR's scripts and used as a test document Shane's ASObjC book (a PDF). I also tested with Preview (save as TIFF at 72 dpi and then as PDF) The file sizes were:

Original - 2.4 MB

With Shane's script - 104.5 MB

With KniazidisR's script - 21.5 MB

With Preview - 18.1 MB

I looked at the new PDF's and Shane's was as expected but the pages of the PDF created by KniazidisR's script were out of order. This appears to be fixed by padding the counter used with the naming of the TIFF files.

Applescript:

set j to text -3 thru -1 of ("000" & i as text)
set outPath to (NSString's stringWithString:("/Volumes/RAM Disk/TIFFs/" & j & ".tiff"))

The PDFs created by KniazidisR's and Shane's scripts were both 72 dpi.

Last edited by peavine (2021-03-17 03:05:34 pm)


2018 Mac mini - macOS Catalina

Offline

 

#12 2021-03-17 05:08:54 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6615

Re: Rasterize pdf

Mockman wrote:

Is it possible to increase the resolution within the script?



This version lets you specify the resolution:

Applescript:

on rasterPDF:aPDF savingTo:destPosixPath resolution:theDpi
   set aURL to (|NSURL|'s fileURLWithPath:(POSIX path of aPDF))
   set aPDFdoc to PDFDocument's alloc()'s initWithURL:aURL
   set pCount to aPDFdoc's pageCount()
   repeat with i from 0 to (pCount - 1)
       set thisPage to (aPDFdoc's pageAtIndex:i)
       -- do size calculations
       set pageSize to (thisPage's boundsForBox:(current application's kPDFDisplayBoxMediaBox))
       set pageWidth to current application's NSWidth(pageSize)
       set pageHeight to current application's NSHeight(pageSize)
       set pixelWidth to (pageWidth * theDpi / 72) div 1
       set pixelHeight to (pageHeight * theDpi / 72) div 1
       -- make bitmaps
       set theImageRep to (current application's NSPDFImageRep's imageRepWithData:(thisPage's dataRepresentation()))
       set newRep to (current application's NSBitmapImageRep's alloc()'s initWithBitmapDataPlanes:(missing value) pixelsWide:pixelWidth pixelsHigh:pixelHeight bitsPerSample:8 samplesPerPixel:4 hasAlpha:yes isPlanar:false colorSpaceName:(current application's NSDeviceRGBColorSpace) bytesPerRow:0 bitsPerPixel:32)
       -- store the existing graphics context
       current application's NSGraphicsContext's saveGraphicsState()
       -- set graphics context to new context based on the new bitmapImageRep
       (current application's NSGraphicsContext's setCurrentContext:(current application's NSGraphicsContext's graphicsContextWithBitmapImageRep:newRep))
       (theImageRep's drawInRect:{origin:{x:0, y:0}, |size|:{width:pixelWidth, height:pixelHeight}} fromRect:(current application's NSZeroRect) operation:(current application's NSCompositeSourceOver) fraction:1.0 respectFlipped:false hints:(missing value))
       -- restore state
       current application's NSGraphicsContext's restoreGraphicsState()
       -- make new image and page
       (newRep's setSize:{pageWidth, pageHeight})
       set theData to newRep's TIFFRepresentation()
       set theImage to (NSImage's alloc()'s initWithData:theData)
       set newPage to (current application's PDFPage's alloc's initWithImage:theImage)
       (aPDFdoc's removePageAtIndex:i)
       (aPDFdoc's insertPage:newPage atIndex:i)
   end repeat
   set outNSURL to |NSURL|'s fileURLWithPath:destPosixPath
   aPDFdoc's writeToURL:outNSURL
end rasterPDF:savingTo:resolution:


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#13 2021-03-18 03:34:26 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 1752

Re: Rasterize pdf

peavine wrote:

I looked at the new PDF's and Shane's was as expected but the pages of the PDF created by KniazidisR's script were out of order. This appears to be fixed by padding the counter used with the naming of the TIFF files.


Thank you, Peavine, for your consideration. My script was not in the order of the pages. I made the correct fix in post #5, only slightly more efficient than padding the filename.

Also, I removed the unnecessary repeat loop (in the combineFiles handler) and now the script is 1.5 times faster. (And creates the PDF with size close to size of PDF created by Preview method.)

Last edited by KniazidisR (2021-03-18 03:40:06 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 13.1
Ram: 4 GB

Offline

 

#14 2021-03-18 05:25:58 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 1752

Re: Rasterize pdf

Shane Stanley wrote:

This version lets you specify the resolution:


Shane, thanks for your last script. I ran 2 tests.

Your script successfully worked with a 5-page PDF and DPI = 600. With a 168-page PDF and DPI = 600, the script hangs and I get a message that the Script Debugger is not responding and takes 62 GB of memory !!!

It looks like a memory leak somewhere in the script. What do you say?

I tried your handler as following. Maybe alloc() statements need some additional parentheses?

Applescript:


use scripting additions
use framework "Foundation"
use framework "AppKit"
use framework "Quartz"
use framework "QuartzCore"

set aPDF to choose file of type "pdf"
set destPosixPath to (POSIX path of (path to desktop folder)) & "/Rasterized.pdf"
my rasterPDF:aPDF savingTo:destPosixPath resolution:600

on rasterPDF:aPDF savingTo:destPosixPath resolution:theDpi
   set aURL to (current application's |NSURL|'s fileURLWithPath:(POSIX path of aPDF))
   set aPDFdoc to current application's PDFDocument's alloc()'s initWithURL:aURL
   set pCount to aPDFdoc's pageCount()
   -- do size calculations
   set thisPage to (aPDFdoc's pageAtIndex:0)
   set pageSize to (thisPage's boundsForBox:(current application's kPDFDisplayBoxMediaBox))
   set pageWidth to current application's NSWidth(pageSize)
   set pageHeight to current application's NSHeight(pageSize)
   set pixelWidth to (pageWidth * theDpi / 72) div 1
   set pixelHeight to (pageHeight * theDpi / 72) div 1
   repeat with i from 0 to (pCount - 1)
       set thisPage to (aPDFdoc's pageAtIndex:i)
       -- make bitmaps
       set theImageRep to (current application's NSPDFImageRep's imageRepWithData:(thisPage's dataRepresentation()))
       set newRep to (current application's NSBitmapImageRep's alloc()'s initWithBitmapDataPlanes:(missing value) pixelsWide:pixelWidth pixelsHigh:pixelHeight bitsPerSample:8 samplesPerPixel:4 hasAlpha:yes isPlanar:false colorSpaceName:(current application's NSDeviceRGBColorSpace) bytesPerRow:0 bitsPerPixel:32)
       -- store the existing graphics context
       current application's NSGraphicsContext's saveGraphicsState()
       -- set graphics context to new context based on the new bitmapImageRep
       (current application's NSGraphicsContext's setCurrentContext:(current application's NSGraphicsContext's graphicsContextWithBitmapImageRep:newRep))
       (theImageRep's drawInRect:{origin:{x:0, y:0}, |size|:{width:pixelWidth, height:pixelHeight}} fromRect:(current application's NSZeroRect) operation:(current application's NSCompositeSourceOver) fraction:1.0 respectFlipped:false hints:(missing value))
       -- restore state
       current application's NSGraphicsContext's restoreGraphicsState()
       -- make new image and page
       (newRep's setSize:{pageWidth, pageHeight})
       set theData to newRep's TIFFRepresentation()
       set theImage to (current application's NSImage's alloc()'s initWithData:theData)
       set newPage to (current application's PDFPage's alloc()'s initWithImage:theImage)
       (aPDFdoc's removePageAtIndex:i)
       (aPDFdoc's insertPage:newPage atIndex:i)
   end repeat
   set outNSURL to current application's |NSURL|'s fileURLWithPath:destPosixPath
   aPDFdoc's writeToURL:outNSURL
end rasterPDF:savingTo:resolution:

Last edited by KniazidisR (2021-03-18 05:29:14 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 13.1
Ram: 4 GB

Offline

 

#15 2021-03-18 05:34:45 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6615

Re: Rasterize pdf

KniazidisR wrote:

It looks like a memory leak somewhere in the script. What do you say?



The issue, sadly, is that ASObjC leaks memory badly, period. Initially it relied on automatic garbage collection, but when that was abandoned memory management was presumably just tacked on to AppleScript's own, periodic, garbage collection. But even that doesn't seem to clear everything out.

In most cases, the OS's efficient overall memory management means it doesn't matter much. But when you push it hard -- which you're doing in that test -- it tends to bog down more or less completely.

(The leaking is such that if you run a batch of tests, the memory use is cumulative. I suspect that's a straight bug.)


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#16 2021-03-18 05:37:53 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6615

Re: Rasterize pdf

Further to that: the poor memory management is one of the reasons I withdrew my book on how to write ASObjC-based apps in Xcode. It's just too easy to write apps that then crash intermittently because of memory problems (sometimes because a clean-up appears to have happened).

But it's generally fine in applets, which mostly just run and quit.


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#17 2021-03-18 05:43:36 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 1752

Re: Rasterize pdf

Yes, this is a very unpleasant incident. Thanks for the clarification. I was coding the movie (97% of CPU, about), when your script tested.

Last edited by KniazidisR (2021-03-18 05:58:42 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 13.1
Ram: 4 GB

Offline

 

#18 2021-03-26 05:01:34 am

Fredrik71
Member
Registered: 2019-10-23
Posts: 671

Re: Rasterize pdf

Shane Stanley wrote:

...withdrew my book on how to write ASObjC-based apps in Xcode.


I wishes for other Title: ASObjC-based apps in PyObjC smile, or maybe it has the same fate.


if you are the expert, who will you call if its not your imagination.

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)