Acrobat Metadata?

Mark67 · July 20, 2007, 8:50am

Similar in some respects to Calvins method this can be used with Quark PDFs if they were created via print to PS and distilled (requires Quark PDF boxer XT) to enbed media box bleed box etc. Instead of writing a report you could move and rename the files. The script itself requires satimage OSXA. I just import the text file back to a Quark doc.

set Q1 to "Do you want to include all the subfolders" & return & "within your folder selection?"
set theDialog to display dialog Q1 buttons {"No", "Yes", "Cancel"} default button 1 with icon note
if button returned of theDialog is "Yes" then
	set inputFolder to choose folder with prompt "Where is the top level folder of PFD's?" without invisibles
	tell application "Finder"
		set filesList to (files of entire contents of inputFolder whose name extension is "pdf" or file type is "PDF ")
	end tell
else
	tell application "Finder"
		set inputFolder to choose folder with prompt "Where is the folder of PFD's?" without invisibles
		set filesList to (files of inputFolder whose name extension is "pdf" or file type is "PDF ")
	end tell
end if
set countA to count of filesList
if countA = 0 then
	tell application "Finder"
		display dialog "Your selection contained no PDF files to check!" buttons {"Cancel"} default button 1 giving up after 3
	end tell
end if
--
repeat with aFile in filesList
	tell application "Finder"
		set theFile to aFile as alias
		set docName to name of theFile
	end tell
	try
		set PDFlevel to find text "PDF-1[.][3-7]" in theFile with regexp and string result
		set theLevel to the result
	on error
		set theLevel to "Not Found"
	end try
	set DN to true
	try
		set Device_N to find text "DeviceN" in theFile with regexp and string result
		set SepsType to the result
	on error
		set DN to false
	end try
	if DN is false then
		try
			set Device_CMYK to find text "Separation/All/DeviceCMYK" in theFile with regexp and string result
			set SepsType to "DeviceCMYK"
		on error
			set SepsType to "Grayscale"
		end try
	end if
	set MediaBox to my GREPSearch(theFile, "MediaBox")
	set BleedBox to my GREPSearch(theFile, "BleedBox")
	set TrimBox to my GREPSearch(theFile, "TrimBox")
	set CropBox to my GREPSearch(theFile, "CropBox")
	--
	my writelog(docName, theLevel, SepsType, MediaBox, BleedBox, TrimBox, CropBox)
end repeat
--
set ReadFile to ((path to desktop folder) as text) & "PDF Info Log.txt" as alias
tell application "TextEdit"
	activate
	open ReadFile
end tell
--
on GREPSearch(theFile, BoxType)
	set SearchString to BoxType & "\\[[. [:digit:]]{1,}\\]"
	try
		set BoxText to find text SearchString in theFile with regexp and string result
		set FoundString to the result
		set BoxText to my BoxMeasure(BoxType, FoundString)
		return BoxText
	on error
		return ("No " & BoxType) as text
	end try
end GREPSearch
--
on BoxMeasure(BoxType, FoundString)
	set x to text ((BoxType's length) + 2) thru -2 of FoundString
	set {L, B, R, T} to my GetTextItem(x, " ", 0)
	set {L, B, R, T} to {L as number, B as number, R as number, T as number}
	set H to (((((T - B) / 72) as inches) as centimeters) as number) * 10
	set W to (((((R - L) / 72) as inches) as centimeters) as number) * 10
	set thisHeight to my RoundDecimal(H, 3, to nearest)
	set thisWidth to my RoundDecimal(W, 3, to nearest)
	return thisWidth & "mm " & thisHeight & "mm" as string
end BoxMeasure
--
on RoundDecimal(NumberToRound, DecimalPlace, RoundType)
	set RoundFactor to 10 ^ DecimalPlace
	NumberToRound * RoundFactor
	round result rounding RoundType
	result / RoundFactor
end RoundDecimal
--
on GetTextItem(ThisString, ThisDelim, ThisItem)
	copy the text item delimiters to OldDelims
	set the text item delimiters to ThisDelim
	if class of ThisItem is list then
		set fromItem to (item 1 of ThisItem) as integer
		set toitem to (item 2 of ThisItem) as integer
		set arrItem to (text items fromItem thru toitem of ThisString)
	else
		set arrItem to every text item of ThisString
	end if
	set the text item delimiters to OldDelims
	if class of ThisItem is list then
		return arrItem as text
	else
		if ThisItem is not 0 then
			return (item ThisItem of arrItem) as text
		else
			return arrItem
		end if
	end if
end GetTextItem
--
on writelog(docName, theLevel, SepsType, MediaBox, BleedBox, TrimBox, CropBox)
	set theLog to ((path to desktop folder) as text) & "PDF Info Log.txt"
	try
		open for access file the theLog with write permission
		write (docName & tab & theLevel & tab & SepsType & tab & MediaBox & tab & BleedBox & tab & TrimBox & tab & CropBox & return) to file the theLog starting at eof
		close access file the theLog
	on error
		try
			close access file the theLog
		end try
	end try
end writelog

Mark67 · July 20, 2007, 9:04am

John, Acrobat’s Preflight is no where near as fast as reading the file head data but if you are familiar with using and/or creating your own Preflight conditions then script can process this for you. What you are looking for in this exercise is pretty basic info that is contained in the file head however Preflight will allow you to dig much deeper in to contents of the PDF file. Im in the middle of running some tests with mine at the moment the initial signs look promising (fingers crossed I really can’t be sat about checking hundreds of these). The PDF’s are opened invisibly and moved to new locations takes around 10-30 seconds per file with press ready PDF’s (my testing is with PDF’s approximately 8-12mb) I’ll let you know how I get on with this.

JohnR · July 20, 2007, 3:48pm

Mark,

Thanks for that. I didn’t know preflight can run that fast. However at the minimal scale - 240 files x 10 seconds = 40 minutes or maximal scale 240 files x 30 seconds = 120 minutes to do these files? It’s still more consuming than I’d like.

I’m grateful for the input, but perhaps scanning head data will work better in a time crunch?

Your thoughts?

John

P.S. Calvin - I’m gonna try your script, and see how that works first!