To get the pdf files in the folders and subfolders

Hi All,

Please share the code to get the pdf files in the folder and multiple subfolders. Your help is much appreciated.

Hi Fredrick,

Is it possible to directly use the applescript without library objc code? Actually, Im newbie to Objc

The effective plain AppleScript uses System Events and recursion for this task.

Getting FILE references:


property pdfsList : {}

getPDFs(choose folder)
return pdfsList

-- recursive handler
on getPDFs(aFolder)
	tell application "System Events"
		set pdfsList to pdfsList & (files of aFolder whose name extension is "pdf")
		repeat with subFolder in (get folders of aFolder)
			my getPDFs(subFolder)
		end repeat
	end tell
end getPDFs

To get HFS paths:


-- Replace the appropriate script code line with this:
set pdfsList to pdfsList & (path of files of aFolder whose name extension is "pdf")

To get POSIX paths:


-- Replace the appropriate script code line with this:
set pdfsList to pdfsList & (Posix path of files of aFolder whose name extension is "pdf")

TIP: in the whose clause you can use kind property or type identifier property instead of name extension property. For me, the type identifier property is the best choice. Like here:


set pdfsList to pdfsList & (files of aFolder whose kind is "PDF document")
set pdfsList to pdfsList & (files of aFolder whose type identifier is "com.adobe.pdf")

This is probably the quickest way…

property mdFindContent : quoted form of "kMDItemKind == '*PDF document*'"

activate
set searchThisFolder to quoted form of POSIX path of (choose folder)

set pdfFiles to paragraphs of ¬
	(do shell script "mdfind -onlyin " & searchThisFolder & space & mdFindContent)

-- if you dont want the results as a list
--set pdfFiles to do shell script "mdfind -onlyin " & searchThisFolder & space & mdFindContent

using property will works for run the script in other mac? actually i faced some issue in other scripts while using property

The “property” value in this case is only text. It’s not pointing to location on the hard drive or anything. You should have no problems with this

It was not for nothing that I brought several options using the slower System Events. Reliability (when Spotlight indexing is disabled, or when the filename contains special characters) matters a lot. And compared to the reliable AsObjC variant (which does not use Spotlight, and is not presented here), System Events has more flexibility in quickly returning results as Posix paths and HFS paths.

Many users have Spotlight disabled on their hard drive (like mine), and almost everyone has it disabled on external drives. Therefore, I never use or archive scripts using Spotlight. Very fast and reliable script is AsObjC variant which does not use Spotlight (it returns URLs and is not presented here). The most flexible and reliable is System Events. Scripts using Spotlight, is fastest, but also not reliable.

One should always warn about the unreliability of data from Spotlight, especially for newbies.

Vijay_Yukthi requested a solution that used basic AppleScript and KniazidisR has provided that solution.

As regards speed, I tested KniazidisR’s and wch1zpink’s scripts, both of which worked as expected. I also tested the following ASObjC script, which was probably originally written by Shane:

use framework "Foundation"
use scripting additions

set theFolder to POSIX path of (choose folder)

set theFiles to getFiles(theFolder, {"pdf"})

on getFiles(theFolder, theFileExtensions)
	set fileManager to current application's NSFileManager's defaultManager()
	set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
	set searchOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants as integer) + (current application's NSDirectoryEnumerationSkipsHiddenFiles as integer)
	set theFiles to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:searchOptions errorHandler:(missing value))'s allObjects()
	set searchPredicate to current application's NSPredicate's predicateWithFormat_("pathExtension.lowercaseString IN %@", theFileExtensions)
	return ((theFiles's filteredArrayUsingPredicate:searchPredicate) as list)
end getFiles

The test folder contained 385 PDF files spread over 102 subfolders, and the test results were:

KniazidisR’s script - 803 milliseconds
wch1zpink’s script - 62 milliseconds
ASObjC script - 19 milliseconds

Just for the sake of completeness, we shouldn’t forget the Finder, which completed my timing test in 93 seconds. This is not usable, of course, but with a limited number of files and folders it might do the job.

set theFolder to (choose folder)

tell application "Finder"
	set theFiles to (every file of the entire contents of theFolder whose name extension is "pdf") as alias list
end tell

Thanks peavine and Fredrik for clear explanation