Sunday, October 17, 2021

#1 2021-09-16 10:03:48 am

Vijay_Yukthi
Member
Registered: 2021-04-16
Posts: 34

To get the pdf files in the folders and subfolders

Hi All,

Please share the code to get the pdf files in the folder and multiple subfolders. Your help is much appreciated.

Offline

 

#2 2021-09-16 10:09:22 am

Fredrik71
Member
Registered: 2019-10-23
Posts: 898

Re: To get the pdf files in the folders and subfolders

Search in this forum for code that use NSPredicate


if you are the expert, who will you call if its not your imagination.

Offline

 

#3 2021-09-16 10:26:40 am

Vijay_Yukthi
Member
Registered: 2021-04-16
Posts: 34

Re: To get the pdf files in the folders and subfolders

Hi Fredrick,

Is it possible to directly use the applescript without library objc code? Actually, Im newbie to Objc

Offline

 

#4 2021-09-16 11:17:15 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 2079

Re: To get the pdf files in the folders and subfolders

The effective plain AppleScript uses System Events and recursion for this task.

Getting FILE references:

Applescript:


property pdfsList : {}

getPDFs(choose folder)
return pdfsList

-- recursive handler
on getPDFs(aFolder)
   tell application "System Events"
       set pdfsList to pdfsList & (files of aFolder whose name extension is "pdf")
       repeat with subFolder in (get folders of aFolder)
           my getPDFs(subFolder)
       end repeat
   end tell
end getPDFs

To get HFS paths:

Applescript:


-- Replace the appropriate script code line with this:
set pdfsList to pdfsList & (path of files of aFolder whose name extension is "pdf")

To get POSIX paths:

Applescript:


-- Replace the appropriate script code line with this:
set pdfsList to pdfsList & (Posix path of files of aFolder whose name extension is "pdf")

TIP: in the whose clause you can use kind property or type identifier property instead of name extension property. For me, the type identifier property is the best choice. Like here:

Applescript:


set pdfsList to pdfsList & (files of aFolder whose kind is "PDF document")
set pdfsList to pdfsList & (files of aFolder whose type identifier is "com.adobe.pdf")

Last edited by KniazidisR (2021-09-16 11:42:19 am)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 14.1
Ram: 4 GB

Offline

 

#5 2021-09-16 11:35:42 am

Fredrik71
Member
Registered: 2019-10-23
Posts: 898

Re: To get the pdf files in the folders and subfolders

Vijay_Yukthi wrote:

Is it possible to directly use the applescript


The reason to avoid it... it's very slow... compare to ASObjC in this regard.

You could also use Shane Stanley Metadata_Lib that use spotlight search.
https://latenightsw.com/freeware/


if you are the expert, who will you call if its not your imagination.

Offline

 

#6 2021-09-16 12:01:45 pm

wch1zpink
Member
Registered: 2011-08-20
Posts: 61

Re: To get the pdf files in the folders and subfolders

This is probably the quickest way…

Applescript:

property mdFindContent : quoted form of "kMDItemKind == '*PDF document*'"

activate
set searchThisFolder to quoted form of POSIX path of (choose folder)

set pdfFiles to paragraphs of ¬
   (do shell script "mdfind -onlyin " & searchThisFolder & space & mdFindContent)

-- if you dont want the results as a list
--set pdfFiles to do shell script "mdfind -onlyin " & searchThisFolder & space & mdFindContent

Offline

 

#7 2021-09-16 12:25:30 pm

Vijay_Yukthi
Member
Registered: 2021-04-16
Posts: 34

Re: To get the pdf files in the folders and subfolders

using property will works for run the script in other mac? actually i faced some issue in other scripts while using property

Offline

 

#8 2021-09-16 12:41:26 pm

wch1zpink
Member
Registered: 2011-08-20
Posts: 61

Re: To get the pdf files in the folders and subfolders

Vijay_Yukthi wrote:

using property will works for run the script in other mac? actually i faced some issue in other scripts while using property



The “property” value in this case is only text. It's not pointing to location on the hard drive or anything.  You should have no problems with this

Offline

 

#9 2021-09-16 12:43:33 pm

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 2079

Re: To get the pdf files in the folders and subfolders

It was not for nothing that I brought several options using the slower System Events. Reliability (when Spotlight indexing is disabled, or when the filename contains special characters) matters a lot. And compared to the reliable AsObjC variant (which does not use Spotlight, and is not presented here), System Events has more flexibility in quickly returning results as Posix paths and HFS paths.

Many users have Spotlight disabled on their hard drive (like mine), and almost everyone has it disabled on external drives. Therefore, I never use or archive scripts using Spotlight. Very fast and reliable script is AsObjC variant  which does not use Spotlight (it returns URLs and is not presented here). The most flexible and  reliable is System Events. Scripts using  Spotlight, is fastest, but also not reliable.

One should always warn about the unreliability of data from Spotlight, especially for newbies.

Last edited by KniazidisR (2021-09-16 12:58:40 pm)


Model: MacBook Pro
OS X: Catalina 10.15.4
Web Browser: Safari 14.1
Ram: 4 GB

Offline

 

#10 2021-09-16 03:35:31 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1081

Re: To get the pdf files in the folders and subfolders

Vijay_Yukthi requested a solution that used basic AppleScript and KniazidisR has provided that solution.

As regards speed, I tested KniazidisR's and wch1zpink's scripts, both of which worked as expected. I also tested the following ASObjC script, which was probably originally written by Shane:

Applescript:

use framework "Foundation"
use scripting additions

set theFolder to POSIX path of (choose folder)

set theFiles to getFiles(theFolder, {"pdf"})

on getFiles(theFolder, theFileExtensions)
   set fileManager to current application's NSFileManager's defaultManager()
   set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
   set searchOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants as integer) + (current application's NSDirectoryEnumerationSkipsHiddenFiles as integer)
   set theFiles to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:searchOptions errorHandler:(missing value))'s allObjects()
   set searchPredicate to current application's NSPredicate's predicateWithFormat_("pathExtension.lowercaseString IN %@", theFileExtensions)
   return ((theFiles's filteredArrayUsingPredicate:searchPredicate) as list)
end getFiles

The test folder contained 385 PDF files spread over 102 subfolders, and the test results were:

KniazidisR's  script - 803 milliseconds
wch1zpink's script - 62 milliseconds
ASObjC script - 19 milliseconds

Last edited by peavine (2021-09-18 07:40:28 am)


2018 Mac mini - macOS Catalina - Script Debugger 8

Offline

 

#11 2021-09-16 04:32:12 pm

Fredrik71
Member
Registered: 2019-10-23
Posts: 898

Re: To get the pdf files in the folders and subfolders

KniazidisR wrote:

Many users have Spotlight disabled


Spotlight is not disabled as default for main HD (Macintosh HD)
If you like to have a extended HD to be disabled (spotlight index) you need to add that to
privacy tab. In other words its not disabled as default.

You could remove any options from search result and only include PDF Document

If you ask me I think the greatest feature Finder has is the smart folder. It use the same approach
it would be to build NSPredicateEditor in NSView to filter content.

If a background process of building spotlight database make a computer slow, it properly are
other reason. Maybe time to upgrade to a faster computer with more memory.

PS. It takes 10 seconds for spotlight index to update with new contents. (I have done the research)

What spotlight index do not do is... index every file on the HD.

But if you point the location in Finder to a system directory it will search it include subfolders.
In other words search This Mac: (doesn't search in entire computer) that Apple say...

Last edited by Fredrik71 (2021-09-16 04:44:44 pm)


if you are the expert, who will you call if its not your imagination.

Offline

 

#12 2021-09-16 05:23:13 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1081

Re: To get the pdf files in the folders and subfolders

Just for the sake of completeness, we shouldn't forget the Finder, which completed my timing test in 93 seconds. This is not usable, of course, but with a limited number of files and folders it might do the job.

Applescript:

set theFolder to (choose folder)

tell application "Finder"
   set theFiles to (every file of the entire contents of theFolder whose name extension is "pdf") as alias list
end tell


2018 Mac mini - macOS Catalina - Script Debugger 8

Offline

 

#13 2021-09-17 01:41:44 am

Vijay_Yukthi
Member
Registered: 2021-04-16
Posts: 34

Re: To get the pdf files in the folders and subfolders

Thanks peavine and Fredrik for clear explanation

Offline

 

#14 2021-09-17 05:09:24 am

Fredrik71
Member
Registered: 2019-10-23
Posts: 898

Re: To get the pdf files in the folders and subfolders

peavine wrote:


The test results were:

KniazidisR's  script - 803 milliseconds
wch1zpink's script - 62 milliseconds
ASObjC script - 19 milliseconds


Only like to point out...

I have seen approximately 20 milliseconds to execute the command AS do shell script.
So in other words ASObjC is finish before AS have call the command do shell script.
(If the bash command do not add any extra time, do shell script will always loose if ASObjC
time is less and 20 milliseconds).

Most of the time bash commands are very fast...

Last edited by Fredrik71 (2021-09-17 05:10:24 am)


if you are the expert, who will you call if its not your imagination.

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)