Convert recursive to iterative script

The following is a script that lists files in a source folder and its subfolders:


set theFolder to choose folder

set allFiles to {}

getAllFiles(theFolder, allFiles)

allFiles

on getAllFiles(theFolder, allFiles)
	tell application "System Events"
		set fileList to (every file of theFolder)
		repeat with i from 1 to (count fileList)
			set end of allFiles to item i of fileList as alias
		end repeat
		set subFolders to folders of theFolder
		repeat with subFolderRef in subFolders
			my getAllFiles(subFolderRef, allFiles)
		end repeat
	end tell
end getAllFiles

It is a minor rewrite of a script contained in:

https://www.macscripter.net/viewtopic.php?id=46815

I read the following post, the title of which is “Recursion against iteration”

https://www.macscripter.net/viewtopic.php?pid=96272

Is it possible to convert the above script so that it works using iteration? I spent significant time on this but didn’t get far.

BTW, I’m doing this just to learn–not for use in an actual script that I will use.

Thanks.

(*
List a Given Folder  based upon a script posted in
applescript-users@lists.apple.com

2018-12-01

*)
----------------------------------------------------------------
use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
----------------------------------------------------------------

property listFolderPaths : true

on listFolder:targetDir
	
	set POSIXPath to POSIX path of targetDir
	
	set theFolderURL to current application's NSURL's fileURLWithPath:POSIXPath
	
	set NSDirectoryEnumerationSkipsHiddenFiles to a reference to 4
	set NSDirectoryEnumerationSkipsPackageDescendants to a reference to 2
	set myOptions to NSDirectoryEnumerationSkipsHiddenFiles + NSDirectoryEnumerationSkipsPackageDescendants
	
	set NSFileManager to a reference to current application's NSFileManager's defaultManager()
	
	set typeIdentifierKey to current application's NSURLTypeIdentifierKey
	set keysToRequest to current application's NSArray's arrayWithObject:(typeIdentifierKey)
	set theURLs to (NSFileManager's enumeratorAtURL:theFolderURL includingPropertiesForKeys:keysToRequest options:myOptions errorHandler:(missing value))'s allObjects()
	
	if theURLs is equal to missing value then error (theError's localizedDescription() as text)
	set thePaths to {}
	repeat with aURL in theURLs
		set itsPath to (aURL's valueForKey:"path") as text
		set {theResult, theValue, theError} to (aURL's getResourceValue:(reference) forKey:(current application's NSURLIsDirectoryKey) |error|:(reference))
		if theResult as boolean is false then error (theError's |localizedDescription|() as text) number (theError's code() as integer)
		
		if theValue as boolean then
			if listFolderPaths then set end of thePaths to itsPath
		else
			set end of thePaths to itsPath
		end if
		
	end repeat
	return thePaths
end listFolder:

set targetDir to choose folder


my listFolder:targetDir



Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) lundi 29 avril 2019 15:59:38

How about this?


set theFolder to (choose folder)

set allFiles to {}

getAllFiles(theFolder, allFiles)

allFiles

on getAllFiles(theFolder, allFiles)
	script o
		property allFiles : {}
		property subfolders : {}
	end script
	
	set o's allFiles to allFiles
	set f to 0
	repeat
		tell application "System Events"
			set fileList to (path of every file of theFolder)
			repeat with i from 1 to (count fileList)
				set end of o's allFiles to (item i of fileList) as alias
			end repeat
			set o's subfolders to o's subfolders & folders of theFolder
			if (f = (length of o's subfolders)) then exit repeat
			set f to f + 1
			set theFolder to item f of o's subfolders
		end tell
	end repeat
end getAllFiles

It’s a by-now well-known way to speed up access items in long lists. It allows the list variables to be addressed as object specifiers (variable of object) instead of just as variables. For reasons known only to AppleScript engineers, this speeds up the getting and setting of list items if the lists are very long. It’s similar to what happens when you use a variable set to a reference to a list variable: https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_classes.html#//apple_ref/doc/uid/TP40000983-CH1g-DontLinkElementID_587. A reference value is simply an object specifier in a variable. In my script above, the object specifiers are compiled into the script code, which makes things slightly faster still.

The script’s allFiles and/or subfolders lists can get very long if it’s dealing with a large folder hierarchy, hence the decision to use object specifiers. These knock a couple of seconds off the time the script takes to handle the large folder I used to test it, but make no difference with small folders.

It turned out that the repeating process here was inherently faster than the recursive one anyway. Also that getting and coercing the paths of the files was considerably faster than getting and coercing the file objects themselves. (It also allowed the script to handle file packages, which System Events can’t coerce to alias, for some reason.)

The script Yvan posted is the fastest of the three, but it returns POSIX paths instead of aliases. It can be easily modified to return AppleScript file specifiers instead, but converting these to aliases slows it down. However, file specifiers would be good enough in most cases.

Thanks Yvan and Nigel for your responses.

Nigel. Yesterday, after first seeing your script, I ran some tests on an extremely large folder I created for the purpose and found your script to be over twice as fast as the script I posted. I didn’t understand why that was the case, and I appreciate your subsequent post explaining the reasons for the difference.

For completeness, here’s a version of Yvan’s ASObjC script which returns a list of file specifiers. The items are basically obtained using the NSFileManager equivalent of the Finder’s entire contents. The repeat simply filters out folder URLs if they’re not wanted. Unlike Yvan’s version, this script won’t work on a Yosemite system.

(*
List a Given Folder  based upon a script originally posted in
applescript-users@lists.apple.com

2018-12-01

*)
----------------------------------------------------------------
use AppleScript version "2.5" -- Mac OS 10.11 (El Capitan) or later.
use framework "Foundation"
use scripting additions
----------------------------------------------------------------

on listFolder from targetDir given anyFolders:includingFolders
	
	set POSIXPath to POSIX path of targetDir
	
	set |⌘| to current application
	set theFolderURL to |⌘|'s class "NSURL"'s fileURLWithPath:(POSIXPath)
	
	set isDirectoryKey to |⌘|'s NSURLIsDirectoryKey
	set isPackageKey to |⌘|'s NSURLIsPackageKey
	set keysToRequest to |⌘|'s class "NSArray"'s arrayWithArray:{isDirectoryKey, isPackageKey}
	
	-- Check that the input's a folder.
	set {resourceValues, theError} to (theFolderURL's resourceValuesForKeys:(keysToRequest) |error|:(reference))
	if (theError is not missing value) then error (theError's |localizedDescription|() as text) number (theError's code() as integer)
	if (not (resourceValues's valueForKey:(isDirectoryKey)) or (resourceValues's valueForKey:(isPackageKey))) then error "The passed item isn't a folder!"
	
	set NSDirectoryEnumerationSkipsHiddenFiles to |⌘|'s NSDirectoryEnumerationSkipsHiddenFiles
	set NSDirectoryEnumerationSkipsPackageDescendants to |⌘|'s NSDirectoryEnumerationSkipsPackageDescendants
	set myOptions to NSDirectoryEnumerationSkipsHiddenFiles + NSDirectoryEnumerationSkipsPackageDescendants
	
	-- Get all the normally visible items in the hierarchy.
	set NSFileManager to |⌘|'s class "NSFileManager"'s defaultManager()
	set theURLs to (NSFileManager's enumeratorAtURL:(theFolderURL) includingPropertiesForKeys:(keysToRequest) options:(myOptions) errorHandler:(missing value))'s allObjects()
	
	-- If folders aren't wanted, remove any such URLs from the array.
	if (not includingFolders) then
		set theURLs to theURLs's mutableCopy()
		repeat with i from (count theURLs) to 1 by -1
			set aURL to item i of theURLs
			set {resourceValues, theError} to (aURL's resourceValuesForKeys:(keysToRequest) |error|:(reference))
			if (theError is not missing value) then error (theError's |localizedDescription|() as text) number (theError's code() as integer)
			
			if ((resourceValues's valueForKey:(isDirectoryKey)) and not (resourceValues's valueForKey:(isPackageKey))) then tell theURLs to removeObjectAtIndex:(i - 1)
		end repeat
	end if
	
	-- Return the URL array as a list of AS file specifiers.
	return theURLs as list
end listFolder


set targetDir to (choose folder)
listFolder from targetDir without anyFolders

Hello Nigel.

Your script is fine but, at this time, I don’t see which feature prevent it for running under Yosemite.

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) mercredi 1 mai 2019 16:15:15

Hi Yvan.

I was going by Shane’s book. Apparently it’s possible to coerce NSURL fileURLs directly to alias or to «class furl» in Yosemite, but coercing an array of them to list only gives a list of «class furl» from El Capitan onwards. It would be easy to adapt the script to work in Yosemite, but the NSURLs would have to be coerced individually. I don’t know if it would also be necessary to restore the explicit as boolean coercions I cut. Maybe it would be safer to keep them anyway. :rolleyes:

Thanks.

I forgot this feature.

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) mercredi 1 mai 2019 19:12:15

Out of curiosity, I ran timing tests on the scripts in this thread. To do this, I created a folder with numerous subfolders containing 3GB of data and 7,850 files. For timing, I used the following:

[the choose folder command]
set startTime to (time of (current date))
[the script]
set executeTime to (time of (current date)) - startTime

The execute times were:

Recursive script I posted: 16 seconds
Nigel’s iterative script: 6 seconds
Nigel’s version of Yvan’s script: 2 seconds

To set a baseline, I ran the following script on the test folder and the execute time was 35 seconds. Additional time would be required to convert the returned result to aliases.

tell application "Finder" set fileList to entire contents of testFolder

While this assumption seems sound, in practice, aliases are returned noticeably faster with high file volumes, as Finder’s references are longish. What’s surprising to me is that using entire contents worked at all with 7850 files; it’s often unreliable with more than several hundred.

Hi peavine.

Those are very similar to the timings I’m getting here. One of the things which takes so long with the Finder is that it has to put together a long list of its own specifiers. As with System Events, if you can get it to return the results in some other form instead, it’s often quicker (that is, quicker than it otherwise would be). This takes about 12 seconds with the same folder on my system:

tell application "Finder" to set fileList to (files of entire contents of testFolder) as alias list

The as alias list is a Finder speciality which works with the preceding specifier rather than coercing a returned list after the fact.

Marc and Nigel. Thanks for explaining the speed benefit of using aliases–that’s a useful thing to know.

I ran Nigel’s suggestion on my test folder (3GB and 7,850 files), and it took 9 seconds to complete. A big improvement.