Loading file list from a folder

A basic idea behind AppleScript is its “plug-in” architecture. There’s the core language, which is actually quite small but can do a lot. Then there’s the ability to add commands supplied by separate OSAXen and commands belonging to applications whose authors have included suitable scripting interfaces. Over the years have been added the ability to run shell scripts, simulate user actions in the GUI, and recently to access some of the system’s Objective-C frameworks. On the one hand, it’s a bewildering array of things to learn. On the other, it offers a vast choice of solutions from which an expert can select what he/she feels is the most appropriate. On the third hand ( :slight_smile: ), it can be approached from a number of different directions to suit people coming from different programming backgrounds. Complete beginners (English speakers, at least) should find the core language fairly easy to grasp. People familiar with Unix or languages like Python or Ruby can go straight to ‘do shell script’ and achieve a lot of what they want to do straight away in a way that they already know. Hard-core Objective-C programmers should be able to adapt to ASObjC without too much trouble if they need to. Once a start’s been made, you can “add on” any additional knowledge you need much as the language adds on extensions.

So “pure AppleScript” is a term rather like “thoroughbred mongrel”. In as far as it means anything, I’d personally regard “pure AppleScript” as being the core language and (when I’m in a flexible mood!) the StandardAdditions OSAX.

I could probably live with your inflexible definition. But I struggle with the need to Balkinize in the first place, especially with loaded terms like “pure”. The core language is not very useful by itself – that’s why the hack that is scripting additions was added before it was even released – and in many ways it’s stuck in a time warp. It’s the “impure” bits that have helped keep it alive.

Sorry, that was a typo left behind from when I first put the code together and had extensionList coded as a property rather than a local variable. I corrected the entry.

Of course. My mistake. I had been thinking of using text item delimiters as in my example and noticed its presence in yours, but didn’t look closely enough to see the difference in usage.

It’s clear from the posts above that there are multiple ways of getting the HFS paths or Applescript aliases of files of a folder. But as I mentioned earlier, I am not aware of any shell solutions to getting that information, other than to “cheat” with the osascript command. So for the fun of it and just so that it’s out there, I put together the following shell solution that is not quite pure since it requires an Applescript run script command, but the heavy lifting is done by the shell.

To get the HFS paths or Applescript aliases of all pdf files in a parent folder:


set hfsPaths to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

--or--

set applescriptAliases to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

And to get the HFS paths or Applescript aliases of all pdf, txt, and jpg files in a parent folder:


set hfsPaths to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o  -iname '*.txt' -o  -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

--or--

set applescriptAliases to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o  -iname '*.txt' -o  -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

Notes: 1) The -name primaries have been changed to -iname so that file extension searching will be case-insensitive. 2) Since this post was first submitted, the curly braces have been transferred from the do shell script command to the run script command so that an empty list will be returned in the case of no matching files. 3) This approach will fail if an HFS path of an item in the parent folder has a double-quote character in its name.

Here are the same solutions but with the curly braces incorporated into the do shell script command, and with additional examples in which all files are returned (i.e., no filtering is performed based on file name extension):

To get the HFS paths or Applescript aliases of all files in a parent folder:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

To get the HFS paths or Applescript aliases of all pdf files in a parent folder:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

And to get the HFS paths or Applescript aliases of all pdf, txt, and jpg files in a parent folder:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o -iname '*.txt' -o -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o -iname '*.txt' -o -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

Hi bmose.

‘run script’ isn’t “cheating”, of course. :wink:

I find these to be faster and slightly more thorough:

set parent_folder_hfs_path to (path to downloads folder)

set hfsPaths to run script (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as text,¬/; 1 s/^/{/; $ s/,¬$/}/'")

--or--

set hfsPaths to run script (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as alias,¬/; 1 s/^/{/; $ s/,¬$/}/'")

‘find’ simply returns the relevant files’ POSIX paths. The ‘sed’ codes double-escapes any quotes or backslashes in them, enquotes them and adds the AppleScript code, inserts an opening brace at the beginning of the first line, and edits a closing brace onto the end of the last.

Nice tweaks! It’s more streamlined. I made one slight adjustment: I pulled your curly braces out of the sed command and put them in a wrapping echo command so that in the case where no matching files are found, an empty list rather than no result is returned. (Also, I used applescriptAliases for the second statement’s variable name :).)


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as text,¬/; $s/,¬$//;')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as alias,¬/; $s/,¬$//;')}\"")

Ah. Right! I hadn’t realised sed wouldn’t be triggered in such cases.

Oops! :rolleyes:

I thought I might submit this to Code Exchange, given that there is pretty much nothing out there about using the shell to get HFS paths and AppleScript aliases. One question: Is it really necessary to “doubly” escape the double-quote character in the first sed command? This seems to work just as well:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as text,¬/; $s/,¬$//;')}\"")

Under your ‘echo’ scheme, the ‘sed’ code’s a string embedded in an ‘echo’ string in a shell script represented by an AppleScript string. The ‘sed’ code edits text returned by ‘find’ which may contain quote or backslash characters. These characters have to be doubly escaped in the AppleScript text to be received correctly by ‘sed’, which then has to add enough escapage to any matches so that, after everything’s gone through ‘echo’, there’s enough escapage left to to doubly escape the characters in the path string(s) represented within the AppleScript text returned by the shell script. Simple really. :wink:

set parent_folder_hfs_path to (path to downloads folder)

set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\/\\\\\\\\\\\\\\\\/g; s/\\\"/\\\\\\\\\"/g; s/^.*$/\"&\" as POSIX file as text,¬/; $ s/,¬$//')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\/\\\\\\\\\\\\\\\\/g; s/\\\"/\\\\\\\\\"/g; s/^.*$/\"&\" as POSIX file as alias,¬/; $ s/,¬$//')}\"")

Edit: Explanation rewritten.

or try to use the ‘satimage.osax’ with ‘list files’:
http://www.satimage.fr/software/en/dictionaries/dict_satimage.html#SatimageFileAdditions.listfiles

Another good one! An equivalent script using Satimage would look like this:

set extensionList to {"pdf", "zip", "dmg", "jpg"}
set sourceFolder to (path to downloads folder)

list files sourceFolder of extension extensionList without recursively

Here’s an extension of bmose’s idea which allows — er — extensions to be passed:

set extensionList to {"pdf", "zip", "dmg", "jpg"}
set sourceFolder to (path to downloads folder)

listFilesWithGivenExtensions(sourceFolder, extensionList)

on listFilesWithGivenExtensions(sourceFolder, extensionList)
	if (extensionList is {}) then
		set filterBlock to "-not -name '.*'"
	else
		set astid to AppleScript's text item delimiters
		set AppleScript's text item delimiters to "' -o -iname '*."
		set filterBlock to "\\( -iname '*." & extensionList & "' \\)"
		set AppleScript's text item delimiters to astid
	end if
	
	return (run script (do shell script "echo \"{$(find " & sourceFolder's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 " & filterBlock & " | LC_ALL='en_GB' sed -E 's/\\\\/\\\\\\\\\\\\\\\\/g; s/\\\"/\\\\\\\\\"/g; s/^.*$/\"&\" as POSIX file as alias,¬/; $ s/,¬$//')}\""))
end listFilesWithGivenExtensions

I submitted to Code Exchange the run script/do shell script solution along with a discussion of other methods of batch-retrieving HFS paths and AppleScript aliases. I tried to beat into partial submission the backslashes in the run script/do shell script expression with a more liberal use of the & special character :slight_smile:

Hey Nigel can the AppleScript objects also be referenced as say:

set aList to {path to desktop, "aardvark", 17, {1, 2, 3}, 4, {{10,11,12}, {20,30,40}, {300, 400, 500}}, "hello", {a:"apple", b:"banana"}, 5.0, 7, "world", {a:"orange", b:"grapefruit"}, {c:{csub1:"strawberry", csub2:"raspberry"}, d:{dsub1:"tomato", dsub2:"potato"}}}

aList's second record --->? {a:"orange", b:"grapefruit"}
aList's third record ---->? {c:{csub1:"strawberry", csub2:"raspberry"}, d:{dsub1:"tomato", dsub2:"potato"}}

aList's second record of (aList's third record) ---->? d:{dsub1:"tomato", dsub2:"potato"}
aList's second list ---->? {{10,11,12}, {20,30,40}, {300, 400, 500}}
aList's third list of (aList's second list) ----->?  {300, 400, 500}

I guess for that last question, if that is true then that is probably going to help clear up the
“List of List” or “Lists of Lists” that I find in some dictionary’s.
That has never been clear to me.

thanks

Hi technomorph.

Your first, second, and fourth fetch lines are correct.

The fifth should just be:

third list of (aList's second list) -- With or without the parentheses.

… or …

aList's second list's third list

… or …

list 3 of list 2 of aList

… or whatever mixture of styles you prefer.

You can’t get an indexed item of a record because the point of a record is that the values in it are labelled rather than being in a particular order. So your third fetch line would have to be something like this:

d of (aList's third record) ----> {dsub1:"tomato", dsub2:"potato"}

“List of list” simply means a list containing lists, the implication being that it either only contains lists or is empty. What any lists in it might contain isn’t specified.

item 1 = {[b]INFO:[/b]{[i]BITRATE:[/i]"128000", [i]GENRE:[/i]"Alternative", [i]COMMENT:[/i]"missing 20180401", [i]RATING:[/i]"Search In Playlists", [i]PLAYTIME:[/i]"353", [i]IMPORT_DATE:[/i]"2018/4/29", [i]FLAGS:[/i]"10", [i]FILESIZE:[/i]"5582", [i]|color|[/i]:"1"}, PRIMARYKEY:{|key|:"Tekno/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:All My Friends (Franz Ferdinand Version).m4p"}, LOCATION:{DIR:"/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:", |file|:"All My Friends (Franz Ferdinand Version).m4p", VOLUME:"Tekno", VOLUMEID:"Tekno"}, MAININFO:{MODIFIED_DATE:"2018/5/9", MODIFIED_TIME:"47417", AUDIO_ID:"AQoAABEBERERIhEQEiERERFERlVCIjVnQxRWZCIiNVZ1QhETZlUiRVIRNENoQUVVImdlQgJWZiRTEQAWd4UVZ2QQJnZhRUIRERIiEREiEREQABJmRTMhERVYUhFFVUERJERERCJHVFVSE0VCEUZkiDFFZRJmZVITVWUTQyIRFXd1E1Z2MRWGcyVEIiM0M0MyIUdlQQFGZSFGMzEAATITZldBJWVlM0dVd2dnQyIREyIzIiMiISERERA2ZUEmZlUgNWZhRUERAEd4ckZmYxFnZyJFQ0RmdUIRIiIhBHYlVVQzIhEjMjIjMzIiMzIhERERFDERElQxERNDMxACMyERAA==", title:"Test Title 02", ARTIST:"Test Artist 02"}, ENTRYNO:1}

which breaks down to
item 1 having these records:
ENTRYNO: record of 1 item
INFO: record of 9 items
-----BITRATE
-----GENRE
-----COMMENT
-----RATING
-----PLAYTIME
-----IMPORT_DATE
-----FLAGS
-----FILE_SIZE
-----COLOR
PRIMARYKEY : record of … etc
-----subrecord1
-----subrecord2
LOCATION : …etc
-----subrecord3
-----subrecord4
-----subrecord5
…etc

All of the sub records all have different user names.

  1. Am I able to access the subrecords directly IE:
    from previous example:
    1a)
    d of aList
    1b)
    dsub of aList
    1c)
    dsub of (aList’s third record)
    1d) or do I have to drill right down into the subrecords like:
    dsub of d of (aList’s third record)

  2. I’m guessing I’m probably not able to ask for a recordkey name that’s deep in a list.
    But in my ITEM 1 it just contains records. Can I

2a)
BITRATE of item 1

2b) or do I have to
BITRATE of INFO of item 1

  1. If I have to drill down would I be better to set up 3b rather that 3a
    3a)
    BITRATE of INFO of item 1
    GENRE of INFO of item 1
    COMMENT of INFO of item 1
    RATING of INFO of item 1
    FLAGS of INFO of item 1
    3b)
    set ITEMINFO to INFO of item 1
    BITRATE of ITEMINFO
    GENRE of ITEMINFO
    COMMENT ITEMINFO
    RATING of ITEMINFO
    FLAGS of ITEMINFO

  2. I’m guessing there is probably other functions/libraries out there that will “filter” an
    array for me by “keys”

I’m not wanting all of the INFO from the item 1 just certain sub records
providing key say {“BITRATE”, “GENRE”, “COMMENT”, “RATING”, “FLAGS”}
in the key case is it possible to use a list of lists?
{“INFO” {“BITRATE”, “GENRE”, “COMMENT”, “RATING”, “FLAGS”}, “LOCATION” {“VOLUME”, “DIR”, “|file|”}}

thanks

Hi technomorph.

We seem to have strayed a long way from the topic of this thread — and even from the items-of-a-class-from-a-list digression. :confused:

If ‘aList’ is a list, then 1a and 1b are both wrong because lists don’t have labelled properties. If ‘dsub’ is a property of a record which is the ‘d’ property of aList’s third record, then 1c is wrong and 1d is right.

You’d normally only use an expression like ‘aList’s third record’ where the list contained items of various classes and you specifically wanted the third record. If you know that the list only contains records, then ‘aList’s third item’ may be more convenient. It’s also theoretically more efficient, since it simply grabs the third item from the list without having to check the item’s class.

Yes. BITRATE is a property of the record which is item 1’s INFO value, not a direct property of item 1 itself.

Whichever you find more convenient at the time. Theoretically, if you need to extract several values from a subitem, it’s more efficient to use a variable, as in 3b, since it reduces the number of pointers which have to be followed with each individual value. But the difference is tiny.

It may be possible in ASObjC, but filtering is straying even further off-topic here.

If you know that a record heirarchy contains all the properties of interest, a convenient way to set variables to all the required values would be like this:

-- For this demo, aList is a list containing one record.
set aList to {{INFO:{BITRATE:"128000", GENRE:"Alternative", COMMENT:"missing 20180401", RATING:"Search In Playlists", PLAYTIME:"353", IMPORT_DATE:"2018/4/29", FLAGS:"10", FILESIZE:"5582", |color|:"1"}, PRIMARYKEY:{|key|:"Tekno/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:All My Friends (Franz Ferdinand Version).m4p"}, LOCATION:{DIR:"/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:", |file|:"All My Friends (Franz Ferdinand Version).m4p", VOLUME:"Tekno", VOLUMEID:"Tekno"}, MAININFO:{MODIFIED_DATE:"2018/5/9", MODIFIED_TIME:"47417", AUDIO_ID:"AQoAABEBERERIhEQEiERERFERlVCIjVnQxRWZCIiNVZ1QhETZlUiRVIRNENoQUVVImdlQgJWZiRTEQAWd4UVZ2QQJnZhRUIRERIiEREiEREQABJmRTMhERVYUhFFVUERJERERCJHVFVSE0VCEUZkiDFFZRJmZVITVWUTQyIRFXd1E1Z2MRWGcyVEIiM0M0MyIUdlQQFGZSFGMzEAATITZldBJWVlM0dVd2dnQyIREyIzIiMiISERERA2ZUEmZlUgNWZhRUERAEd4ckZmYxFnZyJFQ0RmdUIRIiIhBHYlVVQzIhEjMjIjMzIiMzIhERERFDERElQxERNDMxACMyERAA==", title:"Test Title 02", ARTIST:"Test Artist 02"}, ENTRYNO:1}}

-- Set variables to the required values from the first record in the list.
-- The record containing the variables must have the same structure as the record containing the values, but need only contain the required properties.
-- The order in which the records' properties are written in the source code is of course immaterial.
set {INFO:{BITRATE:theBitRate, GENRE:theGenre, COMMENT:theComment, RATING:theRating, FLAGS:theFlags}, LOCATION:{VOLUME:theVolume, DIR:theDirectory, |file|:theFile}} to item 1 of aList

return {theBitRate, theGenre, theComment, theRating, theFlags, theVolume, theDirectory, theFile}
--> {"128000", "Alternative", "missing 20180401", "Search In Playlists", "10", "Tekno", "/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:", "All My Friends (Franz Ferdinand Version).m4p"}

While I’m not expecting to get hundreds of files returned, I’d like to speed up the following AppleScript version and add the ability to exclude a list of file extensions. The list of exclusions is the part about which I’m particularly concerned, as I’m not that familiar with some of the languages used it the preceding posts and can’t extend them. Any help would be appreciated.


		-- --------------------------------------------------------------------------------
		-- get all the files having a filename that is the same as the ".meta" file,
		-- but not the ".meta" file itself, into a list.
		-- --------------------------------------------------------------------------------
		try
			tell application "System Events"
				set theList to (name of files of alias (thePath) whose (name begins with theFilename & ".") and (name does not contain "." & theExtension))
			end tell
		on error error_message number error_number
			set this_error to "Error: " & error_number & ". " & error_message & return
			write_to_file(theStatus, this_file, true)
		end try

It currently looks for all files having the root name “theFilename” in “thePath”, but not with the extension “theExtension”.

I’m not too particular whether or not the substitute utilizes scripts or objective C. (But I’m hoping the suggestion is commented so I can learn from it). I just need to improve the speed, as the pure AppleScript way is quite slow.

I was playing around with bash and the terminal. I “think” the script needs to incorporate something like:


find . -type f -iname "GoodFile.*" -a -not \( -iname "*.meta" -o -iname "*.app" \)

then have its output fed to grep to use a regular expression to format it as a string for return to AppleScript.

To get the GoodFile and list of excluded extensions, I believe I’d have to create the bash script on the fly in a handler.

Am I on the right track?

Try this:

use AppleScript version "2.5" -- 10.11 or later
use framework "Foundation"
use scripting additions

on listFilesIn:sourceAliasOrFile fileStub:fileStub excludedExtensions:theExtensions
	set fileManager to current application's NSFileManager's defaultManager()
	set theURLs to fileManager's contentsOfDirectoryAtURL:sourceAliasOrFile includingPropertiesForKeys:{} options:(current application's NSDirectoryEnumerationSkipsHiddenFiles) |error|:(missing value)
	
	set thePred to current application's NSPredicate's predicateWithFormat:"%K == %@ AND !(%K IN %@)" argumentArray:{"lastPathComponent.stringByDeletingPathExtension", fileStub, "pathExtension", theExtensions}
	set theURLs to theURLs's filteredArrayUsingPredicate:thePred
	return theURLs as list
end listFilesIn:fileStub:excludedExtensions: