Searching for files with Finder - name contains strings from list

Hello Nigel.

I think I misread the post a little, so your answer is correct to the specification as well, I read the OP as if he wanted any file that contained one of the search terms, where he really wanted the ones that matched all the search terms.

Not only is your solution probably faster, but it is correct as well. :wink:

Thank you both for the quick answers this really helps a lot to learn more about Applescript!

I tried both scripts. For me the second script works better since it only copys the files with an exact match.

For instance if I have files

Berlin_City_Airport.jpg
and
Berlin_City_Airport_Restaurant.jpg

and my search strings are {“Berlin”,“City”,“Airport”,“Restaurant”}

the first script duplicates both files and the script from Nigel with the namesearch only copys Berlin_City_Airport_Restaurant.jpg

I´m always impressed how quick people answer here!
Once again thank you so much for your help!
Marc

McUsrII while I was typing my reply you write again ;).
My post title might have been a little misleading. I was looking for matching of all searchstrings.

Love this forum, learned so much here!

Hello.

Here is possibly a simpler way to do it, if you are going to edit the list often anyway. Here I use the fact that you can build up a filter clause with and’s (or or’s) as connectives.
I have also changed your path to the image folder, a bit, to make the whole script do less.

set imagefolder to (((path to desktop as text) & "Pictures")) as alias
set target_folder to (choose folder with prompt "Choose the target folder:")

tell application "Finder"
	set the_files to files's name of folder imagefolder whose name contains "Berlin" and name contains "City" and name contains "Airport"
	repeat with this_file in the_files
		if (not (exists file this_file of target_folder)) then
			duplicate file this_file of imagefolder to target_folder
		end if
	end repeat
end tell

Hi McUsrII,

this copying is part of a bigger script. The list and with that the searchstrings get generated new all the time from a CSV.

So I might do one search for {“Berlin”,“City”,“Airport”,“Restaurant”} and then for {“Cologne”,“City”,“Cathedral”} then for {“Hamburg”,“City”,“Harbour”}.

I first tried it with ands but since it always changes I was stuck.
With the script from Nigel I can change the searchstrings and the resulting request and copy the files I need.

Hello.

This is more of a fun excercise, as I don’t really know if it is faster to use a complex filter, than Nigel’s way of doing it. Maybe I’ll time it tomorrow. :wink:

Here I build up the search expression with the list of terms, and executes the filter expression with a run script statement, I then process it with whats left of the finder block, I have to do it this way, because the run script is part of standard additions, in order for the script to run correctly.

set target_folder to (choose folder with prompt "Choose the target folder:")
-- Change, we no longer use the filterclause as an alias
set imagefolder to (((path to desktop as text) & "Pictures"))

set searchstrings to {"Berlin", "City", "Airport"}
set ssc to count searchstrings
if ssc = 0 or class of searchstrings is not list then
	error "No list of search strings specified"
else
	-- we build up a string that contains the filter clause for querying finder
	set theString to "tell application \"finder\"  to files's name of folder \"" & imagefolder & "\"   whose name contains " & "\"" & item 1 of searchstrings & "\""
	
	-- We add on every search criteria after the first one to the filter criteria.
	repeat with i from 2 to ssc
		set theString to theString & " and name contains \"" & item i of searchstrings & "\""
	end repeat
	-- We run the query for finder and gets the result.
	set the_files to run script theString
	-- the final duplicating of files
	tell application "Finder"
		set i to 0
		repeat with this_file in the_files
			if (not (exists file this_file of target_folder)) then
				duplicate file this_file of folder imagefolder to target_folder
				set i to i + 1
			end if
		end repeat
		display notification "There were " & i & " files copied." with title "Duplicate picture files"
	end tell
end if

Edit
I removed some unnecessary coercions, and let the code flow more naturally with an error message if no searchstrings are specified. I also addad a notification, telling us how many files were copied.

And I presume that {“Berlin”, “City”, “Airport”} scoring a hit with both files would be correct for your purposes.

My script has a slight weakness in that {“Hamburg”, “City”, “Restaurant”} would pick out both “Hamburg_City_Restaurant.jpg” and “Berlin_City_Hamburger restaurant.jpg”. If you’re running Snow Leopard or later, only a small tweak is needed to cure this:

set imagefolder to POSIX path of ((path to desktop as text) & "Pictures") -- Assuming this really has to be a POSIX path.
set target_folder to (choose folder with prompt "Choose the target folder:")
set searchstrings to {"Hamburg", "City", "Restaurant"}

set imageFolderPath to (POSIX file imagefolder) as text -- Get the POSIX path as an HFS path.

tell application "Finder" to set filenames to name of every file of folder imageFolderPath

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"_", "."}
repeat with i from 1 to (count filenames)
	set thisName to item i of filenames
	-- Get a list of this file name's individual segments. It's assumed that the extension won't be the same as any of the other segments!
	set nameSegments to thisName's text items
	set matched to false
	-- Check each file name against each search string in turn.
	repeat with j from 1 to (count searchstrings)
		set matched to (nameSegments contains item j of searchstrings)
		-- Give up immediately on any mismatch.
		if (not matched) then exit repeat
	end repeat
	-- If the file name contains all the search strings, duplicate the file to the destination folder.
	-- As in the original script, any name clashes at the destination simply prevent the duplication.
	if (matched) then
		tell application "Finder"
			if not (file thisName of target_folder exists) then
				duplicate file thisName of folder imageFolderPath to target_folder
			end if
		end tell
	end if
end repeat
set AppleScript's text item delimiters to astid

Hello.

The Finder isn’t as smart as Nigel Garvey when it comes to filtering, that is for sure.
My script will pick out the Berlin_Hamburger_restaurant, and Berlin_eu_de_Cologne_shop as well for that matter. :slight_smile: I also guess that Nigels approach may be faster, since I have some vague memories of filter queries being slow with the Finder, and the and clauses, won’t make it any faster. But, it can be useful in some situations to compile filterqueries on the fly, and that is why I posted my last script, to show that it can be done, for those not initiated in it (not Nigel! :slight_smile: ).

Hey McUsrII,

If I’m understanding the problem correctly I believe what you were attempting can be done. It’s just that the filter form is very slow.


set srcFldr to ((path to desktop as text) & "Pictures:")
set destFldr to alias ((path to home folder as text) & "test_directory:TestFldr_Dest:")
set searchStrings to {"Berlin", "City", "Airport", "Restaurant"}

set AppleScript's text item delimiters to "\" and name contains \""

set _query to "tell application \"Finder\"
	(files of alias \"" & srcFldr & "\" whose name contains \"" & (searchStrings as text) & "\") as alias list
end tell"

set fileList to run script _query

tell application "Finder"
	duplicate fileList to destFldr
end tell

Okay, this is pretty quick.

I’ve left off the duplicate in the Finder code, since that’s been well covered already.

The result is a list of aliases.


set searchStrings to {"Berlin", "City", "Airport", "Restaurant"}
set srcFldrPath to POSIX path of ((path to desktop as text) & "Pictures:")
set destFldr to alias ((path to home folder as text) & "test_directory:TestFldr_Dest:")
set AppleScript's text item delimiters to linefeed

set shCMD to "sed -En '/" & item 1 of searchStrings & "/"

set endOfCmd to {}
set searchStrings to rest of searchStrings
repeat with i in searchStrings
	set end of endOfCmd to "}"
	set shCMD to shCMD & "{" & linefeed & "/" & i & "/"
end repeat
set shCMD to shCMD & "p"
set endOfCmd to endOfCmd as text
set shCMD to shCMD & linefeed & endOfCmd & "'"

set shCMD to "srcDIR='" & srcFldrPath & "';
ls -1 \"$srcDIR\" | " & shCMD & " \\
| sed -E \"s!^!${srcDIR}!\""

set fileList to paragraphs of (do shell script shCMD)

repeat with i in fileList
	set contents of i to alias POSIX file (contents of i)
end repeat

fileList

. or indeed can be dispensed with altogether if ls returns the paths too:


set searchStrings to {"Berlin", "City", "Airport", "Restaurant"}
set srcFldrPath to POSIX path of ((path to desktop as text) & "Pictures:")
set destFldr to alias ((path to home folder as text) & "test_directory:TestFldr_Dest:")
set AppleScript's text item delimiters to linefeed

set shCMD to "sed -En '/" & item 1 of searchStrings & "/"

set endOfCmd to {}
set searchStrings to rest of searchStrings
repeat with i in searchStrings
	set end of endOfCmd to "}"
	set shCMD to shCMD & "{" & linefeed & "/" & i & "/"
end repeat
set shCMD to shCMD & "p"
set endOfCmd to endOfCmd as text
set shCMD to shCMD & linefeed & endOfCmd & "'"

set shCMD to "ls " & quoted form of srcFldrPath & "*  | " & shCMD -- NB. the asterisk.
set fileList to paragraphs of (do shell script shCMD)

repeat with i in fileList
	set contents of i to alias POSIX file (contents of i)
end repeat

fileList

Hi Chris.

Yep. That’s fast. :slight_smile:

The second sed command in the shell script can in fact be incorporated into the first:


set searchStrings to {"Berlin", "City", "Airport", "Restaurant"}
set srcFldrPath to POSIX path of ((path to desktop as text) & "Pictures:")
set destFldr to alias ((path to home folder as text) & "test_directory:TestFldr_Dest:")
set AppleScript's text item delimiters to linefeed

set shCMD to "sed -En '/" & item 1 of searchStrings & "/"

set endOfCmd to {}
set searchStrings to rest of searchStrings
repeat with i in searchStrings
	set end of endOfCmd to "}"
	set shCMD to shCMD & "{" & linefeed & "/" & i & "/"
end repeat
set shCMD to shCMD & "s!^!'${srcDIR}'!p"
set endOfCmd to endOfCmd as text
set shCMD to shCMD & linefeed & endOfCmd & "'"

set shCMD to "srcDIR='" & srcFldrPath & "';
ls -1 \"$srcDIR\" | " & shCMD
set fileList to paragraphs of (do shell script shCMD)

repeat with i in fileList
	set contents of i to alias POSIX file (contents of i)
end repeat

fileList

Here’s another version. Filtering in Cocoa is done using things call predicates, which are similar to whose clauses. In fact, Cocoa scripting’s whose clauses are built on predicates.

But there are also things called compound predicates: you create a bunch of predicates, make a compound predicate from them, and then filter with that. It seems to me ideal for this sort of problem.

So this version requires Yosemite (or you can put it in a script library under Mavericks):

use AppleScript version "2.4"
use scripting additions
use framework "Foundation"

set searchStrings to {"Berlin", "City", "Restaurant"}
set srcFldrPath to POSIX path of ((path to desktop as text) & "Pictures:")
-- we need it in NSString form later, so...
set srcFldrPath to current application's NSString's stringWithString:srcFldrPath
-- get list of files
set theNSFileManager to current application's NSFileManager's defaultManager()
set theFiles to theNSFileManager's contentsOfDirectoryAtPath:srcFldrPath |error|:(missing value)
-- build list of predicates from search strings
set thePreds to {}
repeat with aString in searchStrings
	set end of thePreds to (current application's NSPredicate's predicateWithFormat_("self contains %@", aString))
end repeat
-- filter names using compound *and* predicate
set theFiles to theFiles's filteredArrayUsingPredicate:(current application's NSCompoundPredicate's andPredicateWithSubpredicates:thePreds)
-- convert names to full paths
set fileList to (srcFldrPath's stringsByAppendingPaths:theFiles) as list
-- convert to list of aliases
repeat with i in fileList
	set contents of i to alias POSIX file (contents of i)
end repeat

I’d imagine timings will vary depending on the number of files in the directory and so on, but in my very simple test it takes around 0.001 seconds, compared with about 0.01 seconds for the sed versions. (I found Nigel’s a fraction slower than Chris’s.)

Hello Chris.

This is a reply to the post where you addressed me. What I meant Nigel’s script did so well, was that it tested that the search strings ended on the word boundaries, so for instance Hamburger wasn’t included when Hamburg was the search string (Exact Match). You get a very slow script if you want to avoid those cases while using filtering with Finder. The reasons for this is of course, that since you really can’t make any assumptions about where the individual searchstrings are located in a file name, you can’t end the strings with either a dot or an underscore. This means again that you’ll have to test each filename from the filtered set like Nigel did. :slight_smile:

Nice bunch of script all you got here, I’ll have to work through them. :slight_smile:

I started to read this late, but what if the string has to include two of the same strings like Kansas City, Kansas?

Edited: i.e. like {kansas, city, kansas}. Like Garden City, Kansas.

I must admit I didn’t compare their timings. It was after two in the morning here, I’d only been trying to economise on Chris’s code, and ” since my test folder only contains seven, carefully named files ” any differences in the time taken to process the text would be too close to be reliable and easily swamped in practice by variations in the times taken to duplicate the identified files ” which none of the later scripts actually bothers to do. It is interesting that you find Chris’s script faster though, because it runs the shell script twice! I’ll check it out again later on.

This would probably have to be dealt with by crossing off strings as they were found. But Marc hasn’t mentioned it as a possibility, so I wouldn’t be inclined to worry about it here. (Yet, at least!) When writing the script in post #10, I did think of eliminating the name extensions in case they coincided with any of the search terms; but in the end, I decided it was unlikely to happen.

Edit: Ah. I see the forum times are correct again. The clocks must have gone back in the US. So it was between one and two in the morning when I posted my modifications of Chris’s script. :wink:

Later: With the redundant ‘do shell script’ removed from Chris’s script, I can’t find any reliable difference in speed between it and my variations. Shane’s ASObjC script is of course eight to ten times as fast as any of them.

I couldn’t time Shane’s script at first, because the code I use to load my timer into a script .

set timer to (load script file ((path to scripts folder from user domain as text) & "Libraries:Timer.scpt"))

. kept returning the stupid error “Can’t make current application into type file.” The cure turned out to be:

set timer to (load script (get alias ((path to scripts folder from user domain as text) & "Libraries:Timer.scpt")))

Hello.

If there was a folder with lots of files, then a do shell script with mdfind -onlyin, and a compiled search expression maybe beneficiary. I just mention that, as another option, maybe it is faster than the sed version of things, maybe not. I don’t have the number of files to test this anyway, so I leave it in the open. :slight_smile:

That’s why I mentioned it – but the difference was slight, and my test folder very basic. I guess what I was really saying was that I was surprised there wasn’t a significant difference the other way around, though.

FWIW, changing the repeat loop slightly to use grep will deal with hamburgers and Berliners:

repeat with aString in searchStrings
	set aString to (current application's NSRegularExpression's escapedPatternForString:aString) -- escape any significant characters
	set end of thePreds to (current application's NSPredicate's predicateWithFormat_("self matches %@", "(?i).*" & aString & "[ _. ].+"))
end repeat

That should theoretically slow it down a smidge, but I can’t see any difference here.

You know, there’s an app for that :wink:

Probably, but I’m always nervous about the risk of the Spotlight database being wrong when there’s not a lot time to be gained. I just changed my test folder from a handful of files to 500-odd, and the sed time only rose from 0.01 to 0.053 seconds (whereas the ASObjC time went to 0.034).

So much for theory. My test on a larger number of files shows the regex version is actually quicker.