Copy every nth file from a folder to a new folder

Thanks for your help.
I now tried to build a loop to make it a little bit faster.
My idea is, to run the script for 5 times. And with each loop a new folder with the number of the loop is created. I tried this, but no luck…

set theFolder to (choose folder)
repeat with a from 1 to 5
tell application "Finder"
	set theFiles to files of theFolder -- AppleScript list of the files.
	set savedFiles to (make new folder at desktop with properties {name:"Result"& a})
	repeat with i from 10 to (count theFiles) by 10
		move item i of theFiles to savedFiles
	end repeat	
a = a + 1
end tell
end repeat
end tell

I got it:


set theFolder to (choose folder)
repeat with a from 1 to 5
	tell application "Finder"
		set theFiles to files of theFolder -- AppleScript list of the files.
		set savedFiles to (make new folder at desktop with properties {name:"Result" & a})
		repeat with i from 10 to (count theFiles) by 10
			move item i of theFiles to savedFiles
		end repeat
		a = a + 1
	end tell
end repeat

Hi moosmahna.

There’s no need for the a = a + 1 line. a is automatically incremented each time round the outer repeat. In AppleScript, a = a + 1 is just a test to see if a = a + 1, which obviously it doesn’t, so the line generates the unused value false.

Hi peavine.

Possibly. I was simply churning out suggestions for solving the immediate problem, testing with a folder I happened to have on my desktop. I didn’t look very closely at fine-tuning for speed. But I’d encourage anyone who’s interested to experiment.

Nigel–thanks for the response.

I created a test folder containing 5,000 text files and I then ran the script included below ten times, alternating between “as alias list” enabled and then “as alias list” disabled (commented out). The average time of all runs was about 26 seconds and the runs with “as alias list” enabled were consistently one to two seconds slower.

BTW, I used duplicate rather than move to keep the source files intact.

set startTime to (time of (current date))

set theFolder to "Samsung SSD:Test:" as alias

tell application "Finder"
	set theFiles to files of theFolder -- as alias list
	set savedFiles to (make new folder at desktop with properties {name:"Result"})
	repeat with i from 10 to (count theFiles) by 10
		duplicate item i of theFiles to savedFiles
	end repeat
end tell

set executeTime to (time of (current date)) - startTime


Now I go at work. To make script faster you can try 2 things : 1) use System Events instead of Finder 2) maybe exists command line tool for that task, so try do shell script too

OK. But don’t forget that copying takes longer than moving — and by how much depends largely on the sizes of the items being copied. “Moving” only involves changing entries in the disk catalogue and is therefore faster and more nearly the same whatever’s being “moved”.

An ASObjC script would no doubt be fastest of all, but of little help to the OP’s understanding at the moment.

It was meant more for amusement, really. I forgot the smiley :frowning:

Nigel. In retrospect, I didn’t need to move or copy files to see if appending “as alias list” makes things faster or slower. I ran some tests on this and found the latter to be the case. Often an alias list is needed for file processing outside the Finder or for some other reason but otherwise it probably does not need to be used (as was the case with your original script).

Apologies to the OP for getting off-topic.

It’s pretty much a given in my experience with large lists that an alias list will be faster in comparison to Finder’s own references, which are voluminous. If you have very long lists, you can also prepend a file reference with the word “my” and get more than tenfold performance improvement (isolated to list iteration); e.g.:

[format] move item i of my theFiles to savedFiles
[/format]

Edit: There may be a nominal performance penalty in using these tricks in low iteration settings, and 500 is probably not sufficiently long enough.

Marc Anthony. I’ve included my test script below. It takes 10 seconds to run with a test folder containing 20,000 text files and 20 seconds if I uncomment “as alias list”. What you say makes sense but it doesn’t work that way for me–or perhaps I don’t undertand what is happening here.

Thanks for letting me know about adding the word “my” to the script.

myTest()

on myTest()
	set startTime to (time of (current date))
	set theFolder to "Samsung SSD:Test:" as alias
	tell application "Finder" to set theFiles to (files of theFolder) -- as alias list
	set executeTime to (time of (current date)) - startTime
end myTest

BTW, I ran this script from within the Script Editor but I also created separate script applications with and without “as alias list”. The script application without “as alias list” was consistently twice as fast (8 seconds without and 18 seconds with).

FWIW, here’s a pretty fast version using my FileManagerLib script library:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use script "FileManagerLib" version "2.2.1"
use scripting additions

set sourcePath to  (choose folder)
set newFolder to create folder at (path to desktop) use name "Result"
set theContents to objects of sourcePath without include folders
set theContents to sort objects theContents sorted property name property sort type Finder like
repeat with i from 10 to count of theContents by 10
	move object (item i of theContents) to folder newFolder
end repeat

peavine: I tested the below code twice—once with a folder with 3460 files and then with a folder with 20K files; it was edited just a bit from your original to ensure the Finder didn’t start doing something else in between runs. The respective results were {13, 6} and {76,31}, indicating that using alias lists is about twice as fast on my machine around the few thousand mark and that the effect scales gradually higher in performance with larger counts. I tested this on a 2012 Mac Mini with a spinning HD formatted in HFS, which may be a factor.


set logTime to {}
tell application "Finder"
	set theFolder to folder -1 --the last created folder on my desktop
	set startTime to my (current date)'s time --my used to escape standard addition error
	get theFolder's files
	set logTime's end to (my (current date)'s time) - startTime
	delay 0.5
	set startTime to my (current date)'s time
	get theFolder's files as alias list
	set logTime's end to (my (current date)'s time) - startTime
end tell
logTime

Marc Anthony. I ran your script on my test folder of 20,000 text files and my results were {9, 19} compared with your result of {76. 31}. It does seem that part of the reason for this difference could be the hardware.

My computer is a 2018 Mac mini with A 6-core cpu and 16GB memory. The test files are on a APFS-formatted Samsung T5 external SSD, which has a read speed of about 540 MB/s.

Once again, apologies to the OP for going off-topic. I raised this issue as a simple question to Nigel and it sort of became more involved. If Nigel thinks these posts should be moved to another thread that would be great.

I wonder. I know that APFS does smart tricks when duplicating files, so if your folder is full of files you’ve made by duplicating then their contents haven’t actually been duplicated, but that’s just the contents.

I set up a folder of 4700 files. I ran Marc script, and I got {19, 16}. I have to say, this surprised me – I was expecting results more like Marc’s. (I tested on an iMac with Fusion Drive.)

But if I get the list using my FileManagerLib, it takes well under 1 second. So by definition, the time spent talking to hardware is a minute fraction overall. I suppose the Finder could be doing thousands of separate disk hits.

But I wonder if it’s some combination of OS version and APFS vs HFS.

Oooh! No. My apologies to you and KniazidisR. I was getting my wires crossed a little above when I mentioned replies being helpful to the OP.

While I feel that initial responses should be geared directly towards helping the OP — which includes offering code they’re likely to be able to understand given their apparent current knowledge of AppleScript (judging from what they post) — it’s perfectly reasonable for anyone who’s interested to go on to discuss improvements, other methods, and matters arising once the initial help’s been given. This is exactly what happened above and I’m sorry if I gave the impression that exploring alternative ways to achieve the same end was off-topic.

Nigel–thanks for the information. Sometimes there’s a fine line between hijacking a thread and adding information that could be helpful, and it was good to read your thoughts on this.

Hi all, I have tried some of your scripts for copying every 10th file from a folder containing around 40,000 files. It keeps giving me the error apple event timed out. Any suggestions?

I successfully tested the following script on my Sonoma computer with 5,000 files. I don’t know if it will work with 40,000 files. Please note:

  • The script gets all files in the source folder but not its subfolders.

  • Package files are skipped.

  • Before getting every tenth file, the source files are sorted by name in the same manner as the Finder.

  • If an existing file is found in the target folder, the user is notified and the file is skipped.

-- revised 2024.03.17

use framework "Foundation"
use scripting additions

on main()
	set sourceFolder to POSIX path of (choose folder with prompt "Select the source folder")
	set sourceFolder to current application's |NSURL|'s fileURLWithPath:sourceFolder
	set targetFolder to POSIX path of (choose folder with prompt "Select the destination folder")
	set targetFolder to current application's |NSURL|'s fileURLWithPath:targetFolder
	set sourceFiles to getSourceFiles(sourceFolder)
	copyFiles(sourceFiles, targetFolder)
end main

on getSourceFiles(theFolder)
	set fileManager to current application's NSFileManager's defaultManager()
	set theKey to current application's NSURLIsRegularFileKey -- does not include packages
	set theFiles to fileManager's contentsOfDirectoryAtURL:theFolder includingPropertiesForKeys:{theKey} options:4 |error|:(missing value) -- option 4 skips hidden files
	repeat with i from theFiles's |count|() to 1 by -1
		set anItem to (theFiles's objectAtIndex:(i - 1))
		set {theResult, regularFile} to (anItem's getResourceValue:(reference) forKey:theKey |error|:(missing value))
		if regularFile as boolean is false then (theFiles's removeObject:anItem)
	end repeat
	set theDescriptor to current application's NSSortDescriptor's sortDescriptorWithKey:"lastPathComponent" ascending:true selector:"localizedStandardCompare:" -- sorts in the same manner as the Finder
	(theFiles's sortUsingDescriptors:{theDescriptor})
	return theFiles
end getSourceFiles

on copyFiles(sourceFiles, targetFolder)
	set fileManager to current application's NSFileManager's defaultManager()
	set fileCount to sourceFiles's |count|()
	repeat with i from 1 to fileCount by 10
		set aFile to (sourceFiles's objectAtIndex:(i - 1))
		set fileName to aFile's lastPathComponent()
		set targetFile to (targetFolder's URLByAppendingPathComponent:fileName)
		set {theResult, theError} to (fileManager's copyItemAtURL:aFile toURL:targetFile |error|:(reference))
		if theResult is false then display alert "File " & quote & (fileName as text) & quote & " could not be copied" message "This often occurs when a file with this name already exists in the destination folder" buttons {"Cancel", "Skip"} cancel button 1
	end repeat
end copyFiles

main()