Timing Results with Content Property in a Repeat Loop

peavine · April 29, 2023, 12:45pm

As written, the following script takes 2.256 seconds to run in Script Geek. If I enable the line with comment characters, the script takes 0.211 second to run. What accounts for this difference?

set aList to {"one", "two", "three"}
repeat 10 times
	set aList to aList & aList
end repeat

set newList to {}
repeat with anItem in aList
	-- set anItem to contents of anItem -- this is the line in question
	set end of newList to anItem
end repeat
return newList

wch1zpink · April 29, 2023, 4:33pm

The difference is that one version is processing “the contents of the item” which turns out to be only one actual item per loop, as opposed to the second version which is processing “the referenced item” which is the actual item and the entire list from which it was taken from.

These following two images should show you.

KniazidisR · April 29, 2023, 4:37pm

In the first case, a “2-dimensional list” (a “list of lists”) is created. In the second - 1-dimensional (list). Naturally, building a 2-dimensional list in a repeat loop should take longer.

This does not depend on the execution environment, but on the fact that 2 scripts perform different tasks.
.

set aList to {"one", "two", "three"}

set newList to {}
repeat with anItem in aList
	set end of newList to anItem
end repeat
return newList

--> {item 1 of {"one", "two", "three"}, item 2 of {"one", "two", "three"}, item 3 of {"one", "two", "three"}}

.

set aList to {"one", "two", "three"}

set newList to {}
repeat with anItem in aList
	set end of newList to contents of anItem
end repeat
return newList

--> {"one", "two", "three"}

KniazidisR · April 29, 2023, 5:01pm

Following table of references is created in the RAM (1st script):

reference to “one” refrence to “two” reference to “three”
reference to “one” refrence to “two” reference to “three”
reference to “one” refrence to “two” reference to “three”

The second script creates following table of references in the RAM:

reference to “one” refrence to “two” reference to “three”

peavine · April 30, 2023, 12:35am

Thanks @wch1zpink and @KniazidisR for the responses, which make perfect sense. I was surprised by the magnititude of the difference and that made me wonder if something was wrong. I’ll keep this in mind when writing repeat loops with large lists.

Krioni · April 30, 2023, 8:05pm

It isn’t just speed that can be affected by this.
If you plan to use the list being built outside of a tell block, and the references to the items won’t work outside the tell block but the contents will, you want to put the contents of the item into your list, not the reference.

KniazidisR · May 1, 2023, 2:46am

[quote=“Krioni, post:6, topic:74596”]
the references to the items won’t work outside the tell block[/quote]

No, they work. This is not the point, but that an excessive level of references (addresses table in the RAM) is being created. A regular reference in a list is a reference to an element. And in this case, a reference to a reference to the element is created.

set aList to {"one", "two", "three"}

set newList to {}
repeat with anItem in aList
	set end of newList to anItem
end repeat

contents of items of newList --> {"one", "two", "three"}

peavine · May 1, 2023, 3:06pm

I thought I should probably add a quick note for those forum members who are new to AppleScript. The following is functionally equivalent to my script in post 1 but with the commented line enabled, and this script takes the same amount of time to run (211 milliseconds).

set aList to {"one", "two", "three"}
repeat 10 times
	set aList to aList & aList
end repeat

set newList to {}
repeat with i from 1 to (count aList)
	set end of newList to item i of aList
end repeat
return newList

CJK · May 5, 2023, 9:32am

What Are You Actually Wanting To Measure ?

Without a doubt, the script that contains the following line within the second repeat loop:

set anItem to contents of anItem

will execute slower than the same script with that line removed (or commented out), and the reason for this is more-or-less as @wch1zpink described, but the other way around: with the above line included, every item being iterated over is being dereferenced (evaluated), and this takes time and slows things down. Conversely, when that line is commented out, everything in the repeat loop is being handled by reference, so no unnecessary evaluations occur, which is going to be substantially faster. It’s the entire reason AppleScript collections are held as referenced objects.

This is a similar principle to the speed enhancement observed when iterating over items in a list housed within a script object (it helps prevent needless dereferencing/evaluation of list items).

What Are Your Timings Actually Measuring ?

Clearly, what I’ve described is the complete opposite of what you’ve found to seemingly be the case in testing when timing these scripts. But this is where you have be very careful about what it is you’re actually timing, and in the case of these scripts, the vast majority of the apparent execution time is dedicated solely to the final line:

return newList

Replace this line in all of the scripts with something like:

return the length of the newList

Time the scripts again, and you’ll get a truer reflection of what I think you are actually trying to get a measure of.

@wch1zpink mentioned the processing of referenced items, which he believed was happening within the repeat loop and adding to the execution time. In fact, the processing is happening in the return statement, and it’s purely in order to display the result for you. As @KniazidisR points out, if your list is still full of references, then printing these out will sometimes (but, by no means always) be more labour-intensive than either printing out a single reference to the entire collection, or printing out the fully-dereferenced list.

Fairer Testing

This wasn’t possible initially, as it was the differences in timings that you were questioning, so you wouldn’t have known where to focus your attention. But, in the end, the remit of these timed tests has been to compare the performance characteristics of dealing with list data when these data are handled by value versus by reference. This is achieved solely by the presence (by value) or absence (by reference) of this line:

set anItem to contents of anItem

Inadvertently, the final line:

return newList

added a separate, and additional, set of operations being timed, and these were different in one script compared to the other—that is, the seemingly identical return statement in each script ends up having to do different things, and ends up affecting the outcome drastically.

With the benefit of hindsight, you can run a fairer test of speed, by ensuring—as far as conceivably possible—that the set of operations you wish to time represent the only functionally significant differences between the scripts, whilst aiming for the rest to be functionally equivalent (which will generally infer that the excess code will be similar, but it won’t always be identical).

I suggested earlier changing the last line of both scripts to:

return the length of the newList

But there’s no reason the scripts shouldn’t output the final list, provided either they both output a list of references (which is only possible for one script), or they both output the contents of the list. To do this, it’s sufficient to change the last line of the script handling items by reference, so that it dereferences the entire list—note that it this is usually the way you will want to approach things, because dereferencing all in one go will, again, be quicker than dereferencing item-by-item:

return the contents of the newList

The script that contains the line set anItem to contents of anItem (uncommented) can keep its return statement as is, since it has already dereferenced everything, or if you would prefer both scripts to use as much as the same code as possible, then it can also use return the contents of the newList without impacting performance.

Results

[ AppleScript version:2.8, system version:12.6.3, CPU type:Intel x86-64h Haswell, CPU speed:2300GHz, physical memory:16384MB ]

Script A:

set anItem to contents of anItem
return newList
Execution time (approx.): 711ms

Script B:

~~set anItem to contents of anItem~~
return the contents of the newList
Execution time (approx.): 111ms

peavine · May 5, 2023, 2:07pm

Thanks @CJK for the post. A lot of great information.

I had understood that the reason why “collections are held as referenced objects” is to save memory and/or increase speed, and that’s one reason I posed my original question. I reran my timing results, which confirm what you say.

set aList to {"one", "two", "three"}
repeat 10 times
	set aList to aList & aList
end repeat

set newList to {}
repeat with anItem in aList
	-- set anItem to contents of anItem -- test line 1
	set end of newList to anItem
end repeat
-- return newList -- test line 2
return the length of the newList

-- result as written is 4 milliseconds
-- result with test line 1 enabled is 214 milliseconds
-- result with test line 2 enabled is 2.278 seconds
-- result with test line 1 and test line 2 enabled is 211 milliseconds

I also tested your suggestion to dereference the entire list all at once, and the timing result was 5 milliseconds:

set aList to {"one", "two", "three"}
repeat 10 times
	set aList to aList & aList
end repeat

set newList to {}
repeat with anItem in aList
	set end of newList to anItem
end repeat
return contents of newList

My computer is M2 Mac mini with 16GB of memory running Ventura 13.3.1