posittion of an item in a list

brian_donovan · October 29, 2012, 5:50pm

Hi, hope someone can help… I am trying to get the number or position of an item in a variable, this is my script:

set X to {“Albert”, “John”, “Mike”, “Bill”}
set y to get the position in X of “John”

the answer i would expect is 2 but this does not wok… any idea will help!

Thank you for your time!

Brian

StefanK · October 29, 2012, 6:30pm

Hi,

there is no index-of-item-in-list function in AppleScript, you need a repeat loop


set X to {"Albert", "John", "Mike", "Bill"}
set foundIndex to indexOfItemInList("John", X)
if foundIndex = -1 then display dialog "Item not found"

on indexOfItemInList(theItem, theList)
	repeat with i from 1 to count theList
		if item i of theList is theItem then return i
	end repeat
	return -1
end indexOfItemInList

brian_donovan · October 29, 2012, 7:20pm

Thank you! I did that but my list has 1,000 items! I am hoping to find a more simple solution!

StefanK · October 29, 2012, 7:31pm

this is the most simple solution.

If the list is ordered, the code could be accelerated by a binary search

McUsr · October 29, 2012, 7:46pm

Hello!

I have never tried this on a list with 1000 items, but I think it will do good, as long as your list items are unique.


set ml to {"Alpha", "Beta", "Delta", "Epsilon"}

set tpos to indexOfItem("Alpha", ml)

log tpos
--> 1


on indexOfItem(theItem, itemsList) -- credit to Emmanuel Levy
	local rs
	set text item delimiters to return
	set itemsList to return & itemsList & return
	set text item delimiters to {""}
	try
		set rs to -1 + (count (paragraphs of (text 1 thru (offset of (return & theItem & return) in itemsList) of itemsList)))
	on error
		return 0
	end try
	rs
end indexOfItem

McUsr · October 29, 2012, 8:31pm

Hello!

I was curious as to how the indexOfItem handler perfoms with a list of 1000 items, so I got me a list of the first 1000 words in Websters.

That those are sorted, should really have nothing to say, given how the handler works.

I tried it a couple of times in Smile, and timed with chrono, and it used 4 thousands of a second to look up item nr 865 in the list.

I call that respectable!

regulus6633 · October 29, 2012, 9:45pm

This is a subroutine for this. I got it from this website several years ago. It should be about the fastest you will find for a large list…

on binaryListItemIndex(indexItem, aList)
	script o
		property lst : aList
	end script
	set indexItemAsList to {indexItem}
	set L to 1
	set r to (count o's lst)
	considering case
		if (indexItemAsList is in aList) then
			repeat until (indexItem is item L of o's lst)
				set L to L + 1
				set m to (L + r) div 2
				if (indexItemAsList is in items L thru m of o's lst) then
					set r to m - 1
				else
					set L to m + 1
				end if
			end repeat
			return L
		else
			return 0
		end if
	end considering
end binaryListItemIndex

McUsr · October 29, 2012, 10:00pm

Hello!

One learns something every day. Keeping the list sorted, the binaryListItemIndex was marginally faster. ( Around 0.0005 at an average on my machine.

I then randomized the list with the handler below, that I think I have snagged from here, and then tried to compare.

It didn’t turn out quite as I exepected, as now the binaryListItemIndex was around 0.002 faster!

But my machine is working at the moment, I don’t trust anything else of those results, I reran it with different results. I trust indexOfItem is slightly slower when the list is sorted.

It also amazes me that binaryListItemIndex is that fast, as it considers case, I removed it from some of my handlers for speed, but I may have gotten it wrong. (Not timed enough.)

on randomizeList(theList)
	set randomList to {}
	repeat with i from 1 to (count of theList)
		set randomItem to some item of theList -- get a random item from the list
		set itemIndex to indexOfItem(randomItem, theList) -- get the index number of the random item
		if (count of theList) > 1 then -- remove the item number from the list
			if itemIndex = theList's length then
				set theList to items 1 thru -2 of theList
			else if itemIndex = 1 then
				set theList to rest of theList
			else
				set theList to items 1 thru (itemIndex - 1) of theList & items (itemIndex + 1) thru -1 of theList
			end if
			set end of randomList to randomItem
		else
			set end of randomList to (item 1 of theList)
			exit repeat
		end if
	end repeat
	return randomList
end randomizeList

Edit

Looking a little closer at the handler, I do believe it!

DJ_Bazzie_Wazzie · October 29, 2012, 11:15pm

This is why AppleScript is such a typical and odd language, at least to me.

Nigel_Garvey · October 30, 2012, 12:04am

Considering case is faster than ignoring it (or certainly used to be) when comparing texts, because “ignoring” has to do more work to decide whether or not different characters are equivalent. When considering case, characters are either identical or not.

Ironically, if you know you’re comparing texts containing caseless characters such as numbers and punctuation, ‘considering case’ is the way to go for speed!

McUsr · October 30, 2012, 12:55am

I thought so to for a while, until I tried to make the indexOfItem handler to be faster, by adding a considering case clause. It appears not to work very well with text items. -Or, I didn’t test it excessively enough!

Neither the reference trick inside it, when counting paragraphs neither for that matter. So it appears that there are different rules when having handlers that works with text item delimiters and paragraphs. (Which I find to be such an elegant solution to lookup problems!)

But for regular list lookup and regular text/string comparision, I think those tricks are good of course.

The binaryListItemIndex points quite clearly in that direction.

DJ_Bazzie_Wazzie · October 30, 2012, 10:52am

Nothing in AppleScript is logic so I wouldn’t be surprised that considering case would be slower. You, as a wide developer, would also know that it’s weird that using TIDs in this case is odd. In many programming languages the solution Stefan posted should be the fastest, logical-wise.

Here an upped version of Stefan’s script:

set X to {"Albert", "John", "Mike", "Bill"}
set foundIndex to indexOfItemInList("John", X)

on indexOfItemInList(theItem, theList)
	script s
		property l : theList
	end script
	set o to 0
	repeat with i in s's l
		set o to o + 1
		if contents of i = theItem then return o
	end repeat
	return -1
end indexOfItemInList

Because my oldest machine is still an i7 I can’t really be sure if it’s fast enough but 25,000 items is still fast.

McUsr · October 30, 2012, 11:24am

Hello!

The handler DJ Bazzie Wazzie came up with, is just a tad slower, compared to the one I brought to the table when the list is sorted. (The item I search for is in the end).

It performs of course better when the item is randomized, and the item is at the end.

Those tests, are not real test as they’d have to be much more elaborate.

It seems to me though that text item delimiters also draws advantages of sorted lists!!!

But it still seems that text item delimiters gives a more stable lookup time with regards to the items placement, for the worst case, but never quite reaching the best case times of the other approaches.

Now, if it was meant to not consider case in the first place, then you would have to perform some extra operations afterwards, slowing it down. This is not unix, AS was made to be simple for their users in the first place. I still believe that for regular string searching and comparison that considering case kicks in, and makes the operations faster.

I like as as a language, but nowadays Objective-C is hotter for me. ( HOT HOT HOT) . A clean cut between Java and C++. And it actually seems that GO is becoming very similar to Objective-C, and C#.

We live in a world with Prolog, Lisp, and many other alternatives to algol based languages. I find that AS is a sexy little language, dealing with objects and text item delimiters and it all, it covers up for what I would have used Smalltalk for quite beautifully, and probably better than Smalltalk itself.

Interapplication communication, and Apple Events RULES!

StefanK · October 30, 2012, 11:35am

However in case of an ordered list a binary search is much faster and more logical-wise

Comparing AppleScript with (other) programming languages is a bit unfair because AppleScript is designed as a scripting language not a full-fledged programming language.

Shane_Stanley · October 30, 2012, 12:01pm

But there is logic to this – ignoring case means the underlying code doesn’t need to deal with the whole case lookup business – it’s a one-to-one match for each character.

In Cocoa, it means using isEqualToString:, rather than one of the compare:-type methods – and there’s a fraction more speed there because isEqualToString: is so common that’s implemented as a function.

DJ_Bazzie_Wazzie · October 30, 2012, 12:23pm

You should know that my handler works with all type of objects and not string only.

No, as you can see in the second sentence, I meant that using TID have overhead and is still faster than simply iterating through an array. That means that the normal if-statement and/or repeat loops have more over overhead than splitting a text. in case you don’t know, AS is written in a time the OS programming language went from pascal to C. When we went to Mac OS X there were some significant performance difference, mostly the code ran better in Mac OS 9 than Mac OS X. The code in the background is still C today and therefore I look at the overhead from a C perspective (and maybe because I’m a C developer myself). So even from an programmer perspective or syntax perspective I think using TID is not very logic. So if AS lives by it’s own rules in it’s own world why would AS be logical with considering blocks? I know and understand Nigel’s comments but I would.

Even in modern languages (JS, AS3 and PHP are relative new) you don’t coerce an array to text, split it and count the returns to know the position of a string. No, you simply iterate through the array and those are simple scripting languages as well. they do it better…

I thought everything is unix , don’t know what you mean by that or what unix has to do with this only that AS and BSD are both written in C. If we’re talking about AS In depth we’re not talking about syntax. The syntax, or scripting component, is built on top of it and later used and had similarities with Smalltalk. The beauty of this was that you could JavaScript as a syntax but also many other syntaxes. Too bad it stayed only with AS (There is a beta for JavaScript). So if you choose the JavaScript syntax the interpreter would still be executing with TID’s faster than iterating while the same code with V8 would iterating be hundred times faster. It’s just a part of OSA not the AS syntax so I don’t understand why you bring this up anyway.

I’ve never been pro Objective-C from the start. When you have developed C++ for years and go to Objective-C it’s a step back. C++ is not only faster but have also a much larger community, much more libraries and frameworks, nicer syntax and better distributable. Therefore when my Applications needs performance I choose to write Objective-C++ to do all the heavy work by C/C++ and leave the interface elements to Objective-C code. This way I can keep the speed up in my applications and also can create the presentation layer in C# for windows platforms but use the same business and data layer (C++) for windows.

The whole reason of improving an programming language is to make it better isn’t it? Otherwise AS was a failure like D.

I compare it with V8, PHP or AS3

I know what you mean Shane in the background the string case comparison

[code]int wcscasecmp(const wchar_t *s1, const wchar_t *s2)
{
wchar_t c1, c2;

for (; *s1; s1++, s2++) {
	c1 = towlower(*s1);
	c2 = towlower(*s2);
	if (c1 != c2)
	return ((int)c1 - c2);
}
return (0);

}[/code]
Would be slower than a case insensitive compare

int wcscmp(const wchar_t *s1, const wchar_t *s2) { while (*s1 == *s2++) if (*s1++ == 0) return (0); return (*(const unsigned int *)s1 - *(const unsigned int *)--s2); }
But my point was that with AS you don’t know if which code is actually called and how it’s done. There is tons of overhead that we know there is but know exactly know how/what it looks like.

Nigel_Garvey · October 30, 2012, 12:27pm

It is in fact faster in straight text comparisons, as I’ve tested again for myself this morning. However, the difference is minute and, in the context of the index handlers, is swamped out by the larger vagaries of using an OSAX command (‘offset’) in McUsr’s offering and the list extractions and list comparisons in the binary handler (which looks to be one of mine from some time ago :D).

The algorithm of Stefan’s solution is ultimately the only way to do it. Low-level languages therefore do it that way. High-level languages like AppleScript can do it that way, but with the overhead of invoking low level code to analyse and execute each individual instruction. Where a high-level instruction does more, there’s more end-orientated low-level work per instruction and less to-ing and fro-ing, so that tends to be faster in the high-level language than imitating low-level code.

Stefan’s code, however, and the binary handler are more generally useful in AppleScript because they work even when the items in the list aren’t text or numbers.

DJ_Bazzie_Wazzie · October 30, 2012, 12:52pm

Excactly my point… even if you write several years AS you still don’t know for sure what happens behind the closed doors of OSA.

McUsr · October 30, 2012, 12:54pm

There is really nothing to say against Nigel Garvey’s post, and this is just an amendment.

There is just one moment here, when the data are such, ( has such constraints), that you can use text item delimiters, then I am sure treating lists of lists by paragraphs and text item delimiters, will outperform anything else.

We have here things that handles the modulus math for sublists of up to 4 items. (MacScripter / Fast handler for finding the index of sublists within a list) I haven’t bothered to take it further at the moment, as it is quite laborious to figure out the formulaes for identifiying which “row” that were hit.

One thing that really speaks for text item delimiters based handlers, is that they are easy to “inline”, so that you can do some work while you are at it.

adayzdone · October 30, 2012, 1:29pm

I think that was from this thread:
http://macscripter.net/viewtopic.php?pid=75633#p75633