Remove Duplicates from a List?

Wow, that opened a can of worms I wasn’t expecting.

Thanks everyone, I’ve got a better understanding now, and using {braces} properly for my subroutine results in a faster and more elegant subroutine than my nested repeats.

Oddly, I now realize that many years ago I came across this problem and ended up, by checking my variable values, eventually figuring out to put something in a (seemingly) “extra” set of braces for a comparison, but never dug in to really figure out what was going on and just ran with it. That was probably 10+ years ago and I’d forgotten entirely until I read this thread.

I was checking my variable values along the way this time and saw that the duplicates weren’t being caught because one value was remaining a record while the other was being coerced to a list, but I couldn’t figure out why it was doing that, and it didn’t occur to me that I could fix it with a simple set of braces. I just figured if I forced Applescript to obtain both items in an identical manner, I would either avoid the coercion, or force an identical coercion, so I wrote it that way, and got the desired result.

What an odd language I write in. I completely understand the thought behind dynamic variable types to make things simpler to the user, and they’re great when they work as expected… but I can’t believe how often my “bug” is an unexpected variable type, and then it’s hidden from me… like having the user choose from a list of numbers, and it returns the resulting number as text… My scripts are full of

(((PathToFolder as text) & "/document name") as POSIX file)

and

if (userChoice as number) is someNumber

and, here’s a good one,

set the keywords of info to {tabChoice, (pathData as list as string), "true"}

. Note I’m forcing the value “true” as text, which is a long story. Sometimes making things simple makes them much more difficult.

There are probably better way to do all these things, I usually just do the first thing I find that works. Which I know can come back to bite me when I lack a deeper understanding of why it worked, because that means I also don’t understand when it’s not going to work. So thanks again for the deeper understanding on this one.

  • t.spoon.

Why

  1. do you have an inner script (foo)?
  2. is there no return clause?
  3. the line “foo’s okAddresses”? What does it do? Purpose?

I don’t think @julifos is still around. So, without meaning to step on any toes, I’ll briefly answer to prevent leaving you stranded:

The inner script object (foo) is an optimisation technique that leverages a quirk in AppleScript’s handling of list objects, for reasons pertaining to:

  • Speed: Accessing lists through a script object is significantly faster than conventional methods.
  • Structure: The script object declares two properties:
    • foo2: A direct reference to the input list l
    • okAddresses: An empty list that will be populated with items from l
  • Efficiency: Both reading from foo2 and writing to okAddresses can be done directly in memory, without the need to copy the data to and from the buffer, and without the need to evaluate any other items contained within the lists.
  • Implementation: This approach is particularly useful when dealing with large lists.

AppleScript does have a return mechanism, but it’s often implicit. Here’s how it works:

  • By default, handlers and scripts return the result of their final executed command.
  • An explicit return statement can be used to: a) Terminate execution early; and, optionally: b) Specify a specific value to return
  • Omitting return means all commands execute sequentially, with the last one’s result being returned.

The return clause isn’t always necessary because of this implicit behaviour, although it’s useful for control over script flow and can aid readability/clarity.

okAddresses is one of the properties in the foo script object. It points to the list that is constructed by the main body of the handler by populating it with items from the original list, l, such that no two items in okAddresses will have the same value.

The line serves as the handler’s implicit return statement. It’s the final command that gets executed, so its result is returned by the handler. It is functionally identical to:

return foo's okAddresses
1 Like

When one assigns an array (your list of addresses) to an NSSet all duplicates are automatically purged eliminating any repeat loop overhead. If there were any list entries that were empty, the following code would also remove them. Then, array y is produced from the NSSet and cast back as a list.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property ca : *current application*

set y to ca's NSMutableArray's array()
set x to {"a@a.e", "b@b.c", "d@a.h", "a@a.e", ""}
set uniqSet to ca's NSSet's alloc()'s initWithArray:x
y's addObjectsFromArray:(uniqSet's allObjects())
y's removeObject:""

return y as list
2 Likes

Hi.

The order of the first instance of each item in the original list can be preserved by using an NSOrderedSet — or an NSMutableOrderedSet if you then want to remove further items:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property ca : current application

set x to {"a@a.e", "b@b.c", "d@a.h", "a@a.e", ""}
set uniqSet to ca's NSMutableOrderedSet's orderedSetWithArray:x
uniqSet's removeObject:""
set y to uniqSet's array()

return y as list

Thanks, Nigel.

An NSorderedSet, or in your corrected example, the NSMutableOrderedSet would be the right solution to efficiently retain the order of a mailing list without duplication.