Tuesday, April 13, 2021

#1 2020-12-02 03:53:22 am

MrCee
Member
Registered: 2016-07-01
Posts: 12

Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

I’m hoping someone can offer some ASOC examples of how I could go about ordering an NSArray that contains multiple key/value pairs, in which one of these key/value pairs contains duplicates.

I’m visualizing this scenario as a ‘table'; 2 columns: {theID, theLocation}
I’d like to de-dupe rows based on {theLocation} but ensuring all values are in sort order first.  If I had to build 2 arrays in this process, the first to retain the first alpha-numeric {theID} and it’s corresponding {theLocation}, and the second to include the remaining {theID, theLocation} which essentially should be the duplicates ready for deletion.

Not from a lack of trying with NSMutableOrderedSet, I’m not finding many ASOC examples on the net which helps me understand the completed code in ASOC when de-duplicating ‘tables’ and referencing the item properties/elements within for the remaining deletion set.

Here’s what I have to set things up using a Music.app/iTunes.app as an example…

Applescript:

use framework "Foundation"

tell application "Music"
   set theID to current application's NSArray's arrayWithArray:(get persistent ID of every file track of library playlist 1)
   set theLocation to current application's NSArray's arrayWithArray:(get location of every file track of library playlist 1)
end tell

set myArrayWithDuplicateLocation to current application's NSMutableArray's alloc's init()
repeat with i from 1 to count of theID
   set myData to {theID:theID's objectAtIndex:(i - 1), theLocation:theLocation's objectAtIndex:(i - 1)}
   (myArrayWithDuplicateLocation's addObject:myData)
end repeat

Results from myArrayWithDuplicateLocation:

(NSArray) {
    {
        theID:"EAC86C7395B85A98",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/Kings%20Of%20Tomorrow%20-%20Finally%20(Danny%20Krivit,%20Steve%20Travolta%20Re-edit).aif
    },
    {
        theID:"7863B30CD98CE67F",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/Kings%20Of%20Tomorrow%20-%20Finally%20(Danny%20Krivit,%20Steve%20Travolta%20Re-edit).mp3
    },
    {
        theID:"3711FDC9DAB25514",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/Kings%20Of%20Tomorrow%20-%20Finally%20(Danny%20Krivit,%20Steve%20Travolta%20Re-edit).aif
    },
    {
        theID:"6BED2D9585D0FE7E",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/01%20Show%20Me%20Love.mp3
    },
    {
        theID:"E42CCAFA2A7CE4A2",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/01%20Show%20Me%20Love.aif
    },
    {
        theID:"071EE671801F822F",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/1-02%20-%20Sho-Nuff%20-%20It's%20Alright.mp3
    },
    {
        theID:"DCC4A1D20E443786",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/1-01%20-%20Sho-Nuff%20-%20Tonite.mp3
    },
    {
        theID:"A42D90507D48062F",
        theLocation:(NSURL) file:///Users/USER/Music/TEST/1-01%20-%20Sho-Nuff%20-%20Tonite.mp3
    }
}

Last edited by MrCee (2020-12-02 05:12:03 am)

Offline

 

#2 2020-12-02 04:05:45 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6613

Re: Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

t's not clear to me what you mean by "de-duplicating". What are you trying to achieve?


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#3 2020-12-02 04:31:21 am

MrCee
Member
Registered: 2016-07-01
Posts: 12

Re: Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

Apologies, I would be clearer if I could explain this better with a table example...

in the result of myArrayWithDuplicateLocation you will see library items which have duplicate locations, so this means we have a double up of {theID} for the same {theLocation} in the following 2 scenarios...

EXAMPLE 1: theLocation:(NSURL) file:///Users/USER/Music/TEST/Kings%20Of%20Tomorrow%20-%20Finally%20(Danny%20Krivit,%20Steve%20Travolta%20Re-edit).aif
has 2 id's: EAC86C7395B85A98 & 3711FDC9DAB25514... One of these id's needs to be deleted.

EXAMPLE 2: theLocation:(NSURL) file:///Users/USER/Music/TEST/1-01%20-%20Sho-Nuff%20-%20Tonite.mp3
has 2 id's: DCC4A1D20E443786 & A42D90507D48062F, One of these id's needs to be deleted.

Although I don't know at this stage which ID that shares the same location was created first, I would like to know how to order the priority of ID In alpha-numeric values as to retain the first in the library and delete the further ID's that share the same duplicate location that references the same file.

Last edited by MrCee (2020-12-02 04:39:54 am)

Offline

 

#4 2020-12-02 05:49:13 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6613

Re: Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

If the IDs were in creation order, you could use:

Applescript:

set theDict to current application's NSDictionary's dictionaryWithObjects:theID forKeys:theLocation
set theID to theDict's allObjects()
set theLocation to theDict's allKeys()

If you wanted to use the IDs based on sort order, you could do something like this:

Applescript:

set theID to current application's NSArray's arrayWithArray:{"A", "B", "C", "D", "E"}
set theLocation to current application's NSArray's arrayWithArray:{2, 1, 1, 3, 2}
set theDict to current application's NSDictionary's dictionaryWithObjects:theLocation forKeys:theID
set theID to (theDict's keysSortedByValueUsingSelector:"compare:")'s reverseObjectEnumerator()'s allObjects()
set theLocation to (theDict's allObjects()'s sortedArrayUsingSelector:"compare:")'s reverseObjectEnumerator()'s allObjects()
set theDict to current application's NSDictionary's dictionaryWithObjects:theID forKeys:theLocation
set theID to theDict's allObjects()
set theLocation to theDict's allKeys()


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#5 2020-12-04 03:13:53 am

MrCee
Member
Registered: 2016-07-01
Posts: 12

Re: Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

This is a great example of how to leverage NSDictionary’s unique keys & ignoring values…  a now blindingly obvious process de-duplicating/or/efficiently removing the corresponding values associated with each duplicated key.

Previously I had been using theID as key and attempting to de-duplicate the theLocation as value in an array. Now I see that it was as simple as swapping this around to theLocation as key, and theID as value in NSDictionary. roll

I’ll be coming back to theID based on sort order, I haven’t seen too many examples of keysSortedByValueUsingSelector so I’ll definitely be using this later. Thank you.

For now, this is what I have in the most optimised way I know how at this stage, using NSSet/NSMutableSet to efficiently find the delta between the 2 sets, therefore providing a list of ID’s to be deleted. If anyone has any optimization tips, please let me know.

Applescript:

use framework "Foundation"
tell application "Music"
   set theID to current application's NSArray's arrayWithArray:(get persistent ID of every file track of library playlist 1)
   set theLocation to current application's NSArray's arrayWithArray:(get location of every file track of library playlist 1)
end tell

set theDict to current application's NSDictionary's dictionaryWithObjects:theID forKeys:theLocation
set myUniqueLocationID to current application's NSSet's setWithArray:(theDict's allObjects())
set myDuplicateLocationID to current application's NSMutableSet's setWithArray:theID
myDuplicateLocationID's minusSet:myUniqueLocationID --the delta, remaining ID's for deletion
set myDuplicateLocationID to myDuplicateLocationID's allObjects() as list

So as I familiarize myself with NSDictionary, I do have one more question. How can I return a list of values (theID) by querying the key (theLocation) in a way that lets me search by partial string matches using ‘CONTAINS’ ? I’ve been looking at keysOfEntriesPassingTest but don’t understand it, I haven’t seen it translated to ASOC anywhere so far. I also have taken note that theLocation is a NSURL so this may have thwarted my recent attempts.

I’m sure I’m missing something simple here, in summary I’m looking for something like this: set aTestDict to theDict's valueForKey:("self CONTAINS 'TEST' or self CONTAINS 'Kings' or self CONTAINS 'Show Me Love' ")

Last edited by MrCee (2020-12-04 03:20:54 am)

Offline

 

#6 2020-12-04 04:51:52 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6613

Re: Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

Predicates and valueForKey: or valueForKeyPath: work on arrays. Methods using blocks like keysOfEntriesPassingTest: aren't available to ASObjC (and wouldn't gain you a lot in this case anyway).


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#7 2020-12-14 04:38:30 am

MrCee
Member
Registered: 2016-07-01
Posts: 12

Re: Sort & de-duplicate 1st array to find 2nd array of deletion duplicates

Thanks, Shane. Seems I got lost in the documentation. Back on track incorporating everything I’ve learned so far.
My process:  Create NSDictionary for efficient de-duplication of keys to remove redundant values, use NSDictionary values to build a ‘delta NSSet’, use ‘delta NSSet’ in predicate with further partial-string matches across full NSArray of library items. The final result is a unique deletion list. Sharing the code in-case anyone finds it useful on their ASOC journey…

Applescript:

use framework "Foundation"
tell application "Music"
   set theID to current application's NSArray's arrayWithArray:(get persistent ID of every file track of library playlist 1)
   set theLocation to current application's NSArray's arrayWithArray:(get location of every file track of library playlist 1)
end tell

set myLocationDict to current application's NSDictionary's dictionaryWithObjects:theID forKeys:theLocation
set myUniqueLocationID to current application's NSSet's setWithArray:(myLocationDict's allObjects())
set myDuplicateLocationID to current application's NSMutableSet's setWithArray:theID
myDuplicateLocationID's minusSet:myUniqueLocationID

set thePosixPath to (theLocation's valueForKey:"path")
set myPosixArray to my makeArrayQuickly(theID, thePosixPath)
set thePred to current application's NSPredicate's predicateWithFormat_("thePosixPath = nil OR NOT thePosixPath CONTAINS '/Music/' OR thePosixPath CONTAINS '/.Trash/' OR theID IN %@", myDuplicateLocationID)
set myDeletionArray to (myPosixArray's filteredArrayUsingPredicate:thePred)

on makeArrayQuickly(theID, thePosixPath)
   set theID to theID as list
   set thePosixPath to thePosixPath as list
   script o
       property oID : theID's items
       property oPosixPath : thePosixPath's items
       property oResult : {}
   end script
   repeat with i from 1 to (count o's oID)
       set end of o's oResult to {theID:item i of o's oID, thePosixPath:item i of o's oPosixPath}
   end repeat
   set theResult to current application's NSMutableArray's arrayWithArray:(o's oResult)
   return theResult
end makeArrayQuickly

Although I’m content with the time to complete for 15K items, I still keep coming back to makeArrayQuickly(). Using script object properties is amazingly fast, although it seems like a ‘workaround’ and I hope I’m not missing something simple when it comes to array creation. We can create a NSDictionary in a split second but not an NSArray without a repeat loop for the results similar in the format in the example below. Or is there another way?

(NSArray) {
    {
        theID:"3F80FAB1127481EF",
        thePosixPath:"/Users/USER/.Trash/01 - Downtown Shutdown (Eva Shaw Remix).mp3"
    },
    {
        theID:"42D67D14E4598F82",
        thePosixPath:"/Users/USER/Music/TEST/04 - Downtown Shutdown (The Revenge Dubstramental).mp3"
    },
    {
        theID:"04A543FB688D9C23",
        thePosixPath:"/Users/USER/Music/TEST/03 - Downtown Shutdown (The Revenge Remix).mp3"
    },
    {
        theID:"9D18E2D0D48445C8",
        thePosixPath:"/Users/USER/Music/TEST/1-01 - Sho-Nuff - Tonite.mp3"
    },
    {
        theID:"237E8FB89A0FBC17",
        thePosixPath:"/Users/USER/Music/TEST/1-01 - Sho-Nuff - Tonite.mp3"
    }
}

Below are 4 ways I have tested. I’m interested in learning the fastest way for NS values only. At the moment, this example reading NS values only  (#1 NSArray) is actually the slowest out of all.

Applescript:

use framework "Foundation"
tell application "Music"
   
   --Setup NS Arrays for testing comparison
   set theNSID to current application's NSArray's arrayWithArray:(get persistent ID of every file track of library playlist 1)
   set theNSLocation to current application's NSArray's arrayWithArray:(get location of every file track of library playlist 1)
   set theNSPosixPath to (theNSLocation's valueForKey:"path")
   
   --Setup Applescript lists for testing comparison
   set theASID to theNSID as list
   set theASPosixPath to theNSPosixPath as list
   
   
   --Test 2 methods of creating the same NSMutableDictionary
   
   log "creating NSDictionary from NS values" ---0.02 SECONDS
   set theNSDict to current application's NSMutableDictionary's dictionaryWithObjects:theNSPosixPath forKeys:theNSID
   log "finished dictionary from NS values"
   
   log "creating NSDictionary from Applescript List values" ---0.38 SECONDS
   set theASDict to current application's NSMutableDictionary's dictionaryWithObjects:theASPosixPath forKeys:theASID
   log "finished dictionary from Applescript List values"
   
   --Test 4 methods of creating the same NSMutableArray
   
   --#1 NSArray : 21 SECONDS
   log "#1 NSArray : create NSMutableArray from NS values"
   set myArrayWithNSvalues to current application's NSMutableArray's alloc's init()
   repeat with i from 1 to theNSID's |count|()
       (myArrayWithNSvalues's addObject:{theNSID:theNSID's objectAtIndex:(i - 1), theNSPosixPath:theNSPosixPath's objectAtIndex:(i - 1)})
   end repeat
   log "finished NSMutableArray from NS values"
   
   --#2 NSArray : 17 SECONDS
   log "#2 NSArray : create vanilla list from applescript list values - then convert to NSMutableArray"
   set MyListWithASListValues to {}
   repeat with i from 1 to count of theASID
       set end of MyListWithASListValues to {theID:item i of theASID, thePosixPath:item i of theASPosixPath}
   end repeat
   set myConvertedFromListArrayWithApplescriptLists to current application's NSMutableArray's arrayWithArray:MyListWithASListValues
   log "finished NSMutableArray from applescript list values - then convert to NSMutableArray"
   
   --#3 NSArray : 11 SECONDS
   log "#3 NSArray : create NSMutableArray from applescript list values" ---
   set myArrayWithApplescriptLists to current application's NSMutableArray's alloc's init()
   repeat with i from 1 to count of theASID
       (myArrayWithApplescriptLists's addObject:{theID:item i of theASID, thePosixPath:item i of theASPosixPath})
   end repeat
   log "finished NSMutableArray from applescript list values"
   
   --#4 NSArray : 0.01 SECONDS
   log "#4 NSArray : create NSMutableArray from script object properties" ---0.01 SECONDS
   script o
       property oID : theASID's items
       property oPosixPath : theASPosixPath's items
       property oResult : {}
       repeat with i from 1 to count of o's oID
           set end of o's oResult to {item i of o's oID, item i of o's oPosixPath}
       end repeat
   end script
   set myArrayWithScriptObjectProperties to current application's NSMutableArray's arrayWithArray:(o's oResult)
   log "finished NSMutableArray from script object properties"
   
end tell

Let me know if anyone has any further suggestions to try?
Thanks

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)