Wednesday, September 30, 2020

#1 2020-08-10 03:18:35 pm

jolinwarren
Member
From:: Edinburgh, Scotland
Registered: 2010-04-02
Posts: 15
Website

Get record position in list of records, or quickly search for record

I've read through many posts on records and quickly searching big lists or finding the position/index of an item in a long list. However, I can't figure out a good way to do what I need to do (without just using a repeat loop to step through the whole list of records).

I have a list of records such as this:

Applescript:


{num:8808, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

The num property can be used to match a file (e.g. DSCF8808.jpg), and when I'm processing that file, I need to access the record. So I need a quick way of locating the record so I can then access the other properties. In an ideal world, I would be able to do the following:

Applescript:


set vPhotoDetails to vPhotoDetails & {num:8808, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

set vFileIndex to hGetFileIndex("DSCF8808.jpg") --> vFileIndex = 8088

set vPhotoRecord to (item of vPhotoDetails whose num = vFileIndex) --> vPhotoRecord = {num:8808, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

Now, I know the last line is not valid AppleScript, which is my whole problem! Is there an elegant way to do this in a handler without resorting to:

Applescript:


on hGetRecordIndex(vRecords, vFileIndex)
   set vRecordNum to 1
   repeat with vRecord in vRecords
       if (num of vRecord) = (vFileIndex as integer) then return vRecordNum
       set vRecordNum to vRecordNum + 1
   end repeat
   
   return null
end hGetRecordIndex

My concern is that this handler will be quite slow if vRecords is a large list. Any advice much appreciated!

Last edited by jolinwarren (2020-08-10 03:21:53 pm)


Filed under: Search, records

Offline

 

#2 2020-08-10 05:54:07 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6462

Re: Get record position in list of records, or quickly search for record

There's no shortcut in AppleScript. However, AppleScriptObjC provides a solution:

Applescript:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

set theArray to current application's NSArray's arrayWithArray:vPhotoDetails
set thePred to current application's NSPredicate's predicateWithFormat:"num = %@" argumentArray:{8808}
set theResult to (theArray's filteredArrayUsingPredicate:thePred)'s firstObject() as record -- assuming only one match


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#3 2020-08-10 06:16:54 pm

Marc Anthony
Member
From:: Dallas, TX
Registered: 2006-04-27
Posts: 964

Re: Get record position in list of records, or quickly search for record

Hi. Assuming you have a list that's consecutively ordered (by the num record), you can just call it by index. If the list is nonconsecutive, it may be more efficient to repopulate—forcing it to be consecutive—or to loop with a reference; this will depend on the span between any gaps in the series.

Applescript:

set thing to {{num:1, something:"something", whatever:"something else"}, {num:2, something:"something2", whatever:"something else2"}, {num:8809, something:"something8809", whatever:"something else8809"}, {num:8810, something:"something8810", whatever:"something else8810"}}

#1) Call by index
thing's item 2 --or item 8809, if there are that many items

#2) Loop until criteria
set {counter, findThis} to {1, 8809}
repeat until my thing's item counter's num is findThis
   set counter to counter + 1
end repeat
thing's item counter's num

Offline

 

#4 2020-08-10 08:35:16 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 620

Re: Get record position in list of records, or quickly search for record

jolinwarren wrote:

My concern is that this handler will be quite slow if vRecords is a large list. Any advice much appreciated!



A repeat loop made faster with a reference-to operator or script object seems a reasonably quick way to accomplish what the OP wants. To quantify this, I created a list containing 1000 items with the match record being item 901. I then ran the loop suggested by the OP both with and without a reference-to operator. The results were 9 milliseconds with the reference-to operator and 463 milliseconds without.

Applescript:


-- untimed code
set oneRecord to {num:1, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

set vRecords to {}

repeat 1000 times
   copy oneRecord to end of vRecords
   set num of oneRecord to ((num of oneRecord) + 1)
end repeat

-- timed code
set vFileIndex to 901

getRecordIndex(a reference to vRecords, vFileIndex)

on getRecordIndex(vRecords, vFileIndex)
   
   set vRecordNum to 1
   repeat with vRecord in vRecords
       if (num of vRecord) = vFileIndex then return vRecordNum
       set vRecordNum to vRecordNum + 1
   end repeat
   
end getRecordIndex

Last edited by peavine (2020-08-13 03:48:34 pm)


2018 Mac mini - macOS Catalina

Offline

 

#5 2020-08-11 07:34:43 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 620

Re: Get record position in list of records, or quickly search for record

I thought I'd also write a script utilizing a script object. It's a few milliseconds faster than the script using a reference-to operator.

Applescript:


-- untimed code
set oneRecord to {num:1, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

set vRecords to {}

repeat 1000 times
copy oneRecord to end of vRecords
set num of oneRecord to ((num of oneRecord) + 1)
end repeat

-- timed code
set vFileIndex to 901

getRecordIndex(vRecords, vFileIndex)

on getRecordIndex(vRecords, vFileIndex)
   
   script o
       property vRecordsRef : vRecords
   end script
   
   set vRecordNum to 1
   repeat with vRecord in o's vRecordsRef
       if (num of vRecord) = vFileIndex then return vRecordNum
       set vRecordNum to vRecordNum + 1
   end repeat
   
end getRecordIndex

Last edited by peavine (2020-08-13 03:55:24 pm)


2018 Mac mini - macOS Catalina

Offline

 

#6 2020-08-13 10:06:00 am

jolinwarren
Member
From:: Edinburgh, Scotland
Registered: 2010-04-02
Posts: 15
Website

Re: Get record position in list of records, or quickly search for record

Thanks all for the replies, this is all really helpful!

Shane Stanley wrote:

There's no shortcut in AppleScript. However, AppleScriptObjC provides a solution:



I've never dipped into ASObjC because of my lack of knowledge of Objective-C, so this is very helpful. I would never have been able to come up with this on my own! I've not had a chance to work on my script since my original post, but I'll probably try using this method first. Am I correct that the use AppleScript version line is to ensure a minimum of version 2.5 (i.e. it will work with version 2.7 which I'm currently using on Mac OS 10.14 Mojave)?

peavine wrote:

A repeat loop made faster with a reference-to operator or script object seems a reasonably quick way to accomplish what the OP wants. To quantify this, I created a list containing 1000 items with the match record being item 901. I then ran the loop suggested by the OP both with and without a reference-to operator. The results were 5 milliseconds with the reference-to operator and 25 milliseconds without.



This is really helpful, as the loop is a lot quicker than I thought it would be. The reference-to operator is a good idea which I hadn't thought of. Thanks for doing the timings, I haven't figured out how to do that, so it's useful to have some data on this. I'm still considering using your solution of putting the records in a script object and looping around. Thought it might be slower than the ASObjC solution Shane provided, it looks like it will be plenty fast enough for my purposes (I can't see my records numbering more than 10k-20k) and would be a bit more readable for me in the future (given my lack of Objective-C knowledge).

One related question I have is why is it faster to put the reference to the records in a script object. When I was looking up fast find routines in posts on this forum, I saw the use of script objects (for instance, this post by Nigel Garvey). I've not used script objects before, and I don't understand why putting the reference to the records in a script object would speed up the execution. What is the relationship of a script object to the rest of the script?

Marc Anthony wrote:

Assuming you have a list that's consecutively ordered (by the num record), you can just call it by index. If the list is nonconsecutive, it may be more efficient to repopulate—forcing it to be consecutive—or to loop with a reference



This is actually how the current version of my script works (which is now almost 9 years old), but it was written for data exports from a slightly different database, where the data was grouped in ‘rolls’ of only a few dozen records with consecutive numbering, and the numbering of the records always started around zero (generally -2, -1, 0, 1, or 2). So it was easy to initialise an offset based on the first record and then just call the record by index from the filename. I was initially hoping I could adapt the code for my new data exports. However, with my current setup, it's one ungrouped database with all records, gaps between numbers, and the first record (and file) could be 8088. So repopulating the list with empty records from 1 to 8087 would be a lot less efficient than just looping through the existing records looking for the right num.

Thanks for all the input, great to have a couple of options that will work!

Offline

 

#7 2020-08-13 10:52:01 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 620

Re: Get record position in list of records, or quickly search for record

jolinwarren wrote:

Thanks for doing the timings, I haven't figured out how to do that, so it's useful to have some data on this.



jolinwarren. I'm glad you've received some helpful suggestions. FWIW, I have included below the script I used for the timing tests--in this case it tests the script-object script.

Applescript:


use framework "Foundation"
use scripting additions

set decimalPlaces to 3

-- untimed code
set oneRecord to {num:1, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

set vRecords to {}

repeat 1000 times
   copy oneRecord to end of vRecords
   set num of oneRecord to ((num of oneRecord) + 1)
end repeat

-- start time
set startTime to current application's CFAbsoluteTimeGetCurrent()

-- timed code
set vFileIndex to 901

set theIndexNumber to getRecordIndex(vRecords, vFileIndex)

on getRecordIndex(vRecords, vFileIndex)
   
   script o
       property vRecordsRef : vRecords
   end script
   
   set vRecordNum to 1
   repeat with vRecord in o's vRecordsRef
       if (num of vRecord) = vFileIndex then return vRecordNum
       set vRecordNum to vRecordNum + 1
   end repeat
   
end getRecordIndex

-- elapsed time
set elapsedTime to (current application's CFAbsoluteTimeGetCurrent()) - startTime
set nf to current application's NSNumberFormatter's new()
nf's setFormat:("0." & (text 1 thru decimalPlaces of "00000"))
set elapsedTime to ((nf's stringFromNumber:elapsedTime) as text) & " seconds"

-- result
elapsedTime --> 6 milliseconds on first run and 2 milliseconds on rerun
-- count vRecords --> 1000
-- theIndexNumber --> 901

Last edited by peavine (2020-08-13 03:35:58 pm)


2018 Mac mini - macOS Catalina

Offline

 

#8 2020-08-13 04:30:34 pm

Marc Anthony
Member
From:: Dallas, TX
Registered: 2006-04-27
Posts: 964

Re: Get record position in list of records, or quickly search for record

jolinwarren wrote:

One related question I have is why is it faster to put the reference to the records in a script object.



Generally speaking, script objects, a reference to, and my are just hocus pocus that play on an AppleScript quirk. Using a script object seems to be a stylistic choice, but, in many scenarios, it ultimately won't be the fastest method—it tends to verbosity. My solution is at least twice as fast as the script object method; even when there are 10K list objects, it executes in significantly less than a second.


Applescript:

use framework "Foundation"
use scripting additions

set decimalPlaces to 3

-- untimed code
set oneRecord to {num:1, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

set vRecords to {}

repeat 1000 times
   copy oneRecord to end of my vRecords --*edit
   set num of oneRecord to ((num of oneRecord) + 1)
end repeat
-- start time
set startTime to current application's CFAbsoluteTimeGetCurrent()

-- timed code

#2) Loop until criteria
set {counter, findThis} to {1, 901}
repeat until my vRecords's item counter's num is findThis
   set counter to counter + 1
end repeat
set finality to vRecords's item counter

-- elapsed time
set elapsedTime to (current application's CFAbsoluteTimeGetCurrent()) - startTime
set nf to current application's NSNumberFormatter's new()
nf's setFormat:("0." & (text 1 thru decimalPlaces of "00000"))
set elapsedTime to ((nf's stringFromNumber:elapsedTime) as text) & " seconds"

-- result
elapsedTime --> about 1 millisecond

*Edited for clarity, test list generation improvement, and to update timing outcome.

Last edited by Marc Anthony (2020-08-13 04:57:46 pm)

Offline

 

#9 2020-08-14 03:31:19 am

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5285

Re: Get record position in list of records, or quickly search for record

Marc Anthony wrote:

My solution is at least twice as fast as the script object method; even when there are 10K list objects


Hi.

1. On my machine, peavine's timing script is consistently "twice as fast" as Marc's: ie. 0.001 seconds as opposed to 0.002.
2. Marc's script loops through my vRecords, which makes it essentially the script object method anyway. The script object in this case is the main script, not one in a handler.
3. peavine's script uses a repeat with … in … repeat with the script object list variable instead of an indexed repeat, which is interesting. Its timing appears to be essentially identical to Marc's, even with 10,000 items, although the intial set-up takes forever because the script copies the records to end of vRecords instead of to end of my vRecords.
4. When using a script object in a handler, I've always obtained the list's length by applying count to the handler's list parameter variable rather than to the script object property because this always used to be slightly faster. (My theory was that count is applied directly to the list rather than to its items or properties.) But experimenting with it again this morning, I'm finding that counting the parameter variable instead of the script object property nearly doubles the time taken for 10,000 items, raising it from 0.034 seconds to about 0.060! I'll be changing my ways from now on and counting lists as script properties!  wink


NG

Offline

 

#10 2020-08-14 10:16:36 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 620

Re: Get record position in list of records, or quickly search for record

Thanks Nigel for the post.

I'm just learning the ins-and-outs of speed-enhancement techniques for large lists and wanted to decide on one before placing it in my notebook. So, I reran the tests with 10000 records and with the matching record set at 9901. The results--which only include enough code to identify the approach employed--were:

Applescript:


-- no speed enhancement - 299 seconds on first run
getRecordIndex(vRecords, vFileIndex)
on getRecordIndex(vRecords, vFileIndex)
   set vRecordNum to 1
   repeat with vRecord in vRecords
       if (num of vRecord) = vFileIndex then return vRecordNum
       set vRecordNum to vRecordNum + 1
   end repeat
end getRecordIndex

-- modify above with a-reference-to operator - 0.51 seconds on first run
getRecordIndex(a reference to vRecords, vFileIndex)

-- script object one - 0.44 seconds on first run
repeat with vRecord in o's vRecordsRef

-- script object two - 0.71 seconds on first run
repeat with i from 1 to (count vRecords)

-- script object three - 0.42 seconds on first run
-- this appears to confirm Nigel's point 4
repeat with i from 1 to (count o's vRecordsRef)

I also ran Marc Anthony's script as written, changing only the number of records to 10000 and the matching record to 9901. The timing result on first run was 0.42 seconds.


2018 Mac mini - macOS Catalina

Offline

 

#11 2020-08-16 04:07:17 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 620

Re: Get record position in list of records, or quickly search for record

jolinwarren wrote:

One related question I have is why is it faster to put the reference to the records in a script object. When I was looking up fast find routines in posts on this forum, I saw the use of script objects (for instance, this post by Nigel Garvey). I've not used script objects before, and I don't understand why putting the reference to the records in a script object would speed up the execution. What is the relationship of a script object to the rest of the script?



I think Nigel answers this question in the following:

Access to the items or properties of a list is faster if you use a reference to identify the list rather than a simple variable. Assigning the list to a property of a script object allows you to use a reference expression like 'BL's Big_List' rather than just, say, 'Big_List'. The script object itself doesn't make access faster, it simply allows a reference expression to be written into the script. This is particularly useful inside a handler, where the list variable may be local and thus unreferenceable.



https://www.macscripter.net/viewtopic.php?pid=60390

BTW, there does seem to be a significant speed advantage to the use of what Nigel refers to above as the script-object method as compared with the simple use of a reference-to operator. I don't know the reason for this but it's probably worth bearing in mind when working with very large lists.

Last edited by peavine (2020-08-16 10:55:22 pm)


2018 Mac mini - macOS Catalina

Offline

 

#12 2020-08-23 09:04:59 am

jolinwarren
Member
From:: Edinburgh, Scotland
Registered: 2010-04-02
Posts: 15
Website

Re: Get record position in list of records, or quickly search for record

Thanks for all this follow-up. I think I'm starting to get my head around the different methods and their implications, so the code examples, timings, and explanations from everyone are hugely helpful! I hadn't previously seen/digested Nigel's explanation on why a script object can speed things up, but that makes sense, and the whole thing is a lot clearer to me.

peavine, thank you for the comprehensive timing results. It looks like I should go with either the "script object three" or Marc's method. They are both plenty fast enough for my purposes. If I ever end up with a database that has significantly more than 10k rows, I will revisit this thread and consider the ASObjC solution. peavine, thanks also for the timing code, that will be useful for future development.

Offline

 

#13 2020-08-24 03:05:19 am

jolinwarren
Member
From:: Edinburgh, Scotland
Registered: 2010-04-02
Posts: 15
Website

Re: Get record position in list of records, or quickly search for record

As a further follow-up for those who are interested, I am now using a modified version of Marc's approach #2. The script isn't finished yet, but I think this part of it won't change further.

The one issue with Marc's code was that if none of the records contained the number being searched for, after getting to the end of the list of records AppleScript will (quite rightly) throw an error. After rewriting the loop, I also noticed that if I passed it a number that wasn't in any of the records, the timing was significantly faster (thanks again peavine for the timing code, it's so useful!). This lead me to realise that comparing num in each record to the number I was searching for was less of a performance hit than:

Applescript:

set vPhotoRecord to vPhotoDetails's item vRecNum

Using my newly gained understanding from all the helpful people in this thread, I used my to use a reference to vPhotoDetails instead, and the timing on first run with 10000 records and searching for 9991 is now 0.0245 seconds!

Applescript:


use framework "Foundation"
use scripting additions

set decimalPlaces to 4

-- untimed code
set oneRecord to {num:1, datetime:"2020-07-24T14:32:00", caption:"Middle of nowhere", lat:"57.59631", lon:"-13.68732", notes:"Photographer: Jane Smith", lens:"Fujinon Super EBC 23mm"}

set vPhotoDetails to {}

repeat 10000 times
   copy oneRecord to end of my vPhotoDetails
   set num of oneRecord to ((num of oneRecord) + 1)
end repeat

-- start timer
set startTime to current application's CFAbsoluteTimeGetCurrent()

-- timed code

set {vRecNum, vFileIndex} to {1, 9991}
set vPhotoRecord to false

repeat with vRecNum from 1 to (count my vPhotoDetails)
   if my vPhotoDetails's item vRecNum's num is vFileIndex then
       set vPhotoRecord to my vPhotoDetails's item vRecNum
       exit repeat
   end if
end repeat

-- elapsed time
set elapsedTime to (current application's CFAbsoluteTimeGetCurrent()) - startTime
set nf to current application's NSNumberFormatter's new()
nf's setFormat:("0." & (text 1 thru decimalPlaces of "00000"))
set elapsedTime to ((nf's stringFromNumber:elapsedTime) as text) & " seconds"

-- result
elapsedTime --> 0.0245 seconds

Even with 100,000 records and searching for 99,991, the timing for the loop is only 0.1657 seconds, so this scales well. I doubt I will ever get to 100,000 records in any case, but that's probably around when I would rewrite this section in ASObjC as suggested by Shane.

Thanks again to everyone. I'm now going through other existing code to optimise some of the loops. big_smile

Offline

 

#14 2020-08-24 07:31:45 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 620

Re: Get record position in list of records, or quickly search for record

jolinwarren. Sounds like you've made great progress.

Just as a point of credit, the timing script I use is based on one written by Nigel and the elapsed-time code was written by Shane. I did some fine-tuning, though.


2018 Mac mini - macOS Catalina

Offline

 

#15 2020-08-24 04:39:50 pm

Marc Anthony
Member
From:: Dallas, TX
Registered: 2006-04-27
Posts: 964

Re: Get record position in list of records, or quickly search for record

jolinwarren wrote:

...100,000 records...that's probably around when I would rewrite this section in ASObjC...



Out of curiosity, I tested the ASObjC snippet with 10K objects, and it's quite the laggard method at ~.385 seconds, which is a little surprising.

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)