Friday, July 30, 2010

#1 2005-06-10 01:33:12 pm

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4250

Simple List Reduction?

I have a string of unseparated characters: "abcdefg" and I'd like to test it for an embedded pairing, say "cd", and then, if I find "cd" remove it to form the list to leave "abefg".

One way to go is to find the offset of "cd" if it occurs and then rebuild the string as shown below.

I know this fails if the getit character (or group) isn't found, if the getit character (or group) starts at the beginning or includes the end of the original string, and I could fix that with some tests, but it seems that the whole shebang involves too much testing for what seems like a simple task. Is there a better way to approach this?

Applescript:

set myString to "abcdef"
set len to length of myString
set getit to "cd"
set getnum to length of getit
set n to offset of getit in myString
set newString to ((characters 1 thru (n - 1) of myString) & (characters (n + getnum) through len of myString)) as string

Last edited by NovaScotian (2005-06-10 01:34:26 pm)


Scripts are tested on a PowerMac dual-core G5/2.3 running OS X 10.5.8 or MacBook Pro Intel Core 2 Duo running OS X 10.6.4

Offline

 

#2 2005-06-10 01:44:38 pm

dant
Member
Registered: 2003-08-29
Posts: 175

Re: Simple List Reduction?

I think this does it:

Applescript:

set myString to "abcdef"
set targetString to "cd"
set newString to do shell script "echo " & myString & "| sed 's/" & targetString & "//'"

- Dan
--As I sed, the stream editor is invaluable.

Offline

 

#3 2005-06-10 01:50:42 pm

dkmarsh
Member
Registered: 2005-04-06
Posts: 50

Re: Simple List Reduction?

Applescript:

set myString to "cdabcdefcdghcdijklmcd"
set getit to "cd"
set AppleScript's text item delimiters to getit
set keptChars to text items of myString
set AppleScript's text item delimiters to ""
set newString to keptChars as string

Offline

 

#4 2005-06-10 05:09:47 pm

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4250

Re: Simple List Reduction?

Thank you both for these different views. In this instance, at least, the "sed" version is slightly faster, but the text item delimiters method is more transparent.


Scripts are tested on a PowerMac dual-core G5/2.3 running OS X 10.5.8 or MacBook Pro Intel Core 2 Duo running OS X 10.6.4

Offline

 

#5 2005-06-10 08:15:02 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Simple List Reduction?

NovaScotian wrote:

Thank you both for these different views. In this instance, at least, the "sed" version is slightly faster, but the text item delimiters method is more transparent.

That comparison looked a bit curious to me, since external calls (to an application, scripting addition or the shell) usually involve a hit of some kind - whereas text item delimiters are native to AppleScript. So I carried out a number of tests here, based on slightly simplified versions of each suggestion.

The results, though varying slightly from one run to another, consistently showed the TID method well ahead - at least 140 times faster than the shell version. (Of course, if huge files were being processed, the results might well be different again.) While in practical terms, the impact of these differences may be relatively small, I thought it might be worth mentioning...

One other difference possibly worth pointing out: I'm not sure whether you wanted to remove all cases of the search string or just the first one. If it matters, you may like to note that the methods (as posted) don't produce quite the same results. smile


kai

Offline

 

#6 2005-06-10 08:57:19 pm

Bruce Phillips
Administrator
Registered: 2004-07-15
Posts: 2647

Re: Simple List Reduction?

kai wrote:

I'm not sure whether you wanted to remove all cases of the search string or just the first one.

NovaScotian, if you need to remove all cases instead of just the first one, then add a "g" to the command dant posted.

Applescript:

set newString to do shell script "echo " & myString & "| sed 's/" & targetString & "//g'"

Offline

 

#7 2005-06-10 09:44:07 pm

dant
Member
Registered: 2003-08-29
Posts: 175

Re: Simple List Reduction?

As soon as I saw dkmarsh's post I knew that was a better solution, and that's what I would have suggested had I thought of it. However, I believe mine has a significant advantage in that I could use the "As I sed" pun whereas no such advantage occurs with the TID solution. tongue

- Dan

Offline

 

#8 2005-06-11 06:51:21 am

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4250

Re: Simple List Reduction?

guardian34 wrote:

kai wrote:

I'm not sure whether you wanted to remove all cases of the search string or just the first one.

NovaScotian, if you need to remove all cases instead of just the first one, then add a "g" to the command dant posted.

Applescript:

set newString to do shell script "echo " & myString & "| sed 's/" & targetString & "//g'"

No, I don't want to remove all cases, I want to remove them one at a time because I have to account for them. I must confess that I used the sed version because I wanted to learn more about sed than because the AppleScript text item delimiters method had anything wrong with it.

I had tried to get to dkmarsh's solution but got the syntax slightly wrong so I couldn't get my version to work properly. I was missing this clean phrase in dkm's script (chasing characters or words instead of "text items"): "set keptChars to text items of myString"


Scripts are tested on a PowerMac dual-core G5/2.3 running OS X 10.5.8 or MacBook Pro Intel Core 2 Duo running OS X 10.6.4

Offline

 

#9 2005-06-11 06:59:59 am

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4250

Re: Simple List Reduction?

An auxiliary question: How do you get accurate timing for a script, Kai?

Last edited by NovaScotian (2005-06-11 07:00:31 am)


Scripts are tested on a PowerMac dual-core G5/2.3 running OS X 10.5.8 or MacBook Pro Intel Core 2 Duo running OS X 10.6.4

Offline

 

#10 2005-06-11 07:44:57 am

John M
Member
Registered: 2003-07-14
Posts: 384

Re: Simple List Reduction?

NovaScotian wrote:

An auxiliary question: How do you get accurate timing for a script

There isa n OSAX to get milliseconds since system startup.  See http://osaxen.com/files/getmillisec1.0.1.html

Best wishes

John M

Offline

 

#11 2005-06-11 11:46:01 am

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Simple List Reduction?

dant wrote:

I believe mine has a significant advantage in that I could use the "As I sed" pun whereas no such advantage occurs with the TID solution.

Very good point, Dan. While I wantid to come up with an equally witty pun, the challenge proved impossible... wink

NovaScotian wrote:

No, I don't want to remove all cases, I want to remove them one at a time because I have to account for them.

Fair enough, NovaScotian - in which case the TID version of Dan's suggestion might look something like:

Applescript:

to cutFirstCase of s from t
   set d to text item delimiters
   set text item delimiters to s
   tell t to if (count text items) > 1 then set t to text item 1 & text from text item 2 to -1
   set text item delimiters to d
   t
end cutFirstCase

cutFirstCase of "cd" from "cdabcdefcdghcdijklmcd"
--> "abcdefcdghcdijklmcd"

In case they might help, here are a couple of additional variations:

Applescript:

to countAndCutCases of s from t
   set d to text item delimiters
   set text item delimiters to s
   set t to t's text items
   set c to (count t) - 1
   set text item delimiters to ""
   set t to t as string
   set text item delimiters to d
   {c, t}
end countAndCutCases

countAndCutCases of "cd" from "cdabcdefcdghcdijklmcd"
--> {5, "abefghijklm"}

Applescript:

on indexList of s at t
   set l to {}
   set d to text item delimiters
   set text item delimiters to s
   repeat with n from 1 to (count t's text items) - 1
       set l's end to (count t's text 1 thru text item n) + 1
   end repeat
   set text item delimiters to d
   l
end indexList

indexList of "cd" at "cdabcdefcdghcdijklmcd"
--> {1, 5, 9, 13, 20}

NovaScotian wrote:

An auxiliary question: How do you get accurate timing for a script, Kai?

Funnily enough, I considered including a brief description of the test script used - but wondered if that might perhaps be introducing too much noise. However, since you ask...

Some folks apparently use 'current date' to time scripts - although, since that gives results to the nearest second, it's really a bit of a blunt instrument. For greater accuracy and convenience, consider using something more precise, such as Jon's commands[1], GetMilliSec[2], Precision Timing Osax[3] or Smile's 'chrono' [4].

[1] http://osaxen.com/files/jonscommands2.1.2.html
[2] http://osaxen.com/files/getmillisec1.0.1.html
[3] http://osaxen.com/files/precisiontiming1.0.html
[4] http://www.satimage.fr/software/en/index.html

To compare the performance of very fast routines, I usually place them inside a loop that repeats several (hundred/thousand) times (enough to achieve a clear and consistent difference). Since a run can sometimes throw up spurious results, I also run each test a number of times to establish a distinct pattern.

In addition, a certain amount of latency can occur during testing. This may, for example, extend the timing slightly on the first script run. One way around this is to reverse the order in which the scripts are run, and then to average out all the results.

To help focus on the essential differences between routines, it's also a good idea to place any statements common to both (such as those initialising shared variables) outside the timed loops.

Having said all that, it may not be worth getting too hung up about minor timing differences - especially where a routine may be used only once within a script. (On the other hand, a script that iterates through hundreds or thousands of repeated operations may well benefit from some optimisation.) Remember, too, that performance can vary substantially from one machine to another - so it's prudent to use caution when quoting any comparisons.

One final word of caution: Some third party scripting additions can enable certain operations and coercions that are not possible on a 'vanilla' system. To avoid confusion, it's not a bad idea, after testing, to disable those that you don't use regularly.

Anyway - here's an example, based on earlier suggestions in this thread, using 'the ticks' from Jon's commands:

Applescript:

set n to 100 (* number of repeats *)

set t to "cdabcdefcdghcdijklmcd"
set s to "cd"

set t1 to the ticks
repeat n times
   
   set text item delimiters to s
   set r to t's text items
   set text item delimiters to ""
   r as string
   
end repeat
set t2 to the ticks
repeat n times
   
   do shell script "echo " & t & "| sed 's/" & s & "//g'"
   
end repeat
set t3 to the ticks
{tids:t2 - t1, shell:t3 - t2}

My apologies for this rather lengthy reply... smile


kai

Offline

 

#12 2005-06-11 02:08:24 pm

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4250

Re: Simple List Reduction?

On the contrary; thanks for the lengthy reply. An example of one remaining problem is shown below. For a whole variety of combinations and functions to be repeated, Jon's ticks and GetMilliSec give entirely different answers that do not seem to be factors of one another. Interesting to use ticks as the "load" on GetMilliSec and vice versa and then swap. The answers are about the same either way; tick: 14-15, gms: 239 - 240. Beats me what's being measured here.

Applescript:

set n to 5000
set t1 to the ticks
repeat n times
   factorial(20)
end repeat
set t2 to the ticks

set g1 to GetMilliSec
repeat n times
   factorial(20)
end repeat
set g2 to GetMilliSec

{tick:t2 - t1, gms:g2 - g1}

on factorial(n)
   if n > 0 then
       return n * (factorial(n - 1))
   else
       return 1
   end if
end factorial

-- {tick:43, gms:690.0}


Scripts are tested on a PowerMac dual-core G5/2.3 running OS X 10.5.8 or MacBook Pro Intel Core 2 Duo running OS X 10.6.4

Offline

 

#13 2005-06-11 03:24:23 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Simple List Reduction?

NovaScotian wrote:

On the contrary; thanks for the lengthy reply. An example of one remaining problem is shown below. For a whole variety of combinations and functions to be repeated, Jon's ticks and GetMilliSec give entirely different answers that do not seem to be factors of one another. Interesting to use ticks as the "load" on GetMilliSec and vice versa and then swap. The answers are about the same either way; tick: 14-15, gms: 239 - 240. Beats me what's being measured here.

A second is comprised of 60 ticks or (more obviously) 1000 milliseconds. Given some variation in the results, they convert reasonably well:

Applescript:

on MsToTicks(n)
   n * 0.06 div 1
end MsToTicks

on ticksToMs(n)
   n div 0.06
end ticksToMs

set r to {tick:43, gms:690.0}
{ticksToMs:ticksToMs(r's tick), MsToTicks:MsToTicks(r's gms)}
--> {ticksToMs:716, MsToTicks:41}

Last edited by kai (2005-06-11 08:08:02 pm)


kai

Offline

 

Board footer

Powered by FluxBB

[ Generated in 0.374 seconds, 10 queries executed ]

RSS (new topics) RSS (active topics)