Convert the Text to Record (effectively)

KniazidisR · August 5, 2022, 5:06pm

Thanks, thanks, thanks!

I ended up with 3 solutions that return the exact result, but at different speeds. The most effective method (for me) here was the JXA-method, provided by @CK. I also discovered a lot of interesting things about his programming technique.

JXA solution from @CK turned out to be the fastest (3 ms). It is ideal for users who understand JXA. See my post #10 for how I applied this solution.
The method suggested by @Marc Anthony is also fast (9 ms), although it is 3 times slower than the JXA solution. Since the difference in speed is not fatal, this solution is ideal for users weak in JXA. See my post #13 for how I applied this 2nd solution.
The method suggested by @robertfern is the most accurate solution, so it’s worth considering. It is ideal for those who would like to receive the values of numerical keys in numerical form. I tried to get the most speed out of it, and made it to remove spaces before key names. I ended up with a script that is 10 times slower than the JXA solution (30 ms):


on currentWiFiNetworkInfo()
	script o
		property currentWiFiNetworkInfo : ""
	end script
	set o's currentWiFiNetworkInfo to paragraphs of (do shell script "/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport -I")
	set {ATID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, ": "}
	set myRec to {}
	repeat with i in o's currentWiFiNetworkInfo
		set tmp to text items of i
		set spaceOffset to item 1 of tmp
		repeat while spaceOffset begins with space -- added by me
			set spaceOffset to text 2 thru -1 of spaceOffset
		end repeat
		set end of myRec to spaceOffset
		set end of myRec to (rest of tmp) as {integer, text} -- the @CK's last suggestion applied here
	end repeat
	set AppleScript's text item delimiters to ATID
	set rawRecord to {«class usrf»:myRec}
end currentWiFiNetworkInfo

my currentWiFiNetworkInfo()

estockly · August 5, 2022, 5:52pm

CK:

try blocks will slow a script down quite significantly when an error is caught. Error catching is a fairly intense set of operations that get performed in order to catch the error, itemise the components of the script responsible for the error, reporting it, and logging it. Luckily, you can probably do away with all of that and simply do this:
tell the textList to set the last item to the last item as {boolean, real, integer, text}

Hmmm… I tried various forms of that, and it changes the class of everything to text. But the below works.


set theContent to "agrCtlRSSI: -29
agrExtRSSI: true
 agrCtlNoise: -87
agrExtNoise: 2.10
state: running
		op mode: station 
 lastTxRate: 59
maxRate: 72
lastAssocStatus: 0
  802.11 auth: open
link auth: wpa2-psk
  BSSID: 8:78:8:0:aa:5c
 SSID: AndroidAP
MCS: 6
channel: 6"
set testRecords to RecordIzeText(theContent)
set currentWiFiNetworkInfo to do shell script "/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport -I"
set actualRecords to RecordIzeText(currentWiFiNetworkInfo)

on RecordIzeText(theContent)
	set the content to linefeed & theContent
	set saveTID to AppleScript's text item delimiters
	repeat
		set AppleScript's text item delimiters to {return & tab, linefeed & tab, return & space, linefeed & space}
		set fixedText to text items of theContent
		set AppleScript's text item delimiters to {linefeed}
		set fixedText to fixedText as text
		if fixedText = theContent then exit repeat
		set theContent to fixedText
	end repeat
	set the theContent to the rest of paragraphs of (theContent as text)
	set recordz to {}
	set text item delimiters to {": "}
	repeat with index from 1 to count theContent
		set anItem to theContent's item index
		set anItem to text items of anItem
		set recordLabel to item 1 of anItem
		set recordValue to the rest of anItem as text
		--set recordValue to recordValue as {boolean, real, integer, text} --turns everything to text
		try
			set recordValue to recordValue as {boolean, number}
		on error
			set recordValue to quote & recordValue & quote
		end try
		set end of recordz to ("|" & recordLabel & "|: " & (recordValue))
	end repeat
	set text item delimiters to {","}
	set recordz to run script "{" & (recordz as text) & "}"
	set AppleScript's text item delimiters to saveTID
	return recordz
end RecordIzeText

Also…

Script Debugger adds a lot of overhead to script execution in order to do all its magic. That makes it less than ideal for timing scripts. I think that’s why Shane made Script Geek in the first place.

Nigel_Garvey · August 5, 2022, 7:08pm

They’ve not been mentioned in any edition of the AppleScript Language Guide for at least 25 years. So either the author(s) have been very forgetful or the AppleScript team hasn’t intended square brackets to be an official part of the language. The latter explanation’s by far the more likely. Square brackets were mentioned once or twice in my very early days on the AppleScript-Users e-mail list, but you and one other poster elsewhere (possibly the same person posting under a different name) are the only people I’ve ever seen actually use them. The length of time I’ve had to point out that they’re not (or are no longer) an official part of AppleScript equates to the time you’ve been casually using them here without explanation or good reason. It’s the permanency of your stubbornness rather than that of square brackets that’s being demonstrated.

AppleScript has quite a few undocumented features and sometimes they can be the only way to get a particular job done or to get it done efficiently (if you know about them). Where this is the case, it should be noted and explained in comments. Where not, more standard methods should be used. This isn’t some law I’ve made up myself. It’s a philosophy passed on by the many professional coders with whom I’ve crossed swords over the past quarter century. Likewise with not using reserved terms (with a few exceptions) as labels for script and user record properties. But as I noted above, your convoluted script seems to depend on the labels not being user ones.

Not being casual or a smartarse is especially relevant in fora where people come for help in learning the language. There are subtle differences between lists with square brackets (“linked lists”) and those with braces (“vectors”). If a learner sees someone who comes on clever casually using square brackets instead of braces simply as a matter of style, they might be tempted to think the two are equivalent, follow suit, and one day run foul of the differences. That said, of course, there’s no reason why an undocumented feature shouldn’t be demo’d at the end of a relevant, already answered topic for the interest and amusement of other scripters.

I in turn would encourage you to read MacScripter’s forum rules, which were also not made up by me. Rule 2 states: “You will be required to enter your legal first and last name when you register.” The first and last name shown in your profile don’t conform to this requirement and I’d encourage you to correct them before Mark notices.

If anyone’s interested, in April 2002, on the AppleScript-Users list, Shane quoted from a 1993 AppleScript document which appears to show the beginning of linked lists’ fall from favour. By 1997, there was no mention of them at all in the AppleScript documentation.

From notes accompanying AS version 1.1 in December 1993:

The Story of Lists

Representation and Efficiency

The way in which lists are represented in memory has changed from
AppleScript 1.0 to AppleScript 1.1, and correspondingly, the characteristics
of these values has changed with respect to efficiency. Efficiency involves
both the amount of time it takes to perform certain operations, and the
amount of space these values consume. Additionally, certain operations can
generate intermediate values (garbage) that are automatically reclaimed by
AppleScript. The more of these intermediate values, the more often the
time-consuming operation of collecting them must occur.

In AppleScript 1.0, lists were represented as linked lists in memory. Linked
lists means that each item was linked to the next via an internal pointer.
On the positive side, linked lists didn’t take up much memory, and
concatenating new item (at least on the left-hand side) was relatively fast
and didn’t create any garbage:

On the negative side, linked lists took longer to reach elements at higher
indicies than they did to reach elements at lower indicies (i.e. the access
time was linear (increased proportionally) to the index of the item being
accessed). This is because the list had to be traversed from beginning,
counting up to the index of the item desired. Also, if you were unfortunate
enough to concatenate on the right-hand side, the entire left-hand side of
the list would have to be copied. This is because we mustn’t change the
left-hand side value to perminantly be concatenated with the right-hand side
because other parts of the program may still be using its old value.

In AppleScript 1.1, lists are represented as vectors. This means that
elements are stored in consecutive locations in memory rather than linked
together. This makes accessing elements by index fast (constant time – it
doesn’t matter about the value of the index), but makes concatenation
generate much more garbage, as both halves of the concatenate operator (&)
must be copied.

This trade-off was made because we found that people accessed items of lists
far more often than they concatenated, or more precisely, that this balance
lead to better overall performance characteristics. However, for certain
scripts, the previous list representation may have been more appropriate,
therefore we did two things: We allow linked lists to be explicitly created,
and we allow insertion operations on vectors.

Explicit Linked Lists

AppleScript 1.1 allows the old-style (linked) lists to be created by using a
square bracket notation. Whereas to put variables x, y and z into a vector,
we say:

{x, y, z}

we can now also write scripts that put these variables into a linked list
via:

[x, y, z]

Note, however, that both of these print results out with braces.
Furthermore, asking the class of either of these two representations will
return the class ‘list’:

class of {1, 2} → list
class of [1, 2] → list

If the two representations need to be distinguished, the ‘best type’
property may be used:

best type of {1, 2} → vector
best type of [1, 2] → linked list

For the concatenation operator (&), the representation of the result will be
the same as the representation of the left-hand side:

best type of ({1} & [2]) → vector
best type of ([1] & {2}) → linked list

When AppleScript 1.1 reads 1.0 script files, all lists stored in persistent
properties will actually be linked lists. However, the distinction will be
transparent to the script when run.

Linked lists are a very good representation for recursive programs that
“walk” lists, processing one item on each recursive step. This is usually
the case when the ‘rest’ property is being used. For example, the following
function computes the sum of the squares of the items in a list:

on SumOfSquares(theList)
if theList = {} then
return 0
else
set element to first item of theList
return (element * element) + SumOfSquares(rest of theList)
end if
end SumOfSquares

When this function is applied to a linked list or vector, the same answer
results (note that the empty list equals the empty vector, so the equality
test works in either case):

SumOfSquares([1, 2, 3]) → 14
SumOfSquares({1, 2, 3}) → 14

however the linked list case doesn’t generate any garbage. The vector case
must generate a new sub-list for each time the ‘rest’ property is accessed.

Vector Insertion

The second thing we did to enhance the performance when using vectors is to
allow items to be inserted at the beginning or end of a vector. E.g.:

set myList to {1, 2}
set end of myList to 3
myList → {1, 2, 3}

set beginning of myList to 0
myList → {0, 1, 2, 3}

Unlike concatenation, setting the beginning or end of a list actually
changes the list. Note that it’s not the variable that’s being changed, it’s
the list itself. That means that if two variables are referring to the same
list, and an item is inserted into the list, both variables will refer to
the changed list:

set x to {1, 2}
set y to x – y refers to the same value that x refers to
set end of x to 3
x → {1, 2, 3}

y → {1, 2, 3}

However, despite the “destructive” nature of insertion, often times we’re
only using lists to accumulate results, and don’t care that variables
referring to the list will be modified (presumably we don’t care because
only one variable is refering to that list anyway). In these cases, we can
greatly increase the efficiency of our scripts by inserting items rather
than concatenating them, forming new lists and leaving the old ones as
garbage. For example, this script that accumulates the squares of a list of
numbers:

set myNumbers to {1, 2, 3}
set resultList to {}
repeat with i in myNumbers
set resultList to resultList & {i * i}
end repeat
resultList → {1, 4, 9}

can be made more efficient (faster and generate less garbage) if rewritten
as:

set myNumbers to {1, 2, 3}
set resultList to {}
repeat with i in myNumbers
set end of resultList to i * i
end repeat
resultList → {1, 4, 9}

For the record, setting the end of a list is by far the most efficient, and
setting the beginning is second (because all the items must be slid down to
make room for a new first item).

Minor difference in Concatenation

There is one minor difference between the way concatenation works on linked
lists and vectors (and consequently between AppleScript 1.0 and AppleScript
1.1 using the curly brace notation for lists). When linked lists are
concatenated, only the left-hand argument is copied, the right-hand argument
is shared. For example:

set x to [1, 2, 3]
set y to [4, 5, 6]
set z to x & y
set item 2 of y to 999
z → {1, 2, 3, 4, 999, 6}

Here, x and y are two linked lists (or AppleScript 1.0 lists) that are
concatenated. Since y appears on the right-hand side of the concatenate
operator (&), it is shared between y and the last three elements of z. Then
when y is updated, these changes are reflected in z also.

However, if vectors are used, both x and y are copied to create z, and
updating y does not change z:

set x to {1, 2, 3}
set y to {4, 5, 6}
set z to x & y
set item 2 of y to 999
z → {1, 2, 3, 4, 5, 6}

Although this does constitute a change in the default behavior from
AppleScript 1.0 to 1.1, we believe the new behavior is more intuitive, and
the old behavior can be preserved using the square bracket syntax.

Nigel_Garvey · August 6, 2022, 7:18am

Here’s an analysis of CK’s AppleScript script in post #5. The way the TIDs are used is interesting.

set WifiStats to "		agrCtlRSSI: -29
		agrExtRSSI: 0
		agrCtlNoise: -87
		agrExtNoise: 0
		state: running
		op mode: station 
		lastTxRate: 59
		maxRate: 72
		lastAssocStatus: 0
		802.11 auth: open
		link auth: wpa2-psk
		BSSID: 8:78:8:0:aa:5c
		SSID: AndroidAP
		MCS: 6
 channel: 6"

-- Replace both linefeeds and colon-spaces with linefeeds.
set AppleScript's text item delimiters to {linefeed, ": "}
set WifiStats to WifiStats's text items as text
-- Replace both spaces and tabs with empty texts resulting from a non-text delimiter.
set AppleScript's text item delimiters to {{space}, tab} -- simile: {7, space, tab}
set WifiStats to WifiStats's text items as text
-- Hack the result's alternating label and value lines into a record.
return {«class usrf»:WifiStats's paragraphs}'s contents as anything

-- Another once-popular version of the hack:
return (record {«class usrf»:WifiStats's paragraphs} as record)'s «class seld»

-- Or even:
script
	{«class usrf»:WifiStats's paragraphs}
end script
return (run script result)

peavine · August 6, 2022, 4:26pm

The purpose of KniazidisR’s script appears to be to get information from the airport utility, and, if this is correct, an ASObjC solution is probably of little worth. However, FWIW, the following ASObjC script does the job and takes 4 milliseconds (with the Foundation framework in memory), which is competitive with the other suggestions. I had earlier posted a different version of this script, but the following is faster because it uses ": " as a separator to parse each paragraph of the source string. Error correction is needed if blank or nonconforming paragraphs are possible.

use framework "Foundation"

set theString to "agrCtlRSSI: -29
agrExtRSSI: 0
agrCtlNoise: -87
agrExtNoise: 0
state: running
op mode: station 
lastTxRate: 59
maxRate: 72
lastAssocStatus: 0
802.11 auth: open
link auth: wpa2-psk
BSSID: 8:78:8:0:aa:5c
SSID: AndroidAP
MCS: 6
channel: 6"

on getRecord(theString)
	set theString to current application's NSString's stringWithString:theString
	set theArray to (theString's componentsSeparatedByString:linefeed)
	set theDictionary to (current application's NSMutableDictionary's new())
	set whiteSpaceCharacters to current application's NSCharacterSet's whitespaceCharacterSet()
	repeat with aParagraph in theArray
		set paragraphArray to (aParagraph's componentsSeparatedByString:": ")
		set theKey to ((paragraphArray's objectAtIndex:0)'s stringByTrimmingCharactersInSet:whiteSpaceCharacters)
		set theValue to ((paragraphArray's objectAtIndex:1)'s stringByTrimmingCharactersInSet:whiteSpaceCharacters)
		(theDictionary's setObject:theValue forKey:theKey)
	end repeat
	return theDictionary as record
end getRecord

set theRecord to getRecord(theString)

KniazidisR · August 6, 2022, 6:07pm

@peavine,

I tested your latest script. It runs for me in 9 ms, that is, it is 3 times slower than a JXA script. I think it’s because of the use of a repeat loop, and the AsObjC solution can only get faster than the JXA solution with the correct Regex expression applying, without repeat loops.

Marc_Anthony · August 6, 2022, 10:39pm

It’s unfortunate that the results from the airport command are so sloppy and haphazard. Multiple lines randomly begin with spaces, and a single line ends with multiple; the need for correction lends itself to less than optimal efficiency—no matter the method. Since my initial attempt eliminated internal spaces, I played around with a regex and married it to CK’s initial approach. It may or may not be faster, but it’s brief.


({«class usrf»:(do shell script "/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport -I | egrep -o '\\w[^:]+|:.+' | awk '{gsub(\" +$|: \",\"\"); print} ' ")'s paragraphs})'s contents as anything

KniazidisR · August 7, 2022, 5:06am

Marc Anthony:

It’s unfortunate that the results from the airport command are so sloppy and haphazard.


({«class usrf»:(do shell script "/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport -I | egrep -o '\\w[^:]+|:.+' | awk '{gsub(\" +$|: \",\"\"); print} ' ")'s paragraphs})'s contents as anything

It’s not just the airport utility and its sloppy and haphazard result, but the fact that many other utilities produce a similar key-value structure. Therefore, I wanted to make sure which approach is the most effective in such cases. Some utilities has filtering options for its output, but still doesn’t return it as record. In this topic, I made sure that using a regular expression is the most effective solution.

I tested your last script and I can confirm that it works instantly (2 ms), i.e. 1 ms faster than the JXA solution.

Thanks, this will come in handy.

Nigel_Garvey · August 7, 2022, 7:08am

Here’s an ASObjC solution using regex. It renders number values in the final record as numbers rather than as text.

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set WifiStats to "		agrCtlRSSI: -29 
		agrExtRSSI: 0
		agrCtlNoise: -87
		agrExtNoise: 0
		state: running
		op mode: station 
		lastTxRate: 59
		maxRate: 72  
		lastAssocStatus: 0
		802.11 auth: open
		link auth: wpa2-psk
		BSSID: 8:78:8:0:aa:5c
		SSID: AndroidAP
		MCS: 6
 channel: 6"

set regex to current application's NSRegularExpressionSearch
tell (current application's class "NSMutableString"'s stringWithString:(WifiStats))
	-- Trim leading and trailing white space and empty lines.
	its replaceOccurrencesOfString:("(?m)^\\s++|\\h++$|\\s++$") withString:("") options:(regex) range:({0, its |length|()})
	-- Enbar labels and enquote values.
	its replaceOccurrencesOfString:("(?m)^([^:]++):\\h++(.++)") withString:("|$1|: \"$2\"") options:(regex) range:({0, its |length|()})
	-- Unenquote number values.
	its replaceOccurrencesOfString:("\"([0-9.-]++)\"") withString:("$1") options:(regex) range:({0, its |length|()})
	-- Replace internal line endings with commas.
	its replaceOccurrencesOfString:("\\R++") withString:(", ") options:(regex) range:({0, its |length|()})
	-- Concatenate between AS text braces (thereby also coercing to AS text) and run as script code.
	return my (run script ("{" & it & "}"))
end tell

Or perhaps more coolly presented:

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set WifiStats to "		agrCtlRSSI: -29 
		agrExtRSSI: 0
		agrCtlNoise: -87
		agrExtNoise: 0
		state: running
		op mode: station 
		lastTxRate: 59
		maxRate: 72  
		lastAssocStatus: 0
		802.11 auth: open
		link auth: wpa2-psk
		BSSID: 8:78:8:0:aa:5c
		SSID: AndroidAP
		MCS: 6
 channel: 6"

set regex to current application's NSRegularExpressionSearch
set patternsAndReplacements to {¬
	{"(?# Trim leading and trailing white space and empty lines.)(?m)^\\s++|\\h++$|\\s++$", ""}, ¬
	{"(?# Enbar labels and enquote values.)(?m)^([^:]++):\\h++(.++)", "|$1|: \"$2\""}, ¬
	{"(?# Unenquote number values.)\"([0-9.-]++)\"", "$1"}, ¬
	{"(?# Replace internal line endings with commas.)\\R++", ", "} ¬
		}
tell (current application's class "NSMutableString"'s stringWithString:(WifiStats))
	repeat with this in patternsAndReplacements
		(its replaceOccurrencesOfString:(this's beginning) withString:(this's end) options:(regex) range:({0, its |length|()}))
	end repeat
	return my (run script ("{" & it & "}"))
end tell

KniazidisR · August 7, 2022, 8:05am

I can confirm: it is very fast color=blue[/color] + returns numbers in numbers format

peavine · August 7, 2022, 1:19pm

I ran timing tests to verify the above. My computer is a 4-year-old Mac mini running Monterey and is neither fast nor slow by current standards. I used Script Geek and the results are in milliseconds:

SCRIPT - 1 ITERATION - 10 ITERATIONS

CK (JXA) - 4 - 45

peavine (ASOBJ) - 4 - 41

Nigel (first ASOBJC) - 2 - 10

IMO, these timing results are sufficiently close that other factors should probably determine which is used.

CK11 · August 8, 2022, 6:36am

You yourself posted the excerpt below entitled From notes accompanying AS version 1.1, which proves (in the words of someone who clearly was an engineer on the team that developed AppleScript) that they are, and therefore still remain, a part of AppleScript.

I only post using my own name.

Nigel, this is not acceptable. I already had to ask another user to refrain from targeting personal comments at me (or anyone else), but that I now need to ask that this same code of conduct be adhered to by a moderator is disgraceful.

Please do not throw personal insults at me (or other users). I understand that, for whatever reason, my disagreeing with you appears to upset you. But this is not ok.

I’m familiar with the philosophy. I began my programming career somewhat more than a quarter of a century ago. I have explained it in previous posts, which I seem to recall you also took issue with.

And, no, my convoluted script doesn’t depend on them.

Again, not acceptable. I get it, you don’t like me. I don’t care. Find a better way to compose yourself or leave me alone.

I am aware I have certain communication issues that cause me to be interpreted as blunt, rude, or “a smartarse”. I am autistic, and this is something I continually work to improve, but it’s also not something I easily get right. I’m fine with criticisms being made about how I write in way that’s designed to be helpful and provide a way for me to learn, but you don’t know me as a person, and you don’t have the right to label me however you see fit just because you’re in a bad mood.

Am I being clear about this ?

So it seems that when a moderator doesn’t agree with one of the members, the thing to do is to try and attack them in an oblique fashion ? My name was Christofer, but it was changed long ago (by deed poll) to CK. There are some social media sites where this name is too short to be accepted, and I’ve been forced to retain my old name. I also use my old name in some professional contexts. But, frankly, none of this is any of your business, and I’m confident Mark does not behave in the manner that I’m seeing here.

Nigel_Garvey · August 8, 2022, 11:18am

No. You’re right. They’re completely unnecessary. I worked that out when I finally got round to de-obfuscating the script properly. (See post #24.)

As the moderator, it’s not for me to like or dislike anyone on this forum, any more than it is for you to tell me how to moderate it. That I do dislike.

My main concerns when moderating include:

• People asking for help here should be given accurate information that they can both use in the immediate context of their enquiry and understand well enough to be able use or adapt for use elsewhere if need be. Similarly, people reading posts in order to increase their AppleScript knowledge shouldn’t be handed stuff that might lead them astray later. This isn’t always possible, of course, but it should be the ambition. Obfuscated and uncommented scripts unnecessarily employing hacks and obsolete, long-undocumented features don’t come anywhere near achieving this.

• Posters with obvious talent and/or enthusiasm for AppleScript and MacScripter should be encouraged and given the best possible guidance — hopefully without stifling their own creativity. They should also be able to enjoy visiting and contributing to MacScripter without fear of their contributions being mocked (although constructive criticism is of course always welcome) or of being abused themselves. (See Posting Guidelines.) The spectacle of occasional visitors insisting that black is white, condescendingly referring to people’s comfort zones, and then laying down the law themselves about what is or is not acceptable is something we could all do without.

• MacScripter’s very few Rules and Posting Guidlines apply to everyone. I didn’t write them. I’ve just been landed with the job of moderating. If I see rules apparently being flouted by someone who’s also being difficult in the fora, it very much is my business.

Matters that simply boil down to common sense, such as not cluttering the fora with unnecessary replies to queries posted many years before, are, according to my brief, decided at my own discretion.

If you have issues with any of this, refer them to Mark. I’ll refer this topic to him myself so that he can remove my moderator status should he deem it necessary. Meanwhile this topic is now locked.