You’re correct that you can’t depend on the order without a base. Every file has the same order so when you lookup the line in one file you can print the line in another file. LC_TIME files work this way because they are faster and saves space compared to localization strings, but the principle remains the same.
Yvan, no. It seems we are both using an osax DJ does not have. Change his ln variable, and she’ll work. Satimage?
DJ, just for fun I tried the script with an abbreviated day name. And learned something about sed - I think.
With an abbreviated name you get 2 line numbers, so it seems to apply a ‘contains’ condition, not “is equal to”.
I just looked at the sed manpage for the 1st time ever, which didn’t me much good.
Regulus and McUsr are both right. Some entries are multiple times in the same file. So we need to tell sed also to quit on first result. Also it should work with satimage users which, correctly mentioned, I don’t have installed.
timeLocalStrings("ma", "nl_NL", "en_US")
on timeLocalStrings(str, _from, _to)
set validLocales to every paragraph of (do shell script "ls /usr/share/locale | grep -i '.utf-8$' | awk -F. '{print $1}'")
if _from is not in validLocales or _to is not in validLocales then return str
set fPath to "/usr/share/locale/" & _from & ".UTF-8/LC_TIME"
set tPath to "/usr/share/locale/" & _to & ".UTF-8/LC_TIME"
set lineNumber to (do shell script "cat " & quoted form of fPath & " | sed -n '/^" & str & "$/ {=;q;}'") as integer
if lineNumber = 0 then return str
return do shell script "cat " & quoted form of tPath & " | sed -n -e '" & lineNumber & "," & lineNumber & "p'"
end timeLocalStrings
Not that it matters much, but awk is notoriously slow, and you are really shooting a sparrow with a cannon when return a field like that with awk. I first tried sed, and that was as hard it seemed, as to change newlines! Then I remembered cut!
I am really fond of awk too. And there is nothing wrong with your code, just an alternate, and possibley much faster way to get that field returned. I think it is generally a better way way to return just the first field of a list of a record separated by some delimiter, (period in this case), when no processing of data is needed!
No 100% vanilla, unless you want to write a plist parser in plain A/S.
It’s defaults or System Events otherwise.
And:
I can’t find that file on my 10.6.8 system. It’s a system file anyway (right?), to get the current user’s language/locale you’d read one of his plists:
do shell script "defaults read -g AppleLanguages"
-- or
do shell script "defaults read -g AppleLocale"
From my first run of the script I guessed that sed does a “contains” match, not “equal to”. In that case, quitting at the 1st match for “Tue” would return “Tuesday” when the order in the LC_TIME is reversed (short names after long names).
So, how does sed match? “equal to” or “contains”?
I can’t find that out from its manpage. It may be there, but hidden by the terminology.
AWK is written to outperform the shell big time and succeeded. Weird is that I would suspect that cut is indeed faster but seems not to be true on every machine. On my machine there is no difference, some machines cut is faster and others awk is faster. AWK is pretty awesome and insanely fast for an interpreter.
for instance when you lookup ‘Tue’ you would normally have an contain match. But when you wrap and ^…$ around it (begins with and ends with) it is an exact match.
--weekday of (current date) as text
"Wed"
timeLocalStrings(result, "en_US", "fr_FR")
on timeLocalStrings(str, _from, _to)
set vName to path to startup disk as text
set fPath to vName & "usr:share:locale:" & _from & ".UTF-8:LC_TIME"
set tPath to vName & ":usr:share:locale:" & _to & ".UTF-8:LC_TIME"
tell application "System Events"
{exists disk item fPath, exists disk item tPath}
end tell
if result contains false then return str
paragraphs of (read file fPath)
tell application "ASObjC Runner"
set maybe to look in list result matching str
end tell
if maybe = {} then return str
item 1 of maybe
paragraph result of (read file tPath)
return result
end timeLocalStrings
I have not timed it towards perl, and maybe it does as good as cut, when it is just one line of input.
I have no doubt, that cut will outperform awk, on any OSX from Tiger onwards if say it is over 100 lines of text that is to be cut.
Most of the time, I use sed when I can, to overcome the slowness of awk. That is, when I don’t need a such a big script language to process input. Sed is also an interpreter, though much smaller, and faster.
How perl perfoms with regards to awk, would be interesting to see. My initial guess would be that it is faster, but I have no knowledge on the matter.
For the fun, I compared three handlers executing them 1000 times.
with handler #1 :
set beg to current date
repeat 1000 times
--weekday of (current date) as text
"Wed"
timeLocalStrings(result, "en_US", "fr_FR")
end repeat
(current date) - beg
--> 460
on timeLocalStrings(str, _from, _to)
tell application "System Events"
set validLocales to name of every folder of folder "Macintosh HD:usr:share:locale:" whose name contains ".UTF-8"
end tell
{validLocales contains _from & ".UTF-8", validLocales contains _to & ".UTF-8"}
if result contains false then return str
set vName to path to startup disk as text
set fPath to vName & "usr:share:locale:" & _from & ".UTF-8:LC_TIME"
set tPath to vName & ":usr:share:locale:" & _to & ".UTF-8:LC_TIME"
paragraphs of (read file fPath)
tell application "ASObjC Runner"
set maybe to look in list result matching str
end tell
if maybe = {} then return str
item 1 of maybe
paragraph result of (read file tPath)
return result
end timeLocalStrings
with handler #2 (DJ Bazzie Wazzie one)
set beg to current date
repeat 1000 times
--weekday of (current date) as text
"Wed"
timeLocalStrings(result, "en_US", "fr_FR")
end repeat
(current date) - beg
--> 37
on timeLocalStrings(str, _from, _to)
set validLocales to every paragraph of (do shell script "ls /usr/share/locale | grep -i '.utf-8$' | awk -F. '{print $1}'")
if _from is not in validLocales or _to is not in validLocales then return str
set fPath to "/usr/share/locale/" & _from & ".UTF-8/LC_TIME"
set tPath to "/usr/share/locale/" & _to & ".UTF-8/LC_TIME"
set lineNumber to (do shell script "cat " & quoted form of fPath & " | sed -n '/^" & str & "$/ {=;q;}'") as integer
if lineNumber = 0 then return str
return do shell script "cat " & quoted form of tPath & " | sed -n -e '" & lineNumber & "," & lineNumber & "p'"
end timeLocalStrings
with handler #3
set beg to current date
repeat 1000 times
--weekday of (current date) as text
"Wed"
timeLocalStrings(result, "en_US", "fr_FR")
end repeat
(current date) - beg
--> 9
on timeLocalStrings(str, _from, _to)
set vName to path to startup disk as text
set fPath to vName & "usr:share:locale:" & _from & ".UTF-8:LC_TIME"
set tPath to vName & ":usr:share:locale:" & _to & ".UTF-8:LC_TIME"
tell application "System Events"
{exists disk item fPath, exists disk item tPath}
end tell
if result contains false then return str
paragraphs of (read file fPath)
tell application "ASObjC Runner"
set maybe to look in list result matching str
end tell
if maybe = {} then return str
item 1 of maybe
paragraph result of (read file tPath)
return result
end timeLocalStrings
First of all there are different awks, teh byte code awk is fastest but not implemented on OS X. Unbelievable but byte code version of AWK is faster than compiled code, I’m still amazed about that. But unfortunately not distributed with Mac OS X. No we have to work with the ‘one and only true’ AWK (designed by Aho, Weinberger and Kernighan), because Kernighan was also the designer of C we don’t have to worry if the C code is properly written :P.
No sed is a good tool but remember that Sed was there first, AWK was designed to extend, or at least, to do things sed isn’t able to. For instance AWK supports extended regular expressions, also you have C-style conditions and controls which sed also doesn’t have. Also AWK has bult-in field separator which ignores surrounding white spaces which cut nor sed have.
Later when the limits of AWK came up Perl was designed to do things which can’t be done with AWK like system call. Perl is also extensible which AWK and sed both aren’t.
So performance-wise AWK should be the middle, perl the slowest and Sed the fastest between these three.
When to use which?
sed for simple text processing
awk for more complex processing
perl for more complex processing and system calls are needed.
set beg to current date
repeat 1000 times
--weekday of (current date) as text
"Wed"
timeLocalStrings(result, "en_US", "fr_FR")
end repeat
(current date) - beg
--> 1
on timeLocalStrings(str, _from, _to)
set fPath to "/usr/share/locale/" & _from & ".UTF-8/LC_TIME"
set tPath to "/usr/share/locale/" & _to & ".UTF-8/LC_TIME"
try
set lookup1 to linefeed & (read fPath as «class utf8») & linefeed
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed & str & linefeed
if ((count lookup1's text items) is 1) then
set AppleScript's text item delimiters to astid
error
end if
set lineNumber to (count paragraphs of text item 1 of lookup1)
if (lineNumber is 0) then set lineNumber to 1 -- Special-case the first line in the file.
set AppleScript's text item delimiters to astid
set outStr to paragraph lineNumber of (read tPath as «class utf8»)
on error
set outStr to str
end try
return outStr
end timeLocalStrings
Edit: Incorporated a fix by alastor933 for a problem he discovered some months later which occurs when the ‘str’ term is the first line in the ‘_from’ file.
set beg to current date
repeat 10000 times
--weekday of (current date) as text
"Wed"
timeLocalStrings(result, "en_US", "fr_FR")
end repeat
(current date) - beg
--> 20 -- Edited
on timeLocalStrings(str, _from, _to)
set fPath to "/usr/share/locale/" & _from & ".UTF-8/LC_TIME"
set tPath to "/usr/share/locale/" & _to & ".UTF-8/LC_TIME"
try
paragraphs of (read fPath as «class utf8») -- Edited
tell application "ASObjC Runner"
set maybe to look in list result matching str
end tell
--if maybe = {} then return str
item 1 of maybe
paragraph result of (read tPath) -- Edited
return result
on error
return str
end try
end timeLocalStrings
I ran with handler #4 with 10 000 pass too and got 2 seconds
I think that yours is the best answer to the original question as the OP wished a plain Applescript one.
I agree with usage of the tools Bazzie Wazzie, and I even didn’t know there was a byte code version available of awk. Wondering if it is made of Java, or the Microsoft byte code, or something else?
As for speed, I am not sure if perl is generally slower really. Having said that, perl code is hard to write, it looks sexy when done, but I can’t understand it after a month away from it, so I prefer awk over perl for those reasons, though the seemingly similarity of awk with c, confuses me at times.
But if I wanted optimum speed, I’d actually test both of those tools, to find the one that performs faststest in that case.
@ post 24 just amazing! Then suddenly grep -n was implemented in Applescript.