A bit early , I still have to work 3 more days (including this one). Thanks, and same to you all!
Thanks! It’s not only faster (for smaller lists), it uses the actual locale settings in AppleScript while bash’s locale settings are still (till mountain lion) single byte encoding inherited from ISO latin 1 collation. That means the bash version only works properly with Unicode characters 1 thru 127 unfortunately, while yours works properly with every unicode character. I think, that’s very useful when file names contains umlauts for instance.
It works a bit more like Bash then . The ‘bug’ you mentioned is still in there on Mountain Lion.
edit: Because Nigel’s script made me curious here a version that works similar to my bash version
set theFiles to {"cöm.apple.QuickTime.plist", "Com.apple.iTunes.plist", "org.x.X11.plist", "org.mozilla.firefox.plist", "com.apple.iCal.plist", "com.apple.AddressBook.plist", "QuickTime Preferences", "com.TeamViewer8.Settings.plist", "com.quark.quarkupdate.preferences.plist", "Compressed file.7z", "Compressed file 3.7z", "Compressed file 2.7z", "Compressed file 5.7z", "Compressed file 4.7z"}
set theList to splitFileName(theFiles)
set sorted2DList to bubbleSort2D(theList, 1)
set theList to mergeFileName(sorted2DList)
return theList
on bubbleSort2D(_data, _field)
script l
property f : _data --just to speed things up
end script
set _end to (count l's f) - 1 --every iteration the last item will always be sorted, therefore we can decrease amount of checks by -1 every time.
set _pos to 1 --the current position in the lis
set _sorted to true --when there is no swap between cells in a go, the data is sorted
repeat until _end < 2
if isGreater(item _field of item (_pos) of l's f, item _field of item (_pos + 1) of l's f) then
tell l's f to set {item _pos, item (_pos + 1)} to {contents of item (_pos + 1), contents of item _pos}
set _sorted to false
end if
if _pos = _end then
if _sorted then exit repeat
set _pos to 1
set _end to _end - 1
set _sorted to true
else
set _pos to _pos + 1
end if
end repeat
return _data
end bubbleSort2D
on isGreater(a, b)
(a > b)
end isGreater
on splitFileName(_data)
script l
property f : _data
end script
set {oldTD, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "."}
repeat with x from 1 to count l's f
set item x of l's f to {text items 1 thru -2 of item x of l's f as string, contents of item x of l's f}
end repeat
set AppleScript's text item delimiters to oldTD
return _data
end splitFileName
on mergeFileName(_data)
script l
property f : _data
end script
repeat with x from 1 to count of l's f
set item x of l's f to item 2 of item x of l's f
end repeat
return _data
end mergeFileName
I though Awk would be really nice for this, and avoiding two calls of sed, while getting the extensions back on,I thought it would be gainful. But, it wasn’t really. Leveraging on pipes inside awk scripts turned out to be a bad idea, as they are kind of one way pipes either you read them, or you write them.
Awk is a nice descendant of sed, like it, it goes with the flow, with range patterns, and pattern action patterns.
Here it is, I don’t have the patience to convert it into a do shell script at the moment.
You have to chmod u+x to make it executable, and feed the files either from stdin, or through a command line argument to the script. And hearing about codepoints and umlauts, that doesn’t make it a whole lot better, regarding its usability, so this is a mere alternative solution. But it works, at least for me.
"#!/usr/bin/awk -f
BEGIN { FS="" }
$0 !~ /[\.]/ {
ext[ $0 ]= ""
t = t "\n" $0
}
$0 ~ /[\.]/{
s=$0
gsub("[\.][^\.]*$","",s)
t = t "\n" s
n= split($0,tarr,".")
ext[s]= tarr[n]
}
END {
"mktemp -t awksort " |getline tf
command = "sort -df |tail +2 > " tf
print t |command
close(command )
while((getline fn[++i] < tf) > 0)
;
close(tf)
n=i
for ( i=1; i<=n;i++ ) {
fnm= fn[i]
if ( ext[ fnm ] != "" )
print fnm "." ext[ fnm ]
else
print fnm
}
}"
Edit
Just ask if you want to know how, and why it works.