set aFile to POSIX path of (choose file)
tell application id "au.com.myriad-com.ASObjC-Runner" -- ASObjC Runner.app
set usedFormat to value for key path "decomposedStringWithCanonicalMapping" of item aFile
end tell
Can I use the code for text and just not files? Because when I deal with filenames, it usually goes good, as they are gotten through some operating system call, (and is how I did it with ICU and wide chars, I just made a representation, of them which couldn’t be translated back, and used the “originals” back and forth to the OS), the trouble arises, when the filenames appears inside a recent items list for instance.
Thank you very much for your sharing!
You know, this is a huge problem in the Linux world, when the guys comes from say Chech republic, and they have accents in their file names. The upside is that they won’t get LInux Thorvaldsto speak about it when they migrate.
And as far as I know HFS+ was released before Unicode was ready with the complete definition of decomposed characters. So you could be right about decomposed characters in HFS+ doesn’t 100% match with Unicode 3.0 and up. Could be right? Yes, we’re talking about Mac OS (8 and 9) and early Mac OS X (10.0 thru 10.2) versions of the file system.
Since Mac OS 10.2.2 Apple introduced an new file system named HFS+ with journaling. This was for Apple the opportunity to update the file name encoding as well. According to an logical time schedule I think HFS+, since panther, is nowadays Unicode 3.2 and therefore there are no longer issues with decomposed characters.
edit Since I like to keep my posts close to facts I did some searching and found something here that confirmed my assumptions
My script in post #36 works by deleting numbered entries, so the handling of Unicode characters in the names is something of a cosmetic nicety in its case. However, in getting the edited text into AppleScript with one shell script and then writing the result back to the plist file with another, the number of backslashes in the file tends to increase! This isn’t noticeable in the application itself, which presumably reads the name of the file referenced by the Bookmark data, but the name display in the script on successive runs becomes somewhat grotesque! This version does Unicode conversions and keeps backslashes under control:
pruneOpenRecentMenu("com.apple.iWork.Pages.LSSharedFileList")
on pruneOpenRecentMenu(domain)
-- Get a list of the application's "Open Recent" menu item names.
set RecentNames to getRecentNames(domain)
-- Ask which items to remove from the menu.
activate
try
set namesToCut to (choose from list RecentNames with prompt "Delete document(s):" with multiple selections allowed)
on error
display dialog "Either the "Open Recent" menu's empty or the input domain's faulty." buttons {"OK"} default button 1 with icon stop
set namesToCut to false
end try
if (namesToCut is false) then error number -128
-- Create individual "delete numbered entry" lines matching the numbers displayed with the selected names.
set cutRegex to linefeed
repeat with i from (count namesToCut) to 1 by -1
set cutRegex to cutRegex & "s/[,[:space:]]+\\{[^\"}]+(\"[^[:cntrl:]]+\\n[^}]*)?\\}//" & word 1 of item i of namesToCut & " ;" & linefeed
end repeat
-- Derive an edited CustomListItems "array" to use in a "defaults write" shell script.
set newData to quoted form of text 1 thru -2 of (do shell script ("defaults read " & domain & " RecentDocuments | sed -En '
/\\(/,$ H ;
$ {
g ;" & ¬
cutRegex & ¬
"s/\\\\\\\\/\\\\/g ; #Adjustment to avoid writing back more backslashes than came out!
s/^[^(]+\\(,?/\\(/ ;
s/\\);[^)]+$/\\)/p ;
}'") without altering line endings)
-- Write the edited array text back to the plist file.
do shell script ("defaults write " & domain & " RecentDocuments -dict-add \"CustomListItems\" " & newData)
end pruneOpenRecentMenu
on getRecentNames(domain)
-- Parse the "Open Recent" document names from the plist file.
set RecentNames to (do shell script ("defaults read " & domain & " RecentDocuments |
sed -En '/^[[:space:]]*CustomListItems[[:space:]]*=[[:space:]]*\\(/,/^[[:space:]]+\\)/ s/^[[:space:]]+Name[[:space:]]*=[[:space:]\"]*(.*(\\\\\"|[^\"]))\"?;$/\\1/p' |
sed '= ' |
sed 'N ;
s/\\n/. / ;
s/\\\\\\\\\"/\"/g ;'"))
-- Convert any Unicode circumlocutions to Unicode characters.
set RecentNames to decodeDefaultsUnicode(RecentNames)
return paragraphs of RecentNames
end getRecentNames
on decodeDefaultsUnicode(defaultsText)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "\\\\U"
set output to defaultsText's text items
repeat with i from 2 to (count output)
set thisTextItem to item i of output
set UniChar to (character id hex2Int(text 1 thru 4 of thisTextItem))
if ((count thisTextItem) > 4) then
set item i of output to UniChar & text 5 thru -1 of thisTextItem
else
set item i of output to UniChar
end if
end repeat
set AppleScript's text item delimiters to ""
set output to output as text
set AppleScript's text item delimiters to astid
return output
end decodeDefaultsUnicode
on hex2Int(hexStr)
set hexDigits to "0123456789abcdef"
set int to 0
repeat with thisDigit in hexStr
set int to int * 16 + (offset of thisDigit in hexDigits) - 1
end repeat
return int
end hex2Int
I can’t really say what I want to say, because then Administrators will turn up, telling me to mind my language!
I actually wrote an “encoder” but realized that I’d have to do it singlehandedly, since the escaped unicode was spread out into several utf-8 bytes. I took a break, and here it is, with pure AS and sed.
They say seeing is believing, but sometimes one has to wonder.
I have UTF-16 characters with diacriticals that must be precomposed, and Nigels code does the trick!
(For those interested in the decomposed busineess, it turns out that you’ll have to use NSString and not CFString(ref)).