This code was posted by Bruce Phillips back in 2006. I know nothing about python.
Q1: Can python strip diacriticals? ie. é to e.
Q2: Is python on all modern macs? (10.3.x and later?), (10.4.x and later?)
Q3: Is this the fastest method of converting text? ie. I will use this to convert addresses.
I will be inserting this code in a AS.Studio app.
TIA
on changeCase of subject to someCase
local subject, someCase, returnList
set returnList to true
if someCase is not in {"lower", "upper", "title", "capitalize"} then error "Invalid case for changeCase()" number 1
-- Make one-item list if needed
if subject's class is not list then set {subject, returnList} to {{subject}, false}
count subject
repeat with i from 1 to result
do shell script "/usr/bin/python -c \"import sys; print unicode(sys.argv[1], 'utf8')." & someCase & "().encode('utf8')\" " & quoted form of (subject's item i)
set subject's item i to result
end repeat
if not returnList then set subject to subject's first item
return subject
end changeCase
changeCase of "hELLO, wORlD!" to "lower"
-- Python should handle accented characters as well. Try out this line:
-- changeCase of "hELLo, wORlD! éèê äöü ñ" to "upper"
It doesn’t strip diacriticals for me (10.4.10).
I thought a bit more about this and came up with this. It’s simple, fast and native applescript. The code should be easy to understand for those wishing to further improve it.
--REMOVE DIACRITICALS AND SET CASE TO UPPER, lower, Title, .Sentence, & none
--written by GEOFF LACY
global upper
global lower
set none to "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" -- ; default
set upper to "ABCDEFGHIJKLMNOPQRSTUVWXYZ" -- upper
set lower to "abcdefghijklmnopqrstuvwxyz" -- lower
set title to "ABCDEFGHIJKLMNOPQRSTUVWXYZ " -- space at the end helps detect Title case
--set sentence to ""
-----------------------------------------------------------------------
set stringToConvert to "teS23a a 49348 tin g âîçÇé èå åÄ ü"
set _caseChange to upper --lower, upper, title, none; Considering title Vs. titleStrict
-----------------------------------------------------------------------
--MAIN
if _caseChange is equal to none then
considering case but ignoring diacriticals
get removeDiacriticalMarks(stringToConvert, _caseChange)
end considering
else if (_caseChange is equal to upper or _caseChange is equal to lower) then
ignoring diacriticals
get removeDiacriticalMarks(stringToConvert, _caseChange)
end ignoring
else if _caseChange is equal to title then
ignoring diacriticals
get removeDiacriticalMarks2(stringToConvert, _caseChange)
end ignoring
end if
-----------------------------------------------------------------------
on removeDiacriticalMarks(stringToConvert, _caseChange) -- UPPER and lower case
set newString to ""
repeat with i from 1 to count of characters in stringToConvert
set _index to offset of (character i in stringToConvert) in _caseChange
if _index is not equal to 0 then -- 0=not found
set newString to newString & character (_index) in _caseChange
else
set newString to newString & character i in stringToConvert
end if
end repeat
return newString
end removeDiacriticalMarks
on removeDiacriticalMarks2(stringToConvert, _caseChange) --Title case
set _titlecase to false
set newString to ""
--First character; Capitalize if A-Z
set _index to offset of (character 1 in stringToConvert) in _caseChange
if _index is not equal to 0 then
set newString to newString & character _index in upper
else
set newString to newString & character 1 in stringToConvert
end if
--Subsequent characters; Capitalize after a space
repeat with i from 2 to count of characters in stringToConvert
set _index to offset of (character i in stringToConvert) in _caseChange
if _index is not equal to 0 then --character found
if _index is not equal to 27 then -- a space
if _titlecase is true then
set newString to newString & character _index in upper
set _titlecase to false
else
set newString to newString & character _index in lower
end if
else
-- found a space, so the next character will be capitalized.
set newString to newString & character _index in _caseChange
set _titlecase to true
end if
else
set newString to newString & character i in stringToConvert
end if
end repeat
return newString
end removeDiacriticalMarks2