[url=http://www.unicode.org/faq/utf_bom.html]http://www.unicode.org/faq/utf_bom.html[/url]
I don’t have Excel, which is apparently where the problem lies.
I do have Numbers 2.3. I’ve saved five CSV files containing the data for the range “A2:C4” in your image in post #16. The formats are UTF-8 without BOM, UTF-8 with BOM, UTF-16BE (ie. File Read/Write’s ‘as Unicode Text’) without BOM, UTF-16BE with BOM, and UTF-16LE (‘as «class ut16»’ on an Intel machine) with BOM. The BOM’s unavoidable in the last instance and in any case, BOM-less UTF-16 ” according to the blurb at the above link ” should be assumed to be big-endian. In fact, though, only the BOM-less UTF-16 is misinterpreted when the files are opened in Numbers. (It’s apparently not recognised as Unicode text.) The others are all rendered perfectly.
Copying the files over to my G5 ” where of course the version written ‘as Unicode text’ with a BOM is identical to one written locally ‘as «class ut16»’ ” the results are exactly the same. The data are all rendered perfectly except for the UTF-16BE without BOM.
So your problem’s not something that’s intrinsic to CSV, but to the application interpreting its somewhat loose rules.
To cover all the bases, here’s the script I used to prepare and write the data. It’s hard-wired to use a comma delimiter:
on list2csv(listOfLists)
copy listOfLists to listOfLists
set csvQuotedQuote to quote & quote
set csvRecordDelimiter to return & linefeed
set astid to AppleScript's text item delimiters
repeat with i from 1 to (count listOfLists)
set recordList to item i of listOfLists
repeat with j from 1 to (count recordList)
set fieldValue to (item j of recordList) as text
if (fieldValue contains quote) then
set AppleScript's text item delimiters to quote
set fieldValue to fieldValue's text items
set AppleScript's text item delimiters to csvQuotedQuote
set fieldValue to fieldValue as text
end if
if ((fieldValue contains quote) or (fieldValue contains ",") or ((count fieldValue's paragraphs) > 1) or (fieldValue begins with space) or (fieldValue ends with space)) then set item j of recordList to quote & fieldValue & quote
set AppleScript's text item delimiters to ","
set item i of listOfLists to recordList as text
end repeat
end repeat
set AppleScript's text item delimiters to csvRecordDelimiter
set csv to listOfLists as text
set AppleScript's text item delimiters to astid
return csv
end list2csv
set myData to {{"one", "two", "three"}, {"Régulier a", "b", "c"}, {" Régulier d", "e,h", "f"}}
set csv to list2csv(myData)
-- Edit the following variously to write the data in different UTFs.
set fRef to (open for access file ((path to desktop as text) & "Test UTF-16BE.csv") with write permission)
try
set eof fRef to 0
-- BOM values, if used, at beginnings of files:
-- When writing 'as «class utf8»', use «data rdatEFBBBF»
-- When writing 'as Unicode text' (UTF-16BE, ie. big-endian), use «data rdatFEFF»
-- When writing 'as «class ut16»' (UTF-16LE, little-endian), a processor-native BOM is prepended automatically by the system TO EACH SECTION OF TEXT WRITTEN.
write «data rdatFEFF» to fRef
write csv to fRef as Unicode text
on error msg
display dialog msg buttons {"OK"}
end try
close access fRef