How to find out the encoding of a file?

Hi!

As I don’t know any native Applescript method to find out the encoding of a file.

.is there a way to determit the encoding via Shell/Bash?

write "testtext" to (choose file) as unicode text

writes the text as utf16

write "testtext" to (choose file) as «class utf8»

writes the text as utf8

If I want to keep the encoding of a file when writing to it - what do I have to do?

Operating System: Mac OS X (10.4)

You can use a BOM -byte order mark- for UTF-16 files. Mac uses Big Endian, so the BOM is 0xFEFF. Simply find the corresponding ASCII chars (“˛ˇ”, but I don’t know if this forum will preserve them) and write them as the first two bytes in your UTF-16 files (Unicode text).

For UTF-8, there is a similar method: write “Ԫø” as the first three characters (0xEFBBBF). Any smart text editor should recognize now this file as UTF-8 encoded.

If I read a utf16 file with

read (choose file)

I get a string that starts with “˛ˇ”

But If I read a utf8 file with

read (choose file)

I get the contant wrong encoded but NOT starting with “Ԫø”

Conclusion:
Detecting utf16 works for me
Detecting utf8 doesn’t

Hmmm… Works fine here :confused:
(Tiger, but it should work also in previous versions…)

Try reading these two files…
http://homepage.mac.com/julifos/tests.zip → 1Kb

Now it works fine! Thx!

My utf8 file must have been corrupt!