Read file - respecting encoding

The Standard addition’s read file command is not ideal, because you never know how to read simple “*.txt” files,

  • Chinese text and smiles need “read as Unicode text”
  • western text documents get messed up if you use “read as Unicode text”

You can use

Do shell script "file -brief '" & posix path of sel & "'"

But that analysis isn’t sufficient to determine how to read txt encoding. If I knew, I could simply use iconv -f

I want a reliable way to convert txt documents into utf8 encoded files. So I tried with :

Do shell script "iconv -t ASCII//TRANSLIT '" & posix path of sel & "'" 

But it throws an error

Standard Additions’s “read as Unicode text” reads as UTF-16 – for UTF-8 use “read as «class utf8»”.

To answer your question directly:

use AppleScript version "2.4"
use scripting additions
use framework "Foundation"

on convertTextFileAt:posixPath
	set pathNSString to current application's NSString's stringWithString:posixPath
	set theExt to pathNSString's pathExtension()
	if theExt as text is "txt" then
		set theNSData to current application's NSData's dataWithContentsOfFile:posixPath
		set theOptionsDict to current application's NSDictionary's dictionaryWithObject:false ¬
			forKey:(current application's NSStringEncodingDetectionAllowLossyKey)
		set {theEncoding, theNSString} to current application's NSString's stringEncodingForData:theNSData ¬
			encodingOptions:theOptionsDict convertedString:(reference) usedLossyConversion:(missing value)
		if theEncoding = 0 then
			error "Unknown encoding"
		end if
		set theNewPath to pathNSString's stringByDeletingPathExtension()'s stringByAppendingString:"-utf8.txt"
		theNSString's writeToFile:theNewPath atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
	end if
end convertTextFileAt: