Now, if the op inserts MACCENTRALEUROPE for MACROMAN or CP1250 if that doesn’t work, his problems should be solved.
I come to think of it, that if he meets a comma, and the values are comma separated, then this may lead to problems, , may he is better off using “;” for “,” when generating the csv.
He can choose his field separator in awk anyways. with FS=“;”.
Sorry guys, maybe I didn’t make clear enough what I need to do.
I often get Excel files that contain the same text in different languages, so I need to merge them to create an import file that needs to be CSV UTF-8.
So the workflow would be:
1 “ Select files to merge
2 - If they are xls|xlsx files convert them to UTF-8 csv
3 - Type output filename and select output folder
4 - Merge keeping the header row only of the first file
Point 4 can be achieved using the following script; points 1, 2 and 3 are still a question mark eheh
set theFiles to choose file with multiple selections allowed
set fileName to text returned of (display dialog "Type file name:" default answer ".csv")
set saveFile to POSIX path of (((path to desktop) as text) & fileName)
do shell script "cp " & quoted form of POSIX path of item 1 of theFiles & space & quoted form of saveFile
if (count of theFiles) = 1 then return
repeat with fileToAdd in theFiles
set x to quoted form of POSIX path of fileToAdd
do shell script "cat " & x & " | awk 'BEGIN {RS=\"\\r\\n\"; getline}{print $0}' >> " & quoted form of saveFile
end repeat
What are you wanting to do with the merged CSV file ?
If you want to open it with Excel, you will always get odd results.
Exporting the csv as xls with Numbers do the wanted trick (at least with the encoding used by Excel in French).
I’m not sure with the one used for Czech.
If you really need to open the final file in Excel, I will try to get a link allowing you to download the trial version of iWork '09 (Apple removed it from its servers).
it needs to be imported into a content management system, it doesn’t need to open in Excel. Only some some source files are received in xls so they need to be converted into csv.
No need to open the final file in Excel
Opening the file in Numbers, which I have, and exporting works, but I want to automate this with a script, because there are from 10 to 30 files to process every time.
tell application "Microsoft Excel"
activate
set outFile to (path of active workbook)
set fileName to (name of active workbook)
set outFile to (outFile & ":" & fileName & ".csv")
try
set openFile to open for access file outFile with write permission
set eof openFile to 0
set lastCol to count of columns of used range of active sheet
set lastRow to count of rows of used range of active sheet
repeat with r from 1 to lastRow
set cellVal to (value of cell r of column 1 of active sheet) as text
if cellVal is not "" then
if text 1 of cellVal as text is not "," then
set rowStr to cellVal
else
set rowStr to "0" & cellVal
end if
else
set rowStr to ""
end if
repeat with c from 2 to lastCol
set cellVal to (value of cell r of column c of active sheet) as text
if cellVal is not "" then
if text 1 of cellVal is not "," then
set rowStr to rowStr & "," & cellVal
else
set rowStr to rowStr & "," & "0" & cellVal
end if
else
set rowStr to rowStr & ","
end if
end repeat
write rowStr & return to openFile as «class utf8»
end repeat
close access openFile
on error e number n
-- Chris Stone
set {cr, sep} to {return, "------------------------------------------"}
set errmsg to sep & cr & "Error: " & e & cr & sep & cr & "Error
Number: " & n & cr & sep
tell application "SystemUIServer"
activate
display dialog errmsg
end tell
set fail to true
end try
end tell
As I suggested in a post above, you should really use semi colon, or colon as a field separator when you are using a “,” as the decimal separator.
I have changed the script to use semi-colon!
tell application "Microsoft Excel"
activate
set outFile to (path of active workbook)
set fileName to (name of active workbook)
set outFile to (outFile & ":" & fileName & ".csv")
try
set openFile to open for access file outFile with write permission
set eof openFile to 0
set lastCol to count of columns of used range of active sheet
set lastRow to count of rows of used range of active sheet
repeat with r from 1 to lastRow
set cellVal to (value of cell r of column 1 of active sheet) as text
if cellVal is not "" then
if text 1 of cellVal as text is not ";" then
set rowStr to cellVal
else
set rowStr to "0" & cellVal
end if
else
set rowStr to ""
end if
repeat with c from 2 to lastCol
set cellVal to (value of cell r of column c of active sheet) as text
if cellVal is not "" then
if text 1 of cellVal is not ";" then
set rowStr to rowStr & ";" & cellVal
else
set rowStr to rowStr & ";" & "0" & cellVal
end if
else
set rowStr to rowStr & ";"
end if
end repeat
write rowStr & return to openFile as «class utf8»
end repeat
close access openFile
on error e number n
-- Chris Stone
set {cr, sep} to {return, "------------------------------------------"}
set errmsg to sep & cr & "Error: " & e & cr & sep & cr & "Error
Number: " & n & cr & sep
tell application "SystemUIServer"
activate
display dialog errmsg
end tell
set fail to true
end try
end tell
Here is an edited version taking care of the local delimiter.
character 2 of (0.5 as text)
if result is "," then
set delim to ";"
else
set delim to ","
end if
tell application "Microsoft Excel"
activate
set outFile to (path of active workbook)
set fileName to (name of active workbook)
set outFile to (outFile & ":" & fileName & ".csv")
try
set openFile to open for access file outFile with write permission
set eof openFile to 0
set lastCol to count of columns of used range of active sheet
set lastRow to count of rows of used range of active sheet
repeat with r from 1 to lastRow
set cellVal to (value of cell r of column 1 of active sheet) as text
if cellVal is not "" then
if text 1 of cellVal as text is not delim then
set rowStr to cellVal
else
set rowStr to "0" & cellVal
end if
else
set rowStr to ""
end if
repeat with c from 2 to lastCol
set cellVal to (value of cell r of column c of active sheet) as text
if cellVal is not "" then
if text 1 of cellVal is not delim then
set rowStr to rowStr & delim & cellVal
else
set rowStr to rowStr & delim & "0" & cellVal
end if
else
set rowStr to rowStr & delim
end if
end repeat
write rowStr & return to openFile as «class utf8»
end repeat
close access openFile
on error e number n
-- Chris Stone
set {cr, sep} to {return, "------------------------------------------"}
set errmsg to sep & cr & "Error: " & e & cr & sep & cr & "Error
Number: " & n & cr & sep
tell application "SystemUIServer"
activate
display dialog errmsg
end tell
set fail to true
end try
end tell
Yvan KOENIG (VALLAURIS, France) samedi 8 septembre 2012 16:58:47
tell application "Microsoft Excel"
activate
set outFile to (path of active workbook)
set fileName to (name of active workbook)
set outFile to (outFile & ":" & fileName & ".csv")
try
set openFile to open for access file outFile with write permission
set eof openFile to 0
set lastCol to count of columns of used range of active sheet
set lastRow to count of rows of used range of active sheet
repeat with r from 1 to lastRow
set cellVal to (value of cell r of column 1 of active sheet) as text
set rowStr to cellVal
repeat with c from 2 to lastCol
set cellVal to (value of cell r of column c of active sheet) as text
set rowStr to rowStr & ";" & cellVal
end repeat
write rowStr & return to openFile as «class utf8»
end repeat
close access openFile
on error e number n
-- Chris Stone
set {cr, sep} to {return, "------------------------------------------"}
set errmsg to sep & cr & "Error: " & e & cr & sep & cr & "Error
Number: " & n & cr & sep
tell application "SystemUIServer"
activate
display dialog errmsg
end tell
set fail to true
end try
end tell
difference between excel 2008 and 2011 is opening:
2008 it opens as macroman
2011 import the csv into cell A1 and select the correct encoding, delimiter and column types so you’re sure all data is in your sheet.
When it shows wrong in excel it is probably the correct encoding, because utf-8 CSV files should be displaying special characters wrong. Set ‘opening character encoding’ of text edit’s preferences to utf-8 and open the CSV file there, if the characters are displayed correctly, the encoding is correct. Because you said that my script with CRLR (return followed with a linefeed) is the correct line terminator I assumed that the encoding you got is Windows. Not certain if it’s UTF-8 or not already but here a script where you can change the first two variable values to the correct encodings.
set fromEncoding to "CP1252"
set toEncoding to "UTF-8"
set iconvCommand to " | iconv -f " & fromEncoding & " -t " & toEncoding & space
set theFiles to choose file with multiple selections allowed
set fileName to text returned of (display dialog "Type file name:" default answer ".csv")
set saveFile to POSIX path of (((path to desktop) as text) & fileName)
do shell script "cat " & quoted form of POSIX path of item 1 of theFiles & iconvCommand & "> " & quoted form of saveFile
if (count of theFiles) = 1 then return
repeat with fileToAdd in theFiles
set x to quoted form of POSIX path of fileToAdd
do shell script "cat " & x & " | awk 'BEGIN {RS=\"\\r\\n\"; getline}{print $0}' " & iconvCommand & ">> " & quoted form of saveFile
end repeat
Now we assume that the encoding of the selected files is CP1252 (Windows latin 1) and that the output file, the merged CSV file, is UTF-8 encoded.
We’re working with CSV files and automate these actions for CMS systems of world leading WebShops and we know the drawbacks (read my post in previous posts). How about fields that needs text indicators (quoting), does the CMS support that as well? You don’t want that the lines shifts and that the wrong copy or price will be related to a product.