Convert XLS files into CSV preserving UTF-8 encoding

Will do, thanks Yvan :slight_smile:

Hmm something that I missed that would be really good for the script.

As the files are in different single folders, is it possible to select the parent folder and let the script pick up the files in its subfolders, unless single files are selected?

As far as I know, the function choose file can’t apply to select items in different folders.

So several schemes are available :

(1) no longer use choose file.
select the files in their folder from the Finder then drop the selection onto a revised script

(2) in a single folder, create aliases of the different files stored in different locations

(3) insert the call to choose file in a loop allowing you to select files.

May you test this code and tell me if you agree with it ?


set theFiles to {}
repeat
	try
		choose file with multiple selections allowed
		set theFiles to theFiles & result
		log theFiles
	on error number errNbr
		
		if errNbr is -128 then
			display dialog "Press OK to exit the loop selecting the files." & return & "Press Cancel to exit the script."
			exit repeat
		end if
	end try
end repeat
theFiles

It allow you to select files in different folders.
When you have done, press the Cancel button.
The script will ask you to :
Press OK to exit the loop selecting the files.
Press Cancel to exit the script.

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 16:21:15

Hmmm it doesn’t seem to do anything for me, if I select a folder it just goes inside the folder and I need to keep selecting.

The situation is often like this

Main folder
|-Project subfolder
|—Language #1
|------Actual-file.xlsx
|—Language #2
|------Actual-file.xlsx
|—Language #3
|------Actual-file.xlsx



So it would be good to just select the “Main folder” and let it pick out the files.

I understand your description but as you wrote, it’s often the described one, not always.

The piece of code is supposed to do the trick in every situation.

display choose file dialog
select files in folder A
press OK
display choose file dialog
select files in folder B
press OK
display choose file dialog
select files in folder C
press OK
.
display choose file dialog
when all the wanted files are selected, press Cancel.

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 16:42:59

Yep you’re right, and you were saying that it cannot do both - select files inside the parent folder and subfolders, or just single files? :slight_smile:

We can’t select files in different folder from a single dialog.
My code use a loop.
pass 1
select file(s) in folder A
pass 2
select file(s) in folder B
.

may I insert the piece of code in the script ?

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 17:05:20

Hmm, but when I select the folder, it doesn’t select the folder and moves on, it just goes into the folder and then I have to select the file myself?

Isn’t it what I wrote ?

You said that often the scheme matches what you described.
As far as I know, often is not the same than always.

If I apply the scheme which you described,
if it doesn’t match the real situation, you will get odd results.

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 17:24:28

Sorry, I’m confusing you! :smiley:

I was thinking to keep the script the same with single file selection which will work every time.

But adding an option to select the parent folder to get all the files inside as well.

Here is a dispatcher offering the two schemes.
Is it OK for you ?


set lesFichiersExcel to {}
set scanSubFolders to "Scan subfolders"
display dialog "Choose the selection mode" buttons {"Cancel", "Use a loop", scanSubFolders} default button 3
button returned of result is scanSubFolders

if result then
	choose folder with prompt "Select the folder stroring subfolders storing Excel files"
	set mainFolder to result as text
	tell application "System Events"
		set maybe to (disk items of folder mainFolder whose (type identifier is "org.openxmlformats.spreadsheetml.sheet" or type identifier is "com.microsoft.excel.xls"))
		if maybe is not {} then set lesFichiersExcel to lesFichiersExcel & maybe
		
		path of folders of folder mainFolder
		repeat with aFolder in result
			set maybe to (disk items of folder aFolder whose (type identifier is "org.openxmlformats.spreadsheetml.sheet" or type identifier is "com.microsoft.excel.xls"))
			if maybe is not {} then set lesFichiersExcel to lesFichiersExcel & maybe
		end repeat
	end tell
	
else
	
	set continue_Local to my getLocalString("Finder", "AL3")
	set cancel_Local to my getLocalString("Finder", "AL1")
	repeat
		try
			choose file with prompt "Select the files to merge" & return & "Press " & cancel_Local & " to exit the loop." of type {"org.openxmlformats.spreadsheetml.sheet", "com.microsoft.excel.xls", "com.apple.traditional-mac-plain-text", "public.comma-separated-values-text"} with multiple selections allowed
			set lesFichiersExcel to lesFichiersExcel & result
		on error errMsg number errNbr
			if errNbr is -128 then
				display dialog "Press " & continue_Local & " to exit the loop selecting the files." & return & "Press " & cancel_Local & " to exit the script." buttons {cancel_Local, continue_Local}
				exit repeat
			else
				error errMsg number errNbr
			end if
		end try
	end repeat
end if
lesFichiersExcel

on getLocalString(the_App, the_Key)
	tell application the_App to return localized string the_Key
end getLocalString


Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 18:05:23

Yep seems to be working fine, the log shows it’s picking up the files :slight_smile:

Can we try integrating it in the script? :slight_smile:

It’s done but I’m refining some details.

It seems to be OK now.
I made some changes to make it easier to read.


#=====

on run
	local lesFichiersExcel, parleAnglais, cancel_Local, scanSubFolders, mainFolder, maybe
	local theFolder, theExt, theName, thePathname, theOpenFile, uneSource, i
	
	# I use this flag because in a script posted in such forum, I can't use the standard localization scheme
	# If you use neither English nor French, you may edit the test and the strings given in French
	set parleAnglais to (do shell script "defaults read 'Apple Global Domain' AppleLocale") does not start with "fr_"
	
	set cancel_Local to my getLocalString("Finder", "AL1")
	
	if parleAnglais then
		set scanSubFolders to "Scan subfolders"
		display dialog "Choose the selection mode" buttons {"Cancel", "Use a loop", scanSubFolders} default button 3
	else
		set scanSubFolders to "Balayage des sous-dossiers"
		display dialog "Choisir le mode de sélection" buttons {cancel_Local, "Boucle", scanSubFolders} default button 3
	end if
	button returned of result is scanSubFolders
	if result then
		set lesFichiersExcel to my useScan(parleAnglais)
	else
		set lesFichiersExcel to my useLoop(parleAnglais)
	end if
	# Extract the container, the name, the name extension of the first item in the list of selected files
	tell application "System Events" to tell disk item (item 1 of lesFichiersExcel as text)
		set {theFolder, theName, theExt} to {path of container, name, name extension}
	end tell
	set theName to (text 1 thru -(1 + (length of theExt)) of theName) & "csv"
	
	# Define the name of the merged file
	display dialog "Define the merged file name" default answer theName
	set theName to text returned of result
	if theName does not end with ".csv" then set theName to theName & ".csv"
	
	# Select the folder in which the merged file will be saved
	choose folder with prompt "Location of the merged file" default location (theFolder as alias)
	
	# Build the pathname of the merged file
	set thePathname to (result as text) & theName
	
	# Applying close access in the error handler was a bad idea.
	# So I try to close access here in case of a file remaining open
	try
		close access file thePathname
	end try
	set theOpenFile to open for access file thePathname with write permission # EDITED
	set eof theOpenFile to 0
	
	# Treats the first source file keeping its first row
	set uneSource to item 1 of lesFichiersExcel
	tell application "System Events" to tell disk item (uneSource as text)
		type identifier
	end tell
	if result is in {"org.openxmlformats.spreadsheetml.sheet", "com.microsoft.excel.xls"} then
		my treatAnExcelFile(uneSource, 1, theOpenFile)
	else
		my treatAcsvFile(uneSource, 1, theOpenFile)
	end if
	
	# Treat the other source file(s) dropping the first row
	repeat with i from 2 to count lesFichiersExcel
		set uneSource to item i of lesFichiersExcel
		tell application "System Events" to tell disk item (uneSource as text)
			type identifier
		end tell
		if result is in {"org.openxmlformats.spreadsheetml.sheet", "com.microsoft.excel.xls"} then
			my treatAnExcelFile(uneSource, 2, theOpenFile)
		else
			my treatAcsvFile(uneSource, 2, theOpenFile)
		end if
	end repeat
	try
		close access theOpenFile # EDITED
	end try
end run

#=====

on useScan(parle_Anglais)
	local mainFolder, maybe, les_Fichiers_Excel
	set les_Fichiers_Excel to {} # ADDED <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
	if parle_Anglais then
		choose folder with prompt "Select the folder stroring subfolders storing Excel files"
	else
		choose folder with prompt "Choisir le dossier contenant les sous-dossiers contenant les fichiers Excel"
	end if
	set mainFolder to result as text
	# CAUTION in this piece of code, the script grab only the Excel files. you may add the other types if you want
	tell application "System Events"
		set maybe to (path of disk items of folder mainFolder whose (type identifier is "org.openxmlformats.spreadsheetml.sheet" or type identifier is "com.microsoft.excel.xls"))
		if maybe is not {} then set les_Fichiers_Excel to les_Fichiers_Excel & maybe
		path of folders of folder mainFolder
		repeat with aFolder in result
			set maybe to (path of disk items of folder aFolder whose (type identifier is "org.openxmlformats.spreadsheetml.sheet" or type identifier is "com.microsoft.excel.xls"))
			if maybe is not {} then set les_Fichiers_Excel to les_Fichiers_Excel & maybe
		end repeat
	end tell -- System Events
	return les_Fichiers_Excel
end useScan

#=====

on useLoop(parle_Anglais)
	local continue_Local, cancel_Local, les_Fichiers_Excel, errNbr
	set continue_Local to my getLocalString("Finder", "AL3")
	set cancel_Local to my getLocalString("Finder", "AL1")
	set les_Fichiers_Excel to {}
	
	# "com.apple.traditional-mac-plain-text" is the type identifier of the csv generated by the export scripts.
	# "public.comma-separated-values-text" is the type identifier of the csv generated by the export feature of Numbers.
	
	repeat
		try
			if parle_Anglais then
				choose file with prompt "Select the files to merge" & return & "Press "Cancel" to exit the loop." of type {"org.openxmlformats.spreadsheetml.sheet", "com.microsoft.excel.xls", "com.apple.traditional-mac-plain-text", "public.comma-separated-values-text"} with multiple selections allowed
			else
				choose file with prompt "Choisir les fichiers à  fusionner" & return & "Presser « " & cancel_Local & " » pour sortir de la boucle." of type {"org.openxmlformats.spreadsheetml.sheet", "com.microsoft.excel.xls", "com.apple.traditional-mac-plain-text", "public.comma-separated-values-text"} with multiple selections allowed
			end if
			set les_Fichiers_Excel to les_Fichiers_Excel & result
		on error errMsg number errNbr
			if errNbr is -128 then
				if parle_Anglais then
					display dialog "Press "Continue" to apply the end of the script." & return & "Press "Cancel" to exit the script." buttons {"Cancel", "Continue"}
				else
					display dialog "Presser « " & continue_Local & " » pour exécuter la suite du script." & return & "Presser « " & cancel_Local & " » pour quitter le script." buttons {cancel_Local, continue_Local}
				end if
				exit repeat
			else
				error errMsg number errNbr
			end if
		end try
	end repeat
	return les_Fichiers_Excel
end useLoop

#=====

# CAUTION, I removed a parameter so that both handlers use the same parameters !

on treatAnExcelFile(a_Pathname, start_Row, the_OpenFile)
	local delim, lastCol, lastRow, r, cellVal, rowStr, e, n
	character 2 of (0.5 as text)
	if result is "," then
		set delim to ";"
	else
		set delim to ","
	end if
	
	tell application "Microsoft Excel"
		activate
		try
			open a_Pathname
			tell active sheet
				set lastCol to count of columns of used range # of active sheet
				set lastRow to count of rows of used range # of active sheet
				
				repeat with r from start_Row to lastRow
					tell row r
						set cellVal to (value of cell 1) as text
						if cellVal contains delim then set cellVal to quote & cellVal & quote # ADDED
						set rowStr to cellVal
						repeat with c from 2 to lastCol
							set cellVal to (value of cell c) as text
							if cellVal contains delim then set cellVal to quote & cellVal & quote # ADDED
							set rowStr to rowStr & delim & cellVal
						end repeat
					end tell
					tell me to write rowStr & return to the_OpenFile as «class utf8» # EDITED
					(*
# Without the « tell me  » statement we would get this kind of behavior :

(1) write "blah;blah;blah
" to 1632 as «class utf8»
--> error number -1708
(2) «event ascrgdut»
--> error number -1708
(3) write "blah;blah;blah
" to 1632 as «class utf8»
--> error number -10004
end tell

tell current application 
write "blah;blah;blah
" to 1632 as «class utf8»
end tell

# Hardcoding the « tell current application  » statement, we spare the three operations issuing a non fatal error
# As I am lazy, I typed the shorter version « tell me »
*)
				end repeat
			end tell # active sheet
			
		on error e number n
			my ErrorHandler(e, n)
		end try

		close active workbook saving no 
 end tell
end treatAnExcelFile

#=====

on treatAcsvFile(a_Pathname, start_Row, the_OpenFile)
	local rowStr, e, n
	# EDITED
	try
		paragraphs start_Row thru -1 of (read a_Pathname as «class utf8»)
	on error
		paragraphs start_Row thru -1 of (read a_Pathname)
	end try
	
	try
		repeat with rowStr in result
			# Here we aren't in a tell application . end tell block so there is no need for the « tell me  » statement 
			write (rowStr as text) & return to the_OpenFile as «class utf8»
		end repeat
	on error e number n
		my ErrorHandler(e, n)
	end try
end treatAcsvFile

#=====

on ErrorHandler(e_, n_)
	local sep
	# Chris Stone
	set sep to "------------------------------------------"
	my getLocalString("Finder", "PE26")
	item 1 of my decoupe(result, " ^0)")
	item 2 of my decoupe(result, "(")
	sep & return & my getLocalString("Finder", "NE86") & ": " & e_ & return & sep & return & result & ": " & n_ & return & sep
	tell application (path to frontmost application as string)
		display dialog result buttons {my getLocalString("Finder", "BN62"), my getLocalString("Finder", "AL3")} # default button 2 (or 1) as you want
	end tell
	# set fail to true
end ErrorHandler

#=====

on getLocalString(the_App, the_Key)
	tell application the_App to return localized string the_Key
end getLocalString

#=====

on decoupe(t, d)
	local oTIDs, l
	set oTIDs to AppleScript's text item delimiters
	set AppleScript's text item delimiters to d
	set l to text items of t
	set AppleScript's text item delimiters to oTIDs
	return l
end decoupe

#=====

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 19:15:10

I’m getting this when I select “Scan subfolders”:

error “The variable les_Fichiers_Excel is not defined.” number -2753 from “les_Fichiers_Excel”

Am I doing something wrong? :slight_smile:

You made nothing wrong.
I forgot an instruction.

Now it’s added.

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 21:25:33

Now getting

Can’t make file “Macintosh HD:Users:
xlsx” of application “System Events” into type text.

Just using scan subfolders now, not sure what the loop function is doing?

Seeing 4 pages for such a simple thing doesn’t make MacScripter user look good :stuck_out_tongue:

(1) the script is corrected.

(2) It seems that you forgot what I took time to explain in depth.

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 22:46:52

I apologize but when the asker add new requested feature one by one, it’s not surprising to get a long thread.

Yvan KOENIG (VALLAURIS, France) lundi 10 septembre 2012 22:47:56