Converting old DOC to DOX - need help with file dates

I am no programmer at all and I need to convert a bunch of 20-25 year old DOC files to DOCX.
Since Office V16 cannot do this, I installed LibreOffice and I found an Applescript that does the conversion job.

But what the script does not do is copying the original file dates to the new converted file.

The script needs to be saved as a program onto the desktop. When dropping files and folders onto it, it starts converting.

It does so by calling eg. “/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to-docx --outdir /Users/Demo/Files /Users/Demo/Files/Filename.DOC”

That creates file Filename.DOCX.

How can I copy all the file dates from Filename.DOC to Filename.DOCX ???

Thanks for helping!
zebramusik

========================

property name_extension : {“doc”}

global fileCnt

on open of finderObjects

set fileCnt to 0


repeat with i in (finderObjects)
	
	if folder of (info for i) is true then
		
		tell application "Finder" to set temp to (entire contents of i)
		
		repeat with j in (temp)
			
			process_files(j)
			
		end repeat
		
	else
		
		process_files(i)
		
	end if
	
end repeat


display notification "Processed Files: " & fileCnt with title "Word DOC to DOCX" subtitle "Processing Complete"

end open

on process_files(fname)

set cmd to "/Applications/LibreOffice.app/Contents/MacOS/soffice "

set cmdArgs to "--headless --convert-to docx --outdir "


tell application "Finder"
	
	set nameExt to name extension of fname
	
	set outDir to do shell script "dirname " & POSIX path of (fname as alias)
	
	
	if name extension of fname is in name_extension then
		
		try
			
			do shell script cmd & cmdArgs & outDir & space & POSIX path of (fname as alias)
			
			set fileCnt to fileCnt + 1
			
		on error errorMessagenumbererrorNumber
			
			display alert "Processing Error: " message "[ " & errorNumber & " ] " & errorMessage
			
			error number -128
			
		end try
		
	end if
	
end tell

return

end process_files

use LibreOffice with Save As

The script above already converts a bunch of files or folders automatically.
The only thing missing is copying the original file modification and creation dates.

LibreOffice which the script uses heavily does not do that.

  1. Your script throws the error “Finder got an error:” usage dirname path (1)"
  2. It contains one more error: it processes only doc-files of folders, leaving out of the process doc-files of its subfolders. A recursive subroutine call itself must be applied here !!!

First you need to solve these 2 problems.

Problem 1) is solved by replacing set outDir to do shell script "dirname " & POSIX path of (fname as alias) with this:


set outDir to do shell script "dirname " & quoted form of (POSIX path of (fname as alias))

I think you could use the SetFile shell command to set the creation date and modification date of the .docx files after getting those dates (in the proper format) from the .doc files. Useful snippets to work into a repeat loop:


tell application "Finder"
	set d to get creation date of file fp_in -- eg, "dataHD:folder_in:file.doc"
	set m to get modification date of file fp_in
	
	tell d to set dateTimeD to its short date string & ", " & its time string
	tell m to set dateTimeM to its short date string & ", " & its time string
end tell

--> "11/22/15, 1:23:12 PM" is the .doc file creation date
--> "11/22/15, 2:22:22 PM"s is the .doc file modification date


-- File path to change:
set docxFP to quoted form of POSIX path of "/Volumes/dataHD/folder_out/file.docx" --> "'/Volumes/dataHD/folder_out/file.docx'"

-- Change creation date of .docx file
set cmd to "SetFile -d " & quote & dateTimeD & quote & " " & docxFP
do shell script (cmd)

-- Change modification date of .docx file
set cmd to "SetFile -m " & quote & dateTimeM & quote & " " & docxFP
do shell script (cmd)


@KniazidisR Thank you for your support!!
I changed the code accordingly. Still works.

@kerflooey Thank you!

I could not save the code.

error “Die Variable „fp_in“ ist nicht definiert.” number -2753 from “fp_in”

fp_in is not defined.

On the other hand, I cannot implement this in my working script.

?? fp_in = “POSIX path of (fname as alias)” ???

And I also have no clue how to compute docxFP.
Since “POSIX path of (fname as alias)” contains the original filename with the old DOC extension.
How do I convert this to the same filename but with “DOCX” extension?

Thanks again!

I think this is a problem many have with old DOC files.

No, a problem is in your code. You have problems with quotes in both do shell script commands. Look for use “"” to fix the bug (as in my code). dirname commang string syntax must be as “dirname ‘/Users/User/myDocFile.doc’” or as “dirname "/Users/User/myDocFile.doc"” and not as “dirname /Users/User/myDocFile.doc”.

Here is my code that works completely correctly:


on open of finderObjects
	
        set all_Files_List to {}
	set shell_Command to "/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to docx --outdir "
	
        --If dropped item is file, then process as file... else process as folder
        if folder of (info for finderObjects) is false then
		tell application "System Events" to set nameExt to name extension of finderObjects
		if nameExt ≠ "doc" then return -- If is file, but not ".doc", then quit dropplet application
		set end of all_Files_List to finderObjects
	else
		get_All_doc_Files_of_Folder(finderObjects, all_Files_List)
	end if
        
        --If no ".doc" files founded then exit dropplet application
	if all_Files_List = {} then return  
	
        --Else converting ".doc" files with LibreOffice and save ".docx" files at same directory     
        repeat with next_File in all_Files_List
		set outDir to do shell script "dirname " & "\"" & next_File & "\""
                tell application "System Events" to set modif_Date to modification date of file  next_File --comment this or remove if you don't need"
		do shell script shell_Command & "\"" & outDir & "\"" & space & "\"" & next_File & "\""
                set new_File to next_File & "x" --comment this or remove if you don't need
                tell application "System Events" to set modification date of file new_File to modif_Date --comment this or remove if you don't need this
	end repeat
        
        --Brief  notification for converting results
	display dialog "Processed Files: " & ((count of all_Files_List) as string) with title "Word DOC to DOCX processing complete" 

end open

on get_All_doc_Files_of_Folder(the_folder, all_Files_List)
	tell application "System Events"
		set files_list to every file of the_folder whose type identifier is "com.microsoft.word.doc"
		repeat with doc_Item in files_list
			set end of all_Files_List to POSIX path of doc_Item
		end repeat
		set sub_folders_list to folders of the_folder
	end tell
        
       --A recursive subroutine call itself to search for ".doc" files  in subfolders too
        repeat with the_sub_folder_ref in sub_folders_list
		my get_All_Files_of_Folder(the_sub_folder_ref, all_Files_List)
	end repeat
end get_All_doc_Files_of_Folder

NOTE: On my computer LibreOffice spends about 1 minute to convert each file. This happens as I read online, because the system checks old Word files for viruses. So, if in your folder is 1000 “.doc” files … have patience.

Now added keeping of modification date. You can remove added code lines (code lines with “–comment this or remove”), if you don’t need this functionality.

NOTE: Backing modification date automatically forces backing of creation date.
That’s all.

@KniazidisR Thank you very much for your work and HAPPY EASTER !!!

I tried your code but it gives me error messages.

When I test with a single DOC file it says:

„«class extn» of {alias “HD MBpro CB:Users:christian:Desktop:!CONVERT:MAENGEL6.DOC”}“ kann nicht gelesen werden. (-1728)

When I test with a directory it says:

„{} whose «class utid» of it = “com.microsoft.word.doc”“ kann nicht gelesen werden. (-1728)

Very strange.

For test purporse I use a folder named “!CONVERT” that sits on the desktop and that contains some old DOC files.

Once again thank you very much for all the effort which I am sure many people will appreciate. Because many people still habe old DOC files.

Christian

Hi.

The parameter of an ‘open’ handler is a list of the dropped items. You need to loop through the items in the list, as in your original script, not try to get ‘info for’ or the ‘name extension’ of the list itself.

Both previous authors are right in their comments. The script needs to be fixed. Now I don’t have time, but for those who urgently need to fix a lot of old .doc files, I suggest using the script not as a droplet, but as an applet:


set finderObjects to choose folder -- or you can [b]set finderObjects to choose file[/b]	
       
 set all_Files_List to {}
	set shell_Command to "/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to docx --outdir "
	
        --If item is file, then process as file... else process as folder
        if folder of (info for finderObjects) is false then
		tell application "System Events" to set nameExt to name extension of finderObjects
		if nameExt ≠ "doc" then return -- If is file, but not ".doc", then quit dropplet application
		tell application "System Events" to set end of all_Files_List to (Posix path of finderObjects)
	else
		get_All_doc_Files_of_Folder(finderObjects, all_Files_List)
	end if
        
        --If no ".doc" files founded then exit application
	if all_Files_List = {} then return  
	
        --Else converting ".doc" files with LibreOffice and save ".docx" files at same directory     
        repeat with next_File in all_Files_List
		set outDir to do shell script "dirname " & "\"" & next_File & "\""
                tell application "System Events" to set modif_Date to modification date of file  next_File --comment this or remove if you don't need"
		do shell script shell_Command & "\"" & outDir & "\"" & space & "\"" & next_File & "\""
                set new_File to next_File & "x" --comment this or remove if you don't need
                tell application "System Events" to set modification date of file new_File to modif_Date --comment this or remove if you don't need this
	end repeat
        
        --Brief  notification for converting results
	display dialog "Processed Files: " & ((count of all_Files_List) as string) with title "Word DOC to DOCX processing complete" 

on get_All_doc_Files_of_Folder(the_folder, all_Files_List)
	tell application "System Events"
		set files_list to every file of the_folder whose type identifier is "com.microsoft.word.doc"
		repeat with doc_Item in files_list
			set end of all_Files_List to POSIX path of doc_Item
		end repeat
		set sub_folders_list to folders of the_folder
	end tell
        
       --A recursive subroutine call itself to search for ".doc" files  in subfolders too
        repeat with the_sub_folder_ref in sub_folders_list
		my get_All_Files_of_Folder(the_sub_folder_ref, all_Files_List)
	end repeat
end get_All_doc_Files_of_Folder

TIP: If you open in Finder the folder processed by the script during the execution of the script, you will see the processing live.

@KniazidisR Thank you very, very much! :slight_smile: :slight_smile: :slight_smile:

As said before, there are so many people out there without coding skills (like me!) and who own many many DOC files from the 90ies which cannot be opened and read anymore with the current version of Microsoft Word or Apple Pages.

I can confirm that the script above used as Applet works well with a single folder, that contains just files. All DOC files will then be converted fine with LibreOffice.

If there is a subfolder the script stops with an error message:
„«class cfol» “HD MBpro CB:Users:christian:Desktop:!CONVERT:SUBFOLDER:” of application “System Events”“ kann nicht in den erwarteten Typ umgewandelt werden. (-1700)

TO FINALIZE:
If somebody wants to fix his DOC files, that works perfectly with the script above and if you have LibreOffice (free download!) installed - as long as there are no subfolders involved.

eg. put all DOC files that need to be converted into a standalone folder and you’re fine!

If someone has many nested subfolders there is a tool called “Document Converter” from RootRise Technologies in the App Store. That one does not rely on LibreOffice at all as well and also does subfolders.

Thanks everyone!

I combined Applescript template “Recursive File Processing Droplet” with the essential code from above and now there is a droplet that does convert the DOC files in subfolders as well.

Cudos to KniazidisR !!! This would not have been possible without your contribution!!!



property type_list : {"DOC"}
property extension_list : {"doc"}
property typeIDs_list : {"com.microsoft.word.doc"}
property counter : 0

on open these_items
	repeat with i from 1 to the count of these_items
		set this_item to item i of these_items
		set the item_info to info for this_item
		if folder of the item_info is true then
			process_folder(this_item)
		else
			try
				set this_extension to the name extension of item_info
			on error
				set this_extension to ""
			end try
			try
				set this_filetype to the file type of item_info
			on error
				set this_filetype to ""
			end try
			try
				set this_typeID to the type identifier of item_info
			on error
				set this_typeID to ""
			end try
			if (folder of the item_info is false) and (package folder of the item_info is false) and (alias of the item_info is false) and ((this_filetype is in the type_list) or (this_extension is in the extension_list) or (this_typeID is in typeIDs_list)) then
				process_file(this_item)
			end if
		end if
	end repeat
	display dialog "Processed Files: " & (counter as string) with title "Word DOC to DOCX processing complete"
end open

-- this sub-routine processes folders 
on process_folder(this_folder)
	set these_items to list folder this_folder without invisibles
	repeat with i from 1 to the count of these_items
		set this_item to alias ((this_folder as Unicode text) & (item i of these_items))
		set the item_info to info for this_item
		if folder of the item_info is true then
			process_folder(this_item)
		else
			try
				set this_extension to the name extension of item_info
			on error
				set this_extension to ""
			end try
			try
				set this_filetype to the file type of item_info
			on error
				set this_filetype to ""
			end try
			try
				set this_typeID to the type identifier of item_info
			on error
				set this_typeID to ""
			end try
			if (folder of the item_info is false) and (package folder of the item_info is false) and (alias of the item_info is false) and ((this_filetype is in the type_list) or (this_extension is in the extension_list) or (this_typeID is in typeIDs_list)) then
				process_file(this_item)
			end if
		end if
	end repeat
end process_folder

-- this sub-routine processes files 
on process_file(this_item)
	-- NOTE that during execution, the variable this_item contains a file reference in alias format to the item passed into this sub-routine
	-- FILE PROCESSING STATEMENTS GO HERE
	
	set next_File to (POSIX path of this_item)
	
	set shell_Command to "/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to docx --outdir "
	set outDir to do shell script "dirname " & "\"" & next_File & "\""
	
	# Datum vom Originalfile einlesen
	tell application "System Events" to set modif_Date to modification date of file next_File
	
	# DOC mit LibreOffice zu DOCX konvertieren
	do shell script shell_Command & "\"" & outDir & "\"" & space & "\"" & next_File & "\""
	
	# Datum vom Originalfile setzen
	set new_File to next_File & "x"
	tell application "System Events" to set modification date of file new_File to modif_Date
	
	set counter to counter + 1
	
end process_file


As promised, I publish Convert_Doc_to_Docx droplet in amended form. Now it accepts files, folders + multiple choice is allowed in any form + droplet processes subfolders too:


property counter: 0

on open theDroppedItems
        set all_Files_List to {}
	set shell_Command to "/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to docx --outdir "
	
	--dropplet allowes multiply files and folders selection
	repeat with theCurrentItem in theDroppedItems
		if folder of (info for theCurrentItem) is true then
			get_All_doc_Files_of_Folder(theCurrentItem, all_Files_List)
		else
			tell application "System Events" to set theTypeIdentifier to type identifier of (theCurrentItem as alias)
			if theTypeIdentifier = "com.microsoft.word.doc" then
				tell application "System Events" to set end of all_Files_List to (POSIX path of theCurrentItem)
			end if
		end if
	end repeat
	
	--If founded ".doc" files then convert them with LibreOffice and save ".docx" files at same directory 
	if all_Files_List ≠ {} then
		repeat with next_File in all_Files_List
			set outDir to do shell script "dirname " & "\"" & next_File & "\""
			tell application "System Events" to set modif_Date to modification date of file next_File --comment this or remove if you don't need"
			do shell script shell_Command & "\"" & outDir & "\"" & space & "\"" & next_File & "\""
			set new_File to next_File & "x" --comment this or remove if you don't need
			tell application "System Events" to set modification date of file new_File to modif_Date --comment this or remove if you don't need this
                        set counter to counter + 1
		end repeat
	end if

--Brief notification for converting results
display dialog "Processed Files: " & (counter as string) with title "Word DOC to DOCX processing complete" giving up after 5
end open

on get_All_doc_Files_of_Folder(the_folder, all_Files_List)
	tell application "System Events"
		set files_list to every file of the_folder whose type identifier is "com.microsoft.word.doc"
		repeat with doc_Item in files_list
			set end of all_Files_List to POSIX path of doc_Item
		end repeat
	end tell
	tell application "Finder" to set sub_folders_list to folders of the_folder
	
	--A recursive subroutine call itself to search for ".doc" files in subfolders too
	repeat with the_sub_folder in sub_folders_list
		set the_sub_folder_ref to the_sub_folder as alias
		my get_All_doc_Files_of_Folder(the_sub_folder_ref, all_Files_List)
	end repeat
end get_All_doc_Files_of_Folder