Why does this Mail archiving script consume 40 GB app memory?

Here’s a cleaner version of the Mail archiving script I posted recently.
I wrote this to chip away at a mailbox with130,000 messages going back to 2011!
It seems to run OK though not terribly fast.
But if I leave it running overnight, in the morning I find Script Editor is using 30 to 40 GB of app memory (on a computer with 8 GB physical ram) and has to be force quit. Usually it gets through a couple of thousand messages before crashing.
And no I am not selecting all 130,000 messages before running the script!

I know it’s problematic to “set selected_messages to selection” but I can’t find any other way of specifying individual messages in a script. My bash-script brain wants to dump all the message ID"s to a text file and step through it, deleting lines as we go, instead of holding them all in memory. But I couldn’t get that to work.

(*
This script creates a folder called "saved-email" in Documents (with subfolders for each year) and saves each message in its own folder, also saving attachments if any. Message folders & files are named with the sender, subject, and UID of the message to prevent duplication or overwriting. The script also sets the timestamp of each saved message to its date sent, for easy chronological sorting & searching.

Select some subset of old messages in the Mail app and run the script. It works pretty well if you do a couple thousand messages at a time.

This script includes various chunks of other people's scripts that I found here and there. Much gratitude to these valiant partners in the struggle. 
*)
set path_to_docs to "~/Documents/saved-email/"

tell current application
	repeat with y from 2011 to 2025
		do shell script ("mkdir -p " & path_to_docs & y)
	end repeat
end tell

tell application "Mail"
	set selected_messages to selection
	set msg_total to (count selected_messages)
	set running_total to 1
	
	repeat with this_message in selected_messages -- loop through the messages sent by Mail
		try
			set msg_sender to (extract name from (sender of this_message))
			if msg_sender = "" then set msg_sender to (extract address from (sender of this_message))
			set msg_text to content of this_message -- retrieve message body
			set msg_subject to (subject of this_message) as Unicode text
			set msg_date to (date received of this_message) as date
			set yyyy to (year of msg_date)
			set msg_id to id of this_message as string
			
			tell current application
				do shell script ("echo " & quoted form of (msg_sender & "-" & (msg_subject & "-" & msg_id)) & " > /tmp/scratch.txt")
				set the_file_name to (do shell script "cat /private/tmp/scratch.txt | sed -r 's/Re://g ; s/[^[:alnum:]-]/-/g ; s/-+/-/g ; s/^-//g' ") -- clean up message names
				set thePath to path_to_docs & yyyy & "/" & the_file_name & "/"
				
				do shell script ("mkdir -p " & thePath)
				do shell script ("touch " & thePath & the_file_name & ".txt")
				
				set save_folder to ((path to documents folder as string) & "saved-email:" & yyyy & ":" & the_file_name & ":") as alias
				set disk_file to ((save_folder as string) & the_file_name & ".txt" as string) as alias
				set disk_file_id to open for access file disk_file with write permission -- open the new file for writing
				
				write msg_text to disk_file_id -- write the body text of the current email into the new file
				close access disk_file_id -- close the file
			end tell
			
			if (count mail attachments of this_message) > 0 then
				
				repeat with this_attachment in (mail attachments of this_message)
					try
						set this_attachment_filename to (((save_folder as string) & this_attachment's name))
						save this_attachment in alias this_attachment_filename -- saves the attachment 
					end try
				end repeat
				
			end if
			
			tell application "Finder"
				set save_folder's modification date to msg_date
				set (entire contents of save_folder)'s modification date to msg_date
			end tell
			
			delete this_message
			tell current application to do shell script ("echo " & quoted form of ("___________________ Archived " & running_total & " of " & msg_total) & " messages > /dev/null ")
			set running_total to running_total + 1
			
		end try
	end repeat
end tell

I’ll look at it. BTW why do you have a ‘tell current application’?
It is not needed

Without it, the “do shell script” commands in the middle of the Mail tell block throw errors.

The one I’m referring to is at the top, outside of the mail tell block.
the one inside can use ‘tell me’ instead’

See if this works

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

property mailFile : missing value

on run
	set mailFile to (path to desktop folder as text) & "Mail-IDs.txt"
	if generateMailFile() then
		setupFolders()
		processMail()
	end if
end run

on generateMailFile()
	local mfile, myList
	try
		set mfile to open for access file mailFile with write permission
	on error
		return false
	end try
	set eof mfile to 0
	set myList to {}
	tell application "Mail"
		set selected_messages to selection
		repeat with this_message in selected_messages
			set this_message to contents of this_message
			set aMailbox to mailbox of this_message
			set anAccount to account of aMailbox
			set tmp to (id of this_message as rich text) & tab & name of aMailbox & tab & id of anAccount & linefeed
			tell me to write tmp to mfile
		end repeat
	end tell
	close access mfile
	return true
end generateMailFile

on processMail()
	local mfile, myList, c, this_message, msg_sender, msg_text, msg_subject, msg_date, yyyy, msg_id, the_file_name, thePath, tid
	try
		set mfile to open for access file mailFile
	on error
		return false
	end try
	set progress description to "Manage Emails…"
	set progress total steps to -1
	set progress completed steps to 0
	set tid to text item delimiters
	set text item delimiters to tab
	set c to 1
	repeat
		try
			set mailItem to text 1 thru -2 of (read mfile until linefeed)
		on error errMsg number errNum
			exit repeat
		end try
		set progress completed steps to c
		
		set mailItem to text items of mailItem
		set {mailID, mboxName, accntID} to mailItem
		set mailID to mailID as integer
		tell application "Mail"
			tell mailbox mboxName of account id accntID
				--message mailID
				set this_message to item 1 of (messages whose id is mailID)
				set msg_sender to (extract name from (sender of this_message))
				if msg_sender = "" then set msg_sender to (extract address from (sender of this_message))
				set msg_text to content of this_message -- retrieve message body
				set msg_subject to (subject of this_message) as text
				set msg_date to (date received of this_message) as date
				set yyyy to (year of msg_date)
				set msg_id to id of this_message as string
			end tell
		end tell
		do shell script ("echo " & quoted form of (msg_sender & "-" & (msg_subject & "-" & msg_id)) & " > /tmp/scratch.txt")
		set the_file_name to (do shell script "cat /private/tmp/scratch.txt | sed -r 's/Re://g ; s/[^[:alnum:]-]/-/g ; s/-+/-/g ; s/^-//g' ") -- clean up message names
		set thePath to path_to_docs & yyyy & "/" & the_file_name & "/"
		
		do shell script ("mkdir -p " & thePath)
		do shell script ("touch " & thePath & the_file_name & ".txt")
		set save_folder to ((path to documents folder as text) & "saved-email:" & yyyy & ":" & the_file_name & ":") as alias
		set disk_file to ((save_folder as text) & the_file_name & ".txt" as text) as alias
		set disk_file_id to open for access file disk_file with write permission -- open the new file for writing
		
		write msg_text to disk_file_id -- write the body text of the current email into the new file
		close access disk_file_id -- close the file
		
		tell application "Mail"
			if (count mail attachments of this_message) > 0 then
				repeat with this_attachment in (mail attachments of this_message)
					try
						set this_attachment_filename to (((save_folder as string) & this_attachment's name))
						save this_attachment in alias this_attachment_filename -- saves the attachment 
					end try
				end repeat
			end if
		end tell
		tell application "Finder"
			set save_folder's modification date to msg_date
			set (entire contents of save_folder)'s modification date to msg_date
		end tell
		tell application "Mail"
			delete this_message
			tell current application to do shell script ("echo " & quoted form of ("___________________ Archived " & running_total & " of " & msg_total) & " messages > /dev/null ")
			set running_total to running_total + 1
		end tell
		if (c mod 10) = 0 then
			set my progress additional description to subject of mailItem & " (" & c & ")"
		end if
		set c to c + 1
	end repeat
	close access mfile
	set text item delimiters to tid
end processMail

on setupFolders()
	set path_to_docs to (path to documents folder as text)
	try
		alias ((path to documents folder as text) & "saved-email:")
	on error
		tell application "System Events" to make new folder at folder path_to_docs with properties {name:"saved-email"}
	end try
	set path_to_docs to path_to_docs & "saved-email:"
	repeat with y from 2011 to 2025
		try
			alias (path_to_docs & y & ":")
		on error
			tell application "System Events" to make new folder at folder path_to_docs with properties {name:(y as text)}
		end try
	end repeat
end setupFolders

I’m not great at English, so I might not express everything perfectly, but…

When you loop through something with standard output, like do shell script, memory consumption tends to get pretty high, right?

In my case, when I have to loop a lot, I turn off history logging. That way, memory usage seems to be a bit more controlled compared to when logging is on.

Hope this helps!

Another thing — closing the log history window also seems to affect memory usage a bit (though I have no solid proof… haha).

Thanks! Is your version set up to run as a droplet? (didn’t work when I tried running it directly in Script Editor).

Thanks … That makes sense, I’ll try it! (Your English is fine btw …
(better than my Japanese anyway) :wink:

Also what are the scripting additions you’re using here?

the scripting additions is the default Apple scripting additions. Apple used to let companies make compiled scripting additions as add-ons to extend the AppleScript language. AT some point due to security concerns, Apple remove this capability except for their own Scripting Addition They told everyone else who made a scripting addition to convert it to a background scriptable app like ‘System Preferences’.

BTW, do shell script does seem to have memory leeks, so it would be better to try to convert most of those into native AppleScript if you can.

Can you explain each ‘do shell script’ that you currently have? Especially the one using sed. How is it cleaning up the message names?

repeat with y from 2011 to 2025
	do shell script ("mkdir -p " & path_to_docs & y)
end repeat

Creates the “saved-email” parent directory and subdirectories for each year if they don’t exist (the nice thing about mkdir -p is that you don’t have to ask if the directory already exists – if it does, mkdir ignores it)


do shell script ("echo " & quoted form of (msg_sender & "-" & (msg_subject & "-" & msg_id)) & " > /tmp/scratch.txt")

Dumps text of sender, subject, and message id, separated by dashes, to a temporary text file.


set the_file_name to (do shell script "cat /private/tmp/scratch.txt | sed -r 's/Re://g ; s/[^[:alnum:]-]/-/g ; s/-+/-/g ; s/^-//g' ") -- clean up message names

This looks ugly but it works.
Read the temporary text file created in previous step and (in this order) delete “Re:” , delete non-alphanumeric characters, replace two or more contiguous dashes with one dash, and delete any dash at the beginning of the string. I cannot even imagine how many lines of Applescript it would take to do all that!


do shell script ("mkdir -p " & thePath)
do shell script ("touch " & thePath & the_file_name & ".txt")

Creates the unique directory for each saved message and create an empty .txt file inside it


tell current application to do shell script ("echo " & quoted form of ("___________________ Archived " & running_total & " of " & msg_total) & " messages > /dev/null ")

A very primitive mechanism for displaying the script’s progress in the Event Log window.


And btw none of these commands return anything except for “cat” and “echo”.

OK try this…

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

global mailFile, path_to_docs, msg_total

on run
	set mailFile to (path to desktop folder as text) & "Mail-IDs.txt"
	if generateMailFile() then
		setupFolders()
		processMail()
	end if
	activate
	display alert "All Done!" giving up after 60
end run

on generateMailFile()
	local mfile, selected_messages
	try
		set mfile to open for access file mailFile with write permission
	on error
		return false
	end try
	set eof mfile to 0
	
	tell application "Mail"
		set selected_messages to selection
		set msg_total to count selected_messages
		tell me
			set progress description to "Manage Emails… (" & msg_total & ")"
			set progress additional description to "(Generating Email List…)"
			set progress total steps to -1
		end tell
		repeat with this_message in selected_messages
			set this_message to contents of this_message
			set aMailbox to mailbox of this_message
			set anAccount to account of aMailbox
			set tmp to (id of this_message as rich text) & tab & name of aMailbox & tab & id of anAccount & linefeed
			tell me to write tmp to mfile
		end repeat
	end tell
	
	close access mfile
	return true
end generateMailFile

on processMail()
	local mfile, myList, running_total, this_message, msg_sender, msg_text, msg_subject, msg_date, yyyy, msg_id, the_file_name, thePath, save_folder, disk_file, disk_file_id, tid
	try
		set mfile to open for access file mailFile
	on error
		return false
	end try
	set progress description to "Manage Emails… (" & msg_total & ")"
	set my progress additional description to "(Starting Archiving…)"
	set progress total steps to msg_total
	set progress completed steps to 0
	set tid to text item delimiters
	set text item delimiters to tab
	set running_total to 1
	repeat
		try
			set mailItem to text 1 thru -2 of (read mfile until linefeed)
		on error errMsg number errNum
			exit repeat
		end try
		set progress completed steps to running_total
		
		set mailItem to text items of mailItem
		set {mailID, mboxName, accntID} to mailItem
		set mailID to mailID as integer
		tell application "Mail"
			tell mailbox mboxName of account id accntID
				--message mailID
				set this_message to item 1 of (messages whose id is mailID)
				set msg_sender to (extract name from (sender of this_message))
				if msg_sender = "" then set msg_sender to (extract address from (sender of this_message))
				set msg_text to content of this_message -- retrieve message body
				set msg_subject to (subject of this_message) as rich text
				set msg_date to (date received of this_message) as date
				set yyyy to (year of msg_date)
				set msg_id to id of this_message as string
			end tell
		end tell
		set the_file_name to cleanText(msg_sender & "-" & msg_subject & "-" & msg_id)
		set thePath to path_to_docs & yyyy & ":"
		tell application "System Events" to make new folder at folder thePath with properties {name:the_file_name}
		--set f to open for access file (thePath & the_file_name & ":" & the_file_name & ".txt") with write permission
		set save_folder to (path_to_docs & yyyy & ":" & the_file_name & ":")
		set disk_file to (save_folder & the_file_name & ".txt")
		set disk_file_id to open for access file disk_file with write permission -- open the new file for writing
		
		write msg_text to disk_file_id -- write the body text of the current email into the new file
		close access disk_file_id -- close the file
		
		tell application "Mail"
			if (count mail attachments of this_message) > 0 then
				repeat with this_attachment in (mail attachments of this_message)
					try
						set this_attachment_filename to (((save_folder as string) & this_attachment's name))
						save this_attachment in alias this_attachment_filename -- saves the attachment 
					end try
				end repeat
			end if
			--delete this_message
		end tell
		tell application "System Events"
			set save_folder to folder save_folder
			set save_folder's modification date to msg_date
			set (disk items of save_folder)'s modification date to msg_date
		end tell
		log ("___________________ Archived " & running_total & " of " & msg_total & " messages")
		--if (running_total mod 10) = 0 then
		try
			set progress additional description to msg_subject & " (" & running_total & ")"
		end try
		--end if
		set running_total to running_total + 1
	end repeat
	set progress completed steps to msg_total
	close access mfile
	set text item delimiters to tid
end processMail

on setupFolders()
	set path_to_docs to (path to documents folder as text)
	try
		alias ((path to documents folder as text) & "saved-email:")
	on error
		tell application "System Events" to make new folder at folder path_to_docs with properties {name:"saved-email"}
	end try
	set path_to_docs to path_to_docs & "saved-email:"
	repeat with y from 2011 to 2025
		try
			alias (path_to_docs & y & ":")
		on error
			tell application "System Events" to make new folder at folder path_to_docs with properties {name:(y as text)}
		end try
	end repeat
end setupFolders

on cleanText(s)
	local ans, c, i, ans, tid
	set ans to ""
	repeat with i from 1 to length of s
		set c to text i of s
		if c is in "1234567890-_abcdefghijklmnopqrstuvwxyz?'.!@#$%^&*()+=~ " then set ans to ans & c
	end repeat
	set tid to text item delimiters
	repeat while "--" is in ans
		set text item delimiters to "--"
		set ans to text items of ans
		set text item delimiters to "-"
		set ans to ans as text
	end repeat
	set text item delimiters to tid
	return ans
end cleanText

No More ‘do shell script’

** EDIT **. – I edited the script to have better progress bar

error “The variable c is not defined.” number -2753 from “c”

I fixed it, try again…

** EDIT **
Any results yet?

Thanks @robertfern … it works sporadically. Several issues:

  1. the message properties that Mail returns don’t always work as specifiers. For instance, calling a mailbox by its name sometimes gives an “invalid index” error. So even though a message says its mailbox is “All Mail”, Mail.app doesn’t recognize that as a valid mailbox name.

tell application “Mail”

get item 1 of every message of mailbox “All Mail” of account id “752625AE-8F62-4ECB-B76C-6157C042F45B” whose id = 143202

error number -1728 from mailbox “All Mail” of account id “752625AE-8F62-4ECB-B76C-6157C042F45B”

Result:

error “Mail got an error: Can’t get mailbox "All Mail" of account id "752625AE-8F62-4ECB-B76C-6157C042F45B".” number -1728 from mailbox “All Mail” of account id “752625AE-8F62-4ECB-B76C-6157C042F45B”

Maybe it would be better to get the ID of the mailbox at the beginning, and just use that.

  1. Sometimes that happens with the message ID as well, no idea why. It seems random. If I select more than about a dozen messages to archive, typically there will be an “invalid index” error before the loop ends.

  2. I also suspect that at least some of the slowness might be caused by the constant upstream communication with Gmail. Every time you delete a Gmail message locally, Mail.app has to call Gmail with the update, because IMAP. I tried taking the account offline but it pops back online as soon as you start moving messages around.

  3. The text cleanup is not exactly replicating what I had set up with sed, but that’s a tweak for later.

Do you have an IMAP Gmail account set up in Apple Mail and did the script work on it?

Being a Security & IT specialist, I will never have a Gmail account, Sorry!
I do have a Microsoft account tho.

As for the Gmail account, when you select the emails you wantt archived, are you choosing emails in the “All Mail” mailbox or from the InBox?

If you are selecting from the “All Mail” mailbox, DON’T. Gmail doesn’t follow standard mail protocols and as you found out causes problems with scriptability. AOL also does this.

BTW, Is mine running faster than the ‘do shell script’ version?

I edited my script above to have a better Progress-Bar

I’m selecting them from the Archive mailbox, which seems like a fake mailbox that Apple puts in by default.

I love this script, at least the idea of this script. I have been fiddling around with a couple mail script over the years, but I’ve never attempted one as thorough as yours.

I just logged in to comment that I received an “All Mail” error. Specifically: "Mail got an error: Can’t get mailbox “All Mail” of account id “xxxx-xxx-xxxx-xxxx”

I initially thought the error was happening with messages that had attachments, but that is not the case. It occurs with all of my Gmail accounts, regardless of whether I select messages in “All Mail” or the Gmail’s inbox.

Yup, same here. Try calling it INBOX instead:

tell application "Mail"
	count messages of mailbox "INBOX" of account "my-account-name" -- whatever you named the account in Mail.app
end tell

This is why scripting Mail.app is a pain: you can ask Mail what a class is called, but you can’t count on being able to use that same name in a script. It might have to be named something else or it might not work at all.
Better to call things by their ID number.

And yes it does seem faster, though hard to tell because it won’t run for long without an error.

Mailboxes in Mail don’t have IDs, only names.