Find and replace multiple words in subject string

Hi pros, I try to get to remove multiple words like “RE:”, “Re:”, “Fwd:”… all at once from a string in a simple find and replace function. Does anybody know a good handler for this to find and replace/erase multiple words all at once. Please keep in mind that I’m a newbie as you see in the unstructured code I used here. Thanks a lot in advance for the help!

Hello,

The AppleScript’s Text Item Delimiters works fine with multiple words to replace. In your script above just set variable findText to the list {"Re: ", "Aw: ", "AW: ", "Fwd: ", "FWD: ", "RE: "}. All this words replace with empty string (set replaceText to “”):


set newSubject to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"

on findReplace(findText, replaceText, sourceText)
	set ASTID to AppleScript's text item delimiters
	set AppleScript's text item delimiters to findText
	set sourceText to text items of sourceText
	set AppleScript's text item delimiters to replaceText
	set sourceText to "" & sourceText
	set AppleScript's text item delimiters to ASTID
	return sourceText
end findReplace

set cleanedSubject to my findReplace({"Re: ", "Aw: ", "AW: ", "Fwd: ", "FWD: ", "RE: "}, "", newSubject)

NOTE: I wrote your newSubject’s text in more complicated form, to show how useful may be AppleScript’s Text Item Delimiters instrument.

Uff, thanks so much KniazidisR! Now I see how to use it properly. Case closed and many thanks again for that.

I would like to help you completely with the exact solution to your question. Namely: the peculiarities of AppleScript Text Item Delimiters work with a list of words is such that words are replaced from left to right by the order in the list. It doesn’t matter in your example, but in other text you may get the wrong result if the list of words is not sorted by the number of letters.

This is an oversight of mine post #2, so I want to pay special attention to it now. The list should always be sorted, just in case, before applying AppleScript Text Item Delimiters, from the longest words to the shortest words:


set cleanedSubject to my findReplace({"Fwd: ", "FWD: ", "Re: ", "Aw: ", "AW: ", "RE: "}, "", newSubject)
---------------------------------------------- (From Longest To -------------->>>>>> Shortest} -----------------

You can automate sorting the original list as well. Exist examples of sorting words by number of characters on MacScripter.

Here is one of them (complete solution):


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set newSubject to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"

on findReplace(findText, replaceText, sourceText)
	set findText to current application's NSArray's arrayWithArray:findText
	set sortedList to findText's sortedArrayUsingDescriptors:{current application's NSSortDescriptor's sortDescriptorWithKey:"length" ascending:false}
	set findText to sortedList as list
	set ASTID to AppleScript's text item delimiters
	set AppleScript's text item delimiters to findText
	set sourceText to text items of sourceText
	set AppleScript's text item delimiters to replaceText
	set sourceText to "" & sourceText
	set AppleScript's text item delimiters to ASTID
	return sourceText
end findReplace

set cleanedSubject to my findReplace({"Re: ", "Aw: ", "AW: ", "Fwd: ", "FWD: ", "RE: "}, "", newSubject)

Happy New Year.

KniazidisR has answered the OP’s question. However, in the future, someone may want an ASObjC solution and I’ve included one below.

use framework "Foundation"
use scripting additions

set theString to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"
set findText to "Fwd: |FWD: |Re: |Aw: |AW: |RE: "
set replaceText to ""
set theNewString to searchAndReplace(theString, findText, replaceText)

on searchAndReplace(theString, findText, replaceText)
	set theString to current application's NSString's stringWithString:theString
	return (theString's stringByReplacingOccurrencesOfString:findText withString:replaceText options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}) as text
end searchAndReplace

The following script differs only in that the search is case insensitive.

use framework "Foundation"
use scripting additions

set theString to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"
set findText to "Fwd: |Re: |Aw: "
set replaceText to ""
set theNewString to searchAndReplace(theString, findText, replaceText)

on searchAndReplace(theString, findText, replaceText)
	set theString to current application's NSString's stringWithString:theString
	set theOptions to (current application's NSRegularExpressionSearch as integer) + (current application's NSRegularExpressionCaseInsensitive as integer)
	return (theString's stringByReplacingOccurrencesOfString:findText withString:replaceText options:theOptions range:{0, theString's |length|()}) as text
end searchAndReplace

KniazidisR’s recommendation that findText be arranged from the longest to shortest text applies to the scripts in this post.

If you prefer to use as few lines of AppleScript code as possible, this following code will produce the desired results also.

set theString to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"
set findText to "Fwd: |Re: |Aw: "

set cleanedSubject to (do shell script "echo " & theString & " | sed -E 's/" & findText & "//gi'")

I highly recommend learning about RegEx
It’s something that I avoided for a long time as
It seemed complicated. But I’ve found great use for it
In finding variations on things.
You can specify ignoring case so you only need to define
Matching re | aw | fwd once.
This most important I’ve found is including conditions like
“There must be a space before” ([re|aw|fwd]:slight_smile:
Which could make sure it doesn’t capture this like this if they happen to be in your source:
“This is some text where: you don’t want to remove what I saw: being fwd:”

You can also add an option to catch stuff like
“Re. Aw. Fwd.”. By making the end segment optional for
: or .
Maybe you even want to catch “re aw fwd”
Without anything at the end except a space.
So you can make it optional for the “: or .”
But ensure that it follows by a space.
In those cases you want to make sure that
A space or word boundary comes before it so it
Doesn’t capture things like
“We all went there. You know what we saw. Fwd: it even
Won’t catch awkward thing regarding fwd messages”

here’s an example RegEx AppleScript

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to 1
property NSRegularExpressionUseUnicodeWordBoundaries : a reference to 40
property NSRegularExpressionAnchorsMatchLines : a reference to 16
property NSRegularExpressionSearch : a reference to 1024
property NSString : a reference to current application's NSString

property myTestName : ""

property mySourceA : ""
property mySourceB : ""
property myPattern1 : ""
property myPattern2 : ""
property myReplace : ""

property myTest1A : ""
property myTest2A : ""

property myTest1B : ""
property myTest2B : ""

property myTestExpect1 : ""
property myTestExpect2 : ""

property logRegEx : true
property logResults : true
property logDebug : false


-- REMOVE RE AW FWD
set myTestName to "REGEX REPLACE TEST"
set myPattern1 to "\\b((re|aw|fwd)[:]\\s)" -- removes (re or aw or fwd) followed by : and space
set myPattern2 to "\\b((re|aw|fwd)[:|.]\\s)" -- removes (re or aw or fwd) followed by (: or .) and space
set mySourceA to "Re. Aw: aw AW. Forward: FWD: RE. where: This is a Subject text"
set mySourceB to "Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"
set myReplace to ""
set myTest1A to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
set myTest2A to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace

set myTest1B to my findInString:mySourceB withPattern:myPattern1 replaceWith:myReplace
set myTest2B to my findInString:mySourceB withPattern:myPattern2 replaceWith:myReplace

if logResults then
	log ("-------------------------------------------NEW TEST START")
	log {"----------------myTestName is", myTestName}
	
	log {"myPattern1 is", myPattern1}
	log {"myPattern2 is", myPattern2}
	log {"myReplace is", myReplace}
	
	log {"mySourceA is", mySourceA}
	
	log {"myTest1A is", myTest1A}
	log {"myTest2A is", myTest2A}
	
	log {"mySourceB is", mySourceB}
	log {"myTest1B is", myTest1B}
	log {"myTest2B is", myTest2B}
end if




-- MAIN FUNCTIONS


on findInString:aString withPattern:aRegExString replaceWith:aReplace
	set aRegEx to my createRegularExpressionWithPattern:aRegExString
	if logDebug then
		log {"aRegEx is:", aRegEx}
	end if
	return (my findInString:aString withRegEx:aRegEx replaceWith:aReplace)
end findInString:withPattern:replaceWith:

on findInString:aString withRegEx:aRegEx replaceWith:aReplace
	if logDebug then log ("findInString:withRegEx:replaceWith: START")
	set aSource to NSString's stringWithString:aString
	set aRepString to NSString's stringWithString:aReplace
	set aLength to aSource's |length|()
	set aRange to (current application's NSMakeRange(0, aLength))
	set aCleanString to (aRegEx's stringByReplacingMatchesInString:aSource options:0 range:aRange withTemplate:aRepString)
	
	return aCleanString
end findInString:withRegEx:replaceWith:

on createRegularExpressionWithPattern:aRegExString
	if (class of aRegExString) is equal to (NSRegularExpression's class) then
		log ("it alreadry was a RegEx")
		return aRegExString
	end if
	set aPattern to NSString's stringWithString:aRegExString
	set regOptions to NSRegularExpressionCaseInsensitive + NSRegularExpressionUseUnicodeWordBoundaries
	set {aRegEx, aError} to (NSRegularExpression's regularExpressionWithPattern:aPattern options:regOptions |error|:(reference))
	if (aError ≠ missing value) then
		log {"regEx failed to create aError is:", aError}
		log {"aError debugDescrip is:", aError's debugDescription()}
		break
		return
	end if
	return aRegEx
end createRegularExpressionWithPattern:


log output:

(*-------------------------------------------NEW TEST START*)
(*----------------myTestName is, REGEX REPLACE TEST*)
(*myPattern1 is, \b((re|aw|fwd)[:]\s)*)
(*myPattern2 is, \b((re|aw|fwd)[:|.]\s)*)
(*myReplace is, *)
(*mySourceA is, Re. Aw: aw AW. Forward: FWD: RE. where: This is a Subject text*)
(*myTest1A is, (NSString) "Re. aw AW. Forward: RE. where: This is a Subject text"*)
(*myTest2A is, (NSString) "aw Forward: where: This is a Subject text"*)
(*mySourceB is, Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:*)
(*myTest1B is, (NSString) "your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"*)
(*myTest2B is, (NSString) "your message: This is where. You don’t want to remove what I saw: being fwd:"*)

In the above You’ll notice that the last one with mySourceB fails to catch the very ending “fwd:”.
because the regEx is wanting a qaulifying space. But it’s a line end. So I’ll modify it like this:


-- REMOVE RE AW FWD - with factoring in LINE END
set myTestName to "REGEX REPLACE TEST V2"
set myPattern1 to "\\b((re|aw|fwd)[:|.](\\s|$))" -- removes (re or aw or fwd) followed by (: or .) and (space or end of line)
set myPattern2 to "(^|\\s)((re|fwd|aw)(\\:|\\.))(\\s|$)" -- removes (re or aw or fwd) followed by (: or .) and space
set mySourceA to "Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"
set myReplace1 to ""
set myReplace2 to "$1"
set myTest1A to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace1
set myTest2A to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace1

set myTest1B to my findInString:mySourceA withPattern:myPattern2 replaceWith:" "
set myTest2B to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace2

if logResults then
	log ("-------------------------------------------NEW TEST START")
	log {"----------------myTestName is", myTestName}
	
	log {"myPattern1 is", myPattern1}
	log {"myPattern2 is", myPattern2}
	log {"myReplace1 is", myReplace1}
	
	log {"mySourceA is", mySourceA}
	
	log {"myTest1A is", myTest1A}
	log {"myTest2A is", myTest2A}
	
	log {"myReplace2 is", myReplace2}
	
	log {"myTest1B is", myTest1B}
	log {"myTest2B is", myTest2B}
end if

output from above:


(*-------------------------------------------NEW TEST START*)
(*----------------myTestName is, REGEX REPLACE TEST V2*)
(*myPattern1 is, \b((re|aw|fwd)[:|.](\s|$))*)
(*myPattern2 is, (^|\s)((re|fwd|aw)(\:|\.))(\s|$)*)
(*myReplace1 is, *)
(*mySourceA is, Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:*)
(*myTest1A is, (NSString) "your message: This is where. You don’t want to remove what I saw: being "*)
(*myTest2A is, (NSString) "yourmessage: This is where. You don’t want to remove what I saw: being"*)
(*myReplace2 is, $1*)
(*myTest1B is, (NSString) " your message: This is where. You don’t want to remove what I saw: being "*)
(*myTest2B is, (NSString) "your message: This is where. You don’t want to remove what I saw: being "*)

Now there is problems with replacing.
myPattern1 is capturing ("Re: ") ("fwd. ") (“fwd:”)
myPattern2 is capturing (“Re: “) (” fwd. “) (” fwd:”)

myTest1A with myPattern1 is replacing with “”
it works fine except for the extra space at the end of "being "

myTest2A with myPattern2 is replacing with “”
it fixes the end of "being " but now “yourmessage:” is smashed together.

myTest1B with myPattern2 is replacing with " "
it fixes “your message:” but now extra space a start and end.

myTest2B with myPattern2 is replacing with “$1”
$1 is the 1st Group Captured which is the (^|\s
it works fine except for the extra space at the end of "being "

When ever I run into these situations I usually use a replacement of " ".
And then after the replacements I use a clean all extra white space regEx.
Will clean head and tail and any multiple in-between.

see next

Here’s script with WhiteSpace cleaning functions

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to 1
property NSRegularExpressionUseUnicodeWordBoundaries : a reference to 40
property NSRegularExpressionAnchorsMatchLines : a reference to 16
property NSRegularExpressionSearch : a reference to 1024
property NSCharacterSet : a reference to current application's NSCharacterSet
property NSString : a reference to current application's NSString

property myTestName : ""

property mySourceA : ""
property mySourceB : ""
property myPattern1 : ""
property myPattern2 : ""
property myReplace : ""

property myTest1A : ""
property myTest2A : ""

property myTest1B : ""
property myTest2B : ""

property myTestExpect1 : ""
property myTestExpect2 : ""

property logRegEx : true
property logResults : true
property logDebug : false



-- REMOVE RE AW FWD - with factoring in LINE END
set myTestName to "REGEX REPLACE TEST V3"
set myPattern1 to "\\b((re|aw|fwd)[:|.](\\s|$))"
set mySourceA to "Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"
set myReplace1 to ""
set myTest1A to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace1
set myTest2A to my cleanAllWhiteSpaceInString:myTest1A

if logResults then
	log ("-------------------------------------------NEW TEST START")
	log {"----------------myTestName is", myTestName}
	
	log {"myPattern1 is", myPattern1}
	log {"myReplace1 is", myReplace1}
	
	log {"mySourceA is", mySourceA}
	
	log {"myTest1A is", myTest1A}
	log {"myTest2A is", myTest2A}
end if


-- MAIN FUNCTIONS


on findInString:aString withPattern:aRegExString replaceWith:aReplace
	set aRegEx to my createRegularExpressionWithPattern:aRegExString
	if logDebug then
		log {"aRegEx is:", aRegEx}
	end if
	return (my findInString:aString withRegEx:aRegEx replaceWith:aReplace)
end findInString:withPattern:replaceWith:

on findInString:aString withRegEx:aRegEx replaceWith:aReplace
	if logDebug then log ("findInString:withRegEx:replaceWith: START")
	set aSource to NSString's stringWithString:aString
	set aRepString to NSString's stringWithString:aReplace
	set aLength to aSource's |length|()
	set aRange to (current application's NSMakeRange(0, aLength))
	set aCleanString to (aRegEx's stringByReplacingMatchesInString:aSource options:0 range:aRange withTemplate:aRepString)
	
	return aCleanString
end findInString:withRegEx:replaceWith:

on createRegularExpressionWithPattern:aRegExString
	if (class of aRegExString) is equal to (NSRegularExpression's class) then
		log ("it alreadry was a RegEx")
		return aRegExString
	end if
	set aPattern to NSString's stringWithString:aRegExString
	set regOptions to NSRegularExpressionCaseInsensitive + NSRegularExpressionUseUnicodeWordBoundaries
	set {aRegEx, aError} to (NSRegularExpression's regularExpressionWithPattern:aPattern options:regOptions |error|:(reference))
	if (aError ≠ missing value) then
		log {"regEx failed to create aError is:", aError}
		log {"aError debugDescrip is:", aError's debugDescription()}
		break
		return
	end if
	return aRegEx
end createRegularExpressionWithPattern:

on cleanAllWhiteSpaceInString:aString
	set aPattern to (NSString's stringWithString:"(\\s+){2}")
	set aDirtyString to (my findInString:aString withPattern:aPattern replaceWith:" ")
	set aCleanString to (my cleanWhiteSpaces:aDirtyString)
	return aCleanString
end cleanAllWhiteSpaceInString:

on cleanWhiteSpaces:aString
	set aCharSet to NSCharacterSet's whitespaceCharacterSet()
	return (my cleanString:aString withCharacterSet:aCharSet)
end cleanWhiteSpaces:



on cleanString:aString withCharacterSet:aCharSet
	set aDirtyString to NSString's stringWithString:aString
	set aCleanString to (aDirtyString's stringByTrimmingCharactersInSet:aCharSet)
	return aCleanString
end cleanString:withCharacterSet:



(*-------------------------------------------NEW TEST START*)
(*----------------myTestName is, REGEX REPLACE TEST V3*)
(*myPattern1 is, \b((re|aw|fwd)[:|.](\s|$))*)
(*myReplace1 is, *)
(*mySourceA is, Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:*)
(*myTest1A is, (NSString) "your message: This is where. You don’t want to remove what I saw: being "*)
(*myTest2A is, (NSString) "your message: This is where. You don’t want to remove what I saw: being"*)