Sunday, January 23, 2022

#1 2022-01-01 01:16:07 am

NewtonsLaws
Member
From:: Switzerland
Registered: 2021-05-02
Posts: 25

Find and replace multiple words in subject string

Hi pros, I try to get to remove multiple words like "RE:", "Re:", "Fwd:"... all at once from a string in a simple find and replace function. Does anybody know a good handler for this to find and replace/erase multiple words all at once. Please keep in mind that I'm a newbie as you see in the unstructured code I used here. Thanks a lot in advance for the help!

set newSubject to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject text" -- goal is to clean this string to This is a Subject text

on findReplace(findText, replaceText, sourceText)
    set ASTID to AppleScript's text item delimiters
    set AppleScript's text item delimiters to findText
    set sourceText to text items of sourceText
    set AppleScript's text item delimiters to replaceText
    set sourceText to "" & sourceText
    set AppleScript's text item delimiters to ASTID
    return sourceText
end findReplace

-- this next approach doesn't really give a good result esp. if multiple words are there to clean out like RE: Aw: Fwd:

if (newSubject as text) contains "Re:" or "Aw:" then
    return my findReplace("Re: AW:", "", (newSubject))
else if (newSubject as text) contains "Re:" then
    return my findReplace("Re:", "", (newSubject))
else if (newSubject as text) contains "Fwd:" then
    return my findReplace("Fwd:", "", (newSubject))
else if (newSubject as text) contains "AW:" then
    return my findReplace("AW:", "", (newSubject))
else if (newSubject as text) contains "RE:" then
    return my findReplace("RE:", "", (newSubject))
else
    return newSubject
end if

Last edited by NewtonsLaws (2022-01-01 03:58:26 am)


Filed under: find replace

Offline

 

#2 2022-01-01 03:55:26 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 2208

Re: Find and replace multiple words in subject string

Hello,

The AppleScript's Text Item Delimiters works fine with multiple words to replace. In your script above just set variable findText to the list {"Re: ", "Aw: ", "AW: ", "Fwd: ", "FWD: ", "RE: "}. All this words replace with empty string (set replaceText to ""):

Applescript:


set newSubject to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"

on findReplace(findText, replaceText, sourceText)
   set ASTID to AppleScript's text item delimiters
   set AppleScript's text item delimiters to findText
   set sourceText to text items of sourceText
   set AppleScript's text item delimiters to replaceText
   set sourceText to "" & sourceText
   set AppleScript's text item delimiters to ASTID
   return sourceText
end findReplace

set cleanedSubject to my findReplace({"Re: ", "Aw: ", "AW: ", "Fwd: ", "FWD: ", "RE: "}, "", newSubject)

NOTE: I wrote your newSubject's text in more complicated form, to show how useful may be AppleScript's Text Item Delimiters instrument.

Last edited by KniazidisR (2022-01-01 04:10:39 am)


Model: MacBook Pro
OS X: Catalina 10.15.7
Web Browser: Safari 14.1
Ram: 4 GB

Offline

 

#3 2022-01-01 04:00:32 am

NewtonsLaws
Member
From:: Switzerland
Registered: 2021-05-02
Posts: 25

Re: Find and replace multiple words in subject string

Uff, thanks so much KniazidisR! Now I see how to use it properly. Case closed and many thanks again for that.

Offline

 

#4 2022-01-01 04:45:10 am

KniazidisR
Member
From:: Greece
Registered: 2019-03-03
Posts: 2208

Re: Find and replace multiple words in subject string

I would like to help you completely with the exact solution to your question. Namely: the peculiarities of AppleScript Text Item Delimiters work with a list of words is such that words are replaced from left to right by the order in the list. It doesn't matter in your example, but in other text you may get the wrong result if the list of words is not sorted by the number of letters.

This is an oversight of mine post #2, so I want to pay special attention to it now. The list should always be sorted, just in case, before applying AppleScript Text Item Delimiters, from the longest words to the shortest words:

Applescript:


set cleanedSubject to my findReplace({"Fwd: ", "FWD: ", "Re: ", "Aw: ", "AW: ", "RE: "}, "", newSubject)
---------------------------------------------- (From Longest To -------------->>>>>> Shortest} -----------------

You can automate sorting the original list as well. Exist examples of sorting words by number of characters on MacScripter.

Here is one of them (complete solution):

Applescript:


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set newSubject to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"

on findReplace(findText, replaceText, sourceText)
   set findText to current application's NSArray's arrayWithArray:findText
   set sortedList to findText's sortedArrayUsingDescriptors:{current application's NSSortDescriptor's sortDescriptorWithKey:"length" ascending:false}
   set findText to sortedList as list
   set ASTID to AppleScript's text item delimiters
   set AppleScript's text item delimiters to findText
   set sourceText to text items of sourceText
   set AppleScript's text item delimiters to replaceText
   set sourceText to "" & sourceText
   set AppleScript's text item delimiters to ASTID
   return sourceText
end findReplace

set cleanedSubject to my findReplace({"Re: ", "Aw: ", "AW: ", "Fwd: ", "FWD: ", "RE: "}, "", newSubject)

Happy New Year.

Last edited by KniazidisR (2022-01-01 11:28:38 am)


Model: MacBook Pro
OS X: Catalina 10.15.7
Web Browser: Safari 14.1
Ram: 4 GB

Offline

 

#5 2022-01-01 08:15:57 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1213

Re: Find and replace multiple words in subject string

NewtonsLaws wrote:

Uff, thanks so much KniazidisR! Now I see how to use it properly. Case closed and many thanks again for that.



KniazidisR has answered the OP's question. However, in the future, someone may want an ASObjC solution and I've included one below.

Applescript:

use framework "Foundation"
use scripting additions

set theString to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"
set findText to "Fwd: |FWD: |Re: |Aw: |AW: |RE: "
set replaceText to ""
set theNewString to searchAndReplace(theString, findText, replaceText)

on searchAndReplace(theString, findText, replaceText)
   set theString to current application's NSString's stringWithString:theString
   return (theString's stringByReplacingOccurrencesOfString:findText withString:replaceText options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}) as text
end searchAndReplace

The following script differs only in that the search is case insensitive.

Applescript:

use framework "Foundation"
use scripting additions

set theString to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"
set findText to "Fwd: |Re: |Aw: "
set replaceText to ""
set theNewString to searchAndReplace(theString, findText, replaceText)

on searchAndReplace(theString, findText, replaceText)
   set theString to current application's NSString's stringWithString:theString
   set theOptions to (current application's NSRegularExpressionSearch as integer) + (current application's NSRegularExpressionCaseInsensitive as integer)
   return (theString's stringByReplacingOccurrencesOfString:findText withString:replaceText options:theOptions range:{0, theString's |length|()}) as text
end searchAndReplace

KniazidisR's recommendation that findText be arranged from the longest to shortest text applies to the scripts in this post.

Last edited by peavine (2022-01-02 06:59:10 am)


2018 Mac mini - macOS Monterey - Script Debugger 8

Offline

 

#6 2022-01-02 12:34:39 pm

wch1zpink
Member
Registered: 2011-08-20
Posts: 77

Re: Find and replace multiple words in subject string

If you prefer to use as few lines of AppleScript code as possible, this following code will produce the desired results also.

Applescript:

set theString to "Re: Aw: AW: Fwd: FWD: RE: This is a Subject Re: Aw: text"
set findText to "Fwd: |Re: |Aw: "

set cleanedSubject to (do shell script "echo " & theString & " | sed -E 's/" & findText & "//gi'")

Last edited by wch1zpink (2022-01-03 05:59:16 pm)

Offline

 

#7 2022-01-03 05:04:52 pm

technomorph
Member
Registered: 2017-12-14
Posts: 242

Re: Find and replace multiple words in subject string

I highly recommend learning about RegEx
It’s something that I avoided for a long time as
It seemed complicated.  But I’ve found great use for it
In finding variations on things.
You can specify ignoring case so you only need to define
Matching re | aw | fwd once.
This most important I’ve found is including conditions like
“There must be a space before” ([re|aw|fwd]:)
Which could make sure it doesn’t capture this like this if they happen to be in your source:
“This is some text where: you don’t want to remove what I saw: being fwd:”

You can also add an option to catch stuff like
“Re.  Aw.  Fwd.”.  By making the end segment optional for
: or .     
Maybe you even want to catch “re aw fwd”
Without anything at the end except a space.
So you can make it optional for the “: or .”
But ensure that it follows by a space.
In those cases you want to make sure that
A space or word boundary comes before it so it
Doesn’t capture things like
“We all went there. You know what we saw. Fwd: it even
Won’t catch awkward thing regarding fwd messages”

Offline

 

#8 2022-01-03 09:19:52 pm

technomorph
Member
Registered: 2017-12-14
Posts: 242

Re: Find and replace multiple words in subject string

here's an example RegEx AppleScript

Applescript:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to 1
property NSRegularExpressionUseUnicodeWordBoundaries : a reference to 40
property NSRegularExpressionAnchorsMatchLines : a reference to 16
property NSRegularExpressionSearch : a reference to 1024
property NSString : a reference to current application's NSString

property myTestName : ""

property mySourceA : ""
property mySourceB : ""
property myPattern1 : ""
property myPattern2 : ""
property myReplace : ""

property myTest1A : ""
property myTest2A : ""

property myTest1B : ""
property myTest2B : ""

property myTestExpect1 : ""
property myTestExpect2 : ""

property logRegEx : true
property logResults : true
property logDebug : false


-- REMOVE RE AW FWD
set myTestName to "REGEX REPLACE TEST"
set myPattern1 to "\\b((re|aw|fwd)[:]\\s)" -- removes (re or aw or fwd) followed by : and space
set myPattern2 to "\\b((re|aw|fwd)[:|.]\\s)" -- removes (re or aw or fwd) followed by (: or .) and space
set mySourceA to "Re. Aw: aw AW. Forward: FWD: RE. where: This is a Subject text"
set mySourceB to "Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"
set myReplace to ""
set myTest1A to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
set myTest2A to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace

set myTest1B to my findInString:mySourceB withPattern:myPattern1 replaceWith:myReplace
set myTest2B to my findInString:mySourceB withPattern:myPattern2 replaceWith:myReplace

if logResults then
   log ("-------------------------------------------NEW TEST START")
   log {"----------------myTestName is", myTestName}
   
   log {"myPattern1 is", myPattern1}
   log {"myPattern2 is", myPattern2}
   log {"myReplace is", myReplace}
   
   log {"mySourceA is", mySourceA}
   
   log {"myTest1A is", myTest1A}
   log {"myTest2A is", myTest2A}
   
   log {"mySourceB is", mySourceB}
   log {"myTest1B is", myTest1B}
   log {"myTest2B is", myTest2B}
end if




-- MAIN FUNCTIONS


on findInString:aString withPattern:aRegExString replaceWith:aReplace
   set aRegEx to my createRegularExpressionWithPattern:aRegExString
   if logDebug then
       log {"aRegEx is:", aRegEx}
   end if
   return (my findInString:aString withRegEx:aRegEx replaceWith:aReplace)
end findInString:withPattern:replaceWith:

on findInString:aString withRegEx:aRegEx replaceWith:aReplace
   if logDebug then log ("findInString:withRegEx:replaceWith: START")
   set aSource to NSString's stringWithString:aString
   set aRepString to NSString's stringWithString:aReplace
   set aLength to aSource's |length|()
   set aRange to (current application's NSMakeRange(0, aLength))
   set aCleanString to (aRegEx's stringByReplacingMatchesInString:aSource options:0 range:aRange withTemplate:aRepString)
   
   return aCleanString
end findInString:withRegEx:replaceWith:

on createRegularExpressionWithPattern:aRegExString
   if (class of aRegExString) is equal to (NSRegularExpression's class) then
       log ("it alreadry was a RegEx")
       return aRegExString
   end if
   set aPattern to NSString's stringWithString:aRegExString
   set regOptions to NSRegularExpressionCaseInsensitive + NSRegularExpressionUseUnicodeWordBoundaries
   set {aRegEx, aError} to (NSRegularExpression's regularExpressionWithPattern:aPattern options:regOptions |error|:(reference))
   if (aError ≠ missing value) then
       log {"regEx failed to create aError is:", aError}
       log {"aError debugDescrip is:", aError's debugDescription()}
       break
       return
   end if
   return aRegEx
end createRegularExpressionWithPattern:

log output:

Applescript:

(*-------------------------------------------NEW TEST START*)
(*----------------myTestName is, REGEX REPLACE TEST*)
(*myPattern1 is, \b((re|aw|fwd)[:]\s)*)
(*myPattern2 is, \b((re|aw|fwd)[:|.]\s)*)
(*myReplace is, *)
(*mySourceA is, Re. Aw: aw AW. Forward: FWD: RE. where: This is a Subject text*)
(*myTest1A is, (NSString) "Re. aw AW. Forward: RE. where: This is a Subject text"*)
(*myTest2A is, (NSString) "aw Forward: where: This is a Subject text"*)
(*mySourceB is, Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:*)
(*myTest1B is, (NSString) "your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"*)
(*myTest2B is, (NSString) "your message: This is where. You don’t want to remove what I saw: being fwd:"*)

Offline

 

#9 2022-01-03 09:37:19 pm

technomorph
Member
Registered: 2017-12-14
Posts: 242

Re: Find and replace multiple words in subject string

In the above You'll notice that the last one with mySourceB fails to catch the very ending "fwd:".
because the regEx is wanting a qaulifying space.  But it's a line end.   So I'll modify it like this:

Applescript:


-- REMOVE RE AW FWD - with factoring in LINE END
set myTestName to "REGEX REPLACE TEST V2"
set myPattern1 to "\\b((re|aw|fwd)[:|.](\\s|$))" -- removes (re or aw or fwd) followed by (: or .) and (space or end of line)
set myPattern2 to "(^|\\s)((re|fwd|aw)(\\:|\\.))(\\s|$)" -- removes (re or aw or fwd) followed by (: or .) and space
set mySourceA to "Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"
set myReplace1 to ""
set myReplace2 to "$1"
set myTest1A to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace1
set myTest2A to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace1

set myTest1B to my findInString:mySourceA withPattern:myPattern2 replaceWith:" "
set myTest2B to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace2

if logResults then
   log ("-------------------------------------------NEW TEST START")
   log {"----------------myTestName is", myTestName}
   
   log {"myPattern1 is", myPattern1}
   log {"myPattern2 is", myPattern2}
   log {"myReplace1 is", myReplace1}
   
   log {"mySourceA is", mySourceA}
   
   log {"myTest1A is", myTest1A}
   log {"myTest2A is", myTest2A}
   
   log {"myReplace2 is", myReplace2}
   
   log {"myTest1B is", myTest1B}
   log {"myTest2B is", myTest2B}
end if

output from above:

Applescript:


(*-------------------------------------------NEW TEST START*)
(*----------------myTestName is, REGEX REPLACE TEST V2*)
(*myPattern1 is, \b((re|aw|fwd)[:|.](\s|$))*)
(*myPattern2 is, (^|\s)((re|fwd|aw)(\:|\.))(\s|$)*)
(*myReplace1 is, *)
(*mySourceA is, Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:*)
(*myTest1A is, (NSString) "your message: This is where. You don’t want to remove what I saw: being "*)
(*myTest2A is, (NSString) "yourmessage: This is where. You don’t want to remove what I saw: being"*)
(*myReplace2 is, $1*)
(*myTest1B is, (NSString) " your message: This is where. You don’t want to remove what I saw: being "*)
(*myTest2B is, (NSString) "your message: This is where. You don’t want to remove what I saw: being "*)

Now there is problems with replacing.
myPattern1 is capturing ("Re: ") ("fwd. ") ("fwd:")
myPattern2 is capturing ("Re: ") (" fwd. ") (" fwd:")

myTest1A with myPattern1 is replacing with ""
it works fine except for the extra space at the end of "being "

myTest2A with myPattern2 is replacing with ""
it fixes the end of "being " but now "yourmessage:" is smashed together.

myTest1B with myPattern2 is replacing with " "
it fixes "your message:" but now extra space a start and end.

myTest2B with myPattern2 is replacing with "$1"
$1 is the 1st Group Captured which is the (^|\\s
it works fine except for the extra space at the end of "being "


When ever I run into these situations I usually use a replacement of " ".
And then after the replacements I use a clean all extra white space regEx.
Will clean head and tail and any multiple in-between.

see next

Offline

 

#10 2022-01-03 09:48:42 pm

technomorph
Member
Registered: 2017-12-14
Posts: 242

Re: Find and replace multiple words in subject string

Here's script with WhiteSpace cleaning functions

Applescript:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to 1
property NSRegularExpressionUseUnicodeWordBoundaries : a reference to 40
property NSRegularExpressionAnchorsMatchLines : a reference to 16
property NSRegularExpressionSearch : a reference to 1024
property NSCharacterSet : a reference to current application's NSCharacterSet
property NSString : a reference to current application's NSString

property myTestName : ""

property mySourceA : ""
property mySourceB : ""
property myPattern1 : ""
property myPattern2 : ""
property myReplace : ""

property myTest1A : ""
property myTest2A : ""

property myTest1B : ""
property myTest2B : ""

property myTestExpect1 : ""
property myTestExpect2 : ""

property logRegEx : true
property logResults : true
property logDebug : false



-- REMOVE RE AW FWD - with factoring in LINE END
set myTestName to "REGEX REPLACE TEST V3"
set myPattern1 to "\\b((re|aw|fwd)[:|.](\\s|$))"
set mySourceA to "Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:"
set myReplace1 to ""
set myTest1A to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace1
set myTest2A to my cleanAllWhiteSpaceInString:myTest1A

if logResults then
   log ("-------------------------------------------NEW TEST START")
   log {"----------------myTestName is", myTestName}
   
   log {"myPattern1 is", myPattern1}
   log {"myReplace1 is", myReplace1}
   
   log {"mySourceA is", mySourceA}
   
   log {"myTest1A is", myTest1A}
   log {"myTest2A is", myTest2A}
end if


-- MAIN FUNCTIONS


on findInString:aString withPattern:aRegExString replaceWith:aReplace
   set aRegEx to my createRegularExpressionWithPattern:aRegExString
   if logDebug then
       log {"aRegEx is:", aRegEx}
   end if
   return (my findInString:aString withRegEx:aRegEx replaceWith:aReplace)
end findInString:withPattern:replaceWith:

on findInString:aString withRegEx:aRegEx replaceWith:aReplace
   if logDebug then log ("findInString:withRegEx:replaceWith: START")
   set aSource to NSString's stringWithString:aString
   set aRepString to NSString's stringWithString:aReplace
   set aLength to aSource's |length|()
   set aRange to (current application's NSMakeRange(0, aLength))
   set aCleanString to (aRegEx's stringByReplacingMatchesInString:aSource options:0 range:aRange withTemplate:aRepString)
   
   return aCleanString
end findInString:withRegEx:replaceWith:

on createRegularExpressionWithPattern:aRegExString
   if (class of aRegExString) is equal to (NSRegularExpression's class) then
       log ("it alreadry was a RegEx")
       return aRegExString
   end if
   set aPattern to NSString's stringWithString:aRegExString
   set regOptions to NSRegularExpressionCaseInsensitive + NSRegularExpressionUseUnicodeWordBoundaries
   set {aRegEx, aError} to (NSRegularExpression's regularExpressionWithPattern:aPattern options:regOptions |error|:(reference))
   if (aError ≠ missing value) then
       log {"regEx failed to create aError is:", aError}
       log {"aError debugDescrip is:", aError's debugDescription()}
       break
       return
   end if
   return aRegEx
end createRegularExpressionWithPattern:

on cleanAllWhiteSpaceInString:aString
   set aPattern to (NSString's stringWithString:"(\\s+){2}")
   set aDirtyString to (my findInString:aString withPattern:aPattern replaceWith:" ")
   set aCleanString to (my cleanWhiteSpaces:aDirtyString)
   return aCleanString
end cleanAllWhiteSpaceInString:

on cleanWhiteSpaces:aString
   set aCharSet to NSCharacterSet's whitespaceCharacterSet()
   return (my cleanString:aString withCharacterSet:aCharSet)
end cleanWhiteSpaces:



on cleanString:aString withCharacterSet:aCharSet
   set aDirtyString to NSString's stringWithString:aString
   set aCleanString to (aDirtyString's stringByTrimmingCharactersInSet:aCharSet)
   return aCleanString
end cleanString:withCharacterSet:

Applescript:


(*-------------------------------------------NEW TEST START*)
(*----------------myTestName is, REGEX REPLACE TEST V3*)
(*myPattern1 is, \b((re|aw|fwd)[:|.](\s|$))*)
(*myReplace1 is, *)
(*mySourceA is, Re: your fwd. message: This is where. You don’t want to remove what I saw: being fwd:*)
(*myTest1A is, (NSString) "your message: This is where. You don’t want to remove what I saw: being "*)
(*myTest2A is, (NSString) "your message: This is where. You don’t want to remove what I saw: being"*)

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)