Regular Expression Capture Groups

I’m working to learn the ASObjC implementation of Regular Expression capture groups. I’m using a rewrite of a script from Shane’s ASObjC book, and the goal is to get substrings in parentheses that match a pattern.

Scripts one and two work as expected, but I can’t get script three–which has two capture groups–to work. I’m not sure if my RegEx pattern is faulty or if I’m doing something else wrong. I tested the script in Shane’s book (page 82) with 3 capture groups, and it works fine. Thanks for any help.

use framework "Foundation"

-- script one
# set theString to "(Joe) and (Jack) and (John) and (30) and (40) and (50)"
# set thePattern to "\\((\\D.*?)\\)"
# set n to 1 --> {"Joe", "Jack", "John"}

-- script two
# set theString to "(Joe) and (Jack) and (John) and (30) and (40) and (50)"
# set thePattern to "\\((\\d.*?)\\)"
# set n to 1 --> {"30", "40", "50"}

-- script three
set theString to "(Joe) and (Jack) and (John) and (30) and (40) and (50)"
set thePattern to "\\((\\D.*?)\\)|\\((\\d.*?)\\)"
set n to 2 --> unable to set argument 2... 'utxt'("length"), 0 ] }> could not be coerced to type {_NSRange=QQ}.

set theString to current application's NSString's stringWithString:theString
set theOptions to 24 -- DotMatchesLineSeparators and AnchorsMatchLines
set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:theOptions |error|:(missing value)
set regExResults to theRegEx's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
log (count of items of regExResults) --> 6
set theMatches to {}
repeat with i from 1 to count of items of regExResults
	set aMatch to (item i of regExResults)
	if (aMatch's numberOfRanges()) as integer < (n + 1) then -- N/Ap because 6 ranges and n + 1 is 3
		set end of theMatches to missing value
	else
		set theRange to (aMatch's rangeAtIndex:n) --> {location:9.22337203685478E+18, |length|:0}
		set end of theMatches to (theString's substringWithRange:theRange) as string
	end if
end repeat
return theMatches


The problem is that at some point the range’s location is NSNotFound, which is too big an integer for AppleScript even to represent accurately as a real. When you pass it back, the location therefore has a different value.

The solution is to test for this:

		set theRange to (aMatch's rangeAtIndex:n)
		if theRange's |length| > 0 then
			set end of theMatches to (theString's substringWithRange:theRange) as string
		end if

Thanks Shane. I made the changes you suggest and the script works great. :slight_smile:

Just for learning purposes, the solution raises the question in my mind why a range’s location is NSNotFound. There are 6 regExResults, and all of them are valid substrings. I worked my way through the script in Script Debugger’s debug mode but couldn’t learn anything.

Hi peavine.

Your regex pattern “\((\D.?)\)|\((\d.?)\)” contains two capture groups, one either side of the OR indicator “|”. They’re regarded as capture groups 1 and 2 even though they’re simply alternatives. There are entries for both of them in the match result, but only one of them actually matches the subtext found. So either aMatch’s rangeAtIndex:1 is the range of the subtext and its rangeAtIndex:2 indicates a non-match, or vice versa. You could reduce the number of capture groups to one in this particular case by having the OR within a group, eg. “\((\D.?|\d.?)\)”. This way, the rangeAtIndex:1 is always it.

Thanks Nigel for the explanation–it took a little thought but I understand things now.

Just as an aside, the above script can be simplified if only one capture group is present. The following returns everything within parentheses and returns a blank list if no parentheses are found. Also, the returned list includes an empty string if blank parentheses are encountered, although these can be filtered out in the repeat loop if desired. This script is easily modified to return text contained in other characters–one example being quoted text.

-- requires macOS El Capitan or newer
use framework "Foundation"
use scripting additions

set theString to "(Jack) and (Joe) and (30)" --> {"Jack", "Joe", "30"}
# set theString to "(Jack) and () and (Joe) and (30)" --> {"Jack", "", "Joe", "30"}
# set theString to "" --> {}

set textInParentheses to getTextInParentheses(theString)

on getTextInParentheses(theString)
	set theString to current application's NSString's stringWithString:theString
	set thePattern to "\\((.*?)\\)"
	set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
	set regExResults to theRegEx's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
	set theMatches to current application's NSMutableArray's new()
	repeat with anItem in regExResults
		set theRange to (anItem's rangeAtIndex:1)
		(theMatches's addObject:(theString's substringWithRange:theRange))
	end repeat
	return theMatches as list
end getTextInParentheses

I was working to learn look-behind and look-ahead assertions and realized that they can be used to perform the same task as the script in post 5 above.

-- requires macOS El Capitan or newer
use framework "Foundation"
use scripting additions

set theString to "(Jack) and (Joe) and (30)" --> {"Jack", "Joe", "30"}

set textInParentheses to getTextInParentheses(theString)

on getTextInParentheses(theString)
	set theString to current application's NSString's stringWithString:theString
	set thePattern to "(?<=\\().*?(?=\\))"
	set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
	set regExResults to theRegEx's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
	set theRanges to (regExResults's valueForKey:"range")
	set theMatches to current application's NSMutableArray's new()
	repeat with aRange in theRanges
		(theMatches's addObject:(theString's substringWithRange:aRange))
	end repeat
	return theMatches as list
end getTextInParentheses

One major limitation appears to be that the characters that bracket the desired text have to be different. Thus, at least in my testing, the script cannot be used to find text in quotes. The changed lines in the above script and the result are:

set theString to "\"Jack\" and \"Joe\" and \"30\"" --> {"Jack", " and ", "Joe", " and ", "30"}
set thePattern to "(?<=\\\").*?(?=\\\")"

Hi peavine.

Look-behinds and look-aheads can seem a bit odd at first. They don’t count towards a regex search’s progress through the source text. Text matched by a look-behind may already have been matched or passed over before the match it precedes is reached. Similarly, where a look-ahead is matched, the search resumes from the matched look-ahead text, not from after it. So where look-behind and look-ahead matches are identical and the characters between them essentially wildcards, the results are as you describe. You sometimes have to be very inventive to get round such possibilities. :wink: I think a capture group’s the way to go in this case.

Thanks Nigel. I’ll stick with capture groups.

My script in post 5 works fine with one capture group. My script in post 1 is intended to work with 2 or more capture groups but is broken. I’ve included below a revised script which incorporates Shane’s fix and includes a few miscellaneous edits, which are just a matter of personal preference.

use framework "Foundation"
use scripting additions

set theString to "(Joe) and (30) and (A1) and (Jack) and (40) and (B1)"

set textInParentheses to getTextInParentheses(theString, 1)
-- The second parameter is the capture group, which in this instance would normally be set to 1 (all letters), 2 (all digits), or 3 (a combination of letters and digits). The script will throw an error if a particular capture group is not found (e.g. 4), and error correction needs to be added for this.

on getTextInParentheses(theString, captureGroup)
	set theString to current application's NSString's stringWithString:theString
	set thePattern to "(?i)\\(([a-z]*?)\\)|\\(([0-9]*?)\\)|\\(([a-z0-9]*?)\\)"
	set theRegex to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
	set regexResults to theRegex's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
	set theMatches to current application's NSMutableArray's new()
	repeat with aMatch in regexResults
		set theRange to (aMatch's rangeAtIndex:captureGroup)
		if theRange's |length| > 0 then (theMatches's addObject:(theString's substringWithRange:theRange))
	end repeat
	return theMatches as list
end getTextInParentheses

I’ve gained a basic understanding of capture groups but one issue remained unclear. There are two types of capture group back references, and their formats are \n and $n. In the NSRegularExpressions documentation, the \n back reference is discussed under “Regular Expressions Matacharacters” and $n is discussed under “Template Matching Format”.

Like much having to do with regular expressions, an example helps. The following looks for consecutive duplicate words using \1 and replaces them with one instance of the duplicate words using $1. The search is case insensitive and does not match across paragraph returns, although both of these behaviors are easily changed.

use framework "Foundation"
use scripting additions

set theString to "This is is a test  test.
This This is another Another test."

set cleanedString to removeDuplicateWords(theString)

on removeDuplicateWords(theString)
	set thePattern to "(?i)\\b(\\w+)\\h+\\1\\b" -- \\1 is a back reference to (\\w+)
	set theString to current application's NSMutableString's stringWithString:theString
	set replaceCount to (theString's replaceOccurrencesOfString:thePattern withString:"$1" options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}) -- $1 is a back reference to (\\w+)
	return theString as text
end removeDuplicateWords

Check out the RegExKit App for Mac
It’s amazing for testing your RegExs
Has tip and hints.
Shows you capture groups etc.
I use it all the time.

Thanks technomorph. I downloaded RegexKit from GitHub at:

https://github.com/forhappy/RegexKit

It’s a single app file of about 37 MB with an attractive interface and lots of helpful Regular Expression information. I had been using the Atom editor to test Regular Expressions, but I think RegexKit will be much better.

It will also supply you “code” for your expression and replacement.
I use the PHP code for Objective-c as
Escapes everyhhhong properly

Here’s a script i use to test them in AppleScript.
At the end is a bunch of commented out “tests” or examples you might find useful

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to 1
property NSRegularExpressionUseUnicodeWordBoundaries : a reference to 40
property NSRegularExpressionAnchorsMatchLines : a reference to 16
property NSRegularExpressionSearch : a reference to 1024
property NSString : a reference to current application's NSString

property myTestName : ""

property mySourceA : ""
property mySourceB : ""
property myPattern1 : ""
property myPattern2 : ""
property myReplace : ""

property myTestA1 : ""
property myTestA2 : ""

property myTestB1 : ""
property myTestB2 : ""
property myTestExpect1 : ""
property myTestExpect2 : ""

property logRegEx : true
property logResults : true
property logDebug : false



-- RUN TEMPLATE

-- \\b(WAV|24 bit|96|19\\.2)\\b
-- NEED FLAC MISSING BAD LOW REPLACE NOT LIVE

set aWordsPattern1 to my createPatternForMatchAnyWords:"WAV 24%bit 96 19.2"
set aWordsPattern2 to my createPatternForMatchAnyWords:"NEED%FLAC MISSING BAD LOW REPLACE NOT LIVE"

my testRegWithName:"TRACK QUALITY SCANNING TAGS FOR CONATINS" pattern1:aWordsPattern1 pattern2:aWordsPattern2 source1:"FLAC 24 bit - 19.2 kHz" source2:"missing" replaceWith:"MATCHED" expecting1:"" expecting2:""


-- MAIN SCRIPT OBJECT FUNCTIONS
on testRegWithName:aName pattern1:patternNo1 pattern2:patternNo2 ¬
	source1:sourceA source2:sourceB replaceWith:aReplace ¬
	expecting1:expectNo1 expecting2:expectNo2
	my resetValues()
	set myTestName to aName
	if not patternNo1 is "" then set myPattern1 to patternNo1
	if not patternNo2 is "" then set myPattern2 to patternNo2
	if not sourceA is "" then set mySourceA to sourceA
	if not sourceB is "" then set mySourceB to sourceB
	if not aReplace is "" then set myReplace to aReplace
	if not expectNo1 is "" then set myTestExpect1 to expectNo1
	if not expectNo2 is "" then set myTestExpect2 to expectNo2
	
	my runTestA()
	my runTestB()
	if logResults then my logTestResults()
end testRegWithName:pattern1:pattern2:source1:source2:replaceWith:expecting1:expecting2:

on resetValues()
	set myTestName to ""
	set myPattern1 to "NONE"
	set myPattern2 to "NONE"
	set mySourceA to "NONE"
	set mySourceB to "NONE"
	set myReplace to ""
	
	set myTestA1 to "NONE"
	set myTestA2 to "NONE"
	
	set myTestB1 to "NONE"
	set myTestB2 to "NONE"
	
	set myTestExpect1 to "NONE"
	set myTestExpect2 to "NONE"
end resetValues

on runTestA()
	if mySourceA is "NONE" then
		return
	end if
	if not myPattern1 is "NONE" then
		set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	end if
	if not myPattern2 is "NONE" then
		set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	end if
end runTestA

on runTestB()
	if mySourceB is "NONE" then
		return
	end if
	if not myPattern1 is "NONE" then
		set myTestB1 to my findInString:mySourceB withPattern:myPattern1 replaceWith:myReplace
	end if
	if not myPattern2 is "NONE" then
		set myTestB2 to my findInString:mySourceB withPattern:myPattern2 replaceWith:myReplace
	end if
end runTestB

on logTestResults()
	log ("------------------------------------------- TEST RESULTS LOG")
	log {"----------------myTestName is", myTestName}
	
	log {"myPattern1 is", myPattern1}
	log {"myPattern2 is", myPattern2}
	log {"myReplace is", myReplace}
	
	log {"--------------mySourceA is", mySourceA}
	
	log {"myTestA1 is", myTestA1}
	log {"myTestA2 is", myTestA2}
	if not myTestExpect1 is "NONE" then
		log {"myTestExpect1 is", myTestExpect1}
	end if
	
	log {"--------------mySourceB is", mySourceB}
	log {"myTestB1 is", myTestB1}
	log {"myTestB2 is", myTestB2}
	if not myTestExpect2 is "NONE" then
		log {"myTestExpect2 is", myTestExpect2}
	end if
end logTestResults

-- MAIN FUNCTIONS


on findInString:aString withPattern:aRegExString replaceWith:aReplace
	set aRegEx to my createRegularExpressionWithPattern:aRegExString
	if logDebug then
		log {"aRegEx is:", aRegEx}
	end if
	return (my findInString:aString withRegEx:aRegEx replaceWith:aReplace)
end findInString:withPattern:replaceWith:

on findInString:aString withRegEx:aRegEx replaceWith:aReplace
	if logDebug then log ("findInString:withRegEx:replaceWith: START")
	set aSource to NSString's stringWithString:aString
	set aRepString to NSString's stringWithString:aReplace
	set aLength to aSource's |length|()
	set aRange to (current application's NSMakeRange(0, aLength))
	set aCleanString to (aRegEx's stringByReplacingMatchesInString:aSource options:0 range:aRange withTemplate:aRepString)
	
	return aCleanString
end findInString:withRegEx:replaceWith:

on createRegularExpressionWithPattern:aRegExString
	if (class of aRegExString) is equal to (NSRegularExpression's class) then
		log ("it alreadry was a RegEx")
		return aRegExString
	end if
	set aPattern to NSString's stringWithString:aRegExString
	set regOptions to NSRegularExpressionCaseInsensitive + NSRegularExpressionUseUnicodeWordBoundaries
	set {aRegEx, aError} to (NSRegularExpression's regularExpressionWithPattern:aPattern options:regOptions |error|:(reference))
	if (aError ≠ missing value) then
		log {"regEx failed to create aError is:", aError}
		log {"aError debugDescrip is:", aError's debugDescription()}
		break
		return
	end if
	return aRegEx
end createRegularExpressionWithPattern:



on createPatternForMatchAnyWords:aLine
	set aString to NSString's stringWithString:aLine
	set aArray to aString's componentsSeparatedByString:" "
	set aPattern to NSString's stringWithString:"\\b("
	if (logRegEx) then
		log {"createPatternForMatchAnyWords aArray is:", aArray}
	end if
	
	set aTotal to (aArray's |count|())
	repeat with i from 1 to aTotal
		set aWord to aArray's item i
		set aWord to (aWord's stringByReplacingOccurrencesOfString:"%" withString:" ")
		set aWordPattern to (NSRegularExpression's escapedPatternForString:aWord)
		if (i ≠ aTotal) then
			set aWordPattern to (aWordPattern's stringByAppendingString:"|")
		end if
		if (logRegEx) then
			log {"aWord is:", aWord}
			log {"aWordPattern is:", aWordPattern}
		end if
		set aPattern to (aPattern's stringByAppendingString:aWordPattern)
	end repeat
	set aPattern to aPattern's stringByAppendingString:")\\b"
	if (logRegEx) then
		log {"final pattern is:", aPattern}
	end if
	return aPattern
end createPatternForMatchAnyWords:


on createPatternForMatchAllWords:aLine
	set aString to NSString's stringWithString:aLine
	set aArray to aString's componentsSeparatedByString:" "
	set aPattern to NSString's stringWithString:"^"
	if (logRegEx) then
		log {"createPatternForMatchAllWords aArray is:", aArray}
	end if
	
	repeat with i from 1 to (aArray's |count|())
		set aWord to aArray's item i
		if ((aWord's |length|()) > 1) then
			set aWordPattern to (my createPatternForMatchWord:aWord)
		else
			set aWordPattern to (my createPatternForMatchLetter:aWord)
		end if
		if (logRegEx) then
			log {"aWordPattern is:", aWordPattern}
		end if
		set aPattern to (aPattern's stringByAppendingString:aWordPattern)
	end repeat
	set aPattern to aPattern's stringByAppendingString:".*$"
	if (logRegEx) then
		log {"final pattern is:", aPattern}
	end if
	return aPattern
end createPatternForMatchAllWords:

-- (?=.*\\bYou\\b)
on createPatternForMatchWord:aWord
	set aWordPattern to NSString's stringWithString:"(?=.*\\b"
	set aWordPattern to (aWordPattern's stringByAppendingString:aWord)
	set aWordPattern to (aWordPattern's stringByAppendingString:".?\\b)")
	return aWordPattern
end createPatternForMatchWord:

on createPatternForMatchLetter:aWord
	set aWordPattern to NSString's stringWithString:"(?=.*\\b"
	set aWordPattern to (aWordPattern's stringByAppendingString:aWord)
	set aWordPattern to (aWordPattern's stringByAppendingString:".{0,2}\\b)")
	return aWordPattern
end createPatternForMatchLetter:




(*
	-- /(^.*?\\.){1}
	
	
	my testRegWithName:"REMOVE FROM START TO FIRST PERIOD / ALT ALSO REMOVE TRACK." pattern1:"(^.*?\\.){1}" pattern2:"((?:^.*?\\.){1}(?:track\\.?)?)" source1:"@unionOfArrays.trackGenres" source2:"self.track.bitRate" replaceWith:"" expecting1:"" expecting2:""
*)


(*
	-- ^(?=.*\\bYou\\b)(?=.*\\bKnow\\b)(?=.*\\bLove\\b)(?=.*\\bYou\\b).*$
	-- 
	
	set aWordsPattern1 to my createPatternForMatchAllWords:"You Know I Love You"
	set aWordsPattern2 to my createPatternForMatchAllWords:"You Know I Fuck You"
	
	
	my testRegWithName:"MATCH ALL WORDS IN LINE TITLE TEST 01" pattern1:aWordsPattern1 pattern2:aWordsPattern2 source1:"If You Love Me (Let Me Know)" source2:"I Didn't Know I Loved You" replaceWith:"MATCHED" expecting1:"" expecting2:""
*)



(*
	my testRegWithName:"SINGLE ARTIST MATCH 3 MORE ADDS NO THE" pattern1:"(?>((^the\\s)|(\\s?(\\,|\\&|\\+)\\s?)|(\\s(and|vs)\\.?\\s)|^))(Eagles)(?>($|(\\,?\\s?)))" pattern2:"(?>((\\s?(\\,|\\&|\\+)\\s?)|(\\s(and|vs)\\.?\\s)|^))(Eagles)(?>($|(\\,?\\s?)))" source1:"The Eagles CCR, Eagles Rolling Stones The Eagles of Death and Eagles II" source2:"CCR, Eagles, Rolling Stones & eagles plus Eagles of Death vs Eagles" replaceWith:"$1MATCHED$8" expecting1:"" expecting2:""
	-- ((^.*)(?:(^the\s)|(\,\s?)|^)((Red)*.*(Hot)*.*(Chili)*.*(Peppers)*)(?:$|\,)(.*+))
*)


(*
	
	my testRegWithName:"SINGLE ARTIST MATCH" pattern1:"(?:(^the\\s)|(\\,\\s?)|^)(Eagles)(?:$|\\,)" pattern2:"((^.*)(?:(^the\\s)|(\\,\\s?)|^)(Eagles)(?:$|\\,)(.*+))" source1:"The Eagles of DeathMetal" source2:"CCR, Eagles, Rolling Stones" replaceWith:"MATCHED" expecting1:"" expecting2:""
*)

(*
	
	-- REMOVE DIGITS AND DASH FROM BEGGING
	set myTestName to "REMOVE DIGITS AND DASH FROM BEGGING"
	set myPattern1 to "/^(\\s*[0-9]+\\s*-?\\s*)"
	set myPattern2 to "/^(\\s*[0-9]+\\s*-?\\s*)/m"
	set mySourceA to "001 Come Together"
	set mySourceB to "123123123 - Believe"
	set myReplace to ""
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	
	set myTestB1 to my findInString:mySourceB withPattern:myPattern1 replaceWith:myReplace
	set myTestB2 to my findInString:mySourceB withPattern:myPattern2 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
		
		log {"mySourceB is", mySourceB}
		log {"myTestB1 is", myTestB1}
		log {"myTestB2 is", myTestB2}
	end if
	
	
	
	-- REMOVE THE FROM BEGGING
	set myTestName to "REMOVE THE FROM BEGGING"
	set myPattern1 to "/^the\\W/mi"
	set myPattern2 to ""
	set mySourceA to "The Beatles"
	set mySourceB to "Adam and the Ants"
	set myReplace to ""
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceB withPattern:myPattern1 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	((?:Red)?\b(?:Hot)?\b(?:Chili)?\b(?:Peppers)?\b)
	
	-- MATCH WHOLE WORD - EG WORK, hello
	set myTestName to "MATCH WHOLE WORD - EG WORK, hello"
	set myPattern1 to "\\b(\\w*work\\w*)\\b"
	set myPattern2 to "\\b(\\w*hello\\w*)\\b"
	set mySourceA to "hello 'worked? hello working all works and \"worked with \""
	set mySourceB to ""
	set myReplace to "XXXXX"
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	
	-- REMOVE BRACKETS AND BETWEETN
	set myTestName to "REMOVE BRACKETS AND BETWEETN"
	set myPattern1 to "(\\s*?\\(.+?\\)\\s*)+"
	set myPattern2 to "(\\W?(\\(|\\[|\\{).+?(\\)|\\]|\\})\\W?)+" -- also remove { and [
	set mySourceA to "Blah (blah1) (blah 2) me to (plus) check{all the men) and the [alll them]"
	set mySourceB to ""
	set myReplace to " "
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	
	-- REMOVE DASH TO END
	set myTestName to "REMOVE DASH TO END"
	set myPattern1 to "( -\\s?(.*))"
	set myPattern2 to "(\\s+-\\s?(.*))" -- also remove { and [
	set mySourceA to "Blah (blah1)  -me to (plus) check{all the men) and the [alll them]"
	set mySourceB to ""
	set myReplace to ""
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	
	-- CAPTURE YEAR 19xx or 20xx or 21xx
	set myTestName to "CAPTURE YEAR 19xx or 20xx or 21xx"
	set myPattern1 to "/\\s?(?:\\(|\\[|\\{)?([1-2][0|1|9][0-9]{2})(?:\\)|\\]|\\})?\\s?/i"
	set myPattern2 to "\\s?(?:\\(|\\[|\\{)?([1-2][0|1|9][0-9]{2})(?:\\)|\\]|\\})?\\s?\\-?\\s?(.*)\\s\\[.*(HD).*\\]"
	set mySourceA to "1971 - Paul Simon [24bit 96kHz 2010 HDtracks FLAC]"
	set mySourceB to "1923 Rolling Stones"
	set myReplace to "$2 $3 ($1)"
	set myTestExpect1 to "Paul Simon HD (1971)"
	set expectResults2 to "Rolling Stones (1923)"
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceB withPattern:myPattern1 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	-- CAPTURE YEAR V2
	set myTestName to "CAPTURE YEAR V2"
	set myPattern1 to "/\\s?(?:\\(|\\[|\\{)?([1-2][0|1|9][0-9]{2})(?:\\)|\\]|\\})?\\W+/i"
	set myPattern2 to ""
	set mySourceA to "2009 - The Rolling Stones - Great Album"
	set mySourceB to ""
	set myReplace to "$`$' ($1)"
	set expectResults1 to "The Rolling Stones - Great Album (2009)"
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to ""
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	
	-- CAPTURE YEAR V3 More Complete With Capture Groups
	set myTestName to "CAPTURE YEAR V3 More Complete With Capture Groups"
	set myPattern1 to "/(.*)\\s?(?:\\(|\\[|\\{)?([1-2][0|1|9][0-9]{2})(?:\\)|\\]|\\})?\\W+(.*)/i"
	set myPattern2 to ""
	set mySourceA to "2009 - The Rolling Stones - Great Album"
	set mySourceB to ""
	set myReplace to "$1$3 ($2)"
	set expectResults1 to "The Rolling Stones - Great Album (2009)"
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to ""
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	
	-- REMOVE DASH TO END
	set myTestName to "REMOVE DASH TO END"
	set myPattern1 to "( -\\s?(.*))"
	set myPattern2 to "(\\s+-\\s?(.*))" -- also remove { and [
	set mySourceA to "Blah (blah1)  -me to (plus) check{all the men) and the [alll them]"
	set mySourceB to ""
	set myReplace to ""
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
	
	-- MUSIC SPECIFIC
	
	-- GENRE REFORMAT
	set myTestName to "GENRE REFORMAT"
	--set myPattern1 to "(\\s?,\\s?)|(?:\\b)(\\s?[/]\\s?)(?:\\b)" -- this works
	set myPattern1 to "(^|\\W+)?(,|\\/)(\\W+|$)?" -- this works and also does not replace dashes
	set myPattern2 to "(^|\\W+)?(,|\\/|\\s)(\\W+|$)?" -- this works and also replaces spaces with dash
	set mySourceA to "futureSoul/RnB Disco Boogie, Funk"
	set mySourceB to ""
	set myReplace to " - "
	set expectResults1 to "futureSoul - RnB - Disco - Boogie - Funk"
	set myTestA1 to my findInString:mySourceA withPattern:myPattern1 replaceWith:myReplace
	set myTestA2 to my findInString:mySourceA withPattern:myPattern2 replaceWith:myReplace
	
	if logResults then
		log ("-------------------------------------------NEW TEST START")
		log {"----------------myTestName is", myTestName}
		
		log {"myPattern1 is", myPattern1}
		log {"myPattern2 is", myPattern2}
		log {"mySourceA is", mySourceA}
		log {"mySourceB is", mySourceB}
		log {"myReplace is", myReplace}
		log {"myTestA1 is", myTestA1}
		log {"myTestA2 is", myTestA2}
	end if
*)

-- MUSIC SPECIFIC

(*
	-- VS REPLACE
	set myTestName to "VS REPLACE"
	set myPattern1 to "(^|\\W+)(vs|versus)(\\W+|$)"
	set myPattern2 to "/(^|\\W+)(vs|versus)(\\W+|$)/mi"
	set mySourceA to "this is kerry vs. the world and her versus everything and dj vs me and vse "
	set mySourceB to ""
	set myReplace to " & "
	set myTestExpect to "this is kerry & the world and her & everything and dj & me and vse "
*)

-- *myTestA1  "this is kerry & the world and her & everything and dj & me and vse "*)
-- (*myTestA2 is, (NSString) "this is kerry vs. the world and her versus everything and dj vs me and vse "*)
(*
	
	-- SINGLE ARTIST MATCH
	-- (?:(^the\\s)|(\\,\\s?)|^)(Eagles)(?:$|\\,)
	set myTestName to "SINGLE ARTIST MATCH"
	set myPattern1 to "(?:(^the\\s)|(\\,\\s?)|^)(Eagles)(?:$|\\,)"
	set myPattern2 to "/(^|\\W+)(vs|versus)(\\W+|$)/mi"
	set mySourceA to "The Eagles of"
	set mySourceB to "CCR, Eagles"
	set myReplace to "XXXXXXXXXXXX"
*)


Here’s a few sites that i’ve found helpful:

https://www.regexpal.com
^^^^^^^^^ I used this before I found RegExKit
you might find some useful examples

This site is awesome for explaining more advanced topics and even simple ones
Also helpful for making your RegExs more efficient.

https://www.regular-expressions.info/tutorial.html

Thanks technomorph.

BTW, have you found a way to save the regular expression and test string when working with RegexKit? I couldn’t find a way to do this, and it’s not really important, but I thought I’d ask.

No I don’t think you can save them.
(Or I haven’t figured it out)
Hence why i copy and paste
And save them as I did in my script.

I definitely find I’m adjusting them as
Often something doesn’t get captured or something gets
Captured that I don’t want.

Thanks technomorph. When you click on “RegEx Workspace” in the upper-left corner of RegexKit, a dialog states the following, which made me wonder if there might be a way to save the data. I don’t think there is, though, and RegexKit is a marvelous app regardless.