Search and replace multiple delimiters in text string?

I have a text string with garbled delimiters. Point, comma and semicolon are all used, something like:

value1,v2.v3;,v4,.;v5

Note that there are more than one delimiter between some of the fields.

I want to replace all this with just comma and if there are multiple delimiters they should be replaced with just one delimiter.

I was thinking about doing a recursive search for two commas in a row but there must be easier ways?

I don’t even see a single instance of two commas in a row in that string.

p.s. maybe you just need to replace anything that’s not a letter, number, or comma with a comma. then replace multiple commas with a comma.

but regex gurus may have an easier solution.

You can do this with vanilla Applescript. No need to involve a shell.

lagr. The following solution uses basic AppleScript and has a timing result of less than a millisecond. As written, it will only work with four or fewer consecutive unwanted characters but this can be changed.

set theString to "value1,,v2.v3;,v4,.;v5"

set {TID, text item delimiters} to {text item delimiters, {".", ";"}}
set theString to text items of theString
set text item delimiters to {","}
set theString to theString as text
set text item delimiters to {",,,,", ",,,", ",,"}
set theString to text items of theString
set text item delimiters to {","}
set theString to theString as text --> "value1,v2,v3,v4,v5"
set text item delimiters to TID

The following is an ASObjC solution that also takes less than a millisecond. I prefer this for its brevity.

use framework "Foundation"
use scripting additions

set theString to "value1,,v2..v3;,v4,.;v5"
set theString to current application's NSMutableString's stringWithString:theString
theString's replaceOccurrencesOfString:"[,;.]+" withString:"," options:1024 range:{0, theString's |length|()}
set theString to theString as text --> "value1,v2,v3,v4,v5"

The multiple text item delimiters appear to be matched in order, so they can also be grouped into something like:


{",,,,", ",,,", ",.;", ",,", ";,", ".", ",", ";"}

I went with recursion. It might not be super efficient clockcycle-wise but it is every efficient codereading-wise:

on findReplace(findText, replaceText, sourceText)
–code that replaces ,.; with comma as delimiter
if sourceText contains “,” then
set sourceText to my findReplace(“,”, “,”, sourceText)
end if
return sourceText
end findReplace

Very short, very easy to read and very simple = bugfree.

The only issue that remains is that the string might end with one or more delimiter and this doesn’t handle this.

lagr. I tested your script and received a stack overflow error.

No you didn’t. And if that happens, you just add regular error handling and recurse on/in that.

Code I linked to above solves it all; internal, leading, trailing garbage:

set theText to "value1,,v2.v3;,v4,.;v5"
set {TIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, {",", ".", ";", ":", "whatever"}}
set textItems to every text item of theText
repeat with x from 1 to count textItems
	if item x of textItems = "" then set item x of textItems to missing value
end repeat
set theText to text of textItems as text -- this uses the first item in the list of delims
set AppleScript's text item delimiters to TIDs
return theText
1 Like

I edited my ASObjC solution to remove the specified delimiter characters from the end of the string. The timing result as written was 0.3 millisecond. I increased the length of theString to 2380 words, and the timing result was 1 millisecond.

use framework "Foundation"
use scripting additions

set theString to "value1,,v2..v3;,v4,.;v5.;,"
set theString to current application's NSMutableString's stringWithString:theString
theString's replaceOccurrencesOfString:"[,;.]+" withString:"," options:1024 range:{0, theString's |length|()}
theString's replaceOccurrencesOfString:",$" withString:"" options:1024 range:{0, theString's |length|()}
set theString to theString as text --> "value1,v2,v3,v4,v5"

@peavine, your second replaceOccurrencesOfString statement can be simplified and you can add a space after the comma for more readability. If someone needs it, the result can be coerced to a list:

use framework "Foundation"
use scripting additions

set theString to "..;;,value1,,v2..v3;,v4,.;v5,.;"
set theString to current application's NSMutableString's stringWithString:theString
theString's replaceOccurrencesOfString:"[,;.]+" withString:", " options:1024 range:{0, theString's |length|()}
theString's replaceOccurrencesOfString:", $|^, " withString:"" options:1024 range:{0, theString's |length|()}
return theString as text --> "value1, v2, v3, v4, v5"
return (theString's componentsSeparatedByString:", ") as list --> {"value1", "v2", "v3", "v4", "v5"}
1 Like

Thanks Jonas. Your replacement for my second regex pattern is way better. I’ll make that change in my script.