Saturday, December 3, 2022

#1 2022-09-30 04:08:57 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1517

RegEx Search and Incremented Replacement

I've been working on something without success. The following script replaces an asterisk at the front of a line with a dash and works as expected. However, I want the replacement character to be consecutive line numbers. This is easily done with a repeat loop, but I wondered if that might be avoided. I did a Google search, which contained some suggestions, but I didn't understand them. This is not for a particular project--just to learn. Thanks.

Applescript:

use framework "Foundation"

set theString to "a line
* aa
* bb
* cc
a line"


set theCharacter to "*"
set thePattern to "(?m)^(\\h*)\\" & theCharacter
set theString to current application's NSString's stringWithString:theString
set theString to (theString's stringByReplacingOccurrencesOfString:thePattern withString:("$1" & "-") options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()})
return theString as text

The desired text is:

a line
1 aa
2 bb
3 cc
a line


2018 Mac mini - macOS Monterey - Script Debugger 8

Offline

 

#2 2022-10-02 01:28:22 am

StefanK
Member
From:: St. Gallen, Switzerland
Registered: 2006-10-21
Posts: 11777
Website

Re: RegEx Search and Incremented Replacement

The repeat loop is necessary to be able to increment the index counter.

This is a solution with `NSRegularExpression` where the options allow to determine new lines by the ^ character.

It's mandatory to replace the substrings backwards because the string grows after each replacement and the ranges change.

Applescript:

set theString to "a line
* aa
* bb
* cc
a line"


set theString to my (NSMutableString's stringWithString:theString)
set regex to my (NSRegularExpression's regularExpressionWithPattern:"^\\*\\s" options:(my NSRegularExpressionAnchorsMatchLines) |error|:(missing value))
set matches to regex's matchesInString:theString options:0 range:{0, theString's |length|()}
set theCounter to count matches
repeat with i from theCounter to 1 by -1
   set theRange to (matches's objectAtIndex:(i - 1))'s range()
   (theString's replaceCharactersInRange:theRange withString:((theCounter as text) & " - "))
   set theCounter to theCounter - 1
end repeat
return theString as text


regards

Stefan

Offline

 

#3 2022-10-02 02:59:34 am

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5588

Re: RegEx Search and Incremented Replacement

Here's a minor reworking of Stefan's solution which I think gives the results peavine wanted. Basically the regex is different and the extraction and use of the ranges is slightly optimised:

Applescript:

use framework "Foundation"

set theString to "a line
* aa
* bb
* cc
a line"


set theString to current application's NSMutableString's stringWithString:theString
-- Look-behinds don't allow infinite repeats, but do allow indefinite ones within a range. 10 should be enough.
set regex to current application's NSRegularExpression's ¬
   regularExpressionWithPattern:"(?m)(?<=^\\h{1,10})\\*" options:(0) |error|:(missing value)
set matches to regex's matchesInString:theString options:0 range:{0, theString's |length|()}
set ranges to matches's valueForKey:"range"
repeat with i from (count ranges) to 1 by -1
   (theString's replaceCharactersInRange:(ranges's item i) withString:(i as text))
end repeat
return theString as text


NG

Offline

 

#4 2022-10-06 07:27:40 pm

technomorph
Member
Registered: 2017-12-14
Posts: 302

Re: RegEx Search and Incremented Replacement

Here's a script that includes some handlers that I find I use a lot.
Mainly
- create a RegEx from a pattern.
- check if a RegEx contains matches in a string
- a RegEx replace matches in string with a pattern
(the continue use of always having to a create a range....arrrrgh)

Workflow:
- set replaceIndex to 1
- creates aRegEx
- splits the testString into aArray
- enumerate aArray with aLine
- set aNewLine to aLine
- check if RegEx matches aLine
if YES then create the replaceString with replaceIndex and space
   set aNewLine to aRegEx's replaceMatchesInString:withTemplate:

- add aNewLine to aFinalArray
- create aFinalString by joining aFinalArray with linefeed


Applescript:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
use framework "Foundation"

property NSArray : a reference to current application's NSArray
property NSMutableArray : a reference to current application's NSMutableArray
property NSString : a reference to current application's NSString

property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to 1
property NSRegularExpressionUseUnicodeWordBoundaries : a reference to 40
property NSRegularExpressionAnchorsMatchLines : a reference to 16
property NSRegularExpressionSearch : a reference to 1024

--logging
property logRXMatches : false
property logDebugMode : false

--test properties
property aTestPattern : "^\\*\\s"
property aRegEx : missing value
property aReplaceIndex : 1
property aDelimiter : linefeed
property aTestString : ""
property aTestArray : {}
property aFinalString : ""

set aTestString to "a line
* aa
* bb
* cc
a line"


set aResult to (my incrementalReplaceInString:aTestString withPattern:aTestPattern startIndex:aReplaceIndex) as text

(*
   a line
   1 aa
   2 bb
   3 cc
   a line
*)


on incrementalReplaceInString:aString withPattern:aPattern startIndex:aIndex
   set aFinalArray to NSMutableArray's array()
   set aReplaceIndex to aIndex
   set aRegEx to (my createRegularExpressionWithPattern:aPattern)
   set aTestArray to (my splitString:aString usingDelimiter:(my aDelimiter))
   set aCount to aTestArray's |count|()
   repeat with aIndex from 0 to (aCount - 1)
       set aLine to (aTestArray's objectAtIndex:aIndex)
       set aNewLine to aLine
       if (my containsMatchesForRegEx:aRegEx inString:aLine) then
           set aRepString to NSString's stringWithFormat_("%@ ", aReplaceIndex)
           if (my logDebugMode) then
               log {"aReplaceIndex is:", aReplaceIndex}
               log {"aRepString is:", aRepString}
           end if
           set aNewLine to (my replaceMatchesForRegEx:aRegEx inString:aLine withPattern:aRepString)
           set aReplaceIndex to aReplaceIndex + 1
       end if
       (aFinalArray's addObject:aNewLine)
   end repeat
   
   set aFinalString to (aFinalArray's componentsJoinedByString:(my aDelimiter))
   if (my logDebugMode) then
       log {"aTestArray is:", aTestArray}
       log {"aFinalArray is:", aFinalArray}
       log {"aFinalString is:", aFinalString}
   end if
   return aFinalString
end incrementalReplaceInString:withPattern:startIndex:



-- CREATE A REGULAR EXPRESSION

on createRegularExpressionWithPattern:aRegExString
   if (class of aRegExString) is equal to (NSRegularExpression's class) then
       if (my logDebugMod) then log ("it alreadry was a RegEx")
       return aRegExString
   end if
   set aPattern to NSString's stringWithString:aRegExString
   set regOptions to NSRegularExpressionCaseInsensitive + NSRegularExpressionUseUnicodeWordBoundaries
   set {aRegEx, aError} to (NSRegularExpression's regularExpressionWithPattern:aPattern options:regOptions |error|:(reference))
   if (aError ≠ missing value) then
       log {"regEx failed to create aError is:", aError}
       log {"aError debugDescrip is:", aError's debugDescription()}
       break
       return
   end if
   return aRegEx
end createRegularExpressionWithPattern:

-- REGEX MATCHING FUNCTIONS
-- CONTAINS MATCHES?
on containsMatchesForRegEx:aRegEx inString:aString
   if (aRegEx = missing value) then
       if (my logDebugMode) then log ("aRegEx was nil")
       return false
   end if
   set aCount to (my countOfMatchesForRegEx:aRegEx inString:aString)
   return (aCount > 0)
end containsMatchesForRegEx:inString:

-- COUNT OF MATCHES
on countOfMatchesForRegEx:aRegEx inString:aString
   set matches to (my matchesForRegEx:aRegEx inString:aString)
   return matches's |count|()
end countOfMatchesForRegEx:inString:

-- REGEX MATCHES
on matchesForRegEx:aRegEx inString:aString
   set aSource to NSString's stringWithString:aString
   set aLength to aSource's |length|()
   set aRange to (current application's NSMakeRange(0, aLength))
   set matches to (aRegEx's matchesInString:aSource options:0 range:aRange)
   if (my logRXMatches) then
       log {"matches for Pattern Logs ========"}
       log {"aString is:", aString}
       log {"aSource is:", aSource}
       log {"aLength is:", aLength}
       log {"aRange is:", aRange}
       log {"matches is:", matches}
       if (my logDebugMode) then my debugLogRegEx:aRegEx
   end if
   return matches
end matchesForRegEx:inString:

-- REGEX REPLACE MATCHES IN STRING WITH TEMPLATE

on replaceMatchesForRegEx:aRegEx inString:aString withPattern:aPattern
   set aLength to aString's |length|()
   set aRange to (current application's NSMakeRange(0, aLength))
   set aNewString to (aRegEx's stringByReplacingMatchesInString:aString options:0 range:aRange withTemplate:aPattern)
   return aNewString
end replaceMatchesForRegEx:inString:withPattern:


-- UTILITY SPLIT STRING
on splitString:aString usingDelimiter:aDelimiter
   set aSource to (NSString's stringWithString:aString)
   set aSplitter to (NSString's stringWithString:aDelimiter)
   set aArray to (aSource's componentsSeparatedByString:aSplitter)
   return aArray
end splitString:usingDelimiter:

-- DEBUGGING
on debugLogRegEx:aRegEx
   if (aRegEx = missing value) then
       log {"aRegEx is empty"}
       return
   end if
   set groupCount to aRegEx's numberOfCaptureGroups()
   set aPattern to aRegEx's pattern()
   set aCleanPattern to (NSRegularExpression's escapedPatternForString:aPattern)
   log {"aRegEx is", aRegEx}
   log {"aRegEx aPattern is", aPattern}
   log {"aRegEx aCleanPattern is", aCleanPattern}
   log {"aRegEx groupCount is", groupCount}
   log {"aRegEx options is", aRegEx's options()}
end debugLogRegEx:

Offline

 

#5 2022-10-08 10:06:58 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1517

Re: RegEx Search and Incremented Replacement

Thanks Stefan, Nigel, and technomorph for the script suggestions. They all work great, and I appreciate your time.



Just for learning purposes, I wrote the following script to reset the counter when intervening non-list paragraphs are encountered. I ran timing tests with a string that contained 1,729 paragraphs with a 5-item asterisked list every 50 paragraphs. The timing results for Nigel's and my scripts were 16 and 28 milliseconds, respectively.

Applescript:

use framework "Foundation"
use scripting additions

set theString to "some text
   * line one
   * line two
some text
   * line one
   * line two
some text"


set numberedString to getNumberedString(theString)

on getNumberedString(theString)
   set theString to current application's NSMutableString's stringWithString:theString
   set thePattern to "(?m)^(\\h*)\\*(.*)$"
   set theRegex to current application's NSRegularExpression's regularExpressionWithPattern:(thePattern) options:(0) |error|:(missing value)
   set theMatches to theRegex's matchesInString:theString options:0 range:{0, theString's |length|()}
   set {theCounter, priorRangeEnd} to {1, 0}
   repeat with aMatch in theMatches
       set aRange to aMatch's range()
       if priorRangeEnd = (aRange's location()) then
           set theCounter to theCounter + 1
       else
           set theCounter to 1
       end if
       (theString's replaceOccurrencesOfString:(thePattern) withString:("$1" & theCounter & "$2") options:(1024) range:aRange)
       set priorRangeEnd to (aRange's location()) + (aRange's |length|()) + 1
   end repeat
   return theString as text
end getNumberedString

Last edited by peavine (2022-10-12 07:08:30 am)


2018 Mac mini - macOS Monterey - Script Debugger 8

Offline

 

#6 2022-10-12 10:19:17 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1517

Re: RegEx Search and Incremented Replacement

I was curious how a basic AppleScript would fare in my testing. The timing result was 233 milliseconds but improved to 11 milliseconds if enhanced with a script object (see below). The testing procedure was the same as that described above.

Applescript:

set theString to "some text
   * line one * one
   * line two * two
some text
   * line one
   * line two
some text"


set numberedString to getNumberedString(theString)

on getNumberedString(theString)
   script o
       property theParagraphs : (paragraphs of theString)
       property numberedParagraphs : {}
   end script
   
   set TID to text item delimiters
   set text item delimiters to "*"
   set theCounter to 1
   
   ignoring white space
       repeat with aParagraph in o's theParagraphs
           set aParagraph to contents of aParagraph
           if aParagraph begins with "*" then
               set end of o's numberedParagraphs to (text item 1 of aParagraph & (theCounter as text) & text items 2 thru -1 of aParagraph)
               set theCounter to theCounter + 1
           else
               set end of o's numberedParagraphs to aParagraph
               set theCounter to 1
           end if
       end repeat
   end ignoring
   
   set text item delimiters to linefeed
   set o's numberedParagraphs to (o's numberedParagraphs as text)
   set text item delimiters to TID
   return o's numberedParagraphs
end getNumberedString


2018 Mac mini - macOS Monterey - Script Debugger 8

Offline

 

#7 2022-10-12 03:59:45 pm

Fredrik71
Member
Registered: 2019-10-23
Posts: 1090

Re: RegEx Search and Incremented Replacement

peavine wrote:

11 milliseconds if enhanced with a script object...


To get 11 milliseconds on my machine I have to run 100 iterations in Script Geek wink

Last edited by Fredrik71 (2022-10-12 04:00:06 pm)


Node-RED makes it easy to automate IoT

Offline

 

#8 2022-10-13 07:24:11 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 1517

Re: RegEx Search and Incremented Replacement

Fredrik71 wrote:

To get 11 milliseconds on my machine I have to run 100 iterations in Script Geek wink



I've included below the script I used to run the timing tests. I didn't post it before because I didn't want to clutter the forum. To avoid doing that, I'll remove the script after a few days.

-- script deleted by  peavine --

Last edited by peavine (2022-10-14 03:24:07 pm)


2018 Mac mini - macOS Monterey - Script Debugger 8

Offline

 

#9 2022-10-13 09:40:30 am

Fredrik71
Member
Registered: 2019-10-23
Posts: 1090

Re: RegEx Search and Incremented Replacement

@peavine
I run 100 iterations of your new script (benchmark) on m1 max I get 5ms (average)

Last edited by Fredrik71 (2022-10-13 09:41:05 am)


Node-RED makes it easy to automate IoT

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)