Anyways, just for practice, I rewrote my earlier regex pattern to include a few of your suggestions and to meet applescource’s latest request. I used the ASObjC script that I posted above as a test vehicle (I don’t have BBEdit).
Tested and working
Anyways, just for practice, I rewrote my earlier regex pattern to include a few of your suggestions and to meet applescource’s latest request. I used the ASObjC script that I posted above as a test vehicle (I don’t have BBEdit).
Tested and working
Though your ‘regexReplace()’ func using the ‘L’ lowercase switch works fine, I realized that getting back all lower-case words created it’s own problem–namely, sentences without the 1st word capitalized.
So, I modded your code by adding a new ‘S’ switch for ‘Sentence Case’:
-- func comments - add description of 'S' switch
S - lowercase an entire capture group, then upper-case the 1st word of each sentence
...
-- line 14, added 'S' switch to backRefRegex pattern
set backRefRegex to |⌘|'s class "NSRegularExpression"'s regularExpressionWithPattern:("(?<!\\\\)(\\$[ULIFS]?)(\\d{1,2})") options:(0) |error|:(missing value)
...
-- new line 69, code to capitalize 1st word of each sentence
else if (backRefPrefix ends with "S") then
set sentenceCasePattern to "([\\.\\?\\!]\\W*\\b)(\\w)"
set sentenceCaseTemplate to "$1$U2"
set textMatchedByCaptureGroup to textMatchedByCaptureGroup's lowercaseString()
-- recursive call to regexReplace()
set sentenceCaseText to my regexReplace(textMatchedByCaptureGroup, sentenceCasePattern, sentenceCaseTemplate)
set textMatchedByCaptureGroup to sentenceCaseText
end if
The code 1st lowercases the capture group, then makes a recursive call to regexReplace() with a pattern matching the 1st letter of each sentence, capitalizing it.
There’s probably a more effecient way to do this, but it works.
One flaw in this method is that proper nouns and other words that should be capitalized will be lowercase, but this should probably be handled in a separate operation.
A fun exercise! But in practice it’s just as easy for the script using the handler to pass a suitable pattern and template to it as to have the handler pass them to itself. More flexible too, since the definition of a sentence is then controlled by the script, not by the handler.
You’re probably right.
Just seemed logically consistent to add it here. Though, I imagine the definition of a sentence might be different in other languages.