Regular expression syntax

dear all,

I have a list of 2000 of stings, and I want to check if each item start with 2-digit, then a “.”, and the 4 digit, and then an “/”,
Basically, I want to be sure that each of the items start with something like that:
10.1074/JBC.270.42.24818 -->OK
10.100a/GENO.1997.5201 → not OK
10.1126/SCIENCE.278.5340.1059 -->OK

I never used “regular expression” but my guess is that I am looking for for something like this “^[0-9]{2}.[0-9]{4}”. and then an “/”, but I don’t know how to include all of this into a AppleScript loop.

Thanks.

L.

A single dot means any character, to check for a literal dot you have to escape the dot (.).

So the pattern is “^\d{2}\.\d{4}/”. (\d is the same as [0-9])

AppleScriptObjC (Foundation NSString) can filter strings with Regular Expression

use AppleScript version "2.5"
use framework "Foundation"
use scripting additions

property |⌘| : a reference to current application

set textLines to "10.1074/JBC.270.42.24818
10.100a/GENO.1997.5201
10.1126/SCIENCE.278.5340.1059"

repeat with aLine in (get paragraphs of textLines)
	if length of aLine > 0 and checkRegex(aLine, "^\\d{2}\\.\\d{4}/") is false then
		display dialog aLine buttons {"Cancel", "OK"} default button "OK"
	end if
end repeat

on checkRegex(theText, thePattern)
	set foundationString to |⌘|'s NSString's stringWithString:theText
	return (foundationString's rangeOfString:thePattern options:(|⌘|'s NSRegularExpressionSearch))'s |length|() > 0

end checkRegex

Thanks for the info, Fredrik71.
I tried already Atom.app, but did not pay attention to the fact that it can be used to quickly build regular expressions.

Hello StefanK
In the thread https://macscripter.net/viewtopic.php?id=47299, Shane Stanley explained why the use of property NSNotFound : a reference to 9.22337203685477E+18 + 5807 isn’t really OK and delivered alternative ways to get rid of the original oddity.

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) dimanche 15 mars 2020 10:07:57

Yvan,

in this case you cannot use one of the suggested alternatives because rangeOfString returns an NSRange, however you can check for the length parameter. I updated the post

Thank you for the edited version StefanK

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) dimanche 15 mars 2020 12:25:00

Thanks to all!
i have now what I was looking for.

Ciao, and stay at home enthuse days !!
L.

I found this was a good site for trying/figuring out my
RegEx’s

https://www.regexpal.com/

And this tutorial

https://www.regular-expressions.info/tutorial.html

Only trouble is thing figuring what what needs to be escaped
For the NSString.

There is this NSRegularExpression method but I found
It didn’t seem to work all time. Or maybe I wasn’t using it properly.

https://developer.apple.com/documentation/foundation/nsregularexpression/1408386-escapedpatternforstring?language=objc

The escapedPatternForString: method is meant for dealing with strings that aren’t defined at the time of writing. A simple example might be the result of a dialog, like this:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

set theText to "Jim Jones	12.00
Jane Smith	4.00"
set theText to current application's NSString's stringWithString:theText
set theName to text returned of (display dialog "Enter company name" default answer "")
set theNameEsc to current application's NSRegularExpression's escapedPatternForString:theName
set {theRegex, theError} to current application's NSRegularExpression's regularExpressionWithPattern:("(?m)^" & theNameEsc & "\\t(.+)") options:0 |error|:(reference)
if theRegex is missing value then error theError's localizedDescription() as text
set theMatch to theRegex's firstMatchInString:theText options:0 range:{0, theText's |length|()}
if theMatch is missing value then error "Match not found"
set theRest to theText's substringWithRange:(theMatch's rangeAtIndex:1)

It means the script won’t fail because the text entry contains a character that needs escaping.

Thanks Shane I’ll have to recheck my code.
I may have been escaping the whole “pattern”