I have a list of 2000 of stings, and I want to check if each item start with 2-digit, then a “.”, and the 4 digit, and then an “/”,
Basically, I want to be sure that each of the items start with something like that:
10.1074/JBC.270.42.24818 -->OK
10.100a/GENO.1997.5201 → not OK
10.1126/SCIENCE.278.5340.1059 -->OK
I never used “regular expression” but my guess is that I am looking for for something like this “^[0-9]{2}.[0-9]{4}”. and then an “/”, but I don’t know how to include all of this into a AppleScript loop.
A single dot means any character, to check for a literal dot you have to escape the dot (.).
So the pattern is “^\d{2}\.\d{4}/”. (\d is the same as [0-9])
AppleScriptObjC (Foundation NSString) can filter strings with Regular Expression
use AppleScript version "2.5"
use framework "Foundation"
use scripting additions
property |⌘| : a reference to current application
set textLines to "10.1074/JBC.270.42.24818
10.100a/GENO.1997.5201
10.1126/SCIENCE.278.5340.1059"
repeat with aLine in (get paragraphs of textLines)
if length of aLine > 0 and checkRegex(aLine, "^\\d{2}\\.\\d{4}/") is false then
display dialog aLine buttons {"Cancel", "OK"} default button "OK"
end if
end repeat
on checkRegex(theText, thePattern)
set foundationString to |⌘|'s NSString's stringWithString:theText
return (foundationString's rangeOfString:thePattern options:(|⌘|'s NSRegularExpressionSearch))'s |length|() > 0
end checkRegex
Thanks for the info, Fredrik71.
I tried already Atom.app, but did not pay attention to the fact that it can be used to quickly build regular expressions.
Hello StefanK
In the thread https://macscripter.net/viewtopic.php?id=47299, Shane Stanley explained why the use of property NSNotFound : a reference to 9.22337203685477E+18 + 5807 isn’t really OK and delivered alternative ways to get rid of the original oddity.
Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) dimanche 15 mars 2020 10:07:57
in this case you cannot use one of the suggested alternatives because rangeOfString returns an NSRange, however you can check for the length parameter. I updated the post
The escapedPatternForString: method is meant for dealing with strings that aren’t defined at the time of writing. A simple example might be the result of a dialog, like this:
use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions
set theText to "Jim Jones 12.00
Jane Smith 4.00"
set theText to current application's NSString's stringWithString:theText
set theName to text returned of (display dialog "Enter company name" default answer "")
set theNameEsc to current application's NSRegularExpression's escapedPatternForString:theName
set {theRegex, theError} to current application's NSRegularExpression's regularExpressionWithPattern:("(?m)^" & theNameEsc & "\\t(.+)") options:0 |error|:(reference)
if theRegex is missing value then error theError's localizedDescription() as text
set theMatch to theRegex's firstMatchInString:theText options:0 range:{0, theText's |length|()}
if theMatch is missing value then error "Match not found"
set theRest to theText's substringWithRange:(theMatch's rangeAtIndex:1)
It means the script won’t fail because the text entry contains a character that needs escaping.