I have implemented a very primitive regexp algorithm based on this blog entry byBrian W. Kernighan, the code by Robert Pike. it is basically the NFA algorithm.
The algorithm seem to be implemented properly, and can at least return a match on a line at a time!
(or whatever character chunk below some 2^14 characters you choose.)
I did it mainly to stop the whiners about not having regexp in Applescript, secondly for the hell of it!
It is kindof primitive, no posix character classes, regular character classes, nor word-boundaries, but it has ^ $ for anchors, and supports the ‘.’ and ‘*’ characters.
It was fun to implement it, I hope it can be useful for something, because, if your needs are miniscule, this should be fast enough, and it is easy to fill in the rest of the details for making o a regexp object.
You’ll see by the example, that you’ll call it with characters of your text, as this was the easiest way to overcome the lack of a pointer to char. (char *)
# but refer to the link to this post. The excempt is Jon Pugh and his smart strings library
# Or HHAS and AppleMods, any AppleScript Library delivered free of charge, as a matter of fact.
set RE to "r$*"
set thisText to " TIf you wanted to get rich, how would you do it? I think your best bet would be to start or join a startup. That's been a reliable way to get rich for hundreds of years. The word \"startup\" dates from the 1960s, but what happens in one is very similar to the venture-backed trading voyages of the Middle Ages."
script o
property aline : {}
property lastMatch : false
end script
set o's aline to characters of thisText
set res to match(characters of RE, thisText)
on match(regexp, theText)
if item 1 of regexp = "^" then
return matchhere(items 2 thru -1 of regexp, theText)
end if
repeat with i from 1 to (count theText)
if matchhere(regexp, items i thru -1 of theText) then return true
end repeat
return false
end match
on matchhere(regexp, theText)
if o's lastMatch and length of regexp = 1 then return true
if length of regexp ≥ 3 and item 2 of regexp = "*" then
return matchstar(item 1 of regexp, items 3 thru -1 of regexp, theText)
end if
if item 1 of regexp = "$" and length of regexp = 1 then
return ((length of theText) = 1)
end if
if length of theText > 0 and (item 1 of regexp = "." or item 1 of regexp = "*" or item 1 of regexp = item 1 of theText) then
if length of theText > 1 then
set o's aline to items 2 thru -1 of theText
else
set o's aline to item -1 of theText
end if
if length of regexp > 1 then
return (matchhere(items 2 thru -1 of regexp, o's aline))
else
set o's lastMatch to true
return (matchhere(item -1 of regexp, o's aline))
end if
end if
return false
end matchhere
on matchstar(c, regexp, theText)
repeat with i from 1 to (count theText)
set o's aline to items i thru -1 of theText
if matchhere(regexp, o's aline) then return true
if item i of theText ≠c and c ≠"." then exit repeat
end repeat
return false
end matchstar