Help with syntax for calling Perl shell script regular expression?

I’ve seen examples of how to call a shell script in Perl if I want to do a search and replace with a regular expression, such as

set mytext to "I am learning Perl. My new favorite scripting language in 2009 is Perl and I expect it to be in 2010."
do shell script "echo " & quoted form of mytext & " | perl -p -e 's/Perl/AppleScript/g;'"

However, I can’t figure out the syntax for just doing a search (without replacing anything) and finding a specific match. For example, if I wanted to find the years in the text searching for (\d\d\d\d) .

Could someone help me figure out how to access the matches of the four digits? Thanks.

This might do what you want:

set mytext to "I am learning Perl. My new favorite scripting language in 2009 is Perl and I expect it to be in 2010." & (ASCII character 10) & "foo 2039 bar"
do shell script "echo " & quoted form of mytext & " | perl -n -e 'print join(q{ },m/(\\d\\d\\d\\d)/g),qq{\\n}'"

If you are not familiar with Perl, q{ } is the same as ’ ’ and qq{\n} is the same as “\n”. I used these alternates to avoid having to escape the otherwise equivalent quote characters. The backslashes are doubled because we have to escape them in AppleScript “string literal” values.

If you feed this Perl program input that contains runs of more than four consecutive digits, you may be surprised at the output. It will break up the long sequence into runs of exactly four digits and disregard any digits after the last exact multiple of four (“foo 1234567890” becomes “1234 5678”). If you want to avoid breaking up long sequences into “quads” and avoid the loss of trailing digits (after sequences of at least four), add a plus (+) after the last \d.

Edit History: Removed superfluous change from the example shell code given in the thread’s first post.

Thanks very much. I have done some work with Perl regular expressions but did not know how to make a Perl shell script call from an AppleScript. I also didn’t know about the qq{\n}. Appreciate the info!

as AS does not provide a proper RegularExpression functionality, we need to do a workaround - and that requires escaping…

set s to “m"a’quoted/string”
regexReplace(“([M"/']+)”, “\1X”, s, “gi”)

on regexReplace(sP, rP, s, o)
“echo "” & replaceString(s, “"”, “\"”) & “" | perl -p -e "s/” & escapeForRegex(sP) & “/” & escapeForRegex(rP) & “/” & o & “"”
return do shell script result
end regexReplace

on escapeForRegex(s)
replaceString(s, “"”, “\"”)
return replaceString(result, “/”, “\/”)
end escapeForRegex

on replaceString(s, ss, rs)
if class of ss ≠ list then set ss to {ss}
repeat with os in ss
set text item delimiters to os
set s to every text item of s
set text item delimiters to rs
set s to s as text
end repeat
set text item delimiters to “”
return s
end replaceString