Applescript Regular Expression

Hey Guys, Need help !!!

I have a huge network of file, in with I need to make verification of the NameType regex…

Example :

My regex is \b([A-Z]){2,4}_\d{8}(SP)?\d?.([pP][dD][fF])\b

Sample files and the relsult:

AA_20151212.pdf =========> TRUE
AA_20151212SP.PdF =========> TRUE
AAA_20151212SP1.pDf =========> TRUE
AAA_20151212SP3.pDf =========> TRUE

A_20151212SP1.pdf =========> FALSE
AAAA_20151212.pDf =========> FALSE
AAAAAA_151212.pdf =========> FALSE
BBB_2015-12-12.pdf =========> FALSE
BBB_2015_12_12.pdf =========> FALSE

I’m trying to … grep or echo or see ??? I don’t know any more… :frowning:



tell application "Finder"
	
	set sLibrary to folder "XYZ" of folder "Archive" of disk "XYZ" as alias
	set errorFolder to folder "_ERRORS_HTML5" of sLibrary
	
	set phase0 to {"AA"}
	-- in a folder "AA" folder "XYZ" of folder "Archive" of disk "XYZ"
	-- there is : 	{"A_20151212SP1.pDf", "AA_20151212SP.pDf", "AAA_20151212SP1.pDf", "AAA_20151212SP3.pDf", "AAAA_20151212.pDf", "AAAAAA_20151212.pdf", "BBB_2015-12-12.pdf", "BBB_2015_12_12.pdf", "AA_20151212.pdf"}

	set toRender1 to folder "toRender01" of folder "_flipbook_" of sLibrary

	repeat with aNpIndex from 1 to count of phase0
		set aNpName to (item aNpIndex of phase0)
		set grpPDFs to every file in folder aNpName of sLibrary
		repeat with aPdfNp from 1 to count of grpPDFs
			set aNpItem to (item aPdfNp of grpPDFs)
			set aNpPdfName to name of aNpItem

			-- My regex is :  \b([A-Z]){2,4}\_\d{8}(SP)?\d?\.([pP][dD][fF])\b

			---------
			---------
			--this is where it is not working, the shell script
			--if (do shell script "grep '\\b([A-Z]){2,4}\\_\\d{8}(SP)?\\d?\\.([pP][dD][fF])\\b' ]];") is true then
			if (do shell script "/bin/echo " & quoted form of aNpPdfName & " | /usr/bin/grep -E '\\b([A-Z]){2,4}\\_\\d{8}(SP)?\\d?\\.([pP][dD][fF])\\b' ]];") is true then
	---------
				---------

				display dialog (aNpPdfName as string) & " is RIGHT NAME" buttons {"Cancel", "OK"} default button "OK"
				--and also do something
			else
				display dialog (aNpPdfName as string) & " is WRONG NAME" buttons {"Cancel", "OK"} default button "OK"
				--and also do something
				if (exists file aNpPdfName of errorFolder) is false then
					duplicate file aNpPdfName of folder aNpName of sLibrary to errorFolder
				end if
			end if

		end repeat

	end repeat

end tell


I tried over 50 version of egrep, sed, grep, perl… It’s going nuts in my head with this regex that I can’t think anymore, I’m yelling out for HELP…

Thanks in advance…

Hi,

first of all, according to the regex this line

AAAA_20151212.pDf

matches also the condition.

The main issue is that grep does not return a boolean value. It returns all lines which matches the regex otherwise throws an error.

Try this, I slightly modified the regex, b(pdf)[/b] is the same as b[/b]


...
    repeat with aNpIndex from 1 to count of phase0
        set aNpName to (item aNpIndex of phase0)
        set grpPDFs to every file in folder aNpName of sLibrary
        repeat with aPdfNp from 1 to count of grpPDFs
            set aNpItem to (item aPdfNp of grpPDFs)
            set aNpPdfName to name of aNpItem
            try
                do shell script "/bin/echo " & quoted form of aNpPdfName & " | /usr/bin/grep -E '\\b([A-Z]){2,4}\\_\\d{8}(SP)?\\d?\\.(?i)(pdf)\\b'"
                display dialog aNpPdfName & " is RIGHT NAME" buttons {"Cancel", "OK"} default button "OK"
            on error
                display dialog aNpPdfName & " is WRONG NAME" buttons {"Cancel", "OK"} default button "OK"
                --and also do something
                if not (exists file aNpPdfName of errorFolder) then
                    duplicate file aNpPdfName of folder aNpName of sLibrary to errorFolder
                end if
            end try
        end repeat
    end repeat

When you have AppleScript toolbox installed you can list a folder and use an regular expression for each file in a single command. Also I have changed the regex matching you examples, in the second item in the false list will return true in your regular expression.

set sLibrary to "/Volumes/XYZ/Archive/XYZ/"
set errorFolder to sLibrary & "_ERRORS_HTML5/"
set toRender1 to sLibrary & "_flipbook_/toRender01"
set phase0 to {"AA"}

repeat with aNpIndex from 1 to count of phase0
	set aNpName to (item aNpIndex of phase0)
	set allFiles to AST list folder sLibrary & aNpName without showing invisibles
	set matchingFiles to AST list folder sLibrary & aNpName matching regex "^[A-Z]){2,3}_\\d{8}((SP)?\\d)?\\.pdf$" without showing invisibles
	repeat with aPdfNp from 1 to count of allFiles
		if item aPdfNp of allFiles is in matchingFiles then
			display dialog (item aPdfNp of allFiles as string) & " is RIGHT NAME" buttons {"Cancel", "OK"} default button "OK"
		else
			display dialog (item aPdfNp of allFiles as string) & " is WRONG NAME" buttons {"Cancel", "OK"} default button "OK"
			do shell script "cp -n " & quoted form of (sLibrary & aNpName & "/" & item aPdfNp of allFiles) & space & quoted form of errorFolder
		end if
	end repeat
end repeat