Cleaning data out of a text file?

I think, code in the post #14 is what you asked for just now

I think I misread what the second option was. The master (deny) IP list is not maintained by myself but a third party. That list is updated weekly (if not daily) with all the IPs that need to be blocked on a Synology (or similar) NAS.

What I’m now looking to do is to download the latest incarnation of that list and then process my IP list from my Orbi router, remove the duplicates and sort it (as we do now) and then open the third party deny list and remove any IP addresses which are already in the master deny list.

That way any IPs I submit are only those that are new and need including in the master list. This way the author of the master deny list does not have to check all my submissions manually where a great number may be duplicated. As an example last week I submitted 25 IPs for blocking and all 25 it turns out were already on the list.

dbrewood. I’ve modified my script to work as you want. The script prompts for the IP deny list, which seems a bit awkward, but the actual path can be placed in the script instead. The IP deny list has to be paragraph-separated text of IP addresses only. I tested the script and it worked without issue.

use framework "Foundation"
use scripting additions

on open theDroppedItems
	set theFile to POSIX path of item 1 of theDroppedItems
	set theExistingFile to POSIX path of (choose file)
	set theIPFile to getFileName(theFile)
	
	set theText to (current application's NSString's stringWithContentsOfFile:theFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))
	set theExistingText to (current application's NSString's stringWithContentsOfFile:theExistingFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))
	set theIPData to getIPData(theText, theExistingText)
	
	(current application's NSString's stringWithString:theIPData)'s writeToFile:theIPFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end open

on getIPData(theText, theExistingText)
	set regExPattern to "[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]+"
	set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:regExPattern options:0 |error|:(missing value)
	set regExMatches to theRegEx's matchesInString:theText options:0 range:{location:0, |length|:theText's |length|()}
	
	set ipSet to current application's NSMutableSet's new()
	repeat with anItem in regExMatches
		(ipSet's addObject:(theText's substringWithRange:(anItem's range())))
	end repeat
	
	set ipExistingArray to (theExistingText's componentsSeparatedByCharactersInSet:(current application's NSCharacterSet's newlineCharacterSet()))
	set ipExistingSet to current application's NSSet's setWithArray:ipExistingArray
	
	ipSet's minusSet:ipExistingSet
	set ipSortedArray to ipSet's allObjects()'s sortedArrayUsingSelector:"localizedStandardCompare:"
	return ((ipSortedArray's componentsJoinedByString:linefeed) as text)
end getIPData

on getFileName(theFile)
	set theFile to current application's NSString's stringWithString:theFile
	set fileBase to theFile's stringByDeletingPathExtension()
	set fileExtension to theFile's pathExtension()
	return ((fileBase's stringByAppendingString:"_IP_CLEANED")'s stringByAppendingPathExtension:fileExtension)
end getFileName

Wow! That is totally awesome and does indeed look to do what is needed. I’ll not know for sure until I run it with real data, but on test data I just threw together it looked to work perfectly!

If I do want to hard code the path for the third party deny list:

“/Users/dbrewood/OneDrive/NAS/Deny List/deny-ip-list.txt”

Where exactly would I make the changes? (Sorry I can’t work out which variable I’d need to set).

Thanks again to everyone involved in this.

In my script, delete the first line below and insert the second line below.

set theExistingFile to POSIX path of (choose file)
set theExistingFile to "/Users/dbrewood/OneDrive/NAS/Deny List/deny-ip-list.txt"

Absolutely superb, worked wonderfully well on my test data. Many many thanks.

This should save a lot of work for both myself and the script maintainer!

Guys the script I ended up using is:

use framework "Foundation"
use scripting additions

on open theDroppedItems
	set theFile to POSIX path of item 1 of theDroppedItems
	# set theExistingFile to POSIX path of (choose file)
	set theExistingFile to "/Users/dbrewood/Library/Mobile Documents/com~apple~CloudDocs/_Daron Files/NAS/Deny List/deny-ip-list.txt"
	set theIPFile to getFileName(theFile)
	
	set theText to (current application's NSString's stringWithContentsOfFile:theFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))
	set theExistingText to (current application's NSString's stringWithContentsOfFile:theExistingFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))
	
	set theIPData to getIPData(theText, theExistingText)
	(current application's NSString's stringWithString:theIPData)'s writeToFile:theIPFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end open

on getIPData(theText, theExistingText)
	set regExPattern to "[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]+"
	set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:regExPattern options:0 |error|:(missing value)
	set regExMatches to theRegEx's matchesInString:theText options:0 range:{location:0, |length|:theText's |length|()}
	set ipList to {}
	repeat with anItem in regExMatches
		set end of ipList to (theText's substringWithRange:(anItem's range())) as text
	end repeat
	
	set ipArray to current application's NSMutableArray's arrayWithArray:ipList
	set ipExistingText to current application's NSString's stringWithString:theExistingText
	set newlineSet to current application's class "NSCharacterSet"'s newlineCharacterSet()
	set ipExistingArray to (ipExistingText's componentsSeparatedByCharactersInSet:(newlineSet))
	ipArray's removeObjectsInArray:ipExistingArray
	
	set ipSet to current application's NSOrderedSet's orderedSetWithArray:ipArray
	set ipSortedArray to ipSet's array()'s sortedArrayUsingSelector:"localizedStandardCompare:"
	return ((ipSortedArray's componentsJoinedByString:linefeed) as text)
end getIPData

on getFileName(theFile)
	set theFile to current application's NSString's stringWithString:theFile
	set fileBase to theFile's stringByDeletingPathExtension()
	set fileExtension to theFile's pathExtension()
	return ((fileBase's stringByAppendingString:"_IP_CLEANED")'s stringByAppendingPathExtension:fileExtension)
end getFileName

Which works very well indeed. However I’m looking to automate the process further by submitting the list of addresses to www.abuseipdb.com via their API (detailed here: https://docs.abuseipdb.com/#configuring-fail2ban). What I’d want to be getting back from each IP address is something like this:

185.162.235.175 was found in our database, this IP was reported 110 times. Confidence of Abuse is 100% - ISP Alliance LLC - Country Netherlands - City Meppel, Drenthe

This would make reporting to the guys who compile the IP ban list very easy.

Is any of this actually possible?

I am not familiar with the site’s API, and I would not like to study it, and then also pay (if the key is paid). Therefore, I solved this problem with simple parsing (which can be improved further). But if there is an expert on this API, then his solution will probably be better than mine.


set ipAddress to "95.217.31.46"

tell application "Safari"
	open location "https://www.abuseipdb.com/check/" & ipAddress
	activate
	my waitSafariWebPageLoading()
	set HTML to text of document 1
	close document 1
end tell

set ATID to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"CHECK
" & ipAddress, "REPORT " & ipAddress & "
WHOIS " & ipAddress}
set theReport to text item 2 of HTML
set AppleScript's text item delimiters to ATID

activate me
display dialog ipAddress & " " & theReport buttons {" Cancel ", " OK "}

on waitSafariWebPageLoading()
	tell application "System Events" to tell application process "Safari"
		repeat until ((UI element "Reload this page" of group 3 of toolbar 1 of window 1 exists) or (UI element "Reload this page" of group 2 of toolbar 1 of window 1 exists))
			delay 0.1
		end repeat
	end tell
end waitSafariWebPageLoading

I can get a free API key if needed :slight_smile: In fact I just created this one: 247cdf97b62552676b6f56a4e56611a7fcb2ceaced16cd1e27f7f987be861c1f42b96a7f8be45fb9

I’ve just run the script and if I’m right that opens the web check page and enters the data to be checked and runs the check?

I don’t get the dialog box popping up with any data? It just stops at the report page?

No, it reports full information about the provided IP. At least, on my Catalina system.

I forgot to add 2 activate commands. They are required because waiting handler uses GUI scripting. Try the updated script.

Also, maybe the UI element “Reload this page” has other reference on your Safari version.

Weird, I’ve tried the revised script on macOS Monterey 12.3.1 and it stops at the same point which is where the web page loads and shows the reported IP address.
I have to force quit the script at that point to get it to exit.

This is the reference UI element “Reload this page” of group 3 of toolbar 1 of window 1 for the Catalina 10.15.7 system.

It seems, on macOS Monterey 12.3.1 it is different. Maybe someone, who is on macOS Monterey 12.3.1 will help you to determine the correct reference.

Right, on Monteroy that looks to be under… ‘View’ menu, 12th item down, ‘Reload This Page’, I’ve no idea if that helps?

]BTW it looks like there is a script for

The page can be found here:

https://github.com/AdmiralSYN-ACKbar/bulkcheck

I’ve no ideas if that helps at all?

If I there was an Apple Script version of that then I guess the ‘cleaning script’ I use now could be fed into it?

Okay I’ve downloaded the script and made it readable. I’ve got the Apple script as:

set ipListFile to POSIX path of (choose file of type ".csv")
set resultsFile to (POSIX path of (path to desktop folder)) & "results.csv"

do shell script "/Users/myusername/Library/Mobile Documents/com~apple~ScriptEditor2/Documents/BulkAbuseCheck.sh" & quoted form of ipListFile & " " & quoted form of resultsFile

When I run it and get the file selector up and I look to choose the file all files are greyed out and not selectable.

The source I have is a .txt file, and I then imported that into a spreadsheet and exported it out as a .csv file. Same problem, both files are greyed out and can’t be selected?

That should be:

set ipListFile to POSIX path of (choose file of type "csv")

Okay I modified the location of the script and made the other changes suggested:

set ipListFile to POSIX path of (choose file of type "csv")
set resultsFile to (POSIX path of (path to desktop folder)) & "results.csv"

do shell script "/usr/local/bin/BulkAbuseCheck.sh " & quoted form of ipListFile & " " & quoted form of resultsFile

In the result area of the script editor I just get “”.

I did try changing the csv to txt and selecting the raw txt file I had of IP addresses, same result alas.

I do not get prompted to input the API key as it states should happen in the documentation. Oh and I did install ‘jq’ via homebrew as mentioned in the docs.

The raw text file was:

the contents of the CSV file:

I decided to abandon the proposed BulkAbuseCheck.sh utility, since it does not work as I thought, and even then - not on a Mac. It requires manual input from the user, is interactive, and is not designed for Mac.

Instead, I decided to write my own script, which I think will be more useful for Mac users:


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

property NSString : a reference to current application's NSString
property NSJSONSerialization : a reference to current application's NSJSONSerialization
property NSUTF8StringEncoding : a reference to current application's NSUTF8StringEncoding

property serviceAddress : "https://api.abuseipdb.com/api/v2/check"
property myKey : "ce8b31add6647986bca78a2460b342a168375bdd35efc801ce8ffbade5ea385d06094b665454efe8"
property maxAgeInDays : 30 -- for last 30 days 

set ipListFile to (choose file of type "txt") -- provide ip addresses inside this file
set ipList to paragraphs of (read ipListFile)

set report to ""
repeat with nextIP in ipList
	-- get JSON Report (as record) using site's API
	set jsonData to do shell script "curl -G " & serviceAddress & ¬
		" --data-urlencode \"ipAddress=" & nextIP & ¬
		"\" -d maxAgeInDays=" & maxAgeInDays & ¬
		" -H \"Key: " & myKey & ¬
		"\" -H \"Accept: application/json\""
	set jsonString to (NSString's stringWithString:jsonData)
	set jsonData to (jsonString's dataUsingEncoding:NSUTF8StringEncoding)
	set aRecord to (NSJSONSerialization's JSONObjectWithData:jsonData options:0 |error|:(missing value)) as record
	-- BUILD THE FINAL REPORT as TEXT 
	tell (|data| of aRecord)
		set report to report & "-----------------------   ip: " & its ipAddress & "  ----------------------" & linefeed
		set report to report & "abuseConfidenceScore: " & its abuseConfidenceScore & linefeed
		set report to report & "countryCode: " & its countryCode & linefeed
		set report to report & "domain: " & its domain & linefeed
		set hostnames to its hostnames
		set report to report & "hostnames: "
		repeat with nextName in hostnames
			set report to report & nextName & linefeed
		end repeat
		if (count hostnames) = 0 then set report to report & linefeed
		set report to report & "ipVersion: " & its ipVersion & linefeed
		set report to report & "isp: " & its isp & linefeed
		set report to report & "isPublic: " & its isPublic & linefeed
		set report to report & "isWhitelisted: " & its isWhitelisted & linefeed
		set report to report & "lastReportedAt: " & its lastReportedAt & linefeed
		set report to report & "numDistinctUsers: " & its numDistinctUsers & linefeed
		set report to report & "totalReports: " & its totalReports & linefeed
		set report to report & "usageType: " & its usageType & linefeed
		set report to report & linefeed & linefeed
	end tell
end repeat

That totally and completely amazes me how you could create such an awesome script to do the job natively on a Mac, plus use a text file as input which is brilliant.

Would it be possible (if I dare ask) for the script to be modified so that it could give an output file in CSV format (single line per entry)? That way it’d make it easy to sort on attack relevance or country etc.

If not, it’s not a worry, it still saves me a heck of a lot of work :slight_smile:

Thanks again…

Hello, again.

Following script will export the report as CSV file:


-- script:  Check IP addresses list for DD_DOS and other net attacks
-- written by: KniazidisR (today)
-- note: visit https://api.abuseipdb.com to learn more details

property serviceAddress : "https://api.abuseipdb.com/api/v2/check"
property myKey : "ce8b31add6647986bca78a2460b342a168375bdd35efc801ce8ffbade5ea385d06094b665454efe8"
property maxAgeInDays : 30 -- for last 30 days 

set ipList to {"13.32.145.30", "34.247.206.80"}
-- or:
-- set ipListFile to (choose file of type "txt") -- provide ip addresses inside this file
-- set ipList to paragraphs of (read ipListFile)

-- get JSON Report using site's API
set report to ""
repeat with nextIP in ipList
	set jsonData to do shell script "curl -G " & serviceAddress & ¬
		" --data-urlencode \"ipAddress=" & nextIP & ¬
		"\" -d maxAgeInDays=" & maxAgeInDays & ¬
		" -H \"Key: " & myKey & ¬
		"\" -H \"Accept: application/json\""
	set report to report & jsonData & linefeed
end repeat

-- make temporary text file
set tempFolder to (path to temporary items folder from user domain)
tell application "Finder"
	try
		set reportTextFile to (make new file at tempFolder with properties {name:"Report.txt"}) as alias
	on error
		set reportTextFile to (file "Report.txt" of folder (tempFolder as text)) as alias
	end try
end tell

-- write JSON report to temporary text file
set file_ID to open for access reportTextFile with write permission
set eof file_ID to 0
write report to file_ID as «class utf8»
close access file_ID

-- convert temporary text file to CSV file
set csvReportFile to "" & (path to desktop folder) & "Report.csv"
tell application "Numbers"
	set theDoc to open reportTextFile
	export theDoc to file csvReportFile as CSV
end tell

Many thanks that does indeed work but it’s not true CSV, in that ‘column title’ is not shown as a title and is included in the row contents. I’m not sure if it is possible to format that out to give a true CSV format?

e.g. The first row should ideally be the column titles ‘ipAddress’, followed by ‘isPublic’, ‘ipVersion’ etc, the columns underneath should be 15.235.144.210, true, 4, etc

Does that make sense?

If it’s not possible don’t worry about it, I know I’m asking for a lot…

If the above could be achieved I guess the script could be added to GitHub so that others could benefit from it.

Thanks as always…