My objective is to copy details from a webpage (GWebCache) and process the details until only the hosts addresses remain upon which they are inserted into a host file text document. My issue is that whilst Chrome works fine if delays are added and Firefox works very well even without the short delays. But Safari is not straight forward. My tests are showing Safari is copy-pasting each line broken up into 4 lines. Thus I’m trying to write an extra script for safari to delete all the unnecessary lines. The 4 lines are host address and other details, network name (gnutella), program name, and timestamp. Whilst removing repetitive lines partly solves the issue I still need to remove the unwanted lines (timestamp lines are not identical.)
It would be easier if I could instead instruct Safari to copy all lines in-place instead of this extra code. I have placed the copy-webpage code at bottom of post. Meanwhile if that cannot be done, I need to figure out how to delete the unwanted lines incorporating words such as gnutella, wireshare, 2019/ or 2020/ or lines that incorporate two colons.
The text formatting to remove details is handled by sub-routines not listed here, there’s more than one depending on webcache site accessed.
The keystroke deleting comes up with error number -1700
tell application "TextWrangler"
activate
--- open find window
find "gnutella" searching in text 1 of text document "untitled text 41" options {wrap around:true} with selecting match --- you may need to change this to document 1 or front document.
--- set linetobedeleted to selection
tell application "TextWrangler"
tell front window
tell application "System Events"
try
keystroke (delete {shift down, control down})
end try
end tell
end tell
end tell
end tell
You can test out the first section of code using Safari, chrome or firefox to see the difference and the problem. (Different script for pre-checking if person posseses either bbedit or textwrangler, otherwise this extra processing is by-passed):
set browser_name to "Safari" --- I've removed the code check for default browser. And code to check if any browsers are already open so that they are only quit if they were not inititally open.
-- Write Document Text
set DocText to ""
-- Look at whichever browser is default browser, open and do things
tell application browser_name
activate --- launch instead of activate as it will open in background. Remains in background if already open and hidden.
open location "http://disobscure.velum-ultra.com/skulls.php?showhosts=1"
--- "http://wireshare.sourceforge.net/gwc/gwc.php?display=gnutella"
end tell
to raiseWindow of browser_name for theName
tell the application named browser_name
activate
set theWindow to the first item of ¬
(get the windows whose name is theName)
if index of theWindow is not 1 then
set index to 1
set visible to true
end if
end tell
end raiseWindow
delay 4
tell application "System Events"
-- Open URL
if frontmost of process browser_name then
set visible of process browser_name to true
else
set frontmost of process browser_name to true
end if
tell application process browser_name
set visible to true
-- Press ⌘A
set uiScript to "keystroke \"a\" using command down"
try
run script "tell application \"System Events\"
" & uiScript & "
end tell"
end try
delay 2
set uiScript to "keystroke \"c\" using command down"
try
run script "tell application \"System Events\"
" & uiScript & "
end tell"
end try
delay 2
-- Press ⌘W (close browser tab)
set uiScript to "keystroke \"w\" using command down"
try
run script "tell application \"System Events\"
" & uiScript & "
end tell"
end try
delay 1
keystroke return
end tell
end tell
delay 1
set the clipboard to string of (the clipboard as record)
set webpage to (the clipboard as text)
tell application "TextWrangler"
launch --- launch instead of activate as it will open in background. Remains in background if already open and hidden. --- realised the select, copy-paste probably does not work if in background.
set thisDoc to make new document
-- readying process to copy web-page contents
set ContentRelitive to webpage & return
set DocText to ContentRelitive & return
tell thisDoc
set its text to DocText
end tell
tell application "TextWrangler" to tell front text document to delete text of (lines 1 thru 808)
tell application "TextWrangler" to tell front text document to delete text of (lines 78 thru 1610) --- I had to increase these numbers for Safari. For Chrome + Firefox see original line listing below. Also depends on which URL is accessed.)
-- tell application "TextWrangler" --- I forgot, I added this for safari but does not do anything. Would not want this for the other browsers so remove this section if using chrome or firefox.
-- repeat with i in (get every line of front text document)
-- tell i's text
-- if (contents of line contains "WireShare") then
-- delete text of line
-- end if
-- end tell
-- end repeat
-- end tell
-- tell application "TextWrangler" to tell front text document to delete text of (lines 1 thru 6)
-- tell application "TextWrangler" to tell front text document to delete text of (lines 51 thru 70
--- these are actually for the sourceforge url
end tell