Friday, August 14, 2020

#1 2011-10-03 03:59:01 pm

Ole
Member
Registered: 2011-10-03
Posts: 38

export safari page from source with applescript

I'm trying to make a script that exports a safari page from source as a html document to a given folder. Posts from the archives recommends using either a curl command or a system events solution. The first gives pages that look less like the original page than what you get if you use safaris export from source manually, the latter is kinda kludgy. I have written some scripts and the safari applescript dictionary entries for save and source that I'm trying to use. I'm getting errors, but I'm not sure what they mean. Any help would be appreciated!



These are scripts I've written so far:

1.

 

Applescript:

tell application "Safari"
set folderToSaveSafariWindowIn to "Q:Ø:"
set pageToBeSaved to front window
save document pageToBeSaved as source in alias pageToSaveSafariWindowIn
end tell

RESULT LOG:

tell application "Safari" get window 1 --> window id 6017 save document (window id 6017) as source in alias "Q:Ø:" --> error number -1700 from window id 6017 to integer

error "Safari got an error: Can’t make window id 6017 into type integer." number -1700 from window id 6017 to integer

2.

   

Applescript:

tell application "Safari"
save source of document in "Q:Ø:"
end tell

RESULT:

error "Can’t get source of document." number -1728 from «class conT» of document

These are entries from the applescript dictionary:

document n [see also Standard Suite] : A Safari document representing the active tab in a window. properties source (text, r/o) : The HTML source of the web page currently loaded in the document. text (text, r/o) : The text of the web page currently loaded in the document. Modifications to text aren't reflected on the web page. URL (text) : The current URL of the document.


save v : Save an object. save specifier : the object for the command [as text] : The file type in which to save the data. [in alias] : The file in which to save the object.[



Model: macbook pro
AppleScript: NA
Browser: Safari 533.20.27
Operating System: Mac OS X (10.7)


Filed under: safari

Offline

 

#2 2011-10-03 10:19:41 pm

Trash Man
Sanitation Department
Registered: 2005-10-20
Posts: 5336

Re: export safari page from source with applescript

Applescript:

try
   tell application "Safari"
       set x to URL of document 1
       set r to do shell script "echo " & quoted form of x & " | sed 's|/$||;s|:|%3A|g;s|/|%2F|g'"
       do shell script "curl " & x & " > " & quoted form of ((system attribute "HOME") & "/Desktop/" & r & ".html")
   end tell
end try


One mans trash is another mans treasure

Offline

 

#3 2011-10-04 12:49:08 am

Ole
Member
Registered: 2011-10-03
Posts: 38

Re: export safari page from source with applescript

Thank you! I think there may be some sort of bug here though -- see below. Would the output be saved in html  the same way as in the scripts with the curl command?

I get this output: " "

The event log says:

Applescript:

tell application "Safari"
   get URL of document 1
       --> "[url]http://macscripter.net/post.php?tid=37175[/url]"
   do shell script "echo '[url]http://macscripter.net/post.php?tid=37175[/url]' | sed 's|/$||;s|:|%3A|g;s|/|%2F|g'"
       --> error number -10004
end tell
tell current application
   do shell script "echo '[url]http://macscripter.net/post.php?tid=37175[/url]' | sed 's|/$||;s|:|%3A|g;s|/|%2F|g'"
       --> "http%3A%2F%2Fmacscripter.net%2Fpost.php?tid=37175"
end tell
tell application "Safari"
   system attribute "HOME"
       --> error number -10004
end tell
tell current application
   system attribute "HOME"
       --> "/Users/ivindkulsrud"
end tell
tell application "Safari"
   do shell script "curl [url]http://macscripter.net/post.php?tid=37175[/url] > '/Users/ivindkulsrud/Desktop/http%3A%2F%2Fmacscripter.net%2Fpost.php?tid=37175.html'"
       --> error number -10004
end tell
tell current application
   do shell script "curl [url]http://macscripter.net/post.php?tid=37175[/url] > '/Users/ivindkulsrud/Desktop/http%3A%2F%2Fmacscripter.net%2Fpost.php?tid=37175.html'"
       --> ""
end tell
Result:
""

Last edited by Ole (2011-10-04 01:35:06 am)

Offline

 

#4 2011-10-04 01:11:17 am

StefanK
Member
From:: St. Gallen, Switzerland
Registered: 2006-10-21
Posts: 11693
Website

Re: export safari page from source with applescript

Hi,

in Safari the keyword source is a property of class document.
It cannot be used as a parameter in the save command.

To get the HTML content of a page curl is indeed the best way .
The echo - sed part is not needed, quoted form is sufficient and much more reliable

Applescript:


tell application "Safari" to set {theURL, theTitle} to {URL, name} of document 1
set {TID, text item delimiters} to {text item delimiters, "/"} -- exchange all slashes with underscores
set theTitle to text items of theTitle
set text item delimiters to "_"
set theTitle to theTitle as text
set text item delimiters to TID

do shell script "curl " & quoted form of theURL & " > " & quoted form of (POSIX path of (path to desktop) & theTitle & ".html")

Last edited by StefanK (2011-10-04 01:11:45 am)


regards

Stefan

Offline

 

#5 2011-10-04 02:29:03 am

Ole
Member
Registered: 2011-10-03
Posts: 38

Re: export safari page from source with applescript

got it, thank you stefan

Offline

 

#6 2011-10-04 04:40:17 am

StefanK
Member
From:: St. Gallen, Switzerland
Registered: 2006-10-21
Posts: 11693
Website

Re: export safari page from source with applescript

Or, as you use Safari anyway

Applescript:


tell application "Safari" to set {theSource, theTitle} to {source, name} of document 1
set {TID, text item delimiters} to {text item delimiters, "/"} -- exchange all slashes with underscores
set theTitle to text items of theTitle
set text item delimiters to "_"
set theTitle to theTitle as text
set text item delimiters to TID
set theFile to ((path to desktop as text) & theTitle & ".html")
try
   set fRef to open for access file theFile with write permission
   write theSource to fRef as «class utf8»
   close access fRef
on error
   try
       close access file theFile
   end try
end try

This avoids to load the same page twice

Last edited by StefanK (2011-10-04 04:40:44 am)


regards

Stefan

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)