crosspost: http://www.mac-forums.com/forums/showthread.php?t=120528
NetNewsWire RSS reader can save webpages (web pages which open after clicking on say “read more” of RSS feeds) viewed through its internal browser. Is there a way to automate saving of all webpages?
If not NetNewsWire RSS reader, then any other feed reader…can it be done?
Hi chris2,
As far as I know NetNewsWire offers support for AppleScript and Automator, so chances are good that you can save webpages automatically. I don’t have the application installed, but you can look at its AppleScript dictionary by dropping NetNewsWire’s application icon onto the Script Editor.
But you can also use curl to download RSS feeds and then process them according to your very own requirements:
set feedurl to "http://images.apple.com/main/rss/hotnews/hotnews.rss"
set command to "curl " & quoted form of feedurl
set feedsource to do shell script command
set feedlines to paragraphs of feedsource
repeat with feedline in feedlines
if feedline begins with "<link>" then
set newsurl to (characters 7 through -8 of feedline) as Unicode text
-- more code goes here
end if
end repeat
DEVONagent (and DEVONthink) also offers ways to process feeds and websites:
set feedurl to "feed://images.apple.com/main/rss/hotnews/hotnews.rss"
tell application "DEVONagent"
set feedsource to download markup from feedurl
set feeditems to get items of feed feedsource
repeat with feeditem in feeditems
set articleurl to link of feeditem
set articlesource to download markup from articleurl
-- more code goes here
end repeat
end tell
Thanks for the help. DEVONagent is not free.
set feedurl to "http://images.apple.com/main/rss/hotnews/hotnews.rss"
set command to "curl " & quoted form of feedurl
set feedsource to do shell script command
set feedlines to paragraphs of feedsource
repeat with feedline in feedlines
if feedline begins with "<link>" then
set newsurl to (characters 7 through -8 of feedline) as Unicode text
-- more code goes here
end if
end repeat
– more code goes here
what code to add here ?
if there was a way to do it through Mail it would be great because curl is all command line------scares me
As far as I understood it you want to save the webpages into files. But I do not know where you want to save them, which naming scheme you are using and so on. Therefore you will have to add code there that saves the source of the website into a file on your Mac. For example, you could tell Safari to open the URL and to save it for you, as Safari is quite scriptable. It’s up to you
Mail currently does not offer AppleScript support for managing feeds. But if the command line scares you, then you can also opt for URL Access Scripting, which is only a little bit more work:
on run
set tmpfilepath to my gettmpfilepath()
tell application "URL Access Scripting"
set feedsource to download "http://images.apple.com/main/rss/hotnews/hotnews.rss" to tmpfilepath
end tell
set filecont to read file tmpfilepath
set command to "rm " & quoted form of POSIX path of tmpfilepath
do shell script command
set feedlines to paragraphs of filecont
-- deleting the temp file
repeat with feedline in feedlines
if feedline begins with "<link>" then
set newsurl to (characters 7 through -8 of feedline) as Unicode text
-- more code goes here
end if
end repeat
end run
on gettmpfilepath()
set tmpfolderpath to (path to temporary items folder from user domain) as Unicode text
repeat
set randnum to random number from 10000 to 99999
set tmpfilename to randnum & ".html"
set tmpfilepath to tmpfolderpath & tmpfilename
try
set tmpfilealias to tmpfilepath as alias
on error
exit repeat
end try
end repeat
return tmpfilepath
end gettmpfilepath
thanks Martin.
i used the URL access scripting script
i replaced this (path to temporary items folder from user domain) with “Users/chris/Downloads/feed”
did not change anything else in the script
i get the error “Bad name for file. Users/chris/Downloads/feed13377.html”
Hi,
AppleScript works only with colon separated paths, starting with the name of the (start) volume
"Mac HD:Users:chris:Downloads:feed:"
or, regardless of the user name
((path to home folder as text) & "Downloads:feed:")
Note: If you use a POSIX (slash separated) path, it must start with a slash
thanks StefanK
in “more code goes here” i wanted to add some code that could download the web page in my /users/chris/downloads/feed folder…so i thought of just seeing if they open properly before making them save themselves in feed folder.
i added this in “more code goes here”
tell application "Safari"
open feedline
end tell
some 20 weird URL like file:///%3Clink%3Ehttp/::www.apple.com:iphone:softwareupdate:%3Fsr=hotnews%3Fsr=hotnews.rss%3C:link%3E
file:///%3Clink%3Ehttp/::www.apple.com:trailers:independent:theluckyones:%3Fsr=hotnews%3Fsr=hotnews.rss%3C:link%3E
opened up in Safari with the error
".
(address different at each place, obviously)
:lol:
AppleScript works only with HFS paths (colon separated), but of course literal URL’s won’t be changed,
www.apple.com/iphone/softwareupdate/.
PS: The shell works only with POSIX paths (slash separated)
in simple words, u mean the above script is not the solution yet
It’s probably the solution, but please don’t mix up the HFS paths (like your download folder)
and the POSIX paths (like a URL)
this works on my machine except iTunes store links
on run
set tmpfilepath to my gettmpfilepath()
tell application "URL Access Scripting"
set feedsource to download "http://images.apple.com/main/rss/hotnews/hotnews.rss" to tmpfilepath
end tell
set filecont to read file tmpfilepath
set command to "rm " & quoted form of POSIX path of tmpfilepath
do shell script command
set feedlines to paragraphs of filecont
-- deleting the temp file
repeat with feedline in feedlines
if feedline begins with "<link>" then
open location (characters 7 through -8 of feedline) as Unicode text -- this opens each URL with the default browser
end if
end repeat
end run
.
i googled around…saw this…http://docs.info.apple.com/article.html?path=AppleScript/2.1/en/as208.html
and accordingly did this
i get the error
i m sorry…if i m annoying you…maybe i will learn some more of applescript and then come back
you need the handler, which I skipped for saving space and indicated with “.”
Here’s the whole script
on run
set tmpfilepath to my gettmpfilepath()
tell application "URL Access Scripting"
set feedsource to download "http://images.apple.com/main/rss/hotnews/hotnews.rss" to tmpfilepath
end tell
set filecont to read file tmpfilepath
set command to "rm " & quoted form of POSIX path of tmpfilepath
do shell script command
set feedlines to paragraphs of filecont
-- deleting the temp file
repeat with feedline in feedlines
if feedline begins with "<link>" then
open location (text 7 through -8 of feedline) -- this opens each URL with the default browser
end if
end repeat
end run
on gettmpfilepath()
set tmpfolderpath to (path to temporary items folder as text)
repeat
set randnum to random number from 10000 to 99999
set tmpfilename to randnum & ".html"
set tmpfilepath to tmpfolderpath & tmpfilename
try
set tmpfilealias to tmpfilepath as alias
on error
exit repeat
end try
end repeat
return tmpfilepath
end gettmpfilepath
Note: path to temporary items is used only for the temporary file, which will be deleted afterwards