I’m new to Applescript. Like another poster here, I also download many files from the net using Firefox (I still have Safari but I just prefer Firefox. If this script is possible only in Safari that’s fine too). I would like to know if it’s possible to write a script that tells my browser to locate all links on the page, then download each linked file from those links into a specific folder. Is this possible and how do I do it? Thanks.
something like this? It works only in Safari, the scriptability of FireFox is hopeless
property fileList : {"jpg", "gif", "png", "tif", "bmp", "zip", "dmg", "pdf"}
set destinationFolder to POSIX path of (choose folder)
tell application "Safari"
activate
set num_links to (do JavaScript "document.links.length" in document 1)
repeat with i from 0 to num_links - 1
tell application "Safari" to set this_link to do JavaScript "document.links[" & i & "].href" in document 1
set {ASTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "/"}
set fName to last text item of this_link
set AppleScript's text item delimiters to ASTID
try
if text -3 thru -1 of fName is in fileList then
do shell script "curl -o " & quoted form of (destinationFolder & fName) & space & this_link
end if
end try
end repeat
end tell
Thanks Stefan! That’s very close to what I’m looking for. The script still won’t download all the links on the page for some reason. Here is a link to the page I’m trying:
There are 139 items on the page. I believe there are different kinds of links on this page, some to start downloading files and others that direct you to another page. Maybe that makes a difference. Just wanted to supply a sample page so it’s easier to get help. Thanks very much.
I assumed that on the site could be also links which refers to other pages.
Therefore there a list of extensions, which filters the files with a certain extension
The following version should download everything, also concerning the redirection of some files
set destinationFolder to POSIX path of (choose folder)
tell application "Safari"
activate
set num_links to (do JavaScript "document.links.length" in document 1)
repeat with i from 0 to num_links - 1
tell application "Safari" to set this_link to do JavaScript "document.links[" & i & "].href" in document 1
set {ASTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "/"}
set fName to last text item of this_link
set AppleScript's text item delimiters to ASTID
try
do shell script "curl -L -o " & quoted form of (destinationFolder & fName) & space & this_link
end try
end repeat
end tell
It seems to hit a problem on item 18 on the page. The error says it was timed out due to an error in Safari. Can I view more details on the error or is it really that vague?
Item #18 takes probably more than 2 minutes to download, and then the timeout error occurs.
I changed the syntax of the script a bit, so the shell script line is outside an application tell block.
set destinationFolder to POSIX path of (choose folder)
tell application "Safari"
activate
set num_links to (do JavaScript "document.links.length" in document 1)
end tell
repeat with i from 0 to num_links - 1
tell application "Safari" to set this_link to do JavaScript "document.links[" & i & "].href" in document 1
set {TID, text item delimiters} to {text item delimiters, "/"}
set fName to last text item of this_link
set text item delimiters to TID
try
do shell script "curl -L -o " & quoted form of (destinationFolder & fName) & space & this_link
end try
end repeat
That works so much better! However, it seems the script missed 5 items on the list. I’m trying to figure out which ones those are and what they have in common. Also, is there a way to have the file names displayed in the target folder without the % symbol? Thanks again.
The names of the files, which cause the error will be saved in a list and the list will be displayed at the end
set destinationFolder to POSIX path of (choose folder)
tell application "Safari"
activate
set num_links to (do JavaScript "document.links.length" in document 1)
end tell
set errorList to ""
repeat with i from 0 to num_links - 1
tell application "Safari" to set this_link to do JavaScript "document.links[" & i & "].href" in document 1
set {TID, text item delimiters} to {text item delimiters, "/"}
set fName to last text item of this_link
set text item delimiters to "%20"
set fName to text items of fName
set text item delimiters to space
set fName to fName as text
set text item delimiters to TID
try
do shell script "curl -L -o " & quoted form of (destinationFolder & fName) & space & this_link
on error
set errorList to errorList & fName & return
end try
end repeat
display dialog errorList