Web Image Download from CSV file.

Thank you so much for the help this works an absolute treat :smiley:

Just out of curiosity, were you using my script, or ccstone’s?

I think they’re pretty similar except he’s doing everything in the shell and I’m using as much Applescript as possible and just using the shell to call curl to do the actual download.

If you know some Applescript but not much shell, then mine might have some modification/maintainability advantage for you
 of course, if you know shell better then Applescript, then the opposite would be true.

Hey guys,

I keep getting an error when using these URLs, I have a list of over 300 and I have changed the base URL.

http://jpl.a.bigcontent.io/v1/static/bl_006773_a
http://jpl.a.bigcontent.io/v1/static/bl_006806_a
http://jpl.a.bigcontent.io/v1/static/bl_006908_a

Not sure why? Can anyone help?

Hey Eastland,

Whose script are you using?

What’s the base-url in your script?

What exactly is the error?

-Chris

Hey ccstone,

Thanks for having a look, I am using t.spoons script

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set baseURL to "http://jpl.a.bigcontent.io/v1/static/"

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
   set downloadsFolder to path to downloads folder
   set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
   if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
   set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
repeat with aLine in dataLines
   set delimitHolder to AppleScript's text item delimiters
   set AppleScript's text item delimiters to ","
   set newURL to baseURL & text item 1 of aLine
   copy newURL to the end of theURLs
   set AppleScript's text item delimiters to delimitHolder
end repeat

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}
repeat with imageURL in theURLs
   set delimitHolder to AppleScript's text item delimiters
   set AppleScript's text item delimiters to "/"
   set imageName to the last text item of imageURL
   set AppleScript's text item delimiters to delimitHolder
   set downloadFilePath to saveFolder & imageName as text
   
   set shellOutput to (do shell script "
           
           image=\"" & (contents of imageURL as text) & "\"
           response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
           
           if [ \"$response\" == \"200\" ]; then
           curl $image -o \"" & downloadFilePath & "\"
           echo \"Image downloaded\"
           else
           echo \"Image does not exist\"
           fi
           
           ")
   
   if shellOutput is "Image does not exist" then
       copy imageURL to the end of missingImages
   else
       copy imageURL to the end of downloadedImages
   end if
end repeat

set missingImageDialogText to ""
if (count of missingImages) > 0 then
   set missingImageDialogText to "The following images failed to download:"
   repeat with anImage in missingImages
       set missingImageDialogText to missingImageDialogText & return & anImage
   end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"

I am using the base URL http://jpl.a.bigcontent.io/v1/static/

and getting this error

The script completed.
0 out of 152 images were successfully downloaded to: /Users/eastland/Downloads/Script Downloaded Images/
The following images failed to download:
http://jpl.a.bigcontent.io/v1/static/ÔÂȘÞbl_006773_a
http://jpl.a.bigcontent.io/v1/static/bl_006806_a
http://jpl.a.bigcontent.io/v1/static/bl_006908_a
http://jpl.a.bigcontent.io/v1/static/bl_007373_a
http://jpl.a.bigcontent.io/v1/static/bl_012337_a

and so on


I have checked the URLs and some auto download the mage when entered into a browser not sure if this may cause issue.

Can you help? :smiley:

I opened my script from your post, with the base URL modified as you did it, put the URL’s you posted (with the base path stripped back off) into my test .csv:

And my result was that 4 out of five downloaded successfully.

Only the first one failed, and it also fails if I put the URL you provided into a browser.

Those weird characters in the first link make me suspect a possible problem with character encoding. That the values in your CSV, unlike all but your first posted value, are getting characters changed between what you’re seeing the CSV and what terminal is actually trying to retrieve. For example, terminal doesn’t use Unicode text.

But that’s just a guess, and I never have to mess with text encoding problems (yet), so I’m really not sure.

But for me, the script is working perfectly with the URL’s you posted, aside from the first that doesn’t work in a browser either.

If you want me to look more, you could PM a link to download an actual sample of a CSV you’re using for these, and I can try to dig a little deeper.

Thanks t.spoon,

I think I maybe saving out from excel incorrectly I have PM’d you a link to the file, thanks for having a look.

Well, this just got hard for me to help you with. I opened the script from your last post, downloaded the .csv you PM’d me, ran the script, and got:

The script completed.
146 out of 146 images were successfully downloaded to: ~/Downloads/Script Downloaded Images/

So being unable to reproduce the error, it’s hard to troubleshoot.

Are you clearing out the folder between runs? I’m not entirely sure what the shell script does in case of name conflicts with already downloaded files. If clearing out the folder helps, I can easily modify my script to be like ccstone’s and create a new folder for each run.

But I sort of doubt that’s the problem, and I’m afraid I’m short of ideas on what the problem could be. It really does get hard when the error isn’t reproducible.

Do the files download on your computer when you manually put the URL into a browser?

  • Tom.

Thanks for having a look into this for me, I cleared all the folders and re ran the script with no luck still getting 0 of 146 images were downloaded. I then copied individual URLs into my browser the first two auto downloaded the third showed the image in the browser.

I connect through a proxy do you think this could be an issue?

The odd thing is if I revert back to using the older base URL http://i1.adis.ws/i/jpl/ this works fine but using the new URL http://jpl.a.bigcontent.io/v1/static/ still not working :confused:

The reason for changing URL is the image file returned has less compression which is the one I need.

Yeah, I don’t know. I double-checked, and it is working fine for me with the base URL set to

It does seem particularly odd that the script doesn’t work for you with that base URL but does work for me with the same, but for other Base URL’s it works for both of us.

I’m just scratching my head here.

  • Tom.

We’re both using similar “do shell script” lines with CURL for the actualy downloading, so the results will probably be the same if you use ccstone’s script.

But since what we’re looking at seems to be a weird, inconsistent bug, I’d give his script a shot and see if you get a different result.

  • Tom.

Hi Tom,

Yeah i’m confused with this one, I tried CCstones script but again it returned errors, it only brought back the last cell and skipped the rest.

If the script was modified to just read the CSV file with full links in the doc do you think this may work? Basically ditch the base URL function from the script.

Or am I clutching at straws?

Probably clutching at straws, but who knows. Computers can be weird.

Here it is expecting the full URL in the file.

I tested it with a few files you provided using the “http://jpl.a.bigcontent.io/v1/static/” URL beginning and the full URL in the .csv and it worked fine.

  • Tom.
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
	set downloadsFolder to path to downloads folder
	set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
	if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
	set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
set delimitHolder to AppleScript's text item delimiters
repeat with aLine in dataLines
	set AppleScript's text item delimiters to ","
	copy text item 1 of aLine to the end of theURLs
end repeat
set AppleScript's text item delimiters to delimitHolder

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}

set delimitHolder to AppleScript's text item delimiters
repeat with imageURL in theURLs
	set AppleScript's text item delimiters to "/"
	set imageName to the last text item of imageURL
	set downloadFilePath to saveFolder & imageName as text
	set shellOutput to (do shell script "
           
           image=\"" & (contents of imageURL as text) & "\"
           response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
           
           if [ \"$response\" == \"200\" ]; then
           curl $image -o \"" & downloadFilePath & "\"
           echo \"Image downloaded\"
           else
           echo \"Image does not exist\"
           fi
           
           ")
	
	if shellOutput is "Image does not exist" then
		copy imageURL to the end of missingImages
	else
		copy imageURL to the end of downloadedImages
	end if
end repeat
set AppleScript's text item delimiters to delimitHolder

set missingImageDialogText to ""
if (count of missingImages) > 0 then
	set missingImageDialogText to "The following images failed to download:"
	repeat with anImage in missingImages
		set missingImageDialogText to missingImageDialogText & return & anImage
	end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"

Just to try to help troubleshoot
 here’s a version that still uses the base file path, but it also logs a plain text file of all the URL’s it was trying to load to the downloaded images folder.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set baseURL to "http://jpl.a.bigcontent.io/v1/static/"

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
	set downloadsFolder to path to downloads folder
	set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
	if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
	set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
set delimitHolder to AppleScript's text item delimiters

set dataForFile to ""

repeat with aLine in dataLines
	set AppleScript's text item delimiters to ","
	set newURL to baseURL & text item 1 of aLine
	copy newURL to the end of theURLs
	set dataForFile to dataForFile & newURL & return
end repeat

set dataSavePath to POSIX file (saveFolder & "URLs as Text.txt")
try
	close access file dataSavePath
end try
set fileReference to open for access (dataSavePath) with write permission
set eof fileReference to 0
write dataForFile to fileReference
close access fileReference


set AppleScript's text item delimiters to delimitHolder

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}

set delimitHolder to AppleScript's text item delimiters
repeat with imageURL in theURLs
	set AppleScript's text item delimiters to "/"
	set imageName to the last text item of imageURL
	set downloadFilePath to saveFolder & imageName as text
	set shellOutput to (do shell script "
           
           image=\"" & (contents of imageURL as text) & "\"
           response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
           
           if [ \"$response\" == \"200\" ]; then
           curl $image -o \"" & downloadFilePath & "\"
           echo \"Image downloaded\"
           else
           echo \"Image does not exist\"
           fi
           
           ")
	
	if shellOutput is "Image does not exist" then
		copy imageURL to the end of missingImages
	else
		copy imageURL to the end of downloadedImages
	end if
end repeat
set AppleScript's text item delimiters to delimitHolder

set missingImageDialogText to ""
if (count of missingImages) > 0 then
	set missingImageDialogText to "The following images failed to download:"
	repeat with anImage in missingImages
		set missingImageDialogText to missingImageDialogText & return & anImage
	end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"

Take a look and see if the URL’s look right in the text file. Maybe post part of the text file to the forum or PM me another dropbox link to what it saved so we can take a look, maybe it’ll give me something to go on.

Another possible step is for me to simplify the CURL line and return terminal errors to Applescript.

Actually, come to think of it


Please open terminal and at the command prompt paste:

and hit enter and then post the response in Terminal here. Maybe there will be a CURL error that can help zoom us in.

Hi,

I ran the script on another computer and it worked fine :confused:

This is what terminal brought back on the machine that doesn’t work

Last login: Tue May 2 09:30:55 on console
JDM10SHA00039:~ leo.eastland$ curl http://jpl.a.bigcontent.io/v1/static/bl_006806_a -o ~/Downloads/Script\ Downloaded\ Images/bl_006806_a
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0
Warning: Failed to create the file /Users/leo.eastland/Downloads/Script
Warning: Downloaded Images/bl_006806_a: No such file or directory
curl: (23) Failed writing received data to disk/application

and this is what terminal returned on the machine that the script works on

Last login: Fri Apr 21 18:46:10 on console
JDM11SHA00010:~ capture.station$ curl http://jpl.a.bigcontent.io/v1/static/bl_006806_a -o ~/Downloads/Script\ Downloaded\ Images/bl_006806_a
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0
0 3088k 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0Warning: Failed to create the file /Users/capture.station/Downloads/Script
Warning: Downloaded Images/bl_006806_a: No such file or directory
curl: (23) Failed writing body (0 != 11748)

I’ll cross reference the URL text files and see if I can see any anomalies.

Both these terminal responses are reporting failures. Did you have the “Script Downloaded Images” folder that the script makes present in your Downloads folder on both machines when you ran the terminal command? That’s where I left it downloading to.

Alternately, I can just take that out of the path and download directly to your ~/Downloads folder:

Please try that.

Thanks,

Tom.

Sorry my bad tested again here you go

JDM11SHA00010:~ capture.station$ curl http://jpl.a.bigcontent.io/v1/static/bl_006806_a -o ~/Downloads/bl_006806_a
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3088k 100 3088k 0 0 3519k 0 --:–:-- --:–:-- --:–:-- 3517k

JDM10SHA00039:~ leo.eastland$ curl http://jpl.a.bigcontent.io/v1/static/bl_006806_a -o ~/Downloads/bl_006806_a
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0

OK, well, got me. It worked in the top instance, didn’t work in the bottom instance, and there was no error returned when it didn’t work.

So we’ve ruled out something weird about the Applescript that leads to the command being fed to terminal incorrectly (character encoding, escaping spaces, stuff like that.) Because a valid command fed directly into terminal still doesn’t download the file.

I’m afraid I’m at a loss here for what could be going wrong on the machine where it doesn’t work.

  • Tom.

Yeah I am also at a loss with this, I though it maybe the version of curl but both machines operating exactly the same.