Web Image Download from CSV file.

Hi,

I was hoping somebody would able to help with a script that does the following

Scans a .csv file column A

Adds each cell in turn to the end of a url : http://i1.adis.ws/i/jpl/contents_of_cell_A1

Accesses the site

Then downloads the image to a specified local folder.

Repeats until the last cell.

I am currently processing this manually and attempting to create something in automator but not having much luck.

If anyone could help that would be awesome

Cheers :smiley:

This worked on a test .csv I made with just image URL’s in the first column:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
	set downloadsFolder to path to downloads folder
	set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
	if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
	set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
repeat with aLine in dataLines
	set delimitHolder to AppleScript's text item delimiters
	set AppleScript's text item delimiters to ","
	copy text item 1 of aLine to the end of theURLs
	-- may want to insert validation in case it doesn't start with "HTTP" or end with the right extension, etc.
	set AppleScript's text item delimiters to delimitHolder
end repeat

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}
repeat with imageURL in theURLs
	set delimitHolder to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "/"
	set imageName to the last text item of imageURL
	set AppleScript's text item delimiters to delimitHolder
	set downloadFilePath to saveFolder & imageName as text
	
	set shellOutput to (do shell script "
			
			image=\"" & (contents of imageURL as text) & "\"
			response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
			
			if [ \"$response\" == \"200\" ]; then
			curl $image -o \"" & downloadFilePath & "\"
			echo \"Image downloaded\"
			else
			echo \"Image does not exist\"
			fi
			
			")
	
	if shellOutput is "Image does not exist" then
		copy imageURL to the end of missingImages
	else
		copy imageURL to the end of downloadedImages
	end if
end repeat

set missingImageDialogText to ""
if (count of missingImages) > 0 then
	set missingImageDialogText to "The following images failed to download:"
	repeat with anImage in missingImages
		set missingImageDialogText to missingImageDialogText & return & anImage
	end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"


Hey There,

It’s pretty simple if you know how.

This script will run as is and will download 3 images to a new dated folder in your Downloads folder.


-------------------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2017/03/08 16:00
# dMod: 2017/03/08 16:28 
# Appl: AppleScript & the Shell
# Task: Download image URLs from a CSV file to a dated folder in the ~/Downloads folder.
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @Download, @Image, @URLs, @CSV, @Dated, @Folder
-------------------------------------------------------------------------------------------

set shCMD to "
export PATH=/opt/local/bin:/opt/local/sbin:/usr/local/bin:$PATH;

baseURL='http://members.aceweb.com/randsautos/photogallery/ferrari/enzo/'

# Simulate reading the csv data:
csvData='
Ferrari-Enzo-001.jpg,Line01-Field02,Line01-Field03
Ferrari-Enzo-002.jpg,Line02-Field02,Line02-Field03
Ferrari-Enzo-003.jpg,Line03-Field02,Line03-Field03
'

urlList=$(sed -En '/[[:alnum:]]+/{
	s!,.+!!
	s!^!'\"$baseURL\"'!
	s!^!\"!g
	s!$!\"!g
	s!^!url = !g
	p
}' <<< \"$csvData\")

newFolderPath=~/Downloads/'Downloaded Files - '$(date \"+%Y-%m-%d %H.%M.%S\")
mkdir -p \"$newFolderPath\"

cd \"$newFolderPath\"

echo \"$urlList\" \\
| curl -Ls --remote-name-all -L --user-agent 'Opera/9.70 (Linux ppc64 ; U; en) Presto/2.2.1' -K -
"

do shell script shCMD

-------------------------------------------------------------------------------------------

You can read the file with AppleScript, or more simply by using the cat command in the shell.

[format]
cat ‘/path/to/your/file.csv’
[/format]

Make sure to quote the path if it has any spaces in it.

So “ in the script above you’d replace the “csvData=” segment with:

[format]csvData=$(cat ‘/path/to/your/file.csv’)[/format]


Chris
··················································································
{ MacBookPro6,1 · 2.66 GHz Intel Core i7 · 8GB RAM · OSX 10.12.3 }

Okay, let’s make that simpler for you and change it to read a file.

Given a file named “URL LIST.csv” in your Downloads folder (see dataFilePath in the script) with content of:

[format]Ferrari-Enzo-001.jpg,Line01-Field02,Line01-Field03
Ferrari-Enzo-002.jpg,Line02-Field02,Line02-Field03
Ferrari-Enzo-003.jpg,Line03-Field02,Line03-Field03
[/format]

The script will read the file directly into ˜sed’ to isolate the file names and build the URLs.


-------------------------------------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2017/03/08 16:00
# dMod: 2017/03/08 16:50
# Appl: Miscellaneous
# Task: Download image URLs from a CSV file to a dated folder in your ~/Downloads folder.
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @Download, @Image, @URLs, @CSV, @Dated, @Folder, @curl, @sed
-------------------------------------------------------------------------------------------

set shCMD to "
export PATH=/opt/local/bin:/opt/local/sbin:/usr/local/bin:$PATH;

# Note the single-quotes in the path due to spaces.
dataFilePath=~/'Downloads/URL LIST.csv'

baseURL='http://members.aceweb.com/randsautos/photogallery/ferrari/enzo/'

urlList=$(sed -En '/[[:alnum:]]+/{
	s!,.+!!
	s!^!'\"$baseURL\"'!
	s!^!\"!g
	s!$!\"!g
	s!^!url = !g
	p
}' \"$dataFilePath\" )

newFolderPath=~/Downloads/'Downloaded Files - '$(date \"+%Y-%m-%d %H.%M.%S\")
mkdir -p \"$newFolderPath\"

cd \"$newFolderPath\"

echo \"$urlList\" | curl -Ls --remote-name-all -L --user-agent 'Opera/9.70 (Linux ppc64 ; U; en) Presto/2.2.1' -K -
"
do shell script shCMD

-------------------------------------------------------------------------------------------

-Chris

Whoops, I see I missed that the csv doesn’t contain the entire URL, that it’s just the end of it.

Amended script here:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set baseURL to "http://i1.adis.ws/i/jpl/"

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
	set downloadsFolder to path to downloads folder
	set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
	if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
	set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
repeat with aLine in dataLines
	set delimitHolder to AppleScript's text item delimiters
	set AppleScript's text item delimiters to ","
	set newURL to baseURL & text item 1 of aLine
	copy newURL to the end of theURLs
	set AppleScript's text item delimiters to delimitHolder
end repeat

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}
repeat with imageURL in theURLs
	set delimitHolder to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "/"
	set imageName to the last text item of imageURL
	set AppleScript's text item delimiters to delimitHolder
	set downloadFilePath to saveFolder & imageName as text
	
	set shellOutput to (do shell script "
			
			image=\"" & (contents of imageURL as text) & "\"
			response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
			
			if [ \"$response\" == \"200\" ]; then
			curl $image -o \"" & downloadFilePath & "\"
			echo \"Image downloaded\"
			else
			echo \"Image does not exist\"
			fi
			
			")
	
	if shellOutput is "Image does not exist" then
		copy imageURL to the end of missingImages
	else
		copy imageURL to the end of downloadedImages
	end if
end repeat

set missingImageDialogText to ""
if (count of missingImages) > 0 then
	set missingImageDialogText to "The following images failed to download:"
	repeat with anImage in missingImages
		set missingImageDialogText to missingImageDialogText & return & anImage
	end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"


Thank you so much for the help this works an absolute treat :smiley:

Just out of curiosity, were you using my script, or ccstone’s?

I think they’re pretty similar except he’s doing everything in the shell and I’m using as much Applescript as possible and just using the shell to call curl to do the actual download.

If you know some Applescript but not much shell, then mine might have some modification/maintainability advantage for you… of course, if you know shell better then Applescript, then the opposite would be true.

Hey guys,

I keep getting an error when using these URLs, I have a list of over 300 and I have changed the base URL.

http://jpl.a.bigcontent.io/v1/static/bl_006773_a
http://jpl.a.bigcontent.io/v1/static/bl_006806_a
http://jpl.a.bigcontent.io/v1/static/bl_006908_a

Not sure why? Can anyone help?

Hey Eastland,

Whose script are you using?

What’s the base-url in your script?

What exactly is the error?

-Chris

Hey ccstone,

Thanks for having a look, I am using t.spoons script

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set baseURL to "http://jpl.a.bigcontent.io/v1/static/"

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
   set downloadsFolder to path to downloads folder
   set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
   if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
   set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
repeat with aLine in dataLines
   set delimitHolder to AppleScript's text item delimiters
   set AppleScript's text item delimiters to ","
   set newURL to baseURL & text item 1 of aLine
   copy newURL to the end of theURLs
   set AppleScript's text item delimiters to delimitHolder
end repeat

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}
repeat with imageURL in theURLs
   set delimitHolder to AppleScript's text item delimiters
   set AppleScript's text item delimiters to "/"
   set imageName to the last text item of imageURL
   set AppleScript's text item delimiters to delimitHolder
   set downloadFilePath to saveFolder & imageName as text
   
   set shellOutput to (do shell script "
           
           image=\"" & (contents of imageURL as text) & "\"
           response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
           
           if [ \"$response\" == \"200\" ]; then
           curl $image -o \"" & downloadFilePath & "\"
           echo \"Image downloaded\"
           else
           echo \"Image does not exist\"
           fi
           
           ")
   
   if shellOutput is "Image does not exist" then
       copy imageURL to the end of missingImages
   else
       copy imageURL to the end of downloadedImages
   end if
end repeat

set missingImageDialogText to ""
if (count of missingImages) > 0 then
   set missingImageDialogText to "The following images failed to download:"
   repeat with anImage in missingImages
       set missingImageDialogText to missingImageDialogText & return & anImage
   end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"

I am using the base URL http://jpl.a.bigcontent.io/v1/static/

and getting this error

The script completed.
0 out of 152 images were successfully downloaded to: /Users/eastland/Downloads/Script Downloaded Images/
The following images failed to download:
http://jpl.a.bigcontent.io/v1/static/Ԫøbl_006773_a
http://jpl.a.bigcontent.io/v1/static/bl_006806_a
http://jpl.a.bigcontent.io/v1/static/bl_006908_a
http://jpl.a.bigcontent.io/v1/static/bl_007373_a
http://jpl.a.bigcontent.io/v1/static/bl_012337_a

and so on…

I have checked the URLs and some auto download the mage when entered into a browser not sure if this may cause issue.

Can you help? :smiley:

I opened my script from your post, with the base URL modified as you did it, put the URL’s you posted (with the base path stripped back off) into my test .csv:

And my result was that 4 out of five downloaded successfully.

Only the first one failed, and it also fails if I put the URL you provided into a browser.

Those weird characters in the first link make me suspect a possible problem with character encoding. That the values in your CSV, unlike all but your first posted value, are getting characters changed between what you’re seeing the CSV and what terminal is actually trying to retrieve. For example, terminal doesn’t use Unicode text.

But that’s just a guess, and I never have to mess with text encoding problems (yet), so I’m really not sure.

But for me, the script is working perfectly with the URL’s you posted, aside from the first that doesn’t work in a browser either.

If you want me to look more, you could PM a link to download an actual sample of a CSV you’re using for these, and I can try to dig a little deeper.

Thanks t.spoon,

I think I maybe saving out from excel incorrectly I have PM’d you a link to the file, thanks for having a look.

Well, this just got hard for me to help you with. I opened the script from your last post, downloaded the .csv you PM’d me, ran the script, and got:

The script completed.
146 out of 146 images were successfully downloaded to: ~/Downloads/Script Downloaded Images/

So being unable to reproduce the error, it’s hard to troubleshoot.

Are you clearing out the folder between runs? I’m not entirely sure what the shell script does in case of name conflicts with already downloaded files. If clearing out the folder helps, I can easily modify my script to be like ccstone’s and create a new folder for each run.

But I sort of doubt that’s the problem, and I’m afraid I’m short of ideas on what the problem could be. It really does get hard when the error isn’t reproducible.

Do the files download on your computer when you manually put the URL into a browser?

  • Tom.

Thanks for having a look into this for me, I cleared all the folders and re ran the script with no luck still getting 0 of 146 images were downloaded. I then copied individual URLs into my browser the first two auto downloaded the third showed the image in the browser.

I connect through a proxy do you think this could be an issue?

The odd thing is if I revert back to using the older base URL http://i1.adis.ws/i/jpl/ this works fine but using the new URL http://jpl.a.bigcontent.io/v1/static/ still not working :confused:

The reason for changing URL is the image file returned has less compression which is the one I need.

Yeah, I don’t know. I double-checked, and it is working fine for me with the base URL set to

It does seem particularly odd that the script doesn’t work for you with that base URL but does work for me with the same, but for other Base URL’s it works for both of us.

I’m just scratching my head here.

  • Tom.

We’re both using similar “do shell script” lines with CURL for the actualy downloading, so the results will probably be the same if you use ccstone’s script.

But since what we’re looking at seems to be a weird, inconsistent bug, I’d give his script a shot and see if you get a different result.

  • Tom.

Hi Tom,

Yeah i’m confused with this one, I tried CCstones script but again it returned errors, it only brought back the last cell and skipped the rest.

If the script was modified to just read the CSV file with full links in the doc do you think this may work? Basically ditch the base URL function from the script.

Or am I clutching at straws?

Probably clutching at straws, but who knows. Computers can be weird.

Here it is expecting the full URL in the file.

I tested it with a few files you provided using the “http://jpl.a.bigcontent.io/v1/static/” URL beginning and the full URL in the .csv and it worked fine.

  • Tom.
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
	set downloadsFolder to path to downloads folder
	set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
	if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
	set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
set delimitHolder to AppleScript's text item delimiters
repeat with aLine in dataLines
	set AppleScript's text item delimiters to ","
	copy text item 1 of aLine to the end of theURLs
end repeat
set AppleScript's text item delimiters to delimitHolder

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}

set delimitHolder to AppleScript's text item delimiters
repeat with imageURL in theURLs
	set AppleScript's text item delimiters to "/"
	set imageName to the last text item of imageURL
	set downloadFilePath to saveFolder & imageName as text
	set shellOutput to (do shell script "
           
           image=\"" & (contents of imageURL as text) & "\"
           response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
           
           if [ \"$response\" == \"200\" ]; then
           curl $image -o \"" & downloadFilePath & "\"
           echo \"Image downloaded\"
           else
           echo \"Image does not exist\"
           fi
           
           ")
	
	if shellOutput is "Image does not exist" then
		copy imageURL to the end of missingImages
	else
		copy imageURL to the end of downloadedImages
	end if
end repeat
set AppleScript's text item delimiters to delimitHolder

set missingImageDialogText to ""
if (count of missingImages) > 0 then
	set missingImageDialogText to "The following images failed to download:"
	repeat with anImage in missingImages
		set missingImageDialogText to missingImageDialogText & return & anImage
	end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"

Just to try to help troubleshoot… here’s a version that still uses the base file path, but it also logs a plain text file of all the URL’s it was trying to load to the downloaded images folder.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

set baseURL to "http://jpl.a.bigcontent.io/v1/static/"

set theSpreadsheet to choose file with prompt "Please select your csv file with the image URL's" of type {"csv"} without multiple selections allowed

tell application "Finder"
	set downloadsFolder to path to downloads folder
	set saveFolder to ((downloadsFolder as text) & "Script Downloaded Images:")
	if not (exists saveFolder) then make new folder at downloadsFolder with properties {name:"Script Downloaded Images"}
	set saveFolder to POSIX path of saveFolder
end tell

set spreadsheetData to read theSpreadsheet
set dataLines to the paragraphs of spreadsheetData
set theURLs to {}
set delimitHolder to AppleScript's text item delimiters

set dataForFile to ""

repeat with aLine in dataLines
	set AppleScript's text item delimiters to ","
	set newURL to baseURL & text item 1 of aLine
	copy newURL to the end of theURLs
	set dataForFile to dataForFile & newURL & return
end repeat

set dataSavePath to POSIX file (saveFolder & "URLs as Text.txt")
try
	close access file dataSavePath
end try
set fileReference to open for access (dataSavePath) with write permission
set eof fileReference to 0
write dataForFile to fileReference
close access fileReference


set AppleScript's text item delimiters to delimitHolder

set imageCount to the count of theURLs

set missingImages to {}
set downloadedImages to {}

set delimitHolder to AppleScript's text item delimiters
repeat with imageURL in theURLs
	set AppleScript's text item delimiters to "/"
	set imageName to the last text item of imageURL
	set downloadFilePath to saveFolder & imageName as text
	set shellOutput to (do shell script "
           
           image=\"" & (contents of imageURL as text) & "\"
           response=$(curl --write-out %{http_code} --silent --output /dev/null $image)
           
           if [ \"$response\" == \"200\" ]; then
           curl $image -o \"" & downloadFilePath & "\"
           echo \"Image downloaded\"
           else
           echo \"Image does not exist\"
           fi
           
           ")
	
	if shellOutput is "Image does not exist" then
		copy imageURL to the end of missingImages
	else
		copy imageURL to the end of downloadedImages
	end if
end repeat
set AppleScript's text item delimiters to delimitHolder

set missingImageDialogText to ""
if (count of missingImages) > 0 then
	set missingImageDialogText to "The following images failed to download:"
	repeat with anImage in missingImages
		set missingImageDialogText to missingImageDialogText & return & anImage
	end repeat
end if
display dialog "The script completed." & return & (count of downloadedImages) & " out of " & imageCount & " images were successfully downloaded to: " & saveFolder & return & missingImageDialogText buttons {"Cancel", "OK"} default button "OK"

Take a look and see if the URL’s look right in the text file. Maybe post part of the text file to the forum or PM me another dropbox link to what it saved so we can take a look, maybe it’ll give me something to go on.

Another possible step is for me to simplify the CURL line and return terminal errors to Applescript.

Actually, come to think of it…

Please open terminal and at the command prompt paste:

and hit enter and then post the response in Terminal here. Maybe there will be a CURL error that can help zoom us in.