I need some help writing an Applescript that will automatically download the PDF from every instance of multiple “View PDF” buttons on this Chrome page:
Essentially, the problem is that these links are not direct downloads where I can just right click and save the PDF files, but rather open the native Chrome viewer, which means I have to manually click on every View PDF button in order to download the associated file.
I can temporarily set the Chrome preference to download the files instead of displaying the viewer first, but even then it still takes multiple steps each time to download a single PDF, multiplied by hundreds of buttons on dozens of pages from this site.
I am looking to automate the process so that it cycles through every instance of these “View PDF” buttons and downloads each file without any further input from me, essentially simulating clicking the first “View PDF” button, performing the download, closing the window, then moving to the next “View PDF” button on the page and repeating the process until all the downloads are completed for that page.
I was curious how the approach suggested by Paul would work and roughed-out a script that opens all of the PDFs in Safari. This worked well on my Sonoma computer. Pavilion would want to edit the script to individually save instead of open the PDFs, though.
use framework "Foundation"
use scripting additions
set theURL to "https://www.lakewoodnj.gov/agendas"
set urlOne to "https://www.lakewoodnj.gov"
set sourceCode to (do shell script "curl " & theURL)
delay 5 -- test different values
set pdfURLs to getMatchingStrings(sourceCode)
set urlList to {}
repeat with aURL in pdfURLs
set urlTwo to (aURL's stringByReplacingOccurrencesOfString:"data-src=\"" withString:"") as text
set end of urlList to urlOne & (urlTwo as text)
end repeat
display dialog "A total of " & (count urlList) & " PDFs were found and will be opened in Safari"
tell application "Safari"
activate
tell window 1
repeat with aURL in urlList
set current tab to (make new tab with properties {URL:aURL})
end repeat
end tell
end tell
on getMatchingStrings(theString)
set theString to current application's NSString's stringWithString:theString
set thePattern to "data-src=.*\\.pdf"
set theRegex to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
set regexResults to theRegex's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
set theRanges to (regexResults's valueForKey:"range")
set theMatches to current application's NSMutableArray's new()
repeat with aRange in theRanges
(theMatches's addObject:(theString's substringWithRange:aRange))
end repeat
return theMatches
end getMatchingStrings
Here’s my version which doesn’t use a web browser.
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
on run
local mySource, aFile, tid
set mySource to do shell script "curl https://www.lakewoodnj.gov/agendas"
set tid to text item delimiters
set text item delimiters to {"data-src=\""}
set mySource to text items of mySource
set text item delimiters to {"\""}
repeat with aFile in mySource
set aFile to contents of aFile
if aFile contains ".pdf" then
do shell script "cd ~ ; curl -O " & "https://www.lakewoodnj.gov/" & text item 1 of aFile
end if
end repeat
set text item delimiters to tid
end run
The “~” will make the script save to the user’s home directory. You can change this to a hard coded posix path if you wish.
I’ve always found the opposite to be true.
When you use a referenced variable , it gets dereferenced temporarily in each use of the variable.
That line dereferences it permanately during its contents lifetime