If I load the Facebook startpage in Safari and copy-paste the source code to a text editor it is 3 020 305 bytes/characters (according to BBEdit). If I then scroll down to the bottom ten times to load more content and copy-paste the source again it is STILL 3 020 305 despite much more content being loaded.
Can someone explain this?
I am writing a script where I want to know if I have reached the end of a feed like the Facebook feed and one way to detect could have been to compare the length of the source property but it doesn’t seem to update, neither in the browser nor if I look at the AS source property, when I load more content. Is there some way around this?
A simplified version of my code:
tell application "Safari"
set oldSource to ""
set newSource to ""
open location myURL
tell document 1 to repeat 15 times
set newSource to source
if newSource = oldSource then
exit repeat
else
set oldSource to newSource
end if
do JavaScript JSJumpToEnd
delay 0.5
end repeat
Perhaps the content is already loaded but you don’t see it because you haven’t scrolled to it yet. Or Safari’s source property is simply not updated when TGE DOM changes. In that case, I’d use (out rather try to use) a JS event handler for DOMContentLoaded. Or read the content with JS after scrolling and waiting a bit.
tell application "Safari"
set oldHeight to 0
set newHeight to 0
set loopDelay to 1
activate
open location myURL
try
tell document 1 to repeat
set noLoops to noLoops + 1
set newHeight to do JavaScript JSDocumentHeight
if newHeight = oldHeight then
tell application "System Events"
set frontmostApplicationName to name of 1st process whose frontmost is true
repeat with proc in (every process whose frontmost is true)
end repeat
end tell
activate
tell application frontmostApplicationName to activate
delay loopDelay * 5
set newHeight to do JavaScript JSDocumentHeight
if newHeight ≠oldHeight then
set loopDelay to loopDelay + 1
else
exit repeat
end if
else
set oldHeight to newHeight
end if
do JavaScript JSJumpToEnd
delay 1
do JavaScript JSJumpToEnd
delay loopDelay
end repeat
on error errMsg
log "Safari - errMsg: " & errMsg
do shell script "kill " & caffeinatePid
end try
end tell
I use this to scroll all the way down in Facebook feeds.
Basically, I jump to the end of the document, measure the height of the page, wait a few seconds, jump to the end again, measure the height again and compare the result. If the height has changed, I run the loop again.
If the height hasn’t changed, which is common because the network or Safari regularly stalls when the page is very long (it is common that it is 20 000+ px high) I make a long pause and try again. If succesful I increase the delay the script waits for the page to load and then loop again.
I also make sure that Safari is in front because otherwise it is downprioritized and fail to load pages.
So far this i pretty stable, but it can easily take an hour to jump down 250 times.
Depending on the size of the web page it may display the result after some delay. The source property of Safari has long been botched so much that it fails to refresh the new source code immediately upon opening a new page.
I still have old macOS systems with correspondingly old Safari versions (Safari 9 and even Safari 5). There, using source is buttery smooth and blazingly fast. It’s not the case with Safari 14.