This is very hard to describe over text but I will do my best. I also must preface that I’m rather novice so forgive me if I’m not using proper terminology. Here goes:
I made a web scraper for work and one of the items I’m scraping isn’t static/constant but would always be one of the following: a) Two different numbers b) Two of the same numbers c) Only one number d) No number at all.
If there are two numbers presented, I’m required to scrape the larger value. So here’s the part of the script
try
set Num1 to ScrapeByClass("html_class", 0) as number
set Num2 to ScrapeByClass("html_class", 1) as number
if Num2 is greater than Num1 then
set varNum to Num2
else
set varNum to Num1
end if
on error
try
set varNum to Num1
on error
set varNum to "N/A"
end try
end try
return varNum
NOTE: ScrapeByClass is a handler that uses JavaScript to scrape a webpage
to ScrapeByClass(elementClass, num)
tell application "Safari" to do JavaScript "document.getElementsByClassName('" & elementClass & "')[" & num & "].innerHTML;" in document 1
end ScrapeByClass
So I start with scenario a) where there are two different numbers and I use a if statement to define the larger value, or if it’s the same (scenario b), it would just define it as one of the numbers presented. In scenario c) where there’s only one number given, it goes in the error loop and simply scrape the given single value. Lastly, if none is given, it sets the variable to “N/A” for final result.
The script seems to work fine with the exception of this one issue (or bug?). If I run the script, scrape a value and print it, when I run the script again that has scenario d) with no numbers given. It would still print the last scraped value despite nothing is presented on the webpage. This is not observed when I run the script for the first time on a scenario d) with no numbers.
After spending couple hours messing around, I noticed that when I remove “as number” at the beginning of my script (line 2 and 3). This error is no longer observed. However, because the web scraper handler defines the variable as a string and two strings cannot be compared for a “larger” value, the if condition is broken. So then I’m stuck in between.
Has anyone ever experience such thing?
Initially I thought it’s related to browser caching issue because of the handler. However, this is not the case as I have tried clearing Safari browser data/cache (completely) but the script still remembers the last value scraped when the webpage doesn’t have a value. Also, in my research, I learned that you cannot declare a variable as undefined in AppleScript so it’s not possible to do “If variable is null/nil/”" then …"
I’m at a lost at the fact that AppleScript is “memorizing” a value when the script never involved any data structure or storage.
Any insight/advice is largely appreciated!!!