I’m wracking my brain with this and getting nowhere so I hope the experts can help.
What I want to be able to do is parse the source of a specific webpage, which changes periodically, for a string of text and use the results to generate a URL that gets loaded automatically in a web browser,
For example, let’s say the HTML source of the page in question contains the string “soup of the day is tomato and tomorrow will be”. The variable data I’m looking to extract periodically is the string of text between “of the day " and " and tomorrow will” (in this case, “tomato”). I then want to be able to take that string and insert it into a URL such as “http://www.example.com/sotd.php?s=tomato” to be opened periodically (once every 24 hours on an unattended kiosk, for example). While the string in the middle changes, the surrounding text is always the same and is unique within the page source.
A critical limitation is that the site providing the page I want to parse requires cookies to be set in order to get at the necessary page. As far as I can tell, that rules out the use of command-line options such as curl.
In other words, I need a script to do this:
Periodically load a specific web page and save the HTML source or otherwise make it available for parsing. This page requires cookies and cannot be accessed (to my knowledge) via curl.
Parse the source for a variable string of text bounded by constant text.
Append the variable string to a URL and periodically load said URL in a web browser.
Is this possible to do within AppleScript? If someone can show me how, I’d be VERY grateful!
Most likely you can do that with AppleScript.
As parsing (html) text always requires an individual solution, it’s rather impossible to help without having the source code
Once you figure out how to get the web page’s text, you can use the below code on the text. As StefanK said, parsing html is “an individual solution” so this may or may not work as expected. It mainly depends on the firstDelimiter variable meaning is the firstDelimiter text unique in the text and if not where in the text does it occur.
-- initial parameters
set htmlText to "soup of the day is tomato and tomorrow will be"
set firstDelimiter to "of the day is "
set secondDelimiter to " and tomorrow will"
-- get the words between the delimiters
set text item delimiters to firstDelimiter
set interimText to text item 2 of htmlText
set text item delimiters to secondDelimiter
set finalText to text item 1 of interimText
set text item delimiters to ""
return finalText
Once you have that, you can add that to a url and open it like this…
set initialHTML to "http://www.something.com/"
set finalText to "tomato"
set finalHTML to initialHTML & finalText
open location finalHTML