I’m trying to pull data from a website’s google tags. I’d like to be able to retrieve some data including the vendorBrand, productName and productCategory from the following HTML:
<!DOCTYPE html>
<html lang="en" class="no-js">
<head>
<!-- BEGIN: Google Tag Manager -->
<script>
var dataLayer = [];
function gtag() {
dataLayer.push(arguments);
}
dataLayer.push({
"pageType": "product",
"productNumber": "123456",
"vendorBrand": "Brand Name Here",
"productName": "Short Product Description Here",
"productCategory": "Product Category Lives Here",
"subcategory": "Subcategory Lives Here",
"department": "Product Department Here",
});
dataLayer.push({
"event": "crto_productpage",
"crto": {
"email": "",
"products": ["123456"]
}
});
</script>
<!-- Google Tag Manager -->
<script>
(function(w, d, s, l, i) {
w[l] = w[l] || [];
w[l].push({
'gtm.start':
new Date().getTime(),
event: 'gtm.js'
});
var f = d.getElementsByTagName(s)[0],
j = d.createElement(s),
dl = l != 'dataLayer' ? '&l=' + l : '';
j.async = true;
j.src =
'https://www.googletagmanager.com/gtm.js?id=' + i + dl;
f.parentNode.insertBefore(j, f);
})(window, document, 'script', 'dataLayer', 'GTM-0000000');
</script>
<!-- End Google Tag Manager -->
With the below script, I got it to retrieve the above HTML data and display it as an alert. Now, I just need to figure out how to parse the vendorBrand, productName and productCategory into separate variables. I’m going to assign them as keywords to product images once I figure this out.
tell application "Safari" -- tells AppleScript to use Safari
do JavaScript "var url = 'https://www.websitename.com/products/123456'
var xhr = new XMLHttpRequest();
xhr.open('GET', url, false);
//WAIT
xhr.send();
var data = xhr.responseText;
alert(data);" in document 1
end tell -- stops using safari
Is this something I’d be able to put in the Applescript/Javascript code I already have? It’s going to be a part of a larger code where I drop a product image on a droplet, find the product number in the image filename, have it go to the product page for that image, assign some tags to variables and then assign those variables as keywords.
I tried using shell script and curl and discovered the website didn’t like that. It gave me the bot interruption page source. But posting it in case it works for someone else on a different site.
do shell script "curl " & site & " | textutil -stdin -stdout -format html -convert txt -encoding UTF-8 "
I got the following to work though it’s not very elegant.
tell application "Safari"
activate
--Getting the page source
set WebPage to the source of document 1
--Getting just the Google Tags
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "});"
set gTagsPart to first text item of WebPage
set AppleScript's text item delimiters to "dataLayer.push({"
set gTags to last text item of gTagsPart -->all the Google Tags
set AppleScript's text item delimiters to "\","
set gTagPart to every text item of gTags -->Each individual part (ie. "productCategory":"Bathroom Accessories)
set AppleScript's text item delimiters to tid
--Declaring variables to avoid undefined errors
set product_category to {}
set categoryLevel4 to {}
set categoryLevel3 to {}
set categoryLevel2 to {}
set categoryLevel1 to {}
set theCategory to {}
set subCategory to {}
set theDepartment to {}
--Repeating with every possible Google Tag
repeat with possGTag in gTagPart
--Storing original delimiters in variable
set tid to AppleScript's text item delimiters
--Isolating the categories, subcategories and departments
--(ie. we need to get rid of "productCategory":" in the tag "productCategory":"Bathroom Accessories)
if possGTag contains "productCategory\":\"" then
set AppleScript's text item delimiters to "productCategory\":\""
set productCategory to last text item of possGTag
else if possGTag contains "catLevel4\":\"" then
set AppleScript's text item delimiters to "catLevel4\":\""
set categoryLevel4 to last text item of possGTag
else if possGTag contains "catLevel3\":\"" then
set AppleScript's text item delimiters to "catLevel3\":\""
set categoryLevel3 to last text item of possGTag
else if possGTag contains "catLevel2\":\"" then
set AppleScript's text item delimiters to "catLevel2\":\""
set categoryLevel2 to last text item of possGTag
else if possGTag contains "catLevel1\":\"" then
set AppleScript's text item delimiters to "catLevel1\":\""
set categoryLevel1 to last text item of possGTag
else if possGTag contains "category\":\"" then
set AppleScript's text item delimiters to "category\":\""
set theCategory to last text item of possGTag
else if possGTag contains "subcategory\":\"" then
set AppleScript's text item delimiters to "subcategory\":\""
set subCategory to last text item of possGTag
else if possGTag contains "department\":\"" then
set AppleScript's text item delimiters to "department\":\""
set theDepartment to last text item of possGTag
end if
--Resetting delimiters
set AppleScript's text item delimiters to tid
end repeat
--Displaying test dialog
display dialog productCategory & return & categoryLevel4 & return & categoryLevel3 & return & categoryLevel2 & return & categoryLevel1 & return & theCategory & return & subCategory & return & theDepartment & return
end tell