You are not logged in.
Here is my XML file:
<?xml version="1.0" encoding="UTF-16"?>
<8086:293e>
<type>Audio</type>
<name>Some Name Here</name>
<displayname>Some Display Name Here</displayname>
<download>http://somesite.com</download>
</8086:293e>
How would I use AppleScript to parse the XML without any 3rd party addons? Basically I want to save all the data from the "<8086:293e>" element into seperate variables (e.g. type, name, displayname, download). I've already read the Dictionary for this but I can't make anything out of it.
Help is much appreciated
Thanks
Last edited by wizboyx86 (2008-07-18 01:50:09 pm)
Offline
Assuming it's a text file like what you've shown (I put mine on my desktop) then:
Applescript:
set P to paragraphs 3 thru 6 of (read alias "ACB-G5_Leopard:Users:bell:Desktop:xmlFile")
set tid to AppleScript's text item delimiters
set tData to {}
--> {"Audio", "Some Name Here", "Some Display Name Here", "http://somesite.com"}
repeat with oneP in P
set AppleScript's text item delimiters to ">"
set aPart to text item 2 of oneP
set AppleScript's text item delimiters to "<"
set end of tData to text item 1 of aPart
end repeat
set AppleScript's text item delimiters to tid
Offline
OK, but what if I had different elements. Say I wanted to get the values of "<8086:3333>" rather than "<8086:293e>". Would there be a way to specify which element to extract it from?
Offline
Check out the dictionary for system events in the XML section. There is some basic XML support built right in to system events.
Applescript:
tell application "System Events"
tell XML element 1 of contents of XML file XMLfile --THis targets the main outer tag, <8086:293e> in your case
set typeText to (value of (XML elements whose name is "type")) as string
set nameText to (value of (XML elements whose name is "name")) as string
end tell
end tell
I didn't check this so I hope there aren't errors in the syntax. That should give you a good start at least.
If you want to get more complicated (like writing out an updated XML file after you chenge the data), you might want to get the XML and XSLT scripting additions here http://www.latenightsw.com/freeware/.
Offline
I tried that, but when I tried to run it I got:
Can’t make «class valL» of every «class xmle» of «class xmle» 1 of contents of «class xmlf» "/xmlfile.xml" of application "System Events" whose name = "type" into type string.
EDIT: OK so I have SOME progress. Here is the XML:
Code:
<?xml version="1.0" encoding="UTF-16"?>
<test>
<type>Audio</type>
<name>Some Name</name>
<displayname>Some Display Name</displayname>
<download>http://somesite.com</download>
</test>
And the AppleScript:
Applescript:
set XMLfile to "Leopard:Users:pcwiz:Desktop:xmlfile.xml"
tell application "System Events"
tell XML element "test" of contents of XML file XMLfile
set typeText to (value of XML element "type")
set nameText to (value of XML element "name")
end tell
end tell
That works fine. However, when I change the name from "test" to "8086:293e" and change "tell XML element "test"" in AppleScript to "tell XML element "8086:293e"" it gives me this error:
System Events got an error: Can’t get XML element "8086:293e" of contents of XML file "Leopard:Users:pcwiz:Desktop:xmlfile.xml"
Any ideas?
Last edited by wizboyx86 (2008-07-18 03:48:26 pm)
Offline
To me xml or html or whatever is just a big long string. As such I don't need special xml commands to extract information from a string. I just use repeat loops (looking for characters I'm interested in) to get the information I want. The following will get all of the values and place them in a list for you.
Applescript:
set tt to "<?xml version=\"1.0\" encoding=\"UTF-16\"?>
<8086:293e>
<type>Audio</type>
<name>Some Name Here</name>
<displayname>Some Display Name Here</displayname>
<download>http://somesite.com</download>
</8086:293e>"
-- put the xml into a list
set ttParas to paragraphs of tt
-- we cycle through the list and extract the portion between the tag we're interested in
-- we know that the tags we want are between "<8086:" and "</8086:"
set tagValueList to {}
repeat with i from 1 to count of ttParas
if text 1 thru 6 of (item i of ttParas) is "<8086:" then -- first we find the tag we're interested in
repeat with j from (i + 1) to count of ttParas
if text 1 thru 7 of (item j of ttParas) is "</8086:" then exit repeat -- we exit the loop when we hit the end tag
set end of tagValueList to item j of ttParas
end repeat
exit repeat
end if
end repeat
-- return tagValueList
-- next we extract each value from the found tags knowing that the value is between the ">" and "<" characters
set theTagValues to {}
repeat with i from 1 to count of tagValueList
set thisTag to item i of tagValueList
set thisTagValue to ""
repeat with j from 1 to count of thisTag
if item j of thisTag is ">" then
repeat with k from (j + 1) to count of thisTag
if item k of thisTag is "<" then exit repeat
set thisTagValue to thisTagValue & item k of thisTag
end repeat
exit repeat
end if
end repeat
set end of theTagValues to thisTagValue
end repeat
theTagValues
Offline
wizboyx86 wrote:
However, when I change the name from "test" to "8086:293e" and change "tell XML element "test"" in AppleScript to "tell XML element "8086:293e"" it gives me this error:
System Events got an error: Can’t get XML element "8086:293e" of contents of XML file "Leopard:Users:pcwiz:Desktop:xmlfile.xml"
Any ideas?
Hi.
It seems that "8086:293e" — which not only begins with a numeral but contains a colon — is an invalid name for an XML tag — at least as far as System Events is concerned. Is it of your own devising or has it come from some application? If you're stuck with it, you'll have to use regulus's text parsing idea.
Offline
An attribute can have those characters. If you can't rename the tag to start with a non-number and not have a colon, something like this will work. I tested this (I am in Tiger):
<?xml version="1.0"?>
<Main title="8086:293e">
<type>Audio</type>
<name>Some Name</name>
<displayname>Some Display Name</displayname>
<download>http://somesite.com</download>
</Main>
Applescript:
set XMLfile to "Macintosh HD:Users:myUser:Desktop:TestXML.xml"
tell application "System Events"
tell XML element 1 of contents of XML file XMLfile
set typeText to (value of XML element "type")
set nameText to (value of XML element "name")
end tell
end tell
Last edited by Matt-Boy (2008-07-21 09:38:02 am)
Offline
Alright. I have been experimenting with this some more. I think you would be best structuring your XML file like this:
<?xml version="1.0"?>
<Main>
<DataElement DataValue="8086:293e">
<type>Audio1</type>
<name>Some Name1</name>
<displayname>Some Display Name1</displayname>
<download>http://somesite1.com</download>
</DataElement>
<DataElement DataValue="8086:295e">
<type>Audio2</type>
<name>Some Name2</name>
<displayname>Some Display Name2</displayname>
<download>http://somesite2.com</download>
</DataElement>
<DataElement DataValue="8086:297e">
<type>Audio3</type>
<name>Some Name3</name>
<displayname>Some Display Name3</displayname>
<download>http://somesite3.com</download>
</DataElement>
</Main>
And then read in all of the values (however many there are) into a list, and then do a search on the list to pull the values of the one you are looking for.
Applescript:
set theFinalValues to {}
set XMLfile to "Macintosh HD:Users:myUser:Desktop:TestXML.xml"
tell application "System Events"
tell XML element "Main" of contents of XML file XMLfile
repeat with thisElement from 1 to (count of XML elements)
set dataValue to (value of XML attribute of XML element thisElement) as string
set typeText to (value of (XML elements whose name is "type") of XML element thisElement) as string
set nameText to (value of (XML elements whose name is "name") of XML element thisElement) as string
set displayname to (value of (XML elements whose name is "displayname") of XML element thisElement) as string
set download to (value of (XML elements whose name is "download") of XML element thisElement) as string
set theFinalValues to theFinalValues & {{dataValue, typeText, nameText, displayname, download}}
end repeat
end tell
end tell
repeat with thisData from 1 to count theFinalValues
if item 1 of item thisData of theFinalValues is "8086:295e" then
set typeText to item 2 of item thisData of theFinalValues
set nameText to item 3 of item thisData of theFinalValues
set displayname to item 4 of item thisData of theFinalValues
set download to item 5 of item thisData of theFinalValues
set theFinalSearchValue to {(item 1 of item thisData of theFinalValues), typeText, nameText, displayname, download}
end if
end repeat
theFinalSearchValue
result:
{"8086:295e", "Audio2", "Some Name2", "Some Display Name2", "http://somesite2.com"}
Last edited by Matt-Boy (2008-07-21 10:25:08 am)
Offline
I would agree with Matt-Boy. It is one of the requirements of XML that there be a "root" element as the 1st element. If you use changeable data as the 1st element, then you don't have a root element, to my way of thinking.
Offline
I thought I would add to this thread with a similar issue I'm having. I'm attempting to parse an XML file from the Haloscan commenting system. Here's a short example of what the XML looks like.
<?xml version="1.0" encoding="iso-8859-1" ?>
<comments>
<thread id="rw_unique_entry_id_446_page0">
<comment>
<datetime>2005-06-14T08:57:25-05:00</datetime>
<name>Some Name</name>
<email>some@mail</email>
<uri>http://wwww.w.come</uri>
<ip>100.100.0110.111</ip>
<text><![CDATA[bla bla bla]]></text>
</comment>
<comment>
<datetime>2005-06-14T08:57:25-05:00</datetime>
<name>Some Name</name>
<email>some@mail</email>
<uri>http://wwww.w.come</uri>
<ip>100.100.0110.111</ip>
<text><![CDATA[bla bla bla]]></text>
</comment>
</thread>
<thread id="rw_unique_entry_id_194_page0">
<comment>
<datetime>2005-06-14T08:57:25-05:00</datetime>
<name>Some Name</name>
<email>some@mail</email>
<uri>http://wwww.w.come</uri>
<ip>100.100.0110.111</ip>
<text><![CDATA[bla bla bla]]></text>
</comment>
</thread>
</comments>
THere is a threadid with 1 or more comments under each. I want to search for specific thread ids and extract the comment info.
Offline
Just wanted to add my method:
set paski to do shell script "curl http://www.onthesnow.com/pennsylvania/snow.rss"
-->get all items
set paskiitems to my parsecode(paski, "<item>", "</item>")
-->build array of mountaind data
set regiondata to {}
repeat with x from 1 to count of every item of paskiitems
set thismountain to item x of paskiitems
set AppleScript's text item delimiters to return
set astid to AppleScript's text item delimiters
set mountainname to my cleantags(thismountain, "<title>", "</title>", astid)
set mountaindescription to my cleantags(thismountain, "<description>", "</description>", astid)
set mountainlink to my cleantags(thismountain, "<guid isPermaLink=\"true\">", "</guid>", astid)
set mountainstatus to cleantags(thismountain, "<ots:open_staus>", "</ots:open_staus>", astid)
set mountaindepth to cleantags(thismountain, "<ots:base_depth>", "</ots:base_depth>", astid)
set mountain48sf to cleantags(thismountain, "<ots:snowfall_48hr>", "</ots:snowfall_48hr>", astid)
copy {mountainname, mountaindescription, mountainlink, mountainstatus, mountaindepth, mountain48sf} to end of regiondata
end repeat
return regiondata
on parsecode(code, opentag, closetag)
set itemlist to {}
set AppleScript's text item delimiters to opentag
set taglist to every text item of code as list
set childtaglist to {}
repeat with x from 2 to count of every item of taglist
copy item x of taglist to end of childtaglist
end repeat
repeat with thisitem in childtaglist
set AppleScript's text item delimiters to closetag
copy text item 1 of thisitem to end of itemlist
set AppleScript's text item delimiters to opentag
end repeat
return itemlist
end parsecode
on cleantags(thisitem, opentag, closetag, astid)
try
set AppleScript's text item delimiters to opentag
set rawitem to text item 2 of thisitem
set AppleScript's text item delimiters to closetag
set cleanitem to text item 1 of rawitem
-->reset the delimiters
set AppleScript's text item delimiters to astid
return cleanitem
on error errmsg
display dialog "Could not clean the " & opentag & return & return & errmsg
end try
end cleantags
Offline