parsing xml with applescript or maybe shell

I need t parse xml for an applescript project and i got a start but for some reason my code is not operating the way I expected it to it does find the item I’m looking for but does not return a value

here is the code


set xmlFile to ((choose file without invisibles) as string)
 tell application "System Events"
    set xdata to XML element 1 of contents of XML file xmlFile
    set foo to my getxml(xdata, "line1")
    return foo
 end tell

 on getxml(xmldata, e)
    tell application "System Events"
        repeat with i from 1 to count of XML elements of xmldata
            set e_name to (get name of XML element i of xmldata) as Unicode text
            log e_name
            if e_name is equal to e then
                display dialog "hello"
                return value of XML element i of xmldata
            else
                my getxml(XML element i of xmldata, e)
            end if
        end repeat
    end tell
 end getxml

here is the xml

<?xml version="1.0" encoding="UTF-8"?>
 <foo>
    <bar>
        <line1>test</line1>
    </bar>
    <crap>ohh</crap>
 </foo>

what is interesting is if i give it a top level item liKe it does what I expect it to

so I’m jsut either looking to fix this code or make something better maybe something from the shell that I can call to parse my xml ?

thanks

You haven’t given yourself any way to return a value in that subroutine… so it’s not returning a value. Try this…

set xmlFile to ((choose file without invisibles) as string)
tell application "System Events"
	set xdata to XML element 1 of contents of XML file xmlFile
	set foo to my getxml(xdata, "line1")
	return foo
end tell

on getxml(xmldata, e)
	set returnValue to missing value
	tell application "System Events"
		repeat with i from 1 to count of XML elements of xmldata
			if returnValue is not missing value then exit repeat
			set e_name to (get name of XML element i of xmldata) as Unicode text
			if e_name is equal to e then
				set returnValue to value of XML element i of xmldata
				exit repeat
			else
				set returnValue to my getxml(XML element i of xmldata, e)
			end if
		end repeat
	end tell
	return returnValue
end getxml

Another way to parse XML is via an XSL stylesheet. It seems to me that this might be faster, if the xml file is large. There is a built-in processor in OS X. This is an example AppleScript file I wrote for myself.


set xml_file to quoted form of POSIX path of (choose file with prompt "XML File")
set xsl_file to quoted form of POSIX path of (choose file with prompt "XSL File")

-- return results here
do shell script "xsltproc " & xsl_file & " " & xml_file

-- write xml file with results
-- set out_file to "~/Desktop/Result.xml"
-- do shell script "xsltproc -o " & out_file & " " & xsl_file & " " & xml_file

This is what the XSL file would look like, to get the line1 value(s). Most of this would be the same for any xsl, so it can be a snippet (or clipping).

not sure what xsl is but I’ll see if this does what I want it to can you show me how to use this with the example data I supplied ?

@hank thanks that works like a charm though it is a tad slow but not your fault :slight_smile:

I also wonder if there is a way to capture multiple instances where instead of saying


   if e_name is equal to e then

use contains

pseudo code here


   if e_name  contains to e then
   set returnValue to copy value of XML element i of xmldata to end of returnValue

No Problem. You should give Fenton’s method a try. I’ve never used that either but he did give you the exact command you need and the xsl file you need too. It looks like you just need to try it.

indeed wow that’s fast I wonder if there is a way to do that without spelling out the tree ?

It is possible to target an element using only its name, by using the “//” syntax.

<xsl:value-of select=“//line1” />

But I would think this would be a bit slower, and could return the wrong data if the element was used more than once (like in Apple’s plist files).

What is more accurate is to include as much of the path as you can, before using “//”. The most common situation where you need this is if an element may sometimes be at one level OR one level down from there.

The trouble with mixing AppleScript and XML/XSL is that it’s more difficult to return multiple values AND assign them to AppleScript variables.

It is however possible to return the data in many different ways. I use this method to Import XML directly into FileMaker databases, which can see XML like any other data source, after you transform it to FileMaker syntax XML.

It is also possible to transform the data returned into whatever text you’d like, such as tab separated data.