Getting at the nodes and attributes in an XML document

In the last screencast and post we got our xml exported file from backpackit parsed and printed using the minidom package using a python script.

Over the last year I have been gathering a lot of my experimental data in the form of “note entries” on several backpackit pages ( One for each project). As a first task – I wanted to get a feeling for how many such notes are present.

So with a little help from Mark Pringles excellent book called dive into python we can do the following two things
First obtain an array containing all the note nodes of the kind

<note title=”Bt-TRK pH 7.0 uptake flux with Trypsin treated Bt-TRK 06/19/06″ id=”1005550″ created_at=”2006-08-22 22:38:31″

Then use the created_at attribute and write it to a text file for further processing
SO now for the code

from xml.dom import minidom
# Create XML object
xmldoc = minidom.parse('backpackit.xml')

# I want to get a feel for how many note nodes were in the original file
# Thats as simple as the call getElementsByTagName

notelist = xmldoc.getElementsByTagName("note")
Num_notes = len(notelist)
outfile = open ('notelist.txt','w')
print "I have " + str(Num_notes) + " notes total "

# Now I want to capture when those notes were created and save it to a file

i = 0

#The python for loop

for i in range(len(notelist)):
date_elem = notelist[i].attributes["created_at"]
outfile.write(date_elem.nodeValue + "\n")
print "Wrote file notelist.txt"

# END of script

So now we have a file with all the created_at dates and times. I really want to know how busy I was for the last year and chart a pattern of how many posts I had per week for the whole year. So lets try and get at that information in the next codeitch project.

Refs minidom python doc


