Opening XML in Python

Priyam Mehta
3 min readMar 17, 2021

--

XML (Extensive Markup Language) is pretty much common, and if you’re a programmer, it’s really likely that you come across XML files now and then. Now, you can directly open and edit an XML document in VSCode or Atom or any similar editor, or you convert it to other formats like JSON and make it more intractable for a programming language.

So to read and edit XML files in python, there are 3 pretty simple ways:

  1. The first and the most simplest method is to convert it into a JSON file and then use it in Python using it’s built-in library JSON. Now JSON, which is Java Script Object Notation is quiet similiar to Python dictonaries. You can convert your XML into JSON by this JSON-Converter Online .
  2. If this isn’t what you were looking for then, you could go for xmltodict library in python. Which is my personal favourite.

→ for that purpose, let’s try an example

let’s say I have this XML file which I want to Open in python.

XML file taken from site map of an online store

So initally, we beign with including the library

import xmltodict

Then, we open it using Open method in python. Here the name of the file on my computer was “1.xml”

f = open(“/path/to/file/1.xml” , “r”)

next we call the read method on the file

xmlContent = f.read()

now, to convert it into a Python Ordered Dictionary, we shall parse it using

xmltodict.parse method. And voila !! you have successfully converted it into a dictonary. Now you can manipulate the data like you wold with normal python dictionaries and lists.

d = xmltodict.parse(xmlContent)
print(d)
print(d['urlset'])
print(d['urlset']['url'][0])

suppose you want extract the first URL in the given data,

print(d[‘urlset’][‘url’][0][‘loc’])

With this you can now use it like a normal ordered dict.

→ Full code

import xmltodict# opening the file 
f = open(“/path/to/file/1.xml” , “r”)
xmlContent = f.read()
#parsing it with xmltodict
d = xmltodict.parse(xmlContent)
print(d)
print(d['urlset'])
print(d['urlset']['url'][0])

and you can convert it into JSON using

import json
jsd = json.loads(json.dumps(d))

3. This one’s a little complicated, compared to the other two, but more powerful. It uses python’s built-in The ElementTree XML Module.

You can also use this method to modify the original XML file

suppose we have a XML file like the one below

<employees><employee><name>aaa</name><age>21</age><sal>5000</sal></employee><employee><name>xyz</name><age>22</age><sal>60</sal></employee></employee>

then to Parse it we can use the following code

import xml.etree.ElementTree as ettree = et.ElementTree(file='employees.xml')root = tree.getroot() # finding the root element of the filechildren = root.getchildren() # getting the children elements of the #root

the .getchildren() method returns a List of children in the root element

To access the individual child, one can use a for loop to iterate through it

for child in children:
pairs = child.getchildren()

and for our data set

employee = {}
for child in children:
pairs = child.getchildren()
for pair in pairs:
employee[pair.tag]=pair.text

to get the employee dict like

{'name': 'aaa', 'age': '21', 'sal': '5000'}
{'name': 'xyz', 'age': '22', 'sal': '6000'}

→ And you can write or modify an XML using

et.Element

and

.SubElement(child, "name")

calls. The full example code to make an XML file is below

import xml.etree.ElementTree as etemployees=[{'name':'aaa','age':21,'sal':5000},{'name':xyz,'age':22,'sal':6000}] # making a List with the required #dataroot = et.Element("employees") # defining the root element of the #XMLfor employee in employees:
child=xml.Element("employee") # making a new element
root.append(child) # and appending it to the root element # making sub elements and appending them

nm = xml.SubElement(child, "name")
nm.text = student.get('name')

age = xml.SubElement(child, "age")
age.text = str(student.get('age'))

sal=xml.SubElement(child, "sal")
sal.text=str(student.get('sal'))

tree = et.ElementTree(root)
with open('employees.xml', "wb") as fh:
tree.write(fh)

and this way we can write an XML file using Python.

--

--

No responses yet