Opening XML in Python
XML (Extensive Markup Language) is pretty much common, and if you’re a programmer, it’s really likely that you come across XML files now and then. Now, you can directly open and edit an XML document in VSCode or Atom or any similar editor, or you convert it to other formats like JSON and make it more intractable for a programming language.
So to read and edit XML files in python, there are 3 pretty simple ways:
- The first and the most simplest method is to convert it into a JSON file and then use it in Python using it’s built-in library JSON. Now JSON, which is Java Script Object Notation is quiet similiar to Python dictonaries. You can convert your XML into JSON by this JSON-Converter Online .
- If this isn’t what you were looking for then, you could go for xmltodict library in python. Which is my personal favourite.
→ for that purpose, let’s try an example
let’s say I have this XML file which I want to Open in python.
So initally, we beign with including the library
import xmltodict
Then, we open it using Open method in python. Here the name of the file on my computer was “1.xml”
f = open(“/path/to/file/1.xml” , “r”)
next we call the read method on the file
xmlContent = f.read()
now, to convert it into a Python Ordered Dictionary, we shall parse it using
xmltodict.parse method. And voila !! you have successfully converted it into a dictonary. Now you can manipulate the data like you wold with normal python dictionaries and lists.
d = xmltodict.parse(xmlContent)
print(d)
print(d['urlset'])
print(d['urlset']['url'][0])
suppose you want extract the first URL in the given data,
print(d[‘urlset’][‘url’][0][‘loc’])
With this you can now use it like a normal ordered dict.
→ Full code
import xmltodict# opening the file
f = open(“/path/to/file/1.xml” , “r”)
xmlContent = f.read()#parsing it with xmltodict
d = xmltodict.parse(xmlContent)print(d)
print(d['urlset'])
print(d['urlset']['url'][0])
and you can convert it into JSON using
import json
jsd = json.loads(json.dumps(d))
3. This one’s a little complicated, compared to the other two, but more powerful. It uses python’s built-in The ElementTree XML Module.
You can also use this method to modify the original XML file
suppose we have a XML file like the one below
<employees><employee><name>aaa</name><age>21</age><sal>5000</sal></employee><employee><name>xyz</name><age>22</age><sal>60</sal></employee></employee>
then to Parse it we can use the following code
import xml.etree.ElementTree as ettree = et.ElementTree(file='employees.xml')root = tree.getroot() # finding the root element of the filechildren = root.getchildren() # getting the children elements of the #root
the .getchildren() method returns a List of children in the root element
To access the individual child, one can use a for loop to iterate through it
for child in children:
pairs = child.getchildren()
and for our data set
employee = {}
for child in children:
pairs = child.getchildren()
for pair in pairs:
employee[pair.tag]=pair.text
to get the employee dict like
{'name': 'aaa', 'age': '21', 'sal': '5000'}
{'name': 'xyz', 'age': '22', 'sal': '6000'}
→ And you can write or modify an XML using
et.Element
and
.SubElement(child, "name")
calls. The full example code to make an XML file is below
import xml.etree.ElementTree as etemployees=[{'name':'aaa','age':21,'sal':5000},{'name':xyz,'age':22,'sal':6000}] # making a List with the required #dataroot = et.Element("employees") # defining the root element of the #XMLfor employee in employees:
child=xml.Element("employee") # making a new element root.append(child) # and appending it to the root element # making sub elements and appending them
nm = xml.SubElement(child, "name")
nm.text = student.get('name')
age = xml.SubElement(child, "age")
age.text = str(student.get('age'))
sal=xml.SubElement(child, "sal")
sal.text=str(student.get('sal'))
tree = et.ElementTree(root)
with open('employees.xml', "wb") as fh:
tree.write(fh)
and this way we can write an XML file using Python.