I have been experimenting with custom maps (a.k.a. my places) in Google Maps to obtain geographic coordinates of certain locations. In a certain way, I am geocoding. According to Wikipedia, geocoding is the process of finding geographic coordinates associated to other forms of location data, such as street addresses or postal codes. I am not using a fully automated process but I managed to automate some parts of it.
What am I geocoding? I want to create a data set with the geographic coordinates of the main stock exchanges across the world. Why? Because I am interested in calculating the physical distance between stock exchanges. As you can see in the following map, I have already located most of the stock exchanges of America and Europe.
So far, what I am doing is to manually locate the stock exchanges in a custom Google map and then I download an XML-structured file that contains the geographic data of my custom map. Then, I can extract the coordinates of each stock exchange from the XML file. While locating the stock exchanges in my map is a manual process, obtaining the coordinates is an automatic process. And in this latter process is where the fun is.
I built a small script in Python in order to get the data from Google Maps in a simple format. Based on the data I add to my Stock Exchanges map, Google Maps creates an XML file containing tags with the geographic data. I use the Python script to download the XML file and then navigate the tag structure to obtain the geographic coordinates of each stock exchange. In order to navigate the XML file, I use the BeautifulSoup library. This library helps you parse the XML file (see this interesting example). There are many other libraries that can do this job. However, BeautifulSoup was the one I have learnt to use. I discovered it in Nathan Yau’s book Visualize This.
The structure of the XML can be observed in the following example of the XML file with the data about the Toronto Stock Exchange:
<Placemark> <name>Toronto Stock Exchange; Canada; Toronto</name> <description> <!--[<span class="hiddenSpellError" pre=""-->CDATA[ <div dir="ltr"></div> <pre> ]]> </description> <styleUrl>#style11</styleUrl> <Point> <coordinates>-79.383545,43.648350,0.000000</coordinates> </Point> </Placemark>
Every Stock Exchange in the map is a Placemark. I need what is inside each name and coordinates tag. I am not sure about this, but it seems that the word name is a reserved term in Python, so the first thing I have to do is to replace the word name by, say, nombre. Then, using BeautifulSoup, I search the data inside the tags nombre and coordinates. The BeautifulSoup library allows to obtain the data contained in each tag (nombre and coordinates).
#Import library from BeautifulSoup from BeautifulSoup import BeautifulStoneSoup #Open the Google Maps XML file (read-only) localFilein = open('Stock Exchanges of the World Input.xml', "r") localFile = open('Stock Exchanges of the World.xml', "w") #Replace the word 'name' by 'nombre'. for line in localFilein: localFile.write( line.replace('name', 'nombre') ) localFilein.close() localFile.close() #Parse the file with BeautifulSoup (Stone library for XML) localFile = open('Stock Exchanges of the World.xml','r') soup = BeautifulStoneSoup(localFile) #Print the name of the stock exchange and the coordinates for i in soup.findAll('placemark'): stockname = i.nombre.string coord = i.point.coordinates.string print stockname + ";" + coord
In line 15, I search all the tags called placemark and then I move across them. In line 16, i.nombre.string lets me obtain what is inside the tag structure placemark.nombre. This is similar to line 17, because I am obtaining the text that is inside the tag placemark.point.coordinates.
In the case of Toronto Stock Exchange, for instance, the output of this script is as follows (after replacing , by ; in notepad):
Toronto Stock Exchange; Canada; Toronto;-79.383545;43.648350;0.000000
A further step for automating the whole task would be to geocode the addresses of the stock exchanges. That is, assuming that I can get a data set containing the address of each stock exchange (I did a quite extensive search and I did not find anything like that), I can retrieve the geographic coordinates from Google Maps based on that address. This would mean that I would not have to manually place each stock exchange in a custom map. I would just work “behind the scenes” with the Google Maps API.
I know that the code is far from perfect, but it works just fine! Suggestions are welcomed!