Choropleth maps are one of the most interesting and useful visualizations. They are important because they can provide information for geographic location, they look beautiful and grab attention in a presentation. Several different libraries can be used to do that. In this tutorial, I will use folium.
What is a choropleth map?
Here is the definition from Wikipedia:
Choropleth maps provide an easy way to visualize how a measurement varies across a geographic area or show the level of variability within a region. A heat map or isarithmic map is similar but does not use a priori geographic areas. They are the most common type of thematic map because published statistical data (from government or other sources) is generally aggregated into well-known geographic units, such as countries, states, provinces, and counties, and thus they are relatively easy to create using GIS, spreadsheets, or other software tools.
In simple and easy words, choropleth maps are the maps that show the information by geolocation using color on the map. See some of the pictures below to get more understanding.
Data preparation is an important and common task for all data scientists. The dataset I used here is reasonably nice and clean. But for this visualization, I still need to work on it a bit. Let’s import the necessary libraries and the dataset.
import pandas as pd import numpy as npdf = pd.read_excel('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DV0101EN/labs/Data_Files/Canada.xlsx', sheet_name='Canada by Citizenship', skiprows=range(20), skipfooter=2)
I cannot show a screenshot of the dataset here because it’s too big. I encourage you to run the code by yourself. That’s the only way to learn.
This dataset contains how many immigrants came to Canada from the different countries of the world from 1980 to 2013. Let’s see the column names of the dataset to get a sense of what this dataset contains:
df.columns#Output: Index(['Type', 'Coverage', 'OdName', 'AREA', 'AreaName', 'REG', 'RegName', 'DEV', 'DevName', 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013], dtype='object')
We are going to plot the total number of immigrants from 1980 to 2013 in each country.
We need the name of the country and the years. Drop some unnecessary columns from the dataset.
df.drop(['AREA', 'REG', 'DEV', 'Type', 'Coverage', 'AreaName', 'RegName', 'DevName'], axis=1, inplace=True)
The column ‘OdName’ is the name of the country. Rename it to ‘Country’ for making it understandable.
Now, make a column ‘Total’ that will be the sum of the immigrants from all the years for each country.
df['Total'] = df.sum(axis=1)
Look, we have the ‘Total’ column at the end. It gives the total number of immigrants for each country.
Remember that setting this axis as 1 is important. It says that the sum operation should be across columns. Otherwise, it will do the sum across rows and we will end up getting the total number of immigrants per year instead of per country.
Basic Choropleth Map
I am going to show, how to develop a choropleth map step by step here. Import folium. If you do not have folium, install it by running this command in your anaconda prompt:
conda install -c conda-forge folium
Import folium now and generate a world map.
import folium world = folium.Map(location=[0,0], zoom_start=2)
Now in this world map, we will set our data. But it also requires geo data that contains coordinates of each country. Download the geo data from this link. I already downloaded and put it in the same folder as the notebook I used for this tutorial. I just need to read that file now.
wc = r'world-countries.json'
For this choropleth map, you need to pass on
- the geo_data that we saved as ‘wc’ above,
- columns we need to use from the dataset,
- ‘key_on’ parameter that comes from the geo_data. The value of the ‘key_on’ parameter always starts with ‘feature’. Then we need to add the key from the geo_data that we saved as ‘wc’. That JSON file is too big. So, I am showing a part of it to explain the key_on parameter:
In the properties key, we have the name of the country. That’s what we need to pass on. So, the value of the key_on parameter will be ‘feature.properties.name’.
5. I will also use some styling parameters: fill_color, fill_opacity, line_opacity, and legend_name. I think these are self-explanatory.
Here is the code for our first choropleth map:
world.choropleth(geo_data=wc, data=df, columns=['Country', 'Total'], key_on='feature.properties.name', fill_color='YlOrRd', fill_opacity=0.8, line_opacity=0.2, legend_name='Immigration to Canada' ) world
This map is interactive! You can navigate around using the mouse. Also, it changes the color with intensity. The darker the color, the more immigrants came from that country to Canada. But black means there is no data available or there were no immigrants.
This map may look a bit plane. We can use tiles to make it look more interesting:
world_map = folium.Map(location=[0, 0], zoom_start=2, tiles='stamenwatercolor') world_map.choropleth(geo_data=wc, data=df, columns=['Country', 'Total'], threshold_scale=threshold_scale, key_on='feature.properties.name', fill_color='YlOrRd', fill_opacity=0.7, line_opacity=0.2, legend_name='Immigration to Canada' )
Isn’t it better looking! We can make it more interesting by using a few tiles which will give us options to change the tiles based on the requirement. We will use folium’s TileLayer method to add the different layers of tiles on the map. In the end, we will also include the LayerControl method to get the option of altering the layers.
world = folium.Map(location=[0, 0], zoom_start=2, tiles='cartodbpositron') tiles = ['stamenwatercolor', 'cartodbpositron', 'openstreetmap', 'stamenterrain'] for tile in tiles: folium.TileLayer(tile).add_to(world)world.choropleth( geo_data=wc, data=df, columns=[‘Country’, ‘Total’], threshold_scale=threshold_scale, key_on=’feature.properties.name’, fill_color=’YlOrRd’, fill_opacity=0.7, line_opacity=0.2, legend_name=’Immigration to Canada’, smooth_factor=0 )
Look, underneath the right corner of legend, there is a stack of tiles. If you click on that you will get the list of tiles. You will be able to change the tiles style there. I find this option very cool!
Add Informative Label
Finally, I want to show you another useful and interesting option. That is to use an informative label. We cannot expect everyone to know the name of the country by looking at the map. It will be useful to have the label of the country on the map. We will make it interesting. Folium has a function called ‘GeoJsonTooltip’ that does that. First, we need to make the world map as usual. Add all the parameters to it and save in a variable. Then add this additional feature using ‘GeoJsonTooltip’ with an add_child method. Here is the complete code.
world = folium.Map(location=[0,0], zoom_start=2, tiles='cartodbpositron') choropleth = folium.Choropleth(geo_data=wc, data=df, columns=['Country', 'Total'], threshold_scale=threshold_scale, key_on='feature.properties.name', fill_color='YlOrRd', fill_opacity=0.7, line_opacity=0.2, legend_name='Immigration to Canada', ).add_to(world)choropleth.geojson.add_child( folium.features.GeoJsonTooltip(['name'], labels=False)) world
Notice, I put the cursor on France, it shows the name France. The same way you can put the cursor in any place of the map get the name of the place.
I wanted to show how to develop an interactive choropleth map, style it, and add informative labels to it. I hope it was helpful.
#python #datascience #datavisualization #DataAnalytics #dataAlatises #ChoroplethMap