Five Advanced Plots in Python — Matplotlib
Different colored sphere other side of a curved paper border.( 3d render )

Five Advanced Plots in Python — Matplotlib

Data visualization is probably the most widely used feature in data science for obvious reasons. Visualization is the best way to present and understand data. So, it is a good idea to have a lot of visualization tools in hand. Because different kind of visualization is appropriate in a different situation. In this article, I will share some code for five 3d visualizations in Matplotlib.

I am assuming, you know 2d plots in Matplotlib. So, I won’t go for too much discussion. This article will simply demonstrate how to make these five plots.

The five 3d plots I will demonstrate in this article:

  1. Scatter Plot
  2. Contour Plot
  3. Tri-Surf Plot
  4. Surface Plot
  5. Bar Plot

I am using a dataset from Kaggle for this article.

Dataset

Please feel free to download the dataset from this link if you want to run this code yourself. This is an open dataset that is mentioned here.

First import the necessary packages and the dataset:

import pandas as pd
import numpy as np
from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt

df = pd.read_csv("auto_clean.csv")

The dataset is pretty big. So I am not showing any screenshots here. These are the columns of this dataset:

df.columns

Output:

Index(['symboling', 'normalized-losses', 'make', 'aspiration', 'num-of-doors','body-style', 'drive-wheels', 'engine-location', 'wheel-base',  'length', 'width', 'height', 'curb-weight', 'engine-type', 'num-of-cylinders', 'engine-size', 'fuel-system', 'bore', 'stroke', 'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg', 'highway-mpg', 'price', 'city-L/100km', 'horsepower-binned', 'diesel', 'gas'] ,dtype='object')

Let’s move to the plotting part.

Scatter Plot

The scatter plot is pretty self-explanatory. I am assuming that you know the 2d scatter plot. To make a 3d scatter plot, we just need to use the ‘scatter3D’ function and pass x, y, and z values. I choose to use the height, width, and length for x, y, and z values.

To add some more information and also to add some style I will pass the price as a size parameter. I feel like a scatter plot looks nice when it has different sizes of bubbles and different colors. At the same time, size includes some more information.

So, the bigger the bubbles more the price. This same plot will give an idea of how price differs with height, width, and length. Colors are also changing with peak RPM.

%matplotlib notebook
fig = plt.figure(figsize=(10, 10))
ax = plt.axes(projection="3d")
ax.scatter3D(df['length'], df['width'], df['height'],
c = df['peak-rpm'], s = df['price']/50, alpha = 0.4)
ax.set_xlabel("Length")
ax.set_ylabel("Width")
ax.set_zlabel("Height")
ax.set_title("Relationship between height, weight, and length")
plt.show()
Image by Author

This plot is interactive that helps to understand the data even better:

Animation by Author

Bar Plot

Bar plot is always useful. But when it is a 3D bar plot can be even more useful. Here I will plot the price with respect to the body style and peak RPM. For doing that a little data preparation is necessary. First of all, body style is a categorical variable. The values of the body_style column are as follows:

df['body-style'].unique()

Output:

array(['convertible', 'hatchback', 'sedan', 'wagon', 'hardtop'],
dtype=object)

These strings need to be replaced with numeric values:

df['body_style1'] = df['body-style'].replace({"convertible": 1,
"hatchback": 2,
"sedan": 3,
"wagon": 4,
"hardtop": 5})

Now, I will find the average peak rpm and price for each body style.

gr = df.groupby("body_style1")['peak-rpm', 'price'].agg('mean')

Output:

For the 3d bar plot, we obviously need to pass the x, y, and z values to the bar3d function as you can expect. But also it asks for dx, dy, and dz.

First, think of the 2d plane where the bars will be placed, where we only can use x and y values, and z values are zeros. So, we will use body_style1 in the x-axis and peak-rpm in the y axis, and z values will be zeros on the 2d plane as discussed already.

Next, we should specify the size of the bars. I am using the width of 0 .3 on the x-axis, a width of 30 on the y-axis and the height will be the value of the price column. That means,

dx = 0.3

dy = 30

dz = gr[‘price’]

Here is the code snippet for the bar plot:

%matplotlib notebook
x = gr.index
y = gr['peak-rpm']
z = [0]*5

colors = ["b", "g", "crimson", 'r', 'pink']
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection="3d")
ax.set_xticklabels(['convertible', 'hatchback', 'sedan', 'wagon', 'hardtop'])
ax.set_xlabel("Body Style", labelpad = 7)
ax.set_yticks(np.linspace(5000, 5250, 6))
Image by Author

Here is the full view of the bar plot:

Animation by Author

Here is the bar plot!

Surface Plot

For this type of plot one-dimensional x and y values do not work. So, we need to use the ‘meshgrid’ function to generate a rectangular grid out of two one-dimensional arrays.

This plot shows the relationship between two variables in a 3d setting.

I choose to see the relationship between the length and width in this plot. Here is the code for this surface plot:

def z_function(x, y):
return np.sin(np.sqrt(x**2 + y**2))

plt.figure(figsize=(10, 10))
ax = plt.axes(projection="3d")

X, Y = np.meshgrid(x, y)
Z = z_function(X, Y)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1,
cmap='winter', edgecolor='none')
ax.set_xlabel("Length")
ax.set_ylabel("Width")
ax.set_title("Peak RPM vs City-MPG")
ax.view_init(65, 30)
Image by Author

Here is the full view of the surface plot

Animation by Author

Contour Plot

Maybe you already know the 2d contour plot. Here I am showing the relationship between the peak RPM and the city-MPG using a 3d contour plot.

In this example, I am using the sin function for z values. Feel free to try it with the cosine function.

Here is the code snippet:

%matplotlib notebook
def z_function(x, y):
return np.sin(np.sqrt(x**2 + y**2))
plt.figure(figsize=(10, 10))
ax = plt.axes(projection="3d")
x = df['peak-rpm']
y = df['city-mpg']
X, Y = np.meshgrid(x, y)
Z = z_function(X, Y)
ax.contour3D(X, Y, Z, rstride=1, cstride=1,
cmap='binary', edgecolor='none')
ax.set_xlabel("Peak RPM")
ax.set_ylabel("City-MPG")
ax.set_title("Peak RPM vs City-MPG")
ax.view_init(60, 35)
plt.show()
Image by Author

Here is the full view of this contour plot

Animation by Author

Please try with cosine for the z-function and see how the contour with cosine looks with the same data.

Tri-Surf Plot

Let’s see how a tri-surf plot looks like. We do not need a mesh grid for the tri-surf plot. Simple one-dimensional data is good for x and y-direction.

Here is the code.

%matplotlib notebook
plt.figure(figsize=(8, 8))
ax = plt.axes(projection="3d")
x = df['peak-rpm']
y = df['city-mpg']
z = z_function(x, y)
ax.plot_trisurf(x, y, z,
cmap='viridis', edgecolor='none');
ax.set_xlabel("Peak RPM")
ax.set_ylabel("City-MPG")
ax.set_title("Peak RPM vs City-MPG")
ax.view_init(60, 25)
plt.show()
Image by Author

Here is the full shape of the plot

Animation by Author

Conclusion

These are the five plots I wanted to share in this article. I hope you will be using them in your own projects.

Feel free to follow me on Twitter and check out my new YouTube channel.

Leave a Reply

Close Menu