Plotting multiple scatter plots pandas

Inside the holoviz ecosystem, there is a library called hvplot which provides very nice high-level plotting functionality (on top of holoviews) that works out of the box with Pandas:

import numpy as np
import hvplot.pandas
import pandas as pd

df = pd.DataFrame(np.random.randn(100, 6), columns=['a', 'b', 'c', 'd', 'e', 'f'])

df.hvplot(x='a', y=['b', 'c', 'd', 'e'], kind='scatter')

enter image description here


Where would this bx be passed into?

You ought to repeat the second call to plot, not the first, so there is no need for bx.

In detail: plot takes an optional ax argument. This is the axes it draws into. If the argument is not provided the function creates a new plot and axes. In addition, the axes is returned by the function so it can be reused for further drawing operations. The idea is not to pass an ax argument to the first call to plot and use the returned axes in all subsequent calls.

You can verify that each call to plot returns the same axes that it got passed:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(100, 6), columns=['a', 'b', 'c', 'd', 'e', 'f'])


ax1 = df.plot(kind='scatter', x='a', y='b', color='r')    
ax2 = df.plot(kind='scatter', x='c', y='d', color='g', ax=ax1)    
ax3 = df.plot(kind='scatter', x='e', y='f', color='b', ax=ax1)

print(ax1 == ax2 == ax3)  # True

enter image description here

Also, if the plot is the same graph, shouldn't the x-axis be consistently either 'a' or 'c'?

Not necessarily. If it makes sense to put different columns on the same axes depends on what data they represent. For example, if a was income and c was expenditures it would make sense to put both on the same 'money' axis. In contrast, if a was number of peas and c was voltage they should probably not be on the same axis.


You can plot any column against any column you like. Whether that makes sense you have to decide for yourself. E.g. plotting a column denoting time on the same axis as a column denoting distance may not make sense, but plotting two columns which both contain distance on the same axis, is fine.

In order to specify that a certin plot should be on an already existing axes (ax), you'd specify the ax keyword as seen in the documentation. Of couse you can create several plots on the same axes.

ax = df.plot(kind="scatter", x="x",y="a", color="b", label="a vs. x")
df.plot(x="x",y="b", color="r", label="b vs. x", ax=ax)
df.plot( x="x",y="c", color="g", label="c vs. x", ax=ax)

A complete example:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0,6.3, 50)
a = (np.sin(x)+1)*3
b = (np.cos(x)+1)*3
c = np.ones_like(x)*3
d = np.exp(x)/100.
df = pd.DataFrame({"x":x, "a":a, "b":b, "c":c, "d":d})

ax = df.plot(kind="scatter", x="x",y="a", color="b", label="a vs. x")
df.plot(x="x",y="b", color="r", label="b vs. x", ax=ax)
df.plot( x="x",y="c", color="g", label="c vs. x", ax=ax)
df.plot( x="d",y="x", color="orange", label="b vs. d", ax=ax)
df.plot( x="a",y="x", color="purple", label="x vs. a", ax=ax)

ax.set_xlabel("horizontal label")
ax.set_ylabel("vertical label")
plt.show()

enter image description here