matplotlib: plotting histogram plot just above scatter plot

I strongly recommend to flip the right histogram by adding these 3 lines of code to the current best answer before plt.show() :

ax_yDist.invert_xaxis()
ax_yDist.yaxis.tick_right()
ax_yCumDist.invert_xaxis()

after flipping the right histogram

The advantage is that any person who is visualizing it can compare easily the two histograms just by moving and rotating clockwise the right histogram on their mind.

On contrast, in the plot of the question and in all other answers, if you want to compare the two histograms, your first reaction is to rotate the right histogram counterclockwise, which leads to wrong conclusions because the y axis gets inverted. Indeed, the right CDF of the current best answer looks decreasing at first sight:

before flipping the right histogram


Here's an example of how to do it, using gridspec.GridSpec:

import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
import numpy as np

x = np.random.rand(50)
y = np.random.rand(50)

fig = plt.figure()

gs = GridSpec(4,4)

ax_joint = fig.add_subplot(gs[1:4,0:3])
ax_marg_x = fig.add_subplot(gs[0,0:3])
ax_marg_y = fig.add_subplot(gs[1:4,3])

ax_joint.scatter(x,y)
ax_marg_x.hist(x)
ax_marg_y.hist(y,orientation="horizontal")

# Turn off tick labels on marginals
plt.setp(ax_marg_x.get_xticklabels(), visible=False)
plt.setp(ax_marg_y.get_yticklabels(), visible=False)

# Set labels on joint
ax_joint.set_xlabel('Joint x label')
ax_joint.set_ylabel('Joint y label')

# Set labels on marginals
ax_marg_y.set_xlabel('Marginal x label')
ax_marg_x.set_ylabel('Marginal y label')
plt.show()

enter image description here


I encountered the same problem today. Additionally I wanted a CDF for the marginals.

enter image description here

Code:

import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import numpy as np

x = np.random.beta(2,5,size=int(1e4))
y = np.random.randn(int(1e4))

fig = plt.figure(figsize=(8,8))
gs = gridspec.GridSpec(3, 3)
ax_main = plt.subplot(gs[1:3, :2])
ax_xDist = plt.subplot(gs[0, :2],sharex=ax_main)
ax_yDist = plt.subplot(gs[1:3, 2],sharey=ax_main)
    
ax_main.scatter(x,y,marker='.')
ax_main.set(xlabel="x data", ylabel="y data")

ax_xDist.hist(x,bins=100,align='mid')
ax_xDist.set(ylabel='count')
ax_xCumDist = ax_xDist.twinx()
ax_xCumDist.hist(x,bins=100,cumulative=True,histtype='step',density=True,color='r',align='mid')
ax_xCumDist.tick_params('y', colors='r')
ax_xCumDist.set_ylabel('cumulative',color='r')

ax_yDist.hist(y,bins=100,orientation='horizontal',align='mid')
ax_yDist.set(xlabel='count')
ax_yCumDist = ax_yDist.twiny()
ax_yCumDist.hist(y,bins=100,cumulative=True,histtype='step',density=True,color='r',align='mid',orientation='horizontal')
ax_yCumDist.tick_params('x', colors='r')
ax_yCumDist.set_xlabel('cumulative',color='r')

plt.show()

Hope it helps the next person searching for scatter-plot with marginal distribution.