Symmetrical, violin plot-like histogram?

Here is one possibility using base graphics:

tmp <- tapply( iris$Petal.Length, iris$Species, function(x) hist(x, plot=FALSE) )

plot.new()
tmp.r <- do.call( range, lapply(tmp, `[[`, 'breaks') )
plot.window(xlim=c(1/2,length(tmp)+1/2), ylim=tmp.r)
abline(v=seq_along(tmp))

for( i in seq_along(tmp) ) {
    h <- tmp[[i]]
    rf <- h$counts/sum(h$counts)
    rect( i-rf/2, head(h$breaks, -1), i+rf/2, tail(h$breaks, -1) )
}

axis(1, at=seq_along(tmp), labels=names(tmp))
axis(2)
box()

You can tweak the different parts to your preferences and the whole thing could easily be wrapped into a function.


Here is a Lattice panel function based on @GregSnow's answer using base graphics. I couldn't have done it without Greg providing a solid starting point, so all credit goes to Greg. My panel function is not very sophisticated, and could very well break on something simple, but will handle horizontal and vertical orientations, and allows you to supply a vector of breaks or leave it out. It also removes bins on the ends that are empty. The panel function uses hist's default behavior for breaks rather than histogram's, which is more complicated. Comments about better methods are welcome.

Since symmetric or centered histograms don't have an existing name, as far as I know, and they're reminiscent of a Tower of Hanoi toy, perhaps they should be called "Tower of Hanoi histograms". Thus the function is called panel.hanoi.

Simple usage example using the definition of df30 above:

bwplot(crime.v ~ bias, data=df30, panel=panel.hanoi)

Here is a more complex example using data provided in the link in the question (graphic at end of answer).

bwplot(crime.v ~ bias, data=df, ylim=c(-1,1), pch="|", coef=0, panel=function(...){panel.hanoi(col="pink", breaks=cv.ints, ...); panel.bwplot(...)})

This example adds ylim to specify that the plot should go from -1 to 1, and overlays a bwplot on top of the Hanoi plot. pch and coef affect the appearance of the bwplot. The example also uses the following definition to center each box of the Hanoi plot around the locations where my data points tend to lie (see original question):

cv.ints <- c(-1.000000000, -0.960000012, -0.822307704, -0.684615396, -0.546923088, -0.409230781, -0.271538473, -0.133846165, 0.003846142, 0.141538450, 0.279230758, 0.416923065, 0.554615373, 0.692307681, 0.829999988, 0.967692296, 1.000000000)

Here is the panel function:

panel.hanoi <- function(x, y, horizontal, breaks="Sturges", ...) {  # "Sturges" is hist()'s default

  if (horizontal) {
    condvar <- y # conditioning ("independent") variable
    datavar <- x # data ("dependent") variable
  } else {
    condvar <- x
    datavar <- y
  }

  conds <- sort(unique(condvar))

  # loop through the possible values of the conditioning variable
  for (i in seq_along(conds)) {

      h <- hist(datavar[condvar == conds[i]], plot=F, breaks) # use base hist(ogram) function to extract some information

    # strip outer counts == 0, and corresponding bins
    brks.cnts <- stripOuterZeros(h$breaks, h$counts)
    brks <- brks.cnts[[1]]
    cnts <- brks.cnts[[2]]

    halfrelfs <- (cnts/sum(cnts))/2  # i.e. half of the relative frequency
    center <- i

    # All of the variables passed to panel.rec will usually be vectors, and panel.rect will therefore make multiple rectangles.
    if (horizontal) {
      panel.rect(head(brks, -1), center - halfrelfs, tail(brks, -1), center + halfrelfs, ...)
    } else {
      panel.rect(center - halfrelfs, head(brks, -1), center + halfrelfs, tail(brks, -1), ...)
    }
  }
}

# function to strip counts that are all zero on ends of data, along with the corresponding breaks
stripOuterZeros <- function(brks, cnts) { do.call("stripLeftZeros", stripRightZeros(brks, cnts)) }

stripLeftZeros <- function(brks, cnts) {
  if (cnts[1] == 0) {
    stripLeftZeros(brks[-1], cnts[-1])
  } else {
    list(brks, cnts)
  }
}

stripRightZeros <- function(brks, cnts) {
  len <- length(cnts)
  if (cnts[len] ==0) {
    stripRightZeros(brks[-(len+1)], cnts[-len])
  } else {
    list(brks, cnts)
  }
}

Tower of Hanoi histograms with overlaid bwplots

Tags:

R

Lattice