No outliers in ggplot boxplot with facet_wrap

Through outlier.size=NA you make the outliers disappear, this is not an option to ignore the outliers plotting the boxplots. So, the plots are generated considering the (invisible) outliers. There seems to be no option for what you want. In order to make the boxplots as you need them I would calculate the quantiles myself and generate the boxplots based on these quantiles, like in the following example:

stat<-tapply(diamonds$price,list(diamonds$cut,diamonds$clarity),function(x) boxplot.stats(x))
stats<-unlist(tapply(diamonds$price,list(diamonds$cut,diamonds$clarity),function(x) boxplot.stats(x)$stats))

df<-data.frame(
  cut=rep(rep(unlist(dimnames(stat)[1]),each=5),length(unlist(dimnames(stat)[2]))),
  clarity=rep(unlist(dimnames(stat)[2]),each=25),
  price=unlist(tapply(diamonds$price,list(diamonds$cut,diamonds$clarity),function(x) boxplot.stats(x)$stats)))

ggplot(df,aes(x=cut,y=price,fill=cut))+geom_boxplot()+facet_wrap(~clarity,scales="free")

Which gives (note that the orders in the plot are different now):

enter image description here


It can be done with stat_summary and custom statistic calculation function:

calc_boxplot_stat <- function(x) {
  coef <- 1.5
  n <- sum(!is.na(x))
  # calculate quantiles
  stats <- quantile(x, probs = c(0.0, 0.25, 0.5, 0.75, 1.0))
  names(stats) <- c("ymin", "lower", "middle", "upper", "ymax")
  iqr <- diff(stats[c(2, 4)])
  # set whiskers
  outliers <- x < (stats[2] - coef * iqr) | x > (stats[4] + coef * iqr)
  if (any(outliers)) {
    stats[c(1, 5)] <- range(c(stats[2:4], x[!outliers]), na.rm = TRUE)
  }
  return(stats)
}

ggplot(diamonds, aes(x=cut, y=price, fill=cut)) + 
    stat_summary(fun.data = calc_boxplot_stat, geom="boxplot") + 
    facet_wrap(~clarity, scales="free")

output figure

The stats calculation function is generic, thus no need for data manipulation before plotting.

It is also possible to set whiskers to 10% and 90% :

calc_stat <- function(x) {
  coef <- 1.5
  n <- sum(!is.na(x))
  # calculate quantiles
  stats <- quantile(x, probs = c(0.1, 0.25, 0.5, 0.75, 0.9))
  names(stats) <- c("ymin", "lower", "middle", "upper", "ymax")
  return(stats)
}

ggplot(diamonds, aes(x=cut, y=price, fill=cut)) + 
    stat_summary(fun.data = calc_stat, geom="boxplot") + 
    facet_wrap(~clarity, scales="free")

Output figure with 10% and 90% whiskers

Tags:

R

Ggplot2