Remove rows which have all NAs in certain columns

Here are two dplyr options:

library(dplyr)
df <- data_frame(a = c(0, NA, 0, 4, NA, 0, 6), b = c(1, NA, 0, 4, NA, 0, NA), c = c(1, 0, 1, NA, NA, 0, NA))


# columns b and c would be the columns you don't want all NAs

df %>% 
  filter_at(vars(b, c), any_vars(!is.na(.)))

df %>% 
  filter_at(vars(b, c), any_vars(complete.cases(.)))

# A tibble: 5 x 3
      a     b     c
  <dbl> <dbl> <dbl>
1     0     1     1
2    NA    NA     6
3     0     6     1
4     4     4    NA
5     0     0     0

In the newer version of dplyr, use if_any

df %>% 
      filter(if_any(c(b, c), complete.cases))

You can use all with apply to find rows where all values are NA:

x[!apply(is.na(x[,5:9]), 1, all),]

or negate is.na and test for any:

x[apply(!is.na(x[,5:9]), 1, any),]

or using rowSums like @RHertel wher you dont need to calculate the number of selected rows:

x[rowSums(!is.na(x[,5:9])) > 0,]

This a one-liner to remove the rows with NA in all columns between 5 and 9. By combining rowSums() with is.na() it is easy to check whether all entries in these 5 columns are NA:

x <- x[rowSums(is.na(x[,5:9]))!=5,]

Tags:

R

Dataframe

Na