How to get count of non-NA raster cells within polygon using R

The example data from Jeffrey

library(raster)
r <- raster(ncols=10, nrows=10)
set.seed(0)
x <- runif(ncell(r))
x[round(runif(25,1,100),digits=0)] <- NA
r[] <- x
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1),  Polygons(list(Polygon(cds2)), 2)))
polys <- SpatialPolygonsDataFrame(polys, data.frame(ID=sapply(slot(polys, "polygons"), function(x) slot(x, "ID"))))

Now use extract

extract(r, polys, fun=function(x, ...) length(na.omit(x))/length(x))
#[1] 0.8333333 0.6666667

If you have many rasters, first use stack to combine them (if they have the same extent and resolution)

To get the actual polygon area you should not use the slot(i, 'area') approach. For planar data you can use rgeos::gArea(polys, byid=TRUE) For spherical data (lon/lat) you can use geosphere::areaPolygon


I am not sure if you want the ratio based on the "real value" of the polygon(s) areas or the areas of the cells intersecting them. Here is some example code that uses all cells intersecting the polygons (basically, ratio of NA cells to non-NA cells). It is a dummy example and you will need to write your own function.

    # Create some example data
    require(raster)
    require(sp)

    r <- raster(ncols=10, nrows=10)
      x <- runif(ncell(r))
        x[round(runif(25,1,100),digits=0)] <- NA
          r[] <- x
      cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
        cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
          polys <- SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1), 
                                   Polygons(list(Polygon(cds2)), 2)))
            polys <- SpatialPolygonsDataFrame(polys, data.frame(ID=sapply(slot(polys, "polygons"), 
                                              function(x) slot(x, "ID"))))
plot(r)
  plot(polys, add=TRUE)

You can use this code snippet to add an area column to your polygon data by extracting from the area slot. This could be used if you want to ratio using the "real" polygon area.

# Add area of polygon(s)
polys@data <- data.frame(polys@data, Area=sapply(slot(polys, 'polygons'), 
                         function(i) slot(i, 'area')))  

The most efficient, and considerable faster, alternative to for loops are "apply" like functions. There are a number of these available in R that are utilized for different object classes or data structures. In this case, since extract returns a list, we will use lapply (list apply). This is a way to apply a base or custom function to a list object. The the object class stored in the list is a vector, the function is quite straight forward. If you use extract on a brick or stack raster object the resulting objects stored in the list would be data.frame objects.

# On a single raster object, extract returns list object with stored vectors.                           
( vList <- extract(r, polys, na.rm=FALSE) )
  class(vList)

# Use lapply to apply function that calculates ratio of NA to non-NA values
#   wrapping lapply in unlist() collapses result into a vector  
aRatio <- function(x) { if(length(x[is.na(x)]) > 0) (length(x[is.na(x)]) / length(x[!is.na(x)])) else 0 }  
  ( vArea <- unlist( lapply(vList, FUN=aRatio ) ) )

# Assign ordered vector back to polygons
polys@data <- data.frame(polys@data, NAratio=vArea)
  str(polys@data)