Randomly sampling points in R with minimum distance constraint?

If I understand you correctly, you want to draw a distance-constrained random sample from your data for each observation in the data. This is akin to a K nearest neighbor analysis.

Here is an example workflow that will create a kNN random sample, using a minimum distance constraint, and add the corresponding rowname back to your data.

Add libraries and example data

library(sp)
data(meuse)
coordinates(meuse) <- ~x+y

Calculate a distance matrix using spDists

 dmat <- spDists(meuse)

Define minimum sample distance and set to NA in distance matrix. Here is where you would create any type of constraint say, a distance range.

min.dist <- 500 
dmat[dmat <= min.dist] <- NA

Here we iterate through each row in the distance matrix and select a random sample != NA. The "samples" object is a data.frame where ID is the rownames of the source object and kNN is the rowname of the nearest neighbor. Note; there is some NA handling added just in case no neighbor is found, which could happen with distance constraints.

samples <- data.frame(ID=rownames(meuse@data), kNN=NA)
  for(i in 1:nrow(dmat) ) {
    x <- as.vector( dmat[,i] )
      names(x) <- samples$ID
    x <- x[!is.na(x)]
    if(!length(x) == 0) {
      samples[i,][2] <- names(x)[sample(1:length(x), 1)]
      } else {
      samples[i,][2] <- NA
    }   
  }

We can then add the kNN column, containing the rownames of the nearest neighbor, to the original data.

meuse@data <- data.frame(meuse@data, kNN=samples$kNN)
  head(meuse@data)

We could also subset the unique nearest neighbor observations.

meuse.sub <- meuse[which(rownames(meuse@data) %in% unique(samples$kNN)),]

There are much more elegant ways to perform this analysis but this workflow gets the general idea across. I would recommend taking a hard look at the spdep library and dnearneigh or knearneigh functions for a more advanced solution.


You can do this either with R or ArcGIS independently.

With ArcGIS, first create a feature class (e.g. shape file) from your grid coordinates. Then use this grid feature class as constraining_extent parameter of "Create Random Points" tool.

The only coding you have to do is to put that tool in a loop which can be achieved via Model Builder or Arcpy.

here is a sample (100 iteration):

import arcpy

for i in range(100):
    print i
    arcpy.CreateRandomPoints_management("c:/data/project", "samplepoints", "c:/data/studyarea.shp", "", 500, "", "POINT", "")

For R, use genrandompnts from spatialecology website. This tool is similiar to ArcGis "Create Random Points" tool.

In addition, there is another thread similiar to your quesiton.

How to create random points outside polygons?


If one is interested in sampling points with a distance constraint for each polygon, in the meanwhile, there are two nice and fast possibilities: (1) using QGIS "random points inside polygons" through RQGIS-package , or (2) using spatstat::rSSI-function.

In the following examples for (1) RQGIS and (2) spatstat::rSSI:

## load relevant packages
if(!require("pacman")) install.packages("pacman")
pacman::p_load(sf, sp, rgdal, dplyr, mapview, spatstat, maptools, devtools)


## load data and convert to sf
columbus <- readOGR(system.file("shapes/columbus.shp", package="spData")[1]) %>%
              sf::st_as_sf(.)



## start random sampling with distance constraint

# # # # # # # # # # # # # # # # # # # #
# (1) ... using RQGIS
# # # # # # # # # # # # # # # # # # # #

devtools::install_github("jannes-m/RQGIS")
library("RQGIS")

# ... open QGIS tunnel
RQGIS::open_app() # QGIS must be installed 


# ... find suitable algorithm
RQGIS::find_algorithms(search_term = "random")
# [6] "Random points inside polygons (fixed)
# ---------------->qgis:randompointsinsidepolygonsfixed"                                          

# [7] "Random points inside polygons (variable)
# ------------->qgis:randompointsinsidepolygonsvariable"  


# ... check usage
RQGIS::get_usage(alg = "qgis:randompointsinsidepolygonsfixed")
RQGIS::get_args_man(alg = "qgis:randompointsinsidepolygonsfixed")


# ... process random points with minimum distance (0.25 degree)
#     using a fixed maximum number (10)
rdnmPts.RQGIS <- RQGIS::run_qgis(alg = "qgis:randompointsinsidepolygonsfixed",
                                 show_output_paths = TRUE,
                                 load_output = TRUE, params = list(
                                  VECTOR =  columbus, MIN_DISTANCE = "0.25", 
                                  VALUE = "10",OUTPUT = "rndm_pts.shp"))


# ... take a look on the result
mapview::mapview(list(rdnmPts.RQGIS, columbus))


# ... check number of random points
nrow(rdnmPts.RQGIS)
# [1] 187



# # # # # # # # # # # # # # # # # # # #
# (2) ... using spatstat::rSSI ------------------------------------
# # # # # # # # # # # # # # # # # # # #

# spatstat::rSSI uses a special format input. Therefore, a function is created
# to transform the simple feature to owin-format.


# init function
genRandomPtsDist <- function(x, seed = 123, dist = 10, n = Inf, 
                             maxit = 100, quiet = TRUE, ...)
{

  # get start time of process
  process.time.start <- proc.time()

  # get crs
  crs <- sf::st_crs(x = x)

  # convert simple feature to spatial polygons
  x.sp <- x %>% as(., "Spatial") %>%  as(., "SpatialPolygons")


  # convert to owin object
  x.owin <- x.sp %>%
    slot(., "polygons") %>%
    lapply(X = ., FUN = function(x){sp::SpatialPolygons(list(x))}) %>%
    lapply(X = ., FUN = spatstat::as.owin)


  # generate random sampling with distant constraint (can be parallelized)
  pts.ppp <- lapply(X = 1:length(x.owin), FUN = function(i, x.owin, r, n, quiet, 
                                                         seed, maxit, ...)
  {
    if(quiet == FALSE) cat("Run ", i, " of ", length(x.owin), "\n")
    set.seed(seed)
    spatstat::rSSI(r = r, n = n, giveup = maxit, win = x.owin[[i]], ...)
  }, quiet = quiet, x.owin = x.owin, r = dist, n = n, seed = seed, maxit = maxit, ...)


  # back-conversion to simple feature
  pts.sf <- pts.ppp %>%
    lapply(X = ., FUN = function(x) sf::st_as_sfc(as(x, "SpatialPoints"))) %>%
    do.call(c, .) %>%
    sf::st_sf(., crs = crs)


  # get intersected items
  pts.inter.x <- sf::st_intersects(x = pts.sf, y = x) %>% unlist

  if(length(pts.inter.x) != nrow(pts.sf))
  {
    warning("Some sample points are outside a polygon")
  } else{
    pts.sf$In <- pts.inter.x
  }


  # get time of process
  process.time.run <- proc.time() - process.time.start
  if(quiet == FALSE) cat(paste0("------ Run of genRandomPtsDist: " , 
    round(x = process.time.run["elapsed"][[1]]/60, digits = 3), " Minutes ------\n"))

  return(pts.sf)
} # end of function



## Using the defined function, now one can generate a random sample.
# ... process random points with minimum distance (0.25 degree) 
#     using a fixed maximum number (10)
rdnmPts.rSSI <- genRandomPtsDist(x = columbus, dist = 0.25, n = 10, quiet = FALSE)


# ... take a look on the result
mapview::mapview(list(rdnmPts.rSSI, columbus))


# ... check number of random points
nrow(rdnmPts.rSSI)
# [1] 171

Tags:

Random

R