How to take a subset from a netCDF file using latitude/longitude boundaries in R

In principle you are 2/3 of the way there. You can of course create the starting indices using something like this:

require(ncdf4)

ncFile <- nc_open( MyNetCDF )
LonStartIdx <- which( ncFile$dim$lon$vals == 355)
LatStartIdx <- which( ncFile$dim$lat$vals == 34.5)

Do the same for the counts. Then, read the variable you want

MyVariable <- ncvar_get( ncFile, varName, start=c( LonStartIdx, LatStartIdx), count=...)

However in your case you are out of luck as far as I know. The reading / writing netcdf routines do their stuff sequentially. Your grid wraps around since you have coordinates that go from 0 - 360 in longitude and you are interested in a box that contains the zero meridian.

For you (assuming you have not too much data) it would make more sense to read in the full grid into R, and then use either subset or create indices using which and cut out your "box" in R.

ncFile <- nc_open( MyNetCDF )
LonIdx <- which( ncFile$dim$lon$vals > 355 | ncFile$dim$lon$vals < 10)
LatIdx <- which( ncFile$dim$lat$vals > 34.5 & ncFile$dim$lat$vals < 44.5)
MyVariable <- ncvar_get( ncFile, varName)[ LonIdx, LatIdx]
nc_close(ncFile)

Remark: I prefer ncdf4, I find the syntax a bit easier to remember (and there was another advantage over the older netcdf R-package that I have forgotten...)

Ok. Comments cannot be as long as I would need them, so I updated the answer No worries. Let's go through the questions step by step.

  • The which function way will work. I use it myself.
  • The data will be in a similar format as in the netcf file, but I am not too sure if there is some problem with the 0 meridian (I guess yes). You might have to swap the two halves by doing something like this (replace the corresponding line in the 2nd example)

    LonIdx <- c(which( ncFile$dim$lon$vals > 355) , which( ncFile$dim$lon$vals < 10) )
    

    This changes the order of the coordinate indices so that the Western part comes first and then the Eastern.

  • Reformatting everything to a 2x3 data frame is possible. Take the data my 2nd code example returns (will be a matrix, [lon x lat]. Also get the values of the coordinates from

    lon <- ncFile$dim$lon$val[LonIdx]
    

    (or how longitude is called in your example, same for lat). Then assemble the matrix using

    cbind( rep(lat, each=length(lon)), rep(lon,length(lat)), c(myVariable) )
    
  • The coordinates will of course be the same as in the netcdf file...

You need to samity check the last cbind, as I am only about 98% confident that I have not messed up the coordinates. In the R scripts I found on my desktop I use loops, which are... evil... This should be (a bit?) faster and is also more sensible.


You can also use CDO to extract the area from the bash command line first and then read the file in R:

cdo sellonlatbox,-5,12,34.5,44.5 in.nc out.nc 

I note in the above discussion that there was a problem concerning the order of the latitudes. You can also use the CDO command "invertlat" to sort that out for you.