Working with PostGIS data in R?

If you have PostGIS driver capability in the rgdal package then its just a question of creating a connection string and using that. Here I'm connecting to my local database gis using default credentials, so my DSN is rather simple. You might need to add a host, username, or password. See gdal docs for info.

> require(rgdal)
> dsn="PG:dbname='gis'"

What tables are in that database?

> ogrListLayers(dsn)
 [1] "ccsm_polygons"         "nongp"                 "WrldTZA"              
 [4] "nongpritalin"          "ritalinmerge"          "metforminmergev"      

Get one:

> polys = readOGR(dsn="PG:dbname='gis'","ccsm_polygons")
OGR data source with driver: PostgreSQL 
Source: "PG:dbname='gis'", layer: "ccsm_polygons"
with 32768 features and 4 fields
Feature type: wkbMultiPolygon with 2 dimensions

What have I got?

> summary(polys)
Object of class SpatialPolygonsDataFrame
Coordinates:
        min      max
x -179.2969 180.7031
y  -90.0000  90.0000
Is projected: NA 
proj4string : [NA]
Data attributes:
      area         perimeter       ccsm_polys      ccsm_pol_1   
 Min.   :1.000   Min.   :5.000   Min.   :    2   Min.   :    1  
 1st Qu.:1.000   1st Qu.:5.000   1st Qu.: 8194   1st Qu.: 8193  
 Median :1.000   Median :5.000   Median :16386   Median :16384  
 Mean   :1.016   Mean   :5.016   Mean   :16386   Mean   :16384  
 3rd Qu.:1.000   3rd Qu.:5.000   3rd Qu.:24577   3rd Qu.:24576  
 Max.   :2.000   Max.   :6.000   Max.   :32769   Max.   :32768  

Otherwise you can use R's database functionality and query the tables directly.

> require(RPostgreSQL)
Loading required package: RPostgreSQL
Loading required package: DBI
> m <- dbDriver("PostgreSQL")
> con <- dbConnect(m, dbname="gis")
> q="SELECT ST_AsText(the_geom) AS geom from ccsm_polygons LIMIT 10;"
> rs = dbSendQuery(con,q)
> df = fetch(rs,n=-1)

That returns the feature geometry in df$geom, which you'll need to convert to sp class objects (SpatialPolygons, SpatialPoints, SpatialLines) to do anything with. The readWKT function in rgeos can help with that.

The things to beware of are usually stuff like database columns that can't be mapped to R data types. You can include SQL in the query to do conversions, filtering, or limiting. This should get you started though.


If you have data in Postgis, don't export it to shapefile. From my point of view, it's kind of a step back.

You can query your postgis database from R using SQL statements, importing them as dataframes and, since you are familiar with R, do all the geostatistics you need from there. I believe you can also export your geostatistic result back to postgis.

Using SQL with Postgis Functions you can also do all kind of spatial analysis, like overlay operations, distances,and so on.

For map plotting I would use QGIS, a OpenSource GIS software, that can read postgis directly (as far as I know that was the initial goal of the project), and the upcoming version 2.0 comes with lots of features to produce great looking maps.


UPDATE:

As of v0.9 st_read_db() and st_write_db() are defunct and you can simply use st_read() and st_write().


The newly introduced sf-package (succesor of sp) provides the st_read() and st_read_db() functions. After this tutorial and from my experience it's faster than the already mentioned ways. As sf will probably replace sp one day it's also a good call to have look now ;)

require(sf)
dsn = "PG:dbname='dbname' host='host' port='port' user='user' password='pw'"
st_read(dsn, "schema.table")

you can also access the DB using RPostgreSQL:

require(sf)
require(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname = dbname, user = user, host = host,
                 port = port, password = pw)

st_read_db(con, table = c("schema", "table"))
# or:
st_read_db(con, query = "SELECT * FROM schema.table")
    
dbDisconnect(con)
dbUnloadDriver(drv)

With st_write_db() you can upload data.