Convert huge XYZ CSV to GeoTIFF

You can do this using GDAL, it directly supports XYZ format. It doesn't matter if your coordinates are UTM, gdal_translate will output in the same coordinate system.

So to convert to GeoTIFF is as simple as:

gdal_translate test.xyz test.tif

Look at the GeoTIFF doc for output options (such as compression) and the gdal_translate doc for more usage info. In particular, you should specify what the coordinate system is with the -a_srs parameter.

-a_srs srs_def:

Override the projection for the output file. The srs_def may be any of the usual GDAL/OGR forms, complete WKT, PROJ.4, EPSG:n or a file containing the WKT.

gdal_translate -a_srs EPSG:12345 test.xyz test.tif

Comma/space separated and fixed column widths, with and without a header row are supported.

The supported column separators are space, comma, semicolon and tabulations.

$ head -n 2 test_space.xyz 
x y z
146.360047076550984 -39.0631214488636616 0.627969205379486084

$ gdalinfo test_space.xyz
Driver: XYZ/ASCII Gridded XYZ
Files: test_space.xyz
Size is 84, 66
Coordinate System is `'
Origin = (146.359922066953317,-39.062997159090934)
Pixel Size = (0.000250019195332,-0.000248579545455)
Corner Coordinates:
Upper Left  ( 146.3599221, -39.0629972) 
Lower Left  ( 146.3599221, -39.0794034) 
Upper Right ( 146.3809237, -39.0629972) 
Lower Right ( 146.3809237, -39.0794034) 
Center      ( 146.3704229, -39.0712003) 
Band 1 Block=84x1 Type=Float32, ColorInterp=Undefined
  Min=0.336 Max=0.721 

$ head -n 2 test_commas.xyz 
x, y, z
146.360047076550984, -39.0631214488636616, 0.627969205379486084

$ gdalinfo test_commas.xyz
Driver: XYZ/ASCII Gridded XYZ
etc...

$ head -n 2 test_formatted.xyz
x                       y                       z
146.3600471            -39.06312145             0.627969205

$ gdalinfo test_formatted.xyz
Driver: XYZ/ASCII Gridded XYZ
etc...

The only gotchas I'm aware of are:

The opening of a big dataset can be slow as the driver must scan the whole file to determine the dataset size and spatial resolution; and

The file has to be sorted correctly (by Y, then X).

Cells with same Y coordinates must be placed on consecutive lines. For a same Y coordinate value, the lines in the dataset must be organized by increasing X values. The value of the Y coordinate can increase or decrease however.

$ head -n 5 test.csv
x,y,z
146.3707979,-39.07778764,0.491866767
146.3787985,-39.07157315,0.614820838
146.3637974,-39.07132457,0.555555582
146.3630473,-39.07579901,0.481217861

$ gdalinfo test.csv
ERROR 1: Ungridded dataset: At line 3, too many stepY values
gdalinfo failed - unable to open 'test.csv'.

$ tail -n +2 test.csv| sort -n -t ',' -k2 -k1 > test_sorted.xyz

$ head -n 5 test_sorted.xyz 
146.3600471,-39.07927912,0.606096148
146.3602971,-39.07927912,0.603663027
146.3605471,-39.07927912,0.603663027
146.3607971,-39.07927912,0.589507282
146.3610472,-39.07927912,0.581049323

$ gdalinfo test_sorted.xyz
Driver: XYZ/ASCII Gridded XYZ
etc...

Open Data DGM200 of Germany: https://gdz.bkg.bund.de/index.php/default/open-data/digitales-gelandemodell-gitterweite-200-m-dgm200.html

Download the xyz file: https://daten.gdz.bkg.bund.de/produkte/dgm/dgm200/aktuell/dgm200.utm32s.xyzascii.zip

Convert xyz to GeoTIFF file:

gdal_translate -a_srs utm32s.prj dgm200_utm32s.xyz germanyDGM200.tif

Convert huge XYZ CSV to GeoTIFF

Tags:

Csv

Gdal

Gdal Translate

Geotiff Tiff

Xyz

Related

Recent Posts