Displaying huge geotiffs (or vrts) with QGIS?

You seem to have two main concerns: VRT id slow with browsing and it is slow to build global overviews.


While I am sure that GDAL VRT used to be slow for me and my MapServer many years ago it may be that the situation has changed. I made a test layer with 10000 aerial images (image/tile size from 10000x10000 to 12000x12000 pixels) and now GDAL VRT is actually faster than native MapServer shapefile index and serves with the test computer 6 tiles (256x256) per second in a simple test with 1 thread where GetMaps hit always the first overview level. Mosaic with 10000 images is still quite a small one and I guess that in my test Linux had the whole VRT file in cache memory. How many images do you have in your VRT?

The following chapter may contain old information, read responsibly:

There is some evidence that VRT is slow when it contains huge number of images. That't because VRT is an index in XML format and it does not support spatial index which leads to full scan of the whole XML file every time. There is nothing you can do for improving that with plain GDAL even there has been some discussion about implementing spatial index for VRT http://osgeo-org.1560.x6.nabble.com/gdal-dev-Don-t-we-have-any-ideas-for-GSoC-2017-td5309810.html.

If you are willing to install new software the easiest workaround could be to use MapServer with tileindex http://www.mapserver.org/optimization/tileindex.html. If you create a tileindex with gdaltindex http://www.gdal.org/gdaltindex.html and create an index for the tileindex as well with shptree http://www.mapserver.org/utilities/shptree.html then MapServer should be able to access very fast all the image files that you have. Create overviews for individual tiles and serve the layer through WMS for QGIS and you have resolved the first part of the problem but not the problem with global overviews. Even if you have created overviews for the individual tiles it will be slow to open thousands of image files for covering a large area and therefore you must limit the number of files by creating overview images which cover larger area. That's what you have already tried to do by building overviews for the hole VRT with gdaladdo.

I do not know any ready made tool in GDAL/MapServer world for creating global pyramids automatically. You could convert tiles from the global VRT into a set of images with bigger pixel size by writing a script that runs gdal_translate http://www.gdal.org/gdal_translate.html with a sliding -prowjin or -srswin. Then you can combine the resulting tiles into a new overview layer with gdalbuildvrt or gdaltindex.

Because you also consider using GeoServer I would recommend to have a loot at gdal_retile script http://www.gdal.org/gdal_retile.html that is written to handle your case. It could be also possible to use the tiles which gdal_retile creates directly as overviews with QGIS by building VRT over them. However, the first problem with slow huge VRT files would remain.


Ok, well I solved both my problems...mainly by buying an NVMe SSD. My disk read/write has gone from 125 MB/s to 1200 MB/s.

Programatically, there are a few things you can do to help your read/write speed. First, consider the blocksize of your tiff. If you are using a striped tiff, when you zoom to a particular region, the GIS software will have to read each complete row of the region, including the portions of the tiff that won't be displayed, in order to display the region. For example, if you zoom into a 256 x 256 pixel region, if you have a striped tiff the software will have to read at least 256 blocks (one per row). If you have a tiled tiff (tiled at 256 x 256), the maximum number of blocks that must be read is 4 (and a minimum of 1). So the first thing you can do is ensure that you are using a tiled tiff (TILED=YES creation option in gdal), and you can set the blocksize to something reasonable (I used 256 x 256 with the gdal creation options BLOCKXSIZE and BLOCKYSIZE.)

Secondly, a hybrid approach to overviews seems to work well. If you can parallelize your operations, you can add overviews to the individual tiles quite quickly, but this will only benefit you for resolutions smaller than your tilesize. I created internal overviews of levels 2 4 8 16 32 and 64 on the individual tiles. Then build a VRT and create overviews of levels 128, 256, and 512 on the VRT (keep in mind that these are for global datasets at 30m resolution--your levels will change depending on number of pixels in your tiff). The total time for creating individual overviews is on the order of minutes (depends on how many threads you can run and how many tiles you have), but creating overviews on the VRT is still on the order of an hour. The runtime improvement over my initial post is due to the SSD and creating fewer levels on the VRT.

Thirdly, you can play with the GDAL_MAX_DATASET_POOL_SIZE option when building vrts as described at the bottom of this page. It sets the maximum number of tiffs to keep in memory at once.

Fourthly, I found that compressing with PACKBITS provides the fastest display times. The files aren't as small as LZW, but it's a tradeoff you might be willing to make.

The result is a VRT that loads rapidly and pans/zooms almost seamlessly.