Advantages either way between GML and Shapefiles?

You can't really choose either because as a GIS Professional data will come at you in a million different formats and you need to handle everything. This is why GDAL/OGR has so many translation algorithms. More importantly, neither format is a 'beautiful' way of managing and storing GIS data. For real GIS data management you want to use one of the spatially enabled database formats. Shapefiles and GML are now really formats of convenience and should both be treated as such.

A few more random thoughts:

  • The shapefile format is very old and often poorly implemented leading to poor data in some cases. On this note I was once even been told by one senior ESRI (the original developers of the shapefile) staff member that they wish the shapefile format would die but it is so prevalent as the data transfer format that it has a life of its own. GML is more recent... and often equally poorly implemented...
  • The Shapefile has been the lingua Franca of vector GIS since the 1980s and so is more widely accepted than GML which is why it remains one of the most common data transfer formats.
  • Shapefiles are binary and GML is text so as data volumes increase the shapefile can be smaller, though GML responds well to compression (especially when you strip redundant white-space) and is human readable (which I've never found to be a huge advantage in practice).
  • The GML format can be verbose and is not universally liked by GIS specialists. Anecdotal evidence comes from many conversations I've had on this topic with people including senior ESRI developers who have stated they prefer the KML format to GML (having opened THAT can of worms I shall step away from the argument!).
  • GML is a single file and less likely to lose bits along the way (I can't count the number of times I have been given just the '.shp' part of a shapefile - an understandable mistake given the name).
  • GML has no special advantage on the web as most web mapping systems are just as happy with a database or a shapefile and for a heavy-duty app, the database is the way to go.

I could ramble on with other pros and cons but really my first paragraph says it all. I would treat both formats as a means of sharing data rather than part of a GI-data archive management system.


Shapefiles have the restriction of 8 characters in field names, and only ASCII characters for them as well. I would prefer GML to avoid this.


What you're dealing with is two different formats of vector data - ESRI Shapefiles are the native vector format for ArcGIS which is a GIS package with a very long heritage (so thats why you're probably finding quite a lot of GIS data available out there in this format). GML is a Geographic Markup Language, extension of XML. QGIS can handle both formats (and many more) - as has been suggested above, it would be really helpful to know what your trying to do with the files.