Organization and tidiness of multiple copies of layers?

This is a wicked problem. We've tried various systems, which have all worked to varying degree for a time, and eventually grown unwieldly and started to fall apart as more and edge cases which don't quite fit are encountered. That said, each of the systems we've used is way better than nothing, proving the maxim that any system is better than no system.

Here is a thumbnail overview of our current practice:

Put everything except rasters into a file geodatabase, the fewer the better. Don't nest feature classes under feature datasets unless they are related in some manner (e.g. hydro>streams, hydro>lakes, hydro>wetlands, etc.). This leads to a big long list at the top of the fgdb but that is an acceptable evil.

Create layer files for all the feature classes and organize that instead, this gives a lot of freedom to name as needed, using unsupported characters etc.*, and ability to move and rename as circumstances change. It also allows duplication without redundancy, for example one set of layers grouped according to nominal scale (50k, 250k...), another by region (AK, YT...) , a third by theme (caribou, land use, transportation...), and a fourth by client while the datastore itself remains unchanged.

For duplicates use shortcuts instead of the layer files themselves, otherwise there are too many things to update when things change. Configure ArcCatalog to show shortcuts: *Tools > Options > file types: .lnk (Limitations: preview & metadata don't work, you can't follow the shortcut to its source in ArcCatalog. This can be remedied using Symbolic Links instead of shortcuts, see Link Shell Extension)

*(tip: add the Layers folder as a Start Menu toolbar so they're always at your finger tips.)

Z:\Layers\
          Base\
          Thematic\
          Reference\
          All Dressed Base (250k).lyr
          Administration Boundaries (1000k).lyr
          ...
Z:\Raster\
          Landsat\
          Orthos\
Z:\Data\
        Foo_50k.gdb
        Foo_250k.gdb
        NoScale.gdb

Map compositions and outputs (print files, pdf's, exports, etc.) which by nature are more dynamic and variable are stored and organized differently somewhere else. This is the part which has been harder for us. We currently use a dedicated drive with folders named according to Job# (doing it again I'd use date instead, '2010-10-26') and sub folders for project specific data and results/deliberables. A spreadsheet index lists all the job numbers (folder name), their corresponding map titles and client. Ex:

W:\Foo_0123\
            Foobarmap_001.mxd
            Docs\
                 ReadMe.doc
            Data\
                 buffers_2000m.shp
                 gps_tracks.csv
            Output\
                   Foobarmap_001.pdf
            Deliverables

Keeping the index up to date is a friction point, people don't like to do it, avoid it, and are inconsistent with naming etc. (using a database instead of spreadsheet would help). Using a numerical folder name convention also makes it very difficult to the map for project X without the index, another notable source of friction. Ideally the index would be a clickable html page which is automatically generated from a db application. That is whole 'nother project though.

Key principles:

  • separate the slowly changing and often reused stuff from the dynamic and variable, and treat them differently
  • Don't duplicate unecessarily, use layer files and shortcuts/links wherever possible.
  • don't change systems too frequently, give each a solid try.

I very much welcome examples of other structures, as I said we're not content with what we have. :)


If other people will be accessing the data in your system, you cannot make the organization schema meaningful for only yourself; you must keep their use of the system in mind. If you don't consider them, you'll be spending alot of time answering questions such as "where is the landuse data" and "why can't I find the [insert dataset here]?"

In maintaining such a system for many years, I found that people can't find data if it is first organized by source, e.g. c:\CensusBureau\Roads and c:\ESRI\Countries. Instead, I recommend to list the data thematically first, then by source in case you have multiple sources, e.g. c:\Roads\CensusBureau and c:\Roads\LocalGovt.

Likewise I wouldn't separate rasters and vectors into different directories. However, it may be necessary to split them onto different physical or logical drives if you have alot of raster data that won't fit onto one drive.

I recommend the following directory structure. Theme\SourceYear, where Theme is the thematic layer, Source is an abbreviated name for the data source, and Year is the year the data represents on the ground. In this scenario, TIGER Roads from the Census Bureau would be located in \Roads\Census00 and \Roads\Census10 (or replace 'Census' with 'TIGER').

Be aware that certain extensions in ArcGIS don't work with file names longer than 13 characters. I can't remember which extension, I just remember this being a problem.


We work on a project level for cad files guess it depends on how your particular work flow is set up, we have our master working project then prepare any additional datastores from this in an export script at the end of the editing session.

datadir\cad\cadastre.dgn
datadir\srv\fuel.dgn
datadir\srv\sewerage.dgn
datadir\map\base.dgn
datadir\map\printsets.dgn
...

then each file has levels/layers/features named with an identifier
sewPipe
sewManhole
sewPit
...

We then export it all to SQL spatial instead of reading our working project files where it is displayed to the user via Mapguide or whatever flavour GIS app needed.

The GIS layers are sorted by feature name with identifiers and similar folder layout to allow sorting.