database design - categories and sub-categories

Managing hierarchical data has some ways. One of the most important ones is Nested Set Model. See here for implementation. Even some content management systems like Joomla, use this structure.

Update 2020: As there are some considerations on this post, I should say that now I prefer the Adjacency List Model instead of the Nested Set Model since there is less complexity in this way. Also See here for implementation.


I would use this structure:

ParentId, CategoryId, Title
null, 1, Home
null, 2, Business
null, 3, Hobbies
1, 4, Gardening
1, 5, Kitchen
1, 6, ...
2, 7, Development
2, 8, Marketing
2, 9, ...
3, 10, Soccer
3, 11, Reading
3, 12, ...

In detail:

  • only use one table, which references itself, so that you can have unlimited depth of categories
  • use technical ids (using IDENTITY, or similar), so that you can have more than 10 subcategories
  • if required add a human readable column for category-numbers as separate field

As long as you are only using two levels of categories you can still select like this:

SELECT BizTitle FROM tblBiz WHERE ParentId = 3 AND CategoryId = 11

The new hierarchyid feature of SQL server also looks quite promising: https://msdn.microsoft.com/en-us/library/bb677173.aspx


What I don't like about the Nested Set Model:

  • Inserting and deleting items in the Nested Set Model is a quite comlicated thing and requires expensive locks.
  • One can easily create inconsistencies which is prohibited, if you use the parent field in combination with a foreign key constraint.
    • Inconsistencies can appear, if rght is lower than lft
    • Inconsistencies can appear, if a value apprears in several rght or lft fields
    • Inconsistencies can appear, if you create gaps
    • Inconsistencies can appear, if you create overlaps
  • The Nested Set Model is in my opinion more complex and therefore not as easy to understand. This is absolutely subjective, of course.
  • The Nested Set Model requires two fields, instead of one - and so uses more disk space.