Why is iterating over a std::set so much slower than over a std::vector?

Isn't a set just a structured vector that checks whether each item is unique upon insertion?

No, by far not. These data structures are completely different, and the main distinction here is the memory layout: std::vector puts its element into a contiguous location in memory, while std::set is a node-based container, where every element is separately allocated and resides at distinct places in memory, possibly far away from each other and definitely in a way that pre-fetching data for fast traversal is impossible for the processor. This is quite the opposite for std::vector - as the next element is always just right "next to" the current one in memory, a CPU will load elements into its cache, and when actually processing the elements, it only has to go to the cache to retrieve the values - which is very fast compared to RAM access.

Note that it's a common need to have a sorted, unique collection of data that is laid out contiguously in memory, and C++2a or the version thereafter might actually ship with a flat_set, have a look at P1222.

Matt Austern's "Why you shouldn't use set (and what you should use instead)" is an interesting read, too.


The main reason is that when you iterate over a std::vector which stores its element in a contiguous memory chuck you basically do:

++p;

where p is a T* raw pointer. The stl code is:

 __normal_iterator&
 operator++() _GLIBCXX_NOEXCEPT
 {
    ++_M_current;                            // <--- std::vector<>: ++iter
    return *this;
 }

For a std::set, the underlying object is more complex and in most implementations you iterate over a tree like structure. In its simplest form this is something like:

p=p->next_node;

where p is a pointer over a tree node structure:

struct tree_node {
   ...
   tree_node *next_node;
};

but in practice the "real" stl code is much more complex:

_Self&
operator++() _GLIBCXX_NOEXCEPT
{
    _M_node = _Rb_tree_increment(_M_node);   // <--- std::set<> ++iter
    return *this;
}

// ----- underlying code \/\/\/

static _Rb_tree_node_base*
local_Rb_tree_increment(_Rb_tree_node_base* __x) throw ()
{
  if (__x->_M_right != 0) 
    {
      __x = __x->_M_right;
      while (__x->_M_left != 0)
        __x = __x->_M_left;
    }
  else 
    {
      _Rb_tree_node_base* __y = __x->_M_parent;
      while (__x == __y->_M_right) 
        {
          __x = __y;
          __y = __y->_M_parent;
        }
      if (__x->_M_right != __y)
        __x = __y;
    }
  return __x;
}

_Rb_tree_node_base*
_Rb_tree_increment(_Rb_tree_node_base* __x) throw ()
{
  return local_Rb_tree_increment(__x);
}

const _Rb_tree_node_base*
_Rb_tree_increment(const _Rb_tree_node_base* __x) throw ()
{
  return local_Rb_tree_increment(const_cast<_Rb_tree_node_base*>(__x));
}

(see: What is the definition of _Rb_tree_increment in bits/stl_tree.h?)


First of all you should note, that a std::set is sorted. This is typically achieved by storing the data in a tree-like structure.

A vector is typically stored in a contiguous memory area (like a simple array) which can therefore be cached. And this is why it is faster.

Tags:

C++

Stl

C++11