Binary Search Tree Implementation in C++ STL?

A clean and simple BST implementation in CPP:

struct node {
   int val;
   node* left;
   node* right;
};

node* createNewNode(int x)
{
    node* nn = new node;
    nn->val = x;
    nn->left  = nullptr;
    nn->right = nullptr;

    return nn;
}

void bstInsert(node* &root, int x)
{
    if(root == nullptr) {
        root = createNewNode(x);
        return;
    }

    if(x < root->val)
    {
        if(root->left == nullptr) {
            root->left = createNewNode(x);
            return;
        } else {
            bstInsert(root->left, x);
        }
    }

    if( x > root->val )
    {
        if(root->right == nullptr) {
            root->right = createNewNode(x);
            return;
        } else {
            bstInsert(root->right, x);
        }
    }
}

int main()
{
     node* root = nullptr;

     int x;
     while(cin >> x) {
         bstInsert(root, x);
     }

     return 0;
}

What you need is a way to look up some data given a key. With the key being an unsigned int, this gives you several possibilities. Of course, you could use a std::map:

typedef std::map<unsigned int, record_t> my_records;

However, there's other possibilities as well. For example, it's quite likely that a hash map would be even faster than a binary tree. Hash maps are called unordered_map in C++, and are a part of the C++11 standard, likely already supported by your compiler/std lib (check your compiler version and documentation). They were first available in C++TR1 (std::tr1::unordered_map)

If your keys are rather closely distributed, you might even use a simple array and use the key as an index. When it comes to raw speed, nothing would beat indexing into an array. OTOH, if your key distribution is too random, you'd be wasting a lot of space.

If you store your records as pointers, moving them around is cheap, and an alternative might be to keep your data sorted by key in a vector:

typedef std::vector< std::pair<unsigned int, record_t*> > my_records;

Due to its better data locality, which presumably plays nice with processor cache, a simple std::vector often performs better than other data structures which theoretically should have an advantage. Its weak spot is inserting into/removing from the middle. However, in this case, on a 32bit system, this would require moving entries of 2*32bit POD around, which your implementation will likely perform by calling CPU intrinsics for memory move.


std::set and std::map are usually implemented as red-black trees, which are a variant of binary search trees. The specifics are implementation dependent tho.


STL's set class is typically implemented as a BST. It's not guaranteed (the only thing that is is it's signature, template < class Key, class Compare = less<Key>, class Allocator = allocator<Key> > class set;) but it's a pretty safe bet.

Your post says you want speed (presumably for a tighter game loop).

So why waste time on these slow-as-molasses O(lg n) structures and go for a hash map implementation?