C++ Linked list using smart pointers

I would look at the interface of std::list, which is a C++ implementation of linked lists. It seems that you are approaching the templating of your Linked list class wrong. Ideally your linked list should not care about ownership semantics (i.e. whether it is instantiated with raw ptrs, smart pointers or stack allocated variables). An example of ownership sematics with STL containers follows. However, there are better examples of STL and ownership from more authoritative sources.

#include <iostream>
#include <list>
#include <memory>

using namespace std;

int main()
{

    // Unique ownership.
    unique_ptr<int> int_ptr = make_unique<int>(5);

    {
        // list of uniquely owned integers.
        list<unique_ptr<int>> list_unique_integers;

        // Transfer of ownership from my parent stack frame to the
        // unique_ptr list.
        list_unique_integers.push_back(move(int_ptr));

    } // list is destroyed and the integers it owns.

    // Accessing the integer here is not a good idea.
    // cout << *int_ptr << endl;
    // You can make a new one though.
    int_ptr.reset(new int(6));

    // Shared ownership.
    // Create a pointer we intend to share.
    shared_ptr<int> a_shared_int = make_shared<int>(5);

    {
        // A list that shares ownership of integers with anyone that has
        // copied the shared pointer.
        list<shared_ptr<int>> list_shared_integers;

        list_shared_integers.push_back(a_shared_int);

        // Editing and reading obviously works.
        const shared_ptr<int> a_ref_to_int = list_shared_integers.back();
        (*a_ref_to_int)++;
        cout << *a_ref_to_int << endl;

    } // list_shared_integers goes out of scope, but the integer is not as a
    // "reference" to it still exists.

    // a_shared_int is still accessible.
    (*a_shared_int)++;
    cout << (*a_shared_int) << endl;

} // now the integer is deallocated because the shared_ptr goes 
// out of scope.

A good exercise to understand ownership, memory allocation/deallocation, and shared pointers is to do a tutorial where you implement your own smart pointers. Then you will understand exactly how to use smart pointers and you will have one of those xen moments where you realise how pretty much everything in C++ comes back to RAII (ownership of resources).

So back to the crux of your question. If you want to stick to Nodes of type T, don't wrap the node in a smart pointer. The Node destructor must delete the underlying raw pointer. The raw pointer may point to a smart pointer itself specified as T. When your "LinkedList"'s class destructor is called it iterates through all Nodes with Node::next and calls delete node; after it obtained the pointer to the next node.

You could create a list where nodes are smart pointers... but this is a very specialised linked list probably called SharedLinkedList or UniqueLinkedList with very different sematics for object creation, popping, etc. Just as an example, a UniqueLinkedList would move a node in the return value when popping a value to a caller. To do metaprogramming for this problem would require the use of partial specialization for different types of T passed. Example, something like:

template<class T>
struct LinkedList
{
    Node<T> *head;
};

// The very start of a LinkedList with shared ownership. In all your access
// methods, etc... you will be returning copies of the appropriate pointer, 
// therefore creating another reference to the underlying data.
template<class T>
struct LinkedList<std::shared_ptr<T>>
{
    shared_ptr<Node<T>> head;
};

Now you start implementing your own STL! You can already see potential for problems as mentioned in the comments to your question with this approach. If nodes have shared_ptr next it will result in a call to that shared Node's destructor, which will call the next shared Node destructor and so forth (stack overflow due to the recursion is possible). So that is why I don't care much for this approach.


There are basically two alternatives to set up a smart-pointer enhanced list:

  1. Using std::unique_ptr:

    template<typename T>
    struct Node
    {
         Node* _prev;
         std::unique_ptr<Node> _next;
         T data;
    };
    
    std::unique_ptr<Node<T> > root; //inside list
    

    That would be my first choice. The unique-pointer _next takes care there are no memory leaks, whereas _prev is an observing pointer. However, copy constructor and such things -- in case you need them -- need to be defined and implemented by hand.

  2. Using shared_ptr:

    template<typename T>
    struct Node
    {
         std::weak_ptr<Node> _prev;   //or as well Node*
         std::shared_ptr<Node> _next;
         T data;
    };
    
    std::shared_ptr<Node<T> > root; //inside list
    

    This is alternative is copyable by design and adds further safety because of the weak_ptr, see below. It is less performant than the unique_ptr when it comes to structural changes of the list, such as insertions and removals, e.g. due to thread safety in shared_ptr's control block.

    Yet, traversing the list, i.e. dereferencing the pointers, should be as performant as for the unique_ptr.

In both approaches the idea is that one node owns the complete remaining list. Now when a node goes out of scope, there is no danger that the remaining list becomes a memory leak, as the nodes are iteratively destructed (starting from the last one).

The _prev pointer is in both options only an observing pointer: it's task is not to keep the previous nodes alive, but only to provide a link to visit them. For that, a Node * is usually sufficient (--note: observing pointer means you never do memory related stuff like new, delete on the pointer).

If you want more safety, you can also use a std::weak_ptr which prevents from things like

std::shared_ptr<Node<T> > n;
{
    list<T> li;
    //fill the list
    n = li.root->next->next; //let's say that works for this example
}
n->_prev; //dangling pointer, the previous list does not exists anymore 

Using a weak_ptr, you can lock() it and in this way chack whether _prev is still valid.


You do not "need" to use a smart pointer for a linked list, because that statement doesn't make sense. You do not use smart pointers for low-level data structures. You use smart pointers for high-level program logic.

As far as low-level data structures are concerned, you use a standard container class from the C++ standard library, like std::list [*], which solves all your memory-management problems anyway, without using any smart pointers internally.

If you really really need your own highly specialised/optimised custom container class because the entire C++ standard library is unfit for your requirements and you need a replacement for std::list, std::vector, std::unordered_map and other optimised, tested, documented and safe containers – which I very much doubt! –, then you have to manage memory manually anyway, because the point of such a specialised class will almost certainly be the need for techniques like memory pools, copy-on-write or even garbage collection, all of which conflict with a typical smart pointer's rather simplistic deletion logic.

In the words of Herb Sutter:

Never use owning raw pointers and delete, except in rare cases when implementing your own low-level data structure (and even then keep that well encapsulated inside a class boundary).

Something along those lines is also expressed in Herb Sutter's and Bjarne Stroustrup's C++ Core Guidelines:

This problem cannot be solved (at scale) by transforming all owning pointers to unique_ptrs and shared_ptrs, partly because we need/use owning "raw pointers" as well as simple pointers in the implementation of our fundamental resource handles. For example, common vector implementations have one owning pointer and two non-owning pointers.

Writing a linked-list class in C++ with raw pointers can be a useful academic exercise. Writing a linked-list class in C++ with smart pointers is a pointless academic exercise. Using any of these two self-made things in production code is almost automatically wrong.


[*]Or just std::vector, because due to cache locality that will almost always be the better choice anyway.