Choice of programming language for learning data structures and algorithms

I would recommend Java mainly because:

  • garbage collection
  • references
  • rich collections

EDIT: Down voters please explain.


The answer to this question depends on exactly what you want to learn.

Python and Ruby

High-level languages like Python and Ruby are often suggested because they are high level and the syntax is quite readable. However, these languages all have abstractions for the common data structures. There's nothing stopping you implementing your own versions as a learning exercise but you may find that you're building high-level data structures on top of other high-level data structures, which isn't necessarily useful.

Also, Ruby and Python are dynamically-typed languages. This can be good but it can also be confusing for the beginner and it can be harder (initially) to catch errors since they typically won't be apparent until runtime.

C

C is at the other extreme. It's good if you want to learn really low-level details like how the memory is managed but memory management is suddenly an important consideration, as in, correct usage of malloc()/free(). That can be distracting. Also, C isn't object-oriented. That's not a bad thing but simply worth noting.

C++

C++ has been mentioned. As I said in the comment, I think this is a terrible choice. C++ is hideously complicated even in simple usage and has a ridiculous amount of "gotchas". Also, C++ has no common base class. This is important because data structures like hash tables rely on there being a common base class. You could implement a version for a nominal base class but it's a little less useful.

Java

Java has also been mentioned. Many people like to hate Java and it's true that the language is extremely verbose and lacking in some of the more modern language features (eg closures) but none of that really matters. Java is statically typed and has garbage collection. This means the Java compiler will catch many errors that dynamically typed languages won't (until runtime) and there's no dealing with segmentation faults (which isn't to say you can't leak memory in Java; obviously you can). I think Java is a fine choice.

C#

C# the language is like a more modern version of Java. Like Java, it is a managed (garbage collected) intermediate compiled language that runs on a virtual machine. Every other language listed here apart from C/C++ also run on a virtual machine but Python, Ruby, etc are interpreted directly rather than compiled to bytecode.

C# has the same pros and cons as Java, basically.

Haskell (etc)

Lastly, you have functional languages: Haskell, OCaml, Scheme/Lisp, Clojure, F#, etc. These think about all problems in a very different way and are worth learning at some point but again it comes down to what you want to learn: functional programming or data structures? I'd stick to learning one thing at a time rather than confusing the issue. If you do learn a functional language at some point (which I would recommend), Haskell is a safe and fine choice.

My Advice

Pick Java or C#. Both have free, excellent IDEs (Eclipse, Netbeans and IntelliJ Community Edition for Java, Visual Studio Express for C#, Visual studio community edition) that make writing and running code a snap. If you use no native data structure more complex than an array and any object you yourself write you'll learn basically the same thing as you would in C/C++ but without having to actually manage memory.

Let me explain: an extensible hash table needs to be resized if sufficient elements are added. In any implementation that will mean doing something like doubling the size of the backing data structure (typically an array) and copying in the existing elements. The implementation is basically the same in all imperative languages but in C/C++ you have to deal with segmentation faults when you don't allocate or deallocate something correctly.

Python or Ruby (it doesn't really matter which) would be my next choice (and very close to the other two) just because the dynamic typing could be problematic at first.


In my opinion, C would be the best language to learn data structures and algorithms because it will force you to write your own. It will force you to understand pointers, dynamic memory allocation, and the implementations behind the popular data structures like linked lists, hash tables, etc. Many of which are things you can take for granted in higher level languages (Java, C#, etc.).