Why is quicksort better than mergesort?

As many people have noted, the average case performance for quicksort is faster than mergesort. But this is only true if you are assuming constant time to access any piece of memory on demand.

In RAM this assumption is generally not too bad (it is not always true because of caches, but it is not too bad). However if your data structure is big enough to live on disk, then quicksort gets killed by the fact that your average disk does something like 200 random seeks per second. But that same disk has no trouble reading or writing megabytes per second of data sequentially. Which is exactly what mergesort does.

Therefore if data has to be sorted on disk, you really, really want to use some variation on mergesort. (Generally you quicksort sublists, then start merging them together above some size threshold.)

Furthermore if you have to do anything with datasets of that size, think hard about how to avoid seeks to disk. For instance this is why it is standard advice that you drop indexes before doing large data loads in databases, and then rebuild the index later. Maintaining the index during the load means constantly seeking to disk. By contrast if you drop the indexes, then the database can rebuild the index by first sorting the information to be dealt with (using a mergesort of course!) and then loading it into a BTREE datastructure for the index. (BTREEs are naturally kept in order, so you can load one from a sorted dataset with few seeks to disk.)

There have been a number of occasions where understanding how to avoid disk seeks has let me make data processing jobs take hours rather than days or weeks.


Quicksort has O(n2) worst-case runtime and O(nlogn) average case runtime. However, it’s superior to merge sort in many scenarios because many factors influence an algorithm’s runtime, and, when taking them all together, quicksort wins out.

In particular, the often-quoted runtime of sorting algorithms refers to the number of comparisons or the number of swaps necessary to perform to sort the data. This is indeed a good measure of performance, especially since it’s independent of the underlying hardware design. However, other things – such as locality of reference (i.e. do we read lots of elements which are probably in cache?) – also play an important role on current hardware. Quicksort in particular requires little additional space and exhibits good cache locality, and this makes it faster than merge sort in many cases.

In addition, it’s very easy to avoid quicksort’s worst-case run time of O(n2) almost entirely by using an appropriate choice of the pivot – such as picking it at random (this is an excellent strategy).

In practice, many modern implementations of quicksort (in particular libstdc++’s std::sort) are actually introsort, whose theoretical worst-case is O(nlogn), same as merge sort. It achieves this by limiting the recursion depth, and switching to a different algorithm (heapsort) once it exceeds logn.