Why and where to use INDEXes - pros and cons

What are indexes for, what are they in database?

Without index on column user_name system would have to scan the entire Example table on a row-by-row basis to find all matching entries. If the data distribution in particular table points that there are only a few rows or so this is clearly an inefficient way of obtaining those rows.

However, when using indexes, you are redirecting the power of search to a different, tree structure, that has faster lookups and very small depth.

Please have in mind, that indexes are pure redundancy. Database index is just like a telephone book one or any other index in a book you might be willing to read (probably a part of, to quickly find what you're looking for).

If you are interested in a chapter of a book the index lets you find it relatively quickly so that you don't have to skim through many pages to get it.

Why aren't indexes created on default?

Index is a data structure that is created alongside a table and maintains itself whenever a table is changed. The fact of it's existance implies usage of data storage.

If you would index every column on a large table, the storage needed to keep indexes would exceed the size of table itself by far.

Self maintenance of an index structure also means that whenever an UPDATE, INSERT, DELETE occurs, the index has to be updated (it's done automatically and does not require your action), but that costs time which means these operations are performed slower.

There are situations, when you need to retrieve most of the table (eg 90% of rows will be in the output), or the entire table, and in this case Sequence scan of the whole table (behaviour without an index) would be more efficient than doing the tree traversal and leaf node chain (which is the behaviour for navigating the index tree structure).


Well you can probably fill books about indices but in short here a few things to think about, when creating an index:

While it (mostly) speeds up a select, it slows down inserts, updates and deletes because the database engine does not have to write the data only, but the index, too. An index need space on hard disk (and much more important) in RAM. An index that can not be held in RAM is pretty useless. An index on a column with only a few different values doesn't speed up selects, because it can not sort out much rows (for example a column "gender", which usually has only two different values - male, female).

If you use MySQL for example you can check, if the engine uses an index by adding "explain" before the select - for your above example EXPLAIN SELECT TestField FROM Example WHERE username=XXXX


The main reason why don't we use an index as a default is the maintenance problem. i.e when we generally update(insert,delete,or update) that particular column which is indexed in a table then the index must be updated dynamically which is a bit time consuming process. Moreover it becomes an overhead to maintain this index.