what is the difference in indexing and sharding

Indexing is a way to store column values in a datastructure aimed at fast searching. This speeds up a search tremendously compared to a full table scan since not all rows will have to be examined. You should consider having indices on the columns in your WHERE clauses.

Sharding is a technique to split the table up between different machines. This makes it possible for parallell resolution of queries. For example, half the table can be searched on one machine and the other half on another machine. This will in some cases make it possible to increase the performance by adding more hardware, especially for large tables.


Indexing is the process of storing the column values in a datastructure like B-Tree or Hashing. It makes the search or join query faster than without index as looking for the values take less time. Sharding is to split a single table in multiple machine. For both indexing and searching it is necessary to select appropriate key.

For large tables, you should consider both indexing and sharding. For example, consider a Table X which has 1 million rows. If you search for a key K in table X, query processing will jump directly to row R which contains the key and return R to the user. If you do not cross your storage limit in most cases you don't need to shard a table. If you cross your storage limit you have to shard. There is no benefit sharding a small table as it will cause additional overhead of Network and aggregating subquery.