Redis,distributed or not?

Regarding question 1, Redis is an in-memory store with some persistency capabilities. All your dataset should fit in memory. A single instance is therefore limited by the maximum memory of your server. Now, you can also shard the data to several Redis instances, running on multiple servers. Provided you have the budget for it, it is perfectly possible to store 100GB - 1TB of data on a set of Redis boxes. Please note sharding is not automatic: it has to be implemented by the client or the application. It also puts some constraints on the operations you can do on your data (for instance it would not be possible on server-side to calculate the intersections of two sets hosted by different Redis instances).

Regarding question 2, a single Redis instance is not a distributed system. It is a remote centralized store. Now by using several Redis instances, you can build a distributed system. Because it is a do-it-yourself approach, you can decide to make it a CP or AP system.

A single instance can replicate its activity to slave instances (which are therefore eventually consistent with the master). The application can choose to always connect to the master for read and write. In that case, you may get a CP system. It can also write on the master, and read from all instances (including slaves), so you may get an AP system. I said "may", because it requires some significant work to build such systems on top of Redis.

You can mix sharding and master/slave replication to build the distributed system you need. However, Redis only provides basic bricks to do this. Especially, it does offer very little to deal with resiliency and HA (and address the P in the CAP theorem). IMO, Redis sentinel alone is not enough to support an HA Redis configuration, since it only covers role management. You need to complement it with a resource manager, and put a lot of logic into the client/application.

There is an on-going project called Redis Cluster, whose purpose is to provide a minimalistic ready-to-use distributed system, but it still lacks of lot of things, and is not usable yet for production purpose.

If you need an off-the-shelf distributed store, Redis is probably not a good option. You will be better served by Cassandra, Riak, MongoDB, Couchbase, Aerospike, MySQL Cluster, Oracle NoSQL, etc ... However, if you want to build your own specialized system, Redis is an excellent component to build upon.


Here is a useful link, Redis Cluster Tutorial:

https://redis.io/topics/cluster-tutorial

You might also benefit from looking at the Facebook solution with memcache:

https://engineering.fb.com/web/introducing-mcrouter-a-memcached-protocol-router-for-scaling-memcached-deployments/