how Cassandra chooses the coordinator node and the replication nodes?

The coordinator node is typically chosen by an algorithm which takes "network distance" into account. Any node can act as the coordinator, and at first requests will be sent to the nodes which your driver knows about. But once it connects and understands the topology of your cluster, it may change to a "closer" coordinator.

The coordinator only stores data locally (on a write) if it ends up being one of the nodes responsible for the data's token range.


The coordinator is selected by the driver based on the policy you have set. Common policies are DCAwareRoundRobinPolicy and TokenAware Policy.

For DCAwareRoundRobinPolicy, the driver selects the coordinator node based on its round robin policy. See more here: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.html

For TokenAwarePolicy, it selects a coordinator node that has the data being queried - to reduce "hops" and latency. More info: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/TokenAwarePolicy.html

It is a best practice to wrap policies so there is a primary and secondary policy should there be an issue. More information available at the links above.