Neo4j in distributed mode - is it possible?

It sounds like you're asking about database sharding. The short answer is no, this feature isn't supported.

Neo4j has two primary clustering modes, the older HA (highly available) clustering, and the newer Causal Clustering, and both require Enterprise Edition. In both cases all nodes participating in the cluster must contain the entire graph.

For now I'll stick with causal clustering, as that's where feature development is continuing.

As far as read scaling, that can be scaled horizontally by adding read replicas to the cluster. The bolt+routing protocol ensures that explicit read transactions using the driver are routed to either one of the followers or a read replica, and take load into account to some degree.

For write scaling, that is vertical only, as only one node at a time (the elected leader) is allowed to write, so ensuring that all core nodes (the nodes in the cluster that can potentially be elected leader) have adequate RAM, disk space, and SSDs is critical.

EDIT:

Neo4j Fabric was introduced in January 2020 with the release of Neo4j 4.0. This allows sharding of data across multiple shards (databases or clusters, and they don't need any additional configuration to be used as a shard), and ways to query over these multiple shards and work with the results.


Neo4j Enterprise has clustering but it is for high availability.

It does not shard like TigerGraph for example.

Each instance (node) in the cluster has a replication of the full data set.

Tags:

Neo4J