How do you programmatically configure hazelcast for the multicast discovery mechanism?

The problem appearently is that the cluster starts (and stops) and doesn't wait till enough members are in the cluster. You can set the hazelcast.initial.min.cluster.size property, to prevent this from happening.

You Can set 'hazelcast.initial.min.cluster.size' programmatically using:

Config config = new Config(); 
config.setProperty("hazelcast.initial.min.cluster.size","3");

Your configuration is correct BUT you have set a very long multicast timeout of 200 sec where the default is 2 sec. setting a smaller value will solve it.

From Hazelcast Java API Doc: MulticastConfig.html#setMulticastTimeoutSeconds(int)

Specifies the time in seconds that a node should wait for a valid multicast response from another node running in the network before declaring itself as master node and creating its own cluster. This applies only to the startup of nodes where no master has been assigned yet. If you specify a high value, e.g. 60 seconds, it means until a master is selected, each node is going to wait 60 seconds before continuing, so be careful with providing a high value. If the value is set too low, it might be that nodes are giving up too early and will create their own cluster.


It seems you are using TCP/IP clustering, so that is good. Try the following (from the hazelcast book)

If you are making use of iptables, the following rule can be added to allow for outbound traffic from ports 33000-31000:

iptables -A OUTPUT -p TCP --dport 33000:31000 -m state --state NEW -j ACCEPT

and to control incoming traffic from any address to port 5701:

iptables -A INPUT -p tcp -d 0/0 -s 0/0 --dport 5701 -j ACCEPT

and to allow incoming multicast traffic:

iptables -A INPUT -m pkttype --pkt-type multicast -j ACCEPT

Connectivity test If you are having troubles because machines won't join a cluster, you might check the network connectity between the 2 machines. You can use a tool called iperf for that. On one machine you execute: iperf -s -p 5701 This means that you are listening at port 5701.

At the other machine you execute the following command:

iperf -c 192.168.1.107 -d -p 5701

Where you replace '192.168.1.107' by the ip address of your first machine. If you run the command and you get output like this:

------------------------------------------------------------
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 192.168.1.107, TCP port 5701
TCP window size: 59.4 KByte (default)
------------------------------------------------------------
[  5] local 192.168.1.105 port 40524 connected with 192.168.1.107 port 5701
[  4] local 192.168.1.105 port 5701 connected with 192.168.1.107 port 33641
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.2 sec  55.8 MBytes  45.7 Mbits/sec
[  5]  0.0-10.3 sec  6.25 MBytes  5.07 Mbits/sec

You know the 2 machines can connect to each other. However if you are seeing something like this:

Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
connect failed: No route to host

Then you know that you might have a network connection problem on your hands.