Why can't I create this gluster volume?

I was seeing an obscure error message about an unconnected socket with peer 127.0.0.1.

[2013-08-16 00:36:56.765755] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (127.0.0.1:1022)

It turns out the problem I was having was due to NAT. I was trying to create gluster servers that were behind a NAT device and use the public IP to resolve the names. This is just not going to work properly for the local machine.

What I had was something like the following on each node.

A hosts file containing

192.168.0.11  gluster1
192.168.0.12  gluster2
192.168.0.13  gluster3
192.168.0.14  gluster4

The fix was to remove the trusted peers first

sudo gluster peer detach gluster2
sudo gluster peer detach gluster3
sudo gluster peer detach gluster4

Then change the hosts file on each machine to be

# Gluster1
127.0.0.1     gluster1
192.168.0.12  gluster2
192.168.0.13  gluster3
192.168.0.14  gluster4


# Gluster2
192.168.0.11  gluster1
127.0.0.1     gluster2
192.168.0.13  gluster3
192.168.0.14  gluster4

etc

Then peer probe, and finally create the volume which was then successful.

I doubt that using IP addresses (the public ones) will work in this case. It should work if you use the private addresses behind your NAT. In my case, each server was behind a NAT in the AWS cloud.