docker-compose network creation kicks me out of ssh

I finally ended up running a docker network ls. The output was a list of more than 15 networks which were very old. I ran a docker ps to make sure that nothing related to these networks was still running. One container was indeed still running (redis) and it was on a network called bridge. I stopped the container. Then I started going through all the networks with docker network rm <network name> until I was left with 4 networks: bridge, host, none, and the only network that was still working. Then I could start new networks with docker-compose up again as usual


I had the same issue, I solved it by setting the network_mode option on docker compose (see the docs here. The solution came from this thread ).

services:
  my_service:
    image: ...
    network_mode: "host"

br-xxxxxxx are the bridge interfaces of Docker and vethxxxxxxx are the virtual interfaces of your containers, Docker use those veth interfaces but you do not directly interact on it, they use an IPv6 address and don't have IPv4. Docker can't create NAT interfaces, it can only create bridge and veth with IPv6 for containers. You can link your bridge to any physical or virtual interface of your host.

So it work like that:

eth0 (your interface or v-interface if you want) ↔ brxxxxx(docker bridge) ↔ vethxxxxx (v-interface of your container)

It's all I can say, I'm not sure that someone else will answer, there is not a lot of Docker experts, so I give you all informations I can to help you to understand your logs.


Diagnosis

Our team is using AWS EC2 instances running Ubuntu 18.04 as devservers. We recently got reports that docker-compose broke SSH connections. Even after restarting, the devservers are still inaccessible. So I started investigation.

I was able to exclude the cause of docker-compose by reproducing using docker only.

ubuntu@ip-172-31-115-116:~$ docker network create -d bridge my-bridge-network
aca5884d60f146cef81ac55c8cccd231a43f40927d645168642d9b28c5e009a6

ubuntu@ip-172-31-115-116:~$ docker network prune
WARNING! This will remove all custom networks not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Networks:
my-bridge-network

ubuntu@ip-172-31-115-116:~$ docker network create -d bridge my-bridge-network
f0a7a06a9627bc2de00eb60091a92010451690626d95e077f622f3058cc3a07c

ubuntu@ip-172-31-115-116:~$ docker network prune
WARNING! This will remove all custom networks not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Networks:
my-bridge-network

ubuntu@ip-172-31-115-116:~$ docker network create -d bridge my-bridge-network
Connection reset by 172.31.115.116 port 22

Then the root cause occurred to me.

Root cause

  • Our docker-compose files are using the bridge network mode which will create a new bridge network by default. When docker-compose down or docker network prune is run, the bridge network will be torn down. And the next docker-compose run or docker network create will create a new bridge network.
  • The default IP range for the docker0 bridge adapter is 172.17.0.0/16.
  • When I first ran the docker network create -d bridge my-bridge-network command, it created a new bridge adapter for 172.18.0.0/16.
  • The second bridge adapter was created for 172.19.0.0/16.
  • Naturally, the 3rd bridge adapter is created for 172.20.0.0/16. However, that is our Engineering VPN IP range. Therefore the overlap caused the server unable to communicate with our laptop.

Solutions

The solution is to make sure new docker bridge networks will skip our VPN IP range.

Temp solution

If we add the skipped IP ranges to system route table, docker will automatically skip them. Therefore, we can run the below script whenever the devserver got rebooted.

sudo route add -net [our VPN IP range] netmask 255.255.0.0 gw [our gateway]

This solution is imperfect that the new routes will be discarded after restarting the machine.

Main solution

We should permanently apply the route changes to all devservers.

echo "            routes:" | sudo tee -a /etc/netplan/50-cloud-init.yaml
echo "            - to: [our VPN IP range]" | sudo tee -a /etc/netplan/50-cloud-init.yaml
echo "              via: [our gateway]" | sudo tee -a /etc/netplan/50-cloud-init.yaml
sudo netplan apply

Docker IP changes

We also plan to modify the docker default-address-pools to redefine docker IP ranges. Refer to https://github.com/docker/compose/issues/4336#issuecomment-457326123. I would say modifying /etc/docker/daemon.json is better.