With Ansible, is it possible to connect connect to hosts that are behind Cloud IAP (Identity-Aware Proxy) in GCP?

I figured out a way to make this work without a connection plugin. Basically you can write a script that wraps the gcloud tool and point the ansible_ssh_executable parameter at this script, which you can define at the inventory level. You do need to make sure the gcp_compute inventory plugin identifies hosts by name, because this is what gcloud compute ssh expects.

Here's the script:

#!/bin/sh
set -o errexit
# Wraps the gcloud utility to enable connecting to instances which are behind
# GCP Cloud IAP. Used by setting the `ansible_ssh_executable` setting for a play
# or inventory. Parses out the relevant information from Ansible's call to the
# script and injects into the right places of the gcloud utility.

arg_string="$@"

grep_hostname_regex='[a-z]*[0-9]\{2\}\(live\|test\)'
sed_hostname_regex='[a-z]*[0-9]{2}(live|test)'

target_host=$(
  echo "$arg_string\c" | grep -o "$grep_hostname_regex"
)

ssh_args=$(
  echo "$arg_string\c" | sed -E "s# ${sed_hostname_regex}.*##"
)

cmd=$(
  echo "$arg_string\c" | sed -E "s#.*${sed_hostname_regex} ##"
)

gcloud compute ssh "$target_host" \
  --command="$cmd" \
  --tunnel-through-iap \
  -- $ssh_args

Note:

  • This is tested on macOS. The sed options might be different on Linux, for example.
  • The "host" regexes will need to fit your naming convention. If you don't have a consistent naming convention that would work like it does for me you'll need to find some other way to parse out the information.

After discussion on https://www.reddit.com/r/ansible/comments/e9ve5q/ansible_slow_as_a_hell_with_gcp_iap_any_way_to/ I altered solution to use an SSH connection sharing via socket.

It is two times faster then @mat solution. I put it on our PROD. Here is an implementation that doesn't depend on host name patterns!

The proper solution is to use Bastion/Jump host because gcloud command still spawns Python interpreter that spawns ssh - it is still inefficient!

ansible.cfg:

[ssh_connection]
pipelining = True
ssh_executable = misc/gssh.sh
ssh_args =
transfer_method = piped

[privilege_escalation]
become = True
become_method = sudo

[defaults]
interpreter_python = /usr/bin/python
gathering = False
# Somehow important to enable parallel execution...
strategy = free

gssh.sh:

#!/bin/bash

# ansible/ansible/lib/ansible/plugins/connection/ssh.py
# exec_command(self, cmd, in_data=None, sudoable=True) calls _build_command(self, binary, *other_args) as:
#   args = (ssh_executable, self.host, cmd)
#   cmd = self._build_command(*args)
# So "host" is next to the last, cmd is the last argument of ssh command.

host="${@: -2: 1}"
cmd="${@: -1: 1}"

# ControlMaster=auto & ControlPath=... speedup Ansible execution 2 times.
socket="/tmp/ansible-ssh-${host}-22-iap"

gcloud_args="
--tunnel-through-iap
--zone=europe-west1-b
--quiet
--no-user-output-enabled
--
-C
-o ControlMaster=auto
-o ControlPersist=20
-o PreferredAuthentications=publickey
-o KbdInteractiveAuthentication=no
-o PasswordAuthentication=no
-o ConnectTimeout=20"

exec gcloud compute ssh "$host" $gcloud_args -o ControlPath="$socket" "$cmd"

UPDATE There is response from Google engineer that gcloud aren't supposed to be called in parallel! See "gcloud compute ssh" can't be used in parallel

Experiments were shown that with Ansible fork=5 I almost always hit an error. With fork=2 I've never experienced one.

UPDATE 2 Time passed and as of end of 2020 I can run gcloud compute ssh in parallel (in WSL I did fork = 10) without locking errors.