Not able to make AWS ECS services communicate over service discovery

I would like to elaborate @Imran detailed answer a bit more, since, most of the answer talks about SRV DNS Record Type and showing Nginx example only for a premium version of Nginx ( and SRV).

In case you work with ECS Fargate and configured A DNS Record. the most important thing is to configure a proper resolver.

From the docs:

Configures name servers used to resolve names of upstream servers into addresses, for example:

resolver 127.0.0.1 [::1]:5353;

The address can be specified as a domain name or IP address, with an optional port. If port is not specified, the port 53 is used. Name servers are queried in a round-robin fashion.

with that been said the resolver must resolve the Private DNS. therefore, we need to use the NS DNS Record. using 8.8.8.8 as a resolver won't work since this DNS can't resolve the Private DNS.

NS stands for ‘name server’ and this record indicates which DNS server is authoritative for that domain (which server contains the actual DNS records). A domain will often have multiple NS records which can indicate primary and backup name servers for that domain.

In order to get the DNS Resolver run the following command:

aws route53 list-resource-record-sets --hosted-zone-id %HOSTED_ZONE_ID% --query "ResourceRecordSets[?Type == 'NS']"

Pick one of the resource records and place it into the Nginx resolver (including the trailing .).

Nginx basic template:

events {
  worker_connections 768;
}

http {
  # DNS Resolver
  resolver ns-###.awsdns-####.com. valid=10s;
  gzip on;
  gzip_proxied any;
  gzip_types text/plain application/json;
  gzip_min_length 1000;
  fastcgi_buffers 16 16k; 
  fastcgi_buffer_size 32k;

  server {

    listen 80;
    
    location / {
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header Host $host;
          proxy_redirect   off;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          # This is the important part
          proxy_pass http://ecs-fargate-svc.local:8080;
    }

    location = /health-check {
      return 200 'all good';
    }

  }
}

Few points that need to consider:

  • Don't forget to add the mapping port (in my example 8080).
  • Make sure the Security group allows traffic within the VPC.
  • Since working with Fargate and we have limited logs, consider creating an EC2 instance in the VPC the ECS Fargate tasks located and try to curl\ping the URL\DNS Record.

My service discovery:

enter image description here

Documentations:

Nginx resolver

The name server (NS) record


Update 03/2022

AWS has now ENI Trunking which can increase how many ENIs can be attached to a given EC2 Instance Type in the VPC. This makes using awsvpc mode lot flexible with DNS A records and makes Service Discovery easier to configure for ECS Services.

Combining this with AWS App Mesh and AWS Cloud Map you can make ECS Service Discovery lot easier.

More info about ENI Trunking & App Mesh Examples. https://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-eni.html https://github.com/aws/aws-app-mesh-examples/tree/main/walkthroughs/howto-ingress-gateway


Original Answer

As per our conversation, here is bit summary of what's happening.

  • If Service1(nginx in your case) needs to interact with Service2(redis) with AWS ServiceDiscovery option and use of SRV records then Service1 needs to be aware that it needs to perform DNS SRV lookup instead of DNS A(Address) lookup.

  • You have multiple options here. First, if you want to continue to use the SRV records use then your client nginx needs to proxy redis upstream server with options of service and resolve which are available only in premium version of nginx. Check my sample nginx configuration I have tested at the bottom of the answer which works.

  • Also make sure, you create the AWS Service discovery name with prefix _http._tcp otherwise, I had issues configuration SRV resolve/service option in nginx configuration without the prefix.

aws ecs service

  • Other option, If you do not want to rely on SRV records but go to standard A record lookup then you will have to use awsvpc mode for containers and select A option.

enter image description here

  • With DNS A option then your query of service_discovery_service_name.service_discovery_namespace will work fine.
  • With DNS A option, there are some constraints. You cannot run multiple tasks on a given EC2 instance due to number of ENIs limit which can be attached depending EC2 instance family. Update Check 03/2022 modification above.

Sample nginx DNS SRV Options configuration:

stream {
    resolver 172.31.0.2;
    upstream redis {
        zone tcp_servers 64k;
        server redisservice.local service=_http._tcp resolve;
    }
    server {
        listen 12345;
        status_zone tcp_server;
        proxy_pass redis;
    }
}

Some references -

https://aws.amazon.com/blogs/aws/amazon-ecs-service-discovery/ https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-service-discovery.html