Cloudformation template for creating ECS service stuck in CREATE_IN_PROGRESS

No need to register the full ARN for the TaskDefinition, because when the logical ID of this resource is provided to the Ref intrinsic function, Ref returns the Amazon Resource Name (ARN).

In the following sample, the Ref function returns the ARN of the MyTaskDefinition task, such as arn:aws:ecs:us-west-2:123456789012:task/1abf0f6d-a411-4033-b8eb-a4eed3ad252a.

{ "Ref": "MyTaskDefinition" }

Source http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-taskdefinition.html


Your AWS::ECS::Service needs to register the full ARN for the TaskDefinition (Source: See the answer from ChrisB@AWS on the AWS forums). The key thing is to set your TaskDefinition with the full ARN, including revision. If you skip the revision (:123 in the example below), the latest revision is used, but CloudFormation still goes out to lunch with "CREATE_IN_PROGRESS" for about an hour before failing. Here's one way to do that:

"MyService": {
    "Type": "AWS::ECS::Service",
    "Properties": {
        "Cluster": { "Ref": "ECSClusterArn" },
        "DesiredCount": 1,
        "LoadBalancers": [
            {
                "ContainerName": "myContainer",
                "ContainerPort": "80",
                "LoadBalancerName": "MyELBName"
            }
        ],
        "Role": { "Ref": "EcsElbServiceRoleArn" },
        "TaskDefinition": {
            "Fn::Join": ["", ["arn:aws:ecs:", { "Ref": "AWS::Region" },
            ":", { "Ref": "AWS::AccountId" },
            ":task-definition/my-task-definition-name:123"]]}
        }
    }
}

Here's a nifty way to grab the latest revision of MyTaskDefinition via the aws cli and jq:

aws ecs list-task-definitions --family-prefix MyTaskDefinition | jq --raw-output .taskDefinitionArns[0][-1:]

I found another related scenario that will cause this and thought I'd put it here in case anyone else runs into it. If you define a TaskDefinition with an Image that doesn't actually exist in its ContainerDefinition and then you try to run that TaskDefinition as a Service, you'll run into the same hang issue (or at least something that looks like the same issue).

NOTE: The example YAML chunks below were all in the same CloudFormation template

So as an example, I created this Repository:

MyRepository:
    Type: AWS::ECR::Repository

And then I created this Cluster:

MyCluster:
    Type: AWS::ECS::Cluster

And this TaskDefinition (abridged):

MyECSTaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
        # ...
        ContainerDefinitions:
            # ...
              Image: !Join ["", [!Ref "AWS::AccountId", ".dkr.ecr.", !Ref "AWS::Region", ".amazonaws.com/", !Ref MyRepository, ":1"]]
            # ...

With those defined, I went to create a Service like this:

MyECSServiceDefinition:
    Type: AWS::ECS::Service
    Properties:
        Cluster: !Ref MyCluster
        DesiredCount: 2
        PlacementStrategies:
            - Type: spread
              Field: attribute:ecs.availability-zone
        TaskDefinition: !Ref MyECSTaskDefinition

Which all seemed sensible to me, but it turns out there two issues with this as written/deployed that caused it to hang.

  1. The DesiredCount is set to 2 which means it will actually try to spin up the service and run it, not just define it. If I set DesiredCount to 0, this works just fine.
  2. The Image defined in MyECSTaskDefinition doesn't exist yet. I made the repository as part of this template, but I didn't actually push anything to it. So when the MyECSServiceDefinition tried to spin up the DesiredCount of 2 instances, it hung because the image wasn't actually available in the repository (because the repository literally just got created in the same template).

So, for now, the solution is to create the CloudFormation stack with a DesiredCount of 0 for the Service, upload the appropriate Image to the repository and then update the CloudFormation stack to scale up the service. Or alternately, have a separate template that sets up core infrastructure like the repository, upload builds to that and then have a separate template to run that sets up the Services themselves.

Hope that helps anyone having this issue!