Sunday, February 16, 2020

ECS Fargate ERROR : CannotPullContainerError: Error response from daemon

Last week i was asked to look into an issue faced by the team working on a service deployed in ECS Fargate

ERROR

CannotPullContainerError: Error response from daemon: Get https://xxxxxxxxxxxx.dkr.ecr.us-east-x.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)



Details:
The container service was failing to provision and stopping with the above error.


Possible Reasons:
The error indicated that the container is failing during "Pull" event and in this case, the only pull request which is configured is for the docker image maintained in ECS registry. 

Most of the times such issue occurs due to lack of access and this case was no different.. There are two possible ares to look at for enabling Fargate resource deployed in private subnet to pull image from ECS registry


  • If task is launched without an public IP, its it required to configure route table on the subnet with has "0.0.0.0/0" going to a NAT Gateway or NAT instance . This is to ensure it can connect to internet. If task is launched with an public IP, configure route table on the subnet to have "0.0.0.0/0" going to an internet gateway to ensure traffic can flow in.


  • Ensure the security groups for the Task allows for outbound access. 

if internet  access is a concern than another option is to deploy registry inside VPC.
Here is link for one of the option 






No comments:

Gray Failures: What is it and how to detect one?

If you are reading this article , i guess you are curious to know about gray failures and different methods to detect gray failures.  Hopefu...