The Long Way to Windows Container on Amazon EKS: VPC Resource Controller

Amazon EKS, which was previously called Amazon Elastic Container Service for Kubernetes, is the managed Kubernetes service of theirs. We already have some services (Linux containers) running on production environments for while. It is battle-tested and works well if you ask me.

Due to the characteristic of several Windows-based services ~~that are unlikely to run on other platforms in the foreseeable future~~, we have been discussing the feasibility of running Windows Containers on EKS.

I am aware that Amazon announced EKS Windows Container support became generally available last October, however, I haven’t tried it until now.

To know whether it works as we expect, I fired up an EKS cluster (with version 1.14), then followed the Windows Support guide to add…well, Windows support.

There are a few ways to do it, be it eksctl, or steps by steps using the commands they provided, I chose the latter one so I can at least know what I was doing.

However, here comes the plot twist. I applied the IIS sample YAML but the pod was never up.

After describing the pod, there were events like:

$ k describe pod <IIS-pod>
...(...omitted)
Events:
  Type     Reason             Age                      From                Message
  ----     ------             ----                     ----                -------
  Warning  FailedScheduling   4m45s (x181 over 45h)    default-scheduler   0/3 nodes are available: 2 node(s) didn't match node selector, 3 Insufficient vpc.amazonaws.com/PrivateIPv4Address.
  Normal   NotTriggerScaleUp  4m21s (x1046 over 179m)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added): 3 max limit reached

Strangely, I’ve never seen errors like Insufficient vpc.amazonaws.com/PrivateIPv4Address. I went to the EC2 console and checked the Windows worker instance, and there were 0 private IP addresses other than the primary private IP attached to that instance as they should.

So yeah, it looks like the only way out would be everybody’s best friend, Google. At that time, there were only a few GitHub issues that remain open. There is one interest blog post I found that seemed to be helpful, but it was written in Russian, I could only comprehend the message from AWS’ support engineer.

The support engineer from AWS in that post said the private subnets that worker nodes were using were not associated with any route tables. So I checked our VPC console, the subnets we were using were all associated with the main route table! I tough maybe we are having different issues and just moved on.

Digging into the logs of vpc-resource-controller that I deployed in the Windows Support guide, it throws errors:

$ k logs -f <vpc-resource-controller-pod> -n kube-system

...(...omitted)
I1230 07:25:35.212299       1 watcher.go:238] Node watcher processing update on node ip-10-xx-xx-xx.ap-northeast-1.compute.internal.
I1230 07:25:35.212330       1 manager.go:173] Node manager updating node ip-10-xx-xx-xx.ap-northeast-1.compute.internal.
E1230 07:25:35.212341       1 watcher.go:242] Node watcher failed to update node ip-10-xx-xx-xx.ap-northeast-1.compute.internal: node manager: failed to find node ip-10-xx-xx-xx.ap-northeast-1.compute.internal.

I noticed the “failed to find” messages and thought, maybe there was a DNS resolution issue? I then execed into that pod and see what’s going on:

$ k exec -it <vpc-resource-controller-pod> sh -n kube-system
sh-4.2# yum install -y bind-utils
sh-4.2# nslookup ip-10-xx-xx-xx.ap-northeast-1.compute.internal
Server:        10.xx.xx.xx
Address:    10.xx.xx.xx#53

Non-authoritative answer:
Name:    ip-10-xx-xx-xx.ap-northeast-1.compute.internal
Address: 10.xx.xx.xx

So, it wasn’t a DNS resolution issue. Out of ideas, I opened a support ticket. After a few backs and forths, the support engineer couldn’t reproduce the problem and suggested me to redeploy vpc-resource-controller.

And I checked logs again:

I0101 12:21:09.102850       1 manager.go:109] Node manager adding node ip-10-xx-xx-xx.ap-northeast-1.compute.internal with instanceID i-0ae8ea6ef05a41947.
E0101 12:21:09.245666       1 manager.go:126] Node manager failed to get resource vpc.amazonaws.com/CIDRBlock pool on node ip-10-xx-xx-xx.ap-northeast-1.compute.internal: failed to find the route table for subnet subnet-xxxxxxxxx
E0101 12:21:09.245695       1 watcher.go:183] Node watcher failed to add node ip-10-xx-xx-xx.ap-northeast-1.compute.internal: failed to find the route table for subnet subnet-xxxxxxxxx
I0101 12:21:09.245707       1 watcher.go:259] Node watcher adding key ip-10-xx-xx-xx.ap-northeast-1.compute.internal (4): failed to find the route table for subnet subnet-xxxxxxxxx

That was new and looked like somehow related to…route table. But these errors became the previous ones after I delete the pod and let it to be re-created.

Finally, the support engineer confirmed that he or she was able to reproduce the issue and mentioned that our private subnets didn’t associate with any route table.

I thought I’ve checked that before. So I went to VPC console and checked again, it looks like these subnets were associated!

The support engineer also suggested me to use aws cli to check:

$ aws ec2 describe-route-tables --filters "Name=association.subnet-id,Values=subnet-xxxxx"

But only the output only contains an empty array:

{
    "RouteTables": []
}

The support engineer pointed out that subnets without explicitly associated route table will implicitly use the main route table (I should’ve read the document though).

…You can explicitly associate a subnet with a particular route table. Otherwise, the subnet is implicitly associated with the main route table.

The VPC console usually shows the route table that subnet is currently using. It doesn’t necessary mean that particular subnet is explicitly associated with the route table.

And you guessed it, the vpc-resource-controller uses DescribeRouteTables API and it couldn’t find the route table, just like the result above, hence the errors.

Let’s associate the subnets with the main route table like the support engineer suggested:

$ aws ec2 associate-route-table --route-table-id <route-table-id> --subnet-id <subnet-id>

After that, vpc-resource-controller didn’t show errors any more and the IIS server pod was finally working.

At the time of writing, you must manually associate the subnets that you are using to the route table you need, otherwise vpc-resource-controller will fail.

I am not sure whethe they will modify the vpc-resource-controller, but this is something you need to notice for now if you need to schedule Windows workload on Amazon EKS.