Someone's life journey.

Automatically Recover EC2 Instances That Failing Status Checks With Cloudwatch Events and Lambda

The Incident

Recently, some of our EKS worker nodes suddenly became unresponsive. When I was checking on the EC2 console, the status check showed “Insufficient Data”.

According to past experience, when underlying hardware somehow got impaired, we will get notifications. However, without much useful information this time, I only did some quick investigation and then had to manually terminate these instances.

Prevent AWS CLI V2 From Using Pager

I always like to try new things out. So when I saw AWS CLI v2 is generally available, I just upgraded it without checking changes.

One thing I found different after the upgrade is that, seems all outputs are redirected to things like less. It’s somehow inconvenient when calling AWS CLI in a shell script:

Notify Google to Update Sitemap Using Netlify Functions

Currently, this site is hosted on Netlify. I am pretty satisfied and don’t plan to move anytime soon. I also submitted my sitemap to Google for it to index. But the update frequency seems not very high.

Fortunately, Google provides an endpoint for you to notify it. Send a GET request to${siteMapUrl} and you are done.

But do we have to use curl every time we deploy to tell Google it’s time to fetch our sitemap? Well, life is short, don’t waste time on things like that.

Disable T3 Unlimited Using ASG Lifecycle Hooks, CloudWatch Events, and Lambda


To save costs on testing environments, we use multuple instance types with 100% Spot ratio and lowest-price allocation strategy for several auto scaling groups.

We combined several instance types like c5.xlarge, m5.xlarge, t3.xlarge, t3a.xlarge. It works fine so far, but t3 and t3a instances come with unlimited credits enabled by default. If applications run on these instances suddenly start misbehaving, the cost will increase after accumulated credits burn out.