The Making of Admission Webhooks, Part 2: The Implementation

In part 1, we briefly went through the concept of admission webhooks. In this post, we are going to build one and deploy it to a cluster.

Let’s keep it simple: this webhook adds a throwaway Redis sidecar container when the pod has the following annotations (why use annotations?):

  • cache.wtcx.dev/inject: true
  • cache.wtcx.dev/port: <user specified port> (optional)
  • cache.wtcx.dev/memory: <user specified memory> (optional)

You can find the complete resources in this repo: github.com/wtchangdm/k8s-admission-webhook-example.

This post is part of the Kubernetes admission webhook series

Prerequisites

k3d

To run our admission webhooks, we need a cluster. k3d lets you run a k3s cluster using Docker. It’s also very lightweight and easy to create/teardown for testing purposes.

After installed k3d, run the following command:

1
$ k3d cluster create

When the cluster is up, you can find a single node in it:

1
2
3
$ k get no
NAME                       STATUS   ROLES                  AGE   VERSION
k3d-k3s-default-server-0   Ready    control-plane,master   49s   v1.21.2+k3s1

Then we got ourselves a disposable cluster. Let’s not spend too much time on it.

cert-manager

Kubernetes specifically asks the admission webhooks’ scheme to be https. To fulfill this requirement, we need a certificate. Of course, we can manually generate it, but every time we deploy, we will have to paste it to MutatingWebhookConfiguration and ValidatingWebhookConfiguration’s caBundle field.

It becomes inconvenient when we need to wrap our webhook into a simple helm chart or put it into a version control system. There are many ways to automate this step. For example, kyverno will generate it at runtime; there are also people who use helm hooks with other tools, it’s not a bad idea when you have time to do that.

However, since our goal is just to build a webhook, it would be great when we can focus on that.

Let’s leverage cert-manager’s CA Injector here:

cainjector helps to configure the CA certificates for: Mutating Webhooks, Validating Webhooks, and Conversion Webhooks.

Pretty self-explanatory, isn’t it? With CA Injector, we don’t need to generate the certificate ourselves, and cert-manager will automatically renew our certificate (it can be signed for a long time, though).

When the certificate is generated, it will create a secret containing the TLS key and certificate as well. We can just mount these two files for our web server pods.

All we need is:

  1. Install cert-manager:

    1
    2
    3
    4
    5
    6
    7
    
    $ helm repo add jetstack https://charts.jetstack.io
    $ helm install \
      cert-manager jetstack/cert-manager \
      --namespace cert-manager \
      --create-namespace \
      --version v1.4.0 \
      --set installCRDs=true
  2. An selfsigned Issuer custom resource by cert-manager

  3. A Certificate custom resource by cert-manager

  4. As for MutatingWebhookConfiguration and ValidatingWebhookConfiguration, we can just remove the caBundle field then add annotation: cert-manager.io/inject-ca-from: <NAMESPACE>/<CERT_NAME>.

The webhook design

A webhook is essentially an API. Therefore, we will build a simple API server that serves the response that Kubernetes’ Admission controller expects.

The mutating flow is simple. As you can see, we can skip requests that either don’t have the annotations we are looking for (cache.wtcx.dev/inject: true) or are dry-run requests. Generally speaking, we can skip DELETE requests in this case as well. However, it’s easier done in MutatingWebhookConfiguration and/or ValidatingWebhookConfiguration, as we can just opt-out DELETE requests without ever receving it.

The application itself is written in Node.js with Fasitfy. Again, you can find the source code here.

I just registered two routes, one for mutating admission webhook, and one for validating admission webhook:

  • /v1/hook/cache/mutate
  • /v1/hook/cache/validate

If you are wondering how does Kubernetes’ admission controller know what the endpoints are, check MutatingWebhookConfiguration and ValidatingWebhookConfiguration files.

And for all these routes, they have several pre-handler hooks before them:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// ...
const v1 = (v1Apis, opts, done) => {
  v1Apis.register((sidecarMutatingApis, logMutatingApiOpts, done) => {
    sidecarMutatingApis.addHook('preHandler', fastifyHooks.skipOnDryRunRequest)
    sidecarMutatingApis.addHook('preHandler', fastifyHooks.skipOnPodWithoutAnnotations)
    sidecarMutatingApis.addHook('preHandler', fastifyHooks.rejectOnInvalidAnnotations)
    sidecarMutatingApis.addHook('preHandler', fastifyHooks.skipOnPatchedPod)

    sidecarMutatingApis.post('/hook/cache/mutate', mutate)

    sidecarMutatingApis.post('/hook/cache/validate', validate)
    // ...
}

These four fastify hooks (not to be confused with admission webhooks) are the condition blocks in the flow chart above. Each one serves as a middleware, so the handler itself won’t be bothered unless necessary.

It guarantees a request reaches the mutating route (/v1/hook/cache/mutate) when:

  1. It’s not a dry-run request.
  2. It’s a Pod that has annotation cache.wtcx.dev/inject set to true.
  3. All related annotations (cache.wtcx.dev/port & cache.wtcx.dev/memory) are valid if there are any.
  4. It hasn’t been patched before. (Rember a request can be sent mutliple times?)

Review your work, again

The same logic goes to the validating route (/v1/hook/cache/validate). Why do we need validating route here? It’s because the Guaranteeing the final state of the object is seen rule. It’s used for making sure that the final state is something you expected.

In my example, the validating route directly rejects the request by throwing 400 with error message in the response body (while the HTTP response itself is still 200, see response format).

Because a legit request shouldn’t have reached this part. It is supposed to be returned early by one of the middlewares. However, you can always run a much more detailed check to see what’s going on and why is the request didn’t pass this phase.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
const validate = async (req, res) => {
  const uid = req.body.request.uid

  // All pod requests shouldn't reach here as it's expected be allowed from hook "fastifyHooks.skipOnPatchedPod".
  // Just reject if nothing further needs to be done.
  return k8sAdmissionReviewHelper.buildRejectResponse(uid, {
    code: 400,
    message: 'Pod validation failed.'
  })
}

JSONPatch and base64 encoding

As mentioned in part 1, I spent some time on JSONPatch to properly modify the object and writing the tests.

Since our example is very straightforward, it doesn’t involve update/replace existing resources like containers, volumes, etc.

We already know that only Pod requests that need to be patched can reach the mutating route; all we need here is to create a Redis container with some settings set to the values that annotations specify:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// ...
/** @type {import('@kubernetes/client-node').V1Container} */
const container = {
  name: REDIS_SIDECAR_CONTAINER,
  image: 'redis:alpine',
  command: [
    'sh',
    '-c',
    `redis-server --port ${port}`
  ],
  resources: {
    requests: {
      cpu: '100m',
      memory: mem
    },
    limits: {
      cpu: '100m',
      memory: mem
    }
  },
  ports: [
    {
      containerPort: port
    }
  ]
}
// ...

At mutating route, you can see the following lines:

1
2
3
4
5
6
const container = k8sAdmissionReviewHelper.createRedisContainer(req.body.request.object)

// JSON Patch format
const patchResult = [
  { op: 'add', path: `/spec/containers/-`, value: container }
]

The value is the container object we created above. Since the container is not in the Pod (yet), we are going to add it at path /spec/containers/-, as in, insert into the end of containers array.

However, we can’t just send this JSON array as a part of the response. Admission controller expects a base64 encoded string of the JSONPatch result above.

1
const base64Result = Buffer.from(JSON.stringify(patchResult)).toString('base64')

Which looks like:

W3sib3AiOiJhZGQiLCJwYXRoIjoiL3NwZWMvY29udGFpbmVycy8tIiwidmFsdWUiOnsibmFtZSI6Ind0Y3gtZXhhbXBsZS1yZWRpcyIsImltYWdlIjoicmVkaXM6YWxwaW5lIiwiY29tbWFuZCI6WyJzaCIsIi1jIiwicmVkaXMtc2VydmVyIC0tcG9ydCA1NTY2Il0sInJlc291cmNlcyI6eyJyZXF1ZXN0cyI6eyJjcHUiOiIxMDBtIiwibWVtb3J5IjoiMTAwTWkifSwibGltaXRzIjp7ImNwdSI6IjEwMG0iLCJtZW1vcnkiOiIxMDBNaSJ9fSwicG9ydHMiOlt7ImNvbnRhaW5lclBvcnQiOjU1NjZ9XX19XQ==

Let’s decode it just to be sure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ echo -n "W3sib3AiOiJhZGQiLCJwYXRoIjoiL3NwZWMvY29udGFpbmVycy8tIiwidmFsdWUiOnsibmFtZSI6Ind0Y3gtZXhhbXBsZS1yZWRpcyIsImltYWdlIjoicmVkaXM6YWxwaW5lIiwiY29tbWFuZCI6WyJzaCIsIi1jIiwicmVkaXMtc2VydmVyIC0tcG9ydCA1NTY2Il0sInJlc291cmNlcyI6eyJyZXF1ZXN0cyI6eyJjcHUiOiIxMDBtIiwibWVtb3J5IjoiMTAwTWkifSwibGltaXRzIjp7ImNwdSI6IjEwMG0iLCJtZW1vcnkiOiIxMDBNaSJ9fSwicG9ydHMiOlt7ImNvbnRhaW5lclBvcnQiOjU1NjZ9XX19XQ==" | base64 -d | jq
[
  {
    "op": "add",
    "path": "/spec/containers/-",
    "value": {
      "name": "wtcx-example-redis",
      "image": "redis:alpine",
      "command": [
        "sh",
        "-c",
        "redis-server --port 5566"
      ],
      "resources": {
        "requests": {
          "cpu": "100m",
          "memory": "100Mi"
        },
        "limits": {
          "cpu": "100m",
          "memory": "100Mi"
        }
      },
      "ports": [
        {
          "containerPort": 5566
        }
      ]
    }
  }
]

Finally, the response with JSONPatch will look like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{
  "apiVersion": "admission.k8s.io/v1",
  "kind": "AdmissionReview",
  "response": {
    "uid": "<UID_FROM_REQUEST>",
    "allowed": "allowed",
    "patch": "<BASE64_JSONPATCH>",
    "patchType": "JSONPatch"
  }
}

Put it all together

I assume you already have a k3d cluster running and cert-manager installed. In that case, let’s try to deploy it:

First, if you don’t have an existing image, we can build it locally then import into cluster:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Clone the repo
$ git clone https://github.com/wtchangdm/k8s-admission-webhook-example.git
$ cd k8s-admission-webhook-example
# Don't use latest tag as the imagePullPolicy will be "Always" by default
$ docker build -t webhook-test:0.0.1 .
# ...(omitted)
$ k3d image import webhook-test:0.0.1
INFO[0000] Importing image(s) into cluster 'k3s-default'
# ...(omitted)
INFO[0021] Successfully imported image(s)
INFO[0021] Successfully imported 1 image(s) into 1 cluster(s)

With the image imported, we can apply all manifests:

1
2
3
4
5
6
7
$ k apply --recursive -f manifests
certificate.cert-manager.io/wtcx-admission-webhook-crt created
issuer.cert-manager.io/selfsigned-issuer created
mutatingwebhookconfiguration.admissionregistration.k8s.io/cache.wtcx.dev created
validatingwebhookconfiguration.admissionregistration.k8s.io/cache.wtcx.dev created
deployment.apps/wtcx-admission-webhook created
service/wtcx-admission-webhook created

Last, we will apply a deployment that runs redis-cli that connects to a localhost Redis server:

1
2
$ k apply -f tests/redis.yaml
deployment.apps/redis created

The deployment itself only includes a single container in the PodSpec. But after patched by our mutating admission webhook, it’s now a Pod containing two containers:

1
2
3
4
$ k get po
NAME                                     READY   STATUS              RESTARTS   AGE
wtcx-admission-webhook-848bc49f6-c7xgm   1/1     Running             0          8m2s
redis-7d9b6787d6-46rn8                   0/2     ContainerCreating   0          0s

Why was there a container restarted?

Most of the time, you will see the restart number to be 1. That’s because redis-cli launched faster than the redis-server we injected.

1
2
3
4
$ k get po
NAME                                     READY   STATUS    RESTARTS   AGE
wtcx-admission-webhook-848bc49f6-c7xgm   1/1     Running   0          9m34s
redis-7d9b6787d6-46rn8                   2/2     Running   1          92s
1
2
$ k logs -f redis-7d9b6787d6-46rn8 redis --previous
Could not connect to Redis at localhost:5566: Connection refused

This demonstration shows how to inject a Redis container with resource set on-demand. But again, this is only for testing purpose, I believe a sane person wouldn’t deploy Redis server like this.

Cleanup

1
2
3
4
5
6
7
8
$ k3d cluster delete
INFO[0000] Deleting cluster 'k3s-default'
INFO[0001] Deleted k3d-k3s-default-serverlb
INFO[0002] Deleted k3d-k3s-default-server-0
INFO[0002] Deleting image volume 'k3d-k3s-default-images'
INFO[0002] Removing cluster details from default kubeconfig...
INFO[0002] Removing standalone kubeconfig file (if there is one)...
INFO[0002] Successfully deleted cluster k3s-default!

Further readings

This post is part of the Kubernetes admission webhook series

Cover: https://unsplash.com/photos/BINLgyrG_fI

0%