How to Make Changes to kube-apiserver

kube-apiserver might not restart after making changes to it. Here are some tips on how to monitor the process and fix it.

Juampy NR

June 15, 2022

When preparing for the CKS exam, you have to train a lot with tasks that involve making changes to /etc/kubernetes/manifests/kube-apiserver.yaml. This file is monitored by the Kubelet service, which runs the kube-apiserver process in a container. If the Kubelet detects a chance in that file, it restarts the container with the updated configuration.

In successful scenarios, kube-apiserver's container stops for a few minutes and then returns. However, if there is an error, it might not come back.

We'll show you steps you can follow to get back on track, especially if you are preparing for any of the Linux Foundation's Kubernetes certifications.

The testing environment

Let's take one of Kim Wuestkamp's killercoda challenges where we have an ephemeral cluster that we can tinker with. Here is an active environment with the printed the contents of /etc/kubernetes/manifests/kube-apiserver.yaml:

An active environment with the printed the contents of /etc/kubernetes/manifests/kube-apiserver.yaml

The contents of /var/kubernetes/manifests/kube-apiserver.yaml

We can see that there is a command called kube-apiserver, which receives a long list of options, each in its own line. We will start by changing something that we know will work, then forge ahead from there.

Making a change

The change we are going to make is about forbidding the creation of privileged containers. Per the official documentation, this is about turning the option --allow-privileged=true into --allow-privileged=false.

Before making the change, it is recommended to make a copy of the existing configuration. You can just copy it into the home directory with the following command:

cp /etc/kubernetes/manifests/kube-apiserver.yaml .

Now let's make the actual change in the file with an editor like VIM. Open the file, make the change described above, save, and quit.

TIP: In a production environment, Kubernetes configuration might be managed via configuration in code. However, the steps to monitor and analyze the result of a change explained below should still be helpful.

Monitor the change

The Kubelet will detect the file change and restart the kube-apiserver container with the new configuration. We can track this by observing the running containers via the following command:

watch crictl ps

Here is the output, which refreshes every two seconds:

Running containers. Notice that kube-apiserver has been running for 6 minutes.

This listing shows our running containers. One of them is kube-apiserver, which has been running for 6 minutes. If everything goes well, it should vanish for a few seconds and reappear, like in the screenshot below:

Output from watch crictl ps after kube-apiserver restarts

Running containers. Notice how kube-apiserver has been running for 33 seconds.

TIP: In the above command, we use crictl instead of docker to list running containers. crictl is a command-line interface for CRI-compatible container runtimes. It is common to use CRI as the container runtime in Kubernetes instead of docker. If you are using docker, though, the command would be identical, only replacing crictl with docker.

Verify the change

Now that the kube-apiserver container is running, we can verify that it is using our change by searching for running processes:

ps aux | grep kube-apiserver | grep privileged

The kube-apiserver process has the expected option value.

We can see that there is a kube-apiserver process that is using the option --allow-privileged=false, hence the change did take effect.

Dealing with failures

There are two kinds of errors:

Bad syntax errors. These are caused by introducing an error at /etc/kubernetes/manifests/kube-apiserver.yaml which makes its YAML invalid, and therefore the Kubelet stops the kube-apiserver container but does not start it back.
Incorrect configuration. This causes a container startup error, so the Kubelet will try to start the kube-apiserver container a few times before it desists.

In both cases, the steps to find out are the same, but understanding these two scenarios will help you to know where to look at. Let's look at each in turn.

Bad syntax

We will introduce a typo at /etc/kubernetes/manifests/kube-apiserver.yaml and then see what happens. The following screenshot shows an invalid array of options that are passed to the kube-apiserver command:

kube-apiserver.yaml with typo at allow-privileged

Notice the missing dash at the third option (allow-privileged).

Now let's save the file and monitor containers as we did before. We list containers with the following command:

watch crictl ps

kube-apiserver is still running as it has not been stopped yet by the Kubelet

After a couple of minutes, kube-apiserver is gone and does not reappear:

Output of watch crictl ps after kube-apiserver stops working

kube-apiserver does not show up again.

Time to look at the logs. These are under /var/log/pods or /var/log/containers. Use whatever you prefer. In this case, there is nothing since the container is not running, and it was deleted. The kube-apiserver logs are missing at /var/log/pods:

Output of tail -f /var/log/pods/kube-system_

There is no directory for kube-apiserver at /var/log/pods.

Now let's look at /var/log/containers:

Output of tail -f /var/log/containers/kube-

There are no kube-apiserver logs either.

Next, let's list all containers in the system using crictl, including the ones that are not running:

crictl ps -a

There is no container for kube-apiserver. Not even a crashed one.

If you get to this point, look for errors at journalctl, filtering them by kube-apiserver since there is usually a lot of output. You can do this with:

journalctl | grep apiserver

And here is the result, right at the bottom of the output:

May 13 13:25:49 controlplane kubelet[25181]: E0513 13:25:49.437945 25181 file.go:187] "Could not process manifest file" err="/etc/kubernetes/manifests/kube-apiserver.yaml: couldn't parse as pod(yaml: line 18: could not find expected ':'), please check config file" path="/etc/kubernetes/manifests/kube-apiserver.yaml"

We can see above that the manifest file /etc/kubernetes/manifests/kube-apiserver.yaml could not be processed. We even got the line number (18), which matches the line where we broke the array of options. Let's fix it and verify that the kube-apiserver gets back into life:

vim /etc/kubernetes/manifests/kube-apiserver.yaml

kube-apiserver.yaml file with the missing dash restored

Here we have restored the missing dash at the options array.

Then we save and monitor containers again. As expected, kube-apiserver is now running:

kube-apiserver has been running for 10 seconds.

When we have introduced bad syntax, there will be no new containers and (unless you look quickly) no logs under /var/log. Therefore, we can find the error at journalctl.

Let's look at the other type of error that we may find.

Incorrect configuration

In this scenario, the manifest syntax is correct, but the configuration is not. This could happen when we set the wrong value to an option, such as an incorrect data type or an incorrect volume path. Let's see an example.

We will use the same option we have been using in this article, allow-privileged, and this time we will set it to wrong. The resulting manifest is valid YAML, but when the Kubelet starts the container, it will fail and enter a loop in which it tries to start and crashes. Let's start by making the change:

vim /etc/kubernetes/manifests/kube-apiserver.yaml

Notice the “wrong” value at --allow-privileged

We then save the file and monitor containers:

watch crictl ps

kube-apiserver can’t start

Look at the listing above, in which kube-apiserver appears as STATE Exited, and there have been two attempts to start it, the last one 11 seconds ago. We can dig deeper by looking at the logs of the container of the previous attempt by copying the container identifier in the CONTAINER column (2a5e823478bdf), and then using crictl logs 2a5e823478bdf:

There is the error. Now that we know what to fix, let's look at container logs to see if we can find this error there:

No kube-apiserver logs at /var/log/pods

Nothing there. Now let's look at pod logs:

No kube-apiserver logs at /var/log/pods

Nothing there either. Finally, let's look at the error at journalctl:

journalctl | grep kube-apiserver

The above command returns a lot of error messages since there are other services trying to reach out to the kube-apiserver and failing. None of the errors is the one that we found in the container logs via crictl.

Summary

Here is a suggested list of steps that sums up all the above examples:

Make a copy of kube-apiserver.yaml before you make changes to it.
Make the change at kube-apiserver.yaml and save it.
Monitor containers with watch crictl ps or watch docker ps depending on the container runtime that the cluster uses.
If the kube-apiserver does not reappear after a couple of minutes:
1. Look at container logs with crictl ps -a. If you see a kube-apiserver container, inspect its logs with crictl logs [id of the container]. If you identify the error, fix it at kube-apiserver.yaml and go back to step 3.
2. If you did not find the error, search for it at journalctl using journalctl | grep api-server. Try to find the error there.
3. If you did not find the error at journalctl , look at the container and pod logs at /var/log.
There are cases where the Kubelet did stop the kube-apiserver container but did not start it again. You can force it to do so with systemctl restart kubelet.service. That should attempt to start kube-apiserver and log an error at journalctl if it failed.

Conclusion

This guide should help you to track configuration errors at kube-apiserver. It requires patience and practice until you get fluent with these commands and the kind of output you will find. You can start practicing now by solving the challenges that involve the apiserver at https://killercoda.com.

Published in:

System Administration

More Resources

See all

How to Make Changes to kube-apiserver

The testing environment

Making a change

Monitor the change

Verify the change

Dealing with failures

Bad syntax

Incorrect configuration

Summary

Conclusion

More Resources

Sending a Drupal Site Into Retirement Using HTTrack

Continuous Deployment, Infrastructure as Code, and Drupal: Part 3

Simple Off-Site Backups with rsync, ssh, and sudo

Get in touch with us

Tell us about your project or drop us a line. We'd love to hear from you!