Turning the lights on/off in Kubernetes clusters
Some time ago, cruising through the Sustainability-related parts of the Internet, I've arrived upon a term LightSwitchOps. Since I'm a fool for all the things Ops, I decided to have a look.
In this article, we will dip our fingers in the concept of LightSwitchOps and how to apply it in the Kubernetes environment, at the smallest possible level - pods. You can call this an intro to the LSO with a practical example, if you wish.
Just a small note on wording - I'll use the term hardware to describe underlying machines, servers, virtual machines, and other equipment.
Now, without further ado, let us start.
Got a Wondering Mind?
Join me in my journey of learning about ways how software engineers can help build a more sustainable world.
No spam. Ever. I promise!
What is Light switch ops?
The LightSwitchOps is a concept in sustainability that represents turning the machines, servers, VMs off when not used. Now, this concept, from the logical point of view, seems normal, right? I myself am the person who goes around the house and switches lights off whenever unnecessary. Why can't I (we) do the same with the servers?
Well, in the IT world, because of all those ilities, (availability being one of them), we gravitate towards leaving the hardware working for the longest time possible. Now, don't get me wrong, if the application, process, or whatever you have/use needs to be on 24/7, then go ahead, leave it on. But for the most of the stuff, we don't need that availability.
And there are some research that show that a big percent of the running IT equipment, actually emitting CO2 is not being used, just sits there and consumes electricity. More information can be found here and here.
Okay, we can turn our hardware off, but what about the start-up time?
What about it? If you don't need your hardware to be on all the time, you can spare some seconds to wait during startup. Or minutes, if we're talking in Windows terms.
What is kube-green
?
Now, let's move the concept of LightSwitchOps into Kubernetes. We now know what the concept is, how can we apply it to our services running in Kubernetes?
We can use the tool called kube-green
. The kube-green
helps you turn the services off when not used, and back on when needed. For example - turning on during work hours, and turning off during non-working hours.
How it works?
The tool itself works quite simple - when you install it on the cluster, you get a CustomResourceDefinition
available, the SleepInfo
. This CRD basically takes the resources you specified via label selector(s), and scales them to 0 (Deployment
), or suspends them (CronJob
).
And that's it!
The setup is quite small, it creates the following resources:
- namespace - where to run
kube-green
SleepInfo
CRD mentioned above- service account for the service
- role and cluster role for controller manager
- cluster role for the metrics and for the proxy
- some configuration for the controller
- two services
- one deployment
- one certificate (
cert-manager
is a prerequisite) - and one validating web-hook configuration to validate the
SleepInfo
CRD application.
For simple instructions on how to install it, check out the link below.
https://kube-green.dev/docs/install/
Why not use the HPA
for this?
You may ask yourself, or me for that matter, - what about Kubernetes-native mechanisms such as HorizontalPodAutoscaler
?
Good question. The simple answer is that the HPA
cannot scale resources to 0. Yep.
The HPA
works in a way that it monitors the metrics of resources we specify. If the resource usage changes above or below defined threshold (e.g. memory or CPU), the HPA
automatically scales up or down those resources. In that way, it enables the auto-scaling mechanism in quite easy and stress-free way.
Because it uses the metrics of the running resources if you try and scale to 0, it won't be able to know when to scale up. Therefore, if you try and put minReplicas: 0
in the HPA
, Kubernetes will throw an error, and it won't apply the configuration.
This is where using kube-green
can help us.
Bonus points
If you install kube-green
to handle turning the services off when not used, consider using the Kubernetes Cluster Autoscaler, to scale up or down your nodes when they are not being used. It is a good pair from my point of view. For more info on that, check out the link below.
Summary
The kube-green
tool in a nutshell is a time-based scaling tool. It doesn't use any fancy metrics and approaches. It just scales resources to 0 and back up when you define it. Following are a couple of impressions of the tool.
- Quite easy to install and setup, even with
cert-manager
as a prerequisite. - Simple to use, without any hassle and additional configuration.
- It uses set it and forget it approach - you install it, configure it, and leave it working.
Things can be that simple! Even with Kubernetes.
Further information
To find out more about the LightSwitchOps concept, visit the link below.
To find out more about kube-green
, checkout the link below.
Thanks for sticking with me for this long! In the next article, we will go one level up, and apply the concept of LightSwitchOps to Kubernetes nodes. See you in a couple of weeks!
If you liked the article, feel free to share it. If there is something wrong with the things I wrote, feel free to drop a comment below. Bonus - subscribe to the blog and receive these articles in your inbox!
Thank you for helping me grow!