Turning the lights on/off in Kubernetes clusters

Some time ago, cruising through the Sustainability-related parts of the Internet, I've arrived upon a term LightSwitchOps. Since I'm a fool for all the things Ops, I decided to have a look.

In this article, we will dip our fingers in the concept of LightSwitchOps and how to apply it in the Kubernetes environment, at the smallest possible level - pods. You can call this an intro to the LSO with a practical example, if you wish.

Just a small note on wording - I'll use the term hardware to describe underlying machines, servers, virtual machines, and other equipment.

Now, without further ado, let us start.

What is Light switch ops?

The LightSwitchOps is a concept in sustainability that represents turning the machines, servers, VMs off when not used. Now, this concept, from the logical point of view, seems normal, right? I myself am the person who goes around the house and switches lights off whenever unnecessary. Why can't I (we) do the same with the servers?

Well, in the IT world, because of all those ilities, (availability being one of them), we gravitate towards leaving the hardware working for the longest time possible. Now, don't get me wrong, if the application, process, or whatever you have/use needs to be on 24/7, then go ahead, leave it on. But for the most of the stuff, we don't need that availability.

And there are some research that show that a big percent of the running IT equipment, actually emitting CO2 is not being used, just sits there and consumes electricity. More information can be found here and here.

Okay, we can turn our hardware off, but what about the start-up time?

What about it? If you don't need your hardware to be on all the time, you can spare some seconds to wait during startup. Or minutes, if we're talking in Windows terms.

What is kube-green?

Now, let's move the concept of LightSwitchOps into Kubernetes. We now know what the concept is, how can we apply it to our services running in Kubernetes?

We can use the tool called kube-green. The kube-green helps you turn the services off when not used, and back on when needed. For example - turning on during work hours, and turning off during non-working hours.

How it works?

The tool itself works quite simple - when you install it on the cluster, you get a CustomResourceDefinition available, the SleepInfo. This CRD basically takes the resources you specified via label selector(s), and scales them to 0 (Deployment), or suspends them (CronJob).

And that's it!

The setup is quite small, it creates the following resources:

  • namespace - where to run kube-green
  • SleepInfo CRD mentioned above
  • service account for the service
  • role and cluster role for controller manager
  • cluster role for the metrics and for the proxy
  • some configuration for the controller
  • two services
  • one deployment
  • one certificate (cert-manager is a prerequisite)
  • and one validating web-hook configuration to validate the SleepInfo CRD application.

For simple instructions on how to install it, check out the link below.

https://kube-green.dev/docs/install/

Why not use the HPA for this?

You may ask yourself, or me for that matter, - what about Kubernetes-native mechanisms such as HorizontalPodAutoscaler?

Good question. The simple answer is that the HPA cannot scale resources to 0. Yep.

The HPA works in a way that it monitors the metrics of resources we specify. If the resource usage changes above or below defined threshold (e.g. memory or CPU), the HPA automatically scales up or down those resources. In that way, it enables the auto-scaling mechanism in quite easy and stress-free way.

Because it uses the metrics of the running resources if you try and scale to 0, it won't be able to know when to scale up. Therefore, if you try and put minReplicas: 0 in the HPA, Kubernetes will throw an error, and it won't apply the configuration.

This is where using kube-green can help us.

Bonus points

If you install kube-green to handle turning the services off when not used, consider using the Kubernetes Cluster Autoscaler, to scale up or down your nodes when they are not being used. It is a good pair from my point of view. For more info on that, check out the link below.

autoscaler/cluster-autoscaler at master · kubernetes/autoscaler
Autoscaling components for Kubernetes. Contribute to kubernetes/autoscaler development by creating an account on GitHub.

Summary

The kube-green tool in a nutshell is a time-based scaling tool. It doesn't use any fancy metrics and approaches. It just scales resources to 0 and back up when you define it. Following are a couple of impressions of the tool.

  • Quite easy to install and setup, even with cert-manager as a prerequisite.
  • Simple to use, without any hassle and additional configuration.
  • It uses set it and forget it approach - you install it, configure it, and leave it working.

Things can be that simple! Even with Kubernetes.

Further information

To find out more about the LightSwitchOps concept, visit the link below.

Why Cloud Zombies Are Destroying the Planet and How You Can Stop Them
At QCon London, Holly Cummins, Quarkus senior principal software engineer at RedHat, talked about how utilization and elasticity relate to sustainability. In addition, she introduced a range of practical zombie-hunting techniques, including absurdly simple automation, LightSwitchOps, and FinOps.

To find out more about kube-green, checkout the link below.

GitHub - kube-green/kube-green: A K8s operator to reduce CO2 footprint of your clusters
A K8s operator to reduce CO2 footprint of your clusters - kube-green/kube-green

Thanks for sticking with me for this long! In the next article, we will go one level up, and apply the concept of LightSwitchOps to Kubernetes nodes. See you in a couple of weeks!

If you liked the article, feel free to share it. If there is something wrong with the things I wrote, feel free to drop a comment below. Bonus - subscribe to the blog and receive these articles in your inbox!

Thank you for helping me grow!