Autoscaling java application based on Memory utilization metrics

Sudhakar_Vuppalapati · September 5, 2024, 4:20pm

Spring Boot applications scale the number of pods in a Kubernetes cluster based on observed metrics: memory utilization, I would like to do the auto-scaling scaling the Pods based on memory utilization of the web application, the JVM heap won’t release once it grows or acquires the memory. So once we scale out, we won’t be able to scale because the heap will hold the memory; we need to run GC regularly. How do you handle this one? Please, can you help me?

Thanks
Sudhakar

jayeshmahajan · September 6, 2024, 10:12pm

Tuning Garbage Collection (GC) Behavior

You can configure the JVM to handle memory more efficiently by tuning its GC behavior. Here are some approaches:

Use G1 Garbage Collector: G1 GC is better at managing heap fragmentation and reducing large spikes in memory usage compared to the default GC.

-XX:+UseG1GC
-XX:MaxGCPauseMillis=200

Configure Heap Size Properly: Set appropriate -Xms (initial heap size) and -Xmx (maximum heap size) values. Often, setting these two to the same value can reduce fragmentation and make memory usage more predictable.

-Xms1g -Xmx1g

Use Kubernetes Memory-Based Autoscaling

While Kubernetes Horizontal Pod Autoscaler (HPA) scales pods based on memory or CPU metrics, these metrics can become misleading due to how the JVM holds on to memory. You can apply the following approaches:

Vertical Pod Autoscaler (VPA): Instead of horizontally scaling the number of pods, you could use VPA to adjust the resources (CPU, memory) assigned to each pod dynamically.
Custom Metrics: Use a custom metric that better reflects the actual application load, such as HTTP request rate, queue length, or business metrics (like active users) instead of raw memory utilization.

Using liveness and readiness probes can help manage JVM memory more effectively. These probes ensure that Kubernetes only routes traffic to healthy instances and can restart unhealthy pods, allowing the JVM to reset its heap.

For instance, you could tie the liveness probe to an endpoint that checks for heap memory exhaustion:

As a last resort, you can schedule periodic restarts of your pods using Kubernetes CronJobs or leveraging a tool like Kured (Kubernetes Reboot Daemon). While this isn’t ideal, it can provide temporary relief by resetting memory usage periodically.

Topic		Replies	Views
Spring Boot applications scale the number of pods in a Kubernetes cluster based on observed metrics: memory utilization, the JVM heap won't release once it grows or acquires the memory. So once we scale out, we won't be able to scale because the heap wil microk8s network	0	32	September 5, 2024
Autoscaling memory in pods General Discussions k8s-blog , k8s-release , kubeweekly	1	1256	December 7, 2018
Scale horizontally depending on connection count instead of resource usage? General Discussions	1	860	February 20, 2022
Kubernetes Vetical Pod Autoscaler wont recreate pods General Discussions	0	1056	April 24, 2022
Kubernetes pod auto scale down General Discussions	1	120	October 15, 2024

Autoscaling java application based on Memory utilization metrics

Related topics