{"id":27789,"date":"2025-07-24T10:48:02","date_gmt":"2025-07-24T08:48:02","guid":{"rendered":"https:\/\/blog.mi.hdm-stuttgart.de\/?p=27789"},"modified":"2025-07-24T10:48:04","modified_gmt":"2025-07-24T08:48:04","slug":"beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling","status":"publish","type":"post","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/","title":{"rendered":"Beyond Reactive: How AI is Revolutionizing Kubernetes Autoscaling"},"content":{"rendered":"\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>Note:<\/strong>&nbsp;This blog post was written for the module Enterprise IT (113601a) in the summer semester of 2025<\/p>\n<\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\">Introduction<\/h2>\n\n\n\n<p>Kubernetes has become the leading open-source platform for managing containerized applications. Its ability to automate deployment, scaling, and operations helps teams efficiently manage microservices architectures and dynamic cloud workloads.<\/p>\n\n\n\n<p>A cornerstone of efficient Kubernetes cluster management is auto-scaling, a mechanism that dynamically adjusts resources to meet fluctuating application demands, thereby ensuring optimal performance, maximizing resource utilization, and controlling operational costs. While Kubernetes offers built-in auto-scaling capabilities through tools like the Horizontal Pod Autoscaler (HPA) and the Vertical Pod Autoscaler (VPA), these traditional methods often rely on reactive measures and predefined thresholds. These approaches, while effective for many scenarios, can fall short in handling the dynamic and often unpredictable nature of contemporary workloads.<\/p>\n\n\n\n<p>Artificial Intelligence is now stepping in to revolutionize this landscape, offering a more intelligent and efficient way to manage Kubernetes cluster scaling. AI&#8217;s capacity to analyze vast amounts of data, learn from patterns, and make proactive decisions is transforming how Kubernetes clusters adapt to changing demands, leading toward a future of more autonomous and optimized cloud-native operations. This article looks at the limitations of traditional Kubernetes auto-scaling and explains how AI is enabling more autonomous and optimized cloud-native operations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Foundations of Kubernetes Auto-Scaling Mechanisms<\/h2>\n\n\n\n<p>To fully grasp Kubernetes&#8217; auto-scaling capabilities, it&#8217;s essential to first understand the foundational components that form its backbone: Containers, Pods, Nodes, and Clusters.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Containers:<\/strong> These are standardized, executable software packages that bundle an application with all its dependencies, ensuring consistent execution across various environments. Each container is designed to be stateless and immutable, promoting repeatable deployments by decoupling applications from the underlying host infrastructure. [1]<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Pods:<\/strong> The smallest deployable unit in Kubernetes, a Pod represents a single instance of a running process in your cluster. A Pod encapsulates one or more Containers, along with shared storage resources, a unique network IP, and options that govern how its containers run. All containers within a Pod share the same network namespace and can communicate with each other easily. [2]<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Nodes:<\/strong> These are the worker machines (either virtual or physical) in a Kubernetes cluster where Pods are actually run. Each Node is managed by the Kubernetes control plane and contains the necessary services, like kubelet, a container runtime, and kube-proxy, to facilitate container execution and network communication for its Pods. [3]<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Clusters:<\/strong> A Kubernetes Cluster is a complete system that manages containerized applications. It consists of a set of Nodes (which run your Pods) and a control plane. The control plane acts as the brain of the cluster, coordinating all activities such as scheduling Pods, maintaining the desired application states, and managing scaling, all to ensure high availability and fault tolerance of your applications. [4]<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"591\" data-attachment-id=\"27791\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/kubernetes_basics\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics.png\" data-orig-size=\"2392,1380\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"kubernetes_basics\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-1024x591.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-1024x591.png\" alt=\"\" class=\"wp-image-27791\" style=\"width:668px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-1024x591.png 1024w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-300x173.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-768x443.png 768w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-1536x886.png 1536w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/kubernetes_basics-2048x1182.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 1:<\/strong> Conceptual overview of basic Kubernetes architecture, illustrating the relationship between Nodes, Pods, and Containers within a Cluster.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Overview of Kubernetes Auto-Scaling Mechanisms<\/h2>\n\n\n\n<p>With these foundational components in mind, we can now explore Kubernetes&#8217; built-in mechanisms for automating the scaling of applications and infrastructure. These mechanisms ensure that applications can dynamically adjust to varying workloads, optimizing resource utilization and maintaining performance.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Horizontal Pod Autoscaler (HPA)<\/strong> \u2013 scales the number of pod replicas in a workload. [5]<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Vertical Pod Autoscaler (VPA)<\/strong> \u2013 adjusts the CPU\/memory requests of containers (changing pod resource allocations). [6]<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Cluster Autoscaler (CA)<\/strong> \u2013 adds or removes nodes in the cluster itself. [6]<\/li>\n<\/ul>\n\n\n\n<p>Each of these mechanisms operates through a rule-based control loop that continuously monitors relevant metrics and makes scaling decisions reactively.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Horizontal Pod Autoscaler (HPA): Scaling by Replicas<\/h3>\n\n\n\n<p>The Horizontal Pod Autoscaler (HPA) is a core Kubernetes feature that automatically adjusts the number of pod replicas for a given workload to match demand [5].<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"564\" data-attachment-id=\"27792\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/hpa\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA.png\" data-orig-size=\"1600,881\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"HPA\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA-1024x564.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA-1024x564.png\" alt=\"\" class=\"wp-image-27792\" style=\"width:734px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA-1024x564.png 1024w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA-300x165.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA-768x423.png 768w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA-1536x846.png 1536w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/HPA.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 2:<\/strong> Visual Overview: HPA Workflow<\/figcaption><\/figure>\n\n\n\n<p>HPA scales based on observed metrics such as average CPU utilization, average memory usage, or custom metrics [5]. The HPA controller operates in a control loop, periodically querying metrics and comparing them to target values to determine the desired number of replicas [5]. The calculation for the desired number of replicas is as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"112\" data-attachment-id=\"27794\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/screenshot-2025-07-23-210700\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700.png\" data-orig-size=\"1464,160\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screenshot 2025-07-23 210700\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700-1024x112.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700-1024x112.png\" alt=\"\" class=\"wp-image-27794\" style=\"width:642px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700-1024x112.png 1024w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700-300x33.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700-768x84.png 768w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210700.png 1464w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>For example, if the current average CPU utilization across all pods is 200m, the target is set at 100m, and there are currently 2 replicas, the HPA computes:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735.png\"><img loading=\"lazy\" decoding=\"async\" width=\"891\" height=\"94\" data-attachment-id=\"27795\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/screenshot-2025-07-23-210735\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735.png\" data-orig-size=\"891,94\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Screenshot 2025-07-23 210735\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735.png\" alt=\"\" class=\"wp-image-27795\" style=\"width:395px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735.png 891w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735-300x32.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-210735-768x81.png 768w\" sizes=\"auto, (max-width: 891px) 100vw, 891px\" \/><\/a><\/figure>\n\n\n\n<p>This results in the HPA doubling the number of pods to meet the target. This calculation is performed at regular intervals (defaulting to every 15 seconds). [5]<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Vertical Pod Autoscaler (VPA): Scaling by Resources<\/h3>\n\n\n\n<p>The Vertical Pod Autoscaler (VPA) takes a different approach to scaling by automatically adjusting resource requests (minimum guaranteed resources) and limits (maximum allowed resources) for containers. Instead of changing the number of replicas like HPA, VPA focuses on optimizing the resource allocation of existing pods to improve workload efficiency. [8]<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"352\" data-attachment-id=\"27796\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/vpa\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA.png\" data-orig-size=\"1600,550\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"VPA\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA-1024x352.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA-1024x352.png\" alt=\"\" class=\"wp-image-27796\" style=\"width:830px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA-1024x352.png 1024w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA-300x103.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA-768x264.png 768w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA-1536x528.png 1536w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/VPA.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 3:<\/strong> The Vertical Pod Autoscaler (VPA) dynamically adjusts a Pod\u2019s resource allocation.<\/figcaption><\/figure>\n\n\n\n<p>VPA operates through three main components: the Recommender (which analyzes resource usage and generates recommendations), the Admission Controller (which applies resource settings to new pods), and the Updater (which evicts pods to apply new resource configurations when needed). The system can operate in different modes, ranging from recommendation-only to fully automated resource management. VPA manages resource requests and limits for containers by analyzing historical usage data and Out-Of-Memory events to determine optimal resource allocations. [8]<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cluster Autoscaler (CA): Scaling the Infrastructure<\/h3>\n\n\n\n<p>The Cluster Autoscaler (CA) operates at the cluster level, automatically adjusting the number of nodes based on pod scheduling requirements. Unlike HPA and VPA, which work at the pod level, CA monitors for pods in a pending state rather than directly measuring resource utilization, checking every 10 seconds by default to detect when pods cannot be scheduled due to insufficient cluster capacity. [9, 10]<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"644\" data-attachment-id=\"27797\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/ca\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA.png\" data-orig-size=\"1600,1007\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"CA\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA-1024x644.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA-1024x644.png\" alt=\"\" class=\"wp-image-27797\" style=\"width:773px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA-1024x644.png 1024w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA-300x189.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA-768x483.png 768w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA-1536x967.png 1536w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/CA.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 4:<\/strong> The Cluster Autoscaler (CA) adding a new Node to accommodate pending Pods. When new Pods cannot be scheduled on existing Nodes, CA triggers the cloud provider to launch a new Node, where these pending Pods can then be scheduled.<\/figcaption><\/figure>\n\n\n\n<p>As illustrated in Figure 4, the scaling process follows a systematic workflow: CA detects pending pods that cannot be scheduled (1), communicates with the cloud provider infrastructure to provision additional capacity (2), and triggers the launch of a new node (3). The newly provisioned node is then registered with the Kubernetes control plane, making it available for scheduling the previously pending pods. [9]<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Limitations of Traditional Autoscaling: Reactive Design<\/h2>\n\n\n\n<p>Traditional autoscaling mechanisms in Kubernetes, like the Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), and Cluster Autoscaler (CA), operate reactively, initiating scaling actions only after a change in demand is detected [11]. This reactive approach inherently involves a delay between the moment a demand spike occurs and when the autoscalers actually respond and adjust resources [12]. This delay can result in periods of over-utilization (too many resources) or under-utilization (insufficient resources) before the system can adapt [12, 11]. Under-utilization leads to application slowdowns or failures due to a lack of resources and over-utilization to wasted resources and increased costs [12].<\/p>\n\n\n\n<p>The Horizontal Pod Autoscaler (HPA) evaluates metrics like CPU or memory usage at fixed intervals, typically every 15 seconds [5], introducing latency between demand changes and scaling. The Vertical Pod Autoscaler (VPA) applies resource adjustments by restarting pods [13], which temporarily disrupts availability. Similarly, the Cluster Autoscaler (CA) requires time to provision and integrate new nodes, delaying capacity expansion.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/reactive.png\"><img loading=\"lazy\" decoding=\"async\" width=\"586\" height=\"331\" data-attachment-id=\"27798\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/reactive\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/reactive.png\" data-orig-size=\"586,331\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"reactive\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/reactive.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/reactive.png\" alt=\"\" class=\"wp-image-27798\" style=\"width:646px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/reactive.png 586w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/reactive-300x169.png 300w\" sizes=\"auto, (max-width: 586px) 100vw, 586px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 5:<\/strong> Illustration of Reactive Scaling: Scaling actions are taken only after the CPU load has already increased, based on existed values. This can lead to temporary performance degradation during sudden spikes<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">From Reactive to Proactive: The Case for AI-Driven Autoscaling<\/h2>\n\n\n\n<p>To address the shortcomings of traditional scaling mechanisms, modern infrastructures increasingly turn to proactive autoscaling. Unlike reactive scaling, which acts only after a load increase has been observed, proactive AI-based strategies predict upcoming demand and prepare resources in advance [11].<\/p>\n\n\n\n<p><em>Figure 6<\/em> illustrates the concept of proactive scaling: instead of waiting for CPU usage to increase, the system uses predicted values beyond the current time point (<em>s2<\/em>) to adjust resources ahead of time. This eliminates the drawbacks of reactive scaling by enabling timely resource adjustments, thereby preventing latency issues and performance degradation before they occur. To achieve such proactive capabilities, various AI models are employed, which we will outline in the subsequent section.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive.png\"><img loading=\"lazy\" decoding=\"async\" width=\"839\" height=\"448\" data-attachment-id=\"27799\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/proactive\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive.png\" data-orig-size=\"839,448\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"proactive\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive.png\" alt=\"\" class=\"wp-image-27799\" style=\"width:685px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive.png 839w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive-300x160.png 300w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/proactive-768x410.png 768w\" sizes=\"auto, (max-width: 839px) 100vw, 839px\" \/><\/a><figcaption class=\"wp-element-caption\">Figure 6: Illustration of Proactive Scaling<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">AI Models for Predictive Auto-Scaling<\/h2>\n\n\n\n<p>As established, the transition from reactive to proactive auto-scaling in Kubernetes fundamentally relies on the ability to accurately forecast future resource demands. This predictive capability is achieved through the application of various Machine Learning models, which can analyze historical workload data to identify patterns and anticipate future needs.<\/p>\n\n\n\n<p>In this section, we explore key AI model architectures commonly employed for predictive auto-scaling in Kubernetes environments, beginning with foundational time series approaches and progressing to more complex deep learning techniques.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Time Series Models<\/h3>\n\n\n\n<p>Time series models like ARIMA (AutoRegressive Integrated Moving Average) can be used to predict future workload, as they are widely used for time series forecasting [14]. These models focus on decomposing time series data into key components [15]:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Trends<\/strong>: Capturing long-term increases or decreases in resource utilization.<\/li>\n\n\n\n<li><strong>Seasonalities<\/strong>: Modeling regular, predictable cycles such as daily, weekly, or monthly patterns.<\/li>\n\n\n\n<li><strong>Cyclical Patterns<\/strong>: Accounting for irregular, non-fixed cycles that may occur due to workload changes or business cycles.<\/li>\n<\/ul>\n\n\n\n<p>Their relative simplicity and interpretability [15] have made them popular for many years in time series forecasting [14].<\/p>\n\n\n\n<p>However, these models are primarily linear, which can limit their ability to capture complex, non-linear relationships [15] or sudden workload changes often seen in dynamic Kubernetes environments [16].<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Neural Networks and Deep Learning<\/h3>\n\n\n\n<p>For more complex and non-linear patterns that traditional time series models might miss [15], Neural Networks (NN) and especially Deep Learning (DL) models come into play. These advanced algorithms can process vast amounts of data and learn intricate relationships, making them highly suitable for predictive auto-scaling.<\/p>\n\n\n\n<p>Here are the primary AI model architectures commonly used for this purpose:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Long Short-Term Memory (LSTM):<\/strong> This special recurrent neural network (RNN) [17] is perfectly suited for time-series forecasting. They can learn long-term dependencies [15] from historical Kubernetes workload data, making them ideal for predicting future resource needs.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Multilayer Perceptrons (MLP): <\/strong>These fundamental feedforward networks are effective when predictions rely heavily on current or cross-sectional features (e.g., immediate CPU utilization, time of day, current active connections). They can model complex relationships between various input metrics gathered from your Kubernetes cluster.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Convolutional Neural Networks (CNNs) (specifically 1D-CNNs): <\/strong>While known for image processing and feature extraction, 1D-CNNs can efficiently identify local patterns within time series data. However, their forecasting accuracy tends to be relatively low compared to other models such as LSTM and Arima, as shown in a study, where CNNs performed worse in bitcoin price prediction compared to LSTM and traditional econometric methods [18].<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Ensemble Methods:<\/strong> Since no single forecasting model performs best in all situations, combining models like LSTM, MLP, and CNN can improve prediction accuracy. Each model captures different aspects of the data, and their combination increases the chances of identifying relevant patterns [15]. This makes ensemble approaches more robust, especially in complex and dynamic environments such as Kubernetes workloads.<\/li>\n<\/ul>\n\n\n\n<p>The choice of the best architecture for your Kubernetes environment depends heavily on the specific workload patterns and available data. LSTMs are often preferred for deep temporal dependencies, MLPs excel with rich feature sets, and CNNs are great for pattern detection. However, ensemble methods generally offer improved overall performance and reliability for proactive scaling decisions.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Benefits of AI-Based Scaling for Kubernetes Environments<\/h2>\n\n\n\n<p>Having explored the diverse AI models applicable to predictive auto-scaling, we now turn our attention to the substantial advantages they deliver. The integration of AI into Kubernetes auto-scaling brings a multitude of benefits, leading to more efficient, reliable, and cost-effective cloud-native operations, which we&#8217;ll explain in the following sections.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cost Optimization<\/h3>\n\n\n\n<p>Traditional scaling methods often lead to either over-provisioning or under-provisioning [11]. Both scenarios are costly for businesses, resulting in wasted resources or lost revenue due to service degradation [11]. In today\u2019s fast-paced digital environment, even brief downtime or slow responses are increasingly intolerable, potentially leading to substantial financial losses [11].<\/p>\n\n\n\n<p>AI-driven auto-scaling can significantly reduce cloud infrastructure costs by accurately predicting resource needs. This approach can help prevent both over-provisioning during low demand and under-provisioning during peak loads. Techniques like intelligent rightsizing of containers, based on actual usage, can contribute to substantial savings [19].<\/p>\n\n\n\n<p>Beyond basic rightsizing, AI technologies can also optimize the use of discounted computing options offered by major cloud providers like AWS, Azure, and Google Cloud. For instance, AI models can predict when spot instances (unused capacity) are likely to be available and when they might be interrupted. This foresight can help organizations plan their resource allocation to minimize service disruption, potentially allowing them to confidently use these highly cost-effective instances. Similarly, for reserved instances (long-term commitments), AI can analyze historical usage patterns. Based on this data, it can recommend the optimal number and type of reserved instances to purchase, aiming to ensure businesses commit to the right amount of resources and maximize savings by avoiding over-provisioning [19].<\/p>\n\n\n\n<p>Many companies offering AI-based autoscaling for Kubernetes highlight these potential cost benefits. For example,<em> StormForge <\/em>promotes savings of up to 65% with its machine learning-driven rightsizing [20], and <em>Sedai.io<\/em> reports 30-50% cost reductions through AI-powered optimization [21]. Furthermore, research and case studies also suggest that AI-based autoscaling can lead to significant cost reductions [22, 23, 24]. These considerable promises underscore AI-based autoscaling&#8217;s immense potential for substantial cloud cost optimization.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Improved Performance<\/h3>\n\n\n\n<p>AI-powered predictive scaling can significantly boost the performance of cloud environments by proactively adjusting resources based on forecasted workload demands. This is especially advantageous in Kubernetes, where scaling operations, like spinning up new pods or provisioning additional nodes, can introduce noticeable delays during sudden traffic spikes. Factors like metric collection intervals, scaling thresholds, and new infrastructure startup times contribute to these lags [12]. Consequently, workloads might temporarily experience performance degradation or underutilization before the system fully adapts [12].<\/p>\n\n\n\n<p>Predictive scaling directly addresses this by provisioning resources ahead of time, ensuring adequate capacity is available precisely when required. This leads to more consistent responsiveness [11]. Experimental results have shown that AI-driven autoscaling can enhance application response times by up to 25% compared to traditional rule-based methods [24].<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/respnsetimeComparison.png\"><img loading=\"lazy\" decoding=\"async\" width=\"753\" height=\"475\" data-attachment-id=\"27800\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/respnsetimecomparison\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/respnsetimeComparison.png\" data-orig-size=\"753,475\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"respnsetimeComparison\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/respnsetimeComparison.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/respnsetimeComparison.png\" alt=\"\" class=\"wp-image-27800\" style=\"width:671px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/respnsetimeComparison.png 753w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/respnsetimeComparison-300x189.png 300w\" sizes=\"auto, (max-width: 753px) 100vw, 753px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 7:<\/strong> Comparison of System Response Times Using Rule-Based vs. Predictive AI-Driven Autoscaling<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Enhanced Resource Utilization<\/h3>\n\n\n\n<p>AI-based autoscaling offers considerable advantages in optimizing resource utilization by making more intelligent and context-aware scaling decisions. Traditional approaches like the Horizontal Pod Autoscaler (HPA) rely primarily on standard metrics such as CPU and memory usage by default [5]. While these metrics provide a basic indication of system load, they often fail to capture the complexity of real-world workloads. As a result, scaling actions may occur too late or be misaligned with actual resource needs, leading to overprovisioning or underutilization, especially in scenarios where custom metrics are not configured or utilized.<\/p>\n\n\n\n<p>Although HPA can be extended to use custom metrics [5], doing so typically requires considerable manual effort to define, collect, and integrate relevant metrics into the scaling logic. This process can be time-consuming and brittle, especially in dynamic or complex environments. In contrast, AI-driven autoscaling can automatically learn patterns from a broad set of input signals\u2014including CPU, memory, network traffic, disk I\/O, latency, request throughput, and even application-specific parameters, without requiring extensive manual configuration.<\/p>\n\n\n\n<p>For example, an AI-based approach might autonomously identify that scaling should occur based on queue length, response times, or the number of active user sessions, factors that may be critical for performance but are ignored by default in traditional setups. By continuously adapting to evolving workloads and learning from historical data, AI-driven systems can provide more timely and accurate scaling decisions with less manual intervention.<\/p>\n\n\n\n<p>This intelligent, adaptive capacity directly translates into tangible improvements in efficiency. Experimental results have shown that AI-driven autoscaling can reduce resource wastage by up to 30% compared to traditional rule-based techniques [24].<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/AIvsRuleBased.png\"><img loading=\"lazy\" decoding=\"async\" width=\"718\" height=\"468\" data-attachment-id=\"27801\" data-permalink=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/07\/24\/beyond-reactive-how-ai-is-revolutionizing-kubernetes-autoscaling\/aivsrulebased\/\" data-orig-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/AIvsRuleBased.png\" data-orig-size=\"718,468\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"AIvsRuleBased\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/AIvsRuleBased.png\" src=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/AIvsRuleBased.png\" alt=\"\" class=\"wp-image-27801\" style=\"width:632px;height:auto\" srcset=\"https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/AIvsRuleBased.png 718w, https:\/\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/07\/AIvsRuleBased-300x196.png 300w\" sizes=\"auto, (max-width: 718px) 100vw, 718px\" \/><\/a><figcaption class=\"wp-element-caption\"><strong>Figure 8: <\/strong>Comparison of System Resource Utilization Using Rule-Based vs. Predictive AI-Driven Autoscaling<\/figcaption><\/figure>\n\n\n\n<p>While the benefits of AI-driven autoscaling are substantial, its implementation is not without its own set of challenges and important considerations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Challenges and Considerations for AI-Driven Auto-Scaling<\/h2>\n\n\n\n<p>While AI-driven auto-scaling offers significant advantages, its adoption also presents certain challenges and considerations that organizations need to address.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Implementation and Integration Complexities<\/h3>\n\n\n\n<p>Integrating AI-based auto-scaling into existing cloud infrastructure presents challenges, particularly in environments with legacy systems. This process necessitates not only deep technical expertise but also a comprehensive understanding of both the current infrastructure and the proposed AI capabilities.<\/p>\n\n\n\n<p>The implementation of predictive scaling\u00a0usually\u00a0involves\u00a0substantial\u00a0changes\u00a0to existing resource management\u00a0procedures. This\u00a0could\u00a0be\u00a0technically\u00a0challenging\u00a0and\u00a0expensive. Furthermore,\u00a0implementing\u00a0AI-based\u00a0auto-scaling and associated monitoring\u00a0solutions\u00a0takes\u00a0manpower\u00a0and\u00a0time\u00a0for IT\u00a0departments,\u00a0reflecting\u00a0large\u00a0upfront\u00a0investments\u00a0and\u00a0the\u00a0need\u00a0for specialized\u00a0skill. These inherent complexities are especially burdensome\u00a0to\u00a0small and medium-sized enterprises. [11]<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Requirements and Quality<\/h3>\n\n\n\n<p>AI-driven predictive scaling heavily relies on accurate and comprehensive historical and real-time data. Poor data quality, such as noise, inconsistencies, or gaps, can undermine prediction accuracy and lead to inefficient resource allocation. Therefore, ensuring data reliability and addressing any shortcomings in data collection are essential for effective scaling. While cloud environments generate vast amounts of data, preparing this information for AI use remains challenging as it requires careful cleaning, labeling, and filtering to ensure relevance. Moreover, limited access to diverse real-world datasets can make it difficult to train and evaluate AI-based algorithms robustly, impacting their overall performance. [11]<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Stability and Predictability<\/h3>\n\n\n\n<p>As AI models become part of cloud systems, continuous monitoring and regular updates are essential to maintain their accuracy and effectiveness under changing conditions. This helps ensure stable and predictable scaling behavior, which is critical for reliable system performance. [11]<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Computational Overhead<\/h3>\n\n\n\n<p>Running AI and machine learning models for auto-scaling can introduce significant computational overhead. This added load may offset some of the expected efficiency gains [11]. Therefore, the overhead must be carefully considered as part of the overall cost-benefit analysis when adopting AI-powered auto-scaling solutions.<\/p>\n\n\n\n<p>Despite these challenges, a structured approach is essential for successfully implementing AI-driven autoscaling; a conceptual framework for this process is outlined next.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conceptual Framework for AI-Driven Auto-Scaling in Kubernetes<\/h2>\n\n\n\n<p>Implementing AI-driven autoscaling in Kubernetes requires an architecture that collects data, analyzes it with machine learning models, and translates predictions into scaling actions. In the following, it will be outlined how AI-driven autoscaling in Kubernetes could be implemented.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Data Collection and Preprocessing<\/h3>\n\n\n\n<p>To enable intelligent autoscaling, relevant metrics must be collected from the Kubernetes environment. These could include CPU and memory usage, request and error rates, latency, and application-specific indicators. Tools like <em>Prometheus<\/em> can be used to collect and store time-series data from the Kubernetes cluster [25]. The collected data should be preprocessed, cleaned, normalized, and transformed to create consistent and useful input for model training.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Model Training and Target Definition<\/h3>\n\n\n\n<p>The AI model can be trained to predict different kinds of targets, depending on the chosen scaling strategy:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Workload Forecasting:<\/strong> The model predicts workload indicators such as CPU demand or request volume. These predictions can then inform scaling decisions by existing autoscaling mechanisms in Kubernetes.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Direct Scaling Output:<\/strong> Alternatively, the model can be trained to output a desired scaling state directly, for example, the number of pods or nodes required under future conditions.<\/li>\n<\/ul>\n\n\n\n<p>For model selection, various approaches can be utilized depending on the use case. The different models that could be used were already outlined before.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Model Deployment<\/h3>\n\n\n\n<p>Once trained, the AI models need to be deployed for real-time inference. This can be achieved in several ways:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>In-Cluster Deployment:<\/strong> Deploy models as containerized services within the Kubernetes cluster using tools like TensorFlow Serving [26].<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>External Deployment:<\/strong> Host models on external platforms (cloud ML services, dedicated inference servers) that communicate with the cluster via APIs.<\/li>\n<\/ul>\n\n\n\n<p>The deployment choice depends on latency requirements, resource constraints, and security considerations.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Scaling Execution and Kubernetes Integration<\/h3>\n\n\n\n<p>Depending on the model output, different execution paths are required to apply scaling decisions within the Kubernetes environment. If the AI model forecasts workload metrics\u2014such as future CPU usage or incoming request rates\u2014these predictions can be fed into existing Kubernetes autoscalers by making them available through a custom metrics API. For example, a tool like KEDA (Kubernetes Event-Driven Autoscaler) can be used to expose these predicted values as custom metrics, which Kubernetes\u2019 Horizontal Pod Autoscaler (HPA) can then consume to make real-time scaling decisions [27]. Alternatively, if the model directly outputs target scaling values such as the desired number of pod replicas, a custom service or controller can be implemented that reads these predictions and applies them to the Kubernetes API by updating the replica count of deployments.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Continuous Inference and Feedback Loop<\/h3>\n\n\n\n<p>Once integrated, predictive autoscaling operates as a continuous loop. The model consumes up-to-date metrics, generates forecasts (e.g., CPU demand or target replicas), and triggers scaling actions.<\/p>\n\n\n\n<p>The impact of each scaling action feeds back into the metrics pipeline, allowing the model to adapt to workload changes over time. As patterns evolve, periodic retraining with recent data ensures prediction quality remains high, enabling a self-correcting and adaptive autoscaling system.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion and Outlook<\/h2>\n\n\n\n<p>AI is fundamentally transforming Kubernetes auto-scaling, shifting it from a reactive, threshold-based approach to a proactive, predictive, and adaptive approach. This shift brings significant benefits, including cost optimization through efficient resource utilization, improvements in application performance, and availability. The importance of AI-based scaling in Kubernetes is rapidly growing, with many companies actively developing and deploying solutions that leverage these advanced capabilities. For instance, commercial solutions like StormForge [28], Wave Autoscale [29], and Dysnix PredictKube [30] are already utilizing machine learning to offer intelligent resource optimization. While adopting AI for Kubernetes auto-scaling presents challenges related to increased complexity, significant data requirements for training, and the need for specialized expertise, the potential rewards are immense and lay the groundwork for truly autonomous Kubernetes operations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n\n<p>[1] Containers. Kubernetes. url: <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/containers\/\">https:\/\/kubernetes.io\/docs\/concepts\/containers\/<\/a> (visited on 05\/25\/2025).<br>[2] Pods. Kubernetes. url: <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/workloads\/pods\/\">https:\/\/kubernetes.io\/docs\/concepts\/workloads\/pods\/<\/a> (visited on 05\/25\/2025).<br>[3] Nodes. Kubernetes. url: <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/architecture\/nodes\/\">https:\/\/kubernetes.io\/docs\/concepts\/architecture\/nodes\/<\/a> (visited on 05\/25\/2025).<br>[4] Cluster Architecture. Kubernetes. url: <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/architecture\/\">https:\/\/kubernetes.io\/docs\/concepts\/architecture\/<\/a> (visited on 05\/25\/2025).<br>[5] Horizontal Pod Autoscaling. Kubernetes. url: <a href=\"https:\/\/kubernetes.io\/docs\/tasks\/run-application\/horizontal-pod-autoscale\/\">https:\/\/kubernetes.io\/docs\/tasks\/run-application\/horizontal-pod-autoscale\/<\/a> (visited on 05\/25\/2025).<br>[6] Autoscaler\/README.Md at Master \u00b7 Kubernetes\/Autoscaler. url: <a href=\"https:\/\/github.com\/kubernetes\/autoscaler\/blob\/master\/README.md\">https:\/\/github.com\/kubernetes\/autoscaler\/blob\/master\/README.md<\/a> (visited on 05\/25\/2025).<br>[7] Chapter 2: Horizontal Autoscaling &#8211; Kubernetes Guides &#8211; Apptio. Apr. 16, url: <a href=\"https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/horizontal\/\">https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/horizontal\/<\/a> (visited on 07\/22\/2025).<br>[8] Chapter 1: Vertical Autoscaling &#8211; Kubernetes Guides &#8211; Apptio. Apr. 16, url: <a href=\"https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/vertical\/\">https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/vertical\/<\/a> (visited on 07\/22\/2025).<br>[9] Chapter 3: Cluster Autoscaling &#8211; Kubernetes Guides &#8211; Apptio. Apr. 16, url: <a href=\"https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/cluster\/\">https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/cluster\/<\/a> (visited on 07\/22\/2025).<br>[10] Autoscaler\/Cluster-Autoscaler\/README.Md at Master \u00b7 Kubernetes\/Autoscaler. GitHub. url: <a href=\"https:\/\/github.com\/kubernetes\/autoscaler\/blob\/master\/cluster-autoscaler\/README.md\">https:\/\/github.com\/kubernetes\/autoscaler\/blob\/master\/cluster-autoscaler\/README.md<\/a> (visited on 07\/22\/2025).<br>[11] Pranav Murthy. \u201cAI-Powered Predictive Scaling in Cloud Computing: Enhancing Efficiency through Real-Time Workload Forecasting\u201d. In: Iconic Research And Engineering Journals 5.4 (Nov. 2021). issn: 2456-8880. <br>[12] Raymond Ajax, Ayuns Luz, and Crystal Herry. \u201cAutoscaling in Kubernetes: Horizontal Pod Autoscaling and Cluster Autoscaling\u201d.<br>[13] Vertical Pod Autoscaling \u2014 Google Kubernetes Engine (GKE) \u2014 Google Cloud. url: <a href=\"https:\/\/cloud.google.com\/kubernetes-engine\/docs\/concepts\/verticalpodautoscaler\">https:\/\/cloud.google.com\/kubernetes-engine\/docs\/concepts\/verticalpodautoscaler<\/a> (visited on 05\/25\/2025).<br>[14] Sima Siami Namini, Neda Tavakoli, and Akbar Siami Namin. \u201cA Comparison of ARIMA and LSTM in Forecasting Time Series\u201d. doi: 10.1109\/ICMLA.2018.00227.<br>[15] Vaia Kontopoulou et al. \u201cA Review of ARIMA vs. Machine Learning Approaches for Time Series Forecasting in Data Driven Networks\u201d. In: Future Internet 15 (July 2023). doi: 10.3390\/fi15080255.<br>[16] Charan Shankar Kummarapurugu. \u201cAI-Driven Predictive Scaling for Multi-Cloud Resource Management: Using Adaptive Forecasting, Cost-Optimization, and Auto-Tuning Algorithms\u201d. In: International Journal of Science and Research (IJSR) 13 (Oct. 2024). doi: 10.21275\/SR241015062841.<br>[17] Sepp Hochreiter and J\u00fcrgen Schmidhuber. \u201cLong Short-Term Memory\u201d. In: Neural Computation 9 (Nov. 1997).<br>[18] Dinh-Thuan Nguyen and Huu-Vinh Le. \u201cPredicting the Price of Bitcoin Using Hybrid ARIMA and Machine Learning\u201d. In: Nov. 2019, isbn: 978-3-030-35652-1. doi: 10.1007\/978-3-030-35653-8_49.<br>[19] Lawrence Emma. \u201cAI-POWERED CLOUD RESOURCE MANAGEMENT: MACHINE LEARNING FOR DYNAMIC AUTOSCALING AND COST OPTIMIZATION\u201d.<br>[20] Kubernetes Cost Optimization. stormforge.io. url: <a href=\"https:\/\/stormforge.io\/solution-brief\/kubernetes-cost-optimization\/ \">https:\/\/stormforge.io\/solution-brief\/kubernetes-cost-optimization\/ <\/a>(visited on 05\/31\/2025).<br>[21] Autonomous Cloud Cost Optimization for Modern Apps \u2014 Reduce Costs by 30-50%. url: <a href=\"https:\/\/www.sedai.io\/use-cases\/cloud-cost-optimization\">https:\/\/www.sedai.io\/use-cases\/cloud-cost-optimization<\/a> (visited on 05\/31\/2025).<br>[22] Charan Shankar Kummarapurugu. \u201cAI-Driven Predictive Scaling for Multi-Cloud Resource Management: Using Adaptive Forecasting, Cost-Optimization, and Auto-Tuning Algorithms\u201d. In: International Journal of Science and Research (IJSR) 13.10 (Oct. 2024). issn: 23197064. doi: 10.21275\/SR241015062841. (Visited on 05\/31\/2025).<br>[23] Ravi Pulle, Gaurav Anand, and Satish Kumar. \u201cMonitoring Performance Computing Environments And Autoscaling Using AI\u201d. In: International Research Journal of Modernization in Engineering Technology and Science 5.5 (2023).<br>[24] Shraddha Gajjar. AI-Driven Auto-Scaling in Cloud Environments. Feb. 2025. doi: 10.13140\/RG.2.2.12666.61125.<br>[25] Overview \u2014 Prometheus. url: <a href=\"https:\/\/prometheus.io\/docs\/introduction\/overview\/\">https:\/\/prometheus.io\/docs\/introduction\/overview\/<\/a> (visited on 06\/01\/2025).<br>[26] TensorFlow Serving with Docker \u2014 TFX. url: https:\/\/www.tensorflow.org\/tfx\/serving\/docker (visited on 06\/01\/2025).<br>[27] KEDA \u2014 KEDA Concepts. KEDA. url: <a href=\"https:\/\/keda.sh\/docs\/2.17\/concepts\/\">https:\/\/keda.sh\/docs\/2.17\/concepts\/<\/a> (visited on 07\/22\/2025).<br>[28] Introducing Intelligent Bi-dimensional Autoscaling. stormforge.io. url: <a href=\"https:\/\/stormforge.io\/blog\/introducing-intelligent-bi-dimensional-autoscaling\/\">https:\/\/stormforge.io\/blog\/introducing-intelligent-bi-dimensional-autoscaling\/<\/a> (visited on 07\/22\/2025).<br>[29] Stclab Inc. Introduction \u2013 Wave Autoscale. Feb. 12, 2025. url: <a href=\"https:\/\/waveautoscale.com\/docs\/introduction\">https:\/\/waveautoscale.com\/docs\/introduction<\/a> (visited on 07\/22\/2025).<br>[30] KEDA \u2014 Introducing PredictKube &#8211; an AI-based Predictive Autoscaler for KEDA Made by Dysnix. KEDA. url: <a href=\"https:\/\/keda.sh\/blog\/2022-02-09-predictkube-scaler\/\">https:\/\/keda.sh\/blog\/2022-02-09-predictkube-scaler\/<\/a> (visited on 07\/22\/2025).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">List of Figures<\/h2>\n\n\n\n<p><strong>Figure 1<\/strong>: <em>Source: <em>Kubernetes Networking Guide for Beginners &#8211; Kubernetes Book<\/em>. <a href=\"https:\/\/matthewpalmer.net\/kubernetes-app-developer\/articles\/kubernetes-networking-guide-beginners.html\">https:\/\/matthewpalmer.net\/kubernetes-app-developer\/articles\/kubernetes-networking-guide-beginners.html<\/a> (accessed 2025-07-23).<\/em><br><strong>Figure 2<\/strong>:<em> <em>Source: Chapter 2: Horizontal Autoscaling &#8211; Kubernetes Guides &#8211; Apptio. Apr. 16, url: <a href=\"https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/horizontal\/\">https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/horizontal\/<\/a> (visited on 07\/22\/2025).<\/em><\/em><br><strong>Figure 3<\/strong>: <em>Source: Chapter 1: Vertical Autoscaling &#8211; Kubernetes Guides &#8211; Apptio. Apr. 16, url: <a href=\"https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/vertical\/\">https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/vertical\/<\/a> (visited on 07\/22\/2025).<\/em><br><strong>Figure 4<\/strong>: <em>Source: Chapter 3: Cluster Autoscaling &#8211; Kubernetes Guides &#8211; Apptio. Apr. 16, url: <a href=\"https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/cluster\/\">https:\/\/www.apptio.com\/topics\/kubernetes\/autoscaling\/cluster\/<\/a> (visited on 07\/22\/2025).<\/em><br><strong>Figure 5<\/strong>: <em>Source: \u0413\u0443\u0442\u043c\u0430\u043d, \u0414 &amp; \u0421\u0438\u0440\u043e\u0442\u0430, O.. (2023). \u041f\u0440\u043e\u0430\u043a\u0442\u0438\u0432\u043d\u0435 \u0430\u0432\u0442\u043e\u043c\u0430\u0442\u0438\u0447\u043d\u0435 \u043c\u0430\u0441\u0448\u0442\u0430\u0431\u0443\u0432\u0430\u043d\u043d\u044f \u0432\u0432\u0435\u0440\u0445 \u0434\u043b\u044f Kuberneters. \u0410\u0434\u0430\u043f\u0442\u0438\u0432\u043d\u0456 \u0441\u0438\u0441\u0442\u0435\u043c\u0438 \u0430\u0432\u0442\u043e\u043c\u0430\u0442\u0438\u0447\u043d\u043e\u0433\u043e \u0443\u043f\u0440\u0430\u0432\u043b\u0456\u043d\u043d\u044f. 1. 32-38. 10.20535\/1560-8956.42.2023.278925.<\/em><br><strong>Figure 6<\/strong>: <em>Source: <em>\u0413\u0443\u0442\u043c\u0430\u043d, \u0414 &amp; \u0421\u0438\u0440\u043e\u0442\u0430, O.. (2023). \u041f\u0440\u043e\u0430\u043a\u0442\u0438\u0432\u043d\u0435 \u0430\u0432\u0442\u043e\u043c\u0430\u0442\u0438\u0447\u043d\u0435 \u043c\u0430\u0441\u0448\u0442\u0430\u0431\u0443\u0432\u0430\u043d\u043d\u044f \u0432\u0432\u0435\u0440\u0445 \u0434\u043b\u044f Kuberneters. \u0410\u0434\u0430\u043f\u0442\u0438\u0432\u043d\u0456 \u0441\u0438\u0441\u0442\u0435\u043c\u0438 \u0430\u0432\u0442\u043e\u043c\u0430\u0442\u0438\u0447\u043d\u043e\u0433\u043e \u0443\u043f\u0440\u0430\u0432\u043b\u0456\u043d\u043d\u044f. 1. 32-38. 10.20535\/1560-8956.42.2023.278925.<\/em><\/em><br><strong>Figure 7<\/strong>: <em>Source: Shraddha Gajjar. AI-Driven Auto-Scaling in Cloud Environments. Feb. 2025. doi: 10.13140\/RG.2.2.12666.61125<\/em><br><strong>Figure 8<\/strong>: <em>Source: <\/em>Shraddha Gajjar. AI-Driven Auto-Scaling in Cloud Environments. Feb. 2025. doi: 10.13140\/RG.2.2.12666.61125.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Note:&nbsp;This blog post was written for the module Enterprise IT (113601a) in the summer semester of 2025 Introduction Kubernetes has become the leading open-source platform for managing containerized applications. Its ability to automate deployment, scaling, and operations helps teams efficiently manage microservices architectures and dynamic cloud workloads. A cornerstone of efficient Kubernetes cluster management is [&hellip;]<\/p>\n","protected":false},"author":1261,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"[]"},"categories":[1],"tags":[1031,154,57],"ppma_author":[1104],"class_list":["post-27789","post","type-post","status-publish","format-standard","hentry","category-allgemein","tag-enterprise-it","tag-kubernetes","tag-machine-learning"],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":10190,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2020\/03\/01\/autoscaling-of-docker-containers-in-google-kubernetes-engine\/","url_meta":{"origin":27789,"position":0},"title":"Autoscaling of Docker Containers  in Google Kubernetes Engine","author":"de032","date":"1. March 2020","format":false,"excerpt":"In this blog post we are taking a look at scaling possibilities within Kubernetes in a cloud environment. We are going to present and discuss various options that all have the same target: increase the availability of a service.","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2020\/03\/1052ebad-d01f-4803-bde6-e943c4598ef9.jpeg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2020\/03\/1052ebad-d01f-4803-bde6-e943c4598ef9.jpeg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2020\/03\/1052ebad-d01f-4803-bde6-e943c4598ef9.jpeg?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2020\/03\/1052ebad-d01f-4803-bde6-e943c4598ef9.jpeg?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2020\/03\/1052ebad-d01f-4803-bde6-e943c4598ef9.jpeg?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2020\/03\/1052ebad-d01f-4803-bde6-e943c4598ef9.jpeg?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":27583,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/02\/28\/einsatz-von-kunstlicher-intelligenz-zur-automatischen-skalierung-von-kubernetes-clustern\/","url_meta":{"origin":27789,"position":1},"title":"Einsatz von K\u00fcnstlicher Intelligenz zur automatischen Skalierung von Kubernetes-Clustern","author":"Lars Gerigk","date":"28. February 2025","format":false,"excerpt":"Anmerkung:\u00a0Dieser Blogpost wurde f\u00fcr das Modul Enterprise IT (113601a) verfasst.Aus Gr\u00fcnden der besseren Lesbarkeit wird in dieser Arbeit auf eine geschlechtsneutrale Differenzierung verzichtet. S\u00e4mtliche Personenbezeichnungen gelten gleicherma\u00dfen f\u00fcr alle Geschlechter. Kurzfassung Die Branche der fortschreitenden Cloud Digitalisierung und die steigenden Anforderungen an hochverf\u00fcgbaren, skalierbaren Anwendungen haben Kubernetes zu einer der\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/02\/image-27.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/02\/image-27.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/02\/image-27.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2025\/02\/image-27.png?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":21651,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2021\/09\/18\/deploy-random-chat-application-on-aws-ec2-with-kubernetes\/","url_meta":{"origin":27789,"position":2},"title":"Deploying Random Chat Application on AWS EC2 with Kubernetes","author":"dv029","date":"18. September 2021","format":false,"excerpt":"1. Introduction For the examination of the lecture \u201cSoftware Development for Cloud Computing\u201d, I want to build a simple Random Chat Application. The idea of this application is based on the famous chat application called Omegle. Omegle is where people can meet random people in the world and can have\u2026","rel":"","context":"In &quot;Cloud Technologies&quot;","block_context":{"text":"Cloud Technologies","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/scalable-systems\/cloud-technologies\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2021\/09\/image-19.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":12796,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2021\/03\/12\/kiss-dry-n-solid-yet-another-kubernetes-system-built-with-ansible-and-observed-with-metrics-server-on-arm64\/","url_meta":{"origin":27789,"position":3},"title":"KISS, DRY \u2018n SOLID \u2014 Yet another Kubernetes System built with Ansible and observed with Metrics Server on arm64","author":"Artur Bergen","date":"12. March 2021","format":false,"excerpt":"This blog post shows how a plain Kubernetes cluster is automatically created and configured on three arm64 devices using an orchestration tool called Ansible. The main focus relies on Ansible; other components that set up and configure the cluster are Docker, Kubernetes, Helm, NGINX, Metrics Server and Kubernetes Dashboard. Individual\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2021\/03\/nginx_welcome.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2021\/03\/nginx_welcome.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2021\/03\/nginx_welcome.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":9655,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2020\/02\/29\/image-editor-on-kubernetes-with-kompose-minikube-k3s-k3sup-and-helm-part-2\/","url_meta":{"origin":27789,"position":4},"title":"Kubernetes: from Zero to Hero with Kompose, Minikube, k3sup and Helm \u2014 Part 2: Hands-On","author":"Leon Klingele","date":"29. February 2020","format":false,"excerpt":"This is part two of our series on how we designed and implemented a scalable, highly-available and fault-tolerant microservice-based Image Editor. This part depicts how we went from a basic Docker Compose setup to running our application on our own \u00bbbare-metal\u00ab Kubernetes cluster.","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2020\/02\/DDD_dependencies-1024x119.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2020\/02\/DDD_dependencies-1024x119.png 1x, \/wp-content\/uploads\/2020\/02\/DDD_dependencies-1024x119.png 1.5x, \/wp-content\/uploads\/2020\/02\/DDD_dependencies-1024x119.png 2x"},"classes":[]},{"id":5175,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2019\/02\/24\/benefiting-kubernetes-part-2-deploy-with-kubectl\/","url_meta":{"origin":27789,"position":5},"title":"Migrating to Kubernetes Part 2 &#8211; Deploy with kubectl","author":"Can Kattwinkel","date":"24. February 2019","format":false,"excerpt":"Written by: Pirmin Gersbacher, Can Kattwinkel, Mario Sallat Migrating from Bare Metal to Kubernetes The interest in software containers is a relatively new trend in the developers world. Classic VMs have not lost their right to exist within a world full of monoliths yet, but the trend is clearly towards\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/pexels-photo-379964.jpeg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/pexels-photo-379964.jpeg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/pexels-photo-379964.jpeg?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/pexels-photo-379964.jpeg?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/02\/pexels-photo-379964.jpeg?resize=1050%2C600&ssl=1 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"authors":[{"term_id":1104,"user_id":1261,"is_guest":0,"slug":"hannah_holzheu","display_name":"Hannah Holzheu","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/4b372d84de5bba296d4b9a0289650b769b748f5012ce1908003f94e1cd6de9e9?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/27789","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/users\/1261"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/comments?post=27789"}],"version-history":[{"count":7,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/27789\/revisions"}],"predecessor-version":[{"id":27817,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/27789\/revisions\/27817"}],"wp:attachment":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/media?parent=27789"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/categories?post=27789"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/tags?post=27789"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/ppma_author?post=27789"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}