Cost-Optimal Microservices Deployment with Cluster Autoscaling and Spot Pricing

Read original: arXiv:2405.12311 - Published 5/22/2024 by Dasith Edirisinghe, Kavinda Rajapakse, Pasindu Abeysinghe, Sunimal Rathnayake

🌀

Overview

Microservices architecture is an ideal software architecture for cloud-based development and deployment, offering benefits like agility and efficiency.
Microservices are often deployed using containers and container orchestration systems, which provide tools for resource management and automating orchestration processes.
Cloud providers offer transient pricing options like AWS Spot Pricing, which can significantly reduce cloud costs, but the dynamic nature of resource demand and abrupt termination of spot VMs make this challenging.
Containerization and container orchestration systems open new ways to optimize the cost of microservices deployments by leveraging spot pricing while meeting application and business goals.

Plain English Explanation

The paper introduces a solution called SpotKube, which is designed to help companies running microservices-based applications on public clouds reduce their costs. Microservices are a way of building software where the application is divided into smaller, independent services that can be developed and deployed separately.

Companies often use containers and container orchestration systems like Kubernetes to manage and run their microservices. Cloud providers also offer special "spot" pricing options, which can be much cheaper than regular cloud instances but come with the risk of the instances being suddenly terminated.

SpotKube is an open-source tool that automatically scales the Kubernetes cluster used to run the microservices, using a genetic algorithm to find the optimal configuration that minimizes costs while still meeting the performance requirements of the application. This helps companies take advantage of the cost savings from spot pricing while ensuring their microservices applications continue to run smoothly.

Technical Explanation

The paper proposes SpotKube, an open-source, Kubernetes-based solution for cost optimization in microservices deployments on public clouds with spot pricing options. SpotKube uses a genetic algorithm-based optimization approach to autoscale the Kubernetes cluster, ensuring cost-effective deployment while meeting application performance requirements and handling abrupt termination of nodes.

The system analyzes the characteristics of the microservices-based application and recommends the optimal configuration for resource allocation to the cluster. This is achieved through an elastic cluster autoscaler powered by the optimization algorithm, which aims to minimize costs by leveraging spot pricing while maintaining system availability.

The authors implement and evaluate SpotKube using representative microservices-based applications in a real public cloud setup, comparing its effectiveness against alternative optimization strategies. The results demonstrate the benefits of their approach in reducing cloud costs while maintaining application performance, even in the face of sudden changes in resource demand and unexpected termination of spot instances.

Critical Analysis

The paper presents a novel and practical solution to the challenge of optimizing costs for microservices-based applications deployed on public clouds with spot pricing options. By leveraging containerization and container orchestration systems like Kubernetes, the authors show how it is possible to adaptively manage resources and take advantage of transient pricing while ensuring application performance and availability.

One potential limitation of the research is that it only considers a single public cloud provider (AWS) and its spot pricing model. It would be interesting to see how the approach would fare in a multi-cloud environment with different spot pricing mechanisms, as discussed in the related work on cost minimization in multi-cloud systems.

Additionally, the paper does not delve into the potential impact of decentralized, mobility-aware resource management on microservices deployments, which could be a consideration for certain application domains.

Further research could also investigate the integration of workload prediction techniques to enhance the optimization capabilities of SpotKube, potentially improving its ability to anticipate and respond to changes in resource demand.

Conclusion

The paper introduces SpotKube, an innovative solution that helps companies running microservices-based applications on public clouds reduce their costs by leveraging spot pricing options. By using a genetic algorithm-based optimization approach within a Kubernetes-based architecture, SpotKube is able to automatically scale the cluster and allocate resources in a way that minimizes costs while maintaining application performance and availability.

This research demonstrates the potential of combining containerization, container orchestration, and cost optimization techniques to address the challenges of deploying microservices in the cloud. As more businesses migrate their applications to the cloud, solutions like SpotKube could play a crucial role in helping them optimize their cloud spending and ensure the long-term viability of their microservices-based software systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌀

Cost-Optimal Microservices Deployment with Cluster Autoscaling and Spot Pricing

Dasith Edirisinghe, Kavinda Rajapakse, Pasindu Abeysinghe, Sunimal Rathnayake

Microservices architecture has been established as an ideal software architecture for cloud-based software development and deployment, offering many benefits such as agility and efficiency. Microservices are often associated with containers and container orchestration systems for deployment, as containerization provides convenient tools and techniques for resource management, including the automation of orchestration processes. Among the factors that make the cloud suitable for commercial software deployment, transient pricing options like AWS Spot Pricing are particularly attractive as they allow consumers to significantly reduce cloud costs. However, the dynamic nature of resource demand and the abrupt termination of spot VMs make transient pricing challenging. Nonetheless, containerization and container orchestration systems open new avenues to optimize the cost of microservices deployments by leveraging spot pricing on the public cloud while achieving application and business goals. We propose SpotKube, an open-source, Kubernetes-based, application-aware, genetic algorithm-based solution for cost optimization, which autoscales clusters for microservices-based applications hosted on public clouds with spot pricing options. SpotKube analyzes application characteristics and recommends the optimal configuration for resource allocation to the cluster. It consists of an elastic cluster autoscaler powered by an optimization algorithm that ensures cost-effective microservices deployment while meeting application performance requirements and handling abrupt termination of nodes, thereby minimizing the impact on system availability. We implement and evaluate SpotKube with representative microservices-based applications in a real public cloud setup, demonstrating the effectiveness of our approach against alternative optimization strategies.

5/22/2024

Cost Minimization in Multi-cloud Systems with Runtime Microservice Re-orchestration

Marco Zambianco, Silvio Cretti, Domenico Siracusa

Multi-cloud systems facilitate a cost-efficient and geographically-distributed deployment of microservice-based applications by temporary leasing virtual nodes with diverse pricing models. To preserve the cost-efficiency of multi-cloud deployments, it is essential to redeploy microservices onto the available nodes according to a dynamic resource configuration, which is often performed to better accommodate workload variations. However, this approach leads to frequent service disruption since applications are continuously shutdown and redeployed in order to apply the new resource assignment. To overcome this issue, we propose a re-orchestration scheme that migrates microservice at runtime based on a rolling update scheduling logic. Specifically, we propose an integer linear optimization problem that minimizes the cost associated to multi-cloud virtual nodes and that ensures that delay-sensitive microservices are co-located on the same regional cluster. The resulting rescheduling order guarantees no service disruption by repacking microservices between the available nodes without the need to turn off the outdated microservice instance before redeploying the updated version. In addition, we propose a two-step heuristic scheme that effectively approximates the optimal solution at the expense of close-to-zero service disruption and QoS violation probability. Results show that proposed schemes achieve better performance in terms of cost mitigation, low service disruption and low QoS violation probability compared to baseline schemes replicating Kubernetes scheduler functionalities.

5/9/2024

StatuScale: Status-aware and Elastic Scaling Strategy for Microservice Applications

Linfeng Wen, Minxian Xu, Sukhpal Singh Gill, Muhammad Hafizhuddin Hilman, Satish Narayana Srirama, Kejiang Ye, Chengzhong Xu

Microservice architecture has transformed traditional monolithic applications into lightweight components. Scaling these lightweight microservices is more efficient than scaling servers. However, scaling microservices still faces the challenges resulted from the unexpected spikes or bursts of requests, which are difficult to detect and can degrade performance instantaneously. To address this challenge and ensure the performance of microservice-based applications, we propose a status-aware and elastic scaling framework called StatuScale, which is based on load status detector that can select appropriate elastic scaling strategies for differentiated resource scheduling in vertical scaling. Additionally, StatuScale employs a horizontal scaling controller that utilizes comprehensive evaluation and resource reduction to manage the number of replicas for each microservice. We also present a novel metric named correlation factor to evaluate the resource usage efficiency. Finally, we use Kubernetes, an open-source container orchestration and management platform, and realistic traces from Alibaba to validate our approach. The experimental results have demonstrated that the proposed framework can reduce the average response time in the Sock-Shop application by 8.59% to 12.34%, and in the Hotel-Reservation application by 7.30% to 11.97%, decrease service level objective violations, and offer better performance in resource usage compared to baselines.

7/16/2024

DRPC: Distributed Reinforcement Learning Approach for Scalable Resource Provisioning in Container-based Clusters

Haoyu Bai, Minxian Xu, Kejiang Ye, Rajkumar Buyya, Chengzhong Xu

Microservices have transformed monolithic applications into lightweight, self-contained, and isolated application components, establishing themselves as a dominant paradigm for application development and deployment in public clouds such as Google and Alibaba. Autoscaling emerges as an efficient strategy for managing resources allocated to microservices' replicas. However, the dynamic and intricate dependencies within microservice chains present challenges to the effective management of scaled microservices. Additionally, the centralized autoscaling approach can encounter scalability issues, especially in the management of large-scale microservice-based clusters. To address these challenges and enhance scalability, we propose an innovative distributed resource provisioning approach for microservices based on the Twin Delayed Deep Deterministic Policy Gradient algorithm. This approach enables effective autoscaling decisions and decentralizes responsibilities from a central node to distributed nodes. Comparative results with state-of-the-art approaches, obtained from a realistic testbed and traces, indicate that our approach reduces the average response time by 15% and the number of failed requests by 24%, validating improved scalability as the number of requests increases.

7/16/2024