Evaluating Serverless Machine Learning Performance on Google Cloud Run

2406.16250

Published 6/26/2024 by Prerana Khatiwada, Pranjal Dhakal

🚀

Abstract

End-users can get functions-as-a-service from serverless platforms, which promise lower hosting costs, high availability, fault tolerance, and dynamic flexibility for hosting individual functions known as microservices. Machine learning tools are seen to be reliably useful, and the services created using these tools are in increasing demand on a large scale. The serverless platforms are uniquely suited for hosting these machine learning services to be used for large-scale applications. These platforms are well known for their cost efficiency, fault tolerance, resource scaling, robust APIs for communication, and global reach. However, machine learning services are different from the web-services in that these serverless platforms were originally designed to host web services. We aimed to understand how these serverless platforms handle machine learning workloads with our study. We examine machine learning performance on one of the serverless platforms - Google Cloud Run, which is a GPU-less infrastructure that is not designed for machine learning application deployment.

Create account to get full access

Overview

Serverless platforms, also known as functions-as-a-service (FaaS), allow users to run individual functions called microservices without managing infrastructure
Machine learning (ML) services are in high demand and could benefit from the cost-efficiency, fault tolerance, and scalability of serverless platforms
However, serverless platforms were originally designed for web services, not ML workloads, so it's unclear how well they handle ML tasks
This study examines the performance of ML workloads on Google Cloud Run, a serverless platform not specifically designed for ML

Plain English Explanation

Serverless platforms let people run small, individual computer programs (called "functions" or "microservices") without having to manage the servers or infrastructure needed to run those programs. This can be cheaper and more reliable than traditional server-based approaches.

Machine learning tools are increasingly useful and in high demand, so there's interest in running ML services on serverless platforms. Serverless platforms are known for being cost-effective, fault-tolerant, and able to automatically scale resources. However, serverless platforms were originally made for simple web applications, not complex ML workloads.

This study looks at how well one specific serverless platform, Google Cloud Run, handles ML tasks. Cloud Run is unique because it doesn't have any specialized hardware like graphics processing units (GPUs) that are often used for ML. The researchers wanted to understand how this type of serverless platform performs for ML, even without that specialized hardware.

Technical Explanation

The authors examined the performance of machine learning workloads on Google Cloud Run, a serverless platform that does not provide GPU hardware typically used for ML tasks. Serverless platforms like Cloud Run promise benefits such as lower hosting costs, high availability, fault tolerance, and dynamic scaling for hosting individual microservices.

While machine learning tools are seen as reliably useful, and the demand for ML services is increasing, the serverless platforms were originally designed to host traditional web services, not specialized ML workloads. The authors aimed to understand how these serverless platforms can handle machine learning tasks, which have different performance characteristics than the web applications the platforms were built for.

Critical Analysis

The paper provides a useful exploration of how a serverless platform designed for web services, rather than ML, performs on machine learning workloads. However, the study is limited to a single platform, Google Cloud Run, which lacks GPU hardware typically used for ML.

The results may not generalize to other serverless offerings that provide GPU support or are designed more explicitly for ML use cases. Further research is needed to understand how a broader range of serverless platforms, with varying hardware and architectural choices, handle diverse ML workloads and models.

Additionally, the paper does not deeply examine potential issues or limitations that may arise when running ML services on serverless infrastructure, such as cold starts, resource constraints, or challenges with model deployment and updates.

Conclusion

This study provides an initial look at the performance of machine learning workloads on a serverless platform, Google Cloud Run, that was not designed with ML in mind. The results suggest that even without specialized hardware, serverless platforms can be a viable option for hosting certain ML services, particularly those with modest resource requirements.

However, further research is needed to understand how a broader range of serverless offerings, with different architectural choices, handle more diverse ML tasks and models. Careful consideration of the unique characteristics and constraints of serverless platforms will be important as ML services increasingly seek to leverage their cost-efficiency, scalability, and fault tolerance.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

New!Deploying AI-Based Applications with Serverless Computing in 6G Networks: An Experimental Study

Marc Michalke, Chukwuemeka Muonagor, Admela Jukan

Future 6G networks are expected to heavily utilize machine learning capabilities in a wide variety of applications with features and benefits for both, the end user and the provider. While the options for utilizing these technologies are almost endless, from the perspective of network architecture and standardized service, the deployment decisions on where to execute the AI-tasks are critical, especially when considering the dynamic and heterogeneous nature of processing and connectivity capability of 6G networks. On the other hand, conceptual and standardization work is still in its infancy, as to how to categorizes ML applications in 6G landscapes; some of them are part of network management functions, some target the inference itself, while many others emphasize model training. It is likely that future mobile services may all be in the AI domain, or combined with AI. This work makes a case for the serverless computing paradigm to be used to this end. We first provide an overview of different machine learning applications that are expected to be relevant in 6G networks. We then create a set of general requirements for software engineering solutions executing these workloads from them and propose and implement a high-level edge-focused architecture to execute such tasks. We then map the ML-serverless paradigm to the case study of 6G architecture and test the resulting performance experimentally for a machine learning application against a setup created in a more traditional, cloud-based manner. Our results show that, while there is a trade-off in predictability of the response times and the accuracy, the achieved median accuracy in a 6G setup remains the same, while the median response time decreases by around 25% compared to the cloud setup.

7/2/2024

cs.NI

📈

New!Imaginary Machines: A Serverless Model for Cloud Applications

Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic, Gustavo Alonso

Serverless Function-as-a-Service (FaaS) platforms provide applications with resources that are highly elastic, quick to instantiate, accounted at fine granularity, and without the need for explicit runtime resource orchestration. This combination of the core properties underpins the success and popularity of the serverless FaaS paradigm. However, these benefits are not available to most cloud applications because they are designed for networked virtual machines/containers environments. Since such cloud applications cannot take advantage of the highly elastic resources of serverless and require run-time orchestration systems to operate, they suffer from lower resource utilization, additional management complexity, and costs relative to their FaaS serverless counterparts. We propose Imaginary Machines, a new serverless model for cloud applications. This model (1.) exposes the highly elastic resources of serverless platforms as the traditional network-of-hosts model that cloud applications expect, and (2.) it eliminates the need for explicit run-time orchestration by transparently managing application resources based on signals generated during cloud application executions. With the Imaginary Machines model, unmodified cloud applications become serverless applications. While still based on the network-of-host model, they benefit from the highly elastic resources and do not require runtime orchestration, just like their specialized serverless FaaS counterparts, promising increased resource utilization while reducing management costs.

7/2/2024

cs.DC cs.NI cs.OS

🏋️

ElastiBench: Scalable Continuous Benchmarking on Cloud FaaS Platforms

Trever Schirmer, Tobias Pfandzelter, David Bermbach

Running microbenchmark suites often and early in the development process enables developers to identify performance issues in their application. Microbenchmark suites of complex applications can comprise hundreds of individual benchmarks and take multiple hours to evaluate meaningfully, making running those benchmarks as part of CI/CD pipelines infeasible. In this paper, we reduce the total execution time of microbenchmark suites by leveraging the massive scalability and elasticity of FaaS (Function-as-a-Service) platforms. While using FaaS enables users to quickly scale up to thousands of parallel function instances to speed up microbenchmarking, the performance variation and low control over the underlying computing resources complicate reliable benchmarking. We demonstrate an architecture for executing microbenchmark suites on cloud FaaS platforms and evaluate it on code changes from an open-source time series database. Our evaluation shows that our prototype can produce reliable results (~95% of performance changes accurately detected) in a quarter of the time (<=15min vs.~4h) and at lower cost ($0.49 vs. ~$1.18) compared to cloud-based virtual machines.

5/24/2024

cs.DC

📊

How to integrate cloud service, data analytic and machine learning technique to reduce cyber risks associated with the modern cloud based infrastructure

Upakar Bhatta

The combination of cloud technology, machine learning, and data visualization techniques allows hybrid enterprise networks to hold massive volumes of data and provide employees and customers easy access to these cloud data. These massive collections of complex data sets are facing security challenges. While cloud platforms are more vulnerable to security threats and traditional security technologies are unable to cope with the rapid data explosion in cloud platforms, machine learning powered security solutions and data visualization techniques are playing instrumental roles in detecting security threat, data breaches, and automatic finding software vulnerabilities. The purpose of this paper is to present some of the widely used cloud services, machine learning techniques and data visualization approach and demonstrate how to integrate cloud service, data analytic and machine learning techniques that can be used to detect and reduce cyber risks associated with the modern cloud based infrastructure. In this paper I applied the machine learning supervised classifier to design a model based on well-known UNSW-NB15 dataset to predict the network behavior metrics and demonstrated how data analytics techniques can be integrated to visualize network traffics.

5/21/2024

cs.LG cs.CE