A Feature Dataset of Microservices-based Systems

Read original: arXiv:2404.01789 - Published 4/3/2024 by Weipan Yang, Yongchao Xing, Yiming Lyu, Zhihao Liang, Zhiying Tu

✨

Overview

Microservice architecture has become a dominant approach in the software industry.
Poorly designed microservices can exhibit "microservice bad smells" that indicate potential issues.
Detecting these bad smells relies on data about microservice features, but there is a lack of open-source datasets for this purpose.
This paper aims to address this research gap by creating an open-source dataset of microservice features.

Plain English Explanation

Microservices are a way of building software where the entire application is broken down into smaller, independent services. This approach has become very popular, as it can offer benefits like flexibility, scalability, and faster development cycles. However, if these microservices are not designed well, they can start to exhibit certain "bad smells" - signs that something might be wrong with the way the system is structured.

Researchers who study these microservice bad smells need access to data about the features and characteristics of real-world microservice systems. But there hasn't been a good open-source dataset available for this purpose. This makes it harder for researchers to thoroughly investigate microservice bad smells and find ways to detect and prevent them.

The researchers in this paper set out to create that missing dataset. They collected a number of open-source microservice systems that use the popular Spring Cloud framework. They then developed a tool to automatically extract key information and features about those microservices. After verifying the data, they packaged it up into an open-source dataset that researchers can use for their work on microservice bad smells.

Technical Explanation

The researchers first identified the need for an open-source dataset of microservice features to support research on microservice bad smells. They selected Spring Cloud as the target framework because of its widespread adoption in the industry.

They then developed a custom extraction program that could analyze Spring Cloud-based microservice systems and extract relevant feature data. This included metrics related to the architecture and interactions of the microservices, such as the number of services, dependencies between them, and communication patterns.

The researchers applied this extraction program to a collection of open-source microservice systems they had gathered. They manually verified the extracted data to ensure its accuracy, and then packaged it all into a CSV dataset that can be freely shared and used by other researchers.

The resulting dataset provides a valuable resource for studying microservice bad smells. It contains real-world data that can be used to develop and test detection techniques, ultimately helping to improve the design and quality of microservice-based applications.

Critical Analysis

The researchers acknowledge that their dataset is specific to microservices built using the Spring Cloud framework. This means the findings and insights derived from analyzing the data may not generalize perfectly to microservices built with other technologies.

Additionally, the dataset only includes open-source systems, which may not be representative of proprietary, enterprise-scale microservice architectures. Further research would be needed to validate the dataset's applicability in those contexts.

That said, the availability of this open-source dataset is a valuable contribution to the field. It provides a foundation for researchers to build upon, and the extraction tool developed by the authors can likely be adapted to work with other microservice frameworks as well.

Conclusion

This research paper addresses an important gap in the study of microservice bad smells by creating an open-source dataset of microservice feature data. This dataset, along with the extraction tool, has the potential to significantly advance research in this area.

By providing a common dataset for researchers to work with, it becomes easier to develop and compare different techniques for detecting microservice bad smells. This, in turn, can lead to better practices and patterns for designing high-quality, maintainable microservice-based applications.

Overall, this work represents an important step forward in the study of microservice architecture and its associated challenges and best practices.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

A Feature Dataset of Microservices-based Systems

Weipan Yang, Yongchao Xing, Yiming Lyu, Zhihao Liang, Zhiying Tu

Microservice architecture has become a dominant architectural style in the service-oriented software industry. Poor practices in the design and development of microservices are called microservice bad smells. In microservice bad smells research, the detection of these bad smells relies on feature data from microservices. However, there is a lack of an appropriate open-source microservice feature dataset. The availability of such datasets may contribute to the detection of microservice bad smells unexpectedly. To address this research gap, this paper collects a number of open-source microservice systems utilizing Spring Cloud. Additionally, feature metrics are established based on the architecture and interactions of Spring Boot style microservices. And an extraction program is developed. The program is then applied to the collected open-source microservice systems, extracting the necessary information, and undergoing manual verification to create an open-source feature dataset specific to microservice systems using Spring Cloud. The dataset is made available through a CSV file. We believe that both the extraction program and the dataset have the potential to contribute to the study of micro-service bad smells.

4/3/2024

🚀

The PetShop Dataset -- Finding Causes of Performance Issues across Microservices

Michaela Hardt, William R. Orchard, Patrick Blobaum, Shiva Kasiviswanathan, Elke Kirschbaum

Identifying root causes for unexpected or undesirable behavior in complex systems is a prevalent challenge. This issue becomes especially crucial in modern cloud applications that employ numerous microservices. Although the machine learning and systems research communities have proposed various techniques to tackle this problem, there is currently a lack of standardized datasets for quantitative benchmarking. Consequently, research groups are compelled to create their own datasets for experimentation. This paper introduces a dataset specifically designed for evaluating root cause analyses in microservice-based applications. The dataset encompasses latency, requests, and availability metrics emitted in 5-minute intervals from a distributed application. In addition to normal operation metrics, the dataset includes 68 injected performance issues, which increase latency and reduce availability throughout the system. We showcase how this dataset can be used to evaluate the accuracy of a variety of methods spanning different causal and non-causal characterisations of the root cause analysis problem. We hope the new dataset, available at https://github.com/amazon-science/petshop-root-cause-analysis/ enables further development of techniques in this important area.

4/10/2024

Microservices-based Software Systems Reengineering: State-of-the-Art and Future Directions

Thakshila Imiya Mohottige (University of Melbourne), Artem Polyvyanyy (University of Melbourne), Rajkumar Buyya (University of Melbourne), Colin Fidge (Queensland University of Technology), Alistair Barros (Queensland University of Technology)

Designing software compatible with cloud-based Microservice Architectures (MSAs) is vital due to the performance, scalability, and availability limitations. As the complexity of a system increases, it is subject to deprecation, difficulties in making updates, and risks in introducing defects when making changes. Microservices are small, loosely coupled, highly cohesive units that interact to provide system functionalities. We provide a comprehensive survey of current research into ways of identifying services in systems that can be redeployed as microservices. Static, dynamic, and hybrid approaches have been explored. While code analysis techniques dominate the area, dynamic and hybrid approaches remain open research topics.

7/22/2024

Synthetic Time Series for Anomaly Detection in Cloud Microservices

Mohamed Allam, Noureddine Boujnah, Noel E. O'Connor, Mingming Liu

This paper proposes a framework for time series generation built to investigate anomaly detection in cloud microservices. In the field of cloud computing, ensuring the reliability of microservices is of paramount concern and yet a remarkably challenging task. Despite the large amount of research in this area, validation of anomaly detection algorithms in realistic environments is difficult to achieve. To address this challenge, we propose a framework to mimic the complex time series patterns representative of both normal and anomalous cloud microservices behaviors. We detail the pipeline implementation that allows deployment and management of microservices as well as the theoretical approach required to generate anomalies. Two datasets generated using the proposed framework have been made publicly available through GitHub.

8/2/2024