How to Sustainably Monitor ML-Enabled Systems? Accuracy and Energy Efficiency Tradeoffs in Concept Drift Detection

Read original: arXiv:2404.19452 - Published 5/1/2024 by Rafiullah Omar, Justus Bogner, Joran Leest, Vincenzo Stoico, Patricia Lago, Henry Muccini

🎯

Overview

Machine learning (ML) models deployed in production environments often suffer from "concept drift" - gradual changes in the statistical properties of the real-world domain they're operating in.
Regularly retraining these models can help, but is energy-intensive.
The researchers conducted experiments to study the tradeoff between accuracy and energy efficiency for different concept drift detection methods.

Plain English Explanation

The researchers looked at a common problem that affects machine learning (ML) models used in the real world. Over time, the data these models are trained on can gradually change, causing their performance to decline. This phenomenon is known as "concept drift".

A simple solution is to periodically retrain the ML models, but this can consume a lot of energy, which is problematic. The researchers wanted to find a more energy-efficient approach. They tested out several different methods for "detecting concept drift" and looked at how accurate each one was, as well as how much energy they consumed.

The key idea is that if you can detect when concept drift is starting to occur, you can retrain the model only when necessary, rather than doing it on a fixed schedule. This could save a lot of energy. The researchers used both synthetic datasets with different types of drift, as well as several common ML models, to thoroughly evaluate the different drift detection methods.

Technical Explanation

The researchers conducted a controlled experiment to study the accuracy-energy tradeoff of seven common "concept drift detection" methods. They used five synthetic datasets, each with an abrupt and a gradual drift version, along with six different ML models as base classifiers.

This full factorial design resulted in 420 test combinations (7 drift detectors * 5 datasets * 2 drift types * 6 base classifiers). The researchers measured the energy consumption and drift detection accuracy for each combination.

Their results show three main types of drift detectors: 1) those that prioritize detection accuracy over energy efficiency (KSWIN), 2) balanced detectors that consume low to medium energy while maintaining good accuracy (HDDM_W, ADWIN), and 3) detectors that use very little energy but have unacceptably poor accuracy (HDDM_A, PageHinkley, DDM, EDDM).

By providing this "detailed evidence" on the energy-accuracy tradeoffs, the researchers aim to help ML practitioners choose the most appropriate "concept drift detection" method for their specific systems and requirements.

Critical Analysis

The researchers acknowledge several limitations in their study. They only used synthetic datasets, so the results may not fully reflect the performance of these drift detection methods on real-world data. Additionally, the energy measurements were based on simulations, not actual hardware deployments, which could impact the fidelity of the results.

Further research is needed to validate these findings on a broader range of datasets and deployment scenarios. It would also be valuable to explore the potential impact of hyperparameter tuning on the accuracy-energy tradeoffs observed for each drift detection method.

Overall, the researchers provide a solid foundation for understanding the practical tradeoffs involved in choosing a concept drift detection strategy. However, ML practitioners should still carefully evaluate these methods in the context of their specific applications before making a selection.

Conclusion

This study offers valuable insights into the accuracy-energy tradeoffs of different concept drift detection methods for ML models deployed in production environments. The researchers' findings suggest that there are balanced options that can maintain good detection accuracy while consuming relatively low amounts of energy.

By providing this empirical evidence, the researchers aim to support ML practitioners in choosing the most appropriate drift detection approach for their systems. This can help improve the long-term reliability and efficiency of ML-powered applications, which is crucial as these technologies become more widespread in the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🎯

How to Sustainably Monitor ML-Enabled Systems? Accuracy and Energy Efficiency Tradeoffs in Concept Drift Detection

Rafiullah Omar, Justus Bogner, Joran Leest, Vincenzo Stoico, Patricia Lago, Henry Muccini

ML-enabled systems that are deployed in a production environment typically suffer from decaying model prediction quality through concept drift, i.e., a gradual change in the statistical characteristics of a certain real-world domain. To combat this, a simple solution is to periodically retrain ML models, which unfortunately can consume a lot of energy. One recommended tactic to improve energy efficiency is therefore to systematically monitor the level of concept drift and only retrain when it becomes unavoidable. Different methods are available to do this, but we know very little about their concrete impact on the tradeoff between accuracy and energy efficiency, as these methods also consume energy themselves. To address this, we therefore conducted a controlled experiment to study the accuracy vs. energy efficiency tradeoff of seven common methods for concept drift detection. We used five synthetic datasets, each in a version with abrupt and one with gradual drift, and trained six different ML models as base classifiers. Based on a full factorial design, we tested 420 combinations (7 drift detectors * 5 datasets * 2 types of drift * 6 base classifiers) and compared energy consumption and drift detection accuracy. Our results indicate that there are three types of detectors: a) detectors that sacrifice energy efficiency for detection accuracy (KSWIN), b) balanced detectors that consume low to medium energy with good accuracy (HDDM_W, ADWIN), and c) detectors that consume very little energy but are unusable in practice due to very poor accuracy (HDDM_A, PageHinkley, DDM, EDDM). By providing rich evidence for this energy efficiency tactic, our findings support ML practitioners in choosing the best suited method of concept drift detection for their ML-enabled systems.

5/1/2024

Concept Drift Detection using Ensemble of Integrally Private Models

Ayush K. Varshney, Vicenc Torra

Deep neural networks (DNNs) are one of the most widely used machine learning algorithm. DNNs requires the training data to be available beforehand with true labels. This is not feasible for many real-world problems where data arrives in the streaming form and acquisition of true labels are scarce and expensive. In the literature, not much focus has been given to the privacy prospect of the streaming data, where data may change its distribution frequently. These concept drifts must be detected privately in order to avoid any disclosure risk from DNNs. Existing privacy models use concept drift detection schemes such ADWIN, KSWIN to detect the drifts. In this paper, we focus on the notion of integrally private DNNs to detect concept drifts. Integrally private DNNs are the models which recur frequently from different datasets. Based on this, we introduce an ensemble methodology which we call 'Integrally Private Drift Detection' (IPDD) method to detect concept drift from private models. Our IPDD method does not require labels to detect drift but assumes true labels are available once the drift has been detected. We have experimented with binary and multi-class synthetic and real-world data. Our experimental results show that our methodology can privately detect concept drift, has comparable utility (even better in some cases) with ADWIN and outperforms utility from different levels of differentially private models. The source code for the paper is available hyperlink{https://github.com/Ayush-Umu/Concept-drift-detection-Using-Integrally-private-models}{here}.

6/10/2024

🔎

Online Drift Detection with Maximum Concept Discrepancy

Ke Wan, Yi Liang, Susik Yoon

Continuous learning from an immense volume of data streams becomes exceptionally critical in the internet era. However, data streams often do not conform to the same distribution over time, leading to a phenomenon called concept drift. Since a fixed static model is unreliable for inferring concept-drifted data streams, establishing an adaptive mechanism for detecting concept drift is crucial. Current methods for concept drift detection primarily assume that the labels or error rates of downstream models are given and/or underlying statistical properties exist in data streams. These approaches, however, struggle to address high-dimensional data streams with intricate irregular distribution shifts, which are more prevalent in real-world scenarios. In this paper, we propose MCD-DD, a novel concept drift detection method based on maximum concept discrepancy, inspired by the maximum mean discrepancy. Our method can adaptively identify varying forms of concept drift by contrastive learning of concept embeddings without relying on labels or statistical properties. With thorough experiments under synthetic and real-world scenarios, we demonstrate that the proposed method outperforms existing baselines in identifying concept drifts and enables qualitative analysis with high explainability.

7/9/2024

🤿

Optimized Deep Learning Models for Malware Detection under Concept Drift

William Maillet, Benjamin Marais

Despite the promising results of machine learning models in malicious files detection, they face the problem of concept drift due to their constant evolution. This leads to declining performance over time, as the data distribution of the new files differs from the training one, requiring frequent model update. In this work, we propose a model-agnostic protocol to improve a baseline neural network against drift. We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy, an improvement to the classical Binary Cross-Entropy more effective against drift. We train our model on the EMBER dataset, published in2018, and evaluate it on a dataset of recent malicious files, collected between 2020 and 2023. Our improved model shows promising results, detecting 15.2% more malware than a baseline model.

8/2/2024