Applied Machine Learning to Anomaly Detection in Enterprise Purchase Processes

A. Herreros-Mart'inez, R. Magdalena-Benedicto, J. Vila-Franc'es, A. J. Serrano-L'opez, S. P'erez-D'iaz

In a context of a continuous digitalisation of processes, organisations must deal with the challenge of detecting anomalies that can reveal suspicious activities upon an increasing volume of data. To pursue this goal, audit engagements are carried out regularly, and internal auditors and purchase specialists are constantly looking for new methods to automate these processes. This work proposes a methodology to prioritise the investigation of the cases detected in two large purchase datasets from real data. The goal is to contribute to the effectiveness of the companies' control efforts and to increase the performance of carrying out such tasks. A comprehensive Exploratory Data Analysis is carried out before using unsupervised Machine Learning techniques addressed to detect anomalies. A univariate approach has been applied through the z-Score index and the DBSCAN algorithm, while a multivariate analysis is implemented with the k-Means and Isolation Forest algorithms, and the Silhouette index, resulting in each method having a transaction candidates' proposal to be reviewed. An ensemble prioritisation of the candidates is provided jointly with a proposal of explicability methods (LIME, Shapley, SHAP) to help the company specialists in their understanding.

5/24/2024

Anomaly Detection Within Mission-Critical Call Processing

Sean Doris, Iosif Salem, Stefan Schmid

With increasingly larger and more complex telecommunication networks, there is a need for improved monitoring and reliability. Requirements increase further when working with mission-critical systems requiring stable operations to meet precise design and client requirements while maintaining high availability. This paper proposes a novel methodology for developing a machine learning model that can assist in maintaining availability (through anomaly detection) for client-server communications in mission-critical systems. To that end, we validate our methodology for training models based on data classified according to client performance. The proposed methodology evaluates the use of machine learning to perform anomaly detection of a single virtualized server loaded with simulated network traffic (using SIPp) with media calls. The collected data for the models are classified based on the round trip time performance experienced on the client side to determine if the trained models can detect anomalous client side performance only using key performance indicators available on the server. We compared the performance of seven different machine learning models by testing different trained and untrained test stressor scenarios. In the comparison, five models achieved an F1-score above 0.99 for the trained test scenarios. Random Forest was the only model able to attain an F1-score above 0.9 for all untrained test scenarios with the lowest being 0.980. The results suggest that it is possible to generate accurate anomaly detection to evaluate degraded client-side performance.

8/28/2024

📊

A Data Mining-Based Dynamical Anomaly Detection Method for Integrating with an Advance Metering System

Sarit Maitra

Building operations consume 30% of total power consumption and contribute 26% of global power-related emissions. Therefore, monitoring, and early detection of anomalies at the meter level are essential for residential and commercial buildings. This work investigates both supervised and unsupervised approaches and introduces a dynamic anomaly detection system. The system introduces a supervised Light Gradient Boosting machine and an unsupervised autoencoder with a dynamic threshold. This system is designed to provide real-time detection of anomalies at the meter level. The proposed dynamical system comes with a dynamic threshold based on the Mahalanobis distance and moving averages. This approach allows the system to adapt to changes in the data distribution over time. The effectiveness of the proposed system is evaluated using real-life power consumption data collected from smart metering systems. This empirical testing ensures that the system's performance is validated under real-world conditions. By detecting unusual data movements and providing early warnings, the proposed system contributes significantly to visual analytics and decision science. Early detection of anomalies enables timely troubleshooting, preventing financial losses and potential disasters such as fire incidents.

5/7/2024

Towards a Unified Framework of Clustering-based Anomaly Detection

Zeyu Fang, Ming Gu, Sheng Zhou, Jiawei Chen, Qiaoyu Tan, Haishuai Wang, Jiajun Bu

Unsupervised Anomaly Detection (UAD) plays a crucial role in identifying abnormal patterns within data without labeled examples, holding significant practical implications across various domains. Although the individual contributions of representation learning and clustering to anomaly detection are well-established, their interdependencies remain under-explored due to the absence of a unified theoretical framework. Consequently, their collective potential to enhance anomaly detection performance remains largely untapped. To bridge this gap, in this paper, we propose a novel probabilistic mixture model for anomaly detection to establish a theoretical connection among representation learning, clustering, and anomaly detection. By maximizing a novel anomaly-aware data likelihood, representation learning and clustering can effectively reduce the adverse impact of anomalous data and collaboratively benefit anomaly detection. Meanwhile, a theoretically substantiated anomaly score is naturally derived from this framework. Lastly, drawing inspiration from gravitational analysis in physics, we have devised an improved anomaly score that more effectively harnesses the combined power of representation learning and clustering. Extensive experiments, involving 17 baseline methods across 30 diverse datasets, validate the effectiveness and generalization capability of the proposed method, surpassing state-of-the-art methods.

6/4/2024

Applied Machine Learning to Anomaly Detection in Enterprise Purchase Processes

Overview

Plain English Explanation

Technical Explanation

Critical Analysis

Conclusion

Related Papers