A process mining-based error correction approach to improve data quality of an IoT-sourced event log

Read original: arXiv:2404.13091 - Published 4/23/2024 by Mohsen Shirali, Zahra Ahmadi, Carlos Fern'andez-Llatas, Jose-Luis Bayo-Monton, Gemma Di Federico
Total Score

0

A process mining-based error correction approach to improve data quality of an IoT-sourced event log

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper proposes a process mining-based approach to improve the data quality of event logs collected from Internet of Things (IoT) devices.
  • The researchers aim to automatically detect and correct errors in IoT-sourced event logs, which can be prone to various issues due to sensor malfunctions, network disruptions, and other factors.
  • The proposed method leverages process mining techniques to analyze the event log and identify patterns that indicate potential errors, then applies correction algorithms to fix those errors.

Plain English Explanation

The paper focuses on improving the quality of data collected from IoT devices, which can sometimes have errors or problems. IoT devices like sensors are used to monitor things in the real world and record events, but the data they collect isn't always perfect. There can be issues like sensor malfunctions, network problems, or other factors that cause errors or inaccuracies in the event logs.

The researchers developed a new approach that uses process mining techniques to analyze the event log data and identify patterns that might indicate an error. Process mining is a way of studying the actual processes and workflows that are happening, based on the event log data. By looking for patterns that don't seem to fit the normal process, the system can flag potential errors.

Once the errors are detected, the system then applies correction algorithms to try to fix them and improve the overall data quality. This allows the event log data to be more reliable and useful for things like understanding business processes or analyzing IoT sensor data.

The key idea is to automatically clean up the messy IoT data, rather than having people manually review it, which can be time-consuming and error-prone. By using process mining and correction algorithms, the system can do this in a more efficient and scalable way.

Technical Explanation

The paper presents a process mining-based approach for detecting and correcting errors in IoT-sourced event logs. The researchers first provide background on Ambient Assisted Living (AAL) systems, which are IoT-based systems that monitor and assist elderly or disabled individuals in their daily lives.

The key steps of the proposed error correction method are:

  1. Event Log Preprocessing: The raw IoT event log data is preprocessed to handle missing values, outliers, and other basic data quality issues.

  2. Process Discovery: Process mining techniques are used to discover the underlying process model from the preprocessed event log. This provides a baseline understanding of the expected process behavior.

  3. Error Detection: The discovered process model is used to identify deviations from the expected process flow, which may indicate potential errors in the event log data.

  4. Error Correction: Based on the detected errors, the system applies various correction algorithms to fix the issues in the event log, such as correcting timestamp inaccuracies or removing duplicate events.

  5. Evaluation: The corrected event log is evaluated to assess the improvement in data quality, using metrics such as completeness, correctness, and consistency.

The paper presents a case study using a real-world AAL system dataset to demonstrate the effectiveness of the proposed approach. The results show that the method is able to significantly improve the quality of the IoT-sourced event log, making the data more reliable for downstream analysis and applications.

Critical Analysis

The paper provides a thoughtful and well-designed approach to address the common issue of data quality problems in IoT-generated event logs. The use of process mining techniques to identify patterns and detect potential errors is a novel and promising approach.

However, the paper does not discuss the limitations or potential drawbacks of the proposed method. For example, the error detection and correction algorithms may not be able to handle all types of errors, and there could be cases where the process model discovery is inaccurate or incomplete. Additionally, the performance and scalability of the approach when dealing with large-scale or highly complex IoT systems are not explored.

Further research could also investigate the generalizability of the method to other IoT domains beyond Ambient Assisted Living, as well as the potential integration with other data quality enhancement techniques, such as information fusion or error mitigation approaches.

Conclusion

This paper presents a novel process mining-based approach to improve the data quality of IoT-sourced event logs. By leveraging process discovery and error detection techniques, the method can automatically identify and correct various types of errors in the event log data, making it more reliable and useful for a wide range of applications, such as business process analysis or IoT-based monitoring and decision-making.

The proposed approach offers a promising solution to a common challenge in the IoT domain, where sensor data can be prone to various quality issues. Further research and refinement of the method could lead to even more robust and effective data quality improvement techniques for IoT systems.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

A process mining-based error correction approach to improve data quality of an IoT-sourced event log
Total Score

0

A process mining-based error correction approach to improve data quality of an IoT-sourced event log

Mohsen Shirali, Zahra Ahmadi, Carlos Fern'andez-Llatas, Jose-Luis Bayo-Monton, Gemma Di Federico

Internet of Things (IoT) systems are vulnerable to data collection errors and these errors can significantly degrade the quality of collected data, impact data analysis and lead to inaccurate or distorted results. This article emphasizes the importance of evaluating data quality and errors before proceeding with analysis and considering the effectiveness of error correction methods for a smart home use case.

Read more

4/23/2024

LLM-based event abstraction and integration for IoT-sourced logs
Total Score

0

LLM-based event abstraction and integration for IoT-sourced logs

Mohsen Shirali, Mohammadreza Fani Sani, Zahra Ahmadi, Estefania Serral

The continuous flow of data collected by Internet of Things (IoT) devices, has revolutionised our ability to understand and interact with the world across various applications. However, this data must be prepared and transformed into event data before analysis can begin. In this paper, we shed light on the potential of leveraging Large Language Models (LLMs) in event abstraction and integration. Our approach aims to create event records from raw sensor readings and merge the logs from multiple IoT sources into a single event log suitable for further Process Mining applications. We demonstrate the capabilities of LLMs in event abstraction considering a case study for IoT application in elderly care and longitudinal health monitoring. The results, showing on average an accuracy of 90% in detecting high-level activities. These results highlight LLMs' promising potential in addressing event abstraction and integration challenges, effectively bridging the existing gap.

Read more

9/6/2024

Towards Explainable Automated Data Quality Enhancement without Domain Knowledge
Total Score

0

New!Towards Explainable Automated Data Quality Enhancement without Domain Knowledge

Djibril Sarr

In the era of big data, ensuring the quality of datasets has become increasingly crucial across various domains. We propose a comprehensive framework designed to automatically assess and rectify data quality issues in any given dataset, regardless of its specific content, focusing on both textual and numerical data. Our primary objective is to address three fundamental types of defects: absence, redundancy, and incoherence. At the heart of our approach lies a rigorous demand for both explainability and interpretability, ensuring that the rationale behind the identification and correction of data anomalies is transparent and understandable. To achieve this, we adopt a hybrid approach that integrates statistical methods with machine learning algorithms. Indeed, by leveraging statistical techniques alongside machine learning, we strike a balance between accuracy and explainability, enabling users to trust and comprehend the assessment process. Acknowledging the challenges associated with automating the data quality assessment process, particularly in terms of time efficiency and accuracy, we adopt a pragmatic strategy, employing resource-intensive algorithms only when necessary, while favoring simpler, more efficient solutions whenever possible. Through a practical analysis conducted on a publicly provided dataset, we illustrate the challenges that arise when trying to enhance data quality while keeping explainability. We demonstrate the effectiveness of our approach in detecting and rectifying missing values, duplicates and typographical errors as well as the challenges remaining to be addressed to achieve similar accuracy on statistical outliers and logic errors under the constraints set in our work.

Read more

9/17/2024

📊

Total Score

0

From Internet of Things Data to Business Processes: Challenges and a Framework

Juergen Mangler, Ronny Seiger, Janik-Vasily Benzin, Joscha Gruger, Yusuf Kirikkayis, Florian Gallik, Lukas Malburg, Matthias Ehrendorfer, Yannis Bertrand, Marco Franceschetti, Barbara Weber, Stefanie Rinderle-Ma, Ralph Bergmann, Estefan'ia Serral Asensio, Manfred Reichert

The IoT and Business Process Management (BPM) communities co-exist in many shared application domains, such as manufacturing and healthcare. The IoT community has a strong focus on hardware, connectivity and data; the BPM community focuses mainly on finding, controlling, and enhancing the structured interactions among the IoT devices in processes. While the field of Process Mining deals with the extraction of process models and process analytics from process event logs, the data produced by IoT sensors often is at a lower granularity than these process-level events. The fundamental questions about extracting and abstracting process-related data from streams of IoT sensor values are: (1) Which sensor values can be clustered together as part of process events?, (2) Which sensor values signify the start and end of such events?, (3) Which sensor values are related but not essential? This work proposes a framework to semi-automatically perform a set of structured steps to convert low-level IoT sensor data into higher-level process events that are suitable for process mining. The framework is meant to provide a generic sequence of abstract steps to guide the event extraction, abstraction, and correlation, with variation points for plugging in specific analysis techniques and algorithms for each step. To assess the completeness of the framework, we present a set of challenges, how they can be tackled through the framework, and an example on how to instantiate the framework in a real-world demonstration from the field of smart manufacturing. Based on this framework, future research can be conducted in a structured manner through refining and improving individual steps.

Read more

5/24/2024