How to design a dataset compliant with an ML-based system ODD?

Read original: arXiv:2406.14027 - Published 6/21/2024 by Cyril Cappi, No'emie Cohen, M'elanie Ducoffe, Christophe Gabreau, Laurent Gardes, Adrien Gauffriau, Jean-Brice Ginestet, Franck Mamalet, Vincent Mussot, Claire Pagetti and 1 other

✨

Overview

This paper focuses on designing and validating a dataset for a machine learning-based Vision-based Landing system.
The researchers describe a process for establishing Operational Design Domains (ODDs) at both the system and image levels, based on emerging certification standards.
They translate high-level system constraints into actionable image-level properties, allowing for the definition of verifiable Data Quality Requirements (DQRs).
The Landing Approach Runway Detection (LARD) dataset, which combines synthetic imagery and real footage, is used to illustrate this approach.

Plain English Explanation

The paper discusses the challenge of designing a dataset that meets the strict requirements of machine learning (ML) systems used in safety-critical applications, such as autonomous landing of aircraft. The researchers present a framework for establishing Operational Design Domains (ODDs) - the specific conditions under which an ML system is intended to operate safely. This involves translating high-level system constraints, like environmental conditions and operational scenarios, into verifiable Data Quality Requirements (DQRs) for the training data.

To demonstrate this approach, the researchers used the Landing Approach Runway Detection (LARD) dataset, which combines computer-generated synthetic images and real-world footage. They focused on the steps required to verify that the LARD dataset meets the defined DQRs, ensuring it can adequately support the development and testing of an ML-based Vision-based Landing system.

The goal of this work is to provide a replicable framework for designing datasets that comply with the stringent needs of certifying ML-based systems for safety-critical applications, such as automated object detection for surveillance missions or leveraging systematic knowledge of 2D transformations in computer vision tasks.

Technical Explanation

The paper presents a methodology for designing and validating a dataset that aligns with the Operational Design Domain (ODD) of a machine learning-based Vision-based Landing system. The researchers leverage emerging certification standards to establish ODDs at both the system and image levels, which define the specific conditions under which the ML system is intended to operate safely.

The key steps in their approach include:

Translating high-level system constraints, such as environmental factors and operational scenarios, into actionable image-level properties. This allows for the definition of verifiable Data Quality Requirements (DQRs) for the training data.
Using the Landing Approach Runway Detection (LARD) dataset, which combines synthetic imagery and real footage, to illustrate the process of verifying that the dataset meets the defined DQRs.

The researchers' replicable framework addresses the challenges of designing a dataset that is compliant with the stringent needs of certifying ML-based systems for safety-critical applications, such as automated driving or object detection for surveillance missions. By defining ODDs at both the system and image levels, and translating them into verifiable DQRs, the researchers aim to ensure the dataset can adequately support the development and testing of the ML-based Vision-based Landing system.

Critical Analysis

The paper presents a comprehensive and well-structured approach to designing and validating a dataset for a machine learning-based Vision-based Landing system. The researchers' focus on establishing Operational Design Domains (ODDs) and translating them into verifiable Data Quality Requirements (DQRs) is a crucial step in ensuring the dataset can support the certification of safety-critical ML applications.

One potential limitation of the research is the reliance on a single dataset, the Landing Approach Runway Detection (LARD) dataset, to illustrate the framework. While the LARD dataset combines synthetic and real-world data, it would be valuable to see the framework applied to additional datasets, potentially in different safety-critical domains, to further validate its generalizability.

Additionally, the paper does not explicitly address the challenges of dataset bias and its impact on the certification of ML-based systems. Exploring strategies to mitigate bias in the dataset design process could strengthen the overall framework.

Despite these minor concerns, the researchers' work provides a valuable contribution to the field of machine learning in safety-critical applications. The replicable framework presented in the paper can serve as a foundation for researchers and practitioners to design and validate datasets that meet the stringent requirements of certification standards.

Conclusion

This paper presents a comprehensive framework for designing and validating a dataset that aligns with the Operational Design Domain (ODD) of a machine learning-based Vision-based Landing system. By establishing ODDs at both the system and image levels, and translating high-level constraints into verifiable Data Quality Requirements (DQRs), the researchers have developed a replicable approach to addressing the challenges of certifying ML-based systems for safety-critical applications.

The researchers' work has significant implications for the development of reliable and trustworthy ML-based systems in domains such as automated driving, object detection for surveillance missions, and multi-label image classification. By ensuring the dataset used to train and validate these systems meets rigorous quality standards, the researchers are contributing to the advancement of safe and reliable AI technologies that can be deployed in the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

How to design a dataset compliant with an ML-based system ODD?

Cyril Cappi, No'emie Cohen, M'elanie Ducoffe, Christophe Gabreau, Laurent Gardes, Adrien Gauffriau, Jean-Brice Ginestet, Franck Mamalet, Vincent Mussot, Claire Pagetti, David Vigouroux

This paper focuses on a Vision-based Landing task and presents the design and the validation of a dataset that would comply with the Operational Design Domain (ODD) of a Machine-Learning (ML) system. Relying on emerging certification standards, we describe the process for establishing ODDs at both the system and image levels. In the process, we present the translation of high-level system constraints into actionable image-level properties, allowing for the definition of verifiable Data Quality Requirements (DQRs). To illustrate this approach, we use the Landing Approach Runway Detection (LARD) dataset which combines synthetic imagery and real footage, and we focus on the steps required to verify the DQRs. The replicable framework presented in this paper addresses the challenges of designing a dataset compliant with the stringent needs of ML-based systems certification in safety-critical applications.

6/21/2024

Formalization of Operational Domain and Operational Design Domain for Automated Vehicles

Ali Shakeri

Specifying an Operational Design Domain (ODD) is crucial for safeguarding automated vehicle systems against conditions that exceed their capabilities. Yet, prior definitions of ODD have relied on ambiguous and unclear terms, resulting in numerous misunderstandings and misconceptions. This paper introduces a formal approach to clearly define the Operational Domain (OD) and ODD for automated vehicles. Furthermore, the absence of essential terms, such as the OD, has resulted in the creation of numerous terms that have made things more complicated and confusing. This level of complexity is unacceptable when it comes to developing safety-critical systems, where any uncertainty can lead to significant risks. This study addresses these deficiencies by providing a precise mathematical model of OD and clarifying its relationship with other terms. Also, by formalizing these terms, this work establishes a foundation for developing further concepts such as ODD specification and ODD monitoring, which are explained in this paper.

8/28/2024

🏋️

Towards Robust Training Datasets for Machine Learning with Ontologies: A Case Study for Emergency Road Vehicle Detection

Lynn Vonderhaar, Timothy Elvira, Tyler Procko, Omar Ochoa

Countless domains rely on Machine Learning (ML) models, including safety-critical domains, such as autonomous driving, which this paper focuses on. While the black box nature of ML is simply a nuisance in some domains, in safety-critical domains, this makes ML models difficult to trust. To fully utilize ML models in safety-critical domains, it would be beneficial to have a method to improve trust in model robustness and accuracy without human experts checking each decision. This research proposes a method to increase trust in ML models used in safety-critical domains by ensuring the robustness and completeness of the model's training dataset. Because ML models embody what they are trained with, ensuring the completeness of training datasets can help to increase the trust in the training of ML models. To this end, this paper proposes the use of a domain ontology and an image quality characteristic ontology to validate the domain completeness and image quality robustness of a training dataset. This research also presents an experiment as a proof of concept for this method, where ontologies are built for the emergency road vehicle domain.

6/24/2024

💬

Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models

Akchay Srivastava, Atif Memon

Open Domain Question Answering (ODQA) within natural language processing involves building systems that answer factual questions using large-scale knowledge corpora. Recent advances stem from the confluence of several factors, such as large-scale training datasets, deep learning techniques, and the rise of large language models. High-quality datasets are used to train models on realistic scenarios and enable the evaluation of the system on potentially unseen data. Standardized metrics facilitate comparisons between different ODQA systems, allowing researchers to objectively track advancements in the field. Our study presents a thorough examination of the current landscape of ODQA benchmarking by reviewing 52 datasets and 20 evaluation techniques across textual and multimodal modalities. We introduce a novel taxonomy for ODQA datasets that incorporates both the modality and difficulty of the question types. Additionally, we present a structured organization of ODQA evaluation metrics along with a critical analysis of their inherent trade-offs. Our study aims to empower researchers by providing a framework for the robust evaluation of modern question-answering systems. We conclude by identifying the current challenges and outlining promising avenues for future research and development.

6/21/2024