Fostering Trust and Quantifying Value of AI and ML

Read original: arXiv:2407.05919 - Published 7/9/2024 by Dalmo Cirne, Veena Calambur

🤖

Overview

AI and ML providers have a responsibility to develop valid and reliable systems
There are challenges around building trust in AI/ML inferences (predictions or task solutions from trained models)
The paper examines the dynamic of trust between providers (Trustors) and users (Trustees), and proposes a framework to measure trustworthiness of ML systems

Plain English Explanation

As artificial intelligence (AI) and machine learning (ML) systems become more widely adopted, it's important that the companies developing them ensure the systems are valid, reliable, and trustworthy. Users need to be able to trust the inferences or predictions made by these AI/ML models.

However, there isn't a clear definition of what it means for an AI/ML system to be "trustworthy." Topics like transparency, explainability, safety, and bias are important, but there aren't frameworks to quantify and measure them.

This paper looks at the relationship between the AI/ML providers (called "Trustors") and the users (called "Trustees"). Trustors need to be both trusting and trustworthy, while Trustees don't necessarily need to be either. The challenge is for Trustors to produce results good enough that Trustees will increase their trust to a minimum threshold, both to do business and to continue using the service.

The paper proposes a framework and set of metrics to calculate a "trust score" - a way to objectively measure how trustworthy a given AI/ML system can claim to be, and how that changes over time. This could help build greater trust in AI/ML systems and lead to more widespread adoption.

Technical Explanation

The paper begins by examining the dynamics of trust between AI/ML providers (the "Trustors") and the users of their systems (the "Trustees"). Trustors must be both trusting and trustworthy, while Trustees do not necessarily need to exhibit those qualities.

The key challenge for Trustors is to deliver results that are "good enough" to raise a Trustee's trust above a minimum threshold - both for the initial business relationship, and for the Trustee to continue using the service long-term. To address this, the paper proposes a framework and set of metrics to calculate a "trust score" that can objectively quantify the trustworthiness of a given AI/ML system.

The framework involves defining trust dimensions, identifying associated metrics, and combining those into an overall trust score. The suggested trust dimensions include:

Transparency: How well the system's inner workings and decision-making process are explained
Explainability: How intelligible the system's outputs and predictions are to human users
Safety: The system's reliability, robustness, and ability to avoid harmful mistakes
Fairness: The system's lack of discrimination or bias

By measuring these factors and compiling them into a trust score, the framework aims to provide a quantitative way for Trustors to demonstrate the trustworthiness of their AI/ML systems to potential Trustees.

Critical Analysis

The paper makes a compelling case for the need to define and measure trustworthiness in AI/ML systems. As these technologies become more ubiquitous, it is crucial that users can trust the inferences and decisions produced by the models.

However, the proposed framework is still high-level, and further research would be needed to operationalize the specific metrics and weighting schemes. Additionally, the framework focuses on the provider's perspective (the Trustor), when end-user trust may also depend on factors beyond the provider's control, such as the user's own familiarity with AI/ML.

There are also open questions around how to handle evolving trust over time, as models are updated and retrained. The framework suggests tracking trust scores, but doesn't address scenarios where a model's trustworthiness may decrease after deployment due to data drift or other issues.

Overall, this paper lays important groundwork for defining and measuring trustworthiness in AI/ML, but more work is needed to develop a comprehensive, practical system that can be widely adopted by the industry.

Conclusion

This paper tackles the critical challenge of building trust in AI and machine learning systems. By examining the dynamic between providers (Trustors) and users (Trustees), it proposes a framework for quantifying the trustworthiness of ML models through measures of transparency, explainability, safety, and fairness.

Implementing such a framework could help AI/ML providers demonstrate the reliability of their systems, fostering greater user confidence and enabling wider adoption of these powerful technologies. However, further research is needed to refine the specific metrics and address evolving trust over time.

Ultimately, developing trustworthy AI is an essential step towards realizing the full potential of these systems to positively transform industries and society. This paper represents an important contribution to the ongoing journey towards trustworthy AI.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤖

Fostering Trust and Quantifying Value of AI and ML

Dalmo Cirne, Veena Calambur

Artificial Intelligence (AI) and Machine Learning (ML) providers have a responsibility to develop valid and reliable systems. Much has been discussed about trusting AI and ML inferences (the process of running live data through a trained AI model to make a prediction or solve a task), but little has been done to define what that means. Those in the space of ML- based products are familiar with topics such as transparency, explainability, safety, bias, and so forth. Yet, there are no frameworks to quantify and measure those. Producing ever more trustworthy machine learning inferences is a path to increase the value of products (i.e., increased trust in the results) and to engage in conversations with users to gather feedback to improve products. In this paper, we begin by examining the dynamic of trust between a provider (Trustor) and users (Trustees). Trustors are required to be trusting and trustworthy, whereas trustees need not be trusting nor trustworthy. The challenge for trustors is to provide results that are good enough to make a trustee increase their level of trust above a minimum threshold for: 1- doing business together; 2- continuation of service. We conclude by defining and proposing a framework, and a set of viable metrics, to be used for computing a trust score and objectively understand how trustworthy a machine learning system can claim to be, plus their behavior over time.

7/9/2024

Whether to trust: the ML leap of faith

Tory Frame, Julian Padget, George Stothart, Elizabeth Coulthard

Human trust is critical for trustworthy AI adoption. Trust is commonly understood as an attitude, but we cannot accurately measure this, nor manage it. We conflate trust in the overall system, ML, and ML's component parts; so most users do not understand the leap of faith they take when they trust ML. Current efforts to build trust explain ML's process, which can be hard for non-ML experts to comprehend because it is complex, and explanations are unrelated to their own (unarticulated) mental models. We propose an innovative way of directly building intrinsic trust in ML, by discerning and measuring the Leap of Faith (LoF) taken when a user trusts ML. Our LoF matrix identifies where an ML model aligns to a user's own mental model. This match is rigorously yet practically identified by feeding the user's data and objective function both into an ML model and an expert-validated rules-based AI model, a verified point of reference that can be tested a priori against a user's own mental model. The LoF matrix visually contrasts the models' outputs, so the remaining ML-reasoning leap of faith can be discerned. Our proposed trust metrics measure for the first time whether users demonstrate trust through their actions, and we link deserved trust to outcomes. Our contribution is significant because it enables empirical assessment and management of ML trust drivers, to support trustworthy ML adoption. Our approach is illustrated with a long-term high-stakes field study: a 3-month pilot of a sleep-improvement system with embedded AI.

8/6/2024

🛸

To Trust or Not to Trust: Towards a novel approach to measure trust for XAI systems

Miquel Mir'o-Nicolau, Gabriel Moy`a-Alcover, Antoni Jaume-i-Cap'o, Manuel Gonz'alez-Hidalgo, Maria Gemma Sempere Campello, Juan Antonio Palmer Sancho

The increasing reliance on Deep Learning models, combined with their inherent lack of transparency, has spurred the development of a novel field of study known as eXplainable AI (XAI) methods. These methods seek to enhance the trust of end-users in automated systems by providing insights into the rationale behind their decisions. This paper presents a novel approach for measuring user trust in XAI systems, allowing their refinement. Our proposed metric combines both performance metrics and trust indicators from an objective perspective. To validate this novel methodology, we conducted a case study in a realistic medical scenario: the usage of XAI system for the detection of pneumonia from x-ray images.

5/10/2024

Trustworthy Artificial Intelligence in the Context of Metrology

Tameem Adel, Sam Bilson, Mark Levene, Andrew Thompson

We review research at the National Physical Laboratory (NPL) in the area of trustworthy artificial intelligence (TAI), and more specifically trustworthy machine learning (TML), in the context of metrology, the science of measurement. We describe three broad themes of TAI: technical, socio-technical and social, which play key roles in ensuring that the developed models are trustworthy and can be relied upon to make responsible decisions. From a metrology perspective we emphasise uncertainty quantification (UQ), and its importance within the framework of TAI to enhance transparency and trust in the outputs of AI systems. We then discuss three research areas within TAI that we are working on at NPL, and examine the certification of AI systems in terms of adherence to the characteristics of TAI.

6/17/2024