Phrase-Level Adversarial Training for Mitigating Bias in Neural Network-based Automatic Essay Scoring

Read original: arXiv:2409.04795 - Published 9/10/2024 by Haddad Philip, Tsegaye Misikir Tashu

Phrase-Level Adversarial Training for Mitigating Bias in Neural Network-based Automatic Essay Scoring

Overview

The paper explores using phrase-level adversarial training to mitigate bias in neural network-based automatic essay scoring systems.
The authors aim to reduce unwanted biases in essay scoring that could lead to unfair assessments for certain demographics.
The proposed approach involves training the model to be invariant to sensitive attributes like gender and race while maintaining high scoring accuracy.

Plain English Explanation

The paper looks at a problem in automated essay scoring systems, where the models used to grade essays can sometimes be biased against certain groups of people. For example, the model might give lower scores to essays written by women or racial minorities, even if the content of the essay is equally strong.

To address this issue, the researchers developed a new training approach called phrase-level adversarial training. The idea is to train the model not just to accurately score the essays, but also to be "blind" to sensitive attributes like gender and race. That way, the model won't let those factors unfairly influence the score.

The key steps are:

Train the main essay scoring model as usual to predict high-quality scores.
Simultaneously train a separate "adversarial" model that tries to predict the sensitive attributes (like gender) from the essay text.
Encourage the main model to learn features that are useful for scoring, but not for predicting the sensitive attributes.

This adversarial training process helps the main model become more fair and unbiased in its essay assessments, without sacrificing too much accuracy. The researchers tested this approach on a dataset of essays and found it was effective at reducing demographic biases.

Technical Explanation

The paper proposes a phrase-level adversarial training approach to mitigate bias in neural network-based automatic essay scoring systems. The key technical components are:

Essay Scoring Model: This is the main model that takes an essay text as input and predicts a score for the essay quality. The authors use a BERT-based transformer model for this.
Adversarial Model: This is a separate model that takes the essay text as input and tries to predict sensitive demographic attributes like gender and race. The goal is to train the essay scoring model to be invariant to these sensitive attributes.
Adversarial Training: During training, the essay scoring model is optimized to both predict accurate essay scores and confuse the adversarial model's ability to predict the sensitive attributes. This is done by adding an adversarial loss term to the main scoring model's objective function.

The authors evaluate their approach on a dataset of essays written in Arabic. They find that the phrase-level adversarial training leads to significant reductions in demographic biases in the essay scoring, while maintaining competitive scoring accuracy compared to standard transformer-based models.

Critical Analysis

The paper presents an interesting and technically sound approach to addressing the important problem of bias mitigation in automated essay scoring systems. A few key points:

Novelty: The use of phrase-level adversarial training to target bias is a novel contribution, building on prior work in adversarial debiasing.
Scope: The evaluation is limited to a dataset of Arabic essays, so further work is needed to assess generalization across languages and domains.
Limitations: The paper does not explore the potential tradeoffs between bias reduction and scoring accuracy. More analysis is needed on this front.
Ethical Considerations: While the paper aims to reduce unfair biases, there are open questions around the ethical implications of such techniques and how to ensure they are deployed responsibly.

Overall, this is a valuable contribution to the field of fair and ethical AI systems, but more research is needed to fully understand the implications and limitations of the proposed approach.

Conclusion

This paper introduces a novel phrase-level adversarial training technique to mitigate demographic biases in neural network-based automatic essay scoring systems. By encouraging the scoring model to be invariant to sensitive attributes like gender and race, the approach can significantly reduce unwanted biases while maintaining competitive scoring performance.

The findings highlight the importance of addressing fairness and equity concerns in high-stakes AI applications like education assessment. Further research is needed to generalize the approach and fully characterize the tradeoffs involved. However, this work represents an important step towards more fair and inclusive automated essay scoring systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Phrase-Level Adversarial Training for Mitigating Bias in Neural Network-based Automatic Essay Scoring

Haddad Philip, Tsegaye Misikir Tashu

Automatic Essay Scoring (AES) is widely used to evaluate candidates for educational purposes. However, due to the lack of representative data, most existing AES systems are not robust, and their scoring predictions are biased towards the most represented data samples. In this study, we propose a model-agnostic phrase-level method to generate an adversarial essay set to address the biases and robustness of AES models. Specifically, we construct an attack test set comprising samples from the original test set and adversarially generated samples using our proposed method. To evaluate the effectiveness of the attack strategy and data augmentation, we conducted a comprehensive analysis utilizing various neural network scoring models. Experimental results show that the proposed approach significantly improves AES model performance in the presence of adversarial examples and scenarios without such attacks.

9/10/2024

✅

Automated essay scoring in Arabic: a dataset and analysis of a BERT-based system

Rayed Ghazawi, Edwin Simpson

Automated Essay Scoring (AES) holds significant promise in the field of education, helping educators to mark larger volumes of essays and provide timely feedback. However, Arabic AES research has been limited by the lack of publicly available essay data. This study introduces AR-AES, an Arabic AES benchmark dataset comprising 2046 undergraduate essays, including gender information, scores, and transparent rubric-based evaluation guidelines, providing comprehensive insights into the scoring process. These essays come from four diverse courses, covering both traditional and online exams. Additionally, we pioneer the use of AraBERT for AES, exploring its performance on different question types. We find encouraging results, particularly for Environmental Chemistry and source-dependent essay questions. For the first time, we examine the scale of errors made by a BERT-based AES system, observing that 96.15 percent of the errors are within one point of the first human marker's prediction, on a scale of one to five, with 79.49 percent of predictions matching exactly. In contrast, additional human markers did not exceed 30 percent exact matches with the first marker, with 62.9 percent within one mark. These findings highlight the subjectivity inherent in essay grading, and underscore the potential for current AES technology to assist human markers to grade consistently across large classes.

7/17/2024

Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection

Sourya Dipta Das, Yash Vadi, Kuldeep Yadav

Automated Essay Scoring (AES) systems are widely popular in the market as they constitute a cost-effective and time-effective option for grading systems. Nevertheless, many studies have demonstrated that the AES system fails to assign lower grades to irrelevant responses. Thus, detecting the off-topic response in automated essay scoring is crucial in practical tasks where candidates write unrelated text responses to the given task in the question. In this paper, we are proposing an unsupervised technique that jointly scores essays and detects off-topic essays. The proposed Automated Open Essay Scoring (AOES) model uses a novel topic regularization module (TRM), which can be attached on top of a transformer model, and is trained using a proposed hybrid loss function. After training, the AOES model is further used to calculate the Mahalanobis distance score for off-topic essay detection. Our proposed method outperforms the baseline we created and earlier conventional methods on two essay-scoring datasets in off-topic detection as well as on-topic scoring. Experimental evaluation results on different adversarial strategies also show how the suggested method is robust for detecting possible human-level perturbations.

4/16/2024

Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression

Kun Sun, Rong Wang

Automated essay scoring (AES) involves predicting a score that reflects the writing quality of an essay. Most existing AES systems produce only a single overall score. However, users and L2 learners expect scores across different dimensions (e.g., vocabulary, grammar, coherence) for English essays in real-world applications. To address this need, we have developed two models that automatically score English essays across multiple dimensions by employing fine-tuning and other strategies on two large datasets. The results demonstrate that our systems achieve impressive performance in evaluation using three criteria: precision, F1 score, and Quadratic Weighted Kappa. Furthermore, our system outperforms existing methods in overall scoring.

6/4/2024