Evaluating Algorithmic Bias in Models for Predicting Academic Performance of Filipino Students

2405.09821

Published 5/17/2024 by Valdemar v{S}v'abensk'y, M'elina Verger, Maria Mercedes T. Rodrigo, Clarence James G. Monterozo, Ryan S. Baker, Miguel Zenon Nicanor Lerias Saavedra, S'ebastien Lall'e, Atsushi Shimada

cs.LG cs.CY

Evaluating Algorithmic Bias in Models for Predicting Academic Performance of Filipino Students

Abstract

Algorithmic bias is a major issue in machine learning models in educational contexts. However, it has not yet been studied thoroughly in Asian learning contexts, and only limited work has considered algorithmic bias based on regional (sub-national) background. As a step towards addressing this gap, this paper examines the population of 5,986 students at a large university in the Philippines, investigating algorithmic bias based on students' regional background. The university used the Canvas learning management system (LMS) in its online courses across a broad range of domains. Over the period of three semesters, we collected 48.7 million log records of the students' activity in Canvas. We used these logs to train binary classification models that predict student grades from the LMS activity. The best-performing model reached AUC of 0.75 and weighted F1-score of 0.79. Subsequently, we examined the data for bias based on students' region. Evaluation using three metrics: AUC, weighted F1-score, and MADD showed consistent results across all demographic groups. Thus, no unfairness was observed against a particular student group in the grade predictions.

Create account to get full access

Overview

This paper evaluates the potential for algorithmic bias in models used to predict the academic performance of Filipino students.
The researchers examined several machine learning models to assess how factors like gender, socioeconomic status, and school location may impact model accuracy and fairness.
The goal was to provide insights that could help develop more equitable AI-driven education tools for the Filipino context.

Plain English Explanation

The researchers in this study looked at how well different AI models could predict the academic performance of Filipino students. They wanted to see if factors like a student's gender, economic background, or where their school was located might cause the models to be biased or inaccurate in their predictions.

This is an important issue because AI is increasingly being used in education, such as to personalize learning or identify students who might need extra support. But if the AI models have built-in biases, they could end up disadvantaging certain groups of students. So the researchers set out to evaluate the fairness and accuracy of several machine learning models when applied to data on Filipino students.

By better understanding the potential for algorithmic bias in this context, the researchers hope to provide insights that can help developers create more equitable AI-powered education tools for use in the Philippines and similar settings. This could ensure that all students have an equal opportunity to succeed, regardless of their personal circumstances.

Technical Explanation

The paper begins by discussing the growing use of AI in education, including the potential benefits but also risks of algorithmic bias. The researchers note that most existing studies on this topic have focused on Western, developed countries, while there is less research on emerging economies like the Philippines.

To address this gap, the study evaluates several machine learning models - including linear regression, random forests, and neural networks - in their ability to predict academic performance of Filipino students. The researchers examine how factors like gender, socioeconomic status, and school location impact the models' accuracy and fairness metrics, such as <a href="https://aimodels.fyi/papers/arxiv/mitigating-nonlinear-algorithmic-bias-binary-classification">demographic parity</a> and <a href="https://aimodels.fyi/papers/arxiv/fair-mixed-effects-support-vector-machine">equal opportunity</a>.

The results show that while the models generally performed well on the prediction task, there were notable differences in their performance across student subgroups. For example, the models tended to be less accurate for students from lower-income backgrounds or rural areas. The researchers also found evidence of <a href="https://aimodels.fyi/papers/arxiv/detecting-gender-bias-course-evaluations">gender bias</a>, with the models exhibiting higher error rates for female students in some cases.

To address these biases, the paper explores potential <a href="https://aimodels.fyi/papers/arxiv/towards-geographic-inclusion-evaluation-text-to-image">mitigation strategies</a>, such as adjusting the model hyperparameters or incorporating contextual factors into the training data. The researchers emphasize the importance of evaluating algorithmic fairness, especially when deploying AI systems in high-stakes domains like education.

Critical Analysis

The paper provides a valuable contribution by examining algorithmic bias in the understudied context of the Philippine education system. The researchers' rigorous evaluation of multiple machine learning models and diverse fairness metrics offers important insights.

However, the study is limited to a single dataset, and the extent to which the findings generalize to other Filipino student populations or education systems is unclear. Additionally, while the researchers propose some bias mitigation strategies, they did not implement and evaluate these approaches in depth.

Further research would be beneficial to better understand the root causes of the observed biases and develop more effective techniques for ensuring AI-driven education tools are equitable and inclusive, not just in the Philippines but globally. <a href="https://aimodels.fyi/papers/arxiv/fairness-bias-algorithmic-hiring-multidisciplinary-survey">Multidisciplinary perspectives</a> could also help uncover systemic factors contributing to algorithmic bias in this domain.

Conclusion

This study makes an important contribution by examining the potential for algorithmic bias in models used to predict the academic performance of Filipino students. The researchers found evidence of biases related to gender, socioeconomic status, and school location, highlighting the need for careful evaluation of fairness when deploying AI systems in high-stakes educational contexts.

By providing these insights, the paper lays the groundwork for developing more equitable AI-powered tools to support student learning and success, not just in the Philippines but in other emerging economies as well. Continued research and multidisciplinary collaboration will be crucial to ensuring AI in education promotes opportunity and fairness for all.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Towards Region-aware Bias Evaluation Metrics

Angana Borah, Aparna Garimella, Rada Mihalcea

When exposed to human-generated data, language models are known to learn and amplify societal biases. While previous works introduced benchmarks that can be used to assess the bias in these models, they rely on assumptions that may not be universally true. For instance, a gender bias dimension commonly used by these metrics is that of family--career, but this may not be the only common bias in certain regions of the world. In this paper, we identify topical differences in gender bias across different regions and propose a region-aware bottom-up approach for bias assessment. Our proposed approach uses gender-aligned topics for a given region and identifies gender bias dimensions in the form of topic pairs that are likely to capture gender societal biases. Several of our proposed bias topic pairs are on par with human perception of gender biases in these regions in comparison to the existing ones, and we also identify new pairs that are more aligned than the existing ones. In addition, we use our region-aware bias topic pairs in a Word Embedding Association Test (WEAT)-based evaluation metric to test for gender biases across different regions in different data domains. We also find that LLMs have a higher alignment to bias pairs for highly-represented regions showing the importance of region-aware bias evaluation metric.

6/26/2024

cs.CL

📉

A Principled Approach for a New Bias Measure

Bruno Scarone, Alfredo Viola, Ricardo Baeza-Yates

The widespread use of machine learning and data-driven algorithms for decision making has been steadily increasing over many years. The areas in which this is happening are diverse: healthcare, employment, finance, education, the legal system to name a few; and the associated negative side effects are being increasingly harmful for society. Negative data emph{bias} is one of those, which tends to result in harmful consequences for specific groups of people. Any mitigation strategy or effective policy that addresses the negative consequences of bias must start with awareness that bias exists, together with a way to understand and quantify it. However, there is a lack of consensus on how to measure data bias and oftentimes the intended meaning is context dependent and not uniform within the research community. The main contributions of our work are: (1) a general algorithmic framework for defining and efficiently quantifying the bias level of a dataset with respect to a protected group; and (2) the definition of a new bias measure. Our results are experimentally validated using nine publicly available datasets and theoretically analyzed, which provide novel insights about the problem. Based on our approach, we also derive a bias mitigation algorithm that might be useful to policymakers.

5/22/2024

cs.LG cs.CY

🤔

Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Yun-Wei Chu, Seyyedali Hosseinalipour, Elizabeth Tenorio, Laura Cruz, Kerrie Douglas, Andrew Lan, Christopher Brinton

Conventional methods for student modeling, which involve predicting grades based on measured activities, struggle to provide accurate results for minority/underrepresented student groups due to data availability biases. In this paper, we propose a Multi-Layer Personalized Federated Learning (MLPFL) methodology that optimizes inference accuracy over different layers of student grouping criteria, such as by course and by demographic subgroups within each course. In our approach, personalized models for individual student subgroups are derived from a global model, which is trained in a distributed fashion via meta-gradient updates that account for subgroup heterogeneity while preserving modeling commonalities that exist across the full dataset. The evaluation of the proposed methodology considers case studies of two popular downstream student modeling tasks, knowledge tracing and outcome prediction, which leverage multiple modalities of student behavior (e.g., visits to lecture videos and participation on forums) in model training. Experiments on three real-world online course datasets show significant improvements achieved by our approach over existing student modeling benchmarks, as evidenced by an increased average prediction quality and decreased variance across different student subgroups. Visual analysis of the resulting students' knowledge state embeddings confirm that our personalization methodology extracts activity patterns clustered into different student subgroups, consistent with the performance enhancements we obtain over the baselines.

5/29/2024

cs.LG cs.AI cs.CY

🌐

When mitigating bias is unfair: multiplicity and arbitrariness in algorithmic group fairness

Natasa Krco, Thibault Laugel, Vincent Grari, Jean-Michel Loubes, Marcin Detyniecki

Most research on fair machine learning has prioritized optimizing criteria such as Demographic Parity and Equalized Odds. Despite these efforts, there remains a limited understanding of how different bias mitigation strategies affect individual predictions and whether they introduce arbitrariness into the debiasing process. This paper addresses these gaps by exploring whether models that achieve comparable fairness and accuracy metrics impact the same individuals and mitigate bias in a consistent manner. We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions: Impact Size (how many people were affected), Change Direction (positive versus negative changes), Decision Rates (impact on models' acceptance rates), Affected Subpopulations (who was affected), and Neglected Subpopulations (where unfairness persists). This framework is intended to help practitioners understand the impacts of debiasing processes and make better-informed decisions regarding model selection. Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods. These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.

5/24/2024

cs.LG stat.ML