Improving On-Time Undergraduate Graduation Rate For Undergraduate Students Using Predictive Analytics

Read original: arXiv:2407.10253 - Published 7/16/2024 by Ramineh Lopez-Yazdani, Roberto Rivera

🌀

Overview

The paper examines the issue of low on-time graduation rates among universities in Puerto Rico compared to the mainland United States.
It aims to develop a predictive model that can accurately identify students at risk of not graduating on time.
The researchers evaluated various predictive models using a dataset from the University of Puerto Rico at Mayaguez.
The boosting model trained on an oversampled dataset performed best at predicting who would not graduate on time.

Plain English Explanation

The paper looks at a problem faced by universities in Puerto Rico - many students are not graduating on time. This is an important issue because it can have significant negative consequences for the students, the schools, and the local economy. The researchers wanted to create a way to predict which students are most likely to not graduate on time, so the universities can try to help those students and improve their graduation rates.

They tested out different machine learning models using data from the University of Puerto Rico at Mayaguez. The model that performed the best at predicting which students would not graduate on time was a type of model called a "boosting" model. This model was trained on a dataset that had been modified to have an equal number of students who graduated on time and those who did not.

The key idea is that by being able to accurately predict which students are at risk of not graduating on time, the universities can try to provide extra support and resources to help those students succeed and graduate on schedule. This could have big benefits for the students, the schools, and the local economy.

Technical Explanation

The researchers developed and evaluated various predictive models using a dataset containing information on 24,432 undergraduate students at the University of Puerto Rico at Mayaguez. They looked at two different scenarios:

Group I models used both first-year college factors and pre-college factors as inputs.
Group II models only used pre-college factors as inputs.

For both scenarios, the boosting model trained on an oversampled dataset performed the best at predicting which students would not graduate on time. Oversampling the dataset involved duplicating instances of the minority class (students who did not graduate on time) to create a more balanced training set.

The researchers also experimented with other models like logistic regression, decision trees, and neural networks. However, the boosting model consistently outperformed these other approaches in terms of accurately identifying at-risk students.

Critical Analysis

The paper provides a valuable contribution by demonstrating the potential of using machine learning to proactively identify students who may struggle to graduate on time. However, there are a few important caveats to consider:

The research was conducted using data from a single university, so the findings may not generalize to other institutions in Puerto Rico or beyond. Further studies across a broader range of universities would be helpful.
The paper does not delve into the specific factors or student characteristics that the models used to make their predictions. Understanding these "black box" decision-making processes could lead to more interpretable and actionable insights.
While the predictive performance of the models is promising, the paper does not explore how the universities could actually intervene to support the identified at-risk students. More research is needed on effective retention and support strategies.

Overall, this research represents an important step forward, but there are still opportunities to build on these findings and develop more comprehensive solutions to the problem of low on-time graduation rates.

Conclusion

This paper tackles the critical issue of low on-time graduation rates at universities in Puerto Rico. By developing a predictive model that can accurately identify students at risk of not graduating on time, the researchers have laid the foundation for universities to proactively intervene and provide the necessary support to help these students succeed.

The strong performance of the boosting model, especially when trained on an oversampled dataset, suggests that advanced machine learning techniques can be valuable tools for addressing this problem. If universities can use these insights to implement effective retention and support strategies, it could lead to substantial benefits for students, institutions, and the broader Puerto Rican economy.

Overall, this research represents an important step forward in using data-driven approaches to tackle complex challenges in higher education. As the field continues to evolve, we can expect to see more innovative solutions that leverage the power of predictive modeling to drive positive change.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌀

Improving On-Time Undergraduate Graduation Rate For Undergraduate Students Using Predictive Analytics

Ramineh Lopez-Yazdani, Roberto Rivera

The on-time graduation rate among universities in Puerto Rico is significantly lower than in the mainland United States. This problem is noteworthy because it leads to substantial negative consequences for the student, both socially and economically, the educational institution and the local economy. This project aims to develop a predictive model that accurately detects students early in their academic pursuit at risk of not graduating on time. Various predictive models are developed to do this, and the best model, the one with the highest performance, is selected. Using a dataset containing information from 24432 undergraduate students at the University of Puerto Rico at Mayaguez, the predictive performance of the models is evaluated in two scenarios: Group I includes both the first year of college and pre-college factors, and Group II only considers pre-college factors. Overall, for both scenarios, the boosting model, trained on the oversampled dataset, is the most successful at predicting who will not graduate on time.

7/16/2024

🔮

Unified Prediction Model for Employability in Indian Higher Education System

Pooja Thakar, Anil Mehta, Manisha

Educational Data Mining has become extremely popular among researchers in last decade. Prior effort in this area was only directed towards prediction of academic performance of a student. Very less number of researches are directed towards predicting employability of a student i.e. prediction of students performance in campus placements at an early stage of enrollment. Furthermore, existing researches on students employability prediction are not universal in approach and is either based upon only one type of course or University/Institute. Henceforth, is not scalable from one context to another. With the necessity of unification, data of professional technical courses namely Bachelor in Engineering/Technology and Masters in Computer Applications students have been collected from 17 states of India. To deal with such a data, a unified predictive model has been developed and applied on 17 states datasets. The research done in this paper proves that model has universal application and can be applied to various states and institutes pan India with different cultural background and course structure. This paper also explores and proves statistically that there is no significant difference in Indian Education System with respect to states as far as prediction of employability of students is concerned. Model provides a generalized solution for student employability prediction in Indian Scenario.

7/26/2024

🤖

Forecasting Success of Computer Science Professors and Students Based on Their Academic and Personal Backgrounds

Ghazal Kalhor, Behnam Bahrak

After completing their undergraduate studies, many computer science (CS) students apply for competitive graduate programs in North America. Their long-term goal is often to be hired by one of the big five tech companies or to become a faculty member. Therefore, being aware of the role of admission criteria may help them choose the best path towards their goals. In this paper, we analyze the influence of students' previous universities on their chances of being accepted to prestigious North American universities and returning to academia as professors in the future. Our findings demonstrate that the ranking of their prior universities is a significant factor in achieving their goals. We then illustrate that there is a bias in the undergraduate institutions of students admitted to the top 25 computer science programs. Finally, we employ machine learning models to forecast the success of professors at these universities. We achieved an RMSE of 7.85 for this prediction task.

8/1/2024

🔮

Inside the Black Box: Detecting and Mitigating Algorithmic Bias across Racialized Groups in College Student-Success Prediction

Denisa G'andara, Hadis Anahideh, Matthew P. Ison, Lorenzo Picchiarini

Colleges and universities are increasingly turning to algorithms that predict college-student success to inform various decisions, including those related to admissions, budgeting, and student-success interventions. Because predictive algorithms rely on historical data, they capture societal injustices, including racism. In this study, we examine how the accuracy of college student success predictions differs between racialized groups, signaling algorithmic bias. We also evaluate the utility of leading bias-mitigating techniques in addressing this bias. Using nationally representative data from the Education Longitudinal Study of 2002 and various machine learning modeling approaches, we demonstrate how models incorporating commonly used features to predict college-student success are less accurate when predicting success for racially minoritized students. Common approaches to mitigating algorithmic bias are generally ineffective at eliminating disparities in prediction outcomes and accuracy between racialized groups.

7/12/2024