Balancing Act
Early, Fair, and Accurate Identification of At-Risk Students
DOI:
https://doi.org/10.18608/jla.2025.8761Keywords:
algorithmic bias, identifying at-risk students, machine learning, learning analytics, K-12 students, AP statistics, research paperAbstract
Machine learning algorithms have been widely used for identifying at-risk students. Current research focuses on timeliness and accuracy of the predictions, leading to a heavy reliance on demographic data, which introduces severe bias issues. This study develops fairness-aware machine learning models to identify at-risk students in high school Advanced Placement (AP) statistics, where student performance is closely linked to demographic background. We evaluated the predictive performance and bias mitigation strategies of various machine learning algorithms. To determine the optimal time for accurate and fair identification of at-risk students, we divided the dataset into three stages corresponding to the course’s progress. At each stage, we examined model performance and fairness across groups defined by race, gender, and eligibility for free/reduced-price lunch. Our findings suggest that at Stage 1 (i.e., up to the first unit review assignment), the models effectively identified at-risk students while maintaining fairness across demographic groups. We discovered that incorporating more learning activity data reduced the potential bias caused by overreliance on demographic information. We also examined the impact of different bias mitigation approaches as well as the exclusion of the sensitive features on predictive accuracy and fairness. We further discuss their implications for designing more context-specific solutions in educational settings.
References
Adnan, M., Habib, A., Ashraf, J., Mussadiq, S., Raza, A. A., Abid, M., Bashir, M., & Khan, S. U. (2021). Predicting at-risk students at different percentages of course length for early intervention using machine learning models. IEEE Access, 9, 7519–7539. https://doi.org/10.1109/ACCESS.2021.3049446
An, T.-K., & Kim, M.-H. (2010). A new diverse AdaBoost classifier. In Q. Li, M. Chen, H. Deng, & Y. Gao (Eds.), 2010 international conference on artificial intelligence and computational intelligence (Vol. 1, pp. 359–363). IEEE. https://doi.org/10.1109/AICI.2010.82
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016, May 23). Machine bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Ayubkhan, S. A. H., Yap, W.-S., Morris, E., & Rawthar, M. B. K. (2023). A practical intrusion detection system based on denoising autoencoder and LightGBM classifier with improved detection performance. Journal of Ambient Intelligence and Humanized Computing, 14(6), 7427–7452. https://doi.org/10.1007/s12652-022-04449-w
Baker, R. S., Esbenshade, L., Vitale, J., & Karumbaiah, S. (2023). Using demographic data as predictor variables: A questionable choice. Journal of Educational Data Mining, 15(2), 22–54. https://doi.org/10.5281/zenodo.7702628
Baker, R. S., & Hawn, A. (2021). Algorithmic bias in education. EdArXiv. https://doi.org/10.35542/osf.io/pbmvz
Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and machine learning: Limitations and opportunity. MIT Press. https://fairmlbook.org/
Becker, B. E., & Luthar, S. S. (2002). Social-emotional factors affecting achievement outcomes among disadvantaged students: Closing the achievement gap. Educational Psychologist, 37(4), 197–214. https://doi.org/10.1207/S15326985EP3704_1
Bellamy, R. K. E., Dey, K., Hind, M., Hoffman, S. C., Houde, S., Kannan, K., Lohia, P., Martino, J., Mehta, S., Mojsilovic, A., Nagar, S., Ramamurthy, K. N., Richards, J., Saha, D., Sattigeri, P., Singh, M., Varshney, K. R., & Zhang, Y. (2018). AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv. http://arxiv.org/abs/1810.01943
Chen, R. J., Chen, T. Y., Lipkova, J., Wang, J. J., Williamson, D. F. K., Lu, M. Y., Sahai, S., & Mahmood, F. (2022). Algorithm fairness in AI for medicine and healthcare. arXiv. https://arxiv.org/abs/2110.00603v2
Chen, R. J., Wang, J. J., Williamson, D. F. K., Chen, T. Y., Lipkova, J., Lu, M. Y., Sahai, S., & Mahmood, F. (2023). Algorithmic fairness in artificial intelligence for medicine and healthcare. Nature Biomedical Engineering, 7(6), 719–742. https://doi.org/10.1038/s41551-023-01056-8
Cohausz, L., Kappenberger, J., & Stuckenschmidt, H. (2024). What fairness metrics can really tell you: A case study in the educational domain. In S. Joksimovic & A. Zamecnik (Eds.), LAK ’24: Proceedings of the 14th learning analytics and knowledge conference (pp. 792–799). ACM Press. https://doi.org/10.1145/3636555.3636873
Deho, O. B., Zhan, C., Li, J., Liu, J., Liu, L., & Duy Le, T. (2022). How do the existing fairness metrics and unfairness mitigation algorithms contribute to ethical learning analytics? British Journal of Educational Technology, 53(4), 822–843. https://doi.org/10.1111/bjet.13217
Feldman, M., Friedler, S., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015). Certifying and removing disparate impact. arXiv. http://arxiv.org/abs/1412.3756
Funder, D. C., & Ozer, D. J. (2019). Evaluating effect size in psychological research: Sense and nonsense. Advances in Methods and Practices in Psychological Science, 2(2), 156–168. https://doi.org/10.1177/2515245919847202
Gardner, J., Brooks, C., & Baker, R. (2019). Evaluating the fairness of predictive student models through slicing analysis. In D. Azcona & R. Chung (Eds.), LAK ’19: Proceedings of the 9th international conference on learning analytics & knowledge (pp. 225–234). ACM Press. https://doi.org/10.1145/3303772.3303791
Gu, J., McFerran, B., Aquino, K., & Kim, T. G. (2014). What makes affirmative action-based hiring decisions seem (un)fair? A test of an ideological explanation for fairness judgments. Journal of Organizational Behavior, 35(5), 722–745. https://doi.org/10.1002/job.1927
Haeri, M. A., & Zweig, K. A. (2020). The crucial role of sensitive attributes in fair classification. In H. K. Singh (Ed.), 2020 IEEE symposium series on computational intelligence (SSCI), 2993–3002. IEEE. https://doi.org/10.1109/SSCI47803.2020.9308585
Hajisoteriou, C., Maniatis, P., & Angelides, P. (2019). Teacher professional development for improving the intercultural school: An example of a participatory course on stereotypes. Education Inquiry, 10(2), 166–188. https://doi.org/10.1080/20004508.2018.1514908
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. arXiv. http://arxiv.org/abs/1610.02413
Holmes, W., Porayska-Pomsta, K., Holstein, K., Sutherland, E., Baker, T., Buckingham Shum, S., Santos, O. C., Rodrigo, M. T., Cukurova, M., Bittencourt, I. I., & Koedinger, K. R. (2022). Ethics of AI in education: Towards a community-wide framework. International Journal of Artificial Intelligence in Education, 32(3), 504–526. https://doi.org/10.1007/s40593-021-00239-1
Holmes, W., & Tuomi, I. (2022). State of the art and practice in AI in education. European Journal of Education, 57(4), 542–570. https://doi.org/10.1111/ejed.12533
Holstein, K., Aleven, V., & Rummel, N. (2020, June). A conceptual framework for human–AI hybrid adaptivity in education. Artificial intelligence in education: 21st international conference, AIED 2020, Ifrane, Morocco, July 6–10, 2020, Proceedings, Part I (pp. 240–254). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-52237-7
Hu, Q., & Rangwala, H. (2020). Towards fair educational data mining: A case study on detecting at-risk students. In A. N. Rafferty, J. Whitehill, C. Romero, & V. Cavalli-Sforza (Eds.), Proceedings of the 13th international conference on educational data mining (EDM 2020) (pp. 431–437). International Educational Data Mining Society. https://eric.ed.gov/?id=ED608050
Jiang, W., & Pardos, Z. A. (2021). Towards equity and algorithmic fairness in student grade prediction. In M. Fourcade, B. Kuipers, S. Lazar, & D. Mulligan (Eds.), AIES ’21: Proceedings of the 2021 AAAI/ACM conference on AI, ethics, and society (pp. 608–617). ACM Press. https://doi.org/10.1145/3461702.3462623
Joshi, A., Saggar, P., Jain, R., Sharma, M., Gupta, D., & Khanna, A. (2021). CatBoost: An ensemble machine learning model for prediction and classification of student academic performance. Advances in Data Science and Adaptive Analysis, 13(03n04), Article 2141002. https://doi.org/10.1142/S2424922X21410023
Kao, G., & Thompson, J. S. (2003). Racial and ethnic stratification in educational achievement and attainment. Annual Review of Sociology, 29, 417–442. https://doi.org/10.1146/annurev.soc.29.010202.100019
Kizilcec, R. F., & Lee, H. (2021). Algorithmic fairness in education. arXiv. https://doi.org/10.48550/arXiv.2007.05443
Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv. https://doi.org/10.48550/arXiv.1609.05807
Kobayashi, K., & Nakao, Y. (2020). One-vs.-one mitigation of intersectional bias: A general method to extend fairness-aware binary classification. arXiv. https://arxiv.org/abs/2010.13494v1
Kordzadeh, N., & Ghasemaghaei, M. (2022). Algorithmic bias: Review, synthesis, and future research directions. European Journal of Information Systems, 31(3), 388–409. https://doi.org/10.1080/0960085X.2021.1927212
Książek, W., Hammad, M., Pławiak, P., Acharya, U. R., & Tadeusiewicz, R. (2020). Development of novel ensemble model using stacking learning and evolutionary computation techniques for automated hepatocellular carcinoma detection. Biocybernetics and Biomedical Engineering, 40(4), 1512–1524. https://doi.org/10.1016/j.bbe.2020.08.007
Lalor, J. P., Abbasi, A., Oketch, K., Yang, Y., & Forsgren, N. (2024). Should fairness be a metric or a model? A model-based framework for assessing bias in machine learning pipelines. ACM Transactions on Information Systems, 42(4), Article 99. https://doi.org/10.1145/3641276
Lee, J., & LaHaye, C. (2024). Unequal access to the mathematics course ladder for rural students in the southern states. Journal of Advanced Academics, 35(4), 671–697. https://doi.org/10.1177/1932202X241241355
Marbouti, F., Diefes-Dux, H. A., & Madhavan, K. (2016). Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education, 103, 1–15. https://doi.org/10.1016/j.compedu.2016.09.005
Marcinkowski, F., Kieslich, K., Starke, C., & Lünich, M. (2020). Implications of AI (un-)fairness in higher education admissions: The effects of perceived AI (un-)fairness on exit, voice and organizational reputation. In M. Hildebrandt, C. Castillo, E. Celis, S. Ruggieri, L. Taylor, & G. Zanfir-Fortuna (Eds.), FAT* ’20: Proceedings of the 2020 conference on fairness, accountability, and transparency (pp. 122–130). ACM Press. https://doi.org/10.1145/3351095.3372867
Matz, S. C., Bukow, C. S., Peters, H., Deacons, C., Dinu, A., & Stachl, C. (2023). Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics. Scientific Reports, 13(1), Article 5705. https://doi.org/10.1038/s41598-023-32484-w
Mesquita, F., Maurício, J., & Marques, G. (2021). Oversampling techniques for diabetes classification: A comparative study. In 2021 international conference on e-health and bioengineering (EHB) (pp. 1–6). IEEE. https://doi.org/10.1109/EHB52898.2021.9657542
Milner, H. R., IV. (2013). Analyzing poverty, learning, and teaching through a critical race theory lens. Review of Research in Education, 37(1), 1–53. https://doi.org/10.3102/0091732X12459720
Nezami, N., Haghighat, P., Gándara, D., & Anahideh, H. (2024). Assessing disparities in predictive modeling outcomes for college student success: The impact of imputation techniques on model performance and fairness. Education Sciences, 14(2), Article 136. https://doi.org/10.3390/educsci14020136
Ober, T. M., Cheng, Y., Carter, M. F., & Liu, C. (2023). Disruptiveness of COVID-19: Differences in course engagement, self-appraisal, and learning. Aera Open, 9. https://doi.org/10.1177/23328584231177967
Pei, B., & Xing, W. (2022). An interpretable pipeline for identifying at-risk students. Journal of Educational Computing Research, 60(2), 380–405. https://doi.org/10.1177/07356331211038168
Peng, B., Yu, R. K., DeHoff, K. L., & Amos, C. I. (2007). Normalizing a large number of quantitative traits using empirical normal quantile transformation. BMC Proceedings, 1(Suppl. 1), Article S156. https://doi.org/10.1186/1753-6561-1-S1-S156
Raju, V. N. G., Lakshmi, K. P., Jain, V. M., Kalidindi, A., & Padma, V. (2020). Study the influence of normalization/transformation process on the accuracy of supervised classification. In 2020 third international conference on smart systems and inventive technology (ICSSIT) (pp. 729–735). IEEE. https://doi.org/10.1109/ICSSIT48917.2020.9214160
Rasooli, A., Razmjoee, M., Cumming, J., Dickson, E., & Webster, A. (2021). Conceptualising a fairness framework for assessment adjusted practices for students with disability: An empirical study. Assessment in Education: Principles, Policy & Practice, 28(3), 301–321. https://doi.org/10.1080/0969594X.2021.1932736
Sha, L., Rakovic, M., Whitelock-Wainwright, A., Carroll, D., Yew, V. M., Gašević, D., & Chen, G. (2021). Assessing algorithmic fairness in automatic classifiers of educational forum Posts. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin, & V. Dimitrova (Eds.), Artificial intelligence in education: 22nd international conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, proceedings, part I (pp. 381–394). Springer Cham. https://doi.org/10.1007/978-3-030-78292-4_31
Small, E., Sokol, K., Manning, D., Salim, F. D., & Chan, J. (2024). Equalised odds is not equal individual odds: Post-processing for group and individual fairness. In S. Rezapour & A. Asudeh (Eds.), FAccT ’24: Proceedings of the 2024 ACM conference on fairness, accountability, and transparency (pp. 1559–1578). ACM Press. https://doi.org/10.1145/3630106.3658989
Wang, S., Dai, Y., Shen, J., & Xuan, J. (2021). Research on expansion and classification of imbalanced data based on SMOTE algorithm. Scientific Reports, 11, Article 24039. https://doi.org/10.1038/s41598-021-03430-5
Wang, T., Zhao, J., Yatskar, M., Chang, K.-W., & Ordonez, V. (2019). Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In E. Mortensen & J. Lim (Eds.), 2019 IEEE/CVF international conference on computer vision (ICCV) (pp. 5309–5318). IEEE. https://doi.org/10.1109/ICCV.2019.00541
Weerts, H., Dudík, M., Edgar, R., Jalali, A., Lutz, R., & Madaio, M. (2023). Fairlearn: Assessing and improving fairness of AI systems. arXiv. https://doi.org/10.48550/arXiv.2303.16626
Wrigley-Asante, C., Ackah, C. G., & Frimpong, L. K. (2023). Gender differences in academic performance of students studying science technology engineering and mathematics (STEM) subjects at the University of Ghana. SN Social Sciences, 3(1), Article 12. https://doi.org/10.1007/s43545-023-00608-8
Yu, R., Lee, H., & Kizilcec, R. F. (2021). Should college dropout prediction models include protected attributes? In C. Meinel, M. Pérez Sanagustín, M. Specht, & A. Ogan (Eds.), L@S ’21: Proceedings of the eighth ACM conference on learning @ scale (pp. 91–100). ACM Press. https://doi.org/10.1145/3430895.3460139
Yu, R., Li, Q., Fischer, C., Doroudi, S., & Xu, D. (2020). Towards accurate and fair prediction of college success: Evaluating different sources of student data. In A. N. Rafferty, J. Whitehill, C. Romero, & V. Cavalli-Sforza (Eds.), Proceedings of the 13th international conference on educational data mining (EDM 2020) (pp. 292–301). International Educational Data Mining Society. https://eric.ed.gov/?id=ED608066
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Journal of Learning Analytics

This work is licensed under a Creative Commons Attribution 4.0 International License.