Exploring Fairness and Explainability in LLM-Generated Support for Online Learning Discussion Forums

Zifeng Liu; Wanli Xing; Xinyue Jiao; Chenglu Li

doi:10.18608/jla.2025.8885

Authors

Zifeng Liu University of Florida https://orcid.org/0009-0005-5833-2141
Wanli Xing University of Florida https://orcid.org/0000-0002-1446-889X
Xinyue Jiao New York University https://orcid.org/0000-0001-9496-609X
Chenglu Li University of Utah https://orcid.org/0000-0002-1782-0457

DOI:

https://doi.org/10.18608/jla.2025.8885

Keywords:

sentiment bias, text generation, fairness, explainable analysis, large language model, online learning, research paper

Abstract

Large language models (LLMs) hold significant potential to enhance online learning by automating responses to learner queries and offering personalized, scalable support. However, concerns about bias in LLM-generated responses present challenges to their ethical and equitable use in educational settings. This study explores fairness and explainability in LLM generated replies within online discussion forums. Specifically, we fine-tuned three state-of-the-art LLMs (GPT-2, Gemma, and LLaMA) using both the original MOOC Posts dataset and a counterfactual version. We then analyzed the sentiment patterns of LLM-generated replies and compared them with human-generated responses. To quantify potential sentiment bias, we introduce absolute distributional sentiment divergence (ADSD) to measure disparities across sensitive attributes, with gender used as a case study. To mitigate bias and enhance transparency, we employed counterfactual fine-tuning by incorporating both factual and counterfactual data, and we used TIGERSCORE, a reference-free explainability metric, to assess response quality. Our findings reveal that LLM-generated responses are generally more neutral than human replies but exhibit varying degrees of sentiment bias across gender. Notably, counterfactual fine-tuning shows promise in reducing this bias, resulting in more balanced sentiment distributions. Additionally, explainability analysis indicates that while newer models (Gemma and LLaMA) outperform GPT-2 in response quality, gaps in accuracy and comprehension remain. This study advances the understanding of bias mitigation and fairness evaluation in LLM-generated educational support, contributing to the development of more equitable, transparent, and responsible AI-driven tools for online learning environments.

References

Abrami, P. C., Bernard, R. M., Bures, E. M., Borokhovski, E., & Tamim, R. M. (2011). Interaction in distance education and online learning: Using evidence and theory to improve practice. Journal of Computing in Higher Education, 23(2), 82–103. https://doi.org/10.1007/s12528-011-9043-x

AI@Meta. (2024). Llama 3 model card [Accessed: 2024-05-30]. https://github.com/meta-llama/llama3/blob/main/MODELCARD.md

Almatrafi, O., & Johri, A. (2022). Improving MOOCs using information from discussion forums: An opinion summarization and suggestion mining approach. IEEE Access, 10, 15565–15573. https://doi.org/10.1109/ACCESS.2022.3149271

Baidoo-Anu, D., & Ansah, L. O. (2023). Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI, 7(1), 52–62. https://doi.org/http://dx.doi.org/10.2139/ssrn.4337484

Baker, R. S., & Hawn, A. (2022). Algorithmic bias in education. International Journal of Artificial Intelligence in Education, 32, 1052–1092. https://doi.org/10.1007/s40593-021-00285-9

Baker, R. S., Martin, T., & Rossi, L. M. (2016). Educational data mining and learning analytics. In J. Larusson & B. White (Eds.), Learning analytics: From research to practice (pp. 379–396). Springer. https://doi.org/10.1007/978-1-4614-3305-7 4

Balta-Salvador, R., Olmedo-Torre, N., Peña, M., & Renta-Davids, A.-I. (2021). Academic and emotional effects of online learning during the COVID-19 pandemic on engineering students. Education and Information Technologies, 26(6), 7407–7434. https://doi.org/10.1007/s10639-021-10593-1

Belz, A., & Reiter, E. (2006). Comparing automatic and human evaluation of NLG systems. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (ACL 2006), 6 April 2006, Trento, Italy (pp. 313–320). Association for Computational Linguistics. https://doi.org/10.1.1.60.8276

Blodgett, S. L., Barocas, S., Daume III, H., & Wallach, H. (2020). Language (technology) is power: A critical survey of “bias” in NLP. In D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), 5–10 July 2020, online (pp. 5454–5476). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.485

Bozkurt, A., Xiao, J., Lambert, S., Pazurek, A., Crompton, H., Koseoglu, S., Farrow, R., Bond, M., Nerantzi, C., Honeychurch, S., Bali, M., Dron, J., Mir, K., Stewart, B., Costello, E., Mason, J., Stracke, C. M., Romero-Hall, E., Koutropoulos, A., . . . Jandric, P. (2023). Speculative futures on ChatGPT and generative artificial intelligence (AI): A collective reflection from the educational landscape. Asian Journal of Distance Education, 18(1), 53–130. https://doi.org/10.5281/zenodo.7636568

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186. https://doi.org/10.1126/science.aal4230

Cercas Curry, A., & Rieser, V. (2018). #MeToo Alexa: How conversational systems respond to sexual harassment. In M. Alfano, D. Hovy, M. Mitchell, & M. Strube (Eds.), Proceedings of the Second ACL Workshop on Ethics in Natural Language Processing (ACL 2018), 5 June 2018, New Orleans, Louisiana, USA (pp. 7–14). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-0802

Cho, M., Lim, S., Lim, J., & Kim, O. (2022). Does gender matter in online courses? A view through the lens of the community of inquiry. Australasian Journal of Educational Technology, 38(6), 169–184. https://doi.org/10.14742/ajet.7194

Cleveland-Innes, M., & Campbell, P. (2012). Emotional presence, learning, and the online learning environment. International Review of Research in Open and Distributed Learning, 13(4), 269–292. https://doi.org/10.19173/irrodl.v13i4.1234

Devlin, J., Chang, M. -W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://doi.org/10.48550/arXiv.1810.04805

Dickson-Deane, C., & Chen, H. -L. (2018). Understanding user experience. In D. Mehdi Khosrow-Pour (Ed.), Encyclopedia of information science and technology (4th ed., pp. 7599–7608). IGI Global. https://doi.org/10.4018/978-1-5225-2255-3.ch661

Dressel, J., & Farid, H. (2018). The accuracy, fairness, and limits of predicting recidivism. Science Advances, 4(1), eaao5580. https://doi.org/10.1126/sciadv.aao5580

Du, H., & Xing, W. (2023). Leveraging explainability for discussion forum classification: Using confusion detection as an example. Distance Education, 44(1), 190–205. https://doi.org/10.1080/01587919.2022.2150145

Du, H., Xing, W., & Pei, B. (2023). Automatic text generation using deep learning: Providing large-scale support for online learning communities. Interactive Learning Environments, 31(8), 5021–5036. https://doi.org/10.1080/10494820.2021.1993932

Dumford, A. D., & Miller, A. L. (2018). Online learning in higher education: Exploring advantages and disadvantages for engagement. Journal of Computing in Higher Education, 30(3), 452–465. https://doi.org/10.1007/s12528-018-9179-z

Froehlich, L., & Weydner-Volkmann, S. (2024). Adaptive interventions reducing social identity threat to increase equity in higher distance education: A use case and ethical considerations on algorithmic fairness. Journal of Learning Analytics, 11(2), 112–122. https://doi.org/10.18608/jla.2024.8301

Fryer, Z., Axelrod, V., Packer, B., Beutel, A., Chen, J., & Webster, K. (2022). Flexible text generation for counterfactual fairness probing. In K. Narang, A. Mostafazadeh Davani, L. Mathias, B. Vidgen, & Z. Talat (Eds.), Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH 2022), 14 July 2022, Seattle, Washington, USA (hybrid) (pp. 209–229). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.woah-1.20

Fu, J., Ng, S.- K., Jiang, Z., & Liu, P. (2024). GPTScore: Evaluate as you desire. arXiv preprint arXiv:2302.04166. https://doi.org/10.48550/arXiv.2302.04166

Gardner, J., Brooks, C., & Baker, R. (2019). Evaluating the fairness of predictive student models through slicing analysis. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 225–234). ACM. https://doi.org/10.1145/3303772.3303791

Gemma Team, Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivi'ere, M., Kale, M. S., Love, J., Tafti, P., Hussenot, L., Sessa, P. G., Chowdhery, A., Roberts, A., Barua, A., Botev, A., Castro-Ros, A., Slone, A., . . . Kenealy, K. (2024). Gemma: Open models based on Gemini research and technology. arXiv preprint arXiv:2403.08295. https://doi.org/10.48550/arXiv.2403.08295

Gottipati, S., Shankararaman, V., & Ramesh, R. (2019). TopicSummary: A tool for analyzing class discussion forums using topic based summarizations. In 2019 IEEE Frontiers in Education Conference (FIE 2019), 16–19 October 2019, Cincinnati, Ohio, USA (pp. 1–9). IEEE. https://doi.org/10.1109/FIE43999.2019.9028526

Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. In D. D. Lee, U. von Luxburg, R. Garnett, M. Sugiyama, & I. Guyon (Eds.), Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS 2016), 5–10 December 2016, Barcelona, Spain (pp. 3323–3331). Curran Associates Inc. https://dl.acm.org/doi/10.5555/3157382.3157469

Hernandez-Selles, N., Pablo-Cesar Munoz-Carril, & Gonzalez-Sanmamed, M. (2019). Computer-supported collaborative

learning: An analysis of the relationship between interaction, emotional support and online collaborative tools. Computers & Education, 138, 1–12. https://doi.org/10.1016/j.compedu.2019.04.012

Hew, K. F., & Cheung, W. S. (2012). Student participation in online discussions: Challenges, solutions, and future research. Springer Science & Business Media. https://doi.org/10.1007/978-1-4614-2370-6

Hew, K. F., Hu, X., Qiao, C., & Tang, Y. (2020). What predicts student satisfaction with MOOCs: A gradient boosting trees supervised machine learning and sentiment analysis approach. Computers & Education, 145, 103724. https://doi.org/10.1016/j.compedu.2019.103724

Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), 22–25 August 2004, Seattle, Washington, USA (pp. 168–177). ACM. https://doi.org/10.1145/1014052.1014073

Huang, P.-S., Zhang, H., Jiang, R., Stanforth, R., Welbl, J., Rae, J., Maini, V., Yogatama, D., & Kohli, P. (2019). Reducing sentiment bias in language models via counterfactual evaluation. arXiv preprint arXiv:1911.03064. https://doi.org/10.48550/arXiv.1911.03064

Hwang, G.-J., & Chen, N.- S. (2023). Exploring the potential of generative artificial intelligence in education: Applications, challenges, and future research directions. Educational Technology & Society, 26(2). https://drive.google.com/file/d/15zj51LzLpsE-LE-04WMycl4uvFTOn8x-/view

Jiang, D., Li, Y., Zhang, G., Huang, W., Lin, B. Y., & Chen, W. (2023). TIGERScore: Towards building explainable metric for all text generation tasks. arXiv preprint arXiv:2310.00752. https://doi.org/10.48550/arXiv.2310.00752

Khalil, M., Prinsloo, P., & Slade, S. (2023). Fairness, trust, transparency, equity, and responsibility in learning analytics. Journal of Learning Analytics, 10(1), 1–7. https://doi.org/10.18608/jla.2023.7983

Kim, D., Park, Y., Yoon, M., & Jo, I. (2016). Toward evidence-based learning analytics: Using proxy variables to improve asynchronous online discussion environments. The Internet and Higher Education, 30, 30–43. https://doi.org/10.1016/J.IHEDUC.2016.03.002

Kiritchenko, S., & Mohammad, S. (2018). Examining gender and race bias in two hundred sentiment analysis systems. In M. Nissim, J. Berant, & A. Lenci (Eds.), Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (SEM 2018), 5–6 June 2018, New Orleans, Louisiana, USA (pp. 43–53). Association for Computational Linguistics. https://doi.org/10.18653/v1/S18-2005

Kizilcec, R. F., & Lee, H. (2022). Algorithmic fairness in education. In The ethics of artificial intelligence in education (pp. 174–202). Routledge. https://doi.org/10.4324/9780429329067

Kotek, H., Dockum, R., & Sun, D. Q. (2023). Gender bias and stereotypes in large language models. In M. Bernstein, S. Savage, & A. Bozzon (Eds.), Proceedings of The ACM Collective Intelligence Conference (CI 2023), 6–9 November 2023,

Delft, Netherlands (pp. 12–24). ACM. https://doi.org/10.1145/3582269.3615599

Langford, C. P. H., Bowsher, J., Maloney, J. P., & Lillis, P. P. (1997). Social support: A conceptual analysis. Journal of Advanced Nursing, 25(1), 95–100. https://doi.org/10.1046/j.1365-2648.1997.1997025095.x

Li, C., & Xing, W. (2021). Natural language generation using deep learning to support MOOC learners. International Journal of Artificial Intelligence in Education, 31, 186–214. https://doi.org/10.1007/s40593-020-00235-x

Li, C., Xing, W., & Leite, W. (2022). Building socially responsible conversational agents using big data to support online learning: A case with Algebra Nation. British Journal of Educational Technology, 53(4), 776–803. https://doi.org/10.1111/bjet.13227

Li, H., Li, C., Xing, W., Baral, S., & Heffernan, N. (2024). Automated feedback for student math responses based on multi-modality and fine-tuning. In Proceedings of the 14th International Conference on Learning Analytics and Knowledge (LAK 2024), 18–22 March 2024, Tokyo, Japan (pp. 763–770). ACM. https://doi.org/10.1145/3636555.3636860

Li, L., Johnson, J., Aarhus, W., & Shah, D. (2022). Key factors in MOOC pedagogy based on NLP sentiment analysis of learner reviews: What makes a hit. Computers & Education, 176, 104354. https://doi.org/10.1016/j.compedu.2021.104354

Liang, P. P., Wu, C., Morency, L. - P., & Salakhutdinov, R. (2021). Towards understanding and mitigating social biases in language models. arXiv preprint arXiv:2106.13219. https://doi.org/10.48550/arXiv.2106.13219

Liu, Z., Jiao, X., Li, C., & Xing, W. (2024). Fair prediction of students’ summative performance changes using online learning behavior data. In D. Joyner, B. Paaßen, & C. D. Epp (Eds.), Proceedings of the 17th International Conference on Educational Data Mining (EDM 2024), 14–17 July 2024, Atlanta, Georgia, USA (pp. 686–691). International Educational Data Mining Society. https://doi.org/10.5281/zenodo.12729918

Liu, Z., Xing, W., Jiao, X., & Li, C. (2025). What are the differences between student and ChatGPT-generated pseudocode? Detecting AI-generated pseudocode in high school programming using explainable machine learning. Education and Information Technologies, 30, 14853–14892. https://doi.org/10.1007/s10639-025-13385-z

Liu, Z., Xing, W., & Li, C. (2024). Explainable analysis of AI-generated responses in online learning discussions. In D. Joyner, B. Paaßen, & C. D. Epp (Eds.), Proceedings of the Educational Data Mining 2024 Workshop: Leveraging Large Language Models for Next-Generation Educational Technologies (EDM 2024), 14–17 July 2024, Atlanta, Georgia, USA. Educational Data Mining Society. https://doi.org/10.13140/RG.2.2.24309.38881/1

Lu, K., Mardziel, P., Wu, F., Amancharla, P., & Datta, A. (2020). Gender bias in neural natural language processing. In V. Nigam, T. B. Kirigin, C. Talcott, J. Guttman, S. Kuznetsov, B. T. Loo, & M. Okada (Eds.), Logic, language, and security. Lecture notes in computer science (pp. 189–202, Vol. 12300). Springer. https://doi.org/10.1007/978-3-030-62077-6_14

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6). https://doi.org/10.1145/3457607

Memarian, B., & Doleck, T. (2023). Fairness, accountability, transparency, and ethics (FATE) in artificial intelligence (AI) and higher education: A systematic review. Computers and Education: Artificial Intelligence, 100152. https://doi.org/10.1016/j.caeai.2023.100152

Moore, R. L., Oliver, K. M., & Wang, C. (2019). Setting the pace: Examining cognitive processing in MOOC discussion forums with automatic text analysis. Interactive Learning Environments, 27(5–6), 655–669. https://doi.org/10.1080/10494820.2019.1610453

Naseer, F., Khalid, M. U., Ayub, N., Rasool, A., Abbas, T., & Afzal, M. W. (2024). Automated assessment and feedback in higher education using generative AI. In R. C. Sharma & A. Bozkurt (Eds.), Transforming education with generative AI: Prompt engineering and synthetic content creation (pp. 433–461). IGI Global. https://doi.org/10.4018/979-8-3693-1351-0.ch021

Nielsen, F. A. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903. https://doi.org/10.48550/arXiv.1103.2903

Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., & Chang, Y. (2016). Abusive language detection in online user content. In Proceedings of the 25th International Conference on World Wide Web (WWW 2016), 11–15 April 2016, Montreal, Quebec, Canada (pp. 145–153). ACM. https://doi.org/10.1145/2872427.2883062

Onan, A. (2021). Sentiment analysis on massive open online course evaluations: A text mining and deep learning approach. Computer Applications in Engineering Education, 29(3), 572–589. https://doi.org/10.1002/cae.22253

Ozhan, S. C., & Kocadere, S. A. (2020). The effects of flow, emotional engagement, and motivation on success in a gamified online learning environment. Journal of Educational Computing Research, 57(8), 2006–2031. https://doi.org/10.1177/0735633118823159

Parmar, D., Dewan, M. A. A., & Wen, D. (2023). Automatic analysis of online course discussion forum: A short review. In 2023 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE 2023), 24–27 September 2023, Regina, Saskatchewan, Canada (pp. 210–215). IEEE. https://doi.org/10.1109/CCECE58730.2023.10289065

Parthasarathy, V. B., Zafar, A., Khan, A., & Shahid, A. (2024). The ultimate guide to fine-tuning LLMs from basics to breakthroughs: An exhaustive review of technologies, research, best practices, applied research challenges and opportunities. arXiv preprint arXiv:2408.13296. https://doi.org/10.48550/arXiv.2408.13296

Riazy, S., Simbeck, K., & Schreck, V. (2020). Fairness in learning analytics: Student at-risk prediction in virtual learning environments. In H. C. Lane, S. Zvacek, & J. Uhomoibhi (Eds.), Proceedings of the 12th International Conference on Computer Supported Education (CSEDU 2020), 2–4 May 2020, online (pp. 15–25, Vol. 1). SciTePress. https://doi.org/10.5220/0009324100150025

Roundtree, A. K. (2023). AI explainability, interpretability, fairness, and privacy: An integrative review of reviews. In H. Degen & S. Ntoa (Eds.), Artificial intelligence in HCI. HCII 2023. Lecture notes in computer science (pp. 262–282, Vol. 14050). Springer. https://doi.org/10.1007/978-3-031-35891-3_19

Rovai, A. P. (2007). Facilitating online discussions effectively. The Internet and Higher Education, 10(1), 77–88. https://doi.org/10.1016/J.IHEDUC.2006.10.001

Rudinger, R., Naradowsky, J., Leonard, B., & Van Durme, B. (2018). Gender bias in coreference resolution. In M. Walker, H. Ji, & A. Stent (Eds.), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Short Papers) (NAACL HLT 2018), 1–6 June 2018, New Orleans, Louisiana, USA (pp. 8–14, Vol. 2). Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-2002

Sabbaghi, S. O., Wolfe, R., & Caliskan, A. (2023). Evaluating biased attitude associations of language models in an intersectional context. In F. Rossi, S. Das, J. Davis, K. Firth-Butterfield, & A. John (Eds.), Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (AIES 2023), 8–10 August 2023, Montréal, Quebec, Canada (pp. 542–553). ACM. https://doi.org/10.1145/3600211.3604666

Sahay, A., Gholkar, S., & Arya, K. (2019). Selection-based question answering of an MOOC. arXiv preprint arXiv:1911.07629, 1–5. https://doi.org/10.48550/arXiv.1911.07629

Sap, M., Card, D., Gabriel, S., Choi, Y., & Smith, N. A. (2019). The risk of racial bias in hate speech detection. In A. Korhonen, D. Traum, & L. Marquez (Eds.), Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), 28 July–2 August 2019, Florence, Italy (pp. 1668–1678). Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1163

Shah, D. S., Schwartz, H. A., & Hovy, D. (2020). Predictive biases in natural language processing models: A conceptual framework and overview. In D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), 5–10 July 2020, online (pp. 5248–5264). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.468

Sheng, E., Chang, K.-W., Natarajan, P., & Peng, N. (2021). Societal biases in language generation: Progress and challenges. In C. Zong, F. Xia, W. Li, & R. Navigli (Eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Long Papers) (ACL-IJCNLP 2021), 1–6 August 2021, online (pp. 4275–4293, Vol. 1). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.330

Sheng, E., Chang, K.-W., Natarajan, P., & Peng, N. (2019). The woman worked as a babysitter: On biases in language generation. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the Ninth International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), 3–7 November 2019, Hong Kong, China (pp. 3407–3412). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1339

Shumaker, S. A., & Brownell, A. (1984). Toward a theory of social support: Closing conceptual gaps. Journal of Social Issues, 40(4), 11–36. https://doi.org/10.1111/j.1540-4560.1984.tb01105.x

Song, Y., Zhu, Q., Wang, H., & Zheng, Q. (2024). Automated essay scoring and revising based on open-source large language models. IEEE Transactions on Learning Technologies, 17, 1920–1930. https://doi.org/10.1109/TLT.2024.3396873

Sun, T., Gaut, A., Tang, S., Huang, Y., ElSherief, M., Zhao, J., Mirza, D., Belding, E., Chang, K.-W., & Wang, W. Y. (2019). Mitigating gender bias in natural language processing: Literature review. In A. Korhonen, D. Traum, & L. Marquez (Eds.), Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), 28 July–2 August 2019, Florence, Italy (pp. 1630–1640). Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1159

Sun, Y., & Gao, F. (2017). Comparing the use of a social annotation tool and a threaded discussion forum to support online discussions. The Internet and Higher Education., 32, 72–79. https://doi.org/10.1016/J.IHEDUC.2016.10.001

Tang, H., Xing, W., & Pei, B. (2018). Exploring the temporal dimension of forum participation in MOOCs. Distance Education, 39(3), 353–372. https://doi.org/10.1080/01587919.2018.1476841

Tegos, S., Demetriadis, S., & Karakostas, A. (2015). Promoting academically productive talk with conversational agent interventions in collaborative learning settings. Computers & Education, 87, 309–325. https://doi.org/10.1016/j.compedu.2015.07.014

Thoits, P. A. (1986). Social support as coping assistance. Journal of Consulting and Clinical Psychology, 54(4), 416–423. https://doi.org/10.1037/0022-006X.54.4.416

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. -A., Lacroix, T., Roziere, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., & Lample, G. (2023). LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. https://doi.org/10.48550/arXiv.2302.13971

Van De Poel, I. (2021). Design for value change. Ethics and Information Technology, 23(1), 27–31. https://doi.org/10.1007/s10676-018-9461-9

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In U. von Luxburg, I. Guyon, S. Bengio, H. Wallach, & R. Fergus (Eds.), Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), 4–9 December 2017, Long Beach, California, USA (pp. 6000–6010). https://dl.acm.org/doi/10.5555/3295222.3295349

Vayre, E., & Vonthron, A.-M. (2017). Psychological engagement of students in distance and online learning: Effects of self-efficacy and psychosocial processes. Journal of Educational Computing Research, 55(2), 197–218. https://doi.org/10.1177/0735633116656849

Venkit, P. N., Gautam, S., Panchanadikar, R., Huang, T. - H., & Wilson, S. (2023). Nationality bias in text generation. arXiv preprint arXiv:2302.02463. https://doi.org/10.48550/arXiv.2302.02463

Wang, B., Shen, T., Long, G., Zhou, T., & Chang, Y. (2021). Eliminating sentiment bias for aspect-level sentiment classification with unsupervised opinion extraction. arXiv preprint arXiv:2109.02403. https://doi.org/10.48550/arXiv.2109.02403

Wang, Q., Rose, C. P., Ma, N., Jiang, S., Bao, H., & Li, Y. (2022). Design and application of automatic feedback scaffolding in forums to promote learning. IEEE Transactions on Learning Technologies., 15(2), 150–166. https://doi.org/10.1109/TLT.2022.3156914

Williamson, J. M. L., & Martin, A. G. (2010). Analysis of patient information leaflets provided by a district general hospital by the Flesch and Flesch-Kincaid method. International Journal of Clinical Practice, 64(13), 1824–1831. https://doi.org/10.1111/j.1742-1241.2010.02408.x

Wong, G. K., Li, Y. K., & Lai, X. (2021). Visualizing the learning patterns of topic-based social interaction in online discussion forums: An exploratory study. Educational Technology Research and Development, 69(5), 2813–2843. https://doi.org/10.1007/s11423-021-10040-5

Xing, W., & Du, D. (2019). Dropout prediction in MOOCs: Using deep learning for personalized intervention. Journal of Educational Computing Research, 57(3), 547–570. https://doi.org/10.1177/0735633118757015

Xu, W., Wang, D., Pan, L., Song, Z., Freitag, M., Wang, W. Y., & Li, L. (2023). INSTRUCTSCORE: Towards explainable text generation evaluation with automatic feedback. arXiv preprint arXiv:2305.14282. https://doi.org/10.48550/arXiv.2305.14282

Yan, L., Sha, L., Zhao, L., Li, Y., Martınez-Maldonado, R., Chen, G., Li, X., Jin, Y., & Gasevic, D. (2024). Practical and ethical challenges of large language models in education: A systematic scoping review. British Journal of Educational Technology, 55(1), 90–112. https://doi.org/10.1111/bjet.13370

Yang, G., Sun, W., & Jiang, R. (2022). Interrelationship amongst university student perceived learning burnout, academic self-efficacy, and teacher emotional support in China’s English online learning context. Frontiers in Psychology, 13, 829193. https://doi.org/10.3389/fpsyg.2022.829193

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K.-W. (2018). Gender bias in coreference resolution: Evaluation and debiasing methods. In M. Walker, H. Ji, & A. Stent (Eds.), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2018, Short Papers), 1–6 June 2018, New Orleans, Louisiana, USA (pp. 15–20, Vol. 2). Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-2003

Zheng, Y., Zhang, R., Zhang, J., Ye, Y., & Luo, Z. (2024). LlamaFactory: Unified efficient fine-tuning of 100+ language models. In Y. Cao, Y. Feng, & D. Xiong (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (System Demonstrations) (ACL 2024), 11–16 August 2024, Bangkok, Thailand (pp. 400–410, Vol. 3). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-demos.38

Zhong, M., Liu, Y., Yin, D., Mao, Y., Jiao, Y., Liu, P., Zhu, C., Ji, H., & Han, J. (2022). Towards a unified multi-dimensional evaluator for text generation. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (AMNLP 2022), 7–11 December 2022, Abu Dhabi, United Arab Emirates (pp. 2023–2038). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.131

Zhou, J., Chen, F., & Holzinger, A. (2022). Towards explainability for AI fairness. In A. Holzinger, R. Goebel, R. Fong, T. Moon, K.-R. Muller, & W. Samek (Eds.), xxAI—Beyond explainable AI (Vol. 13200). Springer. https://doi.org/10.1007/978-3-031-04083-2_18

Zylich, B., Viola, A., Toggerson, B., Al-Hariri, L., & Lan, A. (2020). Exploring automated question answering methods for teaching assistance. In I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, & E. Mill ´an (Eds.), Artificial intelligence in education. AIED 2020. Lecture notes in computer science (pp. 610–622, Vol. 12163). Springer. https://doi.org/10.1007/978-3-030-52237-7_49

Exploring Fairness and Explainability in LLM-Generated Support for Online Learning Discussion Forums

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)