Modelability as a Strategy for Improving the Generalizability and Scalability of Predictive Models
DOI:
https://doi.org/10.18608/jla.2026.9099Keywords:
modelability, generalizability, scalability, cross-institutional research, predictive learning analytics, research paperAbstract
Learning analytics has the potential to enhance education through data-informed decision-making, but persistent challenges around generalizability and scalability continue to limit its real-world impact. In this paper, we introduce the concept of a modelable world: a learning ecosystem purposefully designed to support the development of predictive models that generalize across diverse contexts. We outline three core design principles of modelability: (1) valid and interpretable measurements, (2) scalable and stable implementation, and (3) a collaborative research–practice–technology ecosystem. We then illustrate how these principles can be operationalized in the real world through a case study of CourseKata, a platform offering a fully instrumented online textbook adopted across a wide range of institutions and disciplines. Using CourseKata data, we developed early prediction models of students’ final course grades using behavioral measures and tested the model generalizability across institutions (something rarely done in the modeling literature). Results show that a system designed with modelability in mind can produce predictive models that generalize effectively across diverse educational contexts.
References
Adnan, M., Habib, A., Ashraf, J., Mussadiq, S., Raza, A. A., Abid, M., ... & Khan, S. U. (2021). Predicting at-risk students at different percentages of course length for early intervention using machine learning models. IEEE Access, 9, 7519–7539. https://doi.org/10.1109/ACCESS.2021.3049446
Alomar, K., Aysel, H. I., & Cai, X. (2023). Data augmentation in classification and segmentation: A survey and new strategies. Journal of Imaging, 9(2), 46. https://doi.org/10.3390/jimaging9020046
Asafuddoula, M., Verma, B., & Zhang, M. (2017). A divide-and-conquer-based ensemble classifier learning by means of many-objective optimization. IEEE Transactions on Evolutionary Computation, 22(5), 762–777. https://doi.org/10.1109/TEVC.2017.2782826
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305. https://www.jmlr.org/papers/v13/bergstra12a.html
Blake, A. B., & Stigler, J. W. (2021). Track student engagement on webpages. https://uclatall.github.io/engauge
Brooks, C., Thompson, C., & Teasley, S. (2015). A time series interaction analysis method for building predictive models of learners using log data. In LAK'15: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (pp. 126–135). Association for Computing Machinery. https://doi.org/10.1145/2723576.2723581
Brooks, C., & Thompson, C. (2017). Predictive Modelling in Teaching and Learning. In Lang, C., Siemens, G., Wise, A. F., and Gaevic, D., editors, The Handbook of Learning Analytics, pages 61–68. Society for Learning Analytics Research (SoLAR), Alberta. https://doi.org/10.18608/hla17.005
Bulut, O., Gorgun, G., Yildirim‐Erbasli, S. N., Wongvorachan, T., Daniels, L. M., Gao, Y., Lai, K. W., & Shin, J. (2023). Standing on the shoulders of giants: Online formative assessments as the foundation for predictive learning analytics models. British Journal of Educational Technology, 54(1), 19–39. https://doi.org/10.1111/bjet.13276
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953
Chen, W., Brinton, C. G., Cao, D., Mason-Singh, A., Lu, C., & Chiang, M. (2018). Early detection prediction of learning outcomes in online short-courses via learning behaviors. IEEE Transactions on Learning Technologies, 12(1), 44–58. https://doi.org/10.1109/TLT.2018.2793193
Chen, F., & Cui, Y. (2020). Utilizing student time series behaviour in learning management systems for early prediction of course performance. Journal of Learning Analytics, 7(2), 1–17. https://doi.org/10.18608/JLA.2020.72.1
Coburn, C. E., & Penuel, W. R. (2016). Research–practice partnerships in education: Outcomes, dynamics, and open questions. Educational Researcher, 45(1), 48–54. https://doi.org/10.3102/0013189X16631750
Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2016). Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29. https://doi.org/10.1109/TLT.2016.2616312
Conole, G., Gašević, D., Long, P., & Siemens, G., (2011). Front matter (Cover, Message from the chairs, Committees, Sponsors, TOC). In LAK'11: Proceedings of the 1st International Conference on Learning Analytics and Knowledge. https://doi.org/10.1145/2090116
Dawson, S., Joksimovic, S., Poquet, O., & Siemens, G. (2019, March). Increasing the impact of learning analytics. In LAK'19: Proceedings of the First International Conference on Learning Analytics and Knowledge. (pp. 446–455). Association for Computing Machinery. https://doi.org/10.1145/3303772.3303784
Deho, O. B., Joksimovic, S., Li, J., Zhan, C., Liu, J., & Liu, L. (2022). Should learning analytics models include sensitive attributes? Explaining the why. IEEE Transactions on Learning Technologies, 16(4), 560–572. https://doi.org/10.1109/TLT.2022.3226474
DiNapoli, J., & Miller, E. K. (2022). Recognizing, supporting, and improving student perseverance in mathematical problem-solving: The role of conceptual thinking scaffolds. The Journal of Mathematical Behavior, 66, 100965. https://doi.org/10.1016/j.jmathb.2022.100965
Fan, Y., van der Graaf, J., Lim, L. et al. Towards investigating the validity of measurement of self-regulated learning based on trace data. Metacognition Learning 17, 949–987 (2022). https://doi.org/10.1007/s11409-022-09291-1
Farley, I. A., & Burbules, N. C. (2022). Online education viewed through an equity lens: Promoting engagement and success for all learners. Review of Education, 10(3), e3367. https://doi.org/10.1002/rev3.3367
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Figueroa-Cañas, J., & Sancho-Vinuesa, T. (2020). Early prediction of dropout and final exam performance in an online statistics course. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, 15(2), 86–94. https://doi.org/10.1109/RITA.2020.2987727
Fisher, R. A. (1938). Presidential Address. Sankhyā: The Indian Journal of Statistics (1933-1960), 4(1), 14–17. http://www.jstor.org/stable/40383882
FIRAT, M. (2016). Determining the Effects of LMS Learning Behaviors on Academic Achievement in a Learning Analytic Perspective. Journal of Information Technology Education Research, 15, 75–87. https://doi.org/10.28945/3405
Fries, L., Son, J. Y., Givvin, K. B., & Stigler, J. W. (2021). Practicing connections: A framework to guide instructional design for developing understanding in complex domains. Educational Psychology Review, 33(2), 739–762. https://doi.org/10.1007/s10648-020-09561-x
Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. https://doi.org/10.1016/j.iheduc.2015.10.002
Ge, L., Gao, J., Ngo, H., Li, K., & Zhang, A. (2014). On handling negative transfer and imbalanced distributions in multiple source transfer learning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(4), 254–271. https://doi.org/10.1002/sam.11217
Gitinabard, N., Xu, Y., Heckman, S., Barnes, T., & Lynch, C. F. (2019). How widely can prediction models be generalized? Performance prediction in blended courses. IEEE Transactions on Learning Technologies, 12(2), 184–197. https://doi.org/10.1109/TLT.2019.2911832
Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21(2), 215-223. https://doi.org/10.1080/00401706.1979.10489751
Gray, C. C., & Perkins, D. (2019). Utilizing early engagement and machine learning to predict student outcomes. Computers & Education, 131, 22–32. https://doi.org/10.1016/j.compedu.2018.12.006
He, H., & Garcia, E. A. (2009). Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239
Hernández-García, Á., Cuenca-Enrique, C., Del-Río-Carazo, L., & Iglesias-Pradas, S. (2024). Exploring the relationship between LMS interactions and academic performance: A Learning Cycle approach. Computers in Human Behavior, 155(2024). 108183. https://doi.org/10.1016/j.chb.2024.108183
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Hossin, M., & Sulaiman, M. N. (2015). A Review on Evaluation Metrics for Data Classification Evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2), 1. https://doi.org/10.5121/ijdkp.2015.5201
Hu, Y. H. (2022). Effects of the COVID-19 pandemic on the online learning behaviors of university students in Taiwan. Education and Information Technologies, 27(1), 469–491. https://doi.org/10.1007/s10639-021-10677-y
Huang, A. Y. Q., Lu, O. H. T., Huang, J. C. H., Yin, C. J., & Yang, S. J. H. (2019). Predicting students' academic performance by using educational big data and learning analytics: evaluation of classification methods and learning logs. Interactive Learning Environments, 28(2), 206–230. https://doi.org/10.1080/10494820.2019.1636086
Ifenthaler, D. (2017). Are higher education institutions prepared for learning analytics? TechTrends, 61(4), 366–371. https://doi.org/10.1007/s11528-016-0154-0
Junco, R., & Clem, C. (2015). Predicting course outcomes with digital textbook usage data. The Internet and Higher Education, 27, 54–63. https://doi.org/10.1016/j.iheduc.2015.06.001
Kim, D., Park, Y., Yoon, M., & Jo, I. H. (2016). Toward evidence-based learning analytics: Using proxy variables to improve asynchronous online discussion environments. The Internet and Higher Education, 30, 30–43. https://doi.org/10.1016/j.iheduc.2016.03.002
Kim, D., Yoon, M., Jo, I.-H., & Branch, R. M. (2018). Learning analytics to support self-regulated learning in asynchronous online courses: A case study at a women's university in South Korea. Computers & Education, 127, 233–251. https://doi.org/10.1016/j.compedu.2018.08.023
Kohavi, R. (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2, 1137–1143. Presented at the Montreal, Quebec, Canada. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Kovanović, V., Gašević, D., Dawson, S., Joksimović, S., Baker, R. S., & Hatala, M. (2015). Penetrating the black box of time-on-task estimation. In LAK '15: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge. (pp.184–193.) Association for Computing Machinery. https://doi.org/10.1145/2723576.2723623
Langley, G. J., Moen, R. D., Nolan, K. M., Nolan, T. W., Norman, C. L., & Provost, L. P. (2009). The improvement guide: a practical approach to enhancing organizational performance. John Wiley & Sons.
Lewis, C. (2015). What is improvement science? Do we need it in education? Educational Researcher, 44(1), 54–61. https://doi.org/10.3102/0013189X15570388
Lidwell, W., Holden, K., & Butler, J. (2010). Universal principles of design, revised and updated: 125 ways to enhance usability, influence perception, increase appeal, make better design decisions, and teach through design. Rockport Pub.
Lipton, Z. C. (2018). The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3), 31–57.
Liu, Z., He, J., Xue, Y., Huang, Z., Li, M., & Du, Z. (2015). Modeling the learning behaviors of massive open online courses. 2015 IEEE International Conference on Big Data, 2883–2885. https://doi.org/10.1109/BigData.2015.7364110
Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Global Transitions Proceedings, 3(1), 91–99. https://doi.org/10.1016/j.gltp.2022.04.020
Maldonado-Mahauad, J., Pérez-Sanagustín, M., Kizilcec, R. F., Morales, N., & Munoz-Gama, J. (2018). Mining theory-based patterns from Big data: Identifying self-regulated learning strategies in Massive Open Online Courses. Computers in Human Behavior, 80, 179–196. https://doi.org/10.1016/j.chb.2017.11.011
Marbouti, Farshid, Heidi A. Diefes-Dux, and Krishna Madhavan. "Models for early prediction of at-risk students in a course using standards-based grading." Computers & Education 103, 1–15. https://doi.org/10.1016/j.compedu.2016.09.005
Mathrani, A., Susnjak, T., Ramaswami, G., & Barczak, A. (2021). Perspectives on the challenges of generalizability, transparency and ethics in predictive learning analytics. Computers and Education Open, 2, 100060. https://doi.org/10.1016/j.caeo.2021.100060
Mega, C., Ronconi, L., & De Beni, R. (2014). What makes a good student? How emotions, self-regulated learning, and motivation contribute to academic achievement. Journal of Educational Psychology, 106(1), 121–131. https://doi.org/10.1037/a0033546
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18, 5–11. https://doi.org/10.3102/0013189X018002005
Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020). Machine learning with oversampling and undersampling techniques: overview study and experimental results. In International Conference on Information and Communication Systems (ICICS) (pp. 243–248). IEEE.
Moreno-Marcos, P. M., Alario-Hoyos, C., Muñoz-Merino, P. J., & Kloos, C. D. (2018). Prediction in MOOCs: A review and future research directions. IEEE transactions on Learning Technologies, 12(3), 384–401. https://doi.org/10.1109/TLT.2018.2856808
Munir, H., Moayyed, M., & Petersen, K. (2014). Considering rigor and relevance when evaluating test driven development: A systematic review. Information and Software Technology, 56(4), 375–394. https://doi.org/10.1016/j.infsof.2014.01.002
Mwalumbwe, I., & Mtebe, J. S. (2017). Using learning analytics to predict students’ performance in Moodle learning management system: A case of Mbeya University of Science and Technology. The Electronic Journal of Information Systems in Developing Countries, 79(1), 1–13. https://doi.org/10.1002/j.1681-4835.2017.tb00577.x
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://doi.org/10.48550/arXiv.1201.0490
Pelánek, R. (2020, March). Learning analytics challenges: trade-offs, methodology, scalability. In LAK'20: Proceedings of the Tenth International Conference on Learning Analytics & Knowledge. (pp. 554–558). Association for Computing Machinery. https://doi.org/10.1145/3375462.33754
Provost, F., & Domingos, P. (2000). Well-trained PETs: Improving probability estimation trees. Technical Report IS-00-04, Stern School of Business, New York University.
Samuelsen, J., Chen, W. & Wasson, B. (2019). Integrating multiple data sources for learning analytics—review of literature. Research and Practice in Technology Enhanced Learning, 13(11). https://doi.org/10.1186/s41039-019-0105-4
Sghir, N., Adadi, A., & Lahmer, M. (2023). Recent advances in Predictive Learning Analytics: A decade systematic review (2012–2022). Education and Information Technologies, 28(7), 8299–8333. https://doi.org/10.1007/s10639-022-11536-0
Singh, D., & Singh, B. (2020). Investigating the impact of data normalization on classification performance. Applied Soft Computing, 97, 105524. https://doi.org/10.1016/j.asoc.2019.105524
Sharma, R., Shen, H., & Goodwin, R. (2016). Voluntary participation in discussion forums as an engagement indicator: an empirical study of teaching first-year programming. In OzCHI '16: Proceedings of the 28th Australian Conference on Computer-Human Interaction. (pp. 489–493). Association for Computing Machinery. https://doi.org/10.1145/3010915.3010967
Shin, D., & Park, Y. J. (2019). Role of fairness, accountability, and transparency in algorithmic affordance. Computers in Human Behavior, 98, 277–284. https://doi.org/10.1016/j.chb.2019.04.019
Son, J. Y., & Stigler, J. W. (2017-2024). Data Science and Statistics: A Modeling Approach. https://coursekata.org/preview/default/program
Stigler, J. W., Son, J. Y., Givvin, K. B., Blake, A. B., Fries, L., Shaw, S. T., & Tucker, M. C. (2020). The Better Book Approach for Education Research and Development. Teachers College Record, 122(9), 1–32. https://doi.org/10.1177/016146812012200913
Storkey, A. (2009). When training and test sets are different: characterizing learning transfer. In Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D., Dataset Shift in Machine Learning, (pp. 2–28), MIT Press. https://doi.org/10.7551/mitpress/9780262170055.003.0001
Suleman, S., Antu, S. W., & Malanua, S. (2022). The Influence of Assignment Methods and Learning Behavior on Student Learning Outcomes. Journal La Edusci, 3(2), 28–36. https://doi.org/10.37899/journallaedusci.v3i2.607
Sutter, C. C., Totonchi, D. A., DeCoster, J., Barron, K. E., & Hulleman, C. S. (2024). How does expectancy-value-cost motivation vary during a semester? An intensive longitudinal study to explore individual and situational sources of variation in statistics motivation. Learning and Individual Differences, 113, 102484. https://doi.org/10.1016/j.lindif.2024.102484
Tate, T., & Warschauer, M. (2022). Equity in online learning. Educational Psychologist, 57(3), 192–206. https://doi.org/10.1080/00461520.2022.2062597
Trautwein, U., Niggli, A., Schnyder, I., & Lüdtke, O. (2009). Between-teacher differences in homework assignments and the development of students' homework effort, homework emotions, and achievement. Journal of Educational Psychology, 101(1), 176–189. https://doi.org/10.1037/0022-0663.101.1.176
Tricot, A., & Sweller, J. (2014). Domain-specific knowledge and why teaching generic skills does not work. Educational Psychology Review, 26(2), 265–283. https://doi.org/10.1007/s10648-013-9243-1
Tsai, Y. S., Rates, D., Moreno-Marcos, P. M., Munoz-Merino, P. J., Jivet, I., Scheffel, M., ... & Gašević, D. (2020). Learning analytics in European higher education—Trends and barriers. Computers & Education, 155, 103933. https://doi.org/10.1016/j.compedu.2020.103933
Tucker, M. C., Shaw, S. T., Son, J. Y., & Stigler, J. W. (2023). Teaching statistics and data analysis with R. Journal of Statistics and Data Science Education, 31(1), 18–32. https://doi.org/10.1080/26939169.2022.2089410
Vabalas, A., Gowen, E., Poliakoff, E., & Casson, A. J. (2019). Machine learning algorithm validation with a limited sample size. PLOS ONE, 14(11), e0224365. https://doi.org/10.1371/journal.pone.0224365
van Halem, N., van Klaveren, C., Drachsler, H., Schmitz, M., & Cornelisz, I. (2020). Tracking patterns in self-regulated learning using students' self-reports and online trace data. Frontline Learning Research, 8(3), 140–163. https://doi.org/10.14786/flr.v8i3.497
Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. Scotts Valley, CA: CreateSpace.
Wilson, A., Watson, C., Thompson, T. L., Drew, V., & Doyle, S. (2017). Learning analytics: Challenges and limitations. Teaching in Higher Education, 22(8), 991–1007. https://doi.org/10.1080/13562517.2017.1332026
Winne, P. H. (2010). Improving measurements of self-regulated learning. Educational Psychologist, 45, 267–276. https://doi.org/10.1080/00461520.2010.517150.
Winne, P. H. (2020). Construct and consequential validity for learning analytics based on trace data. Computers in Human Behavior, 112, 106457. https://doi.org/10.1016/j.chb.2020.106457
Xu, A., Blake, A. B., Zhang, I. Y., Zhao, Y., & Epner, R. (2023). Early Identification of Underperforming Students via Reading Patterns. In American Educational Research Association Online Paper Repository. https://doi.org/10.3102/2009101
Yang, T. Y., Brinton, C. G., Joe-Wong, C., & Chiang, M. (2017). Behavior-based grade prediction for MOOCs via time series neural networks. IEEE Journal of Selected Topics in Signal Processing, 11(5), 716–728. https://doi.org/10.1109/JSTSP.2017.2700227
Ying, X. (2019). An Overview of Overfitting and its Solutions. Journal of Physics: Conference Series, 1168(2). https://doi.org/10.1088/1742-6596/1168/2/022022
Zimmerman, T. D. (2012). Exploring learner to content interaction as a success factor in online courses. International Review of Research in Open and Distance Learning, 13(4). https://www.proquest.com/scholarly-journals/exploring-learner-content-interaction-as-success/docview/1634473616/se-2
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Journal of Learning Analytics

This work is licensed under a Creative Commons Attribution 4.0 International License.