Learning As It Happens: A Decade of Analyzing and Shaping a Large-Scale Online Learning System

Frederik Coomans
Han L. J. van der Maas
Gunter Maris


With the advent of computers in education, and the ample availability of online learning and practice environments, enormous amounts of data on learning become available. The purpose of this paper is to present a decade of experience with analyzing and improving an online practice environment for math, which has thus far recorded over a billion responses. We present the methods we use to both steer and analyze this system in real-time, using scoring rules on accuracy and response times, a tailored rating system to provide both learners and items with current ability and difficulty ratings, and an adaptive engine that matches learners to items. Moreover, we explore the quality of fit by means of prediction accuracy and parallel item reliability. Limitations and pitfalls are discussed by diagnosing sources of misfit, like violations of unidimensionality and unforeseen dynamics. Finally, directions for development are discussed, including embedded learning analytics and a focus on online experimentation to evaluate both the system itself and the users’ learning gains. Though many challenges remain open, we believe that large steps have been made in providing methods to efficiently manage and research educational big data from a massive online learning system.

Full Text:



Ashcraft, M. H. (1982). The development of mental arithmetic: A chronometric approach. Developmental Review, 2(3), 213–236. http://dx.doi.org/10.1016/0273-2297(82)90012-0

Batchelder, W. H., & Bershad, N. J. (1979). The statistical analysis of a Thurstonian model for rating chess players. Journal of Mathematical Psychology, 19(1), 39–60.


Batchelder, W. H., Bershad, N. J., & Simpson, R. S. (1992). Dynamic paired-comparison scaling. Journal of Mathematical Psychology, 36, 185–212. http://dx.doi.org/10.1016/0022-2496(92)90036-7

Benaglia, T., Chauveau, D., Hunter, D. R., & Young, D. (2009). mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32(6), 1–29. http://dx.doi.org/10.18637/jss.v032.i06

Brinkhuis, M. J. S. (2014, December). Tracking educational progress (PhD thesis). University of Amsterdam. http://hdl.handle.net/11245/1.433219

Brinkhuis, M. J. S., Bakker, M., & Maris, G. (2015). Filtering data for detecting differential development. Journal of Educational Measurement, 52(3), 319–338. http://dx.doi.org/10.1111/jedm.12078

Brinkhuis, M. J. S., & Maris, G. (2009). Dynamic parameter estimation in student monitoring systems (Measurement and Research Department Reports No. 09-01). Arnhem, Netherlands: Cito. https://www.researchgate.net/publication/242357963

Brinkhuis, M. J. S., & Maris, G. (2010). Adaptive estimation: How to hit a moving target (Measurement and Research Department Reports No. 10-01). Arnhem, Netherlands: Cito. http://www.cito.nl/onderzoek%20en%20wetenschap/achtergrondinformatie/publicaties/measurement_reports

Brown, J. S., & VanLehn, K. (1980). Repair theory: A generative theory of bugs in procedural skills. Cognitive Science, 4(4), 379–426. http://dx.doi.org/10.1207/s15516709cog0404_3

Coomans, F., Hofman, A., Brinkhuis, M. J. S., van der Maas, H. L. J., & Maris, G. (2016). Distinguishing fast and slow processes in accuracy-response time data. PLOS ONE, 11(5), 1–19. http://dx.doi.org/10.1371/journal.pone.0155149

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. http://dx.doi.org/10.1007/BF02310555

Elo, A. E. (1978). The rating of chess players, past and present. London: B. T. Batsford.

Glickman, M. E. (1999). Parameter estimation in large dynamic paired comparison experiments. Applied Statistics, 48, 377–394. http://dx.doi.org/10.1111/1467-9876.00159

Glickman, M. E. (2001). Dynamic paired comparison models with stochastic variances. Journal of Applied Statistics, 28(6), 673–689. http://dx.doi.org/10.1080/02664760120059219

Groeneveld, C. M. (2014). Implementation of an adaptive training and tracking game in statistics teaching. In M. Kalz & E. Ras (Eds.), Computer assisted assessment: Research into e-assessment (Vol. 439, pp. 53–58). Springer. http://dx.doi.org/10.1007/978-3-319-08657-6_5

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage.

Hofman, A. D., Visser, I., Jansen, B. R. J., & van der Maas, H. L. J. (2015). The balance-scale task revisited: A comparison of statistical models for rule-based and information-integration theories of proportional reasoning. PLOS ONE, 10(10), e0136449. http://dx.doi.org/10.1371/journal.pone.0136449

Hofman, A. D., Visser, I., Jansen, B. R. J., Marsman, M., & van der Maas, H. L. J. (2017). Fast and slow strategies in multiplication. Preprint. http://dx.doi.org/10.17605/OSF.IO/AW3QQ

Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence. (A. Parsons & S. Milgram, Trans.). New York: Basic Books.

Jansen, B. R. J., Louwerse, J., Straatemeier, M., van der Ven, S. H. G., Klinkenberg, S., & van der Maas, H. L. J. (2013). The influence of experiencing success in math on math anxiety, perceived math competence, and math performance. Learning and Individual Differences, 24, 190–197. http://dx.doi.org/10.1016/j.lindif.2012.12.014

Klinkenberg, S. (2014). High speed high stakes scoring rule. In M. Kalz & E. Ras (Eds.), Computer assisted assessment: Research into e-assessment (Vol. 439, pp. 114–126). Springer. http://dx.doi.org/10.1007/978-3-319-08657-6_11

Klinkenberg, S., Straatemeier, M., & van der Maas, H. L. J. (2011). Computer adaptive practice of maths ability using a new item response model for on the fly ability and difficulty estimation. Computers & Education, 57(2), 1813–1824. http://dx.doi.org/10.1016/j.compedu.2011.02.003

Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: Traps in big data analysis. Science, 343(6176), 1203–1205. http://dx.doi.org/10.1126/science.1248506

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

Maris, G., & van der Maas, H. L. J. (2012). Speed–accuracy response models: Scoring rules based on response time and accuracy. Psychometrika, 77(4), 615–633. http://dx.doi.org/10.1007/s11336-012-9288-y

Molenaar, P. C. M. (2004). A manifesto on psychology as idiographic science: Bringing the person back into scientific psychology, this time forever. Measurement: Interdisciplinary Research & Perspective, 2(4), 201–218. http://dx.doi.org/10.1207/s15366359mea0204_1

Nižnan, J., Pelánek, R., & Řihák, J. (2015). Student models for prior knowledge estimation. In O. C. Santos et al. (Eds.), Proceedings of the 8th International Conference on Educational Data Mining (EDM2015), 26–29 June 2015, Madrid, Spain (pp. 109–116). International Educational Data Mining Society. http://educationaldatamining.org/EDM2015

Partchev, I., & De Boeck, P. (2012). Can fast and slow intelligence be differentiated? Intelligence, 40(1), 23–32. http://dx.doi.org/10.1016/j.intell.2011.11.002

Pelánek, R. (2014). Application of time decay functions and the Elo system in student modeling. In J. Stamper et al. (Eds.), Proceedings of the 7th International Conference on Educational Data Mining (EDM2014), 4–7 July 2014, London, UK (pp. 21–27). International Educational Data Mining Society. http://educationaldatamining.org/EDM2014

Pelánek, R., Papoušek, J., Řihák, J., Stanislav, V., & Nižnan, J. (2017). Elo-based learner modeling for the adaptive practice of facts. User Modeling and User-Adapted Interaction, 27(1), 89–118.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute of Educational Research.

R Core Team. (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/

Savi, A. O., van der Maas, H. L. J., & Maris, G. K. J. (2015). Navigating massive open online courses. Science, 347(6225), 958. http://dx.doi.org/10.1126/science.347.6225.958

Savi, A. O., Ruijs, N. M., Maris, G. K. J., & van der Maas, H. L. J. (2018). Delaying access to a problem-skipping option increases effortful practice: Application of an a/b test in large-scale online learning. Computers & Education, 119, 84–94. http://dx.doi.org/10.1016/j.compedu.2017.12.008

Savi, A. O., Williams, J. J., Maris, G., & van der Maas, H. L. J. (2017, February 27). The role of A/B tests in the study of large-scale online learning. Preprint. http://dx.doi.org/10.17605/OSF.IO/83JSG

Siegler, R. S. (1976). Three aspects of cognitive development. Cognitive Psychology, 8(4), 481–520. http://dx.doi.org/10.1016/0010-0285(76)90016-5

Sonas, J. (2005). Chessmetrics formulas: Chessmetrics rating as “a weighted and padded simultaneous performance rating.” http://www.chessmetrics.com/cm/CM2/Formulas.asp

Straatemeier, M. (2014, April 25). Math Garden: A new educational and scientific instrument (PhD thesis). University of Amsterdam. http://hdl.handle.net/11245/1.417091

van den Bergh, M., Hofman, A. D., Schmittmann, V. D., & van der Maas, H. L. J. (2015). Tracing the development of typewriting skills in an adaptive e-learning environment. Perceptual and Motor Skills, 121(3), 727–745.

van der Linden, W. J., & Glas, C. A. W. (Eds.). (2002). Computerized adaptive testing: Theory and practice. Netherlands: Springer. http://dx.doi.org/10.1007/0-306-47531-6

van der Ven, S. H. G., Straatemeier, M., Jansen, B. R. J., Klinkenberg, S., & van der Maas, H. L. J. (2015). Learning multiplication: An integrated analysis of the multiplication ability of primary school children and the difficulty of single digit and multidigit multiplication problems. Learning and Individual Differences, 43, 48–62. http://dx.doi.org/10.1016/j.lindif.2015.08.013

Veldkamp, B. P., Matteucci, M., & Eggen, T. J. H. M. (2011). Computerized adaptive testing in computer assisted learning? In S. De Wannemacker, G. Clarebout, & P. De Causmaecker (Eds.), Interdisciplinary approaches to adaptive learning: A look at the neighbours (Vol. 126, pp. 28–39). Springer. http://dx.doi.org/10.1007/978-3-642-20074-8_3

Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Mahwah, NJ: Lawrence Erlbaum.

Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68(3), 456–477. http://dx.doi.org/10.1111/bmsp.12054

Wauters, K., Desmet, P., & Van den Noortgate, W. (2010). Adaptive item-based learning environments based on the item response theory: Possibilities and challenges. Journal of Computer Assisted Learning, 26(6), 549–562. http://dx.doi.org/10.1111/j.1365-2729.2010.00368.x

Wickelgren, W. A. (1977). Speed–accuracy tradeoff and information processing dynamics. Acta Psychologica, 41(1), 67–85. http://dx.doi.org/10.1016/0001-6918(77)90012-9


DOI: https://doi.org/10.18608/jla.2018.52.3

Share this article: