Shapes of Educational Data in an Online Calculus Course




Online calculus, Markov chain, clickstream, sequence analysis


This paper describes investigations in visualizing logpaths of students in an online calculus course held at Florida State University in 2014. The clickstreams making up the logpaths can be used to visualize student progress in the information space of a course as a graph. We consider the graded activities as nodes of the graph, while information extracted from the logpaths between the graded activities label the edges of the graph. We show that this graph is associated to a Markov Chain in which the states are the graded activities and the weight of the edge is proportional to the probability of that transition. When we visualize such a graph, it becomes apparent that most students follow the course sequentially, section after section. This model allows us to study how different groups of students employ the learning resources using sequence analysis on information buried in their clickstreams.


Bastian, M., Heymann, S., & Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.

Caprotti, O., Seppälä, M., & Xambó, S. (2007). Novel Aspects of the Use of ICT in Mathematics Education. In M. Iskander Ph.D. PE (Ed.), Innovations in E-learning, Instruction Technology, Assessment, and Engineering Education (pp. 295–299). Springer Netherlands. doi:10.1007/978-1-4020-6262-9_51

Casey, K., Dublin, G. C., & Gibson, P. (2010). ( m ) Oodles of Data Mining Moodle to understand Student Behaviour. In ICEP 10 (Vol. 2010).

Desmarais, M., & Lemieux, F. (2013). Clustering and visualizing study state sequences. In S. K. D’Mello, R. A. Calvo, & A. Olnev (Eds.), Proceedings of the 6th International Conference on Educational Data Mining (EDM 2013) (pp. 224–227). Memphis: International Educational Data Mining Society. Retrieved from

Fox. A. (2013). From MOOCs to SPOCs. Communications of the ACM, Vol. 56 No. 12, Pages 38-40. December 2013 (Vol. 56, No. 12). 10.1145/2535918

Gabadinho, A., Ritschard, G., Müller, N.S. & Studer, M. (2011), Analyzing and visualizing state sequences in R with TraMineR, Journal of Statistical Software. Vol. 40(4), pp. 1-37.

Ganley, C. M. & Hart, A. S. (2016). Individual Differences Related to College Students’ Course Performance in Calculus II. Shape of Educational Data, April 7-8, 2016. Fairfax, USA. Retrieved from

Hadwin, A. F., Nesbit, J. C., Jamieson-Noel, D., Code, J., & Winne, P. H. (2007). Examining trace data to explore self-regulated learning. Metacognition and Learning, 2(2-3), 107–124. doi:10.1007/s11409-007-9016-7

Lee, Shin-Yi. (2012)

Analysis of “look back” strategies in mathematical problem solving. 12th International Congress on Mathematical Education. July 8-15, 2012, COEX,

Lum, P. Y., Singh, G., Lehman, a, Ishkanov, T., Vejdemo-Johansson, M., Alagappan, M., … Carlsson, G. (2013). Extracting insights from the shape of complex data using topology. Scientific Reports, 3, 1236. doi:10.1038/srep01236

Marques, A. & Belo, O. (2011) Discovering Student Web Usage Profiles Using Markov Chains. The Electronic Journal of e-Learning Volume 9 Issue 1 2011, (pp63-74), available online at

Niemi, H. (2012a). How to find motivation - Learning Math is taking a new course through self-efficacy. Personal communication.

Niemi, H. (2012b). Self-efficacy and peer support in Web based learning. Assessment and Effective Teaching of Calculus. University of Helsinki, October 11-13, 2012. NSF-SAVI Collaboration meeting.

Ojalainen, J. & Pauna, M. (2013). Web-Based Mathematics Exercises and Their Effect on Students’ Achievement and Confidence. In R. McBride & M. Searson (Eds.), Proceedings of Society for Information Technology & Teacher Education International Conference 2013 (pp. 2335-2340). Chesapeake, VA: Association for the Advancement of Computing in Education (AACE). Retrieved April 29, 2016 from

Pauna, M. (2016). Calculus Courses’ Assessment Data. Shape of Educational Data. April 7-8, 2016. Fairfax, USA. Retrieved from

Polya, G. (1973). How to solve it. Princeton, NJ: Princeton University Press.

R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

Romero, C., Gutiérrez, S., Freire, M., & Ventura, S. (2008). Mining and visualizing visited trails in web-based educational systems. The 1st International Conference on Educational Data Mining, 182–186.

Sarukkai, R. R.(2000). Link prediction and path analysis using Markov chains. Computer Networks, Volume 33, Issues 1–6, June 2000, Pages 377-386, ISSN 1389-1286,

Scholz, M. (2005). clickstream: an R package for analyzing clickstreams.

Seppälä, M. (2013). Learning Analytics, Riemann Surfaces, and Quadratic Differentials. Riemann and Klein Surfaces, Symmetries and Moduli Spaces. Linköping (Sweden), June 24-28, 2013. Pre-recording from

Seppälä, M. (2014). Shape of Educational Data. NSF 1450501 Standard Grant.

Seppälä, M., Caprotti, O. & Xambo, S. (2006). Using Web Technologies to Teach Mathematics. In C. Crawford, R. Carlsen, K. McFerrin, J. Price, R. Weber & D. Willis (Eds.), Proceedings of Society for Information Technology & Teacher Education International Conference 2006 (pp. 2679-2684). Chesapeake, VA: Association for the Advancement of Computing in Education (AACE).




How to Cite

Caprotti, O. (2017). Shapes of Educational Data in an Online Calculus Course. Journal of Learning Analytics, 4(2), 76–90.



Special section: Shape of Educational Data