Evaluating 21st-Century Competencies in Postsecondary Curricula with Large Language Models
Performance Benchmarking and Reasoning-Based Prompting Strategies
DOI:
https://doi.org/10.18608/jla.2026.9127Keywords:
curricular analytics, 21st century competencies, large language models (LLMs), prompt engineering, chain of thought (CoT), research paperAbstract
The growing emphasis on 21st-century competencies in postsecondary education, intensified by the transformative impact of generative artificial intelligence (GenAI) on the economy and society, underscores the urgent need to evaluate how they are embedded in curricula and how effectively academic programs align with evolving workforce and societal demands. Curricular analytics, particularly recent advancements powered by GenAI, offer a promising data-driven approach to this challenge. However, the analysis of 21st-century competencies requires pedagogical reasoning beyond surface-level information retrieval, and the capabilities of large language models (LLMs) in this context remain underexplored. In this study, we extend prior research on curricular analytics of 21st-century competencies across a broader range of curriculum documents, competency frameworks, and models. Using 7,600 manually annotated curriculum-competency alignment scores (38 competencies and 200 courses across five curriculum document types), we evaluate the informativeness of different curriculum document sources, benchmark the performance of general-purpose LLMs on mapping curricula to competencies, and analyze error patterns. We further introduce a reasoning-based prompting strategy, curricular chain-of-thought (CoT), to strengthen LLMs’ pedagogical reasoning. Our results show that detailed instructional activity descriptions are the most informative type of curriculum document for competency analytics. Open-weight LLMs achieve accuracy comparable to proprietary models on coarse-grained tasks, demonstrating their scalability and cost-effectiveness for institutional use. However, no model reaches human-level precision in fine-grained pedagogical reasoning. Our proposed curricular CoT yields modest improvements by reducing bias in instructional keyword inference and improving the detection of nuanced pedagogical evidence in long text. Together, these findings highlight the untapped potential of institutional curriculum documents and provide an empirical foundation for advancing AI-driven curricular analytics.
References
Arafeh, S. (2016). Curriculum mapping in higher education: A case study and proposed content scope and sequence mapping tool. Journal of Further and Higher Education, 40(5), 585–611. https://doi.org/10.1080/0309877X.2014.1000278
Buckingham Shum, S., & Crick, R. D. (2016). Learning analytics for 21st century competencies. Journal of Learning Analytics, 3(2), 6–21. https://doi.org/10.18608/jla.2016.32.2
Chou, C.-Y., Tseng, S.-F., Chih, W. -C., Chen, Z. -H., Chao, P. - Y., Lai, K. R., Chan, C.-L., Yu, L. -C., & Lin, Y.-L. (2015). Open student models of core competencies at the curriculum level: Using learning analytics for student reflection. IEEE Transactions on Emerging Topics in Computing, 5(1), 32–44. https://doi.org/10.1109/TETC.2015.2501805
Dawson, S., & Hubball, H. (2014). Curriculum analytics: Application of social network analysis for improving strategic curriculum decision-making in a research-intensive university. Teaching and Learning Inquiry, 2(2), 59–74. https://doi.org/10.20343/teachlearninqu.2.2.59
De Silva, L. M. H., Rodríguez-Triana, M. J., Chounta, I.-A., & Pishtari, G. (2024). Curriculum analytics in higher education institutions: A systematic literature review. Journal of Computing in Higher Education, 1–47. https://doi.org/10.1007/s12528-024-09410-8
Decorte, J.- J., Van Hautte, J., Deleu, J., Develder, C., & Demeester, T. (2022). Design of negative sampling strategies for distantly supervised skill extraction. arXiv preprint arXiv:2209.05987. https://doi.org/10.48550/arXiv.2209.05987
Deng, Y., Zhang, W., Chen, Z., & Gu, Q. (2024). Rephrase and respond: Let large language models ask better questions for themselves. arXiv preprint arXiv:2311.04205. https://doi.org/10.48550/arXiv.2311.04205
Department for Education. (2023). Generative artificial intelligence (AI) in education (tech. rep.). Government of the United Kingdom. https://www.gov.uk/government/publications/generative-artificial-intelligence-in-education
Doyle, A., Sridhar, P., Agarwal, A., Savelka, J., & Sakr, M. (2025). A comparative study of AI-generated and human-crafted learning objectives in computing education. Journal of Computer Assisted Learning, 41(1), e13092. https://doi.org/10.1111/jcal.13092
Durant, E., Impagliazzo, J., Conry, S., Reese, R., Lam, H., Nelson, V., Hughes, J., Liu, W., Lu, J., & McGettrick, A. (2015). CE2016: Updated computer engineering curriculum guidelines. In Proceedings of the 2015 IEEE Frontiers in Education Conference (FIE 2015), 21–24 October 2015, El Paso, Texas, USA (pp. 1–2). IEEE. https://doi.org/10.1109/FIE.2015.7344157
Fiesler, C., Garrett, N., & Beard, N. (2020). What do we teach when we teach tech ethics? A syllabi analysis. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE 2020), 11–14 March 2020, Portland, Oregon, USA (pp. 289–295). ACM. https://doi.org/10.1145/3328778.3366825
Ghanizadeh, A., Al-Hoorie, A. H., & Jahedizadeh, S. (2020). Higher order thinking skills. In Higher order thinking skills in the language classroom: A concise guide (pp. 1–51). Springer. https://doi.org/10.1007/978-3-030-56711-8_1
Gorski, P. C. (2009). What we’re teaching teachers: An analysis of multicultural teacher education coursework syllabi. Teaching and Teacher Education, 25(2), 309–318. https://doi.org/10.1016/j.tate.2008.07.008
Greer, J., Molinaro, M., Ochoa, X., & McKay, T. (2016). Learning analytics for curriculum and program quality improvement (pcla 2016). In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, Scotland, UK (pp. 494–495). ACM. https://doi.org/10.1145/2883851.2883899
Griffin, P., McGaw, B., & Care, E. (2012). Assessment and teaching of 21st century skills (Vol. 10). Springer. https://doi.org/10.1007/978-3-319-65368-6
Herandi, A., Li, Y., Liu, Z., Hu, X., & Cai, X. (2024). Skill-LLM: Repurposing general-purpose LLMs for skill extraction. arXiv preprint arXiv:2410.12052. https://doi.org/10.48550/arXiv.2410.12052
Hilliger, I., Aguirre, C., Miranda, C., Celis, S., & Pérez-Sanagustín, M. (2020). Design of a curriculum analytics tool to support continuous improvement processes in higher education. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 181–186). ACM. https://doi.org/10.1145/3375462.3375489
Hilliger, I., Miranda, C., Celis, S., & Pérez-Sanagustín M. (2024). Curriculum analytics adoption in higher education: A multiple case study engaging stakeholders in different phases of design. British Journal of Educational Technology, 55(3), 785–801. https://doi.org/10.1111/bjet.13374
Homa, N., Hackathorn, J., Brown, C. M., Garczynski, A., Solomon, E. D., Tennial, R., Sanborn, U. A., & Gurung, R. A. (2013). An analysis of learning objectives and content coverage in introductory psychology syllabi. Teaching of Psychology, 40(3), 169–174. https://doi.org/10.1177/0098628313487456
Hong, P. Y. P., & Hodge, D. R. (2009). Understanding social justice in social work: A content analysis of course syllabi. Families in Society, 90(2), 212–219. https://doi.org/10.1606/1044-3894.3874
Irwin, R. (2002). Characterizing the core: What catalog descriptions of mandatory courses reveal about LIS schools and librarianship. Journal of Education for Library and Information Science, 175–184. https://doi.org/10.2307/40323978
Javadian Sabet, A., Bana, S. H., Yu, R., & Frank, M. R. (2024). Course-Skill Atlas: A national longitudinal dataset of skills taught in US higher education curricula. Scientific Data, 11(1), 1086. https://doi.org/10.1038/s41597-024-03931-8
Jayalath, V., Barthakur, A., Dawson, S., Tingey, J., Crase, L., & Kovanović, V. (2025). Scaling curriculum mapping in higher education: Evaluating generative AI’s role in curriculum analytics. In A. I. Cristea, E. Walker, Y. Lu, O. C. Santos, & S. Isotani (Eds.), Proceedings of the 2025 International Conference on Artificial Intelligence in Education (AIED 2025), 22–26 July 2025, Palermo, Italy (pp. 294–308). ACM. https://doi.org/10.1007/978-3-031-98414-3_21
Jovanovic, J., Zamecnik, A., Barthakur, A., & Dawson, S. (2025). Curriculum analytics: Exploring assessment objectives, types, and grades in a study program. Education and Information Technologies, 30(4), 4843–4866. https://doi.org/10.1007/s10639-024-13015-0
Kawintiranon, K., Vateekul, P., Suchato, A., & Punyabukkana, P. (2016). Understanding knowledge areas in curriculum through text mining from course materials. In Proceedings of the 2016 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE 2016), 7–9 December 2016, Bangkok, Thailand (pp. 161–168). IEEE. https://doi.org/10.1109/TALE.2016.7851788
Kitto, K., Sarathy, N., Gromov, A., Liu, M., Musial, K., & Buckingham Shum, S. (2020). Towards skills-based curriculum analytics: Can we automate the recognition of prior learning? In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 171–180). ACM. https://doi.org/10.1145/3375462.3375526
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916. https://doi.org/10.48550/arXiv.2205.11916
Kotsiou, A., Fajardo-Tovar, D. D., Cowhitt, T., Major, L., & Wegerif, R. (2022). A scoping review of future skills frameworks. Irish Educational Studies, 41(1), 171–186. https://doi.org/10.1080/03323315.2021.2022522
Kozov, V., Ivanova, G., & Atanasova, D. (2024). Practical application of AI and large language models in software engineering education. International Journal of Advanced Computer Science and Applications, 15(1). https://doi.org/10.14569/IJACSA.2024.0150168
Li, X., Henriksson, A., Duneld, M., Nouri, J., & Wu, Y. (2024). Supporting teaching-to-the-curriculum by linking diagnostic tests to curriculum goals. Artificial Intelligence in Education, 14829, 118–132. https://doi.org/10.1007/978-3-031-64302-6_9
Light, J. (2024). Student demand and the supply of college courses. https://doi.org/10.2139/ssrn.4856488
Liu, C., Hoang, L., Stolman, A., & Wu, B. (2024). HiTA: A RAG-based educational platform that centers educators in the instructional loop. In A. Olney, I. Chounta, Z. Liu, O. Santos, & I. Bittencourt (Eds.), Artificial intelligence in education. AIED 2024. Lecture notes in computer science (pp. 405–412, Vol. 14830). Springer. https://doi.org/10.1007/978-3-031-64299-9_37
Lohr, D., Berges, M., Chugh, A., Kohlhase, M., & Müller, D. (2025). Leveraging large language models to generate course-specific semantically annotated learning objects. Journal of Computer Assisted Learning, 41(1), e13101. https://doi.org/10.1111/jcal.13101
Lyu, W., Wang, Y., Chung, T., Sun, Y., & Zhang, Y. (2024). Evaluating the effectiveness of LLMs in introductory computer science education: A semester-long field study. In Proceedings of the 11th ACM Conference on Learning at Scale (L@S 2024), 18–20 July 2024, Atlanta, Georgia, USA (pp. 63–74). ACM. https://doi.org/10.1145/3657604.3662036
McKinsey Global Institute. (2023). Generative AI and the future of work in America. https://www.mckinsey.com/mgi/our-research/generative-ai-and-the-future-of-work-in-america
Meyers, N. M., & Nulty, D. D. (2009). How to use (five) curriculum design principles to align authentic learning environments, assessment, students’ approaches to thinking and learning outcomes. Assessment & Evaluation in Higher Education, 34(5), 565–577. https://doi.org/10.1080/02602930802226502
Musa, F., Mufti, N., Latiff, R. A., & Amin, M. M. (2012). Project-based learning (PjBL): Inculcating soft skills in 21st century workplace. Procedia-Social and Behavioral Sciences, 59, 565–573. https://doi.org/10.1016/j.sbspro.2012.09.315
National Education Association. (2024). Teaching in the age of AI: NEA members’ roadmap for safe, effective, and accessible use of artificial intelligence in education (tech. rep.). Washington, DC. https://www.nea.org/resource-library/artificial-intelligence-education
Nguyen, K. C., Zhang, M., Montariol, S., & Bosselut, A. (2024). Rethinking skill extraction in the job market domain using large language models. arXiv preprint arXiv:2402.03832. https://doi.org/10.48550/arXiv.2402.03832
Nye, M., Hewitt, J., Chen, J., Krueger, D., Duvenaud, D., Lake, B., & Zemel, R. (2021). Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114. https://doi.org/10.48550/arXiv.2112.00114
OECD. (2023a). Innovating assessments to measure and support complex skills (N. Foster & M. Piacentini, Eds.). https://doi.org/10.1787/e5f3e341-en
OECD. (2023b). OECD digital education outlook 2023: Towards an effective digital education ecosystem. https://doi.org/10.1787/c74f03de-en
OECD. (2023c). OECD employment outlook 2023: Artificial intelligence and the labour market. https://doi.org/10.1787/08785bba-en
Office of Educational Technology. (2023). Artificial intelligence and the future of teaching and learning: Insights and recommendations (tech. rep.). U.S. Department of Education. Washington, DC. https://www.ed.gov/sites/ed/files/documents/ai-report/ai-report.pdf
Ohland, M., & Collins, R. (2002). Creating a catalog and meta analysis of freshman programs for engineering students: Part 2: Learning communities. In Proceedings of the 2002 American Society for Engineering Education Annual Conference and Exposition, 16–19 June 2002, Montréal, Québec, Canada (pp. 7–338). ASEE PEER. https://doi.org/10.18260/1-2--10110
Pistilli, M. D., & Heileman, G. L. (2017). Guiding early and often: Using curricular and learning analytics to shape teaching, learning, and student success in gateway courses. New Directions for Higher Education, 2017(180), 21–30. https://doi.org/10.1002/he.20258
Retnawati, H., Djidu, H., Apino, E., Anazifa, R. D., et al. (2018). Teachers’ knowledge about higher-order thinking skills and its learning strategy. Problems of Education in the 21st Century, 76(2), 215–230. https://doi.org/10.33225/pec/18.76.215
Senger, E., Zhang, M., van der Goot, R., & Plank, B. (2024). Deep learning-based computational job market analysis: A survey on skill extraction and classification from job postings. In E. Hruschka, T. Lake, N. Otani, & T. Mitchell (Eds.), Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024), 22 March 2024, St. Julian’s, Malta (pp. 1–15). Association for Computational Linguistics. https://doi.org/10.18653v1/2024.nlp4hr-1.1
Shorman, S., Khder, M., et al. (2024). Curriculum management system to measure the course and program outcomes. In 2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS 2024), 28–29 January 2024, Manama, Bahrain (pp. 391–397). IEEE. https://doi.org/10.1109/ICETSIS61505.2024.10459625
Siyan, L., Xu, Z., Raghuram, V. C., Zhang, X., Yu, R., & Yu, Z. (2025). Bringing pedagogy into focus: Evaluating virtual teaching assistants’ question-answering in asynchronous learning environments. In C. Christodoulopoulos, T. Chakraborty, C. Rose, & V. Peng (Eds.), Findings of the Association for Computational Linguistics (EMNLP 2025), 4–9 November 2025, Suzhou, China (pp. 9743–9774). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.findings-emnlp.518
Sridhar, P., Doyle, A., Agarwal, A., Bogart, C., Savelka, J., & Sakr, M. (2023). Harnessing LLMs in curricular design: Using GPT-4 to support authoring of learning objectives. arXiv preprint arXiv:2306.17459. https://doi.org/10.48550/arXiv.2306.17459
Tan, C. W., & Lim, K. Y. (2023). Revolutionizing formative assessment in STEM fields: Leveraging AI and NLP techniques. In 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2023), 31 October–3 November 2023, Taipei, Taiwan (pp. 1357–1364). IEEE. https://doi.org/10.1109/APSIPAASC58517.2023.10317226
Tang, R., & Sae-Lim, W. (2016). Data science programs in U.S. higher education: An exploratory content analysis of program description, curriculum structure, and course focus. Education for Information, 32(3), 269–290. https://doi.org/10.3233/efi-160977
Thakrar, K., & Young, N. (2025). Enhancing talent employment insights through feature extraction with LLM finetuning. arXiv preprint arXiv:2501.07663. https://doi.org/10.48550/arXiv.2501.07663
Tian, Z., Sun, M., Liu, A., Sarkar, S., & Liu, J. (2024). Enhancing instructional quality: Leveraging computer-assisted textual analysis to generate in-depth insights from educational artifacts. arXiv preprint arXiv:2403.03920. https://doi.org/10.48550/arXiv.2403.03920
UNESCO. (2023). Guidance for generative AI in education and research (tech. rep.). UNESCO. Paris. https://unesdoc.unesco.org/ark:/48223/pf0000386693
Walker, R. E. (2024). Mapping curricula to skills and occupations using course descriptions. In C. da Rocha Brito & M. M. Ciampi (Eds.), Proceedings of the 2024 IEEE World Engineering Education Conference (EDUNINE 2024), 10–13 March 2024, Guatemala City, Guatemala. IEEE. https://doi.org/10.1109/EDUNINE60625.2024.10500452
Wang, Y., Zhang, Z., & Wang, R. (2023). Element-aware summarization with large language models: Expert-aligned evaluation and chain-of-thought method. arXiv preprint arXiv:2305.13412. https://doi.org/10.48550/arXiv.2305.13412
Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. arXiv preprint arXiv:2206.07682. https://doi.org/10.48550/arXiv.2206.07682
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain of thought prompting elicits reasoning in large language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.), Proceedings of the 36th International Conference on Neural Information Processing Systems (NIPS 2022), 28 November 2022–9 December 2022, New Orleans, Louisiana, USA (pp. 24824–24837). ACM. https://dl.acm.org/doi/10.5555/3600270.3602070
World Economic Forum. (2025). The future of jobs report 2025. World Economic Forum. https://www.weforum.org/publications/the-future-of-jobs-report-2025/
Xu, Z., Li, X., Huan, Y., Minaya, V., & Yu, R. (2025). From course to skill: Evaluating large language model performance in curricular analytics. In A. Cristea, E. Walker, Y. Lu, O. Santos, & S. Isotani (Eds.), Artificial intelligence in education. AIED 2025. Lecture notes in computer science (pp. 203–211, Vol. 15882). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-98465-5_26
Yang, H., Kim, J., & Lee, W. (2023). Analyzing the alignment between AI curriculum and AI textbooks through text mining. Applied Sciences, 13(18), 10011. https://doi.org/10.3390/app131810011
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. (2023). Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601. https://doi.org/10.48550/arXiv.2305.10601
Zamecnik, A., Barthakur, A., Wang, H., & Dawson, S. (2024). Mapping employable skills in higher education curriculum using llms. In R. Ferreira Mello, N. Rummel, I. Jivet, G. Pishtari, & R. Valiente (Eds.), Technology enhanced learning for inclusive and equitable quality education. EC-TEL 2024. Lecture notes in computer science (pp. 18–32, Vol. 15160). Springer. https://doi.org/10.1007/978-3-031-72312-4_2
Zhang, M., Jensen, K., Sonniks, S., & Plank, B. (2022). Skillspan: Hard and soft skill extraction from english job postings. In M. Carpuat, M.- C. de Marneffe, & I. V. Meza Ruiz (Eds.), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022), 10–15 July 2022, Seattle, Washington, USA (pp. 4962–4984). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.naacl-main.366
Zhang, S., Qin, L., Zhou, D., Le, Q. V., Liu, P. J., et al. (2022). Automatic chain of thought prompting in large language models. arXiv preprint arXiv:2210.03493. https://doi.org/10.48550/arXiv.2210.03493
Zhou, D., Schärli, N., Hou, L., Wei, J., Scales, N., Wang, X., Schuurmans, D., Cui, C., Bousquet, O., Le, Q., & Chi, E. (2022). Least-to-most prompting enables complex reasoning in large language models. arXiv preprint arXiv:2205.10625. https://doi.org/10.48550/arXiv.2205.10625
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Journal of Learning Analytics

This work is licensed under a Creative Commons Attribution 4.0 International License.