Introduction
The integration of Artificial Intelligence (AI) in mathematics education has spurred numerous research initiatives at the international level (Shin, 2022; Hsu et al., 2021; Zhou, 2023). These initiatives primarily focus on analyzing the impact of AI on teaching and learning mathematics (Hwang & Tu, 2021; Zhang & Aslan, 2021), driven by the increasing and progressive use of AI by students to tackle school challenges. Consequently, it is essential to identify how educational research can provide guidelines that positively impact teaching and learning.
Initially, AI concentrated on various problem-solving tasks, such as theorem proving or playing chess, which are linked to decision-making and traditionally modeled by decision trees to devise problem-solving strategies (Abeliuk & Gutiérrez, 2021). Currently, AI has been integrated into various domains and use cases, including facial recognition technology, language learning, image processing, and natural language processing. Education, particularly mathematics, has emerged as an area of interest for AI applications in teaching and learning processes (Zhang & Aslan, 2021). In this regard, Jara and Ochoa (2020) highlight AI’s role in personalizing learning, fostering student collaboration, and using games as learning experiences. Additionally, AI has significantly collaborated with Information and Communication Technologies (ICT), which, in educational settings, have been mediated by public policies and various technological waves leading to the implementation of digital whiteboards, tablets, computers, Internet access, and other technological resources with AI integration (Jara and Ochoa, 2020).
These technological waves also influence the teaching and learning of mathematics. According to Bakker et al. (2021), mathematics education faces challenges such as incorporating new teaching approaches, conducting research in diverse domains, utilizing low-technology resources, maintaining online presence, and conducting online assessments. AI is recognized as a driver that can contribute to the solution of some of these problems (Chen et al., 2020; Hwang & Tu, 2021). For instance, Zhou (2023) developed a program based on computer-assisted personalized learning, which demonstrated improvements in students’ academic performance and motivation across five different subjects.
In this context, and based on the literature reviewed, efforts to systematize the state of the art on AI in mathematics education are evident when analyzing systematic reviews in the English language (e.g., Zhang & Aslan, 2021; Mohamed et al., 2022). However, in the context of Latin American research, there are still no comprehensive reviews of the literature on how different technologies can be utilized in education. Therefore, this systematic review aims to analyze the current state of research in AI in mathematics education, its applications, and its role in teaching and learning processes, as well as the purposes and methods used. This analysis is expected to enhance the understanding of the concepts underpinning AI research, with implications for mathematics education, and to encourage further research.
Conceptual Framework
Artificial Intelligence
The International Conference on Artificial Intelligence and Education, held in Beijing in 2019, reached a consensus on the implementation of AI in education. The conference identified integrating AI planning in educational policies as a key area for several tasks: management and participation in education, support for teaching and teachers, learning and assessment of learning, monitoring, evaluation, and research (UNESCO, 2019). This underscores the growing significance of AI in the educational sector and the various aspects of teaching that can benefit from it. Therefore, it is necessary to clarify certain terms or concepts used in the AI field to refer to specific technologies used in education, including intelligent tutoring systems, machine learning, chatbots, and robotics.
Intelligent Tutoring Systems
Intelligent Tutoring Systems (ITS) are computerized learning environments that mimic a teacher’s teaching style to provide student support in a way that adapts to their learning needs and profiles (Erümit & Çetin, 2020; Lippert et al., 2020; Sharma & Harkishan, 2022). In other words, ITS adapt to the content or concepts, teaching methods, and needs of individual students (Lippert et al., 2020). From this perspective, one of its primary functions is “to assess students’ knowledge acquisition during the educational process” (Erümit & Çetin, 2020, p. 4478).
Machine Learning
Machine Learning is a type of AI that automates data analysis methods. This automation develops algorithms that enable learning from data and making predictions (Alenezi & Faisal, 2020; Chen et al., 2020; Webb et al., 2021). Machine learning can build intelligent applications whose behavioral systems can mimic the human brain; these applications can be controlled through human-computer interaction (Chen et al., 2020).
Chatbots
Chatbots are conversational agents, i.e., computer software capable of engaging in conversations or simulating communication to provide information and services through interaction in common or everyday language (Følstad et al., 2021; Liu et al., 2020). In the educational environment, chatbots can help personalize and enrich the learning environment (Liu et al., 2020). Additionally, they can support students with course content, assignments, study resources, individual interaction, or collaborative activities (Kuhail et al., 2023).
Robotics
Educational robotics is defined by Mendoza-Hernández et al. (2020) as “a pedagogical approach that becomes a teaching strategy for different areas such as mathematics, science, and computer science. This approach creates a learning environment where the student plays a key role” (p. 7). Robotics practices in education can promote mental representations of abstract concepts, increase motivation, enhance teamwork, and foster persistence when students face complex and challenging scenarios (Kopcha et al., 2017).
Based on the aforementioned concepts, this systematic review of the literature on artificial intelligence in the field of mathematics education aims to show the progress of research in this area over the last five years. We pose the following research questions:
How are artificial intelligence studies characterized according to the country of research conduct, the year of publication, the type of research, the use of research methods, and the educational level?
What are the uses or roles of artificial intelligence in mathematics education in the studies analyzed?
Methodology of the Systematic Literature Review
Search Strategies and Article
Selection Procedures
To determine the scope of artificial intelligence research in mathematics education, a systematic literature review was conducted, defined as “a review of existing studies that use rigorous, explicit, and accountable research methods” (Gough et al., 2012, p. 6). The systematic review followed the guidelines of the PRISMA 2020 (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement (Page et al., 2021). The search was conducted up to June 20, 2023, in the following databases: Web of Science (WoS), Scopus, and SciELO.
The terms or words used in the search string or equation were consistent with the UNESCO thesaurus: artificial intelligence, education, and mathematics. Boolean and asterisk operators were employed to refine the search. The search strings used to retrieve the items are shown in Table 1 below.
Table 1 Search strings used
Database | Search items |
---|---|
Scopus | TITLE-ABS-KEY (“artificial intelligence” OR “AI”) AND TITLE-ABS-KEY (education) AND TITLE-ABS-KEY (math*) |
Wos | ALL= ((“AI” OR “artificial intelligence”) AND (“education”) AND (math*)) |
SciELO | (“AI” OR “artificial intelligence”) AND (“education”) AND (math*) |
Note: Own research source.
The search focused on articles published in English and Spanish as of June 20, 2023, that relate artificial intelligence to various areas of mathematics education, including teaching, learning, assessment, and other topics. The search yielded 9,144 studies addressing AI at different educational levels in mathematics. After applying the inclusion and exclusion criteria outlined in Table 2, 32 articles were selected.The article selection process was conducted using the steps proposed by Page et al. (2021): identification, selection, and inclusion. During the identification phase, the search strings shown in Table 1 were implemented. The search across three databases yielded 9,144 articles based on titles, abstracts, and keywords. In the initial selection phase, we used the refine function of the electronic databases to exclude 8,958 articles. These exclusions were based on the type of publication (e.g., book chapters, conference proceedings, or books), language (other than Spanish or English), subject areas (other than social sciences or educational research), and publication years (prior to 2019). In the final selection phase, we reviewed the abstracts of 173 articles that were potentially relevant to this manuscript. In the last inclusion stage, we included 29 articles in the
Table 2 Inclusion and exclusion criteria
Criterios de inclusión | Criterios de exclusión |
---|---|
I1: Studies at all school levels of mathematics education | E1: I study in a discipline other than mathematics |
I2: Studies focused on artificial intelligence in future teachers, students of school education or tertiary education related to mathematics. | E2.1: Studies that do not focus on incorporating artificial intelligence into mathematics education |
E2.2: Studies that mention intelligence, but don't focus on it | |
I3: Studies published in English or Spanish | E3: Studies not published in English or Spanish |
I4: Articles | E4: Conference proceedings, books, press articles and book chapters |
I5: Articles in the final stage of publication | E5: Articles in the press |
I6: Studies indexed in the Wos, Scopus and SciELO databases | E6: Studies not indexed in any of the databases included in the study |
I7: Studies conducted between 2019-2023 | E7: Studies conducted before 2019 |
Note: Own research source.
systematic review. Figure 1 presents the flow chart of the article selection process.
Data Analysis
This analysis included 29 articles. Initially, a content analysis of the eligible studies was performed, which were reviewed in depth. Subsequently, a coding scheme was developed to relate them to the research questions according to the following categories:
Bibliometric indicators for the selected studies
Research methodologies implemented
Role of artificial intelligence in studies
Concepts addressed in studies related to mathematics education.
Regarding the coding of the bibliometric and methodological characteristics of the studies, we followed the guidelines proposed by Cevikbas et al. (2022). The analysis continued with content analysis techniques (Cáceres, 2003), focusing on the proposed categories of analysis (deductive). For the first research question, the bibliometric and methodological characteristics identified in each study were subcategorized using the following criteria: year of publication, geographic distribution, research method, sample/participants, level of education, sample size, data collection methods, and role of AI.
Results of the Systematic Review of the Literature on Mathematics Education and Artificial Intelligence
Characteristics of the Studies and Research Methodologies of the Articles
Types of Documents and Years of Publication
The 29 articles included in the analyses were published in 21 different scientific journals: 8 technology and education journals, mathematics education journals, 3 education journals, 3 interdisciplinary journals, 2 distance education journals, and 1 education journal. Articles published in mathematics education journals represent 14% of the total, while those published in technology and
education journals account for 25%. Regarding the years of publication, there has been an increasing trend in the production of scientific articles over the past five years, as illustrated in Graph 1. Few articles were published in 2023, possibly because the year has not yet concluded and several journals are published semi-annually.
Geographic Distribution
The geographic distribution of authors was determined based on the affiliations reported in the articles. An analysis revealed that 80% of the articles have between one and five authors. Table 3 shows the distribution of authors across 19 countries, with the United States (30%) and China (16.3%) being the most represented. However, when analyzed by continent, most authors are from Europe and Asia, followed by the Americas.
Table 3 Geographical distribution of authors according to their affiliation
Countries | Frequency | Percentage |
---|---|---|
Germany | 13 | 12.5 |
Canada | 3 | 2.9 |
China | 17 | 16.3 |
Colombia | 2 | 1.9 |
South Korea | 2 | 1.9 |
United Arab Emirates | 2 | 1.9 |
Fiji | 2 | 1.9 |
Indonesia | 3 | 2.9 |
Italy | 5 | 4.8 |
Jordan | 1 | 1 |
Kazakhstan | 7 | 6.7 |
Norway | 3 | 2.9 |
Oman | 1 | 1 |
Portugal | 4 | 3.8 |
United Kingdom | 1 | 1 |
Russia | 1 | 1 |
Sweden | 4 | 3.8 |
Taiwan | 2 | 1.9 |
United States | 31 | 29.8 |
Total | 104 | 100 |
Note: Own research source.
Research Designs and Data Collection Methods
The articles included in the analyses are subdivided into 20 empirical studies, 3 theoretical studies, and 6 studies present the implementation of AI in the field of education. As a general trend, they demonstrate successful applications of AI in educational environments. The analysis revealed that 27.6 % of the articles used quantitative methods, 24.1 % were qualitative methods, and 10.3% used a combination of both qualitative and quantitative methodologies (see Graph 2). This relates to the fact that 9 articles discussed online programs based on artificial intelligence applied to mathematics education, though these were not empirically tested. The approaches reported in the qualitative and quantitative articles include design-based studies, phenomenological and ethnographic approaches, case studies, and experimental designs.
In the 29 articles reviewed, data collection methods were analyzed to identify the number and types of instruments used in the research. A total of 43 instruments were employed, including questionnaires, interviews, tests, and various observation tools (see Graph 3). The most frequently used instruments were questionnaires (40.4%) and interviews (19.2%). Regarding the number of instruments per article, it was found that 4 instruments were used in one study, while 1 and 2 data collection instruments were most commonly used, at 47% and 32% (n=20), respectively.
Sample, Sizes and Education Levels of Survey Participants
In the 29 studies analyzed, the sample, its size, and the educational levels of the participants were examined and categorized based on the information provided by the authors in each article. Twenty-four percent of the studies had a sample size of fewer than 50 participants, and the same percentage applied to studies with sample sizes between 101 and 200 participants. Studies that did not mention or did not apply sample sizes (theoretical articles) represented 20% of articles reviewed (see Table 4).
Table 4 Sample size used in the studies
Sample size | Frequency | Percentage |
---|---|---|
0-50 | 7 | 24.1 |
51-100 | 2 | 6.9 |
101-200 | 7 | 24.1 |
201-500 | 4 | 13.8 |
>500) | 3 | 10.3 |
Not mentioned | 1 | 3.4 |
Not Applicable | 5 | 17.2 |
Total | 29 | 100 |
Note: Own research source.
Regarding the educational levels of the study participants, Graph 4 shows that most of them were primary and secondary school students, at 13.8% and 34.5%, respectively. Only one article considered both students and teachers as participants. Three articles reported studies with teachers as participants, while 2 articles focused on future teachers. In addition, we subcategorized the higher education level into university students and future teachers.
Analysis of Artificial Intelligence Concepts
Given that the studies link some AI concepts in the articles, this information was inductively analyzed and categorized using the proposed conceptual framework, prioritizing the significance of the AI resource in the analysis. These concepts include machine learning, adaptive learning, chatbot, robotics and intelligent tutoring systems. The results indicate that most of the articles refer to artificial intelligence systems (48%, n=19), followed by machine learning (20.7%, n=6). The complete description can be found in Table 5.
Some Results of the Reviewed Articles
Table 6 presents some empirical results from the reviewed articles.
Table 6 Some empirical results of the reviewed articles
Authors/year | Diploma | Some results |
---|---|---|
Kong et al. (2023) | Evaluating an artificial intelligence literacy programme for empowering and developing concepts, literacy and ethical awareness in senior secondary students | Knowledge of programming provide advantages for the deep learning course but not for other courses, such as AI application projects. In addition, some ethical principles may be too complex for upper secondary school students to understand. |
Huang y Qiao (2022) | Enhancing Computational Thinking Skills Through Artificial Intelligence Education at a STEAM High School | AI education with STEM is beneficial in promoting students' creativity, cooperation, critical thinking, and problem-solving in computational thinking skills. It also improves the learning motivation and self-efficacy of the students in the experimental group. |
Lee y Yeo (2022) | Developing an AI-based chatbot for practicing responsive teaching in mathematics | They developed a chatbot with knowledge of concepts and operations between fractions. They show that this chat reasonably and adequately covered the questions of future teachers and provided answers that seemed realistic. |
Zhai et al. (2022) | Applying machine learning to automatically assess scientific models | Using assessments incorporating drawn and textual models, they achieved excellent scoring accuracy through machine learning. They also identified five characteristics of the drawn models that can significantly affect the accuracy of the machine score. |
Bekmanova et al. (2021) | Personalized training model for organizing blended and lifelong distance learning courses and its effectiveness in Higher Education | The results of a distance learning course, based on personalized learning, indicate that the course meets expectations and is innovative. In addition, they found that 100% of students were successfully certified compared to a traditional classroom course. |
Shin (2022) | Teaching Mathematics Integrating Intelligent Tutoring Systems: Investigating Prospective Teachers’ Concerns and TPACK | Future teachers recognize that they have solid pedagogical knowledge (PK) and pedagogical content knowledge (PCK) they need to teach mathematics. However, when PK and PCK were integrated with technological knowledge, they were less likely to recognize that they had sound knowledge for effective teaching with technology. |
Azevedo et al. (2022) | Mathematics learning and assessment using MathE platform: A case study | When asking students about the extent to which the MathE platform is a valuable aid to their studies, they found that 40.6% of students considered it useful, while 9.4% said it was not useful, and 9.4% thought the platform was helpful. The main difficulties experienced with the platform were organization and language. |
Moltudal et al. (2022) | Adaptive Learning Technology in Primary Education: Implications for Professional Teacher Knowledge and Classroom Management | Teachers describe the technology as promising but feel that, to use it fully, their students must spend more time solving tasks in the program than teachers are willing to allocate. |
Zhou (2023) | Integration of modern technologies in higher education on the example of artificial intelligence use | There is a significant difference in student performance in five subjects before and after the introduction of the Raptivity personalized learning platform. |
Wang et al. (2023) | When adaptive learning is effective learning: comparison of an adaptive learning system to teacher-led instruction | Students using Aquirrel AI learning independently outperformed students enrolled in a course taught by expert teachers. They also outperformed students who received both whole-class and small-group instruction. |
Sperling et al. (2022) | Still w(AI)ting for the automation of teaching: An exploration of machine learning in Swedish primary education using Actor-Network Theory | The study shows that AI technologies designed to personalize and automate require mutual adaptation of human and non-human actors in the network. |
Wang et al. (2022) | Development and Application of an Intelligent Assessment System for Mathematics Learning Strategy among High School StudentsTake Jianzha County as an Example | They found that assessment and implementation systems are effective in providing teachers with techniques to help assess and improve mathematics learning strategies. |
Shin et al. (2021) | Analyzing students’ performance in computerized formative assessments to optimize teachers’ test administration decisions using deep learning frameworks | The model created helps predict whether the next test will be significant based on the performance scores of two previous high-pressure tests. |
Ferro et al. (2021) | Gea2: A Serious Game for Technology-Enhanced Learning in STEM | They found that the effectiveness of the game, as a learning tool, did not yield good overall results. The game was expected to improve understanding of topics explained in class but ended up being a replacement for face-to-face lectures. |
Hsu et al. (2021) | Is it possible for young students to learn the Ai-STEAM application with experiential learning? | They showed that the use of experiential learning integrated into an AI-STEM course improves learning effectiveness. |
Robles y Quintero (2020) | Intelligent system for interactive teaching through videogames | For each video game it is shown that the implementation of the intelligent system, with two computational techniques implemented, enables the user to obtain a better performance in the subjects addressed. |
Yannier et al. (2020) | Active Learning is About More Than Hands-On: A Mixed-Reality AI System to Support STEM Education | AI agent-guided inquiry helped students formulate better and more scientific theories of the phenomena they experience. Additionally, children who receive guidance during inquiry can learn to apply science in engineering tasks better. |
Büscher (2020) | Scaling up qualitative mathematics education research through artificial intelligence methods | They report that the model performed with 76% accuracy on the test set, meaning that it labeled 76% of the data in the same way as a research team would have done. |
Cung et al. (2019) | Getting Academically Underprepared Students Ready through College Developmental Education: Does the Course Delivery Format Matter? | When using a well-developed intelligent tutoring system, the learning gains are even more significant when combined with in-person lectures. |
So y Lee (2023) | Pedagogical exploration and technological development of a humanoid robotic system for teaching to and learning in young children | They found that the NAO Robot can build a positive and friendly relationship with children while achieving math learning outcomes. |
Denes (2023) | A case study of using AI for General Certificate of Secondary Education (GCSE)grade prediction in a selective independent school in England | They showed that predictions are more accurate for STEM subjects and those with more students. In addition, they found that STEM and non-STEM teachers have different perceptions when awarding grades. |
Soesanto et al. (2022) | Indonesian students’ perceptions towards AI-based learning in mathematics | Some of the students’ perceptions are the following: they see AI robots as intelligent machines that can detect something; they understand AI as a robot created to do something; and AI is seen as a simulation of intelligence modeled on a machine. |
Yang et al. (2021) | Can Crowds Customize Instructional Materials with Minimal Expert Guidance?Exploring Teacher-guided Crowdsourcing for Improving Hints in an AI-based Tutor | In one of the studies conducted (2), they report that teachers perceive self-written suggestions as better and more satisfactory than existing AI tutor suggestions. However, they did not perceive personalized, crowd-produced suggestions as an improvement on the original suggestions. |
Walkington y Bernacki (2019) | Personalizing Algebra to Students’ Individual Interests in an Intelligent Tutoring System: Moderators of Impact | Deeper personalization tended to result in lower efficiency, but more positive affective states, whereas situational interest and play were unaffected. Additionally, students who were more deeply engaged with their interests performed better on measures of efficiency. |
Wardat et al. (2023) | ChatGPT: A revolutionary tool for teaching and learning mathematics | They found that ChatGPT is a useful tool, but caution is needed when using it and guidelines for its safe use should be developed. |
Discussion
The above results show a notable lack of papers from Latin America focused on AI and education. This gap invites further development of such topics, placing issues related to teaching and learning in different educational environments. Given the small number of works identified, research efforts should be implemented at all educational levels. Regarding the methods used, although there is no great imbalance in what has been done, there are insufficient studies using mixed methods and involving a high number of participants. Progress in this area is crucial, as AI implementation is already widespread in everyday life, being used as a support to collaborate in mathematical problems, at the student’s fingertips.
On the other hand, the results consider, for the most part, the integration of AI to enhance learning, mainly through autonomous work. The articles identify exploratory-descriptive scopes regarding its implementation, either by integrating this type of technology into their teaching processes e.g., chatbots in initial teacher training), or by identifying the types of knowledge that optimize technology use, such as prior programming knowledge at intermediate levels as reported in Kong et al. (2023). This also calls for a deeper understanding of the complexities involved in adequately integrating AI into educational processes. Consequently, creation and proper functioning are not sufficient: correct integration is required in both pedagogical practice and gradual adaptation with students (Sperling et al., 2022).
This review highlights papers in which AI is directly linked to evaluation processes. The results indicate a higher degree of effectiveness when AI plays a significant role in feedback processes. However, several reviewed studies emphasize the importance of continuously monitoring technological tools to ensure that responses are closely aligned with students’ needs, encompassing language aspects and developing a response typology consistent with in-person practices. From this perspective, it is highly important that when educational activities are conducted concurrently-that is, when involving work with a teacher and work with an AI system-there is synchrony that not only facilitates discussions about them, and that the feedback processes are consistent with the classroom work, but also enables assessment beyond each task individually. In this regard, a projection of research development is proposed. Although some studies focus on synergistic assessment, the results are still in their early stages in the evaluation field.
Conclusions
This study systematically analyzed current research on AI in mathematics education across 29 articles. Our main focus was to examine the characteristics of these studies, the AI technologies used, and their linkage to theories or theoretical perspectives in mathematics education.
Concerning the first research question, the examined studies indicate a growing presence of AI in mathematics education, especially notable in 2022, with a similar trend expected for 2023. Additionally, the vast majority of the reviewed material was empirical, with only a few theoretical studies. Articles published in journals with a tradition in mathematics education constituted less than 20% of the analyzed articles; most of these come from the United States and China. Moreover, Asia is the continent with the largest number of countries addressing AI in mathematics education, while only a small number of authors come from Latin America.
The examined studies primarily used quantitative (26%) and qualitative (24.14%) methods, with few and insufficient mixed-method studies (6.90%) for research development in the area. Furthermore, approximately one-third of the studies focused on high school students as subjects of study, followed by primary students and teachers. Few research studies analyzed university students, particularly future teachers. According to the results, most studies were conducted in secondary and primary education. However, no study was found investigating participants in early childhood education.
The strategies used to collect data from AI interventions were mostly questionnaires, interviews, and tests. Approximately half of these studies used at least two instruments to obtain their data. However, few investigations approached AI from a theory or a theoretical perspective of the didactics of mathematics (n=6).
Regarding the role of AI in mathematics education, the results reveal that it is most commonly used as a computerized learning environment, i.e., intelligent learning systems for assessment, learning effectiveness, distance education, learning, and teaching. This finding encourages further development in already initiated research fields or the identification of new areas to explore, starting from those mentioned.
The limitations of this study are influenced by the selected databases and the exclusion of conference proceedings, book chapters, books, and other materials, as well as journals not included in the Scopus, WoS, and Scielo databases. In addition, the exclusion of written languages other than English and Spanish limits the scope of the results. Another limitation may be connected to the automated exclusion process conducted in each database and the search strings utilized. “Artificial intelligence,” “education,” and “math” were the terms employed in this review, although some identified studies did not use the term “artificial intelligence” but instead referred directly to the AI technology used in the study.
Authors’ Contribution Statement
All authors acknowledge that they have read and approved the final version of this article.
Data Availability Statement
Data supporting the results of this study will be made available by the corresponding author, DP, upon reasonable request.
The percentages of contribution for the conceptualization, preparation, and correction of this paper were as follows: D.P.G 70 % and J.H.A. 30 %
Preprint
A preprint version of this article was deposited at: https://doi.org/10.5281/zenodo.8277888