| Total: 37
Artificial Intelligence is undoubtedly becoming pervasive in everyday life of everyone. In this setting, developing correct AI conception since childhood is not only a need to be addressed in educational curricula, but is also a children right. Accordingly, several initiatives at national and international levels aim at promoting AI and emerging technology literacy, supported also by a proliferation in the literature of learning courses covering a variety of topics, learning objectives and targeted ages. Schools are therefore pushed to introduce innovative activities for children in their curricula. In this paper, we report the results of a case study where we tested the contribution of an AI block-based course in developing computational thinking, and human and AI minds understanding in fifth and sixth grade children.
In recent years, the rapid advancement of artificial intelligence (AI) has fostered an urgent need to better prepare current and future educators to be able to integrate AI technologies in their teaching and to teach AI literacy to PreK-12 students. While many organizations have developed professional learning opportunities for inservice educators, a gap remains for resources specifically designed for those facilitating and enrolled in Educator Preparation Programs (EPPs). In response to this gap, the International Society for Technology in Education (ISTE) launched its first AI Explorations for EPPs Faculty Fellowship. As a result of the Faculty Fellows’ collaboration, this paper articulates a framework of seven critical strategies with the potential to address the urgent need EPPs have in preparing preservice teachers to effectively integrate AI-powered instructional tools and to teach this new area of content knowledge in PreK-12 classrooms. In addition, we provide a review of literature and an overview of the emerging needs for integrating AI education in EPPs. We demonstrate why support for preservice teachers’ critical examination and application of AI, including a focus on the issues of equity, ethics, and culturally responsive teaching, is essential to their later success in PreK-12 classrooms. Recommendations for further research and learning are also provided to promote community-wide initiatives for supporting the integration of AI in education through Educator Preparation Programs and beyond.
Roughly every decade, the ACM and IEEE professional organizations have produced recommendations for the education of undergraduate computer science students. These guidelines are used worldwide by research universities, liberal arts colleges, and community colleges. For the latest 2023 revision of the curriculum, AAAI has collaborated with ACM and IEEE to integrate artificial intelligence more broadly into this new curriculum and to address the issues it raises for students, instructors, practitioners, policy makers, and the general public. This paper describes the development process and rationale that underlie the artificial intelligence components of the CS2023 curriculum, discusses the challenges in curriculum design for such a rapidly advancing field, and examines lessons learned during this three-year process.
Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input of real teachers or students. This paper applies a large language model-based QG approach where questions are generated with learning goals derived from Bloom's taxonomy. The automatically generated questions are used in multiple experiments designed to assess how teachers use them in practice. The results demonstrate that teachers prefer to write quizzes with automatically generated questions, and that such quizzes have no loss in quality compared to handwritten versions. Further, several metrics indicate that automatically generated questions can even improve the quality of the quizzes created, showing the promise for large scale use of QG in the classroom setting.
Artificial intelligence (AI) is quickly finding broad application in every sector of society. This rapid expansion of AI has increased the need to cultivate an AI-literate workforce, and it calls for introducing AI education into K-12 classrooms to foster students’ awareness and interest in AI. With rich narratives and opportunities for situated problem solving, story-driven game-based learning offers a promising approach for creating engaging and effective K-12 AI learning experiences. In this paper, we present our ongoing work to iteratively design, develop, and evaluate a story-driven game-based learning environment focused on AI education for upper elementary students (ages 8 to 11). The game features a science inquiry problem centering on an endangered species and incorporates a Use-Modify-Create scaffolding framework to promote student learning. We present findings from an analysis of data collected from 16 students playing the game's quest focused on AI planning. Results suggest that the scaffolding framework provided students with the knowledge they needed to advance through the quest and that overall, students experienced positive learning outcomes.
Artificial intelligence (AI) and its teaching in the K-12 grades has been championed as a vital need for the United States due to the technology's future prominence in the 21st century. However, there remain several barriers to effective AI lessons at these age groups including the broad range of interdisciplinary knowledge needed and the lack of formal training or preparation for teachers to implement these lessons. In this experience report, we present ImageSTEAM, a teacher professional development for creating lessons surrounding computer vision, machine learning, and computational photography/cameras targeted for middle school grades 6-8 classes. Teacher professional development workshops were conducted in the states of Arizona and Georgia from 2021-2023 where lessons were co-created with teachers to introduce various specific visual computing concepts while aligning to state and national standards. In addition, the use of a variety of computer vision and image processing software including custom designed Python notebooks were created as technology activities and demonstrations to be used in the classroom. Educational research showed that teachers improved their self-efficacy and outcomes for concepts in computer vision, machine learning, and artificial intelligence when participating in the program. Results from the professional development workshops highlight key opportunities and challenges in integrating this content into the standard curriculum, the benefits of a co-creation pedagogy, and the positive impact on teacher and student's learning experiences. The open-source program curriculum is available at www.imagesteam.org.
Sentiment analysis provides a promising tool to automatically assess the emotions voiced in written student feedback such as periodically collected unit-of-study reflections. The commonly used dictionary-based approaches are limited to major languages and fail to capture contextual differences. Pretrained large language models have been shown to be biased and online versions raise privacy concerns. Hence, we resort to traditional supervised machine learning (ML) approaches which are designed to overcome these issues by learning from domain-specific labeled data. However, these labels are hard to come by -- in our case manually annotating student feedback is prone to bias and time-consuming, especially in high-enrollment courses. In this work, we investigate the use of student crowdsourced labels for supervised sentiment analysis for education. Specifically, we compare crowdsourced and student self-reported labels with human expert annotations and use them in various ML approaches to evaluate the performance on predicting emotions of written student feedback collected from large computer science classes. We find that the random forest model trained with student-crowdsourced labels tremendously improves the identification of reflections with negative sentiment. In addition to our quantitative study, we describe our crowdsourcing experiment which was intentionally designed to be an educational activity in an introduction to data science course.
As Artificial Intelligence (AI) continues to permeate various aspects of societies, understanding the disparities in AI knowledge and skills across different living areas becomes imperative. Small living areas have emerged as significant contributors to Europe's economy, offering an alternative to the bustling environment of larger cities for those seeking an improved quality of life. Nonetheless, they often encounter challenges related to digital infrastructure, access to financial resources, and digital skills gaps, limiting their economic and social growth prospects. This study investigates the digital and AI skills gaps in the context of small and large European living areas, shedding light on the potential hindrances to unleashing the full economic and social potentials of these regions in an AI-enabled economy. Drawing from a comprehensive dataset encompassing 4,006 respondents across eight EU countries, this research examines the current perceptions and understandings of AI and digital skills within two distinct population groups: residents of smaller living areas and their counterparts in larger communities. Through bivariate analysis, notable insights are revealed concerning trust in AI solutions and entities, self-assessed digital skills, AI Awareness, AI Attitudes and demography variables in both population groups. These insights may refer to the significance of addressing digital and AI skills gaps in fostering growth and preparedness for the AI-driven future. As AI becomes increasingly integral to various aspects of society, targeted interventions and policies are essential to bridge these gaps and enable individuals and communities to harness the transformative potential of AI-enabled economies.
With the advancement and utility of Artificial Intelligence (AI), personalising education to a global population could be a cornerstone of new educational systems in the future. This work presents the PEEKC dataset and the TrueLearn Python library, which contains a dataset and a series of online learner state models that are essential to facilitate research on learner engagement modelling. TrueLearn family of models was designed following the "open learner" concept, using humanly-intuitive user representations. This family of scalable, online models also help end-users visualise the learner models, which may in the future facilitate user interaction with their models/recommenders. The extensive documentation and coding examples make the library highly accessible to both machine learning developers and educational data mining and learning analytics practitioners. The experiments show the utility of both the dataset and the library with predictive performance significantly exceeding comparative baseline models. The dataset contains a large amount of AI-related educational videos, which are of interest for building and validating AI-specific educational recommenders.
As artificial intelligence (AI) is playing an increasingly important role in our society and global economy, AI education and literacy have become necessary components in college and K-12 education to prepare students for an AI-powered society. However, current AI curricula have not yet been made accessible and engaging enough for students and schools from all socio-economic backgrounds with different educational goals. In this work, we developed an open-source learning module for college and high school students, which allows students to build their own robot companion from the ground up. This open platform can be used to provide hands-on experience and introductory knowledge about various aspects of AI, including robotics, machine learning (ML), software engineering, and mechanical engineering. Because of the social and personal nature of a socially assistive robot companion, this module also puts a special emphasis on human-centered AI, enabling students to develop a better understanding of human-AI interaction and AI ethics through hands-on learning activities. With open-source documentation, assembling manuals and affordable materials, students from different socio-economic backgrounds can personalize their learning experience based on their individual educational goals. To evaluate the student-perceived quality of our module, we conducted a usability testing workshop with 15 college students recruited from a minority-serving institution. Our results indicate that our AI module is effective, easy-to-follow, and engaging, and it increases student interest in studying AI/ML and robotics in the future. We hope that this work will contribute toward accessible and engaging AI education in human-AI interaction for college and high school students.
High school teachers from many disciplines have growing interests in teaching about artificial intelligence (AI). This cross-disciplinary interest reflects the prevalence of AI tools across society, such as Generative AI tools built upon Large Language Models (LLM). However, high school classes are unique and complex environments, led by teachers with limited time and resources with priorities that vary by class and the students they serve. Therefore, developing curricula about AI for classes that span many disciplines (e.g. history, art, math) must involve centering the expertise of cross-disciplinary teachers. In this study, we conducted five collaborative curricular co-design sessions with eight teachers who taught high school humanities and STEM classes. We sought to understand how teachers considered AI when it was taught in art, math, and social studies contexts, as well as opportunities and challenges they identified with incorporating AI tools into their instruction. We found that teachers considered technical skills and ethical debates around AI, opportunities for "dual exploration" between AI and disciplinary learning, and limitations of AI tools as supporting engagement and reflection but also potentially distracting. We interpreted our findings relative to co-designing adaptable AI curricula to support teaching about and with AI across high school disciplines.
Large language models like ChatGPT can generate human-like code, posing challenges for programming education as students may be tempted to misuse them on assignments. However, there are currently no robust detectors designed specifically to identify AI-generated code. This is an issue that needs to be addressed to maintain academic integrity while allowing proper utilization of language models. Previous work has explored different approaches to detect AI-generated text, including watermarks, feature analysis, and fine-tuning language models. In this paper, we address the challenge of determining whether a student's code assignment was generated by a language model. First, our proposed method identifies AI-generated code by leveraging targeted masking perturbation paired with comperhesive scoring. Rather than applying a random mask, areas of the code with higher perplexity are more intensely masked. Second, we utilize a fine-tuned CodeBERT to fill in the masked portions, producing subtle modified samples. Then, we integrate the overall perplexity, variation of code line perplexity, and burstiness into a unified score. In this scoring scheme, a higher rank for the original code suggests it's more likely to be AI-generated. This approach stems from the observation that AI-generated codes typically have lower perplexity. Therefore, perturbations often exert minimal influence on them. Conversely, sections of human-composed codes that the model struggles to understand can see their perplexity reduced by such perturbations. Our method outperforms current open-source and commercial text detectors. Specifically, it improves detection of code submissions generated by OpenAI's text-davinci-003, raising average AUC from 0.56 (GPTZero baseline) to 0.87 for our detector.
Building a skilled cybersecurity workforce is paramount to building a safer digital world. However, the diverse skill set, constantly emerging vulnerabilities, and deployment of new cyber threats make learning cybersecurity challenging. Traditional education methods struggle to cope with cybersecurity's rapidly evolving landscape and keep students engaged and motivated. Different studies on students' behaviors show that an interactive mode of education by engaging through a question-answering system or dialoguing is one of the most effective learning methodologies. There is a strong need to create advanced AI-enabled education tools to promote interactive learning in cybersecurity. Unfortunately, there are no publicly available standard question-answer datasets to build such systems for students and novice learners to learn cybersecurity concepts, tools, and techniques. The education course material and online question banks are unstructured and need to be validated and updated by domain experts, which is tedious when done manually. In this paper, we propose CyberGen, a novel unification of large language models (LLMs) and knowledge graphs (KG) to generate the questions and answers for cybersecurity automatically. Augmenting the structured knowledge from knowledge graphs in prompts improves factual reasoning and reduces hallucinations in LLMs. We used the knowledge triples from cybersecurity knowledge graphs (AISecKG) to design prompts for ChatGPT and generate questions and answers using different prompting techniques. Our question-answer dataset, CyberQ, contains around 4k pairs of questions and answers. The domain expert manually evaluated the random samples for consistency and correctness. We train the generative model using the CyberQ dataset for question answering task.
Automatic short answer grading (ASAG) seeks to mitigate the burden on teachers by leveraging computational methods to evaluate student-constructed text responses. Large language models (LLMs) have recently gained prominence across diverse applications, with educational contexts being no exception. The sudden rise of ChatGPT has raised expectations that LLMs can handle numerous tasks, including ASAG. This paper aims to shed some light on this expectation by evaluating two LLM-based chatbots, namely ChatGPT built on GPT-3.5 and GPT-4, on scoring short-question answers under zero-shot and one-shot settings. Our data consists of 2000 student answers in Finnish from ten undergraduate courses. Multiple perspectives are taken into account during this assessment, encompassing those of grading system developers, teachers, and students. On our dataset, GPT-4 achieves a good QWK score (0.6+) in 44% of one-shot settings, clearly outperforming GPT-3.5 at 21%. We observe a negative association between student answer length and model performance, as well as a correlation between a smaller standard deviation among a set of predictions and lower performance. We conclude that while GPT-4 exhibits signs of being a capable grader, additional research is essential before considering its deployment as a reliable autograder.
This paper explores the use of large language models (LLMs) to score and explain short-answer assessments in K-12 science. While existing methods can score more structured math and computer science assessments, they often do not provide explanations for the scores. Our study focuses on employing GPT-4 for automated assessment in middle school Earth Science, combining few-shot and active learning with chain-of-thought reasoning. Using a human-in-the-loop approach, we successfully score and provide meaningful explanations for formative assessment responses. A systematic analysis of our method's pros and cons sheds light on the potential for human-in-the-loop techniques to enhance automated grading for open-ended science assessments.
Pedagogical planners can provide adaptive support to students in narrative-centered learning environments by dynamically scaffolding student learning and tailoring problem scenarios. Reinforcement learning (RL) is frequently used for pedagogical planning in narrative-centered learning environments. However, RL-based pedagogical planning raises significant challenges due to the scarcity of data for training RL policies. Most prior work has relied on limited-size datasets and offline RL techniques for policy learning. Unfortunately, offline RL techniques do not support on-demand exploration and evaluation, which can adversely impact the quality of induced policies. To address the limitation of data scarcity and offline RL, we propose INSIGHT, an online RL framework for training data-driven pedagogical policies that optimize student learning in narrative-centered learning environments. The INSIGHT framework consists of three components: a narrative-centered learning environment simulator, a simulated student agent, and an RL-based pedagogical planner agent, which uses a reward metric that is associated with effective student learning processes. The framework enables the generation of synthetic data for on-demand exploration and evaluation of RL-based pedagogical planning. We have implemented INSIGHT with OpenAI Gym for a narrative-centered learning environment testbed with rule-based simulated student agents and a deep Q-learning-based pedagogical planner. Our results show that online deep RL algorithms can induce near-optimal pedagogical policies in the INSIGHT framework, while offline deep RL algorithms only find suboptimal policies even with large amounts of data.
Automatic speech scoring is crucial in language learning, providing targeted feedback to language learners by assessing pronunciation, fluency, and other speech qualities. However, the scarcity of human-labeled data for languages beyond English poses a significant challenge in developing such systems. In this work, we propose a Language-Independent scoring approach to evaluate speech without relying on labeled data in the target language. We introduce a multilingual speech scoring system that leverages representations from the wav2vec 2.0 XLSR model and a force-alignment technique based on CTC-Segmentation to construct speech features. These features are used to train a machine learning model to predict pronunciation and fluency scores. We demonstrate the potential of our method by predicting expert ratings on a speech dataset spanning five languages - English, French, Spanish, German and Portuguese, and comparing its performance against Language-Specific models trained individually on each language, as well as a jointly-trained model on all languages. Results indicate that our approach shows promise as an initial step towards a universal language independent speech scoring.
MineObserver 2.0 is an AI framework that uses Computer Vision and Natural Language Processing for assessing the accuracy of learner-generated descriptions of Minecraft images that include some scientifically relevant content. The system automatically assesses the accuracy of participant observations, written in natural language, made during science learning activities that take place in Minecraft. We demonstrate our system working in real-time and describe a teacher dashboard to showcase observations, both of which advance our previous work. We present the results of a study showing that MineObserver 2.0 improves over its predecessor both in perceived accuracy of the system's generated descriptions as well as in usefulness of the system's feedback. In future work, we intend improve system generated descriptions to give more teacher control and shift the system to perform continuous learning to more rapidly respond to novel observations made by learners.
This paper focuses on using Large Language Models to support teaching assistants in answering questions on large student forums such as Piazza and EdSTEM. Since student questions on these forums are often closely tied to specific aspects of the institution, instructor, and course delivery, general-purpose LLMs do not directly do well on this task. We introduce RetLLM-E, a method that combines text-retrieval and prompting approaches to enable LLMs to provide precise and high-quality answers to student questions. When presented with a student question, our system initiates a two-step process. First, it retrieves relevant context from (i) a dataset of student questions addressed by course instructors (Q&A Retrieval) and (ii) relevant segments of course materials (Document Retrieval). RetLLM-E then prompts LLM using the retrieved text and an engineered prompt structure to yield an answer optimized for the student question. We present a set of quantitative and human evaluation experiments, comparing our method to ground truth answers to questions in a test set of actual student questions. Our results demonstrate that our approach provides higher-quality responses to course-related questions than an LLM operating without context or relying solely on retrieval-based context. RetLLM-E can easily be adopted in different courses, providing instructors and students with context-aware automatic responses.
Motor skills, especially fine motor skills like handwriting, play an essential role in academic pursuits and everyday life. Traditional methods to teach these skills, although effective, can be time-consuming and inconsistent. With the rise of advanced technologies like robotics and artificial intelligence, there is increasing interest in automating such teaching processes. In this study, we examine the potential of a virtual AI teacher in emulating the techniques of human educators for motor skill acquisition. We introduce an AI teacher model that captures the distinct characteristics of human instructors. Using a reinforcement learning environment tailored to mimic teacher-learner interactions, we tested our AI model against four guiding hypotheses, emphasizing improved learner performance, enhanced rate of skill acquisition, and reduced variability in learning outcomes. Our findings, validated on synthetic learners, revealed significant improvements across all tested hypotheses. Notably, our model showcased robustness across different learners and settings and demonstrated adaptability to handwriting. This research underscores the potential of integrating Imitation and Reinforcement Learning models with robotics in revolutionizing the teaching of critical motor skills.
Learnersourcing offers great potential for scalable education through student content creation. However, predicting student performance on learnersourced questions, which is essential for personalizing the learning experience, is challenging due to the inherent noise in student-generated data. Moreover, while conventional graph-based methods can capture the complex network of student and question interactions, they often fall short under cold start conditions where limited student engagement with questions yields sparse data. To address both challenges, we introduce an innovative strategy that synergizes the potential of integrating Signed Graph Neural Networks (SGNNs) and Large Language Model (LLM) embeddings. Our methodology employs a signed bipartite graph to comprehensively model student answers, complemented by a contrastive learning framework that enhances noise resilience. Furthermore, LLM's contribution lies in generating foundational question embeddings, proving especially advantageous in addressing cold start scenarios characterized by limited graph data. Validation across five real-world datasets sourced from the PeerWise platform underscores our approach's effectiveness. Our method outperforms baselines, showcasing enhanced predictive accuracy and robustness.
Understanding student behavior in educational settings is critical in improving both the quality of pedagogy and the level of student engagement. While various AI-based models exist for classroom analysis, they tend to specialize in limited tasks and lack generalizability across diverse educational environments. Additionally, these models often fall short in ensuring student privacy and in providing actionable insights accessible to educators. To bridge this gap, we introduce a unified, end-to-end framework by leveraging temporal action detection techniques and advanced large language models for a more nuanced student behavior analysis. Our proposed framework provides an end-to-end pipeline that starts with raw classroom video footage and culminates in the autonomous generation of pedagogical reports. It offers a comprehensive and scalable solution for student behavior analysis. Experimental validation confirms the capability of our framework to accurately identify student behaviors and to produce pedagogically meaningful insights, thereby setting the stage for future AI-assisted educational assessments.
The rapid evolution of artificial intelligence (AI), specifically large language models (LLMs), has opened opportunities for various educational applications. This paper explored the feasibility of utilizing ChatGPT, one of the most popular LLMs, for automating feedback for Java programming assignments in an introductory computer science (CS1) class. Specifically, this study focused on three questions: 1) To what extent do students view LLM-generated feedback as formative? 2) How do students see the comparative affordances of feedback prompts that include their code, vs. those that exclude it? 3) What enhancements do students suggest for improving LLM-generated feedback? To address these questions, we generated automated feedback using the ChatGPT API for four lab assignments in a CS1 class. The survey results revealed that students perceived the feedback as aligning well with formative feedback guidelines established by Shute. Additionally, students showed a clear preference for feedback generated by including the students' code as part of the LLM prompt, and our thematic study indicated that the preference was mainly attributed to the specificity, clarity, and corrective nature of the feedback. Moreover, this study found that students generally expected specific and corrective feedback with sufficient code examples, but had diverged opinions on the tone of the feedback. This study demonstrated that ChatGPT could generate Java programming assignment feedback that students perceived as formative. It also offered insights into the specific improvements that would make the ChatGPT-generated feedback useful for students.
Text-to-image generation (TTIG) technologies are Artificial Intelligence (AI) algorithms that use natural language algorithms in combination with visual generative algorithms. TTIG tools have gained popularity in recent months, garnering interest from non-AI experts, including educators and K-12 students. While they have exciting creative potential when used by K-12 learners and educators for creative learning, they are also accompanied by serious ethical implications, such as data privacy, spreading misinformation, and algorithmic bias. Given the potential learning applications, social implications, and ethical concerns, we designed 6-hour learning materials to teach K-12 teachers from diverse subject expertise about the technical implementation, classroom applications, and ethical implications of TTIG algorithms. We piloted the learning materials titled “Demystify text-to-image generative tools for K-12 educators" with 30 teachers across two workshops with the goal of preparing them to teach about and use TTIG tools in their classrooms. We found that teachers demonstrated a technical, applied and ethical understanding of TTIG algorithms and successfully designed prototypes of teaching materials for their classrooms.
Generative AI tools introduce new and accessible forms of media creation for youth. They also raise ethical concerns about the generation of fake media, data protection, privacy and ownership of AI-generated art. Since generative AI is already being used in products used by youth, it is critical that they understand how these tools work and how they can be used or misused. In this work, we facilitated students’ generative AI learning through expression of their imagined future identities. We designed a learning workshop - Dreaming with AI - where students learned about the inner workings of generative AI tools, used text-to-image generation algorithms to create their imaged future dreams, reflected on the potential benefits and harms of generative AI tools and voiced their opinions about policies for the use of these tools in classrooms. In this paper, we present the learning activities and experiences of 34 high school students who engaged in our workshops. Students reached creative learning objectives by using prompt engineering to create their future dreams, gained technical knowledge by learning the abilities, limitations, text-visual mappings and applications of generative AI, and identified most potential societal benefits and harms of generative AI.