Knowledge about Machine Learning is becoming essential, yet it remains a restricted privilege that may not be available to students from a low socio-economic status background. Thus, in order to provide equal opportunities, we taught ML concepts and applications to 158 middle and high school students from a low socio-economic background in Brazil. Results show that these students can understand how ML works and execute the main steps of a human-centered process for developing an image classification model. No substantial differences regarding class periods, educational stage, and sex assigned at birth were observed. The course was perceived as fun and motivating, especially to girls. Despite the limitations in this context, the results show that they can be overcome. Mitigating solutions involve partnerships between social institutions and university, an adapted pedagogical approach as well as increased on-by-one assistance. These findings can be used to guide course designs for teaching ML in the context of underprivileged students from a low socio-economic status background and thus contribute to the inclusion of these students.
The insertion of Machine Learning (ML) in everyday life demonstrates the importance of popularizing an understanding of ML already in school. Accompanying this trend arises the need to assess the students’ learning. Yet, so far, few assessments have been proposed, most lacking an evaluation. Therefore, we evaluate the reliability and validity of an automated assessment of the students’ learning of an image classification model created as a learning outcome of the “ML for All!” course. Results based on data collected from 240 students indicate that the assessment can be considered reliable (coefficient Omega = 0.834/Cronbach's alpha α=0.83). We also identified moderate to strong convergent and discriminant validity based on the polychoric correlation matrix. Factor analyses indicate two underlying factors “Data Management and Model Training” and “Performance Interpretation”, completing each other. These results can guide the improvement of assessments, as well as the decision on the application of this model in order to support ML education as part of a comprehensive assessment.
Machine Learning (ML) is becoming increasingly present in our lives. Thus, it is important to introduce ML already in High School, enabling young people to become conscious users and creators of intelligent solutions. Yet, as typically ML is taught only in higher education, there is still a lack of knowledge on how to properly teach younger students. Therefore, in this systematic literature review, we analyze findings on teaching ML in High School with regard to content, pedagogical strategy, and technology. Results show that High School students were able to understand and apply basic ML concepts, algorithms and tasks. Pedagogical strategies focusing on active problem/project-based hands-on approaches were successful in engaging students and demonstrated positive learning effects. Visual as well as text-based programming environments supported students to build ML models in an effective way. Yet, the review also identified the need for more rigorous evaluations on how to teach ML.
Educational data mining is widely deployed to extract valuable information and patterns from academic data. This research explores new features that can help predict the future performance of undergraduate students and identify at-risk students early on. It answers some crucial and intuitive questions that are not addressed by previous studies. Most of the existing research is conducted on data from 2-3 years in an absolute grading scheme. We examined the effects of historical academic data of 15 years on predictive modeling. Additionally, we explore the performance of undergraduate students in a relative grading scheme and examine the effects of grades in core courses and initial semesters on future performances. As a pilot study, we analyzed the academic performance of Computer Science university students. Many exciting discoveries were made; the duration and size of the historical data play a significant role in predicting future performance, mainly due to changes in curriculum, faculty, society, and evolving trends. Furthermore, predicting grades in advanced courses based on initial pre-requisite courses is challenging in a relative grading scheme, as students’ performance depends not only on their efforts but also on their peers. In short, educational data mining can come to the rescue by uncovering valuable insights from academic data to predict future performance and identify the critical areas that need significant improvement.
With the development of technology allowing for a rapid expansion of data science and machine learning in our everyday lives, a significant gap is forming in the global job market where the demand for qualified workers in these fields cannot be properly satisfied. This worrying trend calls for an immediate action in education, where these skills must be taught to students at all levels in an efficient and up-to-date manner. This paper gives an overview of the current state of data science and machine learning education globally and both at the high school and university levels, while outlining some illustrative and positive examples. Special focus is given to vocational education and training (VET), where the teaching of these skills is at its very beginning. Also presented and analysed are survey results concerning VET students in Slovenia, Serbia, and North Macedonia, and their knowledge, interests, and prerequisites regarding data science and machine learning. These results confirm the need for development of efficient and accessible curricula and courses on these subjects in vocational schools.
Although Machine Learning (ML) is used already in our daily lives, few are familiar with the technology. This poses new challenges for students to understand ML, its potential, and limitations as well as to empower them to become creators of intelligent solutions. To effectively guide the learning of ML, this article proposes a scoring rubric for the performance-based assessment of the learning of concepts and practices regarding image classification with artificial neural networks in K-12. The assessment is based on the examination of student-created artifacts as a part of open-ended applications on the use stage of the Use-Modify-Create cycle. An initial evaluation of the scoring rubric through an expert panel demonstrates its internal consistency as well as its correctness and relevance. Providing a first step for the assessment of concepts on image recognition, the results may support the progress of learning ML by providing feedback to students and teachers.
Prior programming knowledge of students has a major impact on introductory programming courses. Those with prior experience often seem to breeze through the course. Those without prior experience see others breeze through the course and disengage from the material or drop out. The purpose of this study is to demonstrate that novice student programming behavior can be modeled as a Markov process. The resulting transition matrix can then be used in machine learning algorithms to create clusters of similarly behaving students. We describe in detail the state machine used in the Markov process and how to compute the transition matrix. We compute the transition matrix for 665 students and cluster them using the k-means clustering algorithm. We choose the number of cluster to be three based on analysis of the dataset. We show that the created clusters have statistically different means for student prior knowledge in programming, when measured on a Likert scale of 1-5.
Although Machine Learning (ML) has already become part of our daily lives, few are familiar with this technology. Thus, in order to help students to understand ML, its potential, and limitations and to empower them to become creators of intelligent solutions, diverse courses for teaching ML in K-12 have emerged. Yet, a question less considered is how to assess the learning of ML. Therefore, we performed a systematic mapping identifying 27 instructional units, which also present a quantitative assessment of the students’ learning. The simplest assessments range from quizzes to performance-based assessments assessing the learning of basic ML concepts, approaches, and in some cases ethical issues and the impact of ML on lower cognitive levels. Feedback is mostly limited to the indication of the correctness of the answers and only a few assessments are automated. These results indicate a need for more rigorous and comprehensive research in this area.
Although Machine Learning (ML) is integrated today into various aspects of our lives, few understand the technology behind it. This presents new challenges to extend computing education early to ML concepts helping students to understand its potential and limits. Thus, in order to obtain an overview of the state of the art on teaching Machine Learning concepts in elementary to high school, we carried out a systematic mapping study. We identified 30 instructional units mostly focusing on ML basics and neural networks. Considering the complexity of ML concepts, several instructional units cover only the most accessible processes, such as data management or present model learning and testing on an abstract level black-boxing some of the underlying ML processes. Results demonstrate that teaching ML in school can increase understanding and interest in this knowledge area as well as contextualize ML concepts through their societal impact.
The growing amount of information in the world has increased the need for computerized classification of different objects. This situation is present in higher education as well where the possibility of effortless detection of similarity between different study courses would give the opportunity to organize student exchange programmes effectively and facilitate curriculum management and development. This area which currently relies on manual time-consuming expert activities could benefit from application of smartly adapted machine learning technologies. Data in this problem domain is complex leading to inability for automatic classification approaches to always reach the desired result in terms of classification accuracy. Therefore, our approach suggests an automated/semi-automated classification solution, which incorporates both machine learning facilities and interactive involvement of a domain expert for improving classification results. The system's prototype has been implemented and experiments are carried out. This interactive classification system allows to classify educational data, which often comes in unstructured or semi-structured, incomplete and/or insufficient form, thus reducing the number of misclassified instances significantly in comparison with the automatic machine learning approach.