Sentence Encoder-Based Clustering Method for Modeling Students' Learning Programming Behavior
Introductory programming courses are widely known for their difficulty among students. Success in courses is commonly measured in the form of final grades, which might not capture the challenges students face during their learning process. In this paper, we predict students’ success and their future compiler errors based on previously made errors. Furthermore, we examine the effect of applying two clustering techniques before making the predictions and identify key weeks and errors that have the greatest impact on predictions. Experimental results show that students’ compiler errors observed through the semester are an important predictor of students’ achievement and future struggles. Predictions are further improved using sentence encoder-generated embeddings with K-Means algorithm. Our study suggests that students’ errors, particularly the most recent ones, enable meaningful clustering that enhances performance prediction after only three weeks of the semester.