Publication details

Comparison of Three Programming Error Measures for Explaining Variability in CS1 Grades

Authors

ŠVÁBENSKÝ Valdemar PANKIEWICZ Maciej ZHANG Jiayi CLOUDE Elizabeth B. BAKER Ryan S. FOUH Eric

Year of publication 2024
Type Article in Proceedings
Conference Proceedings of the 29th Annual ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE'24) [to appear]
Citation
Web ArXiv.org preprint
Doi http://dx.doi.org/10.1145/3649217.3653563
Keywords programming education; introductory programming; introduction to programming; novice programming; computer science education
Attached files
Description Programming courses can be challenging for first year university students, especially for those without prior coding experience. Students initially struggle with code syntax, but as more advanced topics are introduced across a semester, the difficulty in learning to program shifts to learning computational thinking (e.g., debugging strategies). This study examined the relationships between students' rate of programming errors and their grades on two exams. Using an online integrated development environment, data were collected from 280 students in a Java programming course. The course had two parts. The first focused on introductory procedural programming and culminated with exam 1, while the second part covered more complex topics and object-oriented programming and ended with exam 2. To measure students' programming abilities, 51095 code snapshots were collected from students while they completed assignments that were autograded based on unit tests. Compiler and runtime errors were extracted from the snapshots, and three measures -- Error Count, Error Quotient and Repeated Error Density -- were explored to identify the best measure explaining variability in exam grades. Models utilizing Error Quotient outperformed the models using the other two measures, in terms of the explained variability in grades and Bayesian Information Criterion. Compiler errors were significant predictors of exam 1 grades but not exam 2 grades; only runtime errors significantly predicted exam 2 grades. The findings indicate that leveraging Error Quotient with multiple error types (compiler and runtime) may be a better measure of students' introductory programming abilities, though still not explaining most of the observed variability.

You are running an old browser version. We recommend updating your browser to its latest version.

More info