Data-Driven Test Case Generation for Automated Programming Assessment

Abstract

Building high-quality test cases for programming problems is an important part of any well-built Automated Programming Assessment System. Traditionally, test cases are created by human experts or using machine auto-generation methods based on the problem definition and sample solutions. Unfortunately, the human approach can not anticipate the numerous ways that programmers can construct erroneous solutions. The machine auto-generation methods are complex, problem-specific, and time-consuming. This paper proposes a fast, simple method for generating high-quality test sets for a programming problem from an existing collection of student solutions for that problem. This paper demonstrates the effectiveness of the proposed method in online programming course assessments. The experiments showed that, when applied to large collections of such programs, the method produces concise, human-understandable test sets that provide better coverage than test sets built by experts with rich teaching experience.

Publication
Proceedings of the Conference on Innovation and Technology in Computer Science Education (ITiCSE)