Is ‘Vibes-Based Grading’ in First-year Mathematics Fair?

Authors

Keywords:

First-year in mathematics, transition pedagogy, holistic grading, Assessment

Abstract

BACKGROUND

Many students enter Australian universities underprepared for the mathematical content in the degrees they are pursuing, unskilled in how to learn mathematics and unenthusiastic about the subject.  To overcome these disadvantages, it is crucial that first-year mathematics courses engage students using the best pedagogy principles and practices, especially those related to transition pedagogy. 

In the Mathematical Sciences Institute at the Australian National University (ANU), we have recently undertaken a fundamental redesign of MATH1013, our first mathematics subject for engineering students, physics students and others.   The redesign aimed to directly address issues of transition pedagogy.  A key piece of the project is a new assessment structure which largely decouples the question of whether a student should pass the course from the question of which passing grade the student should receive.   The final exam is used to determine the level of pass the student should receive.   The exam is constructed to allow students opportunities to demonstrate the depth of their conceptual understanding and higher-order mathematical thinking in the context of the course, and it is graded using holistic grading.  In holistic grading, also known as nonreductionist grading, a student’s performance on a complex task, or a collection of related tasks, is “considered an entity in itself, which cannot be sub-divided into component performances” (Clark, 1996).   Our final exam contains six problems, each of which has subparts of varying complexity.  A grader reads the entirety of the student’s response to a problem, and assigns one number that purports to represent the level of understanding demonstrated throughout the response.   Recently, one of our students described this as “vibes-based grading.”

AIMS
In this study we aim to address the following research question: Are the final exam grades for MATH1013 being awarded fairly? 

DESIGN AND METHODS

In any fair grading scheme, the score assigned to a student’s work is determined by the understanding demonstrated in the response, not the caprice or idiosyncrasies of the grader. We conducted an experiment in which two experienced graders used a prescribed methodology to each assess a large sample of student work.  We considered two measures of interrater reliability: the percent agreement and Cohen’s kappa statistic (McHugh, 2012).

In this talk, we will describe the methodology used, and the results obtained.   

REFERENCES

Clarke, D. (1996). Assessment. In: Bishop, A.J., Clements, K., Keitel, C., Kilpatrick, J., Laborde, C. (eds) International Handbook of Mathematics Education. Kluwer International Handbooks of Education, vol 4. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-1465-0_10

McHugh M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276–282.

Published

2025-09-22