This post isn’t entirely physics-based. Its argument could be applied to most subjects examined in New Zealand. I have much more to discuss about a review of our *physics* curriculum, but I want to address some issues from last year’s exams and therefore want to wait till all remarking is finished so I don’t accidentally get myself in trouble…

Today I want to talk about some implications of New Zealand having an examination system which uses a *“ranking grade” *(also known as “grading to a curve”), which is interestingly **dressed up**, and marketed as “*standards-based grading*“.

From the NCEA website, an “excellence” understanding in Level 2 Physics mechanics is “a comprehensive understanding of mechanics”. Therefore the expectation is that if a student shows “a comprehensive understanding of mechanics”, then they should be awarded an excellence grade. This is unfortunately where the misdirection starts.

This same same system promoting standards-based grading has expectations of what the grading curve should look like. Disturbingly, I hear a large number of stories from people who know NCEA markers (not just in physics) that marking schedules for the exams are continuously adjusted so that the spread of marks matches those expected by the grading curve. I didn’t explicitly mention it in my post on “5 tau“, but I suspect the curriculum/syllabus changed that year by a marking panel that decided (post-hoc) that a question wasn’t hard enough, and to obtain the correct marking curve for the overall examination the “5-tau” was added.

As a teacher of science, I find this abhorrent.

So, what are the implications of this? Let’s play a “lightning round game“.

Suppose I (Teacher A) am teaching a large (very large) number of physics students and I figure out a way to make them understand physics extraordinarily well. What happens in their exams? *They all get excellence grades.*

Now because we look to fit our grades to a curve, what happens to students from Teacher B who hasn’t changed their approach? *Their students grades are shifted down. According to Teacher B, they have done something worse this year, although they haven’t changed. *

Now suppose that you (Teacher C) also invent a new way to make student’s understand physics better. But not quite as well as my way. What happens? *Maybe some student’s grades are improved, but the data is skewed by my (Teacher A’s) students. *

Finally imagine that all 3 of us taught the same as we always have, but the examiner accidentally writes a easier-than-normal examination. What happens? *In order to maintain the grade curve, some questions are marked harder. Perhaps the examiner is now looking for a special “key word” (5 tau!).*

The first three points above are a criticism of a ranking-based system. It impacts our ability to impartially critique our own teaching practice as the goal posts shift each year. As we enter into another exciting year of developing an inquiry, I wonder how we can accurately measure any progress using NCEA examinations that are based on “ranking system”. In fact for this reason I am now judging my teaching practice with a pre- and post-test with the FCI (Force Concept Inventory) and Lawson’s CTSR (Classroom Test of Scientific Reasoning). Although I am still teaching with the NCEA examinations as a goal, at least I have a test that I can compare my results to year on year that I know has reliability and validity.

The final point from above (that the examiner writes an easy test and has to tweek the mark schedule), shows specifically the weakness of a ranking-system *dressed up* as standards-based grading. If we had a pure ranking system, then this would not be an issue, we would simply report students rank based on an un-altered mark schedule. But because we want to try and achieve a certain number of excellences, the marking schedule of our physics test is distorted and twisted, and this ends up influencing the very curriculum we teach to.

As a teacher of physics, a subject I consider to be objective and full of laws and truth, our current system does not sit well with me.

So let me finish with a specific example. Last year I dropped teaching L2 Waves to my students, so that they could try and understand Mechanics at a deeper level. How did my change effect your results: A teacher I have never met, and students I have never taught? Although just a drop in the ocean (N=70), it had implications for your students in both Waves and Mechanics! If my students historically had done well in Waves, then by not sitting this exam, it meant that there were more Merits and Excellences left in the pool for your students. If my students had historically struggled in Waves, by no longer entering it, and decreasing the total number of students sitting that paper, your students would have suffered.

Now because my students concentrated and spent more time on Mechanics, and inevitably got better grades on that paper, unfortunately your student’s had a lower chance at getting the limited number Merits and Excellences available in that paper.

Obviously the numbers here mean that any influence is incredibly small, but you get my point. There are a number of flaws in the system…