Are you an instructional leader, department chair, or lead teacher trying to support more consistent assessment practices among your team–and specifically, more consistency in how student work is evaluated, scored, or rated? This post is for you.
Before we jump in, it’s worth stating: Subjectivity is par for the course. Let’s name it. Every human brings their own perspectives, experiences, biases, and understandings to the table. That said, we can mitigate bias and benefit from the best forms of subjectivity–multiple, diverse perspectives–when we do these three things:
Hold these three things in mind as you review these five ways to help your team calibrate when rating student work. And we know you’re busy! So we’ve ordered these by complexity and lift. You might use one or more of these to help plan an upcoming professional development workshop or series, or you might introduce these as key methods across your faculty for calibrating when scoring student work samples.
“Two Views” is a practice of ensuring, before a final decision is made, another person has the opportunity to review it and provide input on the decision. Importantly, this should be someone with relevant knowledge, and ideally, with a different point of view from yours.
In practice? Partner up with another teacher or coach and have them periodically, or frequently, rate the same piece of work, or a part of a piece of work, that you are rating. Use any variances as helpful signals that you might need to look closer at the work to ensure you can substantiate your rating.
Short on time? Choose the particular dimension of analysis or assessment that you feel most unsure of, and present your request for help in an open-ended (don’t lead the witness!) but targeted way.
“I’m unsure which level on the skill progression Liza meets, based on how she has introduced her argument in paragraph 1. Can you try rating this for me, and share your rating and rationale?”
Practice, practice, practice. Another important way to calibrate when rating student work is simply to create recurring opportunities to practice. Bring your team together, provide them work samples or ask them to bring work samples, and then structure the process using a step-by-step protocol like this to guide your process.
Modeling and Debrief is much like a “Fishbowl” activity: Have one or several teachers who have demonstrated strength in rating student work model the process. Their job is to “think out loud” as they review the student work sample, reference the scoring criteria, discuss their ratings and rationale, and make their determination.
Tips? Have them do it twice. Choose a piece of work that is fairly straightforward in scoring, and then choose another that seems more difficult to rate. Give everyone, including observers, the time to review the work and practice rating it on their own. Then, ensure sufficient time for the teachers in the Fishbowl to engage in meaningful discussion about the work, and model norms of appreciative inquiry (stay curious, stay positive) and evidence-based decision-making.
Finally, debrief the experience with the full faculty, using such guiding questions as:
Remember your Probability and Statistics class? When it’s not feasible to survey every person in a population, we use a sample–a smaller, representative part–to gather data. By analyzing the sample, we can generalize or make predictions about the entire population.
This is that logic, applied to calibrating faculty on how to rate student work using a stable set of criteria or indicators. Instead of evaluating every single piece of scored work, we’ll have faculty score one piece of work as a “sample” to better understand how aligned the team is, and to identify areas for further learning and calibration.
Here’s how it works:
This takes time, and practice, and organization–but in the long run, it can be helpful. The process, too, of selecting student work samples and determining which performance level it represents is also incredibly helpful to further align your team! Over time, you can curate student work samples–across different learning contexts, such as disciplines or grade levels– that reflect a particular level on the skill progression or rubric, so that both educators and students have tangible examples of what success looks like along a learning trajectory.
And there you have it! Five ways to calibrate when rating student work. Happy calibrating.