This year I was selected to be a reader for the AP Physics 1 exam. I really had no idea what to expect other than that I’d work my tail off for a week. I’d gathered from hearing other people talk about it that it was a wonderful experience. And I’m happy to report that, yes, it is!
I wanted to write up my experience to help others in the future understand what it’s like and also to give people an insight into the intricate process of getting 160,000 (approximate number of AP Physics 1 exam for 2018) exams efficiently and, most importantly, accurately graded in 7 days. Later, I’ll write a post detailing what I learned about how to help students succeed on the paragraph-length response FRQ’s.
Travel, Lodging, Food, and Other Logistics
I was chosen in the last round of selections sometime in February. I received an email linking me to the AP Reader site that had all the information I needed. My flight and lodging were 100% covered. Three meals and two snacks a day are provided (with vegetarian and gluten-free options, though the latter of which can be slim pickings), and the menu for the week was posted well in advance. One night is designated “Dine-out Night.” Dinner isn’t provided then because you’re encouraged to go out to a restaurant with your fellow readers, which will be reimbursed (up to $25).
Transport, including Lyft and Uber, is reimbursed to and from the airport and your home. Meals paid for on the travel days are also reimbursed.
Readers are housed 2 to a room, with the option to name a roommate if you know someone that’s going that you’d like to stay with. The hotel was a block from the convention center, but for the one group at a hotel further out a shuttle was provided. Rooming alone is possible, but you have to cover half the cost of the room.
The process of booking the flight and lodging was all done through the AP Reader website and was quick and painless.
The First Day
Each day was divided into 4 sessions, each about 2 hours long, with 2 fifteen-minute breaks and an hour-long lunch interspersed.
Each person is assigned to grade one question for the week. I was assigned AP Physics 1, question 5 of the primary exam, a paragraph-length response question. “P1Q5 Operational” (as opposed to alternate or international), to use the lingo. Everyone grading the same question is in the same room (or a partition of a larger room).
There were a total of 16 P1Q5 readers each of which worked under 1 of 3 Table Leaders. For the first two sessions, myself, the Table Leader (TL), and the other readers under that TL went through 30-40 “sample” packs, questions chosen specifically by the TL’s to help train us on the rubric. We were provided a copy of the rubric along with their notes helping clarify particularly sticky points. We would grade 5 or so questions by ourselves, then discuss as a group with the goal of independently coming to the same grade via hitting the same points on the rubric.
The training session was essential to all of us having some confidence going into the afternoon sessions where we’d be actually grading our first exams.
Table Leaders, Aides, and Safety Nets
We sat two to a table and were encouraged to discuss with our neighbor when we needed a second opinion. Boxes were lined up in the front of the room filled with 12 folders, each folder having 25 exams. Each folder also had a 5 scantron-like pages, one for each question, where we’d write and bubble in the score for Q5 for each exam.
After finishing a folder, we’d hand it off to our TL, who would then check over (“backread”) our work checking for rubric consistency. They were our safety net for most of the week. What was great, though, was that we were never told or implied that we “messed up” or that we “missed something.” Any time my TL approached me, it always came from a place of “here’s what I got, how did you get what you got?” And not in a you’re-actually-wrong-but-I’m-being-euphemistic-about-it way, but in a way that was truly open to our opinions. Any time I changed my original grade, it wasn’t because I was outranked (though I’m sure they’re prepared to do that if necessary) but because I consented based on our respective opinions. There were plenty of times when my TL would say “that’s what I got too, I just wanted to check” or “oh, ok, I see now.”
I felt like my opinion was desired and respected, that my expertise in physics teaching was welcomed. This was essential to me feeling like a valued individual part of a team vs. an interchangeable, disposable cog in the machine. All in all, our TL’s were the most influential part of the experience. All for the better.
As the week went on, the TL’s backread our packets less and less signifying a growing trust in our ability to be consistent. This belief was backed by data too as exam sheets were graded continuously. The Question Lead (who supervised the TL’s) and TL’s would look at the score distributions for each reader constantly to make sure that exams were being graded fairly and consistently.
The room was also staffed by a group of aides. They organized boxes, put folders of exams back in the proper boxes, carted boxes in and out, and, most importantly in my opinion, checked the bubble sheets for errors. If I wrote a 3, but bubbled a 4, they would catch it, bring it over to me, and have me correct it. While I’m proud that this only happened 10-15 times (out of 12,426 questions I graded), that is 10-15 times too many. To err is human, and our aides saved countless students from the consequences of our errors.
Everything about this process assumes that will mistakes will be made, which is simply recognizing reality, therefore layers of safety nets are crucial to ensuring a fair grading for every student. And it was always, always, always emphasized that accuracy was more important than speed. 100% Every time. Not once were we pressured to go faster or meet a quota.
Fine-tuning the Algorithm
About halfway through the second day I started to fine-tune my process. Grab two folders (which became three later on, to save on the number of trips to the pile), place them precisely *here* on the table, etc. Closing one book then opening the next can be condensed to just opening the next one under the one I’m grading, thus combining it into one fluid motion. Write and bubble the score while my fingers thumb through the next book to find the exact right page to flip though, further condensing the process.
Ah, crap, it’s a stack of books with a thinner paper weight. Now I’ve got to “recalibrate” my thumb. Aaaand we’re back to the thicker weight. Recalibrate again.
By the third day, I’d pretty much seen every iteration of every student response. Patterns start to emerge. I’m needing to read and process each response less and less and instead I move to scanning for key words. Lots of students answered part B from an inertia/Newton’s Second Law/Hooke’s Law perspective, which couldn’t be done to the level of detail required by the rubric by students with this level of math and physics knowledge. So instead of reading the argument for the 8,346th time, taking it in word-for-word, weighing the students’ proper use of vocabulary, skimming over the 5,683rd student definition of inertia… it became “argument from inertia” and was scored accordingly.
Regardless of my initial take on a response, I always scanned for key words and phrases:
- Kinetic energy, elastic/spring energy, KE, EPE, SPE
- Transfer, exchange, roll over (an interesting way to describe energy exchange)
- Equations applying the conservation of momentum and/or energy.
Without using these phrases, or others like it, students couldn’t arrive at a sufficiently detailed explanation. At the end of the day, though, it was far less about the specific words and more about the ideas attempting to be expressed. It was less “looking for key phrases” (like “energy was conserved”) and more like looking for evidence that the student is communicating, say, that kinetic and elastic energy are exchanged back-and-forth during oscillation.
This perspective helped me feel good about the grades I assigned. Never once did I feel like a student was punished for not using the correct phrasing nor a student awarded points for trying to game the system by throwing an assortment of physics-ey language. Because simply writing “conservation of energy” without sufficient evidence that it was specifically “elastic to kinetic back to elastic during oscillation” wasn’t enough to be awarded points.
It got even more fine-pointed than that. For example, if someone wrote “momentum was conserved in the collision, so velocity had to stay the same” they probably wouldn’t get points for that because it demonstrates that they don’t understand how to apply that principle. But “the blocks stick together, mass goes up, so to keep momentum the same, velocity had to go down” would get points even though they didn’t use “inelastic collision.”
I greatly enjoyed the experience and would happily do so again were I invited. Being away from home for this long is tough on me, so that would be the only reason I turn down possible future opportunities.
Overall, I was highly impressed with everything about this process, both from a fairness-to-students perspective and how well I was taken care of.