Can A Computer Grade Essays As Well As A Human? Maybe Even Better, Study Says | WAMU 88.5 - American University Radio

NPR : News

Can A Computer Grade Essays As Well As A Human? Maybe Even Better, Study Says

Computers have been grading multiple-choice tests in schools for years. To the relief of English teachers everywhere, essays have been tougher to gauge. But look out, teachers: A new study finds that software designed to automatically read and grade essays can do as good a job as humans — maybe even better.

The study, conducted at the University of Akron, ran more than 16,000 essays from both middle school and high school tests through automated systems developed by nine companies. The essays, from six different states, had originally been graded by humans.

In a piece in The New York Times, education columnist Michael Winerip described the outcome:

Computer scoring produced "virtually identical levels of accuracy, with the software in some cases proving to be more reliable," according to a University of Akron news release.

"In terms of consistency, the automated readers might have done a little better even," Winerip tells All Things Considered host Melissa Block.

The automated systems look for a number of things in order to grade, or rate, an essay, Winerip says. Among them are sentence structure, syntax, word usage and subject-verb agreements.

"[It's] a lot of the same things a human editor or reader would look for," he says.

What the automated readers aren't good at, he says, is comprehension and whether a sentence is factually true or not. They also have a hard time with other forms of writing, like poetry. One example is the software e-rater, by Educational Testing Service.

Les Perelman, a director of writing at the Massachusetts Institute of Technology, was allowed to test e-rater. He told Winerip that the system has biases that can be easily gamed.

E-Rater prefers long essays. A 716-word essay [Perelman] wrote that was padded with more than a dozen nonsensical sentences received a top score of 6; a well-argued, well-written essay of 567 words was scored a 5.

"You could say the War of 1812 started in 1925," Winerip says. "There are all kinds of things you could say that have little or nothing to do in reality that could receive a high score."

Efficiency is where the automated readers excel, Winerip says. The e-rater engine can grade 16,000 essays in about 20 seconds, according to ETS. An average teacher might spend an entire weekend grading 150 essays, he says, and that efficiency is what drives more education companies to create automated systems.

"Virtually every education company has a model, and there's lots of money to be made on this stuff," he says.

A greater focus on standardized testing and homogenized education only serves to increase the development of automated readers to keep up with demand, Winerip says.

Winerip says that what worries him is that if automated readers become the standard way of grading essays, then teachers will begin teaching to them, removing a lot of the "juice" of the English language.

"If you're not allowed to use a sentence fragment ... [or] a short paragraph ... then you're going to get a very homogenized form of writing," he says. "The joy of writing is surprise."

Copyright 2012 National Public Radio. To see more, visit http://www.npr.org/.

NPR

'Passages' Author Reflects On Her Own Life Journey

Gail Sheehy is famous for her in-depth profiles of influential people, as well as her 1976 book on common adult life crises. Now she turns her eye inward, in her new memoir Daring: My Passages.
NPR

Syrup Induces Pumpkin-Spiced Fever Dreams

Hugh Merwin, an editor at Grub Street, bought a 63-ounce jug of pumpkin spice syrup and put it in just about everything he ate for four days. As he tells NPR's Scott Simon, it did not go well.
NPR

Texas Gubernatorial Candidates Go To The Border To Court Voters

Republicans have won every statewide office in Texas for 20 years, but the growing Hispanic population tends to vote Democrat, and the GOP's survival may depend on recruiting Hispanic supporters.
NPR

In San Diego, A Bootcamp For Data Junkies

Natasha Balac runs a two-day boot camp out of the San Diego Supercomputer Center for people from all types of industries to learn the tools and algorithms to help them analyze data and spot patterns in their work.

Leave a Comment

Help keep the conversation civil. Please refer to our Terms of Use and Code of Conduct before posting your comments.