Fact Checking, Online Communities and Journalism

check_and_x graphicA few tidbits about fact checking that caught my eye recently:

First, here’s a shocker! algorithms can do it. Tipped via the National Science Foundation’s Science 360 site

Indiana University scientists create computational algorithm for fact-checking

  • June 17, 2015

FOR IMMEDIATE RELEASE

BLOOMINGTON, Ind. — Network scientists at Indiana University have developed a new computational method that can leverage any body of knowledge to aid in the complex human task of fact-checking.

As a former professional fact checker, I smiled a bit at their “complex human task” description. Sometimes, but many facts you check for publication in a daily newspaper, for instance, people’s names, titles, addresses, spelling, dates, quotations, are pretty straightforward and sources are fairly structured etc. If you think about the sources as in one part of the network, and the query in another, it’s basically a (geometric) math problem. So makes complete sense that an algorithmic approach could, in principle, do this work, and following paths to explore where the fact “lives,” and if it can be located in multiple sources, with different axes to grind, is what a resourceful fact checker will do–computer or human.

I wonder what is next for these researchers, and I hope it involves not just checking facts fed in but also finding a way to determine what facts (and biases) in a document need to be checked. Although there is often misinformation at the root of factual errors, more pernicious and harder to automate is smoking out persistent bias, a problem of sense-making, in which true facts are nonetheless marshaled to dubious or faulty ends, or less balefully, just not applicable to the question at hand. (Insert old pirates and global warming joke here.) If such tests were computerizable it might end, or at least put a dent in  blog commenting as we know it. Not a bad outcome. Fact checking is also a notoriously unoptimized activity, at least when done by humans. The more obscure the fact, and honestly, the less relevant, the more heroic and inefficient the quest (microfilm anybody?).  That works for sleuthing in the stacks for that telling citation, but on the web, bad facts spread like wildfire, and catching them fast and correcting them decisively would be a real service.

Second Poynter, a resource for journalists, has an interesting piece about using “gossip communities” (their term) as sources for journalists. Writer Ben Lyons pegs his story to a now-debunked social science study about whether people became more open to gay marriage based on in person canvassing). He sheds light on the issues of what happens when a journalist needs to enter a subculture,  abrasive and unreliable though they sometimes are, to get or check a story.

Complications abound with such “online community beats:” real names are rare, verifiable sources likewise, and the details can often only be checked against the comments of other people in the same world. But, in the case of the  Nonetheless, in the case of the gay marriage canvassing story, the PoliSciRumors.com community did raise doubts about the data long before it unraveled  more publicly. I suppose a modern day Woodward & Bernstein team wouldn’t be meeting in a parking garage, but in a chat room in TOR!

The next bite, “fact checking, are you doing it at all?” comes from the  science journalist (and inventor of dance your PhD thesis!) John Bohannon, who explains that results from his fake study linking chocolate to weight loss was an all too easy sell to the media, who didn’t bother to sniff out that the results (and the publication they appeared in) were rubbish.  From his lively explanation at io9:

Here’s a dirty little science secret: If you measure a large number of things about a small number of people, you are almost guaranteed to get a “statistically significant” result. Our study included 18 different measurements—weight, cholesterol, sodium, blood protein levels, sleep quality, well-being, etc.—from 15 people. (One subject was dropped.) That study design is a recipe for false positives.

It was a perfect storm of problems: p-values, a very small n, and then to top it off a “pay to play” journal that published it two weeks after submission, without changing a word, and for the low low cost of 600 Euros.

The experiment was craptastic, but the news coverage was a dream. And are now his “results” are probably part of the corpus of facts that the IU researchers’ computers have to untangle. Maybe they will factor in the questions from commentators, who, unlike professional journalists raised questions.

And as a bonus, Priceonomics has a timely entry about scientific retractions, with the point that the increase in number is possibly due to better policing than to an epidemic of cheating (although that remains a possibility).

Fact Checking Words: Doing it Diligently

In the 24-hour 360-degree news cycle that is the Web, fact checking seems to be a lost art, but I encountered this interview with an editor at a small Virginia paper that suggests otherwise:

From the American Press Institute site:

Fact checking a sensitive story: 6 good questions with News Leader editor William Ramsey.

I was particularly struck by these bits:

Q: Can you describe how the fact checking was conducted for this series? Did you use a checklist? A spreadsheet? A particular process?

A: We had a multi-pronged approach. We generated a list of every factual statement (not actual copy) from the main stories and sent it to state officials, who used investigators and PIOs to verify the information. This was critical since a portion of our reporting featured narratives rebuilt from disjointed case records. We also sampled a percentage of our hand-built database and determined an error rate, which was really low. We made those error fixes, and re-sampled another portion, which held up. For accuracy in [Borns’]  writing, we extracted facts from her project’s main story and made a Google spreadsheet for the team, using it to log verification of each fact, the source, the person checking and a note when a change was made to the draft.

Q: In the fact checking of this series, were there any lessons learned that will be used at the News Leader in the future, or could be replicated at other news organizations?

A: I hope so. We tried two new ideas I liked: war room Fridays and a black hat review.

For the Friday sessions, we took over a conference room and brought in reporters not connected to the project. On one Friday, for example, our government reporter spent the day checking story drafts against state records.

For the “black hat” review, borrowed from the software development industry, we took turns playing a critic’s role, peppering ourselves in a hostile interview about process, sources and conclusions. It gave us actionable information to improve the content before it published.

There is so much yammer about computational journalism (much of it hype to my old-school ears), but this example of using both old-fashioned and computer approaches to fact check the work of journalism seems to me a lot more valid that trawling the data for “news” and then reporting it even if specious or trivial. I particularly like the image of “black hat” fact checkers. In cybersecurity, it seems you call (at least some of) these Pen Testers.

 

Washington_times_scaled
The Washington Times (a different publication from the current one) as it was 100 years ago. From the Library of Congress’s Historic American Newspapers Collection, which has a display of front pages from 100 years ago each day.

 

Surprising Words: Fact Checking in Books

When I was a news researcher, it was surprising to me that you were allowed to cite a fact previously reported in our own pages to resolve a query. But at least the effort to get things right was serious; if this Atlantic piece is correct, book publishers don’t bother now, and never really did.

One of the most notorious and colorful publishing frauds. One quibble with the Atlantic piece...fact-checking and fraud detection are distinct tasks. As is rooting out bias. Most  editorial "gatekeepers," the few that are left, don't attempt all three.
One of the most notorious and colorful publishing frauds. One quibble with the Atlantic piece…fact-checking and fraud detection are distinct tasks. As is rooting out bias. Most editorial “gatekeepers,” the few that are left, don’t attempt all three.

http://www.theatlantic.com/entertainment/archive/2014/09/why-books-still-arent-fact-checked/378789/

“When I was working on my book, I did an anecdotal survey asking people: Between books, magazines, and newspapers, which do you think has the most fact-checking?” explained Craig Silverman, author of Regret the Error, a book on media accuracy, and founder of a blog by the same name. Almost inevitably, the people Silverman spoke with guessed books.

“A lot of readers have the perception that when something arrives as a book, it’s gone through a more rigorous fact-checking process than a magazine or a newspaper or a website, and that’s simply not that case,” Silverman said. He attributes this in part to the physical nature of a book: Its ink and weight imbue it with a sense of significance unlike that of other mediums.Fact-checking dates back to the founding of Time in 1923, and has a strong

tradition at places like Mother Jones and The New Yorker. (The Atlantic checks every article in print.) But it’s becoming less and less common even in the magazine world. Silverman suggests this is in part due to the Internet and the drive for quick content production. “Fact-checkers don’t increase content production,” he said. “Arguably, they slow it.”

What many readers don’t realize is that fact-checking has never been standard practice in the book-publishing world at all.

Reasonable Words: Fact Checking in a Digital World

Years ago (pre-Web) I was a news researcher for the Washington Post, and in a job before that responded to inquires from Congress as a staffer at the Congressional Research Service. Both jobs involved digging things up and fact checking in books (“You Could Look It Up“) and in expensive databases that you dialed up and used arcane search strings to mine.  Now resources beyond anything I had access to (and that included the largest library in the world when I was at LoC) are are few taps of a cell phone away. But in that constant blizzard of content, what’s reliable? What counts as news? What factual standards should reporters aim for, particularly in a breaking news situation?

A group of journalists has just put out a web resource to address this issue, The Verification Handbook. I’ve only just begun to browse it, but the content looks strong and the need is real.

The Verification Handbook

Thanks to Joyce Venza’s school library blog, Never Ending Search for the pointer.