Fact Checking, Online Communities and Journalism

check_and_x graphicA few tidbits about fact checking that caught my eye recently:

First, here’s a shocker! algorithms can do it. Tipped via the National Science Foundation’s Science 360 site

Indiana University scientists create computational algorithm for fact-checking

  • June 17, 2015


BLOOMINGTON, Ind. — Network scientists at Indiana University have developed a new computational method that can leverage any body of knowledge to aid in the complex human task of fact-checking.

As a former professional fact checker, I smiled a bit at their “complex human task” description. Sometimes, but many facts you check for publication in a daily newspaper, for instance, people’s names, titles, addresses, spelling, dates, quotations, are pretty straightforward and sources are fairly structured etc. If you think about the sources as in one part of the network, and the query in another, it’s basically a (geometric) math problem. So makes complete sense that an algorithmic approach could, in principle, do this work, and following paths to explore where the fact “lives,” and if it can be located in multiple sources, with different axes to grind, is what a resourceful fact checker will do–computer or human.

I wonder what is next for these researchers, and I hope it involves not just checking facts fed in but also finding a way to determine what facts (and biases) in a document need to be checked. Although there is often misinformation at the root of factual errors, more pernicious and harder to automate is smoking out persistent bias, a problem of sense-making, in which true facts are nonetheless marshaled to dubious or faulty ends, or less balefully, just not applicable to the question at hand. (Insert old pirates and global warming joke here.) If such tests were computerizable it might end, or at least put a dent in  blog commenting as we know it. Not a bad outcome. Fact checking is also a notoriously unoptimized activity, at least when done by humans. The more obscure the fact, and honestly, the less relevant, the more heroic and inefficient the quest (microfilm anybody?).  That works for sleuthing in the stacks for that telling citation, but on the web, bad facts spread like wildfire, and catching them fast and correcting them decisively would be a real service.

Second Poynter, a resource for journalists, has an interesting piece about using “gossip communities” (their term) as sources for journalists. Writer Ben Lyons pegs his story to a now-debunked social science study about whether people became more open to gay marriage based on in person canvassing). He sheds light on the issues of what happens when a journalist needs to enter a subculture,  abrasive and unreliable though they sometimes are, to get or check a story.

Complications abound with such “online community beats:” real names are rare, verifiable sources likewise, and the details can often only be checked against the comments of other people in the same world. But, in the case of the  Nonetheless, in the case of the gay marriage canvassing story, the PoliSciRumors.com community did raise doubts about the data long before it unraveled  more publicly. I suppose a modern day Woodward & Bernstein team wouldn’t be meeting in a parking garage, but in a chat room in TOR!

The next bite, “fact checking, are you doing it at all?” comes from the  science journalist (and inventor of dance your PhD thesis!) John Bohannon, who explains that results from his fake study linking chocolate to weight loss was an all too easy sell to the media, who didn’t bother to sniff out that the results (and the publication they appeared in) were rubbish.  From his lively explanation at io9:

Here’s a dirty little science secret: If you measure a large number of things about a small number of people, you are almost guaranteed to get a “statistically significant” result. Our study included 18 different measurements—weight, cholesterol, sodium, blood protein levels, sleep quality, well-being, etc.—from 15 people. (One subject was dropped.) That study design is a recipe for false positives.

It was a perfect storm of problems: p-values, a very small n, and then to top it off a “pay to play” journal that published it two weeks after submission, without changing a word, and for the low low cost of 600 Euros.

The experiment was craptastic, but the news coverage was a dream. And are now his “results” are probably part of the corpus of facts that the IU researchers’ computers have to untangle. Maybe they will factor in the questions from commentators, who, unlike professional journalists raised questions.

And as a bonus, Priceonomics has a timely entry about scientific retractions, with the point that the increase in number is possibly due to better policing than to an epidemic of cheating (although that remains a possibility).


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s