Checking It Twice

By this time in the season you’ve probably heard one or another rendition of the familiar Christmas carol about Santa’s annual performance measurement regimen. Mr. Claus and team work hard to make sure the results of the North Pole poll are accurate. After all, it would never do to have children receiving gifts they don’t want or deserve.

proofiness95In his book, Proofiness: The Dark Arts of Mathematical Deception, Charles Seife points out that verification is also the cornerstone of good journalism.1  He writes, “Responsible reporters must take nothing for granted—only by basing every single sentence of your story upon observations or verifiable facts can you be assured that you’re reporting the truth.”2 Grittier newspaper veterans say it this way: “If your mother says she loves you, check it out.”

But seeing may or may not be believing. Our powers of observation are limited and our senses can be easily deceived. And sometimes our judgment is inadequate. Unreasonable claims can seem credible to those of us with little knowledge of the domain the claim pertains to. My brother (an electrical engineer) reminded me of this when I sent him the link to a slick advertisement for the “ultimate all-in-one-device,” the Pomegranate mobile phone. Like me, at first he wondered if this device is for real or not. But it didn’t take him long to figure it out. He explained that the coffee brewing feature tipped him off to the hoax:

No cell phone battery has enough stored energy to heat a cup of water! There’s a characteristic of materials called heat capacity—the measure of how much energy is required to raise the temperature of some given material by 1 degree Centigrade. The heat capacity of water is huge, more than twice that of silver. It’s even larger than the heat capacity of iron. (This means it takes more energy to heat a little cube of water than to heat a little cube of iron.) I don’t know how much energy it takes to heat a cup of water but I’d guess you’d need a dozen, or maybe more, large flashlight batteries.

Being a parody on our society’s adoration of new gadgets, the trickiness of this ad is benign. It ‘s actual purpose is to entice viewers to spend their time on the real world experience of life in Nova Scotia.

There are, of course, plenty of phony claims broadcast into the infosphere for less than honorable reasons—claims with no bases in fact whatsoever. Seife calls phony quantitative claims Potemkin numbers. The name is from a legend about Russian Prince Potemkin, the consort of Catherine the Great. The prince wanted to convince the empress that the Crimea was a vital, thriving area. So, he built Potemkin villages, elaborate facades in the shape of villages and towns that appeared real from a distance. But they were fake and insubstantial. Potemkin numbers are the same thing—completely made up numbers.

Click for larger image

Seife gives several examples of Potemkin numbers in his book. However, as an excuse for posting a fascinating holiday image that someone sent me (shown at right), I’ll mention one qualitative Potemkin claim that Seife recounted: A 1950′s cigarette advertisement promised “from first puff to last, Chesterfield gives you a smoke measurably smoother…cooler…best for you!”3  He wonders how such highly subjective attributes could be successfully defined, let alone measured.

More troubling than Potemkin numbers, though, are claims that are half-truths. While these are based on verifiable facts, the evidence behind these claims isn’t nearly as weighty as their promoters suggest. One such partial truth is what Seife calls disestimation, “the act of taking a number too literally and understating or ignoring the uncertainties that surround it.”4

Many library advocacy claims are disestimations.  I saw an example this week in a story about an economic impact study done at the Free Library of Philadelphia. The story reports that the Free Library has been an essential resource for starting or improving an estimated 8,630 area businesses. Rather than saying “about 8,500 or 9,000″ (a fairer portrayal of the data), 8,630 makes the findings seem more definite. Or, as Seife says, this pseudo-precision dresses a number up “as an absolute fact instead of presenting it as the error-prone estimate that it really is.”

Exactly how exact is this 8,630 figure, anyway? Because neither the story nor the report say what sampling method was used, we can only guess. If survey respondents were self-selected, the findings are not likely to be an accurate reflection of the larger population of all active Free Library patrons. Probably, then, 8,630 is considerably slanted in some direction or other. (Maybe the true figure is closer to 2,000 or maybe it’s closer to 12,000.) If probability sampling were used, survey results wouldn’t match the larger patron population exactly. There will be some play in the data (sampling error), although in this case the figures will be more meaningful than figures from a self-selected (convenience) sample.

Regardless of the sampling method the researchers used, there is other noise (nonsampling error) in the data for other reasons, such as respondents misunderstanding questions or providing socially desirable answers, and so on.

Given these kinds of challenges in survey research, it would have been nice if the study researchers had talked more about the accuracy of their data. They might have explained how their assumptions affected the results and how different assumptions could produce different results. Or perhaps explore other data sources that could shine a light on their results. And why not give estimates as ranges, like “we think between 8,000 and 9,000 businesses”; or as a conservative minimum, like “at least 7,000 businesses”; or emphasizing the approximate nature of the data, like “roughly 8,500, give or take a few hundred”?

Sure, readers are more comfortable with supposedly clear-cut findings. Maybe something like:

Naughty:                8.347%
Nice:                      91.653%

But they’re also are interested in the truth, and the truth is that estimates are always fuzzy. When writing final reports and executive summaries, researchers should take the time to educate readers about this fuzziness.

 
———————-

1  Sian Brannon of Denton Public Library (Texas) told me about this book. Thanks, Sian!
2   Seife, C., 2010, Proofiness: The Dark Art of Mathematical Deception, New York: Penguin Group, p. 226.
3   Seife, C., p. 15.
4   Seife, C., p. 23. The reasons why study findings can never be completely definitive or exact can be found in Seife’s book, as well as in introductory texts on behavioral science research, program evaluation, and performance measurement. Or you can peruse some prior entries in this blog like this one, this one, or this one.

About these ads
This entry was posted in Measurement, Research. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s