Do No Quantitative Harm

Every measurement and every real-world number is a little bit fuzzy, a little bit uncertain. It is an imperfect reflection of reality. A number is always impure: it is an admixture of truth, error, and uncertainty.
Charles Seife, Proofiness: How You Are Being Fooled by the Numbers

Seife explains that the most well-conceived measures and carefully collected data are still imperfect. In real life situations where measurement designs are far from ideal and data collection is messy, the numbers are even more imperfect. The challenge for library assessment and research professionals is making sure our study designs and measures don’t make things any worse than they already are. To the best of our abilities we should strive to do no harm to the data.

Sharpening our skills in quantitative reasoning/numeracy will help make sure our measures aren’t exacerbating the situation. Here I’m continuing a quantitative exercise begun in my Nov. 9 post about a library return-on-investment (ROI) white paper connected with the LibValue project. In the post I explained that the ROI formula substantially exaggerated library benefits, a quantitative side-effect I suspect the researchers weren’t aware of.

A caveat before proceeding: Quantitative reasoning is not for people expecting quick and simple takeaways. Or those seeking confirmation of their pre-conceived notions. Quantitative reasoning is about thoroughness. It involves systematic thinking and lots of it! (That’s why this post is so long.)

This exercise involves one remaining component in the LibValue ROI formula I didn’t get to in my prior post. I can tell you up front that, measurement-wise, this component makes things worse. By that I mean it detracts from a sound estimate of library ROI rather than enhancing it. To see why let’s revisit the entire formula shown here with sample data:


LibValue White Paper ROI Formula with Sample Data.  Click image for formula only.

The expression circled in blue is the component we’ll be looking at. Let’s begin by first considering how this expression works arithmetically. After that we’ll explore the meanings of the specific measures in the expression.

You can see that the expression is a fraction containing multiple terms embedded in the larger fraction that is the entire ROI formula. In formula terms alone the blue-circled fraction looks like this:


Blue-Circled Fraction in Formula Terms

This fraction and its blue-circled version above are rates. A rate is a fraction where the numerator and denominator have different units of measure such as miles per gallon, influenza cases per 1000 adults, or library return per dollar invested. So, for the remainder of this post I’ll use the term formula rate to refer to the fraction shown here. And I’ll use rates for other fractions we encounter that meet the definition just given.

In the larger ROI formula the formula rate serves as an adjustment to the other two terms to its right ($101,596 X 1128 in the blue-circled fraction). As explained in my prior post, these two terms multiplied together are equivalent to a university’s total grant income. Total grant income, in turn, is used in the LibValue ROI formula to reflect library return/earnings (benefits).

Whether the formula rate ends up increasing or decreasing library benefits depends on each university’s data. For now, we can simply note that, based on the 8 universities studied in the white paper, the value of this rate hovers around 0.45. Its median value is 0.443 or 4/9. So, it typically decreases the value of the rest of the formula numerator by about 55%.

The formula rate is made up of two products, one in the numerator and one in the denominator, which follow this pattern:


It also happens that the rate can be re-expressed as the product of two separate rates like this:

LibValue Ratio as 2 Separate Ratios

Formula Rate Expressed as Two Separate Rates

To identify these let’s call the left rate awards/proposals rate and the right one percentages rate. Using sample data from the blue-circled fraction the equations below demonstrate that the expression consisting of two separate rates (upper equation) is equivalent to the original formula rate (lower equation):


The two-rates (upper) and formula rate (lower) expressions are arithmetically equivalent.
Click for larger image.

Now let’s see how these two separate rates behave for the 8 universities surveyed in the white paper. Take a look at this dot plot:


Dot Plot of Awards/Proposals and Percentages Rates.  Click for larger image.

Note how the percentages rates (red markers) are high and stay within the range from .95 to 1.11, whereas the awards/proposals rates (green markers) are lower with a broader range, from .006 to .70.

Because the percentages rates hover so close to 1.0 their individual percentages essentially cancel each other out. Recall that multiplying any number by 1.0 yields the number itself. This is what happens when the awards/proposals rates are multiplied by percentages rates so close to 1.0. For example, in the upper calculation shown above the awards/proposals rate (.577 or .58 rounded) is multiplied by 1.01 resulting in .583 (or .58 rounded). This multiplication does almost nothing. So, in this case the formula rate can be simplified to:


When the percentages rate numerator and denominator are close, the formula rate
can be simplified to just the awards/proposals term.

The near equivalence of the formula rate and the awards/proposals rate alone is evident in these dot plots:


Dot Plots of Formula & Awards/Proposals Rates. Lower chart has decreased vertical axis.
Click either chart for larger image.

To provide a closer view of the gaps between the dots (markers) the lower plot omits university 3. Decreasing the axis range widens the scale units. In the plots you can see that the two rates stay close together university by university. So close that in the top plot 6 of the 8 marker pairs overlap. We can also gauge the closeness of these two rates by computing the differences between them as I’ve done here:


Comparison of Universities’ Formula and (Simplified) Awards/Proposals Rates.  Click for larger image.

The differences appear in row 3. The abbreviation Abs indicates that these are absolute numbers. Since we’re interested in any difference between the rates, the sign of the difference doesn’t matter. Row 4 gives the difference in row 3 as a percent of the formula value (row 1).

These differences (gaps) are also plotted here as percentages, sorted low to high:


Gaps in Universities’ Percentage Rate Terms.  Click for larger image.

Notice that for 4 of the 8 universities the gaps are 1.5% or less. For these universities, as I said, the percentages rate can be omitted because it’s inconsequential. For the other 4 universities the rate may be important. However, this does raise the question about whether these measures are worth the time and expense to collect.1  Answering this requires a larger and more representative sample than the white paper had.

There’s something else interesting to be seen in the pair of dot plots shown above. For 4 of the universities the formula rate equals or exceeds the awards/proposals rate. For the 4 other universities the opposite is true. (I’ll leave it to the reader to determine whether either of these sets of 4 universities matches the 4 mentioned as having gaps between the rates of 1.5% or lower.) This diagram explains the two scenarios:


How Percentages Rate Affects Value of Formula Rate.  Click for larger image.

Again, in the LibValue ROI formula the formula rate adjusts total grant income downward, typically by 55% for the sample of 8 universities in the white paper. (We can’t say whether or not this is true for the larger population of universities without a representative sample.) As seen in the dot plots above, the awards/proposals rate primarily determines the final value of the formula rate while the percentages rate plays a minor role.

Still, there’s another problem with the percentages rate that needs addressed in spite of its minor role. This has to do with a specific arithmetic behavior of the two rates. The awards/proposals rate can never exceed 1.0 (100%) since this measure was defined by the white paper researchers as a part of a whole—how many proposals succeeded (grants awarded) per grant proposal submitted. On the other hand, the percentages rate can exceed 1.0 and does so as we saw in the diagram above.

The percentages rate’s ability to exceed 1.0 has an unintended effect. All else being roughly equal, universities with a lower % of proposals that contain citations obtained through the library (the percentages rate denominator) earn higher library benefit estimates! That is, for these universities the final adjustment to total income is relatively higher than for others. Assigning higher library benefits to universities that use fewer library resources is inconsistent with the idea that making use of more library resources would produce more benefit.

While this unintended effect hinges on the arithmetic behavior of the formula rate, it involves the meanings of the individual measures as well. Since we have yet to address what the ROI formula measures mean, let’s begin this by re-visiting the original formula rate:


Formula Rate.  Click for larger image.

Looking still at the percentages, it seems the white paper researchers intended that each would modify—lessen, actually—the term to its left. Though they didn’t explain their measurement design decisions in detail, the researchers did say that not all grant proposal successes should automatically be attributed to library resources used. I’m presuming they weren’t comfortable giving libraries credit for grant successes that had nothing to do with use of library resources. So, they sought to dampen the awards and proposals counts some. Thus, % of faculty who say that citations are important to grant awards lessens number of grant awards and % of proposals that contain citations obtained through the library lessens number of grant proposals.

Their decision led to compromises that diminished the validity of the LibValue study findings. This is nothing unusual, of course. Every assessment, evaluation, and research project involves methodological compromises. Researchers just need to recognize them, understand their implications, and inform readers about the implications in the final reports. Unfortunately, this doesn’t always happen. When it doesn’t, it’s up to the astute (and quantitatively skilled!) reader to try to sort things out.

So let’s sort through the compromises the white paper researchers made. First, the measures needed for the formula rate numerator were counts of awards where citations contributed to awards success and total awards counts. Unfortunately, these counts were not available to the researchers. So, they relied on faculty opinions as a substitute. This substitute is the percentages rate measure we’re already familiar with, % of faculty who say that citations are important to grant awards.

But think about it. It would be mere happenstance for the proportion of faculty who believed that citations contribute to grant success to match the proportion of grant awards where citations actually did contribute. The fact that these two measures have different units—faculty members versus grant awards—is the first clue that it’s a stretch to equate them. More important, faculty perceptions are a poor approximation of actual grant awards decisions since faculty perceptions will tend to be stable over time compared to awards decisions. Therefore, the accuracy of this faculty opinions measure is questionable. One of the main tenets of assessment and evaluation is relying upon direct measures rather than upon impressionistic opinions about what these measures might be. Here faculty opinions are indirect and impressionistic measures.

The second compromise concerns the formula rate denominator. Here the measurement needed would be counts of grant proposals containing citations obtained from the library. Again, the researchers did not have access to these counts. So, they devised a completely different measure and labeled it as if it were the counts just described. What they actually collected were faculty opinions about % of citations in grant proposals, grant reports, and/or published articles accessed by faculty via university computer network or via the library.2  Note that the unit of measure for the measure collected is citations whereas for the named measure it is proposals.

An example should help clarify this mismatch: Suppose at one university faculty proposals, reports, and published articles contained a total of 10,000 citations and the faculty estimated that 96% of these were obtained from the library. This means that the 96% refers to 9,600 citations. Now suppose the university also submitted 1,000 grant proposals that year. Applying the 96% to the named measure—% of proposals that contain citations obtained through the library—amounts to 960 proposals containing citations obtained from the library. In any given year or at any given university the proportions of library citations (among all citations) and grant proposals containing library citations (among all proposals) could, by random luck, be the same. But this lucky match would not occur every year or at every university. And, of course, the percentages researchers collected were not based on actual citation counts but on rough faculty estimates.

Again, this is a case of collecting impressionistic data rather than direct measurements. Substituting less-than-ideal estimates for unavailable measures may be justified. But relabeling a measure to make it appear to be something that it is not is deceptive.

There’s a third possible compromise apparent in the ROI formula. This has to do with the meaning researchers ascribed to the formula rate. Take a look at this description of the University of Illinois at Urbana-Champaign (UIUC) ROI model appearing in the LibValue white paper:


Depiction of University of Illinois at Urbana-Champaign ROI Model.3  Click for larger image.

As explained in my prior post, the white paper ROI model is a near clone of the UIUC ROI model with one exception I describe below.4  In the UIUC model shown here note the bold text, Percentage of proposals that are successful and use citations obtained by the library. This text labels the rate (fraction) just below it, which is the formula rate we’ve been discussing all along.

The bold label describes the joint probability that a grant proposal was successful and also contained citations from the library. I’ll leave it to the reader to compare how joint probabilities are calculated with the formula rate calculation. In the meantime, let me just say this: If the label describes what researchers meant to measure, then the formula rate calculation is incorrectly specified. If the researchers purposely substituted an alternative calculation for a joint probability, then their ROI formula is another step removed from a valid measure of library ROI.

Finally, we need to consider the meaning of the left part of the ROI formula rate, the awards/proposals rate. You probably have surmised that this rate is essentially the university’s grant application success rate—the number of grant awards won per grant proposal submitted. As seen already, it is mainly the awards/proposals rate that determines how much total grant income is lessened to obtain the final library benefit estimate.

So, let’s think about how this works. Say two universities each earned $10 million in grant income. According to the ROI formula the library benefit estimate is $10 million with some adjustment (usually) downward based on the formula rate. Suppose the first university submitted 10 grant proposals and 5 of these were funded. And the second submitted 100 proposals and 20 were funded. The success rates for the universities would be 50% and 20% respectively. Thus, the second university’s earnings end up adjusted downward by 80% (which is what a 20% formula rate does) while the first is adjusted downward by 50%.

Now consider a third university that submitted just 1 grant proposal which succeeded in obtaining a single grant award of $10 million. The ROI formula credits this university with library benefits equal to the full $10 million (100%). But why should one university deserve the full $10 million credit while others earn considerably less? Actually, there is no good reason for these gradations. The monetary value of grant income/revenue is whatever it is, and is tracked in university financial accounting systems the same, regardless of how it was or was not earned. Grant success rates have nothing at all to do with accurately tallying grant income. Since the ROI model defines library benefit as total grant income, then the benefit equals the total money received.

Despite all of the time we’ve spent here deciphering the formula rate, this rate should never have been part of the ROI formula in the first place.5  Its inclusion made the formula worse rather than better.

Now it may be that accurate measurement requires adjusting gross grant income figures somewhat. But these adjustments should be based on possible causes of inaccuracy such as over- or under-reporting, record-keeping errors, inflation, and so on. Adjustments based upon extraneous characteristics of the universities won’t produce accurate results. These just adulterate the measures and make them difficult to interpret. Which brings up an interesting question: Just what does it mean to give universities with higher success rates more credit for the money earned? What is the meaning of library benefits that have been filtered this way?

The LibValue researchers came close to recognizing the formula rate didn’t belong in the ROI formula. Take a look at the first bold label in the UIUC formula shown above, Percentage of faculty who use citations in grant proposals who are also PI’s. The researchers rejected this rate, writing “The percentage of faculty who are principal investigators [PI’s] has no bearing on the library’s ROI.”6  (Leaving this rate out of their ROI formula is the only alteration to the UIUC ROI model made by the LibValue researchers.) They didn’t realize that their statement applied equally to the next part of the UIUC formula, the formula rate.

Let me finish now by acknowledging the elephant in the library. The biggest weakness in the UIUC and LibValue ROI models is equating library benefits with significant portions of university grant earnings. The quick example I used above involved portions between 20% and 100%. But those did not include cases where the ROI formula exaggerated grant income significantly (see my prior post). If we include exaggerated income estimates with the formula rate adjustment, the library benefits specified by the ROI model range from 40% to 225% of total grant income, with an average benefit estimate of 110% and median of 78% (based on the white paper data).

It must be quite a surprise to university teams who pored and sweated over proposal drafts—principle investigators, faculty collaborators and colleagues, graduate students, grants offices, support staff, and others—to learn that libraries feel justified in claiming something like 80% or 110% credit for the teams’ work. Leaving grant creators with a small or negative share.

Obviously, this allotment isn’t reasonable. It is much more likely that a library’s contribution to grant earnings would be single-digit percentages or less. Even a crude ratio of library citation pages to total proposals page counts would be, what, around 5%? If pages are weighted according to content and relevance, citations pages would be greatly outweighed. (Redundant, actually.)

Then there’s the question of credit for relevant content culled from cited articles, books, and other works. Do original creators get the credit? Or do grant authors deserve it for their ability to master, synthesize, and add to knowledge? Whichever way this might go, clearly the library’s role is tertiary. The library is basically the messenger. Of course, a messenger doesn’t deserved to be harmed for being a medium. But he shouldn’t take credit for the message either.

So how shall we gauge the value of libraries being the messenger (medium)? A difficult question, indeed. Its answer needs to be more credible than the white paper and UIUC ROI models.

1   For a measure to be meaningful it has to show some variation. Otherwise, it does not help us make useful distinctions between important subgroups in our population such as small academic, public university, and large private university libraries. A questionnaire item that elicits identical responses from all respondents probably isn’t measuring anything useful.
2   Tenopir et al. 2010. University Investment in the Library Phase II, p. 18. See Table 12. The percentages are estimates based on faculty recall rather than on actual citation counts.
3   Tenopir et al. 2010, p. 7.
4   See my prior post for more information about the UIUC model, including the fact that the model is specified differently in different sections of the article.
5   Studying the formula rate was worthwhile as a quantitative exercise, though. It involved some good ideas such as the importance of analyzing patterns in the data to see how the measure behaves.
6   Tenopir et al. 2010, p. 7.

Posted in Library assessment, Measurement, Numeracy, Research, Statistics | 1 Comment

Quantitative Thinking Improves Mental Alertness

It’s been a while since I’ve posted here. Writer’s block, I guess. I was hoping to come up with some new angle on library statistics. But to be honest, I haven’t been able to shake the quantitative literacy kick I’ve been on. I believe that quantitative literacy/numeracy is important in this era of data-driven, evidence-based, value-demonstrated librarianship. Especially when much of the data-driving, evidence-basing, and value-demonstrating has been undermined by what I’ll call quantitative deficit disorder. Not only has this disorder gone largely undiagnosed among library advocacy researchers and marketing afficionados, it has also found its way to their audiences. You may even have a colleague nearby who suffers from the disorder.

The most common symptoms among library audiences are these: When presented with research reports, survey findings, or statistical tables or graphs, subjects become listless and unable to concentrate. Within seconds their vision begins to blur. The primary marker of the disorder is an unusually compliant demeanor. Common subject behavior includes visible head-nodding in agreement with all bullet points in data presentations or executive summaries. In severe cases, subjects require isolation from all data-related visual or auditory stimuli before normal cognitive processes will resume.

The only known therapeutic intervention for quantitative deficit disorder is regular exercise consisting of deliberate and repetitive quantitative thinking. Thankfully, this intervention has been proven to be 100% effective! Therefore, I have an exercise to offer to those interested in staving off this disorder.

This exercise is more advanced than others I’ve posted in the past. Researchers who conducted the study I’ve chosen didn’t communicate their quantitative thought processes very clearly. Meaning the exercise requires us to fill in several blanks in the study rationale.

The study is a white paper about library return-on-investment from the LibValue project at the University of Tennessee Center for Information and Communication Studies. The aim of the white paper is to determine how valuable library collections and services are in the university grants process.

It turns out that the LibValue study substantially exaggerates benefits attributable to library collections and services. To understand this we need to examine the formulas the researchers used. To begin with, they were guided by the basic formulas for calculating ROI and cost/benefit analysis shown here:

Basic ROI formula

Basic Return-On-Investment and Cost/Benefit Analysis Formulas

The idea is determining the degree to which monetary investments (costs) pay off in monetary earnings (benefits). Properly calculating ROI or cost/benefit ratios requires (1) a thorough identification of relevant earnings (benefits) and investments (costs) and (2) accurate assignment of monetary values to identified benefits and costs.

The white paper uses the following ROI formula, which I show here verbatim, although reformatted for readability:

LibValue ROI formula

LibValue white paper ROI formula.1  Click to see original format.

Though more complicated than the basic ROI formula shown above, this formula follows the same template. The return (earnings) part shows up in the numerator and consists of three separate expressions multiplied together—the multi-term fraction on the left and the two simpler terms to the right. The investment part appears in the denominator as the single term total library budget.

The white paper researchers adapted this formula from an earlier ROI study conducted at the University of Illinois at Urbana-Champaign (UIUC).2  The UIUC researchers, in turn, derived their formula from a 2003 ROI study of special libraries.3

Since the white paper doesn’t explain much about the thinking behind the formula (aka model), we have to refer to the UIUC article to understand it. Except those authors didn’t explain why their model ended up constructed as it is, other than to say they adapted the idea from the special libraries study. All we really have to go on is their table below:

UIUC Article Models

Adapted from UIUC article.4  Red annotations correct errors explained below. Row numbering and shading added. Click for larger image.

The UIUC model appears in the right column and the special libraries model it was adapted from appears in the left column. Both columns contain formula elements—measures, actually—arranged vertically and connected by what were minus signs (hyphens) followed by an equal sign. The minus signs are typographical errors which I have corrected in red. Based on text in the UIUC article these should be X’s to indicate multiplication (which is also matches the white paper model). The top two rows are grayed out to indicate the part of the UIUC model that white paper researchers decided not to use.

To interpret the table follow either column downward. (Later you can, if you want, trace left to right to see how the UIUC researchers adapted the special libraries entries to academic libraries.) Following the right column beginning with row 3 we see two measures multiplied together: Grant proposal success rate using library resources and average grant income. This gives the intermediate product seen in row 7, average grant income generated using library resources.

From the left column in the table above we see the idea of using an average came from rows 5 and 7 of the special libraries model. Regrettably, the special library researchers mistakenly believed a median can be equated with an average. As if waving a magic wand over one can transform it into the other. Fairy dust aside, an average and a median are very different measures. Considering these to be interchangeable is innumerate.

Returning to the right column of the table, the sequence continues into row 8 where multiple calculations are crowded together, which may explain why the model ended up incorrectly specified. According to calculations in the article the first two terms in row 8 are both divided by library materials budget. I corrected this in the equation shown here:


Corrected equation from rows 7 & 8 in right column of UIUC table (above).

It’s not obvious at first, but there are significant measurement decisions embedded in this formula (and the white paper formula also). Decisions that the UIUC researchers acknowledged only in a couple of cryptic statements:

The model is extended to determine the return on the library budget from grant income.5

Quantifying grant income using awards data is problematic as grants can be multiyear awards with income received over time or all at once, and they can be extended or renegotiated. …Data on grant expenditures were recommended because they are part of the university’s reporting system that accounts for the disposition of each grant.6

Let’s untangle the thinking here so we can see what we think of it! The first statement announces that, in the numerator of the basic ROI formula, revenue (earnings/return) was replaced with grant income. Conceiving income as equivalent to revenue is fine as long as researchers and readers alike understand an important distinction between these accounting concepts which happens to be related to timing. Revenue is money earned irrespective of when you actually get your hands on that money. Income is money you have received and have had in your possession at some time. This distinction, and related complications, presented certain measurement challenges that are barely mentioned in UIUC article.

This mention occurs in the second statement above, where the researchers concluded that quantifying (that is, measuring) income using awards data was problematic. But this conclusion doesn’t make sense since they already had chosen to substitute income for awards data (grant revenue). I think that the researchers had a problem with revenue and also with a particular type of income, income from one-time disbursements by grantors in the full amount of grant awards. That is, I believe the researchers’ issue was with library ROI earnings that occurred at a single point in time, whether as revenue or income.

For some (unexplained) reason, the researchers wanted income spread out over time, presumably over the lifetime of the grant. They sought to view all grant income this way whether or not this was how the income actually worked. For example, to the researchers a 3-year $6 million dollar grant award disbursed all at once was equivalent to $500,000 received by the university each quarter or $2 million received annually. To this end, the researchers found a suitable way for spreading grant income over time when it wasn’t so in fact. They chose to gather data on grant expenditures as a substitute measure for grant income.

Thus, the researchers’ measurement decisions had two stages: First, they substituted income data for revenue. Then, they substituted expenditure data for income data. As I say, the only rationale offered for these decisions are the statements quoted above. The LibValue white paper says nothing at all about measurement decisions the model entails, other than the omission of part of the UIUC model (shaded in the table above).

A related puzzle in both studies is exactly which year(s) the ROI data pertain to. The UIUC article says that grants data were collected from an unspecified 10-year period. A survey in the appendix queried faculty about grant proposals for both 2006 and awards received for the 5-year period ending in 2006. Yet, the researchers included this example calculation of their ROI formula indicating the data were from the single year 2006:

UIUC Model Example

Example calculation from UIUC article.7  Shading in the original.
Red annotation added. Click for larger image.

Notice the red-circled measure no. of grants (expended) in year. In the earlier table the ROI formula lists this measure as simply number of grants expended. (Another inconsistency is library [materials] budget in the table which evolved into total library budget in the example.)

Similar mixed messages appear in the LibValue white paper regarding years the data are for. Again, a survey in the appendix queried faculty about 2007 and as well as the prior 5-years. And regression analysis results were reported for data for the unspecified 10-year period. The data tables in the article do not indicate what year(s) the data pertain to. But the data must be annual, rather than multi-year, as the measure from the white paper formula, grants expended each year, implies. And, of course, total library budget in the formula denominator is an annual statistic.

Therefore, we can presume that average grant income (UIUC model) and the average size of grant (white paper model) are annual figures. (Average size of grant means average grant award amount.) But there’s another wrinkle. The averages just named could have been annualized in one of two ways: (1) The averages were calculated from a single year’s grants or (2) they were calculated from multiple years’ grants. Either way poses a measurement challenge related to timing of earnings (grant income) versus investment (budget). If the grant income averages are from a single year, then grants awarded in a given year typically wouldn’t show up as income that same year, although income from prior year grants would. The substitution of expenditures for income was likely meant to resolve this, except expenditures timing could be out-of-sync with annualized grant income. If the averages are from multiple years, somehow the researchers needed to reconcile the multi-year data with annual investment (budget) data.

Spending more time second-guessing the researchers’ measurement decisions is probably not very fruitful here. But you get the idea. Comparisons of earnings—either annualized over multiple years or for a single year—with a single budget year is a bit iffy. How can we know which grants awards were earned from one year’s library investment versus another year’s? It’s also conceivable that some grant awards were facilitated by more than one year’s library investment.

Let’s set these issues aside, as it’s the researchers’ responsibility to address them thoughtfully. Let’s take a look instead at one other interesting aspect of the formulas. In the table from the UIUC article (above) note that the average in row 7 is multiplied by number of grants expended in row 8. Here’s the calculation—corrected as explained earlier—with a twist I’ve added which I will explain:


A twist added to the equation from rows 7 & 8 of the table from the UIUC article (above).

My added twist is equating the two terms multiplied in the numerator left of the equal sign with a single term in the numerator on the right, total grant income generated using library resources. Here’s my reasoning: If you would, temporarily assume that number of grants expended is the same as number of grant awards received. This is roughly the same as equating income (which was apparently measured as reimbursements received for grant expenditures) with revenue (grant awards). As reported in the white paper, these two counts do happen to be equal for some universities but not for others.8   But we’ll get to that.

This assumption means that the UIUC researchers had to calculate average grant income by taking total grant income and dividing it by number of grants awarded. Here’s why the average in the numerator of the equation shown just above has become a total: Multiplying any average by the number of cases used to calculate that average yields the total amount for all cases—in this instance, total grant income.

This same multiplication occurs in the LibValue formula:

LibValue ROI right side

Right terms of LibValue formula numerator. Click for larger image.

Again, multiplying average size of grant by the number of grants expended (which we’re assuming to be equivalent to number of grants awarded) gives the total grant income:


Right terms of LibValue formula numerator are equivalent to total grant income. Click for larger image.

Simplified in this way, it becomes obvious that library earnings/benefits are based on total annual grant income, moderated to an unknown degree by spreading that income over time. Neither the UIUC article nor the LibValue white paper explain why it was necessary to use (annual) averages. But it stands to reason that they had to have access to data on total grant income in order to calculate the averages at all.

Just one more thing to consider. We assumed that number of grant awards was equal to number of grants expended. However, this assumption is only partially true. It is true for the 3 of the 8 universities in the white paper as indicated in red here:

Awards per Grants Expended

Grants expended compared to grant awards for 8 universities in LibValue white paper.9
Click for larger image.

For the rest of the universities (green in the table) grants expended outpaced grants awarded by factors of about 1.5 to one, three to one, four to one, and six to one. From all of this we can conclude: The right terms in the LibValue ROI formula numerator are equivalent at least to a university’s total annual grant income and usually to 300% to 600% of total annual grant income. Thus, the white paper considerably exaggerates library return/earnings for the majority of the universities studied (assuming the LibValue model as a whole to be acceptable).

To determine if and how exaggerated the final ROI ratios are we need to decipher the rest of the formula, including the multi-term fraction in the numerator. Besides this, there is also the question of how defining library benefits as equal to 100% of university grant awards can be justified.

But these are issues we can ruminate on in a sequel to this post. I mean, you can only think quantitatively for so long. Time to give our brains some rest.

1   Tenopir, C. et al. 2010. University Investment in the Library, Phase II: An International Study of the Library’s Value to the Grants Process, p. 7.
2   Luther, J. 2008. University Investment in the Library: What’s the Return, A Case Study at the University of Illinois at Urbana-Champaign.
3   Strouse, R. 2003. Demonstrating Value and Return on Investment, Information Outlook, 14-19.
4   Luther, J. 2008. p. 8.
5   Luther, J. 2008. p. 8.
6   Luther, J. 2008. p. 9.
7   Luther, J. 2008. p. 11.
8   Although it’s possible, for record-keeping reasons, for annual grant income to differ from annual grant expenditures, the white paper researchers tallied earnings based on grant expenditures rather than grant income. This means they also considered number of grants expended in a given year to be equivalent to the number of grants awards receiving income.
9   Data from Table 1b in Tenopir et al. 2010. p. 9.

Posted in Measurement, Numeracy, Research | Tagged , , | Leave a comment

If You Plug Them In They Will Come

In their book What the Numbers Say Derrick Niederman and David Boyum say that the way to good quantitative thinking is practice, practice, practice! In this spirit I offer this post as another exercise for sharpening the reader’s numeracy skills.

A couple of months back I presented a series of statistical charts about large U.S. public library systems. Sticking with the theme of large public libraries, I thought I’d focus on one in particular, The Free Library of Philadelphia. This is because the PEW Charitable Trusts Philadelphia Research Initiative did an up-close analysis of The Free Library in 2012. So this post is a retrospective on that PEW report. Well, actually, on just this graph from the report:

PEW Philadelphia Report Bar Chart

Source: The Library in the City, PEW Charitable Trusts Philadelphia Research Initiative.
 Click to see larger image.

The PEW researchers interpreted the chart this way:

Over the last six years, a period in which library visits and circulation grew modestly, the number of computer sessions rose by 80 percent…These numbers only begin to tell the story of how the public’s demands on libraries are changing.1

The implication is that because demand for technology outgrew demand for traditional services by a factor of 8-to-1, The Free Library should get ready to plug in even more technological devices! This plan may have merit, but the evidence in the chart does not justify it. Those data tell quite a different story when you study them closely. So, let’s do that.

The main problem with the chart is that 80% is an exaggerated figure. It is potentially inflated on its own and is definitely inflated in the context of comparisons made in the chart. Let me begin with the first point.

The percentages in the PEW chart are cumulative rates, that is, rates of change calculated over multiple years. When cumulative rates are based on low initial values, the rates end up artificially inflated. This is clearly the case with a statement in the PEW report about digital downloads at the Phoenix Public Library increasing by more than 800%.2  When you see large 3- or 4-digit percentages, chances are the baseline (the denominator in the fraction that the percentage comes from) is too low. These percentages are so exaggerated they aren’t meaningful.

Another example of a measure on the upswing from low initial values is public computers in U.S. public libraries, seen in the chart below. The Institute of Museum and Library Services (IMLS) began collecting data on these in 1998.


Click to see larger image.

The plotted lines indicate cumulative growth over time calculated from 5 different baseline years listed in the chart legend. Each line begins the year following the baseline year. For example, tracking growth based on the count of computers in 1998 begins in 1999 and extends to 2011. Growth based on the 2000 count begins in 2001. And so forth. In a sense, each baseline year begins with zero growth and increases from there.

The legend lists the count of computers for each of these years. (A peculiar use for a legend, I know. But the data are a bit easier to reference there than in a separate table or chart.) The arrow at the right edge of the horizontal axis indicates that in 2011 there were a total of 262,462 public computers in U.S. public libraries.

The calculation of how much growth 262,462 represents depends on what baseline year we choose. Using the 1998 count as the baseline (24,104, brown line) the 2011 count represents 989% growth. Using the 2002 level (99,453, orange line) the growth was 164%. Using 2004 (141,194, green line) gives 86%. And so on.

The earlier the baseline year, the higher the 2011 cumulative rate is. With each baseline year the cumulative rate decreases because the baseline amounts increase steadily as seen in the chart legend. So, gauging overall growth depends on how far back our hindsight stretches.

Now, let’s see how this dynamic plays out with The Free Library data. Computer use in 2005 at the library was not really a project startup, as is obvious from this chart:


Click to see larger image.

Nevertheless, we know the trend involves low early numbers (denominators) and high later numbers (numerators), as that’s the gist of the PEW story to begin with! And we know that with cumulative rates there is leeway in selecting the baseline year. So, for kicks let’s see how the PEW researchers’ choice compares with other baselines.

To do this we need The Free Library’s computer usage data. Unfortunately, I don’t have access to the data from the PEW report. And also unfortunately, there are some irregularities with the official data the library reported to IMLS—computer uses in 2006 and 2007 were apparently under-counted. Presuming the PEW report to be correct, I estimated usage for 2005 through 2007 by extrapolating backward from the 2011 IMLS count using PEW’s 80% figure. By my calculation computer use at the library was roughly 722,000 in 2005. These alternative counts, IMLS-reported versus extrapolated, are shown in this chart:


Click to see larger image.

Using the extrapolated figures up until 2007 and the IMLS data thereafter, here’s what 2012 cumulative rates for this measure look like:


Click to see larger image.

Consistent with the PEW report, in the chart the 2011 value for the 2005 baseline (brown) line is marked at 80%. Of course, this amount increased in 2012 since the rate is cumulative. The black dot (a line with no length) indicates the cumulative rate of growth from 2011 to 2012, which is equivalent to the annual rate of growth in 2012.

So, is the 80% cumulative growth in 2011 exaggerated? Possibly, if a lower figure can be used just as easily. And there is no rule of thumb for choosing from among alternative cumulative rates, making the choice an arbitrary one.

Whatever choice analysts make, it’s hard to make sense of cumulative rates in general. On their own the rates—80%, 67%, 50% and so on—sound high. But how high are they really, with multiple years rolled into them? And how well are readers able to evaluate them? I suspect that they’re like weather forecasts that audiences translate impressionistically into low-medium-high probabilities.

Another drawback with cumulative rates is that, as summary numbers (that is, the values listed at the right ends of the lines in the chart), they hide year-to-year details. Like the dip that occurred in computer use at the Free Library in 2009. Or the big jump in public computer counts reported to IMLS in 1999 seen in the earlier chart.

A more straightforward way to describe growth in performance data is tracking this annually. This would be the ideal route for the PEW study because it eliminates the baseline year dilemma. Plus it provides readers with a more complete understanding of the data.

The next set of charts shows annual rates of growth in public computers in libraries along with rates for visits and circulation from 1999 to 2011. In chart A you can see that rates for public computer counts (green line) begin high due to smaller counts early on. But within a few years they fall pretty much in line with rates for established library services.

Public Computers Annual Growth

Click to see larger image.

Notice in chart B that in the aftermath of the Great Recession total counts of traditional services fell while total public computer uses grew. Still, the point is that over time technology rates settled into the single-digits.

The next chart gives annual rates of growth in public computer use at The Free Library. (These same percentages appeared at the left ends of trend lines in the chart above that shows cumulative growth.)


Click to see larger image.

Looking at this chart would you have guessed that the pattern up to 2011 represents 80% cumulative growth? The annual percentages, on the other hand, do help us understand what is behind the 80% from the PEW report and the 88% in 2012. We can describe the 80% as a trend where 5 out of 6 years growth was at or somewhat above 10%. Or equivalent to an average annual growth of 10.3%. This same translation can be done for the 67% cumulative rate from 2006 to 2012. The 67% amounts to roughly 10% or somewhat higher growth for 5 out of 7 years. Or an average annual growth of 9.0%.

As you can see, describing growth trends is a bit of a moving target! The time range selected has a major bearing on the answer. In any case annual rates are easier for the typical reader to comprehend.

There’s another thing to be aware of with trends in growth rates. When we see a dip and then increase, the year following the dip is somewhat exaggerated due to the dip year. This applies to the 14% increase in 2010.

Finally, we come to my second point about the PEW report 80% figure being inflated “in the context of comparisons made in the chart.”  (I prefer the term biased over inflated.) The bias comes from comparing relative growth in measures having very different orders of magnitude. As seen in the chart below, annual visit and circulation counts at The Free Library ranged from 5.5 to 7.5 million, whereas public computer uses peaked at a bit above 1 million. So we’re talking a difference factor of 6 or 7 here.


Click to see larger image.

As you may have surmised, the issue here is baselines (denominators) again! A 10% increase in circulation amounts to 700,000 units while a 10% increase in computer uses amounts to 100,000. In this example growth in circulation obviously out-paces computer uses, despite the rates being identical. Comparing relative growth in measures of such different magnitudes is biased because it favors the measure with the smaller baseline. This next chart illustrates this bias:


The measure with the lowest 6-year increase gets all the glory! Click to see larger image.

The chart shows the net changes for the three measures presented in the PEW bar chart at the beginning of this post. The red annotations show the cumulative growth which each net count represents. (Data submitted by The Free Library to IMLS do not match the PEW chart: Based on IMLS data the 2011 cumulative rate of growth in circulation was 14.4%, not 11%. And the 2011 cumulative rate of growth in installed public computers was 20%, not 27%.)

Due to its low baseline—5 to 6 million lower than visits and circulation baselines—public computer uses shows stellar growth! Meanwhile, the other two statistics hobble along in what the PEW researchers described as a modest pace.

How curious that the third-place statistic was cast as the most impressive in the PEW report! Which makes me wonder whether PEW was intentionally praising the library for reaching a not-very-high-bar, since it’s a lot easier to “grow” small programs than large ones. Getting circulation to increase by 14% was no meager accomplishment. Yet, this went unheralded in the PEW report.

Spinning data serves PEW’s agenda, as it does others who see the main function of public libraries as portals to the Internet. Of course, seeing through spun data requires un-spinning them. And the only things needed for this are time and sound quantitative thinking.

Incidentally, the story that the PEW bar chart only began to tell—about unrelenting demand for technology at libraries—didn’t exactly come true at The Free Library. Perhaps you noticed this in the chart of annual change in computer uses (the chart above with the single green line). The 14% rebound in 2010 occurred in a year when the number of computers remained unchanged. It’s doubtful the rebound can be attributed to longer hours, since the total hours The Free Library was open in 2010 were the lowest in 8 years. In 2012 the library had the second-lowest hours open over the same 8-year span. In 2012 growth in computer uses fell to a modest 4.6% even though the library had added 120 more computers. Looks like just plugging them in is no guarantee after all.


1  PEW Charitable Trusts Philadelphia Research Initiative,The Library in the City: Changing Demands and a Challenging Future, 2012, p. 10.

2  Pew Charitable Trusts Philadelphia Research Initiative, p. 14.

Posted in Advocacy, Data visualization, Library statistics, Numeracy | Leave a comment

Averages Gone Wrong

In this post I’ll be telling a tale of averages gone wrong. I tell it not just to describe the circumstances but also as a mini-exercise in quantitative literacy (numeracy), which is as much about critical thinking as it is about numbers. So if you’re game for some quantitative calisthenics, I believe you’ll find this tale invigorating. Also, you’ll see examples of how simple, unadorned statistical graphs are indispensable in data sleuthing!

Let me begin, though, with a complaint. I think we’ve all been trained to trust averages too much. Early in our school years we acquiesced to the idea of an average of test scores being the fairest reflection of our performance. Later in college statistics courses we learned about a host of theories and formulas that depend on the sacrosanct statistical mean/average. All of this has convinced us that averages are a part of the natural order of things.

But the truth is that idea of averageness is a statistical invention, or more accurately, a sociopolitical convention.1 There are no such things as an average student, average musician, average automobile, average university, average library, average book, or an average anything. The residents of Lake Wobegon realized this a long time ago!

Occasionally our high comfort level with averages allows them to be conduits for wrong information. Such was the case for the average that went wrong found in this table from a Public Library Funding and Technology Access Study (PLFTAS) report:


Source: Hoffman, J. et al. 2012, Libraries Connect Communities: Public Library
Funding & Technology Study 2011-2012
, 11.   Click to see larger image.

The highlighted percentage for 2009-2010 is wrong. It is impossible for public libraries nationwide to have, on average, lost 42% of their funding in a single year. For that average to be true almost all of the libraries would have had to endure cuts close to 40%. Or for any libraries with lower cuts (like 20% or less) there would have been an equivalent number with more severe cuts (70% or greater). Either way a groundswell of protests and thousands of news stories of libraries closing down would have appeared. These did not, of course. Nor did the official Institute of Museum & Library Services (IMLS) data show funding changes anywhere near the -42% in that table. The Public Libraries in the U.S. Survey data show the average expenditures decrease was -1.7%.2

Various factors could have caused the 2010 PLFTAS percentage to be so far off. I suspect that two of these were an over-reliance on statistical averages and the way the averages were calculated.

Since the percentages in the table describe annual changes, they are rates. Rates you will recall, are how given numbers compare to base figures, like miles per gallon, visits per capita, or number of influenza cases per 1000 adults. The rates in the PLFTAS table indicate how each year’s average library expenditures compare with the prior year. The chart title labels the data average total operating expenditures change.

That label is somewhat ambiguous due to use of the terms average and total together. Usually, a number cannot simultaneously be an average and a total. The percentages in the chart are based on a measure named total operating expenditures, which is the sum of staffing, collection, and other expenditures at an individual library outlet. So, total refers to totals provided by the library outlets, not a total calculated by the researchers from data for the entire group of outlets surveyed.

The title’s wording is ambiguous in another, more significant way. To elaborate, let me first abbreviate total operating expenditures as expenditures, making the phrase average expenditures change. Both the chart title and my revised phrase are ambiguous because they can be interpreted in two ways:


Average change in expenditures
Average rate of change in expenditures
Change in average expenditures
Rate of change in average expenditures

Two Interpretations of the Phrase Average Expenditures Change

Tricky, isn’t it? It turns out that percentages from the PLFTAS table fall under the second interpretation, change in average expenditures. That is, the percentages are rates of change in a set of annual averages. The data in the table are the rates while the averages appear elsewhere in the PFTAS reports.3

As explained in my prior post, averages—as well as medians, totals, and proportions—are aggregate measures. Aggregate measures are single numbers that summarize an entire set of data. Thus, we can say more generally that the PLFTAS data are changes in an aggregate measure (an average). Tracking aggregate library measures of one type or another is quite common in library statistics. Here is an example:

Lyons Fig7a Visits
Lyons Fig8a Visits Rate

Annual Library Visit Totals and Rates of Change in the Totals.
 Source: Lyons, R. 2013. Rainy Day Statistics: U.S. Public Libraries and the Great Recession,
Public Library Quarterly, 32:2, 106-107. Click either chart to see larger image.

The upper chart tracks annual visit totals (aggregate measures) and the lower tracks rates of change in these. The annual rate of change in any measure, including aggregate measures, is calculated as follows:

Average Calc

This is exactly how the PLFTAS researchers calculated their—oops…I almost typed average rates! I mean their rates of change in the averages. They compared how much each year’s average expenditure level changed compared to the prior year.

In the earlier table the alternative interpretation of the phrase average expenditures change is average rate of change in expenditures. This type of average is typically called an average rate, which is short-hand for average rate of change in a given measure. An average rate is an average calculated from multiple rates we already have on hand. For example, we could quickly calculate an average rate for the lower of the two line charts above. The average of the 5 percentages there is 3.0%. For the rates in the PLFTAS table the average is -9.6%. In both instances these averages are 5-year average rates.

However, 5-year rates aren’t very useful to us here because they mask the annual details that interested the PLFTAS researchers. We can, though, devise an average rate that does incorporate detailed annual expenditure data. We begin by calculating an individual rate for each of the 6000-8000+ library outlets that participated in the PLFTAS studies following the formula on the left side of the table above. We do this for each of the 5 years. Then, for each year we calculate an average of the 6000-8000+ rates. Each of the 5 resulting rates is the average rate of change in total operating expenditures for one year.

Obviously, tracking how thousands of individual cases, on average, change each year is one thing, and tracking how a single aggregate measure like an average or total changes is quite another. The chart below shows how these types of rates differ:

Pub Lib Expend Rates

Three Different Types of Library Expenditure Rates.    Click to see larger image.

The data are for 9000+ libraries that participated in the IMLS Public Libraries in the U.S. Survey in any of the 5 years covered. Notice that rates for the aggregate measures (red and green lines) decrease faster over time than the average rate (blue line). Since thousands of individual rates were tabulated into the average rate, this rate is less susceptible to fluctuations due to extreme values reported by a small minority of libraries.

On the other hand, rates for totals and averages are susceptible to extreme values reported by a small minority, mainly because the calculation units are dollars instead of rates (percentages).4 This susceptibility would usually involve extreme values due to significant funding changes at very large libraries. (A 1% budget cut at a $50 million library would equal the entire budget at a $500,000 library, and a 10% cut would equal a $5 million dollar one!) Or fluctuations could be caused simply by data for two or three very large libraries being missing in a given year. For the PLFTAS studies, the liklihood of non-response by large systems would probably be higher than in the IMLS data.

The other striking thing visible in the line graph above is how trends in rates of change in totals and averages (red and green lines) are nearly identical. So, tracking rates in average funding pretty much amounts to tracking total funding. (Makes sense, since an average is calculated directly from the total.)

Now the question becomes, which type of rate is better for understanding library funding changes—rate of change in an average or an average rate? I honestly cannot say for sure. Clearly, each can slant the outcome in certain ways, although that isn’t necessarily a bad thing. It all depends in what features of the data we’re hoping to represent.

Regardless, the lesson is that an unexamined average can be very deceptive. For this reason, it’s always smart to study the distribution (spread) of our data closely. As it happens, staring out of the pages of one PLFTAS report is the perfect data distribution for the -42% mystery discussed here. Beginning with the 2009-2010 edition the PLFTAS studies asked library outlets to report how much funding change they experienced annually. The responses appear in the report grouped into the categories appearing in the left column of this table:


Distribution of Reported Changes in 2010 Library Funding. Adapted from: Bertot et al. 2010.
2009-2010 Public Library Funding & Technology Survey: Survey Findings and Results.

Presuming the data to be accurate, they are strong evidence that the -42% average decrease could not be right. The mere fact that funding for 25% of the library outlets was unchanged lowers the chances that the average decrease would be -42%. Add to this the percentages in the greater-than-0% categories (top 5 rows) and any possibility of such a severe decrease is ruled out.

This argument is even more compelling when visualized in traditional statistical graphs (rather than some silly infographic layout). The graphs below show the distributions of data from the table above and corresponding tables in the 2011 and 2012 PLFTAS reports.5 The first graphic is a set of bar charts, one for each year the PLFTAS study collected the data:


Depiction of Distribution of Budget Changes from 2010 – 2012 as Bar Charts.
Click any chart for larger image.

Perhaps you recognize this graph as a trellis chart (introduced in my prior post) since the 3 charts share a single horizontal axis. Notice in that axis that the categories from the PLFTAS table above are now sorted low-to-high with 0% in the middle. This re-arrangement lets us view the distribution of the data. Because the horizontal axis contains an ordered numeric scale (left-to-right), these bar charts are actually equivalent to histograms, the graphical tools of choice for examining distributions. The area covered by the adjacent bars in a histogram directly reflect the quantities of data falling within the intervals indicated on the horizontal axis.

From the bar charts we see that the distributions for the 3 years are quite similar. Meaning, for one thing, that in 2010 there was no precipitous drop or anything else atypical. We also see that the 0% category contains the most outlets in every year. After that the intervals 0.1 to 2% and 2.1 to 4% account for the most outlets. Even without summing the percentages above the bars we can visually estimate that a majority of outlets fall within the 0% to 4% range. Summing the 2010 percentages for 5 categories 0% or higher we find that 69% of the outlets fall within this range. For 2011 the sum is also 69% and for 2012 it is 73%.

Visually comparing the distributions is easier with the next set of graphs, a line chart and a 3-D area chart. I usually avoid 3-D graphics completely since they distort things so much. (In the horizontal axis, can your eyes follow the 0% gridline beneath the colored slices to the back plane of the display?) Here I reluctantly use a 3-D chart because it does give a nice view of the distributions outlines, better than the line chart or separate bar charts. So, I hereby rescind my policy of never using 3-D graphics! But I stick by this guiding principle:Does the graphical technique help us understand the data better?

Budget Change LineBudget Change 3D

Depictions of Distribution of Budget Changes from 2010 – 2012 in a Line Chart and 3-D Area Chart.
Click either chart for larger image.

Notice that the horizontal axes in these charts are identical to the horizontal axis in the bar charts. Essentially, the line chart overlays the distributions from the bar charts, confirming how similar these three are. This chart is also useful for comparing specific values within a budget change category or across categories. On the other hand, the closeness of the lines and the numerous data labels interfere with viewing the shapes of the distributions.

Here’s where the 3-D chart comes in. By depicting the distributions as slices the 3-D chart gives a clear perspective on their shapes. It dramatizes (perhaps too much?) the sharp slopes on the negative side of 0% and more gradual slopes on the positive side. Gauging the sizes of the humps extending from 0% to 6% it appears that the bulk of library outlets had funding increases each year.

So, there you have it. Despite reports to the contrary, the evidence available indicates that no drastic drop in public library funding occurred in 2010. Nor did a miraculous funding recovery restore the average to -4% in 2011. (Roughly, this miracle would have amounted to a 60% increase.) Accuracy-wise, I suppose it’s some consolation that in the end these two alleged events did average out!


1   Desrosières, A. 1998. The Politics of Large Numbers: A History of Statistical Reasoning. Cambridge MA: Harvard University Press. See chapters 2 & 3.
2   Based on IMLS data the 2009 average expenditures were $1.19 million and the 2010 average was $1.17 million, a 1.7% decrease. Note that I calculated these averages directly from the data. Beginning in IMLS 2010 changed the data appearing in their published tables to exclude libraries outside the 50 states and entities not meeting library definition. So it was impossible to get comparable totals for 2009 and 2010 from those tables.
3   I corresponded with Judy Hoffman, primary author of the study, who explained the calculation methods to me. The figures necessary for arriving at the annual averages appear in the detailed PLFTAS reports available here.
4   This is something akin to political voting. With the average rate each library outlet submits its vote—the outlet’s individual rate of expenditure change. The range of these will be relatively limited, theoretically from -100% to 100%. In practice, however, very few libraries will experience funding increases higher than 40% or decreases more severe than -40%. Even if a few extreme rates occur, these will be counter-balanced by thousands of rates less than 10%. Therefore, a small minority of libraries with extreme rates (high or low) cannot sway the final results very much.
   With the calculation of annual averages each libraries vote by expenditures dollars. These have a much wider range—from about $10 thousand to $100 million or more. With aggregate measures like totals, means/averages, and medians each library’s vote is essentially weighted in proportion to its funding dollars. Due to the quantities involved, aggregate library measures are affected much more by changes at a few very larges libraries than by changes at a host of small libraries.
5    The data are from the sequence of annual reports entitled Public Library Funding and Technology Access Survey: Survey and Findings available at the University of Maryland Information Policy & Access Center.
See table 66, p. 61 in the 2009-2010 report; table 53, p. 54 in the 2010-2011 report; and table 57, p. 65 in the 2011-2012 report.

Posted in Advocacy, Library statistics, Measurement | Leave a comment

I Think That I Shall Never See…

This post is about a much discussed question: How did the Great Recession affect U.S. public libraries? I’m not really going to answer the question, as that would amount to a lengthy journal article or two. But I am going to suggest a way to approach the question using data from the Institute of Museum and Library Services (IMLS) Public Libraries in the United States Survey. Plus I’ll be demonstrating a handy data visualization tool known as a trellis chart that you might want to consider for your own data analysis tasks. (Here are two example trellis charts in case you’re curious. They are explained futher on.)

As for the recession question, in the library world most of the discussion has centered on pronouncements made by advocacy campaigns: Dramatic cuts in funding. Unprecedented increases in demand for services. Libraries between a rock and hard place. Doing more with less. And so forth.

Two things about these pronouncements make them great as soundbites but problematic as actual information. First, the pronouncements are based on the presumption that looking at the forest—or at the big picture, to mix metaphors—tells us what we need to know about the trees. But it does not.

In the chart below you can see that the Great Recession had no general, across-the-board effect on public library funding. Some libraries endured severe funding cuts, others more moderate cuts, others lost little or no ground, while the majority of libraries actually had funding increases in the aftermath of the recession.


 Bars to the left of the zero line reflect libraries with decreases; bars to the right, increases. Change of -10% = 10% decrease. Change of 10% = 10% increase. Click for larger image.

In the chart note that 35% of libraries had 5-year inflation-adjusted cumulative decreases of one size or another. Of these libraries, about half (18% of all libraries) had decreases of 10% or greater and half (17% of all libraries) had decreases less than 10%. 65% of libraries had cumulative increases of any size. Of libraries with increases, two-thirds (43% of all libraries) had increases of 10% or greater and one-third (22% of all libraries) with increases less than 10%. By the way, expenditure data throughout this post are adjusted for inflation because using unadjusted (face-value) figures would understate actual decreases and overstate actual increases.1

The second problem with the advocacy pronouncements as information is their slantedness. Sure, library advocacy is partial by definition. And we promote libraries based on strongly held beliefs about their benefits. So perhaps the sky-is-falling messages about the Great Recession were justified in case they actually turned out to be true. Yet many of these messages were contradicted by the available evidence. Most often the messages involved reporting trends seen only at a minority of libraries as if these applied to the majority of libraries. That’s is essentially what the pronouncements listed above do.

A typical example of claims that contradict actual evidence appeared in the Online Computer Library Center (OCLC) report Perceptions of Libraries, 2010. Data in that report showed that 69% of Americans did not feel the value of libraries had increased during the recession. Nevertheless, the authors pretended that the 31% minority spoke for all Americans, concluding that:

Millions of Americans, across all age groups, indicated that the value of the library has increased during the recession.2

In our enthusiasm for supporting libraries we should be careful not to be dishonest.

But enough about information accuracy and balance. Let’s move on to some nitty-gritty data exploration! For this I want to look at certain trees in the library forest. The data we’ll be looking at are just for urban and county public library systems in the U.S. Specifically, the 44 libraries with operating expenditures of $30 million or more in 2007.3 The time period analyzed will be 2007 to 2011, that is, from just prior to the onset of the Great Recession to two years past its official end.

Statistically speaking, a forest-perspective can still compete with a tree-perspective even with a small group of subjects like this one. Here is a graph showing a forest-perspective for the 44 libraries:

Median Coll Expend

Median collection expenditures for large U.S. urban libraries.  Click to see larger graph.

You may recall that a statistical median is one of a family of summary (or aggregate) statistics that includes totals, means, ranges, percentages/proportions, standard deviations, and the like. Aggregate statistics are forest statistics. They describe a collective as a whole (forest) but tell us very little about its individual members (trees).

To understand subjects in a group we, of course, have to look at those cases in the data. Trellis charts are ideal for examining individual cases. A trellis chart—also known as a lattice chart, panel chart, or small multiples—is a set of statistical graphs that have been arranged in rows and columns. To save space the graphs’ axes are consolidated in the trellis chart’s margins. Vertical axes appear in the chart’s left margin and the horizontal axes in the bottom or top margin or both.

Take a look at the chart below which presents data from agricultural experiments done in Minnesota in the 1930’s. It happens that the data depicted there are famous because legendary statistician R. A. Fisher published them in his classic 1935 book, The Design of Experiments. Viewing the data in a trellis chart helped AT&T Bell Laboratories statistician William Cleveland discover an error in the original data that went undetected for decades. The story of this discovery both opens and concludes Cleveland’s 1993 book Visualizing Data.4

The core message of Cleveland’s book is one I’ve echoed here and here: Good data visualization practices can help reveal things about data that would otherwise remain hidden.5
Trellis Chart Example

Trellis chart depicting 1930’s agricultural experiments data.
Source:  Click to see larger image.

At the left side of the chart notice that a list of items (these are barley seed varieties) serves as labels for the vertical axes for three graphs in the top row. The list is repeated again as axes labels for the graphs in the second row. On the bottom of the chart repeated numbers (20 to 60) form the horizontal scales for the two graphs in each column. The layout of a trellis chart provides more white space so that the eye can concentrate on the plotted data alone, in this case circles depicting experimental results for 1931 and 1932.

With multiple graphs arranged side by side a trellis chart makes it easy to compare how different cases (aka research subjects) compare on a single measure. The chart below shows more about how this works using library data:

Demo trellis chart

Trellis chart example with library collection expenditures data.  Click for larger image.

The chart presents collection expenditures as a percent of total operating expenditures from 2007 to 2011. The cases are selected libraries as labeled. Notice how easy it is to identify the line shapes—like the humped lines of Atlanta, Baltimore, Cuyahoga Co., and Hawaii. And the bird-shapes of Brooklyn and Hennepin Co. And the District of Columbia’s inverted bird. Trellis charts make it easy to find similarities among individual trends, such as the fairly flat lines for Baltimore Co., Broward Co., Cincinnati, Denver, and King Co. Nevertheless, the charts presented here are more about identifying distinct patterns in single graphs. Each graph tells a unique story about a given library’s changes in annual statistics.

Incidentally, the trellis charts to follow have been adapted slightly to accommodate cases with exceptionally high data values. Instead of appearing in alphabetical order with other libraries in the chart, graphs for cases with high values appear in the far right column as shown in this one-row example:

Trellis Chart Adaptation

Row from trellis chart with high value graph shaded and in red.
 Click for larger image.

Notice that the graph at the right is shaded with its vertical axis clearly labeled in red, whereas the vertical axes for the non-shaded/black-lettered graphs appear at the left margin of the chart row. In this post all shaded/red-lettered graphs have scaling different from the rest of the graphs in the chart. By using extended scaling just for libraries with high values, the scaling for the rest of the libraries can be left intact.6

With that explanation out of the way, let’s look for some stories about these 44 urban and county libraries beginning with total operating expenditures:

Oper Expend Chart #1

Chart #1 interactive version

Oper Expend Chart #2

Chart #2 interactive version

Total Operating Expenditures.  Click charts for larger images. Click text links for interactive charts.

Take a moment to study the variety of patterns among the libraries in these charts. For instance, in chart #1 Brooklyn, Broward County, Cleveland, and Cuyahoga Co. all had expenditure levels that decreased significantly by 2011. Others like Denver, Hawaii, Hennepin Co., Houston, Multnomah Co., and Philadelphia had dips in 2010 (the Great Recession officially ended in June of the prior year) followed by immediate increases in 2011. And others like Boston, Orange Co. CA, San Diego Co., and Tampa had their expenditures peak in 2009 and decrease afterwards.

Now look at collection expenditures in these next two charts. You can see, for instance, that these dropped precipitously over the 4-year span for Cleveland, Los Angeles, Miami, and Queens. For several libraries including Atlanta, Baltimore, and Columbus expenditures dipped in 2010 followed by increases in 2011. Note also other variations like the stair-step upward trend of Hennepin Co., Houston’s bridge-shaped trend, the 2009 expenditure peaks for King Co., Multnomah, San Diego Co., and Seattle, and Chicago’s intriguing sideways S-curve.

Coll Expend Chart #1

Chart #1 interactive version

Coll Expend Chart #2

Chart #2 interactive version

Collection Expenditures.  Click charts for larger images. Click text links for interactive charts.

Again, with trellis charts the main idea is visually scanning the graphs to see what might catch your eye. Watch for unusual or unexpected patterns although mundane patterns might be important also. It all depends on what interests you and the measures being viewed.

Once you spot an interesting case you’ll need to dig a little deeper. The first thing to do is view the underlying data since data values are typically omitted from trellis charts. For instance, I gathered the data seen in the single graph below for New York:

NYPL Coll Expend

Investigating a trend begins with gathering detailed data. Click for larger image.

The example trellis chart presented earlier showed collection expenditures as a percent of total operating expenditures. This same measure is presented in the next charts for all 44 libraries, including links to the interactive charts. Take a look to see if any trends pique your curiosity.

Coll Expend as pct chart #1

Chart #1 interactive version

Coll Expend as pct chart #2

Chart #2 interactive version

Percent Collection Expenditures .  Click charts for larger images. Click text links for interactive charts.

Exploring related measures at the same time can be revealing also. For example, collection expenditure patterns are made clearer by seeing how decreases in these compare to total expenditures. And how collection expenditures as a percentage of total expenditures relate to changes in the other two measures. The charts below make these comparisons possible for the 4 libraries mentioned earlier—Cleveland, Los Angeles, Miami, and Queens:

Multiple collection measures

Chart #1 interactive version

Multiple measures with data values

Chart #2 interactive version

Understanding collection expenditure trends via multiple measures. Chart #1, trends alone. Chart #2, data values visible.  Click charts for larger images. Click text links for interactive charts.

The next step is analyzing the trends and comparing relevant figures, with a few calculations (like percentage change) thrown in. Cleveland’s total expenditures fell continuously from 2007 to 2011, with a 20% cumulative decrease. The library’s collection expenditures decreased at nearly twice that rate (39%). As a percent of total expenditures collection expenditures fell from 20.4% to 15.6% over that period. Still, before and after the recession Cleveland outspent the other three libraries on collections.

From 2007 to 2010 Los Angeles’ total expenditures increased by 6% to $137.5 million, then dropped by 18% to $113.1 million. Over the 4-year span this amounted to a 13% decrease. For that same period Los Angeles’ collection expenditures decreased by 45%.

By 2010 Miami’s total expenditures had steadily increased by 38% to $81.8 million. However, in 2011 these fell to $66.7 million, a 17% drop from 2010 level but an increase of 13% over the 2007 level. Miami’s collection expenditures decreased by 78% over from 2007 to 2011, from $7.4 million to $1.6 million.

Total expenditures for Queens increased by 17% from 2007 to 2009, the year the Great Recession ended. However, by 2011 these expenditures dropped to just below 2007 levels, a 2% cumulative loss over the 4 years and a 19% loss from the 2009 level. From 2007 to 2011, though, Queens collection expenditures declined by 63% or $7.3 million.

Talk about data telling stories! Three of the 4 libraries had percent of total expenditures spent on collections decrease to below 6% in the aftermath of the recession. To investigate these figures futher we would need to obtain more information from the libraries.

As you can see, trellis charts are excellent tools for traipsing through a data forest, chart by chart and tree by tree. Obviously this phase takes time, diligence, and curiosity. Just 44 libraries and 5 years’ worth of a half-dozen measures produces a lot of data! But the effort expended can produce quite worthwhile results.

If your curious about other interesting trends, the next two sets of charts show visits and circulation for the 44 urban and county public libraries. Looking quickly, I didn’t see much along the lines of unprecedented demand for services. Take a gander yourself and see if any stories emerge. I hope there isn’t bad news hiding there. (Knock on wood.)


Chart #1 interactive version

Visits chart #2

Chart #2 interactive version

Visits.  Click charts for larger images. Click text links for interactive charts.

Circ Chart #1

Chart #1 interactive version

Circ Chart #2

Chart #2 interactive version

Circulation.  Click charts for larger images. Click text links for interactive charts.


1   The 2007 through 2010 expenditure data presented here have been adjusted for inflation. The data have been re-expressed as constant 2011 dollars using the GDP Deflator method specified in IMLS Public Libraries in the United States Survey: Fiscal Year 2010 (p. 45). For example, because the cumulative inflation rate from 2007 to 2011 was 6.7%, if a library’s total 2007 expenditures were $30 million in 2007, then for this analysis that 2007 figure was adjusted to $32 million.
   Standardizing the dollar values across the 4-year period studied is the only way to get an accurate assessment of actual expenditure changes. A 2% expenditure increase in a year with 2% annual inflation is really no expenditure increase. Conversely, a 2% expenditure decrease in a year with 2% annual inflation is actually a 4% expenditure decrease.
2   Online Computer Library Center, Perceptions of Libraries, 2010: Context and Community, p. 44.
3   In any data analysis where you have to create categories you end up drawing lines somewhere. To define large urban libraries I drew the line at $30 million total operating expenditures. Then, I based this on inflation adjusted figures as described in footnote #1. So any library with unadjusted total operating expenditures equal to or exceeding $28.2 million in 2007 was included.
4   See anything unusual in the chart? (Hint: Look at the chart labeled Morris.) The complete story about this discovery can be found here. Page down to the heading Barley Yield vs. Variety and Year Given Site. See also William S. Cleveland’s book, Visualizing Data, pp. 4-5, 328-340.
5   Using ordinary graphical tools statistician Howard Wainer discovered a big mistake in data that were 400+ years old! His discovery is described in his 2005 book, Graphic Discovery: A Trout in the Milk and Other Visual Adventures. Wainer uncovered anomalies in data appearing in an article published in 1710 by Queen Anne’s physician, John Arbuthnot. The original data were registered christenings and burials collected in England from 1620 to 1720 at the orders of Oliver Cromwell. See Wainer, H. Graphic Discovery, 2005, pp.1-4.
6   The chart below illustrates how a larger scale affects the shapes of a trend line. The scale in the left graph ranges from $25M to $100M, while the scale of the right graph ranges from $25M to $200M. Because the left graph scaling is more spacious (smaller scaling), its trend line angles are more accentuated.

Different Axes Example

Click for larger image.

Posted in Advocacy, Data visualization, Library statistics | Tagged , , , , , | Leave a comment

Roughly Wrong

I decided to move right on to my first 2014 post without delay. The reason is the knot in my stomach that developed while viewing the Webjunction webinar on the University of Washington iSchool Impact Survey. The webinar, held last fall, presented a new survey tool designed for gathering data about how public library patrons make use of library technology and what benefits this use provides them.

Near the end of the webinar a participant asked whether the Impact Survey uses random sampling and whether results can be considered to be statistically representative. The presenter explained that the survey method is not statistically representative since it uses convenience sampling (a topic covered in my recent post). And she confirmed that the data only represent the respondents themselves. And that libraries will have no way of knowing whether the data provide an accurate description of their patrons or community.

Then she announced that this uncertainty and the whole topic of sampling were non-issues, saying, “It really doesn’t matter.” She urged attendees to set aside any worries they had about using data from unrepresentative samples, saying these samples portray “real people doing these real activities and experiencing real outcomes.” And that the samples provide “information you can put into use.”

As well-meaning as the Impact Survey project staff may be, you have to remember their goal is selling their product, which they just happen to have a time-limited introductory offer for. Right now the real issues of data accuracy and responsible use of survey findings are secondary or tertiary to the project team. They could have chosen the ethical high road by proactively discussing the strengths and weaknesses of the Impact Survey. And instructing attendees about appropriate ways to interpret the findings. And encouraging their customers to go the extra mile to augment the incomplete (biased) survey with data from other sources.

But this is not part of their business model. You won’t read about these topics on their website. Nor were they included in the prepared Webjunction presentation last fall. If the issue of sampling bias comes up, their marketing tactic is to “comfort” (the presenter’s word) anyone worried about how trustworthy the survey data are.

The presenter gave two reasons for libraries to trust data from unrepresentative samples: (1) A preeminent expert in the field of program evaluation said they should; and (2) the University of Washington iSchool’s 2010 national study compared its convenience sample of more than 50,000 respondents with a smaller representative sample and found the two samples to be pretty much equivalent.

Let’s see whether these are good reasons. First, the preeminent expert the presenter cited is Harry P. Hatry, a pioneer in the field of program evaluation.1  She used this quote by Hatry: “Better to be roughly right than to be precisely ignorant.”2  To understand Hatry’s statement we must appreciate the context he was writing about. He was referring mainly to federal program managers who opted to not survey their users at all rather than attempt to meet high survey standards promoted by the U.S. Office of Management and Budget. Hatry was talking about the black-and-white choice of high methodological rigor versus doing nothing at all. The only example of lower versus higher precision survey methods he mentioned is mail rather than telephone surveys. Nowhere in the article does he say convenience sampling is justified.

The Impact Survey team would have you believe that Hatry is fine with public agencies opting for convenient and cheap data collection methods without even considering the alternatives. Nevertheless, an Urban Institute manual which Hatry served as advisor for, Surveying Clients About Outcomes, encourages public agencies to first consider surveying their complete roster of clientele. If that is not feasible, public agencies should then use a sampling method that makes sure findings “can be projected reliably to the full client base.”3  The manual does not discuss convenience sampling as an option.

Data accuracy is a big deal to Hatry. He has a chapter in the Handbook of Practical Program Evaluation about using public agency records in evaluation research. There you can read page after page of steps evaluators should follow to assure the accuracy of the data collected. Hatry would never advise public agencies to collect whatever they can, however they can, and use it however they want regardless of how inaccurate or incomplete it is. But that is exactly the advice of the Impact Survey staff when they counsel libraries that sample representativeness doesn’t really matter.

The Impact Survey staff would like libraries to interpret roughly right to mean essentially right. But these are two very different things. When you have information that is roughly right, that information is also roughly wrong. (Statisticians call this situation uncertainty, and the degree of wrongness, error.) The responsibility of a quantitative analyst here is exactly that of an information professional. She must assess how roughly right/wrong the information is. And then communicate this assessment to users of the information so they can account for this in their decision-making. If they do not consider the degree of error in their data, the analyst and decision-makers are replacing Hatry’s precise ignorance with the more insidious ignorance of over-confidence in unvetted information.4

The second reason the presenter gave for libraries not worrying about convenience samples was an analysis from the 2010 U.S. Impact Public Library Study. She said that study researchers compared their sample of 50,000+ self-selected patrons with another sample they had which they considered to be representative. They found that patterns in the data from the large convenience sample were very similar to those in the small representative sample. She explained, “Once you get enough data you start seeing a convergence between what is thought of as a representative sample…and what happens in a convenience sample.”

So, let me rephrase this. You start by attracting thousands and thousands of self-selected respondents from the population you’re interested in. And you continue getting more and more self-selected respondents added to this. When your total number of respondents gets really large, then the patterns in this giant convenience sample begin to change so that they now match patterns found in a small representative sample drawn from that same population. Therefore, very large convenience samples should be just as good as fairly small representative samples.

Assuming this statistical effect is true, how would this help improve the accuracy of small convenience samples at libraries that sign up for the Impact Survey? Does this statistical effect somehow trickle down to the libraries’ small samples, automatically making them the equivalent of representative samples? I don’t think so. I think that, whatever statistical self-correction occurred in the project’s giant national sample, libraries using this survey tool are still stuck with their small unrepresentative samples.5

While it is certainly intriguing, this convergence idea doesn’t quite jibe with the methodology of the 2010 study. You can read in the study appendix or in my prior post about how the analysis worked in the opposite direction. The researchers took great pains to statistically adjust the data in their convenience sample (web survey) in order to counter its intrinsic slantedness. Using something called propensity scoring they statistically reshaped the giant set of data to align it with the smaller (telephone) sample, which they considered to be representative. All of the findings in the final report were based on these adjusted data. It would be very surprising to learn that they later found propensity scoring to be unnecessary because of some statistical effect that caused the giant sample to self-correct.

As you can see, the Impact Survey staff’s justifications for the use of convenience sampling aren’t convincing. We need to rethink the idea of deploying quick-and-easy survey tools for the sake of library advocacy. As currently conceived, these tools require libraries to sacrifice certain of their fundamental values. Gathering and presenting inaccurate and incomplete data is not something libraries should be involved in.


1   The presenter said Hatry “wrote the book on evaluation.” Hatry is legendary in the field of program evaluation. But the book on evaluation has had numerous co-authors and still does. See Marvin Alkin’s 2013 book, Evaluation Roots.
2   The complete quotation is, “I believe that the operational principle for most programs is that it is better to be roughly right than to be precisely ignorant.” Hatry, H.P. (2002). Performance Measurement: Fashions and Fallacies, Public Performance & Management Review, 25:4, 356.
3   Abravanel, M.D. (2003). Surveying Clients About Outcomes, Urban Institute, Appendix C.
4   Yes, convenience samples produce unvetted information. They share the same weakness that focus groups have. Both data collection methods provide real information from real customers. But you take a big risk assuming these customers speak for the entire target group you hope to reach.
5   As I mentioned in my recent post, there is a known statistical effect that can make a library’s convenience sample perfectly match a representative sample drawn from the population of interest. This effect is known as luck or random chance. Just by the luck of the draw your convenience sample could, indeed, end up exactly matching the data from a random sample. The problem is, without an actual random sample to cross-check this with your library will never know whether this has happened. Nor how lucky the library has been!

Posted in Advocacy, Probability, Research, Statistics | 3 Comments

Wasting Time Bigtime

We all know that the main function of libraries is to make information accessible in ways that satisfy user needs. Following Ranganathan’s Fourth Law of Library Science, library instructions guiding users to information must be clear and simple in order to save the user’s time. This is why library signage avoids exotic fonts, splashy decorations, and any embellishments that can muddle the intended message. Library service that wastes the user’s time is bad service.

So I am baffled by how lenient our profession is when it comes to muddled and unclear presentations of quantitative information in the form of data visualizations. We have yet to realize that the sorts of visualizations that are popular nowadays actually waste the user’s time—bigtime!  As appealing as these visualizations may be, from an informational standpoint they violate Ranganathan’s Fourth Law.

Consider the data visualization shown below from the American Library Association’s (ALA) Digital Inclusion Study:

Digital Inclusion Total Dash

ALA Digital Inclusion Study national-level dashboard. Click to access original dashboard.

This visualization was designed to keep state data coordinators (staff at U.S. state libraries) informed. The coordinators were called upon to encourage local public libraries to participate in a survey conducted last fall for this study. The graphic appears on the project website as tool for monitoring progress of the survey state by state.

Notice that the visualization is labeled a dashboard, a data display format popularized by the Balanced Scorecard movement. The idea is a graphic containing multiple statistical charts, each one indicating the status of an important dimension of organizational performance. As Stephen Few observed in his 2006 book, Information Dashboard Design, many dashboard software tools are created by computer programmers who know little to nothing about the effective presentation of quantitative information. Letting programmers decide how to display quantitative data is like letting me tailor your coat. The results will tend towards the Frankensteinian. Few’s book provides several scary examples.

Before examining the Digital Inclusion Study dashboard, I’d like to show you a different example, the graphic appearing below designed by the programmers at Zoomerang and posted on The Center for What Works website. It gives you some idea of the substandard designs that programmers can dream up:1   

What Works Chart

Zoomerang chart posted on Click to see larger version.

The problems with this chart are:

  • There are no axis labels explaining what data are being displayed. The data seem to be survey respondents’ self-assessment of areas for improvement based on a pre-defined list in a questionnaire.
  • There is no chart axis indicating scaling. There are no gridlines to assist readers in evaluating bar lengths.
  • Long textual descriptions interlaced between the blue bars interfere with visually evaluating bar lengths.
  • 3D-shading on the blue bars has a visual effect not far from something known as moiré, visual “noise” that makes the eye work harder to separate the visual cues in the chart. The gray troughs to the right of the bars are extra cues the eye must decipher.
  • The quantities at the far right are too far away from the blue bars, requiring extra reader effort. The quantities are located where the maximum chart axis value typically appears. This unorthodox use of the implied chart axis is confusing.
  • The questionnaire items are not sorted in a meaningful order, making comparisons more work.

We should approach data visualizations the way we approach library signage. The visualizations should make the reader’s task quick and easy—something the Zoomerang chart fails at. Here’s a better design:2

What Works Revision

Revision of original (blue) Zoomerang chart posted above. Click to see larger version.

WARNING:  Beware of statistical, graphical, and online survey software. Nine times out of ten the companies that create this software are uninformed about best practices in graphical data presentation. (This applies to a range of vendors, from Microsoft and Adobe to upstart vendors that hawk visualization software for mobile devices.) Indiscriminate use of these software packages can cause you to waste the user’s time.

The Digital Inclusion Study dashboard appearing at the beginning of this post wastes the user’s time. Let’s see how. Note that the dashboard contains three charts—a gauge, line chart, and map of the U.S. The titles for these are imprecise, but probably okay for the study’s purposes (assuming the state data coordinators were trained in use of the screen). Still, for people unfamiliar with the project or users returning to this display a year later, the titles could be worded more clearly. (Is a goal different from a target? How about a survey submission versus a completion?)

Understandability is a definite problem with the map’s color-coding scheme. The significance of the scheme is likely to escape the average user. It uses the red-amber-green traffic signal metaphor seen in the map legend (bottom left). With this metaphor green usually represents acceptable/successful performance, yellow/amber, borderline/questionable performance, and red, unacceptable performance.

Based on the traffic signal metaphor, when a state’s performance is close to, at, or exceeds 100%, the state should appear in some shade of green on the map. But you can see that this is not the case. Instead, the continental colored in a palette ranging from light reddish to bright yellow. Although Oregon, Washington, Nevada, Michigan, and other states approach or exceed 100% they are coded orangeish-yellow.3  And states like Colorado, North Carolina, and Pennsylvania, which reported 1.5 to 2 times the target rate, appear in bright yellow.

This is all due to the statistical software reserving green for the highest value in the data, namely, Hawaii’s 357% rate. Generally speaking, color in a statistical chart is supposed to contain (encode) information. If the encoding imparts the wrong message, then it detracts from the informativeness of the chart. In other words, it wastes user time—specifically, time spent wondering what the heck the coding means!

Besides misleading color-coding, the shading in the Digital Inclusion Study dashboard map is too subtle to interpret reliably. (The dull haze covering the entire map doesn’t help.) Illinois’ shading seems to match Alabama’s, Michigan’s, and Mississippi’s, but these three differ from Illinois by 13 – 22 points. At the same time, darker-shaded California is only 5 points lower than Illinois.

The Digital Inclusion map’s interactive feature also wastes time. To compare data for two or more states the user must hover her device pointer over each state, one at a time. And then remember each percentage as it is displayed and then disappears.

Below is a well-designed data visualization that clarifies information rather than making it inaccessible. Note that the legend explains the color-coding so that readers can determine which category each state belongs to. And the colors have enough contrast to allow readers to visually assemble the groupings quickly—dark blue, light blue, white, beige, and gold. Listing the state abbreviations and data values on the map makes state-to-state comparisons easy.


A well-designed data visualization. Source: U.S. Bureau of Economic Analysis. Click to see larger version.

This map is definitely a time saver!

Now let’s turn to an…er…engaging feature of the ALA dashboard above—the dial/gauge. To the dismay of Stephen Few and others, dials/gauges are ubiquitous in information dashboards despite the fact that they are poor channels for the transmission of information. Almost always these virtual gadgets obscure information rather than reveal it.4  Meaning, again, that they are time wasters.

The gauge in the dashboard above presents a single piece of data—the number 88. It is astonishing that designers of this virtual gadget have put so many hurdles in the way of users trying to comprehend this single number. I hope this bad design comes from ignorance rather than malice. Anyway, here are the hurdles:

  1. The dial’s scaling is all but invisible. The dial is labeled, but only at the beginning (zero) and end (100) of the scale, and in a tiny font. To determine values for the rest of the scale the user must ignore the prominent white lines in favor of the obscured black lines (both types of lines are unlabelled). Then she has to study the spacing to determine that the black lines mark the 25, 50, and 75 points on the dial. The white lines turn out to be superfluous.
  2. The needle is impossible to read. The green portion of the banding causes the red tick-marks to be nearly invisible. The only way to tell exactly where the needle is pointing is by referring to the ‘88’ printed on the dial, a requirement that renders the needle useless.
  3. The uninitiated user cannot tell what is being measured. The text at the center of the image is masked at both edges because it has been squeezed into too small a space. And the gauge’s title is too vague to tell us much. I am guessing that the dial measures completed survey questionnaires as a percentage of some target quantity set for the U.S. public libraries that were polled. (And, honestly, I find it irritating that the 88 is not followed by a percent symbol.)
  4. The time period for the data depicted by the gauge is unspecified. Not helpful that the line chart at the right contains no scale values on the horizontal axis. Or, technically, the axis has one scale value—the entirety of 2013. (Who ever heard of a measurement scale with one point on it?) The dial and line chart probably report questionnaires submitted to date. So it would be especially informative for the programmers to have included the date on the display.
  5. Although the red-amber-green banding seems to be harmless decoration, it actually can lead the reader to false conclusions. Early on in the Digital Inclusion Study survey period, submissions at a rate of, say, 30%, would be coded ‘unacceptable’ even though the rate might be quite acceptable. The same misclassification can occur in the amber region of the dial. Perhaps users should have been advised to ignore the color-coding until the conclusion of the survey period. (See also the discussion of this scheme earlier in this post.)

The graphic below reveals a serious problem with these particular gauges. The graphic is from a second dashboard visible on the Digital Inclusion Study website, one that appears when the user selects any given U.S. state (say, Alaska) from the dashboard shown earlier:

Digital Inclusion Alaska Chart

ALA Digital Inclusion Study state-level dashboard. Click to see larger version.

Notice that this dashboard contains five dials—one for the total submission rate for Alaska (overall) and one for each of four location categories (city, suburban, town, and rural). While the scaling in all five dials spans from 0% to 100%, two of the dials—city and town—depict quantities far in excess of 100%. I’ll skip the questions of how and why the survey submission rate could be so high, as I am uninformed about the logistics of the survey. But you can see that, regardless of the actual data,the needles in these two gauges extend only a smidgen beyond the 100% mark.

Turns out these imitation gauges don’t bother to display values outside the range of the set scaling, which, if you think about it, is tantamount to withholding information.5  Users hastily scanning just the needle positions (real-life instrument dials are designed for quick glances) will get a completely false impression of the data. Obviously, the gauges are unsatisfactory for the job of displaying this dataset correctly.

So now the question becomes, why use these gauges at all? Why not just present the data in a single-row table? This is all the dials are doing anyway, albeit with assorted visual aberrations. Besides, there are other graphical formats capable of displaying these data intelligently. (I won’t trouble you with the details of these alternatives.)

One point about the line chart in the Alaska (state-level) dashboard. Well, two points, actually. First, the weekly survey submission counts should be listed near the blue plotted line—again, to save the user’s time. Second, the horizontal axis is mislabeled. Or, technically, untitled. The tiny blue square and label are actually the chart legend, which has been mislocated. As it is, its location suggests that both chart axes measure survey completions, which makes no sense. The legend pertains only to the vertical axis, not to the horizontal. The horizontal axis represents the survey period measured in weeks. So perhaps the label “Weeks” would work there.

In charts depicting a single type of data (i.e. a single plotted line) there is no need for a color-coded legend at all. The sort of detail that software programmers will know nothing about.

Finally, a brief word about key information the dashboard doesn’t show—the performance thresholds (targets) that states had to meet to earn an acceptable rating. Wouldn’t it be nice to know what these are? They might provide some insight into the wide variation in states’ overall submission rates, which ranged from 12% to 357%. And the curiously high levels seen among the location categories. Plus, including these targets would have required the dashboard designers to select a more effective visualization format instead of the whimsical gauges.

Bottom line, the Digital Inclusion Study dashboard requires a lot of user time to obtain a little information, some of which is just plain incorrect. Maybe this is no big deal to project participants who have adjusted to the visualization’s defects in order to extract what they need. Or maybe they just ignore it. (I’m still confused about the purpose of the U.S. map.)

But this a big deal in another way. It’s not a good thing when nationally visible library projects model such unsatisfactory methods for presenting information. Use of canned visualizations from these software packages is causing our profession to set the bar too low. And libraries mimicking these methods in their own local projects will be unaware of the methods’ shortcomings. They might even assume that Ranganathan would wholeheartedly approve!


1   Convoluted designs by computer programmers are not limited to data visualizations. Alan Cooper, the inventor of Visual Basic, describes how widespread this problem is in his book, The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity.
2   Any chart with closely spaced bars can be subject to moiré, especially when bold colors are used. Pastel shades, like the tan in this chart, help minimize this.
3   Delaware also falls into this category and illustrates the distortion intrinsic to maps used to display non-spatial measures. (Shale deposit areas by state is a spatial measure; prevalance of obesity by state is a non-spatial measure.) Large states will be visually over-emphasized while tiny states like Delaware and Rhode Island struggle to be seen at all.
4   My favorite example, viewable in Stephen Few’s blog, is how graphic artists add extra realism as swatches of glare on the dials’ transparent covers. These artists don’t think twice about hiding information for the sake of a more believable image.
5   This is extremely bad form—probably misfeasance—on the part of the software companies. More responsible software companies, like SAS and Tableau Software, are careful to warn chart designers when data extend beyond the scaling that chart designers define.

Posted in Data visualization | Leave a comment