May 03, 2017
Comparing Apples and Oranges, Floors and Ceilings in Digital Scholarship
19 min read
This post by the WMQ’s Josh Piker originally appeared on The Scholarly Kitchen, a blog about “What’s Hot and Cooking in Scholarly Publishing.” by Joshua Piker
As part of an “Exchange” on “Reviewing Digital History” in the February 2016 American Historical Review (AHR), Harvard historian Vincent Brown recounted a Reddit-inspired, Ultimate Fighting Champion (UFC)-fueled — really! — deluge of over 33,000 visitors to his website, Slave Revolt in Jamaica, 1760-1761: A Cartographic Narrative on a single day in 2015. This is an evocative example of the spontaneous possibilities the internet offers for disseminating world-class historical scholarship. Brown also discussed the more modest — but still quite impressive — numbers of viewers in the ten months prior: “more than 6,400 users in 121 countries and 1,870 cities viewed the site, which continued to record between 1,000 and 1,500 sessions per month.”
To establish that his site’s regular visitor numbers really are impressive, Brown offered as a point of comparison the view and download statistics for articles in the William and Mary Quarterly (WMQ), “which would have been an ideal venue for a print essay about the Jamaican slave revolt.” Brown notes that the most viewed and downloaded article was a 1983 essay by Daniel K. Richter, “War and Culture: The Iroquois Experience,” and that it was “accessed 4,817 times between September 5, 2012, and September 4, 2015.” In short, three years’ worth of JSTOR access for the most popular WMQ article added up to only about 75% of the viewer numbers racked up by Brown’s site in the year before his deluge day.
That does seem impressive. And I very much appreciate Brown’s shout-out to the Quarterly as a place he might well have wanted to bring the project had he set out to write a print-based article. I would have liked to have published that piece, and that’s especially true because, thanks to our OI Reader app, the Quarterly article might well have been able to incorporate much of — or even more, or different types of — the exploratory and interactive material that makes the website so valuable. But, water under the bridge and all that. Brown has given us a wonderful historical resource, one that has been viewed by tens of thousand of people. That’s more than would have seen the Quarterly version….
Or is it?
That’s not a rhetorical question. I don’t know the answer, but I do know that comparing website views to article accesses won’t get us the information we need to find it. And to be clear, that’s not a criticism of Brown. He knows that viewer numbers are just one way to measure scholarly reach and impact. As he put it, “While accessibility is a great egalitarian virtue, it is not always a satisfying end in itself”; his project, he argued, “is most valuable as part of a larger social and educational ecosystem.” Moreover, the article accesses v. website views comparison that he draws in his essay is a very common one, so common, in fact, that the editors of the AHR who were organizing a conversation about digital history let it pass without apparent quibble or qualification. I probably would’ve too. This is just how we talk about these two ways of communicating historical scholarship. Website? Views! Article? Accesses!
It’s worth taking the time, however, to consider these terms and the logic behind their use a bit more carefully. The Richter article that Brown references for comparison with his website is actually an excellent vehicle for showing why we should think twice before assuming that website views v. article accesses is an apples-to-apples comparison.
Let’s start with the basics. Richter’s article was published in 1983. Although the WMQ was one of the first 10 journals on JSTOR, back content was not added until 2009. So, according to JSTOR, from January 1, 2009 to December 15, 2016, the article was accessed — either downloaded or read online — 32,989 times. That’s a lot, but only about how many times Brown’s site was viewed in that one spectacular day.
As Brown notes, however, almost 90% of those single-day viewers never got beyond the site’s first page (p. 176), and even before the UFC crowd showed up, “the average duration of viewing sessions for the Slave Revolt website was just under two minutes” (p. 185), although the vagaries of how visitors to the site can be tracked mean the figures for duration of visit are necessarily suggestive, not definitive. One way or another, though, it seems likely the viewers of Brown’s website and the people who accessed Richter’s article interacted quite differently with the material. The vast majority of the people who downloaded Richter’s article all but certainly made it to p. 2 — after all, if you have a subscription (either personal or institutional), you can read the first page on JSTOR before you have to decide whether to download the article — and spent more than two minutes with the essay. And if you don’t have a subscription and come to the article on JSTOR’s site, you can read the first page before you decide whether to “Read Online.” People in that situation who read the first page would be counted as “viewers” on Brown’s site, but they would not have “accessed” the article on JSTOR.
But the problems with the article access v. website viewer numbers only begin with what we might think of as “depth of engagement.” Yes, those numbers likely measure different things, but it’s important to recognize, as well, that the number for Brown’s site is the number. That’s it. It’s a big — and growing bigger — number, but the number of viewers is pretty close to the number of people who have engaged directly with his material. Perhaps a few people clustered around one computer screen were counted as only one viewer? More likely, some teachers put the site up on a screen in their classrooms, and each class was counted as one viewer. But the number of people we’re talking about in those situations is likely a rounding error in re: the baseline number of viewers. Or, put more simply, Brown has a pretty good idea of how many people have viewed his site.
For Richter’s article, by contrast, the figure of 32,989 accesses from JSTOR is just the beginning. After all, the article first appeared in 1983 in hardcopy, pre-JSTOR and online publication. In 1983, the Quarterly published 4,249 copies of the issue that included Richter’s article. Some of those copies no doubt went into the circular file, but we can assume that many people subscribed to the journal because they wanted to actually read the darn thing. Moreover, 1,547 copies went to libraries and archives. How many people — students, faculty, researchers — consulted the article in those contexts? In some places, possibly only a few, but even if the answer is one reader every few years, the numbers add up in a hurry. And at places that train graduate students, Richter’s status as a leader in the booming fields of early American and Native American history mean that the numbers likely add up even faster than that.
Moreover, the number of copies that we published is actually only the tip of the iceberg for hardcopies of Richter’s article. The article has been reprinted in at least eight different essay collections including subsequent editions and Richter’s own 2013 volume, Trade, Land, Power: The Struggle for Eastern North America. Those volumes include both the go-to essay collection for teachers of American colonial history courses and two of the most popular essay collections in Native American history. How many copies are we talking about here? Thousands? Tens of thousands? We also shouldn’t forget the used-book market. There are fifty copies of one of those essay collections currently available for sale on Amazon as I write this. And according to WorldCat, the volumes containing Richter’s essay are owned by 2,766 libraries worldwide.
Plus Google Books. There you can read the entirety of Richter’s article (as published in two volumes) for free. You can also “view” most of the article in another volume on that site, although the absence of a few key pages may impact your viewing pleasure, as they say. Of course, I have not the slightest clue how many people have looked at any of these sources. Nor do I have any way of knowing how many people have downloaded the version of the article that Richter published as chapter 4 in Trade, Land, and Power and that is available for download via Project Muse. That latter number would go to the University of Pennsylvania Press, which published the book, not to the WMQ.
Of course, not everyone who lands on the Google Books version of the essay or buys or checks out one of the hardcopies of the books equals a reader of Richter’s article. But how different is a buyer of the book who flips through the pages from the viewer of the Slave Revolt website who spends a minute or less on the site’s first page? A bit different, I guess, but I wouldn’t give great odds on either of them passing a snap quiz about Richter or Brown’s larger arguments.
Speaking of quizzes, most of the books that reprint Richter’s article are intended for classroom use. And while the popularity of those texts suggests that the number of students who have read (skimmed? browsed? turned the pages of?) Richter’s article is large, we really can’t even begin to guess how large – and not just because I don’t have access to either the databases of the publishers and the used-book sellers or the circulation records for the lending libraries.
The other problem with figuring out classroom use of Richter’s article goes back to the issue of downloads. The article has been downloaded from JSTOR 9,859 times. Each of those downloads is a PDF. And once a PDF is downloaded, it can circulate via email attachments and posting on websites and the like. Professors are getting better about using stable links instead of providing PDFs to their students, but too many of us still opt for the ease and convenience of simply emailing a PDF to the class or posting a PDF on the class website.
Can we track and count those PDFs? Not on your life. If you download a PDF of Richter’s article and email it to your 200-student class, that’s one download. If each of those students loves the article so much that they send it to everyone in their contact list, we’re still at one download. If — to move us back toward the case of Brown’s website — one of those contacts is the newly crowned UFC champion who posts the PDF on his personal webpage where it’s read by millions of adoring-but-bloodthirsty fans, we’re still at one download.
PDFs, in other words, are like currency – they last for some time and they can circulate widely. A minute’s worth of research reveals there are about 6 billion $20 US bills out there, but that’s not the same thing as knowing how many transactions used a $20 bill in the last few years. At least 9,589 PDFs of Richter’s article exist, and we know some circulate. They wind up on course websites, syllabi, and libraries’ e-reserve lists. Faculty send them to local copy services — over the years, only forty-eight course instructors have officially requested permission to do so for Richter’s article — to make course packs and to their colleagues and students. And so on. PDFs are great, until you want to know how many people have read a given article.
The point of all of this isn’t that Richter’s article has been “read” (whatever that might mean) more than Brown’s website. There simply is no way of knowing if that’s the case because of the very different publishing ecosystems in which these two works are situated. What I can say with something close to certainty, though, is that comparing access numbers for an article to viewer numbers for a website is to compare the readership floor of the former to the readership ceiling of the latter.
And not to belabor the point — too late? — but there’s also the whole time-depth aspect to this accessed article v. viewed website comparison. Websites are ephemeral creatures. We live in a digital world in which even NEH-sponsored digital projects are assumed to “sunset.” What “enduring” means for digital resources is very much an open question. Richter’s article, by contrast, is thirty-four years old and going strong. Rather than fading away, it’s appearing in an ever-broader array of venues and a proliferating number of media.
One more analogy, then: a website like Brown’s is a sprinter; an article like Richter’s is a decathlete.
The sprinter is going to rack up impressive numbers quickly. Decathletes sprint, of course, but not like the true sprinter, and so a decathlete’s sprint-based numbers will not be as eye-catching. But the decathlete doesn’t just sprint; there are other events, other venues in which to rack up numbers. And some of those other events are long-term ones, endurance races in which numbers pile up slowly but steadily for some time. One sort of athlete isn’t better than the other, just as Richter’s article isn’t better than Brown’s website, or vice versa. But if we’re going to compare the two, then we should be careful not to look at one ‘event’ and decide that we’ve got the complete picture.