Collecting experiential data: Blogging self-study

Blogging on Peer-Reviewed Research This is the last part, part 7, of a multi-part post on the subject of blogs as scholarship. It discusses a blogging self-study that I undertook using CommentPress. The last post discussed underlying assumptions and theoretical frameworks within which one might view the phenomenon.

The literature about blogs as scholarship suggests many promising areas for investigation, but as a preliminary matter, I experimented with academic blogging myself. The use of a researcher’s own experience and skills, or “data in the head” drawn from personal, research, and literature-reading experience can lead to much better theory building, the ultimate goal of qualitative research (referring to Strauss 1990) (Alston & Bowles, 2003). The University of Texas at Austin’s Libraries installed a WordPress blog on the Texas Digital Library Website for me, using the CommentPress theme to encourage commenting. I serialized a first draft of an academic paper about the effects of mass digitization projects on copyright law and policy (Mass Digitization), posting one section each week for six weeks. I announced the experiment on other blogs I host, on listserves, at a conference I attended during the experiment; and asked friends to announce the experiment on their blogs. I used Google Analytics to monitor traffic to the site and correlate it with the announcements and weekly postings, and kept a journal of my thoughts about and reactions to the experience throughout the six weeks. I actively reviewed the literature reported above while I conducted the experiment, and utilized the themes emerging from the literature, and my experiences with blogging Mass Digitization, to begin constructing a survey for later administration to help broaden my understanding of how bloggers experience blogging.

Blogging Mass Digitization in light of the revelations in the existing literature enabled me to examine aspects of the experience to which I had not attended in previous blogging. My experience suggests that bloggers idealize blogging in some respects. I doubt that all of their observations and opinions will hold up to systematic observation and scrutiny, though some are likely to be supported.

Nevertheless, the academic power hierarchy and the publishing industry’s hegemony will make the future for academic blogging difficult, despite its benefits. Because of these dynamics, blogs present interesting objects of study for those who wish to better understand the politics of academe. One can say the same about every computer-mediated, networked form of communication today – they all challenge existing power relationships and will eventually reconfigure institutions and scholarly publishing. No aspect of the community is likely to emerge unchanged.


Installing CommentPress

Installing CommentPress is “tricky,” according to the U.T. Libraries’ Dustin Slater, who installed it for me. He completed the installation over about a two-week period. He reports that the code contained many deviations from standard procedure and best practices, complicating the adaptations needed to situate it within the Texas Digital Libraries environment. The blog also needed significant debugging over an additional period of two weeks once Slater invited me to begin testing it. I was able to launch the blog on October 18, 2007 with my initial announcement on Collectanea. This was Slater’s first blog installation, so he believes it will be easier to bring up blogs in the future, based on this experience.

Drafting in public

I chose a subject for this experimental public drafting about which I have formed opinions over a long period. Initial drafting did not require additional research. I had been collecting the evidence to support my arguments for most of 2007. I drafted an overview first, setting out a series of arguments that I would later expand. Each week I posted a new section, building on earlier sections supported by evidence from the Web, to which I linked on a “Project resources” page. I tried to time announcements to new posts, when possible. The Resources page acted as a repository for links to all the Web-based references I will eventually include in the final draft. In this early stage of drafting, however, I did not link directly to the resources from the text; rather, I referred readers to the Resources page to explore the resources in context groupings (orphan works, mass digitization projects, legal resources, retreats from DRM, etc.). I thought that a lot of textual linking could distract from the theme of each paragraph, and might reduce time on the blog and the likelihood of commenting.

Monitoring data

Google Analytics provides a wealth of data about blog visitors and the Libraries already had an account for the Texas Digital Library, so it was easy to add the Mass Digitization blog to the account. I spent about an hour each week exploring the tool and the data to become familiar with all the facts that I could bring to bear on the story I want to tell about my experience. The data is permanently retained on Google’s servers, but can also be saved in various file formats to illustrate points I may wish to make.


I wrote in a journal each week, and more often when particular events occurred that were noteworthy, for example, when I received two spam comments. I wrote from a personal perspective, commenting mainly on how it felt to be drafting publicly, how this experience compares to other drafting experiences, how I am affected by seeing the locations where readers came from, how long they stayed, what they read, what they didn’t read, and how colleagues reacted when I talked with them about the experiment.


Without data, my impression of blogging would have been very inaccurate because it would have been based on the presence, or rather, absence of comments. Few or no comments would have suggested few or no visitors, but that was not the case. The comment rate on my blog for the 40 days ended November 26, 2007 was .25%, but an estimated 250 visitors[1] viewed pages 790 times. The blog welcome page, containing no substantive information, garnered the most views (324), followed by the overview page (156). Each successive section was viewed fewer and fewer times, partly because each page was present on the blog for shorter and shorter periods as the experiment wound down. For example, visitors viewed the conclusion page only 4 times, but I posted it on the day before the project ended. They viewed the resource page 51 times; the comment pages, 46 times. Small numbers of visitors viewed the category pages (fair use, irrelevance of law, business models, creative commons and orphan works).

Once they arrived on the blog, visitors tended to stay for a while, spending, on average 6 - 11 minutes on the site, depending on the pages they viewed and whether they were new or returning visitors.

Visitors came from 93 cities, including Austin, Washington, New York, Albany, Ann Arbor, Chicago and London, and from nine countries, including the US, Canada, Australia, Puerto Rico, Italy, Norway, Switzerland and the UK. They were referred by or came from nine sources (including of course, the ones from which I announced the experiment) including the Copyright Crash Course, Collectanea, Lifelong learning (my personal blog), Google, bloglines (an rss subscriber), The Googlization of Everything (where I commented on one of Siva's posts), Moodle, and the Texas Digital Library. New visitors slightly outnumbered returning visitors but returning visitors spent four to five times longer on the pages they visited, and viewed nearly twice as many pages per visit.

There were clear correlations between the announcements and visits. It is probably safe to conclude that some, although not all, announcements increased traffic to the site. For example, I made 14 announcements over the course of the experiment and visitor tracking recorded above-average numbers of pageviews on about half of those days or the day immediately following. Graph 1 shows the overview of traffic to the blog annotated with the announcements, numerical totals for the peak days and the average number of pageviews per day, for comparison. Once published, the sections received most of their pageviews within 2 weeks. After that, traffic fell off, but did not disappear entirely. For example, graphs 2 and 3 show the traffic pattern for post numbers two (All quiet on the legislative front) and four (Ok, ok, DRM and contracts were a big mistake; now what?).

Interpretation and discussion

The blogging experience may be very different for those just starting out from the experience of those who have been blogging for years (Hoffman, 2007) or who were well-known before they began to blog (Volokh, 2006). This particular experiment only lasted 6 weeks, so the Mass Digitization blog was not likely to become well-known. Bloggers who report amazing numbers (“stunning” numbers in one case referenced above) (2007; Rodrik, 2007, pars. 4-5; 2006, p. 1) can leave the impression that everyone will experience that kind of readership. Clearly, I did not.

More surprising, however, is the absence of substantive comments. The presence of modest numbers of comments about many blog posts attests to the fact that comments are or can be a significant feature of a blog, but apparently, only at much higher overall numbers of visitors, or only within certain communities built around certain bloggers. At least I am not alone in being alone: over the summer, University of Michigan’s Office of Scholarly Publishing collaborated with the authors of The Ithaka Report (Brown, Griffiths, & Rascoff, 2007) and the Institute for the Future of the Book, to publish the Report in CommentPress, about which Maria Bonn, Director, Scholarly Publishing Office, University of Michigan Library commented recently (p. 1):

We are watching this experiment with interest. In the first three weeks that the Ithaka report was available in CommentPress, this version of the report was viewed thousands of times. We received dozens and dozens of e-mails and verbal reports from members of the academic community noting their enthusiasm for the projects. And yet, the discussion the report invites has been relatively quiet. We look forward to seeing if the level of discussion remains constant or increases and to performing some analysis to see what this experiment can teach us about the appropriate alignment of content, user communities, and technology.

Beginning academic bloggers may not be the only ones who need to exercise patience and perseverance if they want to join the ranks of those who really are able to build community through conversation and commentary around a blog. These experiences suggest that the time it takes to build community could be a good subject for investigation through longitudinal study, and should be included among the questions I ask in my survey.

If it were generally the case that providing feedback for new ideas is not one of the blog’s strong points for a novice academic blogger, what are the advantages of blogging, if any? The same data that tends to dash the hope of commentary, exalts public distribution. Two hundred fifty people have looked at one or more of the pages that constitute this early draft of Mass Digitization. Even though I plan to polish it, run it through the site a second time with links and citations, and submit it for publication (open access, certainly), it may be that the style of the blog post, short, informal and timely, attracted some viewers who would never have looked at the article in its final polished form. If so, these readers constitute a new audience. But does the blogger lose the legal scholars in the process, those who could be counted on, at least, to note that the article was published (they’ll see it in some index or table of contents service if it’s in their area of interest) and perhaps even read it in its polished form? Might those readers also peruse the draft in blog-style? If so, it might be reasonable to assume a net gain of readers. And that’s just from the perspective of the value to the scholar (since public dissemination of one’s work is one of the academic’s responsibilities). If we assume that the non-legal reader would not likely read the same argument in formal, lengthy law review-style, those readers probably have gained some benefit as well.

But there may be other benefits to be considered: I simply had a very good time blogging the Mass Digitization article. On November 20, I commented in my journal:

Reading an interview with Cory Doctorow just now, in which he talks about how O'Reilly tolerates piracy b/c the most pirated of his works is also the most profitable, and seeing how that quote will fit in with what I'm writing at Mass digitization, and how I can link to the interview in the writing resources page --- wow, this is just so cool. I love this writing/reading/remixing from the real world. It is so much fun! The relaxed pace of one section each week for a first draft – very doable. This process is much more creative and involves so much more fortuitous discovery than the way I would have written an article before, based entirely on law review articles, cases and statutes. Those will still be part of the story, but only as background. I don't plan to use them to support my argument really. Maybe a little, but quotes and real stories from the public that illustrate the points seem to me a lot more persuasive to the audience I want to reach, not just lawyers. It is a public audience that probably includes many more non-lawyers than lawyers.

The subjective experience of pleasure in crafting an argument “in the flow” of the network may be a more important key to the future of blogs and other Web 2.0 forms of interactive communication than it might seem. All things being equal, why wouldn’t scholars prefer to perform their duties in a manner that they enjoy if they have that choice? Increasingly, they will have more choices about how to publish their research findings. Thus, the questions related to who determines what counts for scholarship, who will win and lose if traditional forms of publishing diminish in importance, and how the scholarly community will reform itself around more open, shorter and disintermediated documentary forms are the most interesting ones. Blogging scholars themselves are taking steps to identify works of and about peer-reviewed scholarship reported on blogs. The Website, BPR3 (Bloggers for Peer-reviewed Research Reporting), offers both a way for bloggers to identify their serious posts and a way for readers to find them. The BPR3 icon, affixed to posts about peer-reviewed research, creates an aggregation system like the Creative Commons that links all the posts bearing the icon back to the BPR3 Website and soon-to-be search capabilities. What will they think of next?


The literature about blogs as scholarship reports mostly people’s opinions about the benefits and disadvantages of legal scholars’ blogging. Research based on systematic empirical observation of any kind is sparse. Nevertheless, clear themes emerge suggesting promising areas for investigation, including,

  • studies designed to systematically document qualitative as well as quantitative aspects of the scholarly blogosphere,
  • studies about attitudes towards the bloggers’ assertions among bloggers, their non-blogging colleagues and institutional administrators,
  • studies that look at the larger environment of which blogs are a part, that investigate the transition occurring within the academic publishing milieu and its effect on the use and acceptance of shorter, open, disintermediated forms of scholarly communication, such as blogs, and
  • the eventual reformation of the scholarly communication network in light of the changes resulting from its full integration into networked environments.

The state of research on these subjects invites a wide variety of studies that can deepen the discussion and inform the analysis of what effect scholars’ blogging is likely to have on academe, over what time horizon. Because it is so preliminary, not yet addressing any major concern in the field of information studies, the literature sets the stage for research from a variety of perspectives. To extend my own observations, I will identify a group of legal academic bloggers in the spring to survey regarding the themes discussed herein, the observations from my self-study and hypotheses about the acceptance of blogging as scholarship.


[1] It is not possible to determine how many unique visitors visited just the Mass Digitization pages. I applied the site ratio of visitors to pageviews (.32) to the number of pageviews for the Mass Digitization pages, which could be determined, to arrive at 253.

