9CHRIS and gradual improvement

December 8, 2017

For more than four years, I’ve been working on a digital legal history project called 9CHRIS – the 9th Circuit Historical Records Index System. The potential for historical analysis that comes from having some 40,000 briefs and transcripts available is what keeps me perpetually interested in continuing to improve this project. Each time I open one of the documents, I think of the potential that such rich detail can offer to historians and others studying the West, especially the relationship between western places, western residents, and the law.

Lately, I have realized that there’s a more simplistic factor that helps keep me improving 9CHRIS: the variety of work that is required helps keep upgrades from becoming dull. From the tedious escapism of hand-correcting the records, to the puzzle-solving of deriving meaning from ugly text, to learning about servers, HTML, Markdown, SQLite, and git, to the inspirational feeling of having a senior scholar get in touch about how they have been using it to inform their work, each element of the project requires a different kind of skill and delivers a different sort of reward. I can “play around” with something new, and if it seems to work and there might be promise in it, I can then formalize it as part of the project; but if it ends up being just a curiosity or a dead-end, there was satisfaction in the exploration on its own.

Over the last few months, these two threads have come together in a new effort at improving the data. The identification of documents in the first version of 9CHRIS was mostly done by computers, which were largely but not entirely correct. I hand-corrected some of the most egregious examples where the computer guessed wrong, and the system was generally usable as a result. Recently, however, I realized that if the whole dataset could be hand-corrected, it would open a number of possibilities for future use. Other information could be layered atop the corrected data, and this might, in turn, open up prospects for much more interesting analysis. Consequently, I began hand-correcting each of the 3,359 volumes in 9CHRIS, inaugurating a new phase of the project. Beginning with volume 0001, and moving sequentially, I flip through each volume page by page, noting the beginning of each of the documents it contains, and updating the 9CHRIS site with the correct finds.

Today I passed a milestone, having corrected volume 1000. With each volume I correct, searches improve, errors are removed from the dataset, and I get new ideas about what can be done with all this data once every document has been correctly identified. Once this hand-correcting phase is complete, future project initiatives will construct layers of additional metadata on this foundation. Even before all the corrections are complete, however, the existing system will be gradually improving with each corrected document.