Project Update: Before you can manipulate the text…

you have to have text to work with. In my case, I have text, but it’s handwritten on parchment. I spent a good amount of last semester transcribing pages of text from Univ of Illinois MS 80, a small manuscript (about 3.5″ x 5.5″) of 122 folios (244 pages) into a word document. While my enthusiasm to record as much of the information from the manuscript as possible in my transcription was admirable for my paleography project, the data I have will need some adjusting for my digital humanities project. I plan to use plagiarism software to see if I can better narrow down the original source material for the portions of MS 80’s text where the scribe simply states that some of the writings are from St. Bernard, from St. Anselm, and from St. Augustine. In order to run MS 80 (in Middle English) and the source materials (some Latin, some Middle English, some modern English, depending on the source and where I’m able to obtain a digital copy) through the plagiarism software to (hopefully!) detect overlapping text, I need unmarked, unformatted, digitized text.

As my digitized version  of MS 80 stands now, several problems are immediately apparent:

1. I need to remove the table (not a big fix)

2. The punctuation needs to be standardized. This is a little more complicated; the periods sometimes indicate a sense break (comma) and sometimes indicate the end of a sentence (semicolon or period). Sometimes, they simply indicate that the reader should pause briefly before continuing. I’m still not exactly sure whether the slash marks indicate the end of a paragraph, a sentence, or both. It seems to depend on which scribe is copying at the time, and also depends perhaps on the formatting of the source material.

3. All abbreviations need to be spelled out, with parentheses removed. However, if the source material that I’ll be checking this manuscript against is formatted with parentheses, I will either have to remove those as well, or leave them in my text and be prepared for screwier data.

4. If the source material is modernized, I will need to create a modernized version of my manuscript. This is problematic in itself, as my vocabulary selection may not always match that of other editors.

5. The spelling, too, is not standardized. If my sources are in Middle English (my preference, to help decrease the number of variables involved here), their spelling will also not be standardized, nor will it be consistent within itself.

So, my original concern that I wouldn’t be able to find adequate software is turning out to be less of a concern as the issue of data preparation becomes more complex and unwieldy. I’m learning quite a bit about textual editing theory in the process, which I’m finding oddly enjoyable. The more I work on this, the less likely it seems that I will actually come away with new information about MS 80. I may end up simplifying this project by searching for similarities between two *known* borrowings (for example, MS 80 includes a well-known & oft-copied version of a prayer to Jesus) and see how well the plagiarism software detects what scholars have already verified, and then determine whether or not this is a viable option for detecting textual echoes.

This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to Project Update: Before you can manipulate the text…

  1. tedunderwood says:

    I actually think you are learning more about digital humanities here than you would if the project were simpler! In practice, much of the labor often turns out to be data preparation. I’d like to find ways around that, but I also have to admit that you learn a lot from getting down & dirty with the data.

  2. Camille Marshall says:

    Hi, I’m actually working on the same text (The Meditations upon the Seven Days of the Week) from the only other known manuscript (CUL MS Kk. 1.6) for a different purpose (a uni workshop on editing). I’d love to compare the two versions- we’re you working from microfilm, or directly on the manuscript? Willing to share data?

    • Sara Weisweaver says:

      Camille, this is excellent news, and I’d love to collaborate with you! Please send me an email at weiswea2 [at] illinois [dot] edu & we can go from there.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s