Some results of topic modeling.

This is going to be a post with a lot of pictures and not a lot of interpretation. I’m actually using it in lieu of a class handout, because it’s easier than printing 40 pages in full color! The interpretation will be up to us in class.

These images are based on a collection of 1629 volumes from 1751 to 1902, including fiction, drama, poetry, biography, and some essays. The collection was topic-modeled using LDA. The plots you’re looking at here indicate the volumes that prominently featured a given topic, giving some indication of date and genre. BrowseLDA3.R can give you more detailed metadata on the volumes where a given topic appears prominently.

In each case it’s important to note that the gray curve represents an overall frequency that is based on the whole collection, not just the volumes plotted as individual points. (The script only plots the top 1/4 of the volumes, because plotting 1629 points is like plotting fog.) Also, the collection has more drama and poetry in the 18c than it does in the 19c; remedying this is a priority for us going forward.

Topic modeling often introduces you to things you “already know” sort of in the sense that a map of your home town shows you “things you already know.” You may have been on most of those streets. But a picture, from above, showing the spatial relationship of the parts … is not exactly the form in which you experienced them.

We know about the eighteenth-century discourse of sensibility, for instance. But in this modeling run, sensibility divides into two parts. Topic 109 contains a lot of epistolary fiction — and, less prominently, drama — and it expresses tenderness in these terms: dear am happiness myself moment shall most adieu amiable heaven heart happy alas cannot present can passion creature situation charming dearest friendship ever letter confess mine sentiments unhappy truly esteem honour tenderness both distress thousand affection conduct wishes wish possible.

By contrast, Topic 45 contains fiction and poetry, and peaks slightly later.

It expresses sensibility, if possible, even more tenderly: heart mind bosom spirit soul tender every nature generous affection virtue virtues parent tenderness youth whose delight friendship feel affections ardent lovely tears sensibility passion object pure gentle beloved gratitude passions sweet frame form human pride sentiment anguish sacred kind. This discourse may have a slightly less epistolary, or less first-person, emphasis.

I don’t want to overinterpret the next graph, because it’s quite possible that the generic division is an artefact of the collection rather than a feature of underlying print culture. But those black circles are books on peerage, and biographies of various noble characters, especially including Byron.

What I find surprising in the next one: the volumes where this topic is most prominent include novels by William Godwin and Brockden Brown — and also biographies, of William Godwin and Brockden Brown! Whatever affliction is happening here, even talking about Godwin and Brown seems to cause it. The affliction might be Jacobinism, I suppose, because a lot of the volumes further down the list are Jacobin novels.

Here are some of the terms prominent in the topic: myself situation mind present most degree scarcely however period conduct character travels might human intercourse thoughts circumstances conceived means purpose sentiments motives existence produced condition appeared temper scene perhaps person attention nothing own species regard time already confidence circumstance every. The stray novel in the 1840s there is Melville’s Typee.

Finally, here’s something for you later 19c types. I’ve been saying in class that there’s something weird about faces and body parts in later 19c fiction. Here it is again, mixed up with architecture and spatial orientation somehow.

Whatever this topic is, it always involves “looking” through “windows.” Door face back eyes looked room stood window down suddenly turned yes hand moment looking walked sat round voice light across cried white front herself girl look chair floor standing away behind table fire dark wall opened slowly towards between. We could say this is just “concrete,” but it’s concrete in some weirdly specific way. It seems to me that there’s something here — a periodizable discourse analogous to 18c “sensibility” — waiting to be defined by anyone who can figure out what the heck it is. The authors represented here include Thomas Hardy, Olive Schreiner, Mary Augusta (Mrs. Humphry) Ward, and A. Conan Doyle.

Finally, just to prove that we do have some 19c poetry, here’s Topic #2, which interestingly bridges late-Romantic and Victorian poetry, including Byron and Felicia Hemans, but also Elizabeth Barrett Browning, Harriet Beecher Stowe, and the occasional volume of fiction by Bulwer-Lytton or C. R. Maturin.

I actually am not entirely sure what to call this: dark light wild voice night through around dead spirit earth beneath like dim deep darkness stood death dream still sound sleep heard grave pale words sun silent fell waters bright gaze within shadow cloud near burning cold alone calm vision.


About tedunderwood

Ted Underwood is Professor of Information Sciences and English at the University of Illinois, Urbana-Champaign. On Twitter he is @Ted_Underwood.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s