Topic Modeling and The Ladder

Hi guys.

I’m quite late to this party, but since I didn’t participate in the round-table discussion/blog post week, I thought I should share what I’ve been working on with everybody now. Below is an excerpt from my paper. Thanks everybody, for being awesome; this was a great class. I would love to read some updates!

For this project, and for me as a scholar in general, it was important to figure out some way to balance the specific needs of my archive and still put the what was relevant about DH methods into practice. Following Tonya Clements’ example of a close-distant-reading, I attempted a simple “bag of words” type analysis on some works by one of my pet authors, Edna Ferber; because she was so widely read in the 30s and 40s, yet nobody cares about her now, it was fairly easy to find a good number of her texts online in .txt format via The results were profoundly uninteresting. Ferber is a straightforward writer, known for her interesting characters more than complex plots or innovative style. In fact, this experience gave me a new understanding of Tonya Clements’ suggestion in “‘A Thing Not Beginning and Not Ending’: Using Digital Tools to Distant-Read Gertrude Stein’s The Making of Americans” that digital methods proved that Stein was a “genius”—though Stein may not have necessarily been secretly writing in these patterns on purpose, psychically intuiting that someday, almost a century later, someone with a computer would find them, it appears to be true that a text has to be pretty complex in order for a close-distant-reading like that to produce interesting results. I would need something bigger, over a longer period of time, that was still at least somewhat unified by something, if not a specific author.

Enter The Ladder. The Ladder was a monthly periodical that ran from 1956-1972. It was written, printed and distributed by the lesbian organization the Daughters of Bilitis in San Francisco. It was a small magazine, in every sense of the word: small subscribership, only about twenty pages per issue, and a relatively brief 15-year run. But the journal was tremendously important to the disenfranchised community it served—namely, the ever-invisible lesbian population in 1950s America.  McCarthy-era anti-gay paranoia meant that publishing and distributing such a text was fairly risky—and indeed, the Daughters of Bilitis had an extensive FBI file—but it was vital for establishing a lesbian community and/or support network for those who feared for their jobs and friendships. The Daughters of Biltis’s expressed purpose in creating the magazine came down to two basic concepts: 1. educating people about what it means to be a gay woman, and 2. Archiving, reviewing, and soliciting lesbian-centered literature. If the literary field as a whole functions as a “collective system,” as Moretti insists, The Ladder was explicitly concerned with facilitating their own smaller version of that system for lesbian women in the 1950s United States. As a magazine, it’s barely a dent in the large corpus of twentieth century magazines. As a medium of communication between individual members of an otherwise-estranged community of readers and writers, it’s a perfect resource, and though it is not large in the distant-reading sense, it is large enough that a traditional close reading of all 180 or so issues is less effective than a comprehensive view of the publication using digital methods.

Given the wide-ranging, system-based goals of The Ladder, topic modeling seemed like it would produce the most interesting results. Obtaining the data would prove to be extremely feasible, because one of the best things about The Ladder is how carefully it has been preserved and archived. Though the regular subscriber list never topped more than a few hundred, issues were shared, read, re-read, and stored much more widely than these numbers suggest. Several archival projects have taken an interest in the Daughters of Bilitis and their landmark publication, and as a result it’s available in several libraries—including the Rare Books and Manuscripts library here at U of I—and it’s even been digitized by the LGBT Life database. I had my entire corpus available to me, already in a digital format.

Due to time constraints, I ended up stopping at 1963, with more than 1000 .txt files for topic modeling. Though the project seemed exceedingly promising, and though I was learning more and more about my corpus firsthand through OCR correction, I started my project far too late to make the topic modeling happen by this point. I strongly wish I had come up with this idea earlier in the semester, so I might have been able to do a more thorough job.

For  now, I did a quick “bag of words” scan on about half of the issues from 1957—the first full year of The Ladder’s publication, and the first year in which Phyllis Lyon cast aside her pseudonym Ann Ferguson in favor of her real name. The 20th through 80th most commonly used words for the January through July issues, excluding April, are below.

The term “homosexual” is not insignificant here—it was a fairly new term at the time, as the more commonly used word was Havelock Ellis’s “congenital invert” to describe gay citizens. Active verbs like “make” and “can,” in conjunction with the word “public” are likewise unsurprising, considering that The Ladder placed itself at the frontier of a social movement aimed at cultural acceptance of homosexuality as a minority identity. I’m struck by the lack of the word “lesbian” (or, Lesbian, as The Ladder typically capitalized the word), because my understanding of the periodical was that it was the only one addressed specifically to gay women, and that this is what differentiated it from associate homophile publications like The Mattachine Review and ONE. I am equally struck by the appearance of the word “love” on this list, though upon second glance it’s less puzzling when one sees the word “book” and “subject” on the list—much of lesbian literature which was being published and reviewed by The Ladder were centered on forbidden love stories. While this approach to text mining has obvious limitations, and is not itself enough to compose a compelling argument, I think enough interesting things emerge, even from a simple text analysis, that it bodes well for a future project involving topic modeling and The Ladder.

