This week I was most focused on the Dataset Deep Dives, the process of finding one and of reporting on one. I fully recognize that what I reported on was not actually a dataset, and wasn’t even much of a corpus, but I really wanted to report on something closer to digital literary studies in practice than the Pudding articles, and I was really fascinated by the results of the Orlando stylometry article. In the process of poking cursorily around the world of digital modernism, I found what I briefly alluded to in my video, which is that the functions of copyright and time mean that literary datasets are not equally distributed.
Maybe it’s because I was focusing on my personal focus in literary studies (twentieth-century literature), but a lot of what I witnessed seemed to involve crafting your corpus by hand rather than working from a pre-existing dataset. This is in turn what I imagined I’d be doing for my final project, too. I think I was one of relatively few people who focused on the data from a literary studies paper rather than a web essay, and it was interesting to see the marked differences in tone, form, and content between the two types of medium. It made me realize that, even though one is probably more “natural”, the academic register is probably the one I’m more comfortable doing analysis in.