Inspired by a project by David Venturi which aimed at creating his own Data Science Curriculum using only courses and resources available online, I decided to do something similar, but instead of focussing on Data Science, my goal is to get a better grasp of the field of Digital Humanities.
I took this decision because of an experience I made during my ›traditional‹ course of studies. I got in contact with the Digital Humanities a few times in this time, but never in a way that allowed for a deep and thorough understanding. For this reason, I remained fascinated but at the same time rather ignorant of the methodologies and concepts utilised by the practitioners of the discipline. I didn’t even come close to what one could call code literacy (Cf. Rieder und Röhle 2012, p. 76): a critical understanding of the techniques and methods used in the Digital Humanities, a critical understanding of what coding means in the humanities.
At the same time, I noticed in discussions with ›traditionally‹ oriented classmates that there is some sort of discontent when it comes to the Digital Humanities. This discontent was of a rather vague nature, but for most it just didn’t seem necessary to bring informatics into the mix, to change learned ways of tackling problems. Evidently, books and papers written by academics in the Digital Humanities are oftentimes irritating to read when approached by a more ›traditional‹ scholar. Not only is the way of writing significantly different, the form and structure of arguments also seem to differ substantially. The most obvious marker of this is the use of numerical data and graphic representations (Cf. topic of relative incommensurability Ramsay 2003).
Oftentimes the hope that gets placed in the Digital Humanities is the prospect of a stronger trans- and interdisciplinary communication within the humanities without homogenising their disciplinary diversity. This hope is not without its reasons: The use of digital methods is not restricted to one use case or discipline and it is already discernible at this point that the Digital Humanities have the potential to transgress boundaries within the humanities conceptually as well as systematically (Cf. Meister 2012, p. 84). If, on the other hand, the reformulation of humanist problems into abstract code will result in the rise of a new lingua franca that has to be spoken by every scholar in the humanities in the future as Meister seems to propose (Cf. ibid, p. 83), remains to be seen.
In my opinion, it would be misguided to demand all kinds of humanities research to conform to an informatics rule set. Such a development would be detrimental considering the productive specialisations of the individual disciplines which have their own histories. Convergence on the basis of digital methods, however, seems possible. The perspectives and possibilities in the area of Digital Humanities are still in their infant state which means that experimentation about what works and what doesn’t is still important.
Since I started studying in 2012 the professionalisation (and ›professoralisation‹) of the Digital Humanities has steadily increased. In 2013 the organisation Digital Humanities im deutschsprachigen Raum was established and it seems like there is a new Bachelor or Master course in Digital Humanities established each year. Still the main question about what defines or should define Digital Humanities remains at the centre of many disciplinary discussions. For this reason, the following paragraphs point out some of the more prominent positions and definitions.
Franco Moretti is considered to be one of the most important emissaries of the discipline. Ten years before he established the Stanford Literary Lab in 2010, he had coined the term Distant Reading which he used almost like a battle cry against ›traditional‹ literature criticism. Moretti defines Distant Reading as the »focus on units that are much smaller or much larger than the text: devices, themes, tropes — or genres and systems« (Moretti 2000, p. 57) which is why, he argued, these units could only be adequately captured and described using large amounts of text. His plea to literary criticism was a polemical one: “[W]e know how to read texts, now let’s learn how not to read them.” (Ibid.).
During the almost two decades since this plea, a form of consolidation took place, leading to a more grounded tone of the discussion. The Digital Humanities have arrived in the academic mainstream. This is also perceptible in the broad definition of the discipline by Jannidis et. al (2017). The authors describe the field as the »the sum of all attempts to utilise informatics in the context of humanities research« (p. 13). A figure that depicts this intersection can be found in Patrick Sahle’s Paper DH Studieren! Auf dem Weg zu einem Kern- und Referenzcurriculum (2013):
Sahle (2013), p. 27.
One can, therefore, differentiate at least four different main areas of the Digital Humanities at this point (Cf. Jannidis et. al. 2017, p. 13):
I’m particularly interested in the first and the last point of this list. The curriculum, however, focusses on learning skills related to the first point, it is supposed to facilitate basic knowledge in Python supplemented by a knowledge of basic concepts and methods of corpus linguistics, the areas of Natural Language Processing (NLP) and Machine Learning. The courses are offered by different universities as well as platforms, partly leading to redundancies which are, in my opinion, helpful on a didactic level. I will document my progress in the blog and also try to find and present further helpful resources. The bars indicate how far I have progressed in the respective course.
The courses used can each be understood as an introduction to the relevant subject area, be it in dealing with Python or with regard to special applications of digital methods for dealing with large amounts of text/data. My aim is not to acquire full informatics knowledge, but instead to enable myself to independently design and apply digital methods for checking and answering own research questions, as well as the ability to critically evaluate those very techniques and methods.
Introduction to object-oriented programming with Python with basic units for the structure and modules of the standard library as well as for the concept and application of functions and classes. Various projects for application-oriented learning of algorithmic problem solving.
Three-month Challenge Course in which the basics of data science were taught. Basic concepts from the areas of descriptive statistics, Python and SQL necessary for the description, research and visualization of given data.
Introductory course in the Digital Humanities. Conveys the basics of research in the Digital Humanities by showing different projects as well as tools and how and why they were developed. Includes units on acquiring, cleaning, and creating data as well as using the command line and the tool Voyant.
Basic terms and concepts of corpus linguistics: word frequencies, collocations, N-grams. In addition, learning units for manual and automatic corpus annotation (Named-entity recognition (NER), Part-of-Speech-Tagging (POS) and Lemmatization).
Techniques and methods for the extraction and analysis of text data (Natural Language Processing), including learning units on Topic Analysis, Text Clustering and Text Categorization.
In addition to Deep Learning with PyTorch, the course also includes units for the use of Numpy and Pandas for machine learning.