One of my projects for 2018 is to take a text and shepherd it, or curate it, all the way through an open source pipeline from ‘print’ to ‘digital edition’. This is part of my 2018 year of digital humanities. Here I talk a little bit about the envisioned process.
The text I have in mind is quite short, just over 2000 words. It’s Gregory of Nyssa’s De Deitate adversus Evagrium (in vulgo In suam Ordinationem). I’ve done some work on De Deitate Filii et Spiritus Sancti and this will be a nice complement to that.
My checklist of things to do:
Step 1: OCRing a print text
Step 2: Correcting the OCR output
Step 3: Create a TEI-XML version.
Step 4: PoS Tagging/Lemma tagging/Morph tagging
Step 5: Produce a translation
Step 6: Alignment
Step 7: Annotations and commentary
Then, voilá, open-sourced text freely available with useful data attached. Half of these things I don’t actually know how to do yet. Maybe more than half. That’s part of the fun. And, presuming it goes well, will make it a pilot project for future texts through a similar pipeline.