One of my projects for 2018 is to take a text and shepherd it, or curate it, all the way through an open source pipeline from ‘print’ to ‘digital edition’. This is part of my 2018 year of digital humanities. Here I talk a little bit about the envisioned process.
The text I have in mind is quite short, just over 2000 words. It’s Gregory of Nyssa’s De Deitate adversus Evagrium (in vulgo In suam Ordinationem). I’ve done some work on De Deitate Filii et Spiritus Sancti and this will be a nice complement to that.
My checklist of things to do:
The Pipeline
Step 1: OCRing a print text
Step 2: Correcting the OCR output
Step 3: Create a TEI-XML version.
Step 4: PoS Tagging/Lemma tagging/Morph tagging
Step 5: Produce a translation
Step 6: Alignment
Step 7: Annotations and commentary
Then, voilá, open-sourced text freely available with useful data attached. Half of these things I don’t actually know how to do yet. Maybe more than half. That’s part of the fun. And, presuming it goes well, will make it a pilot project for future texts through a similar pipeline.
I have had a strong desire for quite some time now to look into doing some of this type of work as well, but dwells low enough on the priority list that I’ve not gotten around to it yet. I am pondering a research project as part of my MDiv that might neatly intersect with a project like this. (Actually my real desire is to move past just the digitisation towards a open/free online database that allows conducting searches of this type of material.)
> just the digitisation towards a open/free online database that allows conducting searches of this type of material.)
Then you need to be checking out Open Greek and Latin (http://www.dh.uni-leipzig.de/wo/projects/open-greek-and-latin-project/ ) and similar projects. There’s a lot of stuff going on in digital humanities and classics, and collaboration is vital.
Thats awesome.
I also have been wanting to do exactly this but don’t know how. Keep us posted!
As with my ‘Diary/Apprentice’ posts, I plan to blog my way through the process with my own step-by-step guide.