linguistic corpora i've developed / improved upon. if you use these in research, please cite me!
DN onlineNovember 2017
a version of the Diplomatarium Norvegicum that is: (1) searchable with regexp; (2) tagged for gender, title/rank and name of first signatory, and number of signatories; (3) tagged for date; (4) (largely) localised
XML Fornrit for gendered speech in Old IcelandicJune 2013
xml version of the Fornrit corpus from the Mörkuð íslenskt málheild, with all direct speech tagged for gender of the speaker