linguistic corpora i've developed / improved upon. if you use these in research, please cite me!
Annotated DN onlineNovember 2017
a version of the Diplomatarium Norvegicum that is: (1) searchable with regexp; (2) tagged for gender, title/rank and name of first signatory, and number of signatories; (3) tagged for date; (4) (largely) localised
XML Fornrit for gendered speech in Old IcelandicJune 2013
xml version of the Fornrit corpus from the Mörkuð íslensk málheild with direct speech tagged for the gender of the speaker