Spelling Regularisation in the Corpus of Early English Correspondence: A Quantitative Analysis
12 November 2019
- L21, Veveří 26, Brno
The paper explores trends in spelling variation in Early English correspondence (15th – 17th c.) on the material of the Corpus of Early English Correspondence (CEEC).
The history of English spelling has so far been mostly described from a systemic (langue) point of view focusing on the description of, and specific changes between, the number of spelling conventions as used in English at particular periods. When (ir-)regularity has been described, it was mostly in terms of structural regularity and consistency, or general sound-spelling correspondence. This paper, on the other hand, focuses on regularity seen as predictability of actual written forms (parole) throughout the period.
Overall change in spelling regularity has so far been commented on only in relatively general terms such as “as the fifteenth century progressed so a universal stabilised orthography … was increasingly widely used” (Scragg, 46), “after 1550 we find a … greater stability and regularity of spelling in private documents”, “up to the final fixing of spelling circa 1650” (ibid, 68) or “the 15th century saw a steady movement towards a fixed spelling that very much resembles the spelling of Modern English” (Upward & Davidson, 174). Such assertions, while undoubtedly based on quality research, result however from unspecified methodology supported by individual examples only and – as evidence of the progress of regularisation in a quantitative analysis – are largely circumstantial. There is of course no doubt about the general direction of the process and its basic characteristics, such as the slower pace of the change in private documents compared to the spelling of professional publications, but the data to support the assertions as well as precise definitions of spelling regularisation have not yet been, to my knowledge, provided.
This paper introduces a novel methodology for the quantification of spelling regularity, which allows a more objective assessment of its progression and which also makes use of the metadata provided by the CEEC such as gender, letter authenticity or relationship/kinship between the author and the recipient. The paper explores interactions of such variables from the diachronic perspective using quantified levels of spelling regularity. The measure introduced for this purpose is based on weighted information (Shannon) entropy, as a measure of predictability of spellings of individual functionally defined types, and its calculation is partly based on the morphological tagging of the parsed version of the corpus.
Keywords: spelling, regularity, correspondence, corpus, entropy
Parsed Corpus of Early English Correspondence, tagged version. (2006). Annotated by Arja Nurmi, Ann Taylor, Anthony Warner, Susan Pintzuk, and Terttu Nevalainen. Compiled by the CEEC Project Team. York: University of York and Helsinki: University of Helsinki.
Scragg, D. G. (1974). A history of English spelling. Manchester: Manchester University Press.
Upward, C., & Davidson, G. (2011). The history of English spelling. Malden, MA: Wiley-Blackwell.