LINDSAY ROSE RUSSELL
“The Miriams of Webster: Women Employees at Merriam-Webster, 1864–1961”
Responding to common errors in dictionary correspondents’ address lines, real-life Merriam-Webster editor Kory Stamper offered this whimsical caricature of her spectral colleague “Miriam Webster”: “The stack of marked sources she must type up never shrinks; the stack of finished citations never grows. […] She lives alone with her books, her disappointment, and that burnt-out lightbulb in the hallway.” This spinster lexicographer, impressively diligent if acutely disaffected, counters the far more pervasive sense of dictionary makers as male, to be greeted “Dear Sirs,” a form equally irksome to Stamper. Of course, Merriam-Webster dictionaries are the product of neither specter. They are and have long been the product of heterogeneous labor. As early as 1864, women were working alongside men on An American Dictionary of the English Language. By the early part of the twentieth century, women were being recruited from nearby women’s colleges to serve as lexicographical as well as administrative assistants for Webster’s [Second] New International Dictionary. These practices of recruiting and employment would continue for Webster’s [Third] New International Dictionary, a project that would also involve women consultants, editors, and advertisers. By the close of 1960s, women and men alike would be complaining that the company’s purpose-build dictionary offices afforded insufficient accommodations for its female workforce. This paper sketches the roles women played at Merriam-Webster before 1961, supplanting the single-spinster assumption with a more vibrant sense of how women inhabited and enhanced dictionary work.
Biography: Lindsay Rose Russell teaches at the University of Illinois at Urbana-Champaign where she is also core faculty in The Center for Writing Studies. Her research interests include histories and descriptions of the English language, rhetorical theory, genre studies, and feminist historiography. Her first book, Women and Dictionary Making, is forthcoming from Cambridge University Press.
BEATRIZ SÁNCHEZ CÁRDENAS & MÍRIAM BUENDÍA CASTRO
Explaining hurricanes from a collocational perspective. You will know terms by the company they keep
A recent trend in Terminology is the enhancement of terminological entries with phraseological information, such as multiword expressions (MWEs). Given that verbs carry most of the semantic load of the sentence, they are essential to define the syntactic and semantic structure in which terms are inserted. Thus, the identification of noun-verb combinations in corpora is crucial to define the linguistic behavior of terms. This paper explores new techniques for the automatic identification and extraction from corpora of noun-verb combinations that are relevant for the construction of new terminological entries using a NLP software tool. Accordingly, we implemented a methodology to identify which verbs are associated with each term in the corpora. Based on Claveau & L’Homme (2006), this method has three steps: “1) isolating contexts in which N-V pairs sharing a realization relationship occur; 2) from these contexts, inferring linguistically-motivated rules that reflect the behavior of realization N-V pairs; and 3) projecting these rules on corpora to find other valid N-V pairs”. Thanks to the analysis of these noun-verb combinations, it is possible to extract the most salient collocational information for specialized terms. The inclusion of such information in the entries of specialized lexicographic resources provides valuable information for writing and understanding specialized texts.
Biographies: Beatriz Sánchez Cárdenas is a lecturer in the Department of Translation and Interpreting at the University of Granada. She holds a PhD from the University of Strasbourg (France). Her main research interests are corpus linguistics, specialized language and phraseology. She has been invited to present her research in various leading universities such as the University of Paris Diderot or the University of California, Berkeley. She has published papers in prestigious journals and publishing houses such as Procedia –Social and Behavioral Sciences, Scolia (Sciences Cognitives, Linguistique & Intelligence Artificielle), Revista de Lingüística y Lenguas Aplicadas, John Benjamins, and Peter Lang. Míriam Buendía Castro is a lecturer in the Department of Modern Philology at the University of Castilla-La Mancha (Spain). She holds a PhD in Translation and Interpreting from the University of Granada with which she was awarded the Outstanding Doctoral Dissertation Award. She has published more than 35 articles, book chapters, and a book in prestigious international journals and publishing houses, such as Terminology, or RESLA. Her main research interests are terminology, phraseology, and corpus linguistics.
Hall Speak: Using Language Contact and Lexical Borrowing on Halls
of Residence to Update Regional Lexicography
This study focuses on a language contact situation which has resulted in continuous lexical borrowing taking place on the halls of residence at the UWI, St. Augustine Campus. The language situations on its halls are distinct from more typical contact situations, which tend to involve two groups having prolonged contact. Here many groups are interacting; and, those involved in the contact possess a shared language (Standard English), which can be used to diffuse any misinterpretations when use of their native language variety fails. Additionally, each year, half the population of each hall consists of new students, often bringing new or updated terms. This paper reports on the results of the first survey of language contact and lexical borrowing, identifying 232 loanwords, used among these halls. 180 of these terms come from different languages around the region and the other 52 have been classified as having been developed on the halls themselves. This list of lexical items identified by residents was compiled into a dictionary and checked against standard regional references. Almost half of this study’s loanword compilation are not found in the most recent of these dictionaries, particularly slang and swearwords. It is hoped that this study will serve as a platform for investigating other multicultural/ multilingual environments. The methodological framework used in this study has provided a comprehensive basis from which lexicographic research can be done elsewhere, as the terms documented in this study can be used to update and augment already existing regional dictionaries.
Biography: Kellon Sankar is currently a student of the University of the West Indies (UWI) St. Augustine, pursuing an MPhil in Linguistics. He is also a Research Assistant/Tutor at UWI, St. Augustine, attached to two departments of the Faculty of Humanities and Education. His study focuses on linguistic and cultural integration on Halls of Residence at the UWI, St. Augustine campus, specifically, language contact and lexical borrowing. He has lived on a hall of residence for the duration of his undergraduate degree, as well as post-graduate degree (thus far) and has seen first-hand, the extent of cultural exchange taking place. Upon completion of this degree, he would like to progress on to a career in research, exploring regional lexicography, while continuing to teach at the tertiary level.
ANTONIO SAN MARTÍN
The contextualization of definitions in specialized lexical resources
This paper presents a new method of creating definitions, which can be integrated into the workflow of any specialized lexical resource to solve one of the problems of specialized definitions, namely, the fact that they do not always meet the needs of their target users. One of the reasons for this is the one-size-fit-all approach that is usually followed. The current practice in specialized lexicography is for each term to have only one definition. However, the knowledge transmitted by a term in real use events varies, depending on the context of activation. As a consequence, terminological definitions tend to be either too general in an effort to encompass all possible contexts, or be too specific, which means that many applicable contexts are omitted.
This paper characterizes the contextual dimension of terminological definitions and describes the flexible terminological definition approach, which includes a corpus-based methodology for crafting contextualized definitions founded on cognitive linguistics principles. As a practical example, we applied this methodology to the elaboration of definitions of environmental terms with Caribbean thematic-cultural contextual constraints. The resulting flexible definitions (based on corpus evidence) reflect the conceptual content of environmental terms from a Caribbean perspective. Consequently, these definitions provide more relevant information for users such as translators, legislators or scientific writers dealing with environmental terminology in the Caribbean context. It thus follows that incorporating the flexible definition approach in Caribbean specialized lexicography would increase the quality of its lexical resources.
Close reading of large corpora through digital reformatting
The Oxford English Dictionary (OED) has long been concerned with establishing the earliest date at which an English word began written use (Gilliver, 25). Throughout the dictionary’s history the means of establishing such dates has largely been through the reading program (both by volunteers and OED staff). In recent years digital corpora have enabled editors to search far larger bodies of text than was possible several decades ago. Yet although the functions enabling navigation through digital corpora allow certain advantages, digital searching lacks the subtlety of close reading and its tools have typically not been designed with the needs of historical lexicographers foremost in mind. An illustration of the problems accompanying a reliance on digitally searching such corpora as Early English Books Online (EEBO) may be found in the OED entry for anxious (adj.). This entry is part of the third edition of the OED, and, given that it was updated in March 2016 it is certain that OED editors would have searched for this word in EEBO. The earliest citation in OED3 for anxious is 1548; however, a 1529 use of this word by Thomas More may be found in EEBO. The explanation for this gap is methodological. Thomas More’s spelling of the word (anxyouse) is not one of the variants used by EEBO, so unless this specific (and uncommon) spelling is used by a researcher accessing this database this antedating use of anxious will not be in the search results.
This paper proposes a new search methodology, one which combines some of the nuance of close reading with the efficacy of digital searching. A large number of titles from EEBO (25,000) will be downloaded, and then run through a program (PanDoc) that allows certain types of formatting, such as the alphabetization of all individual words in the corpus. Deceptively simple, the process of alphabetizing large bodies of digital text, particularly those from a period before orthographic standardization, allows researchers to examine the contents of corpora such as EEBO from a fresh perspective. Every potential lemmata, as well as all the orthographic variants found in EEBO (many of which are unrecognized by this database’s search function) are readily available for chronological comparison with recent entries in OED.
Biography: Ammon Shea is Digital Content Editor for Merriam-Webster, where he writes articles on language for m-w.com, and researches antedatings. He has previously been a consulting editor for American Dictionaries for Oxford University Press, and worked with the North American Reading Program of the Oxford English Dictionary. His most recent book is Bad English: A History of Linguistic Aggravation (Perigee, 2014).
JASON F. SIEGEL
Some moral obligations of lexicographers
It is a long-standing belief that lexicographers have a responsibility to the truth of the facts of language usage. Dictionary-makers must create senses that line up with actual usage, without distortion or obfuscation. Yet as Lew (2013) points out, dictionary senses at most correspond to reality as structured by editorial judgments of lexicographers. In this presentation, I present some of the tendencies that lexicographers follow, despite other options being available to them, such as privileging scientific accuracy over folk usage (e.g. defining cucumber and pumpkin as ‘fruits’, or assigning man and woman to reproductive characteristics to which speakers are not privy when they apply the labels), as well as alternatives that could support the speaker community better. One such alternative is a reduced embrace of scientism in favor of folk understanding, in line with the conception of dictionaries as repositories of words as they are actually used. Indeed, as Tao (2016) has demonstrated, dictionaries can have real-life consequences for vulnerable populations when judges use them to ascertain plain meanings. Another is to examine whether the sense order and labelling of words such as racism and gay ought to be guided by the primary or preferred usage of the speakers most likely to be harmed by the concept or by epithetic usage. Finally, we will examine the moral obligations in the development of lexicographic corpora, through the lens of coverage of Caribbean English terms in dictionaries from outside and inside the region alike, and how exclusion of countries’ words affects the conception of their language.
Biography: Jason F. Siegel is the Research Fellow in Lexicography at the University of the West Indies, Cave Hill Campus in Barbados. In this capacity he is the Director of the Richard & Jeannette Allsopp Centre for Caribbean Lexicography. His research focuses on Caribbean lexicography of the French- and English-official Caribbean. He is also the Co-Producer of the new series Bit o’Bajan, a televised series of vignettes about Barbadian lexemes.
Detection of Errors in the Treatment of Multiple Equivalence: A Prolegomenon
The present paper is meant to be the first step in developing the mechanisms for automatic detection of errors in lexicographic treatment of multiple equivalence, the most prominent form of lexical anisomorphism. In addition to the obvious interest this phenomenon has been commanding in lexicography (e.g., Zgusta 1971, Al-Kasimi 1977, Yong and Peng 2007), important direct and indirect contributions to the study of multiple equivalence can be found in various other intellectual traditions from ordinary language philosophy (most notably in Frege 1982 and Puttnam 1975), to linguistic anthropology (Goddard 2012), translatology (Pym 2009), and second language teaching (Kramsch 1993). Sipka (2015) has identified the following common errors in the treatment of multiple equivalence in bilingual dictionaries: a. strategy mismatch, b. vagueness, c. lack of explanation or specification, d. redundancy, e. underspecification overlaps, f. overspecification gaps, g. mislabeling, h. inadequate exemplification, i. language mismatch, j. inconsistent segregation, k. equivalent omission and l. overinterpretation.
Automatic detection of the aforementioned errors in bilingual dictionaries assumes a feasible and reasonably accurate search pattern of some kind. A search pattern, in turn, needs to be efficient and its construction should not be overly time-consuming. More precisely, when deciding if it makes sense to develop a search pattern one should be guided by the following consideration: design feasibility, design resources complexity, comprehensiveness, and exclusiveness. The first parameter of a possible search pattern is a simple check if the error in question lends itself to automatic detection. If the answer to the first question above is positive, i.e., if the development of a search pattern is possible given the type of error, the next question should be what kinds of resources one would need to muster to design that particular search pattern. It may be so that gathering the resources for a search pattern is overly time consuming and that it makes more sense to search for the errors manually, by visually scanning the dictionary database. The next step in building the search pattern is the question about the proportion of the errors in question that will be detected by that particular search pattern. Ideally, the search pattern should detect all instances of treatment errors, but it may make sense to deploy the patterns that find most of them, especially if the design of the pattern is not overly time consuming. In this situation of underdetection, the cases that are not found by the search pattern may be addressed manually, left to be corrected in other editions of the same dictionary, etc. Finally, we want the search pattern to exclude the cases, which are not errors that we are looking for. Ideally, the output of the search patterns should not contain any non-errors (but, realistically, some level of overdetection can be tolerated). Possible search patterns for each of the aforementioned types of multiple equivalence are discussed in the central section of the present paper.
Biography: Danko Sipka is a professor of Slavic languages and linguistics at Arizona State University, where he teaches in the School of International Letters and Cultures. Danko Sipka's research interests include lexicography, and lexicology, as shown by his recent monograph Lexical Conflict: Theory and practice (Cambridge University Press, 2015).\