The use of language corpora, or large samples of natural texts, has become ubiquitous in linguistic research. Yet, there are no conceptual or methodological frameworks for corpus representativeness. This book is the first to provide the field of linguistics with a comprehensive framework for corpus design, evaluation, and representativeness.
Jesse Egbert is Associate Professor of Applied Linguistics at Northern Arizona University. He is a co-founding General Editor of Register Studies, and his recent books focus on online register variation (2018), methodogical triangulation (2016, 2020), and corpus linguistics methods (2020).
1. Introduction; 2. Approaches to representativeness in previous corpus linguistic research; 3. Corpus representativeness: a conceptual and methodological framework; 4. Domain considerations; 5. Distribution considerations; 6. The influence of domain and distribution considerations on corpus representativeness - bringing it all together; 7. Corpus design and representativeness in practice; Glossary; Appendix A. Example articles documenting existing corpora; Appendix B. Survey of corpus design and compilation practices.