Preface 1. The Basics The Importance of Language Annotation The Layers of Linguistic Description What Is Natural Language Processing? A Brief History of Corpus Linguistics What Is a Corpus? Early Use of Corpora Corpora Today Kinds of Annotation Language Data and Machine Learning Classification Clustering Structured Pattern Induction The Annotation Development Cycle Model the Phenomenon Annotate with the Specification Train and Test the Algorithms over the Corpus Evaluate the Results Revise the Model and Algorithms Summary 2. Defining Your Goal and Dataset Defining Your Goal The Statement of Purpose Refining Your Goal: Informativity Versus Correctness Background Research Language Resources Organizations and Conferences NLP Challenges Assembling Your Dataset The Ideal Corpus: Representative and Balanced Collecting Data from the Internet Eliciting Data from People The Size of Your Corpus Existing Corpora Distributions Within Corpora Summary 3. Corpus Analytics Basic Probability for Corpus Analytics Joint Probability Distributions Bayes Rule Counting Occurrences Zipf's Law N-grams Language Models Summary 4. Building Your Model and Specificationl Some Example Models and Specs Film Genre Classification Adding Named Entities Semantic Roles Adopting (or Not Adopting) Existing Models Creating Your Own Model and Specification: Generality Versus Specificity Using Existing Models and Specifications Using Models Without Specifications Different Kinds of Standards ISO Standards Community-Driven Standards Other Standards Affecting Annotation Summary 5. Applying and Adopting Annotation Standards Metadata Annotation: Document Classification Unique Labels: Movie Reviews Multiple Labels: Film Genres Text Extent Annotation: Named Entities Inline Annotation …… 6. Annotation and Adjudication.. 7. Training: Machine Learning... 8. Testing and Evaluation. 9. Revising and Reporting. 10. Annotation: TimeML. 11. Automatic Annotation: Generating TimeML. 12. Afterword: The Future of Annotation. A. List of Available Corpora and Specifications B. List of Software Resources C. MAE UserGuide D. MAI UserGuide E. Bibliography Index