Building And Using Comparable Corpora

Author: Serge Sharoff
Publisher: Springer Science & Business Media
ISBN: 3642201288
Size: 31.72 MB
Format: PDF, Kindle
View: 5243
Download Read Online

Building And Using Comparable Corpora from the Author: Serge Sharoff. The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Web As Corpus

Author: Maristella Gatto
Publisher: A&C Black
ISBN: 1441134131
Size: 45.17 MB
Format: PDF, ePub, Docs
View: 5348
Download Read Online

Web As Corpus from the Author: Maristella Gatto. Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the "web as corpus†?. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.

The Routledge Handbook Of Corpus Linguistics

Author: Anne O'Keeffe
Publisher: Routledge
ISBN: 1135153620
Size: 72.88 MB
Format: PDF, ePub
View: 6308
Download Read Online

The Routledge Handbook Of Corpus Linguistics from the Author: Anne O'Keeffe. The Routledge Handbook of Corpus Linguistics provides a timely overview of a dynamic and rapidly growing area with a widely applied methodology. Through the electronic analysis of large bodies of text, corpus linguistics demonstrates and supports linguistic statements and assumptions. In recent years it has seen an ever-widening application in a variety of fields: computational linguistics, discourse analysis, forensic linguistics, pragmatics and translation studies. Bringing together experts in the key areas of development and change, the handbook is structured around six themes which take the reader through building and designing a corpus to using a corpus to study literature and translation. A comprehensive introduction covers the historical development of the field and its growing influence and application in other areas. Structured around five headings for ease of reference, each contribution includes further reading sections with three to five key texts highlighted and annotated to facilitate further exploration of the topics. The Routledge Handbook of Corpus Linguistics is the ideal resource for advanced undergraduates and postgraduates.

Parallel Text Processing

Author: Jean VĂ©ronis
Publisher: Springer Science & Business Media
ISBN: 9780792365464
Size: 27.21 MB
Format: PDF
View: 3596
Download Read Online

Parallel Text Processing from the Author: Jean VĂ©ronis. With the rising importance of multilingualism in language industries, brought about by global markets and world-wide information exchange, parallel corpora, i.e. corpora of texts accompanied by their translation, have become key resources in the development of natural language processing tools. The applications based upon parallel corpora are numerous and growing in number: multilingual lexicography and terminology, machine and human translation, cross-language information retrieval, language learning, etc. The book's chapters have been commissioned from major figures in the field of parallel corpus building and exploitation, with the aim of showing the state of the art in parallel text alignment and use ten to fifteen years after the first parallel-text alignment techniques were developed. Within the book, the following broad themes are addressed: (i) techniques for the alignment of parallel texts at various levels such as sentence, clause, and word; (ii) the use of parallel texts in fields as diverse as translation, lexicography, and information retrieval; (iii) available corpus resources and the evaluation of alignment methods. The book will be of interest to researchers and advanced students of computational linguistics, terminology, lexicography and translation, both in academia and industry.

Corpus Based Language Studies

Author: Tony McEnery
Publisher: Taylor & Francis
ISBN: 9780415286237
Size: 64.86 MB
Format: PDF, ePub, Docs
View: 5138
Download Read Online

Corpus Based Language Studies from the Author: Tony McEnery. Covering the major approaches to the use of corpus data, this work gathers together influential readings from leading names in the discipline, including Biber, Widdowson, Sinclair, Carter and McCarthy.

Seeing Through Multilingual Corpora

Author: Stig Johansson
Publisher: John Benjamins Publishing
ISBN: 9789027223005
Size: 32.51 MB
Format: PDF, ePub, Mobi
View: 1107
Download Read Online

Seeing Through Multilingual Corpora from the Author: Stig Johansson. Through electronic corpora we can observe patterns which we were unaware of before or only vaguely glimpsed. The availability of multilingual corpora has led to a renewal of contrastive studies. We gain new insight into similarities and differences between languages, at the same time as the characteristics of each language are brought into relief. The present book focuses on the work in building and using the English-Norwegian Parallel Corpus and the Oslo Multilingual Corpus. Case studies are reported on lexis, grammar, and discourse. A concluding chapter sums up problems and prospects of corpus-based contrastive studies, including applications in lexicography, translator training, and foreign-language teaching. Though the main focus is on English and Norwegian, the approach should be of interest more generally for corpus-based contrastive research and for language studies in general. Seeing through corpora we can see through language.

Exploring English With Online Corpora

Author: Wendy Anderson
Publisher: Palgrave Macmillan
ISBN: 1137079150
Size: 72.72 MB
Format: PDF, Mobi
View: 5854
Download Read Online

Exploring English With Online Corpora from the Author: Wendy Anderson. Never before have so many electronic resources been available to support the teaching of English. From a wide variety of online corpora to specialized archives of speech and writing, teachers and students are faced with the challenge of understanding these resources and selecting those appropriate to their purpose. This accessible introduction demonstrates how freely available corpora can be used for the study of English at different linguistic levels, including vocabulary, grammar, discourse and pronunciation. With clear descriptions and a practical approach, Exploring English with Online Corpora: • introduces readers to a range of online corpora that can be used to explore and analyse the English language • provides a detailed guide to interpreting corpus data • demonstrates how teachers and lecturers can integrate corpora into language courses • includes interactive tasks, a helpful glossary and further reading sections This essential guide to the use of online corpora will be invaluable for students studying the English language and beginning to formulate their own research questions.

Translation And Cognition

Author: Gregory M. Shreve
Publisher: John Benjamins Publishing
ISBN: 9789027231918
Size: 32.23 MB
Format: PDF, ePub, Docs
View: 3299
Download Read Online

Translation And Cognition from the Author: Gregory M. Shreve. "Translation and Cognition" assesses the state of the art in cognitive translation and interpreting studies by examining three important trends: methodological innovation, the evolution of research design, and the continuing integration of translation process research results with the core findings of the cognitive sciences. Several of the volume s essays focus on fruitful new process research methods, such as eye tracking and keystroke logging that have arisen to supplement the use of think-aloud protocols. Another set of contributions investigates how some central theories, concepts, and methods from our sister disciplines of psycholinguistics, cognitive psychology, and neuroscience can inform our understanding of translation processes and their development in novices and experts. Yet another set of essays argues that methodological innovation and integration with the cognitive sciences can lead to more robust research designs and theoretical frameworks to explain the intricacies of cognitive processing during translation and interpreting. Thus, this timely volume actively demonstrates that a new theoretical and methodological consensus in cognitive translation studies is emerging, promising to greatly improve the quality, verifiability, and generalizability of translation process research."

Treebanks

Author: A. Abeillé
Publisher: Springer Science & Business Media
ISBN: 9401002010
Size: 73.86 MB
Format: PDF, ePub
View: 2445
Download Read Online

Treebanks from the Author: A. Abeillé. This book provides a state of the art on work being done with parsed corpora. It gathers 21 papers on building and using parsed corpora raising many relevant questions, and deals with a variety of languages and a variety of corpora. It is for those working in linguistics, computational linguistics, natural language, syntax, and grammar.

Building And Exploring Web Corpora Wac3 2007

Author: CĂ©drick Fairon
Publisher: Presses univ. de Louvain
ISBN: 9782874630828
Size: 56.91 MB
Format: PDF, ePub, Mobi
View: 4401
Download Read Online

Building And Exploring Web Corpora Wac3 2007 from the Author: CĂ©drick Fairon. WAC More and more people are using Web data for linguistic and NLP research. The Web as Corpusworkshop (WAC) provides a venue for exploring how we can use it effectively and the advancementsto which this could lead.This book is a collection of the talks presented at the 3 rd WAC in Louvain-la-Neuve (Belgium).The focus is on the description of Web corpus collection projects, the exploration of Web datacharacteristics from a linguistics/NLP perspective, and on the use of crawled Web data for NLPpurposes. CLEANEVAL Any use of Web data requires that it be cleaned in order to get rid of unwanted material including,for example, HTML markup, navigation bars, advertisements. To date there has been no sharingof resources or expertise in this particular domain and the cleaning has often been done minimally.Cleaneval was an exercise aimed at promoting collaboration and improving our understandingof the issues. Results and perspectives are presented in this book.