Building And Using Comparable Corpora

Author: Serge Sharoff
Publisher: Springer Science & Business Media
ISBN: 3642201288
Size: 71.29 MB
Format: PDF, Kindle
View: 5442
Download Read Online

Building And Using Comparable Corpora from the Author: Serge Sharoff. The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Web As Corpus

Author: Maristella Gatto
Publisher: A&C Black
ISBN: 1441134131
Size: 43.75 MB
Format: PDF, Docs
View: 6507
Download Read Online

Web As Corpus from the Author: Maristella Gatto. Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the "web as corpus†?. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.

Applications Of Finite State Language Processing

Author: Tamás Váradi
Publisher: Cambridge Scholars Publishing
ISBN: 1443826030
Size: 12.77 MB
Format: PDF, ePub, Docs
View: 1026
Download Read Online

Applications Of Finite State Language Processing from the Author: Tamás Váradi. NooJ is both a corpus processing tool and a linguistic development environment: it allows linguists to formalize several levels of linguistic phenomena: orthography and spelling, lexicons for simple words, multiword units and frozen expressions, inflectional, derivational and productive morphology, local, structural syntax and transformational syntax. For each of these levels, NooJ provides linguists with one or more formal tools specifically designed to facilitate the description of each phenomenon, as well as parsing tools designed to be as computationally efficient as possible. This approach distinguishes NooJ from most computational linguistic tools, which provide a single formalism that should describe everything. As a corpus processing tool, NooJ allows users to apply sophisticated linguistic queries to large corpora in order to build indices and concordances, annotate texts automatically, perform statistical analyses, etc. NooJ is freely available and linguistic modules can already be downloaded for Acadian, Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, French, English, German, Hebrew, Greek, Hungarian, Italian, Polish, Portuguese, Spanish and Turkish. The present volume contains papers from the 2008 International NooJ conference which was held 8–10 June 2008 in Budapest. While the focus of the Budapest conference was on making NooJ compatible with other applications, the papers vary with respect to whether they regard Natural Language Processing (NLP) as a research goal or as a tool. However, they all present a slightly different problem either in the field of NLP, or in one that can be solved using NLP, or present a new development in the tool itself. The range of problems dealt with in the volume is quite varied, which will hopefully enable the readers to find contributions that are relevant to their field of interest.

The Routledge Handbook Of Corpus Linguistics

Author: Anne O'Keeffe
Publisher: Routledge
ISBN: 1135153620
Size: 53.96 MB
Format: PDF, ePub, Mobi
View: 3580
Download Read Online

The Routledge Handbook Of Corpus Linguistics from the Author: Anne O'Keeffe. The Routledge Handbook of Corpus Linguistics provides a timely overview of a dynamic and rapidly growing area with a widely applied methodology. Through the electronic analysis of large bodies of text, corpus linguistics demonstrates and supports linguistic statements and assumptions. In recent years it has seen an ever-widening application in a variety of fields: computational linguistics, discourse analysis, forensic linguistics, pragmatics and translation studies. Bringing together experts in the key areas of development and change, the handbook is structured around six themes which take the reader through building and designing a corpus to using a corpus to study literature and translation. A comprehensive introduction covers the historical development of the field and its growing influence and application in other areas. Structured around five headings for ease of reference, each contribution includes further reading sections with three to five key texts highlighted and annotated to facilitate further exploration of the topics. The Routledge Handbook of Corpus Linguistics is the ideal resource for advanced undergraduates and postgraduates.

Translation And Cognition

Author: Gregory M. Shreve
Publisher: John Benjamins Publishing
ISBN: 9789027231918
Size: 54.51 MB
Format: PDF, Docs
View: 2966
Download Read Online

Translation And Cognition from the Author: Gregory M. Shreve. "Translation and Cognition" assesses the state of the art in cognitive translation and interpreting studies by examining three important trends: methodological innovation, the evolution of research design, and the continuing integration of translation process research results with the core findings of the cognitive sciences. Several of the volume s essays focus on fruitful new process research methods, such as eye tracking and keystroke logging that have arisen to supplement the use of think-aloud protocols. Another set of contributions investigates how some central theories, concepts, and methods from our sister disciplines of psycholinguistics, cognitive psychology, and neuroscience can inform our understanding of translation processes and their development in novices and experts. Yet another set of essays argues that methodological innovation and integration with the cognitive sciences can lead to more robust research designs and theoretical frameworks to explain the intricacies of cognitive processing during translation and interpreting. Thus, this timely volume actively demonstrates that a new theoretical and methodological consensus in cognitive translation studies is emerging, promising to greatly improve the quality, verifiability, and generalizability of translation process research."

Corpus Based Language Studies

Author: Tony McEnery
Publisher: Taylor & Francis
ISBN: 9780415286237
Size: 58.84 MB
Format: PDF, Kindle
View: 3840
Download Read Online

Corpus Based Language Studies from the Author: Tony McEnery. Covering the major approaches to the use of corpus data, this work gathers together influential readings from leading names in the discipline, including Biber, Widdowson, Sinclair, Carter and McCarthy.

Multilingual Corpora And Multilingual Corpus Analysis

Author: Thomas Schmidt
Publisher: John Benjamins Publishing
ISBN: 9027219346
Size: 60.86 MB
Format: PDF, Mobi
View: 5270
Download Read Online

Multilingual Corpora And Multilingual Corpus Analysis from the Author: Thomas Schmidt. This volume deals with different aspects of the creation and use of multilingual corpora. The term 'multilingual corpus' is understood in a comprehensive sense, meaning any systematic collection of empirical language data enabling linguists to carry out analyses of multilingual individuals, multilingual societies or multilingual communication. The individual contributions are thus concerned with a variety of spoken and written corpora ranging from learner and attrition corpora, language contact corpora and interpreting corpora to comparable and parallel corpora. The overarching aim of the volume is first to take stock of the variety of existing multilingual corpora, documenting possible corpus designs and uses, second to discuss methodological and technological challenges in the creation and analysis of multilingual corpora, and third to provide examples of linguistic analyses that were carried out on the basis of multilingual corpora.

Seeing Through Multilingual Corpora

Author: Stig Johansson
Publisher: John Benjamins Publishing
ISBN: 9789027223005
Size: 33.32 MB
Format: PDF, Kindle
View: 5015
Download Read Online

Seeing Through Multilingual Corpora from the Author: Stig Johansson. Through electronic corpora we can observe patterns which we were unaware of before or only vaguely glimpsed. The availability of multilingual corpora has led to a renewal of contrastive studies. We gain new insight into similarities and differences between languages, at the same time as the characteristics of each language are brought into relief. The present book focuses on the work in building and using the English-Norwegian Parallel Corpus and the Oslo Multilingual Corpus. Case studies are reported on lexis, grammar, and discourse. A concluding chapter sums up problems and prospects of corpus-based contrastive studies, including applications in lexicography, translator training, and foreign-language teaching. Though the main focus is on English and Norwegian, the approach should be of interest more generally for corpus-based contrastive research and for language studies in general. Seeing through corpora we can see through language.

From Corpus To Classroom

Author: Anne O'Keeffe
Publisher: Cambridge University Press
ISBN: 0521851467
Size: 11.93 MB
Format: PDF, Mobi
View: 3565
Download Read Online

From Corpus To Classroom from the Author: Anne O'Keeffe. This book summarises and makes accessible recent work in corpus research, focusing on spoken data and on the place of lexis in grammar and discourse.

Treebanks

Author: A. Abeillé
Publisher: Springer Science & Business Media
ISBN: 9401002010
Size: 70.32 MB
Format: PDF, Docs
View: 232
Download Read Online

Treebanks from the Author: A. Abeillé. This book provides a state of the art on work being done with parsed corpora. It gathers 21 papers on building and using parsed corpora raising many relevant questions, and deals with a variety of languages and a variety of corpora. It is for those working in linguistics, computational linguistics, natural language, syntax, and grammar.