Building And Using Comparable Corpora

Author: Serge Sharoff
Publisher: Springer Science & Business Media
ISBN: 3642201288
Size: 14.44 MB
Format: PDF
View: 2806
Download Read Online

Building And Using Comparable Corpora from the Author: Serge Sharoff. The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Building And Using Comparable Corpora

Author: Serge Sharoff
Publisher: Springer
ISBN: 9783642201271
Size: 77.73 MB
Format: PDF, ePub
View: 5180
Download Read Online

Building And Using Comparable Corpora from the Author: Serge Sharoff. The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Hybrid Approaches To Machine Translation

Author: Marta R. Costa-jussĂ 
Publisher: Springer
ISBN: 3319213113
Size: 17.44 MB
Format: PDF, ePub, Docs
View: 3053
Download Read Online

Hybrid Approaches To Machine Translation from the Author: Marta R. Costa-jussà. This volume provides an overview of the field of Hybrid Machine Translation (MT) and presents some of the latest research conducted by linguists and practitioners from different multidisciplinary areas. Nowadays, most important developments in MT are achieved by combining data-driven and rule-based techniques. These combinations typically involve hybridization of different traditional paradigms, such as the introduction of linguistic knowledge into statistical approaches to MT, the incorporation of data-driven components into rule-based approaches, or statistical and rule-based pre- and post-processing for both types of MT architectures. The book is of interest primarily to MT specialists, but also – in the wider fields of Computational Linguistics, Machine Learning and Data Mining – to translators and managers of translation companies and departments who are interested in recent developments concerning automated translation tools.

Corpora In Translator Education

Author: Federico Zanettin
Publisher: Routledge
ISBN: 1317641353
Size: 58.45 MB
Format: PDF, Docs
View: 3250
Download Read Online

Corpora In Translator Education from the Author: Federico Zanettin. The use of language corpora as a resource in linguistics and language-related disciplines is now well-established. One of the many fields where the impact of corpora has been growing in recent years is translation, both at a descriptive and a practical level. The papers in this volume, which grew out of presentations at the conference Cult2k (Bertinoro, Italy, 2000), the second in the series Corpus Use and Learning to Translate, are principally concerned with the use of corpora as resources for the translator and as teaching and learning aids in the context of the translation classroom. This book offers a cross-section of research by some leading scholars in the field, who offer accounts of first-hand experience and theoretical insights into the various ways of building and using appropriate corpora in translation teaching, for the benefit of teachers and learners alike. The various contributions provide a rich source of inspiration for other researchers and practitioners concerned with 'corpora in translator education'. Contributors include Stig Johansson, Tony McEnery, Kirsten Malmkjær, Jennifer Pearson, Lynne Bowker, Krista Varantola, Belinda Maia and a number of other scholars.

Corpus Use And Translating

Author: Allison Beeby
Publisher: John Benjamins Publishing
ISBN: 9027224269
Size: 65.71 MB
Format: PDF, Mobi
View: 2125
Download Read Online

Corpus Use And Translating from the Author: Allison Beeby. Professional translators are increasingly dependent on electronic resources, and trainee translators need to develop skills that allow them to make the best use of these resources. The aim of this book is to show how CULT (Corpus Use for Learning to Translate) methodologies can be used to prepare learning materials, and how novice translators can become autonomous users of corpora. Readers interested in translation studies, translator training and corpus linguistics will find the book particularly useful. Not only does it include practical, technical advice for using and learning to use corpora, but it also addresses important issues such as the balance between training and education and how CULT methodologies reinforce student autonomy and responsibility. Not only is this a good introduction to CULT, but it also incorporates the latest developments in this field, showing the advantages of using these methodologies in competence-based learning.

Web As Corpus

Author: Maristella Gatto
Publisher: A&C Black
ISBN: 1441134131
Size: 62.52 MB
Format: PDF, Docs
View: 3512
Download Read Online

Web As Corpus from the Author: Maristella Gatto. Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the "web as corpus†?. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.

Parallel Text Processing

Author: Jean VĂ©ronis
Publisher: Springer Science & Business Media
ISBN: 9780792365464
Size: 58.38 MB
Format: PDF, Docs
View: 4237
Download Read Online

Parallel Text Processing from the Author: Jean VĂ©ronis. With the rising importance of multilingualism in language industries, brought about by global markets and world-wide information exchange, parallel corpora, i.e. corpora of texts accompanied by their translation, have become key resources in the development of natural language processing tools. The applications based upon parallel corpora are numerous and growing in number: multilingual lexicography and terminology, machine and human translation, cross-language information retrieval, language learning, etc. The book's chapters have been commissioned from major figures in the field of parallel corpus building and exploitation, with the aim of showing the state of the art in parallel text alignment and use ten to fifteen years after the first parallel-text alignment techniques were developed. Within the book, the following broad themes are addressed: (i) techniques for the alignment of parallel texts at various levels such as sentence, clause, and word; (ii) the use of parallel texts in fields as diverse as translation, lexicography, and information retrieval; (iii) available corpus resources and the evaluation of alignment methods. The book will be of interest to researchers and advanced students of computational linguistics, terminology, lexicography and translation, both in academia and industry.

Corpus Based Language Studies

Author: Tony McEnery
Publisher: Taylor & Francis
ISBN: 9780415286237
Size: 65.70 MB
Format: PDF, ePub
View: 6905
Download Read Online

Corpus Based Language Studies from the Author: Tony McEnery. Covering the major approaches to the use of corpus data, this work gathers together influential readings from leading names in the discipline, including Biber, Widdowson, Sinclair, Carter and McCarthy.

Applications Of Finite State Language Processing

Author: Tamás Váradi
Publisher: Cambridge Scholars Publishing
ISBN: 1443826030
Size: 69.84 MB
Format: PDF, Mobi
View: 1776
Download Read Online

Applications Of Finite State Language Processing from the Author: Tamás Váradi. NooJ is both a corpus processing tool and a linguistic development environment: it allows linguists to formalize several levels of linguistic phenomena: orthography and spelling, lexicons for simple words, multiword units and frozen expressions, inflectional, derivational and productive morphology, local, structural syntax and transformational syntax. For each of these levels, NooJ provides linguists with one or more formal tools specifically designed to facilitate the description of each phenomenon, as well as parsing tools designed to be as computationally efficient as possible. This approach distinguishes NooJ from most computational linguistic tools, which provide a single formalism that should describe everything. As a corpus processing tool, NooJ allows users to apply sophisticated linguistic queries to large corpora in order to build indices and concordances, annotate texts automatically, perform statistical analyses, etc. NooJ is freely available and linguistic modules can already be downloaded for Acadian, Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, French, English, German, Hebrew, Greek, Hungarian, Italian, Polish, Portuguese, Spanish and Turkish. The present volume contains papers from the 2008 International NooJ conference which was held 8–10 June 2008 in Budapest. While the focus of the Budapest conference was on making NooJ compatible with other applications, the papers vary with respect to whether they regard Natural Language Processing (NLP) as a research goal or as a tool. However, they all present a slightly different problem either in the field of NLP, or in one that can be solved using NLP, or present a new development in the tool itself. The range of problems dealt with in the volume is quite varied, which will hopefully enable the readers to find contributions that are relevant to their field of interest.

Treebanks

Author: A. Abeillé
Publisher: Springer Science & Business Media
ISBN: 9401002010
Size: 45.54 MB
Format: PDF
View: 6878
Download Read Online

Treebanks from the Author: A. Abeillé. This book provides a state of the art on work being done with parsed corpora. It gathers 21 papers on building and using parsed corpora raising many relevant questions, and deals with a variety of languages and a variety of corpora. It is for those working in linguistics, computational linguistics, natural language, syntax, and grammar.