Building And Using Comparable Corpora

Author: Serge Sharoff
Publisher: Springer Science & Business Media
ISBN: 3642201288
Size: 50.70 MB
Format: PDF, Mobi
View: 6873
Download Read Online

Building And Using Comparable Corpora from the Author: Serge Sharoff. The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Web As Corpus

Author: Maristella Gatto
Publisher: A&C Black
ISBN: 1441134131
Size: 19.11 MB
Format: PDF, Docs
View: 1558
Download Read Online

Web As Corpus from the Author: Maristella Gatto. Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the "web as corpus†?. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.

Applications Of Finite State Language Processing

Author: Tamás Váradi
Publisher: Cambridge Scholars Publishing
ISBN: 1443826030
Size: 60.59 MB
Format: PDF, Docs
View: 5626
Download Read Online

Applications Of Finite State Language Processing from the Author: Tamás Váradi. NooJ is both a corpus processing tool and a linguistic development environment: it allows linguists to formalize several levels of linguistic phenomena: orthography and spelling, lexicons for simple words, multiword units and frozen expressions, inflectional, derivational and productive morphology, local, structural syntax and transformational syntax. For each of these levels, NooJ provides linguists with one or more formal tools specifically designed to facilitate the description of each phenomenon, as well as parsing tools designed to be as computationally efficient as possible. This approach distinguishes NooJ from most computational linguistic tools, which provide a single formalism that should describe everything. As a corpus processing tool, NooJ allows users to apply sophisticated linguistic queries to large corpora in order to build indices and concordances, annotate texts automatically, perform statistical analyses, etc. NooJ is freely available and linguistic modules can already be downloaded for Acadian, Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, French, English, German, Hebrew, Greek, Hungarian, Italian, Polish, Portuguese, Spanish and Turkish. The present volume contains papers from the 2008 International NooJ conference which was held 8–10 June 2008 in Budapest. While the focus of the Budapest conference was on making NooJ compatible with other applications, the papers vary with respect to whether they regard Natural Language Processing (NLP) as a research goal or as a tool. However, they all present a slightly different problem either in the field of NLP, or in one that can be solved using NLP, or present a new development in the tool itself. The range of problems dealt with in the volume is quite varied, which will hopefully enable the readers to find contributions that are relevant to their field of interest.

Multilingual Processing In Eastern And Southern Eu Languages

Author: Cristina Vertan
Publisher: Cambridge Scholars Pub
Size: 58.27 MB
Format: PDF
View: 4065
Download Read Online

Multilingual Processing In Eastern And Southern Eu Languages from the Author: Cristina Vertan. This volume draws attention to many specific challenges of multilingual processing within the European Union, especially after the recent successive enlargement. Most of the languages considered herein are not only less resourced in terms of processing tools and training data, but also have features which are different from the well known international language pairs. The 16 contributions address specific problems and solutions for languages from south-eastern and central Europe in the context of multilingual communication, translation and information retrieval.

Das Deutsche Als Kompositionsfreudige Sprache

Author: Livio Gaeta
Publisher: Walter de Gruyter
ISBN: 311027843X
Size: 48.34 MB
Format: PDF
View: 4884
Download Read Online

Das Deutsche Als Kompositionsfreudige Sprache from the Author: Livio Gaeta. Die Komposition nimmt eine zentrale Position in der Wortbildung (nicht nur) der germanischen Sprachen ein und gilt insbesondere im Deutschen als hochproduktives Wortbildungsmuster. Trotz der Fülle an Literatur zur Komposition im Allgemeinen stellt jedoch eine umfassende Auseinandersetzung und Darstellung der Komposition im Deutschen in der aktuellen Forschung ein Desiderat dar, auch innerhalb der deutschsprachigen Germanistik. Im Anschluss an frühere Untersuchungen ergeben sich heute aus einzel- und übereinzelsprachlicher Perspektive neue Untersuchungsaspekte, die nicht zuletzt auch aus aktuellen methodologischen Untersuchungsmöglichkeiten und Fragestellungen erwachsen, beispielsweise im Bereich der Korpuslinguistik und der Psycholinguistik. Der Band nimmt die Komposition im Deutschen aus unterschiedlichen Perspektiven in den Blick und möchte so dazu beitragen, diese Lücke füllen. Die Beiträge diskutieren sowohl Fragen, die sich aus einer innersprachlichen, strukturellen Perspektive ergeben, als auch weitergehende, systembezogene Aspekte der Komposition im Deutschen. Dazu gehören u.a. die Abgrenzung der Komposition von anderen Wortbildungsprozessen und von syntaktischen Prozessen, Überlegungen zur Struktur und Interpretation der zugrunde liegenden morphologischen Einheiten und ihrer Repräsentation im mentalen Lexikon sowie auch die graphematische Dimension der Kompositionsforschung und Fragen, die sich aus aktuellen Entwicklungen auf dem Gebiet der Textlinguistik und Diskurstheorie ergeben.

Corpora In Translator Education

Author: Federico Zanettin
Publisher: Routledge
ISBN: 1317641353
Size: 37.93 MB
Format: PDF, ePub
View: 4767
Download Read Online

Corpora In Translator Education from the Author: Federico Zanettin. The use of language corpora as a resource in linguistics and language-related disciplines is now well-established. One of the many fields where the impact of corpora has been growing in recent years is translation, both at a descriptive and a practical level. The papers in this volume, which grew out of presentations at the conference Cult2k (Bertinoro, Italy, 2000), the second in the series Corpus Use and Learning to Translate, are principally concerned with the use of corpora as resources for the translator and as teaching and learning aids in the context of the translation classroom. This book offers a cross-section of research by some leading scholars in the field, who offer accounts of first-hand experience and theoretical insights into the various ways of building and using appropriate corpora in translation teaching, for the benefit of teachers and learners alike. The various contributions provide a rich source of inspiration for other researchers and practitioners concerned with 'corpora in translator education'. Contributors include Stig Johansson, Tony McEnery, Kirsten Malmkjær, Jennifer Pearson, Lynne Bowker, Krista Varantola, Belinda Maia and a number of other scholars.

Introducing Corpora In Translation Studies

Author: Maeve Olohan
Publisher: Routledge
ISBN: 1134492219
Size: 22.54 MB
Format: PDF, Mobi
View: 6228
Download Read Online

Introducing Corpora In Translation Studies from the Author: Maeve Olohan. The use of corpora in translation studies, both as a tool for translators and as a way of analyzing the process of translation, is growing. This book provides a much-needed assessment of how the analysis of corpus data can make a contribution to the study of translation. Introducing Corpora in Translation Studies: traces the development of corpus methods within translation studies defines the types of corpora used for translation research, discussing their design and application and presenting tools for extracting and analyzing data examines research potential and methodological limitatis considers some uses of corpora by translators and in translator training features research questions, case studies and discussion points to provide a practical guide to using corpora in translation studies. Offering a comprehensive account of the use of corpora by today's translators and researchers, Introducing Corpora in Translation Studies is the definitive guide to a fast-developing area of study.