Definite Noun Phrases in Statistical Machine Translation into Danish
2009 (English)In: Proceedings of the Workshop on Extracting and Using Constructions in NLP / [ed] Magnus Sahlgren and Ola Knutsson, 2009, 4-9 p.Conference paper (Refereed)
There are two ways to express definiteness in Danish, which makes it problematic for statistical machine translation (SMT) from English, since the wrong realisation can be chosen. We present a part-of-speech-based method for identifying and transforming English definite NPs that would likely be expressed in a different way in Danish. The transformed English is used for training a phrase-based SMT system.This technique gives significant improvements of translation quality, of up to 22.1% relative on Bleu, compared to a baseline trained on original English, in two different domains.
Place, publisher, year, edition, pages
2009. 4-9 p.
, SICS Technical Report, ISSN 1100-3154 ; T2009:10
Statistical machine translation, definiteness, nouns, Scandinavian languages
Language Technology (Computational Linguistics) Language Technology (Computational Linguistics) Computer Science
IdentifiersURN: urn:nbn:se:liu:diva-53955OAI: oai:DiVA.org:liu-53955DiVA: diva2:294003
Workshop on Extracting and Using Constructions in NLP, May 14, Odense, Denmark