liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
BETA
Holmqvist, Maria
Publications (10 of 14) Show all publications
Holmqvist, M., Stymne, S., Ahrenberg, L. & Merkel, M. (2012). Alignment-based reordering for SMT. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). Paper presented at The Eight International Conference on Language Resources and Evaluation (LREC'12), May 2012, Istanbul, Turkey (pp. 3436-3440).
Open this publication in new window or tab >>Alignment-based reordering for SMT
2012 (English)In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), 2012, p. 3436-3440Conference paper, Published paper (Other academic)
Abstract [en]

We present a method for improving word alignment quality for phrase-based statistical machine translation by reordering the source text according to the target word order suggested by an initial word alignment. The reordered text is used to create a second word alignment which can be an improvement of the first alignment, since the word order is more similar. The method requires no other pre-processing such as part-of-speech tagging or parsing. We report improved Bleu scores for English-to-German and English-to-Swedish translation. We also examined the effect on word alignment quality and found that the reordering method increased recall while lowering precision, which partly can explain the improved Bleu scores. A manual evaluation of the translation output was also performed to understand what effect our reordering method has on the translation system. We found that where the system employing reordering differed from the baseline in terms of having more words, or a different word order, this generally led to an improvement in translation quality.

Keywords
Mahine translation, statistical machine translation, word alignment, reordering
National Category
Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:liu:diva-80355 (URN)
Conference
The Eight International Conference on Language Resources and Evaluation (LREC'12), May 2012, Istanbul, Turkey
Available from: 2012-08-23 Created: 2012-08-23 Last updated: 2018-01-12
Holmqvist, M. & Ahrenberg, L. (2011). A Gold Standard for English-Swedish Word Alignment. In: Bolette Sandford Pedersen, Gunta Nepore and Inguna Skadina (Ed.), Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Paper presented at NODALIDA 2011: 18th Nordic Conference of Computational Linguistics, May 11-13 2011, Riga, Latvia (pp. 106-113). Tartu, Estland
Open this publication in new window or tab >>A Gold Standard for English-Swedish Word Alignment
2011 (English)In: Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011 / [ed] Bolette Sandford Pedersen, Gunta Nepore and Inguna Skadina, Tartu, Estland, 2011, p. 106-113Conference paper, Poster (with or without abstract) (Other academic)
Abstract [en]

Word alignment gold standards are an importantresource for developing and evaluatingword alignment methods. In thispaper we present a free English–Swedishword alignment gold standard consistingof texts from Europarl with manually verifiedword alignments. The gold standardcontains two sets of word aligned sentences,a test set for the purpose of evaluationand a training set that can be usedfor supervised training. The guidelinesused for English–Swedish alignment werecreated based on guidelines for other languagepairs and with statistical machinetranslation as the targeted application. Wealso present results of intrinsic evaluationusing our gold standard and discuss the relationshipto extrinsic evaluation in a statisticalmachine translation system.

Place, publisher, year, edition, pages
Tartu, Estland: , 2011
Series
NEALT Proceedings Series, ISSN 1736-6305 ; 11
Keywords
Machine translation, Evaluation, Gold standard, Word alignment
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-80286 (URN)
Conference
NODALIDA 2011: 18th Nordic Conference of Computational Linguistics, May 11-13 2011, Riga, Latvia
Projects
Multilingual extraction and term structuring
Funder
Swedish Research Council, 621-2008-4664
Available from: 2012-08-23 Created: 2012-08-23 Last updated: 2012-08-30
Holmqvist, M., Stymne, S. & Ahrenberg, L. (2011). Experiments with word alignment, normalization and clause reordering for SMT between English and German. In: Chris Callison-Burch, Philipp Koehn, Christof Monz, Omar F. Zaidan (Ed.), Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT 2011). Paper presented at The Sixth Workshop on Statistical Machine Translation (WMT 2011) (pp. 393-398).
Open this publication in new window or tab >>Experiments with word alignment, normalization and clause reordering for SMT between English and German
2011 (English)In: Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT 2011) / [ed] Chris Callison-Burch, Philipp Koehn, Christof Monz, Omar F. Zaidan, 2011, p. 393-398Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents the LIU system for the WMT 2011 shared task for translation between German and English. For English– German we attempted to improve the translation tables with a combination of standard statistical word alignments and phrase-based word alignments. For German–English translation we tried to make the German text more similar to the English text by normalizing German morphology and performing rule-based clause reordering of the German text. This resulted in small improvements for both translation directions.

Keywords
Machine translation, word alignment, reordering, normalization
National Category
Language Technology (Computational Linguistics) Language Technology (Computational Linguistics) Computer Sciences
Identifiers
urn:nbn:se:liu:diva-70129 (URN)
Conference
The Sixth Workshop on Statistical Machine Translation (WMT 2011)
Available from: 2011-08-19 Created: 2011-08-19 Last updated: 2018-01-12
Herdagdelen, A., Ciaramita, M., Mahler, D., Holmqvist, M., Hall, K., Riezler, S. & Alfonseca, E. (2010). Generalized syntactic and semantic models of query reformulation. In: SIGIR '10 Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. Paper presented at 33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2010), 19-23 July 2010, Geneva, Switzerland (pp. 283-290). ACM Press
Open this publication in new window or tab >>Generalized syntactic and semantic models of query reformulation
Show others...
2010 (English)In: SIGIR '10 Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, ACM Press, 2010, p. 283-290Conference paper, Published paper (Other academic)
Abstract [en]

We present a novel approach to query reformulation which combines syntactic and semantic information by means of generalized Levenshtein distance algorithms where the substitution operation costs are based on probabilistic term rewrite functions. We investigate unsupervised, compact and efficient models, and provide empirical evidence of their effectiveness. We further explore a generative model of query reformulation and supervised combination methods providing improved performance at variable computational costs. Among other desirable properties, our similarity measures incorporate information-theoretic interpretations of taxonomic relations such as specification and generalization.

Place, publisher, year, edition, pages
ACM Press, 2010
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-60173 (URN)10.1145/1835449.1835498 (DOI)978-1-4503-0153-4 (ISBN)
Conference
33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2010), 19-23 July 2010, Geneva, Switzerland
Available from: 2010-10-07 Created: 2010-10-07 Last updated: 2012-12-06Bibliographically approved
Holmqvist, M. (2010). Heuristic Word Alignment with Parallel Phrases. In: Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias (Ed.), Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10). Paper presented at the Seventh conference on International Language Resources and Evaluation. European Language Resources Association (ELRA)
Open this publication in new window or tab >>Heuristic Word Alignment with Parallel Phrases
2010 (English)In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) / [ed] Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias, European Language Resources Association (ELRA) , 2010Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
European Language Resources Association (ELRA), 2010
National Category
Language Technology (Computational Linguistics) Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:liu:diva-60172 (URN)2-9517408-6-7 (ISBN)
Conference
the Seventh conference on International Language Resources and Evaluation
Available from: 2010-10-07 Created: 2010-10-07 Last updated: 2018-01-12Bibliographically approved
De Bona, F., Riezler, S., Hall, K., Ciaramita, M., Herdagdelen, A. & Holmqvist, M. (2010). Learning dense models of query similarity from user click logs. In: HLT '10: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Paper presented at The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 474-482).
Open this publication in new window or tab >>Learning dense models of query similarity from user click logs
Show others...
2010 (English)In: HLT '10: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010, p. 474-482Conference paper, Published paper (Refereed)
National Category
Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:liu:diva-60175 (URN)1-932432-65-5 (ISBN)
Conference
The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Available from: 2010-10-07 Created: 2010-10-07 Last updated: 2018-01-12Bibliographically approved
Stymne, S., Holmqvist, M. & Ahrenberg, L. (2010). Vs and OOVs: Two Problems for Translation between German and English. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR (WMT'10). Paper presented at The Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 15-16 July 2010 Uppsala, Sweden (pp. 183-188).
Open this publication in new window or tab >>Vs and OOVs: Two Problems for Translation between German and English
2010 (English)In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR (WMT'10), 2010, p. 183-188Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we report on experiments with three preprocessing strategies for improving translation output in a statistical MT system. In training, two reordering strategies were studied: (i) reorder on thebasis of the alignments from Giza++, and (ii) reorder by moving all verbs to the end of segments. In translation, out-of-vocabulary words were preprocessed in a knowledge-lite fashion to identify a likely equivalent. All three strategies were implemented for our English-German systems submitted to the WMT10 shared task. Combining them lead to improvements in both language directions.

Keywords
Machine translation, reordering, Out-of-vocabulary words
National Category
Language Technology (Computational Linguistics) Computer Sciences
Identifiers
urn:nbn:se:liu:diva-58979 (URN)978-1-932432-71-8 (ISBN)1-932432-71-X (ISBN)
Conference
The Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 15-16 July 2010 Uppsala, Sweden
Available from: 2010-09-03 Created: 2010-09-03 Last updated: 2018-01-12Bibliographically approved
Holmqvist, M., Stymne, S., Jody, F. & Ahrenberg, L. (2009). Improving alignment for SMT by reordering and augmenting the training corpus. In: Proceedings of the Fourth Workshop on Statistical Machine Translation (WMT09). Paper presented at The Fourth Workshop on Statistical Machine Translation (WMT09) (pp. 120-124). Athens, Greece
Open this publication in new window or tab >>Improving alignment for SMT by reordering and augmenting the training corpus
2009 (English)In: Proceedings of the Fourth Workshop on Statistical Machine Translation (WMT09), Athens, Greece, 2009, p. 120-124Conference paper, Published paper (Refereed)
Abstract [en]

We describe the LIU systems for English-German and German-English translation in the WMT09 shared task. We focus on two methods to improve the word alignment: (i) by applying Giza++ in a second phase to a reordered training corpus, where reordering is based on the alignments from the first phase, and (ii) by adding lexical data obtained as high-precision alignments from a different word aligner. These methods were studied in the context of a system that uses compound processing, a morphological sequence model for German, and a part-of-speech sequence model for English. Both methods gave some improvements to translation quality as measured by Bleu and Meteor scores, though not consistently. All systems used both out-of-domain and in-domain data as the mixed corpus had better scores in the baseline configuration.

Place, publisher, year, edition, pages
Athens, Greece: , 2009
Keywords
Machine translation, reordering, word alignment
National Category
Language Technology (Computational Linguistics) Computer Sciences
Identifiers
urn:nbn:se:liu:diva-58978 (URN)
Conference
The Fourth Workshop on Statistical Machine Translation (WMT09)
Available from: 2010-09-03 Created: 2010-09-03 Last updated: 2018-01-12
Stymne, S., Holmqvist, M. & Ahrenberg, L. (2008). Effects of Morphological Analysis in Translation between German and English. In: Proceedings of the Third Workshop on Statistical Machine Translation. Paper presented at ACL 2008 Third Workshop on Statistical Machine Translation,Columbus, Ohio, USA, 19 June, 2008 (pp. 135-138). Stroudsburg, PA, USA: Association for Computational Linguistics
Open this publication in new window or tab >>Effects of Morphological Analysis in Translation between German and English
2008 (English)In: Proceedings of the Third Workshop on Statistical Machine Translation, Stroudsburg, PA, USA: Association for Computational Linguistics, 2008, p. 135-138Conference paper, Published paper (Refereed)
Abstract [en]

We describe the LIU systems for German-English and English-German translation submitted to the Shared Task of the Third Workshop of Statistical Machine Translation. The main features of the systems, as compared with the baseline, is the use of morphological pre- and post-processing, and a sequence model for German using morphologically richparts-of-speech. It is shown that these additions lead to improved translations.

Place, publisher, year, edition, pages
Stroudsburg, PA, USA: Association for Computational Linguistics, 2008
Keywords
computational linguistics, statistical machine translation
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-44127 (URN)75721 (Local ID)978-1-932432-09-1 (ISBN)75721 (Archive number)75721 (OAI)
Conference
ACL 2008 Third Workshop on Statistical Machine Translation,Columbus, Ohio, USA, 19 June, 2008
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2018-01-12
Holmqvist, M., Ahrenberg, L. & Stymne, S. (2008). High-precision Word Alignment with Parallel Phrases. In: The second Swedish Language Technology Conference SLTC-08,2008 (pp. 45-46).
Open this publication in new window or tab >>High-precision Word Alignment with Parallel Phrases
2008 (English)In: The second Swedish Language Technology Conference SLTC-08,2008, 2008, p. 45-46Conference paper, Published paper (Refereed)
Keywords
computational linguistics, word alignment
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-44125 (URN)75719 (Local ID)75719 (Archive number)75719 (OAI)
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2018-01-12
Organisations

Search in DiVA

Show all publications