liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Contribution to Terminology Internationalization by Word Alignment in Parallel Corpora
INSERM, U729, Paris, France.
Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
INSERM, U729, Paris, France.
2006 (English)In: AMIA 2006 Symposium Proceedings, Washington D.C., USA: AMIA , 2006, 185-189 p.Conference paper (Refereed)
Abstract [en]

Background and objectives

Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French.


Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics.


We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies.


Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction.

Place, publisher, year, edition, pages
Washington D.C., USA: AMIA , 2006. 185-189 p.
National Category
Computer Science
URN: urn:nbn:se:liu:diva-35773PubMedID: 17238328Local ID: 28508OAI: diva2:256621
AMIA 2006 Annual SymposiumWashington, DC, USANovember 11, 2006 - November 15, 2006
Available from: 2009-10-10 Created: 2009-10-10 Last updated: 2012-09-12

Open Access in DiVA

No full text

Other links

PubMedLink to publication

Search in DiVA

By author/editor
Merkel, Magnus
By organisation
NLPLAB - Natural Language Processing LaboratoryThe Institute of Technology
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 26 hits
ReferencesLink to record
Permanent link

Direct link