liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Alignment-based profiling of Europarl data in an English-Swedish parallel corpus
Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology. (HCS)
2010 (English)In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) / [ed] Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias, Paris, France: European Language Resources Association (ELRA) , 2010, 3398-3404 p.Conference paper (Refereed)
Abstract [en]

This paper profiles the Europarl part of an English-Swedish parallel corpus and compares it with three other subcorpora of the sameparallel corpus. We first describe our method for comparison which is based on alignments, both at the token level and the structurallevel. Although two of the other subcorpora contains fiction, it is found that the Europarl part is the one having the highest proportion ofmany types of restructurings, including additions, deletions and long distance reorderings. We explain this by the fact that the majorityof Europarl segments are parallel translations.

Place, publisher, year, edition, pages
Paris, France: European Language Resources Association (ELRA) , 2010. 3398-3404 p.
Keyword [en]
parallel corpora, profiling, translation, English, Swedish
National Category
Language Technology (Computational Linguistics)
URN: urn:nbn:se:liu:diva-60039ISBN: 2-9517408-6-7OAI: diva2:354794
Available from: 2010-10-05 Created: 2010-10-04 Last updated: 2010-10-18Bibliographically approved

Open Access in DiVA

fulltext(418 kB)174 downloads
File information
File name FULLTEXT01.pdfFile size 418 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Link to conference

Search in DiVA

By author/editor
Ahrenberg, Lars
By organisation
NLPLAB - Natural Language Processing LaboratoryThe Institute of Technology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 174 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 137 hits
ReferencesLink to record
Permanent link

Direct link