liu.seSearch for publications in DiVA
Change search
Refine search result
12 1 - 50 of 61
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Holmer, Daniel
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Holmlid, Stefan
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Jönsson, Arne
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Analysing changes in official use of the design concept using SweCLARIN resources2023In: Selected papers from the CLARIN Annual Conference 2022 / [ed] Tomaž Erjavec and Maria Eskevich, Linköping University Electronic Press, 2023Chapter in book (Refereed)
    Abstract [en]

    We investigate changes in the use of four Swedish words from the fields of design and archi- tecture. It has been suggested that their meanings have been blurred, especially in governmental reports and policy documents, so that distinctions between them that are important to stakeholders in the respective fields are lost. Specifically, we compare usage in two governmental public reports on design, one from 1999 and the other from 2015, and additionally in opinion responses to the 2015 report. Our approach is to contextualise occurrences of the words in different representations of the texts using word embeddings, topic modelling and sentiment analysis. Tools and language resources developed within the SweClarin infrastructure have been crucial for the implementation of the study.

  • 2.
    Holmer, Daniel
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Monsen, Julius
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Jönsson, Arne
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Apel, Mikael
    Sveriges Riksbank.
    Blix Grimaldi, Marianna
    The Swedish National Debt Office.
    Who said what? Speaker Identification from Anonymous Minutes of Meetings2023In: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa) / [ed] Tanel Alumäe and Mark Fishel, 2023Conference paper (Refereed)
  • 3.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Holmer, Daniel
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Holmlid, Stefan
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Jönsson, Arne
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Analysing Changes in Official Use of the Design Concept Using SweCLARIN Resources2022In: Proceedings of the CLARIN Annual meeting, 2022Conference paper (Refereed)
    Abstract [en]

    We show how the tools and language resources developed within the SweClarin infrastructure can be used to investigate changes in the use and understanding of the Swedish related words arkitektur, design, form, and formgivning. Specifically, we compare their use in two governmental public reports on design, one from 1999 and the other from 2015. We test the hypothesis that their meaning has developed in a way that blurs distinctions that may be important to stakeholders in the respective fields.

  • 4.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Att vara Lars: Några tankar om språkteknologi och socioonomastik2022In: Live and learn: Festschrift in honor of Lars Borin / [ed] Volodina, Elena Dannélls, Dana Berdicevskis, Aleksandrs Forsberg, Markus Virk, Shafqat, Göteborg: Göteborgs universitet, 2022, p. 1-4Chapter in book (Other academic)
    Abstract [en]

    Since the SweClarin project began in 2015 its resources in terms of data and tools have been used in many different projects including linguistics. A research area where they have been less employed is the study of names. In this paper I suggest that language technology and general corpora can be used to contribute to the sociological study of personal names and offer a few examples. As is fit for the occasioin I take Lars as the point of departure. 

  • 5.
    Axelsson, Bodil
    et al.
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Culture and Society, Division of Culture, Society, Design and Media.
    Holmer, Daniel
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Jönsson, Arne
    Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Studying Emerging New Contexts for Museum Digitisations on Pinterest2021In: Selected Papers from the CLARIN Annual Conference 2020 / [ed] Costanza Navarretta and Maria Eskevich, 2021, p. 24-36Conference paper (Refereed)
    Abstract [en]

    In a SweClarin cooperation project we apply topic modelling to the texts found with pins in Pin-terest boards. The data in focus are digitisations of Viking Age finds from the Swedish History Museum and the underlying research question is how they are given new contextual meanings in boards. We illustrate how topic modelling can support interpretation of polysemy and culturally situated meanings. It expands on the employment of topic modelling by accentuating the necessity of interpretation in every step of the process from capturing and cleaning the data, to modelling and visualisation. The paper concludes that the national context of digitisations of Viking Age jewellery in the Swedish History Museum’s collection management system is re-placed by several transnational contexts in which Viking Age jewellery is appreciated for its symbolical meanings and decorative functions in contemporary genres for re-imagining, relivingand performing European pasts and mythologies. The emerging contexts on Pinterest also high-light the business opportunities involved in genres such as reenactment, neo-paganism, lajv and fantasy. The boards are clues to how digitisations serve as prototypes for replicas.

    Download full text (pdf)
    fulltext
  • 6.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Translation Competence in Machines: A Study of Adjectives in English-Swedish Translation2021In: Proceedings for the First Workshop on Modelling Translation: Translatology in the Digital Age / [ed] Yuri Bizzoni , Elke Teich, Cristina España-Bonet and Josef van Genabith, 2021, p. 57-65Conference paper (Refereed)
    Abstract [en]

    Recent improvements in neural machinetranslation calls for increased efforts onqualitative evaluations so as to get a betterunderstanding of differences in translationcompetence between human and machine.This paper reports the results of a studyof 1170 adjectives in translation from En-glish to Swedish, using the Parallel Uni-versal Dependencies Treebanks for theselanguages. The comparison covers two di-mensions: the types of solutions employedand the incidence of debatable or incorrecttranslations. It is found that the machinetranslation uses all of the solution typesthat the human translation does, but in dif-ferent proportions and less competently

  • 7.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Olsson, Leif-Jöran
    University of Gothenburg, Sweden.
    Frid, Johan
    Lund University, Sweden.
    A New Gold Standard for Swedish NERC2019In: Proceedings of the CLARIN Annual Conference 2019 / [ed] Kiril Simov, Maria Eskevich, CLARIN , 2019, p. 112-115Conference paper (Refereed)
    Abstract [en]

    Starting in 2018 Swe-Clarin members are working cross-instituionally on special themes. In thispaper we report ongoing work in a project aimed at the creation of a new gold standard forSwedish Named-Entity Recognition and Categorisation. In contrast to previous efforts the newresource will contain data from both social media and edited text. The resource will be madefreely available through Spr ̊akbankenText

  • 8.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Danielsson, Henrik
    Linköping University, The Swedish Institute for Disability Research. Linköping University, Department of Behavioural Sciences and Learning, Disability Research. Linköping University, Faculty of Arts and Sciences.
    Bengtsson, Staffan
    The Swedish Institute for Disability Research, Jönköping University, Sweden.
    Arvå, Hampus
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Holme, Lotta
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Jönsson, Arne
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Studying Disability Related Terms with Swe-Clarin Resources2019Conference paper (Refereed)
    Abstract [en]

    In Swedish, as in other languages, the words used to refer to disabilities and people with disabilities are manifold. Recommendations as to which terms to use have been changed several times over the last hundred years. In this exploratory paper we have used textual resources provided by Swe-Clarin to study such changes quantitatively. We demonstrate that old and new recommendations co-exist for long periods of time, and that usage sometimes converges.

    Download full text (pdf)
    Introduction to proceedings
    Download full text (pdf)
    Article in full text
  • 9.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Towards an adequate account of parataxis in Universal Dependencies2019In: Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019) / [ed] Alexandre Rademaker, Francis Tyers, Association for Computational Linguistics, 2019Conference paper (Refereed)
    Abstract [en]

    The parataxis relation as defined for Universal Dependencies 2.0 is general and, for this reason,sometimes hard to distinguish from competing analyses, such as coordination, conj, or apposi-tion, appos. The specific subtypes that are listed for parataxis are also quite different in character.In this study we first show that the actual practice by UD-annotators is varied, using the parallelUD (PUD-) treebanks as data. We then review the current definitions and guidelines and suggestimprovements.

  • 10.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Comparing machine translation and human translation: A case study2017In: RANLP 2017 The First Workshop on Human-Informed Translation and Interpreting Technology (HiT-IT) Proceedings of the Workshop, September 7th, 2017 / [ed] Irina Temnikova, Constantin Orasan, Gloria Corpas and Stephan Vogel, Shoumen, Bulgaria: Association for Computational Linguistics , 2017, p. 21-28Conference paper (Refereed)
    Abstract [en]

    As machine translation technology improves comparisons to human performance are often made in quite general and exaggerated terms. Thus, it is important to be able to account for differences accurately. This paper reports a simple, descriptive scheme for comparing translations and applies it to two translations of a British opinion article published in March, 2017. One is a human translation (HT) into Swedish, and the other a machine translation (MT). While the comparison is limited to one text, the results are indicative of current limitations in MT.

    Download full text (pdf)
    Comparing machine translation and human translation: A case study
  • 11.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Swedish prepositions are not pure function words2017In: Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, 22 May, Gothenburg, Sweden / [ed] Marie-Catherine de Marneffe, Joakim Nivre, and Sebastian Schuster, Linköping: Linköping University Electronic Press, 2017, Vol. 135, p. 11-18Conference paper (Refereed)
    Abstract [en]

    As for any categorial scheme used for annotation, UD abound with borderline cases. The main instruments to resolve them are the UD design principles and, of course, the linguistic facts of the matter. UD makes a fundamental distinction between content words and function words, and a, perhaps less fundamental, distinction between pure function words and the rest. It has been suggested that adpositions are to be included among the pure function words. In this paper I discuss the case of prepositions in Swedish and related languages in the light of these distinctions. It relates to a more general problem: How should we resolve cases where the linguistic intuitions and UD design principles are in conflict?

    Download full text (pdf)
    Swedish prepositions are not pure function words
  • 12.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Converting an English-Swedish Parallel Treebank to Universal Dependencies2015In: Proceedings of the Third International Conference on Dependency Linguistics (DepLing 2015), Association for Computational Linguistics, 2015, p. 10-19, article id W15-2103Conference paper (Refereed)
    Abstract [en]

    The paper reports experiences of automatically converting the dependency analysis of the LinES English-Swedish parallel treebank to universal dependencies (UD). The most tangible result is a version of the treebank that actually employs the relations and parts-of-speech categories required by UD, and no other. It is also more complete in that punctuation marks have received dependencies, which is not the case in the original version. We discuss our method in the light of problems that arise from the desire to keep the syntactic analyses of a parallel treebank internally consistent, while available monolingual UD treebanks for English and Swedish diverge somewhat in their use of UD annotations. Finally, we compare the output from the conversion program with the existing UD treebanks.

    Download full text (pdf)
    fulltext
  • 13.
    Lahoud, Inaya
    et al.
    University of Galatasaray, Turkey.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Summary of the Workshop on Educational Knowledge Management2015In: KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, EKAW 2014, SPRINGER-VERLAG BERLIN , 2015, Vol. 8982, p. 64-65Conference paper (Refereed)
    Abstract [en]

    n/a

  • 14.
    Haglund, Jesper
    et al.
    Linköping University, Department of Social and Welfare Studies, Learning, Aesthetics, Natural science. Linköping University, Faculty of Arts and Sciences.
    Jeppsson, Fredrik
    Linköping University, Department of Social and Welfare Studies, Learning, Aesthetics, Natural science. Linköping University, Faculty of Educational Sciences.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
    Taking advantage of the "Big Mo": Momentum in everyday english and swedish and in physics teaching2015In: Research in science education, ISSN 0157-244X, E-ISSN 1573-1898, Vol. 45, no 3, p. 345-365Article in journal (Refereed)
    Abstract [en]

    Science education research suggests that our everyday intuitions of motion and interaction of physical objects fit well with how physicists use the term “momentum”. Corpus linguistics provides an easily accessible approach to study language in different domains, including everyday language. Analysis of language samples from English text corpora reveals a trend of increasing metaphorical use of “momentum” in non-science domains, and through conceptual metaphor analysis, we show that the use of the word in everyday language, as opposed to for instance “force”, is largely adequate from a physics point of view. In addition, “momentum” has recently been borrowed into Swedish as a metaphor in domains such as sports, politics and finance, with meanings similar to those in physics. As an implication for educational practice, we find support for the suggestion to introduce the term “momentum” to English-speaking pupils at an earlier age than what is typically done in the educational system today, thereby capitalising on their intuitions and experiences of everyday language. For Swedish-speaking pupils, and possibly also relevant to other languages, the parallel between “momentum” and the corresponding physics term in the students’ mother tongue could be made explicit.

    Download full text (pdf)
    fulltext
  • 15.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Alignment2014In: Routledge Encyclopedia of Translation Technology / [ed] Chan Sin-wai, London and New York: Routledge, 2014, p. 395-408Chapter in book (Refereed)
  • 16.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Chunk Accuracy: A Simple, Flexible Metric for Translation Quality2014In: LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, European Language Resources Association, 2014Conference paper (Refereed)
    Abstract [en]

    n/a

  • 17.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Towards a research infrastructure for translation studies.2014Conference paper (Other academic)
    Abstract [en]

    In principle the CLARIN research infrastructure provides a good environment to support research on translation. In reality, the progress within CLARIN in this area seems to be fairly slow. In this paper I will give examples of the resources currently available, and suggest what is needed to achieve a relevant research infrastructure for translation studies. Also, I argue that translation studies has more to gain from language technology, and statistical machine translation in particular, than what is generally assumed, and give some examples.

    Download full text (pdf)
    fulltext
  • 18.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    Tarvi, Ljuba
    Helsinki University.
    Translation Class Instruction as Collaboration in the Act of Translation2014In: Proceedings of The 9th Workshop on Innovative Use of NLP for Building Educational Applications, Baltimore, USA, June 26th, 2014. / [ed] Joel Tetreault, Jill Burstein, Claudia Leacock, The Association for Computational Linguistics , 2014, p. 34-42Conference paper (Refereed)
    Abstract [en]

    The paper offers an effective way of teacher-student computer-based collabo-ration in translation class. We show how a quantitative-qualitative method of analysis supported by word alignment technology can be applied to student translations for use in the classroom. The combined use of natural-language pro-cessing and manual techniques enables students to ‘co-emerge’ during highly motivated collaborative sessions. Within the advocated approach, students are pro-active seekers for a better translation (grade) in a teacher-centered computer-based peer-assisted translation class.

  • 19.
    Eldén, Lars
    et al.
    Linköping University, Department of Mathematics, Computational Mathematics. Linköping University, The Institute of Technology.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
    Fagerlund, Martin
    Linköping University, Department of Mathematics, Computational Mathematics. Linköping University, The Institute of Technology.
    Computing Semantic Clusters by Semantic Mirroring and Spectral Graph Partitioning2013In: Mathematics in Computer Science, ISSN 1661-8270, Vol. 7, p. 293-313Article in journal (Refereed)
    Abstract [en]

    Using the technique of semantic mirroring a graph is obtained that represents words and their translationsfrom a parallel corpus or a bilingual lexicon. The connectedness of the graph holds information about the semanticrelations of words that occur in the translations. Spectral graph theory is used to partition the graph, which leadsto a grouping of the words in different clusters. We illustrate the method using a small sample of seed words froma lexicon of Swedish and English adjectives and discuss its application to computational lexical semantics andlexicography.

  • 20.
    Stymne, Sara
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Cancedda, Nicola
    Xerox Research Centre Europe.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Generation of Compound Words in Statistical Machine Translation into Compounding Languages2013In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 39, no 4, p. 1067-1108Article in journal (Refereed)
    Abstract [en]

    In this article we investigate statistical machine translation (SMT) into Germanic languages, with a focus on compound processing. Our main goal is to enable the generation of novel compounds that have not been seen in the training data. We adopt a split-merge strategy, where compounds are split before training the SMT system, and merged after the translation step. This approach reduces sparsity in the training data, but runs the risk of placing translations of compound parts in non-consecutive positions. It also requires a postprocessing step of compound merging, where compounds are reconstructed in the translation output. We present a method for increasing the chances that components that should be merged are translated into contiguous positions and in the right order and show that it can lead to improvements both by direct inspection and in terms of standard translation evaluation metrics. We also propose several new methods for compound merging, based on heuristics and machine learning, which outperform previously suggested algorithms. These methods can produce novel compounds and a translation with at least the same overall quality as the baseline. For all subtasks we show that it is useful to include part-of-speech based information in the translation process, in order to handle compounds.

    Download full text (pdf)
    fulltext
  • 21.
    Merkel, Magnus
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems.
    Foo, Jody
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering.
    IPhraxtor - A linguistically informed system for extraction of term candidates2013In: Proceedings of the 19th Nordic Conference on Computational Linguistics (Nodalida 2013), Oslo, May 22-24, 2013, NEALT Proceedings Series 16 / [ed] Stephan Oepen, Kristin Hagen, Janne Bondi Johannesse, Linköping: Linköping University Electronic Press, 2013, Vol. 85, p. 121-132Conference paper (Refereed)
    Abstract [en]

    In this paper a method and a flexible tool for performing monolingual term extraction is presented; based on the use of syntactic analysis where information on parts-of-speech; syntactic functions and surface syntax tags can be utilised. The standard approaches to evaluating term extraction; namely by manual evaluation of the top n term candidates or by comparing to a gold standard consisting of a list of terms from a specific domain can have its advantages; but in this paper we try to realise a proposal by Bernier-Colborne (2012) where extracted terms are compared to a gold standard consisting of a test corpus where terms have been annotated in context. Apart from applying this evaluation to different configuratio

    Download full text (pdf)
    IPhraxtor - A linguistically informed system for extraction of term candidates
  • 22.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
    Tarvi, Ljuba
    Helsingfors University, Finland.
    Natural Language Processing for the Translation Class2013In: Proceedings of the second workshop on NLP for computer-assisted language learning at NODALIDA 2013: NEALT Proceedings Series 17 / [ed] Elena Volodina, Lars Borin, Hrafn Loftsson, Linköping: Linköping University Electronic Press, 2013, p. 1-10Conference paper (Refereed)
    Abstract [en]

    We propose a system for use in translation teaching with automatic support for alignment and comparative assessment of different translations. A primary use of this system is for discussion in class and comparison of student translations from a given source text, but it may also be used to study and compare differences between published translations. We describe the intended functions of the system and give suggestions on its design and architecture. We also discuss the degree of automation that can be expected and report results from a small indicative study focused on word alignment performance.

  • 23.
    Holmqvist, Maria
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Stymne, Sara
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Alignment-based reordering for SMT2012In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), 2012, p. 3436-3440Conference paper (Other academic)
    Abstract [en]

    We present a method for improving word alignment quality for phrase-based statistical machine translation by reordering the source text according to the target word order suggested by an initial word alignment. The reordered text is used to create a second word alignment which can be an improvement of the first alignment, since the word order is more similar. The method requires no other pre-processing such as part-of-speech tagging or parsing. We report improved Bleu scores for English-to-German and English-to-Swedish translation. We also examined the effect on word alignment quality and found that the reordering method increased recall while lowering precision, which partly can explain the improved Bleu scores. A manual evaluation of the translation output was also performed to understand what effect our reordering method has on the translation system. We found that where the system employing reordering differed from the baseline in terms of having more words, or a different word order, this generally led to an improvement in translation quality.

  • 24.
    Weiss, Sandra
    et al.
    Linköping University.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, The Institute of Technology.
    Error profiling for evaluation of machine-translated text: a Polish-English case study2012In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) / [ed] Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis, Paris, Frankrike: European Language Resources Association (ELRA) , 2012, p. 1764-1770Conference paper (Other academic)
    Abstract [en]

    We present a study of Polish-English machine translation, where the impact of various types of errors on cohesion and comprehensibility of the translations were investigated. The following phenomena are in focus: (i) The most common errors produced by current state-of-the-art MT systems for Polish-English MT. (ii) The effect of different types of errors on text cohesion. (iii) The effect of different types of errors on readers' understanding of the translation. We found that errors of incorrect and missing translations are the most common for current systems, while the category of non-translated words had the most negative impact on comprehension. All three of these categories contributed to the breaking of cohesive chains. The correlation between number of errors found in a translation and number of wrong answers in the comprehension tests was low. Another result was that non-native speakers of English performed at least as good as native speakers on the comprehension tests.

  • 25.
    Stymne, Sara
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    On the practice of error analysis for machine translation evaluation2012In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), European Language Resources Association , 2012, p. 1786-1790Conference paper (Refereed)
    Abstract [en]

    Error analysis is a means to assess machine translation output in qualitative terms, which can be used as a basis for the generation of error profiles for different systems. As for other subjective approaches to evaluation it runs the risk of low inter-annotator agreement, but very often in papers applying error analysis to MT, this aspect is not even discussed. In this paper, we report results from a comparative evaluation of two systems where agreement initially was low, and discuss the different ways we used to improve it. We compared the effects of using more or less fine-grained taxonomies, and the possibility to restrict analysis to short sentences only. We report results on inter-annotator agreement before and after measures were taken, on error categories that are most likely to be confused, and on the possibility to establish error profiles also in the absence of a high inter-annotator agreement.

  • 26.
    Holmqvist, Maria
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    A Gold Standard for English-Swedish Word Alignment2011In: Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011 / [ed] Bolette Sandford Pedersen, Gunta Nepore and Inguna Skadina, Tartu, Estland, 2011, p. 106-113Conference paper (Other academic)
    Abstract [en]

    Word alignment gold standards are an importantresource for developing and evaluatingword alignment methods. In thispaper we present a free English–Swedishword alignment gold standard consistingof texts from Europarl with manually verifiedword alignments. The gold standardcontains two sets of word aligned sentences,a test set for the purpose of evaluationand a training set that can be usedfor supervised training. The guidelinesused for English–Swedish alignment werecreated based on guidelines for other languagepairs and with statistical machinetranslation as the targeted application. Wealso present results of intrinsic evaluationusing our gold standard and discuss the relationshipto extrinsic evaluation in a statisticalmachine translation system.

  • 27.
    Holmqvist, Maria
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Stymne, Sara
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Experiments with word alignment, normalization and clause reordering for SMT between English and German2011In: Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT 2011) / [ed] Chris Callison-Burch, Philipp Koehn, Christof Monz, Omar F. Zaidan, 2011, p. 393-398Conference paper (Refereed)
    Abstract [en]

    This paper presents the LIU system for the WMT 2011 shared task for translation between German and English. For English– German we attempted to improve the translation tables with a combination of standard statistical word alignments and phrase-based word alignments. For German–English translation we tried to make the German text more similar to the English text by normalizing German morphology and performing rule-based clause reordering of the German text. This resulted in small improvements for both translation directions.

  • 28.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Alignment-based profiling of Europarl data in an English-Swedish parallel corpus2010In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) / [ed] Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias, Paris, France: European Language Resources Association (ELRA) , 2010, p. 3398-3404Conference paper (Refereed)
    Abstract [en]

    This paper profiles the Europarl part of an English-Swedish parallel corpus and compares it with three other subcorpora of the sameparallel corpus. We first describe our method for comparison which is based on alignments, both at the token level and the structurallevel. Although two of the other subcorpora contains fiction, it is found that the Europarl part is the one having the highest proportion ofmany types of restructurings, including additions, deletions and long distance reorderings. We explain this by the fact that the majorityof Europarl segments are parallel translations.

    Download full text (pdf)
    FULLTEXT01
  • 29. Fagerlund, Martin
    et al.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Eldén, Lars
    Linköping University, Department of Mathematics, Scientific Computing. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Computing Word Senses by Semantic Mirroring and Spectral Graph Partitioning2010In: Proceedings of TextGraphs-5 - 2010 Workshop on Graph-based Methods for Natural Language Processing / [ed] Carmen Banea, Alessandro Moschitti, Swapna Somasundaran and Fabio Massimo Zanzotto, Stroudsburg, PA, USA: The Association for Computational Linguistics , 2010, p. 103-107Conference paper (Refereed)
    Abstract [en]

    Using the technique of ”semantic mirroring”a graph is obtained that representswords and their translations from a parallelcorpus or a bilingual lexicon. The connectednessof the graph holds informationabout the different meanings of words thatoccur in the translations. Spectral graphtheory is used to partition the graph, whichleads to a grouping of the words accordingto different senses. We also report resultsfrom an evaluation using a small sample ofseed words from a lexicon of Swedish andEnglish adjectives.

  • 30.
    Stymne, Sara
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Using a Grammar Checker for Evaluation and Postprocessing of Statistical Machine Translation2010In: Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC'10), European Language Resources Association, 2010, p. 2175-2181Conference paper (Refereed)
    Abstract [en]

    One problem in statistical machine translation (SMT) is that the output often is ungrammatical. To address this issue, we have investigated the use of a grammar checker for two purposes in connection with SMT: as an evaluation tool and as a postprocessing tool. As an evaluation tool the grammar checker gives a complementary picture to standard metrics such as Bleu, which do not account for grammaticality. We use the grammar checker as a postprocessing tool by applying the error correction suggestions it gives. There are only small overall improvements of the postprocessing on automatic metrics, but the sentences that are affected by the changes are improved, as shown both by automatic metrics and by a human error analysis. These results indicate that grammar checker techniques are a useful complement to SMT.

  • 31.
    Stymne, Sara
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Holmqvist, Maria
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Vs and OOVs: Two Problems for Translation between German and English2010In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR (WMT'10), 2010, p. 183-188Conference paper (Refereed)
    Abstract [en]

    In this paper we report on experiments with three preprocessing strategies for improving translation output in a statistical MT system. In training, two reordering strategies were studied: (i) reorder on thebasis of the alignments from Giza++, and (ii) reorder by moving all verbs to the end of segments. In translation, out-of-vocabulary words were preprocessed in a knowledge-lite fashion to identify a likely equivalent. All three strategies were implemented for our English-German systems submitted to the WMT10 shared task. Combining them lead to improvements in both language directions.

  • 32.
    Maleki, Jalal
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Yaesoubi, Maziar
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Applying Finite State Morphology to Conversion Between Roman and Perso-Arabic Writing Systems2009In: FINITE-STATE METHODS AND NATURAL LANGUAGE PROCESSING, ISSN 0922-6389, Vol. 191, p. 215-223Article in journal (Refereed)
    Abstract [en]

    This paper presents a method for converting back and forth between the Perso-Arabic and a romanized writing system for Persian. Given a word in one writing system, we use finite state transducers to generate morphological analysis for the word that is subsequently used to regenerate the orthography of the word in the other writing system. The system has been implemented in XFST and LEXC.

  • 33.
    Maleki, Jalal
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Yaesoubi, Maziar
    Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Applying Finite State Morphology to Conversion Between Roman and Perso-Arabic Writing Systems2009In: Finite-State Methods and Natural Language Processing - Post-proceedings of the 7th International Workshop FSMNLP 2008 / [ed] Jakub Piskorski, Bruce Watson, Anssi Yli-Jyrä, IOS Press , 2009, p. 215-223Conference paper (Other academic)
    Abstract [en]

    This paper presents a method for converting back and forth between the Perso-Arabic and a romanized writing system for Persian. Given a word in one writing system, we use finite state transducers to generate morphological analysis for the word that is subsequently used to regenerate the orthography of the word in the other writing system. The system has been implemented in XFST and LEXC.

  • 34.
    Nyström, Mikael
    et al.
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Zweigenbaum, Pierre
    Assistance Publique-Hôpitaux de Paris, Inserm U729, Inalco CRIM.
    Petersson, Håkan
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology.
    Åhlfeldt, Hans
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology.
    Creating a medical English-Swedish dictionary using interactive word alignment2009In: Lexicography: The Changing Landscape / [ed] Salonee Priya, Hyderabad, India: The Icfai University Press , 2009, 1, p. 131-157Chapter in book (Other academic)
    Abstract [sv]

    Lexicography is a realm of growing academic specialization. Dictionaries map meaning onto use. We have innumerable dictionaries on different subjects and for different purposes which we keep referring to, time and again. Despite the frequency with which dictionaries are unquestioningly consulted, many have little idea of what actually goes into making them or how meanings are definitively ascertained. We have become so accustomed to using dictionaries that we fail to take notice of the effort and time spent in their making. Understanding the finer nuances of the art of dictionary-making will be of interest to everyone. With changing times and the penetration of technology, the bulkier forms of dictionaries have given way to softer forms. This book updates the reader to the changing notions of the lexicon and dictionary-making in the new realm of modern technology and newer electronic tools. The book introduces us to lexicography and leads us to dictionaries for general and specific purposes. It examines dictionary compilation and research and enables compilers, users, educators and publishers to look anew at the art of lexicography. It duly takes into account the fact that dictionaries are meant to fulfill the needs of specific user groups and reflects the same in the chapters devoted to various professional dictionaries, which have recently achieved widespread recognition in the lexicographical literature. A good read for students of linguistics, teachers and translators apart from general readers interested in knowing the intricate art of making a dictionary.

  • 35.
    Holmqvist, Maria
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Stymne, Sara
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Jody, Foo
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Improving alignment for SMT by reordering and augmenting the training corpus2009In: Proceedings of the Fourth Workshop on Statistical Machine Translation (WMT09), Athens, Greece, 2009, p. 120-124Conference paper (Refereed)
    Abstract [en]

    We describe the LIU systems for English-German and German-English translation in the WMT09 shared task. We focus on two methods to improve the word alignment: (i) by applying Giza++ in a second phase to a reordered training corpus, where reordering is based on the alignments from the first phase, and (ii) by adding lexical data obtained as high-precision alignments from a different word aligner. These methods were studied in the context of a system that uses compound processing, a morphological sequence model for German, and a part-of-speech sequence model for English. Both methods gave some improvements to translation quality as measured by Bleu and Meteor scores, though not consistently. All systems used both out-of-domain and in-domain data as the mixed corpus had better scores in the baseline configuration.

  • 36.
    Maleki, Jalal
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Converting Romanized Persian to the Arabic Writing System2008In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08) / [ed] Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Daniel Tapias, Marrakech, Morocco: European Language Resources Association, 2008Conference paper (Refereed)
    Abstract [en]

    This paper describes a syllabification based conversion method for converting romanized Persian text to the traditional Arabic-based writing system. The system is implemented in Xerox XFST and relies on rule based conversion of words rather than using morphological analysis. The paper presents a brief evaluation of the accuracy of the transcriptions generated by the method.

  • 37.
    Stymne, Sara
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Holmqvist, Maria
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Effects of Morphological Analysis in Translation between German and English2008In: Proceedings of the Third Workshop on Statistical Machine Translation, Stroudsburg, PA, USA: Association for Computational Linguistics, 2008, p. 135-138Conference paper (Refereed)
    Abstract [en]

    We describe the LIU systems for German-English and English-German translation submitted to the Shared Task of the Third Workshop of Statistical Machine Translation. The main features of the systems, as compared with the baseline, is the use of morphological pre- and post-processing, and a sequence model for German using morphologically richparts-of-speech. It is shown that these additions lead to improved translations.

  • 38.
    Holmqvist, Maria
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Stymne, Sara
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    High-precision Word Alignment with Parallel Phrases2008In: The second Swedish Language Technology Conference SLTC-08,2008, 2008, p. 45-46Conference paper (Refereed)
  • 39.
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Searching Parallel Treebanks for Translation Relations2008In: Resourceful Language Technology: Festschrift in Honor of Anna Sågvall Hein, Uppsala: Acta Universitatis Upsaliensis , 2008, 1, p. 11-20Chapter in book (Other academic)
    Abstract [en]

    As the first holder of the first chair in computational linguistics in Sweden, Anna Sågvall Hein has played a central role in the development of computational linguistics and language technology both in Sweden and on the international scene. Besides her valuable contributions to research, which include work on machine translation, syntactic parsing, grammar checking, word prediction, and corpus linguistics, she has been instrumental in establishing a national graduate school in language technology as well as an undergraduate program in language technology at Uppsala University. It is with great pleasure that we present her with this Festschrift to honor her lasting contributions to the field and to commemorate her retirement from the chair in computational linguistics at Uppsala University. The contributions to the Festschrift come from Anna’s friends and colleagues around the world and deal with many of the topics that are dear to her heart. A common theme in many of the articles, as well as in Anna’s own scientific work, is the design, development and use of adequate language technology resources, epitomized in the title Resourceful Language Technology.

  • 40.
    Holmqvist, Maria
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Stymne, Sara
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Getting to Know Moses: Initial Experiments on German-English Factored Translation2007In: Proceedings of the Second Workshop on Statistical Machine Translation,2007, Stroudsberg, PA: Association for Computational Linguistics , 2007, p. 181-Conference paper (Refereed)
  • 41.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    LinES: An English-Swedish Parallel Treebank2007In: Proceedings of 16th Nordic Conference of Computational Linguistics Nodalida,2007 / [ed] Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek, Mare Koit, Tartu, Estonia: University of Tartu , 2007, p. 270-273Conference paper (Refereed)
    Abstract [en]

    This paper presents an English-Swedish Parallel Treebank, LinES, that is currently under development. LinES is intended as a resource for the study of variation in translation of common syntactic constructions from English to Swedish. For this reason, annotation in LinES is syntactically oriented, multi-level, complete and manually reviewed according to guidelines. Another aim of LinES is to support queries made in terms of types of translation shifts.

  • 42.
    Stymne, Sara
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    A Bilingual Grammar for Translation of English-Swedish Verb Frame Divergences2006In: Annual Conference of the European Association for Machine Translation EAMT,2006, Oslo, Norway: EAMT , 2006Conference paper (Refereed)
  • 43.
    Nyström, Mikael
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
    Merkel, Magnus
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Zweigenbaum, Pierre
    Petersson, Håkan
    Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
    Åhlfeldt, Hans
    Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
    Creating a medical English-Swedish dictionary using interactive word alignment2006In: BMC Medical Informatics and Decision Making, E-ISSN 1472-6947, Vol. 6, no 35Article in journal (Refereed)
    Abstract [en]

    Background: This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish. Methods: The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates. Results: A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems and these results indicate the power of the method for finding inconsistencies in terminology translations. We also report on some factors that may contribute to making the process of dictionary creation with similar tools even more expedient. Finally, the contribution is discussed in relation to other ongoing efforts in constructing medical lexicons for non-English languages. Conclusion: In three man weeks we were able to produce a medical English-Swedish dictionary consisting of 31,000 entries and also found hidden translation errors in the utilized medical terminology systems. © 2006 Nyström et al, licensee BioMed Central Ltd.

    Download full text (pdf)
    FULLTEXT01
  • 44. Maegaard, Bente
    et al.
    Fenstad, Jens-Erik
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Kvale, Knut
    Mühlenbock, Katarina
    Heid, Bernt-Erik
    KUNSTI - Knowledge Generation for Norwegian Language Technology2006In: Proceedings of LREC 2006 Language Resources and Evaluation Conference,2006, Genoa: LREC , 2006, p. 757-Conference paper (Refereed)
  • 45.
    Ahrenberg, Lars
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory.
    Codified Close Translation as a Standard for MT2005In: Conference of the European Association for Machine Translation,2005, Budapest: EAMT , 2005, p. 13-Conference paper (Refereed)
  • 46.
    Nyström, Mikael
    et al.
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology.
    Åhlfeldt, Hans
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology.
    Klein, Gunnar
    Karolinska Institutet, Solna.
    Nilsson, Gunnar
    Karolinska Institutet.
    Chen, Rong
    Karolinska Institutet, Solna.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Halvautomatisk översättning av SNOMED CT till svenska2003In: IT i vården - terminologi, 2003Conference paper (Other academic)
  • 47.
    Merkel, Magnus
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Petterstedt, Michael
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Ahrenberg, Lars
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Interactive Word Alignment for Corpus Linguistics2003In: Proceedings of Corpus Linguistics 2003, 28-31st March, 2003, Lancaster UK. UCREL Technical Papers., UCREL (University Centre for Computer Corpus Research on Language) , 2003, p. 533-542Conference paper (Refereed)
  • 48.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Petterstedt, Michael
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Interactive Word Alignment for Language Engineering2003In: The 10th Conference of the European Chapter of the Association for Computational Linguistics, Conference Companion, Association for Computational Linguistics , 2003, p. 49-52Conference paper (Refereed)
  • 49.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Andersson, Mikael
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    A System for Incremental and Interactive Word Linking2002In: In Third International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, 29-31 May 2002., European Language Resources Association (ELRA) , 2002, p. 485-490Conference paper (Refereed)
    Abstract [en]

    Aligned parallel corpora constitute a critical information resource for a great number of linguistic and technological endeavors. Automatic sentence alignment has reached a level whereby large parallel documents can be fully aligned with the aid of interactive post-editing tools. Word alignment systems have not yet reached the same level of performance, but are good enough to support full word alignment if embedded in an interactive system. In this paper we describe a system for fast and accurate word alignment currently under development at our department, where the user can review and improve the output from an automatic system in an incremental fashion.

  • 50.
    Ahrenberg, Lars
    et al.
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    Merkel, Magnus
    Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology.
    A knowledge-lite approach to word alignment2000In: Parallel Text Processing: Alignment and Use of Translation Corpora / [ed] Jean Veronis, Dordrecht, The Netherlands: Kluwer Academic Publishers, 2000, p. 97-116Chapter in book (Other academic)
    Abstract [en]

    The most promising approach to word alignment is to combine statistical methods with non-statistical information sources. Some of the proposed non-statistical sources, including bilingual dictionaries, POS-taggers and lemmatizers, rely on considerable linguistic knowledge, while other knowledge-lite sources such as cognate heuristics and word order heuristics can be implemented relatively easy. While knowledge-heavy sources might be expected to give better performance, knowledge-lite systems are easier to port to new language pairs and text types, and they can give sufficiently good results for many purposes, e.g. if the output is to be used by a human user for the creation of a complete word-aligned bitext. In this paper we describe the current status of the Linköping Word Aligner (LWA), which combines the use of statistical measures of co-occurrence with four knowledge-lite modules for (i)) word categorization, (ii) morphological variation, (iii) word order, and (iv) phrase recognition. We demonstrate the portability of the system (from English-Swedish texts to French-English texts) and present results for these two language-pairs. Finally, we will report observations from an error analysis of system output, and identify the major strengths and weaknesses of the system.

12 1 - 50 of 61
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf