liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
1 - 5 av 5
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Bollmann, Marcel
    et al.
    University of Copenhagen, Denmark.
    Søgaard, Anders
    University of Copenhagen, Denmark.
    Error Analysis and the Role of Morphology2021Ingår i: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021, s. 1887-1900Konferensbidrag (Refereegranskat)
    Abstract [en]

    We evaluate two common conjectures in error analysis of NLP models: (i) Morphology is predictive of errors; and (ii) the importance of morphology increases with the morphological complexity of a language. We show across four different tasks and up to 57 languages that of these conjectures, somewhat surprisingly, only (i) is true. Using morphological features does improve error prediction across tasks; however, this effect is less pronounced with morphologically complex languages. We speculate this is because morphology is more discriminative in morphologically simple languages. Across all four tasks, case and gender are the morphological features most predictive of error.

    Ladda ner fulltext (pdf)
    fulltext
  • 2.
    Bollmann, Marcel
    et al.
    University of Copenhagen, Denmark.
    Aralikatte, Rahul
    University of Copenhagen, Denmark.
    Murrieta Bello, Héctor
    University of Copenhagen, Denmark.
    Hershcovich, Daniel
    University of Copenhagen, Denmark.
    de Lhoneux, Miryam
    University of Copenhagen, Denmark.
    Søgaard, Anders
    University of Copenhagen, Denmark.
    Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task2021Ingår i: Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, 2021, s. 248-254Konferensbidrag (Refereegranskat)
    Abstract [en]

    We evaluated a range of neural machine translation techniques developed specifically for low-resource scenarios. Unsuccessfully. In the end, we submitted two runs: (i) a standard phrase-based model, and (ii) a random babbling baseline using character trigrams. We found that it was surprisingly hard to beat (i), in spite of this model being, in theory, a bad fit for polysynthetic languages; and more interestingly, that (ii) was better than several of the submitted systems, highlighting how difficult low-resource machine translation for polysynthetic languages is.

    Ladda ner fulltext (pdf)
    fulltext
  • 3.
    Bollmann, Marcel
    et al.
    University of Copenhagen, Denmark.
    Elliott, Desmond
    University of Copenhagen, Denmark.
    On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology2020Ingår i: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, s. 7819-7827Konferensbidrag (Refereegranskat)
    Abstract [en]

    The field of natural language processing is experiencing a period of unprecedented growth, and with it a surge of published papers. This represents an opportunity for us to take stock of how we cite the work of other researchers, and whether this growth comes at the expense of “forgetting” about older literature. In this paper, we address this question through bibliographic analysis. By looking at the age of outgoing citations in papers published at selected ACL venues between 2010 and 2019, we find that there is indeed a tendency for recent papers to cite more recent work, but the rate at which papers older than 15 years are cited has remained relatively stable.

    Ladda ner fulltext (pdf)
    fulltext
  • 4.
    Bollmann, Marcel
    University of Copenhagen, Denmark.
    A Large-Scale Comparison of Historical Text Normalization Systems2019Ingår i: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, 2019, s. 3885-3898Konferensbidrag (Refereegranskat)
    Abstract [en]

    There is no consensus on the state-of-the-art approach to historical text normalization. Many techniques have been proposed, including rule-based methods, distance metrics, character-based statistical machine translation, and neural encoder–decoder models, but studies have used different datasets, different evaluation methods, and have come to different conclusions. This paper presents the largest study of historical text normalization done so far. We critically survey the existing literature and report experiments on eight languages, comparing systems spanning all categories of proposed normalization techniques, analysing the effect of training data quantity, and using different evaluation methods. The datasets and scripts are made publicly available.

    Ladda ner fulltext (pdf)
    fulltext
  • 5.
    Bollmann, Marcel
    et al.
    University of Copenhagen, Denmark.
    Korchagina, Natalia
    University of Zurich, Switzerland.
    Søgaard, Anders
    University of Copenhagen, Denmark.
    Few-Shot and Zero-Shot Learning for Historical Text Normalization2019Ingår i: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), Association for Computational Linguistics, 2019, s. 104-114Konferensbidrag (Refereegranskat)
    Abstract [en]

    Historical text normalization often relies on small training datasets. Recent work has shown that multi-task learning can lead to significant improvements by exploiting synergies with related datasets, but there has been no systematic study of different multi-task learning architectures. This paper evaluates 63 multi-task learning configurations for sequence-to-sequence-based historical text normalization across ten datasets from eight languages, using autoencoding, grapheme-to-phoneme mapping, and lemmatization as auxiliary tasks. We observe consistent, significant improvements across languages when training data for the target task is limited, but minimal or no improvements when training data is abundant. We also show that zero-shot learning outperforms the simple, but relatively strong, identity baseline.

    Ladda ner fulltext (pdf)
    fulltext
1 - 5 av 5
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf