liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Few-Shot and Zero-Shot Learning for Historical Text Normalization
University of Copenhagen, Denmark.ORCID iD: 0000-0003-2598-8150
University of Zurich, Switzerland.
University of Copenhagen, Denmark.
2019 (English)In: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), Association for Computational Linguistics, 2019, p. 104-114Conference paper, Published paper (Refereed)
Abstract [en]

Historical text normalization often relies on small training datasets. Recent work has shown that multi-task learning can lead to significant improvements by exploiting synergies with related datasets, but there has been no systematic study of different multi-task learning architectures. This paper evaluates 63 multi-task learning configurations for sequence-to-sequence-based historical text normalization across ten datasets from eight languages, using autoencoding, grapheme-to-phoneme mapping, and lemmatization as auxiliary tasks. We observe consistent, significant improvements across languages when training data for the target task is limited, but minimal or no improvements when training data is abundant. We also show that zero-shot learning outperforms the simple, but relatively strong, identity baseline.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2019. p. 104-114
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:liu:diva-197948DOI: 10.18653/v1/d19-6112OAI: oai:DiVA.org:liu-197948DiVA, id: diva2:1798249
Conference
2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019),HongKong,China,November 3,2019
Available from: 2023-09-18 Created: 2023-09-18 Last updated: 2023-09-26

Open Access in DiVA

fulltext(294 kB)40 downloads
File information
File name FULLTEXT01.pdfFile size 294 kBChecksum SHA-512
522b613a4f0f4e6f49ed2b76f8353670b97779e6e0e413f864b3a2d0b2f0ca78546889a526cf87c8263c8a78407152d680244708423a413163acbd9d3b813acf
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Bollmann, Marcel

Search in DiVA

By author/editor
Bollmann, Marcel
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 40 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 235 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf