liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
1 - 4 av 4
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Kunz, Jenny
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Holmström, Oskar
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    The Impact of Language Adapters in Cross-Lingual Transfer for NLU2024Konferensbidrag (Refereegranskat)
    Abstract [en]

    Modular deep learning has been proposed for the efficient adaption of pre-trained models to new tasks, domains and languages. In particular, combining language adapters with task adapters has shown potential where no supervised data exists for a language. In this paper, we explore the role of language adapters in zero-shot cross-lingual transfer for natural language understanding (NLU) benchmarks. We study the effect of including a target-language adapter in detailed ablation studies with two multilingual models and three multilingual datasets. Our results show that the effect of target-language adapters is highly inconsistent across tasks, languages and models. Retaining the source-language adapter instead often leads to an equivalent, and sometimes to a better, performance. Removing the language adapter after training has only a weak negative effect, indicating that the language adapters do not have a strong impact on the predictions.

  • 2.
    Holmström, Oskar
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kunz, Jenny
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem.
    Bridging the Resource Gap: Exploring the Efficacy of English and Multilingual LLMs for Swedish2023Ingår i: Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), Tórshavn, the Faroe Islands, 2023, s. 92-110Konferensbidrag (Refereegranskat)
    Abstract [en]

    Large language models (LLMs) have substantially improved natural language processing (NLP) performance, but training these models from scratch is resource-intensive and challenging for smaller languages. With this paper, we want to initiate a discussion on the necessity of language-specific pre-training of LLMs. We propose how the “one model-many models” conceptual framework for task transfer can be applied to language transfer and explore this approach by evaluating the performance of non-Swedish monolingual and multilingual models’ performance on tasks in Swedish. Our findings demonstrate that LLMs exposed to limited Swedish during training can be highly capable and transfer competencies from English off-the-shelf, including emergent abilities such as mathematical reasoning, while at the same time showing distinct culturally adapted behaviour. Our results suggest that there are resourceful alternatives to language-specific pre-training when creating useful LLMs for small languages.

  • 3.
    Holmström, Oskar
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Doostmohammadi, Ehsan
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models2023Ingår i: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), 2023, s. 634-642Konferensbidrag (Refereegranskat)
    Abstract [en]

    In recent years, instruction finetuning models have received increased attention due to their remarkable zero-shot and generalization capabilities. However, the widespread implementation of these models has been limited to the English language, largely due to the costs and challenges associated with creating instruction datasets. To overcome this, automatic instruction generation has been proposed as a resourceful alternative. We see this as an opportunity for the adoption of instruction finetuning for other languages. In this paper we explore the viability of instruction finetuning for Swedish. We translate a dataset of generated instructions from English to Swedish, using it to finetune both Swedish and non-Swedish models. Results indicate that the use of translated instructions significantly improves the models’ zero-shot performance, even on unseen data, while staying competitive with strong baselines ten times in size. We see this paper is a first step and a proof of concept that instruction finetuning for Swedish is within reach, through resourceful means, and that there exist several directions for further improvements.

  • 4.
    Kunz, Jenny
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Jirénius, Martin
    Linköpings universitet, Institutionen för datavetenskap. Linköpings universitet, Tekniska fakulteten.
    Holmström, Oskar
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Human Ratings Do Not Reflect Downstream Utility: A Study of Free-Text Explanations for Model Predictions2022Ingår i: Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2022, Vol. 5, s. 164-177, artikel-id 2022.blackboxnlp-1.14Konferensbidrag (Refereegranskat)
    Abstract [en]

    Models able to generate free-text rationales that explain their output have been proposed as an important step towards interpretable NLP for “reasoning” tasks such as natural language inference and commonsense question answering. However, the relative merits of different architectures and types of rationales are not well understood and hard to measure. In this paper, we contribute two insights to this line of research: First, we find that models trained on gold explanations learn to rely on these but, in the case of the more challenging question answering data set we use, fail when given generated explanations at test time. However, additional fine-tuning on generated explanations teaches the model to distinguish between reliable and unreliable information in explanations. Second, we compare explanations by a generation-only model to those generated by a self-rationalizing model and find that, while the former score higher in terms of validity, factual correctness, and similarity to gold explanations, they are not more useful for downstream classification. We observe that the self-rationalizing model is prone to hallucination, which is punished by most metrics but may add useful context for the classification step.

1 - 4 av 4
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf