liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering. (NLP)
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-5633-5307
2023 (English)In: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), 2023, p. 634-642Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, instruction finetuning models have received increased attention due to their remarkable zero-shot and generalization capabilities. However, the widespread implementation of these models has been limited to the English language, largely due to the costs and challenges associated with creating instruction datasets. To overcome this, automatic instruction generation has been proposed as a resourceful alternative. We see this as an opportunity for the adoption of instruction finetuning for other languages. In this paper we explore the viability of instruction finetuning for Swedish. We translate a dataset of generated instructions from English to Swedish, using it to finetune both Swedish and non-Swedish models. Results indicate that the use of translated instructions significantly improves the models’ zero-shot performance, even on unseen data, while staying competitive with strong baselines ten times in size. We see this paper is a first step and a proof of concept that instruction finetuning for Swedish is within reach, through resourceful means, and that there exist several directions for further improvements.

Place, publisher, year, edition, pages
2023. p. 634-642
Keywords [en]
NLP, natural language processing, language models, gpt, instruction tuning, instruction finetuning, multilingual, zero-shot
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-196546OAI: oai:DiVA.org:liu-196546DiVA, id: diva2:1787063
Conference
NoDaLiDa
Funder
CUGS (National Graduate School in Computer Science)Wallenberg AI, Autonomous Systems and Software Program (WASP)Available from: 2023-08-11 Created: 2023-08-11 Last updated: 2024-11-08

Open Access in DiVA

fulltext(132 kB)11 downloads
File information
File name FULLTEXT02.pdfFile size 132 kBChecksum SHA-512
b308dedc6d9e97ead3b7fff9ec1e8b6ca8618ac34071dab9865142a36966b5f49e64677ce39e1a27021e8da40998109afa839d2953c89deb47bf5a975318193d
Type fulltextMimetype application/pdf

Other links

Förlagets fulltext / Publisher's full text

Authority records

Holmström, OskarDoostmohammadi, Ehsan

Search in DiVA

By author/editor
Holmström, OskarDoostmohammadi, Ehsan
By organisation
Artificial Intelligence and Integrated Computer SystemsFaculty of Science & Engineering
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 11 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 165 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf