Bridging the Resource Gap: Exploring the Efficacy of English and Multilingual LLMs for Swedish
2023 (English)In: Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), Tórshavn, the Faroe Islands, 2023, p. 92-110Conference paper, Published paper (Refereed)
Abstract [en]
Large language models (LLMs) have substantially improved natural language processing (NLP) performance, but training these models from scratch is resource-intensive and challenging for smaller languages. With this paper, we want to initiate a discussion on the necessity of language-specific pre-training of LLMs. We propose how the “one model-many models” conceptual framework for task transfer can be applied to language transfer and explore this approach by evaluating the performance of non-Swedish monolingual and multilingual models’ performance on tasks in Swedish. Our findings demonstrate that LLMs exposed to limited Swedish during training can be highly capable and transfer competencies from English off-the-shelf, including emergent abilities such as mathematical reasoning, while at the same time showing distinct culturally adapted behaviour. Our results suggest that there are resourceful alternatives to language-specific pre-training when creating useful LLMs for small languages.
Place, publisher, year, edition, pages
Tórshavn, the Faroe Islands, 2023. p. 92-110
Keywords [en]
NLP, Natural Language Processing, language model, GPT, monolingual, multilingual, cross-lingual, one model-many models
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-196545OAI: oai:DiVA.org:liu-196545DiVA, id: diva2:1787062
Conference
RESOURCEFUL workshop at NoDaLiDa
Funder
CUGS (National Graduate School in Computer Science)2023-08-112023-08-112024-11-08