Evaluating Pre-Trained Language Models for Focused Terminology Extraction from Swedish Medical RecordsShow others and affiliations
2022 (English)In: Terminology in the 21st Century: Many Faces, Many Places, Term 2022 - held in conjunction with the International Conference on Language Resources and Evaluation, LREC 2022 - Proceedings / [ed] Rute Costa, Sara Carvalho, Ana Ostroški Anić, Anas Fahad Khan, European Language Resources Association , 2022, Vol. 2022.term-1, p. 30-32Conference paper, Published paper (Refereed)
Abstract [en]
In the experiments briefly presented in this abstract, we compare the performance of a generalist Swedish pre-trained languagemodel with a domain-specific Swedish pre-trained model on the downstream task of focused terminology extraction of implantterms, which are terms that indicate the presence of implants in the body of patients. The fine-tuning is identical for bothmodels. For the search strategy we rely on KD-Tree that we feed with two different lists of term seeds, one with noise and onewithout noise. Results shows that the use of a domain-specific pre-trained language model has a positive impact on focusedterminology extraction only when using term seeds without noise.
Place, publisher, year, edition, pages
European Language Resources Association , 2022. Vol. 2022.term-1, p. 30-32
Keywords [en]
terminology extraction, implant terms, generalist BERT, domain-specific BERT
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-190559Scopus ID: 2-s2.0-85146268302ISBN: 9791095546955 (print)OAI: oai:DiVA.org:liu-190559DiVA, id: diva2:1718756
Conference
Language Resources and Evaluation Conference (LREC 2022), Marseille, France, 20-25 June 2022
2022-12-132022-12-132024-08-23Bibliographically approved