liu.seSök publikationer i DiVA
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Experimenting with modeling-specific word embeddings
Linköpings universitet, Institutionen för datavetenskap, Programvara och system. Linköpings universitet, Tekniska fakulteten.ORCID-id: 0000-0003-2439-2136
Univ Murcia, Spain.
Univ Murcia, Spain.
2025 (Engelska)Ingår i: Software and Systems Modeling, ISSN 1619-1366, E-ISSN 1619-1374, Vol. 24, nr 6, s. 1647-1669Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The application of machine learning techniques to address MDE problems often requires transforming raw information (e.g., software models) to a numerical representation which can be used by machine learning algorithms. To this end, pretrained embeddings are a key technology to facilitate the construction of such applications. However, previous works have demonstrated that these embeddings struggle to generalize effectively in the MDE domain due to their training on general-purpose corpora. To tackle this issue, we developed WordE4MDE , which are specialized word embeddings trained specifically on modeling documents. In this study, we aim to overcome several limitations of WordE4MDE and conduct additional experiments to assess its efficacy. Key limitations we address include: (1) mitigating the out-of-vocabulary issue through the utilization of sub-word embeddings, (2) adding contextualization to the embeddings by training a BERT model on our specific modeling corpus and (3) addressing the constraint of limited training data by investigating the augmentation of our modeling corpus with StackOverflow and StackExchange data.

Ort, förlag, år, upplaga, sidor
SPRINGER HEIDELBERG , 2025. Vol. 24, nr 6, s. 1647-1669
Nyckelord [en]
Embeddings; Classification; Clustering; Recommendation; Machine Learning; Model-Driven Engineering
Nationell ämneskategori
Datorsystem
Identifikatorer
URN: urn:nbn:se:liu:diva-210726DOI: 10.1007/s10270-024-01250-5ISI: 001376228600001Scopus ID: 2-s2.0-85212045326OAI: oai:DiVA.org:liu-210726DiVA, id: diva2:1926131
Anmärkning

Funding Agencies|Agencia Estatal de Investigacin [TED2021-129381B-C22, MCIN/AEI/10.1 3039/501100011033, PID2022-140109NB-I00, MCIN/AEI/10.13039/5011000 11033]; FEDER/UE [CNS2022-135578, MICIU/AEI/10.13039/501100011033]

Tillgänglig från: 2025-01-10 Skapad: 2025-01-10 Senast uppdaterad: 2026-02-24Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Hernández López, José Antonio

Sök vidare i DiVA

Av författaren/redaktören
Hernández López, José Antonio
Av organisationen
Programvara och systemTekniska fakulteten
I samma tidskrift
Software and Systems Modeling
Datorsystem

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 41 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf