liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Text2VQL: Teaching a Model Query Language to Open-Source Language Models with ChatGPT
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering. McGill Univ, Canada.ORCID iD: 0000-0002-8790-252X
2024 (English)In: 27TH INTERNATIONAL ACM/IEEE CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS, ASSOC COMPUTING MACHINERY , 2024, p. 13-24Conference paper, Published paper (Refereed)
Abstract [en]

While large language models (LLMs) like ChatGPT has demonstrated impressive capabilities in addressing various software en-gineering tasks, their use in a model-driven engineering (MDE) context is still in an early stage. Since the technology is proprietary and accessible solely through an API, its use may be incompatible with the strict protection of intellectual properties in industrial models. While there are open-source LLM alternatives, they often lack the power of proprietary models and require extensive data fine-tuning to realize their full potential. Furthermore, open-source datasets tailored for MDE tasks are scarce, posing challenges for training such models effectively. In this work, we introduce Text2VQL, a framework that generates graph queries captured in the VIATRA Query Language (VQL) from natural language specifications using open-source LLMs. Initially, we create a high-quality synthetic dataset comprising pairs of queries and their corresponding natural language descriptions using ChatGPT and VIATRA parser. Leveraging this dataset, we use parameter-efficient tuning to specialize three open-source LLMs, namely, DeepSeek Coder 1b, DeepSeek Coder 7b, and CodeLlama 7b for VQL query generation. Our experimental evaluation demonstrates that the fine-tuned models outperform the base models in query generation, highlighting the usefulness of our synthetic dataset. Moreover, one of the fine-tuned models achieves performance comparable to ChatGPT.

Place, publisher, year, edition, pages
ASSOC COMPUTING MACHINERY , 2024. p. 13-24
Keywords [en]
large language model (LLM); model query language; query generation; VIATRA Query Language (VQL); ChatGPT
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:liu:diva-209114DOI: 10.1145/3640310.3674091ISI: 001322650200013ISBN: 9798400705045 (print)OAI: oai:DiVA.org:liu-209114DiVA, id: diva2:1910901
Conference
27th International Conference on Model Driven Engineering Languages and Systems (MODELS), Linz, AUSTRIA, sep 22-27, 2024
Note

Funding Agencies|Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation; MCIN/AEI [PID2022-140109NB-I00]; FEDER/UE [PID2022-140109NB-I00]; Software Center project [30]

Available from: 2024-11-06 Created: 2024-11-06 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Hernández López, José AntonioFöldiák, MátéVarro, Daniel
By organisation
Software and SystemsFaculty of Science & Engineering
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf