liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Hernández López, José AntonioORCID iD iconorcid.org/0000-0003-2439-2136
Publications (3 of 3) Show all publications
Saad, M., Hernández López, J. A., Chen, B., Varró, D. & Sharma, T. (2025). An Adaptive Language-Agnostic Pruning Method for Greener Language Models for Code. Paper presented at ACM International Conference on the Foundations of Software Engineering (FSE). Proceedings of the ACM on Software Engineering, 2(FSE), 1183-1204
Open this publication in new window or tab >>An Adaptive Language-Agnostic Pruning Method for Greener Language Models for Code
Show others...
2025 (English)In: Proceedings of the ACM on Software Engineering, E-ISSN 2994-970X, Vol. 2, no FSE, p. 1183-1204Article in journal (Refereed) Published
Abstract [en]

Language models of code have demonstrated remarkable performance across various software engineering andsource code analysis tasks. However, their demanding computational resource requirements and consequentialenvironmental footprint remain as significant challenges. This work introduces Alpine, an adaptive programming language-agnostic pruning technique designed to substantially reduce the computational overhead ofthese models. The proposed method offers a pluggable layer that can be integrated with all Transformer-basedmodels. With Alpine, input sequences undergo adaptive compression throughout the pipeline, reaching a sizethat is up to ×3 less their initial size, resulting in significantly reduced computational load. Our experimentson two software engineering tasks, defect prediction and code clone detection across three language modelsCodeBert, GraphCodeBert and UniXCoder show that Alpine achieves up to a 50% reduction in FLOPs,a 58.1% decrease in memory footprint, and a 28.1% improvement in throughput on average. This led to areduction in CO2 emissions by up to 44.85%. Importantly, it achieves a reduction in computation resourceswhile maintaining up to 98.1% of the original predictive performance. These findings highlight the potentialof Alpine in making language models of code more resource-efficient and accessible while preserving theirperformance, contributing to the overall sustainability of their adoption in software development. Also, itsheds light on redundant and noisy information in source code analysis corpora, as shown by the substantialsequence compression achieved by Alpine.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2025
National Category
Software Engineering Artificial Intelligence
Identifiers
urn:nbn:se:liu:diva-218874 (URN)10.1145/3715773 (DOI)
Conference
ACM International Conference on the Foundations of Software Engineering (FSE)
Available from: 2025-10-16 Created: 2025-10-16 Last updated: 2025-12-04
Hernández López, J. A., Dura, C. & Cuadrado, J. S. (2025). Experimenting with modeling-specific word embeddings. Software and Systems Modeling, 24(6), 1647-1669
Open this publication in new window or tab >>Experimenting with modeling-specific word embeddings
2025 (English)In: Software and Systems Modeling, ISSN 1619-1366, E-ISSN 1619-1374, Vol. 24, no 6, p. 1647-1669Article in journal (Refereed) Published
Abstract [en]

The application of machine learning techniques to address MDE problems often requires transforming raw information (e.g., software models) to a numerical representation which can be used by machine learning algorithms. To this end, pretrained embeddings are a key technology to facilitate the construction of such applications. However, previous works have demonstrated that these embeddings struggle to generalize effectively in the MDE domain due to their training on general-purpose corpora. To tackle this issue, we developed WordE4MDE , which are specialized word embeddings trained specifically on modeling documents. In this study, we aim to overcome several limitations of WordE4MDE and conduct additional experiments to assess its efficacy. Key limitations we address include: (1) mitigating the out-of-vocabulary issue through the utilization of sub-word embeddings, (2) adding contextualization to the embeddings by training a BERT model on our specific modeling corpus and (3) addressing the constraint of limited training data by investigating the augmentation of our modeling corpus with StackOverflow and StackExchange data.

Place, publisher, year, edition, pages
SPRINGER HEIDELBERG, 2025
Keywords
Embeddings; Classification; Clustering; Recommendation; Machine Learning; Model-Driven Engineering
National Category
Computer Systems
Identifiers
urn:nbn:se:liu:diva-210726 (URN)10.1007/s10270-024-01250-5 (DOI)001376228600001 ()2-s2.0-85212045326 (Scopus ID)
Note

Funding Agencies|Agencia Estatal de Investigacin [TED2021-129381B-C22, MCIN/AEI/10.1 3039/501100011033, PID2022-140109NB-I00, MCIN/AEI/10.13039/5011000 11033]; FEDER/UE [CNS2022-135578, MICIU/AEI/10.13039/501100011033]

Available from: 2025-01-10 Created: 2025-01-10 Last updated: 2026-02-24Bibliographically approved
Sadrnezhaad, M., Hernández López, J. A., Mårtensson, T. & Varro, D. (2025). Generative AI in Simulation-Based Test Environments for Large-Scale Cyber-Physical Systems: An Industrial Study. In: Product-Focused Software Process Improvement: 26th International Conference, PROFES 2025, Salerno, Italy, December 1–3, 2025, Proceedings. Paper presented at 26th International Conference, PROFES 2025, Salerno, Italy, December 1–3, 2025 (pp. 203-219). Springer Nature, 16361
Open this publication in new window or tab >>Generative AI in Simulation-Based Test Environments for Large-Scale Cyber-Physical Systems: An Industrial Study
2025 (English)In: Product-Focused Software Process Improvement: 26th International Conference, PROFES 2025, Salerno, Italy, December 1–3, 2025, Proceedings, Springer Nature , 2025, Vol. 16361, p. 203-219Conference paper, Published paper (Refereed)
Abstract [en]

Quality assurance for large-scale cyber-physical systems relies on sophisticated test activities using complex test environments investigated with the help of numerous types of simulators. As these systems grow, extensive resources are required to develop and maintain simulation models of hardware and software components, as well as physical environments. Meanwhile, recent advances in generative AI have led to tools that can produce executable test cases for software systems, offering potential benefits such as reducing manual efforts or increasing test coverage. However, the application of generative AI techniques to simulation-based testing of large-scale cyber-physical systems remains underexplored. To better understand this gap, this study captures practitioners’ perspectives on leveraging generative AI, based on a cross-company workshop with six organizations. Our contribution is twofold: (1) detailed, experience-based insights into challenges faced by engineers, and (2) a research agenda comprising three high-priority directions: (a) AI-generated scenarios and environment models, (b) simulators and AI in CI/CD pipelines, and (c) trustworthiness in generative AI for simulation. While participants acknowledged substantial potential, they also highlighted unresolved challenges. By detailing these issues, the paper aims to guide future academia-industry collaboration towards the responsible adoption of generative AI in simulation-based testing.

Place, publisher, year, edition, pages
Springer Nature, 2025
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
Generative AI, Cyber-physical system, Simulation, Test environment
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:liu:diva-219589 (URN)10.1007/978-3-032-12089-2_13 (DOI)001718768800013 ()2-s2.0-105023306090 (Scopus ID)9783032120892 (ISBN)9783032120885 (ISBN)
Conference
26th International Conference, PROFES 2025, Salerno, Italy, December 1–3, 2025
Note

Funding: Vinnova competence center on Continuous Digitalization (CoDiG); Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2025-11-19 Created: 2025-11-19 Last updated: 2026-04-14
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-2439-2136

Search in DiVA

Show all publications