liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Grammar-Based Method for Instilling Empirical Dependency Structure in LLMs
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0001-7117-5594
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
2025 (English)In: Proceedings of the 9th Workshop on Constraint Grammar and Finite State NLP / [ed] Trond Trosterud, Linda Wiechetek, Flammie Pirinen, University of Tartu Library , 2025, p. 45-49Conference paper, Published paper (Refereed)
Abstract [en]

We investigate whether synthetic pretraining data generated from a formal grammar modeling syntactic dependencies can improve English language models. Building upon the structured pretraining data approach of Papadimitriou and Jurafsky (2023), we develop a grammar that more closely mirrors empirical dependency structures. Our results are negative – this type of pretraining significantly degrades model performance, with both our and their pretraining approach performing worse than no pretraining at all. We analyze potential explanations for these findings and discuss implications for future work on structured-data pretraining.

Place, publisher, year, edition, pages
University of Tartu Library , 2025. p. 45-49
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:liu:diva-213116ISI: 001656769600008ISBN: 9789908531137 (electronic)OAI: oai:DiVA.org:liu-213116DiVA, id: diva2:1953091
Conference
9th Workshop on Constraint Grammar and Finite State NLP, Tallinn, March 5, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)CUGS (National Graduate School in Computer Science)
Note

Funding Agencies|Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation; National Graduate School of Computer Science in Sweden (CUGS); Swedish Research Council [2022-06725]

Available from: 2025-04-17 Created: 2025-04-17 Last updated: 2026-02-03

Open Access in DiVA

No full text in DiVA

Other links

Länk till konferensbidrag / Link to conference paper

Authority records

Torstensson, OlleHolmström, Oskar

Search in DiVA

By author/editor
Torstensson, OlleHolmström, Oskar
By organisation
Artificial Intelligence and Integrated Computer SystemsFaculty of Science & Engineering
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 200 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf