liu.seSearch for publications in DiVA
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Crystal Symmetry and Machine Learning for Systematic Materials Discovery
Linköping University, Department of Physics, Chemistry and Biology, Theoretical Physics. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-0747-1289
2026 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Discovering new crystalline materials lies at the frontier of modern materials science, driving innovation in energy storage, catalysis, semiconductors, and beyond. The vastness of the chemical and structural space poses a profound challenge: the number of possible atomic arrangements grows prohibitably large with system size and composition. Traditional first-principles methods such as density functional theory (DFT) have revolutionized materials discovery, but their high computational cost limits large-scale exploration. This work addresses the combinatorial bottleneck by bringing together two complementary dimensions of modern materials discovery: data-driven predictions using machine learning and high-performance computing.

The work presented in this thesis builds on a symmetry-aware representation of crystal structures called protostructures, based on Wyckoff positions: a coordinate free description of symmetry related atomic sites. This formulation transforms the continuous space of atomic coordinates into a discrete and combinatorially enumerable one. We developed a machine learning model, Wren, which is trained on this representation to provide fast estimates of stability and guide exploration toward promising regions of structural space. A GPU-accelerated workflow using machine-learning-based interatomic potentials and parallelized screening allows for the evaluation of billions of candidate structures within practical timeframes.

Building on this framework, the presented work enumerates 39 billion binary and ternary compounds spanning the chemical space from lithium to bromine, identifying over 88,000 new structural prototypes, and about half a million new crystal structures within a stability limit of 100 meV/atom. The approach is further applied to experimentally unresolved powder diffraction data, where it reconstructs crystal structures consistent with measured patterns, demonstrating the workflow’s ability to uncover physically realizable materials beyond known prototypes.

To explore even broader regions of structural complexity, this work introduces WyckoffDiff, a diffusion-based generative model that produces novel, symmetry-consistent protostructures beyond the training distribution, some predicted to be thermodynamically stable.

Since pretrained interatomic potentials form the foundation of this work, their quality was examined through two complementary studies. The first benchmarks their accuracy in reproducing mixing enthalpies across disordered alloys. The second investigates how these potentials capture the topology of potential energy surfaces by probing energy variations along symmetry-constrained pathways, showing how different machine-learning potentials represent local minima and saddle points, and other artifacts. These two benchmarks provides insight into their reliability for structure prediction, and the resulting findings informed the selection and parametrization of models used throughout our screening framework.

Altogether, the work presented in this thesis demonstrates that the combination of coarse grained screening, ML-based interatomic potentials, and high-performance computing can dramatically accelerate the discovery of previously unseen crystal structures. The framework presented in the thesis expands the boundaries of computational materials discovery and represents a step toward a large-scale, perhaps even comprehensive, mapping of all stable crystal structures permitted by chemistry and symmetry.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2026. , p. 89
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2510
National Category
Condensed Matter Physics
Identifiers
URN: urn:nbn:se:liu:diva-221198DOI: 10.3384/9789181184761ISBN: 9789181184754 (print)ISBN: 9789181184761 (electronic)OAI: oai:DiVA.org:liu-221198DiVA, id: diva2:2038335
Public defence
2026-03-06, Planck, F Building, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Available from: 2026-02-13 Created: 2026-02-13 Last updated: 2026-02-13Bibliographically approved
List of papers
1. Rapid discovery of stable materials by coordinate-free coarse graining
Open this publication in new window or tab >>Rapid discovery of stable materials by coordinate-free coarse graining
Show others...
2022 (English)In: Science Advances, E-ISSN 2375-2548, Vol. 8, no 30, article id eabn4117Article in journal (Refereed) Published
Abstract [en]

A fundamental challenge in materials science pertains to elucidating the relationship between stoichiometry, stability, structure, and property. Recent advances have shown that machine learning can be used to learn such relationships, allowing the stability and functional properties of materials to be accurately predicted. However, most of these approaches use atomic coordinates as input and are thus bottlenecked by crystal structure identification when investigating previously unidentified materials. Our approach solves this bottleneck by coarse-graining the infinite search space of atomic coordinates into a combinatorially enumerable search space. The key idea is to use Wyckoff representations, coordinate-free sets of symmetry-related positions in a crystal, as the input to a machine learning model. Our model demonstrates exceptionally high precision in finding unknown theoretically stable materials, identifying 1569 materials that lie below the known convex hull of previously calculated materials from just 5675 ab initio calculations. Our approach opens up fundamental advances in computational materials discovery.

Place, publisher, year, edition, pages
AMER ASSOC ADVANCEMENT SCIENCE, 2022
National Category
Textile, Rubber and Polymeric Materials
Identifiers
urn:nbn:se:liu:diva-187733 (URN)10.1126/sciadv.abn4117 (DOI)000836554300009 ()35895811 (PubMedID)
Note

Funding Agencies|Winton Programme for the Physics of Sustainability; Royal Society; Swiss National Science Foundation [P2BSP2_191736]; Swedish Research Council (VR) [2020-05402]; Swedish e-Science Centre (SeRC); Swedish Research Council [2018-05973]

Available from: 2022-08-30 Created: 2022-08-30 Last updated: 2026-02-13
2. Identifying crystal structures beyond known prototypes from x-ray powder diffraction spectra
Open this publication in new window or tab >>Identifying crystal structures beyond known prototypes from x-ray powder diffraction spectra
2024 (English)In: Physical Review Materials, E-ISSN 2475-9953, Vol. 8, no 10, article id 103801Article in journal (Refereed) Published
Abstract [en]

The large amount of powder diffraction data for which the corresponding crystal structures have not yet been identified suggests the existence of numerous undiscovered, physically relevant crystal structure prototypes. In this paper, we present a scheme to resolve powder diffraction data into crystal structures with precise atomic coordinates by screening the space of all possible atomic arrangements, i.e., structural prototypes, including those not previously observed, using a pre-trained machine learning (ML) model. This involves (i) enumerating all possible symmetry-confined ways in which a given composition can be accommodated in a given space group, (ii) ranking the element-assigned prototype representations using energies predicted using and perturbing atoms along the degree of freedom allowed by the Wyckoff positions to match the experimental diffraction data, and (iv) validating the thermodynamic stability of the material using density-functional theory. An advantage of the presented method is that it does not rely on a database of previously observed prototypes and is, therefore capable of finding crystal structures with entirely new symmetric arrangements of atoms. We demonstrate the workflow on unidentified x-ray diffraction spectra from the ICDD database and identify a number of stable structures, where a majority turns out to be derivable from known prototypes. However, at least two are found not to be part of our prior structural data sets.

Place, publisher, year, edition, pages
AMER PHYSICAL SOC, 2024
National Category
Structural Biology
Identifiers
urn:nbn:se:liu:diva-208676 (URN)10.1103/PhysRevMaterials.8.103801 (DOI)001330003700001 ()
Note

Funding Agencies|Swedish Research Council (VR) [2020-05402]; Swedish e-Science Centre (SeRC); Swedish Research Council [2018-05973]

Available from: 2024-10-22 Created: 2024-10-22 Last updated: 2026-02-13
3. Evaluating and improving the predictive accuracy of mixing enthalpies and volumes in disordered alloys from universal pretrained machine learning potentials
Open this publication in new window or tab >>Evaluating and improving the predictive accuracy of mixing enthalpies and volumes in disordered alloys from universal pretrained machine learning potentials
2024 (English)In: Physical Review Materials, E-ISSN 2475-9953, Vol. 8, no 11, article id 113803Article in journal (Refereed) Published
Abstract [en]

The advent of machine learning in materials science opens the way for exciting and ambitious simulations of large systems and long time scales with the accuracy of ab initio calculations. Recently, several pretrained universal machine learned interatomic potentials (UPMLIPs) have been published, i.e., potentials distributed with a single set of weights trained to target systems across a very wide range of chemistries and atomic arrangements. These potentials raise the hope of reducing the computational cost and methodological complexity of performing simulations compared to models that require for-purpose training. However, the application of these models needs critical evaluation to assess their usability across material types and properties. In this work, we investigate the application of the following UPMLIPs: MACE, CHGNET, and M3GNET to the context of alloy theory. We calculate the mixing enthalpies and volumes of 21 binary alloy systems and compare the results with DFT calculations to assess the performance of these potentials over different properties and types of materials. We find that the small relative energies necessary to correctly predict mixing energies are generally not reproduced by these methods with sufficient accuracy to describe correct mixing behaviors. However, the performance can be significantly improved by supplementing the training data with relevant training data. The potentials can also be used to partially accelerate these calculations by replacing the ab initio structural relaxation step.

Place, publisher, year, edition, pages
AMER PHYSICAL SOC, 2024
National Category
Theoretical Chemistry
Identifiers
urn:nbn:se:liu:diva-210045 (URN)10.1103/PhysRevMaterials.8.113803 (DOI)001356380700001 ()
Note

Funding Agencies|Swedish Research Council (VR) [2020-05402]; Swedish Government Strategic Re-search Area in Materials Science on Functional Materials at Linkping University [2009-00971]; Swedish e -Science Centre (SeRC) - Swedish Research Council [2022-06725]

Available from: 2024-11-27 Created: 2024-11-27 Last updated: 2026-02-13
4. WyckoffDiff– A Generative Diffusion Model for Crystal Symmetry
Open this publication in new window or tab >>WyckoffDiff– A Generative Diffusion Model for Crystal Symmetry
Show others...
2025 (English)In: Proceedings of the 42nd International Conference on Machine Learning, PMLR , 2025, Vol. 267, p. 15130-15147Conference paper, Published paper (Refereed)
Abstract [en]

Crystalline materials often exhibit a high level of symmetry. However, most generative models do not account for symmetry, but rather model each atom without any constraints on its position or element. We propose a generative model, Wyckoff Diffusion (WyckoffDiff), which generates symmetry-based descriptions of crystals. This is enabled by considering a crystal structure representation that encodes all symmetry, and we design a novel neural network architecture which enables using this representation inside a discrete generative model framework. In addition to respecting symmetry by construction, the discrete nature of our model enables fast generation. We additionally present a new metric, Fréchet Wrenformer Distance, which captures the symmetry aspects of the materials generated, and we benchmark WyckoffDiff against recently proposed generative models for crystal generation. As a proof-of-concept study, we use WyckoffDiff to find new materials below the convex hull of thermodynamical stability.

Place, publisher, year, edition, pages
PMLR, 2025
Series
Proceedings of Machine Learning Research, ISSN 2640-3498
National Category
Condensed Matter Physics Artificial Intelligence
Identifiers
urn:nbn:se:liu:diva-218524 (URN)
Conference
ICML 2025, Forty-Second International Conference on Machine Learning, Vancouver Convention Center, Sun. July 13th through Sat. July 19th
Available from: 2025-10-07 Created: 2025-10-07 Last updated: 2026-02-13

Open Access in DiVA

fulltext(77474 kB)89 downloads
File information
File name FULLTEXT01.pdfFile size 77474 kBChecksum SHA-512
78fc936548e3b4051ac20709cef32766e88b6be3b227f417fa5c28385ce995a3556e00fdc64f23c38b905fda99431122fc47d2c281b0545a17437ca14e032e17
Type fulltextMimetype application/pdf
Order online >>

Other links

Publisher's full text

Authority records

Parackal, Abhijith S.

Search in DiVA

By author/editor
Parackal, Abhijith S.
By organisation
Theoretical PhysicsFaculty of Science & Engineering
Condensed Matter Physics

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 3779 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf