liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Metadata/README elements for synthetic structured data made with GenAI: Recommendations to data repositories to encourage transparent, reproducible, and responsible data sharing
Linköping University, Department of Thematic Studies, The Department of Gender Studies. Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Thematic Studies, Technology and Social Change.ORCID iD: 0000-0001-5041-5018
Swedish National Data Service, University of Gothenburg.
University of Manchester.
Linköping University, Department of Thematic Studies.
Show others and affiliations
2025 (English)Report (Other (popular science, discussion, etc.))
Abstract [en]

Publication of AI-generated synthetic structural data in data repositories is beginning to reveal the specific documentation elements that need to accompany synthetic datasets so as to ensure reproducibility and enable data reuse. This document identifies actions that research repositories can take to encourage users to provide AI-generated synthetic datasets with appropriate structure and documentation. The recommendations are specifically for AI generated data, not (for example) data produced using pre-configured models or missing data created by statistical inference. Additionally, this document discusses metadata/README elements for synthetic structured datasets (tabular and multi-modal) and not textual data from LLMs or images for computer vision. 

The document is the result of a workshop held on 23rd January 2025, with participants from the Swedish National Data Service, Linköping University and Manchester University. It also draws on survey responses about current practice from 17 data repositories and a review of existing metadata and README requirements. 

Place, publisher, year, edition, pages
AI Policy Exchange Forum (AIPEX) , 2025.
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:liu:diva-212766DOI: 10.63439/MPEW5336OAI: oai:DiVA.org:liu-212766DiVA, id: diva2:1949237
Funder
Wallenberg AI, Autonomous Systems and Software Program – Humanity and Society (WASP-HS)Available from: 2025-04-02 Created: 2025-04-02 Last updated: 2025-04-11

Open Access in DiVA

fulltext(227 kB)28 downloads
File information
File name FULLTEXT02.pdfFile size 227 kBChecksum SHA-512
3fefd573758c66914734c1565737e131fc9d6ce41c926442c2c559e88e737830fee3cf4e78171d3ff404f5c05e7ac6946016be1e9dc939e78fab14704f80a4c7
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Johnson, ErickaHajisharif, Saghi

Search in DiVA

By author/editor
Johnson, ErickaHajisharif, Saghi
By organisation
The Department of Gender StudiesFaculty of Arts and SciencesTechnology and Social ChangeDepartment of Thematic StudiesMedia and Information TechnologyFaculty of Science & Engineering
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 53 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 521 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf