liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving
Zenseact.
Zenseact.ORCID iD: 0000-0002-0194-6346
Zenseact.
Zenseact.
Show others and affiliations
2023 (English)In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Institute of Electrical and Electronics Engineers (IEEE), 2023, p. 20121-20131Conference paper, Published paper (Refereed)
Abstract [en]

Existing datasets for autonomous driving (AD) often lack diversity and long-range capabilities, focusing instead on 360° perception and temporal reasoning. To address this gap, we introduce Zenseact Open Dataset (ZOD), a large- scale and diverse multimodal dataset collected over two years in various European countries, covering an area 9×that of existing datasets. ZOD boasts the highest range and resolution sensors among comparable datasets, coupled with detailed keyframe annotations for 2D and 3D objects (up to 245m), road instance/semantic segmentation, traffic sign recognition, and road classification. We believe that this unique combination will facilitate breakthroughs in long-range perception and multi-task learning. The dataset is composed of Frames, Sequences, and Drives, designed to encompass both data diversity and support for spatio-temporal learning, sensor fusion, localization, and mapping. Frames consist of 100k curated camera images with two seconds of other supporting sensor data, while the 1473 Sequences and 29 Drives include the entire sensor suite for 20 seconds and a few minutes, respectively. ZOD is the only large-scale AD dataset released under a permissive license, allowing for both research and commercial use. More information, and an extensive devkit, can be found at zod.zenseact.com.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023. p. 20121-20131
Series
International Conference on Computer Vision (ICCV), ISSN 1550-5499, E-ISSN 2380-7504
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-209825DOI: 10.1109/iccv51070.2023.01846ISBN: 9798350307184 (electronic)ISBN: 9798350307191 (print)OAI: oai:DiVA.org:liu-209825DiVA, id: diva2:1913276
Conference
2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 01-06 October, 2023
Available from: 2024-11-14 Created: 2024-11-14 Last updated: 2025-10-27
In thesis
1. On the Road to Safe Autonomous Driving via Data, Learning, and Validation
Open this publication in new window or tab >>On the Road to Safe Autonomous Driving via Data, Learning, and Validation
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomous driving systems hold the promise of safer and more efficient transportation, with the potential to fundamentally reshape what everyday mobility looks like. However, to realize these promises, such systems must perform reliably in both routine driving and in rare, safety-critical situations. To this end, this thesis addresses three core aspects of autonomous driving development: data, learning, and validation.

First, we tackle the fundamental need for high-quality data by introducing the Zenseact Open Dataset (ZOD) in Paper A. ZOD is a large-scale multimodal dataset collected across diverse geographies, weather conditions, and road types throughout Europe, effectively addressing key shortcomings of existing academic datasets.

We then turn to the challenge of learning from this data. First, we develop a method that bypasses the need for intricate image signal processing pipelines and instead learns to detect objects directly from RAW image data in a supervised setting (Paper B). This reduces the reliance on hand-crafted preprocessing but still requires annotations. Although sensor data is typically abundant in the autonomous driving setting, such annotations become prohibitively expensive at scale. To overcome this bottleneck, we propose GASP (Paper C), a self-supervised method that captures structured 4D representations by jointly modeling geometry, semantics, and dynamics solely from sensor data.

Once models are trained, they must undergo rigorous validation. Yet existing evaluation methods often fall short in realism, scalability, or both. To remedy this, we introduce NeuroNCAP (Paper D), a neural rendering-based closed-loop simulation framework that enables safety-critical testing in photorealistic environments. Building on this, we present R3D2 (Paper E), a generative method for realistic insertion of non-native 3D assets into such environments, further enhancing the scalability and diversity of safety-critical testing.

Together, these contributions provide a scalable set of tools for training and validating autonomous driving systems, supporting progress both in mastering the nominal 99% of everyday driving and in validating behavior in the critical 1% of rare, safety-critical situations.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2025. p. 65
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2478
National Category
Computer Vision and Learning Systems
Identifiers
urn:nbn:se:liu:diva-219102 (URN)10.3384/9789181182453 (DOI)9789181182446 (ISBN)9789181182453 (ISBN)
Public defence
2025-11-28, Zero, Zenit Building, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Note

Funding agencies: This thesis work was supported by the Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP) funded by Knut and Alice Wallenberg Foundation, and by Zenseact AB through their industrial PhD program. The computational resources were provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) at C3SE, partially funded by the Swedish Research Council through grant agreement no. 2022-06725, and by the Berzelius resource, providedby the Knut and Alice Wallenberg Foundation at the National Supercomputer Centre.

Available from: 2025-10-27 Created: 2025-10-27 Last updated: 2025-10-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Ljungbergh, William

Search in DiVA

By author/editor
Ljungbergh, William
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 87 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf