liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Picking out the bad apples: unsupervised biometric data filtering for refined age estimation
Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia.ORCID iD: 0000-0002-5861-7076
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6763-5487
Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia.
2023 (English)In: The Visual Computer, ISSN 0178-2789, E-ISSN 1432-2315, Vol. 39, p. 219-237Article in journal (Refereed) Published
Abstract [en]

Introduction of large training datasets was essential for the recent advancement and success of deep learning methods. Due to the difficulties related to biometric data collection, facial image datasets with biometric trait labels are scarce and usually limited in terms of size and sample diversity. Web-scraping approaches for automatic data collection can produce large amounts of weakly labeled and noisy data. This work is focused on picking out the bad apples from web-scraped facial datasets by automatically removing erroneous samples that impair their usability. The unsupervised facial biometric data filtering method presented in this work greatly reduces label noise levels in web-scraped facial biometric data. Experiments on two large state-of-the-art web-scraped datasets demonstrate the effectiveness of the proposed method with respect to real and apparent age estimation based on five different age estimation methods. Furthermore, we apply the proposed method, together with a newly devised strategy for merging multiple datasets, to data collected from three major web-based data sources (i.e., IMDb, Wikipedia, Google) and derive the new Biometrically Filtered Famous Figure Dataset or B3FD. The proposed dataset, which is made publicly available, enables considerable performance gains for all tested age estimation methods and age estimation tasks. This work highlights the importance of training data quality compared to data quantity and selection of the estimation method.

Place, publisher, year, edition, pages
Heidelberg, Germany: Springer, 2023. Vol. 39, p. 219-237
Keywords [en]
Filtering, Biometric, Unsupervised, Web scraping, Age estimation, Dataset design
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:liu:diva-182685DOI: 10.1007/s00371-021-02323-yISI: 000740610100001Scopus ID: 2-s2.0-85122654875OAI: oai:DiVA.org:liu-182685DiVA, id: diva2:1634548
Note

Funding: The author K. Besenic receives Ph.D. scholarship from the company Visage Technologies.

Available from: 2022-02-02 Created: 2022-02-02 Last updated: 2023-05-15Bibliographically approved

Open Access in DiVA

fulltext(2812 kB)224 downloads
File information
File name FULLTEXT06.pdfFile size 2812 kBChecksum SHA-512
844f2a2fe4d806e333cf56fb786f83f39ca990e1fdbed42a110e40b88d8c2c0f6ec7f4717fa1f045bb9483686fc2b17483842e1bcadb7d29321d066862379a17
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Ahlberg, Jörgen

Search in DiVA

By author/editor
Bešenić, KrešimirAhlberg, Jörgen
By organisation
Computer VisionFaculty of Science & Engineering
In the same journal
The Visual Computer
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 280 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 599 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf