liu.seSök publikationer i DiVA
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Leveraging the Power of Data Augmentation for Transformer-based Tracking
Dalian University of Technology.
Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.ORCID-id: 0000-0002-1019-8634
Linköpings universitet, Institutionen för systemteknik, Datorseende. Linköpings universitet, Tekniska fakulteten.ORCID-id: 0000-0002-6096-3648
Dalian University of Technology.
Visa övriga samt affilieringar
2024 (Engelska)Ingår i: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Institute of Electrical and Electronics Engineers (IEEE), 2024, Vol. 34, s. 6455-6464Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Due to long-distance correlation and powerful pretrained models, transformer-based methods have initiated a breakthrough in visual object tracking performance. Previous works focus on designing effective architectures suited for tracking, but ignore that data augmentation is equally crucial for training a well-performing model. In this paper, we first explore the impact of general data augmentations on transformer-based trackers via systematic experiments, and reveal the limited effectiveness of these common strategies. Motivated by experimental observations, we then propose two data augmentation methods customized for tracking. First, we optimize existing random cropping via a dynamic search radius mechanism and simulation for boundary samples. Second, we propose a token-level feature mixing augmentation strategy, which enables the model against challenges like background interference. Extensive experiments on two transformer-based trackers and six benchmarks demonstrate the effectiveness and data efficiency of our methods, especially under challenging settings, like one-shot tracking and small image resolutions. Code is available at https://github.com/zj5559/DATr.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2024. Vol. 34, s. 6455-6464
Serie
IEEE Winter Conference on Applications of Computer Vision, ISSN 2472-6737, E-ISSN 2642-9381
Nationell ämneskategori
Datorgrafik och datorseende
Identifikatorer
URN: urn:nbn:se:liu:diva-207504DOI: 10.1109/wacv57701.2024.00634ISI: 001222964606058Scopus ID: 2-s2.0-85192011829ISBN: 9798350318920 (digital)ISBN: 9798350318937 (tryckt)OAI: oai:DiVA.org:liu-207504DiVA, id: diva2:1896389
Konferens
2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, Waikoloa, HI, USA, Jan. 3-8, 2024.
Anmärkning

Funding: National Natural Science Foundation of China10.13039/501100001809, Fundamental Research Funds for the Central Universities10.13039/501100012226

Tillgänglig från: 2024-09-10 Skapad: 2024-09-10 Senast uppdaterad: 2025-03-20

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Sök vidare i DiVA

Av författaren/redaktören
Edstedt, JohanFelsberg, Michael
Av organisationen
DatorseendeTekniska fakulteten
Datorgrafik och datorseende

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 85 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf