liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Eldesokey, AbdelrahmanORCID iD iconorcid.org/0000-0003-3292-7153
Publications (10 of 10) Show all publications
Robinson, A., Eldesokey, A. & Felsberg, M. (2021). Distractor-aware video object segmentation. In: Pattern Recognition. DAGM GCPR 2021: . Paper presented at German Conference on Pattern Recognition (pp. 222-234).
Open this publication in new window or tab >>Distractor-aware video object segmentation
2021 (English)In: Pattern Recognition. DAGM GCPR 2021, 2021, p. 222-234Conference paper, Published paper (Refereed)
Abstract [en]

Semi-supervised video object segmentation is a challenging task that aims to segment a target throughout a video sequence given an initial mask at the first frame. Discriminative approaches have demonstrated competitive performance on this task at a sensible complexity. These approaches typically formulate the problem as a one-versus-one classification between the target and the background. However, in reality, a video sequence usually encompasses a target, background, and possibly other distracting objects. Those objects increase the risk of introducing false positives, especially if they share visual similarities with the target. Therefore, it is more effective to separate distractors from the background, and handle them independently.

We propose a one-versus-many scheme to address this situation by separating distractors into their own class. This separation allows imposing special attention to challenging regions that are most likely to degrade the performance. We demonstrate the prominence of this formulation by modifying the learning-what-to-learn method to be distractor-aware. Our proposed approach sets a new state-of-the-art on the DAVIS val dataset, and improves over the baseline on the DAVIS test-dev benchmark by 4.8 percent points.

Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13024
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-175117 (URN)10.1007/978-3-030-92659-5_14 (DOI)001500565200014 ()2-s2.0-85124271728 (Scopus ID)978-3-030-92658-8 (ISBN)978-3-030-92659-5 (ISBN)
Conference
German Conference on Pattern Recognition
Available from: 2021-04-19 Created: 2021-04-19 Last updated: 2025-10-10
Eldesokey, A. & Felsberg, M. (2021). Normalized Convolution Upsampling for Refined Optical Flow Estimation. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: . Paper presented at 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021), Online, February 8-10, 2021 (pp. 742-752). SciTePress, 5
Open this publication in new window or tab >>Normalized Convolution Upsampling for Refined Optical Flow Estimation
2021 (English)In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, SciTePress , 2021, Vol. 5, p. 742-752Conference paper, Published paper (Refereed)
Abstract [en]

Optical flow is a regression task where convolutional neural networks (CNNs) have led to major breakthroughs. However, this comes at major computational demands due to the use of cost-volumes and pyramidal representations. This was mitigated by producing flow predictions at quarter the resolution, which are upsampled using bilinear interpolation during test time. Consequently, fine details are usually lost and post-processing is needed to restore them. We propose the Normalized Convolution UPsampler (NCUP), an efficient joint upsampling approach to produce the full-resolution flow during the training of optical flow CNNs. Our proposed approach formulates the upsampling task as a sparse problem and employs the normalized convolutional neural networks to solve it. We evaluate our upsampler against existing joint upsampling approaches when trained end-to-end with a a coarse-to-fine optical flow CNN (PWCNet) and we show that it outperforms all other approaches on the FlyingChairs dataset  while having at least one order fewer parameters. Moreover, we test our upsampler with a recurrent optical flow CNN (RAFT) and we achieve state-of-the-art results on Sintel benchmark with ∼ 6% error reduction, and on-par on the KITTI dataset, while having 7.5% fewer parameters (see Figure 1). Finally, our upsampler shows better generalization capabilities than RAFT when trained and evaluated on different datasets.

Place, publisher, year, edition, pages
SciTePress, 2021
Series
VISIGRAPP, ISSN 2184-4321
Keywords
Optical Flow Estimation CNNs, Joint Image Upsampling, Normalized Convolution, Spare CNNS
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-175901 (URN)10.5220/0010343707420752 (DOI)000661288200079 ()9789897584886 (ISBN)
Conference
16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021), Online, February 8-10, 2021
Note

Funding: Wallenberg AI, Autonomous Systems and Software Program (WASP); Swedish Research CouncilSwedish Research CouncilEuropean Commission [2018-04673]

Available from: 2021-05-26 Created: 2021-05-26 Last updated: 2025-02-07Bibliographically approved
Eldesokey, A. (2021). Uncertainty-Aware Convolutional Neural Networks for Vision Tasks on Sparse Data. (Doctoral dissertation). Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Uncertainty-Aware Convolutional Neural Networks for Vision Tasks on Sparse Data
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Early computer vision algorithms operated on dense 2D images captured using conventional monocular or color sensors. Those sensors embrace a passive nature providing limited scene representations based on light reflux, and are only able to operate under adequate lighting conditions. These limitations hindered the development of many computer vision algorithms that require some knowledge of the scene structure under varying conditions. The emergence of active sensors such as Time-of-Flight (ToF) cameras contributed to mitigating these limitations; however, they gave a rise to many novel challenges, such as data sparsity that stems from multi-path interference, and occlusion.

Many approaches have been proposed to alleviate these challenges by enhancing the acquisition process of ToF cameras or by post-processing their output. Nonetheless, these approaches are sensor and model specific, requiring an individual tuning for each sensor. Alternatively, learning-based approaches, i.e., machine learning, are an attractive solution to these problems by learning a mapping from the original sensor output to a refined version of it. Convolutional Neural Networks (CNNs) are one example of powerful machine learning approaches and they have demonstrated a remarkable success on many computer vision tasks. Unfortunately, CNNs naturally operate on dense data and cannot efficiently handle sparse data from ToF sensors.

In this thesis, we propose a novel variation of CNNs denoted as the Normalized Convolutional Neural Networks that can directly handle sparse data very efficiently. First, we formulate a differentiable normalized convolution layer that takes in sparse data and a confidence map as input. The confidence map provides information about valid and missing pixels to the normalized convolution layer, where the missing values are interpolated from their valid vicinity. Afterwards, we propose a confidence propagation criterion that allows building cascades of normalized convolution layers similar to the standard CNNs. We evaluated our approach on the task of unguided scene depth completion and achieved state-of-the-art results using an exceptionally small network.

As a second contribution, we investigated the fusion of a normalized convolution network with standard CNNs employing RGB images. We study different fusion schemes, and we provide a thorough analysis for different components of the network. By employing our best fusion strategy, we achieve state-of-the-art results on guided depth completion using a remarkably small network.

Thirdly, to provide a statistical interpretation for confidences, we derive a probabilistic framework for the normalized convolutional neural networks. This framework estimates the input confidence in a self-supervised manner and propagates it to provide a statistically valid output confidence. When compared against existing approaches for uncertainty estimation in CNNs such as Bayesian Deep Learning, our probabilistic framework provides a higher quality measure of uncertainty at a significantly lower computational cost.

Finally, we attempt to employ our framework in a common task in CNNs, namely upsampling. We formulate the upsampling problem as a sparse problem, and we employ the normalized convolutional neural networks to solve it. In comparison to existing approaches, our proposed upsampler is structure-aware while being light-weight. We test our upsampler with various optical flow estimation networks, and we show that it consistently improves the results. When integrated with a recent optical flow network, it sets a new state-of-the-art on the most challenging optical flow dataset.

Abstract [sv]

Tidiga datorseendealgoritmer arbetade med täta 2D-bilder som spelats in i gråskala eller med färgkameror. Dessa är passiva bildsensorer som under gynnsamma ljusförhållanden ger en begränsad scenrepresentation baserad endast på ljusflöde. Dessa begränsningar hämmade utvecklingen av de många datorseendealgoritmer som kräver information om scenens struktur under varierande ljusförhållanden. Utvecklingen av aktiva sensorer såsom kameror baserade på Time-of-Flight (ToF) bidrog till att lindra dessa begränsningar. Dessa gav emellertid istället upphov till många nya utmaningar, såsom bearbetning av gles data kommen av flervägsinterferens samt ocklusion.

Man har försökt tackla dessa utmaningar genom att förbättra insamlingsprocessen i TOFkameror eller genom att efterbearbeta deras data. Tidigare föreslagna metoder har dock varit sensor- eller till och med modellspecifika där man måste ställa in varje enskild sensor. Ett attraktivt alternativ är inlärningsbaserade metoder där man istället lär sig förhållandet mellan sensordatan och en förbättrad version av dito. Ett kraftfullt exempel på inlärningsbaserade metoder är neurala faltningsnät (CNNs). Dessa har varit extremt framgångsrika inom datorseende, men förutsätter tyvärr tät data och kan därför inte på ett effektivt sätt bearbeta ToF-sensorernas glesa data.

I denna avhandling föreslår vi en ny variant av faltningsnät som vi kallar normaliserade faltningsnät (eng. Normalized Convolutional Neural Networks) och som direkt kan arbeta med gles data. Först skapar vi ett deriverbart faltningsnätlager baserat på normaliserad faltning som tar in gles data samt en konfidenskarta. Konfidenskartan innehåller information om vilka pixlar vi har mätningar för och vilka som saknar mätningar. Modulen interpolerar sedan pixlar som saknar mätningar baserat på närliggande pixlar för vilka mätningar finns. Därefter föreslår vi ett kriterie för att propagera konfidens vilket tillåter oss att bygga en kaskad av normaliserade faltningslager motsvarande kaskaden av faltningslager i ett faltningsnät. We utvärderade metoden på scendjupkompletteringsproblemet utan färgbilder och uppnådde state-of-the-art-prestanda med ett mycket litet nätverk.

Som ett andra bidrag undersökte vi sammanslagningen av normaliserade faltningsnät med konventionella faltningsnät som arbetar med vanliga färgbilder. We undersöker olika sätt att slå samman näten och ger en grundlig analys för de olika nätverksdelarna. Den bästa sammanslagningsmetoden uppnår state-of-the-art-prestanda på scendjupkompletteringsproblemed med färgbilder, återigen med ett mycket litet nätverk.

Som ett tredje bidrag försöker vi statistiskt tolka prediktionerna från det normaliserade faltningsnätet. Vi härleder ett statistiskt ramverk för detta ändamål där det normala faltningsnätet via självstyrd inlärning lär sig estimera konfidenser och propagera dessa till en statistiskt korrekt sannolikhet. När vi jämför med befintliga metoder för att prediktera osäkerhet i faltningsnät, exempelvis via Bayesiansk djupinlärning, så ger vårt probabilistiska ramverk bättre estimat till en lägre beräkningskostnad.

Slutligen försöker vi använda vårt ramverk för en uppgift man ofta löser med vanliga faltningsnät, nämligen uppsampling. We formulerar uppsamplingsproblemet som om vi fått in gles data och löser det med normaliserade faltningsnät. Jämfört med befintliga metoder är den föreslagna metoden både medveten om lokal bildstruktur och lättviktig. Vi testar vår uppsamplare diverse optisktflödesnät och visar att den konsekvent ger förbättrade resultat. När vi integrerar den med ett nyligen föreslaget optisktflödesnät slår vi alla befintliga metoder för estimering av optiskt flöde.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2021. p. 59
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2123
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-175307 (URN)10.3384/diss.diva-175307 (DOI)9789179297015 (ISBN)
Public defence
2021-06-18, Online through Zoom (contact carina.e.lindstrom@liu.se) and Ada Lovelace, B Building, Campus Valla, Linköping, 13:00 (English)
Opponent
Supervisors
Funder
Swedish Research Council, 2018-04673Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2021-05-26 Created: 2021-04-28 Last updated: 2025-02-07Bibliographically approved
Eldesokey, A., Felsberg, M. & Khan, F. S. (2020). Confidence Propagation through CNNs for Guided Sparse Depth Regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10)
Open this publication in new window or tab >>Confidence Propagation through CNNs for Guided Sparse Depth Regression
2020 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, Vol. 42, no 10Article in journal (Refereed) Published
Abstract [en]

Generally, convolutional neural networks (CNNs) process data on a regular grid, e.g. data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open research problem with numerous applications in autonomous driving, robotics, and surveillance. In this paper, we propose an algebraically-constrained normalized convolution layer for CNNs with highly sparse input that has a smaller number of network parameters compared to related work. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. We also propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. To integrate structural information, we also investigate fusion strategies to combine depth and RGB information in our normalized convolution network framework. In addition, we introduce the use of output confidence as an auxiliary information to improve the results. The capabilities of our normalized convolution network framework are demonstrated for the problem of scene depth completion. Comprehensive experiments are performed on the KITTI-Depth and the NYU-Depth-v2 datasets. The results clearly demonstrate that the proposed approach achieves superior performance while requiring only about 1-5% of the number of parameters compared to the state-of-the-art methods.

Place, publisher, year, edition, pages
IEEE, 2020
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-161086 (URN)10.1109/TPAMI.2019.2929170 (DOI)000567471300008 ()
Note

Funding agencies: Vinnova through grant CYCLAVinnova; Swedish Research CouncilSwedish Research Council [2018-04673]; VR starting grant [2016-05543]

Available from: 2019-10-21 Created: 2019-10-21 Last updated: 2025-02-07
Eldesokey, A., Felsberg, M. & Khan, F. S. (2019). Propagating Confidences through CNNs for Sparse Data Regression. In: British Machine Vision Conference 2018, BMVC 2018: . Paper presented at The 29th British Machine Vision Conference (BMVC), Northumbria University, Newcastle upon Tyne, England, UK, 3-6 September, 2018. BMVA Press
Open this publication in new window or tab >>Propagating Confidences through CNNs for Sparse Data Regression
2019 (English)In: British Machine Vision Conference 2018, BMVC 2018, BMVA Press , 2019Conference paper, Published paper (Refereed)
Abstract [en]

In most computer vision applications, convolutional neural networks (CNNs) operate on dense image data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open problem with numerous applications in autonomous driving, robotics, and surveillance. To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. Furthermore, we propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. Comprehensive experiments are performed on the KITTI depth benchmark and the results clearly demonstrate that the proposed approach achieves superior performance while requiring three times fewer parameters than the state-of-the-art methods. Moreover, our approach produces a continuous pixel-wise confidence map enabling information fusion, state inference, and decision support.

Place, publisher, year, edition, pages
BMVA Press, 2019
National Category
Computer graphics and computer vision Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-149648 (URN)
Conference
The 29th British Machine Vision Conference (BMVC), Northumbria University, Newcastle upon Tyne, England, UK, 3-6 September, 2018
Available from: 2018-07-13 Created: 2018-07-13 Last updated: 2025-02-01Bibliographically approved
Kristanl, M., Matas, J., Leonardis, A., Felsberg, M., Pflugfelder, R., Kamarainen, J.-K., . . . Ni, Z. (2019). The Seventh Visual Object Tracking VOT2019 Challenge Results. In: 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW): . Paper presented at IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, SOUTH KOREA, oct 27-nov 02, 2019 (pp. 2206-2241). IEEE COMPUTER SOC
Open this publication in new window or tab >>The Seventh Visual Object Tracking VOT2019 Challenge Results
Show others...
2019 (English)In: 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), IEEE COMPUTER SOC , 2019, p. 2206-2241Conference paper, Published paper (Refereed)
Abstract [en]

The Visual Object Tracking challenge VOT2019 is the seventh annual tracker benchmarking activity organized by the VOT initiative. Results of 81 trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis as well as the standard VOT methodology for long-term tracking analysis. The VOT2019 challenge was composed of five challenges focusing on different tracking domains: (i) VOT-ST2019 challenge focused on short-term tracking in RGB, (ii) VOT-RT2019 challenge focused on "real-time" short-term tracking in RGB, (iii) VOT-LT2019 focused on long-term tracking namely coping with target disappearance and reappearance. Two new challenges have been introduced: (iv) VOT-RGBT2019 challenge focused on short-term tracking in RGB and thermal imagery and (v) VOT-RGBD2019 challenge focused on long-term tracking in RGB and depth imagery. The VOT-ST2019, VOT-RT2019 and VOT-LT2019 datasets were refreshed while new datasets were introduced for VOT-RGBT2019 and VOT-RGBD2019. The VOT toolkit has been updated to support both standard short-term, long-term tracking and tracking with multi-channel imagery. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website(1).

Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2019
Series
IEEE International Conference on Computer Vision Workshops, ISSN 2473-9936
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-169305 (URN)10.1109/ICCVW.2019.00276 (DOI)000554591602038 ()9781728150239 (ISBN)
Conference
IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, SOUTH KOREA, oct 27-nov 02, 2019
Note

Funding Agencies|Slovenian research agencySlovenian Research Agency - Slovenia [J2-8175, P2-0214, P2-0094]; Czech Science Foundation Project GACR [P103/12/G084]; MURI project - MoD/DstlMURI; EPSRCEngineering & Physical Sciences Research Council (EPSRC) [EP/N019415/1]; WASP; VR (ELLIIT, LAST, and NCNN); SSF (SymbiCloud); AIT Strategic Research Programme; Faculty of Computer Science, University of Ljubljana, Slovenia

Available from: 2020-09-12 Created: 2020-09-12 Last updated: 2025-02-07Bibliographically approved
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Zajc, L. C., . . . He, Z. (2019). The Sixth Visual Object Tracking VOT2018 Challenge Results. In: Laura Leal-Taixé and Stefan Roth (Ed.), Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8–14, 2018 Proceedings, Part I. Paper presented at Computer Vision – ECCV 2018 Workshops, Munich, Germany, September 8–14, 2018 (pp. 3-53). Cham: Springer Publishing Company
Open this publication in new window or tab >>The Sixth Visual Object Tracking VOT2018 Challenge Results
Show others...
2019 (English)In: Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8–14, 2018 Proceedings, Part I / [ed] Laura Leal-Taixé and Stefan Roth, Cham: Springer Publishing Company, 2019, p. 3-53Conference paper, Published paper (Refereed)
Abstract [en]

The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a “real-time” experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new long-term tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).

Place, publisher, year, edition, pages
Cham: Springer Publishing Company, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11129
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
urn:nbn:se:liu:diva-161343 (URN)10.1007/978-3-030-11009-3_1 (DOI)000594378400001 ()9783030110086 (ISBN)9783030110093 (ISBN)
Conference
Computer Vision – ECCV 2018 Workshops, Munich, Germany, September 8–14, 2018
Note

Funding agencies: Slovenian research agencySlovenian Research Agency - Slovenia [P2-0214, P2-0094, J2-8175]; Czech Science FoundationGrant Agency of the Czech Republic [GACR P103/12/G084]; WASP; VR (EMC2); SSF (SymbiCloud); SNIC; AIT Strategic Research Programme 2017 Visua

Available from: 2019-10-30 Created: 2019-10-30 Last updated: 2025-02-01Bibliographically approved
Nyberg, A., Eldesokey, A., Bergström, D. & Gustafsson, D. (2019). Unpaired Thermal to Visible Spectrum Transfer using Adversarial Training. In: Computer Vision - Eccv 2018 Workshops, Pt VI: . Paper presented at 15th European Conference on Computer Vision (ECCV), Munich, GERMANY, sep 08-14, 2018 (pp. 657-669). Springer
Open this publication in new window or tab >>Unpaired Thermal to Visible Spectrum Transfer using Adversarial Training
2019 (English)In: Computer Vision - Eccv 2018 Workshops, Pt VI, Springer, 2019, p. 657-669Conference paper, Published paper (Refereed)
Abstract [en]

Thermal Infrared (TIR) cameras are gaining popularity in many computer vision applications due to their ability to operate under low-light conditions. Images produced by TIR cameras are usually difficult for humans to perceive visually, which limits their usability. Several methods in the literature were proposed to address this problem by transforming TIR images into realistic visible spectrum (VIS) images. However, existing TIR-VIS datasets suffer from imperfect alignment between TIR-VIS image pairs which degrades the performance of supervised methods. We tackle this problem by learning this transformation using an unsupervised Generative Adversarial Network (GAN) which trains on unpaired TIR and VIS images. When trained and evaluated on KAIST-MS dataset, our proposed methods was shown to produce significantly more realistic and sharp VIS images than the existing state-of-the-art supervised methods. In addition, our proposed method was shown to generalize very well when evaluated on a new dataset of new environments.

Place, publisher, year, edition, pages
Springer, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11134
Keywords
Thermal imaging; Generative Adversarial Networks; Unsupervised learning; Colorization
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-161252 (URN)10.1007/978-3-030-11024-6_49 (DOI)000594200000049 ()2-s2.0-85061729407 (Scopus ID)
Conference
15th European Conference on Computer Vision (ECCV), Munich, GERMANY, sep 08-14, 2018
Available from: 2019-10-24 Created: 2019-10-24 Last updated: 2025-02-07
Eldesokey, A., Felsberg, M. & Khan, F. S. (2017). Ellipse Detection for Visual Cyclists Analysis “In the Wild”. In: Michael Felsberg, Anders Heyden and Norbert Krüger (Ed.), Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I. Paper presented at 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I (pp. 319-331). Springer, 10424
Open this publication in new window or tab >>Ellipse Detection for Visual Cyclists Analysis “In the Wild”
2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, p. 319-331Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous driving safety is becoming a paramount issue due to the emergence of many autonomous vehicle prototypes. The safety measures ensure that autonomous vehicles are safe to operate among pedestrians, cyclists and conventional vehicles. While safety measures for pedestrians have been widely studied in literature, little attention has been paid to safety measures for cyclists. Visual cyclists analysis is a challenging problem due to the complex structure and dynamic nature of the cyclists. The dynamic model used for cyclists analysis heavily relies on the wheels. In this paper, we investigate the problem of ellipse detection for visual cyclists analysis in the wild. Our first contribution is the introduction of a new challenging annotated dataset for bicycle wheels, collected in real-world urban environment. Our second contribution is a method that combines reliable arcs selection and grouping strategies for ellipse detection. The reliable selection and grouping mechanism leads to robust ellipse detections when combined with the standard least square ellipse fitting approach. Our experiments clearly demonstrate that our method provides improved results, both in terms of accuracy and robustness in challenging urban environment settings.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10424
National Category
Computer graphics and computer vision Computer Engineering
Identifiers
urn:nbn:se:liu:diva-145372 (URN)10.1007/978-3-319-64689-3_26 (DOI)000432085900026 ()9783319646886 (ISBN)9783319646893 (ISBN)
Conference
17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I
Note

Funding agencies: VR (EMC2, ELLIIT, starting grant) [2016-05543]; Vinnova (Cykla)

Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2025-02-01Bibliographically approved
Felsberg, M., Kristan, M., Matas, J., Leonardis, A., Pflugfelder, R., Häger, G., . . . He, Z. (2016). The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results. In: Hua G., Jégou H. (Ed.), Computer Vision – ECCV 2016 Workshops. ECCV 2016.: . Paper presented at 14th European Conference on Computer Vision (ECCV) (pp. 824-849). SPRINGER INT PUBLISHING AG
Open this publication in new window or tab >>The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results
Show others...
2016 (English)In: Computer Vision – ECCV 2016 Workshops. ECCV 2016. / [ed] Hua G., Jégou H., SPRINGER INT PUBLISHING AG , 2016, p. 824-849Conference paper, Published paper (Refereed)
Abstract [en]

The Thermal Infrared Visual Object Tracking challenge 2016, VOT-TIR2016, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2016 is the second benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2016 challenge is similar to the 2015 challenge, the main difference is the introduction of new, more difficult sequences into the dataset. Furthermore, VOT-TIR2016 evaluation adopted the improvements regarding overlap calculation in VOT2016. Compared to VOT-TIR2015, a significant general improvement of results has been observed, which partly compensate for the more difficult sequences. The dataset, the evaluation kit, as well as the results are publicly available at the challenge website.

Place, publisher, year, edition, pages
SPRINGER INT PUBLISHING AG, 2016
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 9914
Keywords
Performance evaluation; Object tracking; Thermal IR; VOT
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-133773 (URN)10.1007/978-3-319-48881-3_55 (DOI)000389501700055 ()978-3-319-48881-3 (ISBN)978-3-319-48880-6 (ISBN)
Conference
14th European Conference on Computer Vision (ECCV)
Available from: 2017-01-11 Created: 2017-01-09 Last updated: 2025-02-07
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3292-7153

Search in DiVA

Show all publications