liu.seSearch for publications in DiVA
Endre søk
Link to record
Permanent link

Direct link
Alternativa namn
Publikasjoner (10 av 210) Visa alla publikasjoner
Edstedt, J., Bökman, G., Wadenbäck, M. & Felsberg, M. (2024). DeDoDe: Detect, Don’t Describe — Describe, Don’t Detect for Local Feature Matching. In: 2024 International Conference on 3D Vision (3DV): . Paper presented at International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), Davos, Switzerland, 18-21 March, 2024.. Institute of Electrical and Electronics Engineers (IEEE)
Åpne denne publikasjonen i ny fane eller vindu >>DeDoDe: Detect, Don’t Describe — Describe, Don’t Detect for Local Feature Matching
2024 (engelsk)Inngår i: 2024 International Conference on 3D Vision (3DV), Institute of Electrical and Electronics Engineers (IEEE), 2024Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Keypoint detection is a pivotal step in 3D reconstruction, whereby sets of (up to) K points are detected in each view of a scene. Crucially, the detected points need to be consistent between views, i.e., correspond to the same 3D point in the scene. One of the main challenges with keypoint detection is the formulation of the learning objective. Previous learning-based methods typically jointly learn descriptors with keypoints, and treat the keypoint detection as a binary classification task on mutual nearest neighbours. However, basing keypoint detection on descriptor nearest neighbours is a proxy task, which is not guaranteed to produce 3D-consistent keypoints. Furthermore, this ties the keypoints to a specific descriptor, complicating downstream usage. In this work, we instead learn keypoints directly from 3D consistency. To this end, we train the detector to detect tracks from large-scale SfM. As these points are often overly sparse, we derive a semi-supervised two-view detection objective to expand this set to a desired number of detections. To train a descriptor, we maximize the mutual nearest neighbour objective over the keypoints with a separate network. Results show that our approach, DeDoDe, achieves significant gains on multiple geometry benchmarks. Code is provided at http://github.com/Parskatt/DeDoDegithub.com/Parskatt/DeDoDe.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2024
Serie
2024 International Conference on 3D Vision (3DV), ISSN 2378-3826, E-ISSN 2475-7888
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-204892 (URN)10.1109/3dv62453.2024.00035 (DOI)9798350362459 (ISBN)9798350362466 (ISBN)
Konferanse
International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), Davos, Switzerland, 18-21 March, 2024.
Tilgjengelig fra: 2024-06-17 Laget: 2024-06-17 Sist oppdatert: 2024-06-17
Jonnarth, A., Zhang, Y. & Felsberg, M. (2024). High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation. In: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV): . Paper presented at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, jan 3-8, 2024 (pp. 999-1008). Institute of Electrical and Electronics Engineers (IEEE)
Åpne denne publikasjonen i ny fane eller vindu >>High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation
2024 (engelsk)Inngår i: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Institute of Electrical and Electronics Engineers (IEEE), 2024, s. 999-1008Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Image-level weakly-supervised semantic segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training. The typical approach involves training an image classification network using global average pooling (GAP) on convolutional feature maps. This enables the estimation of object locations based on class activation maps (CAMs), which identify the importance of image regions. The CAMs are then used to generate pseudo-labels, in the form of segmentation masks, to supervise a segmentation model in the absence of pixel-level ground truth. Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss, which utilizes a heuristic that object contours almost always align with color edges in images. However, both are based on the multinomial posterior with softmax, and implicitly assume that classes are mutually exclusive, which turns out suboptimal in our experiments. Thus, we reformulate both techniques based on binomial posteriors of multiple independent binary problems. This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method. This is demonstrated on a wide variety of baselines on the PASCAL VOC dataset, improving the region similarity and contour quality of all implemented state-of-the-art methods. Experiments on the MS COCO dataset further show that our proposed add-on is well-suited for large-scale settings. Our code implementation is available at https://github.com/arvijj/hfpl.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2024
Emneord
weakly supervised, semantic segmentation, importance sampling, feature similarity, class activation maps
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-202446 (URN)10.1109/WACV57701.2024.00105 (DOI)
Konferanse
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, jan 3-8, 2024
Tilgjengelig fra: 2024-04-15 Laget: 2024-04-15 Sist oppdatert: 2024-04-24bibliografisk kontrollert
Edstedt, J., Athanasiadis, I., Wadenbäck, M. & Felsberg, M. (2023). DKM: Dense Kernelized Feature Matching for Geometry Estimation. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): . Paper presented at 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17-24 June 2023 (pp. 17765-17775). IEEE Communications Society
Åpne denne publikasjonen i ny fane eller vindu >>DKM: Dense Kernelized Feature Matching for Geometry Estimation
2023 (engelsk)Inngår i: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Communications Society, 2023, s. 17765-17775Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Feature matching is a challenging computer vision task that involves finding correspondences between two images of a 3D scene. In this paper we consider the dense approach instead of the more common sparse paradigm, thus striving to find all correspondences. Perhaps counter-intuitively, dense methods have previously shown inferior performance to their sparse and semi-sparse counterparts for estimation of two-view geometry. This changes with our novel dense method, which outperforms both dense and sparse methods on geometry estimation. The novelty is threefold: First, we propose a kernel regression global matcher. Secondly, we propose warp refinement through stacked feature maps and depthwise convolution kernels. Thirdly, we propose learning dense confidence through consistent depth and a balanced sampling approach for dense confidence maps. Through extensive experiments we confirm that our proposed dense method, Dense Kernelized Feature Matching, sets a new state-of-the-art on multiple geometry estimation benchmarks. In particular, we achieve an improvement on MegaDepth-1500 of +4.9 and +8.9 AUC@5° compared to the best previous sparse method and dense method respectively. Our code is provided at the following repository: https://github.com/Parskatt/DKM.

sted, utgiver, år, opplag, sider
IEEE Communications Society, 2023
Serie
Proceedings:IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919, E-ISSN 2575-7075
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-197717 (URN)10.1109/cvpr52729.2023.01704 (DOI)001062531302008 ()9798350301298 (ISBN)9798350301304 (ISBN)
Konferanse
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17-24 June 2023
Merknad

This work was supported by the Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP), funded by Knut and Alice Wallenberg Foundation; andby the strategic research environment ELLIIT funded by the Swedish government. The computational resources were provided by the National Academic Infrastructure forSupercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725, and by the Berzelius resource, provided bythe Knut and Alice Wallenberg Foundation at the National Supercomputer Centre.

Tilgjengelig fra: 2023-09-11 Laget: 2023-09-11 Sist oppdatert: 2023-11-30bibliografisk kontrollert
Holmquist, K., Klasén, L. & Felsberg, M. (2023). Evidential Deep Learning for Class-Incremental Semantic Segmentation. In: Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen (Ed.), Image Analysis. SCIA 2023.: . Paper presented at SCIA 2023, 23rd Scandinavian Conference on Image Analysis. Sirkka, Finland, April 18–21, 2023 (pp. 32-48). Springer
Åpne denne publikasjonen i ny fane eller vindu >>Evidential Deep Learning for Class-Incremental Semantic Segmentation
2023 (engelsk)Inngår i: Image Analysis. SCIA 2023. / [ed] Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen, Springer, 2023, s. 32-48Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Class-Incremental Learning is a challenging problem in machine learning that aims to extend previously trained neural networks with new classes. This is especially useful if the system is able to classify new objects despite the original training data being unavailable. Although the semantic segmentation problem has received less attention than classification, it poses distinct problems and challenges, since previous and future target classes can be unlabeled in the images of a single increment. In this case, the background, past and future classes are correlated and there exists a background-shift.

In this paper, we address the problem of how to model unlabeled classes while avoiding spurious feature clustering of future uncorrelated classes. We propose to use Evidential Deep Learning to model the evidence of the classes as a Dirichlet distribution. Our method factorizes the problem into a separate foreground class probability, calculated by the expected value of the Dirichlet distribution, and an unknown class (background) probability corresponding to the uncertainty of the estimate. In our novel formulation, the background probability is implicitly modeled, avoiding the feature space clustering that comes from forcing the model to output a high background score for pixels that are not labeled as objects. Experiments on the incremental Pascal VOC and ADE20k benchmarks show that our method is superior to the state of the art, especially when repeatedly learning new classes with increasing number of increments.

sted, utgiver, år, opplag, sider
Springer, 2023
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13886
Emneord
Class-incremental learning, Continual-learning, Semantic Segmentation
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-193265 (URN)10.1007/978-3-031-31438-4_3 (DOI)9783031314377 (ISBN)9783031314384 (ISBN)
Konferanse
SCIA 2023, 23rd Scandinavian Conference on Image Analysis. Sirkka, Finland, April 18–21, 2023
Tilgjengelig fra: 2023-04-26 Laget: 2023-04-26 Sist oppdatert: 2024-04-27bibliografisk kontrollert
Zhang, Y., Robinson, A., Magnusson, M. & Felsberg, M. (2023). Leveraging Optical Flow Features for Higher Generalization Power in Video Object Segmentation. In: 2023 IEEEInternational Conferenceon Image Processing: Proceedings. Paper presented at 2023 IEEE International Conference on Image Processing (ICIP), 8–11 October 2023 Kuala Lumpur, Malaysia (pp. 326-330). IEEE
Åpne denne publikasjonen i ny fane eller vindu >>Leveraging Optical Flow Features for Higher Generalization Power in Video Object Segmentation
2023 (engelsk)Inngår i: 2023 IEEEInternational Conferenceon Image Processing: Proceedings, IEEE , 2023, s. 326-330Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

We propose to leverage optical flow features for higher generalization power in semi-supervised video object segmentation. Optical flow is usually exploited as additional guidance information in many computer vision tasks. However, its relevance in video object segmentation was mainly in unsupervised settings or using the optical flow to warp or refine the previously predicted masks. Different from the latter, we propose to directly leverage the optical flow features in the target representation. We show that this enriched representation improves the encoder-decoder approach to the segmentation task. A model to extract the combined information from the optical flow and the image is proposed, which is then used as input to the target model and the decoder network. Unlike previous methods, e.g. in tracking where concatenation is used to integrate information from image data and optical flow, a simple yet effective attention mechanism is exploited in our work. Experiments on DAVIS 2017 and YouTube-VOS 2019 show that integrating the information extracted from optical flow into the original image branch results in a strong performance gain, especially in unseen classes which demonstrates its higher generalization power.

sted, utgiver, år, opplag, sider
IEEE, 2023
Emneord
Optical flow features; Attention mechanism; Semi-supervised VOS; Generalization power
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-199057 (URN)10.1109/ICIP49359.2023.10222542 (DOI)001106821000063 ()9781728198354 (ISBN)9781728198361 (ISBN)
Konferanse
2023 IEEE International Conference on Image Processing (ICIP), 8–11 October 2023 Kuala Lumpur, Malaysia
Tilgjengelig fra: 2023-11-08 Laget: 2023-11-08 Sist oppdatert: 2024-03-12
Ljungbergh, W., Johnander, J., Petersson, C. & Felsberg, M. (2023). Raw or Cooked?: Object Detection on RAW Images. In: Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen (Ed.), Image Analysis: 22nd Scandinavian Conference, SCIA 2023, Sirkka, Finland, April 18–21, 2023, Proceedings, Part I.. Paper presented at Scandinavian Conference on Image Analysis, Sirkka, Finland, April 18–21, 2023 (pp. 374-385). Springer, 13885
Åpne denne publikasjonen i ny fane eller vindu >>Raw or Cooked?: Object Detection on RAW Images
2023 (engelsk)Inngår i: Image Analysis: 22nd Scandinavian Conference, SCIA 2023, Sirkka, Finland, April 18–21, 2023, Proceedings, Part I. / [ed] Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen, Springer, 2023, Vol. 13885, s. 374-385Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Images fed to a deep neural network have in general undergone several handcrafted image signal processing (ISP) operations, all of which have been optimized to produce visually pleasing images. In this work, we investigate the hypothesis that the intermediate representation of visually pleasing images is sub-optimal for downstream computer vision tasks compared to the RAW image representation. We suggest that the operations of the ISP instead should be optimized towards the end task, by learning the parameters of the operations jointly during training. We extend previous works on this topic and propose a new learnable operation that enables an object detector to achieve superior performance when compared to both previous works and traditional RGB images. In experiments on the open PASCALRAW dataset, we empirically confirm our hypothesis.

sted, utgiver, år, opplag, sider
Springer, 2023
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13885
Emneord
Computer Vision, Object detection, RAW images, Image Signal Processing
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-199000 (URN)10.1007/978-3-031-31435-3_25 (DOI)2-s2.0-85161382246 (Scopus ID)9783031314346 (ISBN)9783031314353 (ISBN)
Konferanse
Scandinavian Conference on Image Analysis, Sirkka, Finland, April 18–21, 2023
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Tilgjengelig fra: 2023-11-06 Laget: 2023-11-06 Sist oppdatert: 2023-11-09bibliografisk kontrollert
Brissman, E., Johnander, J., Danelljan, M. & Felsberg, M. (2023). Recurrent Graph Neural Networks for Video Instance Segmentation. International Journal of Computer Vision, 131, 471-495
Åpne denne publikasjonen i ny fane eller vindu >>Recurrent Graph Neural Networks for Video Instance Segmentation
2023 (engelsk)Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 131, s. 471-495Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach operates online at over 25 FPS and obtains 16.3 AP on the challenging OVIS benchmark, setting a new state-of-the-art. We further conduct detailed ablative experiments that validate the different aspects of our approach. Code is available at https://github.com/emibr948/RGNNVIS-PlusPlus.

sted, utgiver, år, opplag, sider
Springer, 2023
Emneord
Detection; Tracking; Segmentation; Video
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-190196 (URN)10.1007/s11263-022-01703-8 (DOI)000885236800001 ()
Merknad

Funding Agencies|Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation; Excellence Center at Linkoping-Lund in Information Technology (ELLIT)

Tilgjengelig fra: 2022-11-29 Laget: 2022-11-29 Sist oppdatert: 2023-11-02bibliografisk kontrollert
Melnyk, P., Felsberg, M. & Wadenbäck, M. (2022). Steerable 3D Spherical Neurons. In: Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, Sivan Sabato (Ed.), Proceedings of the 39th International Conference on Machine Learning: . Paper presented at International Conference on Machine Learning, Baltimore, Maryland, USA, 17-23 July 2022 (pp. 15330-15339). PMLR, 162
Åpne denne publikasjonen i ny fane eller vindu >>Steerable 3D Spherical Neurons
2022 (engelsk)Inngår i: Proceedings of the 39th International Conference on Machine Learning / [ed] Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, Sivan Sabato, PMLR , 2022, Vol. 162, s. 15330-15339Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Emerging from low-level vision theory, steerable filters found their counterpart in prior work on steerable convolutional neural networks equivariant to rigid transformations. In our work, we propose a steerable feed-forward learning-based approach that consists of neurons with spherical decision surfaces and operates on point clouds. Such spherical neurons are obtained by conformal embedding of Euclidean space and have recently been revisited in the context of learning representations of point sets. Focusing on 3D geometry, we exploit the isometry property of spherical neurons and derive a 3D steerability constraint. After training spherical neurons to classify point clouds in a canonical orientation, we use a tetrahedron basis to quadruplicate the neurons and construct rotation-equivariant spherical filter banks. We then apply the derived constraint to interpolate the filter bank outputs and, thus, obtain a rotation-invariant network. Finally, we use a synthetic point set and real-world 3D skeleton data to verify our theoretical findings. The code is available at https://github.com/pavlo-melnyk/steerable-3d-neurons.

sted, utgiver, år, opplag, sider
PMLR, 2022
Serie
Proceedings of Machine Learning Research, ISSN 2640-3498
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-187149 (URN)000900064905021 ()
Konferanse
International Conference on Machine Learning, Baltimore, Maryland, USA, 17-23 July 2022
Merknad

Funding: Wallenberg AI, Autonomous Systems and Software Program (WASP); Swedish Research Council [2018-04673]; strategic research environment ELLIIT

Tilgjengelig fra: 2022-08-08 Laget: 2022-08-08 Sist oppdatert: 2023-05-10
Felsberg, M. (2022). Visual tracking: Tracking in scenes containing multiple moving objects. In: E. R. Davies, Matthew A. Turk (Ed.), Advanced Methods and Deep Learning in Computer Vision: (pp. 305-336). London: Elsevier
Åpne denne publikasjonen i ny fane eller vindu >>Visual tracking: Tracking in scenes containing multiple moving objects
2022 (engelsk)Inngår i: Advanced Methods and Deep Learning in Computer Vision / [ed] E. R. Davies, Matthew A. Turk, London: Elsevier, 2022, s. 305-336Kapittel i bok, del av antologi (Fagfellevurdert)
sted, utgiver, år, opplag, sider
London: Elsevier, 2022
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-188549 (URN)9780128221099 (ISBN)
Tilgjengelig fra: 2022-09-16 Laget: 2022-09-16 Sist oppdatert: 2022-11-11bibliografisk kontrollert
Holmquist, K., Klasén, L. & Felsberg, M. (2021). Class-Incremental Learning for Semantic Segmentation - A study. In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS): . Paper presented at 2021 Swedish Artificial Intelligence Society Workshop (SAIS), 14-15 June 2021, Sweden (pp. 25-28). IEEE
Åpne denne publikasjonen i ny fane eller vindu >>Class-Incremental Learning for Semantic Segmentation - A study
2021 (engelsk)Inngår i: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE , 2021, s. 25-28Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

One of the main challenges of applying deep learning for robotics is the difficulty of efficiently adapting to new tasks while still maintaining the same performance on previous tasks. The problem of incrementally learning new tasks commonly struggles with catastrophic forgetting in which the previous knowledge is lost.Class-incremental learning for semantic segmentation, addresses this problem in which we want to learn new semantic classes without having access to labeled data for previously learned classes. This is a problem in industry, where few pre-trained models and open datasets matches exactly the requisites. In these cases it is both expensive and labour intensive to collect an entirely new fully-labeled dataset. Instead, collecting a smaller dataset and only labeling the new classes is much more efficient in terms of data collection.In this paper we present the class-incremental learning problem for semantic segmentation, we discuss related work in terms of the more thoroughly studied classification task and experimentally validate the current state-of-the-art for semantic segmentation. This lays the foundation as we discuss some of the problems that still needs to be investigated and improved upon in order to reach a new state-of-the-art for class-incremental semantic segmentation.

sted, utgiver, år, opplag, sider
IEEE, 2021
Emneord
Industries, Deep learning, Conferences, Semantics, Labeling, Task analysis, Artificial intelligence
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-189039 (URN)10.1109/sais53221.2021.9483955 (DOI)000855522600007 ()9781665442367 (ISBN)9781665442374 (ISBN)
Konferanse
2021 Swedish Artificial Intelligence Society Workshop (SAIS), 14-15 June 2021, Sweden
Forskningsfinansiär
Vinnova
Merknad

Funding agencies: Vinnova [2020-02838]

Tilgjengelig fra: 2022-10-08 Laget: 2022-10-08 Sist oppdatert: 2023-03-01bibliografisk kontrollert
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-6096-3648