liu.seSearch for publications in DiVA
Change search
Refine search result
3456789 251 - 300 of 725
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 251.
    Holmkvist, Albin
    et al.
    Linköping University, Department of Electrical Engineering.
    Björkander, Max
    Linköping University, Department of Electrical Engineering.
    Learning features for extrinsic camera calibration of wide-angle cameras2023Independent thesis Advanced level (degree of Master (Two Years)), 28 HE creditsStudent thesis
    Abstract [en]

    This thesis attempts to solve the problem of estimating the extrinsic camera parameters (pitch and roll) from a wide-angle view image. The first contributionis a data generation pipeline capable of producing wide-angle distorted images with rotation and line segment annotations. This pipeline was used to produce four datasets with distortion and rotation in the range −5◦ to 5◦. The second contribution is two neural networks aiming to estimate the roll and pitch angles, one where line segments are used, and one where ResNet and DenseNet features are used. The roll and pitch angles are predicted both directly and with vanishing points as an intermediate representation in both networks. The line segment network managed to extract line segments from distorted images, and predict the roll and pitch angles with a mean error of 3.70◦ over all datasets. The network with features from ResNet and DenseNet performed the best with a mean angle error of 1.02◦ over all datasets.

    Download full text (pdf)
    fulltext
  • 252. Order onlineBuy this publication >>
    Holmquist, Karl
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Data-Driven Robot Perception in the Wild2023Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    As technology continues to advance, the interest in the relief of humans from tedious or dangerous tasks through automation increases. Some of the tasks that have received increasing attention are autonomous driving, disaster relief, and forestry inspection. Developing and deploying an autonomous robotic system to this type of unconstrained environments —in a safe way— is highly challenging. The system requires precise control and high-level decision making. Both of which require a robust and reliable perception system to understand the surroundings correctly. 

    The main purpose of perception is to extract meaningful information from the environment, be it in the form of 3D maps, dense classification of the type of object and surfaces, or high-level information about the position and direction of moving objects. Depending on the limitations and application of the system, various types of sensors can be used: lidars, to collect sparse depth information; cameras, to collect dense information for different parts of the visual spectra, of-ten the red-green-blue (RGB) bands; Inertial Measurements Units (IMUs), to estimate the ego motion; microphones, to interact and respond to humans; GPS receivers, to get global position information; just to mention a few. 

    This thesis investigates some of the necessities to approach the requirements of this type of system. Specifically, focusing on data-driven approaches, that is, machine learning, which has been shown time and again to be the main competitor for high-performance perception tasks in recent years. Although precision requirements might be high in industrial production plants, the environment is relatively controlled and the task is fixed. Instead, this thesis is studying some of the aspects necessary for complex, unconstrained environments, primarily outdoors and potentially near humans or other systems. The term in the wild refers exactly to the unconstrained nature of these environments, where the system can easily encounter something previously unseen and where the system might interact with unknowing humans. Some examples of environments are: city traffic, disaster relief scenarios, and dense forests. 

    This thesis will mainly focus on the following three key aspects necessary to handle the types of tasks and situations that could occur in the wild: 1) generalizing to a new environment, 2) adapting to new tasks and requirements, and 3) modeling uncertainty in the perception system. 

    First, a robotic system should be able to generalize to new environments and still function reliably. Papers B and G address this by using an intermediate representation to allow the system to handle much more diverse types of environment than otherwise possible. Paper B also investigates how robust the proposed autonomous driving system was to incorrect predictions, which is one of the likely results of changing the environment. 

    Second, a robot should be sufficiently adaptive to allow it to learn new tasks without forgetting the previous ones. Paper E proposed a way to allow incrementally adding new semantic classes to a trained model without access to the previous training data. The approach is based on utilizing the uncertainty in the predictions to model the unknown classes, marked as background. 

    Finally, the perception system will always be partially flawed, either because of the lack of modeling capabilities or because of ambiguities in the sensor data. To properly take this into account, it is fundamental that the system has the ability to estimate the certainty in the predictions. Paper F proposed a method for predicting the uncertainty in the model predictions when interpolating sparse data. Paper G addresses the ambiguities that exist when estimating the 3D pose of a human from a single camera image. 

    List of papers
    1. Computing a Collision-Free Path using the monogenic scale space
    Open this publication in new window or tab >>Computing a Collision-Free Path using the monogenic scale space
    2018 (English)In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2018, p. 8097-8102Conference paper, Published paper (Refereed)
    Abstract [en]

    Mobile robots have been used for various purposes with different functionalities which require them to freely move in environments containing both static and dynamic obstacles to accomplish given tasks. One of the most relevant capabilities in terms of navigating a mobile robot in such an environment is to find a safe path to a goal position. This paper shows that there exists an accurate solution to the Laplace equation which allows finding a collision-free path and that it can be efficiently calculated for a rectangular bounded domain such as a map which is represented as an image. This is accomplished by the use of the monogenic scale space resulting in a vector field which describes the attracting and repelling forces from the obstacles and the goal. The method is shown to work in reasonably convex domains and by the use of tessellation of the environment map for non-convex environments.

    Place, publisher, year, edition, pages
    IEEE, 2018
    Series
    International Conference on Intelligent Robots and Systems (IROS), ISSN 2153-0858
    National Category
    Computer Vision and Robotics (Autonomous Systems) Engineering and Technology
    Identifiers
    urn:nbn:se:liu:diva-152713 (URN)10.1109/IROS.2018.8593583 (DOI)000458872707044 ()978-1-5386-8094-0 (ISBN)978-1-5386-8095-7 (ISBN)978-1-5386-8093-3 (ISBN)
    Conference
    IROS 2018, Madrid, Spain, October 1-5, 2018
    Note

    Funding agencies:This work was founded by the European Union's Horizon 2020 Programme under grant agreement 644839 (CEN-TAURO).

    Available from: 2018-11-16 Created: 2018-11-16 Last updated: 2023-03-01
    2. A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control
    Open this publication in new window or tab >>A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control
    2021 (English)In: 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), IEEE COMPUTER SOC , 2021, p. 3947-3954Conference paper, Published paper (Refereed)
    Abstract [en]

    In this paper, we present a state-of-the-art reinforcement learning method for autonomous driving. Our approach employs temporal difference learning in a Bayesian framework to learn vehicle control signals from sensor data. The agent has access to images from a forward facing camera, which are pre-processed to generate semantic segmentation maps. We trained our system using both ground truth and estimated semantic segmentation input. Based on our observations from a large set of experiments, we conclude that training the system on ground truth input data leads to better performance than training the system on estimated input even if estimated input is used for evaluation. The system is trained and evaluated in a realistic simulated urban environment using the CARLA simulator. The simulator also contains a benchmark that allows for comparing to other systems and methods. The required training time of the system is shown to be lower and the performance on the benchmark superior to competing approaches.

    Place, publisher, year, edition, pages
    IEEE COMPUTER SOC, 2021
    Series
    International Conference on Pattern Recognition, ISSN 1051-4651
    Keywords
    Reinforcement Learning; Semantic Segmentation; Autonomous Driving; Bayesian method
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-178788 (URN)10.1109/ICPR48806.2021.9412200 (DOI)000678409204009 ()978-1-7281-8808-9 (ISBN)
    Conference
    25th International Conference on Pattern Recognition (ICPR), ELECTR NETWORK, jan 10-15, 2021
    Note

    Funding Agencies|SSF project [RIT15-0097]; Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

    Available from: 2021-09-01 Created: 2021-09-01 Last updated: 2023-03-01
    3. Flexible Disaster Response of Tomorrow: Final Presentation and Evaluation of the CENTAURO System
    Open this publication in new window or tab >>Flexible Disaster Response of Tomorrow: Final Presentation and Evaluation of the CENTAURO System
    Show others...
    2019 (English)In: IEEE robotics & automation magazine, ISSN 1070-9932, E-ISSN 1558-223X, Vol. 26, no 4, p. 59-72Article in journal (Refereed) Published
    Abstract [en]

    Mobile manipulation robots have great potential for roles in support of rescuers on disaster-response missions. Robots can operate in places too dangerous for humans and therefore can assist in accomplishing hazardous tasks while their human operators work at a safe distance. We developed a disaster-response system that consists of the highly flexible Centauro robot and suitable control interfaces, including an immersive telepresence suit and support-operator controls offering different levels of autonomy.

    Place, publisher, year, edition, pages
    IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2019
    Keywords
    Robot sensing systems; Task analysis; Hardware; Batteries; Legged locomotion
    National Category
    Robotics
    Identifiers
    urn:nbn:se:liu:diva-162953 (URN)10.1109/MRA.2019.2941248 (DOI)000502779800009 ()
    Note

    Funding Agencies|European UnionEuropean Union (EU) [644839]

    Available from: 2020-01-02 Created: 2020-01-02 Last updated: 2023-03-01
    4. Class-Incremental Learning for Semantic Segmentation - A study
    Open this publication in new window or tab >>Class-Incremental Learning for Semantic Segmentation - A study
    2021 (English)In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE , 2021, p. 25-28Conference paper, Published paper (Refereed)
    Abstract [en]

    One of the main challenges of applying deep learning for robotics is the difficulty of efficiently adapting to new tasks while still maintaining the same performance on previous tasks. The problem of incrementally learning new tasks commonly struggles with catastrophic forgetting in which the previous knowledge is lost.Class-incremental learning for semantic segmentation, addresses this problem in which we want to learn new semantic classes without having access to labeled data for previously learned classes. This is a problem in industry, where few pre-trained models and open datasets matches exactly the requisites. In these cases it is both expensive and labour intensive to collect an entirely new fully-labeled dataset. Instead, collecting a smaller dataset and only labeling the new classes is much more efficient in terms of data collection.In this paper we present the class-incremental learning problem for semantic segmentation, we discuss related work in terms of the more thoroughly studied classification task and experimentally validate the current state-of-the-art for semantic segmentation. This lays the foundation as we discuss some of the problems that still needs to be investigated and improved upon in order to reach a new state-of-the-art for class-incremental semantic segmentation.

    Place, publisher, year, edition, pages
    IEEE, 2021
    Keywords
    Industries, Deep learning, Conferences, Semantics, Labeling, Task analysis, Artificial intelligence
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:liu:diva-189039 (URN)10.1109/sais53221.2021.9483955 (DOI)000855522600007 ()9781665442367 (ISBN)9781665442374 (ISBN)
    Conference
    2021 Swedish Artificial Intelligence Society Workshop (SAIS), 14-15 June 2021, Sweden
    Funder
    Vinnova
    Note

    Funding agencies: Vinnova [2020-02838]

    Available from: 2022-10-08 Created: 2022-10-08 Last updated: 2023-03-01Bibliographically approved
    5. Evidential Deep Learning for Class-Incremental Semantic Segmentation
    Open this publication in new window or tab >>Evidential Deep Learning for Class-Incremental Semantic Segmentation
    2023 (English)In: Image Analysis. SCIA 2023. / [ed] Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen, Springer, 2023, p. 32-48Conference paper, Published paper (Refereed)
    Abstract [en]

    Class-Incremental Learning is a challenging problem in machine learning that aims to extend previously trained neural networks with new classes. This is especially useful if the system is able to classify new objects despite the original training data being unavailable. Although the semantic segmentation problem has received less attention than classification, it poses distinct problems and challenges, since previous and future target classes can be unlabeled in the images of a single increment. In this case, the background, past and future classes are correlated and there exists a background-shift.

    In this paper, we address the problem of how to model unlabeled classes while avoiding spurious feature clustering of future uncorrelated classes. We propose to use Evidential Deep Learning to model the evidence of the classes as a Dirichlet distribution. Our method factorizes the problem into a separate foreground class probability, calculated by the expected value of the Dirichlet distribution, and an unknown class (background) probability corresponding to the uncertainty of the estimate. In our novel formulation, the background probability is implicitly modeled, avoiding the feature space clustering that comes from forcing the model to output a high background score for pixels that are not labeled as objects. Experiments on the incremental Pascal VOC and ADE20k benchmarks show that our method is superior to the state of the art, especially when repeatedly learning new classes with increasing number of increments.

    Place, publisher, year, edition, pages
    Springer, 2023
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13886
    Keywords
    Class-incremental learning, Continual-learning, Semantic Segmentation
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-193265 (URN)10.1007/978-3-031-31438-4_3 (DOI)9783031314377 (ISBN)9783031314384 (ISBN)
    Conference
    SCIA 2023, 23rd Scandinavian Conference on Image Analysis. Sirkka, Finland, April 18–21, 2023
    Available from: 2023-04-26 Created: 2023-04-26 Last updated: 2023-05-11Bibliographically approved
    6. Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End
    Open this publication in new window or tab >>Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End
    2020 (English)In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2020, p. 12011-12020Conference paper, Published paper (Refereed)
    Abstract [en]

    The focus in deep learning research has been mostly to push the limits of prediction accuracy. However, this was often achieved at the cost of increased complexity, raising concerns about the interpretability and the reliability of deep networks. Recently, an increasing attention has been given to untangling the complexity of deep networks and quantifying their uncertainty for different computer vision tasks. Differently, the task of depth completion has not received enough attention despite the inherent noisy nature of depth sensors. In this work, we thus focus on modeling the uncertainty of depth data in depth completion starting from the sparse noisy input all the way to the final prediction. We propose a novel approach to identify disturbed measurements in the input by learning an input confidence estimator in a self-supervised manner based on the normalized convolutional neural networks (NCNNs). Further, we propose a probabilistic version of NCNNs that produces a statistically meaningful uncertainty measure for the final prediction. When we evaluate our approach on the KITTI dataset for depth completion, we outperform all the existing Bayesian Deep Learning approaches in terms of prediction accuracy, quality of the uncertainty measure, and the computational efficiency. Moreover, our small network with 670k parameters performs on-par with conventional approaches with millions of parameters. These results give strong evidence that separating the network into parallel uncertainty and prediction streams leads to state-of-the-art performance with accurate uncertainty estimates.

    Place, publisher, year, edition, pages
    IEEE, 2020
    Series
    Conference on Computer Vision and Pattern Recognition (CVPR), ISSN 1063-6919, E-ISSN 2575-7075
    Keywords
    Uncertainty, Task analysis, Probabilistic logic, Measurement uncertainty, Noise measurement, Convolution, Computer vision
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-169106 (URN)10.1109/CVPR42600.2020.01203 (DOI)978-1-7281-7168-5 (ISBN)978-1-7281-7169-2 (ISBN)
    Conference
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
    Available from: 2020-09-09 Created: 2020-09-09 Last updated: 2023-03-01
    Download full text (pdf)
    fulltext
    Download (png)
    presentationsbild
  • 253.
    Holmquist, Karl
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Klasén, Lena
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Office of the National Police Commissioner, The Swedish Police Authority, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Evidential Deep Learning for Class-Incremental Semantic Segmentation2023In: Image Analysis. SCIA 2023. / [ed] Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen, Springer, 2023, p. 32-48Conference paper (Refereed)
    Abstract [en]

    Class-Incremental Learning is a challenging problem in machine learning that aims to extend previously trained neural networks with new classes. This is especially useful if the system is able to classify new objects despite the original training data being unavailable. Although the semantic segmentation problem has received less attention than classification, it poses distinct problems and challenges, since previous and future target classes can be unlabeled in the images of a single increment. In this case, the background, past and future classes are correlated and there exists a background-shift.

    In this paper, we address the problem of how to model unlabeled classes while avoiding spurious feature clustering of future uncorrelated classes. We propose to use Evidential Deep Learning to model the evidence of the classes as a Dirichlet distribution. Our method factorizes the problem into a separate foreground class probability, calculated by the expected value of the Dirichlet distribution, and an unknown class (background) probability corresponding to the uncertainty of the estimate. In our novel formulation, the background probability is implicitly modeled, avoiding the feature space clustering that comes from forcing the model to output a high background score for pixels that are not labeled as objects. Experiments on the incremental Pascal VOC and ADE20k benchmarks show that our method is superior to the state of the art, especially when repeatedly learning new classes with increasing number of increments.

    The full text will be freely available from 2024-04-27 20:26
  • 254.
    Holmquist, Karl
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Senel, Deniz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Computing a Collision-Free Path using the monogenic scale space2018In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2018, p. 8097-8102Conference paper (Refereed)
    Abstract [en]

    Mobile robots have been used for various purposes with different functionalities which require them to freely move in environments containing both static and dynamic obstacles to accomplish given tasks. One of the most relevant capabilities in terms of navigating a mobile robot in such an environment is to find a safe path to a goal position. This paper shows that there exists an accurate solution to the Laplace equation which allows finding a collision-free path and that it can be efficiently calculated for a rectangular bounded domain such as a map which is represented as an image. This is accomplished by the use of the monogenic scale space resulting in a vector field which describes the attracting and repelling forces from the obstacles and the goal. The method is shown to work in reasonably convex domains and by the use of tessellation of the environment map for non-convex environments.

    Download full text (pdf)
    fulltext
  • 255.
    Holmquist, Karl
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Wandt, Bastian
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Diffpose: Multi-hypothesis human pose estimation using diffusion models2023Conference paper (Refereed)
    Abstract [en]

    Traditionally, monocular 3D human pose estimation employs a machine learning model to predict the most likely 3D pose for a given input image. However, a single image can be highly ambiguous and induces multiple plausible solutions for the 2D-3D lifting step, which results in overly confident 3D pose predictors. To this end, we propose DiffPose, a conditional diffusion model that predicts multiple hypotheses for a given input image. Compared to similar approaches, our diffusion model is straightforward and avoids intensive hyperparameter tuning, complex network structures, mode collapse, and unstable training. Moreover, we tackle the problem of over-simplification of the intermediate representation of the common two-step approaches which first estimate a distribution of 2D joint locations via joint-wise heatmaps and consecutively use their maximum argument for the 3D pose estimation step. Since such a simplification of the heatmaps removes valid information about possibly correct, though labeled unlikely, joint locations, we propose to represent the heatmaps as a set of 2D joint candidate samples. To extract information about the original distribution from these samples, we introduce our embedding transformer which conditions the diffusion model. Experimentally, we show that DiffPose improves upon the state of the art for multi-hypothesis pose estimation by 3-5% for simple poses and outperforms it by a large margin for highly ambiguous poses.

  • 256.
    Hoppmann, Kai
    et al.
    Zuse Institute Berlin, Berlin, Germany; Chair of Software and Algorithms for Discrete Optimization, TU Berlin, Berlin, Germany.
    Mexi, Gioni
    Zuse Institute Berlin, Berlin, Germany.
    Burdakov, Oleg
    Linköping University, Department of Mathematics, Optimization. Linköping University, Faculty of Science & Engineering.
    Casselgren, Carl Johan
    Linköping University, Department of Mathematics, Mathematics and Applied Mathematics. Linköping University, Faculty of Science & Engineering.
    Koch, Thorsten
    Zuse Institute Berlin, Berlin, Germany; Chair of Software and Algorithms for Discrete Optimization, TU Berlin, Berlin, Germany.
    Minimum Cycle Partition with Length Requirements2020In: Integration of Constraint Programming, Artificial Intelligence, and Operations Research, 2020, Vol. 12296, p. 273-282Conference paper (Refereed)
    Abstract [en]

    In this article we introduce a Minimum Cycle Partition Problem with Length Requirements (CPLR). This generalization of the Travelling Salesman Problem (TSP) originates from routing Unmanned Aerial Vehicles (UAVs). Apart from nonnegative edge weights, CPLR has an individual critical weight value associated with each vertex. A cycle partition, i.e., a vertex disjoint cycle cover, is regarded as a feasible solution if the length of each cycle, which is the sum of the weights of its edges, is not greater than the critical weight of each of its vertices. The goal is to find a feasible partition, which minimizes the number of cycles. In this article, a heuristic algorithm is presented together with a Mixed Integer Programming (MIP) formulation of CPLR. We furthermore introduce a conflict graph, whose cliques yield valid constraints for the MIP model. Finally, we report on computational experiments conducted on TSPLIB-based test instances.

  • 257.
    Horney, Tobias
    et al.
    Swedish Defence Research Agency, Sweden.
    Ahlberg, Jörgen
    Swedish Defence Research Agency, Sweden.
    Grönwall, Christina
    Swedish Defence Research Agency, Sweden.
    Folkesson, Martin
    Swedish Defence Research Agency, Sweden.
    Silvervarg, Karin
    Swedish Defence Research Agency, Sweden.
    Fransson, Jörgen
    Swedish Defence Research Agency, Sweden.
    Klasén, Lena
    Swedish Defence Research Agency, Sweden.
    Jungert, Erland
    Swedish Defence Research Agency, Sweden.
    Lantz, Fredrik
    Swedish Defence Research Agency, Sweden.
    Ulvklo, Morgan
    Swedish Defence Research Agency, Sweden.
    An information system for target recognition2004In: Volume 5434 Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications / [ed] Belur V. Dasarathy, SPIE - International Society for Optical Engineering, 2004, p. 163-175Conference paper (Refereed)
    Abstract [en]

    We present an approach to a general decision support system. The aim is to cover the complete process for automatic target recognition, from sensor data to the user interface. The approach is based on a query-based information system, and include tasks like feature extraction from sensor data, data association, data fusion and situation analysis. Currently, we are working with data from laser radar, infrared cameras, and visual cameras, studying target recognition from cooperating sensors on one or several platforms. The sensors are typically airborne and at low altitude. The processing of sensor data is performed in two steps. First, several attributes are estimated from the (unknown but detected) target. The attributes include orientation, size, speed, temperature etc. These estimates are used to select the models of interest in the matching step, where the target is matched with a number of target models, returning a likelihood value for each model. Several methods and sensor data types are used in both steps. The user communicates with the system via a visual user interface, where, for instance, the user can mark an area on a map and ask for hostile vehicles in the chosen area. The user input is converted to a query in ΣQL, a query language developed for this type of applications, and an ontological system decides which algorithms should be invoked and which sensor data should be used. The output from the sensors is fused by a fusion module and answers are given back to the user. The user does not need to have any detailed technical knowledge about the sensors (or which sensors that are available), and new sensors and algorithms can easily be plugged into the system.

  • 258.
    Hotz, Ingrid
    et al.
    University of California, Davis, USA.
    Feng, Louis
    University of California, Davis, USA.
    Hagen, Hans
    University of Kaiserslautern.
    Hamann, Bernd
    University of California, Davis, USA.
    Joy, Ken
    University of California, Davis, USA.
    Tensor Field Visualization Using a Metric Interpretation2006In: Visualization and Image Processing of Tensor Fields / [ed] Joachim Weickert, Hans Hagen, Springer, 2006, p. 269-281Chapter in book (Refereed)
    Abstract [en]

    This chapter introduces a visualization method specifically tailored to the class of tensor fields with properties similar to stress and strain tensors. Such tensor fields play an important role in many application areas such as structure mechanics or solid state physics. The presented technique is a global method that represents the physical meaning of these tensor fields with their central features: regions of compression or expansion. The method consists of two steps: first, the tensor field is interpreted as a distortion of a flat metric with the same topological structure; second, the resulting metric is visualized using a texture-based approach. The method supports an intuitive distinction between positive and negative eigenvalues.

  • 259.
    Hotz, Ingrid
    et al.
    Universtiy of California,Davis, USA.
    Feng, Louis
    Universtiy of California,Davis, USA.
    Hagen, Hans
    University of Kaiserslautern,Germany.
    Hamann, Bernd
    University of California, Davis, USA.
    Joy, Ken
    University of California, Davis, USA.
    Jeremic, Boris
    University of California, Davis, USA.
    Physically Based Methods for Tensor Field Visualization2004Conference paper (Refereed)
    Abstract [en]

    The physical interpretation of mathematical features of tensor fields is highly application-specific. Existing visualization methods for tensor fields only cover a fraction of the broad application areas. We present a visualization method tailored specifically to the class of tensor field exhibiting properties similar to stress and strain tensors, which are commonly encountered in geomechanics. Our technique is a global method that represents the physical meaning of these tensor fields with their central features: regions of compression or expansion. The method is based on two steps: first, we define a positive definite metric, with the same topological structure as the tensor field; second, we visualize the resulting metric. The eigenvector fields are represented using a texture-based approach resembling line integral convolution (LIC) methods. The eigenvalues of the metric are encoded in free parameters of the texture definition. Our method supports an intuitive distinction between positive and negative eigenvalues. We have applied our method to synthetic and some standard data sets, and "real" data from earth science and mechanical engineering application.

  • 260.
    Hotz, Ingrid
    et al.
    Universtiy of California, Davis.
    Feng, Louis
    University of California, Davis.
    Hamann, Bernd
    University of California, Davis, USA.
    Joy, Ken
    University of California, Davis, USA.
    Tensor-fields Visualization using a Fabric like Texture on Arbitrary two-dimensional Surfaces2009In: Mathematical Foundations of Scientific Visualization / [ed] Torsten Möller,Bernd Hamann,Robert D. Russell, Springer, 2009, p. 139-155Chapter in book (Refereed)
    Abstract [en]

    We present a visualization method that for three-dimensional tensor fields based on the idea of a stretched or compressed piece of fabric used as a “texture” for a two-dimensional surfaces. The texture parameters as the fabric density reflect the physical properties of the tensor field. This method is especially appropriate for the visualization of stress and strain tensor fields that play an important role in many application areas including mechanics and solid state physics. To allow an investigation of a three-dimensional field we use a scalar field that defines a one-parameter family of iso-surfaces controlled by their iso-value. This scalar-field can be a “connected” scalar field, for example, pressure or an additional scalar field representing some symmetry or inherent structure of the dataset. Texture generation consists basically of three steps. The first is the transformation of the tensor field into a positive definite metric. The second step is the generation of an input for the final texture generation using line integral convolution (LIC). This input image consists of “bubbles” whose shape and density are controlled by the eigenvalues of the tensor field. This spot image incorporates the entire information content defined by the three eigenvalue fields. Convolving this input texture in direction of the eigenvector fields provides a continuous representation. This method supports an intuitive distinction between positive and negative eigenvalues and supports the additional visualization of a connected scalar field.

  • 261.
    Hotz, Ingrid
    et al.
    University of Kaiserslautern.
    Hagen, Hans
    University of Kaiserslautern.
    Isometric Embedding for a Discrete Metric2004In: Geometric Modeling for Scientific Visualization / [ed] Guido Brunnett ,Bernd Hamann,Heinrich Müller ,Lars Linsen, Springer, 2004, 1, p. 19-36Chapter in book (Refereed)
  • 262.
    Hotz, Ingrid
    et al.
    Zuse Institute Berlin, Berlin, Germany.
    Peikert, Ronald
    ETH Zurich, Zurich, Switzerland .
    Definition  of  a  Multifield2014In: Scientific Visualization: Uncertainty, Multifield, Biomedical, and Scalable Visualization / [ed] Charles D. Hansen; Min Chen; Christopher R. Johnson; Arie E. Kaufman; Hans Hagen, Springer London, 2014, p. 105-109Chapter in book (Refereed)
    Abstract [en]

    A challenge, visualization is often faced with, is the complex structure of scientific data. Complexity can arise in various ways, from high dimensionalities of domains and ranges, time series of measurements, ensemble simulations, to heterogeneous collections of data, such as combinations of measured and simulated data. Many of these complexities can be subsumed under a concept of multifields, and in fact, multifield visualization has been identified as one of the major current challenges in scientific visualization. In this chapter, we propose a multifield definition, which will allow us a systematic approach to discussing related research.

  • 263.
    Hotz, Ingrid
    et al.
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Schultz, ThomasUniversity Bonn, Germany.
    Visualization and Processing of Tensors and Higher Order Descriptors for Multi-Valued Data (Dagstuhl’14)2015Collection (editor) (Refereed)
    Abstract [en]
    • Transfer result from one application to another between which there is otherwise not much exchange
    • Bringing together ideas from applications and theory: Applications can stimulate new basic research, as basic results can be of great use in the applications
    • Summarizing the state of the art and major open questions in the field
    • Presenting new and innovative work with the capabilities of advancing the field
  • 264.
    Hotz, Ingrid
    et al.
    University of California, USA.
    Sreevalsan-Nair, Jaya
    University of California, USA.
    Hagen, Hans
    Technical University of Kaiserslautern,Kaiserslautern, Germany.
    Hamann, Bernd
    University of California, USA.
    Tensor Field Reconstruction Based on Eigenvector and Eigenvalue Interpolation2010In: Dagstuhl Follow-Ups, E-ISSN 1868-8977, Vol. 1, p. 110-123Article in journal (Refereed)
    Abstract [en]

    Interpolation is an essential step in the visualization process. While most data from simulations or experiments are discrete many visualization methods are based on smooth, continuous data approximation or interpolation methods. We introduce a new interpolation method for symmetrical tensor fields given on a triangulated domain. Differently from standard tensor field interpolation, which is based on the tensor components, we use tensor invariants, eigenvectors and eigenvalues, for the interpolation. This interpolation minimizes the number of eigenvectors and eigenvalues computations by restricting it to mesh vertices and makes an exact integration of the tensor lines possible. The tensor field topology is qualitatively the same as for the component wise-interpolation. Since the interpolation decouples the “shape” and “direction” interpolation it is shape-preserving, what is especially important for tracing fibers in diffusion MRI data.

  • 265.
    Hu, Gang
    et al.
    Xian Univ Technol, Peoples R China.
    Zheng, Yixuan
    Xian Univ Technol, Peoples R China.
    Abualigah, Laith
    Al Al Bayt Univ, Jordan; Yuan Ze Univ, Taiwan; Al Ahliyya Amman Univ, Jordan; Middle East Univ, Jordan; Appl Sci Private Univ, Jordan; Univ Sains Malaysia, Malaysia; Sunway Univ Malaysia, Malaysia.
    Hussien, Abdelazim
    Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering. Fayoum Univ, Egypt.
    DETDO: An adaptive hybrid dandelion optimizer for engineering optimization2023In: Advanced Engineering Informatics, ISSN 1474-0346, E-ISSN 1873-5320, Vol. 57, article id 102004Article in journal (Refereed)
    Abstract [en]

    Dandelion Optimizer (DO) is a recently proposed swarm intelligence algorithm that coincides with the process of finding the best reproduction site for dandelion seeds. Compared with the classical Meta-heuristic algorithms, DO exhibits strong competitiveness, but it also has some drawbacks. In this paper, we proposed an adaptive hybrid dandelion optimizer called DETDO by combining three strategies of adaptive tent chaotic mapping, differential evolution (DE) strategy, and adaptive t-distribution perturbation to address the shortcomings of weak DO development, easy to fall into local optimum and slow convergence speed. Firstly, the adaptive tent chaos mapping is used in the initialization phase to obtain a uniformly distributed high-quality initial population, which helps the algorithm to enter the correct search region quickly. Secondly, the DE strategy is introduced to increase the diversity of dandelion populations to avoid algorithm stagnation, which improves the exploitation capability and the accuracy of the optimal solution. Finally, adaptive t-distribution perturbation around the elite solution successfully balances the exploration and exploitation phases while improving the convergence speed through a reasonable conversion from Cauchy to Gaussian distribution. The proposed DETDO is compared with classical or advanced optimization algorithms on CEC2017 and CEC2019 test sets, and the experimental results and statistical analysis demonstrate that the algorithm has better optimization accuracy and speed. In addition, DETDO has obtained the best results in solving six real-world engineering design problems. Finally, DETDO is applied to two bar topology optimization cases. Under a series of complex constraints, DETDO produces a lighter bar structure than the current scheme. It further illustrates the effectiveness and applicability of DETDO in practical problems. The above results mean that DETDO with strong competitiveness will become a preferred swarm intelligence algorithm to cope with optimization problems.

  • 266.
    Hult, Evelina
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Toward Equine Gait Analysis: Semantic Segmentation and 3D Reconstruction2023Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Harness racing horses are exposed to high workload and consequently, they are at risk of joint injuries and lameness. In recent years, the interest in applications to improve animal welfare has increased and there is a demand for objective assessment methods that can enable early and robust diagnosis of injuries.

    In this thesis, experiments were conducted on video recordings collected by a helmet camera mounted on the driver of a sulky. The aim was to take the first steps toward equine gait analysis by investigating how semantic segmentation and 3D reconstruction of such data could be performed. Since these were the first experiments made on this data, no expectations of the results existed in advance.

    Manual pixel-wise annotations were created on a small set of extracted frames and a deep learning model for semantic segmentation was trained to localize the horse, as well as the sulky and reins. The results are promising and could probably be further improved by expanding the annotated dataset and using a larger image resolution. Structure-from-motion using COLMAP was performed to estimate the camera motion in part of a video recording. A method to filter out dynamic objects based on masks created from predicted segmentation maps was investigated and the results showed that the reconstruction was part-wise successful, but struggled when dynamic objects were not filtered out and when the equipage was moving at high speed along a straight stretch.

    Overall the results are promising, but further development needs to be conducted to ensure robustness and conclude whether data collected by the investigated helmet camera configuration is suitable for equine gait analysis.

    Download full text (pdf)
    fulltext
  • 267.
    Hultberg, Johanna
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Dehazing of Satellite Images2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The aim of this work is to find a method for removing haze from satellite imagery. This is done by taking two algorithms developed for images taken from the sur- face of the earth and adapting them for satellite images. The two algorithms are Single Image Haze Removal Using Dark Channel Prior by He et al. and Color Im- age Dehazing Using the Near-Infrared by Schaul et al. Both algorithms, altered to fit satellite images, plus the combination are applied on four sets of satellite images. The results are compared with each other and the unaltered images. The evaluation is both qualitative, i.e. looking at the images, and quantitative using three properties: colorfulness, contrast and saturated pixels. Both the qualitative and the quantitative evaluation determined that using only the altered version of Dark Channel Prior gives the result with the least amount of haze and whose colors look most like reality. 

    Download full text (pdf)
    fulltext
  • 268.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improving Discriminative Correlation Filters for Visual Tracking2015Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Generic visual tracking is one of the classical problems in computer vision. In this problem, no prior knowledge of the target is available aside from a bounding box in the initial frame of the sequence. The generic visual tracking is a difficult task due to a number of factors such as momentary occlusions, target rotations, changes in target illumination and variations in the target size. In recent years, discriminative correlation filter (DCF) based trackers have shown promising results for visual tracking. These DCF based methods use the Fourier transform to efficiently calculate detection and model updates, allowing significantly higher frame rates than competing methods. However, existing DCF based methods only estimate translation of the object while ignoring changes in size.This thesis investigates the problem of accurately estimating the scale variations within a DCF based framework. A novel scale estimation method is proposed by explicitly constructing translation and scale filters. The proposed scale estimation technique is robust and significantly improve the tracking performance, while operating at real-time. In addition, a comprehensive evaluation of feature representations in a DCF framework is performed. Experiments are performed on the benchmark OTB-2015 dataset, as well as the VOT 2014 dataset. The proposed methods are shown to significantly improve the performance of existing DCF based trackers.

    Download full text (pdf)
    fulltext
  • 269. Order onlineBuy this publication >>
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Learning visual perception for autonomous systems2021Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    In the last decade, developments in hardware, sensors and software have made it possible to create increasingly autonomous systems. These systems can be as simple as limited driver assistance software lane-following in cars, or limited collision warning systems for otherwise manually piloted drones. On the other end of the spectrum exist fully autonomous cars, boats or helicopters. With increasing abilities to function autonomously, the demands to operate with minimal human supervision in unstructured environments increase accordingly.

    Common to most, if not all, autonomous systems is that they require an accurate model of the surrounding world. While there is currently a large number of possible sensors useful to create such models available, cameras are one of the most versatile. From a sensing perspective cameras have several advantages over other sensors in that they require no external infrastructure, are relatively cheap and can be used to extract such information as the relative positions of other objects, their movements over time, create accurate maps and locate the autonomous system within these maps.

    Using cameras to produce a model of the surroundings require solving a number of technical problems. Often these problems have a basis in recognizing that an object or region of interest is the same over time or in novel viewpoints. In visual tracking this type of recognition is required to follow an object of interest through a sequence of images. In geometric problems it is often a requirement to recognize corresponding image regions in order to perform 3D reconstruction or localization. 

    The first set of contributions in this thesis is related to the improvement of a class of on-line learned visual object trackers based on discriminative correlation filters. In visual tracking estimation of the objects size is important for reliable tracking, the first contribution in this part of the thesis investigates this problem. The performance of discriminative correlation filters is highly dependent on what feature representation is used by the filter. The second tracking contribution investigates the performance impact of different features derived from a deep neural network.

    A second set of contributions relate to the evaluation of visual object trackers. The first of these are the visual object tracking challenge. This challenge is a yearly comparison of state-of-the art visual tracking algorithms. A second contribution is an investigation into the possible issues when using bounding-box representations for ground-truth data.

    In real world settings tracking typically occur over longer time sequences than is common in benchmarking datasets. In such settings it is common that the model updates of many tracking algorithms cause the tracker to fail silently. For this reason it is important to have an estimate of the trackers performance even in cases when no ground-truth annotations exist. The first of the final three contributions investigates this problem in a robotics setting, by fusing information from a pre-trained object detector in a state-estimation framework. An additional contribution describes how to dynamically re-weight the data used for the appearance model of a tracker. A final contribution investigates how to obtain an estimate of how certain detections are in a setting where geometrical limitations can be imposed on the search region. The proposed solution learns to accurately predict stereo disparities along with accurate assessments of each predictions certainty.

    List of papers
    1. Discriminative Scale Space Tracking
    Open this publication in new window or tab >>Discriminative Scale Space Tracking
    2017 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 39, no 8, p. 1561-1575Article in journal (Refereed) Published
    Abstract [en]

    Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5 percent in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50 percent higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.

    Place, publisher, year, edition, pages
    IEEE COMPUTER SOC, 2017
    Keywords
    Visual tracking; scale estimation; correlation filters
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-139382 (URN)10.1109/TPAMI.2016.2609928 (DOI)000404606300006 ()27654137 (PubMedID)
    Note

    Funding Agencies|Swedish Foundation for Strategic Research; Swedish Research Council; Strategic Vehicle Research and Innovation (FFI); Wallenberg Autonomous Systems Program; National Supercomputer Centre; Nvidia

    Available from: 2017-08-07 Created: 2017-08-07 Last updated: 2023-04-03Bibliographically approved
    2. Convolutional Features for Correlation Filter Based Visual Tracking
    Open this publication in new window or tab >>Convolutional Features for Correlation Filter Based Visual Tracking
    2015 (English)In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), IEEE conference proceedings, 2015, p. 621-629Conference paper, Published paper (Refereed)
    Abstract [en]

    Visual object tracking is a challenging computer vision problem with numerous real-world applications. This paper investigates the impact of convolutional features for the visual tracking problem. We propose to use activations from the convolutional layer of a CNN in discriminative correlation filter based tracking frameworks. These activations have several advantages compared to the standard deep features (fully connected layers). Firstly, they mitigate the need of task specific fine-tuning. Secondly, they contain structural information crucial for the tracking problem. Lastly, these activations have low dimensionality. We perform comprehensive experiments on three benchmark datasets: OTB, ALOV300++ and the recently introduced VOT2015. Surprisingly, different to image classification, our results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers. Our results further show that the convolutional features provide improved results compared to standard handcrafted features. Finally, results comparable to state-of-theart trackers are obtained on all three benchmark datasets.

    Place, publisher, year, edition, pages
    IEEE conference proceedings, 2015
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-128869 (URN)10.1109/ICCVW.2015.84 (DOI)000380434700075 ()9781467397117 (ISBN)9781467397100 (ISBN)
    Conference
    15th IEEE International Conference on Computer Vision Workshops, ICCVW 2015, 7-13 December 2015, Santiago, Chile
    Available from: 2016-06-02 Created: 2016-06-02 Last updated: 2023-04-03Bibliographically approved
    3. The Visual Object Tracking VOT2017 challenge results
    Open this publication in new window or tab >>The Visual Object Tracking VOT2017 challenge results
    Show others...
    2017 (English)In: 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), IEEE , 2017, p. 1949-1972Conference paper, Published paper (Refereed)
    Abstract [en]

    The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by the VOT initiative. Results of 51 trackers are presented; many are state-of-the-art published at major computer vision conferences or journals in recent years. The evaluation included the standard VOT and other popular methodologies and a new "real-time" experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The VOT2017 goes beyond its predecessors by (i) improving the VOT public dataset and introducing a separate VOT2017 sequestered dataset, (ii) introducing a realtime tracking experiment and (iii) releasing a redesigned toolkit that supports complex experiments. The dataset, the evaluation kit and the results are publicly available at the challenge website(1).

    Place, publisher, year, edition, pages
    IEEE, 2017
    Series
    IEEE International Conference on Computer Vision Workshops, ISSN 2473-9936
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:liu:diva-145822 (URN)10.1109/ICCVW.2017.230 (DOI)000425239602001 ()978-1-5386-1034-3 (ISBN)
    Conference
    16th IEEE International Conference on Computer Vision (ICCV)
    Note

    Funding Agencies|Slovenian research agency research programs [P2-0214, P2-0094]; Slovenian research agency project [J2-8175]; Czech Science Foundation Project [GACR P103/12/G084]; WASP; VR (EMC2); SSF (SymbiCloud); SNIC; AIT Strategic Research Programme Visual Surveillance and Insight; Faculty of Computer Science, University of Ljubljana, Slovenia

    Available from: 2018-03-21 Created: 2018-03-21 Last updated: 2023-04-03
    4. Countering bias in tracking evaluations
    Open this publication in new window or tab >>Countering bias in tracking evaluations
    2018 (English)In: Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications / [ed] Francisco Imai, Alain Tremeau and Jose Braz, Science and Technology Publications, Lda , 2018, Vol. 5, p. 581-587Conference paper, Published paper (Refereed)
    Abstract [en]

    Recent years have witnessed a significant leap in visual object tracking performance mainly due to powerfulfeatures, sophisticated learning methods and the introduction of benchmark datasets. Despite this significantimprovement, the evaluation of state-of-the-art object trackers still relies on the classical intersection overunion (IoU) score. In this work, we argue that the object tracking evaluations based on classical IoU score aresub-optimal. As our first contribution, we theoretically prove that the IoU score is biased in the case of largetarget objects and favors over-estimated target prediction sizes. As our second contribution, we propose a newscore that is unbiased with respect to target prediction size. We systematically evaluate our proposed approachon benchmark tracking data with variations in relative target size. Our empirical results clearly suggest thatthe proposed score is unbiased in general.

    Place, publisher, year, edition, pages
    Science and Technology Publications, Lda, 2018
    National Category
    Signal Processing
    Identifiers
    urn:nbn:se:liu:diva-151306 (URN)10.5220/0006714805810587 (DOI)000576679800066 ()9789897582905 (ISBN)
    Conference
    13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, January 27-29, Funchal, Madeira
    Available from: 2018-09-17 Created: 2018-09-17 Last updated: 2021-07-15Bibliographically approved
    5. Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
    Open this publication in new window or tab >>Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
    2016 (English)In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1430-1438Conference paper, Published paper (Refereed)
    Abstract [en]

    Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2016
    Series
    IEEE Conference on Computer Vision and Pattern Recognition, E-ISSN 1063-6919 ; 2016
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-137882 (URN)10.1109/CVPR.2016.159 (DOI)000400012301051 ()9781467388511 (ISBN)9781467388528 (ISBN)
    Conference
    29th IEEE Conference on Computer Vision and Pattern Recognition, 27-30 June 2016, Las Vegas, NV, USA
    Note

    Funding Agencies|SSF (CUAS); VR (EMC2); VR (ELLIIT); Wallenberg Autonomous Systems Program; NSC; Nvidia

    Available from: 2017-06-01 Created: 2017-06-01 Last updated: 2023-04-03Bibliographically approved
    6. Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV
    Open this publication in new window or tab >>Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV
    Show others...
    2016 (English)In: Proceedings of the 12th International Symposium on Advances in Visual Computing, Springer, 2016Conference paper, Published paper (Refereed)
    Abstract [en]

    Visual object tracking performance has improved significantly in recent years. Most trackers are based on either of two paradigms: online learning of an appearance model or the use of a pre-trained object detector. Methods based on online learning provide high accuracy, but are prone to model drift. The model drift occurs when the tracker fails to correctly estimate the tracked object’s position. Methods based on a detector on the other hand typically have good long-term robustness, but reduced accuracy compared to online methods.

    Despite the complementarity of the aforementioned approaches, the problem of fusing them into a single framework is largely unexplored. In this paper, we propose a novel fusion between an online tracker and a pre-trained detector for tracking humans from a UAV. The system operates at real-time on a UAV platform. In addition we present a novel dataset for long-term tracking in a UAV setting, that includes scenarios that are typically not well represented in standard visual tracking datasets.

    Place, publisher, year, edition, pages
    Springer, 2016
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-137897 (URN)10.1007/978-3-319-50835-1_50 (DOI)2-s2.0-85007039301 (Scopus ID)978-3-319-50834-4 (ISBN)978-3-319-50835-1 (ISBN)
    Conference
    International Symposium on Advances in Visual Computing
    Available from: 2017-05-31 Created: 2017-05-31 Last updated: 2023-04-03Bibliographically approved
    Download full text (pdf)
    fulltext
    Download (png)
    presentationsbild
  • 270.
    Häger, Gustav
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Bhat, Goutam
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Rudol, Piotr
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Doherty, Patrick
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV2016In: Proceedings of the 12th International Symposium on Advances in Visual Computing, Springer, 2016Conference paper (Refereed)
    Abstract [en]

    Visual object tracking performance has improved significantly in recent years. Most trackers are based on either of two paradigms: online learning of an appearance model or the use of a pre-trained object detector. Methods based on online learning provide high accuracy, but are prone to model drift. The model drift occurs when the tracker fails to correctly estimate the tracked object’s position. Methods based on a detector on the other hand typically have good long-term robustness, but reduced accuracy compared to online methods.

    Despite the complementarity of the aforementioned approaches, the problem of fusing them into a single framework is largely unexplored. In this paper, we propose a novel fusion between an online tracker and a pre-trained detector for tracking humans from a UAV. The system operates at real-time on a UAV platform. In addition we present a novel dataset for long-term tracking in a UAV setting, that includes scenarios that are typically not well represented in standard visual tracking datasets.

    Download full text (pdf)
    fulltext
  • 271.
    Häger, Gustav
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Persson, Mikael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Predicting Disparity Distributions2021In: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021Conference paper (Refereed)
    Abstract [en]

    We investigate a novel deep-learning-based approach to estimate uncertainty in stereo disparity prediction networks. Current state-of-the-art methods often formulate disparity prediction as a regression problem with a single scalar output in each pixel. This can be problematic in practical applications as in many cases there might not exist a single well defined disparity, for example in cases of occlusions or at depth-boundaries. While current neural-network-based disparity estimation approaches  obtain good performance on benchmarks, the disparity prediction is treated as a black box at inference time. In this paper we show that by formulating the learning problem as a regression with a distribution target, we obtain a robust estimate of the uncertainty in each pixel, while maintaining the performance of the original method. The proposed method is evaluated both on a large-scale standard benchmark, as well on our own data. We also show that the uncertainty estimate significantly improves by maximizing the uncertainty in those pixels that have no well defined disparity during learning.

    Download full text (pdf)
    fulltext
  • 272.
    Hägerlind, Johannes
    Linköping University, Department of Electrical Engineering, Computer Vision.
    3D-Reconstruction of the Common Murre2023Independent thesis Advanced level (degree of Master (Two Years)), 28 HE creditsStudent thesis
    Abstract [en]

    Automatic 3D reconstruction of birds can aid researchers in studying their behavior. Recently there has been an attempt to reconstruct a variety of birds from single-view images. However, the common murre's appearance is different from the birds that have been studied. Moreover, recent studies have focused on side views. This thesis studies the 3D reconstruction of the common murre from single-view top-view images. A template mesh is first optimized to fit a 3D scan. Then the result is used to optimize a species-specific mean from side-view images annotated with keypoints and silhouettes. The resulting mean mesh is used to initialize the optimization for top-down images. Using a mask loss, a pose prior loss, and a bone length loss that uses a mean vector from the side-view images improves the 3D reconstruction as rated by humans. Furthermore, the intersection over union (IoU) and percentage of correct keypoint (PCK), although used by other authors, are insufficient in a single-view top-view setting.

    Download full text (pdf)
    fulltext
  • 273.
    Härnström, Denise
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Classification of Clothing Attributes Across Domains2020Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Classifying clothing attributes in surveillance images can be useful in the forensic field, making it easier to, for example, find suspects based on eyewitness accounts. Deep Neural Networks are often used successfully in image classification, but require a large amount of annotated data. Since labeling data can be time consuming or difficult, and it is easier to get hold of labeled fashion images, this thesis investigates how the domain shift from a fashion domain to a surveillance domain, with little or no annotated data, affects a classifier.

    In the experiments, two deep networks of different depth are used as a base and trained on only fashion images as well as both labeled and unlabeled surveillance images, with and without domain adaptation regularizers. The surveillance dataset is new and consists of images that were collected from different surveillance cameras and annotated during this thesis work.

    The results show that there is a degradation in performance for a classifier trained on the fashion domain when tested on the surveillance domain, compared to when tested on the fashion domain. The results also show that if no labeled data in the surveillance domain is used for these experiments, it is more effective to use the deeper network and train it on only fashion data, rather than to use the more complicated unsupervised domain adaptation method.

    Download full text (pdf)
    fulltext
  • 274.
    Höst, Gunnar
    et al.
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Schönborn, Konrad
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Tibell, Lena
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Educational Sciences.
    Visual images of the biological microcosmos: Viewers’ perception of realism, preference, and desire to explore2022In: Frontiers in Education, E-ISSN 2504-284X, Vol. 7, article id 933087Article in journal (Refereed)
    Abstract [en]

    Visual images are crucial for communicating science in educational contexts and amongst practitioners. Reading images contributes to meaning-making in society at large, and images are fundamental communicative tools in public spaces such as science centers. Here, visitors are exposed to a range of static, dynamic, and digital visual representations accessible through various multimodal and interactive possibilities. Images conveying scientific phenomena differ to what extent they represent real objects, and include photographs, schematic illustrations, and measurement-based models. Depicting realism in biological objects, structures and processes through images differs with respect to, inter alia, shading, color, and surface texture. Although research has shown that aspects of these properties can both potentially benefit and impair interpretation, little is known about their impact on viewers’ visual preference and inclination for further exploration. Therefore the aim of this study is to investigate what effect visual properties have on visitors’ perception of biological images integrated in an interactive science center exhibit. Visitors responded to a questionnaire designed to assess the impact of three indicators of realism (shading, color, and surface texture) and biological content (e.g., cells and viruses) on participants’ preferences, perceptions of whether biological images depicted real objects, and their desire to further explore images. Inspired by discrete choice experiments, image pairs were systematically varied to allow participants to make direct choices between images with different properties. Binary logistic regression analysis revealed that the three indicators of realism were all significant predictors of participants’ assessments that images depict real objects. Shadows emerged as a significant predictor of preference for further exploration together with the presence of cells in the image. Correlation analysis indicated that images that were more often selected as depicting real objects were also more often selected for further exploration. We interpret the results in terms of construal level theory in that a biological image perceived as a realistic portrayal would induce a desire for further exploration. The findings have implications for considering the role of realism and preference in the design of images for communicating science in public spaces.

    Download full text (pdf)
    fulltext
  • 275.
    Ingemars, Nils
    Linköping University, Department of Electrical Engineering.
    A feature based face tracker using extended Kalman filtering2007Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
    Abstract [en]

    A face tracker is exactly what it sounds like. It tracks a face in a video sequence. Depending on the complexity of the tracker, it could track the face as a rigid object or as a complete deformable face model with face expressions.

    This report is based on the work of a real time feature based face tracker. Feature based means that you track certain features in the face, like points with special characteristics. It might be a mouth or eye corner, but theoretically it could be any point. For this tracker, the latter is of interest. Its task is to extract global parameters, i.e. rotation and translation, as well as dynamic facial parameters (expressions) for each frame. It tracks feature points using motion between frames and a textured face model (Candide). It then uses an extended Kalman filter to estimate the parameters from the tracked feature points.

    Download full text (pdf)
    FULLTEXT01
  • 276.
    Ingemars, Nils
    et al.
    Linköping University, Department of Electrical Engineering, Image Coding. Linköping University, The Institute of Technology.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Image Coding. Linköping University, The Institute of Technology.
    Feature-based Face Tracking using Extended Kalman Filtering2007Conference paper (Other academic)
    Abstract [en]

    This work examines the possiblity to, with the computational power of today’s consumer hardware, employ techniques previously developed for 3D tracking of rigid objects, and use them for tracking of deformable objects. Our target objects are human faces in a video conversation pose, and our purpose is to create a deformable face tracker based on a head tracker operating in real-time on consumer hardware. We also investigate how to combine model-based and image based tracking in order to get precise tracking and avoid drift.

    Download full text (pdf)
    fulltext
  • 277.
    Ingerstad, Erica
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Kåreborn, Liv
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Planet-NeRF: Neural Radiance Fields for 3D Reconstruction on Satellite Imagery in Season Changing Environments2024Independent thesis Advanced level (degree of Master (Two Years)), 28 HE creditsStudent thesis
    Abstract [en]

    This thesis investigates the seasonal predictive capabilities of Neural Radiance Fields (NeRF) applied to satellite images. Focusing on the utilization of satellite data, the study explores how Sat-NeRF, a novel approach in computer vision, per- forms in predicting seasonal variations across different months. Through compre- hensive analysis and visualization, the study examines the model’s ability to cap- ture and predict seasonal changes, highlighting specific challenges and strengths. Results showcase the impact of the sun on predictions, revealing nuanced details in seasonal transitions, such as snow cover, color accuracy, and texture represen- tation in different landscapes. The research introduces modifications to the Sat- NeRF network. The implemented versions of the network include geometrically rendered shadows, a signed distance function, and a month embedding vector, where the last version mentioned resulted in Planet-NeRF. Comparative evalua- tions reveal that Planet-NeRF outperforms prior models, particularly in refining seasonal predictions. This advancement contributes to the field by presenting a more effective approach for seasonal representation in satellite imagery analysis, offering promising avenues for future research in this domain.

    Download full text (pdf)
    fulltext
  • 278.
    Isaksson, Filip
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Measuring Porosity in Ceramic Coating using Convolutional Neural Networks and Semantic Segmentation2022Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Ceramic materials contain several defects, one of which is porosity. At the time of writing, porosity measurement is a manual and time-consuming process performed by a human operator. With advances in deep learning for computer vision, this thesis explores to what degree convolutional neural networks and semantic segmentation can reliably measure porosity from microscope images. Combining classical image processing techniques with deep learning, images were automatically labeled and then used for training semantic segmentation neural networks leveraging transfer learning. Deep learning-based methods were more robust and could more reliably identify porosity in a larger variety of images than solely relying on classical image processing techniques.

    Download full text (pdf)
    fulltext
  • 279.
    Isoz, Wilhelm
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Calibration of Multispectral Sensors2005Independent thesis Basic level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This thesis describes and evaluates a number of approaches and algorithms for nonuniform correction (NUC) and suppression of fixed pattern noise in a image sequence. The main task for this thesis work was to create a general NUC for infrared focal plane arrays. To create a radiometrically correct NUC, reference based methods using polynomial approximation are used instead of the more common scene based methods which creates a cosmetic NUC.

    The pixels that can not be adjusted to give a correct value for the incomming radiation are defined as dead. Four separate methods of identifying dead pixels are used to find these pixels. Both the scene sequence and calibration data are used in these identifying methods.

    The algorithms and methods have all been tested by using real image sequences. A graphical user interface using the presented algorithms has been created in Matlab to simplify the correction of image sequences. An implementation to convert the corrected values from the images to radiance and temperature is also performed.

    Download full text (pdf)
    FULLTEXT01
  • 280.
    Izquierdo, Milagros
    et al.
    Linköping University, Department of Mathematics, Mathematics and Applied Mathematics. Linköping University, Faculty of Science & Engineering.
    Stokes, Klara
    Högskolan i Skövde.
    Isometric Point-Circle Configurations on Surfaces from Uniform Maps2016In: Springer Proceedings in Mathematics and Statistics, ISSN 2194-1009, Vol. 159, p. 201-212Article in journal (Refereed)
    Abstract [en]

    We embed neighborhood geometries of graphs on surfaces as point-circle configurations. We give examples coming from regular maps on surfaces with a maximum number of automorphisms for their genus, and survey geometric realization of pentagonal geometries coming from Moore graphs. An infinite family of point-circle v4'>v4v4 configurations on p-gonal surfaces with two p-gonal morphisms is given. The image of these configurations on the sphere under the two p-gonal morphisms is also described.

  • 281.
    Jack Lee, Wing
    et al.
    Monash University of Malaysia, Malaysia.
    Ng, Kok Yew
    Linköping University, Department of Electrical Engineering. Monash University of Malaysia, Malaysia.
    Luh Tan, Chin
    Monash University of Malaysia, Malaysia; Trity Technology, Malaysia.
    Pin Tan, Chee
    Monash University of Malaysia, Malaysia; Trity Technology, Malaysia.
    Real-Time Face Detection And Motorized Tracking Using ScicosLab and SMCube On SoCs2016In: 14TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), IEEE, 2016, article id UNSP Su23.3Conference paper (Refereed)
    Abstract [en]

    This paper presents a method for real-time detection and tracking of the human face. This is achieved using the Raspberry Pi microcomputer and the Easylab microcontroller as the main hardware with a camera mounted on servomotors for continuous image feed-in. Real-time face detection is performed using Haar-feature classifiers and ScicosLab in the Raspberry Pi. Then, the Easylab is responsible for face tracking, keeping the face in the middle of the frame through a pair of servomotors that control the horizontal and vertical movements of the camera. The servomotors are in turn controlled based on the state-diagrams designed using SMCube in the EasyLab. The methodology is verified via practical experimentation.

  • 282.
    Jackman, Simeon
    Linköping University, Department of Biomedical Engineering.
    Football Shot Detection using Convolutional Neural Networks2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this thesis, three different neural network architectures are investigated to detect the action of a shot within a football game using video data. The first architecture uses con- ventional convolution and pooling layers as feature extraction. It acts as a baseline and gives insight into the challenges faced during shot detection. The second architecture uses a pre-trained feature extractor. The last architecture uses three-dimensional convolution. All these networks are trained using short video clips extracted from football game video streams. Apart from investigating network architectures, different sampling methods are evaluated as well. This thesis shows that amongst the three evaluated methods, the ap- proach using MobileNetV2 as a feature extractor works best. However, when applying the networks to a video stream there are a multitude of challenges, such as false positives and incorrect annotations that inhibit the potential of detecting shots.

    Download full text (pdf)
    fulltext
  • 283.
    Jackowski, C.
    et al.
    Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Wyss, M.
    Department of Preventive, Restorative and Paediatric Dentistry, University of Bern, 3010 Bern, Switzerland.
    Persson, A.
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Faculty of Health Sciences. Linköping University, Department of Medical and Health Sciences, Radiology.
    Classens, M.
    Department of Diagnostic Radiology, Lindenhofspital, Bremgartenstrasse 117, 3001 Bern, Switzerland.
    Thali, M.J.
    Center of Forensic Imaging and Virtopsy, Institute of Forensic Medicine, University of Bern, Bühlstreet 20, 3012 Bern, Switzerland.
    Lussi, A.
    Department of Preventive, Restorative and Paediatric Dentistry, University of Bern, 3010 Bern, Switzerland.
    Ultra-high-resolution dual-source CT for forensic dental visualization - Discrimination of ceramic and composite fillings2008In: International journal of legal medicine, ISSN 0937-9827, E-ISSN 1437-1596, Vol. 122, no 4, p. 301-307Article in journal (Refereed)
    Abstract [en]

    Dental identification is the most valuable method to identify human remains in single cases with major postmortem alterations as well as in mass casualties because of its practicability and demanding reliability. Computed tomography (CT) has been investigated as a supportive tool for forensic identification and has proven to be valuable. It can also scan the dentition of a deceased within minutes. In the present study, we investigated currently used restorative materials using ultra-high-resolution dual-source CT and the extended CT scale for the purpose of a color-encoded, in scale, and artifact-free visualization in 3D volume rendering. In 122 human molars, 220 cavities with 2-, 3-, 4- and 5-mm diameter were prepared. With presently used filling materials (different composites, temporary filling materials, ceramic, and liner), these cavities were restored in six teeth for each material and cavity size (exception amalgam n=1). The teeth were CT scanned and images reconstructed using an extended CT scale. Filling materials were analyzed in terms of resulting Hounsfield units (HU) and filling size representation within the images. Varying restorative materials showed distinctively differing radiopacities allowing for CT-data-based discrimination. Particularly, ceramic and composite fillings could be differentiated. The HU values were used to generate an updated volume-rendering preset for postmortem extended CT scale data of the dentition to easily visualize the position of restorations, the shape (in scale), and the material used which is color encoded in 3D. The results provide the scientific background for the application of 3D volume rendering to visualize the human dentition for forensic identification purposes. © 2008 Springer-Verlag.

  • 284.
    Jankowai, Jochen
    et al.
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Wang, Bei
    Univ Utah, UT 84112 USA.
    Hotz, Ingrid
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Robust Extraction and Simplification of 2D Symmetric Tensor Field Topology2019In: Computer graphics forum (Print), ISSN 0167-7055, E-ISSN 1467-8659, Vol. 38, no 3, p. 337-349Article in journal (Refereed)
    Abstract [en]

    In this work, we propose a controlled simplification strategy for degenerated points in symmetric 2D tensor fields that is based on the topological notion of robustness. Robustness measures the structural stability of the degenerate points with respect to variation in the underlying field. We consider an entire pipeline for generating a hierarchical set of degenerate points based on their robustness values. Such a pipeline includes the following steps: the stable extraction and classification of degenerate points using an edge labeling algorithm, the computation and assignment of robustness values to the degenerate points, and the construction of a simplification hierarchy. We also discuss the challenges that arise from the discretization and interpolation of real world data.

    Download full text (pdf)
    fulltext
  • 285.
    Jarnemyr, Pontus
    et al.
    Linköping University, Department of Computer and Information Science, Software and Systems.
    Markus, Gustafsson
    Linköping University, Department of Computer and Information Science, Software and Systems.
    3D Camera Selection for Obstacle Detection in a Warehouse Environment2020Independent thesis Basic level (degree of Bachelor), 10,5 credits / 16 HE creditsStudent thesis
    Abstract [en]

    The increasing demand for online commerce has led to an increasing demand of autonomous vehicles in the logistics sector. The work in this thesis aims to improve the obstacle detection of autonomous forklifts by using 3D sensor technology. Three different products were compared based on a number of criteria. These criteria were provided by Toyota Material Handling, a manufacturer of autonomous forklifts. One of the products was chosen for developing a prototype. The prototype was used to determine if 3D camera technology could provide sufficient obstacle detection in a warehouse environment. The determination was based on the prototype's performance in a series of tests. The tests ranged from human to pallet detection, and were aimed to fulfill all criteria. The advantages and disadvantages of the chosen camera is presented. The conclusion is that the chosen 3D camera cannot provide sufficient obstacle detection due to certain environmental factors.

    Download full text (pdf)
    fulltext
  • 286.
    Javed, Sajid
    et al.
    Khalifa Univ Sci & Technol, U Arab Emirates.
    Danelljan, Martin
    Swiss Fed Inst Technol, Switzerland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. MBZUAI, U Arab Emirates.
    Khan, Muhammad Haris
    MBZUAI, U Arab Emirates.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Matas, Jiri
    Czech Tech Univ, Czech Republic.
    Visual Object Tracking With Discriminative Filters and Siamese Networks: A Survey and Outlook2023In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 45, no 5, p. 6552-6574Article in journal (Refereed)
    Abstract [en]

    Accurate and robust visual object tracking is one of the most challenging and fundamental computer vision problems. It entails estimating the trajectory of the target in an image sequence, given only its initial location, and segmentation, or its rough approximation in the form of a bounding box. Discriminative Correlation Filters (DCFs) and deep Siamese Networks (SNs) have emerged as dominating tracking paradigms, which have led to significant progress. Following the rapid evolution of visual object tracking in the last decade, this survey presents a systematic and thorough review of more than 90 DCFs and Siamese trackers, based on results in nine tracking benchmarks. First, we present the background theory of both the DCF and Siamese tracking core formulations. Then, we distinguish and comprehensively review the shared as well as specific open research challenges in both these tracking paradigms. Furthermore, we thoroughly analyze the performance of DCF and Siamese trackers on nine benchmarks, covering different experimental aspects of visual tracking: datasets, evaluation metrics, performance, and speed comparisons. We finish the survey by presenting recommendations and suggestions for distinguished open challenges based on our analysis.

    Download full text (pdf)
    fulltext
  • 287.
    Jogbäck, Mats
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology.
    Bildbaserad estimering av rörelse för reducering av rörelseartefakter2006Independent thesis Basic level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Before reconstructing a three dimensional volume from an MR brain imaging sequence there is a need for aligning each slice, due to unavoidable movement of the patient during the scanning. This procedure is known as image registration and the method used primarily today is based on a selected slice being the reference slice and then registrating the neighbouring slices, which are assumed to be of minimal deviation.

    The purpose of this thesis is to use another method commonly used in computer vision - to estimate the motion from a regular videosequence, by tracking markers indicating movement. The aim is to create a robust estimation of the movement of the head, which in turn can be used to create a more accurate alignment and volume.

    Download full text (pdf)
    FULLTEXT01
  • 288.
    Johansson, Marcus
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems.
    Online Whole-Body Control using Hierarchical Quadratic Programming: Implementation and Evaluation of the HiQP Control Framework2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The application of local optimal control is a promising paradigm for manipulative robot motion generation.In practice this involves instantaneous formulations of convex optimization problems depending on the current joint configuration of the robot and the environment.To be effective, however, constraints have to be carefully constructed as this kind of motion generation approach has a trade-off of completeness.Local optimal solvers, which are greedy in a temporal sense, have proven to be significantly more effective computationally than classical grid-based or sampling-based planning approaches.

    In this thesis we investigate how a local optimal control approach, namely the task function approach, can be implemented to grant high usability, extendibility and effectivity.This has resulted in the HiQP control framework, which is compatible with ROS, written in C++.The framework supports geometric primitives to aid in task customization by the user.It is also modular as to what communication system it is being used with, and to what optimization library it uses for finding optimal controls.

    We have evaluated the software quality of the framework according to common quantitative methods found in the literature.We have also evaluated an approach to perform tasks using minimal jerk motion generation with promising results.The framework also provides simple translation and rotation tasks based on six rudimentary geometric primitives.Also, task definitions for specific joint position setting, and velocity limitations were implemented.

    Download full text (pdf)
    fulltext
  • 289.
    Johansson, Robert
    et al.
    Linköping University, Department of Computer and Information Science, Human-Centered systems. Linköping University, Faculty of Science & Engineering. Stockholm Univ, Sweden.
    Lofthouse, Tony
    Stockholm Univ, Sweden.
    Hammer, Patrick
    Stockholm Univ, Sweden.
    Generalized Identity Matching in NARS2023In: ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, SPRINGER INTERNATIONAL PUBLISHING AG , 2023, Vol. 13539, p. 243-249Conference paper (Refereed)
    Abstract [en]

    Generalized identity matching is the ability to apply an identity concept in novel situations. This ability has been studied experimentally among humans and non-humans in a match-to-sample task. The aim of this study was to test if this ability was possible to demonstrate in the Non-Axiomatic Reasoning System (NARS). More specifically, we used a minimal configuration of OpenNARS for Applications that contained only sensorimotor reasoning. After training with only two identity matching-to-sample problems, NARS was able to derive an identity concept that it could generalize to novel situations.

  • 290.
    Johansson, Ted
    et al.
    Linköping University, Department of Electrical Engineering, Integrated Circuits and Systems. Linköping University, Faculty of Science & Engineering.
    Forchheimer, Robert
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Åstrom, Anders
    Swedish National Forensic Center, Linköping, Sweden.
    Low-Power Optical Sensor for Traffic Detection2020In: IEEE Sensors Letters, ISSN 2475-1472, Vol. 4, no 5, article id 9050911Article in journal (Refereed)
    Abstract [en]

    A CMOS sensor chip was used, together with an Arduino microcontroller, to create and verify a low-power low-cost optical motion detector for use in traffic detection under dark and daylight conditions. The chip can sense object features with very high dynamic range. On-chip near sensor image processing was used to reduce the data to be transferred to a host computer. A method using local extrema point detection was used to estimate motion through time-to-impact (TTI). Sensor data from the headlights of an approaching/passing car were used to extract TTI values similar to estimations from distance and speed of the object. The method can be used for detection of approaching objects to switch on streetlights (dark conditions) or sensors for traffic lights instead of magnetic sensors in the streets or conventional cameras (dark and daylight conditions). A sensor with a microcontroller operating at low clock frequency will consume less than 30 mW in this application. 

  • 291.
    Johansson, Victor
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    3D Position Estimation of a Person of Interest in Multiple Video Sequences: Person of Interest Recognition2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Because of the increase in the number of security cameras, there is more video footage available than a human could efficiently process. In combination with the fact that computers are getting more efficient, it is getting more and more interesting to solve the problem of detecting and recognizing people automatically.

    Therefore a method is proposed for estimating a 3D-path of a person of interest in multiple, non overlapping, monocular cameras. This project is a collaboration between two master theses. This thesis will focus on recognizing a person of interest from several possible candidates, as well as estimating the 3D-position of a person and providing a graphical user interface for the system. The recognition of the person of interest includes keeping track of said person frame by frame, and identifying said person in video sequences where the person of interest has not been seen before.

    The final product is able to both detect and recognize people in video, as well as estimating their 3D-position relative to the camera. The product is modular and any part can be improved or changed completely, without changing the rest of the product. This results in a highly versatile product which can be tailored for any given situation.

    Download full text (pdf)
    fulltext
  • 292. Order onlineBuy this publication >>
    Johnander, Joakim
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Zenseact AB, Gothenburg.
    Dynamic Visual Learning2022Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Autonomous robots act in a \emph{dynamic} world where both the robots and other objects may move. The surround sensing systems of said robots therefore work with dynamic input data and need to estimate both the current state of the environment as well as its dynamics. One of the key elements to obtain a high-level understanding of the environment is to track dynamic objects. This enables the system to understand what the objects are doing; predict where they will be in the future; and in the future better estimate where they are. In this thesis, I focus on input from visual cameras, images. Images have, with the advent of neural networks, become a cornerstone in sensing systems. Image-processing neural networks are optimized to perform a specific computer vision task -- such as recognizing cats and dogs -- on vast datasets of annotated examples. This is usually referred to as \emph{offline training} and given a well-designed neural network, enough high-quality data, and a suitable offline training formulation, the neural network is expected to become adept at the specific task.

    This thesis starts with a study of object tracking. The tracking is based on the visual appearance of the object, achieved via discriminative convolution filters (DCFs). The first contribution of this thesis is to decompose the filter into multiple subfilters. This serves to increase the robustness during object deformations or rotations. Moreover, it provides a more fine-grained representation of the object state as the subfilters are expected to roughly track object parts. In the second contribution, a neural network is trained directly for object tracking. In order to obtain a fine-grained representation of the object state, it is represented as a segmentation. The main challenge lies in the design of a neural network able to tackle this task. While the common neural networks excel at recognizing patterns seen during offline training, they struggle to store novel patterns in order to later recognize them. To overcome this limitation, a novel appearance learning mechanism is proposed. The mechanism extends the state-of-the-art and is shown to generalize remarkably well to novel data. In the third contribution, the method is used together with a novel fusion strategy and failure detection criterion to semi-automatically annotate visual and thermal videos.

    Sensing systems need not only track objects, but also detect them. The fourth contribution of this thesis strives to tackle joint detection, tracking, and segmentation of all objects from a predefined set of object classes. The challenge here lies not only in the neural network design, but also in the design of the offline training formulation. The final approach, a recurrent graph neural network, outperforms prior works that have a runtime of the same order of magnitude.

    Last, this thesis studies \emph{dynamic} learning of novel visual concepts. It is observed that the learning mechanisms used for object tracking essentially learns the appearance of the tracked object. It is natural to ask whether this appearance learning could be extended beyond individual objects to entire semantic classes, enabling the system to learn new concepts based on just a few training examples. Such an ability is desirable in autonomous systems as it removes the need of manually annotating thousands of examples of each class that needs recognition. Instead, the system is trained to efficiently learn to recognize new classes. In the fifth contribution, we propose a novel learning mechanism based on Gaussian process regression. With this mechanism, our neural network outperforms the state-of-the-art and the performance gap is especially large when multiple training examples are given.

    To summarize, this thesis studies and makes several contributions to learning systems that parse dynamic visuals and that dynamically learn visual appearances or concepts.

    List of papers
    1. DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking
    Open this publication in new window or tab >>DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking
    2017 (English)In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, p. 55-67Conference paper, Published paper (Refereed)
    Abstract [en]

    Discriminative Correlation Filter (DCF) based methods have shown competitive performance on tracking benchmarks in recent years. Generally, DCF based trackers learn a rigid appearance model of the target. However, this reliance on a single rigid appearance model is insufficient in situations where the target undergoes non-rigid transformations. In this paper, we propose a unified formulation for learning a deformable convolution filter. In our framework, the deformable filter is represented as a linear combination of sub-filters. Both the sub-filter coefficients and their relative locations are inferred jointly in our formulation. Experiments are performed on three challenging tracking benchmarks: OTB-2015, TempleColor and VOT2016. Our approach improves the baseline method, leading to performance comparable to state-of-the-art.

    Place, publisher, year, edition, pages
    Springer, 2017
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 10424
    National Category
    Computer Vision and Robotics (Autonomous Systems) Computer Engineering
    Identifiers
    urn:nbn:se:liu:diva-145373 (URN)10.1007/978-3-319-64689-3_5 (DOI)000432085900005 ()9783319646886 (ISBN)9783319646893 (ISBN)
    Conference
    17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I
    Note

    Funding agencies: SSF (SymbiCloud); VR (EMC2) [2016-05543]; SNIC; WASP; Nvidia

    Available from: 2018-02-26 Created: 2018-02-26 Last updated: 2023-04-03Bibliographically approved
    2. A generative appearance model for end-to-end video object segmentation
    Open this publication in new window or tab >>A generative appearance model for end-to-end video object segmentation
    Show others...
    2019 (English)In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 8945-8954Conference paper, Published paper (Refereed)
    Abstract [en]

    One of the fundamental challenges in video object segmentation is to find an effective representation of the target and background appearance. The best performing approaches resort to extensive fine-tuning of a convolutional neural network for this purpose. Besides being prohibitively expensive, this strategy cannot be truly trained end-to-end since the online fine-tuning procedure is not integrated into the offline training of the network. To address these issues, we propose a network architecture that learns a powerful representation of the target and background appearance in a single forward pass. The introduced appearance module learns a probabilistic generative model of target and background feature distributions. Given a new image, it predicts the posterior class probabilities, providing a highly discriminative cue, which is processed in later network modules. Both the learning and prediction stages of our appearance module are fully differentiable, enabling true end-to-end training of the entire segmentation pipeline. Comprehensive experiments demonstrate the effectiveness of the proposed approach on three video object segmentation benchmarks. We close the gap to approaches based on online fine-tuning on DAVIS17, while operating at 15 FPS on a single GPU. Furthermore, our method outperforms all published approaches on the large-scale YouTube-VOS dataset.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2019
    Series
    Proceedings - IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919, E-ISSN 2575-7075
    Keywords
    Segmentation; Grouping and Shape; Motion and Tracking
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-161037 (URN)10.1109/CVPR.2019.00916 (DOI)9781728132938 (ISBN)9781728132945 (ISBN)
    Conference
    IEEE Conference on Computer Vision and Pattern Recognition. 2019, Long Beach, CA, USA, USA, 15-20 June 2019
    Funder
    Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Foundation for Strategic Research Swedish Research Council
    Available from: 2019-10-17 Created: 2019-10-17 Last updated: 2023-04-03Bibliographically approved
    3. Semi-automatic Annotation of Objects in Visual-Thermal Video
    Open this publication in new window or tab >>Semi-automatic Annotation of Objects in Visual-Thermal Video
    Show others...
    2019 (English)In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Institute of Electrical and Electronics Engineers (IEEE), 2019Conference paper, Published paper (Refereed)
    Abstract [en]

    Deep learning requires large amounts of annotated data. Manual annotation of objects in video is, regardless of annotation type, a tedious and time-consuming process. In particular, for scarcely used image modalities human annotationis hard to justify. In such cases, semi-automatic annotation provides an acceptable option.

    In this work, a recursive, semi-automatic annotation method for video is presented. The proposed method utilizesa state-of-the-art video object segmentation method to propose initial annotations for all frames in a video based on only a few manual object segmentations. In the case of a multi-modal dataset, the multi-modality is exploited to refine the proposed annotations even further. The final tentative annotations are presented to the user for manual correction.

    The method is evaluated on a subset of the RGBT-234 visual-thermal dataset reducing the workload for a human annotator with approximately 78% compared to full manual annotation. Utilizing the proposed pipeline, sequences are annotated for the VOT-RGBT 2019 challenge.

    Place, publisher, year, edition, pages
    Institute of Electrical and Electronics Engineers (IEEE), 2019
    Series
    IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), ISSN 2473-9936, E-ISSN 2473-9944
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-161076 (URN)10.1109/ICCVW.2019.00277 (DOI)000554591602039 ()978-1-7281-5023-9 (ISBN)978-1-7281-5024-6 (ISBN)
    Conference
    IEEE International Conference on Computer Vision Workshop (ICCVW)
    Funder
    Swedish Research Council, 2013-5703Swedish Foundation for Strategic Research Wallenberg AI, Autonomous Systems and Software Program (WASP)Vinnova, VS1810-Q
    Note

    Funding agencies: Swedish Research CouncilSwedish Research Council [2013-5703]; project ELLIIT (the Strategic Area for ICT research - Swedish Government); Wallenberg AI, Autonomous Systems and Software Program (WASP); Visual Sweden project ndimensional Modelling [VS1810-Q]

    Available from: 2019-10-21 Created: 2019-10-21 Last updated: 2021-12-03
    4. Video Instance Segmentation with Recurrent Graph Neural Networks
    Open this publication in new window or tab >>Video Instance Segmentation with Recurrent Graph Neural Networks
    2021 (English)In: Pattern Recognition: 43rd DAGM German Conference, DAGM GCPR 2021, Bonn, Germany, September 28 – October 1, 2021, Proceedings. / [ed] Bauckhage C., Gall J., Schwing A., Springer, 2021, p. 206-221Conference paper, Published paper (Refereed)
    Abstract [en]

    Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach, operating at over 25 FPS, outperforms previous video real-time methods. We further conduct detailed ablative experiments that validate the different aspects of our approach.

    Place, publisher, year, edition, pages
    Springer, 2021
    Series
    Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13024
    National Category
    Computer Vision and Robotics (Autonomous Systems)
    Identifiers
    urn:nbn:se:liu:diva-183945 (URN)10.1007/978-3-030-92659-5_13 (DOI)978-3-030-92658-8 (ISBN)978-3-030-92659-5 (ISBN)
    Conference
    43rd DAGM German Conference, DAGM GCPR 2021, Bonn, Germany, September 28 – October 1, 2021
    Available from: 2022-03-28 Created: 2022-03-28 Last updated: 2022-03-29Bibliographically approved
    Download full text (pdf)
    fulltext
    Download (png)
    presentationsbild
  • 293.
    Johnander, Joakim
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Bhat, Goutam
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    On the Optimization of Advanced DCF-Trackers2019In: Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part I / [ed] Laura Leal-TaixéStefan Roth, Cham: Springer Publishing Company, 2019, p. 54-69Conference paper (Refereed)
    Abstract [en]

    Trackers based on discriminative correlation filters (DCF) have recently seen widespread success and in this work we dive into their numerical core. DCF-based trackers interleave learning of the target detector and target state inference based on this detector. Whereas the original formulation includes a closed-form solution for the filter learning, recently introduced improvements to the framework no longer have known closed-form solutions. Instead a large-scale linear least squares problem must be solved each time the detector is updated. We analyze the procedure used to optimize the detector and let the popular scheme introduced with ECO serve as a baseline. The ECO implementation is revisited in detail and several mechanisms are provided with alternatives. With comprehensive experiments we show which configurations are superior in terms of tracking capabilities and optimization performance.

    Download full text (pdf)
    On the Optimization of Advanced DCF-Trackers
  • 294.
    Johnander, Joakim
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Zenseact, Gothenburg, Sweden.
    Brissman, Emil
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Saab, Linköping, Sweden.
    Danelljan, Martin
    Computer Vision Lab, ETH Zürich, Zürich, Switzerland.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. School of Engineering, University of KwaZulu-Natal, Durban, South Africa.
    Video Instance Segmentation with Recurrent Graph Neural Networks2021In: Pattern Recognition: 43rd DAGM German Conference, DAGM GCPR 2021, Bonn, Germany, September 28 – October 1, 2021, Proceedings. / [ed] Bauckhage C., Gall J., Schwing A., Springer, 2021, p. 206-221Conference paper (Refereed)
    Abstract [en]

    Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach, operating at over 25 FPS, outperforms previous video real-time methods. We further conduct detailed ablative experiments that validate the different aspects of our approach.

  • 295.
    Johnander, Joakim
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Zenuity, Sweden.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. ETH Zurich, Switzerland.
    Brissman, Emil
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Saab, Sweden.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. IIAI, UAE.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A generative appearance model for end-to-end video object segmentation2019In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 8945-8954Conference paper (Refereed)
    Abstract [en]

    One of the fundamental challenges in video object segmentation is to find an effective representation of the target and background appearance. The best performing approaches resort to extensive fine-tuning of a convolutional neural network for this purpose. Besides being prohibitively expensive, this strategy cannot be truly trained end-to-end since the online fine-tuning procedure is not integrated into the offline training of the network. To address these issues, we propose a network architecture that learns a powerful representation of the target and background appearance in a single forward pass. The introduced appearance module learns a probabilistic generative model of target and background feature distributions. Given a new image, it predicts the posterior class probabilities, providing a highly discriminative cue, which is processed in later network modules. Both the learning and prediction stages of our appearance module are fully differentiable, enabling true end-to-end training of the entire segmentation pipeline. Comprehensive experiments demonstrate the effectiveness of the proposed approach on three video object segmentation benchmarks. We close the gap to approaches based on online fine-tuning on DAVIS17, while operating at 15 FPS on a single GPU. Furthermore, our method outperforms all published approaches on the large-scale YouTube-VOS dataset.

    Download full text (pdf)
    fulltext
  • 296.
    Johnander, Joakim
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Danelljan, Martin
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad Shahbaz
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking2017In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10424, p. 55-67Conference paper (Refereed)
    Abstract [en]

    Discriminative Correlation Filter (DCF) based methods have shown competitive performance on tracking benchmarks in recent years. Generally, DCF based trackers learn a rigid appearance model of the target. However, this reliance on a single rigid appearance model is insufficient in situations where the target undergoes non-rigid transformations. In this paper, we propose a unified formulation for learning a deformable convolution filter. In our framework, the deformable filter is represented as a linear combination of sub-filters. Both the sub-filter coefficients and their relative locations are inferred jointly in our formulation. Experiments are performed on three challenging tracking benchmarks: OTB-2015, TempleColor and VOT2016. Our approach improves the baseline method, leading to performance comparable to state-of-the-art.

    Download full text (pdf)
    fulltext
  • 297.
    Johnander, Joakim
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Zenseact AB, Sweden.
    Edstedt, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed bin Zayed Univ AI, U Arab Emirates.
    Danelljan, Martin
    Swiss Fed Inst Technol, Switzerland.
    Dense Gaussian Processes for Few-Shot Segmentation2022In: COMPUTER VISION, ECCV 2022, PT XXIX, SPRINGER INTERNATIONAL PUBLISHING AG , 2022, Vol. 13689, p. 217-234Conference paper (Refereed)
    Abstract [en]

    Few-shot segmentation is a challenging dense prediction task, which entails segmenting a novel query image given only a small annotated support set. The key problem is thus to design a method that aggregates detailed information from the support set, while being robust to large variations in appearance and context. To this end, we propose a few-shot segmentation method based on dense Gaussian process (GP) regression. Given the support set, our dense GP learns the mapping from local deep image features to mask values, capable of capturing complex appearance distributions. Furthermore, it provides a principled means of capturing uncertainty, which serves as another powerful cue for the final segmentation, obtained by a CNN decoder. Instead of a one-dimensional mask output, we further exploit the end-to-end learning capabilities of our approach to learn a high-dimensional output space for the GP. Our approach sets a new state-of-the-art on the PASCAL-5(i) and COCO-20(i) benchmarks, achieving an absolute gain of +8.4 mIoU in the COCO-20(i) 5-shot setting. Furthermore, the segmentation quality of our approach scales gracefully when increasing the support set size, while achieving robust cross-dataset transfer.

  • 298.
    Jones, Andrew
    et al.
    USC Institute Creat Technology, CA 90094 USA.
    Nagano, Koki
    USC Institute Creat Technology, CA 90094 USA.
    Busch, Jay
    USC Institute Creat Technology, CA 90094 USA.
    Yu, Xueming
    USC Institute Creat Technology, CA 90094 USA.
    Peng, Hsuan-Yueh
    USC Institute Creat Technology, CA 90094 USA.
    Barreto, Joseph
    USC Institute Creat Technology, CA 90094 USA.
    Alexander, Oleg
    USC Institute Creat Technology, CA 90094 USA.
    Bolas, Mark
    USC Institute Creat Technology, CA 90094 USA.
    Debevec, Paul
    USC Institute Creat Technology, CA 90094 USA.
    Unger, Jonas
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Time-Offset Conversations on a Life-Sized Automultiscopic Projector Array2016In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 927-935Conference paper (Refereed)
    Abstract [en]

    We present a system for creating and displaying interactive life-sized 3D digital humans based on pre-recorded interviews. We use 30 cameras and an extensive list of questions to record a large set of video responses. Users access videos through a natural conversation interface that mimics face-to-face interaction. Recordings of answers, listening and idle behaviors are linked together to create a persistent visual image of the person throughout the interaction. The interview subjects are rendered using flowed light fields and shown life-size on a special rear-projection screen with an array of 216 video projectors. The display allows multiple users to see different 3D perspectives of the subject in proper relation to their viewpoints, without the need for stereo glasses. The display is effective for interactive conversations since it provides 3D cues such as eye gaze and spatial hand gestures.

  • 299.
    Jonnarth, Arvi
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Husqvarna Grp, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Univ KwaZulu Natal, South Africa.
    IMPORTANCE SAMPLING CAMS FOR WEAKLY-SUPERVISED SEGMENTATION2022In: 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), IEEE , 2022, p. 2639-2643Conference paper (Refereed)
    Abstract [en]

    Classification networks can be used to localize and segment objects in images by means of class activation maps (CAMs). However, without pixel-level annotations, classification networks are known to (1) mainly focus on discriminative regions, and (2) to produce diffuse CAMs without well-defined prediction contours. In this work, we approach both problems with two contributions for improving CAM learning. First, we incorporate importance sampling based on the class-wise probability mass function induced by the CAMs to produce stochastic image-level class predictions. This results in CAMs which activate over a larger extent of objects. Second, we formulate a feature similarity loss term which aims to match the prediction contours with edges in the image. As a third contribution, we conduct experiments on the PASCAL VOC 2012 benchmark dataset to demonstrate that these modifications significantly increase the performance in terms of contour accuracy, while being comparable to current state-of-the-art methods in terms of region similarity.

  • 300.
    Jonsson, Christian
    Linköping University, Department of Science and Technology.
    Detection of annual rings in wood2008Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This report describes an annual line detection algorithm for the WoodEye quality control system. The goal with the algorithm is to find the positions of annual lines on the four surfaces of a board. The purpose is to use this result to find the inner annual ring structure of the board. The work was done using image processing techniques to analyze images collected with WoodEye. The report gives the reader an insight in the requirements of quality control systems in the woodworking industry and the benefits of automated quality control versus manual inspection. The appearance and formation of annual lines are explained on a detailed level to provide insight on how the problem should be approached. A comparison between annual rings and fingerprints are made to see if ideas from this area of pattern recognition can be adapted to annual line detection. This comparison together with a study of existing methods led to the implementation of a fingerprint enhancement method. This method became a central part of the annual line detection algorithm. The annual line detection algorithm consists of two main steps; enhancing the edges of the annual rings, and tracking along the edges to form lines. Different solutions for components of the algorithm were tested to compare performance. The final algorithm was tested with different input images to find if the annual line detection algorithm works best with images from a grayscale or an RGB camera.

    Download full text (pdf)
    FULLTEXT01
3456789 251 - 300 of 725
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf