liu.seSearch for publications in DiVA
Change search
Refine search result
12 51 - 92 of 92
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 51. Källhammer, Jan-Erik
    et al.
    Eriksson, Dick
    Granlund, Gösta
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Felsberg, Michael
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Moe, Anders
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Johansson, Björn
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Wiklund, Johan
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Forssén, Per-Erik
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Near Zone Pedestrian Detection using a Low-Resolution FIR Sensor2007In: Intelligent Vehicles Symposium, 2007 IEEE, Istanbul, Turkey: IEEE , 2007, , p. 339-345Conference paper (Refereed)
    Abstract [en]

    This paper explores the possibility to use a single low-resolution FIR camera for detection of pedestrians in the near zone in front of a vehicle. A low resolution sensor reduces the cost of the system, as well as the amount of data that needs to be processed in each frame.

    We present a system that makes use of hot-spots and image positions of a near constant bearing to detect potential pedestrians. These detections provide seeds for an energy minimization algorithm that fits a pedestrian model to the detection. Since false alarms are hard to tolerate, the pedestrian model is then tracked, and the distance-to-collision (DTC) is measured by integrating size change measurements at sub-pixel accuracy, and the car velocity. The system should only engage braking for detections on a collision course, with a reliably measured DTC.

    Preliminary experiments on a number of recorded near collision sequences indicate that our method may be useful for ranges up to about 10m using an 80x60 sensor, and somewhat more using a 160x120 sensor. We also analyze the robustness of the evaluated algorithm with respect to dead pixels, a potential problem for low-resolution sensors.

  • 52.
    Larsson, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssen, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Correlating Fourier descriptors of local patches for road sign recognition2011In: IET Computer Vision, ISSN 1751-9632, E-ISSN 1751-9640, Vol. 5, no 4, p. 244-254Article in journal (Refereed)
    Abstract [en]

    The Fourier descriptors (FDs) is a classical but still popular method for contour matching. The key idea is to apply the Fourier transform to a periodic representation of the contour, which results in a shape descriptor in the frequency domain. FDs are most commonly used to compare object silhouettes and object contours; the authors instead use this well-established machinery to describe local regions to be used in an object-recognition framework. Many approaches to matching FDs are based on the magnitude of each FD component, thus ignoring the information contained in the phase. Keeping the phase information requires us to take into account the global rotation of the contour and shifting of the contour samples. The authors show that the sum-of-squared differences of FDs can be computed without explicitly de-rotating the contours. The authors compare correlation-based matching against affine-invariant Fourier descriptors (AFDs) and WARP-matched FDs and demonstrate that correlation-based approach outperforms AFDs and WARP on real data. As a practical application the authors demonstrate the proposed correlation-based matching on a road sign recognition task.

    Download full text (pdf)
    fulltext
  • 53.
    Larsson, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Patch Contour Matching by Correlating Fourier Descriptors2009In: Digital Image Computing: Techniques and Applications (DICTA), IEEE Computer Society , 2009, p. 40-46Conference paper (Refereed)
    Abstract [en]

    Fourier descriptors (FDs) is a classical but still popular method for contour matching. The key idea is to apply the Fourier transform to a periodic representation of the contour, which results in a shape descriptor in the frequency domain. Fourier descriptors have mostly been used to compare object silhouettes and object contours; we instead use this well established machinery to describe local regions to be used in an object recognition framework. We extract local regions using the Maximally Stable Extremal Regions (MSER) detector and represent the external contour by FDs. Many approaches to matching FDs are based on the magnitude of each FD component, thus ignoring the information contained in the phase. Keeping the phase information requires us to take into account the global rotation of the contour and shifting of the contour samples. We show that the sum-of-squared differences of FDs can be computed without explicitly de-rotating the contours. We compare our correlation based matching against affine-invariant Fourier descriptors (AFDs) and demonstrate that our correlation based approach outperforms AFDs on real world data.

    Download full text (pdf)
    FULLTEXT01
  • 54.
    Larsson, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Using Fourier descriptors for local region matching2009In: SSBA, 2009Conference paper (Other academic)
  • 55.
    Lesmana, Martin
    et al.
    Computer Science, University of British Columbia, Canada.
    Landgren, Axel
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Pai, Dinesh K.
    Computer Science, University of British Columbia, Canada.
    Active Gaze Stabilization2014In: Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing / [ed] A. G. Ramakrishnan, ACM Digital Library, 2014, p. 81:1-81:8Conference paper (Refereed)
    Abstract [en]

    We describe a system for active stabilization of cameras mounted on highly dynamic robots. To focus on careful performance evaluation of the stabilization algorithm, we use a camera mounted on a robotic test platform that can have unknown perturbations in the horizontal plane, a commonly occurring scenario in mobile robotics. We show that the camera can be eectively stabilized using an inertial sensor and a single additional motor, without a joint position sensor. The algorithm uses an adaptive controller based on a model of the vertebrate Cerebellum for velocity stabilization, with additional drift correction. We have alsodeveloped a resolution adaptive retinal slip algorithm that is robust to motion blur.

    We evaluated the performance quantitatively using another high speed robot to generate repeatable sequences of large and fast movements that a gaze stabilization system can attempt to counteract. Thanks to the high-accuracy repeatability, we can make a fair comparison of algorithms for gaze stabilization. We show that the resulting system can reduce camera image motion to about one pixel per frame on average even when the platform is rotated at 200 degrees per second. As a practical application, we also demonstrate how the common task of face detection benets from active gaze stabilization.

    Download full text (pdf)
    fulltext
  • 56.
    Meger, David
    et al.
    UBC.
    Forssén, Per-Erik
    Department of Computer Science, University of British Columbia, Vancouver B.C. V6T 1Z4, Canada.
    Lai, Kevin
    UBC.
    Helmer, Scott
    UBC.
    McCann, Sancho
    UBC.
    Southey, Tristram
    UBC.
    Baumann, Matthew
    UBC.
    Little, James J.
    UBC.
    Lowe, David G.
    UBC.
    Curious George: An Attentive Semantic Robot2008In: IROS workshop, 2007, San Diego, CA: Elsevier, 2008, p. 503-511Conference paper (Refereed)
  • 57.
    Meger, David
    et al.
    UBC.
    Forssén, Per-Erik
    University of British Columbia.
    Lai, Kevin
    UBC.
    Helmer, Scott
    UBC.
    McCann, Sancho
    UBC.
    Southey, Tristram
    UBC.
    Baumann, Matthew
    UBC.
    Little, James J.
    UBC.
    Lowe, David G.
    UBC.
    Curious George: An Attentive Semantic Robot2008In: Robotics and Autonomous Systems, ISSN 0921-8890, E-ISSN 1872-793X, Vol. 56, no 6, p. 503-511Article in journal (Refereed)
    Abstract [en]

    State-of-the-art methods have recently achieved impressive performance for recognising the objects present in large databases of pre-collected images. There has been much less focus on building embodied systems that recognise objects present in the real world. This paper describes an intelligent system that attempts to perform robust object recognition in a realistic scenario, where a mobile robot moving through an environment must use the images collected from its camera directly to recognise objects. To perform successful recognition in this scenario, we have chosen a combination of techniques including a peripheral-foveal vision system, an attention system combining bottom-up visual saliency with structure from stereo, and a localisation and mapping technique. The result is a highly capable object recognition system that can be easily trained to locate the objects of interest in an environment, and subsequently build a spatial-semantic map of the region. This capability has been demonstrated during the Semantic Robot Vision Challenge, and is further illustrated with a demonstration of semantic mapping. We also empirically verify that the attention system outperforms an undirected approach even with a significantly lower number of foveations.

  • 58.
    Merino, Luis
    et al.
    Pablo de Olavide University, Crta. Utrera km. 1, 41013 Seville, Spain.
    Caballero, Fernando
    Robotics, Vision and Control Group, University of Seville, Camino de los Descubrimientos s/n, 41092 Seville, Spain.
    Ferruz, Joaquín
    Robotics, Vision and Control Group, University of Seville, Camino de los Descubrimientos s/n, 41092 Seville, Spain.
    Wiklund, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssen, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ollero, Anibal
    Robotics, Vision and Control Group, University of Seville, Camino de los Descubrimientos s/n, 41092 Seville, Spain.
    Multi-UAV Cooperative Perception Techniques2007In: Multiple Heterogeneous Unmanned Aerial Vehicles / [ed] Aníbal Ollero and Ivan Maza, Berlin / Heidelberg: Springer , 2007, Vol. 37, p. 67-110Chapter in book (Other (popular science, discussion, etc.))
    Abstract [en]

    This Chapter is devoted to the cooperation of multiple UAVs for environment perception. First, probabilistic methods for multi-UAV cooperative perception are analyzed. Then, the problem of multi-UAV detection, localization and tracking is described, and local image processing techniques are presented. Then, the Chapter shows two approaches based on the Information Filter and on evidence grid representations.

  • 59.
    Merino, Luis
    et al.
    Escuela Politécnica Superior, Universidad Pablo de Olavide, 41013 Sevilla, Spain.
    Caballero, Fernando
    Escuela Superior de Ingenieros, Universidad de Sevilla, 41092 Sevilla, Spain.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Wiklund, Johan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ferruz, Joaquín
    Escuela Superior de Ingenieros, Universidad de Sevilla, 41092 Sevilla, Spain.
    Martinez-de Dios, Jose Ramiro
    Escuela Superior de Ingenieros, Universidad de Sevilla, 41092 Sevilla, Spain.
    Moe, Anders
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Nordberg, Klas
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ollero, Anibal
    Escuela Superior de Ingenieros, Universidad de Sevilla, 41092 Sevilla, Spain.
    Single and Multi-UAV Relative Position Estimation Based on Natural Landmarks2007In: Advances in Unmanned Aerial Vehicles: State of the Art and the Road to Autonomy / [ed] Kimon P. Valavanis, Netherlands: Springer , 2007, p. 267-307Chapter in book (Other (popular science, discussion, etc.))
    Abstract [en]

    This Chapter presents a vision-based method for unmanned aerial vehicle (UAV) motion estimation that uses as input an image motion field obtained from matches of point-like features. The Chapter enhances visionbased techniques developed for single UAV localization and demonstrates how they can be modified to deal with the problem of multi-UAV relative position estimation. The proposed approach is built upon the assumption that if different UAVs identify, using their cameras, common objects in a scene, the relative pose displacement between the UAVs can be computed from these correspondences under certain assumptions. However, although point-like features are suitable for local UAV motion estimation, finding matches between images collected using different cameras is a difficult task that may be overcome using blob features. Results justify the proposed approach.

  • 60.
    Merino, Luis
    et al.
    Pablo de Olavide University, Seville, Spain, Pablo de Olavide University, Cita. Utrera km. 1, 41013 Seville, Spain).
    Wiklund, Johan
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Caballero, Fernando
    System Engineering and Automation Department.
    Moe, Anders
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Martinez-de Dios, Jose Ramiro
    Forssén, Per-Erik
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Nordberg, Klas
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Ollero, Annibal
    Department of E.S Ingenieros, University of Seville.
    Vision-Based Multi-UAV Position Estimation2006In: IEEE Robotics & Automation Magazine, ISSN 1070-9932, Vol. 13, no 3, p. 53-62Article in journal (Refereed)
    Abstract [en]

    This paper describes a method for vision-based unmanned aerial vehicle (UAV) motion estimation from multiple planar homographies. The paper also describes the determination of the relative displacement between different UAVs employing techniques for blob feature extraction and matching. It then presents and shows experimental results of the application of the proposed technique to multi-UAV detection of forest fires.  

  • 61.
    Nordberg, Klas
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Doherty, Patrick
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, KPLAB - Knowledge Processing Lab.
    Farnebäck, Gunnar
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Forssén, Per-Erik
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Granlund, Gösta
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Moe, Anders
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Wiklund, Johan
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Vision for a UAV helicopter2002In: International Conference on Intelligent Robots and Systems (IROS), Workshop on Aerial Robotics: Lausanne, Switzerland, 2002Conference paper (Other academic)
    Abstract [en]

    This paper presents and overview of the basic and applied research carried out by the Computer Vision Laboratory, Linköping University, in the WITAS UAV Project. This work includes customizing and redesigning vision methods to fit the particular needs and restrictions imposed by the UAV platform, e.g., for low-level vision, motion estimation, navigation, and tracking. It also includes a new learning structure for association of perception-action activations, and a runtime system for implementation and execution of vision algorithms. The paper contains also a brief introduction to the WITAS UAV Project.

  • 62.
    Nordberg, Klas
    et al.
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Doherty, Patrick
    Linköping University, The Institute of Technology. Linköping University, Department of Computer and Information Science, KPLAB - Knowledge Processing Lab.
    Forssén, Per-Erik
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Wiklund, Johan
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Andersson, Per
    A flexible runtime system for image processing in a distributed computational environment for an unmanned aerial vehicle2006In: International Journal of Pattern Recognition and Artificial Intelligence, ISSN 0218-0014, Vol. 20, no 5, p. 763-780Article in journal (Refereed)
    Abstract [en]

    A runtime system for implementation of image processing operations is presented. It is designed for working in a flexible and distributed environment related to the software architecture of a newly developed UAV system. The software architecture can be characterized at a coarse scale as a layered system, with a deliberative layer at the top, a reactive layer in the middle, and a processing layer at the bottom. At a finer scale each of the three levels is decomposed into sets of modules which communicate using CORBA, allowing system development and deployment on the UAV to be made in a highly flexible way. Image processing takes place in a dedicated module located in the process layer, and is the main focus of the paper. This module has been designed as a runtime system for data flow graphs, allowing various processing operations to be created online and on demand by the higher levels of the system. The runtime system is implemented in Java, which allows development and deployment to be made on a wide range of hardware/software configurations. Optimizations for particular hardware platforms have been made using Java's native interface.

  • 63.
    Ogniewski, Jens
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Pushing the Limits for View Prediction in Video Coding2017In: PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 4, SCITEPRESS , 2017, p. 68-76Conference paper (Refereed)
    Abstract [en]

    More and more devices have depth sensors, making RGB+D video (colour+depth video) increasingly common. RGB+D video allows the use of depth image based rendering (DIBR) to render a given scene from different viewpoints, thus making it a useful asset in view prediction for 3D and free-viewpoint video coding. In this paper we evaluate a multitude of algorithms for scattered data interpolation, in order to optimize the performance of DIBR for video coding. This also includes novel contributions like a Kriging refinement step, an edge suppression step to suppress artifacts, and a scale-adaptive kernel. Our evaluation uses the depth extension of the Sintel datasets. Using ground-truth sequences is crucial for such an optimization, as it ensures that all errors and artifacts are caused by the prediction itself rather than noisy or erroneous data. We also present a comparison with the commonly used mesh-based projection.

  • 64.
    Ogniewski, Jens
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    What is the best depth-map compression for Depth Image Based Rendering?2017In: Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II / [ed] Michael Felsberg, Anders Heyden and Norbert Krüger, Springer, 2017, Vol. 10425, p. 403-415Conference paper (Refereed)
    Abstract [en]

    Many of the latest smart phones and tablets come with integrated depth sensors, that make depth-maps freely available, thus enabling new forms of applications like rendering from different view points. However, efficient compression exploiting the characteristics of depth-maps as well as the requirements of these new applications is still an open issue. In this paper, we evaluate different depth-map compression algorithms, with a focus on tree-based methods and view projection as application.

    The contributions of this paper are the following: 1. extensions of existing geometric compression trees, 2. a comparison of a number of different trees, 3. a comparison of them to a state-of-the-art video coder, 4. an evaluation using ground-truth data that considers both depth-maps and predicted frames with arbitrary camera translation and rotation.

    Despite our best efforts, and contrary to earlier results, current video depth-map compression outperforms tree-based methods in most cases. The reason for this is likely that previous evaluations focused on low-quality, low-resolution depth maps, while high-resolution depth (as needed in the DIBR setting) has been ignored up until now. We also demonstrate that PSNR on depth-maps is not always a good measure of their utility.

    Download full text (pdf)
    What is the best depth-map compression for Depth Image Based Rendering?
  • 65.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Gyroscope-based video stabilisation with auto-calibration2015In: 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, p. 2090-2097Conference paper (Refereed)
    Abstract [en]

    We propose a technique for joint calibration of a wide-angle rolling shutter camera (e.g. a GoPro) and an externally mounted gyroscope. The calibrated parameters are time scaling and offset, relative pose between gyroscope and camera, and gyroscope bias. The parameters are found using non-linear least squares minimisation using the symmetric transfer error as cost function. The primary contribution is methods for robust initialisation of the relative pose and time offset, which are essential for convergence. We also introduce a robust error norm to handle outliers. This results in a technique that works with general video content and does not require any specific setup or calibration patterns. We apply our method to stabilisation of videos recorded by a rolling shutter camera, with a rigidly attached gyroscope. After recording, the gyroscope and camera are jointly calibrated using the recorded video itself. The recorded video can then be stabilised using the calibrated parameters. We evaluate the technique on video sequences with varying difficulty and motion frequency content. The experiments demonstrate that our method can be used to produce high quality stabilised videos even under difficult conditions, and that the proposed initialisation is shown to end up within the basin of attraction. We also show that a residual based on the symmetric transfer error is more accurate than residuals based on the recently proposed epipolar plane normal coplanarity constraint.

    Download full text (pdf)
    fulltext
  • 66.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Spline Error Weighting for Robust Visual-Inertial Fusion2018In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, p. 321-329Conference paper (Refereed)
    Abstract [en]

    In this paper we derive and test a probability-based weighting that can balance residuals of different types in spline fitting. In contrast to previous formulations, the proposed spline error weighting scheme also incorporates a prediction of the approximation error of the spline fit. We demonstrate the effectiveness of the prediction in a synthetic experiment, and apply it to visual-inertial fusion on rolling shutter cameras. This results in a method that can estimate 3D structure with metric scale on generic first-person videos. We also propose a quality measure for spline fitting, that can be used to automatically select the knot spacing. Experiments verify that the obtained trajectory quality corresponds well with the requested quality. Finally, by linearly scaling the weights, we show that the proposed spline error weighting minimizes the estimation errors on real sequences, in terms of scale and end-point errors.

  • 67.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Swedish Def Res Agcy, Sweden.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Trajectory representation and landmark projection for continuous-time structure from motion2019In: The international journal of robotics research, ISSN 0278-3649, E-ISSN 1741-3176, Vol. 38, no 6, p. 686-701Article in journal (Refereed)
    Abstract [en]

    This paper revisits the problem of continuous-time structure from motion, and introduces a number of extensions that improve convergence and efficiency. The formulation with a C2-continuous spline for the trajectory naturally incorporates inertial measurements, as derivatives of the sought trajectory. We analyze the behavior of split spline interpolation on SO(3) and on R3, and a joint spline on SE(3), and show that the latter implicitly couples the direction of translation and rotation. Such an assumption can make good sense for a camera mounted on a robot arm, but not for hand-held or body-mounted cameras. Our experiments in the Spline Fusion framework show that a split spline on R3andSO(3) is preferable over an SE(3) spline in all tested cases. Finally, we investigate the problem of landmark reprojection on rolling shutter cameras, and show that the tested reprojection methods give similar quality, whereas their computational load varies by a factor of two.

    Download full text (pdf)
    fulltext
  • 68.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Törnqvist, David
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification2015In: New Development in Robot Vision / [ed] Yu Sun, Aman Behal & Chi-Kit Ronald Chung, Springer Berlin/Heidelberg, 2015, p. 55-71Chapter in book (Refereed)
    Abstract [en]

    Scene reconstruction, i.e. the process of creating a 3D representation (mesh) of some real world scene, has recently become easier with the advent of cheap RGB-D sensors (e.g. the Microsoft Kinect).

    Many such sensors use rolling shutter cameras, which produce geometrically distorted images when they are moving. To mitigate these rolling shutter distortions we propose a method that uses an attached gyroscope to rectify the depth scans.We also present a simple scheme to calibrate the relative pose and time synchronization between the gyro and a rolling shutter RGB-D sensor.

    For scene reconstruction we use the Kinect Fusion algorithm to produce meshes. We create meshes from both raw and rectified depth scans, and these are then compared to a ground truth mesh. The types of motion we investigate are: pan, tilt and wobble (shaking) motions.

    As our method relies on gyroscope readings, the amount of computations required is negligible compared to the cost of running Kinect Fusion.

    This chapter is an extension of a paper at the IEEE Workshop on Robot Vision [10]. Compared to that paper, we have improved the rectification to also correct for lens distortion, and use a coarse-to-fine search to find the time shift more quicky.We have extended our experiments to also investigate the effects of lens distortion, and to use more accurate ground truth. The experiments demonstrate that correction of rolling shutter effects yields a larger improvement of the 3D model than correction for lens distortion.

  • 69.
    Ovrén, Hannes
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Törnqvist, David
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Why Would I Want a Gyroscope on my RGB-D Sensor?2013In: Proceedings of 2013 IEEE Workshop on Robot Vision (WORV), IEEE , 2013, p. 68-75Conference paper (Refereed)
    Abstract [en]

    Many RGB-D sensors, e.g. the Microsoft Kinect, use rolling shutter cameras. Such cameras produce geometrically distorted images when the sensor is moving. To mitigate these rolling shutter distortions we propose a method that uses an attached gyroscope to rectify the depth scans. We also present a simple scheme to calibrate the relative pose and time synchronization between the gyro and a rolling shutter RGB-D sensor. We examine the effectiveness of our rectification scheme by coupling it with the the Kinect Fusion algorithm. By comparing Kinect Fusion models obtained from raw sensor scans and from rectified scans, we demonstrate improvement for three classes of sensor motion: panning motions causes slant distortions, and tilt motions cause vertically elongated or compressed objects. For wobble we also observe a loss of detail, compared to the reconstruction using rectified depth scans. As our method relies on gyroscope readings, the amount of computations required is negligible compared to the cost of running Kinect Fusion.

    Download full text (pdf)
    fulltext
  • 70.
    Persson, Mikael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Independently Moving Object Trajectories from Sequential Hierarchical Ransac2021In: VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, SCITEPRESS , 2021, p. 722-731Conference paper (Refereed)
    Abstract [en]

    Safe robot navigation in a dynamic environment, requires the trajectories of each independently moving object (IMO). We present the novel and effective system Sequential Hierarchical Ransac Estimation (Shire) designed for this purpose. The system uses a stereo camera stream to find the objects and trajectories in real time. Shire detects moving objects using geometric consistency and finds their trajectories using bundle adjustment. Relying on geometric consistency allows the system to handle objects regardless of semantic class, unlike approaches based on semantic segmentation. Most Visual Odometry (VO) systems are inherently limited to single motion by the choice of tracker. This limitation allows for efficient and robust ego-motion estimation in real time, but preclude tracking the multiple motions sought. Shire instead uses a generic tracker and achieves accurate VO and IMO estimates using track analysis. This removes the restriction to a single motion while retaining the real-time performance required for live navigation. We evaluate the system by bounding box intersection over union and ID persistence on a public dataset, collected from an autonomous test vehicle driving in real traffic. We also show the velocities of estimated IMOs. We investigate variations of the system that provide trade offs between accuracy, performance and limitations.

  • 71.
    Persson, Mikael
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ovrén, Hannes
    Swedish Def Res Agcy, Linkoping, Sweden.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Practical Pose Trajectory Splines With Explicit Regularization2021In: 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 156-165Conference paper (Refereed)
    Abstract [en]

    We investigate spline-based continuous-time pose trajectory estimation using non-linear explicit motion priors. Current regularization priors either linearize the orientation, rely on the implicit regularization obtained from the used spline basis function, or use sampling based regularization schemes. The latter is a special case of a Riemann sum approximation, and we demonstrate when and why this can fail, and propose a way to avoid these issues. In addition we provide a number of novel practically useful theoretical contributions, including requirements on knot spacing for orientation splines, new basis functions for constant velocity extrapolation, and a generalization of the popular P-Spline penalty to orientation. We analyze the properties of the proposed approach using synthetic data. We validate our system using the standard task of visual-inertial calibration, and apply it to stereo visual odometry where we demonstrate real-time performance on KITTI.

  • 72.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ahlberg, Jörgen
    Sensor Informatics Group, Swedish Defence Research Agenc y (FOI), Linköping.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Wadströmer, Niclas
    Sensor Informatics Group, Swedish Defence Research Agenc y (FOI), Linköping.
    Co-alignmnent of Aerial Push-broom Strips using Trajectory Smoothness Constraints2010Conference paper (Other academic)
    Abstract [en]

    We study the problem of registering a sequence of scan lines (a strip) from an airborne push-broom imager to another sequence partly covering the same area. Such a registration has to compensate for deformations caused by attitude and speed changes in the aircraft. The registration is challenging, as both strips contain such deformations. Our algorithm estimates the 3D rotation of the camera for each scan line, by parametrising it as a linear spline with a number of knots evenly distributed in one of the strips. The rotations are estimated from correspondences between strips of the same area. Once the rotations are known, they can be compensated for, and each line of pixels can be transformed such that ground trace of the two strips are registered with respect to each other.

  • 73.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ahlberg, Jörgen
    FOI, Swedish Defence Research Agency, Linköping, Sweden.
    Wadströmer, Niclas
    FOI, Swedish Defence Research Agency, Linköping, Sweden.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Co-aligning Aerial Hyperspectral Push-broom Strips for Change Detection2010In: Proc. SPIE 7835, Electro-Optical Remote Sensing, Photonic Technologies, and Applications IV / [ed] Gary W. Kamerman; Ove Steinvall; Keith L. Lewis; Richard C. Hollins; Thomas J. Merlet; Gary J. Bishop; John D. Gonglewski, SPIE - International Society for Optical Engineering, 2010, p. Art.nr. 7835B-36-Conference paper (Refereed)
    Abstract [en]

    We have performed a field trial with an airborne push-broom hyperspectral sensor, making several flights over the same area and with known changes (e.g., moved vehicles) between the flights. Each flight results in a sequence of scan lines forming an image strip, and in order to detect changes between two flights, the two resulting image strips must be geometrically aligned and radiometrically corrected. The focus of this paper is the geometrical alignment, and we propose an image- and gyro-based method for geometric co-alignment (registration) of two image strips. The method is particularly useful when the sensor is not stabilized, thus reducing the need for expensive mechanical stabilization. The method works in several steps, including gyro-based rectification, global alignment using SIFT matching, and a local alignment using KLT tracking. Experimental results are shown but not quantified, as ground truth is, by the nature of the trial, lacking.

  • 74.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    A Virtual Tripod for Hand-held Video Stacking on Smartphones2014In: 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP), IEEE , 2014Conference paper (Refereed)
    Abstract [en]

    We propose an algorithm that can capture sharp, low-noise images in low-light conditions on a hand-held smartphone. We make use of the recent ability to acquire bursts of high resolution images on high-end models such as the iPhone5s. Frames are aligned, or stacked, using rolling shutter correction, based on motion estimated from the built-in gyro sensors and image feature tracking. After stacking, the images may be combined, using e.g. averaging to produce a sharp, low-noise photo. We have tested the algorithm on a variety of different scenes, using several different smartphones. We compare our method to denoising, direct stacking, as well as a global-shutter based stacking, with favourable results.

  • 75.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Efficient Video Rectification and Stabilisation for Cell-Phones2012In: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 96, no 3, p. 335-352Article in journal (Refereed)
    Abstract [en]

    This article presents a method for rectifying and stabilising video from cell-phones with rolling shutter (RS) cameras. Due to size constraints, cell-phone cameras have constant, or near constant focal length, making them an ideal application for calibrated projective geometry. In contrast to previous RS rectification attempts that model distortions in the image plane, we model the 3D rotation of the camera. We parameterise the camera rotation as a continuous curve, with knots distributed across a short frame interval. Curve parameters are found using non-linear least squares over inter-frame correspondences from a KLT tracker. By smoothing a sequence of reference rotations from the estimated curve, we can at a small extra cost, obtain a high-quality image stabilisation. Using synthetic RS sequences with associated ground-truth, we demonstrate that our rectification improves over two other methods. We also compare our video stabilisation with the methods in iMovie and Deshaker.

    Download full text (pdf)
    fulltext
    Download (zip)
    Supplementary material
    Download (png)
    presentationsbild
  • 76.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Rectifying rolling shutter video from hand-held devices2011In: Proceedings SSBA´11 Symposium on Image Analysis, 2011Conference paper (Other academic)
    Abstract [en]

    This paper presents a method for rectifying video sequences from rolling shutter (RS) cameras. In contrast to previous RS rectification attempts we model distortions as being caused by the 3D motion of the camera. The camera motion is parametrised as a continuous curve, with knots at the last row of each frame. Curve parameters are solved for using non-linear least squares over inter-frame correspondences obtained from a KLT tracker. We have generated synthetic RS sequences with associated ground-truth to allow controlled evaluation. Using these sequences, we demonstrate that our algorithm improves over two previously published methods. The RS dataset is available on the web to allow comparison with other methods.

  • 77.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Scan Rectification for Structured Light Range Sensors with Rolling Shutters2011In: IEEE International Conference on Computer Vision, Barcelona Spain, 2011, p. 1575-1582Conference paper (Other academic)
    Abstract [en]

    Structured light range sensors, such as the Microsoft Kinect, have recently become popular as perception devices for computer vision and robotic systems. These sensors use CMOS imaging chips with electronic rolling shutters (ERS). When using such a sensor on a moving platform, both the image, and the depth map, will exhibit geometric distortions. We introduce an algorithm that can suppress such distortions, by rectifying the 3D point clouds from the range sensor. This is done by first estimating the time continuous 3D camera trajectory, and then transforming the 3D points to where they would have been, if the camera had been stationary. To ensure that image and range data are synchronous, the camera trajectory is computed from KLT tracks on the structured-light frames, after suppressing the structured-light pattern. We evaluate our rectification, by measuring angles between the visible sides of a cube, before and after rectification. We also measure how much better the 3D point clouds can be aligned after rectification. The obtained improvement is also related to the actual rotational velocity, measured using a MEMS gyroscope.

  • 78.
    Ringaby, Erik
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Friman, Ola
    Linköping University, Department of Biomedical Engineering, Medical Informatics. Linköping University, The Institute of Technology. Sick IVP AB, Linköping, Sweden.
    Olsvik Opsahl, Thomas
    Norwegian Defence Research Establishment.
    Vegard Haavardsholm, Trym
    Norwegian Defence Research Establishment.
    Kåsen, Ingebjørg
    Norwegian Defence Research Establishment.
    Anisotropic Scattered Data Interpolation for Pushbroom Image Rectification2014In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, no 5, p. 2302-2314Article in journal (Refereed)
    Abstract [en]

    This article deals with fast and accurate visualization of pushbroom image data from airborne and spaceborne platforms. A pushbroom sensor acquires images in a line-scanning fashion, and this results in scattered input data that needs to be resampled onto a uniform grid for geometrically correct visualization. To this end, we model the anisotropic spatial dependence structure caused by the acquisition process. Several methods for scattered data interpolation are then adapted to handle the induced anisotropic metric and compared for the pushbroom image rectification problem. A trick that exploits the semi-ordered line structure of pushbroom data to improve the computational complexity several orders of magnitude is also presented.

    Download full text (pdf)
    fulltext
  • 79.
    Sandberg, David
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Ogniewski, Jens
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology.
    Model-Based Video Coding using Colour and Depth Cameras2011In: Digital Image Computing: Techniques and Applications (DICTA11), IEEE , 2011, p. 158-163Conference paper (Other academic)
    Abstract [en]

    In this paper, we present a model-based video coding method that uses input from colour and depth cameras, such as the Microsoft Kinect. The model-based approach uses a 3D representation of the scene, enabling several other applications besides video playback. Some of these applications are stereoscopic viewing, object insertion for augmented reality and free viewpoint viewing. The video encoding step uses computer vision to estimate the camera motion. The scene geometry is represented by keyframes, which are encoded as 3D quadsusing a quadtree, allowing good compression rates. Camera motion in-between keyframes is approximated to be linear. The relative camera positions at keyframes and the scene geometry are then compressed and transmitted to the decoder. Our experiments demonstrate that the model-based approach delivers a high level of detail at competitively low bitrates.

  • 80.
    Scharr, Hanno
    et al.
    Forschungszentrum Juelich.
    Felsberg, Michael
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Forssén, Per-Erik
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Noise Adaptive Channel Smoothing of Low-Dose Images2003In: Computer Vision for the Nano-Scale Workshop accompanying CVPR 2003,2003, Madison: IEEE Computer Society , 2003Conference paper (Refereed)
  • 81.
    Spies, Hagen
    et al.
    ContextVision AB, Linköping Sweden.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Two-dimensional channel representation for multiple velocities2003In: Proceedings of the 13th Scandinavian Conference of Image Analysis, SCIA 2003 / [ed] Josef Bigun and Tomas Gustavsson, Berlin, Heidelberg: SpringerLink , 2003, Vol. 2749, p. 356-362Conference paper (Refereed)
    Abstract [en]

    We present a two-dimensional information representation, where small but overlapping Gaussian kernels are used to encode the data in a matrix. Apart from points we apply this to constraints that restrict the solution to a linear subspace. A localised decoding scheme accurately extracts multiple solutions together with an estimate of the covariances. We employ the method in optical flow computations to determine multiple velocities occurring at motion discontinuities.

  • 82.
    Tavares, Anderson
    et al.
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Electrical Engineering, Computer Vision. RISE SICS East Linkoping, SE-58183 Linkoping, Sweden.
    Järemo-Lawin, Felix
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Assessing Losses for Point Set Registration2020In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 5, no 2, p. 3360-3367Article in journal (Refereed)
    Abstract [en]

    This letter introduces a framework for evaluation of the losses used in point set registration. In order for a loss to be useful with a local optimizer, such as e.g.& x00A0;Levenberg-Marquardt, or expectation maximization (EM), it must be monotonic with respect to the sought transformation. This motivates us to introduce monotonicity violation probability (MVP) curves, and use these to assess monotonicity empirically for many different local distances, such as point-to-point, point-to-plane, and plane-to-plane. We also introduce a local shape-to-shape distance, based on the Wasserstein distance of the local normal distributions. Evaluation is done on a comprehensive benchmark of terrestrial lidar scans from two publicly available datasets. It demonstrates that matching robustness can be improved significantly, by using kernel versions of local distances together with inverse density based sample weighting.

  • 83.
    Viksten, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Maximally Robust Range Regions2010Report (Other academic)
    Abstract [en]

    In this work we present a region detector, an adaptation to range data of the popular Maximally Stable Extremal Regions (MSER) region detector. We call this new detector Maximally Robust Range Regions (MRRR). We apply the new detector to real range data captured by a commercially available laser range camera. Using this data we evaluate the repeatability of the new detector and compare it to some other recently published detectors. The presented detector shows a repeatability which is better or the same as the best of the other detectors. The MRRR detector also offers additional data on the detected regions. The additional data could be crucial in applications such as registration or recognition.

  • 84.
    Viksten, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Johansson, Björn
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Vision.
    Moe, Anders
    SICK/IVP.
    Comparison of Local Image Descriptors for Full 6 Degree-of-Freedom Pose Estimation2009In: IEEE ICRA, 2009: 1050-4729, Kobe: IEEE Robotics and Automation Society , 2009, p. 2779-2786Conference paper (Refereed)
    Abstract [en]

    Recent years have seen advances in the estimation of full 6 degree-of-freedom object pose from a single 2D image. These advances have often been presented as a result of, or together with, a new local image descriptor. This paper examines how the performance for such a system varies with choice of local descriptor. This is done by comparing the performance of a full 6 degree-of-freedom pose estimation system for fourteen types of local descriptors. The evaluation is done on a database with photos of complex objects with simple and complex backgrounds and varying lighting conditions. From the experiments we can conclude that duplet features, that use pairs of interest points, improve pose estimation accuracy, and that affine covariant features do not work well in current pose estimation frameworks. The data sets and their ground truth is available on the web to allow future comparison with novel algorithms.

    Download full text (pdf)
    FULLTEXT01
  • 85.
    Viksten, Fredrik
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Johansson, Björn
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Moe, Anders
    Linköping University, Department of Electrical Engineering. Linköping University, The Institute of Technology.
    Local Image Descriptors for Full 6 Degree-of-Freedom Object Pose Estimation and Recognition2010Article in journal (Refereed)
    Abstract [en]

    Recent years have seen advances in the estimation of full 6 degree-of-freedom object pose from a single 2D image. These advances have often been presented as a result of, or together with, a new local image feature type. This paper examines how the pose accuracy and recognition robustness for such a system varies with choice of feature type. This is done by evaluating a full 6 degree-of-freedom pose estimation system for 17 different combinations of local descriptors and detectors. The evaluation is done on data sets with photos of challenging 3D objects with simple and complex backgrounds and varying illumination conditions. We examine the performance of the system under varying levels of object occlusion and we find that many features allow considerable object occlusion. From the experiments we can conclude that duplet features, that use pairs of interest points, improve pose estimation accuracy, compared to single point features. Interestingly, we can also show that many features previously used for recognition and wide-baseline stereo are unsuitable for pose estimation, one notable example are the affine covariant features that have been proven quite successful in other applications. The data sets and their ground truths are available on the web to allow future comparison with novel algorithms.

  • 86.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Dellen, Babette
    Institut de Robotica i Informatica Industrial (CSIC-UPC) Llorens i Artigas 4-6, 08028 Barcelona, Spain.
    Channel Coding for Joint Colour and Depth Segmentation2011In: Proceedings of  Pattern Recognition 33rd DAGM Symposium, Frankfurt/Main, Germany, August 31 - September 2 / [ed] Rudolf Mester and Michael Felsberg, Springer, 2011, p. 306-315Conference paper (Refereed)
    Abstract [en]

    Segmentation is an important preprocessing step in many applications. Compared to colour segmentation, fusion of colour and depth greatly improves the segmentation result. Such a fusion is easy to do by stacking measurements in different value dimensions, but there are better ways. In this paper we perform fusion using the channel representation, and demonstrate how a state-of-the-art segmentation algorithm can be modified to use channel values as inputs. We evaluate segmentation results on data collected using the Microsoft Kinect peripheral for Xbox 360, using the superparamagnetic clustering algorithm. Our experiments show that depth gradients are more useful than depth values for segmentation, and that channel coding both colour and depth gradients makes tuned parameter settings generalise better to novel images.

    Download full text (pdf)
    fulltext
  • 87.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Dellen, Babette
    Institut de Robotica i Informatica Industrial, Barcelona, Spain.
    Leaf Segmentation using the Kinect2011In: Proceedings of SSBA 2011 Symposium on Image Analysis, 2011Conference paper (Other (popular science, discussion, etc.))
    Abstract [en]

    Segmentation is an important preprocessing step in many applications. Purely colour-based segmentation is often problematic. For this reason, we here investigate fusion of depth and colour information, to facilitate robust segmentation of single images. We evaluate segmentation results on data collected using the Microsoft Kinect peripheral for Xbox 360, using superparamagnetic clustering. We also propose a method for aligning and encoding colour and depth information from the Kinect device. As we show in the paper, the fusion of depth and colour information produces more semantically relevant segments for scene analysis than either depth or colour separately.

  • 88.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssen, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Attentional Masking for Pre-trained Deep Networks2017In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS17), Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 6149-6154Conference paper (Refereed)
    Abstract [en]

    The ability to direct visual attention is a fundamental skill for seeing robots. Attention comes in two flavours: the gaze direction (overt attention) and attention to a specific part of the current field of view (covert attention), of which the latter is the focus of the present study. Specifically, we study the effects of attentional masking within pre-trained deep neural networks for the purpose of handling ambiguous scenes containing multiple objects. We investigate several variants of attentional masking on partially pre-trained deep neural networks and evaluate the effects on classification performance and sensitivity to attention mask errors in multi-object scenes. We find that a combined scheme consisting of multi-level masking and blending provides the best trade-off between classification accuracy and insensitivity to masking errors. This proposed approach is denoted multilayer continuous-valued convolutional feature masking (MC-CFM). For reasonably accurate masks it can suppress the influence of distracting objects and reach comparable classification performance to unmasked recognition in cases without distractors.

    Download full text (pdf)
    fulltext
  • 89.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    A Research Platform for Embodied Visual Object Recognition2010In: Proceedings of SSBA 2010 Symposium on Image Analysis / [ed] Hendriks Luengo and Milan Gavrilovic, 2010, p. 137-140Conference paper (Other academic)
    Abstract [en]

    We present in this paper a research platform for development and evaluation of embodied visual object recognition strategies. The platform uses a stereoscopic peripheral-foveal camera system and a fast pan-tilt unit to perform saliency-based visual search. This is combined with a classification framework based on the bag-of-features paradigm with the aim of targeting, classifying and recognising objects. Interaction with the system is done via typed commands and speech synthesis. We also report the current classification performance of the system.

    Download full text (pdf)
    fulltext
  • 90.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Embodied Object Recognition using Adaptive Target Observations2010In: Cognitive Computation, ISSN 1866-9956, E-ISSN 1866-9964, Vol. 2, no 4, p. 316-325Article in journal (Refereed)
    Abstract [en]

    In this paper, we study object recognition in the embodied setting. More specifically, we study the problem of whether the recognition system will benefit from acquiring another observation of the object under study, or whether it is time to give up, and report the observed object as unknown. We describe the hardware and software of a system that implements recognition and object permanence as two nested perception-action cycles. We have collected three data sets of observation sequences that allow us to perform controlled evaluation of the system behavior. Our recognition system uses a KNN classifier with bag-of-features prototypes. For this classifier, we have designed and compared three different uncertainty measures for target observation. These measures allow the system to (a) decide whether to continue to observe an object or to move on, and to (b) decide whether the observed object is previously seen or novel. The system is able to successfully reject all novel objects as “unknown”, while still recognizing most of the previously seen objects.

    Download full text (pdf)
    FULLTEXT01
  • 91.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Improving Random Forests by Correlation-Enhancing Projections and Sample-Based Sparse Discriminant Selection2016In: Proceedings 13th Conference on Computer and Robot Vision CRV 2016, Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 222-227Conference paper (Refereed)
    Abstract [en]

    Random Forests (RF) is a learning techniquewith very low run-time complexity. It has found a nicheapplication in situations where input data is low-dimensionaland computational performance is paramount. We wish tomake RFs more useful for high dimensional problems, andto this end, we propose two extensions to RFs: Firstly, afeature selection mechanism called correlation-enhancing pro-jections, and secondly sparse discriminant selection schemes forbetter accuracy and faster training. We evaluate the proposedextensions by performing age and gender estimation on theMORPH-II dataset, and demonstrate near-equal or improvedestimation performance when using these extensions despite aseventy-fold reduction in the number of data dimensions.

  • 92.
    Wallenberg, Marcus
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Teaching Stereo Perception to YOUR Robot2012In: Proceedings of 23rd British Machine Vision Conference, University of Surrey, UK , 2012, p. 1-12Conference paper (Refereed)
    Abstract [en]

    This paper describes a method for generation of dense stereo ground-truth using a consumer depth sensor such as the Microsoft Kinect. Such ground-truth allows adaptation of stereo algorithms to a specific setting. The method uses a novel residual weighting based on error propagation from image plane measurements to 3D. We use this ground-truth in wide-angle stereo learning by automatically tuning a novel extension of the best-first-propagation (BFP) dense correspondence algorithm. We extend BFP by adding a coarse-to-fine scheme, and a structure measure that limits propagation along linear structures and flat areas. The tuned correspondence algorithm is evaluated in terms of accuracy, robustness, and ability to generalise. Both the tuning cost function, and the evaluation are designed to balance the accuracy-robustness trade-off inherent in patch-based methods such as BFP.

    Download full text (pdf)
    Teaching Stereo Perception to YOUR Robot
12 51 - 92 of 92
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf