liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Channel-Coded Feature Maps for Computer Vision and Machine Learning
Linköping University, Department of Electrical Engineering, Computer Vision . Linköping University, The Institute of Technology.
2008 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

This thesis is about channel-coded feature maps applied in view-based object recognition, tracking, and machine learning. A channel-coded feature map is a soft histogram of joint spatial pixel positions and image feature values. Typical useful features include local orientation and color. Using these features, each channel measures the co-occurrence of a certain orientation and color at a certain position in an image or image patch. Channel-coded feature maps can be seen as a generalization of the SIFT descriptor with the options of including more features and replacing the linear interpolation between bins by a more general basis function.

The general idea of channel coding originates from a model of how information might be represented in the human brain. For example, different neurons tend to be sensitive to different orientations of local structures in the visual input. The sensitivity profiles tend to be smooth such that one neuron is maximally activated by a certain orientation, with a gradually decaying activity as the input is rotated.

This thesis extends previous work on using channel-coding ideas within computer vision and machine learning. By differentiating the channel-coded feature maps with respect to transformations of the underlying image, a method for image registration and tracking is constructed. By using piecewise polynomial basis functions, the channel coding can be computed more efficiently, and a general encoding method for N-dimensional feature spaces is presented.

Furthermore, I argue for using channel-coded feature maps in view-based pose estimation, where a continuous pose parameter is estimated from a query image given a number of training views with known pose. The optimization of position, rotation and scale of the object in the image plane is then included in the optimization problem, leading to a simultaneous tracking and pose estimation algorithm. Apart from objects and poses, the thesis examines the use of channel coding in connection with Bayesian networks. The goal here is to avoid the hard discretizations usually required when Markov random fields are used on intrinsically continuous signals like depth for stereo vision or color values in image restoration.

Channel coding has previously been used to design machine learning algorithms that are robust to outliers, ambiguities, and discontinuities in the training data. This is obtained by finding a linear mapping between channel-coded input and output values. This thesis extends this method with an incremental version and identifies and analyzes a key feature of the method -- that it is able to handle a learning situation where the correspondence structure between the input and output space is not completely known. In contrast to a traditional supervised learning setting, the training examples are groups of unordered input-output points, where the correspondence structure within each group is unknown. This behavior is studied theoretically and the effect of outliers and convergence properties are analyzed.

All presented methods have been evaluated experimentally. The work has been conducted within the cognitive systems research project COSPAL funded by EC FP6, and much of the contents has been put to use in the final COSPAL demonstrator system.

Place, publisher, year, edition, pages
Institutionen för systemteknik , 2008. , 155 p.
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1160
Keyword [en]
computer vision, machine learning, object recognition, pose estimation
National Category
Computer Vision and Robotics (Autonomous Systems)
URN: urn:nbn:se:liu:diva-11040ISBN: 978-91-7393-988-1OAI: diva2:17496
Public defence
2008-03-28, Glashuset, Hus B, Campus Valla, Linköpings Universitet, Linköping, 13:15 (English)
Available from: 2008-02-20 Created: 2008-02-20 Last updated: 2016-05-04

Open Access in DiVA

cover(1811 kB)41 downloads
File information
File name COVER01.pdfFile size 1811 kBChecksum SHA-1
Type coverMimetype application/pdf
fulltext(1468 kB)927 downloads
File information
File name FULLTEXT01.pdfFile size 1468 kBChecksum SHA-1
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Jonsson, Erik
By organisation
Computer Vision The Institute of Technology
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 927 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 3285 hits
ReferencesLink to record
Permanent link

Direct link