liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Algorithms in data mining using matrix and tensor methods
Linköpings universitet, Matematiska institutionen, Beräkningsvetenskap. Linköpings universitet, Tekniska högskolan.ORCID-id: 0000-0002-1542-2690
2008 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

In many fields of science, engineering, and economics large amounts of data are stored and there is a need to analyze these data in order to extract information for various purposes. Data mining is a general concept involving different tools for performing this kind of analysis. The development of mathematical models and efficient algorithms is of key importance. In this thesis we discuss algorithms for the reduced rank regression problem and algorithms for the computation of the best multilinear rank approximation of tensors.

The first two papers deal with the reduced rank regression problem, which is encountered in the field of state-space subspace system identification. More specifically the problem is

\[

\min_{\rank(X) = k} \det (B - X A)(B - X A)\tp,

\]

where $A$ and $B$ are given matrices and we want to find $X$ under a certain rank condition that minimizes the determinant. This problem is not properly stated since it involves implicit assumptions on $A$ and $B$ so that $(B - X A)(B - X A)\tp$ is never singular. This deficiency of the determinant criterion is fixed by generalizing the minimization criterion to rank reduction and volume minimization of the objective matrix. The volume of a matrix is defined as the product of its nonzero singular values. We give an algorithm that solves the generalized problem and identify properties of the input and output signals causing a singular objective matrix.

Classification problems occur in many applications. The task is to determine the label or class of an unknown object. The third paper concerns with classification of handwritten digits in the context of tensors or multidimensional data arrays. Tensor and multilinear algebra is an area that attracts more and more attention because of the multidimensional structure of the collected data in various applications. Two classification algorithms are given based on the higher order singular value decomposition (HOSVD). The main algorithm makes a data reduction using HOSVD of 98--99 \% prior the construction of the class models. The models are computed as a set of orthonormal bases spanning the dominant subspaces for the different classes. An unknown digit is expressed as a linear combination of the basis vectors. The resulting algorithm achieves 5\% in classification error with fairly low amount of computations.

The remaining two papers discuss computational methods for the best multilinear

rank approximation problem

\[

\min_{\cB} \| \cA - \cB\|

\]

where $\cA$ is a given tensor and we seek the best low multilinear rank approximation tensor $\cB$. This is a generalization of the best low rank matrix approximation problem. It is well known that for matrices the solution is given by truncating the singular values in the singular value decomposition (SVD) of the matrix. But for tensors in general the truncated HOSVD does not give an optimal approximation. For example, a third order tensor $\cB \in \RR^{I \x J \x K}$ with rank$(\cB) = (r_1,r_2,r_3)$ can be written as the product

\[

\cB = \tml{X,Y,Z}{\cC}, \qquad b_{ijk}=\sum_{\lambda,\mu,\nu}

x_{i\lambda} y_{j\mu} z_{k\nu} c_{\lambda\mu\nu},

\]

where $\cC \in \RR^{r_1 \x r_2 \x r_3}$ and $X \in \RR^{I \times r_1}$, $Y \in \RR^{J \times r_2}$, and $Z \in \RR^{K \times r_3}$ are matrices of full column rank. Since it is no restriction to assume that $X$, $Y$, and $Z$ have orthonormal columns and due to these constraints, the approximation problem can be considered as a nonlinear optimization problem defined on a product of Grassmann manifolds. We introduce novel techniques for multilinear algebraic manipulations enabling means for theoretical analysis and algorithmic implementation. These techniques are used to solve the approximation problem using Newton and Quasi-Newton methods specifically adapted to operate on products of Grassmann manifolds. The presented algorithms are suited for small, large and sparse problems and, when applied on difficult problems, they clearly outperform alternating least squares methods, which are standard in the field.

sted, utgiver, år, opplag, sider
Matematiska institutionen , 2008. , s. 29
Serie
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1178
Emneord [en]
Volume, Minimization criterion, Determinant, Rank deficient matrix, Reduced rank regression, System identification, Rank reduction, Volume minimization, General algorithm, Handwritten digit classification, Tensors, Higher order singular value decomposition, Tensor approximation, Least squares, Tucker model, Multilinear algebra, Notation, Contraction, Tensor matricization, Newton's method, Grassmann manifolds, Product manifolds, Quasi-Newton algorithms, BFGS and L-BFGS, Symmetric tensor approximation, Local/intrinsic coordinates, Global/embedded coordinates;
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-11597ISBN: 978-91-7393-907-2 (tryckt)OAI: oai:DiVA.org:liu-11597DiVA, id: diva2:18011
Disputas
2008-05-27, Glashuset, B-huset, ing. 25, Campus Valla, Linköpings universitet, Linköping, 10:15 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2008-04-29 Laget: 2008-04-29 Sist oppdatert: 2013-10-11
Delarbeid
1. Dimensionality reduction and volume minimization - generalization of the determinant minimization criterion for reduced rank regression problems
Åpne denne publikasjonen i ny fane eller vindu >>Dimensionality reduction and volume minimization - generalization of the determinant minimization criterion for reduced rank regression problems
2006 (engelsk)Inngår i: Linear Algebra and its Applications, ISSN 0024-3795, Vol. 418, nr 1, s. 201-214Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

In this article we propose a generalization of the determinant minimization criterion. The problem of minimizing the determinant of a matrix expression has implicit assumptions that the objective matrix is always nonsingular. In case of singular objective matrix the determinant would be zero and the minimization problem would be meaningless. To be able to handle all possible cases we generalize the determinant criterion to rank reduction and volume minimization of the objective matrix. The generalized minimization criterion is used to solve the following ordinary reduced rank regression problem:

minrank(X)=kdet(B-XA)(B-XA)T,

where A and B are known and X is to be determined. This problem is often encountered in the system identification context.

Emneord
Volume; Minimization criterion; Determinant; Rank deficient matrix
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-13190 (URN)10.1016/j.laa.2006.01.032 (DOI)
Tilgjengelig fra: 2008-04-29 Laget: 2008-04-29 Sist oppdatert: 2013-11-06
2. Rank Reduction and Volume Minimization Approach to State-Space Subspace System Identification
Åpne denne publikasjonen i ny fane eller vindu >>Rank Reduction and Volume Minimization Approach to State-Space Subspace System Identification
2006 (engelsk)Inngår i: Signal Processing, ISSN 0165-1684, E-ISSN 1872-7557, Vol. 86, nr 11, s. 3275-3285Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

In this paper we consider the reduced rank regression problem

solved by maximum-likelihood-inspired state-space subspace system identification algorithms. We conclude that the determinant criterion is, due to potential rank-deficiencies, not general enough to handle all problem instances. The main part of the paper analyzes the structure of the reduced rank minimization problem and identifies signal properties in terms of geometrical concepts. A more general minimization criterion is considered, rank reduction followed by volume minimization. A numerically sound algorithm for minimizing this criterion is presented and validated on both simulated and experimental data.

sted, utgiver, år, opplag, sider
Elsevier, 2006
Emneord
Reduced rank regression, System identification, General algorithm, Determinant minimization criterion, Rank reduction, Volume minimization
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-13191 (URN)10.1016/j.sigpro.2006.01.008 (DOI)
Tilgjengelig fra: 2008-04-29 Laget: 2008-04-29 Sist oppdatert: 2017-12-13
3. Handwritten digit classification using higher order singular value decomposition
Åpne denne publikasjonen i ny fane eller vindu >>Handwritten digit classification using higher order singular value decomposition
2007 (engelsk)Inngår i: Pattern Recognition, ISSN 0031-3203, E-ISSN 1873-5142, Vol. 40, nr 3, s. 993-1003Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

In this paper we present two algorithms for handwritten digit classification based on the higher order singular value decomposition (HOSVD). The first algorithm uses HOSVD for construction of the class models and achieves classification results with error rate lower than 6%. The second algorithm uses the HOSVD for tensor approximation simultaneously in two modes. Classification results for the second algorithm are almost down at 5% even though the approximation reduces the original training data with more than 98% before the construction of the class models. The actual classification in the test phase for both algorithms is conducted by solving a series least squares problems. Considering computational amount for the test presented the second algorithm is twice as efficient as the first one.

Emneord
Handwritten digit classification, Tensors, Higher order singular value decomposition, Tensor approximation, Least squares
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-13192 (URN)10.1016/j.patcog.2006.08.004 (DOI)
Tilgjengelig fra: 2008-04-29 Laget: 2008-04-29 Sist oppdatert: 2017-12-13
4. A Newton-Grassmann method for computing the best multilinear rank-(r1,r2,r3) approximation of a tensor
Åpne denne publikasjonen i ny fane eller vindu >>A Newton-Grassmann method for computing the best multilinear rank-(r1,r2,r3) approximation of a tensor
2009 (engelsk)Inngår i: SIAM Journal on Matrix Analysis and Applications, ISSN 0895-4798, Vol. 32, nr 2, s. 248-271Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

We derive a Newton method for computing the best rank-$(r_1,r_2,r_3)$ approximation of a given $J\times K\times L$ tensor $\mathcal{A}$. The problem is formulated as an approximation problem on a product of Grassmann manifolds. Incorporating the manifold structure into Newton's method ensures that all iterates generated by the algorithm are points on the Grassmann manifolds. We also introduce a consistent notation for matricizing a tensor, for contracted tensor products and some tensor-algebraic manipulations, which simplify the derivation of the Newton equations and enable straightforward algorithmic implementation. Experiments show a quadratic convergence rate for the Newton–Grassmann algorithm.

Emneord
tensor, multilinear, rank, approximation, Grassmann manifold, Newton
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-13193 (URN)10.1137/070688316 (DOI)
Tilgjengelig fra: 2008-04-29 Laget: 2008-04-29 Sist oppdatert: 2013-10-11
5. Best multilinear rank approximation of tensors with quasi-Newton methods on Grassmannians
Åpne denne publikasjonen i ny fane eller vindu >>Best multilinear rank approximation of tensors with quasi-Newton methods on Grassmannians
Manuskript (Annet vitenskapelig)
Identifikatorer
urn:nbn:se:liu:diva-13194 (URN)
Tilgjengelig fra: 2008-04-29 Laget: 2008-04-29 Sist oppdatert: 2010-01-13

Open Access i DiVA

omslag(112 kB)81 nedlastinger
Filinformasjon
Fil COVER01.pdfFilstørrelse 112 kBChecksum MD5
f4aa6a895e856f523ef38f2a259f8a5b97c65d0fc9b70b6ad0df0d1f486a13170cda6ebc
Type coverMimetype application/pdf
fulltekst(792 kB)7827 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 792 kBChecksum MD5
5d6653ef303c764a049c03f9c0c3ea4e9187543fc158260cf28d3c5e299f04c5da9fa06f
Type fulltextMimetype application/pdf

Personposter BETA

Savas, Berkant

Søk i DiVA

Av forfatter/redaktør
Savas, Berkant
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 7828 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 11547 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf