liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Discriminant analysis in small and large dimensions
Department of Mathematics, Stockholm University.ORCID iD: 0000-0001-7855-8221
School of Business, Örebro University.ORCID iD: 0000-0002-1395-9427
Department of Mathematics, University of Dar es Salaam, Tanzania.
Delft Institute of Applied Mathematics, Delft University of Technology, The Netherlands.
2019 (English)In: Theory of Probability and Mathematical Statistics, ISSN 1547-7363, Vol. 100, p. 24-42Article in journal (Refereed) Published
Abstract [en]

We study the distributional properties of the linear discriminant function under the assumption of normality by comparing two groups with the same covariance matrix but different mean vectors. A stochastic representation for the discriminant function coefficients is derived, which is then used to obtain their asymptotic distribution under the high-dimensional asymptotic regime. We investigate the performance of the classification analysis based on the discriminant function in both small and large dimensions. A stochastic representation is established, which allows to compute the error rate in an efficient way. We further compare the calculated error rate with the optimal one obtained under the assumption that the covariance matrix and the two mean vectors are known. Finally, we present an analytical expression of the error rate calculated in the high-dimensional asymptotic regime. The finite-sample properties of the derived theoretical results are assessed via an extensive Monte Carlo study.

Place, publisher, year, edition, pages
Providence, Rhode Island: American Mathematical Society (AMS), 2019. Vol. 100, p. 24-42
Keywords [en]
Discriminant function, Stochastic representation, Large-dimensional asymptotics, Random matrix theory, Classification analysis
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:liu:diva-165557OAI: oai:DiVA.org:liu-165557DiVA, id: diva2:1428723
Available from: 2020-05-06 Created: 2020-05-06 Last updated: 2020-05-06Bibliographically approved
In thesis
1. Contributions to linear discriminant analysis with applications to growth curves
Open this publication in new window or tab >>Contributions to linear discriminant analysis with applications to growth curves
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis concerns contributions to linear discriminant analysis with applications to growth curves.

Firstly, we present the linear discriminant function coefficients in a stochastic representation using random variables from the standard univariate distributions. We apply the characterized distribution in the classification function to approximate the classification error rate. The results are then extended to large dimension asymptotics under assumption that the dimension p of the parameter space increases together with the sample size n to infinity such that the ratio  converges to a positive constant c  (0, 1).

Secondly, the thesis treats repeated measures data which correspond to multiple measurements that are taken on the same subject at different time points. We develop a linear classification function to classify an individual into one out of two populations on the basis of the repeated measures data that when the means follow a growth curve structure. The growth curve structure we first consider assumes that all treatments (groups) follows the same growth profile. However, this is not necessarily true in general and the problem is extended to linear classification where the means follow an extended growth curve structure, i.e., the treatments under the experimental design follow different growth profiles.

At last, a function of the inverse Wishart matrix and a normal distribution finds its application in portfolio theory where the vector of optimal portfolio weights is proportional to the product of the inverse sample covariance matrix and a sample mean vector. Analytical expressions for higher order moments and non-central moments of the portfolio weights are derived when the returns are assumed to be independently multivariate normally distributed. Moreover, the expressions for the mean, variance, skewness and kurtosis of specific estimated weights are obtained. The results are complemented using a Monte Carlo simulation study, where data from the multivariate normal and t-distributions are discussed.

Abstract [sv]

Den här avhandlingen studerar diskriminantanalys, klassificering av tillväxtkurvor och portföljteori.

Diskriminantanalys och klassificering är flerdimensionella tekniker som används för att separera olika mängder av objekt och för att tilldela nya objekt till redan definierade grupper (så kallade klasser). En klassisk metod är att använda Fishers linjära diskriminantfunktion och när alla parametrar är kända så kan man enkelt beräkna sannolikheterna för felklassificering. Tyvärr är så sällan fallet, utan parametrarna måste skattas från data, och då blir Fishers linjära diskriminantfunktion en funktion av en Wishartmatris och multivariat normalfördelade vektorer. I den här avhandlingen studerar vi hur man kan approximativt beräkna sannolikheten för felklassificering under antagande att dimensionen på parameterrummet ökar tillsammans med antalet observationer genom att använda en särskild stokastisk representation av diskriminantfunktionen.

Upprepade mätningar över tiden på samma individ eller objekt går att modellera med så kallade tillväxtkurvor. Vid klassificering av tillväxtkurvor, eller rättare sagt av upprepade mätningar för en ny individ, bör man ta tillvara på både den spatiala- och temporala informationen som finns hos dessa observationer. Vi vidareutvecklar Fishers linjära diskriminantfunktion att passa för upprepade mätningar och beräknar asymptotiska sannolikheter för felklassificering.

Till sist kan man notera att snarlika funktioner av Wishartmatriser och multivariat normalfördelade vektorer dyker upp när man vill beräkna de optimala vikterna i portföljteori. Genom en stokastisk representation studerar vi egenskaperna hos portföljvikterna och gör dessutom en simuleringsstudie för att förstå vad som händer när antagandet om normalfördelning inte är uppfyllt.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2020. p. 47
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2071
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:liu:diva-165558 (URN)10.3384/diss.diva-165558 (DOI)9789179298562 (ISBN)
Public defence
2020-06-08, Online through Zoom (to register: https://bit.ly/36fuupt) and Hopningspunkten, B Building, Campus Valla, Linköping, 15:15 (English)
Opponent
Supervisors
Available from: 2020-05-06 Created: 2020-05-06 Last updated: 2020-10-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Ngailo, Edward

Search in DiVA

By author/editor
Bodnar, TarasMazur, StepanNgailo, Edward
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 233 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf