liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Detection of Sparse and Weak Effects in High-Dimensional Feature Space, with an Application to Microbiome Data Analysis
Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden.
Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
The Centre for Translational Microbiome Research (CTMR), Department of Microbiology, Tumor, and Cell Biology, Karolinska Institutet, Stockholm, Sweden.
The Centre for Translational Microbiome Research (CTMR), Department of Microbiology, Tumor, and Cell Biology, Karolinska Institutet, Stockholm, Sweden.
2020 (English)In: Recent Developments in Multivariate and Random Matrix Analysis: Festschrift in Honour of Dietrich von Rosen / [ed] Holgersson, Thomas; Singull, Martin, Cham: Springer International Publishing , 2020, p. 287-311Chapter in book (Refereed)
Abstract [en]

We present a family of goodness-of-fit (GOF) test statistics specifically designed for detection of sparse-weak mixtures, where only a small fraction of the observational units are contaminated arising from a different distribution. The test statistics are constructed as sup-functionals of weighted empirical processes where the weight functions employed are the Chibisov-O’Reilly functions of a Brownian bridge. The study recovers and extends a number of previously known results on sparse detection using a weighted GOF (wGOF) approach. In particular, the results obtained demonstrate the advantage of our approach over a common approach that utilizes a family of regularly varying weight functions. We show that the Chibisov-O’Reilly family has important advantages over better known approaches as it allows for optimally adaptive, fully data-driven test procedures. The theory is further developed to demonstrate that the entire family is a flexible device that adapts to many interesting situations of modern scientific practice where the number of observations stays fixed or grows very slowly while the number of automatically measured features grows dramatically and only a small fraction of these features are useful. Numerical studies are performed to investigate the finite sample properties of the theoretical results. We shown that the Chibisov-O’Reilly family compares favorably to related test statistics over a broad range of sparsity and weakness regimes for the Gaussian and high-dimensional Dirichlet types of sparse mixture. Finally, an example of human gut microbiome data set is presented to illustrate that the family of tests has found applications in real-life sparse signal detection problems where the sample size is small in relation to the features dimension.

Place, publisher, year, edition, pages
Cham: Springer International Publishing , 2020. p. 287-311
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:liu:diva-188316DOI: 10.1007/978-3-030-56773-6_17Libris ID: s84s89nmqps0pz6pISBN: 9783030567736 (electronic)OAI: oai:DiVA.org:liu-188316DiVA, id: diva2:1700575
Available from: 2022-10-03 Created: 2022-10-03 Last updated: 2022-12-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textFind book at a swedish library/Hitta boken i ett svenskt bibliotek

Authority records

Tillander, Annika

Search in DiVA

By author/editor
Tillander, Annika
By organisation
The Division of Statistics and Machine LearningFaculty of Arts and Sciences
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 70 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf