Discrimination and scoring using small sets of genes for two-sample microarray data
2007 (English)In: Mathematical Biosciences, ISSN 0025-5564, E-ISSN 1879-3134, Vol. 205, no 2, 195-203 p.Article in journal (Refereed) Published
Comparison of gene expression for two groups of individuals form an important subclass of microarray experiments. We study multivariate procedures, in particular use of Hotelling's T2 for discrimination between the groups with a special emphasis on methods based on few genes only. We apply the methods to data from an experiment with a group of atopic dermatitis patients compared with a control group. We also compare our methodology to other recently proposed methods on publicly available datasets. It is found that (i) use of several genes gives a much improved discrimination of the groups as compared to one gene only, (ii) the genes that play the most important role in the multivariate analysis are not necessarily those that rank first in univariate comparisons of the groups, (iii) Linear Discriminant Analysis carried out with sets of 2-5 genes selected according to their Hotelling T2 give results comparable to state-of-the-art methods using many more genes, a feature of our method which might be crucial in clinical applications. Finding groups of genes that together give optimal multivariate discrimination (given the size of the group) can identify crucial pathways and networks of genes responsible for a disease. The computer code that we developed to make computations is available as an R package.
Place, publisher, year, edition, pages
Elsevier, 2007. Vol. 205, no 2, 195-203 p.
Differential analysis; Expression data; Discrimination; Small sets of genes; Hotelling statistic; Curse of dimension; Computational methods; Software; R package; Eczema
Medical and Health Sciences
IdentifiersURN: urn:nbn:se:liu:diva-98565DOI: 10.1016/j.mbs.2006.08.007ISI: 000244157100003PubMedID: 17087979OAI: oai:DiVA.org:liu-98565DiVA: diva2:654966