Consistent feature selection for pattern recognition in polynomial time
2007 (English)In: Journal of machine learning research, ISSN 1532-4435, E-ISSN 1533-7928, Vol. 8, 589-612 p.Article in journal (Refereed) Published
We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMAL-OPTIMAL) vs. finding all features relevant to the target variable (ALL-RELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exist consistent, polynomial-time algorithms. We also prove that ALL-RELEVANT is much harder than MINIMAL-OPTIMAL and propose two consistent, polynomial-time algorithms. We argue that the distribution classes considered are reasonable in many practical cases, so that our results simplify feature selection in a wide range of machine learning tasks.
Place, publisher, year, edition, pages
2007. Vol. 8, 589-612 p.
IdentifiersURN: urn:nbn:se:liu:diva-38405Local ID: 44191OAI: oai:DiVA.org:liu-38405DiVA: diva2:259254