On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers
2007 (English)In: Machine Learning and Data Mining in Pattern Recognition: 5th International Conference, MLDM 2007, Leipzig, Germany, July 18-20, 2007. Proceedings / [ed] Petra Perner, Springer Berlin/Heidelberg, 2007, 2-16 p.Chapter in book (Refereed)
Computational procedures using independence assumptions in various forms are popular in machine learning, although checks on empirical data have given inconclusive results about their impact. Some theoretical understanding of when they work is available, but a definite answer seems to be lacking. This paper derives distributions that maximizes the statewise difference to the respective product of marginals. These distributions are, in a sense the worst distribution for predicting an outcome of the data generating mechanism by independence. We also restrict the scope of new theoretical results by showing explicitly that, depending on context, independent ('Naïve') classifiers can be as bad as tossing coins. Regardless of this, independence may beat the generating model in learning supervised classification and we explicitly provide one such scenario.
Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2007. 2-16 p.
Lecture Notes in Computer Science, ISSN 0302-9743 (print), 1611-3349 (online) ; 4571
independence, classification, supervised learning, pattern recognition, prediction
IdentifiersURN: urn:nbn:se:liu:diva-38249DOI: 10.1007/978-3-540-73499-4_2ISI: 000248523200001Local ID: 43265ISBN: 978-3-540-73498-7ISBN: e-978-3-540-73499-4ISBN: 3-540-73498-8OAI: oai:DiVA.org:liu-38249DiVA: diva2:259098