Data Mining : Naive Bayes (Overview)

on Wednesday 28 August 2013
A Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with the assumption of independent (naive) strong. A more descriptive term for the probability model to be underlined is "independent feature model".


In simple terms, an NBC assumes that the presence (or absence) of a particular feature of a class is not related to the presence (or absence) of other features. For example, if the apple fruit may be red, round, and about 4 inches in diameter. Even if these features depend on each other or on the presence of other features,. An NBC assumes that all properties independently contribute to the probability that this fruit is the apple. Depending on the exact situation of the probability model, the NBC can be trained very efficiently in a supervised learning.

In practical application, parameter estimation for NBC models using maximum likelihood method, in other words, a person can work with models without believing Naïve Bayes Bayesian probability or using other Bayesian methods. Behind naive design and apparently over-simplified assumptions, NBC has worked quite well in many complex real-world situations. In 2004, analysis of Bayesian classification problem has shown that there are some theoretical reasons for the success of which seems unreasonable from NBC (Zhang, H., 2004).

In addition, a comprehensive comparison with other classification methods in 2006 showed that recent approaches outperform Bayes classification, such as boosted random tree or forest (Caruana, R. & Niculescu-Mizil, A, 2006). An advantage of NBC is that it requires a small amount of training data to estimate the parameters (mean and variance of variables) necessary for classification. Because independent variables are assumed, only variants of the variables for each class need to be determined and not the entire covariance matrix.

0 comments:

Post a Comment