Not-so-naive Bayes. Improved Bayesian classifier.

September 15, 2022

2

Enhance the easy Bayesian classifier by releasing its naive assumption

Regardless of being quite simple, naive Bayes classifiers are likely to work decently in some real-world functions, famously doc classification or spam filtering. They don’t want a lot coaching knowledge and are very quick. In consequence, they’re typically adopted as easy baselines for classification duties. What many don’t know is that we will make them a lot much less naive by utilizing a easy trick.

Naive Bayes is a straightforward probabilistic algorithm that makes use of the Bayes’ Theorem, therefore the identify. Bayes’ Theorem is a straightforward mathematical rule that tells us the way to come from P(B|A) to P(A|B). If we all know the chance of one thing given another factor, we will revert it by following this straightforward equation:

For those who want a refresher on the probabilistic notation above, don’t hesitate to take a detour to this introductory article on the subject.

The Naive Bayes algorithm makes use of the Bayes’ Theorem in a quite simple style. It makes use of the coaching knowledge to calculate the chance distribution of every characteristic given the targets, after which, primarily based on the concept, it will get the reverse: the chance of the goal, given the options. This is sufficient to predict class chances for brand new knowledge as soon as we have now the options.

Let’s see it in motion. We’ll use the notorious Iris dataset wherein the duty is to categorise flowers into three iris species primarily based on petal and sepal measurements. To permit for intuitive visualization, we’ll solely use two options: sepal size and petal size. Let’s begin with loading the information and setting part of it apart for testing later.