Posts

Showing posts from January, 2013

Improving Bayesian Filters

Image
Last time we looked at a Simple Introduction to Naive Bayesian Filters (or classifiers), and saw how they work using the example of a box of chocolates. It turns out there's a simple trick you can use to make Bayesian filters more effective. I haven't seen it applied before, but it seems like an obvious technique in certain situations, and I found that it improves the efficacy of filtering by up to 10% in my test cases.

Bayesian filters are used for things like spam filters, where they look for certain words and phrases and rate the "spamminess" of emails. The presence of certain words tips off the spam filter and sends spam to your junk folder. When you're filtering spam, you don't really care about the non-spammy words that an email contains, and you don't care about words that are missing. There are over 600,000 words in the English language. Most of them will not be in an email. You simply don't care about the words that aren't there, if you&…

A Simple Introduction to Naive Bayesian Filters

Image
In the field of data analysis and machine learning algorithms, naive Bayesian filters remain popular since they're relatively easy to implement yet surprisingly effective.  Most introductions to Bayesian filtering dive straight into mathematics, but Bayesian filters are simple enough that you can intuitively see how they work by stepping through an example.

Let's say we've got a box of chocolates, and some of them have nuts inside, and we don't like nuts. We'd like to be able to tell which chocolates have nuts in them.  The chocolates are all different shapes, sizes and colors; some are wrapped and some aren't. The only way to tell if a chocolate contains nuts is to take a bite and see. By sampling the box of chocolates and keeping track of the shape, size, color and wrapping of all the ones we tried, we could come up with a pretty good idea of which ones are most likely to have nuts.

You could make a list of characteristics: size, color, shape, and wrapping. …

Kenka Matsuri

Image
Participants in the Kenka Masturi (Fight Festival) carrying a shrine, Himeji, Japan


© 2012 Darren DeRidder