Railslove 'round the world

Spam or Ham

reddavis:

I am in the process of applying to study Machine Learning at the university of Bristol next September. My project will be to build a Twitter spam classification system.

This is actually my second iteration of this project. For the last month and a bit I have been building the first prototype, I’m currently finishing off my paper on the project.

I originally classified the training data myself (which took a long time), one thing I noticed is that spam on Twitter is very subjective. Some people would classify a “SEO Marketing dude” as spam, whereas other wouldn’t.

The first prototype only had two classes (spam, ham), this iteration has 5 classes. The side effect of this is that I require a lot more training data than before.

So this time I’ve opened it up so that anyone can help classify users.

To find more info, take a look at the help page.

Spam or Ham

Comments (View)
blog comments powered by Disqus