Since I’ve redone the design on my blog, I figured that it would be nice to enable comments again. I’ve stumbled over Steve Kemp’s blogpost “When the light is green, the trap is clean.” which is about a XML-RPC based antispam service for blogs and forums. I followed the link to blogspam.net and read a bit about the API. It seemed pretty straightforward and as an adition, it was open source.

There was no java implementation, so I made one, based on the apache XML-RPC client (project site) and it only implements the basics. The sources are up for grabs here.

Usage:

To test if a comment is spam or not, you should use the function isSpam.

Example:

HashMap options = new HashMap();
options.put("comment","Ten animals I slam in a net");
options.put("ip","91.193.130.226");
String result = (new BlogSpamTest()).isSpam(options);

Now the string result contains the response sent from the blogspam.net server.

To train the bayasian filtering system, use the train method which take to parameters:

  • comment – A string containing the comment to be trained upon.
  • spam – A boolean. true if the comment is spam, false otherwise.

Example:

/*
* Tell the server that following is spam, do this:
* Buy cheap viagra http://viagrashop.nl
*/

(new BlogSpamTest()).train("Buy cheap viagra http://viagrashop.nl",true);

/*
* On the other hand, if the server classified the sentence as spam:
* That is one nice car.
*
* and you don't think that it is, do this:
*/

(new BlogSpamTest()).train("That is one nice car.",false);

If you use it, improve it or have any questions, please let me know. :-)

UPDATE:

All methods have been made static.

I have added the getStats method, which takes one string as a argument, the string being the hostname to get stats for.

Example:

HashMap stats = BlogSpamTest.getStats("http://blog.bredsaal.dk");
System.out.println("Recieved " + stats.get("ok") + "good comments and " + stats.get("spam") + " spam comments.");

Related posts:

Tags: , ,

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This site uses KeywordLuv. Enter YourName@YourKeywords in the Name field to take advantage.