Identify and prevent SPAM from taking over your website
24th August, 2011
It's 2011, and still many, many websites get SPAM comments (including this one!). If you run or own a site which frequently gets SPAM, you know what I'm talking about when I say it can get very annoying!
This post should hopefully give anyone wanting to identify and prevent SPAM comments/messages some useful tips and advice on how to do so.
Firstly, you'll want to prevent any SPAM getting through. You will typically find SPAM anywhere on your site that allows for user input, for example on a comment form.
The best way to prevent spam in my experience is by using a Captcha (Completely Automated Public Turing Test to tell Computers and Humans Apart) which is a form of image-based verification used to check that any comments or messages are coming from a real person, not some hijacked computer sending out SPAM all day!
An excellent and easy to implement PHP based Captcha script (that I have used a few times now) is securimage (http://www.phpcaptcha.org
). There are, of course many others out there such as the popular one at http://www.captcha.net
But I'll be the first to admit that Captcha's are not exactly fun or great for the real commenter who's left desperately trying to decipher all those swirly letters and numbers... there is however an answer to this, and thats jQuery
So, how about a drag and drop based Captcha such as the one by webdesignbeach (http://www.webdesignbeach.com/beachbar/ajax-fancy-captcha-jquery-plugin
) or a slider based one like QapTcha (http://www.myjqueryplugins.com/QapTcha/demo
Another method of finding out if your comments from a human or computer is to ask a question at the end of your form such as "what's this website called?" or "what's 4 + 4?" etc.
Others methods of preventing SPAM such as IP blocking scripts are generally not very affective (as the people sending out the the SPAM typically use a whole range of IP's). Also, stating, for example, that "you manually check and moderate all comments before they go live on the site" won't necessarily stop you from getting overtaken with spam messages either (your'll just end up with hundreds of comments in your moderators panel instead!) but it's still an excellent idea to moderate all comments first.
Also good validation on forms is a must have as well as not allowing the use of HTML or any script tags etc.
If you find your still getting spam, it's mostly easy to spot (just look for the comments with millions of links in or the ones that make no sense). Others however can be a bit more deceiving. For example, take a look at the one below:
Look's ok at a glance right, and hey, they bookmarked your site! But I'm afraid it's very likely a lie. Take another look, do you really think their name is "Bob’s super backlinks" or do you think that they want those keywords on your site? Sadly it's the latter. Some of the SPAM comments can even look quite convincing and real, but if your unsure a great way to find out is to copy and paste just a part of that comment into Google. If lots of other sites come up with a word for word match of that comment its likely SPAM
So to re-cap, make sure your input forms have good validation, are moderated before being posted and make use of a Captcha image. That way you should stop SPAM in it's tracks!
Hopefully the above has provided a bit more information on SPAM and how to identify and prevent it. As always feel free to leave a comment if you think I've missed anything out or just want to say hi