Favorite spam comment of the week

by Andrew Kagan 18. April 2011 12:25

Comment spam is an ongoing problem, and it’s very difficult to eliminate completely. Putting up spambot barriers is effective, but still some spam slips through, especially human-generated trackback attempts, such as this one, masquerading as an anti-spam comment!

Hello, i read your blog occasionally and i own a similar one and i was just curious if you get a lot of spam remarks? If so how do you prevent it, any plugin or anything you can suggest? I get so much lately it's driving me mad so any help is very much appreciated.

Of course, this was posted using a bogus email and a trackback to a spam domain, and I wasn’t lured in to enabling it. But unless you are moderating your comments, this one would likely escape attention.

Search engines such as google, yahoo and bing are continuing to try to separate the “wheat from the chaff” when it comes to figuring out which backlinks are relevant to search, and which are just spammers trying to seed backlinks across thousands of unsuspecting blogs and message boards. This practice has accelerated even as search engines have become better at devaluing them, leading to more headaches for moderators and admins.

As noted, comment spam can be completely prevented by moderating comments, but this requires the blog or message board admin to manually evaluate each comment. Even then, spambots and trackback services using live individuals create a tidal wave of comment spam, so using third-party tools like Akismet are a necessity. Google’s free reCaptcha service is another useful tool, but is vulnerable to brute-force attacks by trackback services using humans to solve reCaptchas.

Implementing both of those tools in concert will cut down on comment spam dramatically (usually by 95-98%), but still requires comment moderation to effectively block the 2-5% that gets through. The above spam comment got through both filters on the Search Partner Pro blog.

Can reCaptcha be hacked?

by Andrew Kagan 1. May 2009 08:27

Fascinating post on a couple days ago about Time Magazine's online Annual "100 Most Influential People" poll getting hacked by Anonymous. Time Magazine allowed users to vote on its website for the person they considered most influential in 2008, using a simple form. Anonymous seized the opportunity to skew the results by spelling out a message with the first initials of the top 21 entries:


Anonymous used an army of bots to overload Time's legitimate votes, and in an effort to stem the attack, Time first took the form offline, where it continued to be exploited, and then finally put reCaptcha, a popular anti-spam visual-text-matching system, on the form (SearchPartner.Pro uses reCaptcha on our contact us form). reCaptcha is quite effective at defeating known exploits that attempt to use OCR (optical character recognition) to read the image and translate it to text, so Anonymous resorted to a "brute force" attack using members (humans) to place as many votes as possible.

Anonymous also revealed many sophisticated techniques for defeating reCaptcha's pattern logic so that humans could submit entries faster. In the end, Time was unable to stop the hack and you can see the results in the image above. Time did not deny that it had been hacked and downplayed the importance of the results.

The news provoked a strong debate on the reCaptcha newsgroup. Was reCaptcha hacked? Typically, hacking a CAPTCHA would mean using a computer to defeat the protection, so that a human would not have to interact with the form. No one really knows if there is an OCR system that can do this right now, although hackers are constantly evolving this technology. Using brute force to defeat the system with human interaction is also quite common, and there are many teams of hackers in China, India and Russia (and elsewhere) that advertise these services, but this isn't so much of a hack as overwhelming a single point of protection. 

The lesson learned here is that relying on a single technology for protection will inevitably fail, while adding additional steps can slow down brute force attacks by many orders of magnitude, for example by restricting the number of submissions by IP address, embedding hidden text fields on forms (that only a bot would see and try to add data to), adding two-factor verification (e.g. CAPTCHA and random problem match), etc.

