This is the fourth entrant in The Blogging Essay Contest from WeblogToolsCollection.com If you would like to participate, please email me your entry at mark at wltc dot net. Please rate this article using the star system below. The competition will be judged primarily on the input from readers like you. Thank you.
This is written by Mark Styles from Lambic
They’re everywhere, and they’re annoying. They’re called CAPTCHAs and they’ve become a ubiquitious part of blog commenting. Bloggers use them as a quick and dirty solution to an annoying problem without consideration for the annoyance they will cause the reader. I want to persuade all bloggers who are using them to please stop.
What are they?
CAPTCHA stands for “completely automated public Turing test to tell computers and humans apart”. I know it should really be CAPTTTTCHA but hey, I didn’t come up with the acronym. Before bigots destroyed his life, Alan Turing posited the idea of a test to determine machine sentience. His test was designed to decide if a computer had achieved artificial intelligence. So far no computer has passed a Turing test, but the CAPTCHA uses the idea of a Turing test in reverse, testing if a supposed person is really a person and not a computer program pretending to be a person. So a CAPTCHA is a test to make sure the person posting a comment (or anything else, but I’m concentrating on the blogging usage here) is really a person, and not a spam generator trying to post comments about card games, prescription drugs or sex. It usually involves an image showing some distorted text, requiring the user to type in what they see in the distorted text.
Why are they bad?
Anything that stop spammers is good, right? Well generally yes, but some things that stop spammers are better than others; so much better that the inferior solutions become un-necessary. There are many problems with CAPTCHAs:
- Any extra work required to comment is likely to deter some people from commenting at all.
- Sometimes the images are so distorted they’re almost impossible to read, even with perfect eyesight.
- CAPTCHAs are hackable. Spammers are smart, they can get past many of our barriers.
- Visually impaired users are completely excluded (although there are audio CAPTCHAs available now).
- Dyslexics have a hard time too.
- There are better and less intrusive solutions.
What are these better solutions?
Hopefully by now I’ve convinced you that CAPTCHAs are not the best solution to the spam flood. Now it’s time to bring in the alternatives, but before I offer my alternatives, we should decide what our requirements are. An effective and non-intrusive spam blocker should:
- Require nothing or as little as possible from the valid commenter.
- Require as little effort as possible from the webmaster/blog owner.
- Work on as many blog platforms as possible, or have similar alternatives for other blogging platforms.
- Stop as much spam as possible.
- Not interfere with valid comments
Here are the solutions which I feel best meet these requirements:
Centralized spam database
This is what I use, in the form of Akismet. The idea is that all spam comments get submitted to a central server. Each time someone comments on your blog the comment gets checked against the central database. If the comment looks like spam it is automatically flagged as such. The person leaving the comment didn’t have to do anything. The blogger just has to check for false positives occasionally. Everybody is happy. So far Akismet has stopped over 15,000 comments from being published on my blog with about three false positives (comments marked as spam which were not spam) that I know of and about 5 false negatives (spam comments that did not get marked as spam). Akismet is designed for WordPress but will work with other blogging platforms, and the API is open source. The downside of this solution is the reliance you have on a central database. If the database goes down or disappears altogether then the spam flood will begin again. But while it’s around, why not take advantage of it?
Comment analysis programs
Programs like the Bad Behaviour plugin for WordPress take all comments received and analyze them for telltale signs of spaminess. Using data hidden in the HTTP headers like user agent information it is possible to tell if a comment came from a legitimate user or a spambot. The downside of this kind of solution is that it has to be smarter than the spammers, and spammers are smart. Bad Behaviour works very well though, or so I’ve heard; Akismet takes care of things so well that I haven’t needed extra solutions.
Filtering, whitelisting and blacklisting
If your spam problem isn’t big enough to warrant external tools, you can probably get a fairly good spam filter going just with what your blogging software offers natively. You should be able to filter out comments which contain common spammy words (like phentermine, poker, viagra, holdem, etc.). If spam is still getting through you can look at whitelisting; maybe your blog has an option like “only allow comments from people who have commented before” which is like an automatic whitelist after the first moderated comment is approved. Blacklisting is trickier, but if you see spam constantly coming from the same source then you can blacklist that source. Most spammers will get around this easily though. For a list of other spam busters, you can try this page, which is for WordPress, but the concepts still apply to other blog platforms.
Summary
CAPTCHAs are bad. They don’t test for humans, they test for smart non-lazy humans with good eyesight and smart spambots that have CAPTCHAs all figured out. They are at best an annoyance and at worst discriminatory. Using some or all of the suggestions I offered above, you can eliminate your spam problem without making your readers jump through hoops and without losing your own time dealing with the problem. If your chosen blogging platform doesn’t support these solutions, then think seriously about changing your platform. I heartily recommend WordPress for all your blogging needs, either hosted or your own installation. My final piece of advice is for quitters. If you give up trying to deal with comment spam, or you give up blogging completely, please please please remember to disable commenting before abandoning your blog. Every spam comment that gets published is a victory for the spammers.
What would your solution to sign up forms spam? i.e. on a webforum.
Akismet is fantastic, i use it on my blog and integrated to the comments sytem on my site. But for forums, i’d like to catch the spam bot *before* they have a chance to post spam… the captcha is effective for this use (although as with every anti spam tool, including akismet), but still lets a few ‘smarter’ bots through…
I’d love to stop using captcha and make it easier for real people to sign up but in the current climate of spam it’s impossible to not have something.
The ban filter on the forum I run is massive and still doesn’t help, it’s not even a busy board! The new captcha on the latest version has been great, cutting new registrations from 20 per day to 1 in almost 2 months.
I believe, you are talking about alternatives to CAPTCHA. If so, you forgot about Spam Karma 2!
Bob, I’m not a forum user so I can’t really comment on that aspect, my essay is targeted more at bloggers. There are alternatives to image based CAPTCHAs that are more accessible though, for example asking a question that a spambot would not be able to answer, like a simple math question (but then you’re excluding people who can’t do math ;))
Ajay, I didn’t forget about SK2, I believe it fits into the comment analysis category where I used Bad Behaviour as the example. SK2 is listed on the Codex page I linked to so I didn’t bother including it explicitly.
@Bob
It may be time to consider changing your forum solution. Askimet integrates beautifully with bbPress which was developed by the makers of WordPress. Although I understand the difficulty that presents if you run a forum with a lot of users.
I have already removed a graphical CAPTCHA from my blog – execept the one from SK2 which is already beaten – and switched over to a combination of all the plugins you told us:
– Akismet for SK2
– Spam Karma 2 itself (unknowngenius.com)
– Bad Behavior 2 (homelandstupidity.us)
– Comments Post Rewriter Plugin (my own one)
And a less anti-spam but more user-validation plugin Skippy’s Comment Authorization plugin. 🙂
Have fun with them to download and installation… 😉
This was a great article, and I hope everyone who swear to CAPTCHA will read it and reconsider. It as you write “at worst discriminatory”.
bbPress is not the kind of forum solution i’m looking for, as nice as it is 🙂
It still wouldn’t stop users registering, just posting! I don’t like haveing loads of inactive accounts accumulating in my forum DB which take up a lot of names ‘real’ users would want to register…
I’ve had a think about this and am going to try using a feature of the forum system i use (IPB) which you can create ‘custom’ registration forms, a good example of this is the sci fi uk forums which does this. using a custom form and not using standard names for the fields (i.e. using blargh instead of email in the html, and setting a hidden input with the name email, if filled in will block the user registering) will hide hopefully cause the bots to ignore it.
There are, as yet, no easy to do alternatives to CAPTCHA, the best is probably the math question one, but does rely on numeracy skills.
I believe captchas as a primary spam prevention method are very bad. However I do believe they have some use as a backup prevention method (ala SK2) There definitely are issues with visually impaired users, but Akismet alone won’t stop spam. You need a blended approach. Here’s an interesting captcha I’ve used with phpbb until Akismet and other methods get more widely used (and thus useful)
http://www.kessels.com/captcha/
Probably will induce a seizure or two, but makes it harder for the bots to decipher.
Interesting article, but couldn’t disagree more. Here’s why:
Well formed captchas are easy for all but the seriously visually impaired. Sound based captchas are also available.
Umm, I’m signed up to a spam database for email. But as you point out spammers aren’t stupid, they soon create workarounds. I’d be interested to see how labour intensive Akismet really is? Going by some accounts – very!
Comment analysis programs are all very well but they have plenty of weaknesses that bots can exploit. Trust me, you can get yourself into a real lather trying to plug all the holes.
Whitelisting and blacklisting can be very helpful in resolving minor spam issues and very handy for blocking malicious or persistent offenders.
In summary, I struggle to see why you would put yourself to all this extra effort when a captcha very efficiently takes care of all but the most sophisticated and persistent spammers.
By the way – if you’re highlighting time as an issue then really anybody who as anything worth saying will not begrudge a few seconds tapping out a captcha.
Interesting debate.
I hate captchas. I have slight vision problems from retinopathy and I have the hardest time reading captchas when they are distorted at all. It’s so much of a pain, that I don’t spend as much time reading blogs that use them anymore. I would like to be able to comment without having to take a vision test!
As for forums, I think they are ok for sign up forms. That’s a one time thing, as opposed to posting on forums which is a more frequent thing, like commenting on blogs. So capthca’s for forum sign ups is ok, just not for every forum post.
Chris, I’m not sure why you think Akismet is labour intensive. It took me five minute to install. For the first little while after it was installed I checked the comments it marked as spam for false positives but now I don’t even do that, I just hit the Delete All button every now and then.
Obviously if you’re worried about false positives then it will take a bit longer to scan the marked comments but even then you can scroll through them pretty quickly. I don’t bother, and it’s only bitten me once, thanks to a reader whose username was a common online card game.
I use CAPTCHAs on my blogs and it does bum me out… I’d rather that there was a less labor intensive way for readers to be distinguished from bots. Even with the CAPTCHAs, some spam gets through, but it’s much better than it was when I turned them off.
I also aggressively monitor my comments, and use blacklisting, word banning etc. There are some words common to spam comments which I don’t want to blacklist, because they might also have legit uses.
I used to hold comments for approval, but I don’t care for the delay that this forces on the conversation. Plus, although it’s a good way to make sure that you reply to comments, it’s ultimately too time consuming.
Readers can sign in to my blog with TypeKey to avoid the CAPTCHAs, but I regard that as only a partial solution. Even though the service has improved, I don’t like that it doesn’t allow the URL link to be specified on a comment post, and it’s still an extra step.
I think the best solution will come from 3rd party identity verifiers eventually. You’ll sign in when you get online, and your verified ID will be used to mark you as a person. Sure, that system can be gamed also, but I think it shows promise.
Two points in favor of CAPTCHAs:
1. Since I have a comments feed, I feel very strongly that I need to do everything I can to save my readers from spam comments.
2. Sometimes, that extra step before posting a comment gives a reader just an extra moment to think about whether they really want to go on record with their comment. Also, it provides a moment to think about whether you’ve said all you need to say. I never use the “preview” feature of comments, but I have edited comments I was leaving to either include or delete part of the comment when I noticed something in the preview that usually comes up along with a CAPTCHA.
One point against:
I hate having to fill out the CAPTCHA myself when I leave a comment in response on my own blogs! I try to respond to most comments, so I end up having to fill out a lot more CAPTCHAs than I would if I were just commenting elsewhere!
I’m up for trying anything to combat this scourge, but false positives tend to seriously antagonise your readership in my experience – therefore labour intensive checking it has to be. Plus it hinders the spontaneity of discussion.
Also, just by looking at the comments on this post you could say that the average commenter of any worth will take a minimum of 3 – 4 minutes thoughtfully constructing his text. How long does it take to fill out a good captcha system? 5 seconds… if that… Not a great deal of time in the scheme of things.
John – a good cms software will enable the webmaster to comment with out filling out captchas.
There are undoubtedly good and bad captcha systems hence my careful use of the term “good captcha system”. The poorer cousins are a menace – of that I can heartily agree. However, get it right and captchas can be very effective.
To Chris:
I hate filling out CAPTCHA, and I’ll generally not comment on something if forced to fill out a CAPTCHA.
Many of them are annoying to read, at best, and impossible to read, at worst.
I hate being forced to fill them out while trying to log in somewhere. I know my username, I know my password, just LET ME LOG IN, right?
Case in point: I had to log in on a specific website I visit. I’ve already told them of my displeasure with the CAPTCHA they added to the login. I knew my name, and my password, but it took me three tries to get the bloody CAPTCHA right! At 14 seconds a pop, that’s three quarters of a minute wasted on a stupid log in form.
(I also have a hateboner with requiring javascript to even log in.)
After using just about everything to control spam, I installed the auto-shutoff plugin and set it to shut off comments for all posts over 14 days old. It’s worked better than anything else.
I considered the auto-shutoff route, but I actually quite like it when someone comes across an old post of mine and decides to comment on it, so I didn’t want to lose that.
The problem is, for your average blogger, captcha is actually the simplest and most effective anti-spam tool available. Most blogging softwares have it installed, or as an installable option, and it doesn’t require the person who runs it to do anything.
I was deleting hundreds of spam each day on my personal blog – personal blog, nothing that draws traffic from anyone but my circle of family and friends. I had several modules that blocked based on known blacklists, but still I was getting hundreds of automated spam. Installed captcha and now I get something spammy through about once a month.
In a personal situation things are a bit different – no one I know has problems seeing the captcha screen or reading the letters/numbers. And I don’t want to have to worry about needing to approve posts. So it is simple – a couple of seconds for a visitor to decipher the screen and neither of us are bothered by unwanted spam.
http://www.google.com
http://www.yahoo.com
http://www.msn.com