post-page

Comment Spam with more Kung Fu?

35
responses
by
 
on
September 23rd, 2008
in
Blogging News, brainstorming, WordPress

It is no secret that Weblog Tools Collection is a magnet for comment spam. I am not sure if this is something to brag about but Akismet has caught over 3,588,568 spam comments as of writing this article. We spend a considerable amount of time moderating comments on this blog and lately the time spent has increased many fold. Let me explain.

Since automattic spamming of blogs has mostly been reduced to a trickle due to the likes of Akismet, spammers are now individually targeting blog posts with highly relevant, and in many cases highly convincing comments. I moderated and subsequently spammed a comment today that was over a hundred words long, on the pros and cons of one of the themes on our daily theme posts. I thought the comment was a very well written review of theme until I looked closely. The URI of the poster was a refinancing Made For AdSense page. One click of my itchy index finger and it was gone. Take that sucka!

This sort of relevant comment spam is not new but the time and effort spent on writing them seems to have increased quite a bit, where legit comments are harder to distinguish from spam. They now come with more Kung Fu and we need to spend more time and effort in identifying them. We use a plugin written by Alex King on this blog that can delink the comment authors. However there is lurking danger in just delinking a comment author since if the author was really a spammer, authorizing a comment is akin to authorizing them to post comments on your blog without moderation (they still have to go through Akismet even if the author was previously authorized). So if the comment looks really legit but has a lurking spam link in the URI, and Akismet thinks it is ham then you as the blogger never see it because it never gets put into the moderation queue and the spammer is successful.

In addition to much better comments from spammers, trackback spam have gotten clever as well. Spammy trackbacks from Russian and Chinese sites have increased considerably. They are much harder to identify as spam because of the language barrier and we have resorted to deleting them if they look anywhere close to being suspicious or add nothing to the conversation. We also delete any and all trackbacks from scraper sites. As a matter of fact, we use the blacklist feature to remove them from contention completely but scapers are another story for another day.

So we are spending more time moderating comments and often feel guilty that we might be spamming or deleting legitimate comments from folks whose site/blog look spammy to us (there was a poker blog that looked spammy but was written by a legitimate poker fan). Sadly, one persons’ spam might look to another person like ham and vice versa. It is getting harder to determine the spamminess of a comment because the distinguishing features that separate ham from spam continue to get more and more blurry. Please be (more) careful when approving a comment because you might be letting a spammer in.

“Named must your fear be before banish it you can.”

heading
heading
35
Responses

 

Comments

  1. Bruce (2 comments.) says:

    Man… I thought I was having a stroke while trying to parse that Yoda quote at the end of your post.

    Anyway, I’ve seen a fair amount of comments written by “comment marketing” outfits lately. Boy, is that a sleazy way to do things. Usually the email address and author URL are the tip off.

  2. Tadd (89 comments.) says:

    I don’t get anywhere near the amount of spam you folks get, however Akismet has snagged about 2000 total off my silly little blog.

    But as you said, I get some spammers who seem to really put some nice thought into a copy/paste comment. It’s strange really. But, the email addy or the web URI is what always tips me off. I’ve considered just dropping their URI and letting the comment slide – but like you, I’ve seen the dangers.

  3. Dougal Campbell (35 comments.) says:

    Yup, that’s a trend I’ve been dealing with, as well. Many of the comments do indeed appear to be written by hand, by someone who has actually read the article that they are commenting on. But then you see that linkbait URL, and you know that the real reason that they are commenting on your site is for the Googlejuice.

    And as you noted, the foreign-language comments/sites can be hard to definitively classify as ham or spam.

  4. JJ (1 comments.) says:

    Most of my comment spam is written in Russian… so it’s pretty easy to distinguish :)

    • Aw (8 comments.) says:

      Cool, but it’s hard to distinguish English and Germany spammers as a Chinese blogger :(

  5. Calvin (3 comments.) says:

    captcha? Why not a combination of various captcha? maths captcha and alphabet captcha?

    the funny thing is, how does these spammers operate? I have this blog, a personal blog with the comment feature disabled, the commend.php removed, the php code for comment box in single.php and home.php removed, and the best part is, there are comment spams that made it…

    • Mark Ghosh (386 comments.) says:

      Since most of these comment spammers are human, a captcha or any other such human test would be useless.

    • Ricky Buchanan (5 comments.) says:

      Aside from what Mark pointed out, another problem with CAPTCHAs are their inaccessibility to users with disabilities. A visual-only captcha will render your site inaccessible to blind users many others with neurological or age-related vision issues. A visual captcha with an audio alternative is inaccessible to those with deafblindness or neurological issues. A maths-based captcha will often be inaccessible to commenters with dyscalculia (the math equavalent of dyslexia for reading). Comprehension-based captchas will often be inaccessible to commenters with neurological or memory issues.

      Pretty much any captcha is inherently inaccessible to some people with disabilities because captchas say “prove you are human by doing X” and I’m pretty sure there is no “X” which all humans (or all net using humans, even) are able to do but which a bot can’t replicate.

      I’m hyper-aware of this because I run a blog specifically for people with all types of disabilities. The solution I’ve chosen is to use Akismet and a lot of manual checking of the spam queue so there’s no login or captcha or other barrier to commenting. It’s far from ideal because of the amount of work required to wade through the spam, but I’d rather have me do the extra work than have the comments inaccessible to a group of people because of the nature of my blog.

      I wish this was a solvable problem, but at the moment it seems to be more like an arms race.

    • Viper007Bond (91 comments.) says:

      CAPTCHAs are so 2000. They are easily defeated and are just annoying to legit users.

      Plugins like Cookies For Comments are much more effective at stopping bots.

  6. Andrew (31 comments.) says:

    This kind of spam, and in fact all spam, is a symptom of the misguided(IMO) approach to comments that says that a commenter should be ‘rewarded’ with a link that gives google juice.

    I don’t agree with this. The fact that I am commenting here says nothing about my website and shouldn’t entitle me to any search engine benefit. It is enough that real humans can find my site and fine out more about me.

    The best answer to comment spam is for everyone to stop linking out. When the benefit is gone then the spammers will go elsewhere.

    • Bruce (2 comments.) says:

      What is the spammer really trying to accomplish though? With all the spam fighting methods in place, throughput to an actually published comment has to be very low nowadays… I’m beginning to believe that comment spam is more about the blog owner’s eyeballs than any traffic that might be arriving there.

      • Doug Smith (17 comments.) says:

        They’re going for Google juice. If you read the advice of the many self-proclaimed Internet “marketers”, they give instructions on finding blogs that you can leave comments on to get free links. Some sell tools to do this automatically. The goal is to have as many links as possible from sites across a variety of PageRanks, and spread over time, so it looks organic.

        It looks like weblog tools doesn’t mark commenter URLs as nofollow. I’m convinced that is just a big magnet for spammers. In fact, I’ve seen spammers promoting dofollow as a good deed we should all do to reward our commenters. Then they use software to collect lists of sites sporting a dofollow logo and sell access to the database for spamming. See my post Nofollow and the Spam War Arms Race for more on that.

  7. Rob (8 comments.) says:

    There is definitely more “Kung Fu” as you put it in comments. One thing I like having on my site is the MyBlogLog widget on my site. By showing who was recently on the site, I can often confirm whether or not the comment was made by someone who was actually on the site. Might be a little more difficult for a really busy blog though.

  8. Pinyo (1 comments.) says:

    So far, I found the combination of Akismet, YAWASP, and Simple Trackback Validation to work quite well.

    Personally, I don’t like CAPTCHA because it interferes with real commenters.

    • Doug Smith (17 comments.) says:

      I’ll second the recommendation for Simple Trackback Validation. Its simple act of checking to see if the IP of the trackback is the same as the IP of the URLs server does wonders. It also checks the URL to see if the page really links to your blog.

      The other thing to watch for on trackbacks is that I’m seeing some spammers link to you from real looking content, perhaps lifted from another topically-related site, then change out the content weeks later. They’re hoping you don’t notice that you end up giving them a link to their spammy keyword stuffed page from your old posts. It’s not a new incoming link so you don’t get any notification from WordPress.

  9. no says:

    Just because I come from Google doesn’t mean I found this in a google search. Your “please subscribe” is idiotic.

  10. Marcus (1 comments.) says:

    I have around 20 visitors/day on my blog, which results in let’s say 1 comment/week, yet about 160K ‘comments’ (mainly track-/pingbacks i guess, since the number has drastically decreased since using Simple Trackback Validation) blocked by Akismet. With that kind of traffic one might think my blog is really not worth being spammed manually, still every third comment’s main purpose seems to be the backlink.
    As i don’t agree with Andrew (#6), but also not regard the links as reward, merely as link, nothing more but especially nothing less, i decided to disable nofollow for commenter’s links once more only yesterday and cleaned up my comments, most of the time deciding to merely discard the link rather than deleting the whole comment (dang… i wish i had known of Alex’ plugin yesterday!), since they are by now party of the more lively discussions.
    Tho I see the danger as you mentioned, i think i can handle it regarding my traffic.
    And last but not least… I wanted to say thank you to all the hobby-SEOs who so selflessly give my blog the impression of being important only a little bit… :P

  11. Sue (4 comments.) says:

    I have noticed this recently, too, and I don’t get very many comments on my site due to the nature of the readers.

    I’ve been delinking the spammers, but never thought about marking it as spam instead so Akismet can learn. But there are hundreds of these commenters around.

    You see, this is just another make money scheme for clueless bloggers. They’re getting paid a few cents to write these comments, and they think they can earn big bucks doing this.

    I’m sure glad that I actually don’t have to deal with the amount this site gets. Or any other popular site. Phew. I’d never have time to do anything else. :D

  12. Mike (1 comments.) says:

    Holy crap, 3.5 million spam comments!

    I’ll never complain again.

  13. Jonathan (81 comments.) says:

    Maybe there should be an Akismet v2 that checks the comment author’s page for spamminess.

  14. Jesse Harris (10 comments.) says:

    What, no love for Spam Karma 2? I’ve been using it for a long time and despite a lack of updates for over 18 months, it still works perfectly with the latest version of WP and has almost zero false positives or false negatives.

  15. minanube (4 comments.) says:

    askismet v2 should to be wait for it. lets see what’s new with new release

  16. Book Maven (3 comments.) says:

    And I thought my comment spam problem was bad (getting about 10 trapped in Akismet per day), but 3.5 million, that’s just crazy!

    My take, however, was that the comment itself wasn’t spam, but the link provided was. The commentator obviously took the time (I assume) to read the post, look at the time, and offer constructive feedback. Chances are, if he hadn’t used a spammy mortgage refinancing link, you probably would have approved the comment as legit, right?

    In situations like that, I would probably go ahead and approve it after removing the link because the comment does add value to the blog entry in some way.

    However there is lurking danger in just de-linking a comment author since if the author was really a spammer, authorizing a comment is akin to authorizing them to post comments on your blog without moderation…

    Not necessarily.

    What I make sure to do on my blog is remove the check next to “Comment author must have a previously approved comment” under the Discussion settings so that spammers won’t be inadvertently white-listed and auto-approved. Or, I’ll deliberately change the email address they used slightly, so if they use it again, it won’t automatically be approved.

    My 0.02 cents.

    BM

  17. Book Maven (3 comments.) says:

    Sorry for the double post; “look at the time” above should read “look at the theme“.

    BM

  18. Anuj Seth (1 comments.) says:

    I’ll not complain about the spam I get on my blog again. It seems non-existing after reading about the volume of spam you get!

  19. Aw (8 comments.) says:

    The fight between bloggers and spammers are never end!

  20. redwall_hp (40 comments.) says:

    Perhaps you should just redact the URL? That way you get the nice long comment, but de-spammified?

  21. Ron says:

    I find the combination of Bad Behavior2 and Spam Karma2 keeps me spam-free on a five year old blog. My original anti-spam plugin was MT-Blacklist which helped a lot until it was abandoned. After that, things got really bad until I put in BB and SK, thankfully it’s been a non-existent problem since.

  22. LoneWolf (1 comments.) says:

    It’s been about 1.5 years since you wrote this post. Have you noticed any improvements in Akismet and other spam fighting tools?

    What I’m finding troublesome is the prevalence of spambots — getting comments on posts that have 0 visits according to Google Analytics is a bit annoying. At least I’m not getting the level of spam that you are here.

    I think that the trick is to find a way to make the spam comments worthless while still rewarding the conversation makers who post legitimate comments. Every time you throw up a road block to the spammers, they find a way. Plus you run the risk of alienating legitimate readers.



Trackbacks/Pingbacks

  1. […] If you’re new here, you may want to subscribe to my RSS feed. Thanks for visiting! Since automattic spamming of blogs has mostly been reduced to a trickle due to the likes of Akismet, spammers are now individually targeting blog posts with highly relevant, and in many cases highly convincing comments. I moderated and subsequently spammed a comment today that was over a hundred words long, on the pros and cons of one of the themes on our daily theme posts. I thought the comment was a very well written review of theme until I looked closely. The URI of the poster was a refinancing Made For AdSense page. One click of my itchy index finger and it was gone. – – via Weblog Tools Collection. […]

  2. […] Comments has always been a problem with WordPress and just about any other blogging system. Of late it has just gotten worse. […]

  3. […] non-existent on my blog but because Spammers Got more involved! Same issue noted by Mark Gnosh at Weblog Tools Collection is hitting me as well. I have just implemented Simple Spam Filter plugin which does a very good job […]

  4. […] Rickmann left the following comment on the WLTC […]

Obviously Powered by WordPress. © 2003-2013

page counter
css.php