White Edition

How to Defeat Comment Spam

Comment spam is undoubtedly one of the most annoying aspects of running a website. It literally sucks the joy out of it. The spam is everywhere, and everyone has problems with it. But, there is actually a way to solve the problem.

Below you can see 8 ways to defeat comment spam - ranging from the simplest solutions, but also the least effective, to some that are very advanced and also highly effective.

Comment spam is for most parts auto generated. A spammer will write a script that can put out millions of spam comments in a very short time. What you need is, simply put, some obstacles that will prevent the script from running on your site.

The less effective - but simple methods

These works quite well for some people and it definitely doesn't hurt to implement them. But, some sites will still be susceptible to spam (mostly high-traffic sites)

1: Remove WordPress, TypePad, MovableType and Blogger tags

Comment spammers are very lazy people, and as such they go for the biggest target - with the least effort. That means they specifically target popular blogging tools like WordPress, TypePad, MovableType and Blogger etc. The way they do this is to simply try to detect if your site runs on any of these things (it only takes one line of code to do).

What they look for is things like this:

Remove these and you  have just removed all the lazy spammers

Note: Only works if your site is not already targeted (hence on the spammers hotlist).

2: Rename your form and its elements

Another thing you need to do is to change your comment form. In the past people said that you should change the "action URL", but that doesn't work anymore. The spammer can detect your new URL with something as simple as this:

What you need to do is to change every part of your comment form, the ID and NAME attributes of all your elements, the action URL - everything. Do not call your website field "website", do not call your emails field "email". Call it something like "joesfish" or "hubba26rrtdh2".

What this does is that it makes much harder for the spammer to write his scripts (remember they aim for quantity not quality).

3: Make it look like something else

Spammers can still detect that you have a comment form on your site, simply because it contains 4 active fields - name, email, website and comment. All they really need to do is to detect if your page has 3 active input fields (not including hidden ones) and one textarea.

To stop them from doing that, you can simply add more. Why don't you have 7 input fields of varies types, 3 textareas and 2 radio buttons. That will make it look like anything but a comment form.

Of course this will look very messy, but thanks to CSS you can add "display:none;" to those  fields that should not be visible to real people.

It sure will make it a lot harder for spammer to figure out.

The less simple - but more effective solutions

Let's move on to more drastic but also more effective methods. Let's make it really hard for the comment spammers. Both these methods successfully prevent external scripting - but it does not prevent in-page scripting (the kind where the script is executed in a browser on your site - for instance using automated bookmarklets).

4: Change your action URL on form submission

This is something many people have tried, very successfully. The idea is that you add a fake action URL in your form. This obviously causes any spam to be sent to that URL, but since it goes nowhere it simply vanishes into thin air.

Then to make it work for real people, you change the action URL into the correct one when the form is being submitted.

5: Encrypt your form

Another method is to encrypt your entire form - using JavaScript. This will make it look as if your site does not allow commenting in general.

Simple run your form HTML trough a Javascript encrypter like the one from Hivelogic (use the advanced form) - and insert that instead. You use the same method to encrypt emails to prevent spam in your mailbox.

Note: I use this method on this site - and I do not get any comment spam.

The more advanced - but also the highly effective solutions

Finally let's move on to some of the more advanced solution. These will have much greater effect, essentailly eliminating spam completely.

6: Use WYSIWYG, and save with AJAX

The easiest way to get rid of comment spam is to not have a comment form on your site. So get rid of it. You can instead use an iframe in DesignMode to mimic the same visual experience as a form. It is more difficult to make, and you have to know about JavaScript and AJAX.

You will also limit commenting to the latest browsers (old browser and non-browser devices will not work with this)

7: Replace your form with an image

Another method to remove the form from your interface is to replace it with an image. I know it sounds strange, but let me explain. What you do is that you insert an image that looks like a comment form, but replace it with an actual form when people click in it.

To the spammer it will look like a page with an image. To real people it is a normal form, because when activated the image is turn into a real form. This will prevent any kind of scripted spam.

Note: Make sure you detect where people click, and set the focus accordingly. If people click on the website area on the image, the image should be replaced with a real form with focus in the website field.

8: Detect keystroke speed

The last thing you can do is to simply detect how fast a comment is written. It generally takes 0.2 seconds to type a character, so you simply detect how long it took to write the full comment and the average pauses between each keystroke.

E.g. 200 characters should take more than 40 seconds to write, with an average keystroke pause of 0.2 seconds.

If it is faster than that, then it is written by a script (thus from a spammer).

Note: You also need to detect when people paste content into your form - For instance when they want to add a link.


9: Double form Magic (UPDATE)

You might be able to defeat the spammer using a double form (read comment #14). This approach is 100% accessible and with no semantic problems.


What you do NOT want to do

Before we finish let's take a short look at what you shouldn't do

CAPTCHA

You could add a CAPTCHA (an image with some, often distorted, text) and require that people type these in to add a comment). It works fine, but they are also incredibly annoying. Do not do this - annoying your real visitors is not a good way to deal with spammers (who never sees the CAPTCHA anyway)

Register/sign-in

This is another solution that also works quite well. But it is a terrible solution. Forcing people to go through a registration process is not only irritating, but it also removes focus. Do not do this!

Spam filters

Spam filters - like Askimet - as one way that many people try to get rid of spam. But it does not work. As with email spam filters your genuine comments is sometimes flagged as spam, and spam is sometimes not flagged. Perhaps it does a really decent job 98% of the time, but since you cannot rely on it 100%, you still have to look through it.

It is not a solution; you are still forced to look at spam. Forget about it.

Comments

1

Clazh - Apr. 25, 2007

a really good article. And thanks for Pointing out how spammers detect which blog platform it is. I think most free Wordpress templates have the meta tag and powered by Wordpress in the footer.

2

Thomas Baekdal - Apr. 26, 2007

Thanks Clazh,

On a different note, here is an interview with one of those comment spammers:

http://www.theregister.co.uk/2005/01/31/link_spamer_interview/

3

Håvard Pedersen - Apr. 26, 2007

A lot of these suggestions will seriously degrade the semantic of a page's markup and will make you site break pretty bad without javascript.

Another option not mentioned here is client-automated CAPTCHA. What you do is use a normal CAPTCHA but add three hidden fields to the form (two of them get initialized with random data from the server). If Javascript is present you take data from two of the hidden fields, processes it, stores the result in the third field and hides the CAPTCHA stuff from the form. you then check the third field when the form is submitted, if it adds up, you know the comment comes from a browser. If not, you check the CAPTCHA as normal. :)

No semantic havoc and a nice fallback for people without javascript!

4

Thomas Baekdal - Apr. 26, 2007

Håvard, I disagree - apart from perhaps item 3, the only effect is that the comment form will be inaccessible. It will not do any harm to the semantic of the rest of the page.

Take a look at this site - disable javascript and see how it looks. The only change is that the comment form disappears. You can also try to disable the CSS (and leave Javascript on or off). Again, the semantics of this site is unaffected (you can comment with CSS off, but not with Javascript off).

Thank you for your other suggestion using hidden fields with server data. It is not as effective because the spammer can simply detect what those hidden values are and manipulate the third field, before running his script. It does remove the lazy spammers though.

Your site cannot tell the difference between a script acting as if it was run in a browser - and a real browser (one of the negative sides of JavaScript).

5

Iacovos Constantinou - Apr. 26, 2007

I might be wrong but in many cases spamers do not even parse a website in order to identify which software is used. In fact, it is more than enough to request common filenames of the most popular blogging software like http://www.example.com/wp-trackback.php, http://www.example.com/wp-comment.php etc. If any of these files exists, then it is possible to submit a comment or trackback without even parsing a bit from the website.

6

Thomas Baekdal - Apr. 27, 2007

Iacovos, You are somewhat right. Spammer do not need to parse your site to submit a comment (if the action URL is known), this was why many people suggested that you should change your action URL into something else.

But, in the ever escalating war, the big spammers soon found out that people did this, so now they are parsing the site to get the correct URL.

I do not think they parse the site on every request, only the inital one (of if the known action URL returns a 404 error).

7

Doug - Apr. 27, 2007

I'm a big fan of your usability related posts, but I find this one way off the mark. Sure, you can weed out a lot of bot activity by relying on Javascript, but I am willing to bet that you will disable more legitamate users' ability to comment than Askimet will tag false positives. And to top it off, Askimet has a method to report false positives to them, so they can improve their filters.

There was recently an excellent CF Meetup about CF Form Protect that covered multiple points of detection in a weighted formula (no one point of failure). http://cfformprotect.riaforge.org/

In the end, if you want accurate spam protection and not inconvenience your users (with requiring JavaScript, logins, CAPTCHAs, etc), you will need to have some sort of spam management interface.

8

Doug - Apr. 27, 2007

Oh and your regex for changing the action of the form says font instead of form :-)

9

Doug - Apr. 27, 2007

I meant detecting the form action, not changing. I'll shut up now!

10

Thomas Baekdal - Apr. 27, 2007

Doug, The next two articles will be about Usability :)

The problem with spam filters is that you are forced to look trough all those that are filtered out - in fear that some legitimate, and useful comments, has been flagged as spam. That means you, as a site owner, is still burdened by it.

The readers do not care about that, but I know from experience, if the owner feel that his/her time is wasted, it will have a big impact of the quality of the site. Spam suck all the joy out of you - and that has a serious impact on this quality.

Spam is a much about not being an inconvinience to the site owner as it is to the its readers.

As for how many peple who will not be able to comment due to the use of Javascript (keeping in mind that this is only the more advanced solution). I do agree that it does present a theoretical problem. It is theoretical because we do not know how many people that is. It could be only the ones browsing using a mobile phone or it could be many more.

Until we know how many that is, we cannot determine if it is a problem.

The last figure I saw, from January 2007, indicated that 94% has Javascript turned on. But, the same survey stated that only 94.3% was using a desktop browser. That is interesting, because to me that means that only 0.3% has deliberately turned Javascript off - the rest cannot use it since they use other devices.

The real question now is; how many of the 6% would be able to comment using a normal form. Keep in mind that they browse using other devices than a browser. My guess is - not that many.

In any case. If am to choose between not having to deal with spam and making my site 100% accessible. Accessibility lose. Keep in mind that these solution will only have an effect on being abe to comment. Not using any other parts of the site.

11

Thomas Baekdal - Apr. 27, 2007

Ohhh... thanks for pointing me to the typo (that is what happens when you post late one evening) :)

12

Jesper Rønn-Jensen - Apr. 28, 2007

Number 2 and 4 don't work for us anymore. We are hit by 1000-1500 spam comments each day. We renamed and changed the form action and submit URL almost a year ago.

But late in the summer, comment spammers found a way through, just as you mention above.

Thanks for this brilliant writeup. I'd like you to go more into accessibility details -- especially those accessibility issues that prevent some of your users from submitting comments.

13

Thomas Baekdal - Apr. 28, 2007

Jesper, thanks!

There are indeed many accessibility issues with this, in the form of some people will not be able to submit a comment. These people are generally people who:

  • have turned JavaScript off (very few)
  • browse using mobile devices
  • use alt=ernative browsers
  • need Screen readers

Is that a problem - yes, it is. But another question is, is it a big enough problem if that means I have to live with spam (in your case a staggering 1000-1500 spam comments). This is mostly up to the induvidual to determine.

I have personally decided that to comment on this site you need Javascript. I really hate spam.

14

Thomas Baekdal - Apr. 28, 2007

I just recieved an email from a person who wrote that he did not understand how I could promote inaccessible solutions as the ones above - when I was a usability advocate.

I agree. I would really like to come up with a solution that both defeats spam - and is accessible at the same time. That would be really cool... but there is a slight problem. You see, in order to be accessible you cannot use these things:

  • Javascript (or any kind of scripting)
  • Cookies, many devices does not support that
  • Referer info, most phones, and some screen readers does send referer info
  • CSS3 hacks, in which you use CSS to include vital elements of the form using the :after selector

That pretty much leaves us with links and 100% standard form elements.

The only possible apprach I can think of, within those restraints, is to do this:

On your blog pages you add a hidden field with the comment ID-code + a single form button names "Add a comment" (or something similar) - and use the post method. You then remove the comment form completely form this page.

On the page that the first form is submitted to, you add a normal comment form - and use the comment-ID from the previous form.

You also setup the second page to validate if it has recieved a comment ID-code from the previous form (using form request - not referer info).

This will require that the spammer detects that the first "simple form" is a gateway to the comment form. He then have to detect the form ID, and submit that using the designated action URL. Secondly he have to retrieve the output from second page, fill in the form elements and submit that form.

This should stop the spammer, because he would need to create a special script just for your site. And, it is 100% accessible (no javascript, css, cookies or referers is used - only standard and semantic HTML).

It is a bit tricky, and somewhat theoretical - but it might work.

15

Jonathan - Apr. 30, 2007

Before I moved from using Drupal to Wordpress, I was getting about 30-50 spams a day. I then moved to Wordpress and the spam went to zero for about three weeks, then suddenly ramped up to about 30-50 again,

I therefore concluded that spammers use at least a semi-manual process of hooking you up to their spam runs. So this article would not be very useful if I'm right.

BTW In the past 12 months, I have had almost 100% successful filtering with Akismet in Wordpress, with perhaps one or two legitimate comments put into the approval queue. I'm not complaining about that at all.

16

M. Kuikens - Apr. 30, 2007

I might slightly disagree about the captcha.

Since it's used so much nowadays people have accepted the need to type over some code to prevent the web from being spammed to death.

When I look at this entry form I have to guess where to put my name, mail, url and comment, thats not very visitor friendly either I think.

However, if the bots in the future find a way to deal with captcha all hell will breake loose on the web...

17

Doug - May. 1, 2007

Interesting results on the no-JS crowd compared to the non desktop browser crowd. I would never have guessed the numbers were so close.

As I mentioned, there are many different anti-spam discussions going on on various blogs and different techniques being discussed. But if you are committed to stopping spam but never actually moderating comments, then you will have to implement solutions that alienate a small percentage of users. My advice then is to make it apparent to the user that they cannot post in advance. So rendering the whole form by Javascript might be acceptable in that case. The worst thing is to allow a user to type in their comment and then tell them it can't be posted because of some referer check or something (one should never use the referer for anything anyway since many "security" packagage such as Norton alt=er this unbeknown to the user).

CAPTCHA plain sucks. I implemented it once before going on vacation and came back to less spam but lots of angry comments about how annoying it was. I didn't find it too bad myself, but after a week or so I began seeing many codes that I had to reenter and agreed it was very annoying. And there are manual spammers out there who will put up with it longer than your readers who are not getting paid to comment. In the end, those people are the bane of your existence after you've spent all that time thinking you have stopped spam :-)

18

Thomas Baekdal - May. 3, 2007

Jonathan, what most likely happended for you was that the spammers got back errors when spamming your site (after you changed platform). The three weeks was the time it took them to get a look at their logs, and activate the wordpress script instead.

M. Kuikens, The way I see CAPTCHA's is pretty much the same way as when someone at the officle (accidently) "lets out a wind". You can live trough it, but it is annoying as hell.

Doug, I agree.

19

Ploum - May. 23, 2007

Here's my solution, which works perfeclty for me:

http://ploum.frimouvy.org/?150-the-invisible-captcha-mechanism-icm-against-form-spam

20

Ploum - May. 23, 2007

Link broken because of the "?" mark. Here is it : http://tinyurl.com/3arrjc

21

mde - Jun. 12, 2007

Jeg så dig i går...

22

Thomas Baekdal - Jun. 12, 2007

Hvor? :)

23

mde - Jun. 12, 2007

by

24

Thomas Baekdal - Jun. 12, 2007

Øhhh... okay. Til personalesalg? (jeg kiggede ikke på alle de mennesker der stod i kø - jeg havde travlt med at komme hjem...)

25

Anonymous - Jun. 12, 2007

Jep...

Jeg stod nu heller ikke ligefrem og gjorde opmærksom på mig selv...

Anyway - bryder jo reglen nu - ville bare sige hej!

26

Thomas Baekdal - Jun. 12, 2007

he he - nu skal jeg jo ikke snakke for højt om regler :)

Men jeg håber du har det godt.

27

Anonymous - Jun. 12, 2007

For det meste! Og i lige måde...

28

Thomas Baekdal - Jun. 12, 2007

Takker :o)

(nu vil jeg spise is nummer 2 idag)

29

Mr Eddy - Jul. 28, 2007

Could you discuss Ploum's method, explain it because it's a bit too complicated for me. Thanks

30

Figs - Sep. 9, 2007

I don't like the javascript solutions because I sometimes like to be able to turn javascript off. The captcha or JS trick sounds reasonable -- gradual degradation -- but would the extra forms affect people using screen readers? (Obviously, you'd need a different trick, like audio captcha alternatives... but that's not exactly new.)

31

Jabapyth - Sep. 9, 2007

You said in a comment that "If am to choose between not having to deal with spam and making my site 100% accessible. Accessibility loses", but you said that CAPTCHAs are a bad Idea because they are bad for the user experience.

They are very effective, and while the user experience is somewhat compromised, isnt that a reasonable sacrifice? Most people are used to CAPTCHAs, and while they are annoying, they have become acceptable.

32

Thomas Baekdal - Sep. 9, 2007

Jabapyth, people do not accept CAPTCHA's because they have grown to like them - they accept them because they are forced to.

Forcing someone to use CAPTCHA's is like a dictator forcing a population to submission. Everyone will accept it if they do not have a choice, but everyone also wishes for the day that they regain their freedem.

I do not think it is a reasonable sacrifice :)

BTW: Accessbility and usability is, in my opinion, not the same. Usability is the ability to use an object, something I think is extremely important - often more important than anything else. Accessbility is the "level" of devices you e.g. website can be used on.

My level of accessibility is: All devices can see the content. Only devices with javascript can add comments (about 0.3%).

33

Simon G. - Sep. 9, 2007

Interesting article, thanks. One of the things related to your "rename form elements" suggestion, is after renaming them, set up a honey pot field called, e.g. "email", or "website" (the type of thing that a spam crawler will look for) and set it to display:none, use autocomplete"off", and fill it with something like "DO NOT CHANGE THIS VALUE".

When you validate the form, if this has changed, then reject it.

--Simon

34

fdb - Sep. 10, 2007

I second simon's suggestion. I am already using it on my site, and it has proven very effective.

35

Maxim - Sep. 10, 2007

Super duper!

Especially the fact that you can see the effect of this rules on this very page!

Thanks for the post.

36

Merlin - Sep. 10, 2007

For blogs running under Dotclear, there is a plugin called "spamalgame": http://dev.becohouse.be/trac/spamalgame/wiki

It sends a javascript function (customizable) into the form. This code will compute some value and put it a field of the form. When the comment is submitted, the server checks that the value corresponds to what was given in the page and accepts the comment.

If the users have disabled javascript, they see a login from. They'll still be able to post if they are registered.

I had tens of comment spams per day before and since I installed it about one year ago, I had absolutely NO spam at all. And my users don't notice anything.

37

auston - Sep. 10, 2007

In response to the blog and comments, you could:

-Put an image link that pulls a form via ajax

-put rel=nofollow tags on all links so they retain no value to spammers.

-check the HTTP Referrer(make sure your blog is the one posting)

-disable HTML

38

Merlin - Sep. 10, 2007

Sorry for the multiple posts but when you submit the comment with referers disabled (Web developer toolbar), you get an error, believe your comment was refused but it wasn't...

39

Thomas Baekdal - Sep. 10, 2007

Merlin, The comment system doesn't work when referer info is disabled.

BTW: The comment should actually have been refused - but I had made a small error in component that validates the comment. Fixed now.

40

Adam - Sep. 12, 2007

I must disagree with your comments about Akismet - I've found it tremendously effective. I've had 2 false negatives and 0 false positives so far, with ~1400 spams blocked. YMMV I guess. :)

41

Thomas Baekdal - Sep. 12, 2007

My point about Akismet was not really its effectiveness, but that its very presence makes you - as the site owner - look at it. If you got a spam filter, you always end up looking in your spam folder to see if something went wrong.

I want an anti-spam system that filters out spam so effectively that I never have to even consider looking at it. On this site I do just that. I don't have a "comment spam folder", so I never look at spam - neither do my visitors.

42

Ignas - Sep. 13, 2007

Another method not mentioned in site:

CSS hidden field that if populated marks submission as spam. Normal user is not able to view or input text in this field, while robots do happily.

43

Sven Fuchs - Sep. 14, 2007

Simon G. + Ignas,

I'm using a expansion of the CSS hidden field technique where the email field will be transparently renamed to something else and a extra-fake-honeypot-email field inserted. After the form has been posted back the plugin will detect data in the bait-email field.

Result: 100% spam catched for approx. 6 months now ...

See http://www.artweb-design.de/projects/mephisto-plugin-inverse-captcha-for-comments-anti-spam for an implementation as a Mephisto plugin

44

Armin Fritsch - Nov. 14, 2007

Hi

(first, i am sorry for my bad english)

I try at the moment another solution for comments/guestbook.

I add an extra field to the normal comment-fields. This field has an question for example:

You are an:

an the answer (the options are)

human,

computer,

spammer

On the submitpage i get the value of this field. If it is not right (human), i will not write anything in the database...

if it is right, it is ok.

I know, this "trick" just work for me because it is addapted to the site (so the answers are not human, computer ..... they are more sitespecific) and it is just one single site.

But i think, if you do stuff like this, it is not that easy for the spammers to get you, because they will search for other websites, that they can spam automaticly.

45

Max - Mar. 24, 2008

I used Akismet and this one! Like them both! Thanx for the site!

46

David - Apr. 12, 2008

Hello people! thanks for the great article, it's very helpful. I hate spam as much as you do! I don't have my own blog but I have a mailbox that was filled with spam some time ago. I set up a Gafana.com account and I got rid of spam like forever. It uses great filters that don't let unsolicited mail come to my inbox and I feel extremey happy about that.

47

Design for interaction - Apr. 18, 2008

I have a similar experience as 'Max' with Akismet, it worked fine but as you pointed out Thomas, you start looking at Akismet all the time. It is always difficult to find the right balance between an open and user friendly website and preventing Spam. Nice article!

 

Published: Apr. 25, 2007 in Technology

Subscribe / Select »

Thomas Baekdal

Thomas Baekdal is a Writer, Interaction Designer, Change Advocate and Project Manager.

» About Baekdal
» Contact Information