Custom Spam Filter For Contact Form 7

I like Contact Form 7. It's been around forever, it's lightweight and simple to setup. It's perfect when you just need to throw up a contact form onto your site.

However, it's not so good in the spam filtering.

I usually use Honeypot for Contact Form 7, which helps, but a fair few bots still get through.

Contact Form 7 has some spam filtering, if you use the Disallowed Comment Keys in WordPress admin. However, it uses fuzzy matching, so if you add "press", it will mark form entries as spam that have the word "WordPress" or "express" in them. That's far too much to have to think about for each word you enter.

You could also hook it up to Akismet or add reCaptcha, but I hate captchas from usability and accessibility points of view.

I also wanted the spam failure to be invisible. Essentially I wanted it to appear to the spammer that the form had succeeded, but not send any mail. That way they can't train their bots. And if they hired people to fill out the form, they also would see the success and move on.

So I figured I'd build my own spam filtering solution.

Here is the code I came up with:

// Contact Form 7 Spam Filter
add_filter( 'wpcf7_spam', 'iw_dont_mark_spam', 20 );
function iw_dont_mark_spam() {
  return false;
}

add_filter( 'wpcf7_skip_mail', 'iw_skip_spam' );
function iw_skip_spam() {
  $submission = WPCF7_Submission::get_instance();

  // We're skipping spam check later. This makes Honeypot for CF7 work.
  if ( defined( 'HONEYPOT4CF7_PLUGIN' ) ) {
    if ( true == honeypot4cf7_spam_check( false, $submission ) ) {
      return true;
    }
  }

  $form_data = implode(
    ' ',
    wpcf7_array_flatten( $submission->get_posted_data() )
  );

  // Auto spam any Russian characters
  if ( preg_match(
      '/[БбГгДдЁёЖжЗзИиЙйКкЛлПпФфЦцЧчШшЩщЪъЫыЬьЭэЮюЯя]/',
      $form_data
  ) ) {
    return true;
  }

  $form_data = preg_replace( '/[^a-z0-9]+/i', ' ', strtolower( $form_data ) );
  $form_data = preg_replace( '/\s+/', ' ', $form_data );
  $form_data = explode( ' ', $form_data );

  // From Settings -> Discussion -> Disallowed Comment Keys
  $bad_words = get_option( 'disallowed_keys' );

  if ( empty( $bad_words ) ) {
    return false;
  }

  $bad_words = explode( "\n", trim( $bad_words ) );

  foreach ( $bad_words as $word ) {
    $word = trim( strtolower( $word ) );
    if ( strlen( $word ) < 3 ) {
      continue;
    }
    if ( in_array( $word, $form_data ) ) {
      return true;
    }
  }

  return false;
}

You can put that into the functions.php file of your child theme, or build it into a plugin.

Let's break that down, so you can customize it to your own usage.

Basic Overview of Spam Filter

We start by overriding any of the spam filters marking an entry as spam. Remember we want this to kill the spam silently.

That's the purpose of iw_dont_mark_spam() and setting it to priority 20. If you're finding this doesn't work, increase the priority. As both the default spam filters and the Honeypot for Contact Form 7 filters are set to priority 10, I didn't need any higher.

Notice that this spam filter is hooked into wpcf7_skip_mail? That allows us to do our spam checks and fail the mail silently, while appearing to succeed.

We run our Honeypot check inside the iw_skip_spam() function, because we're skipping it in its default place. If you don't use Honeypot, you don't need to add that.

We automatically fail any form entries that contain Russian characters. I'm finding a lot of spam has them.

We then pull in the Disallowed Comment Keys, run them against the words in the form entry and fail the entry if we find one.

This checks against whole words, stripping out punctuation and capitalization for the checks. It won't affect the formatting of the form entry.

Detailed Overview of Each Section

Let's break the code down even further, in case any of the sections don't make sense, or you want to understand exactly which parts you need to customize for your individual use.

Skip Spam Check

add_filter( 'wpcf7_spam', 'iw_dont_mark_spam', 20 );
function iw_dont_mark_spam() {
  return false;
}

This function tells Contact Form 7 that whatever the results of any other spam checks, you don't want any form entries to be marked as spam.

If you want it to be marked as spam, and build on any other checks, you might put all of your code in this function (called something else) instead of hooking it into the wpcf7_skip_mail filter.

If you did so, you would need to do it as:

add_filter( 'wpcf7_spam', 'iw_spam_filter', 20, 2 );
function iw_spam_filter( $spam, $form ) {
  return false;
}

In that case, $spam is essentially an unused variable, with $form being the one you might want to work on.

Otherwise, you won't need to touch anything in this function.

Setup Spam Filter

add_filter( 'wpcf7_skip_mail', 'iw_skip_spam' );
function iw_skip_spam() {
  $submission = WPCF7_Submission::get_instance();
}

We're hooking our spam filtering function into wpcf7_skip_mail so that we can fail it silently if we return true; or allow it to send the mail with return false;.

$submission gets the current form. There are other ways of doing it, like passing it into the function, but I found this way to be the most reliable.

Run Honeypot Check

if ( defined( 'HONEYPOT4CF7_PLUGIN' ) ) {
  if ( true == honeypot4cf7_spam_check( false, $submission ) ) {
    return true;
  }
}

The Honeypot for Contact Form 7 works by hooking into wpcf7_spam. We're overriding that filter, so we have to bring it in here.

To make sure we're only running this if the honeypot plugin is installed and activated, we wrap it in

if ( defined( 'HONEYPOT4CF7_PLUGIN' ) ) {}

If honeypot4cf7_spam_check finds spam, we want to automatically fail this spam check with return true. Otherwise, we continue with the checks.

Turn the whole form entry into a string

$form_data = implode(
  ' ',
  wpcf7_array_flatten( $submission->get_posted_data() )
);

We're doing a few things in this line. I've spread it out over a few for the formatting of this blog, but in my code, it's all on one line.

We start by getting the form data with $submission->get_posted_data().

Using wpcf7_array_flatten, we turn it into a simple array, if necessary. Unless you have a complicated form, it will probably be a simple array at this point anyway.

Finally, we turn the whole form into a single string, with each field separated by a space, with implode().

Automatically fail if it contains Russian

if ( preg_match( '/[БбГгДдЁёЖжЗзИиЙйКкЛлПпФфЦцЧчШшЩщЪъЫыЬьЭэЮюЯя]/', $form_data ) ) {
  return true;
}

I included all the upper and lower case Russian letters that aren't similar to the Latin alphabet. If you have other special characters that you want to automatically fail, add them between the square brackets.

Using preg_match, we look through the form for anything that matches a character in the list and return true to skip the mail if they are found.

Strip punctuation and turn the form into an array of words

$form_data = preg_replace( '/[^a-z0-9]+/i', ' ', strtolower( $form_data ) );
$form_data = preg_replace( '/\s+/', ' ', $form_data );
$form_data = explode( ' ', $form_data );

Here we are stripping all punctuation. If you wanted to run a check against email, you would do it before this step.

We start by turning any uppercase letters to lowercase with strtolower( $form_data ).

Using preg_replace, we turn anything that's not a letter or a number into a space. You could probably leave out the 0-9, but I wanted to keep my options open.

With preg_replace( '/\s+/', ' ', $form_data ); we're turning any double+ spaces created by previous steps into single spaces. This is so we get a relatively clean array in the next step.

Using explode(), we're splitting the form into an array of individual words, wherever there is a space.

The reason we're turning it into an array and not just operating on the string is so that we can easily get exact match, rather than partial matching.

Get the Disallowed Words into an array

$bad_words = get_option( 'disallowed_keys' );

if ( empty( $bad_words ) ) {
  return false;
}

$bad_words = explode( "\n", trim( $bad_words ) );

We get the list of bad words from Settings -> Discussion -> Disallowed Comment Keys. If there aren't any, we're finished with the checks.

The words in this list should be whole words. Not phrases or partial words. We're checking for whole words later in the list. If you want to do some fuzzy partial matching, you would do it to the string earlier, rather than after you've turned it into an array.

With explode(), we split the bad words into an array. If you look in the database under disallowed_keys (in wp_options), you will see the list of bad words as a single string. Don't let that fool you: they're invisibly separated by the line ending.

If you prefer to hard code the bad words, rather than use the Disallowed Comments function of WordPress, add them as an array to $bad_words. This can be a good idea if you're the admin of the site for a client or others and you don't want them to have to deal with or edit the bad words.

You might also hard code your bad words here if you want some words to fail the contact form checks that you're fine with allowing in the WordPress comments.

If you want to do both, you might add below the last line:

$more_bad_words = array( 'bad', 'word' );
$bad_words = array_merge( $bad_words, $more_bad_words );

Just replace 'bad', 'word' with an array of the words you want to fail the spam test.

Due to the way the spam check works in the next step, you want to have the words with the most frequency at the top if you have a long list.

Check for spam

foreach ( $bad_words as $word ) {
  $word = trim( strtolower( $word ) );
  if ( strlen( $word ) < 3 ) {
    continue;
  }
  if ( in_array( $word, $form_data ) ) {
    return true;
  }
}

return false;

In this step, we iterate over the list of bad words individually.

With the first few lines, we clean up any potential spaces or uppercase letters, skipping the word if it's less than three letters.

If the word is found in the list of form words, we've identified spam and fail the form without the need for further processing.

If we get to the end of all these checks, then the form is allowed to be sent. So we finish with return false.

Mike Haydon

Thanks for checking out my WordPress and coding tutorials. If you've found these tutorials useful, why not consider supporting my work?

Buy me a coffee

Leave a Comment