How to filter web traffic with blocklists

Create your own HTTP web blocklist filter to block and filter unwanted web HTTP traffic with blocklists.
Published on Sunday, 27 May 2012

Photo by Michael Dziedzic on Unsplash

Block and filter unwanted web HTTP traffic with blocklists, on both IIS and Apache. Protect your website easily with this PHP blocklist class. Let's create your own HTTP web blocklist filter.

Update: renamed "blacklist" to "blocklist" as much as possible without changing code.

My need for an HTTP blocklist and why you need one too

As a systems administrator for web hosting company Vevida, I deal with unwanted HTTP traffic on a daily basis. Customer websites are hacked, comment forms, blogs and forums are flooded with spam, and so on.

There are numerous ways of blocking unwanted visitors, like forum or comment spammers, spam or hack bots, from your web site. One of those methods is to filter traffic using existing blocklists. There are quite a few blocklists available, but for as far as I know most are intended for SMTP traffic, not HTTP.

And I needed a HTTP blocklist.

Two of such HTTP blocklists that do exist, are Project Honey Pot and StopForumSpam. A lot of "offending" IP addresses are listed in these blocklists, so why not use them? Both blocklists offer Application Programming Interfaces or API's, to query their databases. Also, both have lists of written implementations, or plugins to use: here and here. Contributed by the community.

One thing I missed (or couldn't find) is an implementation that utilizes both blocklist databases, and offers support for the creation of a local blocklist. This local blocklist can be used to reduce remote look-ups and decreases the load on both external databases.

So I decided to-try to create such an implementation.

The web blocklist filter can be used with either Windows Server IIS IP Address and Domain Restrictions module, stored in a web.config file, or stored in a flat text file for use with .htaccess files. Both Helicon Ape and Apache mod_rewrite support a RewriteMap directive.

This makes the blocklist, code and usage cross platform!

As said, we'll be using Project Honey Pot and StopForumSpam.

About Project Honey Pot

Project Honey Pot is the first and only distributed system for identifying spammers and the spambots they use to scrape addresses from your website. Using the Project Honey Pot system you can install addresses that are custom-tagged to the time and IP address of a visitor to your site. If one of these addresses begins receiving email we not only can tell that the messages are spam, but also the exact moment when the address was harvested and the IP address that gathered it.

Project Honey Pot was created by Unspam Technologies, Inc - an anti-spam company with the singular mission of helping design and enforce effective anti-spam laws.

Project Honey Pot's Http:Bl has been implemented on a number of different web servers, content management systems, blogging platforms, and forums. These systems query the http:BL servers for visitors to your site and restrict their access if they are found to be malicious. See the available implementations for WordPress, Drupal, Typo3, ASP.NET / IIS, phpBB and so forth.

About StopForumSpam

We provide lists of spammers that persist in abusing forums and blogs with their scams, ripoffs, exploits and other annoyances. We provide these lists so that you don't have to endure the never ending job of having to moderate, filter and delete their rubbish.

We provide a "free for use" site where you can check registrations and posts against our database. We list known forum and blog spammers, including IP and email addresses, usernames, how busy they are and, in some cases, evidence of their spam.

Do not use StopForumSpam as a firewall to your website!

How to: Create a local HTTP web blocklist in PHP

Use Project Honey Pot and Stop Forum Spam to block spammers on your website. Here we'll create a local web blocklist of IP addresses we don't want to hit our website. The code is PHP, but you can port it to ASP.NET (C#, VB.NET), classic ASP, Perl or whatever you want.

Since Apache's mod_rewrite offers the same functionality as Helicon Ape, the usage of this blocklist is cross platform, for both Windows Server IIS and Linux Apache.

The logic behind all is as follows:

  1. a visitor visits your website
  2. The IP address of the visitor is looked up in the local blacklist.txt file and (if set up) web.config file
    • IP address found? Block the visitor and log the visit attempt
    • IP address not found? Look up the IP address in StopForumSpam and Project Honey Pot databases
  3. If the IP adress is found in either one of the databases:
    • write the IP address in the local blacklist.txt file and (if set up) web.config file
    • log the visit attempt.
    • block the visitor

When the blacklist.txt file is in use, there is nothing more you have to do. A visitor will receive a HTTP 403 Forbidden message automatically by IIS.

Let's block us some visitors!

PHP web blocklist filter:

The code below is commented. The downloadable code is not. I've also created a readme documentation (PDF, opens in a new screen). Beware, I'm not a die hard PHP programmer, the code might be (is...) sloppy.

<?php
setlocale( LC_ALL, 'nld_nld' );

/*
 * ( c ) 2012 - Jan Reilink - jan@saotn.nl
 *  follow me on Twitter: @Jan_Reilink
 *  donate: https://www.paypal.me/jreilink
 * 
 * Yep, I host my PHP sites on Windows Server IIS : )
 * @ Vevida ( https://vevida.com )
 * 
 * PHP class for quering IP adresses agains the Project Honey Pot and 
 * Stop Forum Spam databases.
 * 
 * - An Access Key is mandatory!
 * http://www.projecthoneypot.org/create_account.php
 * - Project Honey Pot appreciates a donation
 * http://www.projecthoneypot.org/donate.php
 * API information
 * http://www.projecthoneypot.org/httpbl_api.php
 * 
 * About Project Honey Pot ( PHP )
 * http://www.projecthoneypot.org/about_us.php
 * 
 * The same goes for Stop Forum Spam ( SFS )
 * http://www.stopforumspam.com/usage
 * http://www.stopforumspam.com/donate
 * http://www.stopforumspam.com/keys
 * 
 * 
 * Files used:
 * database\config.ini : with configuration settings
 * database\bl_log.txt : to log blocked requests
 * database\blacklist.txt : the blocklist with which Helicon Ape ( or
 *   mod_rewrite ) RewiteMap directive works
 * 
 * www\httpbl.class.php : the PHP class which does all the magic
 * www\web.config : IIS 7.0 / 7.5 configuration file for use with IIS 
 *   IP Address and Domain Restrictions. Needs to be writable
 * www\.htaccess : configuration file for IIS Helicon Ape or Apache 
 *   mod_rewrite. Needs to be writable
 * 
 * Example usage:
 * 
 * require_once( "httpbl.class.php" );
 * $mybl = new httpBL() ;
 * if( $mybl->_retrieve_IP_address_status_PHP( 
 *   $mybl->_retrieve_remote_IP_address()  ) )
 * {
 *      $output = "<html><head><title>Unauthorised access</title></head>";
 *      $output .= "<body><h1>Unauthorised access</h1><h2>Access from"; 
 *      $output .= "your IP address to this website is prohibited!</h2>";
 *      $output .= "<span>Contact the webmaster if you believe this";
 *      $output .= " is an error.</span></body></html>";
 *      header( "HTTP/1.0 403 Forbidden" );
 *      die( $output );
 *  }
 * 
 */

Class httpBL {
  // CHANGE THE PATH ON THE LINE BELOW!
  public $configfile = "/path/to/your/config.ini";
  public $ini_array;
  
  public function _read_config_file() {
    return $this->ini_array = parse_ini_file( $this->configfile, true );
  }

  public function _check_required_params() {
    $this->_read_config_file() ;
    
    if( ( $this->ini_array["PHPaccesskey"] != "" ) &&
      ( $this->ini_array["SFSaccesskey"] != "" ) &&
      ( $this->ini_array["blfileloc"] != "" ) ) {
      return TRUE;
    } else {
      trigger_error( "not all required variables are filled out." );
      exit;
    }
  }
  
  public function _retrieve_remote_IP_address() {
    $ip = ( isset( $_SERVER['HTTP_X_FORWARDED_FOR'] ) ? $_SERVER['HTTP_X_FORWARDED_FOR'] : $_SERVER['REMOTE_ADDR'] );
    return $ip;
  }

  /**
   * Only for public IP v4 address space
   * http://www.php.net/manual/en/filter.filters.validate.php
   **/
  public function validate_IP_address( $ip ) {
    if( filter_var( $ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4 | FILTER_FLAG_NO_PRIV_RANGE ) !== FALSE ) {
      // IPV4 === TRUE
      return TRUE;
    } else {
      return FALSE;
    }
  }

  /**
   * Note that the IP address being queried should be sent in the reversed 
   * octet format. In other words, "127.1.1.7" should become "7.1.1.127" 
   * for all DNS queries. For more detailed information, please see the 
   * http:BL API ( http://www.projecthoneypot.org/httpbl_api.php ).
   **/
  public function _reverse_octet_format( $ip )
  {
    // another, old, method: return preg_replace( "/( d{1,3} ).( d{1,3} ).( d{1,3} ).( d{1,3} )$/",'${4}.${3}.${2}.${1}',$ip );
    return implode( ".", array_reverse( explode( '.', $ip ) ) );
  }

  /**
   * Look up IP address using DNS to retrieve its status with
   * Project Honey Pot ( PHP )
   * We can also validate IP addresses against different blocklists:
   *   dns_get_record( $revip.".cbl.abuseat.org", DNS_A ); or
   *   zen.spamhaus.org, but that's beyond the scope of this Class
   **/
  public function _retrieve_IP_address_status_PHP( $ip ) {
    if( $this->_check_required_params()  == TRUE ) {
      if( !$this->validate_IP_address( $ip ) ) {
        //trigger_error( "not a valid IP address" );
        return FALSE;
      }
      if( !$this->check_is_current_listed( $ip,'webconfig' ) && ( !$this->check_is_current_listed( $ip,'htaccess' ) ) ) {
        $lookup = $this->ini_array["PHPaccesskey"] .".".implode( ".", array_reverse( explode ( ".", $ip ) ) ) .".dnsbl.httpbl.org";
        $result = explode( ".", gethostbyname( $lookup ) );
        if( !empty( $result ) && ( $result["0"] == "127" ) ) {
          // minDayinBl ( 2 ) and minThreatLevel ( 5 )
          if( ( $result["1"] >= $this->ini_array["minDayinBl"] ) &&
            ( $result["2"] >= $this->ini_array["minThreatLevel"] ) ) {
            $this->save_positive_result( $ip );
            $this->save_any_listing_file( $ip );
            return TRUE;
          }
        }
      } else {
        $this->save_any_listing_file( $ip );
        return TRUE;
      }
    }
    else {
      trigger_error( "not all required variables are filled out." );
      exit;
    }
  }

  /**
   * function to retrieve the IP status from Stop Forum Spam
   * $ip : IP address in *normal* octet format - ( variable )
   * result is either: listed in Stop Forum Spam and now added to our 
   * own blocklist, or already listed in our own little blocklist
   **/
  public function _retrieve_IP_address_status_SFP( $ip ) {
    if( $this->_check_required_params()  == 1 ) {
      if( !$this->validate_IP_address( $ip ) ) {
        return FALSE;
      }
      if( !$this->check_is_current_listed( $ip,'webconfig' ) && ( !$this->check_is_current_listed( $ip,'htaccess' ) ) ) {
        if( $this->http_GET( $ip, 'serial' ) == TRUE ) {
          $this->save_positive_result( $ip );
          $this->save_any_listing_file( $ip );
          return TRUE;
        }
      } else {
        $this->save_positive_result( $ip );
        $this->save_any_listing_file( $ip );
        return TRUE;
      }
    } else {
      trigger_error( "not all required variables are filled out." );
      exit;
    }
  }
  
  /**
   * $url    : HTTP URL to Stop Forum Spam - ( fixed )
   * $ip    : IP address in *normal* octet format - ( variable )
   * $format  : serialized ( serialize()  or JSON ) - ( variable, 
   *         serialize()  is used as default )
   **/
  public function http_GET( $ip, $format='serial' ) {
    $ch = curl_init() ;
    $url = "http://www.stopforumspam.com/api?ip=".$ip."&f=".$format."";
    curl_setopt( $ch, CURLOPT_URL, $url );
    curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
    curl_setopt( $ch, CURLOPT_HEADER, 0 );
    $output = curl_exec( $ch );
    curl_close( $ch );
    $result = unserialize( $output );
    if( $result["ip"]["appears"] == 1 ) {
      return TRUE;
    }
  }
  
  /**
   * function to save a positive listing result in our own blocklist
   * $ip : IP address in *normal* octet format - ( variable )
   * 
   * Helicon Ape uses .htaccess, in combination with 
   * a blacklist.txt file ( RewriteMap ):
   * 
   * 108.59.10.145 -
   * 
   * More information:
   * http://www.saotn.org/htaccess-as-web-application-firewall-waf/
   * http://helicontech.blogspot.com/2009/02/isapirewrite-faq.html
   * 
   * Should also work for Apache's mod_rewrite RewriteMap
   * http://httpd.apache.org/docs/2.4/mod/mod_rewrite.html#rewritemap
   * http://httpd.apache.org/docs/2.4/rewrite/access.html#host-deny
   **/
  public function save_positive_result( $ip ) {
    if( file_exists( $this->ini_array["blfileloc"] ) &&
      is_writable( $this->ini_array["blfileloc"] ) ) {
      if( $this->check_is_current_listed( $ip, 'htaccess' ) != TRUE ) {
        $f = fopen( $this->ini_array["blfileloc"], "ab" );
        fwrite( $f, $ip ." -\r\n" );
        fclose( $f );
        clearstatcache() ;
      }
    } else {
      trigger_error( $this->ini_array["blfileloc"] ." not writable!" );
    }

    if( file_exists( $this->ini_array["webconfigFile"] ) &&
      is_writable( $this->ini_array["webconfigFile"] ) ) {
      if( !$this->check_is_current_listed( $ip, 'webconfig' ) ) {
        $this->_save_web_dot_config_positive_result( $ip );
      }
    } else {
      trigger_error( $this->ini_array["webconfigFile"] ." not writable!" );
    }
  }
  
  /**
   * function to see if it is already listed
   * $ip : IP address in *normal* octet format - ( variable )
   * result is either TRUE ( listed in own blocklist ) or FALSE ( not listed )
   **/
  public function check_is_current_listed( $ip, $file='' ) {
    if( $file == '' || $file == 'htaccess' ) {
      if( file_exists( $this->ini_array["blfileloc"] ) ) {
        $f = file_get_contents( $this->ini_array["blfileloc"] );
        if( strstr( $f, $ip ." -" ) != FALSE ) {
          return TRUE;
        }
      }
    }

    if( $file == 'webconfig' ) {
      if( ( $this->ini_array["use_webconfigFile"] == "1" ) && ( file_exists( $this->ini_array["webconfigFile"] ) ) ) {
        $f = file_get_contents( $this->ini_array["webconfigFile"] );
        $sstring = "ipAddress="".$ip."" allowed="false"";
        if( strstr( $f,$sstring ) == TRUE ) {
          return TRUE;
        }
      }
    }
  }

  /**
   * saves any hit of blocked IP addresses to a log file
   **/
  public function save_any_listing_file( $ip ) {
    if( file_exists( $this->ini_array["bllogfileloc"] ) &&
      is_writable( $this->ini_array["bllogfileloc"] ) ) {
      $f = fopen( $this->ini_array["bllogfileloc"],"ab" );
      fwrite( $f, date( "Y.m.d.G:i" ) ." - " . $ip ."\r\n" );
      fclose( $f );
      clearstatcache() ;
    } else {
      trigger_error( "could not find logfile "
        .$this->ini_array["bllogfileloc"] );
    }
  }

  /**
   * IIS 7 / 7.5 IP Address and Domain Restrictions format
   * uses web.config for configuration, see my WordPress web.config
   * for more information: /posts/my-wordpress-web-config/
   * 
   * <system.webServer>
   *   <security>
   *     <ipSecurity>
   *       <add ipAddress="aa.bbb.ccc.dd" allowed="false" />
   *     </ipSecurity>
   *   </security>
   * </system.webServer>
   **/
  public function _save_web_dot_config_positive_result( $ip ) {
    $formatxml = PHP_EOL;
    $formatxml .= "        <add ipAddress="".$ip."" allowed="false" />";

    $doc = new DOMDocument() ;
    if( $doc->load( $this->ini_array["webconfigFile"] ) === false  ) {
      return false;
    }
    $xpath = new DOMXPath( $doc );
    $iprestrictions_nodes = $xpath->query( '/configuration/system.webServer/security/ipSecurity[starts-with( @add,'ipAddress' )]' );
    if( $iprestrictions_nodes->length > 0 ) {
      return true;
    }

    $xmlnodes = $xpath->query( '/configuration/system.webServer/security/ipSecurity' );
    if ( $xmlnodes->length > 0 ) {
      $ipsecurity_node = $xmlnodes->item( 0 );
    } else {
      $ipsecurity_node = $doc->createElement( 'ipSecurity' );
    
      $xmlnodes = $xpath->query( '/configuration/system.webServer/security' );
      if( $xmlnodes->length > 0 ) {
        $security_node = $xmlnodes->item( 0 );
        $security_node->appendChild( $ipsecurity_node );
      } else {
        $security_node = $doc->createElement( 'security' );
        $security_node->appendChild( $ipsecurity_node );
        
        $xmlnodes = $xpath->query( '/configuration/system.webServer' );
        if( $xmlnodes->length > 0 )
        {
          $system_webServer_node = $xmlnodes->item( 0 );
          $system_webServer_node->appendChild( $security_node );
        } else {
          $system_webServer_node = $doc->createElement( 'system.webServer' );
          $system_webServer_node->appendChild( $security_node );

          $xmlnodes = $xpath->query( '/configuration' );
          if ( $xmlnodes->length > 0 ) {
            $config_node = $xmlnodes->item( 0 );
            $config_node->appendChild( $system_webServer_node );
          } else {
            $config_node = $doc->createElement( 'configuration' );
            $doc->appendChild( $config_node );
            $config_node->appendChild( $system_webServer_node );
          }
        }
      }
    }

    $rule_fragment = $doc->createDocumentFragment() ;
    $rule_fragment->appendXML( $formatxml );
    $ipsecurity_node->appendChild( $rule_fragment );

    $doc->encoding = "UTF-8";
    $doc->formatOutput = true;
    $this->saveDomDocument( $doc, $this->ini_array["webconfigFile"] );

    return true;
  }

  function saveDomDocument( $doc, $filename ) {
    $config = $doc->saveXML() ;
    $config = preg_replace( "/( [^r] )\n/", "$1\r\n", $config );
    $fp = fopen( $filename, 'wb' );
    fwrite( $fp, $config );
    fclose( $fp );
  }
}
?>

Download full source package, with all files and directory structure here. Text version of the class is here.

The code is sloppy, might be faulty, is provided AS-IS, is probably not secured and should not be used in production without testing! But it does work. Use at your own risk.

I hope you find this PHP class helpful in protecting your website from unwanted visitors. If you do, then please donate me a beer or cup of coffee!

Frequently Asked Questions

I do not give support on how to use this class. The code is highly documented with comments and tested in production. Because I am not a PHP programmer, there is much to improve in code. Please do so.

If you have an urgent question, please post it as a comment to this post. I'll try to answer it, and visitors are also free to respond of course.

Check an IP address against arbitrary blocklists, using PHP - Bonus

You can easily shorten the code above to use PHP to check an IP address against one or two arbitrary blocklists, for example dnsbl.sorbs.net and b.barracudacentra.org, or zen.spamhaus.org.

If the IP is listed in either: deny website access. Here's an example class, you'll recognize some of the function names from above.

<?php
Class httpBL {
  public function _retrieve_remote_IP_address()  {
    $ip = ( isset( $_SERVER['HTTP_X_FORWARDED_FOR'] ) ? 
      $_SERVER['HTTP_X_FORWARDED_FOR'] : $_SERVER['REMOTE_ADDR'] );
    return $ip;
  }

  public function validate_IP_address( $ip ) {
    if( filter_var( $ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4 | FILTER_FLAG_NO_PRIV_RANGE ) !== false ) {
      return true;
    } else {
      return false;
    }
  }

  public function _retrieve_IP_address_status_PHP( $ip ) {
    if( !$this->validate_IP_address( $ip ) ) {
      return false;
    } else {
      $lookup1 = implode( ".", array_reverse( explode ( ".", $ip ) ) ) .".dnsbl.sorbs.net";
      $lookup2 = implode( ".", array_reverse( explode ( ".", $ip ) ) ) .".b.barracudacentral.org";
      $result1 = explode( ".", gethostbyname( $lookup1 ) );
      $result2 = explode( ".", gethostbyname( $lookup2 ) );
      if( !empty( $result1 ) && ( $result1["0"] === "127" ) ) {
        error_log( $ip ." is listed in dnsbl.sorbs.net" );
        return true;
      }
      if( !empty( $result2 ) && ( $result2["0"] === "127" ) ) {
        error_log( $ip ." is listed in b.barracudacentral.org" );
        return true;
      }
    }
  }
}

$mybl = new httpBL();
if( $mybl->_retrieve_IP_address_status_PHP( $mybl->_retrieve_remote_IP_address() ) ) {
  $output = "<html><head><title>Unauthorised access</title></head>";
  $output .= "<body><h1>Unauthorised access</h1><h2>Access from"; 
  $output .= "your IP address to this website is prohibited!</h2>";
  $output .= "<span>Your IP address ". $mybl->_retrieve_remote_IP_address() ." is listen in either dnsbl.sorbs.net, zen.spamhaus.org or b.barracudacentral.org. Contact the webmaster if you believe this";
  $output .= " is an error. This is only a temporary test.</span></body></html>";
  header( "HTTP/1.0 403.6 Forbidden" );
  die( $output );
}
?>

Hope this helps! :)