Hiding Phishing Attack. Anti-Bot.

Quite a few phishing kits contain special code for anti-bot protection. The attackers clearly don't want their attack to be discovered, so when a search engine or security bot comes along it they can hide it's presence. This is how a typical phishing kit looks like.

$ ll ChaseClean/login/
total 224
drwxrwxr-x 5 white hat   4096 Jul  9  2019 ./
drwxrwxr-x 3 white hat   4096 Nov 14  2020 ../
drwxrwxr-x 2 white hat   4096 Jul  9  2019 XBALTI/
-rw-rw-r-- 1 white hat  31152 Jan 13  2020 antibot.php
-rw-rw-r-- 1 white hat  22809 Jul  9  2019 auth.php
-rw-rw-r-- 1 white hat 146173 Jun  1  2019 dashboard.php
-rw-rw-r-- 1 white hat   2759 Aug 11  2018 index.php
drwxrwxr-x 2 white hat   4096 Jul  9  2019 js/
drwxrwxr-x 6 white hat   4096 Jul  9  2019 style/

If we open antibot.php we'll see that this defence mechanism might be quite elaborate.

The easiest thing to spot a bot by is a User-Agent HTTP header. This is an example of the most primitive strategy. All common bots that visit the web-page get a fake "Internal Server Error".

if(preg_match('/bot|crawler|spider|facebook|alexa|twitter|curl/i', $_SERVER['HTTP_USER_AGENT'])) {
    logger("[BOT] {$_SERVER['REQUEST_URI']} - 500");
    header('HTTP/1.1 500 Internal Server Error');
    exit();
}

Alternatively some kits send the visitor to the original target website.

Since User-Agent header is not really reliable as some bots override it with a value that resembles a browser the phishers have to rely on IP addresses. A lot of kits contain a huge number of IP addresses or subnets that pose a threat to the attack.

$bannedIP = array("^94.26.*.*", "^95.85.*.*", "^72.52.96.*", "^212.8.79.*", ... )

if(in_array($_SERVER['REMOTE_ADDR'], $bannedIP)) {
	header("location: https://www.google.com/404");
	exit;
} else {
	foreach($bannedIP as $ip) {
		if(preg_match('/' . $ip . '/',$_SERVER['REMOTE_ADDR'])){
			header("location: https://www.google.com/404");
			exit;
		}
	}
}

These lists could be quite long containing hundreds of thousands of IP addresses. The code above tries to perform exact match at first and then apply it as a regex.

Next trick in the arsenal is too look at the IP hostname and trigger the protection in case of keyword matching.

$blocked_words = array("cyveillance","phishtank","amazonaws","calyxinstitute","tor-exit", ...)

$hostname = gethostbyaddr($_SERVER['REMOTE_ADDR']);

foreach($blocked_words as $word) {
    if (substr_count($hostname, $word) > 0) {
        header("location: https://www.google.com/404");
        exit;
    }
}

The list of "blocked_words" could be quite long as well. As you can see security companies, hosting providers and even TOR exit nodes are all excluded from the attack.

Yet another trick we've seen being used is JavaScript redirects. Since the bots just access the content and often don't execute JavaScript it can be exploit by the attackers.

<script type="text/javascript">
    window.location = "http://real.deal/phishing.php";
</script>

The interesting advantage of this approach is that the page can even pretend to be a legitimate site if accessed with JavaScript disabled.