If you’re running a WordPress site with the Apache web server and haven’t looked at your server logs, it’s time you did. Unless you’ve installed some fairly heavyweight server and network protections you’re going to see that probably most of the traffic to your web site are hackers looking for vulnerabilities they can exploit. Log entries like this:
116.193.76.165 - - [17/Nov/2020:00:55:37 -0500] "GET //wp-login.php HTTP/1.1" 200 6032 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" 116.193.76.165 - - [17/Nov/2020:00:55:40 -0500] "POST //wp-login.php HTTP/1.1" 200 6131 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" 116.193.76.165 - - [17/Nov/2020:00:55:42 -0500] "POST //xmlrpc.php HTTP/1.1" 403 3523 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" 107.180.88.41 - - [17/Nov/2020:00:55:46 -0500] "GET //wp-login.php HTTP/1.1" 200 6032 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" 107.180.88.41 - - [17/Nov/2020:00:55:47 -0500] "POST //wp-login.php HTTP/1.1" 200 6131 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" 107.180.88.41 - - [17/Nov/2020:00:55:48 -0500] "POST //xmlrpc.php HTTP/1.1" 403 3523 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
While the image of a dateless incel with poor hygiene and greasy hair and nothing better to do than break into your hobby site may pop into your head, in fact most such random web server hacks are usually done by robot scripts. “Script kiddies”, with possibly no real computer skills, download script kits off hacker site. These bot scripts do the dirty work for them, for which these wannabe terrorists believe they earn bragging rights when they’re successful. One script with enough bandwidth could hit thousands of web sites before lunch. Don’t feel “special”. They’re just probing for site vulnerabilities that smarter brains than theirs can exploit.
Securing a website isn’t a monolithic thing. There are too many potential attack vectors for an all-in-one solution but there are dozens of things you can do, from securing directories writable by the web server user to sanitizing user input to carefully vetting any interactive software you host for anonymous users. I’ll get into some of them later but what I like best is snuffing these assholes before they get to your content. Presenting: the Apache kill file.
Blocking the IP addresses of known offenders can be done more efficiently with either or software or hardware firewall but I’ve found it more convenient to do this with Apache because it gives me an easily accessible log of who’s being bounced. On a busier site though you’d want to take the load off your web server software and look into a dedicated firewall. You may already have one in your operating system or on your router. Ubuntu for instance comes with an easily configured software firewall, ufw.
Apache has a couple of server directives to make hosting an IP or domain-based blackhole list pretty simple. Here it is, nestled inside Apache’s <Directory></Directory> token pair. It could also live inside Location tags.
<Directory /your/physical/web/directory/filepath> Options -Indexes <RequireAll> Require all granted Include conf/your_banned_ip_list.txt </RequireAll> </Directory>
The actual IP ban list is a text file that lives inside the /conf subdirectory in the Apache config directory, typically /etc/apache2/conf. Plant your unwanted IPs there. You specify one IP address in CIDR notation per line. CIDR allows you to ban any subset of the IP address: the Class C, B, A. If you don’t know how to construct CIDR, fear not. There are online utilities that will do it for you. Just hand it the range if IP addresses to be punted from your site.
Require not ip 46.118.0.0/16 Require not ip 46.119.0.0/32 ...
What this says is, “Grant everyone except the IP addresses in conf/your_banned_ip_list.txt to my local web server directory, /your/physical/web/directory/filepath. They will be bounced with an Apache 403 (permission denied) error when they try to access anything in that directory.
114.119.147.60 - - [07/May/2021:08:34:05 -0400] "GET /some/file HTTP/1.1" 403 284
So you’re probably wondering which IP addresses to ban from your web site. That’s up to you. There are prefab online lists of IP addresses of the usual suspects but I don’t recommend them because the most dangerous hackers rarely hack from their own networks but on victim servers they’ve already compromised. You could be blackholing an innocent party and only mildly inconveniencing a hacker.
What I do is look for patterns of hacker behavior in the logs, such as something repeatedly requesting /wp-login.php. If you see four or five of those in a row it usually indicates that someone is trying to crack your account credentials (you did remember not to use “admin” as your admin username, right?) Also lots of failed requests within a very short span of time, especially for files you don’t have on your site. One can be reasonably certain it’s a hacker script if you see something like this. No human types this fast.
92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.git/config HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.env HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /sftp-config.json HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.ftpconfig HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.remote-sync.json HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.vscode/ftp-sync.json HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /.vscode/sftp.json HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /deployment-config.json HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /ftpsync.settings HTTP/1.1" 404 279 92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /wp-includes/wlwmanifest.xml HTTP/1.1" 404 279
Be careful though: legitimate search engine crawlers exhibit behavior similar to this. You can tell the difference because they typically identify themselves in your logs as “googlebot” and “bingbot” and they don’t choke a server with a tommy gun of requests but pad their requests with several seconds between them [1]. Hacker scripts have no such manners.
Hackers also have favorite targets on WordPress sites, like /xmlrpc.com. For 95+% of WordPress sites this PHP file should be disabled and only re-enabled if WordPress complains during a software update. It’s a dangerous legacy utility which is why hackers target it.
I’ll get into more security tips later. In the meantime I can’t recommend more highly WordPress security plugins like Wordfence and iThemes. The free versions are pretty awesome. Wordfence clued me into a hacker who’d “owned” an Apache-writable directory buried deep in my uploads hierarchy and had created a mini-site for spamming.
- One newer search engine, Huawei’s AspiegelBot a/k/a Petalbot, is an exception. It slammed my server with so many requests that I had to block it. It’s pissed off a lot of web admins.