I’ll be the first to admit, that I’m not much of a network guy. Granted, I understand the concepts and can follow along well enough. Its just that without having the Cisco gear at my disposal to play with, it makes it kinda difficult at best (now I have GNS3 which fixes that problem though). Well a few years ago my friend turned me onto Mikrotik’s and honestly, I haven’t looked back since. Its just amazing what you can do with these cheap little routers with a few clicks of the mouse (its got an awesome GUI). This site currently runs behind one, as does my house and my dad’s. One of the recent features in the last year has been Layer7 packet inspection (scanning the actual contents of the packet). What this entails is I am able to create regex rules, and apply these to different firewall and mangle rules and have that rule only match if the data in the packet matches the regex. As you ponder all the possibilities, you can see how useful this can be. I take full advantage of this, and several other features, to ensure the integrity and speed of the site. As examples, I will show you a bit of what I currently do.
Blocking Hack Attempts
An important thing to me is ensuring my sites are up and running, and free of exploits. Most that involves keeping the software used up to date, but it also includes some preemptive measures into blocking attacks (beyond htaccess files and file permissions). I normally make it a daily habit to scan over my http logs and errors to see what is happening. One of the most common things to see is different bots trying to run 10 to 20 different exploits on my forums, bug tracker, wiki, and even this blog. Granted most of them are old exploits for old versions, but every now and then you see new things added to the list, as if they are being updated. So what I do is add L7 regex rules to catch these common exploits, and block them before they even hit the server. The bots IP address is then black holed for 24 hours, to ensure no other scans from it gets through (possibly blocking exploits I didn’t even know existed). In any given day I block a good 200 to 500 bots. While this doesn’t stop new exploits or the like, it does help to clean up my access logs, so that only new types of exploits are being shown and thus helping me discover them and block them also. CPU wise, it doesn’t really seem to affect my router, as only port 80 traffic to the servers is scanned (meaning http requests).
One of the first things I did was disable all remote access to SSH, Webmin, and the other services I have running. SSH was set to run on an alternate port, but even then I still had a few attempts to login to root (which is disabled also). Now being able to easily SSH into my server to fix things was also a necessity. Even though I have VPN setup, having to use it was a hassle (and I can’t fix things when at a PC with no VPN client) so I devised a solution using Port Knocking. The concept of port knocking is simple enough. You hit specific ports in a specific order within a specific amount of time, and it then opens access. With the Mikrotik, this can be expanded to make it much more complex and can be done with no extra equipment or server (unlike most implementations). To use it securely, and to ensure that no random port scan is going to trigger it we do several things. First, instead of using a single IP, I spread it across my entire IP block, even to IPs which have no server listening on them. Second, the UDP ports are randomly spread out across these IPs so that nothing is in natural order. Other random ports are setup to null the attempt, so that if you attempt to knock and happen to hit one of these bad ports, your knock has to start over. These ports themselves do not respond and are actually blocked by the firewall (just like every other port), so you can’t tell if you sent it to the right port or not. Next, I have implemented L7 filters for each of these so that not only do you hit the right port on the right IPs in the right order, the packet of data has to be exact. Last of all, this whole knock attempt has to happen within 1 second. Once your knocking is complete, you are added to an address list that is given access to the administrative ports for 2 minutes. I have then setup a specific UDP port, that you can send a packet (with L7 matching) that will reset the timer. This is done so that when you are done working, your access is removed fairly quickly. From here, I could easily expand it to require more then the 7 ports it currently requires, but I feel that that would be overkill. I could also use both UDP and TCP ports.
Now, for my remote access. I carry a USB drive around that contains a compiled executable (its an autoit script) that does all the port knocking for me. Once opened, it sits down in the system tray, and resets the timer every 1 minute until closed. This way, from any computer I can easily and quickly get remote access to my servers, and easily shut off that access when done. I bundled putty, WinSCP, and portable firefox with it to make it even more convenient.
To ensure that my site is always responsive, even when someone is slurping down my CD, I have implemented some Priority Queues and download limits. Being that my hardware is currently graciously hosted by a local Datacenter and I am given a set amount of bandwidth, I want to ensure that I am using this wisely and not allowing one resource to adversely affect all my sites. My sites only use ~10 Gigs to 20 Gigs of bandwidth a day, so its not an adverse amount of traffic, but even so I like things to run smoothly. So I have broken everything up to different categories for SSH, normal website HTTP traffic, Torrents (for my CactiEZ CD), HTTP Downloads of my CD, and everything else.
In order of priority. SSH is given unlimited bandwidth and highest priority. Website traffic can take up to 90% but anything over 50% can be scavenged (dropped) for higher priority traffic. HTTP Download traffic (plugins) can take up to 50% with scavenging after 10%, with bursting to 80%. Torrent traffic can take a max of 60% with scavenging after 30%. HTTP download of my CD is allowed a max of 30% with scavenging after 10%. Everything else falls under OTHER, and is allowed 40% with all of it being allowed to be scavenged.
You may have noticed that the Torrent of my CD is given more bandwidth and a higher priority than the HTTP download of it. This is to encourage the use of the torrent. I happen to run 2 other Seeds in different locations just to help speed up the connection (there are 16 Seeds at this moment). Using the HTTP download just means you have to share the meager bandwidth with everyone else, while the torrent means you get to use part of the bandwidth of every seed and leecher currently connected. I also happen to make use of L7 rules to make the distinction between HTTP downloads of my plugins, and HTTP downloads of my CD (they are both served from the same server, same IP, same website). A simple regex to match the GET request for the CD classifies the entire connection.
I further make use of L7 to hinder multiple connections to the HTTP download of my CD. As a server operator, I really despise Download Managers because of the unnecessary load they put on the server. When people use a download Manager to grab the CD, it opens up to 20 HTTP connections to different portions of the CD. This causes extra memory and CPU to be used because of the excess HTTP threads that have to be opened. It causes massive disk I/O from the 20 connections making the disk jump back and forth to read each little bit from each spot. And it unfairly tries to consume a larger portion of the allocated bandwidth by squeezing off other downloaders. But, none of these effect my server any longer. I have recently implemented a 1 connection limit on my Queue for the CD. This doesn’t block the extra connections, but instead I assign the extra connections to a different queue, which is only allowed 64 Kbps of bandwidth total. When you get 60-80 connections in this queue, they effective eliminate themselves by timing out. This is helped by the fact that I use a tiny packet buffer on this queue (1 packet), which means the traffic is dropped immediately if the queue fills up. While this doesn’t effectively eliminate the issue, it controls it enough that I am satisfied with the result. When the #1 connection has completed its download, within a few seconds 1 of the #2+ connections gets reclassified and you continue on downloading.
If anyone would like, I can go a bit further in depth on how to configure these types of things, but this post is currently getting long enough as it is. As of today, except for a few little mishaps (I like to test different ideas out) everything seems to be working fairly smoothly. Hopefully in the future I will finish the move to ESXi and makes things a bit snapper too.