What is high availability? High availability is an aspect of system design that ensures your network is running when you need it to. Networks, and the computers that make up the networks, have many components that all need to be working in order to be “available”. If one of these components aren’t working properly, then you might experience significant downtime where you can’t access your network or services.
When designing a system to have high availability, it’s important to keep in mind how much downtime you can tolerate. If you’re willing to have an availability of 99%, then you’d have to face 1% downtime, or 3.65 total days a year. Just to put this number into perspective, when Facebook had an outage in the middle of the night for only half an hour, one estimate says that they lost half a million dollars in revenue. 100% availability is almost impossible to achieve, but a good IT design can get you close to this ideal. Let’s examine some strategies for maintaining high availability.
Backing it up
The most basic way is to create a physical backup of your data. Think of it this way. You’ve started a business from scratch, and you’re six months in. You’ve never saved any of your data, and one day, all of it is lost. It might not take you six months to restore all your information, but it’ll be pretty close. Now if you had made a physical backup, say once a week, then you’ll get your information a lot faster. You’d still have a few days of lost information, but it’s better than trying to start a business from scratch. If you’re a person who has never implemented any sort of IT strategy, then at the very least do this.
For the rest of us who are committed to maintaining workflow, there are much better strategies that you can take advantage of.
RAID (Redundant Array of Independent Disks)
Most of us at home have just one independent “disk” that stores our data, whether it’s inside a laptop or a desktop computer. In RAID, you add on additional disks that will store some or all of the data that would have been on just one disk. This is advantageous because if one disk fails, then another disk can just pick up where the failed disk left off. A disadvantage of such a system is that it takes longer to log your data. If it took you a certain amount of time to write information onto 1 disk, it now takes three times as long to write it onto 3 disks. Of course, your RAID system can be structured to minimize lag from increased write times.
Another issue with RAID systems is that they assume that only 1 disk will fail in the system at any given point. Of course, if you’re facing an issue like a virus, or natural disaster, then it’s likely that what happens to 1 disk will happen to all of them. RAID systems only offer protection from simple hardware failures.
Exactly what is sounds like, mirroring is the process of creating a mirror image of your main server. This process tends to be costly, since it requires you have a server that’s identical to the one you use, but is only there in case of an emergency. Clustering your servers is probably a better idea, but if you have an extra server lying around, then you can go this route.
When you have just one server for multiple workstations, you’re betting your business on that one server always doing its job. If you create a cluster of hosts, you’re able to leverage the additional computing power into more uptime. If one server gets too many requests, then the other machines can balance the load. This is helpful if you have a website that might get a lot of traffic all at once – one web server couldn’t handle it on its own. With a cluster of hosts, you can have Server B assume the duties of Server A if Server A stops working. On some server types, you have to shut down the server if you want to upgrade it or service a part. With a cluster of servers, you’re able to do this without shutting down your entire system.
Since servers are connected through a network, and don’t have to be sitting side by side, some companies choose to spread out their server cluster across different locations. The reason for this is if some disaster happens at one location (theft, flooding, fire, etc.) then it won’t affect the other servers on the network. The rest of the servers can pick up the slack while you set about fixing the problem, with very little downtime.
Examine Your Weak Points
There are plenty of components that need protection if you really want to minimize your downtime. Your server’s environment, hardware, and software are all pressure points that can fail if not properly maintained. Are you keeping your servers in a humid basement? You shouldn’t be surprised if you have an electrical wiring issue. Do you not regularly update your server’s software? If there’s a security breach that you didn’t install a patch for, you’re putting yourself at risk. A comprehensive analysis of your needs and computing environment is essential in order to maximize your business’ uptime.
At ETech 7, we work with our clients to make sure they experience the maximum amount of uptime possible. Call or schedule a consultation today!