Previous | Table of Contents | Next |
Mission-critical data should be backed up daily or weekly and stored off site. There are numerous services that provide offsite storage, often in combination with hierarchical storage management techniques. In the IBM environment, for example, this might entail storing frequently used data on a direct access storage device (DASD) for immediate usage, whereas data used only occasionally might go to optical drives, and data that has not been used in several months would be archived to a tape library.
Carriers, computer vendors, and third-party firms offer vault storage for secure, offsite data storage of critical applications. Small companies need not employ such elaborate methods. They can back up their own data and have it delivered by overnight courier for storage at another company location or bring it to a bank safety deposit box. The typical bank vault can survive even a direct hit by a tornado.
In addition to backing up critical data, it is advisable to register all applications software with the manufacturer and keep the original program disks in a safe place at a different location. This minimizes the possibility of both copies being destroyed in the same catastrophe. Software licenses, manuals, and supplementary documentation should also be protected.
In storm-prone areas like the Southeast, frequent electrical storms can put sudden bursts of electricity, called spikes or surges, on telephone lines. These bursts can destroy router links and cause adapters and modems to fail. To protect equipment attached to telephone lines, surge-suppression devices can be installed between the telephone line and the communications device. Surge suppressors condition the power lines to ensure a constant voltage level. Many modems and other network devices have surge suppressors built in. The disaster recovery plan should specify the use of surge suppressors whenever possible, and equipment should be checked periodically to ensure proper operation.
Most companies can afford to stockpile spare cables and cards but not spare multiplexer and router components that are typically too expensive to inventory. Pooling these items with another area business that uses the same equipment can be an economical form of protection should disaster strike. Such businesses can be identified through user group and association meetings. The equipment vendor is another source for this information.
After each party becomes familiar with the disaster recovery needs of the other, an agreement can be drawn up to pledge mutual assistance. Each party stocks half the necessary spare parts. The pool is drawn from as needed and restocked after the faulty parts come back from the vendors repair facilities.
Carriers offer an economical form of disaster protection with their networks of digital switches. When one link goes down, voice and data calls are automatically rerouted or switched to other links on the carriers network. Examples of switched digital services are switched 56K bps, ISDN, and frame relay. Many routers now offer interfaces for switched digital services, allowing data to take any available path on the network. The same level of protection is available on private networks, but it requires spare lines, which is often a very expensive solution.
A well-planned internetworking system avoids single points of failure. This entails equipping nodes with redundant subsystems, such as power, control logic, network cards, and WAN ports. Routers, for instance, need multiple WAN ports so if a primary line goes down, the router automatically can use the backup line on another port.
Even branch sites with remote-access routers should have multiple WAN ports. If the first line goes down, the remote router is programmed to autodial into a second line on another port, which remains inactive until needed. The second port can dial up a switched 56K line that is paid for on a usage basis.
WANs of three or more sites often link the remote locations to the primary site but not to each other. Though this strategy saves linkage costs, it risks leaving remote workers stranded should the main offices services go down. To keep branches up and running, inexpensive links should be established among them. The links bandwidth should be adequate to keep critical systems communicating.
As long as backup circuits are available, routers can run a routing protocol that understands link states and can reroute around points of network failure. The routing information protocol (RIP), often used in smaller WANs, does not support link states, but the open shortest path first (OSPF) protocol does. OSPF runs on TPC/IP networks and is the protocol of choice for larger internetworks.
It is advisable to test the disaster recovery plan periodically to check assumptions and to find out whether the plan really works. After giving users advance notice, the network manager can come in after business hours, unplug one of the communication links, and see what happens. If something unexpected occurs, it is necessary to fine tune the disaster recovery plan and test again.
With certain types of network equipment, such as multiplexers and switches, several disaster-simulation scenarios can be programmed in advance and stored for emergency implementation. With the integral network modeling capability of some T1 multiplexers, network planners can simulate various disaster scenarios on an aggregate or node level anywhere in the network. This off line simulation allows planners to test and monitor changing conditions and determine their precise impact on network operations.
Any outage should be treated as an unannounced test of the disaster recovery plan. Network managers should determine if the response was adequate and if the response can be improved.
Previous | Table of Contents | Next |