Previous | Table of Contents | Next |
Error Detection. Error detection is augmented by logging systems that keep track of failures over a period of time. This information is examined to determine whether trends may adversely affect network performance. This information, for example, might reveal that a particular component is continually causing errors to be inserted onto the network, or the monitoring system might detect that a component on the network has failed.
Configuration Assessment Component. This component uses information about the current system configuration, including connectivity, component placement, paths, flows, and maps information onto the failed component. This information is analyzed to indicate how that particular failure is affecting the system and to isolate the cause of the failure. Once this assessment has been performed, a solution can be worked out and implemented.
The solution may consist of reconfiguring most of the operational processes to avoid the source of the error. The solution determination component examines the configuration and the affected hardware or software components, determines how to move resources around to bring the network back to an operational state or indicates what must be eliminated because of the failure, and identifies network components that must be serviced.
Function Criticality. The determination of the most effective course of action is based on the criticality of keeping certain functions of the network operating and maintaining the resources available to do this. In some environments, nothing can be done to restore service because of device limitations (e.g., lack of redundant subsystems) or the lack of spare bandwidth. In such cases, about all that can be done is to indicate to the servicing agent what must be corrected and keep users informed of the situation.
Once an alternate configuration has been determined, the reconfiguration system implements it. In most cases, this means rerouting transmissions, moving and restarting processes from failed devices, and reinitializing software that has failed because of some intermittent error condition. In some cases, however, nothing may need to be done except notify affected users that the failure is not severe enough to warrant system reconfiguration.
For WANs, connections among LANs may be accomplished over leased lines with a variety of devices, typically bridges and routers. An advantage of using routers for this purpose is that they permit the building of large mesh networks. With mesh networks, the routers can steer traffic around points of congestion or failure and balance the traffic load across the remaining links.
Sharing resources distributed over the LAN can better protect users against the loss of information and unnecessary downtime than a network with all of its resources centralized at a single location. The vehicle for resource sharing is the server, which constitutes the heart of the LAN. The server gives the LAN its features, including those for security and data protection, as well as those for network management and resource accounting.
Types of Servers. The server determines the friendliness of the user interface and governs the number of users that share the network at one time. It resides in one or more networking cards that are typically added to microcomputers or workstations and may vary in processing power and memory capacity. However, servers are programs that provide services more than they are specific pieces of hardware. In addition, various types of servers are designed to share limited LAN resources for example, laser printers, hard disks, and the RAM mass memory. More impressive than the actual shared hardware are the functions provided by servers. Aside from file servers and communications servers, there are image and fax servers, electronic mail servers, printer servers, SQL servers, and a variety of other specialized servers, including those for videoconferencing over the LAN.
The addition of multiple special-purpose servers provides the capability, connectivity, and processing power not provided by the network operating system and file server alone. A single multiprocessor server combined with a network operating system designed to exploit its capabilities, such as UNIX, provides enough throughput to support 5 to 10 times the number of users and applications as a microcomputer that is used as a server. New bus and cache designs make it possible for the server to make full use of several processors at once, without the usual performance bottlenecks that slow application speed.
Server Characteristics. Distributing resources in this way minimizes the disruption to productivity that would result if all the resources were centralized and a failure was to occur. Moreover, the use of such specialized devices as servers permits the integration of diagnostic and maintenance capabilities not found in general-purpose microcomputers. Among these capabilities are error detection and correction, soft-controlled error detection and correction, and automatic shutdown in case of catastrophic error. Some servers include integral management functions (e.g., remote console management). The multiprocessing capabilities of specialized servers provide the power necessary to support the system overhead that all these sophisticated capabilities require.
Aside from physical faults on the network, there are various causes for lost or erroneous data. A software failure on the host, for example, can cause write errors to the user or server disk. Application software errors may generate inaccurate values, or faults, on the disk itself. Power surges can corrupt data and application programs, and power outages can shut down sessions, wiping out data that has not yet been written to disk. Viruses and worms that are brought into the LAN from external bulletin boards, shareware, and careless user uploads are another concern. User mistakes can also introduce errors into data or eliminate text. Entire adherence to security procedures are usually sufficient to minimize most of these problems, but they do not eliminate the need for backup and archival storage.
Previous | Table of Contents | Next |