Previous | Table of Contents | Next |
Using these ARS features, the NMS design engineers defined three primary trouble-ticket schemas. The event ticket schema specifies the date, time, and severity of the problem, as well as suggested problem resolution steps. The node contact schema provides information about repair personnel (i.e., phone number and other contact information). The common carrier schema provides information about the transport circuits, and carrier contact data. All schemas adhere to a predefined escalation process that ensures automatic notification to management of unresolved or forgotten event tickets.
Along with tracking events and communicating information, the help desk designers have coupled ARS with Sybase database transparently to automatically retain a database of problem-solving experience. This problem-solving information can be used by the NOC operator or nodal technician as a trouble-shooting aid and to provide statistics. Trouble-shooting assistance is achieved by querying for solutions from previous similar problems. Furthermore, storing all previous event tickets allows the ARS to provide, for example, statistical analysis concerning the length of outages, the frequency of outages, most frequent cause of events, and the amount of time spent by individual repair personnel. Coupled with this functionality is also the capability to generate reports for statistical results, event ticket summaries, complete event tickets, and other information as needed.
Preparing for the next release of the help desk, the organization is considering listing these reports as well as the status of all trouble tickets on a secured server that is accessible by either a Web or E-mail interface. Such a capability allows network users to investigate and obtain and track status of reported problems without calling the help desk. This, the organization believes, allows its NOC operators to spend more time to resolve network problems.
Rather than have a standalone help desk that has no insight into the network status information collected and presented by the SNMP manager, the organization decided to integrate its ARS with HPOV. The coupling with HPOV relies on a combination of menu bar integration, along with the in-house developed scripts that automatically filled in specified data fields upon opening a trouble ticket. This reduces the amount of information the NOC operators must fill in for each trouble ticket by the NOC operators. As the dynamics of the network behavior is captured, analyzed, and understood by the engineers, the organization plans to build upon the existing HPOV integration to include the SNMP trap ARS event mapping list to automatically generation trouble tickets for selected network traps and events.
Due to the mission-critical nature of the network, system reliability and redundancy required careful handling. For the NMS, several software components are used together to provide the required the system reliability. The ODS is used to perform disk mirroring to provide redundancy for all of the data and applications used by the primary and secondary NMS WS1 and WS2.
The Qualix Groups First Watch software monitors all of the applications running on the primary workstation, and sends a heartbeat between the primary and secondary workstations. In the event of some type of fault or failure, the First Watch maintains a log and notifies the NOC operator, and if needed, shut down the primary workstation and initiate the secondary workstation.
Isicad Corporations Command software resides on WS4. It provides a graphical representation of the physical location of devices on the network to track network assets and device-to-device connectivity to perform fault isolation. It also documents such useful information as circuit identification numbers, cable type, and personal identification numbers. It is used to locate the exactness of the reported fault. Although it can be integrated with HPOV NNM and Remedys ARS, the NMS will use it initially as a standalone application.
NETScout probes are installed on FDDI LANs at selected sites. These probes function to collect networking data on remote LANs and forward selected information to the network management system. Besides providing standard RMON information, the probes allow the network management system to maintain traffic accountability for individual users at the LAN level. This is extremely important to avoid congestion on the organizations WAN backbone links. NETScout was chosen above other competitors for its ability to collect FDDI as well as Ethernet data and because of its close integration with the HPOV platform.
To date, the organization has written about a half-dozen scripts that integrate the Remedy AR, SAS, and NETScout into the OpenView NNM environment. These scripts assist in customizing the interdependency among multiple applications and underlying UNIX processes. As an example, the SAS integration with HPOV NNM requires a script that ensures SASs ability to access data files created by HPOV NNM. These scripts, in most instances, are no longer then few lines, but they provide the necessary glue to implement an integrated network management solution.
Simultaneous with the design and implementation of the NMS, the organization began developing operations concepts. This model addressed issues pertaining to operations roles and responsibilities, equipment sparing, problem diagnostic procedures, and an external interface from NOC perspective. Once these concepts and philosophies were sufficiently developed, a operations work-flow process was developed.
In the work-flow process, typically, a network event is reported to the NOC, either by the user or by the HPOV through its map or event windows. For the initial deployment of the ARS, NOC operator opens event tickets to document a network event. Associated severity levels are noted along with a brief problem description. Once the event information is entered, the ARS is assigned and forwarded to the appropriate repair personnel or circuit carrier, electronically, to investigate the condition.
Previous | Table of Contents | Next |