Previous | Table of Contents | Next |
At the repair site, the technicians receive the assigned trouble ticket and begin the required fault isolation and resolution process. To accommodate the stringent mean-time-to-restore requirements, the organization has decided to maintain an inventory at each node, which allows the technicians to swap parts quickly and perform detailed diagnostics off line. A summary of such actions is entered in the Restoral Action and Problem Found fields of the opened trouble ticket. Future releases will require the operator to input in the trouble-ticket equipment-sparing loads at each site to track and maintain the sparing profile. Upon problem resolution, the trouble ticket is assigned back to NOC operators for them to perform verification process.
For specific faults such as outage in carrier-provided circuits, the NOC operator monitors and records the action taken by the carrier to restore the circuit. Associated durations are entered as time-tags, and this information is eventually used as metrics to gauge the service-level agreement with the WAN carriers. A similar information base is maintained for individual operators and technicians who are assigned trouble tickets. In future, the organization is considering incorporating the service desk of its router vendor to leverage on the operators expertise in real time by involving them in the help desk process.
As an aid to the current work flow process, rules as well as responsibilities are defined to ensure timely processing for each event. This allows the implementation of escalation procedures to guarantee timely tracking of event conditions. The organization has developed an escalation process that involves three levels and leverages on skill mix of a myriad of people. Level one escalation relies on NOC operators who, in conjunction with technicians, resolve failures. Level two includes an NOC manager, whose skill mix allows him or her to make system-level decisions to resolve problems in real-time. Level three and beyond is the domain of design engineers as well as vendor support. At this level, the problem usually requires expert handling.
The organization anticipates a four to six month field test of the network. During the field test and thereafter, the network design team will take on the new role of sustaining systems engineering and support. In their sustaining engineering role, they will integrate and test third-party applications, perform trend analysis on collected data, and provide recommendations on operations strategys. They will also be an integral part of the organizations problem escalation process.
To date, the organization has deployed this NMS to monitor and manage its TCP/IP network resources. It has also developed an in-house structured training/certification program to ensure proper skill-mix is maintained on the NOC floor. During the initial months, the NOC will house 2 operators on a 24x7 basis with a NOC manager for the day shift. Maintenance contract with a private company will provide technicians at remote sites. Eventually, as the network grows, additional operators and mangers will be hired.
Based on the initial feed-back form the NOC floor, the design engineers have incorporated minor changes-especially in the area of data collection and report formats and contents. Currently, these design engineers are completing the as-built documentation suite as well as finalizing the operations concept and procedure guidelines. They are also beginning to evaluate middleware to be incorporated in Phase II.
From a high-level corporate perspective, the phased implementation approach has resulted in immense dividends. It has kept the initial hardware/software cost in check as well as minimized the up-front implementation risk. As the understanding of the NMS processes and the network dynamics increases, the organization finds itself in a better position to deploy additional software packages in future releases. Finally, such a phased and structured approach has allowed the organization to gradually hire the proper skill mix at the same time re-train its existing operations staff.
Previous | Table of Contents | Next |