Method and apparatus for analyzing the progress of a software upgrade on a telecommunications switch6691300Abstract A method is provided for monitoring the progress of a software upgrade or retrofit on a telecommunications switch (110). A report stream of text messages relating to the state of, and events occurring on, the telecommunications switch is received by a server (102). Predetermined messages are detected in the report stream, including messages indicating entry into a stage of the retrofit, exit from a stage of the retrofit, a failure of a stage of the retrofit, alarms and errors. The time spent in a stage of the retrofit is determined and compared against an expected time (206, 210, 212). A visual portion (300, 400) of a user interface on a client (104) is updated to reflect entry and exit from a stage of the retrofit, and whether the time spent in a stage exceeds an expected time. The user interface reflects occurrences of alarms and errors visually and in some cases audibly. Claims What is claimed is: Description FIELD OF THE INVENTION
TABLE 1
Stage State Image Recognition
ProgressID
Begin Begin 1 REPT RETRO BEGIN RESUME
1
WHEN SESSION HAS STARTED
Begin Begin Complete 2 UPD GEN BEGIN COMPLETED
1
SUCCESSFULLY
Enter Enter 1 MOUNT TEXT TAPE FOR MHD 0/1
1
ON
Enter Enter Complete 2 REPT ENTER HOOK COMPLETED
1
SUCCESSFULLY
Tape Tape 1 UPD GEN ENTER MOUNT TAPE
0
AND CONTINUE
Tape Tape Loaded 2 /updtmp/site/toolxfer/info.out
0
MOP MOP 1 ISMOP REPORT
0
MOP MOP Complete 2 REPT PROC SCHED PROCEED
0
PAUSED
Pump SM Pump SM 1 ST:OPUMP,SM
0
Pump SM Pump SM Complete 2 REPT PROC SCHED PROCEED
0
PAUSED
Pump CMP Pump CMP 1 ST:OPUMP, CMP = 0,MATE
0
Pump CMP Pump CMP Complete 2 ST OPUMP CMP = 1-0 COMPLETED
0
TSMOLD TSMOLD 1 REPT RETRO PROCEED CONTINUE
0
TSMOLD TSMOLD Complete 2 UPD GEN TSM COMPLETED
0
Proceed Proceed 1 REPT RETRO PROCEED EAI SETUP
1
Proceed Proceed Complete 2 REPT PROC SCHED SWITCHFWD
1
PAUSED AT STAGE BOUNDARY
OFLBOOT OFLBOOT 1 AM OFFLINE BOOT STARTED
2
OFLBOOT OFLBOOT Complete 2 EXC OFLBOOT COMPLETED
2
OFLBOOT OFLBOOT failed 3 OFFLINE BOOT FAILED
2
OFLBOOT OFLBOOT failed 3 OFLBOOT TERMINATED
2
OFLBOOT OFLBOOT failed 3 ERROR CODE f031
2
SWITCH SWITCHFWD 1 SWITCHING SMS
1
FORWARD
SWITCH SWITCHFWD 2 SWITCH ONLINE SIDE
1
FORWARD Complete COMPLETED
SWITCH SWITCHFWD 3 ERROR SWITCHING SMS
1
FORWARD
TSMNEW TSMNEW 1 REPT ALWCHKS STARTING
0
ALWCHKS FOR SM
TSMNEW TSMNEW Complete 2 UPD GEN TSM CADN SUMMARY
0
TSMRMV TSMRMV 1 TSMRMV
0
TSMRMV TSMRMV Complete 2 UPD GEN TSM COMPLETED
0
STOP STOP OFLBOOT 1 STOP OFLBOOT STARTED
0
OFLBOOT
STOP STOP OFLBOOT 2 STOP OFLBOOT COMPLETED
0
OFLBOOT Comp.
BOOTHOOK BOOTHOOK 1 offrcr.out
1
BOOTHOOK BOOTHOOK 2 BOOT HOOK COMPLETED
1
Complete
CORCS CORCS 1 CNVT CORCLOG LOAD STARTED
0
CORCS CORCS Complete 2 CNVT CORCLOG
0
LOAD:CONCURRENT CONTROL
PROCESS COMPL
RECENT RECENT CHANGE 1 RCNEW ODDEVOL STARTED
1
CHANGE
RECENT RECENT CHANGE 1 RC REAPPLICATION
1
CHANGE CONCURRENT PROCESS
STARTED
RECENT RECENT CHANGE 2 RCNEW ODDEVOL COMPLETED
1
CHANGE Comp.
COMMIT COMMIT 1 RETRO COMMIT CONTINUING
1
COMMIT COMMIT Complete 2 UPD GEN COMMIT COMPLETED
1
COMMIT COMMIT Complete 2 REPT CMT HOOK COMPLETED
1
SUCCESSFULLY
END END 1 UPD GEN END APP EXECUTING
1
END END Complete 2 UPD GEN END COMPLETED
1
Server 102 determines whether a message received in the report stream relates to an entry into a retrofit stage (204). If the message does relate to an entry into a retrofit stage then a timer is set with the expected stage time (206), in particular, an expected time is predefined for each stage of the retrofit. This advantageously allows monitoring to determine whether a switch has remained in a stage of the retrofit for too long. This may indicate a problem requiring correction. Preferably, the message associated with entry into the retrofit stage is stored, including a time and date stamp from the report stream that is associated with the message. The message reflecting entry into a retrofit stage is stored for use by the user interface as discussed further below. As a report stream continues from the switch, the search for predetermined messages continues. Under normal operating procedures associated with a successful software retrofit, sometime after an entry message is received, an exit message associated with an exit from that stage of the retrofit is received. At step 208, a determination is made whether a message is for an exit from a stage of the retrofit. If the exit message is received, the message, including a time and date stamp, is stored for further use. In addition, the stage timer that was set for and upon the entry into that stage is reset (210). Under normal conditions where a stage completes prior to an expected stage time, resetting the stage timer is not detected by a user. On the other hand, if the stage timer expires prior to a reset (212), then a user is warned. In the preferred embodiment, the user is warned by updating the user interface in a predetermined manner, as discussed below. A reset of the stage timer (210) ends the time audit for a particular stage (216). In addition to messages associated with an entry into, or an exit from, a stage of the retrofit, other messages are detected. Table 2 below lists other exemplary messages that are detected in the preferred embodiment of the invention relating to the 5ESS electronic switch. In general, these messages relate to events that should not occur during a software retrofit. If a message occurs, the database may be updated and a user interface may reflect the change in the database (222). The monitoring process continues until terminated by a user or the report stream ends (224, 226). Table 2 has three columns entitled "Recognition1," "Recognition2," and "Recognition3." Each of these columns may contain a text stream. The actual ROP message being identified is the Boolean combination of Recognition1 & Recognition2 (if populated) & Recognition3 (if populated) on the same line in the ROP stream. For example, referring to row 1, a combination of "ALW" and "RC" on the same line in the ROP stream is a message of interest to the retrofit process. The "Type" column stores a number that is associated with certain actions that are taken in response to receipt of the identified message. In general, the actions that may occur in response to the messages include, visually updating a user interface, audibly updating a user interface, storing the ROP message in one or more files, and inserting additional text into a file or user interface. The "Text1" column includes text that is displayed in the user interface or stored in a file in response to receipt of the associated message.
TABLE 2
Recognition1 Recognition2 Recognition3 Type Text1
ALW: RC 0
AUD: 0
BKUP: ODD 0
CCS TRUNK 0
SIGNALING IN
SERVICE
CLR: AMA MAPS 0
CLR: FILESYS 0
CLR: TRN 0
CMTHOOK COMPLETE 0
CNVT: AMA CONFIG 0
CNVT: RCLOG 0
CORCS HAVE 0
BEEN LOG
DFC ERROR 0
ENDHOOK COMPLETE 0
ERROR CAN 0
ERROR INV 0
EXC OFLBOOT 0
STARTED
EXC: ENVIR UPROC 0
EXC: ODDRCVY 0
EXC: POPCO 0
GLCCS IS SET TO 0
INH: CORC 0
INH: RC 0
INIT: AM 0
MHD AUTONOMOUS 0
MNT OFL PTNS 0
OFFLINE PUMP
MODECD COMPLETE 0
OFFLINE PERIPH DISK 0
OFFLINE PERIPH ERROR 0
OFFLINE PERIPH HASH 0
OFFLINE PERIPH INCON 0
OFFLINE PERIPH PATH 0
OFFLINE PERIPH PROB 0
OFFLINE PERIPH UNAV 0
OFFLINE PERIPH VERIFY 0
OFFLINE PUMP DISK 0
OFFLINE PUMP ERROR 0
OFFLINE PUMP FAIL 0
OFFLINE PUMP HASH 0
OFFLINE PUMP INCON 0
OFFLINE PUMP PATH 0
OFFLINE PUMP PROB 0
OFFLINE PUMP UNAV 0
OFFLINE PUMP VERIFY 0
OP:GEN RESET 0
ORD:CPI 0
REPT BOOT HOOK 0
COMPLETED
REPT CACHE-ERR 0
REPT DFC CODE 0
REPT ISOLATE 0
REPT MHD 0 UNEQUIPPED 0
REPT PRCD HOOK 0
COMPLETED
REPT PROCEED PAUSED AT 0
STAGE BOUNDARY
REPT RETRO PROCEED 0
PERFORM
REPT DKDRV 0
REPT RETRO 0
BEGIN
ST:OPUMP 0
STOP OFLBOOT 0
COMPLETED
STOP: EXC ANY 0
STOP: EXC USER 0
STOP:OFLBOOT,R 0
ST
STOP:OLFBOOT,R 0
ST
STP: RCRLS 0
STP:OPUMP 0
SUPPORTED 0
RCVS LOGGED:
TCAP SIGNALING 0
IN SERVICE
TRANSITION SPEC 0
UPD GEN AM OFFLINE BOOT 0
SUCCESSFUL
UPD GEN END APP EXECUTING 0
UPD GEN ENTER BLOCKS 0
WRITTEN
UPD GEN OFFLINE 0
UPD GEN SELECTED FOR 0
UPD GEN SWITCHFWD COMPLETED 0
UPD GEN SWITCHFWD STARTED 0
UPD GEN TSM COMPLETE 0
UPD GEN TSM IN PROGRESS 0
UPD:BKOUT 0
UPD:BOLO 0
UPD:EXALL APPLY 0
UPD:EXALL OFC 0
UPD:EXALL SOAK 0
UPD:GEN APPLPROC "MOP" 0
UPD:GEN APPLPROC APPLY 0
UPD:GEN APPLPROC HOOK 0
UPD:GEN APPLPROC MODECD 0
UPD:GEN APPLPROC STOPMOP 0
UPD:GEN APPLPROC TOOLS 0
UPD:GEN APPLPROC TSMNEW 0
UPD:GEN APPLPROC TSMOLD 0
UPD:GEN APPLPROC TSMRMV 0
UPD:GEN APPLPROC WRTAMA 0
UPD:GEN BACKOUT 0
UPD:GEN CONTINUE 0
UPD:GEN SMBKOUT 0
UPD:GEN STOPOLB 0
UPD:GEN SWITCH 0
UPD:UPNAME 0
WRT:AMADATA 0
UPD:GEN BEGIN 1
MOUNT FOR MHD AND RESUME 2
UPD GEN ENTER MOUNT TAPE 2
AND CONTINUE
STOP: GEN 3
UPD:GEN RESTORE 3
UPD:GEN ENTER 4
UPD:GEN MHDSTAT 4
EXT-BOARD-ADDR 5
CNVT CORCLOG STOPPED 6
MOVELOG 6
REPT POPCO 6
SUPR ERROR 6
CODE
CNVT: CORC 7 START CORCs
EXC: RCRLS EVOL 7 START
RECENT CHANGE
EXC: RCRLS OSPS 7 START
RECENT CHANGE
RCNEW ODDEVOL 7 RECENT
CHANGE COMPLETE
COMPLETED
RCNEW ODDEVOL 7 RECENT
CHANGE COMPLETE
STOP
START OF CU RECOVERY 7 BOOT START
of
CU RECOVERY
RCNEW ODDEVOL 8 RECENT
CHANGE COMPLETE
ABORT
DATE = 9 *****BASE #
PROCESSOR: CP 9 ***** AM
ODD
PROCESSOR: IM 10
UPD:GEN APPLPROC LOOKODD 11
REPT CORR-BIT-ERR 12
REPT MEM-SYSTEM 12
REPT PARITY-ERROR 12
REPT HASHSUM FAILURE 13
CNVT CORCLOG LOAD SM = 14
CORCS IN ERROR 15
BOOTHOOK COMPLETE 16 BOOTHOOK
COMPLETE
CONCURRENT 16 CORCs
COMPLETE
CONTROL PROCESS
COMPLETED
ENTRHOOK COMPLETE 16 ENTRHOOK
COMPLETED
INSTALLTOOLS COMPLETE 16
INSTALLTOOLS COMPLETE
MODIFY COMPLETE 16 MODIFY
COMPLETE
PRCDHOOK COMPLETE 16 PRCDHOOK
COMPLETE
UPD GEN COMMIT COMPLETED 16 COMMIT
STAGE COMPLETE
UPD GEN ENTER COMPLETED 16 TAPE LOAD
COMPLETE
UPD GEN PROCEED COMPLETED 16 PROCEED
STAGE COMPLETE
UPD:GEN COMMIT 16 START
COMMIT STAGE
UPD:GEN END 16 END STAGE
UPD:GEN PROCEED 16 START
PROCEED STAGE
BACKOUT-RC 18 Fix RC
BACKOUT before
RC reapp
starts (RC/DB)
STOPPED WITH 19
ERROR CODE
REPT RETROFIT 21
TOTAL SYSTEM
DOWNTIME
DOWNTIME 22
In addition to the exemplary messages listed above, other messages are detected. In particular, in the preferred embodiment, messages referred to as processor recovery messages or PRMs are received from the administrative module. These messages are a string of 16 hexadecimal numbers. Certain of the messages are detected and the system responds by visually updating a user interface, audibly updating a user interface, storing the ROP message in one or more files, or inserting additional text into a file or user interface. Actions taken in response to PRMs convert the cryptic hexadecimal code into a message or sound that is readily perceived by a person. In some cases, multiple lines of a ROP message are parsed to determine a message stored and displayed in a user interface. The method described above is preferably implemented with software running on servers 102, 106 and clients 104. The core monitoring application is preferably a multi-threaded NT service written in C++. This core monitoring application manages the connections to the switches 110, collects and stores data from the switches 110, and performs the analysis. For example, In order to connect to a switch, the server spawns a thread for that switch. All monitoring activities for that switch take place within the thread. Internal communications between applications is preferably accomplished using TCP/IP sockets. Data used to control the application, e.g., the predetermined messages, and data stored from the application, e.g., the identified messages, are stored in a database. A C++ library that encapsulates the database primitives provides application access to the database. The user interface is preferably a Microsoft Visual C++ application. FIG. 3 is a diagram illustrating a visual portion of a user interface for displaying the progress of a software upgrade on a telecommunications switch in accordance with the present invention. Preferably, the user interface is implemented on a client 104. Client 104 preferably has software that accesses the database on a server 102, 106 and reflects changes made in the database in a visually aesthetic manner. Preferably, the user interface is displayed on a computer monitor. Display 300 is a consolidated status display. Display 300 collectively shows in one screen the progress of multiple switches or offices undergoing a software upgrade. Display 300 includes a tool bar area 302, with short cuts or buttons for selecting certain commands, including commands to connect to and disconnect from a regional server 102 or national server 106. A region area 304 displays a label associated with the particular region being monitored, in this example, the "Southern" region. The display 300 is characterized by a set of columns and rows. Column 306 labeled "HC" relates to a "health check" conducted prior to the retrofit progress. The "health check" is performed prior to the retrofit to determine whether to proceed with the retrofit. This determination is described in co-pending patent applications referred to above in the cross-reference to related applications. Column 308 includes the name of the office or switch. In other words, column 308 lists the particular switch that is scheduled for a retrofit, generally by geographical designation. Columns 312, 314, 316, 318, 320 and 324 list certain stages of the retrofit and other pertinent fields relating to the retrofit. Column 312 relates to the COMMIT stage and column 314 relates to the END stage. Other stages are preferably displayed by scrolling horizontally in display 300. Column 316 gives the status of the retrofit. The status column is populated with "Pending," "Succeed," "Failure" or "Abort." Pending indicates the office will retrofit within five to seven days; Succeed indicates the retrofit was successful; Failure indicates the retrofit was not successfully completed; and Abort indicates the office retrofitted, but returned to the old generic. Column 318 relates to the down time associated with the retrofit; column 320 lists the time of the most recent note entered, if any; and column 324 lists the number of OFL boot attempts. Each row of display 300 relates to the switch or office listed in column 308. The area defined by the intersection of a column relating to a stage of the retrofit and a row related to a particular office being retrofit is used to display information associated with that particular stage of the retrofit. For example, in FIG. 3, area 350, which is the intersection of column 312 and row 332, relates to the commit stage of the retrofit for the Shreveport, Louisiana office listed in column 308, row 332. In the preferred embodiment, as shown in box 350, a date and time associated with the entry into the commit stage is listed in area 350 and a date and time associated with the exit out of the commit stage is shown below the entry data in area 350. Area 350 is colored a predetermined color to indicate certain status relating to the associated stage of the retrofit at the associated office. In the preferred embodiment, a background color of green is used to reflect no errors found in the associated stage of the retrofit. A flashing yellow background, that is alternating between yellow and another color, or the lack thereof, is used to indicate that a stage has exceeded a predetermined expected stage time. A non-flashing yellow background color indicates the stage is in progress and has not exceeded an expected time. A red color is used to indicate that an error occurred during the associated stage at the associated office. A blue color is used to indicate the OFLBOOT stage. Other colors are optionally used. The area of the user interface containing the name of the office preferably reflects general status about the office. In particular, this area preferably flashes red to indicate a major or critical alarm at an office. Selecting any stage box, for example, by double clicking on the stage box, causes the user interface to display a status display for the selected office. A stage box is defined by the intersection of a column relating to a retrofit stage and a row relating to a particular office or switch. The status display is discussed below with respect to FIG. 4. Selecting a row in the notes column 320 causes the user interface to display a window containing a note if one exist. FIG. 4 shows a status display 400 relating to the progress of a retrofit for a single switch or office. Status display 400 provides additional details associated with a certain office in designated areas or panes. In particular, display 400 includes a ROP output pane 402, a MSG file output pane 404, an FRM file output pane 406, a stage pane 408, a switch pump pane 410 and an alarm pane 412. Display 400 also includes a toolbar 420, which displays icons for certain commands, and a status bar 440, which includes additional status. ROP output pane 402 displays the text stream from the ROP as it is received and stored in a log file. The text in the pane scrolls as additional text is received, but is paused by selecting a pause command in toolbar 420. The MSG file output pane 404 displays messages written to a .msg file. The .msg file contains alarm messages received in the ROP for the switch during monitoring. Alarms are indications of events on the switch that may require corrective action. Alarms fall into three categories reflecting their level of severity--minor, major, or critical. The FRM output pane 406 displays output written to a form file for the office. The form file is written in response to the receipt of predetermined messages. The form file contains, for example, the entry and exit messages for each stage, failing processor recovery messages (PRMs), errors and other predetermined messages. The form file is useful to diagnose problems. The alarm pane 412 displays the most recent alarm message. Alarm pane 412 is preferably colored green for minor alarms, yellow for major alarms and red for critical alarms. There is and acknowledge button 422 and an "ACK all" button 424 in the alarm pane. The acknowledge button 422 is used to acknowledge the occurrence of the current alarm. The ACK all button is used to acknowledge all pending alarms. The buttons 422, 424 are made inactive when no alarms are pending. Alarms also cause an audible sound, with the sound varying with the severity of the alarm. The switch pump pane 410 indicates when each switching module is "pumped" or completed. The word "pumped" appears next to the switching module number when the associated switching module is pumped. If there are errors in the pump process, the errors are indicated, rather than the word "pumped." The stage pane 408 shows the progress of the office through the stages of the retrofit. An area or stage square 430 is provided for each stage with the name of the stage designated below the square 430. The stage square 430 is colored a predetermined color to reflect the status of the stage of the retrofit. Preferably, green indicates that the stage is complete; yellow indicates the stage is in progress; red indicates the stage failed; and blue indicates offline boot is running or OFLBOOT is in progress. Display 400 includes a status bar 440 that includes additional status information. In particular, section 442 contains a voice phone number associated with the office; section 444 contains a foreign exchange number associated with the switch; section 446 indicates which retrofit tape is being loaded; and section 448 indicates whether notes have been entered for the office. The present invention includes a system for regional and national monitoring of retrofits on telecommunication switches. Retrofit monitoring advantageously allows for early problem detection and correction to efficiently utilize resources during a retrofit. Centralized retrofit monitoring advantageously allows for ready and remote diagnoses of problems due to expertise gained from previous retrofit monitoring. Though described above with respect to telecommunications switching systems, the retrofit monitoring process may equally apply to software upgrades on other processor-based systems. The invention being thus described, it will be evident that the same may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the invention and all such modifications are intended to be included within the scope of the appended claims.
|
Same subclass Same class Consider this |
||||||||||
