is a tool you can use to troubleshoot network communications problems and related network errors.

Starting

Start and log in.

Click Tools > System > .

Tip

For instructions on using filtering, sorting, column selection, and pin/unpin to customize the display, see "Customizing and Navigating Interface Displays" on page 1.

A Expand/collapse tree B Pin/Unpin button C Diagnostics tabs
D Dynamic Filter field E Navigation pane F Diagnostics information pane

Navigation pane

Diagnostics information is grouped as follows:

Service Diagnostics: Contains diagnostics information for certain services (, and ).

Communication Diagnostics: Contains diagnostics information for the sites, hardware devices, and software nodes.

Select an item in the navigation pane to display its diagnostics information.

If you add a new device in while is open, you can refresh the tree view to display the new device by collapsing then expanding the root node of the tree.

Diagnostics Information pane

The diagnostics information pane displays detailed data about the state of your power monitoring system and devices.

Service Diagnostics

Service Diagnostics records communication problems and similar events occurring with the product's software components.

Communication Server diagnostics

Information about the communications server is arranged in these tabs:

Console Messages lists all and console messages for the current session.

Tip

The blank area below the Description column header is a dynamic filter field. Type the wildcard character (*) in front of the text you want to search (for example, to display only messages prefixed with WARNING, type *warning). The diagnostics information pane automatically displays only those records that match the text you typed in the box.

Connection Status displays the current status of the software components connected to Network Router.

Tree States displays the ION tree status of all nodes (hardware devices and software nodes).

diagnostics

The diagnostics information pane is split into two sections. The top section (Select Nodes pane) contains the available nodes, while the bottom section contains the node details.

Select nodes to display

In the Select Nodes pane, select the check box beside a node to display its diagnostics information. Clear the check box to hide that node’s diagnostics information.

Tip

If there are many nodes and you want to display only a few of them, right-click the Select Nodes area then click Clear All. Select only the nodes you want to display. To display all the nodes again, right-click the Select Nodes area and click Select All.

Node details

The node details are organized in these tabs:

Node Information provides diagnostics associated with the communication status of each selected node. If the is not configured to gather data from a given node, it does not appear in the list in the Node column. If the is configured to automatically gather information for a node, but that node has not yet been processed, it does not initially appear in the list. Once information becomes available, the node appears (if it has been selected).

Node Performance provides per-node performance summary information.

Log Performance provides performance information on a per log basis.

The following table summarizes the columns on the Node Information tab:

Column Description
Node The name of the device, VIP, or .
DeviceType The device type of the associated node that is returned by the device itself. The uses this to detect device swapouts.
SerialNumber The serial number of the device that is returned by the device. The uses this to detect device swapouts.
Configured Polling Interval (s) The requested polling interval in effect. It can be configured either from the log upload control or from the custom Windows Registry value. All of the nodes for which polling is disabled are identified with Polling Disabled in this column.
Average Update Interval (s) A weighted average time between polled results for the device. The most recent interval accounts for 20% of the value, and the previous average accounts for the remainder. If the value deviates from the average by more than 30s, then the old average is discarded and the current interval is used. By default, the expected value for devices that support logs is the Configured Polling Interval (s) value. The expected value for devices that do not support logs is 60 seconds.
If the is selected but it is not configured to collect data from its System Log Controller, it appears in the diagnostics and shows 300s for Average Update Interval. Initially this value is n/a.
Time Since Update (s) The time in seconds since the last communication with the node. This time includes polling updates, record uploads, and configuration loads.
CommStatus Can be one of the following values:

alive – The node is communicating.

late – If a response to a polling program is not received within 3 minutes, the sends a ping. If the ping does not respond in 10 minutes, the communication status is set to late and another ping is sent. The system continues pinging every 10 minutes until a response is received.

expired – If a ping returns before the response for any preceding request, the original request was lost. The request is abandoned and the communication status is set to expired. A request can be lost if a destination Site Server or VIP is shut down. The state changes from expired when the device responds to a request.

timeout – A request to the device timed out. The device is not communicating.

site not connected – The site is currently not connected.

cannot send – An unrecoverable error. The cannot send a program to the communications subsytem. The shuts down if Network Router is not running. Restart the system.

invalid password – The password entered for this device in is invalid.

password changed – The password for this device has been changed. Update the password for the device in .

site not responding – The connection is unexpectedly broken during communication with the device.

device disabled – The device or its site is not enabled in . Note that automatically removes this node from this list if the node had been detected by automatic means.

does not exist – The device is not registered in the system. In auto-mode, the device eventually disappears from the list unless it is referenced remotely by a VIP.

pending – No responses have been processed.

nack'd – The request was not acknowledged. This could mean that the Site Server hosting the device is not running.

validating – Treemon reported that the device is not responding. A signal is sent to Treemon to validate the state. This state clears once Treemon (via Validator) establishes communications with the device.

Comments Under steady-state conditions, this is blank. While the attempts to upload configuration information, this can contain a string value indicating that the Tree is in use by another client. This indicates that the cannot process the device until the aforementioned client releases it. If the client is ION , it is not released until the node is closed in or is closed. If the client's name ends with -not-clean, the node is currently being evaluated by Treemon/Validator.
AggregateSetupCount The aggregate setup count of the device. The uses this to detect configuration changes.
RequestedIONs The number of ION registers, modules, and/or managers that have been requested from the tree. The needs to upload configuration information to determine which logs need to be processed, which labels should be used for measurement mapping and source resolution, and which labels to use for event cause and effects.
The retrieves the currently cached tree from Treemon, populating as needed by communicating directly with the device. The tree is locked for the duration of this process, and this prevents from opening the tree.
If the value is:

none – No configuration information is currently required.This is typical in a steady-state condition.

cache – Only the currently cached configuration is required. This is typically seen at startup.

A number – The needs specific information and that number of ION objects has been explicitly requested.

RequestStatus The status of the tree requests can include one of the following values:

ready – The does not require any configuration information.

requesting – The requires configuration information and is in the process of gathering it. The value in Request Update Time indicates how long it has been processing this request.

retrying – A previous tree request was not successful. (See the Comments column for the reason.) The request is retried, as shown by the value in Request Update Time. The amount of wait time before retrying a request depends on the nature of the unsuccessful tree request:

Tree in use by another node – 10 seconds.

Tree dirty – 10 seconds

Not responding – 60 seconds.

Tree request timed out after 10 minutes – 5 minutes.

Comm error – 10 seconds.

Other errors – 5 minutes.

blocked – The requires configuration information but all available resources are in use. By default, the can simultaneously request only up to 2 trees per site and 6 trees in total. The Request Update Time value indicates how long the request has been pending.

processing – The has received the requested ION objects and is processing them. The Request Update Time value indicates how long this request has has been processed, including the time during the "requesting" state.

abandoned – This is the same as the retrying status but the request of some of the configuration information was not successful following the successful receipt of some information. The recovers when it retries the request.

Request Update Time (s) The time varies depending on the status of the tree requests described for RequestStatus.

The following table summarizes the columns on the Node Performance tab:

Column Description
Node The name of the device, VIP, or .
Responding Indicates whether or not the node is responding. For a VIP, this includes all external nodes connected, directly or indirectly, to the input of a Recorder. The responding state is used to determine whether or not the download of the log is caught up.
All Logs Polling Disabled Indicates if log upload is disabled for all recorders on the device. A Yes in the column indicates that log upload is disabled.
TotalLogs The total number of Data Recorders, Waveform Recorders, Event Log Controllers, and System Log Controllers that the is configured to collect data from a given node. Note that when automatically detecting these modules, this number may change as the gathers configuration information.
PendingRecords The total number of records that the has requested from the node but has not yet received.
OutstandingRecords The total number of records not yet uploaded based on the last read position counter on the device and the position of the last uploaded record, taking into account the maximum depth of each log.
ProcessedRecords The number of records that have been inserted into the database. Note that a record typically corresponds to a number of DataLog entries. The term "record" refers to records at the device level.
Generated Rec. per sec An estimate of the number of new records being generated per second.
Retrieved Rec. per sec An estimate of the number of records being uploaded per second.
Avg Retrieval Time (s) The average round-trip time in seconds taken to retrieve a record from a device.
Avg Processing Time (s) The average time in seconds necessary to insert a record into the database.
RestoredLogs The total number of logs that the is configured to gather information for.
ManagedLogs The total number from the value in RestoredLogs that is being monitored by an enabled Log Acquisition Module (LAM).
ConfiguredLogs The total number from the value in RestoredLogs that are Recorders and have source inputs or are Event Log Controllers or System Log Controllers.
ConfirmedLogs The total number from the value in RestoredLogs for which the current configuration is known.
NumCaughtUp The total number from the value in RestoredLogs for which the node is responding and there are no records outstanding or pending.

The following table summarizes the columns on the Log Performance tab:

Column Description
Node The name of the device, VIP, or in question.
LogHandle The handle of the Log Register or Event Log Register for this Node.
Responding Indicates whether or not the node is responding. For a VIP, this includes all external nodes connected, directly or indirectly, to the input of a Recorder. This state is used to determine whether or not it is caught up.
Polling Disabled Indicates which individual recorders are excluded from polling requests. A Yes in the column indicates which recorders are excluded.
PendingRecords The total number of records that the has requested from the node but has not yet received. This number includes event records that have been uploaded but are cached internally pending configuration information necessary to complete the processing of the cause and/or effect ION objects.
OutstandingRecords The total number of records not yet uploaded based on the last read position counter on the device and the position of the last uploaded record, taking into account the maximum depth of each log.
ProcessedRecords The number of records that have been inserted into the database. Note that a record typically corresponds to a number of DataLog entries. In this context, "record" refers to records at the device level.
Generated Rec. per sec An estimate of the number of new records being generated per second.
Retrieved Rec. per sec An estimate of the number of records being uploaded per second.
Avg Retrieval Time (s) The average round-trip time in seconds taken to retrieve a record from a device.
Avg Processing Time (s) The average time in seconds necessary to insert a record into the database.
Restored This is always yes. If the log is not "restored", it does not appear in the list.
Managed A Log Acquisition Module (LAM) is enabled that is monitoring this log.
Configured The log is a Recorder that has source inputs or it is an Event Log Controller or a System Log Controller.
Confirmed The latest configuration for the log has been uploaded. For a VIP Recorder that references external devices, directly or indirectly, the configuration information includes information from the external device.
CaughtUp The node is communicating, the current configuration is known, and there are no outstanding or pending records. For a VIP, any device on which the log depends for information must also be responding.

Alarm Service

Alarm Service provides the status of alarms that you configure and enable in the Alarm Configuration application.

The information is organized in a grid. The column labels indicate the type of information provided, such as Rule Name, Alarm Name, Alarm Status, and so on. See the Alarm Configuration Help (accessible from the Alarm Configuration application) for further information about configuring alarms for multiple sources and measurements.

Communications Diagnostics

Communications Diagnostics provides diagnostics information for sites and devices connected to the workstation.

Site overview

Diagnostics information for the sites are contained in these tabs:

Device Summary displays communications statistics for each site.

NetUser Status displays the number of ION programs currently in the queue (awaiting processing) and the total number of ION programs already processed.

Note

Requests and responses transmitted between the software components are referred to as “ION programs”.

Site/Device Diagnostics

Diagnostics information for sites and devices are summarized in these tabs:

Communication Status displays error rates and connection statistics for the selected site or device. The following information is available from the Communications Status tab:

Column Description
Node The device (or software node) name.
Requests The number of communications requests transmitted to the meter.
Responses The number of successful responses received.
Request Ratio The number of requests sent to the device to fulfil the last client request. The value is always 1 for ION devices but it varies for Modbus devices.
Total Errors The total number of communication errors.
Total Err Rate (%) The ratio of Total Errors to Requests.
Sliding Err Rate (%) The error rate in the last 100 requests. This can indicate a trend in communications performance.
Time Util (%) The percentage of the communication channel utilized (serial line or Ethernet) on the site.
Avg Resp Time (s) Average time in seconds for the meter to respond.
Last Resp Time (s) The last response time, in seconds.
Timeouts The number of timeouts. A timeout occurs when no data is received in response to a request.
Bad CRC The number of bad packets received, that is, those that do not pass the error-detection checksum.
Incompl. Frm The number of incomplete packets received, that is, those that did not have all the expected bytes.
Broken Conn. Number of times the connection was lost to the meters on a site.
Bad Frames The number of received packets that had an internal error.
HW Errors Number of errors reported by the computer’s communication hardware.
Misc Errors Number of other errors that do not fit any of the above descriptions.

Site Status displays site statistics such as connection status and totals.

Polling Status displays the number of programs currently in the queue (awaiting processing) and the total number of programs already processed.

Communication Status vs. Site Status

This section explains the difference between the statistics provided on the Communication Status tab and those on the Site Status tab.

“Total Errors” in the Communication Status tab is an derived statistic, while “Bad Responses” in the Site Status tab is a client derived statistic.

To explain this difference, consider a situation where a direct site is experiencing timeouts. Communications with the device is attempted according to two parameters: Connect Attempts (an advanced site property in ) and Maximum Attempts Multiple (an advanced device property in ). Multiplying the values of these two properties determines the number of attempts made to re-establish communications with the device.

For instance, if Connect Attempts is set to 1 and Maximum Attempts Multiple is set to 3, the device will go offline after 3 attempts (that is, 1 x 3).

The “Total Errors” statistic increases by one every time detects a timeout. However, the “Bad Responses” statistic only increments every time a response is sent back to a client.

Using the previous example, consider the case where four timeouts occurred and the device went offline. In this case, “Total Errors” increases by four, while “Bad Responses” only increases by one. If only two timeouts occurred, “Total Errors” would increase by two, while “Bad Responses” would not change.

The following information is available from the Site Status tab.

Column Description
Node The device (or software node) name.
Status The device communication status.
Current Attempt The current number of repeated attempts to communicate with the device.
Max Attempts The maximun number of attempts before flagging the device as offline (Timed-out).
Offline Count The total number of times the device went offline.
Bad Responses The total number of errors sent back to the clients, such as to .
Last Response The time when the last response was received.
Last Attempt The last time that a request was sent to the device.
RT Data Reqs The total number of requests to the device sent by the Real Time Data Service.
TreeMon Reqs The total number of requests to the device sent by the TreeMon service.
VISTA Reqs The total number of one shot requests to the device sent by a client (control, label requests...).
LogInserter Reqs The total number of requests to the device sent by the LogInserter service.
IONSERVICE Reqs The total number of requests to the device sent by ION real-time services.

Note that the last five columns on the Site Status tab are dynamic. That is, the columns are only shown when requests were sent to the device from a service or client.

Additional commands

The following sections describe additional display options and shortcut menus available in .

Diagnostic Details

In the tabs on the diagnostics information pane, double-click a row to display its Diagnostic Details screen. This displays the diagnostic information for the selected item only.

Use the Previous and Next buttons to view the details of other rows in that tab of the diagnostics information pane.

To copy information to the clipboard, select the rows you want to copy, then press CTRL+C.

Diagnostics Information pane shortcut menu options

Right-click the diagnostics information pane to display a shortcut menu. The following table lists all the commands available (though not all panes in provide all the commands listed):

Right-click the diagnostics information pane to display a shortcut menu. The following table lists all the commands available (though not all panes in provide all the commands listed):

Right-click Option Description
Update Refreshes the information in the diagnostic table.
Reset Resets the information in the diagnostic table (not available in the Communications Server Diagnostics display).
Copy All Copies all selected information to the clipboard.
Auto Scroll Enabled by default, this option is only available in the Console Messages tab of the Communications Server Diagnostics display. This option automatically scrolls and selects the latest console message. Clear this option to disable scrolling (that is., select and view an older console message without jumping to the latest one when refreshes).
Options Displays the Options dialog where you can change the diagnostics refresh rate. Note that changing the refresh rate frequency can affect the product's performance.